Google is the largest, most-used search engine in the world, with a global market share that has held steady at about 90 percent since Google Search launched in 1997 as Backrub. In 2017, Google became the most valuable brand in the world, topping Apple, according to the Brand Finance Global 500 report. Google’s position is due mainly to its core business as a search engine and its ability to transform users into payers via advertising.
About 32 percent of Google visitors come from the US, where the company holds 63.9 percent of the search engine market, according to statista.com. Google had 247 million unique US users in November 2015. Globally, it boasts 1.5 billion search engine users and more than 1 billion users of Gmail.
Google data centers process an average of 40 million searches per second, resulting in 3.5 billion searches per day and 1.2 trillion searches per year, Internet Live Stats reports. That’s up from 795.2 million searcher per year in 1999, one year after Google was launched.
Work in the data center industry or simply curious about what the Internet is made of? Follow Data Center Knowledge on Twitter or Facebook, join our LinkedIn Group, or subscribe to our RSS feed and e-mail updates.
In a reorganization in October 2015, Google became a subsidiary of a new company it created called Alphabet. Since then, several projects have been canceled or scaled back, including the halt of further rollout of Google Fiber. Following the reorg, however, Google has placed a lot of focus (and dedicated a lot of resources) to selling cloud services to enterprises, going head-to-head against the market giant Amazon Web Services and the second-largest player in the space, Microsoft Azure.
That has meant a major expansion of Google data centers specifically to support those cloud services. At the Google Cloud Next conference in San Francisco in March 2017, the company’s execs revealed that it spent nearly $30 billion on data centers over the preceding three years. While the company already has what is probably the world’s largest cloud, it was not built to support enterprise cloud services. To do that, the company needs to have data centers in more locations, and that’s what it has been doing, adding new locations to support cloud services and adding cloud data center capacity wherever it makes sense in existing locations.
Here are some of the most frequently asked questions about Google data centers and our best stab at answering them:
Where Are Google Data Centers Located?
Google lists eight data center locations in the U.S., one in South America, four in Europe and two in Asia. Its cloud sites, however, are expanding, and Google’s cloud map shows many points of presence worldwide. The company also has many caching sites in colocation facilities throughout the world, whose locations it does not share.
This far-flung network is necessary not only to support operations than run 24/7, but to meet specific regulations (like the EU’s privacy regulations) of certain regions and to ensure business continuity in the face of risks like natural disasters.Google data centers in the Dalles, Oregon, 2006 (Photo: Craig Mitchelldyer/Getty Images)
In the works as of March 2017, are Google data centers for cloud services in California, Canada, The Netherlands, Northern Virginia, São Paulo, London, Finland, Frankfurt, Mumbai, Singapore, and Sydney.
Here are the data center sites listed by Google:
- Berkeley County, South Carolina
- Council Bluffs, Iowa
- Douglas County, Georgia
- Jackson County, Alabama
- Lenoir, North Carolina
- Mayes County, Oklahoma
- Montgomery County, Tennessee
- The Dalles, Oregon
Here's a March 2017 map of the global infrastructure that supports Google's enterprise cloud services, including existing and future company-owned data centers, leased edge sites in colocation facilities (there are more than 100 of those), and leased and owned fiber routes:
How Big Are Google Data Centers?
A paper presented during the IEEE 802.3bs Task Force in May 2014 estimates the size of five of Google’s US facilities as:
- Pryor Creek (Mayes County), Oklahoma , 980,000 square feet
- Lenoir, North Carolina, 337,000 square feet
- The Dalles, Oregon, 200,000 square feet (before the 2016, 164,000 square foot expansion)
- Council Bluff, Iowa, 200,000 square feet
- Berkely County, South Carolina, 200,000 square feet.
Many of these sites have multiple data center buildings, as Google prefers to build additional structures as sites expand rather than containing operations in a single massive building.
Google itself doesn’t disclose the size of its data centers. Instead, it mentions the cost of the sites or number of employees. Sometimes, facility size slips out. For example, the announcement about the opening of The Dalles in Oregon said the initial building was 164,000 square feet. The size of subsequent expansions, however, has been kept tightly under wraps.
Reports discussing Google’s new data center in Emeshaven, Netherlands, which opened December 2016, didn’t mention size. Instead, they said the company has contracted for the entire 62 Megawatt output of a nearby windfarm and ran 9,941 miles of computer cable within the facility. The data center employs 150 people.
Data center operators building multiple new facilities often standardize aspects of their plans. While Google clearly standardizes much of its operations inside its data centers, the difference in the square footage reports for the data centers in The Dalles and Lenoir suggest that Google hasn't standardized on a single data center size (at least not on the level of MCI/WorldCom, which once built identical 109,000 square foot data centers in 25 cites). Google's Barry Schnitt says that although Google data centers "potentially" could all be a similar size, they are not cookie-cutter designs, either. Google is constantly updating its data center design and equipment to take advantage of the latest technological advances and efficiencies, according to Schnitt.
How Many Google Data Centers Are There?
Few outside Google know exactly how many data centers Google operates. There are the massive Google data center campuses, of which it says it has 15. Some of its enterprise cloud regions are on those campuses, and some are elsewhere. As of March 2017, the company had six enterprise cloud regions online and 11 in the works (see map above). Most if not all of these locations have or will have multiple data centers each. Google has not shared publicly exactly how many there are in each location.
Also unclear is the amount of caching sites, also referred to as edge Points of Presence, Google has around the world. These are small-capacity deployments in leased spaces inside colocation facilities operated by data center providers like Equinix, Interxion, or NTT. The company says there are more than 100 such sites but doesn't share the exact number.
How Many Servers Does Google Have?
There’s no official data on how many servers there are in Google data centers, but Gartner estimated in a July 2016 report that Google at the time had 2.5 million servers. This number, of course, is always changing as the company expands capacity and refreshes its hardware.
How Does Google Decide Where to Build Its Data Centers?
Here are the factors that are known to influence Google's data center site location process:
- The availability of large volumes of cheap electricity to power the data centers
- Google's commitment to carbon neutrality has sharpened its focus on renewable power sources such as wind power and hydro power. The Dalles was chosen primarily for the availability of hydro power from the Columbia River, while the local utility's wind power program influenced the selection of Council Bluffs, Iowa.
- The presence of a large supply of water to support the chillers and water towers used to cool Google's data centers. A number of recent Google data center sites have been next to rivers or lakes.
- Large parcels of land, which allow for large buffer zones of empty land between the data center and nearby roads. This makes the facilities easier to secure, and is consistent with Google's focus on secrecy about its data centers. Google purchased 215 acres in Lenoir, 520 acres for the Goose Creek project, 800 acres of land in Pryor, and more than 1,200 acres of land in Council Bluffs. The extra land may also be used for building windmill farms to provide supplemental power at some facilities.
- Distance to other Google data centers. Google is known to value lightning-fast response time for its searches. Since data from more than one data center may be required for a query or application, Google prizes fast connections between its data centers. While big pipes can help address this requirement, some observers believe Google carefully spaces its data centers to preserve low latency in its connections between facilities.
- Tax incentives. Legislators in North Carolina, South Carolina, Oklahoma and Iowa all passed measures to provide tax relief to Google.
How Does Google Decide Where to Build Its Data Centers?
Google chooses the locations of its data centers based on a combination of factors that include customer location, available workforce, proximity to transmission infrastructure, tax rebates, utility rates and other related factors. Its recent focus on expanding its cloud infrastructure has added more considerations, such as enterprise cloud customer demand for certain locations and proximity to high-density population centers.
The choice of St. Ghislain, Belgium for a data center (which opened in 2010) was based on the combination of energy infrastructure, developable land, a strong local support for high tech jobs and the presence of a technology cluster of businesses that actively supports technology education in the nearby schools and universities.
A positive business climate is another factor. That, coupled with available land and power, made Oklahoma particularly attractive, according to Google’s senior director of operations when the Pryor Creek site was announced. In Oregon, the positive business environment means locating in a state that has no sales tax. Local Wasco County commissioners also exempted Google for most of its property taxes while requiring it to make a one-time payment of $1.7 to local governments and payments of least $1 million each year afterward.
Proximity to renewable energy sources is becoming increasingly important, too. Google is strategically invested in renewable resources and considers its environmental footprint when siting new data centers.
Do Google's Sites Ever Go Offline?
Not very often. The web site monitoring service Pingdom tracked Google's worldwide network of search sites for a one-year period ending in October 2007, and found that all 32 of Google's worldwide search portals (including google.co.uk, google.in, etc.) maintained uptime of at least "four nines"
How Does Google Handle Data Security?
Google deploys progressive layers of security around its physical locations, hardware and software, processes and data for enterprise and consumer computing. Approximately 550 security professionals review security plans for all elements of the network, detect and manage vulnerabilities – including those in third party software – and scan for malware sites, among their other activities. Its internal audit team monitors security regulations globally to ensure compliance.
Physical access to Google’s data centers is severely limited, so only a tiny fraction of employees ever set foot on the premises. Layered protection includes biometric identification, metal detection, vehicle barriers and laser-based intrusion detection systems. Within the data center, Google deploys its own custom-designed security chips to identify and authenticate Google servers and peripherals to minimize the chance that unauthorized hardware can go online without detection.
Data on Google’s internal network is encrypted. Application layer protocols are encapsulated within the remote procedure calls (RPC) mechanisms, effectively isolating the application layer so data is secure even if the network is breached. Additionally, all infrastructure RPC traffic sent over the WAN between data centers is encrypted automatically. The deployment of hardware cryptographic accelerators is extending encryption to all infrastructure RPC traffic.
Google also takes pains to encrypt data before writing it to physical storage. Important elements of this strategy include automatic key rotation and audit logs. Encrypting data at the application layer helps the infrastructure isolate itself from such possible threats as malicious disk firmware. Hardware encryption also is enabled for hard drives and solid state devices. For end-users, permission tickets are used, linking encrypted data to users.
Google says its use of bare-bones servers and self-designed software reduces vulnerabilities, along with its process of replicating and distributing data across multiple servers and locations to eliminate single points of failure. Before storage devices are decommissioned, they are wiped in a multi-step process that includes two independent verifications. Devices that don’t meet those requirements are shredded on site.
These practices that protect Google’s infrastructure also secure Google’s cloud platform. An additional safeguard (among many others) includes virtual machine isolation provided by using the KVM stack to virtualize hardware. Google’s implementation of KVM is furthered hardened by moving part of the control and hardware emulation stack outside the kernel and into an unprivileged process.
Does Google Have a Floating Data Center?
Rumors of Google’s floating data center were rampant, starting in 2013. A barge docked near Treasure Island in San Francisco Bay led to wild speculation as to its uses – one version being that it was a data center. In the end, the barge turned out to be an interactive learning center.
That doesn’t mean floating data centers won’t feature in Google’s future, though. The company received a patent in 2008 for a wave-powered data center that would use the ocean to provide cooling and, through waves’ kinetic action, power. The patent describes potential data center locations as areas 3 to 7 miles from shore in 50 and 70 meters of water. So far, however, Google doesn’t appear to have plans to actually build a floating data center.
What Does a Google Data Center Look Like Inside?
Few people are allowed inside Google data centers, so Google produced some videos to ease your curiosity. Have a look:
Google Data Center 360° Tour – Virtual Reality
Google Data Center – Street View
Here are some Google data center pictures:
How Much Do Google Data Centers Cost to Build?
Google’s newest data center at The Dalles in Oregon, a 164,000-square foot building that opened in 2016, brought its total investment in that site to $1.2 billion. The overall size totals 352,000 square feet of data center divided among three buildings. The site first opened in 2006 and currently employs 175 people. Google has announced plans to add another $600,000 data center about a mile away, bringing the investment to $1.8 billion. That center is expected to employ about 50 people.
Likewise, the Pryor Creek, Oklahoma, data center also is continuing to expand. It first went online in 2011 with a 130,000 square foot, $600,000 facility and soon after built another building for staff offices. When the expansion announced in 2016 is completed, Google’s Pryor Creek data center will represent a $2 billion investment.
The new data center under construction in 2016 in Eemshaven, Netherlands, is expected to cost $773 million. In typical Google fashion, there’s no word on size.
Overall, Google's capital expenditures for 2016 were just under $10.2 billion. Most of that can be accounted for by its data centers and land acquisitions.
Do Google Data Centers Use Renewable Energy?
Google has sought a leadership role in clean energy and energy efficiency. In early 2007 the company announced that it would be carbon neutral for 2007 and beyond, and co-founded an industry consortium, the Climate Savers Computing Initiative, to advocate for "less wasteful computing infrastructure" such as high-efficiency power supplies. Google has also launched RE>C, an initiative to develop electricity from renewable sources cheaper than electricity produced from coal. The project's initial focus is on advanced solar thermal power, wind power technologies and geothermal systems.
Google buys more renewable energy than any corporation in the world. In 2016 it bought enough energy to account for more than half its energy usage. In 2017 the company expects to completely offset all its energy usage with 100 percent renewable energy. To do that, Google has signed 20 purchase agreements for 2.6 gigawatts (GW) of renewable energy. This means that, while renewable energy may not be available everywhere or in the quantities Google needs, Google purchases the same amount of renewable energy as it consumes.
Google also has committed $2.5 billion in equity funding to develop solar and wind energy that is can be added to the power grid throughout the world. That willingness to fund renewable projects is in an attempt to gradually expand the renewable energy market in terms of available, as well as by changing the ways renewable energy can be purchased. In the process, using renewable sources becomes easier and more cost effective for everyone.
Sustainability is a focus inside data centers, too. The St. Ghislain, Belgium, data centers were Google’s first to rely entirely on free cooling. And, that facility’s on-site water purification plant allows the data centers there to recycle water from an industrial canal rather than tapping the region’s fresh water supply.
How Much Energy Do Google Data Centers Use?
Google's major data centers are supported by at least 50 megawatts of electric power, with some estimates ranging as high as 103 megawatts, which is what Harpers magazine estimated to be the power load for Google's data center in The Dalles, Oregon. We suspect that estimate is slightly high, as it's based on math that assumed 500 watts per square foot and multiplies that times the total square footage of the facility
Data center energy use represents a sizeable chunk of the 5.7 terawatt hours its parent company, Alphabet, used in 2015. With an average PUE of 1.12 (versus the industry average of 1.7), Google says its data centers uses half the energy of a typical data center. A growing portion of this is renewable, supplied through power purchase agreements.Chilled-water cooling coils, seen at the top of the enclosure, cool the air as it ascends. The silver piping visible on the left-hand side of the photo, which carry water to and from cooling towers. (Photo: Connie Zhou)
Here’s a rare look inside the hot aisle of a Google data center. The exhaust fans on the rear of the servers direct sever exhaust heat into the enclosed area.
What Kind of Hardware and Software Does Google Use in Its Data Centers?
It’s no secret that Google has built its own Internet infrastructure since 2004 from commodity components, resulting in nimble, software-defined data centers. The resulting hierarchical mesh design is standard across all its data centers.
The hardware is dominated by Google-designed custom servers and Jupiter, the switch Google introduced in 2012. With its economies of scale, Google contracts directly with manufactures to get the best deals.
Google’s servers and networking software run a hardened version of the Linux open source operating system. Individual programs have been written in-house. They include, to the best of our knowledge:
- Google Web Server (GWS) – custom Linux-based Web server that Google uses for its online services.
- Colossus – the cluster-level file system that replaced the Google File System
- BigTable – a high performance NoSQL database service for large analytical and operational workloads
- Spanner – a globally-distributed NewSQL database
- Google F1 – a distributed, relational database that replaced MySQL
- Chubby lock service – provides coarse-grained locking and reliable, low-volume storage for loosely coupled distributed systems.
- Programming languages – C++, Java and Python dominate
- Caffeine – a continuous indexing system launched in 2010 to replace TeraGoogle
- Hummingbird – major search index algorithm introduced in 2013.
- Borg – a cluster manager that runs hundreds of thousands of jobs from thousands of applications across multiple clusters on thousands of machines
Google also has developed several abstractions that it uses for storing most of its data:
- Protocol Buffers – a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more
- SSTable (Sorted Strings Table) – a persistent, ordered, immutable map from keys to values, where both keys and values are arbitrary byte strings. It is also used as one of the building blocks of BigTable
- RecordIO – a file defining IO interfaces compatible with Google’s IO specifications
How Does Google Use Machine Learning in Its Data Centers?
Machine learning is integral to dealing with big data. As Ryan Den Rooijen, global capabilities lead, insights & Innovation, said before the Big Data Innovation Summit in London (March 2017), “Most issues I have observed relate to how to make this data useful…to drive meaningful business impact.” Therefore, in addition to using machine learning for products like Google Translate, Google also uses its neural networks to predict the PUE of its data centers.
Google calculates PUE every 30 seconds, and continuously tracks IT load, external air temperature and the levels for mechanical and cooling equipment. This data lets Google engineers develop a predictive model that analyzes the complex interactions of many variables to uncover patterns that can be used to help improve energy management. For example, when Google took some servers offline for a few days, engineers used this model to adjust cooling to maintain energy efficiency and save money. The model is 99.6 percent accurate.
In July 2016, Google announced results from a test of an AI system by its British acquisition DeepMind. That system had reduced the energy consumption of its data center cooling units by as much as 40% and overall PUE by 15%. The system predicts temperatures one hour in advance, allowing cooling to be adjusted in anticipation.
Does Google Lease Space in Other Companies’ Data Centers?
Yes. Google leases space from others when it makes sense. Not every Google data center has its name on the door. Instead, the company uses a variety of strategies to meets its data center needs. It leases space for caching sites, for example, and uses a mixed build-and-lease strategy for its global cloud data center rollout.