Google Data Center FAQ, Part 3

1 | 2 | 3

Does Google operate “green” data centers?
Google has sought a leadership role in clean energy and energy efficiency since early 2007 when it announced that it would forever be carbon neutral for 2007 and beyond. According to the company’s website, they ave built super-efficient servers, invented more efficient ways to cool data centers and invested heavily in green energy sources, with the goal of being powered 100 percent by renewable energy. Compared to five years ago, Google’s data centers now get around 3.5 times the computing power out of the same amount of energy.

What kind of hardware and software does Google use in its data centers?
Google uses commodity web servers that it customizes with highly-efficient power supplies and also owns a patented power supply that integrates a battery, allowing it to function as an uninterruptible power supply (UPS). Google also uses its own energy-efficient 10 Gigabit Ethernet switches for its data centers. The latest Google network is called Jupiter, and it can deliver 1.3 Pb/sec of aggregate bisection bandwidth across an entire data center, on that is enough for 100,000 servers to be linked to the network at 10 Gb/sec each. .

Google is known to use in-house software for its operations. These programs include:

  • Google Web Server (GWS) – custom Linux-based Web server that Google uses for its online services.
  • Storage systems:
    • Google File System and its successor, Colossus
    • BigTable – structured storage built upon GFS/Colossus
    • Spanner – planet-scale structured storage system, next generation of BigTable stack
    • Google F1 – a distributed, quasi-SQL DBMS based on Spanner, substituting a custom version of MySQL.
  • Chubby lock service
  • MapReduce and Sawzall programming language
  • Indexing/search systems:
    • TeraGoogle – Google’s large search index (launched in early 2006).
    • Caffeine (Percolator) – continuous indexing system (launched in 2010).
    • Hummingbird – major search index update, including complex search and voice search.
  • Borg declarative process scheduling software

Google has developed several abstractions which it uses for storing most of its data:

  • Protocol Buffers – “Google’s lingua franca for data”, a binary serialization format which is widely used within the company.
  • SSTable (Sorted Strings Table) – a persistent, ordered, immutable map from keys to values, where both keys and values are arbitrary byte strings. It is also used as one of the building blocks of BigTable.
  • RecordIO – a sequence of variable sized records.

Does Google lease space in other companies’ data centers?
Although Google prefers to use in-house data centers rather than lease space in multi-tenant facilities, for certain functions it makes sense to use other’s property. For example, Google uses Equinix real estate 
to help clients in 15 markets, including New York, Atlanta, Frankfurt, Germany, and Hong Kong, access Google’s business applications and cloud infrastructure.

Are there pictures of Google’s servers and data centers?
There are many pictures of the exterior of Google’s data centers, and a smaller selection of images of the equipment inside. A Google Image Search turns up many photos around the web, and many others have been posted on Flickr.

Equipment: The Computer History Museum in Silicon Valley has a display of the first Google production server rack from 1999. Several more recent photos of Google racks from presentations have appeared on the web, one showing an image from a slide from Google Developer Day in 2007 and another from 2006 that has been widely circulated.


1 | 2 | 3