Google never says how many servers are running in its data centers. But a recent presentation by a Google engineer shows that the company is preparing to manage as many as 10 million servers in the future.
Google’s Jeff Dean was one of the keynote speakers at an ACM workshop on large-scale computing systems, and discussed some of the technical details of the company’s mighty infrastructure, which is spread across dozens of data centers around the world.
In his presentation (link via James Hamilton), Dean also discussed a new storage and computation system called Spanner, which will seek to automate management of Google services across multiple data centers. That includes automated allocation of resources across “entire fleets of machines.”
Dean says Spanner will be designed for a future scale of “106 to 107 machines,” meaning 1 million to 10 million machines. The goal will be “automatic, dynamic world-wide placement of data & computation to minimize latency or cost.”
Over the long-term, that type of cost management strategy could address regional differences in bandwidth costs and power costs. As we’ve previously noted, the ability to seamlessly shift workloads between data centers creates energy management possibilities, including a “follow the moon” strategy which takes advantage of lower costs for power and cooling during overnight hours. In this scenario, virtualized workloads are shifted across data centers in different time zones to capture savings from off-peak utility rates.
Another motivation for automated capacity management is to route around failures or data center downtime. Google has acknowledged developing software with this goal in mind, and several recent Gmail service outages have reinforced the value of rapid load-shifting across data centers.
This kind of automation could also allow Google to plan more energy-efficient facilities like its chiller-less data center in Belgium. if the weather gets too warm to operate servers safely, Google says it will turn off equipment as needed in Belgium and shift computing load to other data centers