Facebook announced the launch of its newest massive data center in Altoona, Iowa, adding a third U.S. site to the list of company-owned data centers and fourth globally.
The Altoona facility is the first in Facebook’s fleet to feature a building-wide network fabric – an entirely new way to do intra-data center networking the company’s infrastructure engineers have devised.
The social network is moving away from the approach of arranging servers into multiple massive compute clusters within a building and interconnecting them with each other. Altoona has a single network fabric whose scalability is limited only by the building’s physical size and power capacity.
Inter-Cluster Connectivity Became a Bottleneck
Alexey Andreyev, network engineer at Facebook, said the new architecture addresses bandwidth limitations in connecting the massive several-hundred-rack clusters the company has been deploying thus far. A huge amount of traffic takes place within each cluster, but the ability of one cluster to communicate with another is limited by the already high-bandwidth, high-density switches. This means the size of the clusters was limited by capacity of these inter-cluster switches.
By deploying smaller clusters (or “pods,” as Facebook engineers call them) and using a flat network architecture, where every pod can talk to every other pod, the need for high-density switch chassis goes away. “We don’t have to use huge port density on these switches,” Andreyev said.
It’s easier to develop lower-density high-speed boxes than high-density and high-speed boxes, he explained.
Each pod includes four devices Facebook calls “fabric switches,” and 48 top-of-rack switches, every one of them connected to every fabric switch via 40G uplinks. Servers in a rack are connected to the TOR switch via 10G links, and every rack has 160G total bandwidth to the fabric.
Here's a graphic representation of the architecture, courtesy of Facebook:
The system is fully automated, and engineers never have to manually configure an individual device. If a device fails, it gets replaced and automatically configured by software. The same goes for capacity expansion. The system configures any device that gets added automatically.
Using Simple OEM Switches
The fabric does not use the home-baked network switches Facebook has been talking about this year. Jay Parikh, the company’s vice president of infrastructure engineering, announced the top-of-rack switch and Facebook’s own Linux-based operating system for it in June.
The new fabric relies on gear available from the regular hardware suppliers, Najam Ahmad, vice president of network engineering at Facebook, said. The architecture is designed, however, to use the most basic functionality in switches available on the market, which means the company has many more supplier options than it has had in the older facilities that rely on those high-octane chassis for inter-cluster connectivity. “Individual platforms are relatively simple and available in multiple forms or multiple sources,” Ahmad said.
New Architecture Will Apply Everywhere
All data centers Facebook is going to build from now on will use the new network architecture, Andreyev said. Existing facilities will transition to it within their natural hardware refresh cycles.
The company has built data centers in Prineville, Oregon, Forest City, North Carolina, and Luleå, Sweden. It also leases data centers space from wholesale providers in California and Northern Virginia, but has been moving out of those facilities and subleasing the space until its long-term lease agreements expire.
In April, Facebook said it had started the planning process for a second Altoona data center, before the first one was even finished, indicating a rapidly growing user base.
The company has invested in a 138 megawatt wind farm in Iowa that will generate electricity for the electrical grid to offset energy consumption of its data center there.