MENLO PARK, Calif. – Facebook is ready for Act Two of its mission to disrupt the data center. Barely a year after unveiling its custom hardware, the social network has retooled its server and storage designs.
Facebook’s next-generation custom hardware designs are driven by its overhaul of the data center rack, which widens the equipment trays to make servers easier to cool and maintain. The rack design concepts, known as Open Rack, were outlined at the Open Compute Summit last month. While the internal components of the server remain similar, Facebook has rearranged them in a new form factor that makes electrical and mechanical adaptations for the Open Rack design, which also make it easier to replace disks and network cards.
Data Center Knowledge recently got a detailed look at the new designs. Here’s a summary of what’s new, and links to videos in which Facebook’s hardware team discusses its innovations:
Open Rack: A new single-width rack replaces the three-wide “triplet” racks used in the Facebook’s first data center in Prineville, Oregon. The racks have widened the equipment trays to 21 inches (from the traditional 19 inches) while retaining the standard 24-inch footprint. Facebook has reworked the power system. Power supplies are now separate from the server motherboards and reside in a “power shelf” at the base of the rack, where they tie into the busbar at the rear of the unit. The 12V power is then distributed across three busbars that connect to the servers, a design that Facebook says improves the efficiency of its power distribution. See Facebook’s Open Rack Revealed for a video overview of the new design.
Windmill Server: Facebook has redesigned its servers, housing motherboards in 2U modules that replace the 1.5U rack-mount servers from Open Compute. Each module is about seven inches wide, allowing three to fit in each tray. The taller design allows Facebook to improve the airflow through the server by using 80 millimeter fans, a change from the 60 millimeter fans in the first generation of servers. See Facebook’s Windmill Server Design for a video overview of the updated server design.
Knox Storage: Facebook has extended its custom hardware development to include storage, which wasn’t included in the first round of Open Compute releases in April 2011. Its new storage prototype, known as Knox, houses up to 15 disks in trays that slide in and out of the rack for easy maintenance. Knox features a hinge that allows each tray of disks to hang at an angle vertically, after it slides out of the rack, making it easier for staff to replace disks in the upper area of the storage rack. See Facebook’s Storage Design Prototype for a video overview.
With its new designs, Facebook continues to rethink conventional approaches to data center design. The company’s hardware development is driven by Facebook’s desire to optimize its environment for hyper-scale, where modest efficiency gains can have a beneficial impact as they ripple across tens of thousands of servers. A theme in the second-generation designs is the unbundling of components that have traditionally been closely integrated, including CPUs and power supplies.
Separating CPU Refresh Cycles from Other Hardware
One of Facebook’s conceptual goals is to separate the technology refresh cycle for CPUs from the surrounding equipment. “Bringing the hottest new CPUs into our environment can have a big impact” on performance and efficiency, said Frank Frankovsky, Facebook’s VP of Hardware Design and Supply Chain. Why not just swap out the CPUs and leave the rest of the server and rack elements in place, rather than rolling in pre-packaged rackloads of new servers?
Frankovsky said the ability to easily swap out CPUs could transform the way chips are procured at scale, perhaps shifting to a subscription model.
“If you think about it from a high level, it could be a win for everyone,” said Frankovsky. “There’s bound to be a model that allows vendors to move their new technology into an environment (more quickly).” In addition to making it easier for customers to upgrade to the latest processors, a subscription model might also give chip vendors and OEMs more options in repurposing used processors that still have useful life.
‘The Next Frontier of Efficiency’
Making it easier to swap out CPUs can also extend the lifespan of components that often get replaced in a server refresh. Power supplies and DDR3 memory can last through several server refresh cycles, said Matt Corddry, Facebook’s Manager, Hardware Design.
“We believe the rack has a much longer life than some of the compute nodes,” said Corddry. “Why do we scrap all that equipment every three years? I think that’s the next frontier of efficiency in data center operations. The equipment is interchangeable.”
While it has sought to forge a new standard, Facebook’s focus on chassis-level design in Open Rack is reminiscent in many ways of blade server designs. A key challenge is avoiding the fate of blade servers, which focused on proprietary designs that didn’t interoperate with other vendors’ gear. That’s where the Open Compute Project (OCP) can play a key role, organizing the hyper-scale community around single standard. Facebook developed the Open Rack design for its own use and has submitted the plans to the OCP for review.
“We invested a lot of time up front to decide on a standard (for Open Rack),” said Frankovsky. “It’s been moving along really quickly.”