SAN JOSE, Calif. – Unlike past years, when most of the focus was on servers, the data center network took center stage at the Open Compute Summit in Silicon Valley earlier this month.
Facebook and a number of other data center end users and vendors announced network technology contributions to the Open Compute Project, the Facebook-led open source data center and hardware design community.
If somebody wants to build a custom data center network using open source hardware and software, they now have access to just about every part of the stack, save for some software programming work that is still required. Key components of that stack are components of the Facebook network the company's engineers have built for its own use.
“All the pieces are available,” Najam Ahmad, Facebook’s director of network engineering, said. “It took us about a year and a half to get here, but I’m now really excited that we’ve done all the base platform work, and I see that momentum building.”
The programming bit that would still be required is not trivial however. The user would still have to build their own network protocols on top, he said.
Ahmad’s team has changed the way Facebook networks in its data centers are built, and the company has contributed some of those innovations to the open source community – the same way it has opened up its server design specs.
Enter Open Source Network Hardware
In February, Facebook announced the Six Pack, its latest network switch that will enable its new network fabric, and said it would contribute the spec to OCP. The first Facebook data center where the fabric is implemented is the company’s Altoona, Iowa, facility, launched last November.
The Six Pack is not currently running in Facebook data centers at scale. The new switches are being tested in production in several parts of the infrastructure, Ahmad said.
The Facebook network switch that is already running at scale is the top-of-rack switch called Wedge, which the company announced in June of last year. At this month’s summit in San Jose, Facebook said it would contribute the Wedge spec to OCP as well.
Not only will the spec be available, but there’s also already a vendor that will sell Wedge switches. They will be available from the Taiwanese network equipment maker Accton Technology and its channel partners.
Najam Ahmad, director of network engineering, Facebook (Photo: Facebook)
Managing Switches Like Servers
Facebook will also contribute a portion of FBOSS, the set of software applications for managing Wedge switches, to the open source project, making network hardware and network management software Facebook designed available for public consumption.
Even though FBOSS stands for Facebook Open Switching System, it is not an operating system. It is a set of apps that can be run on a standard Linux OS, Adam Simpkins, a Facebook software engineer, explained in a blog post.
The point is to make switches less like switches and more like servers. A Facebook switch behaves like a server that needs some FBOSS software to perform functions of a network switch.
This way, Facebook data centers no longer need a network management team that sits in its own silo, separately from the team that manages servers. The same team can now manage both, widening the pool of people that can manage the entire infrastructure, Ahmad said.
The company is contributing the FBOSS agent, which it uses to manage Wedge switches, as well as OpenBMC, which provides management capabilities for power, environmentals, and other system-level parameters.
APIs Make Open Source FBOSS Possible
Switch hardware and management software on their own are not enough for a data center network. The FBOSS agent doesn’t manage the switches directly. It communicates with switching ASICs (the hardware circuits that perform packet forwarding in switches) through SDKs, or software development kits.
Network vendors have traditionally kept ASIC specs and SDKs closed, but now several ASIC vendors are beginning to open them up, starting with Broadcom, which has released OpenNSL APIs that FBOSS can use to program the Broadcom ASICs on Wedge switches.
Release of OpenNSL is what made release of FBOSS possible. “We couldn’t open source FBOSS if [it] talked directly to the SDK, because then we would be releasing Broadcom’s proprietary information,” Ahmad said. “OpenNSL talks to the SDK, and that’s Broadcom’s problem. Not our problem.”
Non-FBOSS Options Available
But FBOSS isn’t the only option for management software. A company called Big Switch contributed its Linux-based network operating system to OCP this month.
Earlier, a company called Cumulus Networks, which also has a Linux OS for open switches, contributed ONIE (Open Network Install Environment) to the project, which enables installation of any network OS on switches and to manage switches like Linux servers.
So users now have a choice between running Wedge switches with a Linux OS by Cumulus or Big Switch, or running FBOSS and OpenBMC.
But they are not limited to using this software on Facebook’s Wedge switches. So-called “incumbent” network vendors have been announcing “open” switch products since early last year.
They include Dell, Juniper, and, most recently, HP. Dell is offering a choice between Big Switch, Cumulus, and its own network OS. Juniper is planning to start shipping switches that support any open network OS sometime this year. HP’s first open switches – also manufactured by Accton – will ship with Cumulus software.
Disaggregation Unlocks Potential
Disaggregation between network hardware and network software is exactly the idea Ahmad and his team at Facebook had in mind when they started the networking group within OCP in 2013. This is the first time such disaggregation has occurred in data center networking, and OCP has had a lot to do with it.
“We wanted to disaggregate the network appliance, because the network appliance was very much a black box,” Ahmad said. “And a black-box environment doesn’t work.”
In the server world, you know what chip is being used, and what other hardware components are inside. You can install an OS of your choice, and you have the flexibility to program and customize it to your needs.
That kind of flexibility has been simply impossible with networking hardware. “You get what you get, and the only thing you can do with it is whatever protocols they have implemented; whatever [Command Line Interface] is available; and that just doesn’t scale at our size, and for most people,” Ahmad said.
Facebook’s engineers needed the flexibility, and if off-the-shelf products didn’t have it, they had the drive and the resources to design what they needed themselves, which has been the approach other large-scale data center operators, such as Google, Amazon, and Microsoft, have been using as well.
Pent-Up Demand for Open Hardware
The difference in Facebook’s approach was OCP – the idea that if you open source some of the pieces you create for your own use, you can spur an entire ecosystem of vendors and users who aren’t satisfied with business-as-usual. The rate of the OCP ecosystem’s growth indicates there was quite a bit of pent-up demand for that level of flexibility in the market.
OCP has reached a point where it isn’t just Facebook making contributions anymore. The list of data center end users who are active in OCP now includes the likes of Apple, Microsoft, Goldman Sachs, and Fidelity Investments.
The list of active vendors now extends beyond the Asian design manufacturers that supported the project from the start. It now has names like Cisco, Juniper, HP, Dell, Emerson Network Power, and Schneider Electric.
It turns out the market wanted more openness and disaggregation, and the data center vendor establishment has reacted. There was a need for major changes in the way the hardware market worked, and OCP gave the push that was needed to get the ball rolling.