If you’re still wondering why Nvidia is prepared to spend so much for Arm, perhaps this will address some of your concerns: In a Tuesday morning announcement ahead of launching its VMworld 2020 conference, held virtually, VMware stated its next edition of the vSphere virtualization management platform will allow for “server disaggregation.” This includes new models of server components, where core processing is handled by one class of node, and data processing and storage are handled by a separate class altogether.
The result: CPUs and new DPUs (Data Processing Units) that are housed separately, networked together using rack-level fabric, and at some point potentially separated from GPUs that act as accelerators for both.
“Project Monterey is about re-imagining the data center architecture and the cloud architecture to support the needs of next-generation applications,” Rajiv Ramaswani, VMware’s co-COO for products and cloud services, said during a press conference. “It’s perhaps the next biggest set of changes that we are making not only to our own platforms but also to the industry ecosystem around us.”
Last year VMware instituted its first large-scale rearchitecture of vSphere, titled Project Pacific. That project is now complete, with VMware incorporating Pivotal assets, finalizing a container-based orchestration model called Tanzu — its commercial form of Kubernetes — and hard-wiring it into vSphere.
Project Monterey, whose preview edition is being released today, represents the next big architectural leap. While Pacific was said at the time to have been “integrating Kubernetes” from a software perspective, it was actually more like attaching a second layer. With Monterey, Kubernetes will be directly integrated into the system, not as a pre-attached add-on but as the sole engine orchestrating both first-generation, hypervisor-driven virtual machines and next-generation containers.
That change, asserted Kit Colbert, VP and CTO for VMware’s cloud business unit, triggers a new wave of complexity in the server. The new disaggregated server architecture alleviates this burden.
“The core x86 CPU is getting more and more requirements put on top of it,” said Colbert during a press briefing. “You need to run the applications’ business logic. You need to handle all the network I/O overheads. And now, with things like AI and machine learning, there’s an additional sort of data processing that’s needed there as well. And because of that, we’ve seen really a huge amount of innovation in the hardware accelerator space — GPUs and FPGAs — to take some of the burden off of the core CPU in order to improve performance across the board.”
The CPU is not well-positioned in the current x86 architecture to make optimum use of data at the speed it’s being ingested into systems, argued Ramaswani. At the intersection of these processors is a new class of smart network interface controllers (SmartNICs), including several produced by Mellanox, which Nvidia set about acquiring last March. These are devices with Arm CPUs, capable of running the virtual switches and security controls that are presently gumming up the CPU.
As Tom Gillis, senior VP and general manager for VMware’s Networking and Security business unit, told DCK in an interview, Project Monterey’s principal goal is to distribute and orchestrate container-based switches and micro-firewalls in a next-generation server rack running both CPUs and DPUs. Here, ESXi, the network virtualization layer critical for staging vSphere virtual workloads – even on cloud platforms — will run on DPUs.
“With Project Monterey, we have the ability to deliver the network services in the hypervisor, which is what we do today; or in a software agent that we put into an x86-based workload, Linux or Windows Server, without a hypervisor; or now we have a third option, a connection run in a NIC,” Gillis said. “It’s definitely not an agent, because it’s not running in the host, but it’s not a box that you have to put on the network. It’s this interesting space in-between, which is the ideal place for doing security.”
Pacific, he said, achieved VMware’s goal of enabling the application of security policies equally across VMs and containers. While it may not matter to vSphere at a high level whether a firewall is one or the other, it does matter to the security developer, who is now able to produce a firewall as a much simpler container. That component may then be distributed in an entirely new class of processor — not a co-processor, not an accelerator — that has direct access to the data plane.
Specifically crediting Mellanox, Gillis said the virtue of this new architecture is separating the data plane from the central processor — ironically, the opposite of SDN’s most virtuous feature, when the idea was first proposed. In the beginning of SDN the CPU was the faster route for processing data. In the intervening period data sped up and got bigger.
The movement has been on for some time to move security workloads onto SmartNICs. That part isn’t new. What is new here is the extension of the virtualized network platform onto SmartNICs, making the security space homogeneous across processors of entirely different classes. Furthermore, the data path may be passed onto FPGAs or ASICs without affecting the network controller, which can continue to access the data path through network addressing.
“I would argue that the advance the SmartNIC is making is going to change both the economics of the data center and also what’s possible there,” Gillis said. “If you combine this with high-density CPU cores from the likes of AMD and Intel (64-core socket in a 2-socket device), you could run ten thousand VMs in a single 4U chassis with one of these SmartNICs for I/O. It’s a data center in a box; it is amazing the densities we’ll be able to achieve and the economic efficiencies we will achieve. I think if we’re proactive, we can deliver all of that efficiency, and we can do better security than we did in the physical world.”