NVIDIA, the Silicon Valley-based chipmaker that has emerged as the top supplier of computing muscle for Artificial Intelligence software, has partnered with four Taiwanese electronics manufacturing giants who will design and manufacture its latest AI servers for data centers operated by the largest cloud providers, such as Microsoft, Google, and Amazon.
Foxconn, Inventec, Quanta, and Wistron will build hardware powered by NVIDIA’s next-generation GPUs, codenamed Volta, using the reference architecture for the chipmaker’s own HGX supercomputer design, developed together with Microsoft for AI software workloads.
NVIDIA CEO Jensen Huang announced a lineup of Volta-based products, including the Tesla V100 data center GPU, at the company’s big annual conference in Silicon Valley earlier this month, saying the chips will be available later this year.
“We’re working with everybody on the transition to Volta,” Keith Morris, NVIDIA’s senior director of product management for accelerated computing, said in an interview with Data Center Knowledge, referring to the top cloud providers. He said he expects these companies to start upgrading their platforms from the current-generation Tesla P100 GPUs this year.
A New Frontier in Cloud Wars
Providing hardware for machine learning – the fastest-growing and most widely used and researched type of AI today – as a cloud service is the latest frontier in the war for market share among cloud giants. GPU-powered servers are the most common type of hardware used for these workloads, but they’re very expensive and difficult to support in a data center.
Jensen Huang, CEO, NVIDIA, speaking at the GPU Computing Conference in San Jose in May 2017 (Photo: Yevgeniy Sverdlik)
GPUs are very power-hungry, and servers used for a subset of machine learning workloads called training can pack as many of eight of them on a single motherboard. This results in extremely power-dense data center deployments, where densities north of 30kW a rack are common. Most of the world’s data centers have been designed to support the average 3kW to 6kW per rack.
This is why renting GPU servers from cloud providers is an attractive proposition for companies either doing AI research or running AI software in production. They can pay as they go instead of spending large sums upfront to stand up this infrastructure on their own, and they can take advantage of the latest hardware as soon as it becomes available.
Use of Machine Learning on the Rise
A recent survey by MIT Technology Review and Google Cloud found that 60 percent of respondents had already implemented machine-learning strategies and committed to ongoing investment in machine learning. Additionally, 18 percent said they were planning to implement machine-learning strategies within the next 12 to 24 months. Just 5 percent said they had no interest in machine learning and no plans in that area for the foreseeable future.
Here’s a breakdown of the respondents active in machine learning by company size:
Source: MIT Technology Review, Google Cloud
NVIDIA’s partnership with the four manufacturers is focused on a single particular form factor for GPU servers: machines that have GPUs only, rather than the hybrid CPU and GPU servers. Cloud companies plug these servers as extensions for regular CPU-powered machines in their data centers, Morris said.
This is one of the reasons this particular partnership is not with traditional data center hardware vendors like Hewlett Packard Enterprise or Dell Technologies, who tend to build boxes that combine GPUs and CPUs, he explained. “We’re working with all the main OEMs (the traditional vendors) on these products too.”
NVIDIA Anticipates Huge Data Center Revenue Growth
NVIDIA expects its revenue from data center products to more than double between this year’s first and last quarters, going from $143 million in Q1 of fiscal 2017 to $296 million in Q4, according to the company’s Q1 2017 earnings report. It projects this business segment to accelerate even further the following year, expecting it to generate $409 million in revenue in Q1 2018.
NVIDIA still makes most of its money by selling GPUs for videogames and doesn’t expect that to change in the near future.
IDC forecasts that companies will spend $10.36 billion on compute infrastructure for cognitive workloads (one way to describe AI software) in 2022, an average annual growth rate of nearly 19 percent over the next five years. The market-research firm also notes that investment growth in cloud computing infrastructure for these workloads will outpace investment in on-premises infrastructure.
“This presents a significant worldwide opportunity for silicon vendors, ISVs, and services vendors,” the analysts said.
Competition from Google
Earlier this month, Google announced that in addition to its cloud GPU services it will provide similar services for TPUs (Tensor Processing Units), the custom AI processors the company designed in-house. The announcement means cloud market share for GPUs will be smaller than it would’ve been was Google to continue using TPUs strictly to run its own applications. The company operates one of the world’s largest clouds and is known for attracting some of the world’s brightest engineering minds.
A pod of TPU servers inside a Google data center (Photo: Google)
“Obviously … there will be some GPU workloads that will run on the TPU that could’ve run on the GPU,” Morris said. “Overall, I think Google has said that they’re going to continue to use NVIDIA GPUs, and I think the NVIDIA GPUs are a great platform for developing this technology. We expect them to continue to be a great customers of ours.”