Nvidia has partnered with Oracle Cloud, Microsoft Azure, Google Cloud, and others to make its AI supercomputers available as a cloud service.
The company’s new offering, called Nvidia DGX Cloud, will enable enterprises to use a web browser to immediately access Nvidia’s DGX AI supercomputers and AI software they need to train models for generative AI and other AI applications.
At its GTC AI developer conference on Tuesday, Nvidia executives said Oracle has already gone live with Nvidia’s DGX Cloud service, providing enterprises with bare-metal compute and high-performance storage that can scale to superclusters of 32,000 GPUs. The service is expected to be available on Azure next quarter and will expand to Google Cloud and others in the future, the company said.
Nvidia also announced a set of three cloud services, dubbed Nvidia AI Foundations, that enable enterprises to build and run large language models and generative AI models trained with their own proprietary data and for their own specific tasks.
The NeMo cloud service is for building custom language text-to-text generative models, while the Picasso cloud service is for generating images, video models. A third cloud service, BioNeMo, helps accelerate life sciences research and drug discovery. All three run on DGX Cloud.
Nvidia executives said DGX Cloud and its family of AI software, which includes the AI Enterprise software suite, enables enterprises to rent its DGX AI supercomputers on a monthly basis and quickly and easily scale their AI projects without the complexity and cost of buying, deploying and managing their own infrastructure.
“What we’ve done over the years with DGX is not just [create] a state-of-the-art supercomputer, but we’ve built a software stack that sits on top of it,” said Manuvir Das, Nvidia’s vice president of enterprise computing, during a media briefing. “That turns this into a turnkey training as a service. You just provide your job, point to your data set and you hit go and all of the orchestration and everything is taken care of. In DGX Cloud, now the same model is available on infrastructure that is hosted at a variety of public clouds. It's the same interface, the same model for running your training.”
NVIDIA CEO Jensen Huang made more than a dozen announcements at the company’s annual four-day GTC conference that began Tuesday. In his keynote speech, he explained how Nvidia and its partners are bringing AI to every industry.
Alexander Harrowell, Omdia’s principal analyst of advanced computing for AI, said Nvidia’s decision to launch an AI supercomputing cloud service is a smart move.
“NVIDIA has been positioning itself as enterprises' all-round partner for AI development for some time. Its investment in software tools and increasing engagement with the cloud should be seen in this light,” Harrowell said in an interview with Data Center Knowledge.
“DGX Cloud both enables enterprises to use NVIDIA's own tools for managing big GPU clusters in the context of their existing hyperscale skills and relationships and amounts to a very attractive pricing offer compared to current selling prices for A100s and H100s,” added Harrowell in reference to Nvidia’s latest generation GPUs. “Overall this contributes to the defensive moat NVIDIA is building against alternative AI semiconductor options."
New Breakthrough in Chip Design
Nvidia on Tuesday also announced a breakthrough in computational lithography, a process in which chip designs created on computers are physically printed on a piece of silicon.
Nvidia’s new cuLitho software library for computational lithography will reportedly use less power and enable faster design. It will also enable the industry to build more powerful, more energy-efficient next-generation processors that are “2nm and beyond,” the company said.
TSMC, the world’s largest contract chip manufacturer, is incorporating Nvidia’s cuLitho software library into its processes, while electronic design automation leader Synopsis is integrating it into its software. Equipment maker ASML is also collaborating with Nvidia on cuLitho and GPUs, the company said.
Better Electronic Design Automation (EDA) tools are an important enabler both for the move down to 2nm and beyond, and also for the growing custom silicon space, Harrowell said.
“It will be interesting to see what exactly you can do with this as a real EDA AI breakthrough could have the ironic consequence of empowering custom developers at the expense of big name chipmakers,” he added.
Two New Nvidia GPUs
Nvidia on Tuesday also announced two new GPUs for specific generative AI inference workloads. They are:
- Nvidia L4. This GPU can reportedly deliver 120 times more AI-powered video performance than CPUs. It offers enhanced video decoding and transcoding capabilities, video streaming, augmented reality and generative AI video. Google Cloud is the first cloud service provider to offer L4 to its customers, Nvidia said.
- Nvidia H100 NVL. This GPU is for deploying massive language models like ChatGPT at scale. IT features 94GB of memory and delivers up to 12 times faster performance at GPT-3 compared to the prior generation A100 at data center scale, the company said.
“The H100 NVL addresses the demand for scalability in the face of continuing AI model growth by enhancing last year’s H100 with double the NVLink chip-to-chip interconnects, which helps with building large clusters,” Harrowell said. “Interestingly, NVIDIA is emphasizing inference serving rather than training with this feature — although model training is more commonly associated with HPC-style clusters, it’s possible that I/O is a latency bottleneck for inference when very large models are split across multiple accelerators.”
Cloud Partnerships and Other Hardware Announcements
Nvidia also made several announcements with cloud service providers. They include:
- Microsoft announced it will make Nvidia’s industrial metaverse available on Azure. More specifically, Azure will host the Nvidia Omniverse Cloud, a platform-as-a-service that gives customers instant access to a full-stack environment to develop, deploy and manage industrial metaverse applications.
- Microsoft also announced that it will integrate Nvidia Omniverse with Microsoft 365 applications, such as Teams, SharePoint and OneDrive.
- Oracle Cloud Infrastructure and other cloud service providers also announced that they are offering products and services running on Nvidia’s H100 Tensor Core GPU. Amazon Web Services said its forthcoming EC2 UltraClusters of P5 instances can scale up to 20,000 interconnected H100 GPUs.
- Nvidia and Quantum Machines announced they have built a GPU-accelerated quantum computing system called the Nvidia DGX Quantum, which is powered by the Nvidia Grace Hopper Superchip.