Cerebras Introduces ‘World’s Fastest AI Chip’ and New AI Server

UPDATED AI hardware startup Cerebras Systems has introduced a new, third-generation AI processor that it claims to be the fastest in the world. The WSE-3 chip doubles the performance of its predecessor, which was the previous record holder, the company said today (March 13).

“Once again, we’ve delivered the biggest and fastest AI chip on the planet with the same dinner plate-size form factor,” said Andy Hock, Cerebras’ vice president of product management.

As more enterprises race to build generative AI models and other AI workloads, AI processors have become a hot commodity as data center operators are increasingly updating their infrastructure to meet surging demand.

Cerebras, the Sunnyvale, California-based startup entered the hardware market in 2019 when it introduced a super-sized AI chip, called the Wafer Scale Engine (WSE), which measured eight inches by eight inches. It was 56 times larger than the largest GPU and featured 1.2 trillion transistors and 400,000 computing cores, making it the fastest and largest AI chip available at the time.

Then in 2021, the company launched the WSE-2, a 7-nanometer chip that doubled the performance of the original with 2.6 trillion transistors and 850,000 cores.

900,000 Cores

The company today nearly doubled performance again with the WSE-3 chip, which features four million transistors and 900,000 cores, delivering 125 petaflops of performance. The new 5-nanometer processor powers Cerebras’ new CS-3 AI server, which is designed to train the largest AI models.

“The CS-3 is a big step forward for us,” Hock told Data Center Knowledge. “It’s two times more performance than our CS-2 [server]. So, it’s two times faster training for large AI models with the same power draw, and it’s available at the same price [as the CS-2] to our customers.”

Since its launch, Cerebras has positioned itself as an alternative to Nvidia GPU-powered AI systems. The startup’s pitch: instead of using thousands of GPUs, they can run their AI training on Cerebras hardware using significantly fewer chips.

“One [Cerebras] server can do the same work as 10 racks of GPUs,” said Karl Freund, founder and principal analyst of Cambrian AI Research.

Cerebras

The WSE-3 processor powers Cerebras’ new CS-3 AI server, which is designed to train the largest AI models

Cerebras Makes Inroads Into AI Market

Nvidia dominates the AI market with its GPUs capturing about 85% of the AI chip market, while the remaining players such as AMD, Intel, Google, AWS, Microsoft, Cerebras, and others have captured about 15%, the analyst said.

CPUs used for AI training include AMD EPYC and Intel Xeon server processors. And although it might not be apparent from all the hype about GPUs, most AI training takes place on CPUs today, said Matt Kimball, vice president and principal analyst at Moor Insights & Strategy.

However, GPUs such as Nvidia’s H100 and AMD’s MI300 deliver much better training performance, as does AI-specific silicon like Intel’s Gaudi accelerator, he says. Startups in the AI chip space include Cerebras and Tenstorrent.

While the competition has not yet proven that they can steal a big chunk of market share from Nvidia, Cerebras has found success since it launched its first product five years ago, said Freund, who calls Cerebras the most successful AI startup today.

“From the beginning, Cerebras took a very different approach,” he said. “Everybody else is trying to outdo Nvidia, which is really hard to do. Cerebras said, ‘We’re going to build an entire wafer-scale AI engine,’ which no one has ever done. The benefit is incredibly high performance.”

Cloud Access

Cerebras doesn’t make money selling its processors. It makes money selling servers that run on those chips, which, according to a company spokesperson, cost millions of dollars each. Cerebras makes its CS-3 systems available to customers over the cloud, but it also sells to large enterprises, government agencies, and international cloud providers.

For example, Cerebras recently added healthcare provider Mayo Clinic to its growing roster of customers, which includes Argonne National Laboratory and pharmaceutical giant GlaxoSmithKline.

Cerebras in July 2023 also announced it inked a $100 million deal to build the first of nine interconnected, cloud-based AI supercomputers for G42, a technology holding group based in the United Arab Emirates.

Since then, the two companies have built two supercomputers totaling eight exaflops of AI compute. Accessible over the cloud, the supercomputers are optimized for training large language models and generative AI models and are being used by organizations across different industries for climate, health and energy research and other projects.

Cerebras and G42 are currently building a third supercomputer, the Condor Galaxy 3 in Dallas, which will be powered by 64 CS-3 systems and will produce eight exaflops of AI compute. By the end of 2024, the companies plan to complete the nine supercomputers, which will total 55.6 exaflops of compute.

“The fact that Cerebras has now produced a third-generation Wafer Scale Engine is a testament to its customer traction. They generated the kind of revenue they needed to pay for all that engineering,” Freund said.

Cerebras’ Market Challenges

While Cerebras has found success as an early player in the hot AI chip sector, the company addresses the very high end of the market, so it’s not a high-volume product, Kimball said.

“It’s not like an HPE or Dell product where everybody – from a small business to large enterprise – buys it,” he says, referring to Cerebras’ hardware. “Smaller companies are slower to adopt new technology, and I suspect this may be cost prohibitive.

Cerebras faces a challenge that every chip startup has: how to further break into a market dominated by giants such as Nvidia, AMD, and Intel, Kimball said. The fact that large cloud providers AWS, Google, and Microsoft Azure are designing their own custom AI chips is also a challenge for Cerebras and the other chipmakers because it reduces their market, he said.

“And of course, the smaller players are impacted more than the larger vendors,” Kimball said.

In Numbers: WSE-3 Chip and CS-3 AI System

Cerebras’ WSE-3 features 52 times more cores than Nvidia’s H100 Tensor Core. When compared to an Nvidia DGX H100 system, the Cerebras CS-3 system – powered by the WSE-3 chip – performs training eight times faster, features 1,900 times more memory and can train AI models up to 24 trillion parameters, which is 600 times larger than a DGX H100’s capabilities, Cerebras executives said.

A Llama 70 billion parameter model that takes 30 days to train on GPUs can be trained in one day using a CS-3 cluster, Hock said.

Cerebras Partners with Qualcomm on AI Inferencing

Because Cerebras’ hardware focuses on AI training, it previously didn’t have an answer for customers’ AI inferencing needs. Now it does thanks to a new partnership with Qualcomm.

The two companies today said they have collaborated, so that the models trained on Cerebras’ hardware are optimized to run inferencing on Qualcomm’s Cloud A100 Ultra accelerator.

“They optimized the output of the big CS-3 machines to run really well on these very low-cost, low-power Qualcomm AI inferencing engines,” Freund said.

This article was updated on March 15.

Comments

Plain text