Supercomputer maker Cray announced Tuesday the last supercomputer architecture it will have come up with before entering the era of exascale computing. Additionally, the US Department of Energy announced that its lab in California will be first to deploy the “pre-exascale” system, codenamed “Shasta.”
Cray’s new architecture is designed to enable exascale computing when processing technology makes it feasible, which is generally anticipated around 2021 and 2022. It’s also designed to better position the company to capture what it expects will be an explosion of demand for supercomputing technology among traditional enterprises, as they look for ways to scale machine-learning models they’re just starting to experiment with today.
The Shasta architecture supports multiple processor types. Users will be able to deploy a mix of x86, Arm, GPU, or FPGA processors in a single system. It also features a new Cray-designed interconnect technology, called Slingshot, which according to the company is both faster and more flexible than other protocols for interconnecting supercomputer processors, such as Intel’s Omni-Path technology or Mellanox’s Infiniband. (Users can still choose Omi-Path or Infiniband; Shasta supports those as well.)
The focus on improving interconnect technology and supporting a wide variety of processor architectures reflects a broader trend in design of computing systems. As the market-dominating x86 chips fall increasingly behind the progress curve predicted by Moore’s Law, the frontier for greatest performance gains has shifted to the network fabrics that link system components and customization of computing architectures for specific workloads. If you have a strategically important workload that must be as fast as possible, a chip that’s best for that particular workload – but not necessarily great for others – will get you more bang for the buck than a general-purpose one.
The Government’s New Big Iron
Perlmutter, the Shasta system slated for deployment at the DOE’s National Energy Research Scientific Computing Center (part of Lawrence Berkeley National Laboratory) in 2020, will be one of the largest in Cray’s history. At $146 million, it’s also the third-largest deal Cray has ever closed, the company’s president and CEO Peter Ungaro said in a briefing with reporters.
The system, named after the Nobel Prize-winning astrophysicist Saul Perlmutter, who is a physics professor at University of California, Berkeley, will more than triple NERSC’s current computational power. The center currently operates four supercomputers, including Cori (also by Cray), which just about one year ago was named the eighth-fastest supercomputer in the world.
Taking advantage of Shasta’s flexible architecture, the Perlmutter system will include both CPU-only cabinets and cabinets that combine CPUs and GPU accelerators. It will be the center’s first system “specifically designed to meet the needs of large-scale simulations as well as data analysis from experimental and observational facilities,” according to a NERSC statement.
The Exascale Race
Building a system capable of exascale computing (crunching data at the speed of 1 exaFLOPS, or a billion billion calculations per second) has been an important strategic goal for governments around the world, and countries have been racing to reach it. Summit, the DOE system currently considered to be the world’s fastest supercomputer, is capable of computing at 200 petaFLOPS, or one-fifth of 1 exaFLOPS.
Former US president Barack Obama in 2015 launched the National Strategic Computing Initiative, whose number-one objective was to accelerate delivery of an exascale system. The Chinese government included the goal of reaching exascale computing capacity in the Five-Year Plan ending in 2020.
President Donald Trump’s administration appears to be taking supercomputing leadership seriously as well. “Continued leadership in high-performance computing is vital to America’s competitiveness, prosperity, and national security,” US secretary of energy Rick Perry said in a statement issued Tuesday. The Perlmutter system “will be an important milestone on the road to the coming era of exascale computing.”
Expanding Beyond HPC
But Cray’s design goals for Shasta extended beyond helping the government deliver on its exascale computing strategy. The company sees the rise of machine learning and the explosion of data companies can now use as an opportunity to grow its reach beyond the relatively small HPC market.
Only a handful of hyperscale companies – the likes of Facebook and Alphabet – run applications that take advantage of machine learning at scale today. Many of the more traditional enterprises, however, are experimenting with machine learning, Ungaro, the Cray chief executive, said. “Those companies see that if these experiments or proofs of concept are successful, they’re going to have to scale up these AI models in their environment.”
Once they scale them up, their problems expand beyond being able to use machine-learning frameworks like TensorFlow and running them on GPUs. Their biggest problems at that point shift to being able to move data across all the processors, keeping them synchronized and busy, to ensure efficient utilization of the expensive computing resources.
“This starts to have a very different look,” Ungaro said. “And it looks kind of like a supercomputer. Most people believe that AI models at scale will be run on supercomputer technologies.”
Even leaving machine learning aside, the supercomputer architecture lends itself well for applications that deal with massive amounts of data and need to run across large numbers of processors. The kind of synchronization across processors necessary in such cases is difficult to achieve with scale-out architectures you see in traditional commodity server clusters running in corporate data centers today. For supercomputers it’s been bread-and-butter for years.
The oil-and-gas industry has already switched to supercomputers from traditional Intel boards interconnected via Ethernet, as companies realized they needed a more efficient way to use more data and build larger models, Ungaro said. Other verticals he sees as big opportunities include automotive and aerospace manufacturing, financial services, pharmaceuticals, and life sciences.
A Supercomputer for Enterprise Data Centers
As it targets this broader customer set, Cray is thinking of ways to make the transition to HPC easier for traditional enterprise IT teams.
“Now we have to think about how do we build supercomputers that we can put into standard commercial data centers,” Ungaro said. “Not in national laboratories like at NERSC – who have been doing this for years and years and years and are experts at it – but in Fortune 500 companies and in various automotive and engineering companies on the planet.”
That’s why Shasta can be ordered in standard 19-inch cabinets instead of Cray’s custom supercomputer cabinets and cooled with air (using liquid-based rear-door heat exchangers) instead of direct-to-chip liquid cooling.
It also supports Ethernet in addition to the supercomputer interconnect technologies, so it can be connected to enterprise storage systems in companies’ existing data centers or cloud storage. That’s a first for Cray and one of the “crown jewels” of the Shasta architecture, Jeff Brooks, the company’s director of supercomputing products, said.
Taming the Tail
Another crown jewel is the fabric’s congestion-control mechanism. Congestion is one of the biggest issues on supercomputer networks, often used by many researchers simultaneously.
People often talk about bandwidth and network latency between point A and point B, Brooks said, “but that’s always a zero-load kind of statistic.” He likened it to assessing commute times in the middle of the night, on empty streets. “It doesn’t really mean a whole lot,” he said. “What we tried to do with this is make your commute during rush hour the same time as when there’s nobody on the road.”
The congestion mechanism achieves this mostly by optimizing tail latency, he explained. The bulk of packets on a network may travel with super low latency, but there is often some data that travels much slower, with unpredictably high latency, dragging the system’s overall latency down.
It’s like a marathon, Ungaro said, where a few runners cross the finish line first, followed by a big group that includes most runners in the race, followed sometime later by a few of the slowest runners. That last group’s timing is the tail latency.
Watching the Arm Ecosystem
Supporting x86, Arm, GPUs, and FPGAs in a single system is another first for Cray. “This flexibility of adding the right processor or the right processor mix is a new thing for us,” Brooks said.
It’s important not only because users increasingly want to use GPU accelerators for machine learning. It’s also important because x86 processors aren’t getting faster as quickly as they used to, and companies increasingly look for the optimal processor mix for each specific workload. That includes new, custom, workload-specific processor architectures.
“It’ll be more and more difficult to build a big homogeneous system all with generic processors in it and have that be the most value for money that you can deliver to a customer,” Brooks said.
Cray is paying special attention to developments in the Arm ecosystem. Arm’s model enables companies to license blocks of intellectual property and build custom ASICs for various markets. It substantially lowers the barrier to entry into the processor market, and it’s quite possible that an ASIC that does HPC workloads better than general-purpose chips will come out of the Arm space.
“We’re not a huge market, but a $150 million on a processor that’s super great for [the HPC] market? That’s something someone might do,” Brooks said. “I think Arm technology makes that a lot easier.”