Sandia National Laboratories Sandia National Laboratories
Part of Sandia National Laboratories' main campus on Kirtland Air Force Base in Albuquerque, N.M.

Cooling the World's Largest ARM Supercomputer

Astra, the ARM-based supercomputer Hewlett Packard Enterprise is building for the US Department of Energy, will be using an unusual cooling design for a supercomputer • The data center will be designed to use indirect liquid cooling, meaning the servers will be air-cooled, but the air will be cooled by a liquid-based cooling loop • Despite this, the data center is expected to be highly energy efficient, designed to get the LEED Gold rating • Astra is being developed as part of the DOE’s Vanguard program, whose goal is to advance new computing architectures in support of a mandate to secure the nation’s nuclear stockpile

When Astra, the ARM-powered supercomputer the US Department of Energy is building with Hewlett Packard Enterprise, gets booted for the first time later this year, it will be housed in a building that already some supercomputing history.

The facility at Sandia National Laboratories in Albuquerque was originally constructed for Red Storm, once the world's second-fastest supercomputer, sources at HPE and Sandia told Data Center Knowledge. Built in 2002 and decommissioned in 2012, it was the first computer to pass the terabyte speed mark. It's now being expanded to accommodate Astra and other, future systems

The building's infrastructure was designed for liquid-cooled supercomputers. But Astra will not be liquid-cooled in the traditional supercomputer sense. It won't be bringing liquid coolant directly to the servers; there will be no cold plates, liquid disconnects, or any tubing inside the HPE Apollo 70 servers comprising its computing muscle.

"We decided to do the standard Apollo 70 air-cooled solution, so that we're not entering the Apollo 70 with cooling but rather the liquid cooling is outside of the enclosure," Mike Vildibill, VP of HPE's advanced technologies group, explained. "We do have solutions where we can bring direct liquid cooling into some of our systems for more invasive cooling approaches, but for the Vanguard system it's technically an air-cooled system that is housed in a liquid-cooled envelope."

Astra is part of the DOE's Vanguard program, whose current focus is to push forward the development of supercomputers powered by ARM chips. The program's general direction is to support of the National Nuclear Security Administration's mandate to secure the country's nuclear stockpile.

Despite the indirect liquid cooling approach, the Sandia facility operators expect very low cooling costs for the 1.2MW Astra system.

"In the hottest hour of the summer, at peak compute performance, the HPE fan coils use 75 degrees Fahrenheit warm water and 89kW to keep the 1.2MW load cool," a Sandia spokesperson wrote in an email to DCK. "Sandia National Laboratories’ Thermosyphon and open cooling tower uses an additional 80kW to make that 75-degree Fahrenheit warm water on a 102-degrees Fahrenheit dry bulb August afternoon. During the extremely rare high-humidity summer hours, a mechanical chiller can run at 50kW to trim the supply water down to 75 degrees Fahrenheit."

The lab plans to install several supercomputers in the facility, requiring high power density and about 70 percent more warm water. The data center staff is planning to install supporting infrastructure to test different ideas for efficiency improvements, such as using DC power instead of AC or different liquid-cooling solutions.

The Astra data center will measure 14,000 square feet on a 3-foot raised floor to match the existing data center design and will be built so that the wall between data halls can be removed and rebuilt as needed to accommodate growth. The facility will be expanded in the future for other HPC systems, as well as for future phases of the Vanguard program, of which Astra is a part. The completion date for the current expansion is July 18, 2018.

The facility is expected to carry a second-tier LEED Gold rating, designed with things like motion-activated LED lights, with passive light during the day.

The Astra processors will be air-cooled. Liquid brought into the data center at a temperature of 72-85 Fahrenheit by an automated pumping system regulated based on cooling demand. The piping will be under the floor around the entire perimeter of the data center, with a four-inch valve placed every 15 feet for accessibility to tie into new systems. The heated return airflow will also pass through valves to eventually be used to provide heating to adjacent buildings, once a constant load is established.

The return water will pass through a Thermosypon, a new cooling technology designed to both save energy and water usually lost through evaporation. If the Thermosypon can return the water to a temperature suitable for cooling, it will be immediately returned to the process loop. If not, it will flow to a tower system for additional cooling. In extreme conditions the water may flow through a mechanical system, but in the future all mechanical systems will be removed from this process.

The amount of water the Thermosypon saves from evaporation is substantial. The technology has been tested at a National Renewable Energy Laboratory HPC facility in a collaborative effort that included Johnson Controls, NREL, and Sandia, with reported water savings of 1 million gallons per year. Eventually, four Thermosypons will be put in service at the Astra facility for an expected annual water saving of 2.5 million gallons.



Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.