Oak Ridge: The Frontier of Supercomputing

11 comments

Some of the cabinets for the Jaguar supercomputer at Oak Ridge National Laboratory, currently the sixth-fastest machine in the world. An upgrade is underway that will transform Jaguar into the 20-petaflop Titan. (Photo: Rich Miller)

OAK RIDGE, Tenn. – At first glance, the data hall within Oak Ridge National Laboratory resembles many raised floor environments. But a stroll past the dozens of storage cabinets reveals three of the world’s most powerful supercomputers, including a machine that looms as the once and future king of the supercomputing realm.

The Oak Ridge Leadership Computing Facility (OLCF) is on the frontier of supercomputing, forging a path toward “exascale” computing. The data center features an unusual concentration of computing horsepower, focusing 18 megawatts of electric capacity on a 20,000 square foot raised-floor area. “The power demands are about what you would see for a small town,” says Rick Griffin, Senior Electrical Engineer at Oak Ridge National Laboratory (ORNL).

That power sustains three Cray systems that rank among the top supercomputers in the latest Top 500 list: NOAA’s Gaea (33rd), the University of Tennessee’s Kraken system (21st) and ORNL’s Jaguar, which is currently ranked sixth at 2.37 petaflops, but topped the list when it made its debut in November, 2009. (See our photo feature, Inside the Oak Ridge Supercomputing Facility, for more).

Jaguar is currently undergoing a metamorphosis into Titan, an upgraded Cray XE6 system. When it goes live late this year, Titan will be capable of a peak performance of up to 20 petaflops – or 20 million billion calculations per second. Titan will be accelerated by a hybrid computing architecture teaming traditional central processing units (CPUs) from AMD with the latest high-speed graphics processing units (GPUs) from NVIDIA to create a faster and more efficient machine.

The Road to Exascale

At 20 petaflops, Titan would be significantly more powerful than the current Top 500 champ, the Sequoia supercomputer at Lawrence Livermore National Labs, which clocks in at 16.3 petaflops. The data center team at Oak Ridge expects that Titan will debut as the fastest machine within the Department of Energy, which operates the most powerful research supercomputers in the U.S.

But Titan is just a first step toward the goal of creating an exascale supercomputer—one able to deliver 1 million trillion calculations each second – by 2018.

Jaguar is being upgraded in several phases. The dual 6 core AMD Opteron chips have been upgraded to a single 16-core Opteron CPU, while Jaguar’s Seastar interconnect has been updated with Cray’s ground-breaking new Gemini interconnect. In the current phase, NVIDIA Tesla 20-series GPUs are being added to the system, which will be upgraded to NVIDIA’s brand new Kepler architecture. Upon completion, Titan will feature 18,688 compute nodes loaded with 299,008 CPUs, with at least 960 of those nodes also housing GPUs to add more parallel computing power.

Cooling 54 kilowatts per Cabinet

Each of Titan’s 200 cabinets will require up to 54 kilowatts of power, an intense high-density load. The system is cooled with an advanced cooling system developed by Cray, which uses both water and refrigerants. The ECOPhlex (short for PHase-change Liquid Exchange) cooling system uses two cooling loops, one filled with a refrigerant (R-134a ), and the other with chilled water. Cool air flows vertically through the cabinet from bottom to top. As it reaches the top of the cabinet, the server waste heat boils the R-134a, absorbing the heat through a change of phase from a liquid to a gas. It is then returned to the heat exchanger inside a Liebert XDP pumping unit, where it interacts with a chilled water loop and is converted from gas back to liquid.

ORNL estimates that the efficiency of ECOPhlex allowed it to save at least $1 million in annual cooling costs on Jaguar. The advanced nature of the ECOPhlex design will allow the existing cooling system for Jaguar to handle the upgrade to  Titan, accommodating a 10-fold increase in computing-power within the same 200-cabinet footprint.

Upon completion, Titan will require between 10 and 11 megawatts of power. Oak Ridge has 140 additional cabinets for the other systems within its facility, and currently has 14 megawatts of total power for its IT. Another 4.2 megawatts of power is dedicated to Oak Ridge’s chiller plant.

Pages: 1 2

About the Author

Rich Miller is the founder and editor-in-chief of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

Add Your Comments

  • (will not be published)

11 Comments

  1. exascale supercomputer— able to deliver 1 million trillion calculations each second Now that's something

  2. Bill Hopper

    In the "Cooling 54 kilowatts per Cabinet" paragraph, "it interacts with a chilled water loop and is converted from liquid back to gas." should be "...gas back into liquid."

  3. twodogs

    Typo: Last line should end with "gas back to liquid." "Cool air flows vertically through the cabinet from bottom to top. As it reaches the top of the cabinet, the server waste heat boils the R-134a, absorbing the heat through a change of phase from a liquid to a gas. It is then returned to the heat exchanger inside a Liebert XDP pumping unit, where it interacts with a chilled water loop and is converted from liquid back to gas."

  4. Thanks for noticing. Yes, we meant "gas back to liquid" and have corrected this.

  5. Chris

    Typo: Jeff Nichols (not Nicheols)

  6. Thanks, Chris. I've corrected this.

  7. Jeffrey Plum

    Has anyone done research on recovering energy from data center and general business cooling? Mining waste heat might power associated businesses, or even living space. Heat mining, like data mining could be a new area of extracting value from overlooked assets of an operation.