How Advances in Thermal Management Can Revolutionize Data Center Cooling

COOLERCHIPS is one of the initiatives pioneering innovative cooling technologies that have the potential to improve data center cooling, reduce energy consumption, and extend the lifespan of electronic components.

Antonella Gina Fleitas

October 19, 2023

10 Min Read
server room

Considering the fact that each second 1.7 megabytes of data are produced for every individual on the planet, it's fair to say that getting information has become easier than it ever was before. However, as gathering data gets easier and easier, new problems come to the surface. With the estimated global data center electricity consumption in 2022 reaching a staggering 240-340 terawatt-hours (TWh), accounting for approximately 1% to 1.3% of global final electricity demand, the urgency of addressing energy efficiency within data management has never been clearer. And this is where the concepts of thermal design power (TDP) and cooling techniques come in.

In this article, we will analyze the complexity of thermal design power, dissecting its significance, its far-reaching implications in the realm of contemporary electronics, and all the advancements that shape this activity. From pioneering cooling solutions to energy-efficient architectural designs, this article will cast a spotlight on the exhilarating innovations that are relentlessly pushing the frontiers of what can be achieved in the arena of thermal design, including innovations coming out ofthe COOLERCHIPS initiative, a concerted effort to push the boundaries of thermal design and energy efficiency in data centers.

data center workers


Thermal Design Power and the Data Industry

Thermal design power is a critical specification used in the design and marketing of computer processors, such as CPUs and GPUs, and it essentially represents the maximum amount of heat that a component can generate under normal operating conditions. This value is typically measured in watts and is valuable to both manufacturers and users as it helps determine the appropriate cooling solution required to keep the component within its temperature limits.

The history of thermal design in the data industry is a journey that stretches back to the very origins of electronic computing. As electronic computers emerged in the mid-20th century, they brought with them a formidable challenge: the relentless generation of heat. In those early days, vacuum tubes were the primary building blocks of these computing machines. They held incredible promise but also came with a significant drawback — they produced a substantial amount of heat. Engineers like John Presper Eckert and John Mauchly — the architects behind the Electronic Numerical Integrator and Computer (ENIAC), the first general-purpose electronic computer completed in 1946 — were among the first to grapple with the necessity of intricate cooling systems to keep their colossal machines from overheating.

Related: COOLERCHIPS Project Takes On Data Centers' Chip Cooling Challenge

The whole picture changed with the introduction of transistors in the late 1940s. These tiny devices were a revelation, being significantly more energy-efficient and heat-friendly than vacuum tubes. However, as computing power continued its meteoric rise, so did the density of transistors on integrated circuits, introducing new complexities in managing localized hot spots.

In the 1970s, the era of microprocessors arrived, exemplified by the Intel 4004. While these microprocessors brought computing power to the masses, they also generated considerable heat. Visionaries like Bob Noyce and Gordon Moore, co-founders of Intel, stepped up to the challenge. They pioneered innovative thermal solutions, including heat sinks and fans, which became indispensable in managing the rising heat levels.

Moore's prophetic prediction of Moore's Law, the doubling of transistors on a microchip roughly every two years, drove exponential advancements in computing but also meant a perpetual escalation in heat generation. This gave rise to the green computing movement in the late 20th century, championed by figures like Urs Hölzle at Google. It ushered in an era of energy-efficient data centers, featuring server virtualization and sophisticated cooling techniques such as liquid cooling and free cooling.

The 21st century brought fresh challenges. High-performance computing, the demands of artificial intelligence, and the voracious appetite for cryptocurrency mining placed unprecedented stress on thermal design. Innovators like David Reinsel and a cadre of data center professionals pushed the envelope, exploring cutting-edge technologies such as immersion cooling and AI-driven thermal management systems.

Related: Can DCIM Software Drive Data Center Sustainability Efforts?

Nowadays, thermal design remains at the core of data center operations across the world. "Enabling and innovating aggressive and scalable thermal technologies is the need of the hour to align with the exponential increase in power expected by processors over the next decade," Tejas Shah, a lead thermal architect who works in Intel’s cooling projects, said. Processors with higher TDP values tend to produce more heat, which necessitates more robust cooling solutions, like larger heat sinks, liquid cooling, or advanced thermal management techniques. It is the linchpin for reliability, cost-effectiveness, and reducing environmental impact. Pioneers, ingenious engineers, and a relentless drive for innovation have collectively shaped the narrative of thermal design in the data industry.

TDP and Immersive Cooling

The relationship between TDP and immersive cooling lies in the need for efficient thermal management in high-performance computing environments, such as gaming rigs and data centers. Immersive cooling technologies, including liquid cooling systems and phase-change cooling, are designed to provide more effective heat dissipation than traditional air cooling methods.

Related: HoMEDUCS Project's Unique Approach to Keeping Modular Data Centers Cool

The implementation of impression cooling technology marks a paradigm shift in data center management. Traditional cooling methods often struggle to cope with the escalating demands of modern computing, leading to increased energy consumption and water usage. In stark contrast, impression cooling offers a revolutionary approach that promises to mitigate these challenges. It achieves this by utilizing innovative techniques to dissipate heat more efficiently, resulting in substantial reductions in energy consumption.

"Impression cooling is a transformative and disruptive technology," said Jen Huffstetler, Intel's chief product sustainability officer within the Data Center and AI Group, emphasizing its profound impact. "This groundbreaking technology not only addresses some of the most pressing issues in data centers, including the reduction of energy consumption and water usage, but it also empowers our customers to improve their total cost of ownership while simultaneously boosting overall compute density."

data center


The Main Challenges Efficient Cooling Projects Must Consider

There are a number of challenges that projects must consider as they work to improve cooling efficiency:

Miniaturization and Heat Density

As electronic devices shrink in size and increase in computational power, they generate higher heat densities. This phenomenon, quantified as power dissipation per unit area (W/cm²), poses a formidable challenge. The miniaturization of components, like transistors, exacerbates this issue. Efficient thermal design must incorporate advanced cooling techniques such as microchannel heat sinks, thermoelectric coolers, and phase-change materials to cope with these heat densities.

Energy-Efficient Cooling Systems

Traditional cooling methods, such as forced convection (e.g., air cooling) or natural convection, often result in significant energy consumption. Advanced cooling solutions, including heat pipes and two-phase cooling systems, utilize phase transitions (e.g., vaporization and condensation) to enhance heat transfer efficiency, substantially reducing energy consumption.

Material Selection

Thermal conductivity, a material property that determines its ability to conduct heat, is a critical consideration in thermal management. Engineers often choose materials such as copper or aluminum with high thermal conductivity values for heat sinks and thermal interface materials. However, emerging materials such as carbon nanotubes and graphene composites are being explored for even higher thermal conductivities.

Heat Spreading and Dissipation

Achieving uniform heat spreading across components and efficient dissipation are paramount. Heat spreaders, made of thermally conductive materials, are designed to evenly distribute heat. Heat pipes, which rely on phase-change principles, efficiently transport heat from heat sources to heat sinks. Thermal simulation software aids in optimizing these designs.

Design Integration

Integrating thermal solutions into device designs requires meticulous planning. Computational fluid dynamics (CFD) simulations help optimize airflow and component placement to ensure efficient heat dissipation without compromising device functionality. Material choices for insulation and thermal barriers play a role in preventing thermal leakage.

Related: Is Chip Cooling the Answer to Data Center Sustainability?

Variable Workloads

Many systems experience dynamic workloads, causing fluctuations in heat generation. Advanced thermal management solutions incorporate sensors and intelligent control systems that adjust cooling mechanisms, such as fan speeds or liquid flow rates, in real time based on temperature and workload data.

Energy-Efficient Algorithms

In data centers and internet of things (IoT) devices, software plays a pivotal role in energy optimization. Algorithms for workload distribution and dynamic voltage and frequency scaling (DVFS) are used to balance performance and energy consumption. Machine learning techniques are employed to predict workload patterns and optimize cooling.

Environmental Considerations

Achieving energy efficiency aligns with sustainability goals. Thermal engineers aim to design systems with low power usage effectiveness (PUE) in data centers, efficient power delivery systems, and heat reuse strategies to minimize environmental impact.

Testing and Validation

Rigorous testing is essential to validate thermal designs. Environmental chambers simulate real-world conditions, and thermal imaging cameras provide invaluable insights into temperature distribution. Computational models are validated against experimental data to ensure accuracy.

Cost Constraints

While advanced thermal solutions exist, their implementation can be costly. Engineers must weigh the benefits of reduced energy consumption against the upfront costs of innovative cooling technologies, taking into account the total cost of ownership over the system's lifespan.

Technological Advancements

Future-proofing thermal designs involves forecasting technological advancements. Engineers must anticipate shifts in power density, device architectures, and cooling techniques to ensure long-term efficacy.

Interdisciplinary Collaboration

Success in thermal design often hinges on interdisciplinary collaboration. Electrical engineers, materials scientists, thermodynamic specialists, and software developers must collaborate effectively, requiring precise communication and coordination.

Introducing the COOLERCHIPS Projects

One of the pivotal projects at the forefront of this technological revolution is COOLERCHIPS, an acronym for Cooling Operations Optimized for Leaps in Energy, Reliability, and Carbon Hyperefficiency for Information Processing Systems. The COOLERCHIPS initiative leverages advanced engineering principles and state-of-the-art technologies to redefine the cooling paradigm. This entails the development of novel cooling solutions that not only reduce energy consumption but also enhance the overall performance of data center infrastructure.

The COOLERCHIPS projects are a series of initiatives funded by the U.S. Department of Energy, administered by the Advanced Research Projects Agency-Energy (ARPA-E), aimed at advancing data center cooling systems.

COOLERCHIPS has awarded $40 million in grants to 15 enterprise and academic projects. These projects seek to achieve a minimum tenfold improvement in cooling efficiency to address the rising heat challenges posed by data centers. By fostering innovative technologies and approaches, COOLERCHIPS aims to reduce energy consumption, lower carbon emissions, and enhance the sustainability and resilience of data center cooling while promoting economic viability and competitiveness.

Among the standout projects, Flexnode in Bethesda, Maryland, is developing a modular data center with cutting-edge cooling technologies. HP, located in Corvallis, Oregon, is working on liquid cooling to reduce thermal resistance, allowing heat rejection into external air at 40°C and 60% humidity. NVIDIA, based in Santa Clara, California, is engineering a modular data center with two-phase cold plates for outstanding thermal efficiency. The University of California, Davis, in Davis, California, is pioneering thermal management solutions for edge computing data centers, optimizing heat extraction through cost-effective heat exchangers.

These projects collectively aim to revolutionize data center cooling, addressing climate concerns and ensuring reliable digital operations.

Conclusion: COOLERCHIPS Initiatives Push Boundaries of Data Center Cooling

COOLERCHIPS initiatives are at the forefront of research and development, pushing the boundaries of what's possible in terms of cooling technology. By focusing on innovative approaches such as microfluidic cooling, advanced materials, and smart thermal management, COOLERCHIPS projects are revolutionizing the way we manage heat in electronic devices. Not only do these projects have the potential to make our devices more efficient and durable, but they also contribute to reducing our overall carbon footprint.

The adoption of COOLERCHIPS technology can lead to substantial energy savings, prolonging the lifespan of electronic components, and ultimately reducing electronic waste.

As we continue to ask more from our devices, whether it's in the realm of high-performance computing, artificial intelligence, or data centers, we must remember that efficient thermal design and cooling systems are the unsung heroes that make it all possible.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like