Sustainability has emerged as a corporate top priority, and corporate data centers and IT systems are a key area of focus given the heavy amounts of resource consumption that are involved in powering and cooling increasingly energy-hungry processing environments.
As organizations move for faster and more powerful computing environments, the newer processors, GPUs, and solid-state storage all require significantly more power to operate than their technological predecessors. In some cases, the power requirements of these large data centers can rival that of a small city, and this is driving organizations of all sizes to pursue alternatives that can offset this growth in consumption.
While some organizations are pursuing their own data center efficiency and sustainability initiatives, many are shifting workloads to the cloud to simplify their IT management environment while concurrently driving down the carbon footprint associated with their compute utilization.
It might seem counterintuitive to think that major public cloud data centers filled with thousands of racks of computer servers could somehow be a more power-efficient option, but the reality is that the major cloud providers have become 'black belts' at measuring, evaluating, and driving down the power-related costs of operating their mega data centers for both operational efficiency/profitability – and sustainability – reasons.
An obvious step for the cloud providers to take would be to leverage as many sources of green or renewable energy as possible, and they are absolutely pursuing that avenue. On an overall basis, the major public cloud providers are some of the largest consumers of renewable energy in the world. However, rather than simply leveraging sustainable energy, these titans of IT have increasingly turned their attention to reducing the raw amount of power being consumed in the first place to positively influence the outcome.
Driving Environmental and Server Efficiency
The focus on energy efficiency has driven new interest in a power-related metric called power usage effectiveness (PUE) that has long been associated with high-performance computing (HPC) workloads run by some of the largest users of compute resources – such as the US Department of Energy.
PUE measures the energy effectiveness of a data center by measuring the amount of raw power entering the data center and dividing it by the power used to run the IT equipment within it. A perfectly efficient data center would have a PUE of 1.0 – indicating that 100% of the power entering the data center was being used to power the equipment needed, with no waste.
In reality, PUE calculations need to consider the power used for cooling and power conversion. They also need to show measurements taken as a year-round average that includes the hot summer months when cooling requirements will drive up the power requirements for operations.
Major cloud providers continue to make major investments to drive down their PUE. Google's largest public cloud data center environments are, on average, more than 1.5 times as energy efficient as the typical enterprise data center, and other public cloud providers are driving toward similar results.
Just as auto racing teams win races by continually finding small efficiency gains in aerodynamics, cloud operators shave energy utilization by implementing operational innovations such as running their data centers at 80 degrees Fahrenheit, using outside air for cooling, and designing their own super-efficient servers.
In fact, the results of the innovation and investment by large public cloud operators have been studied, measured, and published. A 2020 paper published in the journal Science showed that while the amount of computing done in major cloud data centers increased by about 550% between 2010 and 2018, the amount of energy consumed by these data centers only grew by 6% during the same time period.
The study's authors note that these energy efficiency gains outpaced anything seen in other major sectors of the economy.
Intelligent Software Control as the Next Step
Driving down the power utilization with the cloud can be amplified by providing IT organizations with intelligent control planes that allow administrators to precisely configure, tune, start, and stop cloud-based configurations to precisely meet the needs of user workloads.
For example, traditional corporate data centers have a fixed configuration of servers and resources that are typically running in an 'always-on' configuration, regardless of the current use profile and need for active workloads. This drives a consistent power draw at a relatively high level that is not optimal from a PUE and sustainability perspective.
In contrast, software-controlled cloud-based environments might offer a catalog of 20 different compute instance configurations that can be dynamically assigned to a particular user job – and that can be rapidly turned on and off as needed.
This more dynamic ability to select node types that precisely provide the required amount of processing power for a given workload (and to only utilize those servers/instances when needed) can provide the same type of usage optimization on an application-specific basis as the cloud providers perform in their PUE initiatives.
The result within the cloud is that a single set of compute resources can be dynamically assigned and re-assigned between multiple groups of users, or even groups of users from multiple companies, in a manner that is completely secure, and which maximizes the efficient use of the environment.
This can be particularly useful for compute-intensive workloads such as AI and high-performance computing (HPC) where varying applications can achieve significant performance acceleration from the use of specific processors and server configurations.
The Power of Choice
While advanced software control planes for workload management are a key component of cloud-based execution, there is a growing trend in the use of hybrid cloud environments that allow IT organizations to leverage the best elements of their on-premises data center environments in seamless combination with the public cloud.
In these cases, software-based control plane environments can enable IT managers to target workloads at the 'best fit' location given current activity and system needs. For example, a workload might run on-premises for most of the time, but could ‘burst’ to run on a much larger set of resources in the cloud on a monthly or quarterly basis during periods of very high usage.
Similarly, certain high-performance computing and AI workloads that require access to the latest GPU resources might execute in the cloud where the latest processor technologies are more readily available on a pay-as-you-go basis. It's all about choice.
There's little doubt that the major cloud providers are leading the charge on the path to data center operational efficiency – and their advances are benefiting both the customers who execute their workloads in the cloud as well as IT teams and colocation facility operators who can take advantage of the evolving set of tools and techniques.
The most advanced enterprises will find ways – using intelligent control plane software – to execute their workloads in the right place at the right time to deliver maximum benefit to their organizations and to a sustainable future.
Mark Seamans is Vice President, Cloud and Services – High-Performance Computing and AI for Penguin Solutions, a wholly-owned business within SGH (NASDAQ: SGH).