RAI: A Metric to Measure Whether Your Data Center is Operating Lean

2 comments

RAJAT GHOSH<BR/>Georgia TechRAJAT GHOSH
Georgia Tech

Rajat Ghosh is a Postdoctoral Fellow at the CEETHERM Data Center Laboratory, G.W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology.
His website is RajatGhosh.wordpress.com and he is available on LinkedIn.

Making a data center operation “lean” is increasingly becoming a critical business and regulatory requirement. A lean data center operation can satisfy its customers in the most cost-effective manner.

One potential solution for data center operational expenditure (OPEX) optimization lies in adopting the usage-based pricing model for resource consumption. To realize that goal, a data center’s resource supply and demand sides should match as closely as possible. Following the time-tested management adage, “You can’t manage what you can’t measure,” this article proposes a metric, namely resource allocation index (RAI), to measure the matching between demand and supply sides of data center resources.

Data Center Value Chain

Figure-1

Figure 1: Resource Utilization Landscape for an Internet Data Center (IDC) Value Chain (Click to enlarge graphic.)

Figure 1 shows the resource utilization landscape for an internet data center (IDC) value chain, which acts as an engine room for the prevalent client-server based commodity computing.

The demand side of an IDC is driven by its users. It can be defined as the number of login requests received. Depending on the nature of the IDC, the login requests could be transactional (e.g. Bank of America website), computational (analysis websites such as WolframAlpha Mathematica), archival (e.g. Facebook), merchandise (e.g. Amazon), and content query (e.g. Google). This incoming network traffic poses demands in the form of electronic operations in IT equipment (ITE)—such as volume servers, network switches, and storage disks. Depending on the type of an application, an IDC might utilize different combinations of its IT-enabled capabilities. However, the common denominator of these IT operations is the electricity consumption from renewable or non-renewable sources. Therefore, electricity should be considered as the most fundamental resource for a data center.

Metric to Measure Resource (Electricity) Utilization: PUE vs. RAI

Although the data center’s electricity should be primarily consumed by its ITE, a few exhaustive surveys indicate that on average 35-45 percent of data center electricity is consumed by its cooling hardware—such as server fans, computer room air conditioning (CRAC) units, building chillers, and cooling towers. The fraction of total electricity that is consumed by a data center’s ITE is given by its power usage effectiveness (PUE).

Despite being a useful and prevalent metric, PUE focuses only on the supply side of the value chain for an IDC. In fact, the state-of-the-art trend of designing the data center as a warehouse-scale computer calls for a holistic metric that can encompass the entire resource utilization value chain by a singular metric. In that direction, a metric, namely resource allocation index (RAI), is being proposed, as follows:

RAI = Normalized Resource Supply / Normalized Resource Demand

RAI measures how much electricity is required by a data center in order to serve one request. Taking the ratio of two end-points of the value chain (as shown in Figure 1), RAI is an end-to-end metric and useful for the holistic assessment of a data center’s resource allocation. The normalization is carried out with respect to peak demand and supply values.

RAI as a Quantitative Standard for Resource Provisioning

Table-1

Table 1: Comparison of Resource Allocation Performances of Two Data Centers (Click to enlarge graphic.)

RAI can be used to compare the resource allocation performances of multiple data centers. Table 1 shows RAI values of two hypothetical data centers. While the RAI value for Data Center 1 is equal to 0.91, that value for Data Center 2 is 1.26. With a lower RAI value, Data Center 1 performs better in utilizing its resources.

A cursory glance to RAI definition suggests that a lower value of RAI is desirable because the data center is spending less resource in order to satisfy its demand. Nevertheless, a lower RAI value might not necessarily indicate better resource allocation performance. In fact, an RAI value that is too low suggests the given data center is not drawing the requisite electricity to support its demand. Such resource under-provisioning might cause a degraded service performance—e.g. slow response of amazon.com during Black Fridays—or even a downtime—e.g. the downtimes seen by users of healthcare.gov.

On the other hand, an RAI value that is too high means resource over-provisioning that would lead to significant waste of electricity. Therefore, a data center’s RAI value indicates whether its operating resources have been allocated in one of the following three modes: over-provisioned, under-provisioned, and optimally-provisioned. Depending on the tier-status and the operating constraints for a given data center, the upper limit (UL) and the lower limit (LL) for the RAI values can be defined. Within these RAI values, the data center can be considered to be optimally-provisioned. If we suppose UL=1.25 and LL=0.75, then Data Center-1 (RAI=0.91) is optimally-provisioned and Data Center-2 (RAI=1.26) is over-provisioned. Figure 2 illustrates the concept schematically.

Figure 2: Assessment of Resource Allocation Performance based on RAI Values

Figure 2: Assessment of Resource Allocation Performance based on RAI Values (Click to enlarge graphic.)

More on next page

Pages: 1 2

Add Your Comments

  • (will not be published)

2 Comments

  1. Patrick Mannion

    Highly relevant approach here - and presented in an easily-understood manner, thank you. Maybe as a followup you could compare it to the Green Grid's DCeP framework?

  2. Great Advice! I will get back with a follow-up article.