The 3 Layers of Data Center Management
Clemens Pfeiffer is chief technology officer of Power Assure. Before co-founding Power Assure, he served as founder, president, and CEO for 10 years at International SoftDevices Corporation and was Chief Software Architect at Hewlett Packard.CLEMENS PFEIFFER
Many data center managers wonder, “Why can’t I manage all my data center activity with the monitoring and reporting systems I have in place?” Popular IT management software often describes server utilization, but doesn’t provide a clear picture of the facility power and cooling situation. Inventory management systems can keep track of servers as assets, but not the number of transactions the servers perform, which correlates better with power demand. A change management system can tell you where each server is and how to access it, while an application monitoring system keeps an eye on the virtual machines and the service levels the applications provide. Very few systems incorporate all of this data into a single, holistic picture of how all your data centers are performing.
Furthermore, metrics that attempt to capture the environmental impact of a data center, such as PUE (Power Usage Effectiveness) and CADE (Corporate Average Data center Efficiency), provide very useful pieces of data but don’t accurately capture the multidimensional complexity of data center performance.
To paraphrase Einstein, “We can’t solve problems by using the same kind of thinking we used when we created them.” But that’s what most data center managers do—they run separate building management and IT management platforms and then attempt to correlate them using spreadsheets. Inside each silo, the requirements of the application drive the focus of the observation and analysis. Outside the silos, the business drives the requirements.
Data center managers must find ways to tie together diverse data (such as power, temperature, humidity, utilization, application service levels, and many more) from different sources into an overarching monitoring and analytics environment that lets them make better decisions, optimize the use of their capital spending, and increase the overall efficiency related not only to power but also to capacity, utilization, and operational aspects.
When you set up an environment that models, monitors, analyzes, and manages a data center—despite underlying structures that change continuously—a complete picture of a data center across facility, IT, and finance is finally realized. Think of it as a business intelligence and management platform that provides a truly holistic solution for data center management inside a single data center and across multiple data centers.
A Three-Layer Approach: Monitor, Analyze, Automate
Data center management requires three layers. First, monitor and visualize the details and activities across all systems and locations. Next, analyze how to utilize the data center more efficiently to save energy and space or increase the utilization of existing equipment. Finally, automate the action layer, which allows for synchronized management across the silos of facility, IT hardware, networks, and applications.
The same three-layer approach has to be expanded also across multiple data centers, a likely scenario in any organization with a need for multi-site redundancy. At this level, broader coordination with network providers and power utilities that serve and connect each data center must be integrated with the management of each data center. If a power or network outage impacts one particular site or application stack, it behooves the data center manager to be able to switch that stack seamlessly to another data center or even across multiple data centers. An integrated, holistic data center management framework enables this kind of agility.
The Application-Service Data Center Ideal
CIOs and data center managers are being barraged by hype about the cloud, and the traumas of the past decade have raised the bar for what is considered an acceptable level of business continuity/disaster recovery. For anyone who runs most of their applications on hardware they control, it’s worth looking at the model Amazon.com uses.
Although most companies don’t operate at Amazon.com’s scale, the company has an efficient distribution of data center assets that provide load balancing and redundancy with a minimum of waste. The company splits its operations across three data centers, with fully redundant network hardware and cooling systems across multiple sites but minimum redundancy within a single data center. If one center goes down, the other two automatically take over, limiting the amount of capital expense and providing a high level of service. Mission-critical applications are load-balanced across multiple sites, eliminating the need for hot or cold standby facilities that are not utilized most of the time, wasting large amounts of space, energy, and money.
This kind of arrangement will continue to gain acceptance. The service level dictates the design of the physical environment—not the other way around. The concept of a dedicated “disaster recovery” site, which simply waits for an emergency, may eventually become, to use the word in a different context—redundant. Instead, the application service level itself provides redundancy across multiple data centers, freeing data center managers to use spare capacity in their data centers for lower-priority applications when they are not in use for disaster recovery (which is most of the time), eliminating the need for costly redundant data center configurations, large UPS devices and reliability build-outs and decreasing the need for each data center’s redundancy from 2N to N+1 or even N configurations.
After setting up a cross data center redundant application environment, swings in load can still free-up capacity during low utilization that can be used cost efficiently for second or third tier (or less critical) applications. Currently a lot of IT managers consider such lower priority applications as targets for outsourcing to a cloud hosting environment, even though they can also reliably and effectively run using in-house capacity “for free.”
This vision can only come to life with software that takes into account all five levels of the data-center stack: the network/power level, the facility level, the IT infrastructure level, the application software level, and the application service level. Emergency failover and routine transfer of resources between physical assets must be fully automated, and data centers must be fully instrumented for the integrity of the operation and applications to be maintained based on how critical they are for the business. You need a holistic view—the big picture.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.