As the data center increasingly becomes the heart of the enterprise, data center reliability needs increase. But data center design isn’t simply about infrastructure redundancy. As senior company executives pay more attention to what’s happening in the data center, it is more important than ever for a data center design to match specific company needs.
More redundancy than necessary means overspending and, as you’ll learn later in the article, can actually works against reliability. Steven Shapiro, mission critical practice lead at Morrison Hershfield, an engineering firm that does a lot of data center projects, said companies have to align business mission with expectations of data center performance when deciding how redundant the design should be.
Shapiro talked about the basics of data center design decisions from the availability perspective in a presentation at last week’s Data Center World conference in National Harbor, Maryland. Here are some of the highlights from his presentation:
More redundancy doesn’t always mean more reliability
Not only is it important to design as much as possible for actual reliability needs of the applications, more infrastructure redundancy doesn’t automatically make a system more reliable. In fact, there is a point at which increasing component redundancy lowers reliability because the system becomes more complex and difficult to manage, Shapiro said.
Tier IV costs twice as much as Tier II
Infrastructure reliability level has to match the needs of the applications the data center is supporting. Simply designing and building the most reliable data center you can afford is not the smart way to go, especially considering the cost of redundancy.
The difference in cost between an Uptime Institute Tier I and a Tier II design or between a Tier III and a Tier IV one is small, the jump from Tier II to Tier III is enormous: almost 100 percent. Citing Uptime’s own estimates, Shapiro said a Tier I data center with 15,000 square feet of computer space would cost $10,000 per kW of usable UPS-backed power capacity. The cost goes up to $11,000 for a Tier II facility, but to $20,000 for Tier III and $22,000 for Tier IV.
2(N+1) UPS config not much more reliable than 2N UPS
In another example where more redundancy doesn’t mean more reliability, Shapiro said a design doesn’t get much more reliable by going from a 2N UPS configuration, which has enough UPS modules for the IT load times two, to a 2(N+1) configuration, which has IT load plus one more module times two.
The probability of failure for a system that has 2N UPS, N+1 generator capacity, dual utility feeds, an alternate-source transfer switch, and IT gear with dual power cords is 4.41 percent, according to Shapiro. A system that is the same in every other respect but has a UPS configuration of 2(N+1) has the same probability of failure.
2N generator config is marginally more reliable than N+1
A 2(N+1) generator configuration makes a difference in availability compared to an N+1 config, albeit a small one. In a system with 2(N+1) UPS, dual utility feeds, an alternate-source transfer switch, and dual-corded IT equipment, the difference in failure probability between an N+1 generator configuration and a 2(N+1) configuration is about 1.5 percent – 4.41 percent for the former and 2.94 percent for the latter.
Satisfying even the highest Tier IV requirement in Uptime’s rating system doesn’t require prime rated generators. A standby rating is enough. Uptime’s requirements call for a generator that will run continuously, even during maintenance. That’s a guarantee all major generator manufacturers will readily provide, satisfying the requirement, Shapiro said.
Tier III and Tier IV requirements do, however, call for redundant power distribution from the generator plant and for the fuel supply infrastructure to be concurrently maintainable or fault-tolerant.
15 percent of generators fail after eight hours of running
Generator redundancy is important because generators aren’t infallible. Even if a generator starts successfully, and the facility switches to backup power without incident, things change when generators have to run for prolonged periods of time.
Hurricane Sandy’s aftermath in New York provided that rare test of generator reliability when running at length, and many did fail the test. A number of facilities operated by Morrison Hershfield clients switched to generator power and saw the lights go down after hours of operation, Shapiro said. The failures happened for different reasons, but in one case, a genset failed when it reached the bottom of the fuel tank and took in impurities that had accumulated there and failed to filter out.
He cited a study by the Idaho National Engineering laboratory that found that 15 percent of emergency diesel generators failed after eight hours of continuous operation; one percent failed after 24 hours; five percent failed after half an hour; and 2 percent failed to start.
Tier requirements alone won’t determine reliability
While Uptime’s Tier system defines reliability of infrastructure design, there are many factors that affect reliability beyond design. They include site location, construction of the building, quality of the equipment, the commissioning process, age of the site, operations and maintenance practices of the management, personnel training and level of personnel coverage.
Corrected: A previous version of the article erroneously said Uptime's Tier IV requirements did not call for redundant generators. They do, and the article has been corrected accordingly. Tier IV doesn't require prime rated generators; a standby rating is satisfactory.