In an unfortunate series of unrelated equipment failures, Internap recently experienced three outages at its Manhattan data centers in one week’s time.
The May 16 outage at 111 8th Avenue we reported on earlier was followed by two outages of the hosting service provider’s data center at 75 Broad Street. All three were caused by component failures in uninterruptible power supply systems.
The first outage at 75 Broad occurred around 10:30 a.m. on May 20 as a result of a failure of electrical capacitors on one of the data center’s UPS systems. This was “a little bit unusual to say the least,” Mike Higgins, senior vice president of data center services at Internap, said.
Such capacitors have an average lifespan of seven years, and the ones that failed at 75 Broad were replaced about three years ago. Moreover, Internap’s UPS vendors service the systems annually.
‘Single-corded’ tenants suffer
The data center is a “legacy site,” which essentially means it is old and does not have the most recent and advanced support infrastructure. It does have a redundant UPS system, but not all tenants choose to use the redundancy because it costs more.
When capacitors failed on one of the systems, customers that did pay for the dual feed were automatically switched to the other UPS, but the “single-corded” customers saw their power go down for about three seconds while the system switched them to the utility power feed.
Each of the customers in the latter group stayed down for as long as it took them to bring their servers back up and return their systems to normal. They remained on non-UPS utility power until the next morning, when the damaged capacitors were replaced.
Higgins could not say how many of the facility’s tenants were in the single-corded group, but said it was less than half.
Faulty breaker causes more downtime
The same group of customers saw their equipment go down two days later, around 10:40pm on May 22. The second outage was caused by failure of an output breaker on a UPS system.
Again, customers that opted for redundant power distribution continued business as usual, while single-corded customers stayed without power for about four hours until Internap’s engineers were done replacing the breaker, Higgins said.
Because the faulty output breaker blocked all power distribution downstream of the UPS, the affected customers’ equipment could not be switched over to utility power like it was when the first outage happened. “We couldn’t get to utility power quickly enough,” Higgins said.
Internap keeps an inventory of spare breakers on site, so it was a matter of getting technicians to the building late in the evening and replacing the part. “Fortunately for us we have very responsive vendors,” Higgins said. “Before business hours we had everybody up and running.”
Outage at 111 8th also breaker-related
The problems at 75 Broad came just a few days after Internap saw its other Manhattan data center, the one in the Google-owned carrier hotel at 111 8th Avenue, go down, disrupting operations for about 20 customers, including the well-known online video streaming platform Livestream and StackExchange, a network of well-trafficked websites for developers.
Coincidentally the 111 8th outage was also caused by an output breaker on a UPS system. Internap has a higher degree of redundancy at this facility, but because the system registered the issue as a “major fault,” it took the UPS out of the lineup and shut off the utility feed.
While the failover system at 75 Broad did what it was designed to do when both incidents there happened, the system at 111 8th did not. The failure at “111 8th was very odd,” Higgins said. “We’ve never had an outage there [in the past]. Ever.”
He declined to say who Internap’s UPS vendors in Manhattan were, but said they were well-recognized leading brands. Whether the provider will hold the vendors liable for the incidents remains to be seen. There is still a lot of work to be done to get to the root of each failure.
Tenants don’t mind ‘sharing pain’
Internap management, however, was not lucky enough to get a grace period before having an earful from some of its own disgruntled tenants. “There were customers today that shared their feelings with me, and I was in the ‘listen’ mode,” Higgins said the day after the second outage at 75 Broad. “They experienced some customer pain, so they shared that pain with us.”
Internap warns customers who go for the cheaper single-corded option that they are choosing the less reliable route. There are safeguards from legal action by such customers written into their contracts with the provider, Higgins said.