A UPS failure caused a power outage today at a Rackspace data center in London, leaving several hundred servers offline for hours as technicians needed to help restarting equipment. The incident occurred at the LON03 data center, one of several facilities in the company’s growing London operation.
The power interruption started at 9:19 a.m. local time when a module failed on an uninterruptible power supply (UPS) and the unit failed to transfer the load properly, Rackspace said in its status update. Power was restored for most customers by 11:30 a.m., but a subset of servers failed to restart properly.
“Although the majority of devices were down for a minimal amount of time we had particular issues with a group of about 220 Whiteboxes which have taken several hours to bring up,” Rackspace reported. “DCOps have had to manually intervene to bring the servers back online. In numerous instances they had to replace power supplies in servers, replace firewalls, reconfigure switches and to log on to servers to get them to boot properly. We can only apologise for this incident. We take this type of incident extremely seriously and will work to fix the root cause.”
Rackspace has experienced a number of outages this year, but all had involved servers housed at its Dallas/Fort Worth data center.