An Equinix Inc. (EQIX) data center near Paris suffered a cooling outage late Wednesday, leaving some customer services offline for hours as the provider used backup cooling units to regain control of the temperature in the facility. The incident occurred on the second floor of the Paris 2 IBX data center in St. Denis, France, the facility where Equinix suffered an outage in early July.
The problems began late Wednesday afternoon as temperatures in Paris rose into the upper 90s. “Multiple chillers that support the second floor failed, and the standby chiller system did not start in time to absorb the load,” Michel Brignano, General Manager of Equinix France, wrote in an incident report which was posted to the FrNOG list.
“This impacted temperatures on the second floor and had indirect effect on the ground floor as well,” Brignano continued. “The specific causes of the failures are still under investigation but there appear to have been component/subsystem failures in at least two of the three primary chiller systems supporting the second floor. At this time, we can not say definitively whether the failures were related or not.”
Temperatures Rise in Data Center
Dedicated server host SD-France reported that inlet temperatures inside the Paris 2 data center soared above 50 degrees C (122 degrees Fahrenheit). French web host and registrar Gandi had to discontinue all services it hosted at the Equinix Paris 2 center, leaving some customers offline for as long as eight hours. “This impacts pretty much a quarter of our hosting and all sitemaker services as well as one of our DNS clusters,” Gandi said in its incident report.
Outside temperatures in Paris reached 97 degrees Wednesday, and 90 degrees on Thursday as Equinix staff continued the effort to restore the full cooling capabilities for the Paris 2 data center. “Equinix engineers and supplier support personnel were dispatched to the site (Wednesday) evening and continue to work on the systems,” Brignano wrote.
“The full site returned to target temperature early (Thursday) morning using a combination of some primary capacity and the standby system (which subsequently started). While returning to target temperature this morning was important, we were well aware that the system would be stressed again during the day and have been working to get the failed systems repaired and to undertake other interim steps to avoid a recurrence.
Recovery Work Continuing
“At this point, one of the failed systems has been partially repaired and is providing some cooling supply,” Brignano added in his update at midday Thursday. “Work is continuing on the other units and additional engineers are en route to the site to diagnose and repair the units. Because the site is operating with less than full operating infrastructure, on-site engineers are closely monitoring key cooling equipment and the UPS units. In addition, we are using a number of tactics to maximize the performance of the cooling equipment that is operating.”
SD-France shared its unhappiness about the outage in its public report to customers. “With regard to this last incident, what disappointed us the most is the indisputable lack of communication, transparency and reactivity of Equinix,” SD-France wrote.
The Equinix Paris 2 outage in July was caused by the human error of a vendor conducting routine maintenance on a UPS system, according to Equinix. The error caused one UPS unit to go offline, dropping power to a third of the PA2 center for one minute.