An Amazon Web Services data center in northern Virginia lost power Friday night, causing extended downtime for services includng Netflix, Heroku, Pinterest , Instagram and many others. The incident occurred as a powerful electrical storm struck the Washington, D.C. area, leaving as many as 1.5 million residents without power.
The data center in Ashburn, Virginia that hosts the US-East-1 region lost power for about 30 minutes, but customers were affected for a longer period as Amazon worked to recover virtual machine instances. “We can confirm that a large number of instances in a single Availability Zone have lost power due to electrical storms in the area,” Amazon reported at 8:30 pm Pacific time. An update 20 minutes later said that “power has been restored to the impacted Availability Zone and we are working to bring impacted instances and volumes back online.”
By 1:42 AM Pacific time, Amazon reported that it had “recovered the majority of EC2 instances and are continuing to work to recover the remaining EBS (Elastic Block Storage) volumes.”
UPDATE: While most Amazon customers recovered within several hours, a number of prominent services were offline for much lnger. The photo-sharing service Instagram was unavailable until about Noon Pacific time Saturday, more than 15 hours after the incident began. Cloud infrastructure provider Heroku, which runs its platform atop AWS, reported 8 hours of downtime for some services.
Latest in Series of Outages
The outage marked the second time this month that the Amazon data center hosting the US-East-1 region lost power during a utility outage. Major data centers are equipped with large backup generators to maintain power during utlity outages, but the Amazon facility was apparently unable to make the switch to backup power.
Amazon experienced an outage June 15 in its US-East-1 region that was triggered by a series of failures in the power infrastructure, including the failure of a generator cooling fan while the facility was on emergency power. The same data center also experienced problems early Friday, when customers experienced connectivity problems.
Even Netflix Impacted
The latest outage was unusual in that that it affected Netflix, a marquee customer for Amazon Web Services that is known to spread its resources across multiple AWS availability zones, a strategy that allows cloud users to route around problems at a single data center. Netflix has remained online through past AWS outages affecting a single availability zone.
Adrian Cockroft, the Director of Architecture at Netflix, said the problem was a failure of Amazon’s Elastic Load Balancing service.”We only lost hardware in one zone, we replicate data over three,” Cockroft tweeted. “Problem was traffic routing was broken across all zones.”
The Washington area was hit by powerful storms late Friday that left two people dead and more than 1.5 million residents without power. Dominion Power’s outage map showed that sporadic outages continued to affect the Ashburn area. Although the storm was intense, there were no immediate reports of other data centers in the region losing power. Ashburn is one of the busiest data center hubs in the country, and home to key infrastructure for dozens of providers and hundreds of Internet services.
Here’s a look at the Twitter updates from some of the companies affected:
We’re sorry for the outage and working to get your Friday streaming back to normal as quickly as possible. Thank you for bearing with us.
— Netflix (@netflix) June 30, 2012
Pinterest is currently unavailable due to server outages. Our goal is to be back up by 10:30PM PST. Thanks for your patience!
— Pinterest (@Pinterest) June 30, 2012
It looks like AWS is having issues again,we have numerous EC2 instances that are no longer responding,we will let you know when we know more
— dotcloud status (@dotcloudstatus) June 30, 2012
We’re currently experiencing technical difficulties and we’re working to correct the issues. Thanks for your patience
— Instagram Support (@InstagramHelp) June 30, 2012