Amazon Data Center Loses Power During Storm

Lightning has figured in several data center power incidents. But there are steps you can take to prepare (Source: NOAA).

An Amazon Web Services data center in northern Virginia lost power Friday night, causing extended downtime for services includng Netflix, Heroku, Pinterest and many others. The incident occurred as a powerful electrical storm struck the Washington, D.C. area. Read More

More Problems for Amazon EC2 Cloud

It was a stormy week in the cloud, as an outage at Amazon Web Services affected some customers and sparked discussion about resiliency strategies. (Photo by BCP via Flickr.

Amazon Web Services is reporting more service problems this morning for some customers of its EC2 cloud computing service. Amazon has reported connectivity issues in its US-East-1 availability zone, the same zone which was hit by an outage earlier this month. Read More

Generator Fan Failure Triggered AWS Outage


Last week’s outage at Amazon Web Services was triggered by a series of failures in the power infrastructure in a northern Virginia data center, including the failure of a generator cooling fan while the facility was on emergency power. Read More

Power Outage Affects Amazon Customers

A slide of a data center from a presentation at the Amazon Technology Open House.

A power outage at an Amazon Web Services data center in northern Virginia last night knocked some customers offline. Among the sites affected were Heroku, Pinterest, Quora and HootSuite, along with a host of smaller sites. Read More

Warm Weather Clobbers Climate Data Archive

The unseasonably warm weather across much of the country has meteorologists and weather enthusiasts searching for precedents – and doing so in large enough numbers that it caused an outage for a leading repository of historic weather data. Read More

Windows Azure Cloud Hit By Downtime

Microsoft’s Windows Azure cloud service has been hit with a series of performance problems today, leaving customers unable to manage their applications for about 8 hours and knocking Azure-based services offline for some North American users. Read More

Anticipating The Perfect Storm of Impossible Events

Jesse Robbins of Opscode says that resiliency is a function of culture, as well as engineering. “You cannot learn the lessons of failure without experiencing it,” said Robbins. That’s can be difficult message for IT operations team that view downtime as an enemy to be avoided at all costs. Read More

Additional Downtime Articles