It was a stormy week in the cloud, as an outage at Amazon Web Services affected some customers and sparked discussion about resiliency strategies. (Photo by BCP via Flickr.

It was a stormy week in the cloud, as an outage at Amazon Web Services affected some customers and sparked discussion about resiliency strategies. (Photo by BCP via Flickr.

More Problems for Amazon EC2 Cloud

6 comments

Amazon Web Services is reporting another service outage this morning for some customers of its EC2 cloud computing service. Amazon has reported connectivity issues in its US-East-1 availability zone, the same zone which was hit by an outage earlier this month.

The problems began at about 10:45 a.m. Eastern time, and were confirmed by Amazon a short time later. “We can confirm network connectivity issues for some EC2 instances in a single Availability Zone in the US-EAST-1 region,” Amazon reported in its Service Health Dashboard. ” Customers may be experiencing impaired read/write access to their EBS (Elastic Block Storage) volumes. New instance launches are also delayed. We are applying mitigations to address the connectivity issues … and connectivity is beginning to recover.” dotCloud also reported downtime due to the AWS problems.

UPDATE: As of 12:30 p.m. Eastern, Amazon reports progress. “Connectivity has been restored to the affected subset of EC2 instances and EBS volumes in the single Availability Zone in the US-EAST-1 region. New instance launches are completing normally. Some of the affected EBS volumes are still re-mirroring causing increased IO latency for those volumes.”

Amazon experienced an outage June 15 in its US-East-1 availability zone that was triggered by a series of failures in the power infrastructure in a northern Virginia data center, including the failure of a generator cooling fan while the facility was on emergency power. The downtime affected AWS customers including HerokuPinterestQuora and HootSuite, along with a host of smaller sites.

Today’s problems seem to have affected fewer customers than the June 15 incident. One service reporting availability problems was the AppFog platform. “More AWS outages this morning (EC2, RDS, EBS), attempting to work around as best as we can,” the company reported on its Twitter feed. “Sorry for any inconvenience this has caused.”

It’s not clear whether the smaller number of visible customer problems were due to the issue being more limited, or whether companies impacted by the incident two weeks ago have since opted to extend their infrastructure across additional EC2 availability zones, as recommended by Amazon.

Today’s incident was the fourth in the last 14 months for the US-East-1 availability zone, which is Amazon’s oldest availability zone and resides in a data center in Ashburn, Virginia. The US-East-1 zone also had downtime in April 2011 and another less serious incident in March.

 

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

Add Your Comments

  • (will not be published)

6 Comments

  1. The CTO's and Sysadmins for all the companies should be fired Monday morning... Lets not blame the weather but instead blame the fact that these companies allowed a "failure point" in one datacenter affect their entire chain of service instead of using "high availability" best practices.

  2. Today, Newvem will publish usage tips and recommendations on how to prevent and protect from AWS outages. Go here to read more - http://goo.gl/XFqmz