After The Storm: Architecting AWS for Reliability

It was a stormy week in the cloud, as an outage at Amazon Web Services affected some customers and sparked discussion about resiliency strategies. (Photo by BCP via Flickr.

The recent data center outages for Amazon Web Services have focused attention on the best ways to deploy applications on AWS to avoid downtime. The breadth of Amazon’s cloud computing infrastructure offers many options for routing around problems in a single availability zone or region, and the most recent outage Friday night prompted much discussion around the web about the best ways to configure the architecture of AWS apps for maximum reliability. Here’s a sampling of notable analysis, commentary and resources:

EC2 / EBS Outage: Lessons Learned – From Agile Sysadmin: “Don’t treat AWS like a traditional datacenter. Amazon provides up to four availability zones per region, and a range of free and paid-for tools for using them. Techniques for taking advantage of these features range from as simple as using elastic IP addresses and remapping manually to a different zone, to using multi-availability zone RDS instances to replicate database updates across zones.”

Rainforest: Guide to a Cross-Region Strategy – From the Rainforest blog: “So, you’re hosted in a single zone, and if you’re in US-East presumably went down last night. Stop it. You need to use more than just mulitple zones if you want to stay up during one of these outages. Cross region is the answer. The following is a rough guide to making your website available regardless of ‘catastrophic’ events.”

How to Synch S3 Buckets in AWS and design for failover – From Dan Morrill at CloudAve: “Synching S3 buckets is fairly easy, and there is no reason not to do it, you can cut your own custom script that will determine all the contents in an S3 bucket and make sure that even though the data centers are somewhat isolated ensure the contents are the same in each S3 bucket. You can use a freeware/minimal cost software like S3CMD from S3 tools to help you out if you need a baseline system command set to do this.”

Ask HN: Best setups to Avoid Outages on AWS – This thread at Hacker News discusses the question: “Using only AWS services, what do you put in place to help prevent disruptions when a single availability zone goes down?”

AWS Architecture Center: This Amazon resource provides reference architectures, and white papers on best practices, disaster recovery and building fault-tolerant applications on AWS. Amazon’s description: “The AWS Architecture Center is designed to provide you with the necessary guidance and best practices to build highly scalable and reliable applications in the AWS Cloud. These resources will help you understand the AWS platform, its services and features, and will provide architectural guidance for design and implementation of systems that run on the AWS infrastructure.”

And some Twitter commentary from Lori MacVittie of F5:

Get Daily Email News from DCK!
Subscribe now and get our special report, "The World's Most Unique Data Centers."

Enter your email to receive messages about offerings by Penton, its brands, affiliates and/or third-party partners, consistent with Penton's Privacy Policy.

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.