The recent data center outages for Amazon Web Services have focused attention on the best ways to deploy applications on AWS to avoid downtime. The breadth of Amazon's cloud computing infrastructure offers many options for routing around problems in a single availability zone or region, and the most recent outage Friday night prompted much discussion around the web about the best ways to configure the architecture of AWS apps for maximum reliability. Here's a sampling of notable analysis, commentary and resources:
EC2 / EBS Outage: Lessons Learned - From Agile Sysadmin: "Don’t treat AWS like a traditional datacenter. Amazon provides up to four availability zones per region, and a range of free and paid-for tools for using them. Techniques for taking advantage of these features range from as simple as using elastic IP addresses and remapping manually to a different zone, to using multi-availability zone RDS instances to replicate database updates across zones."
Rainforest: Guide to a Cross-Region Strategy - From the Rainforest blog: "So, you’re hosted in a single zone, and if you’re in US-East presumably went down last night. Stop it. You need to use more than just mulitple zones if you want to stay up during one of these outages. Cross region is the answer. The following is a rough guide to making your website available regardless of ‘catastrophic’ events."
How to Synch S3 Buckets in AWS and design for failover - From Dan Morrill at CloudAve: "Synching S3 buckets is fairly easy, and there is no reason not to do it, you can cut your own custom script that will determine all the contents in an S3 bucket and make sure that even though the data centers are somewhat isolated ensure the contents are the same in each S3 bucket. You can use a freeware/minimal cost software like S3CMD from S3 tools to help you out if you need a baseline system command set to do this."
Ask HN: Best setups to Avoid Outages on AWS - This thread at Hacker News discusses the question: "Using only AWS services, what do you put in place to help prevent disruptions when a single availability zone goes down?"
AWS Architecture Center: This Amazon resource provides reference architectures, and white papers on best practices, disaster recovery and building fault-tolerant applications on AWS. Amazon's description: "The AWS Architecture Center is designed to provide you with the necessary guidance and best practices to build highly scalable and reliable applications in the AWS Cloud. These resources will help you understand the AWS platform, its services and features, and will provide architectural guidance for design and implementation of systems that run on the AWS infrastructure."
And some Twitter commentary from Lori MacVittie of F5:
Architectural analogy of the day: Relying on single site/region is like load balancing one server.
— Lori MacVittie (@lmacvittie) July 2, 2012