Amazon S3 Issues: Load Balancers and MD5

1 comment

Amazon’s S3 storage system had some issues last week with data corruption on files using MD5 to perform integrity checks. After some investigation, Amazon confirmed the problems and identified the cause:

We’ve isolated this issue to a single load balancer that was brought into service at 10:55pm PDT on Friday, 6/20. It was taken out of service at 11am PDT Sunday, 6/22. While it was in service it handled a small fraction of Amazon S3’s total requests in the US. Intermittently, under load, it was corrupting single bytes in the byte stream. … Based on our investigation with both internal and external customers, the small amount of traffic received by this particular load balancer, and the intermittent nature of the above issue on this one load balancer, this appears to have impacted a very small portion of PUTs during this time frame.

There are several follow-ups of note: Alistair Croll at GigaOm takes a look at the role of load balancers in cloud platforms, while Craig Balding of Cloud Security takes a look at the MD5 issues.

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

One Comment

  1. Thanks for the link. One thing to clarify: it was the developers using MD5 that identified there was data corruption. Or to put it another way, the developers that did not code MD5 checks would not be aware that their transfers were getting silently corrupted by the load balancer. Cheers, Craig