Glacier: Cold Cloud Storage From Amazon

Add Your Comments


Will the archival of huge volumes of rarely-used files move to the public cloud? Amazon Web Services today rolled out Glacier, a new “cold storage” service that expands the company’s cloud computing platform. Glacier offers archive storage at a cost of one penny per gigabyte per month. The tradeoff: the archives won’t be available instantaneously (hence the “Glacier” moniker) and there will be charges for accessing backups. The announcement immediately prompted questions about how Glacier works and how it may impact the storage world. Here’s a roundup of notable analysis and commentary from around the web, starting with the Amazon team and including early takes from analysts and journalists.

Amazon Web Services Blog – Jeff Barr outlines the use case for Glacier: “Amazon Glacier differs from S3 in two crucial ways. First, S3 is optimized for rapid retrieval (generally tens to hundreds of milliseconds per request). Glacier is not (we didn’t call it Glacier for nothing). With Glacier, your retrieval requests are queued up and honored at a somewhat leisurely pace. Your archive will be available for downloading in 3 to 5 hours. … Retrieval requests are priced differently, too. You can retrieve up to 5% of your average monthly storage, pro-rated daily, for free each month. Beyond that, you are charged a retrieval fee starting at $0.01 per Gigabyte (see the pricing page for details). So for data that you’ll need to retrieve in greater volume more frequently, S3 may be a more cost-effective service.”

All Things Distributed – Amazon CTO Werner Vogels says small companies are a key market for Glacier: “Although archiving is often associated with established enterprises, many SMB’s and startups have similar archival needs, but dedicated archiving solutions have been out of their reach (either due to the upfront capital investments required or the lack of bandwidth to deal with the operational burden of traditional storage systems). With Amazon Glacier any organization now has access to the same data archiving capabilities as the world’s largest organizations. We see many young businesses engaging in large-scale big-data collection activities, and storing all this data can become rather expensive over time- archiving their historical data sets in Amazon Glacier is an ideal solution.”

Perspectives – Amazon’s James Hamilton looks at the economics of long-term storage: “Cold storage is unusual because the focus needs to be singular. How can we deliver the best price per capacity now and continue to reduce it over time? … Cold storage is a natural cloud solution in that the cloud can provide the volume economics and allow even small-scale users to have access to low-cost, off-site, multi-datacenter, cold storage at a cost previously only possible at very high scale. Implementing cold storage centrally in the cloud makes excellent economic sense in that all customers can gain from the volume economics of the aggregate usage.”

GigaOm Cloud – Barb Darrow has her eye on the competitive landscape: “AWS Glacier is cheap, slow and Amazon hopes startups find it the prefect place to put files that aren’t accessed very often. And if it takes off it could become a problem for the existing backup and recovery business (which is often the first offering many smaller telco cloud providers launch to customers). Amazon argues that backup services generally require an upfront payment to a vendor, are generally over-provisioned because no one wants to be the guy who lost a key file because he cheaped out on hard drives and are thus more expensive than they need to be. So for less than a penny per gigabyte per month Amazon can keep your genome data, files required for regulatory compliance and even archival data.”

ZDNet - Jack Clark digs into the hardware behind Glacier: “Asked what IT equipment Glacier uses, Amazon told ZDNet it does not run on tape. ‘Essentially you can see this as a replacement for tape,’ a company spokesman said via email. Instead, Glacier runs on ‘inexpensive commodity hardware components,’ he said, noting that the service is designed to be hardware-agnostic. This suggests the system will be based on very large storage arrays consisting of a multitude of high-capacity low-cost discs.”

Ars Technica - Ars also wonders about the back-end infrastructure: “We don’t know exactly how Amazon measures the reliability of its storage, but the company is promising 11 nines of annual durability (99.999999999 percent) for each item, with data stored ‘in multiple facilities and on multiple devices within each facility.’ While Amazon says ‘Glacier can sustain the concurrent loss of data in two facilities,’ there is still risk data could be lost forever. If you store 1TB, Amazon’s promised durability rate suggests you can expect to lose an average of 10 bytes per year. Amazon is betting that will be an acceptable risk for the service’s low price.”

PlanForCloud - The cloud cost forecasting service does some quick math on Amazon’s storage services: “We just ran a quick cost forecast and it’s interesting: If you start with 100GB then add 10GB/month, it would cost $102.60 after 3 years on AWS Glacier vs $1,282.50 on AWS S3!”

Rick Branson: How does Amazon make the economics work? Branson, an infrastructure engineer at Instagram, offers some math: “Economics of AMZN Glacier: 3TB drives are about $0.003/mo/GB racked and powered + erasure encoding = thin, but survivable margins.”

Image of Perito Moreno Ice Cliff in illustration via (S. Rossi) / CC BY 3.0

About the Author

Rich Miller is the founder and editor-in-chief of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.