Amazon cloud storage options enhanced with Glacier

StorageIO industry trend for storage IO

In case you missed it, Amazon Web Services (AWS) has enhanced their cloud services (Elastic Cloud Compute or EC2) along with storage offerings. These include Relational Database Service (RDS), DynamoDB, Elastic Block Store (EBS), and Simple Storage Service (S3). Enhancements include new functionality along with availability or reliability in the wake of recent events (outages or service disruptions). Earlier this year AWS announced their Cloud Storage Gateway solution that you can read an analysis here. More recently AWS announced provisioned IOPS among other enhancements (see AWS whats new page here).

Amazon Web Services logo

Before announcing Glacier, options for Amazon storage services relied on general purpose S3, or EBS with other Amazon services. S3 has provided users the ability to select different availability zones (e.g. geographical regions where data is stored) along with level of reliability for different price points for their applications or services being offered.

Note that AWS S3 flexibility lends itself to individuals or organizations using it for various purposes. This ranges from storing backup or file sharing data to being used as a target for other cloud services. S3 pricing options vary depending on which availability zones you select as well as if standard or reduced redundancy. As its name implies, reduced redundancy trades lower availability recovery time objective (RTO) in exchange for lower cost per given amount of space capacity.

AWS has now announced a new class or tier of storage service called Glacier, which as its name implies moves very slow and capable of supporting large amounts of data. In other words, targeting inactive or seldom accessed data where emphasis is on ultra-low cost in exchange for a longer RTO. In exchange for an RTO that AWS is stating that it can be measured in hours, your monthly storage cost can be as low as 1 cent per GByte or about 12 cents per year per GByte plus any extra fees (See here).

Here is a note that I received from the Amazon Web Services (AWS) team:

Dear Amazon Web Services Customer,
We are excited to announce the immediate availability of Amazon Glacier – a secure, reliable and extremely low cost storage service designed for data archiving and backup. Amazon Glacier is designed for data that is infrequently accessed, yet still important to keep for future reference. Examples include digital media archives, financial and healthcare records, raw genomic sequence data, long-term database backups, and data that must be retained for regulatory compliance. With Amazon Glacier, customers can reliably and durably store large or small amounts of data for as little as $0.01/GB/month. As with all Amazon Web Services, you pay only for what you use, and there are no up-front expenses or long-term commitments.

Amazon Glacier is:

  • Low cost– Amazon Glacier is an extremely low-cost, pay-as-you-go storage service that can cost as little as $0.01 per gigabyte per month, irrespective of how much data you store.
  • Secure – Amazon Glacier supports secure transfer of your data over Secure Sockets Layer (SSL) and automatically stores data encrypted at rest using Advanced Encryption Standard (AES) 256, a secure symmetrix-key encryption standard using 256-bit encryption keys.
  • Durable– Amazon Glacier is designed to give average annual durability of 99.999999999% for each item stored.
  • Flexible -Amazon Glacier scales to meet your growing and often unpredictable storage requirements. There is no limit to the amount of data you can store in the service.
  • Simple– Amazon Glacier allows you to offload the administrative burdens of operating and scaling archival storage to AWS, and makes long term data archiving especially simple. You no longer need to worry about capacity planning, hardware provisioning, data replication, hardware failure detection and repair, or time-consuming hardware migrations.
  • Designed for use with other Amazon Web Services – You can use AWS Import/Export to accelerate moving large amounts of data into Amazon Glacier using portable storage devices for transport. In the coming months, Amazon Simple Storage Service (Amazon S3) plans to introduce an option that will allow you to seamlessly move data between Amazon S3 and Amazon Glacier using data lifecycle policies.

Amazon Glacier is currently available in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), EU-West (Ireland), and Asia Pacific (Japan) Regions.

A few clicks in the AWS Management Console are all it takes to setup Amazon Glacier. You can learn more by visiting the Amazon Glacier detail page, reading Jeff Barrs blog post, or joining our September 19th webinar.
Sincerely,
The Amazon Web Services Team

StorageIO industry trend for storage IO

What is AWS Glacier?

Glacier is low-cost for lower performance (e.g. access time) storage suited to data applications including archiving, inactive or idle data that you are not in a hurry to retrieve. Pay as you go pricing that can be as low as $0.01 USD per GByte per month (and other optional fees may apply, see here) depending on availability zone. Availability zone or regions include US West coast (Oregon or Northern California), US East Coast (Northern Virginia), Europe (Ireland) and Asia (Tokyo).

Amazon Web Services logo

Now what is understood should have to be discussed, however just to be safe, pity the fool who complains about signing up for AWS Glacier due to its penny per month per GByte cost and it being too slow for their iTunes or videos as you know its going to happen. Likewise, you know that some creative vendor or their surrogate is going to try to show a miss-match of AWS Glacier vs. their faster service that caters to a different usage model; it is just a matter of time.

StorageIO industry trend for storage IO

Lets be clear, Glacier is designed for low-cost, high-capacity, slow access of infrequently accessed data such as an archive or other items. This means that you will be more than disappointed if you try to stream a video, or access a document or photo from Glacier as you would from S3 or EBS or any other cloud service. The reason being is that Glacier is designed with the premise of low-cost, high-capacity, high availability at the cost of slow access time or performance. How slow? AWS states that you may have to wait several hours to reach your data when needed, however that is the tradeoff. If you need faster access, pay more or find a different class and tier of storage service to meet that need, perhaps for those with the real need for speed, AWS SSD capabilities ;).

Here is a link to a good post over at Planforcloud.com comparing Glacier vs. S3, which is like comparing apples and oranges; however, it helps to put things into context.

Amazon Web Services logo

In terms of functionality, Glacier security includes secure socket layer (SSL), advanced encryption standard (AES) 256 (256-bit encryption keys) data at rest encryption along with AWS identify and access management (IAM) policies.

Persistent storage designed for 99.999999999% durability with data automatically placed in different facilities on multiple devices for redundancy when data is ingested or uploaded. Self-healing is accomplished with automatic background data integrity checks and repair.

Scale and flexibility are bound by the size of your budget or credit card spending limit along with what availability zones and other options you choose. Integration with other AWS services including Import/Export where you can ship large amounts of data to Amazon using different media and mediums. Note that AWS has also made a statement of direction (SOD) that S3 will be enhanced to seamless move data in and out of Glacier using data policies.

Part of stretching budgets for organizations of all size is to avoid treating all data and applications the same (key theme of data protection modernization). This means classifying and addressing how and where different applications and data are placed on various types of servers, storage along with revisiting modernizing data protection.

While the low-cost of Amazon Glacier is an attention getter, I am looking for more than just the lowest cost, which means I am also looking for reliability, security among other things to gain and keep confidence in my cloud storage services providers. As an example, a few years ago I switched from one cloud backup provider to another not based on cost, rather functionality and ability to leverage the service more extensively. In fact, I could switch back to the other provider and save money on the monthly bills; however I would end up paying more in lost time, productivity and other costs.

StorageIO industry trend for storage IO

What do I see as the barrier to AWS Glacier adoption?

Simple, getting vendors and other service providers to enhance their products or services to leverage the new AWS Glacier storage category. This means backup/restore, BC and DR vendors ranging from Amazon (e.g. releasing S3 to Glacier automated policy based migration), Commvault, Dell (via their acquisitions of Appassure and Quest), EMC (Avamar, Networker and other tools), HP, IBM/Tivoli, Jungledisk/Rackspace, NetApp, Symantec and others, not to mention cloud gateway providers will need to add support for this new capabilities, along with those from other providers.

As an Amazon EC2 and S3 customer, it is great to see Amazon continue to expand their cloud compute, storage, networking and application service offerings. I look forward to actually trying out Amazon Glacier for storing encrypted archive or inactive data to compliment what I am doing. Since I am not using the Amazon Cloud Storage Gateway, I am looking into how I can use Rackspace Jungledisk to manage an Amazon Glacier repository similar to how it manages my S3 stores.

Some more related reading:
Only you can prevent cloud data loss
Data protection modernization, more than swapping out media
Amazon Web Services (AWS) and the NetFlix Fix?
AWS (Amazon) storage gateway, first, second and third impressions

As of now, it looks like I will have to wait for either Jungledisk adds native support as they do today for managing my S3 storage pool today, or, the automated policy based movement between S3 and Glacier is transparently enabled.

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved