Glacier ….. 101


Glacier is a low cost Cold-Data-Storage service used for data archival. It exposes a REST interface. “Archive” and “Vault” are the two main components in Glacier service. A vault is a container/holder of archives and has a name and region. There is no limit on the number of archives a vault can hold.

An archive is any data that needs to be stored and it is the base unit of data in Glacier. An archive can be a photo, video, Zip file or an document. An archive has an unique ID and an optional description. Glacier stores an average of 32KB metadata overhead per object stored.

Glacier supports both IAM policies (identity based) and Vault policies (resource based) for access control. IAM roles can be used to authenticate and grant access. Data durability guarantee is eleven 9s, (i.e. 99.999999999%). Glacier is very low cost and charges fall mainly in three categories- Storage charges ($0.004 per GB per month), Retrieval charges ($0.0025/gb for Bulk, $0.01/gb for Standard and $0.03/gb for Expediated), Data Transfer Charges and PUT object charges @ $0.05 per thousand objects. Large archives can be uploaded using parallel MultiPart uploads.

Data retrieval options include –

  • Bulk Retrieval, typically completes within 5-12 hours.
  • Standard Retrieval, completes within 3-5 hours.
  • Expediated Retrieval is done typical in 1-5 minutes

Glacier also supports queries to run directly on the S3 data store without having to retrieve the entire data. This feature is call Glacier Select. The select expression is a SQL query in the form of –

SELECT s.id, s.name, s.city FROM user_info s WHERE  s.age > 21

Select expression can be passed to Glacier via a POST request or via CLI. Upon completion, the output is copied in the S3 bucket specified in the OutputLocation element of the request. In the request, we can specify, Type as SELECT, ACL for the output object and Tier as Bulk/Standard/Expediated. Once Select output is written, SNS service can notify the application.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.