Amazon Glacier
What is Amazon Glacier?
Amazon Glacier, also known as Amazon Simple Storage Service (S3) Glacier, is a low-cost cloud storage service for data with longer retrieval times offered by Amazon Web Services (AWS).
Amazon Glacier provides storage for data archiving and backup of cold data. Cold data refers to files that are infrequently accessed but are kept in case they are needed at a later date. A developer will use a cold data service such as Amazon Glacier to move data that is not needed often to archival storage to save money on storage costs. Developers can also move database backups from tape storage media to the cloud for long-term Glacier storage. Storing cold data this way offloads the burden on IT teams of managing archival data and backups.
Amazon advertises Amazon Glacier as an extremely low-cost storage service. To keep costs low, the service is optimized for infrequently accessed data -- where data retrieval times can reach from three to five hours.
AWS also manages the operational heavy lifting required for data retention. Customers can store small or large amounts of data with Amazon Glacier. After setting the service up with the Amazon Web Services Management Console, users upload whatever data they choose.
When an organization wants to retrieve data, it has three options:
- Bulk retrievals, which recover large amounts of data in a five- to 12-hour window; this is the lowest cost option.
- Standard retrievals, which take about three to five hours to recover stored data.
- Expedited retrievals, which can return data within five minutes. This option, which is the most costly, is designed for organizations that may need their stored data at any time.
Additionally, Amazon considers Amazon Glacier an official part of Amazon Simple Storage Service, which is why it is also called S3 Glacier.
What are the uses for Amazon S3 Glacier?
Amazon designed S3 Glacier as a cloud-based alternative to on-premises magnetic tape backup drives. Although tape libraries can lower storage costs, they still require upfront investments and maintenance. Amazon's Glacier service enables an organization to bypass the management and cost investments involved in tape backup drives by instead using a low-cost cloud service.
S3 Glacier is for long-term storage, so users can retain data for years. For example, S3 Glacier is a good option for organizations that need to reference a set of data only once or twice a year, or those that want to use the data for backup.
S3 Glacier is used in other scenarios and fields, such as:
- Regulatory or compliance archiving. Glacier enables financial services and other industries to keep data for regulatory and compliance archives over extended periods of time.
- Healthcare archival data. Glacier can help hospitals meet regulatory requirements by archiving patient record data securely and cost-effectively.
- Digital preservation. Government agencies or libraries can use Glacier for digital preservation efforts.
- Media asset storage. Media files such as video tend to take up a lot of space when stored over time. Glacier provides a way to store media files affordably.
Where Amazon Glacier stores data: Archives and vaults
Amazon Glacier stores data in archives and vaults. An archive is a block of data that may consist of a single file or aggregated data -- commonly in the form of a Tape Archive or ZIP file. Glacier archives range in size from 1 byte to 40 terabytes (TB). However, there are no limits to how much data and how many archives an AWS user can store in Glacier. Amazon offers a multipart upload feature for higher throughput and reliability for archives with object sizes of over 100 MB.
An AWS user can group archives together into a vault, which helps organize data. Up to 1,000 vaults can be created and configured per region with the AWS Management Console. A user selects a host vault for each archive and can manage access to that vault via AWS Identity and Access Management. An administrator configures a resource-based access policy to each vault, governing who can access a specific set of archives and how they are accessed. Users can also attach notification policies to vaults.
A vault lock helps achieve compliance for each lock. Once a vault locks, S3 Glacier enforces preset configurations to ensure it meets compliance standards.
AWS maintains an inventory of all archives for backup and disaster recovery if needed. S3 Glacier also provides average durability of 11 nines, or 99.999999999%, storing data on multiple devices in multiple facilities across different AWS Regions.
What are the benefits of Amazon Glacier?
Amazon Glacier offers the following benefits:
- Lower cost. Glacier is designed as Amazon's lowest-cost storage class. This enables an organization to store large amounts of data at a reduced cost compared to other Amazon storage services.
- Maintains archival database. An organization does not have to maintain its own archival database. AWS handles administrative tasks such as capacity planning and hardware
- Durability. Glacier is distributed across at least three physical AWS Availability Zones at a time, increasing the ability to restore data if it is lost in one zone.
- Scalability. Organizations can scale the stored data up or down as needed.
- Multiple data retrieval options. Organizations can choose from expedited, standard and bulk retrievals.
- Security. S3 Glacier supports security and compliance standards such as General Data Protection Regulation, Payment Card Industry Data Security Standard, Health Insurance Portability and Accountability Act and Federal Information Security Management Act. It also supports encryption and monitors storage application programming interface call activities.
- Integration. Glacier integrates with other AWS offerings such as AWS Snowball and AWS Direct Connect.
Glacier vs. Amazon S3
Although Amazon considers Glacier an official part of S3, they are still two separate storage options with two different use cases.
Amazon S3 is a high-speed, web-based cloud storage service that is designed for online backup and archival of data or applications on AWS. Amazon S3 is also used for disaster recovery, application hosting and website hosting.
Amazon S3 Glacier provides durable storage for any type of data format. Data is accessible within a standard of three to five hours. A developer could use Amazon Glacier in conjunction with storage lifecycle management, rotating rarely used data to cold storage to save money.
The biggest difference between the two Amazon storage services is that S3 is designed for data that needs to be retrieved in real time, while Amazon Glacier is used for archival. Use of S3 Glacier is reserved for low storage cost use cases where data will not be needed at a moment's notice. S3, however, is suggested for when an organization needs frequent and fast access to its data.
While S3 Glacier uses archives and vaults to store data, S3 uses storage buckets.
How much does Amazon S3 Glacier cost?
S3 Glacier uses a pay-as-you-go pricing method, so organizations do not have to worry about buying more storage than they need. Amazon charges per gigabyte (GB) of data stored per month on Glacier and advertises the minimum cost of $4 per terabyte, per month, with additional costs associated with one of the three data retrieval options noted above.
Although uploading data to Glacier is free, Amazon charges a fee for a retrieval request that is more than approximately 5% of the customer's average monthly storage. This is meant to discourage customers from using Glacier as a general online storage service.
Following is the cost in the Eastern U.S. of the three data retrieval options, as well as the costs associated with making requests for data retrieval:
- Expedited: $0.03 per GB; $10 for 1,000 requests
- Standard: $0.01 per GB; $0.03 for 1,000 requests
- Bulk: $0.0025 per GB; $0.025 for 1,000 requests
Even though uploading data to Amazon S3 Glacier is free, there is a pricing method for upload requests, which is $0.03 per 1,000 requests.
Transferring data out of S3 Glacier to the same region is free; however, there is a cost for transferring data to a different region. The price differs depending on location but will either be $0.01 or $0.02 per GB. If an organization is transferring data out of S3 Glacier to the internet, the cost will be between $0.00 to $0.05 per GB, depending on the amount of data. Prices do not include tax.
Choosing between storage services, such as S3 and S3 Glacier, will depend heavily on what the data being stored is used for. S3 Glacier is a cost-effective option for archival storage.
Learn what options exist for cloud storage and file-sharing services and how they compare.