Data archiving techniques: Choosing the best archive media
Read this tip about data archiving techniques and to discover what's the best archive media for your organization: tape, disk, optical or the cloud.
What you will learn in this tip: There are a few storage media types that can be leveraged for storing data archives; however, their performance and capacity vary and they aren't a good fit for every situation. This tip provides an overview of archive media options and other data archiving techniques.
Many storage and backup administrators understand the importance of performance, resilience, scalability and functionality when developing a strategy for data protection and storage. However, as data reaches the point where it is no longer needed on a daily basis but must be retained as an archive for compliance or legal reasons, some of the storage choices aren't always as clear. What follows is an overview of some of the most common storage archive media available for data archiving, along with a brief discussion of the advantages and disadvantages inherent to each technology or format.
Tape storage media
Although many storage administrators are now choosing disk as the media of choice for backups, tape storage media remains the most widely used archive media mostly because of its cost-to-capacity ratio, but also due to a very broad and well-established implementation base.
More on tape storage media
Read about the future of backup to tape
Learn about tape backup and recovery best practices
Read about tape storage handling procedures
Pros: Tape formats like LTO-5 and DLT-S4 can hold a significant amount of data -- 800 GB of uncompressed data -- and still achieve a maximum throughput of up to 140 MBps (LTO-5). In addition, tape is still relatively inexpensive when compared to disk. Beyond the cost of drives and media, a tape kept in a vault consumes no electricity and doesn't use expensive data center floor space. Tape can also be taken off-site at a relatively low cost.
Cons: Tape is a sequential access storage media, which means data is written to it and read in block sequence (one after the other); a disk can be written to and read from randomly. This introduces data seek time delays that may only be an issue with archive data when you need to access or search it, such as during a legal discovery process. While tape's portability may be an advantage, it also introduces media handling and management overhead for off-site storage, and increases the chances for damages due to mishandling.
Optical media storage
Optical media storage was once very popular for archives as it offered one of the earliest forms of write once, read many (WORM) data overwrite protection, which means that once the media has been written to, it can only be read, not overwritten.
Pros: Optical media has long been considered the media with the longest shelf life, with estimates ranging from 30 to 200 years depending on manufacturers. This is obviously somewhat speculative because the technology has been in widespread use for just a little more than 30 years, but tests have shown that it can outlast other media. Optical media is also more resistant than tape to repeated read passes because it uses laser technology and requires no physical contact with a tape drive's rollers, guides or read heads.
More about optical media storage
Is optical storage right for the enterprise?
Read our overview about data archives
News: PowerFile launches optical hybrid storage appliance
Cons: Low capacity, slow read performance and even slower write performance, and cost are the elements that made optical media storage lose ground to other technology options for long-term storage. Mainstream optical cartridges that were common until the mid-2000s could store approximately 9 GB of data, offered read speeds of 8 MBps and write speeds of 4 MBps.
Some attempts are being made to promote Blu-Ray technology for home-office backups and archives, but there are no mainstream offerings yet. Electronics manufacturer Pioneer introduced a 400 GB read-only disk in 2008 and then announced a rewritable version that was to be released in 2010, but low write performance and higher cost will likely not make it a popular archive technology in data centers.
Disk storage
Disk storage has become the biggest challenger to tape as the media of choice for data archiving. The availability of SATA drives with capacity of up to 2 TB and that can cost less than $150 definitely position the technology to compete with tape.
More on disk storage
Read our tutorial on disk-based data backup
Learn about tape vs. disk backup
Top trends in disk data backup and recovery
Disk storage also benefits from a wide range of features such as local and remote replication, data deduplication and fast search capacity.
Pros: Disk is a random-access device that allows data to be written or read randomly across the surface for the media as mentioned earlier. Enterprise-class disk storage doesn't require media handling for local or off-site copies thanks to replication. Disk storage provides better single point of failure protection than tape media by leveraging RAID technology. Disk storage can be paired with indexing engines for faster discovery. Data deduplication is now a common feature on disk-based data archival solutions.
Cons: Disk storage requires more data center floor space compared to off-site vault space for tape media. When copies of archives are required to be stored in different locations, the cost of additional storage capacity, the power consumption and network bandwidth needed for off-site replication must be taken into consideration.
Removable disk storage
A discussion of data archiving techniques wouldn't be complete without mentioning removable disk storage, which offer the benefits of disk-based archiving with the portability of tape. Dell Inc., Hewlett-Packard (HP) Co., Quantum Corp., ProStor Systems and Tandberg Data each have an RDX removable disk technology offering with single disk cartridge capacity ranging from 160 GB to 640 GB for Dell and Quantum, and up to 750 GB for HP. However, the 30 MBps performance and current lack of high-capacity appliances make it a technology better suited for small- to medium-sized businesses (SMBs) with smaller volumes of archive data.
More on removable disk
News: Tandberg removable disk
Choosing removable hard disk drives
Interview with ProStor CEO about RDX removable disk
Pros: Provides the random read and write capability inherent to disk media. The technology is also available as multi-disk appliances (Quantum and Dell). Removable disk offers portability similar to tape media.
Cons: Relatively expensive data archive storage media in comparison to tape. A high-capacity 750 GB cartridge can cost between $300 and $400 vs. an LTO-5 tape that costs approximately $100. Removable disk storage also introduces a media handling component similar to tape storage. The lack of high capacity appliances limits use of the technology to smaller environments.
On the horizon: Cloud archiving
A relative newcomer to data archiving, cloud archiving can make sense for many organizations, particularly SMBs. The cost can generally be lower than on-premise archiving systems. Pricing ranges from $.25 per GB per month to as much as $12 per GB per month, depending on the cloud-based archiving service purchased.
More on cloud archiving
Tip on choosing a cloud archiving service provider
News on data archiving options
Learn about cloud email archiving
Some examples of cloud-based archiving vendors and their products include: i365 (a Seagate company), Iron Mountain Inc., Nirvanix Inc. and Sonian.
When it comes to selecting the right archive storage media, there are a number of elements to consider but, most importantly, the selection process must start with establishing a clear understanding of the requirements. This usually starts with understanding why archives are needed, the nature of the data to archive, the likelihood or frequency of access to the archive and the search capabilities required. Of course, budgetary consideration can't be overlooked when choosing the best data archive storage media for your organization.
About the author:
Pierre Dorion is the data center practice director and a senior consultant with Long View Systems Inc. in Phoenix, Ariz., specializing in the areas of business continuity and DR planning services and corporate data protection.