animind - Fotolia
Using snapshot backups for your data backup system
The quick and simple nature of snapshot backup systems can be a big win for your data protection platform. However, there are elements of snapshots that aren't great for backup.
Snapshot-based backup systems can change the game for anyone interested in using them as their primary method for backing up and restoring critical data.
Snapshots offer significantly easier and faster backups than any traditional backup system can provide, with recovery time objectives and recovery point objectives that are impossible with a traditional backup system that places files into a backup format and then places that format on tape or disk.
However, not all snapshot backup systems are alike, and not all of them have what it takes to completely replace a backup system. Here, we'll help you understand the benefits and drawbacks of snapshot backups and enable you to make your own decisions as to whether or not you might want to investigate using this kind of backup system for your organization.
There are misconceptions about snapshot-based backup systems. The main misconception is that storage snapshots aren't backups at all -- they are point-in-time copies. Some believe that, if a copy of data doesn't change form -- such as being put inside a tar image -- then it's not a backup. It's unclear where this idea came from, but changing form is not a requirement for a backup.
How are snapshot backups defined?
The Storage Networking Industry Association (SNIA) defines a backup in the following way: "A collection of data stored on (usually removable) non-volatile storage media for purposes of recovery in case the original copy of data is lost or becomes inaccessible -- also called a backup copy. To be useful for recovery, a backup must be made by copying the source data image when it is in a consistent state." The only part of this definition that snapshot-based backups might have trouble with is the "usually removable" part, but this is simply SNIA stating the obvious that some backups are placed on tape.
SNIA's definition does bring up one important aspect of snapshot backups: A snapshot is not really a backup until it has been replicated to another storage system. This is because a snapshot is a virtual copy of the data, not an actual copy of the data. If something happens to the volume upon which a snapshot resides, the snapshot of the volume will be of no use -- unless it was copied to another volume via replication.
In a traditional backup system, backup software and tapes create the ability to restore multiple points in time. This is a critical function of a backup system, as data corruption or other factors may require us to restore the system to a point in time other than the most recent backup. In a snapshot-based backup system, the snapshots provide this functionality. Multiple snapshots -- each created at different times -- are used to present multiple virtual views of the file system as it existed at different points in time.
Another important function of a backup system is to provide a copy of the data in case of disaster. A traditional backup system does this by backing up to the cloud or by sending tapes off-site via a vendor such as Iron Mountain. A snapshot-based backup system accomplishes the same thing via replication.
A snapshot backup system can place multiple copies of the data in multiple locations using replication. For example, operational recoveries may come from an on-site storage system that is physically different than the storage system being backed up, and disaster recovery may come from an off-site storage system that receives a replication stream from the same system. This may be accomplished by having the primary storage system replicate to both systems or having it replicate to the on-site storage system and having that system replicate to the off-site secondary storage system. There are advantages and disadvantages to each approach.
The last part of the SNIA definition of backup is that the data must be in a consistent state when it is copied. With traditional data backup applications, this is usually done via file system and database agents. Snapshot-based backup systems must also figure out a way to make copy data in a consistent state for the backups to be worth anything.
Don't 'fall short' with snapshot backups
It's not acceptable simply to make a snapshot of the database and ask the crash recovery system of that database to make the image consistent during recovery. The snapshot must be created in a way that is supported by the database application. One example of this would be snapshot systems that integrate with Microsoft Volume Shadow Copy Service, as it acts as an intermediary between a snapshot system and the applications that need to be placed in a consistent state. Before considering any snapshot-based product as the core of your backup system, make sure the product has a good answer to this particular requirement.
Another area where snapshot backup systems often fall short is indexing. Some vendors give the impression that, since all you have to do is "cd" into a certain directory and grab the file you need, there's no need for any centralized backup catalog or index -- the way traditional backup systems have. While it is true that a snapshot backup system is somewhat self-indexing, it is also true that, sometimes, people don't know the location of the file they need to restore, and a backup catalog can help with that. With some products, this functionality may actually be provided by marrying a traditional backup product and a snapshot-based product. Some traditional products offer indexing of snapshot-based backups via Network Data Management Protocol.
Make sure your snapshot backup product can scale
Configuration, monitoring and reporting play important roles in snapshot backup. What works for a small shop with one storage system isn't going to work with an enterprise with hundreds or thousands of them. When examining this area of functionality, be sure to ask yourself how well a particular product's capabilities will scale if the size of your data center grows drastically over time. Some systems require you to maintain snapshot and volume relationships via the command line, whereas others have sophisticated web-based UIs to do that for you.
The most important question for any backup administrator to answer every day is: Did the backups work? Larger shops may actually have a staff of operators watching backups as they are performed, and smaller shops may have a single person who checks last night's backups first thing in the morning. Either way, the monitoring functionality of the backup system must be able to answer this question quickly and efficiently.
Reporting is slightly different, as it helps to understand backup trends over time. Are there certain volumes that have difficulty backing up on a regular basis? Is there enough capacity for snapshots and production data? Are there any snapshots that are taking up significantly more room than other snapshots? These questions are answered by the reporting functionality of the product.
One final thing to consider if you are thinking about replacing your traditional backup systems with a snapshot backup system is that most of the former are host-based, and most of the latter are storage-based. The significant increase in server and storage virtual machines (VMs) increases the difficulty of doing storage-based backups. In a world where a "server" can magically move from one physical server and its associated storage to a completely different server and storage resources with a single mouse click, host-based backups are the easiest way to ensure that that server -- actually, a VM -- is backed up, no matter where it resides. Storage-based backups need to account for this particular phenomenon.
It's possible under certain circumstances to completely replace a backup system and all of its functionality with a snapshot-based system. Just make sure you think through all of the things that your backup system does for you today, and make sure that your new system of snapshot backups can do those things as well.
Editor's note: The case against snapshots as backup replacement
It's important to point out that, since this article was originally written in 2010, there have been numerous takes on snapshots and how they may not be a suitable replacement for your backup system. Some experts have argued that snapshots are a good complement to your backup and should be integrated with your overall data protection process.
While snapshots provide quick and easy access to data and help enable capabilities such as instant recovery, they take up a lot of storage capacity, which can hurt performance. A snapshot is not a full copy of data, so it does not provide a complete picture in a recovery situation. A snapshot is also dependent on source data, so would not work properly in the event that data gets lost -- though replication is one way around that issue.
Storage expert George Crump points out that there is no global snapshot standard, so one vendor's snapshot may not be compatible with another's. That can lead to an increased workload in managing snapshots.
Crump also notes that searching across snapshots can be tough. Compare that to the top backup products of the day that have comprehensive search and indexing tools.
At the very least, though, snapshots and backup can work well together as part of a comprehensive data protection platform.