A beginner's guide to Hyper-V checkpoints
Hyper-V checkpoints help mitigate problems with upgrade rollouts. They can pile up, so they require some storage space and management.
A checkpoint is a differencing file that captures the state, data and hardware configuration of a VM in operation. Checkpoints establish a known-good or known-working VM snapshot at a given point in time. They can help IT administrators manage and mitigate risk in VM environments.
Admins typically capture a checkpoint before performing a major process such as patching or upgrading the software within the VM. Once a known-good state is established, the upgrade can proceed as planned. If the process results in problems -- such as installation failures, undetected bugs or other compatibility or performance issues -- the IT administrator can revert, restore or rollback the VM to its previous known-good state. This undoes or discards any changes that the troubled process introduced and should return the VM to its working state.
The differencing file represents the difference between the VM's current state and VM's state at its last checkpoint. The first checkpoint represents the entire VM, but subsequent checkpoints would contain only the changes between the current state and the previous checkpoint. Thus, every checkpoint works together to create a sequence of state changes called a "checkpoint tree."
How Hyper-V checkpoints work
There are two types of checkpoints in Hyper-V: standard checkpoints and production checkpoints. Both capture the state, data and configuration details of a running VM. The difference is in data consistency.
A standard checkpoint only provides application consistency, not data consistency. When the checkpoint is restored, some data or transactions may be lost. Such restoration problems are common for transacting sensitive VM workloads such as Exchange and SQL. Standard checkpoints are most commonly used in development environments, or when admins use previous VM states to test and troubleshoot problems.
A production checkpoint uses backup technologies or tools, such as Volume Shadow Copy Service, to create a data-consistent VM image that admins can restore later and continue normal operation with no loss of data. Production checkpoints are enabled by default in Hyper-V, and admins can change them to standard or back using Hyper-V Manager or PowerShell.
How to disable Hyper-V checkpoints
Hyper-V administrators can manually enable or disable checkpoints as desired using Hyper-V Manager. Administrators can also use Hyper-V Manager to change checkpoint types, use automatic checkpoints, or change checkpoint file locations. The basic steps to enable or disable checkpoints are:
- Start Hyper-V Manager.
- Right-click the name of the desired VM and select Settings.
- Locate the Management section on the left side of the dialog and select the Checkpoints entry.
- The right side of the dialog now shows all VM checkpoint options, so check or uncheck the Enable checkpoints checkbox accordingly.
- Click Apply to save any changes.
Using Hyper-V checkpoints in a production environment
Both standard and production Hyper-V checkpoints are suited to many production environments, and the fundamental technologies that support checkpoints are mature and well-developed. But checkpoints are not a panacea for all data center challenges -- such as proper backups.
Checkpoints are intended to offer only short-term protection to guard against events that might cause disruption to a VM or an environment that depends on the VM. For example, a checkpoint might be an ideal solution to guard a VM against unplanned problems with an unproven OS or application upgrade.
The notion of "short-term" is murky and can mean anywhere from a few hours to a year, depending on the opinions of individual IT professionals. There is no hard rule for exactly how long a checkpoint should exist, but the rule of thumb is to never allow a checkpoint to exist beyond its usefulness. That is, if reverting to a given checkpoint would impair the VM's usefulness, the checkpoint should be deleted. For example, if admins have properly validated the unproven OS or application upgrade, the checkpoint can be merged into the parent checkpoint, rather than deleted outright.
Common concerns about checkpoints consuming too much storage space, impairing performance or being difficult to manage are typically overblown. Checkpoints have extremely small risks of adverse impacts to VMs as long as the checkpoint environment is well-managed. There are no technical reasons to avoid using checkpoints in a production environment.
Admins can use both standard and production checkpoints successfully in production environments. The difference is in the way the checkpoint handles data consistency. For example, protecting an Apache web server as a front end to a database server might be best handled as a production checkpoint because the checkpoint would basically capture the VM in its off state. If the effort were attempted with a standard checkpoint, there is a possibility that data in motion might be lost and result in data inconsistency.
There are several complex cases where checkpoints aren't a good idea:
- Do not use checkpoints with Active Directory servers in an environment with multiple domain controllers to avoid the possibility of a USN rollback.
- Do not use checkpoints with cluster members to prevent the possibility of inadvertently rolling back an entire cluster.
- Do not checkpoint applications that can already replicate or synchronize data.
Tradeoffs of disabling a Hyper-V checkpoint
There are no significant benefits to disable Hyper-V checkpoints. Common arguments about checkpoints taking too much storage or impairing VM performance are typically exaggerated and are readily mitigated through comprehensive checkpoint management. For example, if the business is running short of storage because of Hyper-V checkpoints, then the problem is far more likely to be rooted in poor storage capacity planning rather than Hyper-V checkpoint use.
The main advantage of disabling Hyper-V checkpoints is in management simplification. Issues such as checkpoint storage arise because checkpoints proliferate over time. The checkpoint tree can become unwieldy if left unattended. This requires management and well-considered checkpoint retention policies to address checkpoint merging and removal.
However, the primary disadvantage of disabling Hyper-V checkpoints is the loss of that powerful and effective short-term protection for VMs. VMs still require some form of effective backup (which should be implemented even with checkpoints enabled). But the process of VM backup and restoration with more traditional backup tactics and VM-aware tools can take considerably more time -- and expose the business to more risk -- than using fast checkpoint rollbacks.
Chances are that the potential disadvantages of disabling a Hyper-V checkpoint can far outweigh any meager benefits.