Look closer at the granularity, flexibility of hypervisor replication
Increasingly, storage services are moving to the hypervisor. One of the storage services that stands to gain the most is hypervisor replication.
Increasingly, storage services are moving to the hypervisor. Functions like snapshots, compression, deduplication and caching are being run on the same physical hosts that virtual machines (VMs) run on. Hypervisor-based storage services tend to bring down storage costs while having a better awareness of the environment they run in. One of the storage services that stands to gain the most from being hypervisor-based is replication.
Traditionally, replication has been executed at the storage-system level. Essentially, two storage systems communicate with each other over IP or a Fibre Channel extension product. The advantage of this approach is that it is relatively simple to manage when many applications need to be replicated because entire volumes or even the entire array can be replicated with a single command.
Another advantage is that when the storage system performs replication, it does not consume server resources, which can be important in a virtual environment. Finally, most of these systems replicate over a private network, so they don't use the same bandwidth that production systems run on.
On the downside, legacy replication tools lack the level of detail that is critical because most virtual environments store many VMs on a single LUN or volume. Array replication, because it operates at the LUN or volume level, will replicate all those VMs regardless of their value to the organization. This wastes bandwidth and consumes storage capacity on the secondary array.
Another problem with array-based replication is that most storage vendors require that the replication target system is theirs; in some cases, it has to be the same system. In many cases, user needs may not require a secondary system of equal quality and performance, but with array-based replication, that is the only option.
Finally, if the environment has more than one storage system supporting the virtual environment -- an all-too-common reality -- a replication product has to be bought from each of those vendors. This also means that a unique secondary system must be bought per production array to be replicated.
Hypervisor-based replication
Hypervisor-based replication runs on a physical server either alongside the VMs or inside specific VMs. Either implementation provides much greater detail than an array-based replication option. This means that no matter how many VMs are stored in a given volume, the administrator can choose exactly which VMs should be replicated. Since most VMs don't require real-time replication, the ability to individually select which VMs are replicated can greatly reduce both WAN bandwidth consumption and secondary storage requirements.
In addition, a hypervisor-based replication approach is completely storage-system-agnostic: It can replicate from any storage system to any storage system. This means that the production storage system can be an advanced, flash-based storage system and the secondary system can be something less expensive. It can even be the old storage system that was replaced when production storage was refreshed.
The any-to-any capabilities of hypervisor replication also mean that a single replication tool can be chosen for the environment regardless of the mixture of vendor arrays used in the data center. The ability to standardize on a single tool reduces costs and streamlines management.
Finally, hypervisor-based replication can take advantage of its granularity for more than just VM selection, as mentioned above. For example, a VM-based tool is more likely to also be application aware instead of just VM-aware. This means that at certain intervals, it can place the application into a non-cached mode so that a clean copy of the data can be made.
This granularity can also be used to monitor the condition of the VM or hypervisor itself and take appropriate action. For example, if a VM locks up, an array-based replication tool won't know it. A hypervisor-based replication tool can monitor specific VM conditions and take specific steps if a VM locks up, including starting a failover to the secondary system.
Hypervisor replication downsides
Hypervisor-based replication does have its downsides. First, if the replication software is installed in the VM itself, an environment with thousands of VMs could be an installation and a management nightmare. While some installation woes could be mitigated by VM templates, managing and monitoring the replication process could be challenging. A hypervisor-based replication that is installed at the physical host layer simplifies the implementation and operation of replication jobs because there are obviously fewer hosts than there are VMs. But the host-level, hypervisor-based replication software may lose some of the granularity that an in-VM option will have.
Many of the above limitations can be overcome by a graphical user interface (GUI) that does not overwhelm IT operations but only presents them failed replication jobs or groups. One limitation that a GUI can't fix is the consumption of CPU resources. A hypervisor-based replication tool, whether it is host-level or in-VM, will consume CPU and network resources. A host loaded with a couple dozen VMs doing replication may require additional CPU power and potentially a dedicated network connection.
The key is to use the tool that makes the most sense for your environment. Selecting a replication approach is largely based on your situation. Hypervisor-based replication is compelling, but array-based replication has its place.
If you have one or two storage systems for the virtual infrastructure, or if you have thousands of VMs, array-based replication may make more sense and actually be less expensive than hypervisor-based replication. If you are dealing with a more manageable number of VMs or have a wide mixture of storage systems supporting your virtual infrastructure, hypervisor-based replication may be more appropriate.