Block-level storage virtualization: Reasons to implement it
Learn about the reasons to implement block-level storage virtualization, in Part 1 of a four-part series.
This is Part 1 of a four-part series on block-level storage virtualization. In this story, we explain the reasons why IT departments would want to implement block virtualization. In the rest of the series, we examine how it’s implemented at the server level, in the storage array and at the network appliance level.
Block-level storage virtualization is a storage service that provides a flexible, logical arrangement of storage capacity to applications and users while abstracting its physical location. As a software layer, it intercepts I/O requests to that logical capacity and maps them to the appropriate physical locations. In this way virtualization enables administrators to provide the storage capacity when and where it’s needed while isolating users from the potentially disruptive details of expansion, data protection and system maintenance.
Among the various virtualization technologies available, block-level storage virtualization has been around for many years, and while it hasn’t infiltrated data centers in huge numbers like server virtualization has, it is indeed being implemented. Among 433 respondents to Storage magazine’s fall 2011 storage Purchasing Intentions survey, 32% had virtualized at least some of their storage, and 23 % were planning to evaluate it. And among those who had virtualized at least some of their storage, 14% reported that all of their block storage had been virtualized, while 72% said that some of their block storage had been virtualized.
The technology may be applied internally, to capacity that’s only connected to the controller running the virtualization engine, and it may also be applied to external storage devices on a network. These external systems can include storage capacity that’s exclusively from the same vendor (homogeneous) or may span multiple systems from different vendors (heterogeneous).
The most common implementations are as host-level or array-level services, mostly homogeneous, or as network-based appliances, which are mostly heterogeneous. In addition, storage virtualization is being implemented as a way to bring shared storage to a virtual server environment, typically as a VM-based appliance, and enable VM mobility, host clustering and dynamic provisioning.
Storage consolidation was perhaps the primary historical driver for the original SAN implementations. In this use case, block-level storage virtualization was used to create a physical pool of shared storage that would mimic the direct-attached storage (DAS) being replaced on each server as it was connected to the SAN (Fibre Channel actually uses the SCSI command set). In addition to capacity provisioning for multiple host servers, these “enterprise” disk array systems provided a number of services, such as snapshots, remote replication, and, later, Thin provisioning and deduplication.
Newer scale-out storage architectures are dependent upon logical storage virtualization to create a single pool of storage from multiple physically separate modules. But even with traditional scale-up designs, storage virtualization in one form or another has become a standard feature on practically all enterprise-class arrays and many midrange systems as well. It’s essentially required to efficiently manage any shared storage system, especially in larger implementations with a significant number of hosts accessing it, or if any sort of higher uptime is required.
Reasons to virtualize storage
So why would an IT department want to virtualize their storage resources? There are a number of scenarios in which storage virtualization makes sense. Let’s examine them one by one.
Supporting server virtualization and high availability (HA). Storage virtualization facilitates shared storage, which can enable VM migration and support load balancing among hosts, without having to migrate data between storage systems. It also simplifies storage resource optimization in a dynamic virtualized environment. A shared storage pool can support the clustering of virtualization hosts to enable higher availability with automated failover or faster restart of VMs after a failure is detected. In a similar fashion, a shared pool of highly available storage can be used to support critical applications on physical servers -- for manual failover of storage systems or to support clustering applications.
Easing administration. From an administrative perspective, a larger, shared storage system represents fewer points of management than having DAS systems connected to each host. Storage virtualization can also allow for nondisruptive growth in storage capacity when the time comes to expand existing arrays and for migrating data between storage systems. Most larger, shared storage arrays are likely to have better management tools and functionality as well, which can make new server allocation and the daily “care and feeding” easier to handle, reducing administrative overhead on a per-terabyte basis.
Enabling consolidation and tiering. Storage virtualization, especially on a network-based appliance, can also be used to consolidate existing storage and repurpose assets. For example, more performance-oriented production data can be put on newer arrays and older systems can be used as capacity tiers or for disk backup. Many storage systems and virtualization appliances also have tiering functionality to support this use case.
Simplifying data protection and DR. Storage virtualization can help to consolidate data across an environment in preparation for being copied off-site for DR. Some storage virtualization solutions include remote, asynchronous replication for this purpose.
Implementation
Block virtualization is typically implemented in one of three ways. It can be part of the software stack running on the hypervisor, be run as a VM or on an application server with access to DAS. It can also be part of the software running in the storage controller of a standalone disk array system. Or storage virtualization can be embedded in a storage appliance that’s connected to the network -- either as a turnkey hardware appliance with embedded storage or as a software solution that runs on standard server hardware. We’ll discuss each of these methods in subsequent tips.
Eric Slack is a senior analyst with Storage Switzerland.