Compare the top 5 bare-metal hypervisors host virtual machine (host VM)
X
Definition

virtualization sprawl (VM sprawl)

What is virtualization sprawl (VM sprawl)?

Virtualization sprawl is a phenomenon that occurs when the number of virtual machines (VMs) on a network reaches a point where administrators can no longer manage them effectively. Similarly, sprawl can occur when VMs fall into disuse or are abandoned, utilizing computing resources but returning no useful benefit to the business. Virtualization sprawl is also referred to as virtual machine sprawl, VM sprawl or virtual server sprawl.

VM sprawl has become a common challenge for many organizations, and the more they rely on virtualization, the more likely they are to encounter this problem. Because sprawl can occur gradually, IT teams and business leaders might not be aware of it at first. But by the time they do realize it, the problem is often quite serious, offsetting many of the benefits that come with virtualization. Even when VM admins are aware of the issue, they can still have a difficult time identifying and removing the unwanted VMs.

Virtualization sprawl can result in many unused VMs spread across the network, several of which are ignored or forgotten. VMs might still run in the background and waste resources, but they serve no function. Even if a virtual machine is stopped or shut down, it still takes up valuable disk space and can pose a potential security risk.

Several factors can contribute to virtualization sprawl:

  • Creating VMs is much easier and faster than standing up physical servers, and VM provisioning can often be done automatically without having to go through approval or justification processes.
  • Because VMs can be created easily, VM owners frequently lose track of them and forget that they're out there, especially late in the VM or software lifecycle.
  • VM owners often hang on to their VMs in case they're needed for future projects, even if they haven't been used for a long time.
  • Because VMs are created with software rather than hardware, many users think of them as being free, not considering operating system (OS) licensing fees or resource usage and other operational expenses, such as hardware power and cooling.
  • Many organizations don't have governance policies or standardized processes in place to control VM creation and VM lifecycle management, potentially leading to business problems.

Because of these factors, VMs are being created faster than they can be removed, leading to virtualization sprawl and the serious consequences that come with it.

Why is VM sprawl an issue?

Virtualization sprawl can undermine many of the benefits that come with virtualization, including increased security, better resource utilization, easier management and lower costs. In fact, VM sprawl can present several serious risks to consider.

Security and compliance

Even if a virtual server was only used for a few days or weeks, it can run for years, potentially causing increased security and compliance risks. If one of these VMs is compromised, the organization might not know it's happened until it's too late. Other potential security issues include the following:

  • Unused VMs might not get patched or receive proper maintenance, and they tend to be easily forgotten.
  • Some VMs might contain or have access to sensitive corporate data or private information. Even if a VM has been shut down, its files still exist.
  • Unpatched VMs can become attack vectors.
  • Abandoned VMs might not receive important configuration changes, opening other attack vectors.
  • VMs without proper lifecycle or security monitoring could lead to attacks going unnoticed.
  • Successful attacks on poorly protected VMs could compromise the organization's regulatory compliance or business governance.

Management

Virtual server sprawl can add significant management overhead in the following ways:

  • If the VMs aren't running, admins must still manage the storage they use.
  • If the VMs are running, admins must balance their allocated physical resources with active VMs. In some cases, VM admins continue to update and patch unused VMs as part of their routine maintenance plans, incurring extra management overhead.
  • Sprawl can also affect data protection efforts, such as complicating disaster recovery strategies or increasing the number of backups that must be maintained.
  • VM sprawl can make it more difficult to forecast resource usage because of the uncertainties that come with all the unused VMs.

Performance

Virtual server sprawl can affect performance in the following ways:

  • A running VM, even if it sits idle, has resources allocated to it.
  • If a server hosts many of these VMs, the operational VMs might experience resource availability issues, slowing them down and affecting application performance.
  • Even if a VM is turned off, it still uses disk space, potentially affecting the performance of the operational VMs.
  • If unused VMs are spread across multiple hosts and still running, they might continue to use network bandwidth for routine maintenance tasks, which can affect both virtualized and bare-metal applications.
  • Resources consumed by abandoned and unused VMs can limit the number of other VMs that can share server and network resources.
  • Resource bottlenecks caused by unneeded VMs can impair server and network performance.
  • Sprawl resulting in performance issues can require unnecessary troubleshooting and expense to resolve.

Cost

Virtual server sprawl affects cost in the following ways:

  • Unused VMs take up disk space, regardless of whether they're running, often resulting in the need to purchase additional storage space.
  • The effect on compute and network resources can also translate to increased costs if IT must beef up infrastructure to support the operational VMs.
  • Even an idle VM can require CPU time and use network bandwidth. An organization might also be paying a substantial amount in licensing fees for its unused VMs.

Clearly, organizations that rely on virtualization must take sprawl seriously or they could face serious consequences. Each unused VM wastes resources and introduces risks. But to avoid virtualization sprawl, IT teams must take specific steps to address the unused VMs that already exist and to prevent more of those VMs from being created.

Causes of VM sprawl

Ironically, VM sprawl is often an unintended consequence of today's incredibly beneficial virtualization technology driven by a combination of factors:

  • Ease and simplicity. Traditional server provisioning was a lengthy affair that required formal approvals to purchase and install hardware, configure the system and supporting network, then install and configure the resident application. It could take weeks or even months for a business to deploy a new bare-metal application. Virtualization lets companies provision a new VM from existing resources in a matter of minutes. This typically leads to VMs being perceived as cheap or free. Business leaders can easily request more VMs than they might need and keep them far longer than needed, sometimes abandoning or forgetting about unused VMs over time.
  • Lack of policy. The ease and speed with which VMs can be provisioned has led many businesses to forego the traditional formalities of workload deployment and management. Simply taking the time to justify what a VM is for, who owns it, and how long it's needed as well as allocating departmental costs to operating the VM using accounting techniques, such as showback or chargeback, could raise vital awareness of VM accountability.
  • Lack of VM lifecycle management. Lifecycle management is a core part of any hardware and software asset management. But VMs are often excluded from traditional lifecycle management because they're so readily created and easily destroyed that such formality is often overlooked. In fact, the ease and simplicity of VMs makes lifecycle management even more important. Take the time to add the management tooling needed to oversee VMs just as if they were physical systems. Watch the regular traffic and utilization trends over time.
  • Lack of governance. Traditional silos can remain between business leaders, application developers and operations staff. Once a VM is created, there might be no reliable trail of VM ownership or responsibility. Application owners think the VMs belong to IT, and IT thinks the VMs belong to project or department managers. Software developers might deploy and test builds in VMs but might not account for releasing those resources later. A business needs to clearly delineate the responsibility for VMs and implement the management infrastructure to discover and audit the presence of VMs, assign their ownership, and provide timely reporting needed to support VM lifecycle management decisions.

How can you prevent VM sprawl?

To get VM sprawl under control, IT teams must stop the careless behavior that leads to sprawl and take a more proactive approach to VM lifecycle management. Proper prevention can include the following steps:

  1. Establish policies. Start by implementing a comprehensive set of documented VM policies for controlling virtualization usage. The policies should help standardize the processes used to create, maintain, archive and destroy VMs so the unused ones are kept to a minimum. Users should be able to create VMs only when they're needed, and only the necessary physical resources should be allocated to the virtual servers to avoid overprovisioning.
  2. Implement tooling. In addition to defining policies, IT should implement VM lifecycle management and other software management tools that can audit the existing VMs to determine which ones are actively operational and under the control of a virtualization platform and which ones aren't being used and can potentially be deleted or archived. The goal is to identify every VM on the network and document its usage, whether it's fully operational, running in an idle state or completely shut down. Admins should also evaluate the operational VMs to determine whether they conform to the newly defined policies and then take the steps necessary to bring them into compliance.
  3. Review reporting. Take the time to review the audits and reports provided by tools and evaluate reports against established policies. For the unused VMs, admins should carefully assess them to ensure they're no longer needed. However, they shouldn't delete or disconnect any unused VM until its status can be verified. That said, determining whether a VM is still needed isn't always a straightforward process. Sometimes it takes shutting down or disconnecting the VM to see whether anyone raises an objection. Proceed with caution. Some VMs might appear to be out of service but still serve an important function, if only part of the time.
  4. Take corrective action. Once it's determined that certain VMs are no longer in use, they can be archived or destroyed. When destroying VMs, admins should ensure no sensitive data can be compromised. They should also look for any VM file fragments and secondary files, such as temporary or configuration files, left behind. In addition, they should search for orphaned snapshots or backups and delete those in a secure manner once they've verified that they're no longer needed.
A list explaining four key signs of abandoned VMs.
Abandoned and unused VMs can cause VM sprawl and lead to serious consequences.

Concurrent with cleaning up the virtual environment, IT teams should take several additional steps as part of their VM lifecycle management plan. They might implement practices such as the following:

  • Monitor systems for signs of VM sprawl, such as inexplicable slow performance or server logs that show no login activity.
  • Use the advanced VM management features available through their virtualization platforms, such as automatically decommissioning VMs based on specified expiration dates.
  • Implement VM tagging to more easily track and inventory VMs after they've been deployed.
  • Establish a VM baseline and then perform regular audits at assigned intervals.
  • Assign costs to internal customers for VM usage, using cost models such as chargeback or showback so business groups are more aware of VM-related expenses.
  • Implement strict access controls that limit the number of users who can create VMs or reallocate resources to existing VMs.

To maintain control over their VMs, IT teams also need the proper tools to manage and monitor VM operations across their networks. The right tools can offer insights into the entire VM ecosystem, providing information such as how many VMs are running, who owns the VMs, which computers are hosting the VMs and where VM data is stored. Many tools can also track details about VM software and OS licenses. Some tools also offer advanced automation and orchestration capabilities to streamline management operations and reduce VM sprawl. Most full-featured infrastructure management and monitoring tools now offer ample support for virtualized environments with VMs and containers.

With the right management tools and well-defined policies, IT teams can overcome their VM sprawl challenges. But they must first recognize the seriousness of the problem and be willing to properly address it.

Learn what full virtualization and paravirtualization are as well as how they compare when it comes to enabling resource abstraction and isolation levels.

This was last updated in May 2024

Continue Reading About virtualization sprawl (VM sprawl)