ra2 studio - Fotolia
Top considerations for VMware GPU virtualization
Virtualizing GPUs is not easy without the proper considerations. In VMware environments, admins should consider licensing, resource requirements and hardware capabilities.
Virtualizing a GPU is no longer exclusive to just VDI systems; more and more applications are using advanced vGPUs for desktop, server and high-performance computing. As a result, it's critical for virtual administrators to realize virtualizing a GPU is not the same as virtualizing CPU or RAM. Admins require a different approach when designing, licensing and deploying them, especially for VMware GPU virtualization.
When it comes to VDI systems, admins will find many discussions on the required I/O, CPU and memory resources, which makes sense as these are often critical in setting up a VDI system. However, one of the things admins don't always consider is the GPUs themselves.
Admins should have a vGPU on hand to successfully create a VDI system, and while the two main vendors for vGPUs are Nvidia and AMD, it's the Nvidia product line that has a longer history with the VMware product families.
Consider licensing, resource needs prior to VMware GPU virtualization
Despite the fact that a GPU is hardware, which generally has a hypervisor attached, each GPU is unique. Admins can mix and match several pieces of hardware within a GPU, which can become a challenge for some. For example, admins can have different models of GPU cards in the same VMware cluster, which enables admins to run VMs on different tiers of GPU cards for optimal performance, cost benefit and flexibility.
However, each host in that cluster must run the same GPU cards internally. So, while admins' hosts can have different GPU models, each host can only have one model installed. It would be similar to having two different CPU models in the same server platform; admins can't have two different CPUs and maintain a stable hypervisor.
What this means for admins is they must pay more attention to VMware Distributed Resource Scheduler and High Availability groups to ensure their workloads run on the correct GPU hosts that they have set per virtual workload. This doesn't mean admins can't migrate a workload from a host with a specific GPU model to another host with a different GPU model. To do so, admins must shut down the guest first and ensure their licensing is set, but they can use vMotion to migrate guests from specific hosts to other hosts within the same GPU model.
Admins must ensure they have hosts with the right GPUs to handle failover or, better yet, select GPUs that can meet more of a range of workloads to standardize on a common model. One of the great things about VMware and Nvidia is the ability to allocate up to four GPUs per VM. Admins can scale to middle and upper tier GPUs, such as Nvidia Tesla T4 or Quadro, to handle higher-end workloads without breaking the bank or requiring several different hardware GPUs to support more traditional knowledge workers.
In addition, admins must change the host graphics setting from shared to shared direct. If this step is forgone, admins' VMs won't boot.
Once admins have their GPUs set on a specific set of hosts, they will also require a license. With GPUs, admins must have a software license that enables the driver to access the remote GPU functionality. Though this adds another license to manage, it offers a great opportunity for admins.
This is because the features for the vGPU are determined by the license and not by the driver. This means, in order to enable or disable certain functions, admins must have a specific license to do so. This is a huge benefit for admins who might be worried about having to reconfigure settings based on need.
Additionally, if admins are using multiple GPUs, they must have a license such as Nvidia's NVLink to pull everything together. Admins can't piecemeal together components from several GPUs in order to allocate two to four GPUs per VM; they must be fully allocated cards.
Additional considerations to ensure successful vGPU deployment
Considering vGPUs share the same technology as the hypervisor, they also share the same security benefits as well. Though video security might not be the first priority for some admins, it does come into play when admins use GPUs for high-performance computing, deep learning and AI.
The last thing admins should be aware of is that the GPU's hardware host platform most likely will change. GPUs are serious workhorse cards and they require an ideal framework for support. Hardware platforms that are not GPU-optimized can lack the required physical internal space, cooling and power to support several GPU cards.
Admins must work with both VMware and Nvidia for certified hardware platforms and ensure they have the right cards selected for the use cases, as well as the power and cooling requirements in their data center. The challenges that come with virtualizing GPU technology should be somewhat familiar for admins, with a few differences. Virtualizing GPUs isn't a whole new game; it's moving the game to a different field. If admins know the quirks, they can avoid them and have a successful deployment.