ControlUp targets virtual desktop, server overprovisioning
If IT pros overprovision virtual desktops and servers, they waste money and resources. But there are other problems associated with poor resource allocation.
ANAHEIM, Calif. -- One trick for desktop or server virtualization success is to find the right amount of resources to dedicate to each machine. Too little RAM or virtual CPU capacity can grind performance to a halt, but too much probably means IT overspent.
It comes down to finding an equilibrium that makes a deployment as efficient as possible. ControlUp, a monitoring software company based in San Jose, Calif., conducted a study of more than 1,000 IT networks over 60 days to see how common provisioning issues are. The report highlights key performance indicators and helps IT professionals understand how they compare to other organizations. It can also help organizations drill down and see where they are overprovisioning.
The results were conclusive: Eighty-eight percent of desktops and 92% of servers were overprovisioned by more than one virtual CPU. In fact, only one in 10 desktops and 7% of servers the company analyzed were provisioned properly. Memory overprovisioning costs companies $233 per server on average.
At Citrix Synergy 2018, ControlUp CTO Yoni Avital spoke about the results of the study -- which the company plans to release later this month -- and other market trends.
Why is overprovisioning so common?
Yoni Avital: VMware, Microsoft and Citrix have a tendency to recommend larger-than-necessary sizes to make sure their own applications work. They don't really care about the efficiency of the workload on the customer side. Obviously, if you provision enough network and RAM you will increase the chances that everything will work OK; virtualization admins tend to overprovision much for the same reason.
You take these two factors and think about an application project. A team comes in saying our vendor recommends eight CPUs and 64 GBs of RAM for each server. Then they say, 'I want to make sure this project will work so let's put 10 CPUs and 85 GBs of RAM.' From that moment on, no one really touches it as long as it's working fine. You find yourself with some customers where almost 100% of the workloads were overprovisioned with no need whatsoever.
Why is overprovisioning such a problem?
Yoni AvitalCTO, ControlUp
Avital: When we speak about compute resources it's all about how many CPU cycles do we have available, how much RAM and how much disk I/O and network I/O is available to work with. Lots of applications require different resources. If you overprovision them, from one end you increase the chances that you will never face a resource consumption problem, which is fine. But if you're doing it too much, you're seriously not efficient. You have tons of hardware, all of its allocated, but it's not really being used. That's the main point: The allocation to usage ratio [must be] closer to be more efficient.
What can organizations do to combat overprovisioning?
Avital: What we recommend customers do is, we say, 'you have 100 workloads, 80 of them are overprovisioned so go ahead and right-size some of them based on our recommendations. Wait a week or two, make sure everything works -- no complaints whatsoever -- so you're a bit more confident with the data points we offer, then go ahead and continue the right-sizing process for everything.'
What other metrics do you consider?
Avital: We're continuously calculating benchmarks for various popular applications and agents you would find in any enterprise. For example, an antivirus agent should be pretty much the same across enterprises. They should schedule scans, compare what they find to their own patterns and either send an alert or not depending on what they find. If you look at hundreds of enterprises, the footprint of the antivirus agent should be similar.
We calculated the benchmark of the antivirus resource consumption. Say, on average it should consume 100 MBs of RAM and 1.5% CPU. On a per-customer basis, we compare that benchmark to what they're actually seeing. If the gap is considered large -- for example, the bench mark would be 1.5% CPU, but the customer says we're seeing 10% CPU -- something is happening here. It's the same version, the same software running on similar workloads, but the gap is there. Indeed this is a configuration issue for the customer using a problematic version, and by fixing that they lowered bandwidth consumption of that specific service.
What market trends are you watching?
Avital: Probably the main thing we're seeing [in end-user computing] is user experience issues, whether it's slow logons or slow launches of applications. Perception of the performance is crucial for the day-to-day work routines. Without the visibility that [monitoring] provides, it remains constant complaints and the admins saying, 'We hear you, but we don't really see what's happening. We tried to fix that.' It's an endless circle with nothing being resolved.
How does ControlUp help organizations that have an eye to the cloud?
Avital: Lots of customers imagine the CIO saying I want to understand how much it will cost to migrate our current workloads to Microsoft Azure or Amazon Web Services.
They're taking their current local configuration and comparing that to the relevant instance in the public cloud and say it's going to cost us one million bucks per month. What we do [is make] sizing recommendations and use that for cloud migration scenarios. We're going to tell the customers don't base the cost on the current allocations, base them on the recommended sizing on what we actually discovered and then the estimations are much lower and make more sense.