Why hyper-converged infrastructure simplifies IT management
As fractured data center landscapes shift to HCI environments to consolidate IT operations, learn the many advantages, disadvantages and challenges of hyper-converged deployments.
As businesses expand their IT infrastructures, hasten their migration to multi-cloud environments and add layers of specialized, separately managed resources and workloads, the pressure is on IT to harness and consolidate under one virtual roof a fractured infrastructure comprised of data centers and assorted hardware, software, IaaS, PaaS and SaaS platforms.
Hyper-converged infrastructure (HCI) combines compute, storage and virtualization resources to help simplify, automate and streamline IT operations. "By putting the different resources under the control of one system and allowing them to be pooled in a single, virtualized resource," reported Enterprise Strategy Group (ESG), a division of TechTarget, "HCI can make data centers more scalable, flexible and cost-effective while improving IT management and productivity."
Only 8% of organizations currently have more than half of their business applications and workloads running on HCI, but over the next two years, that number is expected to jump to 28%, according to ESG's October 2021 HCI survey of IT professionals. In addition, HCI is seen surpassing hybrid cloud as the preferred on-premises infrastructure, with 82% of respondents planning to increase their HCI spending and 68% intending to deploy a new HCI cluster.
"HCI has a lot of potential value to contribute in the data center by making IT's experience of managing data center infrastructure a lot more cloud-like, making the infrastructure look like cloud infrastructure," says John Burke, CTO at advisory and strategic consultancy Nemertes Research. "In evaluating HCI, IT needs to carefully weigh the benefits that they'll get from simplified management -- and really, simplified procurement in some ways -- against the various challenges that come with that simplified management and this deployment of a new platform."
In this video, Burke details why hyper-converged infrastructure and the many forms of software- and hardware-based HCI for data centers, outside branches, IoT and the edge come with their share of advantages and disadvantages. "[Y]ou should evaluate at least three options," Burke advises. "Hands-on experience of the platform is critical to making the right decision as to whether to deploy and what to deploy."
Transcript
My name is John Burke, CTO of Nemertes Research, and I'm here to talk about using hyper-converged infrastructure to simplify IT. I'll start by introducing Nemertes so you have an understanding of where the data I'm presenting comes from. We'll define what we mean when we're talking about HCI. And we'll discuss how hyper-converged infrastructure can help in the age of multi-cloud. We'll discuss some of the challenges that come along with using HCI and ramp up with some recommendations.
Nemertes is a research and strategic advisory company that is focused on uncovering and quantifying the business impact of emerging technologies. We interview and collect data from IT practitioners and analyze it to uncover how the deployment of newer technologies -- emerging technologies -- is driving business value in enterprises, and also what the challenges and obstacles to deploying that technology have been and how folks have overcome them, how they've gotten to succeed. And we'll quantify what we mean by success as well, providing metrics and KPIs.
HCI is something that we have been covering for probably six or seven years now. And in the beginning, hyper-converged infrastructure meant one very specific thing. It meant boxes that were purpose-built to run a platform for virtualizing the storage and compute within the box, and providing a virtual network among the virtual machines that were spun up inside that box. A scaling in this kind of early hyper-converged infrastructure was achieved by adding boxes. You would just get another of these specialized boxes and connect it to the first one, and you'd have a bigger pool of resources. And the magic under the cover was that, as soon as you were connecting the two of them together, the platform would recognize that it had more resources to play with, would incorporate those new resources into the management pool, and would provide all of that to the hypervisor inside and make it available for deployment immediately. You could immediately assign virtual machine images to those new resources.
Now, these days, and seven years of early use cases, seven years of evolution and the platform's seven years of market-driven divergence in what was called an HCI has resulted in a current market space that is still inclusive of that hardware-based approach -- you can still go out and buy those. But, more often now, folks are thinking of a software-based solution that they will themselves layer onto off-the-shelf commodity servers -- that those servers will work within chassis storage drives right in the boxes, or storage that will be directly attached to one of those boxes. So, you can still scale by buying more boxes, that's still an option. Or, you can scale by buying bigger boxes, or you can scale by upgrading boxes that you already have. You can add storage in a chassis, you can add processing power, you can add memory.
Still, though, the goal is -- and the value proposition is -- that you can manage it all as one pool of resources, and adding new resources behind the scenes is still transparent from the system's administrator's view. They don't have to do anything special. Once they've plugged the new boxes in, the software waves its magic wand and all of those resources are available for use immediately. There are platforms available from folks like VMware, Nutanix, HPE and others; they'll provide a software-based platform. Some of them will also still provide a hardware-based platform, and a lot of options are available in the market.
As I said, we've been doing research and touching on the subject of HCI for many years now and we have seen adoption rates gradually creep up to the point where immediately pre-pandemic, we were seeing a little over a quarter of organizations already using hyper-converged infrastructure. And within that year -- so, by the end of 2019 -- another 20% were planning on deploying their first HCI. Now, another 15% were planning on deploying in 2020 at that time; but, as we know, the pandemic put a lot of IT plans on hold, especially plans that required IT staff to be on premises, like the installation of new data center infrastructure. And so, we don't actually know how that has all come out and won't be asking about HCI again until next year. So, it's safe to assume that almost half of organizations, or about half of organizations, have deployed HCI at this point and that a substantial fraction more are planning to. Now, where and why is the question.
In the current IT environment, we have crossed a major threshold -- and we crossed this threshold in 2018, 2019. And that was the point where over half of enterprise workloads were running outside enterprise walls. And, at this point, we see about 40% of enterprise workloads still running in data centers -- only 40%. And the rest of it, I'm sorry, the rest of the majority of it is running in either SaaS, 27%, or IaaS environments, about 29%. And that's, you know, things like Azure or AWS or Google Cloud or Oracle Cloud. With just a little bit of residual work, about 4%, still on premises but not in a data center -- what we used to call branch computing and is now turning into something called edge computing.
Focusing first on the data center, we can see that HCI has a lot of potential value to contribute in the data center by making IT's experience of managing data center infrastructure a lot more cloud-like, making the infrastructure look like cloud infrastructure. And that can reduce dramatically the amount of management that IT has to do in providing that hardware level of infrastructure that their cloud, or private cloud environment, or the environment they're trying to turn into a private cloud can then take advantage of.
But, and this is important to remember, it's not in itself a private cloud platform. A cloud management platform can be, and increasingly is, wrapped around HCI. This is, I think, the biggest single use case for HCI in the data center; something like 40% of deployments are focused on providing infrastructure for a private cloud environment. But it's only one piece of that. And it's critical to understand that the HCI is going to be wrapped with a cloud management platform, something that does more than just manage the resource pools. A cloud management platform is focused on providing that layer of functionality that makes consuming those resources like the cloud experience that folks have when they use external clouds -- that they can provision images or whole stacks of images from a portal or catalog. Or, increasingly, if they're a DevOps shop, being able to call on those catalog items via an API from software that's being constructed in-house. The cloud management platform is also the layer that is supposed to take care of orchestrating the deployment of multiple workloads with each other, scaling workloads up or down dynamically in response to workload. It is supposed to be able to move a workload or components of one from one environment to another, from one piece of hardware to another, inside the environment -- again, in response to demand. Sometimes it will also include a cloud service broker functionality, where it can decide which cloud to run a given catalog item on. Will it deploy it against internal resources, or will it push it out into one of those external cloud environments?
Importantly, cloud service management platforms also are focused on providing that layer of reporting that makes the experience, again, more cloud-like. Being able to report against service-level agreements -- is this service staying up and performing at the rates that we're promising our end users? It can monitor performance and report back to that. It can monitor compliance. Are these workloads running in the geographies that they're supposed to be running, if you're working in multiple data centers? Are they running separately from those other workloads that they're never supposed to share hardware with? And, of course, monitoring and reporting on security -- who's accessing this workload? What is that workload trying to talk to, and why?
And lastly, resource usage. A critical component of the external cloud experience is that you can tell how many resources your given job is consuming. And that's critical both for you, because that determines how much you pay in that external cloud environment; it's also critical for the folks running that cloud, that they understand where the work is getting done, who it's getting done for and how much they're charging for it. If you're running a true private cloud inside your organization, you might, again -- and like an external cloud provider -- charge back based on actual resource usage to recover funds from the organization business units that are demanding these workloads be run. Or, you might just use that information to do what's called show back and show to the organization how much of your overall resource utilization is going to deliver which kinds of functionality, how much to the jobs that HR is running, how much to the jobs that marketing or sales or production are running. And there are lots of cloud management platforms out there, again, from VMware, from Red Hat, the OpenStack deployments that are available. Overall, a little over half of organizations have begun to use a cloud service management platform to begin that transition to a true private cloud environment inside their data centers.
Outside the data center -- where that, you know, roughly 4% of work is happening currently -- HCI has a significant role to play there as well, and there a hardware-based HCI is especially appropriate. But we do see some software deployment there as well, with the idea being that moving from legacy branch computing to true edge computing is, in part, making the transition from everything kind of growing the way it grows and being managed the way it's managed based on the history of that particular branch of the organization, to an environment where any compute resources that are still outside the data center are pretty much cookie-cutter deployments of the same technology, the same platform, different resources in the box as different sizes or numbers of boxes based on the needs of a particular branch -- but all of them basically the same so that they can be centrally managed and administered to some extent as a single, widely distributed but uniform infrastructure. So, edge infrastructure is more centralized and more manageable.
And that is really a sweet spot for HCI, providing branch-in-a-box kind of infrastructure that you can manage like the box in every other branch that you've still got infrastructure in. And this is going to become increasingly important across different kinds of enterprises as we see the need for edge computing spread, mainly in response to internet of things-type deployments. And, for example, in manufacturing companies as production lines become steadily more intelligent, as there's more sensor data generated in every line, as there are more points of remote control on each line, it becomes increasingly important to have the ability to receive all that sensor data and do initial analytics on it and make some rapid decisions based on it in real time or hard real time, you know, in the sub-five-millisecond range. And that leads to the need to have the compute resources dedicated to that analysis and decision-making right there on premises. And in that kind of edge compute scenario, HCI can provide a simple drop-in solution for that hardware-based HCI. Vendors often have ruggedized deployments, for example, where the compute and storage and networking is hardened against high temperatures, low temperatures, exposure to the elements, high dust environments, you name it. HCI can be tailored to be an ideal drop-in solution for that kind of IoT-driven edge compute deployment that is becoming increasingly important. And other use cases include image recognition against camera inputs in a maintenance yard for a railroad or a truck line, or image recognition against security camera footage for folks running remote facilities, warehouses, distribution points, things like that. They want to know if somebody is walking around in their lumber yard late at night who was not supposed to be, but they don't want to call the police out if it's just that a coyote somehow got through the fence and is wandering around. Motion sensors alone are not enough. So, IoT will be driving a much broader deployment of HCI as we go forward.
But for all the benefits -- the potential benefits, at least -- in simplifying the environment and making it easier to manage the hardware resource pools that are at the core of data centers and at the core of edge compute deployments, HCI does come with some challenges.
If you're using that kind of hardware-based HCI, the classic HCI, your choices as to what you can deploy are limited to what the vendor's choices are in the configurations that they offer to you. And it can be tricky getting the balance of storage and compute right without overspending or underprovisioning, initially. And folks tend to err on the side of overspending if their financial resources allow, and then tailor down and back as they gain experience with the platforms and how to size them.
If you're taking that more software-based approach, using that flexibility that allows it to run on top of just off-the-shelf hardware, whatever you happen to have in the data center or want to buy this year, does come at a cost of its own -- which is that you're not saving yourself any work at the hardware level when it comes to hardware maintenance. You're still having to monitor and patch firmware for security problems or bug fixes, and you're still going to have to do the kinds of, essentially, regression testing of those patched hardware elements against your operating platforms. So, when you patch a problem with the storage infrastructure inside one of your boxes, is your HCI still going to be able to manage it the way it expects to? Usually yes, but you don't know until you test.
And of course, if you try and get around that by buying your own hardware but making it all the same -- keeping it uniform as though you were buying boxes instead of just a software HCI -- that comes with cost as well: either having to upgrade your boxes too frequently in order to meet the needs of your highest demand workloads, or replacing your boxes less frequently than you need to in order to keep them all uniform without upgrading in advance of their lifecycle running out, basically.
When you're putting actual work into an HCI environment, people can and do run into design problems based on the storage performance required by different jobs that they're running. And it can be difficult to meet the needs of high data performance programs without trial and error, basically, some testing and some shifting of jobs from box to box until you've got exactly the right placement to deliver the functionality that you need to deliver. And that presence of hotspots in your infrastructure can lead to design problems -- not just meeting their needs initially, but making sure that if there are problems with the box they're running on, there's sufficient resources for them to fail over to. So, a complicated thing there can arise for folks with some significantly demanding jobs on the data front.
No matter how you're approaching it, HCI does tend to add a critical-path vendor and platform in the data center without, for most folks, fully replacing any existing one. Nobody that we've talked to, and few people that we see, have ripped everything else out of their data centers and just put their HCI vendor in place. Instead, they're adding an HCI rack or two or three to their environment and still keeping other equipment and other racks, and having to deal with all the vendors for those as well as now their HCI vendor. And it's important, too, especially for software HCI to recall that it is just a reality in the marketplace that vendors don't want any problem you're having with performance or platform stability to be their problem. And so, again, especially if you're running an HCI that's software-based, you're putting in a platform that is sort of in the middle between things that consume your resources and the pool of resources underneath it. And performance problems can be the cause of a lot of finger pointing and buck passing, as application vendors point at the HCI platform, as the HCI platform vendors point at the hardware that you've put underneath them, and the hardware vendors point at the HCI platform right on top of them, and the HCI vendor points at the application vendor. It can be a real challenge when there are significant performance problems.
Looking at a higher level, we see that HCI pushes even harder against the walls of the data center silos that still, in most organizations, exist around storage, compute and networking. They've been eroding slowly but steadily since virtualization became the norm in data centers back around 2008, 2010. But they certainly haven't gone away. Rolling HCI into their environment will just put further stress on them and, where they still exist in most organizations, offer yet another layer of opportunity for confusion about who's responsible for what, arguments over who should be allowed to do what, and tangling of processes for design, deployment, operation and troubleshooting. So, it needs to be taken into consideration and carefully worked around as HCI is deployed.
And lastly, as a sort of highest-level challenge, conceptual challenge, HCI is a step to a goal --but that goal is not HCI. That goal is a real private cloud in the data center. And HCI isn't a real private cloud. It's just a step towards that, and it's not even a necessary one. It's not like it distracts you from deploying private cloud, but you can certainly move in the direction of a private cloud without deploying HCI. So, it's important to understand that it's not the endpoint. It's a step towards it, and an optional one.
So, we certainly encourage everybody to explore HCI for their data centers, for their edge compute cases, and to consider all of the different market-driven variations on the HCI theme: the hardware-centric and software-centric HCI that we've been focused on up to now; something called disaggregated HCI, which is just the software-centric approach with an even less prescriptive hardware model underneath it; the idea of composable infrastructure that really brings more mainframe flavor to the HCI platform; and even on-premise cloud appliances. You can get on-premise deployments of what are essentially Amazon AWS compute nodes, or Azure stacks or Google Cloud stacks or Oracle Cloud stacks. And those represent sort of the endpoint of the HCI game, in that they are not only giving you on-premise private cloud, but they are identical to one of your external cloud environments and providing a kind of seamless migration of work that is ideal.
In evaluating HCI, IT needs to carefully weigh the benefits that they'll get from simplified management -- and really, simplified procurement in some ways -- against the various challenges that come with that simplified management and this deployment of a new platform. And where HCI can help you advance your IT strategy, which of course should be tightly tied to overall business or organizational strategy, you should evaluate at least three options. Hands-on experience of the platform is critical to making the right decision as to whether to deploy and what to deploy. And with that, I'll say thank you. This ends the session.