Server lifecycle approach lowers risk, raises app performance

Nemertes Research CTO John Burke warns that a server's lifecycle may be shorter than you think and offers a four-stage approach to maintain on-premises servers at peak efficiency.

At no time has there been greater pressure for data-driven enterprises to streamline processes, sustain supply chains, monetize vast amounts of data, improve customer and employee engagements, and drive business outcomes. To gain a competitive edge, huge investments in advanced technologies have been among the highest priorities for businesses.

But there comes a time when companies digitally transforming at unprecedented speeds need to pause, evaluate their business models, validate their ROIs, and intelligently assess which technology equipment needs to be refreshed, replaced, recycled or abandoned. Racks of aging servers housed in physical data centers are no exception.

Although the trend is toward services in the cloud, there remain a significant percentage of servers on premises. "Those servers are critically important to the enterprise," said John Burke, CTO at advisory Nemertes Research. "And that speaks to the fact that that 40% or so of work remaining in the data centers is often mission-critical, or is often so sensitive that the enterprise still feels like it needs to be kept on premises and within its four walls. … [R]unning that kind of a workload in a cloud environment is typically more expensive and sometimes several times more expensive than running it in a well-managed data center on premises."

To satisfy increasingly heavy workload demands, IT leaders are constantly faced with critical decisions about their data center infrastructure's functionality, hardware and server lifecycle. "The idea of having to provide proper resources for these workloads in an evolving data center environment is still, if not a conundrum, at least an ongoing challenge for IT staff managing the data centers," Burke explained.

In this video, Burke provides a four-stage server lifecycle approach to manage workloads, improve application performance and keep those on-premises servers working efficiently with the lowest possible risk to business operations.

Transcript

John Burke: Hi, my name is John Burke, CTO with Nemertes, and I'm here to talk to you about server lifecycle management today. First, a word about Nemertes for those of you who are not familiar with us. Nemertes is a research and strategic consulting firm that is focused on finding the business value that companies realize by deploying emerging technologies. We have several focus areas, including the internet of things, cybersecurity and risk management, next generation networking, and cloud and automation. I've been with the Nemertes since 2005, after spending about 20 years doing IT in enterprises, and my focus then, and for the first few years I was at Nemertes, was very much on data center technologies. And I have continued to follow data center technologies as they have morphed into coverage of the cloud.

And that transition has been the defining one of the past decade with an IT organization and infrastructure that was very much built around the idea of providing services out of the data center, having to adapt as it gradually embraced providing services out of different kinds of cloud environments. We've proceeded to the point now where the majority of organizations are using all kinds of clouds to deliver services to their end users, whether it's software as a service, which is essentially ubiquitous, nearly 100% of companies using. Or IaaS and PaaS, with almost 80% now providing production services out of those kinds of environments, or any of the various stripes of PaaS, application PaaS, integration PaaS, communications PaaS. The point is, IT is now using a very broad spectrum of sourcing options to deliver services to its end users. And that balance is very much shifted towards the cloud.

But not all of the work is out in those external cloud environments. In fact, at this point, approximately 40% of the enterprise workload is still running in on-premises data centers that are managed by the enterprise, resourced by the enterprise and operated by the enterprise. And another 4% or so of enterprise workloads are now running on enterprise premises, but outside the actual data centers. And you know, this is either traditional branch office computing, file servers and the like, or this is new-wave edge computing, where the focus is on enterprise applications that require extremely low latency or generate but don't need to provide to the rest of the organization enormous volumes of data, things that need to stay on premises where they're created. So, in all, about 44% of the work of IT is still based in resources run on the premises, whether in the data center or in branch offices. So, IT is still responsible for sourcing and operating and managing a large number of servers in service of running all these enterprise workloads.

So those servers are critically important to the enterprise. And this is, I think, fairly well illustrated by the fact that one of the newest waves of technology being deployed in enterprises right now, SD-WAN, is often being driven by a desire to provide improved performance across the network for applications being delivered from data centers. And that speaks to the fact that that 40% or so of work remaining in the data centers is often mission-critical, or is often so sensitive that the enterprise still feels like it needs to be kept on premises and within its four walls, whether to satisfy legal requirements or some other kinds of nontechnical but risk-related requirements around intellectual property perhaps.

Of course, a substantial number of these workloads -- mission-critical, sensitive, whatever -- are also what I like to call 'cloud-hostile.' You can't see me making the air quotes. But cloud-hostile workloads, things that were architected for an age of maybe dedicated hardware, or virtual machines that had access all the time to the full resources of the hardware at need, rarely sharing and always able to burst up to the full capacity. And so, not particularly cloud-friendly in that sense, not able to take advantage of horizontal scaling, not able to shut down and turn themselves back on when a new request arrives. Instead, systems that are expected to be up 24/7/365. And to have full access to the full resources they need for peak load at any given moment -- cloud-hostile workloads. Still a very important part of how business gets done in many businesses and a significant piece of that 40% that remains on premises. Because running that kind of a workload in a cloud environment is typically more expensive and sometimes several times more expensive than running it in a well-managed data center on premises.

And of course, these applications are typically not static. They are themselves still evolving and expanding. So they are themselves still developing a need for more CPU power, developing a need for more memory, for more storage. The idea of having to provide proper resources for these workloads in an evolving data center environment is still, if not a conundrum, at least an ongoing challenge for IT staff managing the data centers.

So we're strong advocates at Nemertes for taking a lifecycle approach to many things in the organization, whether it's data and managing it in its lifecycle, from acquisition to archiving or disposal, or servers and managing them through their lifecycle, which is typically understood to have four stages -- procurement being the first, obviously, when you get the server sent to you, basically. It's important to note that this is not an activity that is free to the enterprise. This is an activity that has some associated costs, including working out what the specs are for the servers you're going to order, selecting vendors, which I hope is something most organizations still do on an ongoing basis, that they always compare offers from different vendors, negotiating the deal and trying to get the best discount you can from the account team that you work with. And of course contract management for the purchases and billing management that go with that. All of these things have overhead. Some of it's all in IT, some of it's not in IT at all, all of it's something you need to keep an eye on as you set up a lifecycle and regularly execute it and re-execute it to keep your infrastructure up to date.

Deployment is the second piece and obviously involves a lot of its own kind of overhead for planning and executing the inventorying of, the installation of, the networking of all of these servers as you bring them on board and the integration of those new resources into your resource pools. So, into your VMware infrastructure, or whatever other virtualization technology you're using to host your virtual machines in your data center or to host containers in your data center if you're running containers now.

After deployment comes operation and maintenance. And this is where, obviously, servers spend most of their life and IT spends most of its effort on servers. How much that costs is dependent to some extent on hardware quality and the effectiveness of your management software. And, of course, of time, because we see that with servers, operating and maintenance costs do tend to go up over time across the supported lifetime of the server.

And lastly, the last phase in the lifecycle of any given server is its retirement. And whether that is a matter of you taking that equipment out of your racks and reselling it to somebody else or recycling it -- in essence, selling it to the recycler -- your goal with retirement is that you spend as little as possible of your money taking something out of service, or even try through resale or recycling to offset the staff costs of retiring the server by making a little back on that resale.

Now why we like to see folks manage servers on a lifecycle basis in their data centers is focused largely on reducing risk and improving outcomes. Reducing risk is really critical. And it's something folks often think of in their security thinking but don't think of as often on something like server procurement. But managing your servers through a lifecycle process helps you reduce overall risk in several ways -- the most important ones being that hardware is less prone to failure within its warranty window, and the older it gets, the more likely it is that components will fail. And component failures can lead to downtime or impaired performance. So there's an operational risk there associated with running your hardware as it gets older and older.

Similarly, within its maintained service lifetime, from the vendor, you can expect to have spares available, you can expect to have firmware that's actively maintained and updated for security or for bug fixes. And you can expect the hardware to continue to be supported by current versions of hypervisors and current versions of operating systems or container operating systems. All of that also reduces both operational and security risks.

So, one big motivation for being on a lifecycle basis is to make sure that you're keeping your servers in service only as long as they are themselves being maintained by their manufacturers. On the other side of the spectrum to improve outcomes, you're looking to improve performance for applications -- especially if you're bringing new applications in-house or improving upon or upgrading the applications you already have in-house. It's likely that you're going to be wanting to give additional resources to those improved or expanded or upgraded applications and that you'll be able to do that more easily with newer hardware in the mix. You can bring in new hardware, put the hottest workloads on that and cycle other workloads on to the hardware that you've just displaced and, of course, retire the oldest stuff.

You're also going to be looking at reducing the overhead of managing your server plant as you keep it newer. Newer usually means not just faster equipment, but also more efficient equipment, stuff that draws less power, so it generates less heat. Less cooling is required in your environment, reducing your operating costs. It also typically takes up less space. So, you can get by with fewer racks or fewer slots in a rack. You can consolidate workloads onto a smaller number of servers and get by with fewer servers. That typically has multiple payoffs, including the fact that it generally takes you less staff time to manage a smaller number of servers, in addition to newer servers typically being more manageable. And you can sometimes reduce your licensing costs, if you can reduce server counts, for example, for hypervisors.

Lastly, it's good always to keep your business investments focused on the front end of the technology funnel and bringing in the new and the more capable, and less of that investment going to nursing along hardware that's approaching its end of life. So, getting on to a lifecycle can help you manage all of those things and reduce those kinds of risks and see those kinds of improvements in your operating costs and in your environment.

Now it's been typical in the past for servers to cycle through a five-to-seven-year norm lifecycle. For folks who are really focused on minimizing their capital expenses and stretching them across as much time as possible, you'd sometimes see that lifecycle stretch to eight years or even beyond -- basically people running servers as long as they would run, or at least as long as they were able to still get parts for them and still run software on them. Noting that it didn't always have to be current software, it didn't always have to be supported software. So, there's always a negotiation there with the security folks on how old is too old, what are acceptable levels of not having patching, not having firmware fixes and so forth.

For folks who are in those kinds of situations with the longer lifecycles or this capital focus still lingering, leasing has commonly been a tactic to jump-start a lifecycle approach and get to and stick to, at least for some time, a three- or four-year lifecycle. And we're seeing that three-to-five-year lifecycles are increasingly common. That is in part due to this shift towards operationalizing costs and IT getting away from capital expenditures. So into leases. And leases for three and four years are common; five is uncommon. Also, due to the fact that people are seeing this wave of changes in the types of workloads that they're running and in how those workloads consume resources -- those things can make older hardware less efficient in running those new kinds of workloads and can even hurt performance. New workloads, new processors optimized for different kinds of workloads, can really make a difference to application performance.

And we also see that, where in the past, we often saw folks have a year of big buying and cycling lots of hardware out and then a couple of years of not buying. We're now seeing most folks getting onto a 'let's get some hardware in every year, and let's roll some out every year,' always with the newest stuff, getting to the hottest jobs, hottest in the sense of highest demand, and the oldest stuff flowing out of the server pool as quickly as possible."

A goal in any kind of a lifecycle, though, is to also minimize the number of times you have to touch the hardware. Touching the hardware is more expensive than touching software, touching configurations. And so setting your data center up, your policies and processes up to support a low-touch to no-touch lifecycle, once it's been racked, you shouldn't have to go back and touch it until you're un-racking it to send it out the door. That can be really strong support for and actually a good outcome from adopting a lifecycle approach to server management.

So, in sum, and I'll offer as takeaways from this, organizations should adopt and stick to a server lifecycle approach to managing their server plant. Exceptions to server lifecycles tend to come with additional cost, especially in terms of maintenance, but possibly also in terms of added security risk, added operational risk, and so should be avoided when possible. And if you're needing to jump-start that mindset of getting onto a lifecycle, consider leasing, at least through one full cycle as a way to get the company used to the idea of rolling them in and rolling them out and not keeping them forever and not keeping them until they break.

We do suggest buying some every year to make sure that you have adequate resources for new kinds of workloads, workloads that are newly demanding, and that you always have an option to get your oldest and least well-performing boxes out of the racks and into the recyclers bin on a regular basis.

 As I was noting in suggesting that you need to get to a low-touch, no-touch main part of the lifecycle for servers, make sure that the cabling in your data center isn't the destiny of the servers that get plugged in. Structure the cabling and especially your data and its separate storage networks, logical networks, to make sure that you can change what a machine is doing, which workloads it's running, which data it has access to purely in configuration, purely in software, without having to go move the box from rack to rack or even unplug and re-plug connectors from the machine or from some patch bay somewhere. Cabling should not be destiny, especially in an efficient, modern, virtualized data center.

For that 40% of the work that's still in the data center, that's the kind of data center you need to have if you're going to operate efficiently and at the lowest possible risk. And with that, I'll say thank you very much and goodbye.

Dig Deeper on Converged infrastructure management