Hyper-converged infrastructure vendors adapt to AI workloads

Hyper-converged infrastructure evolves to accommodate the increasing demands of resource-hungry AI workloads. AI, in turn, promises to improve management of HCI and non-HCI resources.

Artificial intelligence and machine learning are significantly impacting IT infrastructures nowadays -- not only as a result of the increasing number of AI and machine learning workloads, but also because management tools now integrate AI and machine learning to better control mission-critical workloads and the systems that support them.

Given this strong movement toward AI and machine learning, it should come as no surprise these technologies are now affecting hyper-converged infrastructure (HCI), leading to hyper-converged infrastructure vendors developing more robust systems that enterprises can better integrate into their overall IT framework.

Supporting AI workloads

At its most basic, AI carries out the simulation of human intelligence on machines -- most notably, computing systems -- by using rules-based learning and reasoning to analyze data and arrive at approximate or definitive conclusions. AI also includes self-correcting mechanisms to continually refine its analytics as more data becomes available.

The field of AI contains a number of disciplines, including machine learning, which is a software-specific form of AI that enables applications to predict outcomes without requiring explicit programming. At its core, machine learning is a set of intelligent algorithms that perform statistical analysis on data, looking for patterns that can be used to predict outcomes and subsequently take actions.

Workloads specific to AI and machine learning are claiming more compute and storage resources than ever.

Workloads specific to AI and machine learning are claiming more compute and storage resources than ever, leaving IT teams to scramble for AI-compatible resources to deliver the necessary application performance. In response, hyper-converged infrastructure vendors are updating their offerings to accommodate AI and machine learning workloads by incorporating hardware and software components that meet the demands for processing large quantities of data, often in real or near-real time.

For example, Dell EMC built its latest VxRail appliances on 14th generation PowerEdge servers that support high-memory CPUs and Nvidia P40 GPUs, in addition to 25 Gbps connectivity and NVMe flash drives. According to Dell, it designed its latest HCI appliances for today's mission-critical workloads, significantly outperforming the previous Dell EMC VxRail G Series by providing more processing power and memory, as well as delivering greater IOPS and faster response times, all of which are essential to AI workloads.

Dell isn't alone among hyper-converged infrastructure vendors. Hewlett Packard Enterprise (HPE) has updated its SimpliVity HCI series to support fluctuating, resource-intensive workloads, such as AI and machine learning. For example, SimpliVity systems are now available with HPE's Composable Fabric, a software-defined networking system that's integrated in the HCI stack. Composable Fabric automates routine network management tasks, such as provisioning the network fabric in response to real-time compute and storage events. It can also automatically discover hyper-converged nodes, virtual controllers and virtual machines.

Cisco, meanwhile, is touting its release of Cisco HyperFlex 3.5 as an AI-friendly HCI environment, with support for Nvidia's most advanced data center GPU, Tesla V100. In addition, the Cisco systems can scale using compute-only nodes equipped with GPUs to help better accommodate AI workload requirements. The HyperFlex systems also include the FlexVolume driver to provide persistent volumes for Kubernetes containers, making it easier to deploy, manage and scale containers that support AI applications.

Another example among hyper-converged infrastructure vendors supporting AI and machine learning are IBM and Nutanix. They have joined forces to deliver an HCI system for enterprises planning to implement private clouds to support AI and machine applications, as well as other mission-critical workloads. The HCI platform builds in Nutanix's AHV virtualization, which is based on the vendor's Acropolis HCI platform and hypervisor.

Bringing AI to HCI

In addition to better supporting AI and machine learning workloads, HCI systems themselves are benefiting from AI technologies. AI helps HCI platforms manage systems and workloads, as well as automate everyday tasks. Furthermore, advancements in AI and machine learning are leading to more intelligent data tools for working across the entire IT infrastructure, a trend known as AI for IT operations (AIOps).

AIOps incorporates machine learning and other AI technologies, as well as big data analytics, to streamline administrative operations and optimize workload management, particularly as it applies to resource utilization. AIOps takes a holistic approach to data center management that spans the entire environment, analyzing data collected from both HCI and non-HCI data points in order to uncover patterns, identify anomalies and predict outcomes.

Ideally, AIOps treats HCI environments just like any other resource, whether virtualized or running on bare metal, making it possible to map workloads to the most appropriate compute, storage and network resources. AIOps also helps address issues that result from siloed IT resources, such as individual HCI systems.

Analyze infrastructure to inform automated operations with AIOps tools
AIOps tools analyze infrastructure for data patterns to aid with the automation and orchestration of operations.

An HCI appliance might typically include comprehensive software for managing the system itself, but connecting that system to other systems in a meaningful way is much more difficult, especially with traditional IT tools. AIOps addressees this issue by coordinating collected data from all systems, whether multiple HCI appliances, virtual servers, storage nodes or any number of other systems.

Using predictive analytics, AIOps can help streamline capacity planning and resource allocation, while leading to more accurate and faster insights. In addition, the software can automate routine processes and more quickly detect and respond to service disruptions, system failures, security threats and other issues.

For example, FixStream AIOps can incorporate Nutanix HCI systems into its monitoring and analytics, offering such capabilities as automating resource discovery, maintaining inventories in near-real time, generating reports dynamically and identifying application flows in and out of an HCI environment.

Despite such offerings, AIOps is still a relatively young technology, and its full impact has yet to be felt in the enterprise. Most organizations still rely on traditional tools to manage disparate resources, including HCI environments. Even so, as AI management technologies continue to improve, the impact of AIOps on IT infrastructure should be keenly felt in the world of HCI.

Meanwhile, efforts with less lofty goals than AIOps are also under way, with vendors starting to provide management tools that incorporate AI technologies. For example, some tools use AI to help streamline specific operations, such as optimizing hardware resources, balancing workloads across available storage nodes, moving data from HCI environments to secondary storage or tracking personally identifiable information across HCI clusters to ensure compliance.

The limitless world of AI

As the size and quantity of AI workloads have steadily evolved and increased, so too have the HCI systems that support those workloads, with hyper-converged infrastructure vendors enhancing their products to better accommodate mission-critical applications. In comparison, AIOps and AI-based tooling -- for optimizing and managing infrastructures -- still have a long way to go to catch up.

AI is nonetheless starting to make significant inroads into IT management, promising to better manage both HCI and non-HCI resources. Not only will this make resource and workload administration more effective, it also promises to pull HCI systems into the larger picture, where all resources are treated as components of a unified system across the entire IT infrastructure, thereby helping to deliver workloads more efficiently and securely, while better utilizing the resources at hand.

Dig Deeper on Converged infrastructure management