Getty Images
Enterprise AIOps quietly gets real
Machine learning algorithms are being used to automate some aspects of enterprise IT operations, but the original goal of advanced self-healing systems is still a long way off.
Buzz in the IT industry around AIOps has died down considerably over the last three years. Amid the waning fanfare, however, real-world use of IT automation based on machine learning algorithms has emerged among enterprises.
AIOps -- or AI for IT Operations -- refers to the use of machine learning algorithms to automate routine IT tasks. That can include sifting through IT monitoring alerts, responding to incidents or handling the so-called "undifferentiated heavy lifting" required to do routine maintenance on infrastructure systems.
The term emerged in the mainstream in 2018, and by 2019, AIOps had become a common industry buzzword, prompted several mergers and acquisitions among IT vendors and along with them, plenty of speculation about a highly automated, AI-driven future for computing. Some AIOps vendors, such as Dynatrace, publicly embraced the concept of "NoOps," envisioning a world of fully self-healing, self-managing systems that would eliminate the need for human intervention entirely.
Then the COVID-19 pandemic struck. IT spending plans were upended, and digital transformation went from a long-term goal to an immediate necessity. Sweeping futuristic ideas such as "NoOps" no longer commanded the same attention.
However, amid this upheaval, broadening adoption of cloud computing and cloud-native infrastructure brought with it new IT observability tools and a glut of IT monitoring data, which in turn fed AIOps machine learning algorithms and helped them become more effective. The pandemic also tightened IT budgets while cloud migration heightened the complexity of systems, and IT teams turned to automation tools to compensate for staffing shortfalls.
Arun ChandrasekaranAnalyst, Gartner
"Some of the newer applications are written in a fundamentally different way, where it's just not enough if you monitor it the way you monitored things before," said Arun Chandrasekaran, an analyst at Gartner. "Another trend, I would argue, is a shift away from [reactive monitoring] to more real-time and predictive [tools]."
As a result, while most enterprise IT shops have yet to get anywhere close to "NoOps," AIOps is slowly becoming an everyday reality.
"The ability to collect more metrics and more machine data is growing, and the ability to process this data at scale is growing, thanks to time series databases and open source parallel processing engines," Chandrasekaran said. "There are clearly some areas where statistical machine learning can be extremely valuable, in a very targeted way."
Accenture AIOps tackles routine tasks amid data quality push
Accenture, a multi-national IT professional services and consulting company, is among the enterprises where AIOps has begun to take hold over the last two years. Accenture has deployed its business partner ServiceNow's IT service management (ITSM) and IT operations management (ITOM) software, which uses machine learning algorithms to correlate IT monitoring alerts and reduce the number that are surfaced to IT pros, a cardinal use case for AIOps.
Within Accenture, these tools also automate some routine remediation tasks, which for some IT ops teams has freed up time that used to be spent responding to minor incidents.
"Storage cleanup has been a big one, whenever rogue logging tools have just started filling up disks," said Bryan Locke, global IT operations management lead at Accenture. "A lot of those have involved disk cleanup scripts pre-configured by our operating system standards team -- our orchestration platform can run those on any of our servers or environments that we manage, and if the problem is mitigated, suppress incident alerts."
This is a step toward the goal of widely using AIOps to run self-healing systems, but that's still something Accenture is working toward, Locke said. Much of the work that has already gone into the AIOps system has been groundwork to ensure data quality, such as migrating data from six previously separate ITSM tools and implementing ServiceNow's Common Service Data Model (CSDM).
CSDM is a standardized data model the vendor first introduced in 2017 to support the Configuration Management Database at the core of its Now Platform. The model also standardizes data formatting across all ServiceNow products.
Data quality controls have also matured within ServiceNow's ITOM product in recent years, which, along with conversion to CSDM, will help Accenture ensure that AIOps algorithms are being fed reliable data.
"ServiceNow has been gradually adding more and more, I'm going to call them data patrol insights, regarding how compliant [with policy] and how complete data sets are," Locke said. "They have a series of out-of-the-box rules in the platform that we've been using quite a lot."
Locke said that once Accenture has a standardized set of ITSM data shared among all of its lines of business in the Now platform, he would like to proactively remediate more IT incidents. He added that he'd also like to automate more routine IT tasks, including DevOps deployments.
"Where we want to go is getting artifacts released within Azure DevOps auto-approved, instead of manual approvals or semi-manual approvals," Locke said. "But that's a gradual shift."
Accenture isn't alone in undertaking a rather lengthy journey toward broad AIOps-driven auto-remediation; gradual describes the general state of AIOps growth in enterprise IT, according to a 2021 Gartner report.
"Although AIOps technology has existed for a number of years, successful deployments require time and effort, including a structured roadmap by the end user," according to the report, Market Guide for AIOps Platforms, published in April 2021. "Implementations typically run into a number of problems, including data ingestion, providing contextually relevant analysis and long time to value."
Still, gradual or not, Gartner expects AIOps growth to remain steady at a compound annual growth rate of 15% until 2025.
"There is no future of IT operations that does not include AIOps," the report states. "It is simply impossible for humans to make sense of thousands of events per second being generated by their IT systems."
Atlassian acquisition boosts AI, DevOps tie-ins
AIOps has also begun to play a more prominent role in DevOps toolchains, especially with the growing popularity of soup-to-nuts DevOps platforms sourced from major IT vendors. Among such vendors, Atlassian has expanded the AI-based features for its Jira Service Management ITSM tool over the last three years to include predictive issue assignment and triage, AI-driven IT automation and personalized search results for individual users. This month it acquired Percept.AI to add to that mix, which automates tier-1 service desk tasks.
Tier 1 incident resolution is a separate area of IT management from AIOps, but this move indicates deepening dedication to AI automation throughout the IT stack, said Forrester Research analyst Will McKeon-White.
"AIOps is very much focused on signal-driven resolution and Tier 1 is usually more human-driven," McKeon-White said. "AIOps and automated resolution have a bit of an odd relationship, [but] it's allowing more people to commit to those directions."
Atlassian's acquisition also reflects that AI-driven automation is becoming a much more vendor-dominated market, as enterprises often struggle with do-it-yourself approaches, Gartner's Chandrasekaran said.
"Success with AIOps depends as much on having the right use case as it does on having the proper data and implementation," he said. "This is one of the reasons why DIY efforts have been less successful and there has been a move to consume these capabilities from commercial vendors."
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.