KOHb - Getty Images

How AI in the NOC will transform network operations

AI can improve network operations through automation and troubleshooting. Experts at ONUG emphasized phased adoption and methods that enhance NOC monitoring and support.

Since the explosion of AI, experts have speculated about how it will influence network operations. Most predict AI will be a boon to networking, enabling network engineers to automate routine tasks, troubleshoot issues, provide insights into operations, and improve overall productivity and performance. Skeptics, on the other hand, claim AI will render network engineers redundant and replace them in the workforce.

With so many contrasting opinions, one might wonder how extensive a role AI will play in network operations. Members of the AI-Driven NOC/SOC Automation Project at ONUG's AI Networking Summit Fall 2024 conference gathered at a panel to discuss the project's recent survey that polled network and security professionals on AI adoption.

AI can currently aid network administrators with various tasks in the network operations center (NOC). But, because it hasn't fully developed, it has yet to reach its potential. While the capabilities it offers now are limited or underdeveloped, pretrained large language models (LLMs) can absorb large amounts of data to understand business requirements and support network administrator use cases.

Follow a phased approach

Michael Haugh, vice president of product marketing at Gluware, said that, because of the complexities with AI and generative AI (GenAI) adoption, he suggests organizations follow a phased approach when they implement the technology:

  1. Start with a pretrained LLM. Begin with an existing tool, such as ChatGPT, and figure out how to integrate it with additional products and services.
  2. Enhance the AI model with retrieval-augmented generation. RAG is an AI framework that adds additional data and information to improve an AI model -- and thus the quality of responses. Apply RAG to the AI model to train it on specific use cases for the network.
  3. Fine-tune a pretrained LLM or build a custom model. Fine-tune AI models with documentation, or build a custom tool.

The final step, Haugh said, is the most complex, time-consuming and costly part of the process, but it enables network administrators to create an AI tool trained on the intricacies of a network. This tool can provide more specific info to users during troubleshooting.

Haugh recommended that, when organizations follow this phased approach, they focus on each step and develop it until it becomes an AI copilot -- a virtual assistant that helps users accomplish tasks more efficiently. Network administrators should develop this model until it reaches an operational state, Haugh said. When network administrators ensure the AI is fully functional at each stage of the process, they can ensure the model will support the organization's use cases.

Pradeep Kathail, CTO of enterprise networking at Cisco, agreed.

"Using RAG is a good way for you to experiment with LLMs and experiment with AI," Kathail said. "You can get a lot of output and solve a lot of use cases to begin with, and then you can start realizing which use cases are going to give you a real ROI."

How enterprises use AI in the NOC

With so many promised use cases, AI hype might overshadow actual performance. But, when administrators evaluate AI's use cases, they can understand the practical applications of AI in the NOC. During the panel discussion, members of the AI-Driven NOC/SOC Automation Project team identified the following AI capabilities:

Identification of network issues

AI chatbots are a top use case of NOC automation, Haugh said. An approximate 13% of survey respondents said AI chatbots were a use case that justified investment in GenAI. A typical organization has dozens of IT professionals who must understand the multiple devices that are part of the network. This introduces complexity in network management. A chatbot enables organizations to integrate documentation into a single tool that network professionals can use to access information quickly.

Organizations can train chatbots on information about the network, Haugh said. For example, if an administrator asks a chatbot about the network's load balancers -- a topic on which the AI has been trained -- it can provide an answer to the query.

"When you've supplemented [an LLM] with your data, information, frequently asked questions and configuration guides, it's going to leverage that information first and give you an accurate response," Haugh said.

AI network monitoring

NOC professionals can also train AI models on syslog information. Network devices generate syslog messages -- based on rules written by network administrators to configure and manage notifications -- which monitor and alert teams on issues. But, if a device has an unprecedented error, the message doesn't appear in the syslog because the error isn't in the rule base.

Network administrators can train AI to understand the syslog, however, said Parantap Lahiri, vice president of network and data center engineering at eBay. An AI tool that has knowledge of syslog errors can identify when a new one occurs and help administrators troubleshoot issues in the network. Lahiri said he recommends organizations make the notification an actionable alert that administrators can review.

"It doesn't even have to be a critical syslog," he said. "It could be a warning or info, but it's uncommon because people don't see it as much. So, [AI can] help us to identify issues before they become big."

Faster incident response

Syslog reports are also an important aspect of incident response, which pretrained AI tools can help organizations enable. When an issue occurs in the network, administrators work to fix the problem, but this often highlights the gap in skill sets, Kathail said.

When network users request assistance with an issue in the network, it falls on the experienced administrators to fix the issues for them. This takes away time they could spend on more critical parts of the network, network architecture, design or implementation of new services, Kathail said. But, if network administrators use AI to detect issues in the network, it enables them to fix problems quicker and dedicate more time to appropriate activities.

"If you design your virtual assistant or chatbot the right way and provide a little bit of navigation through the chatbot into your product, a lot of Level Zero, Level One support becomes easier," he said. "The skill gap becomes much smaller for people to adopt [AI]."

Xiaobo Long, head of backbone network services at Citi, said an AI copilot is especially useful to help network administrators accomplish their tasks.

"A lot of our employees don't have extensive programming skills, but they have some skills, and they want to do a lot of development work," she said. "Based on my team's feedback, [a copilot] really improves productivity."

Future automation in the NOC

The use of AI in the NOC helps network professionals streamline operations, improve productivity and reduce the time it takes to mitigate a problem that occurs in the network. However, because AI has yet to fully mature, it hasn't quite reached its potential. AI models, for example, aren't fully autonomous and serve more as copilots.

In the future, when AI fully matures, Haugh said he expects AI models to become automated enough that humans only need to get involved in the process when necessary. Lahiri said eBay plans to develop AI to improve how applications interact with infrastructure to improve troubleshooting.

Whatever the use case might be, it will take time before AI fully develops. When it does, AI will be a supplemental tool to aid network professionals in the NOC, rather than replace them entirely.

Deanna Darah is site editor for TechTarget's Networking site. She began editing and writing at TechTarget after graduating from the University of Massachusetts Lowell in 2021.

Dig Deeper on Network management and monitoring