Small language models are taking the spotlight

An industry analyst shares his perspective on why larger models aren't always better for enterprises.

Large language models demonstrate exceptional capabilities in text generation, language translation and creative writing. But LLMs' large size and heavy data processing requirements increase both cost and complexity when deploying them, particularly in enterprise settings.

Small language models (SLMs) represent a growing trend in AI development. They provide a viable option that effectively combines performance efficiency with cost-effectiveness and customization for enterprise requirements.

The designation of SLMs reflects their nature as reduced-scale LLMs that undergo training on specialized data sets to perform distinct tasks efficiently. This focused method presents significant benefits when applied to enterprise-specific information: Whereas LLMs receive training from extensive public data sources, SLMs adapt to an organization's proprietary knowledge base to deliver precise information for essential business functions.

This specialization proves vital from a content perspective. Consider an SLM-based customer query response system that uses a business's product manuals and FAQ resources as its training data. The SLM can be customized to understand product subtleties, enabling the system to deliver precise and context-specific answers that exceed the performance of generic LLMs. The result is better customer satisfaction, lower support expenses and uniform brand interactions. SLMs can also assist in producing internal reports, summarizing meeting notes and generating personalized training materials customized to an organization's language and context.

SLMs also provide compelling financial advantages. LLMs need extensive computational power and expert knowledge for development and deployment, which is highly costly. These models typically run on expensive GPUs, mostly from Nvidia.

The reduced size of SLMs means they use fewer processing resources, which translates to lower development and deployment costs. SLMs can be deployed on GPUs at reduced costs or, in many cases, even run on much cheaper CPUs from companies like AMD and Intel. This can make more advanced AI capabilities affordable to organizations across all sizes. The specialized design of SLMs allows them to perform faster inference, leading to quicker responses and enhanced UX.

Tools like retrieval-augmented generation (RAG) and vector databases are useful when training SLMs on enterprise data. RAG enables SLMs to instantly retrieve and analyze external data sources, such as a company's knowledge base or internal documents. This ensures that the SLM's responses remain factual and current, reducing hallucinations and boosting response accuracy.

Vector databases can also enable efficient storage and retrieval of contextual data points. These systems convert information into vector format, enabling SLMs to rapidly identify relevant data, regardless of differences in how queries are phrased. SLMs, combined with RAG and vector databases, establish an advanced system that enables intelligent applications to access and process enterprise data, while maintaining extraordinary accuracy and efficiency.

The value of SLMs for businesses is vast. SLMs can be used for the following:

  • Automate internal processes, such as report generation and data summarization, to enhance knowledge management and streamline workflows.
  • Enhance customer service with AI-powered chatbots and virtual assistants, delivering timely and personalized support to users.
  • Improve employee productivity by providing workers with innovative tools to access and analyze information efficiently.
  • Drive innovation by revealing valuable insights within enterprise data and helping develop fresh concepts for products, services and business approaches.

Don't view SLMs as merely smaller versions of LLMs. SLMs can fundamentally change how an organization uses AI to extract value from data. They will become essential for modern data-driven businesses because they specialize in targeted tasks, while using enterprise data and vector databases to provide substantial cost savings. SLMs with enterprise data can be the foundation for building generative AI systems, and AI agents can be used across organizations to increase productivity and much more.

Like all new industries, we are in a market maturity stage where technology starts out inefficient and costly. This is followed by market forces, such as the demand to refine the technology and drive down costs, which result in broader and faster market adoption.

With AI, organizations across industries want the technology, but early financial modeling shows very high operational costs. SLMs can drive these costs down, while also bringing together the precise and relevant context to make AI tools perform more accurately in terms of prompt responses, actions taken and reasoning. It's truly exciting to watch this market work through its maturity cycle, with many more efficiencies to come.

Stephen Catanzano is a senior analyst at Enterprise Strategy Group, now part of Omdia, where he covers data management and analytics.

Enterprise Strategy Group is part of Omdia. Its analysts have business relationships with technology vendors.

Dig Deeper on Enterprise applications of AI