Enterprises shift to on-premises AI to control costs

In 2025, many companies will shift to on-premises AI to cut cloud costs that can easily reach $1 million a month for large enterprises.

In 2025, enterprises shifting generative AI to production after two years of experimentation will consider on-premises deployment as a cost-effective alternative to the cloud.

Organizations have tested the capabilities of large language models underpinning GenAI services on AWS, Microsoft Azure and Google Cloud since OpenAI sparked the AI gold rush in late 2022. The experimentation showed how GenAI could significantly improve business operations and send cloud costs dramatically higher.

Rather than having tough conversations with CFOs over soaring cloud expenses, CIOs will pursue on-premises AI when the underlying technology is less expensive than the cloud, experts said. Better software from startups and packaged infrastructure from vendors such as HPE and Dell make private data centers a way to balance cloud costs.

In fall of 2024, Menlo Ventures surveyed 600 U.S. IT decision-makers in enterprises with at least 50 employees. The venture capital firm found that 47% developed GenAI in-house.

In roughly the same timeframe, Informa TechtTarget's Enterprise Strategy Group surveyed 1,351 senior IT and business managers. The study found that the percentage considering both on-premises and public cloud equally for new applications in 2025 rose from 37% in 2024 to 45%.

At the same time, hardware makers reported significant increases in AI system sales.

In early December, HPE reported that revenue from AI systems rose 16% to $1.5 billion for the quarter ending Oct. 31. In nearly the same timeframe, Dell reported that orders for AI servers rose to a record $3.6 billion while the company's sales pipeline grew more than 50% across all customer types.

"Customers are looking for all shapes and sizes of AI-ready or AI-capable servers," said David Schmidt, senior director of Dell's PowerEdge server line.

Heavily regulated companies or those with strict governance rules have typically avoided the cloud to fully control data privacy and security. AI will not change that dynamic.

What's different is that Fortune 2000 companies will pursue on-premises AI because it offers more cost controls than the cloud, said John Annand, an analyst at research and advisory firm Info-Tech Research Group.

"It's not uncommon for us to get bills that we review for our members, and they're spending $750,000 or a million dollars or more a month in the cloud," Annand said.

Global manufacturing company Jabil develops and deploys nearly all GenAI applications on AWS. CIO May Yap monitors costs and constantly reviews the company's cloud use to ensure maximum efficiency.

"Does moving to the cloud actually give you the cost advantage? In certain cases, it doesn't," Yap said. "But what we have done is undergo this [constant process] called cloud financial optimization that inspects and sees how you rationalize the cloud consumption."

On-prem AI technology

Enterprises today can weigh cloud costs for GenAI applications with as-a-service offerings from Dell and HPE. Dell APEX and HPE GreenLake offer pay-per-use pricing for the AI servers introduced this year combined with storage and networking for private data centers or a co-location facility.

"Organizations are now looking at wanting more predictable costs," said Tiffany Osias, vice president of global colocation services at Equinix. "The cost of cloud is so high that infrastructure costs -- when you hit that tipping point -- are low enough to where they can get much better economics in purchasing equipment and running it on their own."

Does moving to the cloud actually give you the cost advantage? In certain cases, it doesn't.
May YapCIO, Jabil

Companies developing GenAI in-house include Walmart, which built a document summarization application for its benefits help desk and an AI assistant that answers job-related questions from employees at its corporate headquarters.

Many startups have entered the market with software for developing in-house applications on top of the latest hardware, said Tim Tully, a partner at Menlo Ventures. Many of the products are sufficiently mature for on-premises AI development.

"I would say 80% of what generative AI requires is a push-button, turnkey solution, and that already exists in a series of startups today," Tully said. "They're out there selling software, and they're doing fairly decently."

The Menlo Ventures survey found that startups' gains are aided by a growing dissatisfaction with the AI offered by incumbent vendors, which tend to layer GenAI capabilities on existing products. At 64%, the number of companies preferring to buy from established vendors remained high. However, 40% of IT decision-makers questioned whether the current tools fit their needs, and 18% expressed disappointment with the offerings.

Examples of startup products include ready-to-use retrieval augmented generation. RAG is critical to AI development because the framework allows enterprises to fine-tune models on in-house data to control their responses to prompts.

Startups in the space include RAG-as-a-service platform provider Ragie and GenAI platform-as-a-service provider Lamatic.ai.

Other startups are focused on integrating AI with internal systems. For example, Squid AI -- backed by Norwest Venture Partners, Zeev Ventures and Ridge Ventures -- offers a platform connecting various data sources and models to let companies integrate custom AI agents with existing infrastructure. Enterprises can deploy Squid AI on-premises or in the cloud.

LangChain is an example of an open-source orchestration framework for developing AI applications on-premises. The company offers prebuilt libraries, prompt templates and chain constructs for creating chatbots, virtual assistants and intelligent search systems. LangGraph, an extension of the framework, provides an open-source library for creating agent and multi-agent workflows.

Companies that decide on specific in-house development of AI applications will likely use tools outside of traditional IT computing environments, Annand said. Therefore, on-premises development will benefit consulting companies.

"Any company that can offer consulting services around how to effectively use the tools and how the tools can generate business outcomes are going to do well," Annand said.

Antone Gonsalves is an editor at large for TechTarget Editorial, reporting on industry trends critical to enterprise tech buyers. He has worked in tech journalism for 25 years and is based in San Francisco. Have a news tip? Please drop him an email.

Dig Deeper on Enterprise applications of AI