metamorworks - stock.adobe.com

SambaNova Systems intros AI inference platform

The new platform runs Llama 3.1 405B and, with the vendor's full-stack approach, helps enterprises build agentic applications. It also rivals GPU architectures, the vendor claims.

SambaNova Systems on Tuesday introduced SambaNova Cloud, an AI inference platform.

SambaNova Cloud is powered by the independent AI hardware/software vendor's AI chip SN40L. The inference platform enables developers to create generative AI applications using the Llama 3.1 405B and Llama 3.1 70B LLMs from Meta through an API now available for free to any developer who logs into the platform.

SambaNova Cloud also has developer and enterprise tiers.

The developer tier lets developers quickly build models at a higher rate with Llama 3.1, 8B, 70B and 405B models. It will be available by the end of 2024.

The enterprise tier model gives enterprise customers the ability to more quickly build models for production workloads. It is also now available.

SambaNova Cloud comes at a time when enterprises are starting to implement generative AI workflows and deploy them at various scales.

Alternative infrastructure

Enterprises are also exploring both open and closed source models. Meta's Llama LLMs are open.

"Almost every single call I take from enterprises and venture capitalists involves how to leverage Meta Llama 3 and 3.1 models," Gartner Research analyst Chirag Dekate said.

Most enterprises adopting and implementing generative AI workflows are using GPU-oriented infrastructure while also exploring other types of architectures they can easily integrate with.

SambaNova Cloud offers enterprises an alternative infrastructure.

"Anytime you have an ASIC [application-specific integrated circuit] like SambaNova, it will almost always deliver better performance than any conventional GPU," Dekate said. That is because ASICs do not require as much energy as GPUs, making them more efficient and leading to better performance.

However, a big challenge for AI ASICs is how to integrate a software stack.

This is a challenge that SambaNova Cloud seeks to address. With SambaNova Cloud, enterprises do not have to figure out how to build the right software stack. Instead, SambaNova does this for enterprises by offering SN40L as a service, which helps enterprises deal with the integration challenge, Dekate said.

Moreover, SambaNova's full stack approach gives it the advantages it needs to offer Llama 3.1 405B, IDC analyst Matthew Eastwood said.

"LLMs require advanced resource scaling for inference and cost-efficient deployment strategies," Eastwood said. "Overall, these workloads demand a much more tailored approach to hardware, software and performance optimization."

As a provider of ASICs, SambaNova generally is able deliver more efficiency than AI vendors that use GPUs, Dekate said.

"An AI ASIC approach almost always delivers better delivers better efficiency; energy efficiency; cost efficiency; and, more importantly, sustainability efficiency than an equivalent GPU ecosystem," he said.

AI agents

Other than running the largest Llama 3.1 model, SambaNova Cloud lets developers build agentic applications at "unparalleled speed," SambaNova claimed.

According to IDC research, AI agents need "fast and efficient operation to handle real-world, dynamic environments," Eastwood said.

AI agents have "transformative potential … for businesses looking to automate processes and scale operations," he added.

Anytime you have an ASIC [application specific integrated circuit] like SambaNova, it will almost always deliver better performance than any conventional GPU.
Chirag DekateAnalyst, Gartner

AI agents are autonomous systems that can perform tasks without human intervention.

However, while SambaNova could eventually make do on its promise of AI agents, there is a risk of too much hype around agentic applications, Dekate said.

"There is this perception in the market that is now starting to take shape that agents are achieving a level of maturity where they can solve production problems effectively at scale," he said. "The reality is far from that."

He added that for agents to deliver true value, several things must happen. First, models need to move beyond just language into multimodal models. How multimodal techniques are integrated into applications must also evolve.

"These capabilities are not available today. So enterprises that are engaging in agentic AI should measure 10 times before investing once, because chances are that agentic AI investments today are going to fall short of their expectations," Dekate continued.

The versatility of the SambaNova architecture will help the vendor adapt as the techniques needed for true AI agentic workflows evolve, he added. Since the architecture is fluid, it won't need much of a change as the AI market moves more into agentic workflows.

Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI infrastructure