your123 - stock.adobe.com
New Anyscale service enables fine-tuning of open source LLMs
The AI startup introduced a service that lets enterprises deploy large language models into their applications using popular LLM APIs like Llama 2.
Anyscale, the creator of popular open source unified framework Ray introduced a new service for the generative AI market that enables developers to integrate LLMs into their applications using popular LLM APIs like Meta's Llama 2.
The vendor introduced a new service named Endpoints on Monday during its Anyscale Ray Summit conference in San Francisco. It is available now.
Endpoints is meant to make it easy for enterprises to get started with generative AI right away, said Robert Nishihara, Anyscale's cofounder and CEO, in an interview.
Endpoints helps enterprises with the prototyping phase of generative AI, when enterprises are still figuring out how the technology fits their needs, he said.
Open source popularity
The service helps Anyscale capitalize on the growing interest in open source models from independent software vendors trying to embed generative AI into new and existing products, according to Arun Chandrasekaran, an analyst at Gartner.
Open source models are particularly useful for financial and telecommunications companies that have a high level of engineering that enables them to work with and manipulate open source models to fit their needs, he said.
"We have seen more state-of-the-art open source models emerge," Chandrasekaran said, noting that the most significant among these is Llama 2. "Anyscale is really trying to provide a platform that will enable [enterprises] to take these models and fine-tune them and perhaps run them more efficiently in production."
For Anyscale, the interest in open source is telling of the future.
"We believe open models will dominate," Nishihara said. "Looking at the trajectory, open models will become the default ways to power most business applications."
A gap exists
However, despite the popularity of open source, there is still a gap between closed and open source models. Modern closed source models are likely to deliver value faster than open source ones, Chandrasekaran said.
That is partly because the level of skills and talent required for open source are greater than for closed source, he added.
Therefore, Anyscale will likely be challenged to convince enterprises that the open source approach is better than well-integrated stacks like Microsoft or Google.
Arun ChandrasekaranAnalyst, Gartner
"A lot of clients are looking for simple, easy, low-hanging fruit," Chandrasekaran said. "They're looking for simpler use cases, meaning they might go with a more verticalized approach."
For Anyscale, while a performance gap currently exists between closed and open source models, optimizing models is the technique that can close that gap, especially at the performance level, Nishihara said.
"Fine-tuning is the technique that really can play a big role in helping businesses reduce costs," he said.
In line with that approach, Anyscale also launched a fine-tuning API, which lets enterprises using Endpoints not only inference with the models but also customize the Llama 2 model.
Enterprises concerned about privacy can use Anyscale Private Endpoints, which lets users run the entire API in their own cloud accounts and infrastructure.
Anyscale also revealed that Nvidia AI software will be integrated into the Anyscale computing platform, including Ray open source infrastructure, the Anyscale platform and Endpoints.
The integration brings Nvidia software to Ray, including Nvidia TensorRT-LLM, Nvidia Triton Inference Server and Nvidia Nemo.
Anyscale Endpoints costs $1 per million tokens for the 70 billion parameters Llama 2 models. It's less expensive on the smaller models.
Esther Ajao is a TechTarget Editorial news writer covering artificial intelligence software and systems.