Getty Images

Mistral tries to differentiate itself with new OCR API

The new tool enables the vendor to target specific use cases for its multimodal model. It also aims to help enterprises that want to self-host due to privacy concerns.

In a market in which large language models are becoming commoditized, AI startup Mistral is seeking to differentiate itself by targeting a specific application and being more sensitive to privacy and security.

The Paris-based vendor introduced Mistral OCR, an optical character recognition API. According to the vendor, Mistral OCR accurately understands different elements of documents (media including photos and graphics, text, tables and equations). It takes images and PDFs as input and extracts content from text and images.

The AI startup said Mistal OCR is ideal for use with retrieval-augmented generation systems and when taking multimodal documents like slides or PDFs as input.

Mistral said some enterprises have already started to use the API. For example, customer service departments are using Mistral OCR to transform documentation and manuals into indexed knowledge. Research institutions are also using it to convert scientific papers into "AI-ready" formats, according to the vendor.

The API is also multilingual and understands documents in different languages. Mistral OCR capabilities are free on the Mistral chat website, Le Chat. Those wanting to try the API can do so on a free trial basis on Mistral's platform.

With this launch, Mistral joins other vendors in offering features and products for document extraction.

For example, Google's Cloud Vision API can detect and extract text from images. Microsoft also offers Azure AI Vision with OCR and AI.

"What they're essentially doing is they're using a multimodal AI model to make that extraction automated and more qualitative, to make that extraction a lot more accurate," said Gartner analyst Arun Chandrasekaran.

Two strategies

While this is not unique to Mistral, it highlights a trend in the generative AI market of vendors aiming for specific use cases for their models.

"It's really hard to sell general-purpose models," Chandrasekaran said. "You have to identify specific business problems within the enterprise that they can solve."

This is an opportunity for Mistral to monetize some of the multimodality they want to build.
Arun ChandrasekaranAnalyst, Gartner

Therefore, Mistral OCR is an attempt by the AI vendor to step into traditional business processes within the enterprise, Chandrasekaran continued.

"It's a monetization strategy," he said. "This is an opportunity for Mistral to monetize some of the multimodality they want to build."

Mistral is also targeting data privacy. Mistral OCR offers a self-hosting option for organizations with strict data privacy requirements. Those organizations can host the API within their infrastructure.

Some challenges

The option to self-host will be necessary for many enterprises concerned that generative AI is moving too fast while needed governance and guardrails lag, said David Nicholson, an analyst at The Futurum Group.

Enterprises have been considering self-hosting AI tools and models in their data centers to safeguard against future problems such as data breaches.

So, the OCR model Mistral is offering will likely interest those enterprises, Nicholson said.

"I see that as a competitive advantage," he said. "I also think it is a relatively easy advantage to overcome."

Another challenge for Mistral is targeting the right user, Chandrasekaran said.

"The fact that it's more of a developer-centric offering is going to slow down the monetization," he said. "If they directly have access to and focus on the business unit leaders, the monetization will happen much faster."

Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies