Getty Images/iStockphoto

XAI's Grok-3 highlights openness and transparency concerns

The AI startup's new Grok model has 10 times more compute power than the previous generation. XAI also introduced reasoning capabilities that it says surpass reasoning models.

XAI CEO Elon Musk is again challenging OpenAI through his startup's updated large language model, Grok-3, about one week after he made a bid to buy OpenAI.

XAI introduced Grok-3 and Grok-3 mini on Monday through a livestream with Musk, xAI co-founders Jimmy Ba and Yuhuai Wu, and lead engineer Igor Babuschkin. The new models have 10 times more compute power than Grok-2, according to the vendor. Grok-3 and Grok-3 mini surpassed OpenAI's GPT-4o, Google Gemini and DeepSeek-V3 across benchmarks testing for math, science and coding, xAI said.

The startup also created reasoning capabilities in Grok-3 and Grok-3 mini, surpassing other models like OpenAI o1, DeepSeek-R1 and Gemini 2.0 Flash Thinking on benchmarks testing for math, science and coding.

The AI startup claimed that an early version of Grok-3 achieved a high score on Chatbot Arena, a public LLM benchmarking site that produces answers from two different unknown models for comparison. Grok-3's early version codename was Chocolate.

XAI also revealed a new Deep Search tool, which will act as a next-generation search engine.

More reasoning models

Grok-3 comes as the competition between AI vendors has grown in the past few weeks, starting with Chinese AI startup DeepSeek. Since then, AI vendors including xAI's rival OpenAI have refined their reasoning models or introduced new ones.

With DeepSeek-R1 being an open source model, many vendors can now turn any of their models into reasoning models, said Bradley Shimmin, an analyst at Omdia, a division of Informa TechTarget.

I don't see huge differences, except that it isn't encumbered by the censorship built into DeepSeek.
David NicholsonAnalyst, Futurum Group

"You can train any model to behave as a test-time reasoner," he said. "That's what they're doing with Grok-3."

XAI is not the only vendor that can do this. For example, on Feb. 12, Open Thoughts, a community of researchers, released OpenThinker-32B, an open-data reasoning model that sprouted from reasoning traces from DeepSeek-R1.

Grok-3 also seems like DeepSeek's reasoning model, said David Nicholson, an analyst at Futurum Group.

"I don't see huge differences, except that it isn't encumbered by the censorship built into DeepSeek," Nicholson said.

Open vs. closed

While it's unclear whether xAI used DeepSeek to develop Grok-3, it's also unclear how it added reasoning and thinking into its models. The vendor did not release any supporting material or information outside its livestream.

"There's no transparency into how this thing was made, what it's doing and why it is -- as Elon so eloquently put it -- so based," Shimmin said.

The lack of supporting material significantly departs from xAI's initial approach: releasing an open source Grok-1. Musk said on Monday's livestream that while the vendor has yet to open source Grok-2, it plans to do so once Grok-3 is fully available and mature.

Shimmin said the strategy of open sourcing only the previous version of the model, rather than the current one, helps xAI protect its value proposition.

XAI's strategy is a reasonable middle ground to the conversation around open source and AI vendors gaining money from their technology, Nicholson said.

"That's a reasonable balancing act, to say, 'We reserve the right to keep secret, close to the vest, the leading edge of what we do, and then over time we will open this stuff up for developers to use with unlimited licensing,'" he said.

Enterprise use of Grok

However, the lack of transparency could also mean many enterprises might take a wait-and-see approach when using Grok-3.

Enterprises tend to prefer vendors like IBM that are very transparent and make even their pretraining data open, versus those that are closed, Shimmin said.

"That level of transparency is crucial for companies to make a choice of a model that they know ... is indemnified against any sort of future litigation, or at least lets them address any sort of biases that they want to address in their solution," Shimmin said. "We don't know at all what those based biases are in Grok-3."

There is also a question of whether enterprises are ready for the kind of "honesty" that Grok-3 might have, Nicholson said.

"It remains to be seen whether enterprise customers will embrace an approach that is personified by the kinds of behavior that Elon Musk exhibits," he said. Musk has been clear that Grok does not embody what he calls a "woke" agenda. This starkly contrasts with OpenAI's and Google's approach to censoring their respective LLMs, and it's unclear which approach enterprises prefer.

However, Nicholson added that Grok-3 being a contender in the AI market is beneficial.

"It's good news that another contender is jumping in, and ultimately, it will drive down the cost of AI for everyone," he said.

According to Musk, the current version of Grok-3 will have some imperfections, but improvements will be made daily. Moreover, xAI will introduce voice capability in the coming months.

XAI also revealed that it's starting a new SuperGrok subscription and a website called Grok.com.

Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on Artificial intelligence platforms