Getty Images/iStockphoto

Microsoft unveils world AI model for video games

The new type of foundation model is similar to Nvidia Cosmos. While it makes for a better gaming experience, it's also applicable to enterprises.

Microsoft on Wednesday presented its research into a foundation model that can generate video game visuals and controller actions, advancing generative AI applications.

The first World and Human Action Model (WHAM), also called Muse, was developed by the Microsoft Research Game Intelligence and Teachable AI Experiences teams with Microsoft Xbox Game Studios.

Microsoft is currently open sourcing weights and sample data. Developers can also learn and experiment with the weights, sample data and WHAM Demonstrator -- a prototype with a visual interface for interacting with WHAM models -- on Azure AI Foundry. Azure AI Foundry is a platform that developers can use to build AI applications.

World models

Microsoft's WHAM is the next set of foundation models in a generative AI market that already produces language models -- which imitate how humans write things -- and action models, which focus on applications or how people use things.

WHAM is evolving from concepts from Google DeepMind, where models can simulate the world, said Omdia analyst Bradley Shimmin.

"Muse is a continuation of that idea in world-building," Shimmin said.

The models learn as they go without training and can change.

This is a very big, new dimension in foundation models and AI.
Dion HinchcliffAnalyst, Futurum Group

Microsoft is not the first vendor to create a model like this. In January, Nvidia introduced world foundation models, or Nvidia Cosmos. Nvidia Cosmos was trained to understand the physical world. Nvidia trained Cosmos on hours of video footage of humans walking, hands moving and objects being manipulated.

While Nvidia Cosmos is different from Microsoft's Muse, both models are important for building AI technology that can interact with any type of world, said Dion Hinchcliffe, an analyst at The Futurum Group.

"This is a very big, new dimension in foundation models and AI," Hinchcliffe said.

Video games and enterprise applications

While there are several applications of world models, Shimmin said Microsoft's video game application will benefit both gamers and designers.

"It's talking about increasingly immersive and increasingly responsive world modeling using generative AI to help both the designer and the player," he said. "For the designer, [it helps] create a more interesting game, and for the player, [it helps] to have the game evolve more fluidly around them."

Microsoft with Xbox, and Nvidia with its GeForce Now are the two giants that can unleash generative AI in the video gaming industry, Hinchcliffe said.

However, world models move beyond the application of video games.

For enterprises, world models like Muse could be helpful with complex supply chain systems, Shimmin said. For example, a grocer that needs to manage an egg supply chain might use a world model to simulate how its system could respond to disruptions.

"That's worth a tremendous amount of money in the enterprise," Shimmin said. He added that there's a lot of planning software, but deep learning neural networks like Muse could help enterprises anticipate problems better than the planning software available today.

However, world models come with challenges, Hinchcliffe said.

"We will have to be very careful with any AI that takes action," he said. "When you have AI actually doing things, directly changing the world, that's an issue. And so, when they cross over to that, that's going to be where the major risk factor skyrockets."

Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies