metamorworks - stock.adobe.com

Google's Gemma 3 can run on a single TPU or GPU

Google says the new models are user-friendly, tailored for edge devices and cost-effective.

Google on Wednesday introduced Gemma 3, a new generation of its family of open models.

The release, a day after rival AI developer OpenAI launched a series of tools for building agentic AI applications, highlighted the nearly monthly competition between the generative AI pioneers as they aim to match and exceed each other's advances while trying to fend off a host of smaller challengers.

Gemma 3 shows how generative AI models are improving and becoming more economically viable and accessible to business users. Developers can use Gemma 3 models to create applications that fit on a single GPU or TPU, which makes them cheaper to run, Google said.

"The massive reduction in resource requirements is probably the biggest or most interesting component of this," said William McKeon-White, an analyst at Forrester Research. "It becomes more economical to run these things."

The models, which Google touted as "lightweight," portable and responsibly developed, can run on devices such as phones, laptops and workstations. They come in parameter sizes of 1 billion, 4 billion, 12 billion and 27 billion.

According to Google, the models are preferred more on some human evaluation benchmarks than similar systems such as Meta's Llama 3.1 405B, DeepSeek-V3 and OpenAI o3-mini. They are also multilingual, with built-in support for more than 35 languages and pretrained support for 140 languages.

Developers can use Gemma 3 to build applications that analyze text and short videos. The models also support function calling and structured output to help users automate tasks and build agentic applications.

The massive reduction in resource requirements is probably the biggest or most interesting component of this.
William McKeon-WhiteAnalyst, Forrester Research

Developers can download and customize Gemma 3 models from Hugging Face, Ollama and Kaggle. The model is also available on Google AI Studio and Vertex AI.

Economically viable

Organizations no longer need to buy or rent the most expensive GPUs to get generative AI models up and running, which makes the technology much more accessible, McKeon-White said.

The affordability, combined with performance, of models like Gemma 3 will lead to interesting business models from new vendors that can attract startup funding and create ready-made offerings, he said.

"It significantly simplifies the upfront getting started and reduces the ongoing run cost of that," McKeon-White said. "It does allow for a lot more potentially new business to be built off of these models faster than most enterprises could take proper advantage of these advances."

One size does not fit all

Google's openness and adaptability with Gemma 3 are also key benefits, according to Futurum Group analyst Bradley Shimmin.

"They're extremely adaptive, not just in terms of what they've been trained to do, but also adaptive in terms of where they're meant to run," Shimmin said. He has been able to run the model on his AMD laptop, and the model's performance has been "snappy," he said.

However, the models do not replace test-time reasoning models like DeepSeek-R1, Shimmin added. Test-time reasoning models prioritize reasoning during inferencing and don't rely only on pretrained data.

"It's so easy for the market to convince itself that every innovation is the next phase of AI and the next step in the evolution of AI, but there are refinements of each," he said. "Reasoning has its place, but just as important, you might be building an agentic system that uses a reasoning model to then assign tasks to multiple smaller, traditional transformer models, like Gemma, to carry out more specific tasks."

Google built Gemma 3 recognizing that one model cannot fit all scenarios and that some models might be best for specific use cases, Shimmin said.

Some challenges

Despite accessibility and adaptability, Google faces the same challenges as other vendors. Many enterprises are not ready to fully take advantage of these AI systems. While enterprises see the opportunity with AI systems, they tend to take time before integrating them.

"That is the biggest impediment at the moment to seeing success with these things, because you can do pretty remarkable business-transforming changes with them -- it's just hard to get right," McKeon-White said.

Another problem for Google is figuring out how to provide a unified architecture for enterprises to work with, Shimmin said.

"Google's challenge, as it's always been, is that they have many irons in the fire, and many projects that overlap and can benefit from closer integration and a more coordinated go-to-market and support mechanism for those in the enterprise," he said.

Along with Gemma 3, Google also launched ShieldGemma 2, a 4 billion-parameter image safety checker built on the Gemma 3 foundation. It provides added guardrails for image safety and output safety labels across three categories: dangerous content, sexually explicit content and violence.

Google's research unit Google DeepMind also introduced two new AI models based on Gemini 2.0 on Wednesday: Gemini Robotics and Gemini Robotics-ER.

Gemini Robotics is an advanced vision-language-action model. It is able to understand new situations and perform different tasks, including those it has never seen before, Google said. It can tackle complex tasks such as packing a snack in a ziplock bag or folding origami.

Gemini Robotics-ER can create new capabilities on the fly, Google said. The model can perform the necessary steps to control a robot, such as perception, state estimation, spatial understanding, planning and code generation.

Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI business strategies