Getty Images
Google strikes back at OpenAI with Gemini foundation model
After falling behind, the vendor drew on its AI roots in a bid to outdo OpenAI's GPT with a potent foundation model to inject across its app portfolio and for advanced coding.
Listen to this article. This audio was generated by AI.
Google on Wednesday struck back at rival OpenAI and its GPT family of large language models with Gemini, a powerful new multimodal foundation model the cloud giant plans to use to power its generative AI chatbot, Bard, and the rest of its portfolio.
The move came as somewhat of a surprise after reports that Google was delaying the release of Gemini, which has been developing for a long time. Many in the tech world see it as an aggressive step to match, if not exceed, the rapid generative AI strides of OpenAI and its deep-pocketed patron, Microsoft, over the past year while Google lagged.
"Google already had the capability to out-AI OpenAI. Gemini is it," said Mike Gualtieri, a Forrester Research analyst. "Google has all the data, infrastructure and talent to create the best models in the world."
Google drew on its long history in AI R&D to create what it called its largest and most capable AI model. That history includes the DeepMind AI unit it bought in 2014; BERT, an early LLM from 2019; and the AI technology behind its Search engine.
Gemini 1.0 comes in three sizes: Ultra, the biggest configuration for the most complex tasks; Pro, the mid-range version for scaling a wide range of applications; and Nano, a lightweight version that Google said it will use in its Pixel smartphones to bring on-device generative AI capabilities.
Google touted dramatically improved performance over previous models in mathematical calculations -- an area in which LLMs have been weak. Gemini Ultra scored a 90%, outperforming humans, on Cornell University's massive multitask language understanding test, according to Google.
Multimodal LLM
With its native multimodal capabilities able to train on and generate text, images, audio and video, Google sought to differentiate Gemini from previous attempts at multimodal LLMs, which have consisted of integrating separate, single-mode text- and image-generating models.
"There's nobody that I know of that's put [multimodal] out," said Mark Beccue, an analyst at The Futurum Group.
Comparisons to GPT are inevitable, and if the test results claimed by Google are reflective of Gemini's actual capabilities, then "Gemini establishes the new standard that peer models will need to meet or beat," Gartner analyst Chirag Dekate said.
"Gemini is game-changing and sets the benchmark in the fast-evolving generative AI landscape," Dekate continued. "From its ability to explain reasoning in math and physics to advanced coding, Gemini potentially unlocks unprecedented new insights and inspires new applications."
As for the AI arms race, Google appeared to gain significant ground on, if not surpass, OpenAI and other rivals, including AWS and IBM. All have been issuing rapid-fire iterations of their generative AI technology.
Google's latest advancements might have put it in an advantageous position technologically, but any such advantage might be transitory.
"There are four really capable cloud players doing [GenAI] here. And at the end of the day, they'll keep matching each other," Beccue said. "Nobody's going to win this."
Other applications
Beyond advanced math and coding work, Google has more plans for Gemini.
They include applying Gemini Pro to Bard in its first major upgrade since it was unveiled as an alternative to ChatGPT in February. The upgraded Bard is generally available now.
Starting on Dec. 13, developers and enterprise customers will be able to use Gemini Pro through an API in Google AI Studio and Google Cloud Vertex AI.
Chirag DecateAnalyst, Gartner
For Gemini Ultra, Google said it is completing trust and safety checks before making it available to select users as well as for early experimentation and feedback. It will then release to developers and enterprise customers early next year.
Google also plans to release another version of the chatbot, Bard Advanced, early next year.
The tech giant said it has applied Gemini to its Search Generative Experience tool to make it faster. The company is also experimenting with Gemini in classic Search.
Meanwhile, some observers found Google's move to put generative AI capabilities directly on its smartphone -- something not available natively on smartphones from other vendors -- one of the more noteworthy aspects of the Gemini introduction.
"While this is limited to Pixel devices at the moment, watching this develop will be very interesting," said William McKeon-White, a Forrester Research analyst.
Shaun Sutner is a senior news director for TechTarget Editorial's enterprise AI, business analytics, data management, customer experience and unified communications coverage areas. Enterprise AI news writer Esther Ajao contributed to this story.