Getty Images

Google's Veo 2 is technically advanced, but concerns remain

The new version can generate videos up to two minutes long. It seems better than OpenAI and Adobe's video generator. However, it might not be safe for commercial use.

Google continues to improve its AI image and video-generating systems.

On Sunday, the cloud provider introduced the next version of its video generation model, Veo 2, and updated Imagen 3. Veo 2 is available in Google's VideoFX tool. Users can join the waitlist at Google Labs.

Veo 2 creates high-quality videos in different ranges of subjects and styles, according to Google. It can understand cinematography, deliver resolutions up to 4K and extend to minutes in length.

Google's change to Veo comes two weeks after it made it available on Vertex AI. It also comes one week after OpenAI made its video generator generally available, following months of being in preview.

The video generator also competes against Amazon Nova Reels, a new foundation model announced by AWS that generates video of about 24 frames per second that's about six seconds long.

Big leap

Veo 2 is a big leap compared with the other video generators in the market, said Liz Miller, an analyst at Constellation Research.

"The big difference is the resolution and in the duration of clips that can be generated," Miller said.

Veo 2 can generate clips over two minutes long, at up to 4K in resolution, according to Google. VideoFX, however, is still limited to 720p resolution and eight seconds in length.

Google also focused on how and where cameras can be placed and seemed to have trained Veo 2 on motion and how a texture or material will move, Miller said.

"Think of the difference between water and oil. ... There is an inherent difference in how those two materials and textures will move and how they will land and spread," she said. "Veo 2 looks pretty darn good in understanding this difference."

Focus on safety

The improvements Google made are enough to attract more competitors, but enterprises tend to prioritize reliability and safety over cutting-edge technology, said Keith Kirkpatrick, an analyst at Futurum Group.

The technological capabilities are undoubtedly interesting in terms of what can be done.
Keith KirkpatrickAnalyst, Futurum Group

"The technological capabilities are undoubtedly really interesting in terms of what can be done," Kirkpatrick said. "They still need to keep a close eye on these models to make sure that if users are putting in prompts, that they're not going to be generating material that could be considered offensive or toxic."

Adobe, comparatively, does not focus on technological advances like Google but on safety with its Firefly model.

"They want to make sure that what they're doing is safe to use because they have such a large customer base of large organizations that are not going to use tools that are not reliably safe," Kirkpatrick said.

He added that it will be hard for Google to address issues of trust unless it is willing to train the models on only data it has fully vetted. The vendor has also not been clear about the training data it used to train Veo 2, although it's likely that it used YouTube, Miller said.

Another challenge for enterprise customers is that Google has not included Veo 2 users in their contractual indemnification clauses. The lack of indemnification clause might make enterprises wary due to possible copyright questions around data ownership.

Overpromising, underdelivering

While Google said Veo 2 is less likely to hallucinate, it could face the issue of overpromising and underdelivering, Miller said.

For example, most video models struggle with hands. While Veo 2 seems to have reduced the number of errors in hands or human images, it will probably still get things wrong, she said.

"The better any artist -- human or AI -- gets, the more the details look off," she continued. "Animations that are supposed to be hyper-realistic have eyes that are flat and 'dead.' Hands lack veins; textures go wonky for no apparent reason."

Better technology advancement is not what will determine the end winner with video models, Miller said. The model that wins is determined by enterprise use.

She added that many already see them as tools that further the creative process.

Moreover, it's likely that as time goes on, technology from Adobe and OpenAI will catch up to Google.

"If you look down the road -- three months, six months, a year -- you get to a certain point where everyone sort of gets to the same spot," Kirkpatrick said.

Despite the technological advances, safe commercial use will be a key factor for enterprises.

"The key here will be what documentation, visibility and 'ingredient lists' will these solutions be willing to provide to ensure safe commercial use and responsible creation," Miller said.

Google also improved its Imagen 3 image-generation model to generate brighter images. The Imagen 3 model will also roll out to ImageFX, Google's image generation tool from Google Labs, to more than 100 countries.

The cloud vendor also introduced Whisk, its newest experiment from Google Labs. Whisk allows users to input or create images to convey any subject, scene or style.

Esther Shittu is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies