Getty Images/iStockphoto

News

Nvidia unveils text-to-3D AI research project

The new model helps generate 3D images faster. It was trained on different animals and objects using the vendor's A100 GPU and prompts generated with OpenAI's ChatGPT.

Esther Shittu, News Writer

Published: 21 Mar 2024

Nvidia on Thursday introduced a new research project for text-to-3D content creation named LATTE3D. LATTE3D turns text prompts into 3D representations of objects and animals within seconds, Nvidia said.

While researchers trained the model on two datasets of animals and objects, developers could use the same model architecture to train the AI system on other data types, such as a dataset of plants.

LATTE3D was trained with Nvidia A100 Tensor Core GPU and prompts generated using OpenAI's ChatGPT.

Nvidia said users can export the shapes or objects generated into graphics software applications or platforms such as Nvidia Omniverse, the AI hardware-software vendor's platform for creating metaverse applications.

The new research project comes after Nvidia introduced two new powerful Blackwell GPU chips earlier this week. It also comes as the vendor introduced a new project for using foundation models to design human-like robots.

The evolution of 3D images

What Nvidia is doing with LATTE3D is displaying the prowess of its GPUs and the possibilities of its hardware for a range of applications, said Chirag Shah, professor at the information school at the University of Washington.

"This work doesn't necessarily produce better 3D images -- at least, not in a noticeable, groundbreaking way," Shah said. "What it does, however, is produce such images much faster."

Rendering 3D objects is computationally expensive and complex with the added layer of text-to-image, he added.

LATTE3D is a natural evolution of text-to-image generation technology, Futurum Group analyst David Nicholson said.

"We've been conditioned over decades to retrieve things," Nicholson said, referring not only to LLMs but also to retrieving images from search. "When we go to a 3D thing, it's even more complicated."

For LATTE3D, Nvidia used amortized methods, a mechanism that lets the model break down a text prompt into different components to generate results faster. With this approach, Nvidia is aiming to tackle some of the challenges of the earlier versions of 3D image generation models, such as the speed at which the images are generated.

The vendor is using both software and hardware to do that, Nicholson noted. Nvidia is also using plain language prompts and diffusion technology to create and structure this model in inexpensively and quickly, he added.

"It is illustrative of Nvidia's leadership position in all sorts of different areas of AI," he continued.

Image of an AI-generated cat. — LATTE3D is a model that generates 3D models. It has been trained on images of objects and animals.

Some limitations

However, LATTE3D might not become the go-to model for 3D used for creating video or animated films since it is still limited to the specific tasks it can perform.

"Just because this helps to refine the creation of 3D images and 3D models doesn't mean this will find its way into say, the production of a full Hollywood animated film," Nicholson said.

This work doesn't necessarily produce better 3D images -- at least, not in noticeable, groundbreaking way.

Chirag ShahProfessor, University of Washington

Moreover, the techniques Nvidia used still have the same limitations and risks as other text-to-image generation techniques. That is similar to what happened to Google Gemini, Shah said, referring to the bias problem that caused Google to shut down the image generation part of its Gemini multimodal model.

"Things will still cause issues," he said. "We just have the ability to do that faster."

Other than LATTE3D, Nvidia introduced another research project named EDM2. EDM2 is a generative AI model that improves the structure of diffusion-based neural networks used in current image and video generators.

Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies

Part of: Nvidia GTC 2024 news

Up Next

Nvidia unveils new AI Blackwell chip, microservices and more

The vendor launched a barrage of AI tech including faster chips in its new Blackwell infrastructure and new microservices that enable enterprises to create custom applications.

Nvidia unveils text-to-3D AI research project

The new model helps generate 3D images faster. It was trained on different animals and objects using the vendor's A100 GPU and prompts generated with OpenAI's ChatGPT.

SAP, Nvidia partner to boost Business AI development

SAP and Nvidia are working together to combine platforms and services that help customers build business-specific generative AI capabilities across SAP's cloud products.

WEKApod appliance built for Nvidia GPUs a first for company

Weka enters new territory with WEKApod, an appliance purpose-built for Nvidia DGX SuperPod. The offering is part of a major push from storage vendors vying to work with the GPU maker.

NetApp-Nvidia partner to add RAG capabilities to OnTap

NetApp OnTap now enables data stored in the cloud and on premises to connect with LLMs via Nvidia's NeMo RAG -- a feature that will likely become table stakes for storage vendors.

Nvidia partners, customers drive AI into data centers

Nvidia and its partners are providing the tools and infrastructure to build and deploy AI applications that companies say could transform their businesses.

Nvidia unveils text-to-3D AI research project

The new model helps generate 3D images faster. It was trained on different animals and objects using the vendor's A100 GPU and prompts generated with OpenAI's ChatGPT.

The evolution of 3D images

Some limitations

Dig Deeper on AI technologies

What is neural radiance field (NeRF)?

Amazon, Google, Microsoft race for first in quantum computing

AMD trims delivery time for MI350

Nvidia launches new NIM microservices in NeMo Guardrails

The evolution of 3D images

Some limitations

Related Resources

Dig Deeper on AI technologies

What is neural radiance field (NeRF)?

Amazon, Google, Microsoft race for first in quantum computing

AMD trims delivery time for MI350

Nvidia launches new NIM microservices in NeMo Guardrails