An explanation of AI model collapse

In this video, TechTarget editor Sabrina Polin talks about AI model collapses and the threat it poses to data.

Just like how a healthy ecosystem needs biodiversity, AI needs diversity in its training data to be effective. Otherwise, you get model collapse.

Model collapse is what happens when AI models are trained on synthetic, AI-generated content -- as opposed to human-generated content -- and degrade. Simply put, it's a feedback loop.

As generative AI models create more and more content that gets shared on the internet, the next generations of AI models eventually train on that content, instead of human-generated content.

These new models will rely too heavily on patterns, overestimating probable events and underestimating improbable events. This means these synthetically trained models will compound errors, misinterpret data and give increasingly wrong and homogeneous outputs.

This phenomenon has the potential to create data pollution on a large scale. Although generative AI enables more efficient text generation than ever seen before, model collapse implies that none of this data will be valuable to train the next generation of AI models.

Sabrina Polin is a managing editor of video content for the Learning Content team. She plans and develops video content for TechTarget's editorial YouTube channel, Eye on Tech. Previously, Sabrina was a reporter for the Products Content team.