image-to-image translation Generative AI in the enterprise raises questions for CIOs
X

Is AI-generated content copyrighted?

The current legal landscape surrounding AI-generated content is murky and fraught with confusion. Here's where it currently stands and what to know before publishing.

Copyright law, as it currently stands, was written at a time when humans were directly involved in fair use practices, such as citing sources and creating derivative works, which is copyrighted work that comes from other copyrighted work. However, various generative AI tools can create text, images, songs, videos and other content. New AI models can scan copyright-protected content at scale to distill an image's style, a novel's plot or a program's logic. Once trained on protected content, these AI models can generate new content different enough from the original that some might consider it fair use.

Yet, users of popular GenAI platforms can't determine how these services were trained.

"The problem with AI-generated content is that users don't know exactly where the AI is sourcing things from and which parts of the content it creates are from scratch or just pulled from another piece of copyrighted content, or even another person's AI-generated artwork on the platform," said Nizel Adams, CEO and principal engineer at AI consultancy Nizel Corp.

If an organization wants to incorporate AI-generated content into its marketing or content strategies, it must consider the following two questions to avoid copyright infringement.

Can AI-generated content be copyrighted?

If users of GenAI platforms don't know what content the tools were trained on, then they might hesitate to adopt these platforms in case their AI-generated content is already copyrighted and doesn't fall within fair use.

The U.S. Copyright Office has provided some guidance on this topic, but the overarching answer is that AI-generated content can sometimes be copyrighted, according to David Siegel, partner at Grellas Shah LLP. Siegel said he expects more guidance as the courts tackle the issue.

Thus far, the Copyright Office, in line with existing case law, has explained that for a work to be afforded copyright protection in the U.S., it must have a human author. Yet, Siegel said he is not sure what that means in the world of AI.

"If the only human involvement is the input of a chat prompt into ChatGPT, for example, one cannot obtain copyright protection for the raw result of that prompt," he said.

On the other hand, if a user inputs a prompt into an AI tool, gets a response and then modifies the result in creative ways, that can potentially result in content afforded copyright protection. However, only human-authored parts of the work can be copyrighted.

In other words, AI can be a tool authors use to generate materials and create copyrighted works. "But that is a far cry from what most people think about when considering whether AI-generated content can be copyrighted, which is traditionally focused on copyrighting raw outputs," Siegel said.

What is considered copyright infringement for AI-generated content?

Users might wonder if AI-generated content trained on protected intellectual property is considered copyright infringement. This question has a murkier answer and is the subject of numerous lawsuits on images, songs and books. For example, in early 2023, TikTok, Spotify and YouTube removed an AI-generated song that mimicked the voices of rapper Drake and R&B artist The Weeknd. However, the implications for AI-generated content less similar than this example are unclear.

A copyright infringement inquiry begins with the long-established test of access and substantial similarity, according to William Scott Goldman, managing attorney and founder at Goldman Law Group. Overall, this means the case would have to prove that the AI or a human read the content and that it's similar enough to convince a jury it was copied.

"Although there is no established case law for generative AI just yet, I believe without clear-cut proof of access, such infringement claims will fail unless the copying in question is deemed identical to the original," Goldman said.

However, Goldman also said he believes copyright owners and plaintiffs could assert unauthorized use, especially if this use is not considered de minimis -- too small to be considered meaningful -- and the resulting work is substantially similar to the original.

Once both issues have been demonstrated, the case would turn on a fair use defense. GenAI could be considered a derivative work under existing copyright law if it contains sufficient original authorship, Goldman said.

Courts now grapple with whether AI-generated content sufficiently differs enough from the originals under existing fair use precedents.

Lawsuits over AI-generated content

Various lawsuits regarding AI-generated content and GenAI tools have started to make their way through the courts -- both related and unrelated to copyright.

In November 2022, programmers filed a class-action lawsuit against GitHub, Microsoft and OpenAI focusing on breach of contract and privacy claims. In January 2023, the same law firm also filed a class-action lawsuit related to AI-generated image services, such as Stability AI's Stable Diffusion, Midjourney and DreamUp, which raises copyright infringement issues.

A few days later, Getty Images also filed a lawsuit relating to Stable Diffusion, arguing the service had "copied more than 12 million photographs from Getty Images' collection, along with the associated captions and metadata, without permission from or compensation to Getty Images," according to the lawsuit. In July 2023, Sarah Silverman and other authors sued OpenAI and Meta, claiming the GenAI training process infringed on the copyright protection of their works.

These lawsuits differ in important ways, according to Siegel. Stability AI allegedly uses images from the web to train its models. As a result, Getty's customers arguably have less need to license more images from Getty. This gets at the heart of copyright law, which is to incentivize people to develop creative works. Photographers and artists might be less willing to spend time and resources developing photos and images if those are used to train AI to replace them.

"If you are a photographer, would you be willing to spend your time and resources creating photos if those photos were going to be used to train an AI model, without compensation or permission, and potential licensees of your images could simply go to the AI model instead? Doubtful," Siegel said.

The Silverman case against OpenAI and Meta centers around the ability to provide summaries of books without permission to create derivative works from the authors. Siegel said this case differs from the Getty Images one because its use is similar to CliffsNotes, which is considered fine because people can still buy the book to get the full story.

To use or not to use?

In his May 2024 article "AI lawsuits explained: Who's getting sued?," TechTarget Site Editor Ben Lutkevich noted that settling cases involving the use of AI will help answer the following questions:

  • Do you need a license to train a GenAI model on copyrighted material?
  • Does AI-generated content infringe on the copyright of the materials used to train the model?
  • Does GenAI violate controls regarding removal or alteration of copyright information?
  • Is work generated by AI in the style of someone else a violation of that person's rights?
  • What are the ramifications of using open source licenses to train AI models?

Read Lutkevich's full article about copyright lawsuits brought against GenAI companies.

How will AI change copyright laws?

The short-term future of AI-generated content copyright will likely aim to clarify existing concepts, such as fair use and authorship for the AI age. Most AI companies rely on fair use to justify how they train their models, but it presents a gray area, Siegel said.

"To the extent the U.S. wants to foster the development of AI businesses, the laws around the use of copyrighted works in training AI models need to be sufficiently clear that even an early-stage startup can predictably determine whether their business model will run afoul of copyright laws. We are not even close to that point," Siegel said.

Overall, AI is changing copyright law. It is causing the legal system to define what constitutes authorship and how to protect human-generated content even if it contains AI-generated content, said Robert Scott, managing partner and Scott & Scott LLP.

In the U.S., the Copyright Office guidance states that works containing AI-generated content are not copyrightable without evidence that a human author contributed creatively. New laws can help clarify the level of human contribution needed to protect works containing AI-generated content.

Editor's note: This article was updated in June 2024 to provide updated information regarding current and proposed copyright legislation, and to improve the reader experience.

George Lawton is a journalist based in London. Over the last 30 years, he has written more than 3,000 stories about computers, communications, knowledge management, business, health and other areas that interest him.

Next Steps

Generative AI ethics: Biggest concerns

How to prevent deepfakes in the era of generative AI

Pros and cons of AI-generated content

Will AI replace jobs? Job types that might be affected

AI existential risk: Is AI a threat to humanity?

Dig Deeper on Information management and governance