- Critics argue that developers of generative AI systems such as ChatGPT and DALL-E have unfairly trained their models on copyrighted works.
- Daniel Castro director of the Center For Data Innovation says those concerns are misguided. Moreover, restricting AI systems from training on legally accessed data would significantly curtail the development and adoption of generative AI across many sectors.
- Nonetheless, there are legitimate IP issues for policymakers to consider and current copyright laws should be enforced and strengthened for protection.
READ MORE: Critics of Generative AI Are Worrying About the Wrong IP Issues (Center for Data Innovation)
Anyone suggesting generative AI systems are unfairly exploiting the works of creators is wrong, says Daniel Castro, director of the Center For Data Innovation.
He argues that generative AI systems should not be exempt from complying with intellectual property (IP) laws, but neither should they be held to a higher standard than human creators.
Castro’s report refutes the arguments made about how generative AI is unfair to creators and also acknowledges that there are legitimate IP rights at stake.
Training AI Models
The biggest debate when it comes to copyright is whether generative AI systems should be allowed to train their models on text, audio, images, and videos that are legally accessible to Internet users but are also protected by copyright.
Some creators argue that it is unfair for developers to train their AI systems on content they have posted on the Internet without their consent, credit, or compensation.
Castro says that people do not have the right to use copyrighted content any way they want just because they can legally access it on the Internet. However, their not having the right to use it any way they want does not mean they cannot do anything with this content. For example, search engines can legally crawl websites without violating copyright laws.
“While it will ultimately be up to the courts to decide whether a particular use of generative AI infringes on copyright, there is precedent for them to find most uses to be lawful and not in violation of rightsholders’ exclusive rights.”
Is training AI systems on copyrighted content just theft? Online piracy is clearly theft, says Castro, but seeking inspiration and learning from others is not.
“In fact, all creative works are shaped by past works, as creators do not exist in a vacuum. Calling this process theft is clearly inaccurate when applied to the way humans observe and learn, and it is equally inaccurate to describe training a generative AI system.”
Is it wrong to train AI systems on copyrighted content without first obtaining affirmative consent from the copyright holder?
According to Castro, copyright owners have the right to decide whether to display or perform their works publicly. But if they choose to display their work in public, others can use their works in certain ways without their permission. For example, photographers can take pictures of sculptures or graffiti in public places even when those works are protected by copyright.
“There is no intrinsic rationale for why users of generative AI systems would need to obtain permission to train on copyrighted content they have legal access to,” he says. “Learning from legally accessed works does not violate a copyright owner’s exclusive reproduction and distribution rights. Unless human creators will be required to obtain permission before they can study another person’s work, this requirement should not be applied to AI.”
Critics of generative AI are also likely to overestimate individual contributions. In figures given in the report, Stable Diffusion trained on a dataset of 600 million images. Of those, out of a sample of 12 million of the most “aesthetically attractive images” (which presumably skew more toward works of art than other random images from the Internet), the most popular artist (Thomas Kinkade) appeared 9,268 times. Put differently, the most popular artist in the dataset likely represented only 0.0015% of all images in the dataset.
Or consider LaMDA, a large language model created by Google, that trained on 1.56 trillion words scraped from the Internet.
“Given the size of these models, the contribution of any single person is miniscule,” Castro concludes.
Critics also contend that generative AI systems should not be able to produce content that mimics a particular artist’s distinctive visual style without their permission. “However, once again, such a demand would require holding AI systems to a different standard than humans,” fires back Castro. “Artists can create an image in the style of another artist because copyright does not give someone exclusive rights to a style. For example, numerous artists sell Pixar-style cartoon portraits of individuals.”
And it is perfectly legal to commission someone to write an original poem in the style of Dr. Seuss or an original song in the style of Louis Armstrong. Users of generative AI systems should retain the same freedom, he says.
Legitimate IP Issues of Concern
Nonetheless there are legitimate IP issues for policymakers to consider. Castro dives into them.
Individuals who use AI to create content deserve copyright protection for their works. The US Copyright Office has developed initial guidance for registering works created by using AI tools. The Copyright Office should not grant copyright requests to an AI system itself or for works in which there is no substantial human input.
He argues that copyright protection for AI-generated content should function similarly to that of photographs wherein a machine (such as a camera) does much of the mechanical work in producing the initial image, but it is a variety of decisions by the human photographer (subject, composition, lighting, post-production edits, etc.) that shape the final result.
Likewise, individuals who use AI tools to create content do more than just click a button, such as experimenting with different prompts, making multiple variations, and editing and combining final works.
Just as it is illegal for artists to misrepresent their works as that of someone else, so too is it unlawful to use generative AI to misrepresent content as being created by another artist.
“For example, someone might enjoy creating drawings of their own original characters in the style of Bill Watterson, the cartoonist behind the popular Calvin and Hobbes comic strip, but they cannot misrepresent those drawings as having been created by Watterson himself.
“Artists can and should continue to enforce their rights in court when someone produces nearly identical work that unlawfully infringes on their copyright, whether that work was created entirely by human hands or involved the use of generative AI.”
Generative AI has not changed the fact that individuals should continue to enforce their publicity rights by bringing cases against those who violate their rights.
This right is especially important for celebrities, as it enables them to control how others use their likeness commercially, such as in ads or in film and TV.
Castro says, “While deepfake technology makes it easier to create content that impersonates someone else, the underlying problem itself is not new. Courts have repeatedly upheld this right, including for cases involving indirect uses of an individual’s identity.”
Generative AI also raises questions about who owns rights to certain character elements. For example, if a movie studio wants to create a sequel to a film, can it use generative AI to digitally recreate a character (including the voice and image) or does the actor own those rights? And does it matter how the film will depict the character, including whether the character might engage in activities or dialogue that could reflect negatively on the actor?
Castro thinks these types of questions will likely be settled through contracts performers sign addressing who has rights to a performer’s image, voice and more.
Castro finds that while there are many important considerations for how generative AI impacts IP rights and how policymakers can protect rightsholders, critics are wrong to claim that such models should not be allowed to train on legally accessed copyrighted content.
Moreover, imposing restrictions on training generative AI models to only lawfully accessed content could unnecessarily limit its development.
“Instead, policymakers should offer guidance and clarity for those using these tools, focus on robust IP rights enforcement, create new legislation to combat online piracy, and expand laws to protect individuals from impersonation.”
Even with AI-powered text-to-image tools like DALL-E 2, Midjourney and Craiyon still in their relative infancy, artificial intelligence and machine learning is already transforming the definition of art — including cinema — in ways no one could have ever predicted. Gain insights into AI’s potential impact on Media & Entertainment in NAB Amplify’s ongoing series of articles examining the latest trends and developments in AI art
- What Will DALL-E Mean for the Future of Creativity?
- Recognizing Ourselves in AI-Generated Art
- Are AI Art Models for Creativity or Commerce?
- In an AI-Generated World, How Do We Determine the Value of Art?