Amrusha Chati
13 July 2023 • 6.8 min read
share this blog
Stock photo provider Getty Images has sued artificial intelligence company Stability AI Inc in the UK and USA accusing it of illegally using over 12 million copyrighted photos to train its popular Stable Diffusion AI image-generation system.
This is the latest in a string of high-profile cases, from comic books to music, involving generative AI technology like ChatGPT.
But there's one key difference. Others are fighting over the final output, but Getty Images vs. Stability AI is about what lies beneath generative AI technology.
Where does AI get its input from? Do copyright laws apply to these sources? Should AI companies be accountable for the vast amounts of data used to train AI systems?
We take a deep dive into the world of AI to untangle these questions.
Generative Artificial Intelligence (AI) is an algorithm that can create new content like audio, video, code, text, images, text, and simulations.
This forms the basis of large language models (LLMs) like ChatGPT. Such LLMs are fed huge datasets to "train" them to mimic human cognition, language, and creativity.
These data sets come from sources such as The Pile.
The Pile is an open-source dataset that stores vast amounts of data to train AI systems. It contains text from more than 190k pirated books, including the Harry Potter and Game of Thrones series.
Similarly, Open AI's GPT-3 was trained on 45 terabytes of text data. That's approximately 90 million novels.
Most of this material is copyrighted and "scraped" from the internet without licenses or permissions.
And this has made Getty Images very, very angry.
Getty Images is a US-based global digital media provider. It produces, licenses, and sells royalty-free images, stock images, online music, and videos.
Stability AI is one of the world's leading open-source generative AI companies. Its marquee products are Stable Diffusion and DreamStudio.
Getty Images filed a lawsuit with the Delaware federal court in March 2023, protesting:
“Stability AI's brazen infringement of Getty Images' intellectual property on a staggering scale.”
And that may not be hyperbole. Getty Images claims Stability AI copied over 12 million photographs from their collection. This includes captions and metadata, all without permission or compensation. It's also been accused of providing false copyright management information.
Getty Images content was used to train Stable Diffusion. The AI model relies on millions of images "scraped" from the internet on a massive scale. It uses these to generate computer-synthesized images from a given text input or prompt.
It's not the underlying innovation that Getty has a problem with. The company licenses its images and associated metadata to develop AI and machine learning tools.
But Getty Images says Stability AI did not legally license its images, leading to the copyright infringement claim.
It's also an alleged trademark infringement. Stable Diffusion often uses a modified version of the Getty Images watermark. This hurts Getty's reputation as the output “ranges from the bizarre to the grotesque.”
Getty Images and Stability AI declined to comment as the case is currently pending.
Stability AI's defense for this lawsuit will probably rely on the US fair use doctrine. It allows the unlicensed use of copyrighted work for purposes such as news reporting, teaching, or research.
Stability AI has previously said that “training these models is an acceptable and transformative use of content protected by fair use.”
But, the recent Supreme Court verdict in the Andy Warhol copyright infringement case will be a significant precedent for this case.
This doesn’t bode well for Stability AI. SCOTUS ruled against Andy Warhol after much debate around "fair use" and “transformative use.”
The landmark decision was hailed for providing much-needed clarity about copyright infringement. So proving "transformative use" and claiming "fair use" will be much harder for Stability AI in the wake of the Warhol verdict.
Intellectual property law is in uncharted waters when it comes to AI. Policymakers worldwide need help to define and regulate generative AI and its output.
The US House of Representatives is also trying to assess the effects and dangers of AI on copyright law. In a hearing earlier this month, members of Congress quizzed a panel of AI experts about this.
Congressman Darrel Issa said that lawmakers need to strike a balance. They need to protect human creators from AI copyright infringement while promoting technological innovation.
"We must first and foremost address properly the concerns surrounding unauthorized use of copyrighted material while also recognizing that the potential of generative AI can only be achieved with massive amounts of data, far more than is available outside of copyright," Issa said.
Sy Damle (former general counsel, US Copyright Office) argued that most existing generative AI models come under the fair use doctrine.
“Replacing fair use with a licensing regime would stifle AI development and pose a difficult enforcement challenge.”
The Getty Images lawsuit will set an important precedent in this area. Focusing on the input rather than the output of AI tools has added another layer to the ongoing debate around AI and intellectual property.
Whichever side wins, this case will be an important step forward. It'll give us more clarity on where the line between copyright infringement and AI innovation should be drawn.
share this blog
Amrusha Chati
AUTHOR
Amrusha is a versatile professional with over 12 years of experience in journalism, broadcast news production, and media consulting. Her impressive career includes collaborating extensively with prominent global enterprises. She garnered recognition for her exceptional work in producing acclaimed shows for Bloomberg, a renowned business news network. Notably, these shows have been incorporated into the esteemed curriculum of Harvard Business School. Amrusha's expertise also encompassed a 4-year tenure as a consultant at Omidyar Network, a leading global impact investing firm. In addition, she played a pivotal role in the launch and content strategy management of the startup Live History India.
Related Blogs
Victory for Sam Smith in IP lawsuit
11 September 2023 • 4 min read
OnlyFans Consulting Firm Faces Trade Sec...
04 September 2023 • 4 min read
TI & Tiny's $100M legal battle against M...
30 August 2023 • 4 min read
Supreme Court Rules Against Andy Warhol ...
29 August 2023 • 5 min read
A Win for Small Businesses: How Katy Per...
28 August 2023 • 2 min read