Adobe Under Fire: Class-Action Lawsuit Alleges Misuse of Authors' Work for AI Training

Adobe, a prominent technology company, is facing a proposed class-action lawsuit alleging that it utilized pirated books, including copyrighted works by author Elizabeth Lyon, to train its artificial intelligence model, SlimLM. The lawsuit, filed on behalf of Lyon, claims that Adobe's small language model, designed for document assistance tasks on mobile devices, was pre-trained on SlimPajama-627B. This dataset, described by Adobe as a "deduplicated, multi-corpora, open-source dataset," was released by Cerebras in June 2023.
According to Lyon, who specializes in non-fiction writing guidebooks, some of her copyrighted works were incorporated into a pretraining dataset used by Adobe. The lawsuit, initially reported by Reuters, asserts that Lyon's writing was part of a processed subset of a manipulated dataset that formed the foundation of Adobe's program. Specifically, it states, "The SlimPajama dataset was created by copying and manipulating the RedPajama dataset (including copying Books3). Thus, because it is a derivative copy of the RedPajama dataset, SlimPajama contains the Books3 dataset, including the copyrighted works of Plaintiff and the Class members."
"Books3," a vast collection comprising 191,000 books, has become a recurring point of legal contention within the tech community due to its alleged use in training generative AI systems. Similarly, the RedPajama dataset has been implicated in multiple litigations. This legal challenge against Adobe is part of a growing trend of copyright infringement lawsuits targeting the tech industry's use of massive datasets for AI training, many of which allegedly contain pirated materials.
The issue of copyrighted content in AI training data has led to numerous legal battles. For instance, in September, Apple faced a lawsuit claiming it used copyrighted material to train its Apple Intelligence model, specifically mentioning the RedPajama dataset and accusing the company of copying protected works without consent or compensation. A similar lawsuit was filed against Salesforce in October, also citing the use of RedPajama for training purposes. These cases highlight a pervasive challenge for the tech industry, as AI algorithms rely on extensive datasets, and the provenance of some of these materials is increasingly being scrutinized.
A notable precedent occurred in September when Anthropic agreed to a $1.5 billion settlement with several authors who had accused the company of using pirated versions of their work to train its chatbot, Claude. This settlement was widely regarded as a significant development in the ongoing legal discourse surrounding copyrighted material in AI training data, underscoring the legal and ethical complexities inherent in the development and deployment of advanced AI technologies.
You may also like...
Top 10 Family-Friendly Activities You Can Plan This Christmas
An article guide to affordable, family-friendly Christmas activities you can enjoy across Africa, focused on creating me...
Mark Zuckerberg’s biggest AI bet isn’t a Model; It’s a person: Alexandr Wang.
Meta put $14.3B into Scale AI and tapped 28-year-old founder Alexandr Wang to run its AI push. Here’s how he got there, ...
“Everyone Is Doing Better Than You”: The Psychology of Comparison in the Age of Social Media
“Everyone Is Doing Better Than You” dives into the unseen impact of social media on how we measure ourselves. Why does s...
Whose Money Is It Anyway? A Love Story Between Women and Other People’s Wallets
A humorous and insightful social commentary on why women often prefer spending men’s money, blending psychology, storyte...
Jake Paul Vows “Biggest Upset in Sports History” Ahead of Controversial Clash With Anthony Joshua in Miami
Jake Paul faces former heavyweight champion Anthony Joshua in a highly controversial Miami bout, vowing a historic upset...
TikTok Crowns a Star: Alex Warren Wins Breakthrough at Inaugural Awards

The inaugural TikTok Awards honored top creators at the Hollywood Palladium, with Alex Warren crowned Breakthrough of th...
Tragedy Strikes: 'The Voice' Singer Kata Hay Arrested for Vehicular Homicide

Former The Voice contestant Kata Hay, real name Kata Huddleston, has been arrested in Oklahoma on a vehicular homicide w...
Emily in Paris Showrunner Unpacks Season 5 Love Triangles and Reveals Emily's Perfect Match!

Season 5 of 'Emily in Paris' follows Emily's failed Roman venture, leading her back to Paris and a new level of self-awa...


