Adobe Under Fire: Class-Action Lawsuit Alleges Misuse of Authors' Work for AI Training

Adobe, a prominent technology company, is facing a proposed class-action lawsuit alleging that it utilized pirated books, including copyrighted works by author Elizabeth Lyon, to train its artificial intelligence model, SlimLM. The lawsuit, filed on behalf of Lyon, claims that Adobe's small language model, designed for document assistance tasks on mobile devices, was pre-trained on SlimPajama-627B. This dataset, described by Adobe as a "deduplicated, multi-corpora, open-source dataset," was released by Cerebras in June 2023.
According to Lyon, who specializes in non-fiction writing guidebooks, some of her copyrighted works were incorporated into a pretraining dataset used by Adobe. The lawsuit, initially reported by Reuters, asserts that Lyon's writing was part of a processed subset of a manipulated dataset that formed the foundation of Adobe's program. Specifically, it states, "The SlimPajama dataset was created by copying and manipulating the RedPajama dataset (including copying Books3). Thus, because it is a derivative copy of the RedPajama dataset, SlimPajama contains the Books3 dataset, including the copyrighted works of Plaintiff and the Class members."
"Books3," a vast collection comprising 191,000 books, has become a recurring point of legal contention within the tech community due to its alleged use in training generative AI systems. Similarly, the RedPajama dataset has been implicated in multiple litigations. This legal challenge against Adobe is part of a growing trend of copyright infringement lawsuits targeting the tech industry's use of massive datasets for AI training, many of which allegedly contain pirated materials.
The issue of copyrighted content in AI training data has led to numerous legal battles. For instance, in September, Apple faced a lawsuit claiming it used copyrighted material to train its Apple Intelligence model, specifically mentioning the RedPajama dataset and accusing the company of copying protected works without consent or compensation. A similar lawsuit was filed against Salesforce in October, also citing the use of RedPajama for training purposes. These cases highlight a pervasive challenge for the tech industry, as AI algorithms rely on extensive datasets, and the provenance of some of these materials is increasingly being scrutinized.
A notable precedent occurred in September when Anthropic agreed to a $1.5 billion settlement with several authors who had accused the company of using pirated versions of their work to train its chatbot, Claude. This settlement was widely regarded as a significant development in the ongoing legal discourse surrounding copyrighted material in AI training data, underscoring the legal and ethical complexities inherent in the development and deployment of advanced AI technologies.
You may also like...
When Sacred Calendars Align: What a Rare Religious Overlap Can Teach Us
As Lent, Ramadan, and the Lunar calendar converge in February 2026, this short piece explores religious tolerance, commu...
Arsenal Under Fire: Arteta Defiantly Rejects 'Bottlers' Label Amid Title Race Nerves!

Mikel Arteta vehemently denies accusations of Arsenal being "bottlers" following a stumble against Wolves, which handed ...
Sensational Transfer Buzz: Casemiro Linked with Messi or Ronaldo Reunion Post-Man Utd Exit!

The latest transfer window sees major shifts as Manchester United's Casemiro draws interest from Inter Miami and Al Nass...
WBD Deal Heats Up: Netflix Co-CEO Fights for Takeover Amid DOJ Approval Claims!

Netflix co-CEO Ted Sarandos is vigorously advocating for the company's $83 billion acquisition of Warner Bros. Discovery...
KPop Demon Hunters' Stars and Songwriters Celebrate Lunar New Year Success!

Brooks Brothers and Gold House celebrated Lunar New Year with a celebrity-filled dinner in Beverly Hills, featuring rema...
Life-Saving Breakthrough: New US-Backed HIV Injection to Reach Thousands in Zimbabwe

The United States is backing a new twice-yearly HIV prevention injection, lenacapavir (LEN), for 271,000 people in Zimba...
OpenAI's Moral Crossroads: Nearly Tipped Off Police About School Shooter Threat Months Ago
ChatGPT-maker OpenAI disclosed it had identified Jesse Van Rootselaar's account for violent activities last year, prior ...
MTN Nigeria's Market Soars: Stock Hits Record High Post $6.2B Deal
MTN Nigeria's shares surged to a record high following MTN Group's $6.2 billion acquisition of IHS Towers. This strategi...