Reddit Sues Anthropic Over Alleged User Data Scraping for AI Training

Published 1 week ago• 4 minute read

Social media platform Reddit initiated legal proceedings against artificial intelligence startup Anthropic on Wednesday, June 5, 2025, filing a 42-page complaint in the Northern California court, specifically the California Superior Court in San Francisco, where both companies are headquartered. Reddit accuses Anthropic, referred to in the complaint as a $61.5 billion startup, of unlawfully scraping user-generated content from its site to train its AI models, including the Claude chatbot, without permission and in violation of Reddit's user agreement by using the site's data for commercial purposes. Reddit alleges that Anthropic intentionally trained its models on the personal data of Reddit users without ever requesting their consent.

The lawsuit contends that Anthropic's actions constitute a breach of Reddit's terms of use and amount to unfair competition. Notably, this legal challenge is distinct from many other lawsuits targeting AI companies as it does not primarily allege copyright infringement. Reddit is seeking damages and has requested a jury trial. According to TechCrunch, this lawsuit represents the first instance of a major tech company legally challenging an AI startup over the materials used for training AI models.

Ben Lee, Reddit's Chief Legal Officer, stated, "We will not tolerate profit-seeking entities like Anthropic commercially exploiting Reddit content for billions of dollars without any return for redditors or respect for their privacy." He further emphasized, "AI companies should not be allowed to scrape information and content from people without clear limitations on how they can use that data." Reddit highlighted that it has established formal licensing agreements with other AI companies, such as a $60 million deal with Google in February 2024 for training its Gemini AI, and a similar contract with OpenAI in May 2024 for refining ChatGPT. These agreements, Reddit states, permit the use of public Reddit content "but only after agreeing to Reddit's licensing terms," which include provisions to protect user privacy, ensure the right to delete content, and prevent users from being spammed. These licensing deals also played a role in helping the 20-year-old online platform raise funds before its Initial Public Offering (IPO) in March 2024, after which Reddit was valued at over $21 billion.

Anthropic, formed in 2021 by former OpenAI executives, has stated its disagreement with Reddit's claims. An Anthropic spokesperson responded, "We disagree with Reddit's claims and will defend ourselves vigorously." The company's flagship product, the Claude chatbot, is a key competitor to OpenAI's ChatGPT, and Anthropic's primary commercial partner is Amazon, which utilizes Claude to enhance its Alexa voice assistant.

The dispute is not new. In July 2024, Reddit CEO Steve Huffman publicly called out Anthropic, along with Microsoft and Perplexity, for unauthorizedly scraping Reddit's site for training data. At that time, an Anthropic spokesperson reportedly assured Reddit that such activities had ceased. However, Reddit's complaint alleges that since then, Anthropic's bots have crawled its site over 100,000 times.

Like many AI firms, Anthropic has relied on extensive online resources, such as Wikipedia and Reddit, which offer deep troves of written material crucial for teaching AI assistants the patterns of human language. A 2021 paper co-authored by Anthropic CEO Dario Amodei, cited in the lawsuit, identified specific subreddits (subject-matter forums on Reddit) that contained high-quality AI training data, including those focused on gardening, history, relationship advice, and even shower thoughts. In a 2023 letter to the U.S. Copyright Office, Anthropic argued that "the way Claude was trained qualifies as a quintessentially lawful use of materials," by making copies of information for statistical analysis of large datasets.

This lawsuit from Reddit adds to Anthropic's legal challenges, as the AI company is already battling a lawsuit from major music publishers who allege that Claude improperly reproduces copyrighted song lyrics. Reddit, which boasts over 100 million daily active users across hundreds of thousands of subreddit communities, is pursuing this case to protect its platform and user data from unauthorized commercial exploitation.

From Zeal News Studio(Terms and Conditions)