Reddit sues Perplexity and three other companies for allegedly using its content without paying


Reddit is suing companies SerApi, OxyLabs, AWMProxy and Perplexity for allegedly scraping its data from search results and using it without a license, The New York Times reports. The new lawsuit follows legal action against AI startup Anthropic, who allegedly used Reddit content to train its Claude chatbot.

As of 2023, Reddit charges companies looking access to posts and other content in the hopes of making money on data that could be used for AI training. The company has also signed licensing deals with companies like Google and OpenAI, and even built an AI answer machine of its own to leverage the knowledge in users’ posts. Scraping search results for Reddit content avoids those payments, which is why the company is seeking financial damages and a permanent injunction that prevents companies from selling previously scraped Reddit material.

Some of the companies Reddit is focused on, like SerApi, OxyLabs and AWMProxy, are not exactly household names, but they’ve all made collecting data from search results and selling it a key part of their business. Perplexity’s inclusion in the lawsuit might be more obvious. The AI company needs data to train its models, and has already been caught seemingly copying and regurgitating material it hasn’t paid to license. That also includes reportedly ignoring the robots.txt protocol, a way for websites to communicate that they don’t want their material scraped.

Per a copy of the lawsuit provided to Engadget, Reddit had already sent a cease-and-desist to Perplexity asking it to stop scraping posts without a license. The company claimed it didn’t use Reddit data, but it also continued to cite the platform in answers from its chatbot. Reddit says it was able to prove Perplexity was using scraped Reddit content by creating a “test post” that “could only be crawled by Google’s search engine and was not otherwise accessible anywhere on the internet.” Within a few hours, queries made to Perplexity’s answer engine were able to reproduce the content of the post.

“The only way that Perplexity could have obtained that Reddit content and then used it in its ‘answer engine’ is if it and/or its co-defendants scraped Google [search results] for that Reddit content and Perplexity then quickly incorporated that data into its answer engine,” the lawsuit claims.

When asked to comment, Perplexity provided the following statement:

Perplexity has not yet received the lawsuit, but we will always fight vigorously for users’ rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest.

This new lawsuit fits with the aggressive stance Reddit has taken towards protecting its data, including rate-limiting unknown bots and web crawlers in 2024, and even limiting what access the Internet Archive’s Wayback Machine has to its site in August 2025. The company has also sought to define new terms around how websites are crawled by adopting the Really Simple Licensing standard, which adds licensing terms to robots.txt.



Source link

Latest

Publisher pulls horror novel ‘Shy Girl’ over AI concerns

Hachette Book Group said it will not be...

Apple considered buying Halide to upgrade its native Camera app

A legal feud between the co-founders of Lux...

Newsletter

Don't miss

Publisher pulls horror novel ‘Shy Girl’ over AI concerns

Hachette Book Group said it will not be...

Apple considered buying Halide to upgrade its native Camera app

A legal feud between the co-founders of Lux...

Elon Musk misled investors during his Twitter takeover, jury finds

A group of former Twitter investors have prevailed...

Publisher pulls horror novel ‘Shy Girl’ over AI concerns

Hachette Book Group said it will not be publishing a novel called “Shy Girl” over concerns that artificial intelligence was used to generate...

Apple considered buying Halide to upgrade its native Camera app

A legal feud between the co-founders of Lux Optics, the developer behind the Halide camera app, revealed that Apple was close to acquiring...

Payable is the best way to accept PayPal, Stripe, and Square payments in Google Forms – here’s how

Google Forms is one of the easiest ways to collect data from customers and users, but it goes so far beyond that. Did...

LEAVE A REPLY

Please enter your comment!
Please enter your name here