Meta's AI Read Pirated Books. The Copyright Question Looms.

Key Takeaways

A San Francisco court is examining whether Meta’s use of copyrighted books to train its AI model Llama counts as “fair use.”
Authors, including Sarah Silverman, argue Meta knowingly used pirated books, constituting copyright infringement.
Meta claims its AI training is a transformative use of the material, protected under fair use laws, regardless of how the books were acquired.
This case is one of many similar lawsuits against major AI companies like OpenAI and Google.
The judge’s decision could significantly influence the future of AI development and copyright law, potentially costing Meta billions if ruled against.

A crucial legal battle unfolds in a San Francisco courtroom this Thursday, pitting Meta Platforms Inc. against a group of authors, including comedian Sarah Silverman.

The core issue is whether Meta’s use of allegedly pirated books to train its Llama artificial intelligence model constitutes fair use under copyright law. Judge Vince Chhabria will hear arguments and potentially decide the matter, or determine if it should go before a jury.

This lawsuit, which also counts writers like Ta-Nehisi Coates among its plaintiffs, is part of a growing number of legal challenges targeting major AI developers, such as OpenAI and Google, according to Bloomberg Law.

These cases are testing the legal boundaries of training AI models, which often involves processing vast amounts of data, including copyrighted works. The outcome could reshape the AI industry’s billion-dollar business model, which often assumes training with copyrighted content is permissible.

An unfavorable ruling for Meta could lead to substantial financial damages. Experts note this decision will offer insight into how courts view fair use in the age of AI and could influence other pending cases.

The authors contend that Meta intentionally downloaded huge numbers of books from known online “shadow libraries” like Library Genesis, engaging in direct copyright infringement.

Meta counters that its actions fall under the fair use exception. It argues that the books were used to create something entirely new and different – an AI tool, not a replacement for reading books. Meta emphasizes in court filings that Llama is fundamentally different from a book.

Past court decisions on technology and copyright offer some guidance, but generative AI presents unique challenges that haven’t been directly addressed by courts until now.

A key point for the authors is evidence suggesting Meta disregarded copyright concerns to quickly source training data from these unauthorized libraries, allegedly even distributing the pirated material through torrenting.

Meta disputes parts of this narrative, but primarily argues that the *use* of the material for training qualifies as fair use, no matter how it was obtained.

Some legal observers believe focusing on the source of the books side-steps the main fair use question. Others argue Meta’s alleged choice to use pirated sources demonstrates bad faith that the court cannot ignore.

How the court weighs the way Meta acquired the training materials against the fair use arguments will be critical, potentially setting a significant precedent for the AI industry.

Meta’s AI Read Pirated Books. The Copyright Question Looms.

Independent, No Ads, Supported by Readers

Support me with a coffee for just $5!

AI Dreams Up a Whole New Kind of Movie.

AI Search: Peak Now, Ads Later?

When Your AI Landlord Decides to Compete

NYT to OpenAI: Keep Your Chats. Forever.

Latest News

AI Dreams Up a Whole New Kind of Movie.

AI Search: Peak Now, Ads Later?

When Your AI Landlord Decides to Compete

NYT to OpenAI: Keep Your Chats. Forever.

Microsoft’s New AI Gambit: Meta Blood Meets Redmond Muscle

Five AI Assistants, One Hectic Week: Who Survived Us?