Skip to main content

Apple named in AI lawsuit over data set it says doesn’t power Apple Intelligence

Apple is named in a new AI lawsuit by publisher Chicken Soup for the Soul, reports Reuters. However, the lawsuit points to a data set that Apple has already said doesn’t power Apple Intelligence.

Per the Reuters report:

Book publisher Chicken Soup for the Soul sued several Big Tech companies in California federal court late Tuesday for allegedly misusing its content to train their artificial intelligence systems.

The publisher said that Apple (AAPL.O), Google (GOOGL.O), Nvidia (NVDA.O), Meta Platforms (META.O), OpenAI, Anthropic, Perplexity ‌AI and Elon Musk’s xAI used pirated copies of its books to teach their chatbots to respond to human prompts.

The lawsuit, which you can read in full here, accuses Apple of using books to train its AI technology:

This case concerns a straightforward and deliberate act of theft that constitutes copyright infringement. Anthropic, Google, OpenAI, Meta, xAI, Apple, Perplexity, and NVIDIA, illegally copied vast quantities of copyrighted books without permission and then used those stolen copies to build and train their commercial large language models (“LLMs”) and/or optimize their product. Defendants helped themselves to the copyrighted works of thousands of authors—including bestselling writers, Pulitzer Prize-winning journalists, and creators of widely read nonfiction and fiction.

Later in the filing, the lawsuit points to The Pile being used to train Apple Foundation Models.

Rather than obtain licenses or pay for the use of these works, each Defendant
downloaded pirated copies of Plaintiff’s books from shadow-library websites such as The Pile, LibGen, Z-Library, and Anna’s Archive and then reproduced, parsed, analyzed, re-copied, used, and embedded those works into their LLMs (and/or used those works to optimize their product) to accelerate commercial development and win the generative-AI race. The Copyright Act prohibits exactly this conduct. […]

“Apple Foundation Models” relied upon The Pile and Books 3.

If The Pile rings a bell to you, that’s likely because it surfaced in a different AI training accusation in 2024, involving YouTube videos.

At the time, however, Apple said that the dataset in question was only used for research purposes and not actually used in any models that powered Apple Intelligence or machine learning features.

Will that make a difference in this legal case? It will certainly be relevant, but we’ll have to see what happens in court to know if it’s a difference without a distinction or not.

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Zac Hall Zac Hall

Zac covers Apple news, hosts the 9to5Mac Happy Hour podcast, and created SpaceExplored.com.