Adobe Faces Class-Action Lawsuit Over Allegations of Improper Use of Authors’ Work for AI Training

By December 17, 2025December 22, 2025

Unlock Creativity with Adobe Firefly Image 5: New Layer Support and Custom Model Creation Features

Like many forward-thinking tech companies, Adobe has made a significant pivot towards artificial intelligence in recent years, unveiling a suite of AI-driven services since 2023. Among these innovations is Firefly, an impressive media-generation platform. However, this enthusiastic embrace of technology may have inadvertently landed Adobe in hot water, as it faces a lawsuit alleging the use of pirated literary works to train one of its AI models.

The Lawsuit Against Adobe

A proposed class-action lawsuit has been spearheaded by Elizabeth Lyon, an author from Oregon, claiming that Adobe utilized unauthorized versions of several books, including her own, to enhance the capabilities of its SlimLM program.

Understanding SlimLM

Adobe portrays SlimLM as a compact language model series optimized for tasks related to document assistance on mobile devices. The model was reportedly pre-trained using a dataset known as SlimPajama-627B, described by Adobe as a “deduplicated, multi-corpora, open-source dataset.” This dataset was released in June 2023 by Cerebras. Lyon contends that her works were included in the pre-training dataset used by Adobe.

The Details of the Allegations

Lyon’s lawsuit, initially reported by Reuters, states that her writings were part of a processed subset derived from a manipulated dataset underlying Adobe’s program. The lawsuit asserts, “SlimPajama was created by copying and manipulating the RedPajama dataset,” which also included a substantial collection of works known as the Books3 dataset. This collection features a staggering 191,000 books and has stirred legal controversies within the tech community.

The Broader Context

The Books3 dataset has been frequently mentioned in ongoing legal cases against various tech giants. For instance, a lawsuit against Apple claimed the company had improperly used copyrighted materials for training its Apple Intelligence model. Similarly, Salesforce faced allegations of using RedPajama for similar purposes.

As the tech industry grapples with these legal challenges, such lawsuits have become increasingly common. AI algorithms require extensive datasets for training, and some have reportedly included pirated content. In September, Anthropic reached a settlement to pay $1.5 billion to authors who accused it of utilizing their pirated works to train its chatbot, Claude. This case represents a potential shift in the ongoing legal discourse concerning copyright and AI training data.

Conclusion

The dynamic world of artificial intelligence is undoubtedly exciting, but it also brings to light pressing ethical concerns about the sources of training data. As Adobe and other tech companies navigate this intricate landscape, the outcome of such lawsuits will likely shape the future of AI and its relationship with intellectual property rights. If you’re in the tech or literary field, or simply curious about these developments, stay informed and engaged. Together, we can foster a more ethical approach to innovation.