Adobe has built part of its artificial intelligence strategy on a very recognizable banner: protecting creators in a time of profound change. While other technology companies accumulated criticism for the origin of their data, the company presented itself as a responsible alternative. That position is now facing a lawsuit which focuses on the training of one of its models and the use of copyrighted works. The case is not an anomaly, but rather a reflection of a question that the industry has not yet been able to clearly answer.

The Lawsuit Against Adobe

The lawsuit was filed in the U.S. Court for the Northern District of California and takes the form of a proposed class action. An author named Elizabeth Lyon accuses Adobe of using copyrighted books, including her own, to train the company’s AI models, with SlimLM at the center of the case, without permission. According to judicial documentation, these works became part of the training process for systems designed to respond to human instructions. Lyon claims to be acting on behalf of other rights holders who find themselves in a similar situation.

The Great Debate About Data That Trains AI

To understand why this type of litigation is increasingly common, it is essential to look at how modern artificial intelligence operates. Beyond the visible applications, such as chatbots and image generators, there are underlying models that serve as the core of the system, learning from huge volumes of data. Generally speaking, more data can improve performance; however, the issue arises around the origin of that information and the conditions under which it has been used.

Understanding SlimLM

The model mentioned in the lawsuit is not Firefly, Adobe’s well-known creative system, but rather SlimLM, a family of smaller language models designed for specific tasks. These models assist users with document-related functions, particularly on mobile devices. This distinction is significant as it indicates that the debate over training data extends beyond the most prominent applications.

Implications of Training Data

The lawsuit points out that the conflict lies not in SlimLM as a product, but in the data used during its training phase. Adobe has stated that these models were pre-trained with SlimPajama-627B, an open-source data set released by Cerebras in June 2023. However, SlimPajama derives from RedPajama, another dataset that reportedly incorporates Books3, a wide collection of copyrighted books. According to the plaintiff, this chain of data sourcing led to the inclusion of works without proper authorization.

Separating the Narratives

Until now, Adobe’s public narrative regarding artificial intelligence has been significantly framed around Firefly, which is largely associated with creator rights and licensed content. Adobe asserts that these models were trained with content that is licensed or from the public domain, accompanied by compensation initiatives for contributors. However, this lawsuit targets SlimLM, a more discreet model that operates in the background and lacks a direct commercial presence. This separation is crucial for understanding the broader implications of the case.

The Broader Legal Landscape

The legal challenges against Adobe form part of a larger trend in the United States where authors and rights holders are increasingly taking tech companies to court over the unauthorized use of their works for AI model training. Numerous lawsuits against tech giants such as OpenAI and Anthropic have surfaced, with varying outcomes: some cases are still active, while others have resulted in multi-million dollar settlements. This ongoing litigation is integral to defining the legal boundaries of data usage within AI.

Conclusion and Future Insights

Currently, the case is in its early stages, leaving many questions unanswered. The plaintiff seeks unspecified financial compensation on behalf of other potentially affected parties, and Adobe has not yet provided a comment regarding the lawsuit. The judicial process will ultimately determine the outcome of the case, whether it leads to a settlement or dismissal. Regardless of its conclusion, this legal matter brings attention back to a pressing issue: how to effectively balance technological advancements in AI with the rights of content creators.



General News – 2