Data laundering and the threat it poses under copyright law

Publication date: August 20, 2025

Dynamic technological advancements have led modern businesses to operate in ways unforeseen decades ago. Computerization, in particular, without which modern business operations are inconceivable, has effectively broadened the horizons of many entrepreneurs, while simultaneously leaving room for abuse by cybercriminals. The protection of data stored on companies’ internal servers has become crucial. It should be noted that despite the efforts of both EU and national authorities, new threats are emerging in the field of personal data protection law, which may lead to violations not only of general provisions on the protection of personal rights but also of many other legal disciplines, such as copyright. The unprecedented mass digitization of artistic works has resulted in the inclusion of records of paintings, photographs, films, music, architectural designs, and many other manifestations of creative activity as data. From this perspective, the phenomenon of data laundering takes on a unique character and carries with it new threats.

What is data laundering?

Data laundering essentially involves transforming stolen data so that it can be sold or used by ostensibly legitimate databases. For such transformation to be possible, an organization must first obtain the data illegally, either by fraud or theft via malware. The laundering process itself boils down to three stages:

1. Establishing an organization that will face relatively few restrictions related to data processing due to its use for non-commercial or scientific purposes. These are most often non-profit organizations.

2. Perform data manipulation operations using new technologies (for example, randomizers). The goal of these operations is to modify the data in such a way as to make it more credible to potential buyers.

3. Selling data to other businesses or making it available to for-profit organizations that will then use it for commercial purposes.

It can therefore be seen that the entire process aims to make illegally obtained data appear devoid of any suspicion regarding its origin. It is important to note that data laundering can have a significant impact on individual rights, as expressed not only in the Personal Data Protection Act but also in the Copyright and Related Rights Act.

A specific type of data laundering involves exploiting artists’ creative work. Specifically, it involves processing data containing information about a work without the required authorization, in such a way that its reuse would be unrecognizable to others.

Controversy over data laundering under copyright law

It is worth pointing out from the outset that processing data related to works, as defined by the Copyright and Related Rights Act, creates an entirely new work, the disclosure of which would not constitute a direct infringement of the author’s moral rights. This is problematic, as in this situation, it is difficult to define the boundary beyond which the author would be able to bring a copyright lawsuit and effectively pursue claims.

The problem of defining the limits of using someone else’s work to create an entirely new work has become particularly pressing with the development of AI-based tools. These tools operate on the basis of a database composed of multiple works—a crucial element for ensuring the effective functioning of such tools in the future. Therefore, it is impossible to deny the significant contribution of authors to the existence of such models, for which they receive no remuneration. The key question becomes: how did the organizations training the AI tool obtain this data?

Under Polish law, an artist can freely dispose of their rights to their work unless the law provides other restrictions. Therefore, if they grant a license to an organization using an AI tool, it does not constitute an infringement of the artist’s copyright, as, in accordance with the principle of freedom of contract, the artist may authorize the use of their work for the purpose of training an AI tool.

The problem arises, however, when it comes to so-called fair use, regulated in Articles 23-35 of the Copyright and Related Rights Act. These provisions are based on vague terms that require appropriate interpretation. In such a situation, it is necessary to examine the specific case based on the evidence collected. To date, there are no judgments in Poland that would establish a specific line of jurisprudence regarding fair use based on AI tools, but certain conclusions can be drawn from the rulings of foreign courts.

One of the rulings addressing the issue of fair use for someone else’s work that may impact the legal situation of organizations using AI-based tools is the U.S. Supreme Court ruling of May 18, 2023, in the case of Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith et al. The Supreme Court emphasized that when determining the legality of fair use, the similarity between the original and secondary works is not the only factor that matters. The manner in which the resulting copy is used is also crucial – its use solely for research purposes or by a non-profit organization will be considered differently than if the direct or indirect purpose is to generate profit.

Another lawsuit, this time closely related to the use of AI-based tools to create new works based on a database containing information about existing, original works by artists, is Andersen et al. v. Stability AI Ltd et al. The lawsuit was filed by artists accusing three companies – Stability AI, Deviant Art, and Midjourney – using their works without the creators’ consent to train AI-based tools. According to the plaintiffs, the works created using these models are derivative of the artists’ original works and are essentially copies, the use of which constitutes copyright infringement. The main counterargument that the court will have to consider is the aforementioned fair use issue, because despite the artists’ position that the generated works are merely copies of existing works, the defendants will attempt to prove that AI-based tools create transformative results that differ significantly from the original works.

Conclusions

Data laundering in copyright law is a problematic phenomenon. The primary reason for this is the novelty of AI technology and the insufficiently precise legal provisions that could be applied to determine the legality of using works created using AI tools based on datasets containing information about the works of other creators. Undoubtedly, recognizing the dynamic development of AI, legislators should strive to shape regulations so that courts can issue the most equitable judgments possible. It cannot be denied that without artists’ contributions to their work, tools like Stable Diffusion would be unable to function due to the lack of sufficient data needed to train the model. At the same time, it is important to note the partially valid argument that the resulting work is not a copy in the strict sense of the word. Therefore, defining the boundaries of fair use is crucial in this matter, because without clear criteria, the market for tools generating works will remain a gray area, which is never desirable.

KG LEGAL \ INFO BLOG

KG LEGAL \ INFO
BLOG