Adversarial AI attacks and regimes of liability

Publication date: October 23, 2025

Adversarial attacks and why are they possible

The capabilities of AI systems are increasingly impressive, largely based on machine learning. This technique allows for the “training” of algorithms by providing vast amounts of data, which in turn leads to the automation of the algorithm and a radical increase in its “cognitive” capabilities, particularly through generalization, drawing conclusions from the obtained data, and predictive language models that allow for highly probable prediction of the next word that fits a given statement or sentence, taking into account its context. Most AI systems are based, to be precise, on a specific variety or technique of machine learning, referred to as an artificial neural network, or deep neural network, a metaphor that approximates the logic of AI systems to the functions of the human brain.

Neural networks are exceptionally complex, particularly due to the number of parameters that constitute such a network and the connections between them. This situation puts neural networks beyond human cognitive abilities, making it impossible to provide a precise answer to the question of what actually happens inside a so-called neural network. Unlike ordinary algorithms that execute a given command, previously predicted and programmed by a human, AI systems are ready to encounter commands they were not previously programmed to execute, as well as matter not necessarily included in the training set. However, it is also worth considering the “other side of the coin” of this same feature, the so-called black box effect, meaning a lack of knowledge about exactly how the system responded, executed a command, etc. Machine learning, neural networks, and the black box effect impact AI systems’ capabilities, but are also their “Achilles’ heel” in terms of security and vulnerability to undesirable exploitation. Among the greatest threats are so-called adversarial attacks.

Adversarial attacks are techniques that exploit loopholes and vulnerabilities in AI models resulting directly from their “nature”. Generalization based on training data creates the risk of introducing manipulated data, often indistinguishable from human eyes, very subtle, but fundamentally changing the result generated by the AI system. Highly advanced systems increasingly take into account the so-called multimodal nature of data, meaning they read not only text but also images, sounds, and video materials. One can therefore imagine data seemingly representing identical objects, but which in reality conceal completely different content, with a very limited ability for humans to perceive this difference. A frequently cited example, intended to illustrate the impact of adversarial attacks on AI systems, and simultaneously the seriousness of this threat and the potential damage resulting from an attack, is the misreading of road signs, such as a stop sign, by autonomous cars due to adversarial noise, which confuses this sign with another, allowing them to proceed. Equally dangerous, there are at least several types of adversarial attacks, and consequently, scenarios leading to undesirable behavior of AI systems.

Attacks modify input commands, leading to their incorrect interpretation by the AI system, which can consequently lead to unauthorized operations. Manipulations can be performed without modifying the training data or the model, for example, by attaching a sticker to a road sign, which the AI system can take into account, leading to the risk of reading a different sign’s content. Similarly, systems with biometric functions that ensure security, for example, in public places, may malfunction and be subject to manipulation, for example, through clothing specially prepared for this purpose (e.g., headgear or T-shirts). There is also the so-called model poisoning, which involves modifying data from the training set, i.e., at an earlier stage, or stealing information about the data used to train the model, which can potentially lead to the disclosure of confidential, personal, and sensitive data. Furthermore, a common distinction is made between targeted and untargeted attacks. The first group is intended to produce a specific effect, for example, by causing the AI system to interpret incoming data, such as a graphical representation of the number 7, as the number 4 after modifying input data that would not be readable by a human, identifying the standard number 7 in the incoming data. Untargeted attacks are designed to cause the AI system, for example, to misinterpret data presented in graphical form, without necessarily ensuring that the number 7 is always interpreted as a 4, but rather that it is simply not interpreted as a 7.

Currently, there are no procedures, techniques, or methods that provide 100% protection against adversarial attacks , because, as indicated above, vulnerabilities to adversarial attacks are, in a sense, a feature of systems based on neural networks and machine learning. The future use of so-called AI agents, i.e., systems that make decisions independently, may raise particular security concerns. Using certain protective measures, such as setting a threshold for correct categorization, the system may become significantly less effective. Other techniques used for protection include: using additional metrics and models, enriching the trained network with various types of noise (so-called augmentation) or pruning (a method of network optimization by removing parameters that contribute little to the signal).

The Artificial Intelligence Act, a key regulation concerning AI

Of course, Regulation (EU) 2024/1689 of the European Parliament and of the Council (Artificial Intelligence Act, AI Act) provides crucial regulations in this regard. The regulation establishes, among other things, definitions of artificial intelligence systems, introduces a catalogue of prohibited practices (Article 5), and introduces the concept of high-risk AI systems, for which the European legislator has provided numerous obligations. Article 9 introduces the obligation to establish, implement, document, and maintain a risk management system for high-risk AI systems. This includes identifying known and reasonably foreseeable threats, assessing and estimating risks (including those related to misuse), adopting risk mitigation measures, and evaluating the system after it enters the market. Section 2 of Chapter III of the Regulation (Articles 8-15) also includes requirements regarding data and data management, technical documentation, event logging, transparency and sharing of information with users, human oversight, accuracy, robustness, and cybersecurity. Section 3 of Chapter III (Articles 16-27) in turn defines the scope of obligations for suppliers, importers, distributors, and entities using high-risk AI systems. The AI Act also introduces the category of general-purpose AI model, defined under Article 3(63) as an AI model, including an AI model trained with large amounts of data using large-scale self-monitoring, that demonstrates significant generality and is capable of competently performing a wide range of different tasks, regardless of the manner in which the model is placed on the market, and that can be integrated with various downstream systems or applications – excluding AI models used for research, development, and prototyping activities before being placed on the market. Article 51 in turn introduces the conditions that, if met, qualify a general-purpose AI model as general-purpose AI with systemic risk. Articles 53 and 55 regulate, respectively, the obligations of general-purpose AI model providers and the obligations of general-purpose AI providers with systemic risk. Article 55(1)(a) highlights the problem of adversarial attacks, stating: “providers of general-purpose AI models with systemic risk shall evaluate the model in accordance with standardized protocols and tools reflecting the state of the art, including conducting and documenting adversarial model testing with the aim of identifying and mitigating systemic risk.” Article 72, in turn, governs the obligations of the provider regarding monitoring the system after it is released on the market.

A significant issue remains the issue of civil liability for damage caused by AI systems, including as a consequence of an adversarial attack. Until the announced EU directive on the principles shaping the regime for liability for damage caused by AI is adopted and enters into force, the provisions of the Civil Code should apply primarily. This issue is quite complex and ambiguous, not least because of the range of entities potentially liable for such damages, which include: manufacturers, system creators, distributors, producers of devices equipped with an AI system, owners of specific devices using AI, entities using AI systems as part of their business activities, as well as users and consumers who use AI for personal purposes. At the same time, although this may sound somewhat futuristic at this point, it should be noted that we are talking about liability for AI, not liability for AI itself or for a specific AI system, as AI has no legal capacity or personality.

Possible liability regimes for artificial intelligence

The simplest case to qualify is when an AI system is used by a human in such a way that, through a tortious act, it causes damage through its own fault. This means that the AI systems’ role is not the primary one, but merely accidental, technical, and instrumental. According to Article 415 of the Civil Code, liability for, for example, a hacker attack using AI should, of course, be attributed to the hacker. The situation becomes much more difficult when the damage results from an error in the AI system’s operation, rather than direct human involvement or will. Given the very broad range of entities that could be held liable in such a situation, analogy should be used, employing, in addition to the principle of fault, the principle of risk or equity, or sometimes a separate regime of contractual liability (Article 471 of the Civil Code) or even warranty liability. However, the basic regime of tort liability based on fault should be considered, and perhaps strict liability and liability for a dangerous product should also be considered appropriate.

The fundamental parameter for correctly assigning liability, emphasized by the doctrine, is the autonomy of a given technological solution. As noted above, minimal system autonomy allows for the assumption of liability based on fault, treating an AI system similarly to software that is a standard “non-intelligent” algorithm or a physical object—that is, as a tool. High autonomy, in turn, may lead to the necessity of going beyond the principle of fault, since damage may occur as a result of the AI system behaving differently than expected. However, there are indirect possibilities, such as the principle of “organizational fault” (Article 416 of the Polish Civil Code). This provision is most often applied in healthcare settings, particularly in the context of AI systems used by doctors. The principle of organizational fault would be appropriate for causing damage if, for example, maintenance or updating of devices equipped with an AI system were neglected. Potentially, strict liability would be used, which does not allow for exemption from liability by demonstrating a lack of fault in the harmful event, except for so-called exoneration circumstances.

To prove this type of liability, besides the level of system autonomy, a second parameter would be useful, concerning the AI system’s potential for harm, for example, by using the concept of a high-risk AI system. There is also a concept, accepted by some scholars, according to which an entrepreneur would be liable for any consequences of AI actions unless they demonstrated exoneration (on the basis of risk), while a non-entrepreneur would be liable on the basis of fault in supervision. This is a rather interesting concept, but equally controversial.

In some cases, it seems appropriate to apply Article 435 of the Civil Code, a form of strict liability, i.e., liability for damage caused by the operation of an enterprise or plant. Although this provision clearly originated in the context of traditional industry, it seems that software should also be considered, by analogy, as a “force of nature” that sets an enterprise in motion. It is worth noting, however, that any increase in the scope of liability from a fault-based regime to a risk-based regime should be done with caution, meaning that this expansion should be limited to cases where the company poses a significant risk. In the case of the often-cited example of an autonomous car, Article 436 of the Civil Code, which establishes the liability of the owner of a mechanical means of transport propelled by natural forces, should be considered. Applying this provision carries far-reaching consequences, as the person responsible for damage caused by an autonomous vehicle would be its user, not the software developer, tester, or manufacturer of the vehicle. As in the previous case, the use of this path should depend on an assessment of the general danger associated with driving such a vehicle. An assessment indicating a significant danger would lead to the conclusion that strict liability is appropriate; otherwise, attributing such broad liability to the user would likely be inappropriate. Of course, in this case, it is also possible to invoke exoneration grounds.

The possibility of assigning liability to artificial intelligence under the provisions on dangerous products is unclear due to the wording of Article 2 of Council Directive 85/374/EEC, which defines a product as any movable item, even if it is a component part of another movable or immovable item. Electricity is also considered a product, as is Article 449, subparagraph 1, of the Civil Code, which states: “A product shall be understood as a movable item, even if it is connected to another item. Animals and electricity shall also be considered a product.” This clearly refers to the material nature of the product, meaning that it is not an AI system that may be considered a dangerous product, but rather a device whose functionality is based to some extent on the AI system. An interpretation that broadens the definition of a product to also include software or an AI system would be controversial.

Contractual liability under Article 471 of the Civil Code is also significant. In this situation, it will be necessary to demonstrate damage resulting from non-performance or improper performance of an obligation for which the debtor is liable, i.e., culpable non-performance or failure to exercise due diligence in its performance. Article 473 of the Civil Code, which allows for modifications to the debtor’s liability (either limiting or expanding it), appears to be crucial in this case. In the absence of such modifications, attributing liability to the debtor solely on the basis of Article 471 of the Civil Code may be difficult, unless the debtor demonstrated a lack of due diligence, i.e., for example, failed to implement any safety procedures. There also remains the entire scope of liability arising from contractual provisions, both in the model of contractual penalties and warranty liability.

KG LEGAL \ INFO BLOG

Adversarial attacks and why are they possible

The Artificial Intelligence Act, a key regulation concerning AI

Possible liability regimes for artificial intelligence

KG LEGAL \ INFO
BLOG