Publication date: May 09, 2025
In recent years, the dynamic development of artificial intelligence (AI) technology has become noticeable, which has significantly influenced various sectors of the digital economy. Currently, AI systems are widely used, among others, in the HR area (automation of recruitment processes), in e-commerce (analysis of consumer preferences), as well as in public administration (management of social benefits). This technological progress is characterized by enormous potential, but at the same time it raises serious legal and ethical challenges, especially in the area of privacy protection.
From the perspective of personal data protection, the development of artificial intelligence can lead to a number of both positive and negative consequences. One of the key threats associated with the progress of AI technology is deep user profiling. AI enables the processing of huge amounts of data in order to create precise user profiles. On this basis, their behavior can be predicted and automated decisions can be made. As stated in Article 22 of the GDPR, the data subject has the right not to be subject to a decision that is based solely on automated processing, including profiling. This provision clearly limits the possibility of making automated decisions that produce “legal effects” or “significantly affect” a natural person. Nevertheless, practice shows that recruitment or predictive systems can violate the GDPR if they do not provide for the right to human intervention. Such operations are associated with the risk of discrimination, incorrect assessments, lack of transparency or violation of the principle of data minimization. These problems result in the need to increase the protection of personal data in the context of the use of AI at both the national and EU level.
In the European Union, the European Data Protection Board (EDPB) plays an important role in this area, tasked with ensuring the consistent application of the General Data Protection Regulation (GDPR) in all Member States. The EDPB’s activities, including the opinions it issues, are intended to support national supervisory authorities and set interpretative standards for new technological challenges. Although such opinions are not binding, they are considered authoritative and provide guidance for the interpretation of the rules in practice.
EDPB OPINION ON AI MODELS
On 18 December 2024, the European Data Protection Board adopted an important opinion on the use of personal data in the process of creating and implementing artificial intelligence (AI) models.
This document is a response to the growing importance of AI-based technologies in various economic sectors and to the challenges that their development poses from the perspective of personal data protection. The opinion was issued on the basis of Article 70(1)(A) of Regulation (EU) 2016/679 of the European Parliament and of the Council, commonly known as the GDPR, which authorizes the EDPB to issue opinions to ensure the consistent application of data protection rules throughout the European Union.
The initiator of the work on the opinion was the Irish data protection authority, which asked the EDPB to express its opinion on the admissibility of processing personal data in the context of AI. The aim was to achieve harmonisation of the rules at European level on the use of personal data in the context of artificial intelligence.
In the course of work on the opinion, the EDPB held consultations, including meetings with representatives of the scientific community, industry, non-governmental organizations and the newly established EU Office for Artificial Intelligence. The document addresses three key issues that are fundamental to the compliance of AI models with personal data protection requirements:
ANONYMITY OF AI MODELS
Personal data
According to Article 4(1) of the GDPR, “personal data” means any information relating to an identified or identifiable natural person. In the context of AI, it is crucial to understand two elements of this definition. The term “relates” means that the information must relate to the individual in a sufficiently precise manner to have an impact on them (e.g. consumer profile, risk category, result of a predictive model), while the term “identifiable” refers to direct identification (e.g. name and surname), but also indirect identification if there is a “reasonably probable” possibility of identification. According to the EDPB, the use of the expression “any information” in the definition of “personal data” in Article 4(1) of the GDPR is intended to give the concept a broad scope – covering any information as long as it “relates” to a person who can be identified directly or indirectly.
In the context of AI, indirect identification does not include an obvious identifier, but, for example, attributing risk to behaviour or predicting a disease may in themselves constitute personal data. AI models usually do not contain direct personal data, but parameters describing the relationships between data. However, it is possible that personal data can be inferred from them
Is information generated on the basis of inference also subject to GDPR?
According to art. 4 point 1 of the aforementioned regulation, “personal data” is “any information (…). Therefore, it is not only about “primary data” (such as name, surname, Personal Identification Number), but also about “secondary data” that are obtained or inferred – as long as they refer to a specific person. It is not important how the information was obtained, and the decisive role in determining whether something is personal data is played by the relationship between the person and the information. In order to assign information to a person, its content is of the greatest importance.
From the perspective of assessing whether we are dealing with personal data, it is not important whether the person has legal capacity, whether this capacity is full or has been limited.
Case law confirms the above thesis. For example, the judgment of the CJEU of 19.10.2016, C-582/14, PATRICK BREYER v. BUNDESREPUBLIK DEUTSCHLAND indicates the contextual nature of personal data. The court stated then that even a dynamic Internet Protocol address (IP address) registered by an online media service provider may be personal data if the controller has the means of identification.
Anonymization vs. pseudonymization
Anonymization and pseudonymization are two different approaches to limiting the possibility of identifying a natural person in the context of personal data processing. The Personal Data Protection Regulation does not contain a legal definition of anonymization . Data that has been effectively anonymized, i.e. in such a way that the data subject cannot be identified in any way by either the administrator or a third party – are no longer personal data, and therefore are not subject to the GDPR, because in such situation data cannot be linked to a specific person.
In contrast to anonymization, pseudonymization is defined in Article 4(5) of the GDPR as the processing of data in such a way that they cannot be attributed to a specific person without the use of additional information that is kept separately and subject to technical and organizational measures. According to Article 32(1)(A), it is a security measure, but pseudonymized data still remain personal data.
In the judgment of the CJEU of 20 December 2017, C-434/16, PETER NOWAK v. DATA PROTECTION COMMISSIONER, it ruled that even the examiner’s comments on an examination paper may constitute personal data if it is possible to identify the person being examined. This judgment does not directly concern anonymisation, but emphasises the broad understanding of the concept of “personal data” and the importance of the context of processing. Importantly, the Court adopted a functional approach to identifiability – whether data are personal data depends on the possibility of linking them to a specific person, and not on their formal nature.
Anonymity conditions according to EDPB
In terms of determining the anonymity of AI models, the European Data Protection Board stated that the assessment should be analysed individually by data protection authorities. The EDPB also specified what conditions an AI model should meet in order to be considered anonymous. Namely:
Furthermore, Member States should take into account three elements to assess whether these conditions are met. Based on the Opinion WP29 05/2014 on anonymisation techniques, it was stated that data can be considered anonymous if it is not possible to distinguish, combine or infer information from them. If any of the conditions are not met, a detailed identification risk assessment should be carried out.
Secondly, this assessment should take into account “all means which are reasonably likely to be used” by the controller or other persons, based on objective factors, in accordance with Recital 26 of the GDPR, such as:
Finally, it is important to check whether the controller has assessed the risk of identification, both by itself and by third parties who may (even unintentionally) have access to the model and process the data. The EDPB provides a non-binding and non-exhaustive list of elements that the entities may take into account when assessing the controller’s claim of anonymity. Other approaches are also acceptable if they provide an equivalent level of protection, especially taking into account the state of the art. The presence or absence of specific elements does not constitute unequivocal proof of anonymity – the assessment must be holistic. Clarifying these criteria helps to avoid ambiguities in the application of data protection rules and provides greater legal certainty for both organisations and individuals whose data may be processed.
Practical consequences of misclassifying data as anonymous
Classifying data as anonymous has great practical significance, as it means that the GDPR does not apply at all. However, incorrect classification can lead to serious legal consequences – from failure to fulfil legal obligations to illegal processing of personal data.
If data deemed anonymous de facto still allows the identification of a natural person, then we are dealing with personal data. In such a situation, an AI model may be created that exists without a legal basis for processing, and therefore Article 6 of the GDPR is violated. In addition, personal data protection may be violated and, in connection with this, administrative sanctions under Article 83 may be imposed.
GDPR requires a risk-based approach from the controller – they cannot automatically assume that the data is not personal. They are then required to carry out the test discussed in the previous subsection. Furthermore, the controller is also required to keep records to demonstrate compliance (Article 5(2) of the GDPR).
Legitimate interest as a basis for data processing in AI
Legitimate interest (Article 6(1)(F) of the GDPR) is one of the invoked legal grounds for data processing in the context of AI systems – especially where there is no consent or legal obligation.
Data processing is lawful if it is “necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child”. In such a situation, the EU legislator assumes that there may be cases where the controller cannot rely on consent, a provision or the performance of a contract, but should nevertheless be able to process the data.
The concept of legitimate interest can be understood in different ways. It can be assumed that it is such an interest resulting from legal regulations, when these regulations do not regulate the admissibility of data processing, but only indicate some interest.
Legitimate interest according to the EDPB opinion
The opinion provides general guidance for data protection authorities to assess whether legitimate interests can constitute an appropriate legal basis for the processing of personal data in the context of artificial intelligence models. For this purpose, the three-step test established by the Board should be followed. In order for the processing of data to be based on Article 6(1)(f) of the GDPR, three cumulative conditions must be met:
The EDPB has published guidelines on how to conduct the test before applying it as a “legitimate interest”. The test consists of three steps. First, by establishing the existence of a legitimate interest of the controller or a third party, it is necessary to understand a legitimate interest that is clearly defined and present at the time of processing. Second, when assessing the necessity of processing data for the fulfilment of this interest, it is necessary to check whether the purpose can be achieved by other, less intrusive methods that are less intrusive on the rights and freedoms of data subjects. Third, the expectations of data subjects and the potential impact of the processing on their rights must be taken into account in order to determine that the interests or fundamental rights and freedoms of data subjects do not override them.
The EDPB emphasizes that before starting to process data based on legitimate interest, controllers should carefully consider and document the fulfilment of the above conditions, ensuring compliance with the accountability principle set out in Article 5(2) of the GDPR. These services may be beneficial to individuals and may be based on legitimate interest as a legal basis, but only if the processing is strictly necessary and the balance of rights is maintained.
The Opinion sets out a number of criteria to help data protection authorities assess whether individuals can reasonably expect their personal data to be used in a specific way. These criteria include:
Examples of legitimate interests:
A legitimate interest must be specific, real and current – it cannot be hypothetical. The EDPB and national case law indicate that legitimate interests in the context of AI may include, among others: internal research and development (e.g. developing recommendation models), detection of abuse and system security (e.g. analysis of anomalies in transaction data) and improvement of services (e.g. analysis of user data to improve the interface).
It is worth adding that if processing based on legitimate interest leads to automated decision-making as referred to in Article 22 of the GDPR, then:
Possibility of filing an objection – Article 21 of the GDPR
The Personal Data Protection Regulation explicitly states that the data subject has the right to object to the processing of data based on legitimate interest. If such a person objects, the controller may continue processing only if it demonstrates important, legitimate grounds that override the interests, rights and freedoms of that person.
In the context of AI, this means that a person must be clearly informed of their right to object and that an easy mechanism for filing an objection must be provided. In addition, procedures for responding to objections must be prepared.
Consequences of unlawful data processing
A violation of the GDPR can lead to a situation in which the entire AI model becomes illegal. This is especially the case in situations where personal data have been obtained without an appropriate legal basis, the data subject has not been informed (violation of Articles 13 and 14 of the GDPR) or when the processing does not meet the principle of purpose limitation or data minimization. There is a noticeable trend of “fixing” the model through subsequent anonymization , but this does not always result in improving the illegal processing. This is due to the fact that the violation concerns the moment of data acquisition or processing ex ante .
In accordance with Article 32 of the GDPR, the controller and the processor are required to ensure a level of security appropriate to this risk by applying technical and organisational measures. The Regulation pays particular attention to maintaining the security of the ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing services. The EDPB emphasises that Member States are responsible for monitoring the activities of these entities in order to ensure the protection of fundamental rights and freedoms.
However, if it turns out that processing should not take place due to the negative impact on individuals, it is possible to apply mitigating measures. The opinion provides a non-exhaustive list of examples of such measures. They may be of a technical nature or facilitate the exercise of rights by individuals or increase transparency. Technical mitigating measures include actions such as reducing input data or designing AI architecture in a way that allows for subsequent deletion of data. Procedural mitigating measures include mechanisms for handling requests from data subjects or consultations with the data protection officer in high-risk projects. The opinion also emphasizes that the person whose data is being violated may request their deletion under Article 17 of the GDPR.
In addition, creating an AI model using personal data processed in an unlawful manner may undermine the legality of its subsequent use, unless the model has been properly anonymized. An exception is when the controller demonstrates effective anonymization of the data before further use of the model.
Taking into account the scope of the Irish DPA’s inquiry, as well as the wide variety and dynamic development of AI models, the opinion aims to provide general guidance to assist in the analysis of specific cases.
Case study (case study) – practical analysis
Example case:
Startup NeuroMetrics is developing an AI system to predict burnout using data from employee communication platforms (emails, chats, meeting times, login times, response time data, etc.).
The company says the data has been anonymized because the system does not store names or email addresses, and the data is converted into feature vectors ( feature vectors ) before training the model.
However, one of the employees – Ms. Z – recognizes her writing style and work calendar based on the visualization of the results and files a complaint with the UODO, claiming that her data is still being used.
According to the EDPB approach (Opinion 05/2014), data are only effectively anonymous if they cannot be assigned to a specific person, even indirectly, taking into account all reasonably likely means of identification.
In this situation, writing style, calendar of meetings, and activity patterns can be recognized. This results in the ability to assign a unique signature to a given person.
According to the previously mentioned CJEU judgment in the Nowak case (C-434/16), potential identification is equivalent to assigning personal data to someone.
Therefore, it must be concluded that NeuroMetrics has not performed effective anonymization, but only pseudonymization, which is still subject to the GDPR.
The company relied on Article 6(1)(f) of the GDPR – legitimate interest in improving employee well-being and preventing burnout. To determine whether a legitimate interest actually existed, the EDPB Test should be conducted:
It can be concluded that the processing probably did not meet the proportionality and necessity test.
Art. 17 of the GDPR grants the right to delete data if: the data is processed unlawfully, or the person withdraws consent or raises an effective objection.
In practice, however, it may not be technically feasible to completely delete the entire model. However, it is possible to delete the input data of that person or ensure that the model will no longer be used for that person.
The opinion of the European Data Protection Board of 18 December 2024 is an important voice in the discussion on the compliance of AI models with the provisions of the GDPR. Although it is not binding, its authoritative tone and wide-ranging consultations make it a practical guide for supervisory authorities, AI implementers and data protection experts.
The EDPB opinion gives rise to three key conclusions:
In light of the dynamic development of AI technology, the EDPB opinion is a foundation for further organizing the boundaries of permissibility of using personal data in automated systems. It is becoming crucial not only to comply with the letter of the law, but also to respect its spirit – protecting human information autonomy in the era of algorithms.
Sources:
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (OJ EU. L. of 2016, No. 119, p. 1, as amended ).
https://www.edpb.europa.eu/system/files/2024-12/edpb_opinion_202428_ai-models_en.pdf
P. Fajgielski [in:] Commentary to Regulation No. 2016/679 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) [in:] General Data Protection Regulation. Personal Data Protection Act. Commentary, 2nd edition, Warsaw 2022, art. 4.
Judgment of the Court of Justice of 19 October 2016, C-582/14, PATRICK BREYER v. BUNDESREPUBLIK DEUTSCHLAND, ZOTSiS 2016, no. 10, item I-779.
Judgment of the Court of Justice of 20 December 2017, C-434/16, PETER NOWAK v. DATA PROTECTION COMMISSIONER, ZOTSiS 2017, No. 12, item I -994.