Publication date: June 23, 2025
Due to the constantly advancing digitalization of practically all aspects of our lives, in order to ensure the development of technologies such as those based on artificial intelligence (AI), their creators need to collect qualitative personal data. Thanks to them, algorithms will be able to lead to the desired end state in a much more precise way.
However, every time we use personal data, there is a risk of violating our privacy. Depersonalization techniques address these concerns, which by definition should ensure anonymity for our data. The connection with the provisions of the GDPR is immediately visible here, the purpose of which is to provide access to data to people who have consent or must process it. In the context of implementing depersonalization techniques, three provisions contained in the GDPR immediately catch the eye.
The first is Article 22, which limits automated decision-making.
Article 25, paragraph 1, in turn, requires the administrator to implement appropriate technical and organizational measures designed to effectively implement the principles of data protection already in the design process. According to paragraph 2 of the same article, the administrator should implement such technical measures that by default the data necessary to achieve each specific purpose of processing are processed. The regulations also oblige to conduct an assessment of the effects of planned processing operations on the protection of personal data (Article 35 of the GDPR).
Before presenting individual depersonalization techniques, it is worth considering explaining the concept of anonymization. The GDPR provisions do not directly contain a definition of this concept. However, it is found in Article 3, Section 1 of the Act on the exchange of information with law enforcement agencies of the Member States of the European Union, third countries, European Union agencies and international organizations. According to this legal act, it is “the transformation of personal data in a way that prevents the assignment of individual information to a specific or identifiable natural person or if such assignment would require disproportionate costs, time or activities”. Knowing this definition, it is worth taking a look at point 26 of the GDPR recital. It follows from it that this regulation does not apply to the processing of anonymous data, i.e. information that does not relate to an identified or identifiable natural person, or to personal data anonymized in such a way that the data subjects cannot be identified at all or can no longer be identified. In summary, anonymization techniques create sets that resemble the originals in their structure, but they hide confidential information. There is no way to identify the people who are the subject of the data, and therefore the data is not subject to the provisions of the GDPR, which of course does not mean that anonymization techniques are illegal under EU regulations.
In order to better explain what anonymization is, it is best to use examples of its various techniques. These include, for example, obfuscation or deletion of data (e.g. removing names) or data replacement, which involves assigning true values to various elements, making their identification impossible. Randomization techniques are also a broad concept, which create a separate family of methods for eliminating the connection between data and its holder by randomly separating them. The second such family is a group of techniques based on generalization that modifies the scope or order of magnitude.
In the introduction to depersonalization techniques, one cannot also ignore the important term of pseudonymization. Its definition can be easily found in Article 4 of the GDPR, according to which “it is the processing of personal data in such a way that they can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures that prevent their assignment to an identified or identifiable natural person”. The key difference between pseudonymization and anonymization is therefore that pseudonymized data can be assigned to specific natural persons. A specific key or program can serve this purpose. What is extremely important, this feature makes such data fit into the concept of personal data and is subject to the provisions of the GDPR. Pseudonymization for example, a customer number in an online store or a student index number will be assigned.
As can be seen, the term depersonalization techniques covers many concepts, techniques and processes that require explanation from both a technical and legal perspective. Understanding the nuances related to these issues is extremely important before choosing a security method, and this series of studies will allow for better understanding them.