USE OF MEDICAL DATA FOR AI TRAINING

Publication date: October 21, 2025

Under EU Law, namely Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (hereinafter “GDPR”) and the pending entry into application of Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directive 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act) (hereinafter “AIA”), the use of sensitive data (including medical data) for AI training would only be possible after obtaining consent, in cases specified by law, or when using anonymized data. AIA is not a lex specialis vis-à-vis the GDPR, so when using personally identifiable data, using data for AI model training requires meeting the requirements of both acts.

Anonymized data

First, it should be noted that the GDPR, in accordance with its Recital 26 and Article 4(1), refers to personal data, meaning data that allows for the identification of a data subject, and should not therefore apply to anonymized information. The AIA’s definition of “special categories of personal data” refers to the GDPR definition, so it will not apply to personal data either. Therefore, it can be concluded that the use of anonymized data for training AI models is permissible under both acts.

Pursuant to Article 11 of the GDPR, if the purposes for which a controller processes personal data do not or no longer require the identification of a data subject by the controller, the controller shall not be obliged to process additional information in order to identify the data subject for the sole purpose of complying with the GDPR.

GDPR

The GDPR states that personal data may be processed only in strictly defined cases and in compliance with certain standards. Generally, pursuant to Article 6(1)(a) of the GDPR, consent is required for the lawfulness of personal data processing (see exceptions below). Consent to data processing must be given for a specific purpose. It must also be freely given, specific, informed, and unambiguous. Consent cannot be presumed, and the data controller bears the burden of demonstrating that the data subject has consented to processing (Article 7(1) of the GDPR).

Article 5 of the GDPR stipulates that personal data must be processed lawfully, fairly, and in a transparent manner for the data subject. Personal data must be adequate, relevant, and limited to the purposes for which they are processed. They must also be accurate and, where necessary, kept up to date. The controller must take all reasonable steps to ensure that personal data that are inaccurate in relation to the purposes of processing are promptly erased or rectified.

The literature indicates that the purpose of processing must be specific and clear, it cannot be an abstract purpose and that processing data for a purpose other than that for which it was collected is only possible based on consent or a legal provision. Consequently, processing data collected with consent for a different purpose (for example, providing a medical service) for the purpose of training artificial intelligence would be inadmissible without the additional consent of the data subject.

The list of cases in which data processing is permitted without the data subject’s consent is contained in Article 6(1)(bf) of the Regulation. It would be difficult to argue for the use of data for AI training in any of the cases other than those described in point (f). However, medical data falls into the category of so-called sensitive data described in Article 9 of the GDPR, and the aforementioned article prohibits their processing by establishing a separate list of exceptions in its paragraph 2, not including “legitimate interests”. Therefore, it would not be permissible to use this data without consent for purposes such as commercial ones, justifying this by the processor’s “legitimate interest”.

AI ACT

Meanwhile, the AI Regulation introduces the possibility of using sensitive data (understood in the same way as under the GDPR) when developing AI systems. According to Article 10(5) of the regulation (to be applied from August 2, 2026), AI system providers may exceptionally use this data if strictly necessary for the purpose of detecting and correcting bias in high-risk AI systems. This requires compliance with the GDPR requirements (including consent) and the following conditions:

a) it is not possible to effectively detect and correct bias by processing other data;

(b) special categories of personal data are subject to technical restrictions on the re-use of personal data and state-of-the-art security and privacy measures, including pseudonymisation;

(c) special categories of personal data are secured, protected and subject to appropriate safeguards, including strict access controls and documentation, to avoid abuse and ensure that only authorised persons, subject to appropriate confidentiality obligations, have access to such data;

d) this data may not be sent, transferred or otherwise made available to other entities;

(e) special categories of personal data shall be deleted once the bias has been corrected or after the personal data retention period has expired, whichever comes first;

(f) records of processing activities must include a justification as to why the processing of special categories of personal data was strictly necessary to detect and correct bias and why this purpose could not be achieved by processing other data.

It seems that an exceptional case of using such data has been envisaged here, and if the conditions described in this article do not exist, the use of sensitive data for training AI models will not be permissible.

Sale of personal data

As mentioned above, the GDPR does not apply to anonymised data, so their sale or other “commercial use” is not regulated by the GDPR and could encounter possible legal obstacles resulting from regulations other than the GDPR, for example sector-specific ones.

In the case of selling personal data (not anonymised), consent or the conditions of Article 9 paragraph 2 would be required, as we are dealing with sensitive data, it would not be possible to invoke the “legitimate interest” of the “seller”.

Summary

In summary, the use of anonymized data (unless prohibited by specific regulations) for training AI models is permissible. In the case of non-anonymized or pseudonymized personal data, compliance with GDPR and AIA requirements, including consent, would be required.

KG LEGAL \ INFO BLOG

Anonymized data

GDPR

AI ACT

Sale of personal data

Summary

KG LEGAL \ INFO
BLOG