Generative AI and the GDPR - a complete guide
A comprehensive guide to companies using generative AI tools to ensure compliance, transparency, and address potential challenges such as bias and security issues as well as processing data under GDPR in relation to generative AI.
Generative AI and the Legal Foundation for Data Processing Under the GDPR
In compliance with the principle of lawfulness outlined in Article 5(1)(a) of the General Data Protection Regulation (GDPR), controllers must invoke a legitimate legal basis for data processing. Given the potential for AI tools to process personal data, data protection authorities are scrutinizing providers of generative AI applications, particularly OpenAI. The European Data Protection Board has also established a task force dedicated to AI-related issues. One recurring concern among authorities is the determination of a valid legal basis for processing personal data.
Legal Bases Under the GDPR
Article 6 of the GDPR outlines various options for establishing a legal basis: processing is permitted when the data subject has provided consent (Article 6(1)(a) GDPR), when processing is necessary for the fulfillment of a contract or pre- contractual measures (Article 6(1)(b) GDPR), or when processing is necessary for pursuing legitimate interests (Article 6(1)(f) GDPR). Additionally, Article 9 of the GDPR specifies further legal bases for processing special categories of personal data, such as health data, biometric data, and political opinions.
Legal Basis Requirements for Processing Steps
When determining the applicable legal basis, it is crucial to differentiate between various processing steps. Firstly, a distinction must be made between processing input data by the user, processing that input data by the tool itself, and processing output data by both the tool and the user. Furthermore, input and output data are typically used by AI tool providers to train the underlying algorithm. Whenever personal data is processed during these individual steps, the controller must rely on a valid legal basis.
Criticisms of Applicable Legal Bases by Authorities
The Italian data protection authority, "Garante," banned ChatGPT's service in Italy partly due to concerns regarding the lack of a legal basis for ChatGPT to process a substantial amount of personal data for training purposes. German supervisory authorities share similar concerns, and both the German Data Protection Conference (DSK) and the European Data Protection Board (EDPB) have established an AI task force, which will address the legal basis for various personal data processing operations in ChatGPT. If consent is used as the legal basis, the Schleswig-Holstein State Commissioner for Data Protection will request that ChatGPT provide a sample declaration of consent. The Data Protection Authorities of Hessen has asked Chat GPT to explain how it will carry out its data subject access and deletion requests and how it will process withdrawals of consent. If ChatGPT or third parties invoke legitimate interests as the legal basis, ChatGPT should explain the rationale and considerations behind the respective balancing of interests.
Selecting the Appropriate Legal Basis
For companies allowing their employees to use AI tools, it is essential to establish a legal basis for entering personal data in the form of prompts. The applicable legal basis depends on the specific data processing involved in each use case and can take the form of consent or a legitimate interest in entering the data. Companies should carefully consider these factors for their respective use cases and document them in their records of processing activities. Additionally, other data protection obligations of a controller, such as conducting data protection impact assessments and fulfilling transparency obligations, as discussed in subsequent articles, must be considered.
In April 2023, the Italian Garante informed OpenAI that the processing of personal data for training the ChatGPT algorithm could not be based on the legal basis of contract fulfillment (Article 6(1)(b) GDPR). Whether OpenAI can invoke a valid legal basis for processing for training purposes remains to be determined. Since consent under Article 6(1)(a) GDPR is only effective if it is informed and transparent, and AI applications are currently considered black boxes in this regard.
OpenAI will likely rely on the legal basis of legitimate interest.
It will be intriguing to observe whether OpenAI can convincingly balance interests in this context. The CJEU recently confirmed that processing mass quantities of data bundled together from a multitude of different sources for the purpose of product development and improvement based on a legitimate interest under Article 6(1)(f) GDPR (Meta vs. the German Cartel Office (C-252/21), from 4 July 2023). However, the court also held that the strict necessity test needs to be applied: 1) a legitimate interest of all Controllers, 2) that processing is strictly necessary, meaning that this interest cannot be achieved by any other means (taking also into account the principles of data minimization), and 3) the individuals’ fundamental rights and freedoms do not outweigh the legitimate interests. The CJEU also confirmed in its Meta ruling that sensitive data in the public domain is still protected and requires prior consent for further processing to occur. Chat GPT will need to rely on filters to eliminate the unauthorized processing of sensitive data.