Companies whose processing of personal data falls under the terms "scientific research purposes" in the General Data Protection Regulation ("GDPR") are considered privileged under data protection law. For example, they may use their existing (also sensitive) data stock for secondary purposes without having to com-ply with the otherwise applicable strict rules on changes of purpose. This privilege can have a particular innovation-enhancing effect on companies that want to develop innovative and competitive data-driven products for the European market through artificial intelligence/machine learning ("AI/ML").
So far, however, it is still largely unclear to what extent companies operating in the private sector in the European market may invoke the privilege of scientific research. The current legal situation, which is characterised by a high degree of legal uncertainty, has a strong inhibiting effect on innovation in practice. New guidelines announced by the European Data Protection Board ("EDPB") for 2022 intends to bring light into the darkness here. The significance of these guidelines is immense, as they are of considerable relevance for the data economy in Europe and thus especially for AI/ML-based product innovations. Not least, the considerable innovation potential of AI/ML could be better exploited.
The current legal uncertainty and the EDPB's hesitation to provide legal clarity are surprising. Although there is no legally binding definition of the term "scientific research" at European level, the legal frame-work from the GDPR, as well as European primary law, obliges a broad interpretation of this term. The EDPB has also already stated that the term must be measured against its general meaning. This general meaning and the criteria that "scientific research" must fulfil have often been formulated elsewhere and are not particularly controversial.
Therefore, if the EDPB takes the legal framework seriously, it would only be logical to use the opportunity presented to it, in these announced guidelines to unambiguously specify the criteria for the existence of scientific research purposes in line with the intention of the European legislator as well as the general meaning of this term. As a result, this would mean that companies would also be able to invoke the re-search privilege of the GDPR for their AI/ML-based product innovations, if they comply with the generally accepted processes for scientific conduct. This would allow companies to benefit from this privilege through the right processes and methods, further promoting AI as a future technology for the benefit of all.
This year, the EDPB has a great opportunity to decisively promote the commercialisation of data through AI/ML for competitive product innovation. AI/ML is internationally recognised as a key driver of future growth, competitiveness and job creation. The impact of AI on productivity is so disruptive that lack of promotion of this technology or regulatory barriers could lead to the loss of significant market share or even drive certain companies' business models out of competition altogether.
The commercialisation of data is increasingly taking place by means of AI and ML. AI learns from large amounts of data (so-called big data) and recognises correlations that become the basis for new services and products. Data protection plays a key role in this context, as the large amount of (often personal) data required for training the algorithms must be processed in accordance with the requirements of the GDPR.
This applies in particular against the background of the rather strict interpretation of the European Court of Justice (“ECJ”) in its so-called Breyer ruling with regard to the question of when data is personal or anonymised (in which the ECJ recognised dynamic IP addresses as personal data, see ref. no.: C-582/14).
Personal data is therefore often an indispensable component on the way to AI/ML-based product innovations. In practice, however, this results in numerous obstacles in terms of data protection law, which are essentially based on unclear specifications for the secondary use of this data (i.e. the secondary use of data from an existing data stock for purposes other than those for which the data were originally collected).
One privilege of secondary use of data that is regularly discussed in practice in the context of AI/ML-based product innovations and thus enormously important, is the processing of personal data for "scientific re-search purposes". According to the GDPR, "scientific research purposes" privileges the processing of per-sonal data in many ways, e.g. by allowing a justification for the processing of special categories of personal data (Art. 9(1)(j) GDPR) or by allowing the secondary use of personal data for different purposes without a separate compatibility assessment (Art. 6(4) in conjunction with. Art. 5(1)(b) GDPR). Some member states have also used the opening clause of Art. 9(2)(j) GDPR, to exempt data controllers processing personal data for scientific research purposes, complying with their data subject rights (such as access and erasure), if it were to seriously compromise the research purposes (e.g. § Section 27(2) of the German Federal Data Protection Act; similar to Germany, most member states have made use of the opening clause in favour of privileging scientific research purposes in one way or another, see the European Commission study "Assessment of the EU Member States' rules on health data in the light of GDPR" for further details).
So far, the scope of the notion of "scientific research purposes" is largely unclear. Neither the GDPR, the EDPB or its predecessor, the Article 29 Working Party, have yet worked out the exact criteria according to which "scientific research purposes" would be fulfilled. It is unclear how far the research privilege extends, in particular, whether a distinction should be made between (to be privileged) non-commercial, public research on the one hand and (not to be privileged) research by commercially operating companies on the other. For example, it is argued that only third-party funded research and not privately funded research is covered (see e.g. in the renowned German commentary on data protection law by Simi-tis/Hornung/Spiecker gen. Döhmann, Art. 9 GDPR).
Furthermore, it is unclear what in detail constitutes a (privileged) activity that would fall under scientific research by private companies. It is predominantly argued that in the age of big data and data mining, not every analysis of data can be covered as research within the meaning of the GDPR. However, a demarcation between privileged research purposes and non-privileged other purposes has not yet been made.
In practice, this has led to many companies behaving in a risk-averse manner, not using or not fully using the opportunities for AI/ML that theoretically open up as a result of this privilege. Yet any AI is only as good as its training data, and the use of incomplete data also impairs the qualitative and thus competitive development of AI.
The EDPB seems to be aware of the uncertainties in the interpretation of "scientific research purposes". In a document dated 2 February 2021 responding to a request from the European Commission, the EDPB announced that the scope of the scientific research exemption and the safeguards under Article 89(1) of the GDPR require further clarification. According to the EDPB, such clarification is to be provided in the future guidelines on the processing of personal data for scientific research purposes, which is currently in preparation (see here, para. 42). However, a public consultation on this was already closed in April 2021.
Measured against the legal framework resulting from the GDPR itself and European primary law in the form of the TFEU, the EDPB must seize the opportunity presented to it and clarify in principle, that com-mercial research by private companies is also privileged, if these companies adhere to scientific processes (methodology, systematics, need for evidence, verifiability, openness to criticism, willingness to revise) and thus want to gain new knowledge in a targeted manner. The latter is always fulfilled by the intention to develop innovative and competitive data-driven products and services for the European market. The former can be fulfilled if the approach to the development is designed in accordance with the process steps mentioned and in line with scientific procedures.
There is no legally binding definition of the term "scientific research" at European level. However, when interpreting it, the EDPB must consider that, according to recital 159 of the GDPR, scientific research should be interpreted broadly to include "processing for technological development and demonstration, fundamental research, applied research and privately funded research". Furthermore, according to recital 159 of the GDPR, it should take into account the objective of creating a European area of research as laid down in Article 179(1) TFEU. The reference to Article 179(1) TFEU, which as primary law is decisive (as it takes precedence!) for the interpretation of the GDPR, makes it clear that commercial research also falls under the concept of scientific research. According to Article 179(1) TFEU and its usual interpretation, research - as a component of science - means the methodical generation of new knowledge. According to the wording, industrial research is also included. This is in line with the objective of Art. 179 TFEU to enable broad research funding without restrictions that hinder innovation (see e.g. the commentary on EU law by Groeben, von der/Schwarze, Art. 179 TFEU).
To conclude the term does privilege those who wish to generate new knowledge in the above sense, irrespective of their concrete capacity as a public or commercially operating enterprise. However, this does not answer the question of whether private companies can invoke the research privilege of the GDPR for product innovations based on AI/ML. The decisive factor is whether the specific purpose pursued – in this case, product innovations based on AI/ML – is to be regarded as a scientific research purpose. In this regard, the EDPB has at least stated in its Guidelines 5/2020 that the term must be measured against its general meaning (but without further details). As described, however, a legally binding definition of the term "scientific research" does not yet exist at the European level. Nevertheless, the general criteria for scientific research have often been formulated elsewhere, e.g. in the context of constitutional rulings by various courts (for example, the Federal Constitutional Court in Germany). This of course has an influence on how the "general meaning" of this term must be understood. According to unanimous opinion, the following two elements are particularly important:
With this approach, it could be established on the one hand that not every simple analysis and processing of data can already claim the privilege of "scientific research". This corresponds to the proposal put for-ward by the European Parliament in the context of the negotiations on the GDPR that, against the back-drop of big data analyses and data mining, a restriction should be explicitly made for research purposes in the narrower sense, so that not every analysis and processing of data can already claim the privilege as "science". On the other hand, this interpretation based on the general meaning of the term would also mean that private companies pursuing commercial purposes could invoke the research privilege of the GDPR for product innovations based on AI/ML, if they comply with the generally recognised processes for scientific procedures.
In this context, it would be extremely important and conducive to innovation for companies, if the EDPB clearly circumscribes this fundamentally very broad scope of application in its announced guidelines. This is unavoidable for the EDPB – if it takes the legal framework seriously – because the prerequisites for this emerges automatically from the framework determined by the GDPR itself, and the EDPB´s notion that the terms “scientific research” must be interpreted according to the standard of their general meaning.