Long-awaited German judgment by the District Court of Hamburg (Kneschke v. LAION) on the text and data mining exception(s)

Written By

simon hembt Module
Dr. Simon Hembt

Associate
Germany

Senior associate for IP, Copyright, and Industry Regulation – Specialising in Artificial Intelligence, Digital Media, and Games.

niels lutzhoeft Module
Dr. Niels Lutzhöft, LL.M.

Partner
Germany

I am a strategic advisor and litigator for life science and digital media companies – stepping in at the crossroads of IP and sector regulation.

toby bond module
Toby Bond

Partner
UK

I'm a partner in our Intellectual Property Group. Having studied physics at university, I'm fascinated by technology and the way in which it is reshaping our world.

What was the case about?

A photographer (“Plaintiff”) filed a lawsuit against the nonprofit organisation, LAION aimed at promoting research in the field of Artificial Intelligence (“AI”) by providing open datasets for training purposes. This resulted in a dataset consisting of nearly six billion image-text pairs. One of these six billion images belonged to the Plaintiff who had uploaded his picture to a stock photo site. The terms of use of this stock photo site, however, stated that images may not be used for “automated programs.” LAION used the Plaintiff’s image from this site and included it in the training dataset. The Plaintiff claimed a copyright infringement, arguing that none of the limitations to copyright (such as those for text and data mining, “TDM”) applied.

What did the court decide and why is it significant?

The court dismissed the claim, recognising that LAION could benefit from a limitation to copyright under Sec. 60d German Copyright Act (Urheberrechtsgesetz, “UrhG”) / Art 3 Dir. (EU) 2019/790 (Digital Single Market Directive, “DSM-Dir.”). This exception addresses TDM for scientific purposes.

While the court was not required to reach a decision on the general application of the commercial TDM exception under Sec. 44b UrhG / Art. 4 DSM-Dir, its comments on the scope of the exception will be of much interest to those developing AI systems in the EU, especially as compliance with reservation of rights under Art 4 DSM-Dir. has been put front and center for general-purpose AI model developers by Article 53(1)(c) of the EU AI Act. 

However, it remains to be seen whether the Plaintiff will appeal this judgment and whether this line of reasoning will be solidified in further rulings (especially before the CJEU).

What are the key takeaways?

Application of the TDM for scientific purposes exception (Sec. 60d UrhG (see Art. 3 DSM-Dir.; Art. 5 Abs. 3 lit. a Dir. (EC) 2001/29, “InfoSoc-Dir.”) 

  • The court dismissed the Plaintiff's argument that the mere creation of a database itself is not yet associated with any gain in scientific insights and, therefore, would not qualify as a reproduction for scientific purpose. The court found that the creation of a dataset as the basis for training AI systems can be considered as scientific research, as it is a fundamental step aimed at using the dataset for future knowledge generation

  • The court also rejected the Plaintiff's allegation that LAION pursues commercial purposes so that the exception under Sec. 60d UrhG does not apply according to Sec. 60d(2) No. 1 UrhG. The plaintiff argued that commercially operating companies were using the datasets. The court also affirmed the absence of commercial activity (Sec. 60d(2) No. 1.) UrhG) in the case at hand, as the dataset itself is made freely available to the public. The fact that the training set is also used by commercially active companies is irrelevant for the assessment under Sec. 60d(2) UrhG.

Analysis of other exceptions relevant to AI training

The court also discussed the temporary copies and commercial TDM copyright exceptions in detail:

  • Temporary copies: Sec. 44a UrhG (Art. 5 Abs. 1 InfoSoc-Dir.): This provision allows temporary and incidental reproduction of data if necessary for the functioning of a technical process. However, the court denied “temporariness” in this case, as the deletion of the data was not independent of the user (and linked to the mere technical process) but rather due to a deliberate programming of the analysis process.

  • Commercial TDM: Sec. 44b UrhG (Art. 4 DSM-Dir.):

    • Applicability of Sec. 44b UrhG: The court found that the reproduction of images within the described process fell under Sec. 44b UrhG. Some voices in the legal literature argued that Sec. 44b UrhG should not extend to training processes that extract the content of intellectual creations. The court was not convinced by this argument, particularly referencing Art. 53(1)(c) of the AI Act, which requires general purpose AI providers (“GPAI Providers”) to establish an AI policy that takes right holders’ opt-outs into account. If this obligation to consider the opt-out applies to GPAI providers, then the rationale of the EU legislator is that Art. 4 of the DSM Directive must also apply to AI training processes containing datasets with intellectual content, as is typically done by GPAI providers. Furthermore, there is no different result when applying the three-step test from Art. 5(5) of the InfoSoc. According to this test, the stipulated exceptions may only be applied in certain special cases where the normal exploitation of the work is not impaired and the legitimate interests of the rights holder are not unduly violated. The court considers these requirements to be met, (1) as the reproduction is limited to the purpose of analyzing the files, and any potential later creation of artificial competing works does not justify seeing a fundamental infringement of reproduction rights.

    • Publicly Accessible Requirement (Sec. 44b UrhG): The court found that a preview image with a watermark made available online in a stock image library satisfies the 'publicly accessible' requirement (and may therefore be subject to Sec. 44b UrhG), even if access to the full-quality image without a watermark requires a licensing agreement.

    • Machine-Readability of Opt-Outs (Sec. 44b UrhG): While not necessary to reach its main conclusion, the court suggested a generous approach regarding machine-readability, finding that an opt-out in natural language would have been sufficient in this case to disapply the commercial TDM exception. However, the court also indicated that this is not a general rule which will always apply; it depends on the case at hand and in particular on the technical development at the relevant time of use of the work. The court justified its decision by referring to Art. 53(1)(c) AI Act, which states that Art. 4(3) DSM Directive should be observed using state-of-the-art technology – including the use of AI. The objection that machine-readability should be interpreted more narrowly – for instance, according to the Dir. EUR 2019/1024 (“ PSI Dir."), where natural language would rather not be sufficient – was dismissed by the court, noting the different interests of the directives. Essentially, the court stated that Sec. 44b UrhG should not enable the development of increasingly powerful models on the one hand, while on the other hand, within the scope of its limitation (i.e., the opt-out), not require the use of AI to detect opt-outs. 

Latest insights

More Insights
Tech AI robot

Key Areas of Focus in Legal Due Diligence for AI Companies in Germany: Assessing Risks and Ensuring Compliance

Dec 04 2024

Read More
collection of files with coloured bulldog clips

Key digital takeaways from the hearings of incoming Commissioners

Dec 03 2024

Read More
Curiosity line yellow background

ENISA Implementing Guidance on NIS2 security measures - draft for consultation

Dec 03 2024

Read More