AI training and copyright – how does the new Finnish Copyright Act deal with using copyright-protected works to train AI

On 27 February 2023, the Finnish Parliament approved amendments to the Finnish Copyright Act (404/1961) and the Finnish Electronic Communications Services Act (917/2014). This is the most significant reform of the Copyright Act in Finland in 20 years. The new Copyright Act implements the much-debated Directive on copyright and related rights in the Digital Single Market (so called DSM Directive, 2019/790/EU) and the Directive on the exercise of copyright and related rights applicable to certain online transmissions of broadcasting organizations and retransmissions of television and radio programs ((EU)2019/789).

The main goal of both the Finnish Copyright Act and the DSM Directive is to modernize copyright legislation. The goal is to make copyright legislation reflect the different phenomena of today’s modern society, the most significant and challenging of which undoubtedly include artificial intelligence (AI) which has taken major steps forward in recent years. Does the new Copyright Act bring new answers to copyright issues related to AI applications? Since AI technology will be crucial for companies in the future, businesses should consider how legislative reforms would potentially change the rules concerning AI.

This article examines how the new Copyright Act affects the evaluation of AI training from a legal perspective. Copyright issues related to AI applications have been discussed in a general level in my previous article here.

New copyright exception for text and data mining in Section 13 b of the Copyright Act

The most significant legal challenges related to AI-assisted content generation include the risk of copyright infringement which could occur during the training process of AI applications. Content generation with AI is often based on machine learning. During machines learning, algorithms are trained to look for pattens and correlations in large sets of data. This is also how ChatGPT, which has gained great popularity and is available on the internet, works. With the help of machine learning, AI applications learn to make decisions and predictions and separate relevant information from irrelevant information.

In terms of copyright, problems arise when the training data includes material protected by copyright or related rights.

The question of whether using copyright-protected works for machine learning purposes is allowed has long been unclear in our legal system. For example, in Finland, there has not been a copyright restriction in force that would specifically allow the copying of works for the purpose of training AI applications. The situation has not changed with the new Copyright Act either, as no explicit restriction was included in it. However, the new Copyright Act may clarify the legal evaluation of machine learning to some extent, as it includes a provision on text and data mining in its Section 13b, which implements the text and data mining Articles of the DSM Directive.

According to Section 13 b (1) of the new Copyright Act, anyone who has legal access to a work is allowed to make copies of it for the purpose of text and data mining and to keep those copies solely for that purpose, unless the author has expressly and appropriately reserved this right. In other words, text and data mining is generally permitted nowadays, unless the author has expressly prohibited it.

Application to the training of AI unclear

At first glance, the new provision appears to significantly clarify the legal situation. However, it does not yet offer a shortcut to success, as it is unclear whether the new Section 13b applies to the training AI applications. The DSM Directive does not address the question as the Directive does not mention AI or machine learning. Similarly, the government proposal on the new Copyright Act does not take a position on whether the provisions are applicable to the operation of AI or not.

If the definition of text and data mining is interpreted narrowly, it could only be considered to apply to activities intended to produce analytics. Traditionally, text and data mining has been defined as "an automatic analysis technique designed to produce information, trends, or correlations." In this case, the operation of AI would be deemed to fall outside the scope of the provision on data and text mining. After all, the aim of training AI applications is often to produce content rather than merely analytics.

However, it is also possible that text and data mining will be given a broader interpretation in legal praxis in which case Section 13b could also apply to artificial intelligence. As the situation is currently unclear, we must wait for clarifying decisions from the courts or more detailed guidance from authorities.

However, if the new provision on text and data mining was considered to also apply to the training process of AI, the reform could be considered positive for AI developers: in the future, data mining would be permitted unless expressly prohibited. The provision can be seen as promoting free competition and freedom of trade in a situation where a work is used only indirectly, i.e., a copy of the work is made solely for the purpose of analyzing information using technical methods.

According to the DSM Directive, the exercise of the right should take place explicitly and in an appropriate manner. This could be, for example, a clause in the terms of subscription to a content service or a clause in machine-readable form. It will be interesting to see how the exercise of this right is actually carried out. Some international artist portfolio platforms already use a metadata tag "NoAi", which can be added to the metadata of works. In the future, the range of different methods is likely to expand.

Future perspectives

As stated above, the new Copyright Act does not significantly clarify copyright issues related to artificial intelligence. The new Copyright Act does not contain any specific restriction on training AI, although the provision on text and data mining comes fairly close. Since the applicability of the provision to AI is still uncertain, the question of whether training AI on copyrighted works is allowed remains unclear.

If this article raises questions or you would like information on the topic, our experts Sofia Wang and Henri Kaikkonen would be very happy to discuss it with you.

Latest insights

More Insights
City skyline at dusk

China Cybersecurity and Data Protection: Monthly Update - April 2024 Issue

Apr 26 2024

Read More
Curiosity line pink background

Bring out the wine and cheese: Enhanced protection for European GIs in New Zealand

Apr 26 2024

Read More
Green paper windmill

Green Gold: Navigating Mandatory Climate Disclosure and ESG Strategies

Apr 26 2024

Read More