03 | 2023
Forum – Zur Diskussion | A discuter

Damian Hartmann

Text and Data Mining and Copyright in Switzerland and the European Union

Die Schweiz und die EuropÀische Union haben neue Urheberrechtsschranken geschaffen, um den Einsatz von Text und Data Mining vor urheberrechtlichen Hindernissen zu bewahren. In diesem Beitrag werden die neuen urheberrechtlichen Schrankenbestimmungen beleuchtet sowie die Situation in der Schweiz mit der Situation in der EuropÀischen Union verglichen.

La Suisse et l’Union europĂ©enne ont imposĂ© de nouvelles restrictions au droit d’auteur afin de prĂ©server l’utilisation du Text and Data Mining des obstacles liĂ©s au droit d’auteur. Le prĂ©sent article examine ces nouvelles restrictions et compare la situation en Suisse Ă  celle de l’Union europĂ©enne.

Both Switzerland and the European Union have enacted new copyright protection measures to protect the usage of text and data mining from copyright restrictions. This paper examines these new copyright measures and compares the situation in Switzerland with the situation in the European Union.

Damian Hartmann, M.A. HSG in Law and Economics, Trainee Lawyer, St. Gallen.

Inhaltsverzeichnis

I. Introduction

1. Problem background

2. Structure

II. Text and data mining

1. Definitions

2. How it works

3. Connections to copyright

III. Legal situation in Switzerland

1. Copyright protection

2. Relevant copyright barriers for text and data mining

IV. Legal situation in the European Union

1. Copyright protection

2. Relevant copyright barriers for text and data mining

V. Controversies and comparison of text and data mining provisions

1. Controversies in Switzerland

2. Controversies in the European Union

3. Comparison

VI. Conclusion

I. Introduction

«Text and Data Mining (TDM) serves as an essential tool to navigate the endless sea of online information, in search of this invaluable treasure that big data might hold for the European economy.»

1. Problem background

Text and data mining (TDM) is a highly topical subject. Worldwide, researchers in the fields of medicine, biology, pharmaceuticals, anthropology and criminology, as well as companies in banking, forestry, fashion and marketing are exploiting the potential of diverse TDM applications to discover and profit from new knowledge from the immense amount of text and data in our data-driven world. This potential and the associated high value of TDM is illustrated by the opening quote.

One of the biggest challenges that TDM users face is complying with copyright in the text, data and databases used, which may be affected by TDM processes. Due to legal uncertainties in the copyright field, various regulators have decided to include a new exception provision in copyright law. In the European Union, this was done with the enactment of the Digital Single Market Directive (Directive 2019/790/EU) and in Switzerland with the implementation of a new provision into the national Copyright Act (CopA). Both the EU and Swiss regulations are still recent, so – as far as can be seen – the courts have not yet had the opportunity to comment on these new provisions.

The aim of this article is to discuss and evaluate the new TDM provisions in Switzerland and the EU in a comparative legal manner, after providing an overview of the copyright issues associated with TDM applications in Switzerland and the EU.

Due to various points of contact, TDM can also be classified in larger thematic complexes, for example, in the areas of artificial intelligence, machine-learning, big data or (open) innovation. These thematic complexes are not addressed here since this article focuses specifically on TDM applications. From a legal point of view, TDM is relevant for different areas of law such as copyright law, data protection law, contract law and unfair competition law. However, this article is limited to the treatment of issues arising from copyright law.

|2. Structure

This article is structured as follows: It starts with the topic of text and data mining (II). Various definitions of TDM are presented, followed by a description of how TDM works. The three steps in which TDM usually takes place (access to content, extraction and copying of data, and pre-processing and discovery of hidden knowledge) are shown in particular. In addition, the points of connection relevant to copyright law are presented. In a subsequent step, the legal situation in Switzerland is examined (III). Here, copyright protection of text and databases is presented and the Swiss copyright barriers relevant for TDM are highlighted. After that, the paper deals with the legal situation in the European Union (IV). Again, copyright protection in general and for databases in particular is shown, followed by a presentation of the European copyright limitations and exceptions relevant for TDM. Finally, the controversies of the Swiss and the European TDM provisions are discussed, and a comparison is drawn between the Swiss TDM provision and the two European TDM provisions (V). The focus lies hereby on Art. 24d CopA for Switzerland and on Arts. 3 and 4 of Directive 2019/790/EU for the European Union.

II. Text and data mining

1. Definitions

Data mining originates from computer science, where it became established in the late 1980 s. From a computer science perspective, data mining is defined as «a set of mechanisms and techniques, realized in software, to extract hidden information from data». Text mining is a type of data mining in which the data mining activity is applied to text. Data mining is based on statistics and machine learning technology. Therefore, data mining is not a separate technology, but rather an application of machine learning and statistics.

Somewhat less technical than the definition from a computer science perspective is that of the Swiss Federal Council. According to the Federal Council, TDM is a research tool for the electronic evaluation of large amounts of text and data, whereby cross-references in the large amounts of text and data can be quickly identified by using various statistical and mathematical methods. A legally binding definition of TDM does not exist in Swiss copyright law.

A similar description of TDM forms the basis of the Digital Single Market Directive. According to Recital 8 of the aforementioned directive, TDM comprises new processes with which information available in digital form, such as text and data, can be automatically evaluated using computers. In this way, large amounts of information can be processed and findings can be obtained. According to the binding definition of the EU, TDM means any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations.

2. How it works

A broad spectrum of different TDM techniques and methods exists. Accordingly, there are different ways to represent the different steps of TDM. For the purposes pursued here, it is convenient to divide the TDM process as follows: (a) accessing data and text; (b) extracting and copying data and text; and (c) discovering knowledge. These three steps are briefly outlined below.

a) Accessing text and data

In a first step, access to text and data is needed. Text and data can derive from different sources. One possibility is to use «crawling» to automatically search websites and create individual data collections. It is possible to use unstructured data from public websites or social media, but structured data from databases of professional database operators also come into consideration. In addition to such individually compiled text and data collections, existing databases can also be used for TDM. The core activity of TDM takes place only in the following second and third steps. Nevertheless, access to text and data is a prerequisite for TDM to operate at all.

b) Extracting and copying of text and data

As soon as a data set becomes available, the actual TDM process begins. In order for the collected data or the database to be analyzed, the data in their entirety or at least portions thereof must be copied or extracted and stored permanently or temporarily, depending on the specific TDM tech|nique. The extracted or copied collection of data and text used for a TDM analysis is called a «corpus». There are TDM applications in which neither data nor words are copied or extracted, and TDM applications where only a marginal part, e.g. two to three words, are copied. However, such TDM applications tend to be rare. Usually, a TDM application relies on some type of copying.

c) Pre-processing and discovering hidden knowledge

Depending on what kind of data has been compiled and what goal is being pursued through the TDM analysis, the data must be prepared for the analysis. For instance, it may be necessary to first structure the data, delete superfluous data, or correct erroneous data (so-called «data clean-up»). Furthermore, depending on the TDM technique used, text must be transformed into a format suitable for computers (so-called «normalization»). These actions represent the preparatory actions for the subsequent analysis.

Once these steps have been completed, the data are ready for analysis. Depending on the TDM application, different analysis results are created. Typical results of a TDM analysis are pattern findings, clustering and classification or summaries. However, TDM not only allows the identification of previously unknown relationships and patterns, but also statements about future developments.

3. Connections to copyright

From a copyright point of view, two central sets of questions arise with regard to TDM, which concern the object of protection on the one hand and the content of copyright on the other hand.

Regarding the object of protection, data and text used for TDM must first be qualified under copyright law. TDM is only relevant with respect to copyright law if the underlying data and text enjoy copyright protection at all. Since there is no internationally harmonized world copyright law, the copyright protection of data and text concretely used for TDM is basically determined by the applicable national copyright law. In determining the material protected by copyright, the question of the copyright protection of databases is of particular interest. Of importance hereby are the requirements of the TRIPS Agreement regarding copyright protection of data collections, since both Switzerland and all EU Member States are parties to the TRIPS Agreement.

If copyright-protected data and text are used during TDM, the question that arises in the area of the content of copyright is to what extent TDM interferes with the rights of the copyright holder. The focus here is on the temporary and permanent copies that are regularly produced during the TDM process.

In the following two sections, these two sets of questions are examined for Switzerland and the European Union.

III. Legal situation in Switzerland

1. Copyright protection

a) Text

The Swiss Copyright Act protects all literary and artistic works that are intellectual creations and have an individual character. Art. 2 CopA contains a non-exhaustive list of categories of works, which includes in particular (scientific) linguistic works. A linguistic work exists if a conceptual content is expressed by means of language. According to jurisprudence, the requirements concerning the criterion of individual character are low for linguistic works. In the case of scientific linguistic works, the individual character may |result not only from the text, but also from the selection, division and arrangement of the scientific material. Single words are usually part of common speech, which is why they lack individuality and are accordingly not protected by copyright, except in exceptional cases. This also applies if they are short word sequences or a title of a copyrighted work. Generally, text which is used as a basis for text mining is thus in any case suitable to be protected by copyright.

b) Databases

The copyright protection of a database itself (not of the data contained therein) is governed by the regulation on collected works. A more extensive sui generis protection right for databases, as exists in the EU, does not exist in Swiss copyright law. According to the regulation on collected works, collections are protected in their own right, provided that they are intellectual creations with an individual character with regard to selection or arrangement. The protection of works included in a collected work is reserved. This means that databases can obtain copyright protection via the provision of collected works, regardless of the copyright protection of the underlying data.

Accordingly, databases which are pure compilations of freely available data, such as a database with the addresses of all doctors in a certain region, are not protectable because neither the selection nor the arrangement has an individual character. In general, such databases, which are designed to be complete, are not protectable. On the other hand, protection is affirmative for databases that compile statistically significant groups of persons for representative market surveys or clinical trials.

Compared to the requirements for data compilations under the TRIPS Agreement, the Swiss regulation on collected works takes the same approach. To the extent that databases are used for analysis in TDM, this means that not only the individual works may be protected by copyright, but additional and independent copyright protection may exist at the level of the database itself.

c) Reproduction

The author has the exclusive right to decide whether, when and how his or her work is used. The author has the right, in particular to produce copies of the work. Although the CopA, which is still somewhat antiquated, assumes that copyrighted works are physically copied by using the term «produce copies», the reproduction as a copyright-relevant action must also be observed in the digital world. Thus, both the permanent and the temporary electronic saving of data on a computer constitutes a reproduction within the meaning of Art. 10(2)(a) CopA, which is generally subject to the consent of the copyright holder.

Since most TDM applications make copies, as mentioned above, TDM would be dependent on the consent of all copyright holders concerned.

2. Relevant copyright barriers for text and data mining

a) Temporary copies

Due to the far-reaching term of «reproduction», the internet would not be useable in a meaningful way if certain limits were not placed on copyright. One of these limits is Art. 24a CopA, which applies to temporary copies. According to this provision, the making of temporary copies of a work is permitted if: (i) they are transient or incidental; (ii) they represent an integral and essential part of a technological process; (iii) their sole purpose is to enable a transmission of the work in a network between third parties by an intermediary or a lawful use of the work; and (iv) they have no independent economic significance. These conditions must be met cumulatively for the exception of Art. 24a CopA to apply.

Art. 24a CopA predominantly addresses processes such as browsing and caching, but is also relevant in the context of TDM. If temporary copies are created during TDM – which is the case during the actual analysis process – they are covered by Art. 24a CopA; however, the aforementioned provision does not permit the creation of permanent copies, |which is why the provision does not cover all mechanisms of reproduction in the various forms of TDM applications.

b) Private personal use

According to Art. 19(1)(a) CopA, published works may be used for private use, whereby private use means any personal use of a work or use within a circle of persons closely connected to each other, such as relatives or friends. Private personal use does not differentiate concerning the lawfulness of access to the works, so that works made accessible unlawfully can also be used for private personal use as long as they have been published. Moreover, private personal use covers all types of use of the work.

With regard to TDM, under Art. 19(1)(a) CopA, a private researcher can thus create both a permanent data set and permanent and temporary copies that are created during the TDM process. This means that if TDM is conducted as private research in the sense of Art. 19(1)(a) CopA, it is completely possible to do so without violating copyright law. It should be noted, however, that private researchers must perform all actions themselves. If the work is outsourced to a third party, there is already a restriction in that the third party may not make a complete or extensive reproduction of commercially available works. However, TDM often takes place with the help of a third party and is regularly not limited to a private group of persons within the meaning of Art. 19(1)(a) CopA. Furthermore, the exception in Art. 19(1)(a) CopA is not sufficient to justify commercial TDM and it is exclusively applicable to natural persons. Thus, in practice, many TDM activities often do not fall under the Art. 19(1)(a) CopA exception.

c) Use of works for the purpose of scientific research

It has been shown that the existing barriers of copyright law do not completely cover various TDM applications, so that there is a residual risk that copyright infringements are committed by means of TDM. TDM has great significance in practice and is indispensable in a modern, data-driven economy. Furthermore, since research, which is an area where TDM frequently occurs, is of great importance for Switzerland, the existing uncertainties are to be eliminated by a new regulation in the CopA – the so-called science barrier. To accomplish this, on 1 April 2020, a new provision entered into force with Art. 24d CopA.

According to Art. 24d CopA it is permissible for the purposes of scientific research to reproduce a work if the copying is due to the use of a technical process and if the works to be copied can be lawfully accessed. Upon conclusion of the scientific research, the copies made in accordance with Art. 24d CopA may be retained for archiving and backup purposes. The new exception in Art. 24d CopA supplements the already existing copyright barriers, which is why it applies cumulatively to the latter. The individual elements of the science barrier will be explained in the following paragraphs.

First, it should be noted that Art. 24d CopA does not permit all uses of a work within the meaning of Art. 10 CopA, but only reproduction. However, the right of reproduction is comprehensive, so that a complete work may also be reproduced. The reproductions may be stored for archiving and backup purposes and thus do not have to be destroyed. This storage option is useful because it allows research results to be reviewed and the research project to be repeated.

The central essence of Art. 24d CopA lies in the purpose limitation to scientific research. The reproduction of copyrighted works is only permitted if it is done for the purpose of scientific research. According to the Federal Council, scientific research is understood to be the systematic search for new knowledge within and across different scientific disciplines. It includes both basic and applied research. In the case of a research project that serves not only scientific but also other purposes, scientific research must remain the main purpose of the project for Art. 24d CopA to apply. No distinction is made between commercial and non-commercial research because the distinction would be too difficult to make. Therefore, Art. 24d CopA is intended to allow the reproduction of a work for the purpose of scientific re|search, regardless of who is conducting the research and how that research is funded.

Furthermore, the reproduction must be the result of a technical process. Although this technology-neutral formulation has been chosen, it is clear that Art. 24d CopA is primarily directed at TDM processes.

Finally, there must be lawful access to the copyrighted works. This is to be understood – analogous to the copyright exception for private personal use – as meaning that the user of the work has lawfully obtained access to the works, but not that the copy of the work used was lawfully created.

IV. Legal situation in the European Union

1. Copyright protection

a) General

Principally, each EU Member State has its own copyright laws. In detail, these differ considerably, not least because some Member States follow the common law copyright system and others the continental European author’s right system. Although it would be useful for the creation of a single market, the considerable differences which exist make a uniform copyright law at the Union level impossible. Nevertheless, the copyright laws of the EU Member States are harmonized on a selective basis by a total of four regulations and twelve directives. These EU legal acts are of great importance for the global development of copyright law since they are binding for a large number of different countries.

In relation to TDM, this means that the copyright protection of text and data is assessed from the perspective of the respective applicable national copyright laws. On the European level, the following three legal acts are in the foreground regarding TDM: Directive 96/9/EC («Database Directive»), Directive 2001/29/EC («Information Society Directive») and the above-mentioned Directive 2019/790/EU («Digital Single Market Directive»).

b) Databases

Databases are protected in the EU in two ways. On the one hand, databases are protected as such by copyright law – in accordance with the TRIPS Agreement – provided that they constitute an intellectual creation by virtue of the selection or arrangement of their contents. On the other hand, the EU grants a sui generis right to the producer of a database, by which the extraction and further use of the database or parts thereof can be prohibited, provided that a substantial investment was required in producing the database. In both cases, the copyright of databases and the sui generis right, Member States may provide an exception for scientific research, as long as the source is indicated and a non-commercial purpose is pursued.

TDM can thus encompass actions that interfere with both the copyright of databases and the sui generis right. The focus here is again on the reproductions that occur during the TDM process. Due to the various manifestations of TDM, it remains unclear to what extent TDM interferes with the copyright of databases and the sui generis right and thus requires the consent of the right holder, including consideration of the respective exception provisions.

|2. Relevant copyright barriers for text and data mining

a) Temporary copies in the Information Society Directive

According to the Information Society Directive, the author has the right to authorize or prohibit reproductions and partial reproductions in any form. Exceptions to this are temporary, technically necessary reproductions that occur during transmission in a network or during legitimate use and have no economic value of their own.

Similar to the situation with Art. 24a CopA in Switzerland, the exception for temporary copies in the Information Society Directive is often insufficient in cases of TDM.

b) Exceptions and limitations in the Digital Single Market Directive

In the EU the legal uncertainty surrounding TDM was also seen as a threat to the competitive position of the Union. A new exception has therefore been created in the area of scientific research («text and data mining for the purposes of scientific research»). Since TDM can also play an important role for private and public actors outside of scientific research, e.g. to make complex decisions which require large amounts of data analysis or to develop new technologies, the Digital Single Market Directive also creates an exception for private and public entities («exception or limitation for text and data mining»). This exception is intended to encourage innovation in the private sector. Both exemptions were to be implemented by Member States by 7 June 2021.

The provision on TDM for the purposes of scientific research provides that research organizations and cultural heritage institutions may reproduce and extract works and databases protected by the Database Directive and the Information Society Directive, and to which they have lawful access in order to conduct TDM for scientific research purposes. This exception provision cannot be limited by contract. Scientific research includes both the natural sciences and the humanities. Universities, higher education institutions, libraries and research institutes are considered research organizations, regardless of their legal form. However, the research organizations must not be profit-oriented or they must act in a state-recognized mission, which is characterized, for example, by state funding or a public law contract. On the other hand, organizations are not considered to be research organizations if they are subject to the determining influence of commercial enterprises, e.g. through structural circumstances such as in their capacity as shareholders, and could thereby obtain preferential access to the research results. Content that is made available on an open access basis, content for whose access there is a contractual basis between the research institution and the rights holders, and content that is freely available on the internet qualifies as lawful access.

The provision on exception and limitation for TDM provides that for reproductions and extractions of works and databases protected by the Database Directive and the Information Society Directive made in the context of TDM, an exception or limitation is made to the rights granted in those Directives, provided that lawful access to the works exists and rightholders do not place machine-readable conditions of use on their works. Here, lawful access is understood to mean that content published on the internet is considered lawfully accessible as long as the rightholder does not prohibit TDM. A reservation may be made by the |right holder, on the one hand, by machine-readable means and, on the other hand, by a contractual agreement, but also by unilateral declaration.

Obviously, the two exception provisions can overlap. Therefore, it is provided that the exception provision «TDM for the purposes of scientific research» (Art. 3 of Directive 2019/790/EU) is lex specialis to the exception provision «exception or limitation for TDM» (Art. 4 of Directive 2019/790/EU).

V. Controversies and comparison of text and data mining provisions

1. Controversies in Switzerland

The Swiss TDM provision was only slightly controversial in Switzerland, in contrast to the other copyright law amendments addressed in the Message from November 2017. Nevertheless, some differences of opinion between the various stakeholders emerged during the consultation process, which can be summarized as follows. Consumers fully agreed with the draft. Among producers and culture creators, some welcomed the draft on the condition that the publication of copyrighted works would not be allowed. Others opposed the draft outright or supported it only provided there would be no licensing option available for TDM users. Remuneration was supported by producers and culture creators. The users and the cantons agreed with the draft but criticized the remuneration obligation – which was still provided for in the draft but was ultimately deleted – because this would have led to multiple remuneration. Some users and cantons also wanted the TDM provision not to be limited by the criterion of scientific purpose. Opinions were divided among copyright holders. Some supported the TDM provision or even wanted to extend it, some demanded a limitation of the TDM provision with the criterion of licensing possibility, some supported the limitation with the criterion of scientific purpose, and some rejected the TDM provision altogether. The majority of the political parties supported the TDM provision but rejected the remuneration obligation. Some parties wanted to waive the scientific purpose criterion.

A typical picture emerges here for copyright protection. While consumers and users argue for an exception to copyright, culture creators and producers are in favour of maintaining far-reaching copyright protection. This is intuitive, since copyright holders want to exploit their exclusive right to the greatest extent possible and therefore seek strong copyright protection, while users and consumers are interested in the most unhindered use of works possible. With Art. 24d CopA, a compromise was worked out insofar that both the needs of researchers were implemented with the TDM exception, but at the same time a restriction to the scientific area was made in favour of copyright holders. Ultimately, it can be said that the winners are researchers and copyright holders, while private individuals and companies using TDM for non-scientific purposes are the losers.

In Swiss doctrine, the TDM provision (Art. 24d CopA) in its final form has been welcomed. Despite the praise, however, there is also a central criticism of the new TDM provision: the question arises whether it was really necessary to insert the criterion of scientific purpose, since the TDM provision already requires lawful access to copyrighted works. The requirement of lawful access leads to the fact that in many cases copyright holders already have to be compensated for access to the work (e.g. a fee for the use of a database or an electronic library of a scientific publisher). From an economic point of view, it would therefore also have been possible not to tie Art. 24d CopA to the condition of scientific purpose, because access is already compensated in most cases and therefore there is no economic interest in maintaining copyright protection outside of scientific research (related to TDM). However, waiving the criterion of scientific purpose would obviously have gone too far in view of the various interest groups involved, which is why this criterion was retained. In this respect, Art. 24d CopA cannot be considered as the result of a purely economic analysis, but of a political weighing of interests.

Eventually, in view of the various interests involved in the legislative process and the legally flawless design of Art. 24d CopA, the Swiss TDM provision can be described as successful.

2. Controversies in the European Union

The design of the European TDM provisions gave rise to criticism. First, however, the positive aspects should be considered. One of these positive aspects is the harmonizing effect of Art. 3 of Directive 2019/790/EU. Since this exception is mandatory, it creates a uniform regulation for researchers in the EU and thus promotes large, cross-border and Europe-wide research projects. Moreover, it is emphasized in particular that Art. 3 of Directive 2019/790/EU cannot be overridden by contractual agreements, but is absolutely mandatory. This ensures that countless consents are not required for the application of TDM, which was a core concern of the TDM discussion.

On the other hand, the narrow scope of Art. 3 of Directive 2019/790/EU has been criticized. This provision is limited twice: once by the criterion of scientific purpose, and |once by the limitation of the exception to research organizations. Apart from the fact that these two criteria give rise to various delimitation issues, the narrow scope of application means that the EU cannot keep up with innovation-promoting countries that work with the flexible instrument of «fair use». Although the EU has created an additional exception with a broader scope of application in Art. 4 of Directive 2019/790/EU, this exception is weak if not ineffective because it allows contractual TDM prohibitions.

Furthermore, it is feared that the two TDM provisions of the EU will lead to further fragmentation, since the demarcation between Arts. 3 and 4 of Directive 2019/790/EU is not always easy. However, fragmentation is precisely not desired in European copyright law. A major problem of European copyright law has always been that the various exceptions and limitations to copyright have been structured in different ways. Some are mandatory, some are voluntary, some apply to all protected works (horizontal exceptions), some only to certain works (vertical exceptions) and finally there are exceptions which can be overridden by contract while others take precedence over contractual agreements. Therefore, contractual non-overriding and mandatory exceptions are desirable as they help to achieve harmonization. However, this has only been taken into consideration in Art. 3 of Directive 2019/790/EU, but not in Art. 4.

3. Comparison

The Swiss TDM provision (Art. 24d CopA) differs in several ways from the two European TDM provisions (Arts. 3 and 4 of Directive 2019/790/EU). This is due to the fact that Switzerland has decided for once not to wait for the EU regulation, but to create an exception for TDM as quickly as possible, independently from the EU.

Unlike the EU, Switzerland refrains from reserving TDM for research institutions only and thus avoids complicated delimitation issues. Although in Switzerland TDM must also be performed for a scientific purpose, this refers to the method and not the institution that performs TDM. The scope of Art. 3 of Directive 2019/790/EU, on the other hand, depends on the narrow definition of research organizations. While the EU has created an additional TDM provision in Art. 4 of Directive 2019/790/EU that is not limited to research organizations or research for scientific purposes, this provision is severely limited by giving copyright holders the ability to contractually prohibit TDM. This ability of copyright holders to prohibit TDM can be viewed both positively and negatively. On the positive side, thanks to this opt-out solution, the criterion of scientific purpose can be waived, thus making TDM available for non-scientific research from a copyright perspective. The right to prohibit TDM represents, so to speak, the payoff for extending the TDM exception to the non-scientific field. However, the possibility for copyright holders to prohibit TDM can also be seen as a weighty disadvantage compared to the Swiss TDM provision, because a prohibition right has a similar effect as a consent requirement. Consent requirements, however, were precisely what was intended to be abolished in the context of TDM. Even if copyright holders prohibit TDM by contract (e.g. in the General Terms and Conditions), this prohibition is not enforceable under Swiss jurisdiction due to the mandatory nature of Art. 24d CopA. However, authors are free to provide their works with technical protection measures that prevent TDM, since there is no obligation to make works available in a TDM-friendly manner. Paradoxically, this means that the owner of a database with non-copyrighted content can enforce a contractual TDM prohibition, whereas this is not possible for a copyright owner because of Art. 24d CopA, which can make the former better off.

Furthermore, it is positive that Switzerland has chosen a technology-neutral formulation in Art. 24d CopA. This is important because, due to technological developments, it seems restrictive to permit copies for the purpose of scientific research only for TDM and not for other technologies. The TDM provisions of the EU, on the other hand, explicitly speak of text and data mining. This is seen as a disadvantage not only because it does not include other technologies but also because it legally establishes that TDM is a copyright-relevant action. This is not self-evident because, as men|tioned at the beginning, not every TDM application is relevant from a copyright law perspective. It is true that there is now no longer any legal uncertainty as to whether or not TDM is covered by the already existing exceptions. However, the question now arises whether TDM applications, which would not have raised copyright issues before the new TDM provisions came into force, now also fall under the new TDM provisions. New legal uncertainty has thus been created especially in the area of non-copyrighted content.

Besides, the Swiss TDM provision only mentions the reproduction of works, whereas the TDM provisions of the EU additionally explicitly mention extraction. However, this difference is not relevant insofar as the concept of extraction primarily addresses the sui generis right for databases, which does not exist in Switzerland.

Finally, both the Swiss and the European TDM provisions are a step into the right direction by solving copyright problems in the field of scientific research. The attempt of the EU to create a TDM exception outside of scientific research with Art. 4 of Directive 2019/790/EU was also progressive. Since this attempt can be considered a failure due to the possibility of a contractual prohibition, the Swiss TDM solution can be considered more successful overall. This is because the scope of Art. 24d CopA is broader than that of Art. 3 of Directive 2019/790/EU. For cross-border research projects, this means that the Swiss TDM regulation represents a smaller hurdle than the European TDM regulation and is therefore more attractive for innovation. Although this may appear to be an advantage from a Swiss perspective, identical TDM regulations in Switzerland and the EU would have been desirable to avoid regional differences.

VI. Conclusion

TDM can be defined as a variety of electronic applications that can be used to analyze large amounts of text and data and discover hidden knowledge. Although the processes of the various TDM applications differ, TDM usually takes place in three steps; in the first step, access to text and data (from various sources) is obtained; in the second step, these text and data are extracted or copied and a corpus is created; in the third step, the content of this corpus is prepared and finally analyzed. What is relevant in terms of copyright is, on the one hand, that protected works may be used and, on the other hand, that temporary and permanent copies of the content may be created during the TDM process.

In Switzerland, text and data in the form of intellectual works of individual character are protected by copyright. Databases are protected in accordance with the TRIPS Agreement provision on data collections. Additional protection exists for databases in the EU with the sui generis right. In Switzerland as well as in the EU, there are various exception and limitation provisions that partially cover TDM. However, only the newly introduced TDM provisions – Art. 24d CopA for Switzerland and Arts. 3 and 4 of Directive 2019/790/EU for the EU – are able to fully cover TDM.

Since the Swiss and the European TDM provisions were enacted independently from each other, there are some differences. Due to the various interests in the legislative process, in Switzerland the scope of the TDM exception was limited by the criterion of scientific purpose. The EU has opted for an even narrower scope of the TDM exception in the area of scientific research by applying the provision only to certain non-profit or governmental research organizations and cultural heritage institutions. While the second European TDM exception provision outside of scientific research was a promising innovative idea, it has been weakened with contractual excludability. Overall, therefore, the Swiss TDM regulation seems to have been more successful than the European one, at least from the perspective of promoting innovation.

The future will tell to what extent the Swiss and European TDM provisions have proven their worth. As a further outlook, it should be mentioned that the question ought to be asked whether isolated exceptions in copyright law are still appropriate for new technologies of the digital world. If one wants to offer the most innovative environment possible, copyright obstacles must be prevented. Perhaps one must even ask the fundamental dogmatic question of the extent to which automatic copyright protection in the digital environment still makes sense. At the same time, the legitimate interests of copyright holders must not be overlooked.

|Zusammenfassung

Text und Data Mining wird eingesetzt, um aus grossen Datenmengen neue Erkenntnisse zu gewinnen. Hierbei bestanden regelmĂ€ssig urheberrechtliche Hindernisse, denen mit den herkömmlichen Urheberrechtsschranken nicht genĂŒgend begegnet werden konnte. Die neu geschaffenen Urheberrechtsschranken fĂŒr Text und Data Mining in der Schweiz und der EuropĂ€ischen Union sollen hier Abhilfe schaffen. Der Vergleich der schweizerischen und europĂ€ischen Urheberrechtschranken fĂŒr Text und Data Mining zeigt, dass erhebliche Unterschiede bei der urheberrechtlichen Handhabung von Text und Data Mining bestehen.

Résumé

Le Text and Data Mining est utilisĂ© pour obtenir de nouvelles connaissances Ă  partir de grandes quantitĂ©s de donnĂ©es. Des obstacles liĂ©s au droit d’auteur, qui ne pouvaient pas ĂȘtre suffisamment contrĂ©s par les exceptions traditionnelles au droit d’auteur, se dressaient rĂ©guliĂšrement face Ă  cette pratique. Les nouvelles restrictions du droit d’auteur lors de l’utilisation du Text and Data Mining en Suisse et dans l’Union europĂ©enne devraient y remĂ©dier. La comparaison des exceptions au droit d’auteur prĂ©vues en droit suisse et en droit europĂ©en dans ce contexte montre qu’il existe des diffĂ©rences considĂ©rables dans le traitement du droit d’auteur dans le cadre du Text and Data Mining.

Summary

Text and data mining is employed to acquire new insights from large amounts of data. In this technique, copyright obstacles often exist which cannot be sufficiently mitigated by conventional copyright limitations. The newly created copyright limitations for text and data mining in Switzerland and the European Union are intended to remedy this situation. A comparison of the Swiss and European copyright limitations for text and data mining shows that there are significant differences in dealing with copyright in the context of text and data mining.