Evolving Paradise for Machine Learning—Revisions to the Copyright Act Further Accelerate Development of Japan’s AI
Revised Copyright Article 30-4
How will it change?
Based on the awareness of these issues, the current Article 47-7 has been expanded, made more
flexible, and became the new Article 30-4(ii).
Let’s start off by reading this provision.
A work may be used in any of the following cases to the extent that is deemed necessary without any regards to the manner of such use, if the use is not intended for own enjoyment or for enjoyment of others of ideas or emotions expressed in the work. This shall not, however, apply to the cases where in view of the type of the work and the use and manner of the use, the interests of the copyright holder are unduly harmed.”
“Article 30-4(ii) when a work is used for information analysis (which means extracting information concerning language, sound, image, etc. from a large volume of information consisting of a large number of works, etc., and carrying out analysis such as comparison and classification. This definition is also applicable to Article 47-5(1)(ii).)” (Unofficial translation)
(Remainder is omitted.)
The chart below summarizes the major points revised from the current Article 47-7 to the new Article 30-4(ii).
Based on these revisions, the new Article 30-4 will apply to, and render legal, all three “legally impermissible acts to which the Article 47-7 does not apply” that were introduced earlier and are set forth below.
1. An act of preparing a training dataset for another person to generate a model which is sold to an unspecified number of third parties or disclosed on the web, instead of generating a model by yourself
Example: A situation where training dataset is created and sold for generation of image recognition model by copying a large volume of image data available on the web or provided to the public by the right holder.
2. An act of a business operator, who created a training dataset to generate a model on its own and generated a model, selling the training dataset used in the model generation to an unspecified large number of third parties and making it available on the Web at no charge
Example: A situation where a business operator that generated an image generation model sells the training dataset used in that model generation together with that model as a set.
3. An act of sharing of training dataset among a consortium consisting of specific business operators
Example: A situation where a business operator that generates an automatic translation engine utilizing deep learning engages in co-sharing within the consortium of a translation corpus generated by collecting a large volume of natural language data from the web.
Further, the new Article 47-5(1)(ii) provides for “information analysis by computer and provision of its
Therefore, when engaging in a business that generates datasets and sells to the public, the use of copyrighted products incorporated in such dataset is permitted to a certain extent (for example, uploading a small amount of the data on the sales website). However, since the permitted use here is restricted to “minimal use” (“minimal matters in light of the proportion of the portion of the copyrighted products provided for public use, the volume of the portion provided for such use, the indicia of accuracy when provided for such use, and other factors”), the copyrighted product may not be used without restriction.
Issues that will certainly pose business problems after enforcement of the revised Copyright Act
As just described, the transfer and provision to the public of training datasets will be permitted under
the new Article 30-4 from and after January 1, 2019. Following this, I believe that the practical issue will
become specifically what kinds of cases are covered by the proviso in Article 30-4 (“This shall not,
however, apply to the cases where in view of the type of the work and the use and manner of the use, the
interests of the copyright holder are unduly harmed.” (unofficial translation)
First, below are two cases that will certainly be covered by the proviso.
Acts falling under the proviso of the current Article 47-7 are not permitted
First, acts falling under the proviso of the current Article 47-7, which involve the use of “database works compiled for use by persons who carry out data analyses”, fall under the proviso of new Article 30-4 and are impermissible (statement of Deputy Commissioner for Cultural Affairs Nagaoka in the House of Representatives Committee on Education, Culture, Sports, Science and Technology (April 6, 2018)).
Lawful acts to which the current Article 47-7 applied will not become unlawful.
Second, judging from the Diet’s supplemental resolution, I believe that lawful acts to which the current
Article 47-7 applies will not become unlawful.
Except for the cases above, generally in what kind of situation would “the interests of the copyright holder are unduly harmed” be implicated?
Let’s explore and think about this a bit more.
With respect to this issue, it is important to understand the structure of the Copyright Act and the concepts behind the “update on flexible prescribed limitations on rights” among these revisions to the Copyright Act.
Conceptual background for the “update on flexible prescribed limitations on rights” among these revisions to the Copyright Act
Three-layer structure of the prescribed limitation on rights
The provisions of new Article 30-4 and new Article 47-5 that I introduced above relate to, among several
themes of these revisions to the Copyright Act, the theme of “update on flexible prescribed
limitations on rights that correspond to the evolution of digitalization and the network”.
Since “prescribed limitations on rights” is a system that restricts a copyright holder’s rights to a certain extent, these new provisions have updated provisions related to, in short, “in what situation can a copyrighted product be used without the consent of the copyright holder”.
This revised Copyright Act reduces to writing suggestions in the 2017 report. This report divides the prescribed limitations on rights into 3 layers and considers it is appropriate for each of these 3 layers to update own provisions in which adequate flexibility is secured.
First, when you classify the provisions of the current law by these 3 layers, it looks like this.