SERVICE

8 Points about AI Development Agreements that can be learned from the “Contract Guidance on Utilization of AI and Data

6. Know-how

Various know-how is required when developing AI software. For example, this is know-how related to, among other things, the acquisition and selection method of raw data, the processing method into training dataset, an efficient learning method using a training program, and preparation of the trained model in the production environment.

(1) Whether or not it is covered by intellectual property rights
Since know-how is intangible information, it cannot be the object of a copyright; however, if know-how meets the requirements for an “invention”, [such know-how] may be the object of patent rights. If the [know-how] falls under the trade secret category (Unfair Competition Prevention Act, Article Paragraph 6) or the limited provision data category (Revised Unfair Competition Prevention Act, Article 2, Paragraph 7), it will be protected.

(2) Who has what rights under the default rules (i.e., a legal rule)?
Since know-how that does not fall under the trade secret category or the invention category involves no intellectual property rights, no one holds any rights [to such know-how]. As such, both the user and the vendor have no choice but to stipulate in a contract who can use the know-how and in what manner.

7. Summary

If we were to summarize the [information] above, it could look like the chart below. (Postscript added).

Note:
By the way, since the Ministry of Economy, Trade and Industry’s “Enhanced Environment for Open Data Distribution Structures” (August 29, 2016, METI, hereafter the “METI material”) has a similar chart pertaining to AI-related intellectual property rights (P. 82, hereafter the “Enhanced Environment Chart”), I would like to provide a simple explanation of the relationship between the summary chart above and the Enhanced Environment Chart. (Mr. Takuji Hashizume, thank you for pointing this out!!)

First, although it depends on the type of raw data, since certain kinds of data (for example, mechanical operating data, sensor data, and factual data) do not include intellectual property rights, while other copyrightable data (such as photographs, voices, images, and novels) involve copyrights, I have revised the summary chart that first appeared with respect to (raw) data for consistency with the Enhanced Environment Chart on this point.
Next, in the summary chart and the Enhanced Environment Chart, practically the same items are mentioned with respect to “training dataset”.
In addition, if you look at page 78 in the METI material, I think that the “learning” in the Enhanced Environment Chart probably refers to the “training program” in the summary chart.
Further, judging from page 79 in the METI material indicating that “a trained model is an enumeration of numbers calculated by a computer (matrix, etc.)…”, I think that the “ trained model” in the Enhanced Environment Chart refers to the “trained parameter” in the summary chart. (Obviously, this is not an issue of whether the summary chart or the Enhanced Environment Chart is correct; rather, as explained before, it is an issue of defining “trained model”.)
Moreover, although it is noted in the Enhanced Environment Chart (the trained parameter section in the summary chart) that the trained model may be protected under the Patent Act and Copyright Act, I personally believe that this would be very difficult for the reason given earlier (i.e., [it is] “a large string of numerical values automatically generated by the training program”).
Finally, judging from the explanation in page 75 in the METI material to “incorporate the model in the application and use it as software”, I believe that the “use” in the Enhanced Environment Chart probably refers to the “trained model” (an inference program that incorporated trained parameters) in the summary chart. (This is confusing, isn’t it?)

Reference:
“Enhanced Environment for Open Data Distribution Structures” (August 29, 2016, METI)

3. Know how to craft contract provisions that benefit your own company (without being particular about the “ownership of intellectual property rights”, prioritize the “terms of use”)

By now, you understand the legal default rules for the 6 objects in AI development.
The next important matter is how to craft contract provisions that benefit your company premised on those default rules.

Typical Deadlocked Pattern
In AI development agreements, a typical deadlocked pattern concerning rights and intellectual property is shown below.

[User’s Position]
The training dataset and the trained model are generated using raw data that is filled with our know-how and secrets and we pay a subcontracting fee for development. We have certain rights, don’t we?
[Vendor’s Position]
No, you cannot generate a trained model with only raw data. What makes a high-performing model possible is largely thanks to our advanced know-how and intensive labor efforts in both the preprocessing of data and the training process of model. We have certain rights, don’t we?

How should we think about this?
This type of confrontation mainly stems from the [position] of the user and the vendor that the deliverables belong to their respective company, in other words, their persistent claims that the rights to the deliverables should belong to their company.
So, as long as both the vendor and the user continuously persist in their claims of “who has the right” (ownership of intellectual property rights) without any mutual resolution, the negotiations will take a tremendous amount of effort and time with both companies ultimately losing their respective competitiveness. In essence, there should be more contractual provisions than either party initially thought that can simultaneously meet the needs of both parties since the business structures of the person providing the data (the user) and the person generating the trained model (the vendor) are different.
For this reason, the AI Guidelines suggest handling the “ownership of intellectual property rights” and the “terms of use” separately and setting flexible conditions.
For example, with respect to the trained model, by (1) allowing the vendor to own the rights (ownership of intellectual property rights) and then (2) prohibiting the vendor from using [the trained model] for other than a certain purpose or for a competitive purpose for a fixed period after development, while allowing the user free use of the trained model (terms of use), it might be possible to conclude an agreement that is consistent with the mutual benefits of both parties.
In other words, this is a concept where the “terms of use” are “prioritized” than the “ownership of intellectual property rights” of the objects by both parties.
To give an extreme example, even if your company holds no rights relating to the trained model and the counter-party owns all of the right, if, as a result of the negotiations, a provision can be included in the terms of use allowing the free use of the model without any restriction, including providing such model to third parties, this would be virtually the same as holding the rights to the model. (Of course,権利の譲渡の可否や権利者が移転した際の対抗力の問題など、「モデルの権利を保有しているのと完全に同じ」というわけではありませんが)。

Concrete Approach
In this way, if one considers the idea “ownership of intellectual property rights” and the “terms of use” separately, this would logically lead to fixing the “ownership of intellectual property rights” and “terms of use” for all of the 6 objects mentioned. Moreover, the reason why the ownership of the intellectual property rights for raw data is not mentioned in the chart below is that such raw data does not give rise to any intellectual property rights under the current laws and stipulating [such ownership rights] directly in the “terms of use” is sufficient. (Of course, the ownership of intellectual property rights of data that triggers intellectual property rights, such as copyrighted products, will be problematic.)

In reality, there are probably many simpler patterns. The software development agreement for the development phase (model contract) appended to the AI Guidelines contains provisions for ownership of intellectual property rights in Article 16 and Article 17 and provisions for the terms of use in Article 13 and Article 18.