China Institute for Socio-Legal Studies, Shanghai Jiao Tong University

2023-08-28 [author] Shen Weixing preview：

[author]Shen Weixing

[content]

On the Hierarchy of the Data Property Right System: "Three-Three Systems" Data Right Confirmation Law

*Author Shen Weixing

Professor of Law School of Tsinghua University

Abstract: The issue of data rights confirmation has become the biggest institutional obstacle hindering the rapid development of my country's digital economy. The reason why data rights are difficult to confirm is that there are mixed concepts and flat thinking in the existing discussions. The design of data property rights system should adopt hierarchical thinking based on the concept of order, and realize the decoupling from original data to data application through horizontal layering and vertical layering. That is, horizontally separate data and information, data source and processor, source ownership and processor usufruct from the three levels of object, subject and content; vertically, according to the cycle of data generation, data generation is divided into data resource collection, There are three different stages of data collection processing and utilization and data product management; while respecting the initial data ownership of data sources, the right to use corporate data as the basic right to build data resource ownership rights for data collection, data processing and utilization, and data product transactions , Data processing and use rights, and data product management rights are three-stage hierarchical right confirmation pattern.

introduction

With the development of modern society, the factors of production are not limited to the original land, labor and capital, especially for today's digital economy, data is the core factor of production. In October 2019, the Fourth Plenary Session of the Nineteenth Central Committee of the Communist Party of China proposed: "Improve the mechanism of production factors such as labor, capital, land, knowledge, technology, management, data, etc., which are evaluated by the market and determined according to the contribution." listed as factors of production. As a factor of production, data plays an even more important role, and data rights verification has therefore become an important issue at present. The central level has made a forward-looking deployment to construct data property rights. In April 2020, the Central Committee of the Communist Party of China and the State Council issued the programmatic document "Opinions on Constructing a More Complete System and Mechanism for the Market-oriented Allocation of Factors", which proposed the policy goal of "improving the nature of property rights based on the nature of data". Since then, the central government has reiterated and improved this goal in many important documents: In February 2022, the Central Committee of the Communist Party of China and the State Council issued the "Opinions on Accelerating the Improvement of the Socialist Market Economic System in the New Era", reiterating the establishment of a data property rights system, The important goal of improving the definition of data ownership; on June 22, 2022, the 26th meeting of the Central Comprehensive Deepening Reform Committee reviewed and approved the "Opinions on Building a Data Basic System to Better Play the Role of Data Elements" (hereinafter referred to as "Data II "Ten Articles"), General Secretary Xi Jinping emphasized when presiding over the meeting to coordinate the advancement of data property rights, circulation transactions, income distribution, and security governance, and accelerate the construction of data-based institutional systems; on December 2, 2022, the Central Committee of the Communist Party of China and the State Council officially released "Twenty Articles on Data", which proposes to "establish a property rights operation mechanism that separates the right to hold data resources, the right to use data processing, and the right to operate data products." It can be seen that the establishment of a structurally separated data property rights system and the improvement of data ownership distribution rules have reached a high degree of consensus at the central level, and have become the logical starting point for the construction of the entire data infrastructure system.

However, judging from the existing research in the academic circles, there are still different opinions on whether data rights should be confirmed, what kind of property rights should be established, and how property rights should be allocated. Judicial judgments are still inconclusive. It can be said that the issue of data rights confirmation has become the biggest difficulty and blocking point in the development of today's digital economy. The reason for this difficulty is that when discussing the issue of data rights confirmation, the academic community has neither grasped the differential order pattern between data, information, and privacy, nor recognized the complex hierarchical structure inside the data, thus falling into a flat Affirmative mindset. One of the specific manifestations is that when people talk about data, they confuse data with privacy and personal information, so as to draw a radical conclusion that data cannot be traded, causing cognitive confusion for data rights confirmation. In fact, the formation of data is a chain involving multiple subjects. In the generation process of "data resource-data collection-data product", it includes data collectors, storers, transmitters, processors, analysts, There are many participants such as users. These participants play different roles and contribute different degrees in different stages of data generation. If the disclosure process is compressed into one point to discuss the issue of property rights allocation, it will naturally be difficult to accurately reflect Without the contributions of different participants, the distribution of property rights is hardly fair. Therefore, the unique hierarchical structure of data and the diversification of data-forming subjects require a change in the thinking of data rights confirmation—it is not appropriate to apply traditional flat thinking, but to establish a three-dimensional data rights confirmation hierarchical thinking.

Thinking Transformation of Data Right Confirmation: From Planarization to Hierarchy

Constructing a set of logically rigorous and well-defined data hierarchy is an important prerequisite for data rights confirmation. The author believes that in this system, data, information, and privacy need to be distinguished and decoupled horizontally, and a set of flow and transition hierarchical patterns between data and surrounding concepts need to be established; vertically, it is necessary to create Based on the linear characteristics, a three-level progressive logical chain is built on the generation process: "data resources-data collection-data products". The issue of data ownership needs to be nested and discussed in this hierarchical system.

From the perspective of the development history of law, since Heck, a giant of the interest law school, proposed three types of concepts in law, namely, the concept of what should be (Sollenbegriff), the concept of actuality (Seinsbegriff), and the concept of order (Ordungsbegriff), Radbreu Radbruch further followed up and proposed the hierarchy of type thinking based on the concept of order. In 1938, Radbruch published Klassenbegriffe und Ordnungsbegriffe im Rechtsdenken (Klassenbegriffe und Ordnungsbegriffe im Rechtsdenken), which introduced hierarchical thinking into the field of law for the first time. This paper studies the concept of classification and the concept of order separately: the concept of classification (Klassensbegriff, a traditional concept in the sense of logic) adopts the conceptual thinking of classifying according to characteristics, and is a kind of distinction between this thing and that thing. Dissociative thinking, where the boundaries between different concepts are clear and separate from each other. This kind of thinking mode "disintegrates and destroys the overall relevance of life", ignoring that the boundaries of everything in life are blurred and fluid. The concept of order (Ordungsbegriff) adopts type-level thinking, that is, types have hierarchical characteristics, and the scope of the same type can be composed of several different levels, and different levels can flow and transition with each other, and present a state of sequential arrangement .

The core feature of the concept of order is its hierarchy and the fluid transition between different levels, as Larenz pointed out: "A type is not bounded by fixed boundaries compared with other comparable types; on the contrary, it seems to be fluid: Through different shifts of emphasis and changes of character, it turns to another type". Based on hierarchical thinking, it is possible to build a conceptual system of flow and transition between privacy, information, and data. Specifically, personal data, as the carrier of personal information, is located at the symbol layer; in a broad sense, data can be divided into symbol layer and content layer, among which, the data of the content layer enters the category of personal information; if personal information involves information about privacy facts If the description is simplified, the private information will enter the category of personal privacy at the factual layer. It can be seen that personal information is different from privacy, and at the same time, it flows upwards with the help of private information and the privacy of the fact layer; at the same time, it connects downwards as the content layer and the data as the symbol layer. In this way, privacy, personal information, and data constitute a system of order concepts. Only based on the above clarification of the relationship between privacy, personal information and data as the object of rights, can we further examine the data rights themselves and build a differential pattern of the digital rights system in the digital age.

The concept of order pays attention to the different hierarchical relationships existing within the same type, and when applied to the definition of property rights, it is a hierarchical confirmation of different hierarchical forms of the same type of property rights. For example, natural resources contain various interests of the whole people, the state or the collective, and individuals. However, many arguments in the academic circles around the qualitative nature of the state (collective) ownership of natural resources are mostly limited to a flat qualitative logic, which is defined as a simple public power, private rights, etc. However, the criss-crossing relationship of rights above natural resources means that it is difficult to flatly characterize the state (collective) ownership of natural resources, but it should be placed in the hierarchy of natural resources for scientific interpretation. In the entire chain of natural resource rights, ownership of natural resources can be divided into four levels of rights: constitutional state (collective) ownership of natural resources, civil law state (collective) ownership of natural resources, and civil law usufruct rights of natural resources and ownership of natural resource products.

The rights above the data are also deeply hierarchical, and the interests of different subjects on different levels are also different. Therefore, for the nature and attribution of data property rights, it needs to be embedded in "data resources-data collection-data products" As discussed in this chain, it is not appropriate to planarize and confirm data rights in isolation. Data and data elements are two different concepts. Only when data is combined with a series of labor such as data collection, storage, processing, analysis, and application, can data truly have value and play a role, and at this time data can be called a factor of production. In the process of data elementization, the data has gone through the collection of initial data resources; to cleaning, warehousing into standard products, and aggregation into data sets; and finally to the provision of data products or services. In this process, the shape of the data is constantly changing, and the value is multiplied, presenting a complex process line. If the process is compressed into one point for flat confirmation of rights, it will inevitably cause troubles of "point to area" - only by discussing the issue of data ownership at different levels can the data right confirmation have a solid foundation of unity of logic and facts.

To this end, it is necessary to build a hierarchical theoretical system of data property rights to effectively solve the problem of data right confirmation and strengthen the theoretical logic and practical guidance of the data right confirmation scheme. The theoretical system of data property rights hierarchy can be expanded from two dimensions, horizontal and vertical.

From a horizontal point of view, although the data on each layer in a broad sense have different focuses, they all have common problems: first, when people talk about data, they often think of the information carried by the data, and then extend to privacy, so When different objects are mixed together for empowerment, there will inevitably be logical entanglements; second, at different stages of the data formation life cycle, many data participants often work together. Faced with many data participants, what logic should be based on To determine the ownership of property rights and the corresponding legal status of all parties, and then to construct a property right allocation plan that meets the reasonable demands of all parties has become a challenge of our time. Therefore, horizontally, it is necessary to distinguish and decouple data from information and privacy, and to establish a set of flow transitional hierarchical structure between data and its surrounding concepts. At the same time, in the process of data generation, for the many participants involved, data property rights should be structurally divided according to the sources and degrees of contributions of different subjects, so as to realize the reasonable distribution of rights such as data holding rights and processing and use rights, so that by Transition from single property right to divided property right.

From a vertical point of view, in the process of data elementization, the data form has undergone three changes: from the initial collection of data resources to generate original data resources; Develop and derive data products, thus forming a three-level progressive value chain of "data resources-data collection-data products" within the data. In each link, whenever the form of data elements changes, the object, subject, and content of data property rights will change, so its ownership needs to be defined separately. The "data resource holding rights, data processing and use rights, and data product management rights" proposed in the "Twenty Data Articles" actually imply a layered confirmation of rights: in the data collection stage, authorization based on informed consent, Realize the separation of the data ownership of the user as the data source and the data usufruct of the enterprise as the data processor. The enterprise enjoys the right to hold other people’s data resources it collects by virtue of the data usufruct; after the data is collected, the data enterprise based on its previous The obtained data usufruct enjoys the right to process and use the data it holds, so that the original data resources can be processed into data products such as data collections, and pricing and transactions on the basis of standardization can be realized; data companies enjoy , On the basis of data standard products, various data products can also be developed, and then enjoy the right to operate their data products.

Horizontal layering: three-layer separation of data property rights elements

In the three levels of "data resources-data collection-data products", no matter what level of data, in the process of confirming the rights, the object of rights is easily confused and the single property right structure is difficult to apply. To this end, it is necessary to use the idea of rights division to separate the object, subject and content of data property rights layer by layer, strip off the interference elements that are not related to data right confirmation, and at the same time realize data property rights in the Reasonable configuration among multiple subjects.

As early as the ancient Roman period, the idea of rights division had sprouted. Roman law divides property rights into self-property rights and other property rights. The owner of self-property rights can transfer part of his power within a certain range and period according to needs, and establish other property rights for others. Because other real rights are derived from self-real rights, and the content of rights is limited by specific purposes, it is also called derived real rights or limited real rights. The discussion of the essential elements of modern ownership theory also shows that the power and function of ownership can be divided. Based on the idea of rights division (Abspaltungsgedanke), German academic circles put forward two concepts of source rights (Quellrecht) and restricted rights (beschränktes Recht). Source right holders can separate limited rights from source rights and grant them to others. Taking ownership as an example, the source obligee can separate the use right of ownership from the price change right, and create usufructuary rights and security interests, so as to realize the use value and exchange value of the thing and make the best use of it. This kind of rights division model is also called the "mother right-child right" structural theory by Chinese scholars.

The idea of rights division has important enlightenment significance for the construction of data property rights system. The construction of data property rights first requires a strict distinction between information and data at the level of rights objects, and creates personal information personality rights and personal data ownership for individuals respectively, and then separates corporate data usufruct rights from the latter based on the idea of rights division.

1. Object of data property rights: separation of data and information

When constructing data property rights in private law, the object of data property rights is particularly important. It is not only the basis for the establishment of data property rights, but also the constituent elements of data property rights. In the existing research, the cognition that "the object of the right is the foundation of the establishment of the right" has become a basic consensus. It can be said that "the object is the constituent element and carrier of private rights". Therefore, the construction of data property rights from the perspective of private law must be based on the determination of its object category. However, data and personal information are often used vaguely in institutional norms, judicial decisions, and academic research. This will not only cause trouble for courts to protect data property rights and conduct legal arguments, but also cause deviations in rights setting. For this reason, before constructing data property rights, it is necessary to distinguish data and personal information, clarifying that they have different legal characteristics and belong to different rights objects.

Regarding the relationship between data and personal information, the European Union's General Data Protection Regulation does not make a distinction. Article 4, Item 1 of the law defines "personal data" as "data related to an identified or identifiable natural person ('data subject') relevant information". This indiscriminate approach to information and data has had a great impact on the academic circles in our country. When discussing issues, many scholars in our country often confuse information and data, thinking that personal information is personal data, but there are differences in terms of names. That's all. This has led to a mind-set that, when it comes to data, tends to think of personal information and, by extension, privacy. This way of thinking will lead to a very radical conclusion: companies cannot control personal data, and controlling data means controlling personal information or privacy, which leads to the fallacy that personal data cannot be traded, making it difficult to establish a data property rights system.

It is true that data and personal information are so closely combined that they are difficult to distinguish, but just as molecules can be subdivided into atoms, and atoms can be divided into electrons, protons, and neutrons, no matter how closely related the parts are, there is still a need to distinguish them and possibly. The difference between information and data is: the function of information is to eliminate the uncertainty of people's cognition. It is a kind of information that embodies semantics and is located in the content layer; while data is the syntactic information that records the content of information and is the carrier of information. It can be said that information and data are interdependent and interdependent, which can be described as the relationship between "orange flesh and orange peel". After all, orange peel is different from orange flesh, and it is not indistinguishable because of its close combination.

The distinction between data and information is not just a purely theoretical analysis or the imagination of scholars, but also has its legal basis. At the normative level, the formulation process of the "General Principles of Civil Law" has already stated the position of distinguishing information from data. Paragraph 2 of Article 108 of the "General Provisions of the Civil Law (Draft · First Review Draft)" adopts the general expression of "data information", and defines it together with works, patents, trademarks, etc. as the object of intellectual property rights. However, the "General Provisions of the Civil Law (Draft · Second Review Draft)" immediately changed its position and split "data information" into personal information and data, which are specifically stipulated in Articles 109 and 124 respectively. This clearly differentiated pattern is inherited from the general provisions of the Civil Code—Article 111 of the Civil Code stipulates personal information, and Article 127 stipulates data, which clearly shows that legislators divide information and data from personality rights and property rights. position.

Based on this, the definition of personal information and data in the current law further specifies the difference between the two: Article 4, paragraph 1, etc. of the Personal Information Protection Law stipulates that personal information should be "recorded electronically or in other ways" Article 3, paragraph 1, of the Data Security Law defines data as "records of information electronically or otherwise." According to this, the relationship between data and information is the relationship between record and recorded, or more precisely, the relationship between the two is the relationship between form and content. Thus, it provides a solid normative basis for the conceptual distinction between information and data. In short, information is content, knowledge, etc., and its function is to solve uncertainty; while data is a form, which is a carrier for expressing information. In a semiotic sense, information is at the semantic (content) level, while data is at the syntactic (symbol) level. In this regard, the two are at different levels, and there are differences between them: personal information is located at the content level, which is the object of a new type of personality rights (benefits); objectively existing personal data is located at the syntactic level, which is a object of a property right. Data is the carrier of information, which is also clearly reflected in the "Twenty Data Articles", such as Article 4 "Public data that does not carry personal information and does not affect public safety" and Article 6 "Data that carries personal information".

Through the distinction between data and personal information, the object of protection of data property rights is further clarified—the object of protection of data property rights is data at the syntactic level, not information at the semantic level. Some scholars in our country believe that the reason why the law protects the property rights of data is not to protect the data itself, but to enable the obligee to enjoy the right to control the information carried on it by confirming the rights of the carrier of the data. . In other words, the right holders' rights to data are essentially a projection of their interests in specific information. Therefore, the specific content and exercise methods of data rights should also be constructed around the characteristics of specific information. German scholar Specht’s point of view is contrary to this: “The object of data property rights is data as a carrier at the syntactic level, not information at the semantic level.” Following this, data rights cannot be simply equated with information rights. Although the data owner has the right to use the data according to his own wishes, the data property right does not give the right holder any rights related to the data content (ie, semantic information). Not only that, when there are other original rights on top of semantic information, data property rights will be restricted accordingly. In this regard, the examples of works and books can be compared: when the information contained in a book constitutes a work, the ownership of the book is undoubtedly absolute, and the right holder can fully control the book, but the use of the book in the form of exhibition, rental, etc. is limited. Copyright in the works carried by it. In the same way, when personal information is carried on the data, care should be taken to protect the personal information that may be carried on it when the personal data is circulated and utilized. The analogy between a diary and a diary can be further compared here. When the diary contains personal privacy, although the owner of the diary has full ownership, if someone has the right to privacy for the content recorded in the diary, then the owner of the diary The diary may only be used without violating privacy rights. Under the premise of protecting personal information, data processors can use technical means, such as "available and invisible" differential privacy, homomorphic encryption and other privacy computing technologies, to realize the utilization of personal data. If personal information is completely anonymized, such data is no longer personal data, and data processors can naturally use it without restriction.

The functional positioning of information and data is different, and they should be separated in cognition, but the separation does not mean separation, and data and information still have the essential connection of two aspects of one body. Information is the source of data, and data can in turn generate new information content. The real value of data does not lie in the external form of code, but in the information content hidden in it. Therefore, data and information are related to each other, forming a symbiotic unity. Under such a close relationship, only on the basis of personal data ownership can the integrity of personal information on it be maintained, and a cloak be provided for the dignity of personality in the modern information society. As some scholars have said: "For the data containing personality rights and interests, it is obvious that the ownership cannot be attributed to the owner of the big data, which will make the individual's control and domination of personal data based on personality rights and interests fall into a passive situation." It can be said that personal data Ownership is the result of the further development of the right to self-determination of personal information in the digital age, and it is the basis for the rights of individual personality to freely expand in the digital world. Information self-determination can only be effectively guaranteed when individuals have ownership over their data. On the one hand, the "Personal Information Protection Law" endows information subjects with personal information rights, including the right to correction, right to deletion, and right to portability. Although these rights refer to personal information at the semantic level, they must rely on the syntactic level. Personal data rights can only be realized. Taking the right to delete as an example, only when individuals are given ownership of their personal data can they truly have a legitimate basis for rights when they request companies to delete data at the syntactic level. On the other hand, the right to personal information runs through the entire life cycle of personal data processing activities, and the subject of personal information can continuously influence personal data processing activities. similar legal status. If this factual state is not recognized, and the property rights of personal data are allocated to enterprises as processors, it is likely to backfire on personal information rights.

2. Subject of data property rights: separation of source and processor

Data is often formed by the joint action of many data participants. In the process of data production and use, multiple stakeholders such as users, original data companies, and third-party data companies are often involved. In the face of so many data participants, what logic should be used to determine the ownership of property rights and the legal status of each party, and then construct a property right allocation plan that meets the reasonable demands of all parties has become an urgent problem to be solved.

Regarding the allocation of user data property rights, judicial practice follows the logic of "whoever collects it, owns it." Many cases characterize the personal data collected by companies as corporate trade secrets. Many opinions in the academic circle also support the allocation of personal data property rights to enterprises. This proposition is mainly based on the theory of labor empowerment, and the relationship between labor and data value is used as the reason to justify the property rights of enterprise data. However, users do not produce data consciously, and data can only be counted as a "by-product" of users' online behavior, not a product of labor. Therefore, users do not necessarily have property rights to the data they generate. Second, individual user data has almost no property value, and user data has value only after it has been collected, analyzed, and processed by the enterprise. Given that the property value in user data is mainly created by enterprises, the property rights of user data should belong to enterprises. Therefore, based on the theory of labor empowerment, the property rights of user data should be allocated to enterprises, and users only enjoy personality rights such as personal information rights and interests for the data generated by themselves. This rights allocation model is also more conducive to improving the efficiency of data use.

From the perspective of comparative law, the German academic circle also adopted the viewpoint of "whoever collects, who owns" in the early days. Welp first proposed the concept of data producers (Skribent) in 1988, that is, those who generate data through input or running programs are data owners. However, this theory ignores the cooperation relationship that may exist behind the generation of data, such as entrustment or acceptance of entrustment. If the employer appoints employees to operate equipment to generate data, the data producer should be the employer rather than the employee. Accordingly, Zech further revised the theory of data producers, that is, those who are economically or organizationally responsible for initiating data production are data producers and data owners.

However, the data producer theory over-biases data enterprises and largely ignores the contributions of users. From the perspective of the entire life cycle of data, data originates from the user's network access behavior. Considering that users are the originators of data generation, empowering users should be the starting point for data rights configuration. German scholar Fezer also emphasized that the personal behavior of users is the source of data generation, and the original data generated should be regarded as personal intangible property. Here, Fei Ze gives a broad interpretation of personal behavior. Whether it is data actively generated by individuals or passively recorded data, such as machine-generated or automated reflection data (Reflexive Daten), all belong to personal behavior-generated data. On this basis, Feizer made a further prospect: "In the history of bourgeois society, the legitimacy of intangible property rights is the intellectual creation of individuals. In the digital information society of the 21st century, the legitimacy of data ownership is based on data Generated by the individual actions of citizens. This shows a clear historical line of property rights development: from the physical ownership of movable or immovable tangible objects, to the intellectual property rights paid by intelligence, and then to the contemporary data ownership generated by digital behaviors.”

The trend of thought of the source of the data is also reflected in the United States. In the case of hiQ v. LinkedIn, the judge of the Ninth Circuit Court of the United States stated that the data on the LinkedIn platform is based on the contributions of users, and users, not LinkedIn, should enjoy the ownership of these data. Therefore, the court requires LinkedIn to remove hiQ's technical blocking by means of a pre-litigation injunction. It is also based on this idea that Tim Berner Lee, the father of the Internet, proposed the Solid project, which is based on distributed storage technology and allows users to control their personal data. From the perspective of improving the overall welfare of society, Bergelson advocates the allocation of data rights to data subjects.

With the development of the digital economy, the academic community has deepened its understanding of users' data generation behaviors, and the digital labor theory has further justified the establishment of personal data ownership. In the digital age, labor and entertainment have gradually lost their boundaries. The network services provided by enterprises for free in name actually become labor tools for users. The leisure time of users using network services is also their labor time for producing data for enterprises without compensation, but users do not get any compensation from it. And companies actually control user data and aggregate it as a commodity to sell at a profit. In this sense, there is an economic exploitation relationship between enterprises and users, and the surplus value and profits created by users' unpaid labor are monopolized by enterprises. In order to correct this unfair situation, it is necessary to allocate personal data property rights to users who are digital laborers, so that all netizens can participate in the distribution of data dividends with a legitimate premise.

What needs to be pointed out is that during the formulation of the "Data Twenty Articles", the drafters believed that promoting the structural separation of data property rights should jump out of the stereotype of ownership and focus on the rights to use data. When answering reporters' questions, there are often expressions of "downplaying ownership and emphasizing use rights". What needs to be further extended to think about this point of view is that jumping out of ownership or downplaying ownership thinking does not mean denying ownership. If data ownership as the "mother right" and source right (Quellrecht) of the right to use data is completely denied, the right to own data resources, the right to use data processing, and the right to operate data products will lose their foundation and become a source of no water and no capital. wood. The downplaying of ownership is a succinct expression made when comparing data ownership and data utilization, and more emphasis is placed on "promoting the exchange of data use rights and market-oriented circulation" through the structural separation of three rights of data. In other words, the focus of the "Twenty Articles on Data" is to establish that enterprises have the right to use the data they hold, and strive to promote the circulation and utilization of enterprise data use rights through institutional arrangements, but this does not mean denying ownership—denying Data ownership is neither logical nor factual. As pointed out in Article 3 of the "Twenty Articles on Data", it is necessary to "define the legal rights enjoyed by each participant in the process of data production, circulation, and use according to the source of data and the characteristics of data generation." It can be seen that the structural three-right separation of data property rights is based on respecting the two important facts of data sources and data generation, which is in line with the normative intent of the "Twenty Articles on Data".

Giving users ownership of personal data also helps to promote the extension of the protection of personality rights from physical space to virtual space. As the saying goes, "If there is no skin left, nothing will be attached." If you only declare that individuals have the right to self-determination of information, but cannot give individuals ownership of data, the right to self-determination of personal information can only be a dead letter. Information self-determination can only be effectively guaranteed when individuals have ownership over their data. For an individual, the value of personal data ownership is not only that he can use his own data, but also that he can prevent others from using the data. Only when individuals can fundamentally exclude others from using their data will it be possible to talk about information self-determination. In this sense, personal data ownership is the result of the further development of the right to self-determination of personal information in the digital age, and it is the basis for the right of individual personality to freely expand in the digital world. In order to strengthen the comprehensive and three-dimensional protection of personal information, it is necessary to extend personal rights from the content layer to the symbol layer by assigning data ownership to individuals.

3. Content of data property rights: separation of ownership and usufruct

On the basis of the distinction between data sources and data processors, it is necessary to further answer how the various rights derived from data are reasonably divided between data sources and data processors. There is a view that the series of data generated by users on digital platforms and the rights derived from them are jointly owned by users and digital platforms. However, the disadvantages of the shared model are obvious. Not only is it difficult to reasonably and clearly distinguish the extent to which users and digital platforms each have these rights, but it is also not conducive to giving full play to the utility of data. Therefore, if it is not necessary, the common mode should not be adopted as much as possible.

From the perspective of the entire life cycle of data, data originates from users' network access behaviors, and empowering users should be the starting point for data rights allocation; data processing companies have invested a lot of labor and capital, thus endowing them with relatively stable property rights It is conducive to the optimal allocation of data resources and the formation of incentive mechanisms. However, if data processors are given ownership of the data, it will violate the logical starting point that data is generated by users, and it will not be conducive to building a co-construction, sharing, and interconnected Internet. Here, we can learn from the rights division model of "self-property-other-property" and "copyright-neighboring rights". Data ownership and data processors have a dual right structure of data usufruct, which realizes a balanced allocation of data property rights between users and enterprises. Under this rights structure, if the collected data originates from a natural person user, such as a natural person’s online records, whereabouts, etc., the natural person user shall enjoy the ownership of the data, and the legal data collection enterprise shall enjoy the right to use the data; if the collected data is not If it originates from natural persons, but public data such as meteorology and geography, the situation is similar to that of the radio spectrum. The ownership of the relevant data belongs to the state, and the usufruct of the data still belongs to the legal data collection enterprises. Since the economic value of a single or a small amount of data is not high, in the digital economy era, the data ownership goal of the source is generally not to obtain direct economic feedback, but to use data as a functional element in exchange for intelligent information services.

According to the "separation of two rights" model of "ownership + usufruct", data usufruct is derived from data ownership, data ownership is the parent right of data usufruct, and data usufruct can be generated through statutory or agreed methods, It can be obtained through paid transactions or free authorization. Statutory data usufruct rights are mainly applicable to data collection related to the natural environment. The ownership of such data should belong to the state, but data collectors can reasonably use it within the scope of national statutory authorization, provided that it does not harm the public interest. The generation of agreed data usufruct comes from the authorized behavior of individuals and organizations and the factual behavior of data processing.

Granting data usufruct rights to enterprises as data processors that collect and store data is in line with the logic of data generation, and empowerment based on the different contributions of data sources and processors is in line with the principle of fairness. This empowerment model of data usufruct rights can Analogy copyright and neighboring rights. The writing of a novel is the most important source of novel works and a series of copyrights. Subsequent re-creations based on novels, such as storytelling performances, film and TV drama shooting, may increase the influence of the novel, and sometimes It is more famous than the original novel, but even so, storytelling performers or producers of film and television dramas do not have copyright to the novel but only neighboring rights, because the originality of the work is the source of all subsequent property rights (regardless of value) . The above ideas are also applicable to the distribution of data ownership. No matter how much the platform, enterprise or data company invests in data collection, storage, and processing, it is not enough to make it surpass the source of the data—the user, and become the owner of the data, and can only obtain other things similar to the status of neighboring rights right. This is in line with the actual situation of data generation, and also objectively presents the different roles played by various participants in the data formation process. More importantly, the value of the enterprise's data usufruct will not be affected by its status as "other property rights". be affected in any way.

Giving data ownership to users who are data sources is a manifestation of respecting the source of data rights, and the reason why data processors enjoy the right to use data comes from a double basis of legitimacy. On the one hand, according to the theory of labor property rights advocated by Locke, the laborers should enjoy the property rights of the labor products, so the data obtained by processing should be protected by the property law, that is, the usufruct of the data should be given to them. At the same time, in the process of data collection and processing, data processors need to invest huge funds and other costs, and the granting of usufruct is also out of consideration for encouraging their enthusiasm for market investment, so as to avoid the phenomenon of "tragedy of the commons". On the other hand, it is not fully justifiable for the data processor to enjoy the right of usufruct solely based on labor and investment, because the right of usufruct originates from the prior ownership, so as the source obligee, the authorization of the data owner is essential. Regardless of whether the data owner is a natural person, an enterprise or a country, a necessary prerequisite for data processors to legally process data is the informed consent of the data owner, which is also another legitimacy basis for obtaining the usufruct of the data.

4. From the separation of two powers to the separation of three powers

The two-right separation model of "data ownership + data usufruct" is reflected in the "Data Twenty Articles". On the one hand, Article 7 of the "Twenty Data Articles" adopts the concept of "data source", emphasizing the need to "fully protect the legitimate rights and interests of data sources, promote data circulation and use models based on informed consent or legal reasons, and protect data sources. The owner enjoys the rights and interests of obtaining or copying and transferring the data generated by him”, and specifically proposed the way to realize the property rights and interests of personal data, that is, “explore the trustee to represent personal interests and supervise the collection, processing and use of personal information data by market players. mechanism", which actually recognizes that individuals, as data sources, have property rights and interests in their own data. On the other hand, with regard to the data rights enjoyed by data processors, the "Twenty Data Articles" clearly stipulates that "the rights and interests of data processors to exercise autonomous control over the data held in accordance with laws and regulations shall be reasonably protected", and the requirements are put forward in Article 3. "Under the premise of ensuring security, promote data processors to develop and utilize raw data in accordance with laws and regulations, support data processors in exercising data application-related rights in accordance with laws and regulations, promote the reuse and full utilization of data use value, and promote the exchange and exchange of data use rights. market circulation". More importantly, the beginning of the "Twenty Articles on Data" regards "promoting all people to share the dividends of digital economic development" as one of the important goals of building a basic data system, and only by empowering all users with data ownership can they truly share The rights basis for the development dividend of the digital economy. In this sense, the provisions of the "Twenty Articles on Data" have formed a two-right separation model of "ownership of data by the source + usufruct of data by the processor".

On the basis of the separation of the two rights, the "Twenty Articles on Data" further proposes a plan for the separation of the three rights of data, and stipulates in Article 3: "According to the source of data and the characteristics of data generation, each participant in the process of data production, circulation, and use shall be defined separately. The legal rights to enjoy, establish a separate property rights operation mechanism such as the right to hold data resources, the right to use data processing, and the right to operate data products, and promote a new model of "common use and shared benefits" for non-public data in a market-oriented manner. The value creation and value realization of data elements provide basic institutional guarantees." However, "the right to own data resources, the right to use data processing, and the right to operate data products" as an innovative formulation of the "Twenty Articles on Data" has no matching rights in the field of law, which has affected data rights to a certain extent. The implementation of the separation of powers scheme. Specifically, under the current legal framework, it is difficult to find rights that match the separation of data rights, and it is difficult to incorporate these new types of rights into the current legal system. First of all, as far as the right to hold data resources is concerned, the concept of "right to hold" has never been used in my country's civil law. The introduction of "right to hold" this time is to dilute "ownership" and emphasize "right to use". This kind of thinking of distinguishing "ownership" and "holding" has important guiding significance for understanding the ownership of data resources. In other words, the drafters of the policy document used the concept of "data resource ownership" instead of "data resource ownership", which means that "holding" in data resource ownership is to distinguish it from "ownership". Under this premise, it is necessary to discuss how to implement the right of holding under the background that the current civil law has not adopted this concept expression. Secondly, as far as the right to use data processing is concerned, the civil law of our country has never stipulated the concept of "right to use processing". Generally, as long as you have the ownership of raw materials, you can automatically own the right to use processing. For example, owning the ownership of flour can It is processed into bread, and it can be made into furniture with the ownership of wood. This is the natural expression of ownership, but it is a power rather than an independent right. If it is called a right, it should meet the conditions of legal rights and meet the standards of legal rights. Finally, as far as the right to operate data products is concerned, product owners can license, use and even trade products, and such operational behavior is also a natural capability of product ownership.

In this context, how to transform the policy language in the "Twenty Articles on Data" into legal language, so that the data separation of three powers, a policy on paper, can be transformed into a legal right specification, has become a theory that needs to be solved urgently. problem. In fact, the separation of the three data rights is based on the separation of the two data rights, the separation of the ownership of the data source and the usufruct of the data processor, and the system design based on the basic rights of the data usufruct . Incorporating the data separation of three rights into the theoretical framework of data usufruct, data resource holding rights, data processing and use rights, and data product management rights can be regarded as data usufruct rights in each category of "data resources-data collection-data products". The specific manifestations of the stages, thus forming the idea of confirming rights from "separation of two powers" to "separation of three powers".

The separation of data rights among the three is the specific realization of data usufruct at different stages of data generation and utilization, which can also be corroborated from the "Twenty Articles on Data". For example, Article 7 of the "Data Twenty Articles" stipulates: "Under the premise of protecting the public interest, data security, and the legitimate rights and interests of data sources, recognize and protect the right to process and use data obtained in accordance with legal provisions or contractual agreements, and respect data collection. , processing and other data processors’ labor and other factor contributions, and fully protect the rights of data processors to use data and obtain benefits.” It can be seen from the expression of this article that the drafters of the document equate the "right to use data processing" with "the right to use data and obtain benefits", and "use and benefit" are the core content of the data usufruct advocated by the author.

From the separation of two rights to the separation of three rights, the idea of right confirmation is not unique to the data. In the reform of rural land property rights, the three-right separation model of land ownership, land contract right, and land management right has been widely used. This model assigns land ownership, contracting rights, and management rights to different subjects, making the transfer of land more flexible and maximizing the benefits of land use. Drawing on this experience, data ownership and data usufruct rights in data property rights can be separated, and data usufruct rights can be further divided to form a three-right division of data resource holding rights, data processing and use rights, and data product management rights at different stages. setting mode. This structural separation of powers model can effectively break down the property rights barriers faced by the development of the data market, and maximize the promotion of data circulation and utilization on the basis of protecting the legitimate rights and interests of data sources. First of all, after obtaining the consideration, the data resource owner can hand over the holding right and processing and use right to a professional data processor. This can not only ensure that the data owner can share the data dividend, but also promote the full use of data resources and avoid data Waste of resources. Secondly, acknowledging that data processors enjoy the right to own data resources and data processing and use based on the usufruct of data can effectively prevent data from being improperly crawled, stimulate the willingness of data processors to innovate, promote the continuous development of data processing technology, and promote data processing. By using their own technical and professional capabilities, the data mining and processing can be carried out in depth, so as to improve the use value and economic benefits of the data. Finally, the separation of data product management rights can promote the formation of the data product market, promote the diversity of data products and market competition, and improve the quality and service level of data products. Data product operators can sell and license processed data products to realize their own economic value.

Vertical classification: three-stage data right confirmation and its realization path

The data element market is divided into a three-level market structure of data resources, data collections, and data products according to different transaction objects. On the whole, from data resources to data collections to data products, the form of data property rights shows the characteristics of standardization from complex to simple, and the number of stakeholders is also reduced to one in turn. Through the three-stage right confirmation, the complex problem of data right confirmation can be effectively solved.

1. Data resource stage: the split structure of "ownership + holding right" of data resources

Data resources are the most original data. They have the characteristics of large volume, complex structure, and many sources. Without sorting and analysis, their value is usually difficult to reflect and release. In the process of generating data resources, its source is represented as a huge "open set", and the unstructured raw data generated by individuals, enterprises, governments and various social organizations can all be included in the category of data resources. In this regard, individuals, enterprises, and governments may all become sources and owners of data resources. For data processors who legally collect data resources, the fact that they actually possess and control the data can derive the right to hold data resources. In nature, the ownership of data resources is similar to the possession in the property law. Data resource holders can fight against improper crawling by others, but they cannot actively control and use data. From a logical point of view, in the separate layout of the right to hold data resources, the right to use data processing, and the right to operate data products, the right to hold data resources constitutes the pre-basic rights of the right to use data processing and the right to operate data products. However, as a weak right, the right to hold data resources cannot provide fundamental institutional support for the subsequent circulation of data resources because it does not have positive powers. In the data resource collection stage, data ownership is the key. Whether a third party can obtain the right to hold data resources still depends on the will of the individual data owner. Therefore, at the stage of data resources, the focus is on solving the problems of confirmation and authorization of personal data ownership. Only on the basis of fully respecting and compensating the ownership of personal data can the legitimacy of subsequent data circulation and utilization be established.

(1) Data resource ownership confirmation path

According to the data originator theory, individuals have ownership of the data originating from themselves. In the field of enterprise data, the criteria for judging the ownership of data spontaneously generated by enterprises is similar to that of personal data. Enterprises have ownership of data originating from themselves, such as enterprise operating data and industrial data generated by equipment. For this kind of data that is purely self-owned by the enterprise, Article 5 of the "Twenty Data Articles" specifically stipulates the rights and interests of the enterprise: "The collection and processing of various market entities in production and operation activities does not involve personal information and public information. Interested data, market participants enjoy the rights and interests of holding, using, and obtaining benefits in accordance with laws and regulations, ensuring reasonable returns for their labor and other factor contributions, and strengthening incentives for the supply of data elements." However, in practice, it is purely derived from the enterprise itself. The data accounted for only a small amount, and most of the data was collected by enterprises from outside. For personal data collected by companies, such as personal Internet browsing records, commodity purchase records, etc., the ownership of the data still belongs to the individual who is the source of the data.

Public data includes not only personal data and enterprise data collected by government agencies, but also government affairs data formed by government agencies in the course of performing their duties. For the government affairs data formed by the government in the process of performing its duties, if it has nothing to do with individuals or enterprises, the state can enjoy the ownership and possession of the data according to law. But there is controversy over whether personal and corporate data collected by government agencies is owned by the state. One view holds that public data is collected, processed, and stored by state agencies, and has a public nature, so it should be owned by the state or jointly owned by the whole people; as for the type of public data, it does not matter. For personal data in public data, individuals themselves do not enjoy ownership, and the state has the right to possess, use and transfer according to law. Another point of view is that, for public data that is personal data or corporate data, individuals or companies should enjoy ownership, while other data should be owned by the state. This is based on the consideration of protecting personal information, maintaining personal dignity and autonomy of private law. The author holds the latter point of view: For personal data and corporate data collected by government departments, individuals and companies as data sources should enjoy data ownership. This is also consistent with the aforementioned conclusion that the ownership of data resources belongs to the data source.

(2) Realization mechanism of data resource ownership

From the perspective of rights exercise, if the country or enterprise is the owner of data resources, it often controls the data at the same time and has a strong ability to exercise rights. But for individuals, the personal data they generate are often scattered on different platforms, and they are also in a structurally weak position in negotiations with platforms. Therefore, even if the user is given the ownership of personal data, it is difficult for him to obtain fair and reasonable compensation from the enterprise. Under the current digital technology form and data industry structure, German scholars suggested that personal data transactions can be carried out in the form of "service for data" (Dienst gegen Daten), which was later developed into "payment for data" (Leistung gegen Daten). ), which provides a feasible channel for the realization of personal data ownership. But it also needs to be pointed out that empirical research shows that the "payment for data" transaction model has not been smooth in realizing the ownership of user data. Because the marginal cost of providing network services for platform companies is close to zero, and the value of personal data actually far exceeds the value of network services. This is also the root cause of the current excess profits of digital companies. However, due to the unequal status of the user and the platform, facing the network service agreement of "payment for data" provided by the enterprise, if the user wants to use the network service, he can only passively accept it. The right to use user data obtained by the network platform based on the network service agreement only creates a formal illusion of legitimacy, and the user has no ability to claim compensation from the enterprise at all. In order to rectify this unfair situation, the tax law academic circle has put forward the suggestion of levying data tax on enterprises and compensating users in the secondary distribution link. Consideration". However, taxation is an act of the state, and individuals cannot at least not benefit directly, nor can they quantify the consideration based on each individual's network behavior. After all, the logic of taxation and exchanging data for consideration is different. Taxation is a national act of external balance. The data tax collected is also handed over to the state (government) to provide public goods to the society and meet the common needs of the society. How much “digital labor” contributes to the "Data exchange for payment" is a way to realize the ownership of personal data generated according to each person's different "digital labor". It is an internal data benefit sharing mechanism among market players. It is different from the functional positioning of data tax, and the two are irreplaceable.

In order to truly implement the ownership of personal data, it must be equipped with a corresponding implementation mechanism. Currently, what is feasible is the personal data asset account business model that the industry is exploring. In practice, the storage of personal data is scattered and scattered, and the personal data of the same subject in different fields are scattered on different platforms. Personal data generated by a user’s behavior can be collected by multiple platforms from different dimensions. For example, when an individual uses a mobile network to shop on Taobao, the telecom operator will collect data such as the address of the user’s device and traffic usage. The e-commerce platform The user's product browsing records, orders and other data will be collected, and the bank will record the user's consumption data. This not only makes the storage fragmentation and management of personal data complicated, but also makes it difficult for users to effectively exercise their personal data ownership. In order to strengthen individuals' control over their own data, in practice, they have begun to explore the business model of creating personal data asset accounts, among which the British Midata project and the Korean MyData project are the most typical. In this business model, a third-party organization creates an exclusive data account for each data subject, similar to a bank account. The personal data asset account specifically includes three functions: (1) Data aggregation. Individuals can aggregate scattered data into personal data asset accounts by exercising the right to data portability. (2) Data management. After the personal data is classified and stored in a personal data asset account platform, the data can be cleaned, analyzed and mined. Through data integration, data interface and other mechanisms, it is convenient for users to access, manage and use data. Personal life provides convenient services and also lays the foundation for subsequent transactions of personal data. (3) Data transaction. Carry out commercial operation of users' personal data assets, carry out data property rights transactions and data service transactions, the transaction process is transparent to data owners, support value-added services such as data query and sharing, data use and transaction, data processing and products, and provide data for individuals Revenue and Settlement Services. With the help of personal data asset accounts, individuals can effectively control their own data through "my data is the master", and then realize "my data benefits me".

The Fourth Plenary Session of the 19th Central Committee of the Communist Party of China proposed for the first time that "data should be used as a factor of production to participate in social distribution". Theoretically speaking, individuals, as producers of data elements, could have participated in social distribution by virtue of these production elements. However, due to the lack of carriers for storing and aggregating personal data, the data produced by individuals was discarded like bread crumbs, and at most they could be exchanged for so-called free network services. Once the personal data asset account is established, it will play the same role as a bakery. Individuals can aggregate data scraps together for sale, and receive micropayments every time their data is used, so as to truly realize the protection of personal data. control and benefits. In the future, individuals can rely on their personal data asset accounts and use their own data as a factor of production to participate in social distribution, so as to realize the co-creation dividend of sharing the digital economy.

In terms of registration and confirmation of rights, since data subjects generate new data every day, the total amount of personal data continues to expand, and the content of data is also in a process of dynamic change. It is not feasible for the corresponding reviewers to review item by item. Therefore, the author proposes to learn from the review and registration system of patents and trademarks, establish a registration and filing system for personal data asset accounts, and conduct general registration of personal data asset accounts to facilitate the clarification of prior rights and evidence collection.

From the perspective of extraterritorial law practice, countries have begun to implement personal data asset account projects to strengthen individuals' control over their own data, so that individuals will receive micropayments every time their own data is used. Studies have shown that most users are not satisfied with the collection and use of their personal data, but in order to fully integrate into contemporary life, users have to agree to the continued collection and use of personal data by the platform. This is starting to change, and efforts are being made to change the way Internet commerce works. For example, Harvard University launched a "Vendor Relationship Management" (Vendor Relationship Management, referred to as VRM) project. The project focuses on the concept of Vendor Relationship Management (VRM), which focuses on enabling users to become competent economic actors with regard to their data, reversing the relationship between users and digital enterprises (vendors), and avoiding users from being exploited by digital enterprises economic unit. To achieve this goal, the project encourages the development of technologies that enable individuals to more fully manage their own data while increasing their control over how others use their data. If, through technology, users can collect and control their own data, or have the ability to selectively share that data, or even control the conditions under which their data is used by others, they can change their relationship with digital businesses. Many data products currently use the VRM concept. For example, renowned computer scientist Jaron Lanier has proposed the concept of "a world" in which individuals receive micropayments every time their data is used; Focus on developing personal data banks or accounts where users can store and sell access to their data; there are also scholars exploring the market development of personal data banks by building models in the laboratory. These practices all indicate that the realization mechanism of personal data rights is promising. For data companies, the future development can be said to be the digital world for those who own personal data.

(3)How to obtain data resource ownership rights

From the point of view of normative purposes, the establishment of data resource holding rights is mainly to encourage "lawful holding" and "legal holding", and to make a negative evaluation of "illegal holding". If the data holder holds the data in accordance with laws and regulations, he shall have the right to self-control (Article 7 of the "Twenty Data Articles"). Without the permission of the holder, others are not allowed to access, copy, tamper with, or delete data at will. Of course, the protection of the right to hold this kind of autonomous control should be limited to the scope of reasonable protection. In the future, we can refer to the restrictions on intellectual property rights and establish data usufruct restriction systems such as reasonable use of data, legally licensed use, and compulsory licensed use. its boundaries. In order to achieve a balance between data rights confirmation and data utilization, while avoiding the "tragedy of the commons", it is also necessary to prevent the "tragedy of the anti-commons".

As mentioned above, whether an enterprise can have the right to hold or process and use data resources collected from outside depends on the authorization of the data owner. Regardless of whether the owner is a natural person, an enterprise or a country, except for statutory reasons, the basic prerequisite for legal processing is the informed consent of the data owner. The data processor obtains the authorization of the data source, which can be confirmed through the user service agreement. The authorization mechanism needs to distinguish between general authorization and special authorization. General authorization is to authorize the data processor to use the data to a limited extent, that is, as an internal enterprise asset limited to the source of the service data, personal data shall not be used for other purposes except for completely anonymized data; special authorization is to authorize processing The owner can use the data for other purposes, including making the data open, sharing, and permitting others to use it.

Public data is divided into two types: one is the government data formed by the government itself in the process of performing its duties. If it has nothing to do with individuals or enterprises, the state can enjoy data ownership according to law, and government agencies have the right to hold government data in accordance with their statutory duties. Second, the data containing personal information or corporate information collected by party and government organs at all levels in the process of performing their duties according to the law and providing public services, according to the aforementioned model of separation of rights between the source and the processor, such data The ownership belongs to the collected management object, and the government agency of the collector enjoys the right to hold data resources on the basis of the right to use the data based on the legal authorization and the consent of the individual. Regarding the use of such public data, Article 4 of the "Twenty Articles on Data" stipulates that "public data are encouraged to comply with the requirements of 'the original data does not go out of the domain, and the data is usable and invisible' on the premise of protecting personal privacy and ensuring public safety. Provide to the society in the form of models, verification and other products and services” to achieve a balance between data utilization, personal information protection, and public safety. For public data that does not carry personal information and does not affect public safety, it is necessary to promote the expansion of the scope of supply and use according to the purpose. At the same time, the "Twenty Articles on Data" also listed a negative list: "Public data that is kept secret in accordance with laws and regulations shall not be released, and the original public data that has not been disclosed in accordance with laws and regulations shall be strictly controlled to enter the market directly, and public data that guarantee the supply and use of public data shall not be released." Benefit". On the issue of whether to charge for the use of public data, it is necessary to make a judgment based on the use of public data, and the public data used for public governance and public welfare undertakings must be used conditionally free of charge; for the public data used for industrial development and industry development, It is a conditional paid use.

2. Data collection stage: right to use data processing

Raw data resources are massive, scattered, and "many-to-many", making it difficult to directly become the subject of transactions. In order to realize the efficient flow of distributed massive data resources to diverse data product demands, it is necessary to establish an intermediate-based circulation system in the market of factors such as land and capital. In the data circulation system, it is reasonable to regard the data set composed of data elements as an intermediate state between data resources and data products. The so-called data element refers to the sorting and refining of raw data according to specific standards (such as age, income, education, etc.), while eliminating incomplete data, wrong data, redundant data, credible authentication of data, and checking the authenticity of data reliability, credibility and consistency, and finally form the data standard. These data labels can be further aggregated to form a data set. Compared with raw data resources, data collections have higher application value and market value, and have significant advantages in scale, structure, and standardization. In the collection process from a single original data to a data set, the data has been dispersed to aggregated, and the relevant subjects have gradually changed from multiple to unified, thus laying the foundation for the effective operation of "one-to-many" data property rights.

From the perspective of property rights allocation, the data set is formed by preliminary processing and sorting of massive raw data. If an enterprise wants to obtain the right to process and use the data collection, it needs to negotiate with the owner of the original data. Due to the large number of subjects involved in data collection, in order to reduce transaction costs and improve negotiation efficiency, users can hand over their personal data assets to a third-party data trust agency for custody, and professional trust agencies will conduct collective negotiations with data companies on behalf of users and set standards According to the data licensing contract, the enterprise obtains the right to process and use the data collection.

(1) Acquisition and implementation mechanism of the right to use data collection processing

The data collection is formed by the collection of massive user data. How to negotiate with each user and obtain the right to process and use it has become a practical problem. If companies are required to negotiate with each user one by one to solve issues such as how much data to obtain from users and how much data revenue to give to users, it will greatly increase transaction costs and affect the overall function of the data collection. In other words, requiring both parties to conduct one-on-one negotiation and pricing for the licensed use of complex and changeable personal data, so as to determine the applicable scope, extent and method of personal data, will result in excessive negotiation and transaction costs.

Therefore, in order to reduce negotiation and transaction costs, it is necessary to introduce a third-party organization to represent the ownership of personal data. With the development of data application technology, personal data is used in new ways, and personal data subjects usually lack professional knowledge to make rational decisions. In order to reduce rights holders' rights protection costs, improve the efficiency of rights exercise, and facilitate the use of personal data by others, it is necessary to learn from the collective management system in the field of copyright and entrust personal data asset accounts to collective management organizations (such as data trust agencies) for custody. Regarding this business model, the "Twenty Articles on Data" takes a positive attitude and encourages "exploring a mechanism where trustees represent personal interests and supervise the collection, processing, and use of personal information by market players." The aggregation of personal data asset accounts of different subjects can also improve the bargaining power, and the data trust agency negotiates with the data processor for specific usage fees. With the authorization of the data subject, the data trust organization can exercise data rights in its own name, including controlling the use of personal data, signing licensing contracts with users, distributing usage fees to obligees, and conducting rights protection lawsuits and arbitrations.

From the perspective of extraterritorial law, many countries have carried out research on personal data trust systems similar to collective management organizations. For example, according to the data trust theory proposed by the UK Open Data Institute, personal data obligees can entrust their personal data to a third-party organization with independent qualifications for management, and the third-party subject will supervise the data processing behavior of the data controller. It can be seen that the data trust model in the UK is a trust structure of "three parties". In order to protect the rights and interests of personal data from the illegal infringement of the data controller, a third-party trust agency is set up as the trustee outside the personal data obligee and the data controller. Accept the entrustment of personal data obligees to manage personal data. In practice, the British data trust model has been applied in the management of personal data of citizens in smart cities. In the smart city data management project, Sidewalk Labs, a third-party organization, acts as the trustee to manage city data and establish data sharing standards to distribute data benefits to individuals.

(2)Registration and effectiveness of the right to use data collection processing

According to the guidance of the "Twenty Articles on Data", the right to use data will become the key object of data property registration in the future. Article 3 of the "Twenty Articles on Data" stipulates: "Research on new methods of data property rights registration. Under the premise of ensuring safety, promote data processors to develop and utilize raw data in accordance with laws and regulations, and support data processors in exercising data application related laws and regulations. Rights, promote the reuse and full utilization of data use value, promote the exchange of data use rights and market circulation. Treat the circulation and transaction behavior of original data with prudence.” Accordingly, the transfer of data use rights rather than data ownership is the future of data property rights transactions Therefore, it is urgent to establish a data usage right registration system.

At the current stage, the right to use data collection processing is mainly created by the contract, subject to the relativity of the contract, it must be registered to fight against a third party. In this regard, we can refer to the experience of land management rights and give the parties the right to choose whether to register independently. The "Civil Code" does not give a clear answer to the nature of the land management right, but gives the parties the right to choose. In different contexts, the land management right can be determined as a real right or a creditor's right: for land transferred by lease within five years The management right allows both parties to freely create rights and obligations through contracts due to its short duration and small impact; for land management rights of more than five years, especially those transferred by means of shareholding and mortgage, due to the long duration and the land The input and output are large, and the scope of influence is wide. In order to ensure the security of transactions, this land management right belongs to property rights, which is world-friendly and exclusive, and can be registered against bona fide third parties. Therefore, land management rights with a registration or transfer period of more than five years are real rights, and land management rights that are transferred by lease for less than five years are creditor’s rights, so as to meet the different needs of different land management rights holders for land management rights.

Similarly, for data sets with a long value "half-life", enterprises can choose to register the right to process and use the data set on the trading platform; Arrange free trades. The former has the effect of real rights and belongs to the right to the world, and the content of rights is obtained in accordance with legal procedures and requirements; the latter has the effect of obligatory rights and belongs to human rights, and the content of rights is clarified in accordance with the agreement.

3. Data product stage: the separation structure of "ownership + management rights" of data products

Data products are developers who use certain algorithms to conduct in-depth analysis, filtering, refining, integration and desensitization of data collections, and finally form derivative data with market value. At present, in the research on data rights confirmation, the academic circle usually puts data products, data resources, and data collections in the same dimension for investigation. For example, some scholars believe that the business consultant data in the "Taobao v. Meijing case" and the user review data in the "Dianping v. Baidu case" are both corporate data, and there is no substantial difference between the two. However, business consultant data and user review data are very different in nature. The former belongs to data products, while the latter belongs to raw data resources. The business consultant is a data product developed by the platform based on the right to process and use the user data collection combined with digital technology. The platform has independent ownership of the data product, and has the right to independently operate the data product and enjoy the benefits. Article 7 of the "Twenty Articles on Data" also emphasizes: "Protect the right to operate data or data derivative products formed through processing and analysis, regulate the right of data processors to authorize others to use data or data derivative products in accordance with laws and regulations, and promote data elements. Circulation and reuse." In contrast, user review data still belongs to the category of raw data resources. Although these raw data are the income of technology and labor paid by the website, Dianping.com does not use certain algorithms to process these personal data. Cleaning, screening and other standardized work, so the user review data should be owned by the user, and the company only enjoys the right to data possession and processing and use on the basis of the right to use the data.

(1) Data products are independent rights objects

In the data product generation stage, data processors conduct in-depth summary and analysis of data collections, extract internal laws from messy data collections, and form valuable information for data controllers to make inductive deduction. This is the process of data from quantitative change to qualitative change. It is also the highest point of data value generation. Data products are essentially new types of data formed through processing, and can be handled in accordance with the general principle of "processing to obtain ownership". Although data products are derived from data resources or data collections, the data value added by the processing behavior is greater than the original data, so the processed data products can be owned by the processor. This point can also be supported by comparative law. Some German scholars advocate that Article 950, paragraph 1, of the German Civil Code should be applied to data processing by analogy. This paragraph stipulates: "A person who produces a new movable property by processing or transforming one or more materials shall acquire the ownership of this new thing, except that the value of the processing or transformation is significantly lower than the value of the materials." Accordingly, when the value of the data product is significantly greater than the value of the original data or data collection due to processing, the processor can obtain the ownership of the data product. If the creativity of the data product reaches the protection threshold of intellectual property rights, the protection path of the intellectual property law can naturally be applied, and the data product developers enjoy the intellectual property rights of the data products they create. At present, data services are increasingly presented in the form of data products. For example, credit investigation services are finally presented in the form of personal credit reports, which still use data as the carrier. In this sense, the boundaries between data products and data services tend to be blurred, and the boundaries are of little significance.

As a new type of data formed through deep processing and refining, data products have become independent objects based on the condensation of human labor, which can be completely owned by data product developers and can enter the market for circulation. Otherwise, if data products are confused with data resources and data collections, then many transaction restrictions (such as personal information rights) above the latter two will apply to data products, which will lead to the dilemma that data products cannot enter the market. Distinguishing data products from data resources and data collections can also respond to the issues of "co-ownership" and "dual ownership". Some scholars believe that the ownership of derivative data generated by the platform based on the aggregation, processing, and analysis of personal data should be owned separately or jointly by the natural person and the platform. Some scholars have proposed a dual data ownership structure between enterprises and individuals, including the nominal data ownership of individuals and the actual data ownership of enterprises. However, my country's current law cannot accommodate the so-called dual ownership structure, and dual ownership not only does not have a clear ownership of rights, but will instead create rights disputes, which in turn will affect the effectiveness of data. In fact, the object of data resource ownership and data product ownership are different: data products have been independent from data resources based on the exercise of ownership or usufruct of data resources, and the object of their rights refers to specific data products. It can be attributed to the developer of the data product; while the object of data resource ownership is the data resource in its original state, and there will be no so-called "dual ownership" or "co-ownership" at all.

Data product developers have ownership of the data products they develop, mainly including four rights of possession, use, income and disposal. Among them, the possession of data products means that the product developer has the right to store and control the data products, and has the right to decide how to control the data products. The right to use means that the owner has the right to use the data product for himself, or to allow others to use it and charge a fee. In the case of licensing others to use, the right subject has the right to stipulate the obligations of the licensee in the licensing contract, such as not re-transferring, disseminating and using the data product, and can also limit the number of times the licensee can transfer and use the data product. Earning power includes not only the right of the right subject to obtain benefits by selling the product or permitting others to use the product, but also the right of the right subject to profit by using the product itself to provide others with services such as forecasting and analysis. The realization of income rights should be based on the premise that the right subject abides by the market order, does not violate the public interest, does not constitute vicious competition, and does not infringe on personal information. The right to dispose means that the owner can transfer the data product to another person. After the transfer of the data product, whether the original owner should delete the data product should be resolved through negotiation between the two parties. From the above four powers, it can be inferred that the right to operate data products is the meaning of the title of data product ownership. The owner of data products has the right to independently operate data products and obtain benefits. Data product ownership is the "parent right" of data product management rights. .

(2) Generation Path of Data Product Ownership

The ownership of data products can be acquired originally based on the exercise of data ownership or usufruct, and can also be acquired based on a creditor's rights contract.

First, data product ownership is generated based on the exercise of data ownership. Enterprises, as data owners, have the right to use the data generated by their own actions. If enterprises use such data to develop data products, the data resources will be separated from the natural state due to infiltration of human labor and become data products independent of the original data. The ownership is also Enterprises enjoy.

Second, data product ownership is generated based on the exercise of data usufruct rights. This mainly refers to the situation where enterprises develop data products based on user data collections. As the owner of data collection usufruct, enterprises separate specific data products from data resources, which is the embodiment of their data usufruct rights of use and income. At this time, the data usufruct becomes the bridge and link to transform data resources into data products.

Third, the ownership of data products is obtained based on the succession of the creditor's rights contract. For example, taking ownership of an algorithmic model based on a buy-sell contract. At this time, before the buyer obtains it, the ownership of the data product has already been generated by the exercise of data ownership or usufruct. Data products are different from data resources and can be used as transaction objects. In practice, attention should be paid to distinguishing between the disposal of data resources and the disposal of data products.

epilogue

The criss-crossing relationship of rights on the data means that it is difficult to demarcate the data flatly. Instead, it should be placed in the hierarchical thinking of data rights, and a three-dimensional thinking from point to line, from line to volume should be established. paradigm. In order to solve the problem of data rights confirmation, the author proposes a "three-three system" data rights confirmation method based on hierarchical thinking. The first is the three-layer separation of the horizontal dimension: strictly distinguish information and data at the level of rights objects; distinguish data sources and processors at the level of rights subjects; distinguish data ownership and usufruct rights at the level of rights content. In this way, by strictly distinguishing personal information and data at the level of rights objects, and creating personal information rights and personal data ownership for individuals, corporate data usufruct rights can be separated from the latter based on the idea of rights division. Vertically, there are three levels within the data, and its formation process is presented in the form of a linear chain—from the collection of initial data resources to generate original data; Finally, develop and derive data products. As a result, a three-level progressive logical chain is formed within the data: "data resource-data collection-data product". In the stage of raw data collection, the ownership of data resources belongs to the data source, and data processors can enjoy the right to hold data resources in accordance with the law and contract for collecting and storing data resources; in the stage of data collection, users can hand over their personal data asset accounts to trust institutions In trusteeship, the trust agency signs a data licensing contract with the data processor on behalf of the user, and the data processor can obtain the right to process and use the data set in accordance with the contract; Data products are developed and derived from the collection, and they enjoy independent ownership of the data products, and they can independently operate the data products and enjoy benefits. This kind of thinking not only makes the "Twenty Points on Data" feasible, but also plays an important guiding role in the practice of data rights confirmation and circulation and utilization based on its solid theoretical foundation.

The diversification of data load interests and the diversification of data forms provide a deep theoretical basis for the justification of the hierarchical nature of data rights, and it is also the main line that runs through the three-tier and three-tier rights configuration of data. While ensuring the initial rights of data sources and declaring the ownership and dominance of data resources, this kind of right confirmation idea also meets the needs of data processors to use data and be protected, so that data collection rights are based on data usufruct rights. , data utilization, and data product transactions have established a three-stage layered confirmation pattern of data resource holding rights, data processing and use rights, and data product management rights.

The original article was published in Issue 4, 2023 of "China Law", thanks to the WeChat official account "China Law" for authorizing the reprint!

downloadOn the Hierarchy of the Data Property Right System