[author]HU Ling
[content]
Author *HU Ling
Peking University Law School
Abstract: In recent years , the concept and practice of opening public data have gradually become popular, it has become the primary link in the supply of factor market , and the legalization of opening public data has progressed rapidly , but there are some deviations in the previous legal studies. lt is necessary to respond to these issues theoretically , arguing that the public data function originates from the mode of production and aims to be applied in a larger unified market, which requires its division into demonstrative and auxiliary by function, with the former directly becoming information services or data products and the latter used to assist the safe and orderly flow of factors in the general market. In practice, the government has more incentive to promote information production as infrastructure and less incentive to promote opening of demonstrative data, where truly valuable demonstrative data are often traded and licensed for operation through the cultural industry, helping to increase local fiscal revenue. Also , the intergovernmental power structure between central and local governments can further explain what kind of functional data are produced and used by different levels of governments. Therefore it is helpful to understand how the actual classification of public data opening works under the existing legal framework , both from a functional and structural perspective.
Key Words: Public Data Opening; Classification; Function; Mode of Production; Administrative Subcontracting System
1. Introduction
Since the Central Comprehensive Deepening Reform Leading Group issued the "Opinions on Promoting the Opening of Public Information Resources" in 2017, the concept and practice of public data opening have gradually become popular. In recent years, with the introduction of "data elements", especially the release of the "Opinions of the CPC Central Committee and the State Council on Building a Data Infrastructure to Better Play the Role of Data Elements" (hereinafter referred to as "Data Twenty") in 2022, public data opening has become the first link in the supply of data markets.
In this context, the legalization of public data opening has progressed rapidly, with more and more provinces and cities promulgating regulations and implementing measures on the sharing and opening of public data resources. A large number of legal studies have followed, focusing on issues such as the scope, procedures, classification standards, and authorization operation system design of public data opening. It seems that the relevant institutional practices have already entered the right track, and only need to be promoted step by step. The public data formed under this logic has two levels: First, consistent with the general properties of data, public data is a universal element that can operate and process in a vacuum. As long as the procedures and security guarantee mechanisms are designed in advance, the professionalism and efficiency can be ensured through authorized operation, and the supply can be continuously increased according to market demand, providing a fair, reasonable and non-discriminatory information source for different entities, which can then be obtained by different entities for market competition, and then generate broader social value. Therefore, if the public sector does not implement it, the mistake will not be the market-oriented economic concept, but the institutional, technical and budgetary constraints of government departments. Second, due to concerns about the common personal information, trade secrets or other types of negative externalities or risks in the operation process, public data opening and operation need to focus more on data products rather than original data forms, which may lead to the "Arrow information paradox" and need to continuously improve the functions of market mechanisms (such as exchanges and brokers). Therefore, if the trading mechanism cannot operate effectively, it is not a problem with the trading concept itself, but that the relevant public data has not reached the high compliance standards required for entering the exchange.
However, the increasingly detailed formal norms under the guidance of this logic make the goal of public data opening unclear, and it is also difficult to explain the constant adjustment of existing policies and practices. For example, the policy of promoting the construction of a "unified national market" in 2022 pays more attention to the input of public agencies on "market information" rather than "data market"; the "Data Twenty" mentioned the principle of sharing and using, which has gone beyond the scope of sharing within government departments under the existing legal framework, and extended to a wider range of market entities; some provinces have proposed to build a two-level data element market system in the promotion of public data utilization, relying on both administrative and market mechanisms, rather than a single exchange mechanism. All of these require us to rethink the direction and goal of public data opening policy, that is, how much public data needs to be pushed into the market, how to make it a public good for all to facilitate the flow of all kinds of market elements? What are the driving forces and consequences of different choices? These are old problems, but they need to be further discussed in combination with changes in practice.
Similarly, according to the above understanding of public data policy, previous legal research also has some biases: First, public data is imagined as a static element that can be extracted at any time, which gives rise to data ownership issues. However, for data elements, it is not important to analyze the rights of ownership based on all rights, but the control power based on different legal relations, and the ability to continuously create a certain type of information according to actual needs. Data needs to be produced and used, not solidified and hoarded. Therefore, previous studies have paid less attention to the public data opening process as a system and order, such as opening motivation, supply capacity, how to realize value generation and circulation, and production mode issues such as bureaucratic structure, which determine the use mode and conditions in the real world.Second, abstractly talking about public data makes it difficult for us to intuitively understand what we are talking about. Even the institutional design on grading and classification is mainly guided by the use risk, rather than exploring the actual needs of the market. Therefore, it is necessary to re-examine the typification and hierarchical issues based on practice. For example, some data are the supply of information products and elements themselves, while others are the supply of infrastructure that helps the flow of various elements, including information products themselves. If we only focus on the former, there will be bias. Third, existing legal research has focused more on the formation of data element markets, while neglecting the needs of the unified national market construction that these markets are meant to serve. This includes the question of what kind of information tools are needed for general markets, and the role of non-market mechanisms in this process. Even when discussing non-market mechanisms, existing research has focused more on the practices of local government departments, and neglected the role of the central government.
The author of the article attempts to address these issues by providing a new perspective on public data opening policy. The author argues that it is important to see how this policy operates in the real world. Public data functions are rooted in production modes, and its goal is to be used in a wider range of unified markets. This requires public data to be classified into two categories:
Exhibition data, which directly becomes information services or data products; Auxiliary data, which is used to facilitate the safe and orderly flow of elements in general markets. From a practical perspective, governments are more motivated to promote the production of information that serves as infrastructure, and have less motivation to promote the opening of exhibition data. Truly valuable exhibition data is often traded or operated through cultural industries, which can help to increase local fiscal revenue. The power structure between central and local governments can also explain what types of functional data are produced and used by different levels of government.
The article proceeds as follows: First, it discusses data function theory and classifies public data according to market function. This is different from simply classifying data based on risk, or by industry. Second, it places the administrative structure that distinguishes between different data functions in the theoretical framework of administrative contracting. This allows the author to examine the logic of tiered public data opening, where central and local governments have different priorities for different types of data. In practice, tiered data opening is constrained by intergovernmental structural relations. The accumulation of core data under the construction of a unified national information system drives local governments to promote more economic innovation activities, which in turn generates more data for local use. As an example of the application of the theory, the article discusses exhibition data and two types of auxiliary data: authentication and credit information. The author pays particular attention to how these types of data are produced and change. Finally, the article summarizes the author's findings.
2. Data Function and Typification Theoretical Framework
2.1 Data's Two Market Functions
General local governments have two main approaches to classifying public data. One is to assess the degree of risk and determine whether they have the ability to anonymize the data before opening it or to respond to potential consequences after it is opened. For example, the "Shanghai Municipal Public Data Open Classification and Grading Guidelines (Trial)" considers public data that is highly demanding in terms of data security and processing capabilities, has strong timeliness, or requires continuous access to be conditionally open, while public data involving trade secrets, personal privacy, or prohibited by laws and regulations from being open is non-open. However, the difference between the two is simply whether they have the ability to anonymize the data. The second approach is to classify the data based on its natural attributes. For example, the "Jiangsu Provincial Public Data Classification and Grading Specifications" classifies public data by subject, industry, and object, but this classification method is superficial and the data covered is not fixed. The common problem with both approaches is that they cannot explain the value of data classification based on substantive reasons. This section re-examines this issue from a functional perspective.
It is generally believed that the value of data comes from the display of information or the process of data analysis and prediction. These functions are closely linked to the digital economy production mode brought about by information technology. Digital platforms can mobilize and match different factor resources across regions, resulting in a business model that is characterized by information content services. They mobilize the public to produce information for free and charge members for it. Platform companies closely combine the production process and the consumption process, track users and conduct predictions and recommendations, and promote the generation of more trading opportunities. These are all familiar data application processes. However, information systems themselves do not necessarily generate trust and stable cooperative relationships. For traders, they may still encounter a large number of one-time transactions, and they need some default mechanisms to constrain traders, ensure that their behavior is credible, and ensure future multiple transactions. For example, the identity of the subject can be verified, and an abstract credit can be formed based on the historical behavior that has been tracked. These two types of auxiliary information fully reflect the key role of data in ensuring the normal operation of market order, even as infrastructure-type functions.
For example, the "Hangzhou Municipal Public Data Authorization Operation Implementation Plan (Trial)" solicited opinions in February 2023. The plan lists several public data in specific scenario applications, including exhibition function: Medical health data products, fitness facility information in the sports field, travel information in the cultural and tourism field, research information in the scientific research field, etc. And support function: Enterprise credit information in the financial insurance field, connection and matching information in the trade and logistics and industrial manufacturing fields, public opinion information in the social governance field, etc. It can be seen that only by effectively distinguishing the functions of data can public data opening be more targeted and orderly explored and utilized.
The above discussion from the perspective of function is a common analytical framework for data legal issues used by the author, which has been used mainly to discuss personal data or data generated by digital platforms, and has been used less to analyze public data. The following will show that functional analysis is also essential for the discussion of public data usage rules, and only by understanding the function of data can we further decide how to open and use it. At the same time, functional analysis itself is scenario analysis, which helps to solve the problem that previous research has been difficult to go deep into when it comes to scenario-based research.
Any premise of data analysis is to recognize that if a certain external key social function is to be realized, the data usage conditions and systems must be designed on the premise of meeting that function, rather than determining the usage method based on the so-called nature of the data. For example, the personal ID number is personal information and should be strictly protected, but we must not forget that it was first created by the state for identity verification in the process of public services. Therefore, the function of basic identity information is authentication, and then we need to further discuss when and why to verify. If we do not understand this function, we will have the cognitive error that the individual controls the use of the ID number.
When we discuss the function of data, we actually implicitly assume several preconditions: First, Data is dependent, about the identity or behavior of some social subjects and objects, and serves the flow, improvement, or other goals of the object. This means that we first need to pay attention to the legal and social relations in which these social subjects and objects are involved. If these elements are originally endogenous to public services, then the corresponding public data opening also needs to be beneficial to the public interest, and it is unlikely to be completely transformed into private resources. Second, The essence of data is still information, not simply used for direct trading or display, but depends on how the external social or market system wants to use this information for what function. Unlike the previous distinction between "data" and "information" by machine reading or human reading, a brief functional analysis divides data into two categories: one is exhibitional, and the other is auxiliary. Exhibition data is mainly used for people to browse the information itself, while auxiliary data is used to help a specific social system improve trust and ensure transaction security, including identity information, credit rating information, and matching information of market entities. They have the ability to break away from specific scenarios and constantly integrate and abstract, merging several small scenarios into large scenarios, thus realizing the extended application of basic information. Both traditional local markets and expanded digital economy markets need this type of functional data.
Functional analysis is both a descriptive framework and a normative framework, that is, to discuss data types from the perspective of the huge demand of market entities, understand what the market wants, and produce corresponding data from this, and conduct cost-benefit analysis. Similarly, if data cross-border is an important goal, playing a supporting role in the normal operation of cross-border business, then it is necessary to tolerate a certain amount of risk, and formulate gradually improved rules, rather than worry about the risk decreasing or stopping the flow. Risk can be demonstrated and reduced through pilot projects and competition, laying the foundation for the continuous promotion of data use. More importantly, the function and the information it needs are constantly changing with the changes in the environment. In a society with strong mobility, the increase in unsafe factors makes people lack enough expectations to carry out cooperative activities, and social entities are more motivated to conceal personal information. Therefore, public authorities need to require mandatory disclosure or centralized disclosure of relevant risk information for query. In an environment with relatively high trust, this is not necessary. Therefore, the emergence of data functions and the way they are realized are constantly changing, and it is necessary to comprehensively consider factors such as information acquisition costs, social entity perception, binding force, and consequences. Classifying different public data according to this can appropriately break out of the existing fuzzy and fixed ideas about public data ownership and risk, see and recognize that practice is changing, and constantly promote the development and utilization of new data.
2.2 What does the Market and Market Entities Want?
If we only focus on how the data element market is formed, it may not be enough to understand the impact of public data opening on market entities. We need to go back to the characteristics of the market itself. From the perspective of production methods, there are at least two types of factor markets in reality: One is the cross-regional flat factor market dominated by digital platforms (networked); The other is the localized market divided by administrative divisions (gridded). In the networked market, authentication and identification services are provided by platform companies as a default basic service, which is more oriented towards mobile individuals, especially natural person users. The behavior data tracked by platform companies flows in the closed architecture of platform companies. The gridded market is formed by the government within the administrative system. Through different types of administrative behaviors, the government obtains information about administrative counterparts, and evaluates the relevant administrative behaviors and entities. It involves more enterprise entities and less natural persons, which helps to quickly establish and maintain a general open market.
As mentioned above, according to the originally envisioned institutional logic, public data is developed to drive the aggregation of social data. This is easily understood as the concentration of public data through exchanges or authorized operations for trading, and the derived data generated by users needs to be returned to the original public resource pool. This imagination may be biased in reality: First, if local governments only release part of the exhibition data resources according to the existing thinking, it is nothing more than increasing the supply at the margin in a specific industry (local markets prefer cultural and artistic content trading), and it does not play a truly supporting role in market upgrading. It is not the core function for both online and offline markets. Second, if the derived data can be invaluably returned, it just means that these data can track specific social entities and provide more comprehensive behavior profiles. This is precisely the troublesome problem that the public sector is unwilling or unable to handle. Finally, users may be more inclined to train artificial intelligence algorithm models, rather than simply commercial use. Therefore, from the perspective of market entities, the attraction of anonymous-processed display public data may not be as big as imagined. What they need more may be the other functions of public data, which can benefit general markets and help market entities benefit from public information.
The "Twenty Articles on Data" stipulate that public data that does not carry personal information and does not affect public security should be provided to the society in the form of models, verification, and other products and services in accordance with the requirements of "original data not out of domain, data can be used but not visible". According to this thinking, some provinces and cities have realized the way to promote the operation of public data through non-market mechanisms, that is, not all public data needs to be conducted in a market-oriented manner. The important thing is to stabilize the information needed by the production market, promote sharing, and thus reduce the cost of factor flow. We have already seen that the high-threshold compliance mechanism for the construction of data exchanges can only attract a small number of enterprises. Most enterprises can only choose over-the-counter transactions due to high compliance costs. This further highlights the need to promote the rapid entry of auxiliary function data into the market.
In summary, we can see that there are differences in the needs of different types of markets for public data (see Table 1):
|
networked market |
gridded market |
Exhibitional public data |
The demand for public data is weak, and the market mechanism is needed. |
The demand for public data is strong, and the market mechanism is needed. |
Auxiliary public data |
The demand for public data is strong, and the non-market mechanism is needed. |
The demand for public data is strong, and the non-market mechanism is needed. |
Table1 Data needs for different types of markets
3. AdministrativeContracting and the Theoretical Framework of Public Data Classification
3.1 Overview of Administrative Contracting
The general principle for local governmentsto classify public data openings is still risk-oriented. For example, the"Jiangsu Provincial Public Data Classification and GradingSpecifications" stipulates that public data should be classified based onthe importance of data in economic and social development, and the degree ofharm caused to national security, public interests, or the legitimate rightsand interests of individuals and organizations once the data is tampered with,destroyed, leaked, or illegally obtained or used. This principle can beoperated within the controllable range of the local government, but it cannotclearly define the boundaries of the controllable range. This section rethinksthis issue from the perspective of administrative contracting.
The practice of public data opening bylocal governments is gradually deepening, but researchers rarely discuss itfrom the inter-government perspective. It seems that it is natural for localgovernments to promote it, and we have seen a lot of reports pointing to thelocal public data needed by the grid market. However, a careful examinationwill reveal that central government departments play a pivotal role in thecurrent process of public data disclosure, not only shaping the rules, but alsobeing able to change the structure of data flow. This section will use thecommon theoretical framework of administrative contracting in political scienceto explain the practice under inter-government relations. This framework ismainly used to describe the contracting relationship between the centralgovernment and local governments in China, that is, the central governmentoutsources various administrative affairs and public services to localgovernments, and at the same time provides implicit incentives throughpersonnel promotion and local fiscal revenue, and continuously promotes thecompletion of the governance tasks assigned. Under this framework, the centralgovernment divides responsibilities between the central and local governmentsaccording to the different types of power and governance risks they possess,and local governments have different motivations and performances in theimplementation process. Therefore, different governance tasks can be simplydivided (see Table 2):
|
High Degree of Horizontal Promotion Competition among Local Governments |
Low Degree of Horizontal Promotion Competition among Local Governments |
High Degree of Vertical Administrative Contracting by the Central Government |
Ⅱ. Investment promotion, Stability maintenance, Competitive sports, Disaster reconstruction, Public health, Innovation pilot |
Ⅰ. Healthcare, education, environmental protection, market regulation, social security, food safety, safety supervision |
Low Degree of Vertical Administrative Contracting by the Central Government |
Ⅲ. General Infrastructure Construction (Transportation, Finance, Media) |
Ⅳ. National defense, Foreign policy, Customs, Aerospace |
Table 2 Specific Types of Administrative Contracting
Table 2 in the paper describes the fourquadrants of administrative contracting in China. Quadrant I is the generalpublic service that the central government has the incentive to outsource, butthe incentive for local governments is weak. Quadrant II involves strongincentives for promotion and the possibility of obtaining special policies, andsometimes involves certain governance risks. The central government has theincentive to outsource, and may provide detailed guidance when necessary.Quadrant III is the area where the central government is not motivated tooutsource vertically, mainly involving national infrastructure and core powerareas, and most services are carried out by central government-affiliatedenterprises and institutions. Although local governments also have a strong desireto obtain licenses or qualifications. Quadrant IV is a pure public safety andnational development affair.
This theoretical framework takes intoaccount both performance and risk sharing, and is not absolutely risk-oriented.If local governments can bear a certain degree of risk to innovate and limitnegative externalities to a certain geographical scope, they can be allowed toexperiment and promote. The national infrastructure is uniformly planned andguided by the central government departments, and it can also effectivelyreduce risks on the basis of pilot projects.
3.2 Understanding the Classification ofPublic Data Openness
Under the framework of administrativecontracting, the application practice of public data openness can be wellexplained, that is, not all types of public data can be promoted by localgovernments. Even if the central government is willing to outsource, localgovernments may not have enough motivation to implement it. Instead, they alsoconsider whether they can benefit from it. Of course, this does not mean thatdoing well will always have direct feedback, but that under the same expectedconditions, what kind of data can bring more visible benefits (such as helpinglocal economic growth). Therefore, Table 2 can be refined as the followingtable (see Table 3):
|
High Degree of Horizontal Promotion Competition among Local Governments |
Low Degree of Horizontal Promotion Competition among Local Governments |
High Degree of Vertical Administrative Contracting by the Central Government |
Ⅱ. Information System Development and Integration, Digital Collectibles, Cultural Assets, Corporate Credit, Conditional Authorization Opening (Authorized Operation) |
Ⅰ. Environment, Transportation, Meteorology, Market Regulation, Public Service, Unconditional Data Opening |
Low Degree of Vertical Administrative Contracting by the Central Government |
Ⅲ. Authentication, Personal Credit, Social Credit, Health Code |
Ⅳ. Judgment Documents and Knowledge-based Data |
Table 3 Public Data Openness of Administrative Contracting
Current local public data regulations aregenerally classified according to risk level, but they do not take into accountthe operating dynamics of central and local government departments. Sometimes,even if certain information poses a certain risk, it can be fully undertakenand developed by local governments themselves, thereby promoting local marketdevelopment, especially risk-prone local governments. However, public data withinfrastructure functions need to be standardized and regulated across thecountry. This reasoning explains that in Quadrant I, the public data involvedis basically localized and presentable, and the motivation for localgovernments to continue development is insufficient, and it is only carried outaccording to the requirements of the superior government. Quadrant II may playa more active role for local governments, especially those involving unifiedinformation systems and presentable data that can increase some fiscal revenue.The central government also has the intention to mobilize local governments tocarry out construction. Quadrant III involves auxiliary data of infrastructuretype. The central government has no motivation to outsource, and hopes to carryout unified planning and design across the country. However, this process isalso accompanied by some reversals and adjustments. For example, the currententerprise credit management is a registration system, and market competitionis relatively sufficient in various places, still in Quadrant II. However,personal credit management has been transferred from the previous constructionof several platform companies to two national institutions to hold licenses andoperate centrally, entering Quadrant III. Quadrant IV involves specific typesof key presentable information content, each with its own peculiarities.
In this stable structure, the executors ofpublic data disclosure policies are not limited to local governments, but atwo-tier structure of central and local governments. Central and localgovernments have a clear distinction in terms of perspective and datafunctions. This shows how the actual public data hierarchy is formed, not onpaper, can be explained under the administrative contracting system. From thecentral government's perspective, since the issuance of the "Opinions onUsing Big Data to Strengthen the Service and Supervision of MarketSubjects" and the "Opinions on Promoting the Openness of PublicInformation Resources," the central government has begun to systematicallypromote the construction of "one network," "one platform,"and "one center." Each central department has also built a nationalone-network information system in its respective vertical field, such as theNational Enterprise Credit Information Disclosure System, the National MedicalInsurance Information Platform, and the National Public Resources TradingElectronic Service System. The construction of the national one-network meansthat central departments collect large amounts of data in their respectiveindustry fields, facilitate the flow of government information across thecountry, strengthen their own supervision and contracting rights, and superviseand verify how local governments use these data. Secondly, this mechanism meansthat unless local relevant industries can continue to produce data within thescope of local affairs, a considerable amount of important data does not havetoo much authority to autonomously decide on disclosure and use. Finally, ifunderstood under the administrative contracting system, the central government'smotivation is to view data resources as an important tool for constraining andmanaging local economic activities. For example, when the central governmentneeds to promote economic policies, it can be targeted by the operators ofnational databases in specific industries to cooperate with local governments,authorize data development and application, or cooperate with national industryassociations to release data value, so as to truly treat key industry data as apublic resource for centralized control and allocation. The "Guidelinesfor the Construction of the National Integrated Government Big DataSystem" issued by the General Office of the State Council in 2022 clearlystates that the relevant departments of the State Council will coordinate and coordinatethe data resources of their departments and industries, understand the bottomline of data resources, compile the directory of government data, and rely onthe national government big data platform to carry out data sharing andapplication with local governments and departments. No new cross-departmentdata sharing and exchange channels shall be built, and the existing channelsshall be included in the data sharing system management of the nationalgovernment big data platform.
In this process, central departmentsestablish information sharing mechanisms vertically, promote the use ofinterconnected and interoperable information systems, and operate them throughspecific central state-owned enterprises. The latter's responsibilities includecontinuously urging the collection of relevant data, conducting big dataindustry statistics, and commercializing it according to policy needs, butthese data are often not shared horizontally between departments. This alsoinvolves the complex cooperation relationship between the national systemsoperated by state-owned enterprises and local governments. Discrimination inthe granting of authorization to use is often encountered, which may lead todata monopoly problems. In addition, after this cooperation mechanism, localgovernments may not be subject to the local public data disclosure rules, whichalso shows from one side that the ability and scope of action of localgovernments are limited.
Similar to the behavioral patterns ofcentral government departments, under the administrative contracting system,local governments also have the motivation to issue their own public datasharing and disclosure regulations to control and utilize localized public dataresources. One of the motivations is data collection and internal sharingwithin the government. Some provincial governments require all localities toaggregate their data to the provincial platform, establish a provincialone-network, and hope to strengthen the control and constraint of lower-levelgovernments. In the bureaucratic system, the scope of action of grassrootsgovernments is becoming smaller and smaller. When necessary, it can cooperatewith the cities, strengthen targeted support for specific counties and cities,and list the unit as a pilot to be tested first, and gradually promote it. Thesecond motivation is to use this to promote local economic development, promotesmall and medium-sized enterprise loans and financing, solve employmentproblems, and respond to the policy requirements of the central government. Thethird motivation is to master the data power to track and restrict the flow oflocal elements. For example, during the COVID-19 pandemic, the collection ofpersonal health codes and travel data was gradually transferred from the grassrootsto the provincial level, otherwise data abuse may occur.
It is worth noting that the issue ofwhether public data can be charged for trading has been controversial. The"Twenty Articles on Data" proposes to "promote the free use ofpublic data for public governance and public welfare undertakings under certainconditions, and explore the conditional charging use of public data forindustry development and industry development," which has provided policysupport for data authorization operation charging. Since then, localgovernments have been more motivated to promote economic activities such ascultural industry data trading. This is why, despite the potential for illegalfinancing risks from the trading of digital collectibles (NFT) platforms, aconsiderable number of local governments are still promoting them. Public datatrading has become a way to strengthen "data finance." While it ispossible that local governments may only open some government information thathas no special purpose or value in order to meet policy requirements, ingeneral, public data opening can be expected to continue as long as it can beconnected and attached to the existing incentive mechanisms and tasks to helpachieve other more important governance contracting tasks.
The administrative contracting system hasshaped the entire production order and structure of the public data cycle. Thecentral government drives local governments to continuously produce data, andthen feeds back the data resources to local governments. The action space oflocal governments is a key issue for public data opening. In the following, wewill discuss this issue in conjunction with display and auxiliary data, andanalyze the impact on the market, production structure, whether it needs to beoperated and traded, and other issues.
4. Application:Comparative Explanation of Three Types of Public Data
4.1 exhibition data
The foundation of data element policyoriginated in the early 21st century when the state promoted the disclosure ofpublic data elements of exhibition type. However, at that time, the Internetwas just beginning to develop, and the state had not yet recognized theimportance of auxiliary data elements. It was not until the platform economytook shape and formed a huge commercial ecosystem that people realized theimportance of auxiliary information infrastructure in the market. As mentionedabove, the mere disclosure or authorized operation of public exhibition dataitself may not be enough to generate unique incremental value that is differentfrom the original government information disclosure. For example, environmentalor spatial data are very helpful for transportation services, but they arestill marginal and cannot help track to the user to generate the value form ofconsumer Internet. Therefore, in addition to the general continuation of thetraditional government information disclosure catalog, local governments haveno special incentive to continue to expand the output of public data, as thisinvolves a series of investments in information systems.
The exhibition function means that the dataitself can be directly read and used, mainly for informational queries (andtheir basic matching functions), or if the amount is relatively large, it mayalso be read by machines, eventually forming slightly abstract data products.From the perspective of value development, it can be divided into threecategories:
The first category is spatial data, such astransportation, maps, environment, and weather data. This type of data is usedto describe the structure and changes of the physical space, and to matchservices provided in space. Especially map services and weather services can belicensed for individual licensing control, or can be auxiliary elements ofother types of services.
The second category is industrial chaindata, which is used for the generation of industrial big data and knowledgegraphs. It is used more for the aggregation and construction of localindustrial chains, especially to provide industry guidance for importantenterprise investment and financing. This type of information can also beauthorized for operation or disclosure.
The third category is more traditionalcultural industry property trading data. The main driving force of localgovernments is to convert the content or physical objects of various culturalindustries such as art galleries and museums into digital collectibles throughblockchain technology for sale, thus as a type of data trading. A unifieddedicated network and trading center have been established nationwide forinterconnection and interoperability.
It is not difficult to see that most of theabove exhibition public data are in Quadrant I (such as environmental andspatial data), and local governments have little incentive to increase dataopenness investment. Even if the investment is increased, it is subject to thestructure of data types and value generation, and only needs to be graduallypromoted to increase value at the margin. However, those projects that areparticularly important to the central government and can bring economicbenefits belong to Quadrant II (such as industrial chain data and digitalcollectibles), and local governments are willing to seize the policy window toquickly promote related industries. If it is profitable, local governments alsohave the incentive to entrust some exhibition data to local state-ownedenterprises for operation. For private enterprises, unless they are largeplatform companies, it is difficult to obtain qualifications, which is easy tocause fair competition problems. Therefore, the open content in Quadrant I,which was often focused on in previous studies, is not very important in theentire process of public data openness.
4.2 Identity Data: Authentication andIdentification
Identity information is one of the mostubiquitous information infrastructure services that are often overlooked. It isa public service that any market needs to build first for normal operation. Ifthis type of service is lacking, and there is no identification and tracking ofmarket participants, the market may fall into opportunistic transactions in ashort period of time and eventually collapse. In the production mode of thedigital economy, the purpose of identity data is to authenticate the real basicidentity, to ensure the security of using public services (flow security). Thisrequires a unified national technical standard, which cannot be designed andfragmented by local governments. Identity authentication has appeared in thename of network real-name registration in the first 15 years of the developmentof the Internet. Early practices were guided by specific policy goals due tothe lack of overall planning. That is, the intermediary, in accordance with thelegal requirements, verifies and verifies the online identity card through thetraditional identity card, and restricts the qualifications of the subject touse the network service.
Since 2010, the state has comprehensivelypromoted network sovereignty and real-name authentication. This stage faces thechallenges of mobile Internet and the Internet of Things, as well as the keyissue of how to expand online authentication to a wide range of offlinescenarios. The premise of this stage is to authenticate the service providersand information resources at each level of the Internet through the licensingmethod, and then authenticate the wider users through them. Gradually, identityauthentication has become a very important auxiliary data in the digitaleconomy. The current "one-network clearance" and "zero-proofcity" construction are actually the results of promoting extensiveauthentication services.
The original identity card was only forproof of identity, but the function of the identity card began to change inthis process. In the exploration of network real-name registration, theidentity authentication practices led by different departments have constantlydemonstrated and expanded the functions of modern identity cards: (1)Identity proof. Theidentity card first represents the national citizenship granted to natural persons,which is a political identity, thus demonstrating sovereign power. Citizenshave the right to participate in public affairs and obtain public services withthis identifier. Such as voting, public medical insurance, financial services,public transportation services, etc. (2)Verification andpresentation. In certain occasions, citizens need to show their identity cardsto the police. Such as for persons suspected of violating the law or crime, foron-site control in accordance with the law, for serious public securityincidents that seriously endanger social security, at railway stations,long-distance bus stations, ports, docks, airports, or in places designated bythe municipal government of the prefecture-level city during major events. (3)Tracking anddeterrence. The original logic of real-name registration is to track throughreal-name authentication according to the principle of "back-endreal-name, front-end voluntary", and then deter potential illegalbehavior. The government can track the trajectory of criminals using specificservices through identity cards (such as hotels, public transportation,Internet cafes, etc.), and can further increase the connection points forauthenticating identity in public areas through facial recognition and gait recognitiontechnology. Once it is widely used in situations where private services areaccepted (such as shopping malls), identity authentication can become animportant way to fulfill the obligation of security guarantee. These functionsare increasingly linked together through one-time terminal authentication,which is both an identity identification process and a behavior record trackingprocess. As the applicable scenarios continue to expand, this power will covera wider range of areas. Background records can not only be used for creditpunishment, but also more easily form deterrence. In this sense, the statefirst turned the resident ID card into a nationwide super account, in additionto granting citizenship, and accumulated data under this account, which isconvenient for post-tracing.
However, from the perspective of the state,the management of individuals is still carried out by various organizations,and does not directly face individuals. Therefore, individuals and enterprisesare not the same. At present, the focus is still on the authentication ofcorporate identity, but the authentication of individual identity is still inthe pilot stage. After the "Cybersecurity Law of the People's Republic ofChina" (hereinafter referred to as the "Cybersecurity Law") waspassed, it put forward more requirements for identity authentication. TheInstitute of Public Security of the Ministry of Public Security, which leadsthe offline second-generation ID card, has built the Trusted IdentityAuthentication Platform (CTID) and cooperated with super apps such as WeChatand Alipay to launch the "Resident Identity Card Online FunctionalCredential" (also known as "Net Certificate"). This is anelectronic mapping file that is uniquely mapped to the physical ID card chipbased on the ID card making data, and is signed by CTID. Users only need toapply for opening in the corresponding mini program on the platform, and theycan then display it externally or authenticate it through device scanning. Asmore and more people register for WeChat or Alipay, it is foreseeable that thisauthentication method promoted by application platforms can be popularized assoon as possible, and it is more flexible and can adapt to different terminaldevices and APP platforms.
The power of authentication has been stablyin Quadrant III, designed and promoted by central government departments.However, during the COVID-19 pandemic, travel cards and health codes werecreated and used according to the functions of display and tracking. To preventthe abuse of local power, the design and entrance of health codes have beencontinuously raised. Before the resumption of less stringent control measuresfor COVID-19 infection in late 2022, most provincial health codes had alreadybeen unified into provincial systems. At the same time, some places have alsocreated other types of authentication information, and CTID has not yet beenfully popularized. Therefore, it can be said that the basic information ofpersonal authentication is still in a fragmented state at present. In contrast,the unified identity authentication data of enterprises and other organizationshas not been unified for a long time. Before 2015, a considerable number ofinstitutional codes were not unified due to the influence of departments, andthe state also lacked an effective coordination and management mechanism forinformation sharing. Most codes are only used for internal management ofvarious departments, and some departmental information data are isolated andclosed. At the same time, the length, meaning, and function of different typesof institutional codes are different, which leads to the need for legal personsand other organizations to apply for codes from multiple departments whensetting up and handling related businesses, and some even charge fees,increasing social burden. Therefore, the National Development and ReformCommission and other eight departments issued the "Overall Plan for theConstruction of the Unified Social Credit Code System for Legal Persons andOther Organizations", gradually streamlining the code management systemand mechanism, and establishing a comprehensive, stable, and unique unifiedsocial credit code system for legal persons and other organizations based onorganizational code, laying the foundation for the construction of a unifiednational market.
It is not difficult to see that theidentity data of market entities is a special type of public data. Itsgeneration and adjustment are not predetermined, but are constantly tested anditerated according to actual circumstances. However, this type of informationhas never been an object of market transactions, because it is informationprovided by the state to reduce risks and transaction costs in order to achievepublic services. The direct application of this type of information indifferent occasions is conducive to the promotion of public services, and ifthe price factor is added, it may reduce efficiency.
4.3 Credit Data: Public Credit and CreditReporting
In the current framework of the socialcredit system, the social credit system is divided into four main types:government credibility, business credibility, social credibility, and judicialcredibility, as well as two main systems: the public credit information systemand the credit reporting system. Since the release of the "Plan for theConstruction of the Social Credit System (2014-2020)" by the StateCouncil, public credit affairs have been located in Quadrant II from thebeginning under the administrative contracting system, in order to allow localgovernments to better implement the construction and promote experiments.However, because this means lacked constraints, it became a means for somelocal governments to abuse their power. The central government began to collectpower and regulate it, gradually placing it in Quadrant III. In November 2020,the State Council's executive meeting required "to adhere to compliancewith the law, protect rights, be prudent and appropriate, and manage by list,to regulate and improve the system of credit constraint, and promote theorderly and healthy construction of the social credit system." It wasregulated from six aspects: Scientifically define the scope and procedures forthe inclusion of credit information; Normalize the scope and procedures for thesharing and disclosure of credit information; Normalize the criteria foridentifying serious defaulting entities; Conduct credit punishment inaccordance with the law and regulations, and ensure that punishment iscommensurate with the offense; Establish a credit repair mechanism that isconducive to self-correction; Strengthen information security and privacyprotection. Therefore, public credit as a new type of administrative punishmentmeasure has gradually been transferred to the central government's affairs, butit still retains a certain degree of freedom for local governments. In additionto the national public credit information directory formulated by the centralgovernment, provinces and specific cities still have the right to formulatesupplementary directories. In contrast, the enterprise credit industry is moredeveloped and market-oriented, and has become an important tool for localgovernment market management. It has not been unified and centralized by thecentral government, and has used credit and small and medium-sized enterprisefinancing information as leverage in different economic cycles. Therefore, thismatter has always been located in Quadrant II. The auxiliary function of creditinformation is currently still to be solved for localized enterprises and othertypes of organizations, and the relevant information disclosure mainly serveslocal enterprises and organizations.
In the traditional economy, natural personsubjects are often affiliated with enterprises or other social organizations,which are managed by the latter. Due to the rise of digital platforms, theproduction and consumption processes on an individual basis, includinge-commerce, sharing economy, and gig economy, have become more influential. Thetraditional enterprise organization's constraints on individuals have beenweakened, and a new platform-based governance approach is needed. In thiscontext, platforms are able to track consumers' purchasing and use behavior,thus linking the production and circulation and consumption links closelytogether. To assist the operation of the digital market with faster flows,platform companies obtain people's real payment scenarios through third-partypayment services, and they are constantly expanding to offline services, providinga data foundation for consumer credit businesses, and finally forming a"big data credit" model through the scoring and labeling ofindividual consumers. Before 2020, the big data credit model within theplatform had always been located in Quadrant II. Many leading platformcompanies used this type of model to promote popular consumer credit businessesand loan assistance models, and the state did not intervene. However, due tothe subsequent strengthening of the protection of personal information and therecognition of the risks of loan assistance models, the state graduallystrengthened the regulation of loan assistance models. The People's Bank ofChina's Personal Credit Information Center provides public services, andmarket-based personal credit licenses are issued. Both operate in a centralizedmanner, thus turning to Quadrant III.
This section compares the demonstrative andauxiliary public data, hoping to expand our cognitive framework of public dataopenness. First, the supply of data elements in the market not only includesdemonstrative data, but also auxiliary data. The former is not as important andextensive as previously imagined, but the latter is more important but oftenneglected. Second, it is necessary to re-examine the role of local governmentsin the process of public data openness. Only by seeing the structure they arein under the administrative contracting system can we understand what type ofopenness they are more motivated to promote. Finally, public data openness hasformed another ecosystem outside of exchanges, but it is more complicated dueto historical reasons, and needs to be distinguished. The pursuit of interestsby the central and local governments may be one of them.
5. Conclusion
This paper revisits how the de facto"classification" of public data openness operates under the existinglegal system framework from the perspectives of function and structure.Classification is a common approach to processing public data, typically guidedby the principle of risk. This approach can certainly cover the open practicesof some local governments, such as carefully anonymizing content involvingpersonal information and trade secrets, but it cannot explain why we still needa whole credit system to disclose information about relevant social entities.Since it is often impossible to justify the risk, the risk framework isrelatively subjective. However, the functional framework is based on a certaindegree of cost-benefit analysis, believing that the disclosure or use ofspecific information itself can promote the efficient operation of the marketand meet the needs of market entities, regardless of the type of data involved,this function needs to be established. The understanding of such functionsdetermines the input and promotion of different levels of government, which allshow that the actual classification in reality is an effective support for therelevant institutions on paper. This framework can be further applied tounderstanding important data, critical information infrastructure, dataexchanges, and other regulatory matters that require classification.
In conclusion, the focus of this study isas follows: First, the types of public data openness are not infinite, and needto be reclassified according to the function of data. This is not to say thatabstraction should be avoided, but that it is necessary to consider it fromdifferent perspectives, especially the need to see the provision of informationservices with market infrastructure functions rather than the simple supply of datacontent. Second, it is important to pay attention to the productive structureand hierarchical structure of data flow. The productive structure determinesthat the first-level government is unlikely to track a wide range of socialentities in real time, so it is weaker than platform companies in creditscoring and behavior analysis. The way these public data work mainly focuses onadministrative punishment, regulatory convenience, and providing basic publiccredit. The hierarchical structure determines that for a certain type of publicdata, the central and local government departments (and their operatingorganizations) have formed a de facto classification in how to open and usedifferent types of public data (not only the division of central-local powers).Finally, a significant portion of public data only serves to improve publicservices and does not help the general flow of market elements and theimprovement of service quality. Therefore, in addition to market-basedtransactions, it is also necessary to see the more extensive data sharing withsocial entities driven by administrative power, and promote investment andtransaction real information to better match cooperation. The focus is notwhether the data itself has formed a market, but what market the data serves,and then discuss whether administrative or market-based means are more able tomeet that goal. In this process, the different motivations and considerationsbetween different levels of government will also affect the production and disclosureof different data. For example, the digital identity infrastructure and creditsystem for natural persons have both gone through a process of development fromplatform economy to promotion to the general market, which shows that marketmechanisms have helped to promote more unified credit services.
This paper does not provide a completeanswer to how public element supply should be conducted between central andlocal governments, but hopes to show its complexity. Traditionally, thediscussion of public data openness has focused on concept clarification, butthe changes and understanding of reality make it meaningless to focus on theconcept itself. The key is to see the implications of the concept and theproblems that you want to solve. This paper introduces the functional analysisframework into the analysis of public data, believing that it is firstnecessary to understand that the market needs some basic informationinfrastructure functions to operate normally, and that it needs to constantlycreate new types of data with social changes for public purposes. Therefore,what is important is production and usage capability, not fixed assets. Thediscussion from the perspective of function helps to break out of the existingconcepts and see how different types of data can realize their true value, andthen to discuss around this type of function, that is, how to make the bestvalue choice, whether through administrative or market perspectives. Thediscussion from the perspective of structure helps to understand theconstraints on the choices of social actors, especially the structural gamebetween central and local governments, so that we can understand theconstraints and limits that any legal enforcement will encounter. It isprecisely in such a dynamic dimension that we can better think about how toadjust related incentives and action space to promote the implementation ofspecific laws.
This paper also extended the discussion oftwo sub-issues by using the issue of public data openness. The first issue isthe relationship between the state and the market. The state can promote marketconstruction by leading data elements. In addition to establishing new datamarkets, the state can also help traditional markets develop by releasingauxiliary information. From the state's perspective, the more data can become ageneral market infrastructure, the more it needs to be uniformly authorized orlicensed. This point has been discussed by the article through the analysis ofidentity authentication information and credit information. The second issue isthe relationship between the central and local governments. The centralgovernment has the incentive to centralize the construction of infrastructureservices in order to promote the unified market. However, for other types ofpublic data, the central government hopes to continue to outsource them tolocal governments. The motivation of the central and local governments to usepublic data to promote different markets has led to a stable power game, andthus a cyclical process of public data production has been formed.