[author]ZHENG Ge
[content]
ZHENG Ge
Keywords: AI ethics; mechanism design; incentive compatibility; revelation principle; implementation theory
Introduction
AI ethics is a hot topic that has attracted widespread attention. Yet, as is common with many popular issues, the field suffers from an abundance of rhetorical ethical principles and a dearth of effective implementation mechanisms. From international organizations to national governments, from industry associations to individual enterprises, and from technical experts to scholars in the humanities and social sciences, countless parties propose various ethical principles for AI. However, very few address how to translate these principles into practice. The reasons are straightforward: proposing principles is easy and appears noble, but when it comes to implementation, one must confront a host of complex practical issues, including but not limited to:
1. Severe Information Asymmetry:
Enterprises that develop and deploy AI technologies—who control data, algorithms, and computing power—possess far more information than individuals or even governments. They not only manipulate information but also exercise digital power. Even enforcing rigid legal rules is often outsourced to these companies under the banner of “platform responsibility” because of information asymmetry, let alone flexible, guiding ethics.
2. Complexity in Applying Ethical Principles to AI:
Traditional ethical principles assume human autonomy: people, endowed with free will, are held responsible for their actions. In contrast, AI systems are characterized by behavior that even their designers cannot fully control, as it continuously evolves through machine learning. Moreover, users engage with AI under the influence of attention inducing mechanisms and “dark patterns.” Increasingly, people’s self-perception is mediated by digital mirrors powered by AI—trapping them in “information cocoons,” “filter bubbles,” or “echo chambers,” often without realizing it or even embracing it for the convenience and emotional value it brings.
3. Complexity in Assigning Responsibility:
In traditional ethics, clear moral responsibility is ascribed to individual agents. In AI scenarios, however, determining responsibility is far more complicated. For example, if an autonomous vehicle is involved in an accident, it is difficult to ascertain whether the manufacturer, the software developer, the owner, or the AI system itself is responsible. Furthermore, in situations where AI independently makes decisions—such as a medical resource allocation system determining distribution based on its algorithm—the question of moral autonomy arises. Traditional moral decision making, based on free will and moral judgment, is thus challenged by AI’s data-driven processes, prompting us to rethink the essence and source of ethical decision making.Legal research on AI ethics has long recognized these challenges. For instance, Shen Kui has identified an “effectiveness deficit” in AI ethics—the gap between the numerous ethical norms formulated and their failure to achieve the desired practical outcomes, leaving ethical risks unmitigated. He attributes this deficit to factors such as:
(1) Lack of enforceability: Most AI ethical norms lack legal binding force, so companies and individuals have little incentive to comply voluntarily.
(2) Abstraction and vagueness: Many norms are overly abstract and lack concrete implementation guidelines.
(3) Fragmentation: Differing ethical codes proposed by various actors (governments, companies, industry groups) lead to inconsistency.
(4) Insufficient voluntary compliance: Driven by economic interests, developers may favor profit over ethical commitments.
(5) Practical difficulties: Technological challenges may make implementing ethical norms costly.
(6) Communication barriers among social systems hinder widespread adoption of these norms.
(7) The pace of AI development outstrips the formulation and implementation of ethical guidelines.
Shen Kui proposes softlaw mechanisms to remedy this effectiveness deficit, including:
(1) Creating organizational structures—be they governmental, corporate, or thirdparty—to continuously promote the implementation of ethical norms.
(2) Using industry standards and market pressures to create compliance incentives.
(3) Providing economic or other incentives (such as certifications or awards) to encourage adherence.
(4) Developing technical methods (e.g., data privacy protection tools, algorithm transparency measures).
(5) Establishing unified benchmark standards to reduce fragmentation.
(6) Combining soft and hard law approaches to bolster ethical enforcement.
Li Xueyao, in his work, observes that mainstream approaches to AI ethics are heavily influenced by “principlism” from biomedical ethics—extending the four principles of respect for autonomy, nonmaleficence, beneficence, and justice into frameworks of seven, three, or even five principles (with additions such as explainability). He also argues that the inherent differences between AI and biomedicine render a straightforward transplantation of biomedical ethical models problematic.
His solutions, however, remain rooted in traditional legal approaches:
(1) Legalizing AI ethics—that is, incorporating moral and religious norms into legally enforceable standards—by using methods (such as Coffman’s “essence analysis”) to identify and protect substantive interests underlying legal norms.
(2) Employing the proportionality principle to reconcile ethical guidelines with legal rules, thus merging applied ethics with legal analysis.
(3) Enacting dedicated AI ethics legislation distinct from biomedical ethics law, possibly as a separate chapter within AI law.
While these scholars accurately diagnose the problems and reflect current social consensus, their proposed solutions tend to revert to traditional disciplinary approaches. Unfortunately, their strategies for dealing with revolutionary information technologies neglect issues of information distribution and cost and do not propose leveraging information technology itself—much like trying to rein in a spirited horse with mule reins. In reality, to harness a spirited horse one must understand its behavior and design incentive mechanisms so that its selfinterested actions (aimed at outperforming all others) align with the controller’s objectives (to make one’s own horse win), using rewards and punishments to prevent it from running wild.
Concerned that existing institutions cannot keep pace with technological development, Geoffrey Hinton—the father of neural network algorithms and recipient of the Turing Award and a Nobel Prize in Physics—recently remarked, “The greatest problem facing humanity today is an Old Stone Age brain, medieval institutions, and godlike technology.” His point is that we have invented technologies that we cannot fully control or understand.
The human brain operates as an analog simulation system: neural signals, though akin to digital ones, work at low power with limited bandwidth (only a few bits per second), and knowledge is shared via language. In contrast, AI relies on transistors to process binary data (0s and 1s) at the instruction level, consuming vast energy while exchanging data at scales reaching trillions of bits with far greater efficiency. To control such superior informationprocessing and decisionmaking tools, we still depend on ethics and law inherited from the agricultural era—attempts to confine their use through simple, linear, rulebased commands that are ultimately doomed to fail.
Ethics is the social mechanism by which we morally evaluate human practices; it is an integral part of practical reason. A discourse that merely parades ethical principles without affecting practice is bound to suffer from an “effectiveness deficit.” As Suli has noted, “Focusing on legal discourse is important, but one must not miss the substantive legal issues. What matters in legal language is to address the matter at hand, to deeply understand the facts and context of specific disputes, and to compare the possible outcomes of different legal responses—not merely to reflect on words.” The same holds for ethics. Kant argued that all conscious human activity is ethical—actions are evaluated as right or wrong based on a universal moral law. For Kant, practical reason—our capacity for ethical action—is paramount, as it is not mere cognition but the ability to act that endows life with meaning. Detached ethical platitudes, therefore, cannot effectively influence practice.
Motivated by these fundamental concerns, this paper departs from conventional AI ethics debates that focus on abstract principles and their codification into algorithms (value alignment). Instead, it accepts the current state and forms of AI technology as given and, using mechanism design theory from economics, proposes an informationefficient, implementable mechanism for forming and enforcing AI ethics.
1. Applicability of Mechanism Design Theory in AI Ethics
1.1 What Is Mechanism Design?
Both Shen Kui and Li Xueyao have noted that AI ethics struggles with a lack of incentive for individuals and companies to voluntarily comply with ethical norms. Their solutions, however, lean toward legalizing ethics—adding rigidity and enforceability. Yet even legal enforcement faces the challenge of insufficient voluntary compliance, as it requires information about both norms and factual circumstances to “apply legal consequences based on facts and law.” Centralizing rulemaking power, clarifying rules, and increasing enforcement can only address one dimension; the challenge posed by dispersed factual information is the true obstacle that digital technology presents to law and ethics.
Digital technology companies possess clear advantages in computing power, algorithms, and data relative to governments (and certainly individuals). They have both the capability and the incentive to create an appearance of compliance while acting solely in their own interests. Government digital infrastructure—such as egovernment services powered by Alibaba Cloud, digital courts using iFlytek’s natural language processing, pandemic health codes supported by Alipay, and travel codes based on Tencent’s WeChat—is provided by these companies, leaving the government far behind in its ability to obtain comprehensive information.
The goal of AI ethics is precisely to compensate for the shortcomings of legal enforcement by obtaining information in a distributed manner. This way, even as technology providers pursue private gains, they do not completely disregard public welfare. Legalizing AI ethics runs counter to that aim.
Given that “private information” is pervasive and dominates the information landscape, designing an incentivecompatible mechanism that motivates information holders to voluntarily disclose information—and ensures that their pursuit of private interests also promotes public welfare—is exactly what mechanism design theory sets out to achieve. Originating from the study of information asymmetry and incentive problems in economics, mechanism design theory seeks to create rules and mechanisms that align individual actions with overall social objectives. Because information asymmetry is the major challenge in AI ethics—with various stakeholders proclaiming ethical principles that are hard for regulators and observers to verify—the theory is naturally applicable to this domain.
Leonid Hurwicz, a Polish–American economist, in a 1960 article examined the efficiency of information transmission in economic systems and proposed methods to achieve optimal resource allocation under conditions of dispersed information. He emphasized that information efficiency is a key measure of an economic mechanism’s effectiveness. His work laid the foundation for mechanism design theory by exploring how to design mechanisms that achieve set objectives despite information asymmetry and selfinterested behavior.
In a 1972 paper, Hurwicz introduced the core concept of incentive compatibility—providing an economic model for how selfinterested individuals, possessing private information, can be led to cooperate for mutual benefit. This work has enabled economists to systematically address resource allocation problems in the face of information asymmetry.
The principle of incentive compatibility directly informs our approach to AI ethics. The “effectiveness deficit” observed in AI ethics—where numerous ethical norms fail to be implemented in practice—is essentially an incentive compatibility problem. Data is the lifeblood of AI, and acquiring data (including personal data and content protected by intellectual property rights) is central to training AI. Proposing everstricter laws and ethical requirements for personal data protection conflicts with companies’ inherent incentives; strict enforcement could stifle AI development and damage the industry. Likewise, most users prefer convenient, personalized services while preventing excessive data leakage.
Thus, an incentive compatible ethical norm in this context is data security—that is, ensuring that companies safeguard the data used for providing specific services. In this sense, digital technology companies are naturally motivated to secure data, and doing so can lead to a Nash equilibrium where safeguarding privacy becomes an ethical standard. In Part II of this paper, the issue of cyber violence will serve as an example to further illustrate how to develop incentive compatible AI ethical principles.
Roger Myerson later developed the revelation principle—the second core element of mechanism design theory—which states that any equilibrium outcome of a mechanism can be achieved by a direct, incentive compatible mechanism in which participants truthfully report their private information rather than engaging in complex strategic behavior. For example, an auction can be designed so that false bids are not in a bidder’s best interest, and truthful bids both serve the bidder and promote collective welfare. The revelation principle suggests that we should develop mechanisms that encourage AI providers to directly disclose the ethical principles they follow, rather than merely proclaiming a set of principles they neither practice nor intend to practice—a safeguard against “ethics washing.” This issue will be discussed further in Part III.
In addition to the revelation principle, Eric Maskin contributed decisively to implementation theory—the third cornerstone of mechanism design. The core of implementation theory is to identify the necessary and sufficient conditions for Nash equilibrium implementation, encapsulated in what is known as “Maskin’s Theorem.” This theorem provides the mathematical foundation for mechanism design by proving under which conditions social objectives can be achieved through carefully designed mechanisms. Maskin’s work also includes countermeasures to demonstrate sufficiency. In essence, Maskin’s Theorem can be summarized as follows:
1. Sufficient Condition:
If a social choice rule satisfies monotonicity and no veto power, then there exists a mechanism that implements it in Nash equilibrium.
(a) Monotonicity: If a social choice is optimal under one preference profile and does not worsen relative to other options under a different profile, it remains optimal under the new conditions.
(b) No Veto Power: If a social choice is regarded as optimal by almost all participants, no single individual should be able to block it. With three or more participants, the influence of any one individual is diluted, making this condition easier to satisfy.
2. Necessary Condition:
If a social choice rule is implementable in Nash equilibrium, then it must satisfy monotonicity.
Moreover, Maskin’s research clarifies that mechanism design is particularly applicable in repeated interactions. In oneshot settings, only strategic behavior exists (the prisoner’s dilemma being a classic example). In repeated games, by designing appropriate mechanisms (such as reputation or punishment systems), social objectives can be achieved even when shortterm incentives to defect are present. The key is to leverage incentive compatibility over longterm interactions, guiding participants toward cooperation that yields socially optimal outcomes. Indeed, all institutional mechanisms—including legal and ethical ones—are the result of repeated interactions where incentive compatibility becomes entrenched. AI research and applications, whether in commercial or public domains, are typical examples of repeated games where longterm investments in computing power, algorithm refinement, and data accumulation create conditions favorable to effective ethical mechanisms.
2. Incentive Compatibility
2.1 What Is Incentive Compatibility?
Incentive compatibility is a central concept in mechanism design. It means that within a given mechanism, each participant’s optimal strategy (given the actions of others) aligns with the desired behavior specified by the mechanism designer. In other words, participants have no incentive to lie or manipulate information because truthfully reporting their actual information (such as preferences or costs) is their best course of action. In mechanism design, incentive compatibility is typically associated with concepts such as Nash equilibrium or dominant strategy equilibrium. If a mechanism is incentive compatible, then truthful reporting is an equilibrium strategy for all.
2.2 Designing an Incentive Compatible Mechanism
Designing an incentive compatible ethical mechanism generally involves:
(a) Clearly Defining Objectives and Constraints: Objectives and constraints must be specific and measurable—not merely described using adjectives like “beautiful” or “good”—so that the possible actions participants might take to achieve the objective can be delineated.
(b) Determining the Strategy Space: Identifying the actions available to participants or the information they might report.
(c) Constructing an Interactive Mechanism: Designing a mechanism that maps participants’ strategies to outcomes. While many discussions of mechanism design focus on auctions, a oneshot bidding mechanism is not suitable for ethical governance. Instead, mechanisms that capture participants’ behavior over longterm interactions are more appropriate. A public institution should be tasked with collecting, analyzing, and processing the ethical choices manifested in outcomes—especially regarding large digital platform companies. Such an institution, acting as an “impartial observer,” is more effective than an ethics committee that drafts principles without adequate information.
(d) Verifying Incentive Compatibility: Using mathematical analysis or game theory tools to confirm that truthful reporting is a dominant strategy or Nash equilibrium under the mechanism.
(e) Adjusting the Mechanism as Necessary: Refining the mechanism until incentive compatibility is achieved.
2.3 Incentive Compatible Mechanism Design for Regulating Cyber Violence
Cyber violence is a problem that requires both legal and ethical solutions and is of widespread public concern. Unfortunately, few scholars have examined its connection to AI, much less discussed it from an AI ethics perspective. Most discussions of AI ethics remain at the level of abstract principles and fail to address ethical issues in concrete AI application scenarios. In reality, cyber violence—albeit unintentionally—exploits nearly every AI algorithm, and technical solutions to cyber violence are embedded within these algorithms. For example:
(a) Natural Language Processing (NLP) Algorithms: Used for text classification (to identify types of content), sentiment analysis (to gauge emotional tone), and entity recognition (to identify names and places). Platforms deploy these algorithms to drive traffic; extreme rhetoric and negative news attract attention and can inadvertently fuel cyber violence, even though the same algorithms can help detect malicious or abusive content.
(b) Machine Learning Algorithms: Both supervised (using labeled data to train models for detecting harmful content) and unsupervised learning (to discover anomalous patterns) are used.
(c) Deep Learning Algorithms: Convolutional Neural Networks (CNNs) process text and images to identify harmful content; Recurrent Neural Networks (RNNs) analyze sequential data (such as chat logs) to detect abusive language.
(d) Graph Neural Networks (GNNs): Analyze social network relationships to trace the propagation of harmful behavior.
(e) Anomaly Detection Algorithms: Identify unusual user behavior, such as mass posting of offensive content.
(f) Recommendation Algorithms: Analyze user interests to push related content, which can sometimes amplify harmful messages.
(g) Reinforcement Learning Algorithms: Dynamically adjust content moderation strategies to optimize the detection and filtering of harmful content.
These algorithms can be misused to exacerbate cyber violence but can also be harnessed to counter it. Technology here is a double-edged sword; ethics must guide its use for good. Discussing AI ethics without addressing how AI can help implement ethical outcomes is akin to treating ethics as a superficial veneer on technology—a stance that inevitably leads to an “effectiveness deficit.” As the popular saying goes, “only magic can defeat magic.”
On a principle level, while no one openly endorses cyber violence, in reality thousands of ordinary people participate in it. Digital platforms—whose business models rely on attention mechanisms—objectively contribute to its spread, even if they profess indifference. Moreover, government responses aimed at preventing public opinion from spiraling out of control often fail to protect victims promptly; in some cases, victims are even penalized to calm the situation, creating a perverse incentive structure. To design an effective ethical and legal response to cyber violence, it is crucial to understand the clear objectives and constraints of all parties involved.
According to China’s “Regulations on the Governance of Cyber Violence Information,” “cyber violence” is defined as “illegal and harmful information distributed online—via text, images, audio, video, etc.—that involves insults, slander, defamation, incitement to hatred, threats, coercion, invasion of privacy, or content that harms one’s physical or mental wellbeing.” Article 33 provides exemptions for lawful reporting or public supervision of illegal activities. Cyber violence is pervasive mainly because:
(a) Online speech is rarely constrained by the social norms governing facetoface interactions. In personal interactions, socialized adults exercise restraint; online, anonymity or pseudonymity and the lack of a shared community render reputational pressures ineffective.
(b) The cost of online speech is virtually zero—what once required substantial effort to craft and publish is now as simple as a few keystrokes.
Furthermore, perpetrators of cyber violence often feel an illusory moral superiority, a phenomenon underpinned by cognitive biases such as the blind spot effect and self-affirmation bias.
For platforms providing the infrastructure for online speech, enhancing “user experience” and increasing user “stickiness” are key business objectives. Cyber violence—where many netizens participate just for the thrill—fits neatly into these incentives. Thus, designing AI algorithms to detect trending topics (including those laden with cyber violence) aligns with these platforms’ commercial logic (for instance, Weibo’s trending lists have often amplified cyber violence). Capital exploits human weaknesses—such as vanity and the desire to vent anger—rather than promoting virtuous behavior. Unless we consciously use open social media as a platform for selfeducation, the “cocoon effect” of information will persist. This is not a problem that legal intervention alone can solve; for ethics to be effective, an incentivecompatible negative feedback mechanism is essential—and triggering such a mechanism requires government intervention.
Currently, aside from post-incident judicial remedies (sometimes including criminal sanctions), government responses to cyber violence mainly take the form of public opinion management. Key departments—the Cyberspace Administration, Publicity Department, Letters and Visits Office, and emergency management bodies—monitor and respond to public sentiment. Because the goal is to quell disturbances and maintain stability, the practical effect is often that “the bigger the uproar, the faster and harsher the response,” sometimes even punishing victims to suppress the issue. Given that in the digital age AI can cheaply fabricate fake news, videos, or images, achieving a fair resolution through complete clarification is either impossible or prohibitively expensive. Consequently, when uncertainty exists, the burden of proof becomes a value judgment. The logic of stability tends to lead to outcomes unfavorable to cyber violence victims—an individual is insignificant compared to the collective of perpetrators.
In such situations, a positive feedback loop can form among all parties. Positive feedback, where the feedback signal reinforces the input, drives the system further from equilibrium—like a chain reaction of falling dominoes. At this point, the government should act as a negative feedback trigger. In control theory, negative feedback counteracts the input signal, helping restore equilibrium. For example, when reaching for a cup, one constantly adjusts one’s movements based on visual feedback until the cup is grasped.
For cyber violence incidents, government public opinion departments should not aim to validate or empower online bullies. Instead, they should require platforms to remove streams of cyber violence from the user interface (the “front end”) while retaining relevant data on the back end (servers) to serve as evidence for subsequent factfinding and legal action. Understanding the psychological mechanisms behind cyber violence shows that targeting victims not only fails to quell the problem but may also encourage further abuse—since perpetrators are less concerned with the specific individual and more with venting collective anger. To activate a negative feedback mechanism, it is also necessary to leverage another component of mechanism design theory: the revelation principle.
3. The Revelation Principle
The revelation principle is a fundamental concept in mechanism design, stating that any mechanism can be transformed into a direct, incentive compatible mechanism where participants simply report their private information truthfully without needing to engage in complex strategic behavior. This direct mechanism produces equilibrium outcomes equivalent to those in a strategic game, with truthful reporting as the optimal strategy.
Originally conceived in the context of auctions—where bidding and pricing rules can be designed so that truthtelling is the best strategy—the revelation principle suggests that we can construct a direct mechanism in which AI providers are encouraged to directly disclose the ethical principles they actually follow, rather than merely proclaiming a set of principles they do not implement. This helps to prevent “ethics washing.” This issue will be further explored in the subsequent discussion.
For instance, consider a modified secondprice auction where the winner pays twice the secondhighest bid. In such an auction, truthfully reporting one’s valuation is a dominant strategy. Based on the revelation principle, a direct mechanism can be constructed where bidders report their valuations and the auctioneer automatically applies the appropriate pricing rule, achieving the same outcome as the original auction. The key idea is that if we design the environment properly, truthful disclosure will be in every participant’s best interest. In the realm of AI ethics, if companies’ ethical practices are sufficiently transparent, then with appropriate ethical guidance or governmental negative feedback, improvements in overall social ethics can be achieved.
4. Implementation Theory
Implementation theory studies how to design a mechanism (i.e., a set of game rules) so that its equilibrium outcomes meet socially optimal standards. If, in every possible state, the set of equilibrium outcomes of a mechanism equals the socially optimal outcomes defined by a social choice rule, then that rule is said to be implemented by the mechanism. Maskin’s work, notably his paper “Mechanism Design: How to Achieve Social Goals,” outlines a stepbystep process from defining objectives to designing mechanisms that achieve those objectives. The process can be summarized as follows:
1. Define the Social Objective:
Clearly state the social objective or desired outcome—not in vague terms like “uplifting” or “virtuous” but in concrete, achievable terms within realworld constraints. For example, China’s “Guidelines for Algorithm Governance” list “uplifting and virtuous” as a major goal while also specifying more concrete targets (see Table 2).
2. Identify Constraints:
Recognize that mechanism designers typically lack complete information about individual preferences, necessitating the design of an incentivecompatible mechanism to steer behavior toward the predetermined objective. For instance, this might involve prompting companies to disclose verifiable information about their ethical practices regarding cyber violence prevention rather than issuing abstract ethical proclamations.
3. Design an IncentiveCompatible Mechanism:
Develop a mechanism that ensures that, under equilibrium conditions (such as Nash equilibrium), the strategies chosen by the participants yield the desired outcome. For example, a strategy matrix might detail the current behaviors of companies and indicate the improvements expected by society—using government regulatory oversight and potential negative evaluations as negative feedback mechanisms. Such a mechanism should rely on observable manifestations of algorithmic impact (such as user interface design) rather than on unrealistic demands like “opening the algorithmic black box.”
4. Ensure Robustness:
Check that the mechanism satisfies key properties such as monotonicity and no veto power to guarantee its implementability in Nash equilibrium. Monotonicity ensures that if an outcome does not worsen with changes in individual preferences, it remains optimal; no veto power ensures that no single participant can block a widely accepted outcome
5. Verify and Promote:
Validate the mechanism with concrete examples (e.g., the “Qinglang Action”) and generalize the findings through theoretical refinement.
On November 12, 2024, a Joint Notification titled “Qinglang · Governance of Typical Algorithmic Issues on Online Platforms” was issued by the Secretary Bureau of the Cyberspace Administration, the General Office of the Ministry of Industry and Information Technology, the General Office of the Ministry of Public Security, and the General Office of the State Administration for Market Regulation. This notification proposed concrete measures to address common practices by digital technology companies—using algorithms to harm individual rights and public interests—and urged enterprises to conduct thorough selfinspections and make rectifications. Its attachment provides detailed “Guidelines for Algorithm Governance,” which list 27 rectification recommendations across six major areas, including “information cocoons,” trending topics, rights of new employment forms, big data price discrimination, ethical algorithm design, and accountability for algorithmic security. From the perspective of mechanism design theory, this notification and the ensuing actions can be understood as follows:
First, the action required is not about “cracking down” or “severe punishment” but is a targeted enforcement campaign aimed at protecting network and data security, personal rights, labor rights, consumer rights, and the rights of youth and the elderly—all of which are legally protected. This reflects both a legal baseline and ethical guidance.
Second, the enforcement method focuses on “platform responsibility,” urging companies to conduct selfinspections and make rectifications. The guidelines serve as clear instructions so that companies can evaluate whether their algorithmic designs meet the requirements and adjust if they do not.
Third, the choice of enforcement objectives reflects a correct understanding of the commercial models underlying algorithm applications in the digital economy. The rectification proposals are both practically targeted and valueoriented, based on an accurate grasp of the current mainstream algorithmic designs of digital companies and their impacts on user rights and public interests—thus satisfying the principles of incentive compatibility.
Finally, these rectification recommendations can be implemented using existing legal frameworks—such as those governing network and data security and personal information protection—where platform responsibility is a core concept. In today’s digital age, as society increasingly depends on digital technologies to process vast amounts of behavioral data, it is impossible for regulators to oversee every aspect of data processing. Individual litigation or administrative remedies can only address isolated cases. Traditional rights based legal models and command and control regulatory approaches are inadequate. In this context, the most pragmatic institutional design is for the law to set out a framework of basic, bottomline principles and, through platform responsibility, impose compliance obligations on data controllers and processors—internalizing these principles into their operating practices and business costs, with the government monitoring and intervening when necessary. This model—integrating rights protection and risk regulation—is often referred to as “metaregulation” and is exemplified by the “Qinglang Action” as a new “ethics + law” approach in the digital era.
Of course, the “Qinglang Action” is a targeted campaign rather than a permanent mechanism. Mechanism design theory tells us that top–down attempts to solve problems once and for all will inevitably encounter endless grassroots challenges. Ethical principles that ignore incentive compatibility cannot solve the problem of information costs; effective ethical implementation requires not only that you know something but that I know you know it, and so on—a recursive reasoning process involving the balance of private and public information. In this light, there may be no universally applicable AI ethics, only a series of targeted campaigns addressing specific issues (e.g., cyber violence or youth internet addiction), with practices that satisfy incentive compatibility becoming embedded in the relevant stakeholders’ operations. These campaigns serve to collect information, conduct cost–benefit analyses, evaluate ethical compliance, and propose corrective measures when necessary.
Conclusion
Mechanism design theory studies how, under given information structures and resource constraints, one can design a mechanism or set of rules so that the selfinterested actions of participants align with the social objectives determined by the designer. In the field of technology ethics, mechanism design theory can guide us in creating mechanisms that both foster technological innovation and ensure that such activities adhere to ethical norms. Designing an incentivecompatible AI ethics mechanism requires considering multiple factors—including the establishment of ethical norms and behavioral guidelines, the setup of ethical review and regulatory systems, the implementation of incentive and punishment measures, the enhancement of ethics education and training, and the promotion of public participation and oversight. By implementing these measures, we can promote the healthy development of AI technologies, ensure that technological activities conform to ethical standards, and achieve a harmonious coexistence between technology and society.
In theorem form, the content of this paper can be summarized as what I call the “Impartial Spectator” Theorem, which consists of three propositions:
1. The Incentive Compatibility Theorem for AI Ethics (Theorem I):
An ethical principle can only be behaviorally binding if it roughly aligns with the inherent incentives of the actors it seeks to regulate—that is, if it does not force them to act in ways completely divorced from their normal behavior solely to satisfy formal requirements. Otherwise, the ethical principle will devolve into mere formalism.
2. The Revelation Theorem for AI Ethics (Theorem II):
Ethical principles should not be imposed top–down by Kantian “moral legislators” but should emerge from the equilibria reached by all relevant actors through social interaction—principles that, over the long term, are compatible with both selfinterest and public welfare. Under such conditions, a direct mechanism will develop in which all actors are willing to report the ethical practices they actually implement, since doing so is in their best interest. In this framework, the government’s role is primarily to verify whether these publicly stated practices conform to accepted ethical standards and whether there is clear evidence that an actor is not fulfilling them. The government thereby acts as an information recorder and, with its coercive power to punish and reward, enforces ethical behavior.
3. The Implementation Theorem for AI Ethics (Theorem III):
AI ethical principles that satisfy the above two theorems will inherently meet Maskin’s conditions of monotonicity and no veto power. That is, when ethical principles are distilled from practical experience in a bottom–up manner to reflect an optimal equilibrium state, any concentrated force representing public welfare (typically the government) must consider whether its corrective measures disturb the equilibrium among various factors. Only by making gradual, targeted adjustments can the overall ethical state of a digital society—where AI is ubiquitous—be continuously improved.