China Institute for Socio-Legal Studies, Shanghai Jiao Tong University

2024-11-19 [author] DENG Jinting preview：

[author]DENG Jinting

[content]

On the Universal Representation Method of Legal Rules in Computers

*Author Deng Jinting

associate professor of Renmin University of China

Abstract: The theory of computer representation of legal rules is the fundamental theory for the continued development of computational law and the deepening of understanding of the application of computers in the field of law. The current representation methods are mainly based on rules and task models, which have problems such as poor scalability and interpretability, and low accuracy. A universal representation method for legal rules should be established based on the logical and structural characteristics of the legal rules themselves. Starting from legal rules, we can discover some common structures, relationships, and elements, construct a universal dataset, and establish a universal model and rule graph. In this way, the process of judicial syllogism can be expressed while improving accuracy, scalability, and interpretability, deepening the controllability of computer applications in the field of law, and enriching the discussion of scenario theory and algorithm regulation related theories.

Keywords: Computational Law, Legal Rules, Computer Representation, General Elements, General Model, Rule Graph

1. Overview and Significance of Computer Representation Theory for Legal Rules

Legal rules stipulate legal rights, obligations, and responsibilities, or give legal meaning to a certain factual state. Together with legal principles, they constitute legal norms and are the main elements of the legal system, manifested through legal provisions in normative legal documents. So, computer representation of legal rules refers to using computer language to represent legal provisions containing legal rules and present them on computer systems. Computer language is the language used for communication between humans and computers, including assembly language and high-level languages. Human instructions are first conveyed to an application layer through high-level language, and then gradually translated and interpreted into machine language that can be directly reflected by hardware. In this way, legal rules expressed in human language will be transformed into digital electronic signals that computers can read and store. This is a very complex process, but after years of development, it can now be easily achieved. At present, all current valid normative legal documents in China can be viewed on the computer.

However, this computer representation only facilitates human reading of legal rules on computer systems, by encoding digital electronic signals to correspond one-to-one with text, and then representing legal rules through text. This representation is essentially still using human language to represent legal rules, and determining the meaning of legal rules through understanding words. With the further complexity of computer systems, the understanding of these texts, the relationships between them, and the structure of legal provisions are gradually represented through digital electronic signals. For example, linking different legal documents and legal provisions through words with the same or similar meanings. At this point, computers not only present legal rules on the system for people to read, but also partially participate in people's understanding of legal rules. This representation goes further than the previous electronic signalization of simple text, and partially electronically signalizes the understanding of text, making the representation of legal rules in computer language closer to the representation of legal rules in human language. It can be imagined that when the general legal personnel's understanding of the text, structure, etc. of legal provisions is electronically signaled, the representation of legal rules by computer language is basically equivalent to the representation of legal rules by human language. The understanding of legal rules by computers will also be similar to the understanding of legal rules by general legal professionals.

Therefore, the theory of computer representation of legal rules is not only a theory of presenting text containing legal rules in computer systems, but more importantly, a theory of representing the understanding of legal provisions by general legal professionals in computers, fundamentally realizing the representation of legal rules in computer language. In addition to one-to-one correspondence between text and electronic signals, this theory also includes the theory of electronic signaling, such as the meaning of legal text, the relationship between text, and the structure of text. In the future, there may be a situation where computer language replaces human language to directly construct and represent legal rules.

The theory of computer representation of legal rules is the fundamental theory for the further development of computational law. Since entering the era of big data, computational law, as a new paradigm of legal research or representing a new discipline of law, has been constantly discussed in the legal community. It is generally believed that computational law is the study of computer science applied in the field of law. Up to now, there have been many specific studies using computer technology to solve legal problems. For example, using computers to automatically obtain, process, and analyze judgment data for empirical legal analysis. For example, designing and training models based on specific legal tasks to achieve automatic resolution of the task. Whether as a research method or a new discipline, the further development of computational law relies on the development of computer technology's ability to solve legal problems. This ability includes two aspects: one is the ability to transform legal problems into computational problems; The second is the ability to apply computer technology to help solve the transformed problems. The first type, in other words, is the ability to express legal problems in computer language. Law, as a relatively ancient discipline, has developed to this day, with a wide range of problems and many classifications. A common type is divided into legal interpretation and application issues, legislative design issues, judicial system issues, and judicial trial experience extraction issues. However, the core issue of the discipline of law is still the interpretation and application of legal rules, which is the problem of obtaining specific judgments in cases through rigorous logical reasoning and value judgments based on established legal provisions. Many other legal issues revolve around this issue. The computer representation of the interpretation and application of legal rules is naturally the fundamental issue for transforming legal problems into computational problems. If the problems of legal rules can be well represented by computers, then it will be easier to transform other legal problems into computational problems, which can often be solved by transforming them into representation problems of certain types of legal rules.

The theory of computer representation of legal rules is also a fundamental theory for understanding the application of computers in the field of law. With the emergence of weak artificial intelligence capabilities in computers, more and more specific legal scenarios are beginning to use computer technology to help reduce the burden and improve the efficiency of judicial and legal services. At the same time, there are also many discussions and concerns about whether computer technology can be used in these scenarios, whether these applications comply with the spirit of the rule of law and even moral ethics. Whether it is further exploring the application of computer technology in the field of law, or discussing the limitations and concerns of these applications, it is necessary to establish an understanding of the application of computers in the field of law. Otherwise, these discussions are only empirical, uncertain, and lack academic rigor. Only by theoretically understanding the characteristics and laws of the application of computers in the field of law can we reliably discuss the problems that exist in this application. And understanding these applications also follows the two-step approach of computational law, first understanding how the legal problem being solved is transformed into a computational problem, and then understanding how this problem can be solved with the help of computer technology. The computer representation theory of legal rules, as a fundamental theory that transforms legal problems into computational problems, is also a fundamental theory for understanding these applications. Through this theory, we can understand how legal rules are understood in computers, and thus clarify how computers complete established legal tasks based on these legal rules.

2. Current legal rules, computer representation methods, and their limitations

2.1 The current legal rules are mainly represented through the methods of rules and task models

The computer representation path of converting legal texts into electronic signals has become very mature after years of development. The representation of this part is the computer representation of text, mainly establishing a set of codes to correspond text with a certain number one by one. Encoding text refers to arranging 0 and 1 characters in a fixed order and length to represent a certain text, using this character sequence as the unified feature for recording, storing, exchanging, and transmitting the text in a computer system. The codes related to the Chinese character library include national standard GB code, GBK code, and Hong Kong Taiwan BIG-5 code. By using these encodings, along with meeting the data structure, software program format, and programming requirements for common protocols at various layers of the computer, it is possible to read and store these legal texts on a certain computer system. In order to enable languages and scripts from various countries to be encoded without conflict, the International Standards Committee has released a series of coding standards. The encoding of Chinese character libraries should follow these standards. With the development of Chinese character libraries, more and more Chinese characters, punctuation marks, etc. are being encoded into character libraries. These character libraries need to be compatible with each other so that Chinese applications are not affected by the iteration of character libraries and can be used continuously. So these Chinese codes are only determined based on ISO standards which coding areas can be used for Chinese, and then these words are placed in the coding area one by one according to the order in which they appear in the general dictionary. Even with iteration, the development of character libraries can only expand the encoding area, such as expanding from double bytes to four bytes, so that more text can be encoded. However, the characteristics of character shape, pronunciation, and meaning were not taken into account when determining the specific encoding. So it is impossible to obtain any information about the shape, pronunciation, meaning, etc. of a character through its corresponding encoding.

On the basis of these Chinese encodings, Chinese applications have made significant progress. However, as these encodings do not contain any information other than corresponding to the word, they also limit the representation of the meaning of the word by the encoding. This is also why the further development of natural language science has led to the representation of words as word vectors in space, in order to contain more meanings of words through numerical information. The computer implemented through encoding can only represent the meaning of words by associating a certain number with a word, and by considering words represented by the same number in different files as the same word. If all characters and punctuation in two paragraphs are represented by the same number, then the computer will interpret these two paragraphs as identical. On this basis, words and phrases with similar meanings can be associated through rules, telling the computer that if these words appear in a certain paragraph, although their corresponding numerical sequences are different, they can be considered the same. These constitute the foundation of associative retrieval. By continuously adding rules, the meanings of words can be partially expressed, enabling computers to recognize certain words and phrases with similar meanings. For example, some applications calculate the frequency of the same word appearing in different files, and based on this, determine the degree of similarity between the meanings of different files. The more identical words contained and the more frequently they appear in the file, the more similar the meaning of the file.

Although current text encoding only corresponds a certain numerical sequence to text, and the arrangement, structure, numerical values, and other characteristics of the numerical sequence itself are not related to the meaning or structure of the text during design, it does use a numerical sequence to represent text. This numerical sequence can calculate numerical values (decimal numbers) according to binary definitions, so a text is represented as a number through encoding. After the emergence of word vector technology in the later stage, text was represented as a multidimensional vector in space, essentially a number. In this way, paragraphs and chapters composed of text can be represented as some combination of several numbers. These numbers represent these words and can participate in specific mathematical calculations, combined with current computer related function calculations, so that the representation of word meanings is partially achieved through function operations and parameter calculations. One assumption here is that when text is represented as a number, certain mathematical characteristics such as the size, direction, etc. of the number should be related to the meaning, shape, and other characteristics of the text, in order to represent the meaning of the word, even if a number is randomly associated with a character during encoding.

Based on this assumption, in order for the computer to automatically determine the meaning of a certain paragraph or file, or to represent the meaning of a certain text, there is currently another method beyond rules, which is task modeling based. Specifically, extract the meaning of a piece of text and use a few simple tags to annotate its general meaning. Then find many different texts with the same label, or in other words, find different textual expressions that can be extracted into the same label. Convert these words into numbers through encoding, calculate the mathematical characteristics of these numbers, and have the computer establish a model to reflect the relationship between these numbers and labels. After establishing such a model, if certain words with similar meanings are input to the computer, the computer can calculate the mathematical characteristics of the numbers required to represent these words in the model, calculate the output values under the model, and then correspond the model with the labels to determine that these words with similar meanings should point to these labels, thus completing the extraction of the meaning of these words. Similarly, if the input is text that does not have similar meanings, the computer will calculate based on the model, determine that it does not point to these labels, and obtain the understanding that "the meaning of these words is not these labels". So, by training on existing annotated data and building a model, the computer can determine whether the meaning of a certain text belongs to a certain label. This method is also known as machine learning.

It can be seen that the representation of legal provisions by computers, that is, the representation of text by computers, is constantly evolving in response to the needs of computer technology development and application. At first, it was just to store and transmit text through computer systems, so it only needed to be represented as a certain number. In the future, we hope that computers can still represent the meaning of text, and the previous encoding has played an important role in computer applications. It is difficult to start over and there is no better alternative. Therefore, of course, we will continue to develop on the basis of the existing representation with the logic of this representation. On the one hand, we will associate similar characters through specific rules, and on the other hand, we will establish models by calculating the encoding of representative characters or the mathematical characteristics of word vectors to correspond the meaning of a paragraph of text with labels.

2.2 Limitations represented by current legal rules

The development of computer representation of current legal rules in both aspects is limited by the ability of computers to represent the meaning of text. In terms of storing word meanings in computers through rules, since computers reflect them based on electronic signals, the meaning, structure, and other characteristics of words are not reflected in the encoding of words. Therefore, computers cannot grasp certain word meanings through encoding, and it is difficult to analyze and reason based on this, and to grasp the complex and varied relationships between different words. For example, words with the same radical have similar meanings and have a certain radical indicating a certain meaning. These pieces of information are not reflected in the encoding. Only simple semantic relationships can be stored in the computer through rules, and the computer cannot reason based on them. The meaning of characters, the combination of different characters, and their meanings all undergo exponential quantitative changes, so through this method, computers can only slowly advance in representing the meaning of characters.

In terms of task-based modeling, machine learning methods are mainly used. Traditional machine learning has further developed deep learning methods. At present, most applications related to smart rule of law use deep learning methods, which determine specific tasks, establish datasets based on the tasks, train models using deep learning algorithms, test the effectiveness of the models for further improvement, and run the models to complete the tasks. Because this method directly calculates the empirical relationship between the mathematical characteristics of text represented as numbers and the meaning of the text without theoretical basis, a large amount of sample data is required to summarize the experience. For situations with only a small sample size, such as the meaning of individual words, basic concepts, complex meaning expressions, etc., this method cannot be used. For abstract explanatory tasks, this method cannot be used because there are no specific problems and sample data cannot be established. This is also why the prerequisite for using this method is to have a specific and clear task scenario. Because only in this way can specific data be formed. Only by clearly knowing the meaning of the textual expressions to be judged, can various textual expressions with and without this meaning be collected and annotated, allowing the computer to calculate the mathematical characteristics of each sample and its relationship with the labels, and establish a model.

Due to the fact that these models are empirically established based on specific tasks, they lack scalability and are difficult to use in other tasks. In addition, due to the large size of the model in deep learning, it is difficult to understand the relationship between the mathematical characteristics of numbers representing text and the meaning of the text, under what circumstances this relationship holds, and the laws of its evolution, such as the relationship between the size, direction, dimension, and meaning of word vectors. Therefore, even in specific tasks, it is impossible to understand how the model links a certain text expression with a certain label meaning. This led to this method being widely criticized later on. One is the problem of algorithmic black boxes, where the connections established by the model cannot be explained, and how judgments based on the model are implemented cannot be understood. Secondly, its reliability cannot be theoretically determined, and the effectiveness of the model can only be judged by the performance of the test set. This effectiveness cannot reach 100%, but there is a lack of theoretical basis for discussion on how much it should reach. And for more complex tasks, the preparation rate is often not high, which may drop to 70-80%. Thirdly, the model's performance may be severely affected by the sample data situation, such as imbalanced positive and negative samples, unclean data, and biased data. For example, if the data changes, the model needs to be overturned and restarted.

In summary, after encoding the text, the computer representation of legal rules continues to attempt to represent the meaning of the rules in two directions. In terms of applying rules to store specific knowledge in computers, some large companies are slowly pushing forward with human resources. In the field of machine learning, many specific applications have been developed, which are scattered, fragmented, and difficult to integrate. There has been almost no progress in the field of computer science for many years in terms of abstract rule interpretation, interpretation of individual words and basic concepts. Until today, natural language technology can only effectively segment sentences, words, and determine part of speech, but cannot determine the meaning of a word without considering the task and context.

3. General representation method of legal rules in computers

Like the representation of text, the current representation method for legal rules first represents the text of legal provisions containing legal rules, and then represents the meaning of legal rules through two methods: rule based and task-based modeling. The difference lies in that legal provisions have more regular expressions, more rigorous structures, more standardized interpretations, and more complex changes compared to ordinary language. With the continuous improvement of legislative technology in our country, these characteristics of legal provisions have become more prominent. For example, the Civil Code promulgated in 2020 systematically compiled many scattered laws and regulations related to civil law into the code. The legal provisions also contain some concepts and structures with specific legal meanings, which require background knowledge to master correctly. More importantly, understanding legal rules also includes the deductive process of connecting legal rules with specific factual scenarios, that is, the process of applying the law to specific facts to obtain corresponding legal results.

Based on these characteristics, the representation of the meaning of legal rules is more difficult than ordinary words. The richer structure of legal provisions makes it very complex to express them through rules. Legal concepts are often abstract words and phrases without data, making it impossible to establish datasets and train models. The method of connecting legal rules with specific factual scenarios currently relies mainly on deep learning methods, and the strong requirement for interpretability in legal application naturally hinders the use of this method. The deduction process of legal rules cannot be achieved through the method of rules, because computers only have numerical memory of text representation and cannot understand the meaning of individual words, let alone legal provisions with specific meanings. To this day, computers still have very limited representation of the meaning of rules.

However, legal rules have more rigorous structures and characteristics than ordinary words, which should make their meanings easier to determine in order to better guide people's behavior and predict legal consequences. Therefore, the meaning of legal rules should be more easily represented by computers. However, due to the structural characteristics of legal rules not being utilized in current representation methods, they are not as well represented as ordinary text in the current approach. On the one hand, the stronger logic and normativity of legal rules are reflected through the structure and concepts of legal rules, while existing computer representation methods lack attention to the relationship between the structure and basic concepts of words and their meanings, and do not reflect this relationship in the representation logic of computers. Therefore, the advantages of legal rules are difficult to be fully utilized in existing computer representations. On the other hand, due to the lack of basic word understanding and logical reasoning ability for text meaning in existing computer representation methods, it is difficult to represent the structural characteristics and deductive process of legal rules. There are rich legal concepts in legal rules, all of which are abstract basic words, so it is impossible to train models through annotated data to make computers understand them.

Based on this, in order to develop the representation of legal rules by computers, it is necessary to overcome the difficulties of rule representation and improve existing representation methods to fully utilize the characteristics of legal rules. Legal provisions should be decomposed into elements, sub elements, and situations listed by legal provisions, collectively referred to as elements. Some common structures, relationships, and elements in legal rules should be found, and these common elements should be modeled and represented with diagrams to enhance the logical reasoning ability and basic concept understanding ability of computers in expressing the meaning of legal provisions, cope with exponential changes in textual expression, and make up for the lack of computer life experience. The author refers to the method of representing legal rules based on the general structure, relationships, and elements of legal provisions as the general representation method. The development of this computer representation method has groundbreaking significance in both the fields of law and computer science.

3.1 Decompose legal provisions and model common elements

In addition to improving character libraries and encoding, an important method for computer representation of word meanings is task based modeling. This method, combined with the powerful computing power of computers, has seen great development and rich applications in recent years, and there is still great room for development in the future. The computer representation of legal rules should fully utilize this technology by decomposing legal provisions, modeling common legal elements, and improving the accuracy, scalability, and interpretability of this technology.

At present, this type of technology is mainly applied in the legal field as a whole to legal issues in a certain business, relying on deep learning to automatically complete tasks. The automatic conviction of a case refers to training a model to help determine whether a new case belongs to a certain charge. The implementation method is to treat this problem as a binary classification problem, whether it is a certain charge or not. Mark the entire factual description section of tens of thousands or even more existing cases as positive and negative samples, and then use the existing function model to read in 70-80% of these sample data, leaving the rest as the test set. Let the model calculate parameters based on the read data, and then use the test data to see the judgment effect of the trained model. If the effect is good, then the new case description can be input into the trained model, which calculates an output value to determine whether it is classified as a case of that charge or not. The biggest problem with using this method to represent legal rules is its low accuracy and inability to explain how the trained model makes judgments.

By decomposing legal provisions and modeling common legal elements, its interpretability can be increased and accuracy can be improved. Putting the entire factual description of the case into model training naturally makes it difficult to understand how the model calculates the relationship between the case classification based on the mathematical characteristics of so many word encoding. But if we decompose this problem into several steps to obtain more fundamental problems, and then model them separately, we can better understand the significance of each model and its impact on the final judgment. This is equivalent to using legal knowledge to participate in machine learning, partially controlling the process of machine learning. Taking automatic conviction as an example, the current method does not start from legal provisions, but from annotated data. If we decompose this issue, we need to start from the understanding of legal provisions and break down the legal provisions that stipulate a certain charge in the criminal law into several normative elements for conviction. Taking the crime of bribery as an example, the elements that constitute the crime of bribery can be decomposed from Article 385 of the Criminal Law: (1) state officials; (2) Utilize the convenience of one's position; (3) Soliciting or illegally accepting property from others; (4) Seeking benefits for others. When all four elements are met, the charge is established. So Article 385 can be broken down into these four elements to represent. These normative requirements can be further decomposed into several sub requirements or listed as specific situations according to legal provisions. For example, state officials can be listed as two types of situations, namely delegated and non delegated situations. By modeling these elements, sub elements, or situations separately, the entire legal provision can be represented. For example, in the case of non delegation, the defendant's unit and position can be extracted from the judgment on bribery as a positive sample, and the defendant's unit and position can be extracted from the judgment on embezzlement as a negative sample. Sample data can be established, leaving a small portion as a test set, and then the model can be trained. After testing, the accuracy of the computer using the model to determine whether the defendant holding a position in a certain unit is a state employee can reach over 98%. The same method can be used to model other elements. The computer uses these models to determine whether all four elements are valid, and based on this, ultimately determines whether the crime of bribery is valid and provides a reliable probability. After the improvement of decomposition and separate modeling, legal rules are more directly represented by computers and reflected in the logic of computer judgment.

Modeling common elements can improve the scalability of the model. When legal provisions are broken down into several elements, sub elements, and specific circumstances, some commonly used words and basic legal concepts often appear. For example, taking advantage of one's position not only appears in the crime of bribery, but also in many other charges such as bribery of non-state personnel and illegal operation of similar businesses. The term 'property' appears more than 50 times in the Criminal Law. 'Receiving' and 'demanding' have also appeared multiple times. These frequently used words and phrases are relatively common. The term 'property' has a defined legal meaning and is specifically defined in judicial interpretations, making it a fundamental legal concept. These common phrases and basic concepts are referred to as the common elements of legal provisions in this article. General vocabulary and basic concepts often lack sample data and require the use of multiple methods to be represented by computers. One approach is to establish positive and negative samples by annotating relevant factual descriptions in the judgment. The very abstract and vague element of taking advantage of one's position refers to a wide range of specific situations, which are difficult to express through rules or enumeration methods. Therefore, only by marking the factual description of taking advantage of one's position in the judgment as a positive sample, such as the testimony of a briber about the defendant's assistance to him, and then marking the factual description of non taking advantage of one's position behavior patterns in other charges as a negative sample. Another way is to express it through enumeration or rule description. Words and phrases with limited scope, such as' property ', can be expressed through enumeration. Another approach is to search for language materials related to these common phrases and words on social media and online platforms to supplement the dataset. This method is applicable to generic words and phrases without special legal meanings. When there is limited sample data and a large number of parameters involved, the existing achievements of natural language technology can be used to preprocess the sample data, allowing the computer to obtain background knowledge of the semantic meaning of the sample data and improve the effectiveness of the model. From the perspective of the requirement for systematization of criminal law, although these basic legal concepts may have different specific provisions in judicial interpretations of different charges, they should have a unified meaning in the Criminal Law. This is especially true for common phrases and words. Therefore, models specifically constructed for the semantics of these common phrases and basic concepts should be universally applicable in any other place where they appear in criminal law, where their models can be directly called upon as part of the representation of other legal provisions.

3.2 Using diagrams to represent the structure of legal rules

After decomposing legal rules into several elements and modeling them separately, the relationship between these elements can be represented using a graph, and they can be organically combined to jointly complete the representation of legal rules. In this way, legal rules are represented as a graph composed of several models. This representation method can improve the logical reasoning ability of computers towards legal rules and represent the deductive process of legal rules. A diagram can be used to visually represent the relationships between complex entities. Computer science has developed many mature algorithms based on graphs. The introduction of graphs can greatly improve the computer representation of legal rules. The basic composition of a graph is two nodes and edges between them, so it is necessary to determine what the nodes in the graph represent and what the relationships between nodes are.

Legal rules have some common structures. When a legal provision has only one complete legal rule, the provision has a standard rule structure. Article 385 is a standard sub article of the Criminal Law, which stipulates the complete constituent elements of the crime of bribery. There are many specific provisions in the Criminal Law that not only stipulate the constituent elements of charges, but also the elements of sentencing. These provisions contain at least two complete legal rules, namely the rules for conviction and the rules for sentencing. A standard legal provision is equivalent to a complete legal rule, so its structure is also the same, consisting of "applicable conditions+behavioral patterns+outcomes". The applicable conditions and behavioral patterns are also referred to as the normative elements of legal rules. The normative elements can be combined through logical "or", "and", as well as composite relationships, to point to different outcomes. For example, the four elements of Article 385 must be simultaneously established to constitute a crime, so their logical relationship is "and". Element three can be established in two ways, by requesting or accepting, and one of these two ways is sufficient, so there is a logical relationship of "or" between them. Different elements, sub elements, specific situations, results, etc. are represented as nodes in the graph; The logical relationship between them is represented as edges between interrelated nodes. Different combinations of elements connect nodes representing different outcomes. In this way, it is easy to represent a standard legal provision, that is, its legal rules, through a graph.

Many of the actual legal provisions are not standard rule structures. However, these non-standard situations still have some common structures. It can be roughly divided into the following categories: (1) Some legal provisions represent legal principles rather than legal rules. There is a clear difference between the two. Legal rules have clear and specific content that apply to a certain type of behavior in a "all or nothing" manner; The requirements of legal principles are relatively general and vague, and are not applied in a "all or nothing" manner, with greater coverage and abstraction. So, there is a significant difference in the structure of the two articles. (2) There are some legal provisions that are subordinate to legal rules and stipulate certain legal technical content, such as the definition of specialized legal terms, the publishing authority, and the effective date of the law. These articles have a very different structure from legal rules. (3) There are many legal provisions that do not correspond one-to-one with legal rules. Often: a complete legal rule is expressed by several legal provisions; The content of legal rules is represented by different legal provisions, and even by provisions in different normative legal documents; One article expresses multiple legal rules or their elements; A provision only specifies certain elements of a legal rule. In these cases, the structure of legal provisions has undergone some changes compared to the standard rule structure. (4) Some legal provisions that only represent a legal rule belong to uncertain rules, that is, the content is arbitrary or quasi applicable, rather than clearly determined. Specifically, a deterministic rule refers to a legal rule whose content has been clearly determined without the need to cite or refer to other rules to determine its content. The voluntary rule refers to a legal rule whose content has not yet been determined, but only provides some general instructions, which are determined by the corresponding state organs through corresponding channels or procedures. The applicability rules refer to rules that do not specify people's specific behavioral patterns in the content itself, and need to be referenced or referenced from other corresponding content regulations. It can be seen that the provisions expressing uncertainty rules themselves are uncertain in content and need to be combined with other provisions to fully express this rule. These non-standard diagrams need to be adjusted compared to the diagrams under the standard rule structure, and various methods should be comprehensively applied according to different structural characteristics in order to accurately and completely represent the relationships between the various elements and concepts obtained by decomposing legal rules, and organically connect them to represent legal rules together.

In summary, the general representation method of legal rules refers to first decomposing legal provisions, then modeling some common elements separately, and finally representing the structure of legal provisions with diagrams. This method involves a graph of several general models and datasets to represent legal rules, which I refer to as a rule graph. It enhances the logical reasoning and conceptual understanding ability of legal provisions, and can better cope with exponential changes in legal rules and their specific situations, making up for the lack of computer life experience. Compared with current representation methods, this universal representation method can directly represent the concepts, logic, and structure of legal rules in computer language, and can also represent the deductive process of legal rules in specific cases. It can solve many problems in existing paths and has pioneering significance in the fields of law and computer science.

4.The universal representation method of legal rules for the representation of judicial syllogism

Judicial syllogism represents the process of deducing legal rules and connecting them with specific case facts. The current representation method is difficult to represent this process. The general representation method proposed by the author enhances the logical reasoning and conceptual understanding ability in the representation of legal rules, so it can at least partially represent the process of applying legal rules. This means that computers can at least partially apply legal rules, which is a proof that computers can deepen their understanding of the meaning of legal rules. The universal representation method actually nurtures a grand project, which is to represent and connect as many legal rules as possible using graphs and universal models, forming a database that represents legal rules and can be continuously extended and compatible. This grand project cannot be accomplished overnight, but through countless small projects completed separately and then integrated.

4.1 The deductive process of established legal rules represents

When the elements and structure of a legal rule are represented as a rule graph, it can achieve the deductive process of that rule. Specifically, as follows: (1) When a specific case fact is input into the computer, the computer searches for the prerequisite elements for the application of the rule based on the graph of the rule, and then finds the secondary elements and listed specific situations to determine the factual elements that need to be obtained from the specific case facts. For example, through the graph of Article 385 of the Criminal Law, it is found that one element of the crime of bribery is state personnel, and the specific circumstances listed are that the work unit and position are several work situations stipulated by law. The corresponding factual elements are the defendant's work unit and position in the case. (2) Based on the determined factual elements, the computer processes the specific case facts input and extracts corresponding factual elements from them; If the input data is not available, the operator is required to supplement the necessary factual elements, such as the defendant's workplace and position. (3) After obtaining the factual elements, the computer calculates the corresponding factual elements through models of these sub elements and specific situations, obtains output values, and determines whether the sub elements and specific situations are valid. For example, after inputting civil servants from a certain government agency, the model calculates the judgment value of a certain type of national staff and provides an estimated accuracy rate of 99.8%. (4) After determining whether the secondary requirements and specific circumstances are valid, based on the logical relationship between them and the premise requirements of the rule, that is, the nature of the edges between these nodes in the graph, the computer makes a judgment on whether the premise requirements are valid, and then obtains a judgment on whether the rule can be applied. (5) If the rules can be applied, then obtain the elements of the behavior pattern in the rules based on the graph, repeat the above steps, determine the factual elements, use the model to judge whether the secondary elements and specific situations are valid, and then use the graph to judge the validity of the behavior pattern elements, and obtain the corresponding result, that is, the result after the rules are applied.

Through the above five steps, the computer can represent the deduction process of a legal rule, that is, connect the specific case facts input with the legal rule, and obtain the result of the application of the legal rule. Due to some rules not distinguishing between prerequisites and behavioral patterns, it is possible to directly determine whether the rules can be applied and the results of their application based on the decomposed elements, secondary elements, and the establishment of specific circumstances. For example, Article 385 of the Criminal Law can be broken down into four elements. If all of them are met, then this provision can be applied and the applicable result of the crime of bribery can also be obtained; If one of the requirements is not met, the law cannot be applied, and the result is that the crime of bribery is not established. When a legal provision only contains a part of a certain legal rule, or when the legal rule is a non deterministic rule, the judgment obtained by the computer based on the graph is also conditional and undetermined. This state can automatically initiate the deduction process of the correlation graph through the connections between the graphs, and combine the two to obtain the final judgment of whether the rules are applicable and the applicable results.

4.2 Representation of the search process

The above steps can only achieve the deduction process of established rules, that is, the process after determining the possible applicable legal rules in the judicial syllogism. When a large number of legal rules are constructed and interconnected, it can help achieve the most important step in the judicial syllogism, which is to find the law and determine the premise. Finding a law is to determine the most relevant legal rules to the facts of the case from numerous legal rules. This process often requires value judgment to determine the legal rules that should be applicable to the current case. Although the process of value judgment is very complex and there is considerable controversy over whether value judgment can and should be represented by computers, a large number of legal rule graphs can provide clear and specific data support for finding the most relevant legal rules, helping value judgment to be effectively carried out.

How to find the most relevant legal rules to the input case facts through a large number of interrelated legal rule graphs depends on how these legal rule graphs are related. The first and most violent and direct search method is to have the computer match all the legal rule graphs one by one based on the specific case facts input, and return the legal rules with all the prerequisite elements established, or the legal rules with the highest number of prerequisite elements established. Considering that the weight of the influence of prerequisite elements on the application of rules may sometimes vary, it can be required to return the top ten legal rules with the highest number of prerequisite elements established. The disadvantage of this search method is obviously that it requires matching all legal rule graphs for each specific case fact. Each graph has multiple nodes and layers of judgments, so even if the computer has powerful computing power, it may take a long time. However, the advantages are also very obvious, that is, it is simple and direct, not easy to make mistakes, and will not be missed. The second method is to classify the graphs of legal rules, find the characteristic elements of each category in the graphs, and construct a classification graph of legal rules. For example, it can be divided into civil, criminal, and administrative legal relationships. The general elements of these major types of legal relationships can be identified, and their category maps can be constructed to determine the factual elements that can be used to distinguish the several major types of rules. Then obtain data on these factual elements from specific case facts to determine which type of legal rule the case belongs to. You can continue to classify until there are only one or two legal rules in that category. Faced with such graph classification, computers constantly find the factual elements required for classification from the facts of the case, and through model calculations, judge the classification until the most relevant legal rules can no longer be classified. The advantage of this method is that it is closer to the process of legal professionals searching for laws, saving time and reducing computational complexity. Because every time a classification is determined, at least half of the legal rules are removed. The classification level of legal rules is relatively limited, at most only a dozen categories. After all, 2 to the power of 24 is over 10 million, far exceeding the total number of legal provisions in current effective legal documents. The disadvantage of this method is also very obvious, that is, there is a certain error rate every time it is classified, which will increase the complexity of the graph database, making it prone to errors and omissions. The third approach is to find a balance between the first two methods. For example, first perform a certain classification, and then directly match all the rules of that classification after a certain classification, or randomly select a legal rule to match, determine the distance between the specific case facts and the legal rule based on the results, and then exclude some legal rules and select the next legal rule for matching, thereby optimizing the process of finding the most relevant legal rules. There are many other ways to combine the first two methods: for those with clear classification requirements, they can be directly classified to exclude irrelevant rules; For unclear classification, directly match all possible rules to obtain multiple potentially related rules. The third method can use many optimization methods in current computer retrieval technology. Obviously, the third method is a more realistic and feasible approach, which is the direction of development.

By using a graph database to find the most relevant multiple legal rules, the search process can be clearly presented, reflecting the logical reasoning, factual elements, specific data based on the correlation, the establishment of sub elements or listed situations, and the judgment and basis of each step. In addition, relevant legal rules can continue to be deduced through graphs and applied to specific case facts to obtain corresponding results. On this basis, when making value judgments and trade-offs regarding these relevant legal rules, it is important to first have a clear understanding of their relevance, reliability, and applicable outcomes, in order to avoid unnecessary conceptual disputes and reduce logical errors. Therefore, the process of finding laws can be at least partially represented through a graph database of legal rules.

4.3 The method of automatically constructing a large number of legal rule graphs

Building a graph of a large number of legal rules can be achieved by manually processing each legal provision and establishing a graph of each rule, or by attempting more intelligent methods that allow computers to automatically identify various elements, sub elements, and specific situations in legal provisions, automatically analyze and determine the relationships between these elements, sub elements, specific situations, and the results of rule application, and automatically complete the construction of the rule graph. This method needs further research and may not be impossible. Because the current construction of knowledge graphs is to enable computers to automatically perform named entity recognition and relationship extraction, that is, to learn from existing data in order to automatically identify and determine the direction and properties of nodes and edges in the graph from input data. Applied to the construction of legal rule graphs, it is to enable computers to automatically identify the normative elements and results of legal rules from legal provisions, and even automatically determine secondary elements and specific situations. From the current approach, there are still two directions: one is to write the language representation features of the legal rules and normative elements summarized and discovered by people into computer programming language through rules. Let the computer search for and determine the normative requirements based on these features. Another approach is to annotate a large number of normative requirements, train the model, and have the computer automatically learn the features of normative requirements. Both methods require further research to determine their effectiveness. The difficulty of the former lies in discovering the general characteristics of normative requirements, which can be clearly expressed in computer language. For example, in the specific provisions of the Criminal Law, commas, semicolons, and periods can be used to preliminarily decompose legal provisions and obtain several elements and results. The logical relationship between the elements or sub elements connected by a comma is' or ', and one of them can be satisfied. Some linking words, such as "and, or, and", may reflect the logical relationship between the elements. Obviously, these features are not sufficient to automatically identify the elements and their relationships, and their reliability is unclear, requiring further verification. The difficulty of the latter lies in the fact that, from the current development of natural language and machine learning technologies, it is difficult for computers to master this abstract general feature through annotation and automatic learning, and the effect may not be satisfactory. Especially the regulatory requirements are constantly changing.

In summary, the universal representation method of legal rules can express the judicial syllogism. After establishing and associating a large number of legal rule graphs, when inputting specific factual descriptions, the computer uses graph classification and association logic to search for the most relevant legal rules. Based on the graph of relevant legal rules found, it determines normative elements, secondary elements, or specific situations, determines factual elements, calculates through models, determines whether the minor premises are valid, and then connects the major premises through the graph to determine whether the major premises are valid, thereby obtaining corresponding results.

5. The necessity of adopting a universal representation method for legal rules

The general representation method of legal rules can model the more common elements in legal provisions, create diagrams of common legal structure, strengthen the rule derivation of computer models, reduce the complexity of models, increase the universality and interpretability of models, improve the reliability of computer automatic judgment of legal issues, and improve the problems existing in current computer models.

5.1 The addition of legal elements and structural features to the general representation method can increase the generality and stability of the model

The current computer models are capable of representing legal texts, storing, reading, exchanging, and transmitting them. However, in terms of representing the meaning of legal texts, the development is slow, the efficiency is low, and the scope of application is narrow, unable to integrate, lacking universality and stability. When people express various subjective and objective activities through language, they do not spontaneously create a word to represent every activity, thing, or event. In this case, once the object represented by the text undergoes some changes, the text cannot be used, and therefore cannot have universality and stability, and cannot achieve the purpose of communication between people. At present, computers develop specialized datasets and train targeted models based on specific task scenarios, which is equivalent to improvising a text for a certain activity or event. This model lacks generality and stability.

When people use text to represent things, activities, and events, they first start with the simplest and most concrete things, activities, and events, creating corresponding words, such as various concrete objects, simple actions, thoughts, and events. Then, these basic words are combined together to represent categories of things, emotions, events, etc., and then combined into phrases, sentences, and paragraphs based on grammatical structures to represent more complex ideas, events, things, and situations. So generally, specialized words are not created to express complex ideas, events, etc., but rather follow certain grammar rules. Basic words are combined into different words, sentences, paragraphs, and articles, which are constantly changing and represent complex things, viewpoints, and respond to a myriad of worlds. In this way, different people can understand the ever-changing textual expressions based on grammar rules, find the corresponding content of the myriad worlds, and achieve the goal of communication and exchange. Computers collecting data and training models to represent complex tasks through deep learning methods are like constructing specialized words to represent complex things, making it difficult to express people's understanding of complex things. So, basic characters and grammar rules are the key to expressing and unifying the myriad of words.

In addition to relying on keywords and grammar rules, legal provisions are more important in expressing legal rules based on the structural characteristics of the legal provisions themselves, the construction rules of the legal system, and the prescribed meanings of legal concepts. Therefore, from the perspective of simulating human language thinking activities and the representation rules of text, to enable computers to grasp the meaning of legal provisions, computer language should be used to represent legal provisions like text, constructing computer models representing basic elements and structural diagrams representing the logical relationships between them. These basic element models and general structural diagrams can be combined according to rules to represent legal provisions composed of various element combinations, enabling computers to understand legal provisions through the understanding of basic concept models and logical reasoning of rules.

After adding the elements and structural characteristics of legal provisions, the computer's representation of the meaning of legal provisions is strengthened. After decomposing legal provisions into basic rule elements, legal concepts, sub elements, and specific circumstances, models are constructed for each of these elements, which have certain universality and stability and can be used in other legal provisions. After representing the logical relationships between these elements and their corresponding relationships with the results into several types of graphs, some general structures of legal provisions can be represented. Computers can perform certain logical reasoning based on graphs to cope with the ever-changing specific facts of cases.

5.2 Universal representation methods can improve the reliability and interpretability of models

The current computer models have problems with poor interpretability and questionable reliability. These issues have created bottlenecks in the development of smart justice and urgently need to be addressed. Currently, common applications adopt the method of annotating data based on tasks and training models. Under this method, it mainly relies on the computer continuously stacking models and finding mathematical relationships between annotated data and classification values to achieve. In this way, models are often complex and difficult to understand, unable to comprehend the mathematical relationships between input and output data from the model, and unable to understand the logical relationships involved. The input is a description of a specific case fact, and the output value is the answer to a legal question. The entire derivation process is a bunch of incomprehensible big models, so it is equivalent to no deduction process, just like a judge listening to the case description of the parties and making a judgment without any reasoning or discussion. So increasing the derivation process of computers, reducing the complexity of models, and clarifying the content and magnitude of the model's impact on the results can increase the interpretability and reliability of computer solutions to legal problems.

The inclusion of legal requirements and structural features in general representation methods can effectively solve problems that cannot be understood by large models and significantly enhance people's understanding of computer models. Firstly, the derivation process of computers is strengthened. After the legal provisions are decomposed, the sub elements, specific circumstances, and basic concepts are modeled separately, and then connected according to the rules. When a pile of specific case descriptions are inputted, the computer will determine the factual elements to be searched for based on the rule graph, extract data related to each factual element based on the features of these descriptions, input them into each model to calculate the judgment of each sub element or situation, and then derive the corresponding results based on the rules. In this way, the process of how the computer obtains the final legal answer from a pile of case descriptions becomes clearer and more specific. It can know which input data was used to determine what, which elements were established or not, which led to the final result, whether the elements were established or not, which sub elements or situations did not exist or existed, and how these sub elements or situations were connected to determine the final result.

Secondly, the complexity of the model is greatly reduced, making it easier to understand. After decomposing and modeling each part separately, the complexity of the model will be significantly reduced because the tasks of each part are simpler than the overall task. The input data is extracted from a set of specific case descriptions based on the factual elements determined by that part of the task, resulting in higher data quality, fewer data parameters, and less variation. When the model is not too complex, the judgment logic of the model is easier to understand. It is possible to discover the logic behind the correct and incorrect samples in the test set by observing them. For example, after modeling non delegated national staff, comparing and analyzing the sample data in the test set that were correctly and incorrectly judged as national staff and non-state staff, it was found that the logic of the model is to first check whether the input data is a company or enterprise. If not, it is judged as national staff; If so, determine whether it is a state-owned enterprise based on the specific name and industry of the company or enterprise. If so, it is a national employee. Otherwise, it is a non-state employee. The correctness of this logic lies in the fact that non corporate or enterprise units in China are basically working for the state; The staff of these units basically have certain public power; Many units with a corporate or enterprise nature are state-owned enterprises and belong to state work, while other private enterprises are non state work. Obviously, there is still a certain distance between this logic and the logic of judging whether a person is a state functionary according to legal provisions. However, it cannot be denied that this logic conforms to the statistical laws reflected in the judgment data, is simpler and clearer, and is more in line with the characteristics of computer operations. The gap between this logic and the judgment logic stipulated by law should be listed as a special case and recorded in the computer. For example, when non-public civil organizations with non corporate nature appear, they should be treated as exceptions and judged as non-state work. After reducing the complexity of the model, the logic of the model can be analyzed, often reflecting the statistical patterns of the input data and having a certain degree of legitimacy. The gap between it and the logic stipulated by law can also be analyzed, and the method of constructing exceptions can be used to prevent model judgment errors. This can increase the interpretability and reliability of the model's judgments.

Finally, the model's understanding of the entire legal text is clarified. After decomposition and separate modeling, the impact of each model on the computer's understanding of the entire legal text mainly lies in the sub elements, situations, or concepts judged by the model, as well as its upper part and even the final result. After inputting a specific factual description, the input data required by each model, the accuracy of judgment and estimation, the impact on normative requirements, and the impact on whether the final rule is applicable and the applicable results can all be clearly reflected through the structural diagram. If the rule model makes an incorrect judgment, the model can trace the source of the error along the graph and correct it accordingly. If there is a problem with a certain element model, it can only affect the judgment of a certain element and may not necessarily lead to the rule model making incorrect judgments. So the impact of a certain model on the computer understanding of the meaning of the entire legal text is more limited and controllable, and its direct and indirect effects are also clearer and more specific.

6. Theoretical significance of the universal representation method of legal rules

The development of computer representation theory of legal rules is the foundation for the deep integration of law and computer science, and is the fundamental theory for the further development and understanding of the application of computational law. Computers can only accomplish more complex legal tasks on the basis of deepening their understanding of legal rules. People can only make computers responsible for more important legal tasks on the basis of knowing how computers understand legal rules. All of this relies on computers being able to represent the meaning of legal rules more comprehensively, accurately, and systematically than they do now. The current representation method can accurately represent the text of legal provisions, but it is relatively weak in expressing the meaning of legal provisions, which hinders its development. The universal representation method proposed by the author can effectively promote the representation of the meaning of legal provisions by computers based on current methods, and has an important role in promoting the continued development of computational law and deepening the understanding of its applications. The universal representation method can be applied to various projects related to smart law, which can not only improve the task completion effect of these specific projects, but also effectively integrate the results of various projects in the later stage, completing the construction project of this grand legal rule graph database.

A significant theoretical value of this article lies in fundamentally explaining the mechanism of computer representation of legal rules, that is, by encoding or word vectors to represent the text representing legal rules as numbers, and then fundamentally explaining the reasons why this representation method has low accuracy, poor interpretability, and scalability, that is, these numbers and the large models they construct do not have stability and universality, no logical reasoning rules, and are difficult to expand. On this basis, the universal representation method proposed by the author decomposes legal rules to obtain universal elements and constructs models of these elements, which have universality and stability. Further obtain some general structures through legal provisions, associate these elements with graphs and rules to represent these general structures, obtain a rule graph, and then construct a graph database. These rule graphs and their databases integrate logical reasoning rules, have interpretability and scalability, and can respond to the specific meanings of ever-changing legal provisions. Therefore, this universal representation method fundamentally improves existing representation methods and can effectively solve the bottleneck problem that restricts the development of smart justice.

While addressing the many problems currently faced, the application of universal representation methods will have a fundamental impact on the legal academic discussions surrounding computer applications. Because many current academic theories are based on the assumption that computer applications have the aforementioned problems. For example, the widely accepted scenario based theory in the field of "law+technology" is to analyze the new legal problems caused by artificial intelligence in different scenarios, including new legal relationships, algorithm problems, and new criminal phenomena. One important reason why scenarization theory has such a great influence is that many current artificial intelligence applications are built on datasets and trained on large models based on specific tasks. This method is not based on a unified law for deduction and inference to obtain various specific applications, and there is no general theoretical basis. The hypothesis mentioned earlier that there is a certain connection between the mathematical characteristics of numbers representing words and the meaning of words is only empirical and lacks theoretical understanding and recognition. In addition, when designing the encoding of text, it is only a random selection of encoding without considering the meaning of the characters, so it is difficult to understand what this connection really is. Due to a lack of understanding of general connections, various applications are disconnected and unable to be connected. Each model can only be used for the corresponding dataset and task. If there are slight changes in data and tasks, the model needs to be retrained and the corresponding changes cannot be determined through logical reasoning. Therefore, it is indeed necessary to consider the legal issues specific to each application and determine the dataset and algorithm used. The theory of contextualization emerged. Similarly, a significant portion of current academic discussions on algorithmic ethics, morality, and legal regulation revolve around the issues of low accuracy and algorithmic black boxes in the application of artificial intelligence. Sometimes, due to the low accuracy of artificial intelligence applications and the existence of algorithmic black boxes, it is neither reliable nor fair or just, thus requiring limitations on the scope and scenarios of artificial intelligence applications. Sometimes, due to the significant application value and development prospects, the problems of low accuracy and algorithmic black box cannot be directly solved. Therefore, it is necessary to study how to design indirect legal mechanisms to regulate, guide, and prevent them.

Due to the fact that a large part of these theoretical discussions are based on the characteristics and problems of computer applications brought by current representation methods, and with the continuous promotion of general representation methods, computer applications based on general representation methods will significantly improve accuracy, interpretability, and scalability. Such applications are increasing day by day. Therefore, in these computer application fields based on general representation methods, these theoretical discussions will also be greatly influenced and need to be adjusted and reshaped according to the new characteristics of these applications. Due to the fact that the universal representation method decomposes the task of representing legal rules according to legal theory, obtains universal elements, and then solves them one by one, these applications follow general legal theory requirements behind the completion of tasks, and have general rules and characteristics. In this way, it is possible for general theories to conduct general legal analysis and evaluation of these applications, without the need for scenario based theories to discuss them in different situations. In terms of theoretical discussions on algorithm regulation, due to the high accuracy of applications established based on general representation methods, the logic and specific operation rules of algorithms are clear and explicit. Therefore, it is not necessary to determine the necessity of algorithm regulation and construct a series of indirect algorithm regulation mechanisms based on the characteristics of algorithm unreliability, algorithm bias, and algorithm black box in these applications. Instead, based on relevant legal theories, the logic and specific rules of these algorithms can be directly discussed and modified, and the legal liability system for errors, violations, crimes, and other situations in these applications can be determined accordingly.

From this, it can be seen that the general representation method not only fundamentally affects the application and specific content of scenario theory and algorithm regulation related theories in the computer applications established by this method, but also effectively integrates legal theories such as decomposition, interpretation, and judicial syllogism into the application of computers in the legal field. In specific legal applications, various theoretical systems and normative theories of departmental law will also be introduced to guide the decomposition of specific departmental law rule representation tasks and the determination of general elements and structures.

download空白文档