China Institute for Socio-Legal Studies, Shanghai Jiao Tong University

2024-11-03 [author] Liu Hai preview：

[author] Liu Hai

[content]

A Bayesian Artificial Intelligence Model for Legal Fact-Finding

*Author: Liu Hai

Associate Professor, School of Statistics and Information, Shanghai University of International Business and Economics

Abstract: Discretionary evaluation of evidence faces a triple dilemma: ditculty in clarifying evidence, implicit subjective judgment, and fuzzy inner conviction, which hinder the accurate determination of the truth. The Bayesian Artificial Intelligence(Al) model, as a visual, transparent, and quantitative intelligent tool, can assist fact-finders in ccmpleting the process of discretionary evaluation, uncovering the black box of discretionary evaluation, and promoting the accuracy of fact-finding. Bayesian Al uses Bayesian network diagrams as the knowledge system to visually represent the inferential relationships between evidence and facts. lt utilizes probability numbers as the data system, transparently displaying the fact-finder’s subjective judgment on the degree of dependence between evidence and facts. Probabilistic algorithms are used as the inference system to intelligently calculate posterior probabilitiles, quantifying the strength of inner conviction in inference. However, the Bayesien Al model also faces challenges. First, it requires a large number of probability numbers as input, but fact-fnders’ subjective judgments on evidence are difficult to accurately transform into probability numbers, thus affecting the accuracy of the model’s outputs to a certain extent. Second, while the Bayesian Al model can avoid logical fallacies in evidential reasoning and is a tool for the explicit expression of subjectivity, it cannot eliminate subjectivity.

Fact finding contains three processes of proof, questioning and authentication, which is the basis for the application of law and plays a key role in making a fair judicial decision. Authentication refers to the cognitive process of the fact finder, is the evidence of the fact finder. In the process of authentication, the fact finder of the evidence that has been admitted to reasoning and evaluation, so as to decide what to believe, to determine what claim is established. Evaluating evidence and inferring facts is accomplished primarily by the fact finder's discretionary evaluation of evidence. However, Discretionary evaluation of evidence faces a triple dilemma: ditculty in clarifying evidence, implicit subjective judgment, and fuzzy inner conviction, which hinder the accurate determination of the truth. First, the intricate relationship between a large amount of evidence and essential facts makes it difficult for the fact finder to accurately clarify the evidence and discover the truth. Secondly, in the process of evaluating the evidence and inferring the facts, the fact finder inevitably involves subjective judgments such as intuition, experience and individual cognition. These “hidden” subjective factors impede the accurate determination of the truth. The third pillar of evidence is the ambiguity of the degree of inner conviction with which the finder of fact believes the facts of the case to be true. This degree of inner conviction is called the strength of inference, can be quantitatively expressed the strength of inference, and intelligent calculation? In judicial practice, this issue is the most important concern of the fact finder and the most important issue.

The Bayesian Artificial Intelligence model provides a framework for addressing this issue. Driven by both knowledge and data, it constructs intelligence through four elements: knowledge, data, algorithms, and computing power, placing it within the realm of third-generation AI with enhanced robustness and interpretability. Firstly, it constructs a knowledge system using Bayesian network diagrams. Fact-finders combine case details and, based on causal or evidential relationships, weave the structure between all pieces of evidence and facts to be proven, thereby building a Bayesian network diagram. This network visually represents the structural relationships between a large amount of evidence and the facts. Secondly, it constructs a data system using probability numbers. Fact-finders exercise subjective initiative, using empirical rules, scientific principles, and causal links to assess the strength of dependency between evidence and facts to be proven, thereby creating probability distribution tables. Probability numbers transparently present the fact-finder’s subjective judgments about the evidence. Finally, it constructs an inference system using probabilistic algorithms. These algorithms automate and intelligently conduct probability calculations, automatically updating probability values to obtain the desired posterior probability. The posterior probability quantitatively represents the fact-finder’s degree of inner conviction regarding the truth of the case.

1. Bayesian network diagram of the inferential relationships between evidence and facts

Reasoning with evidence in fact-finding is a form of commonsense reasoning, involving knowledge representation and reasoning. Bayesian Artificial Intelligence uses Bayesian network diagrams to structurally represent knowledge, visualizing the relationships between a large amount of evidence and facts. A Bayesian network diagram is composed of nodes and arcs, as shown in Figure 1. Nodes represent variables; some variables indicate unobservable unknown hypotheses, i.e., facts to be proven, such as “Zhang is guilty (H1).” Other variables represent observable facts, i.e., evidence, such as “blood type DNA was matched (E1),” which can be observed through forensic testing techniques. An arc is a directed edge that points from one node to another, representing a ‘direct’ dependency relationship between variables, which is typically causal but can also represent probabilistic influences or other relational dependencies. Therefore, a Bayesian network diagram is a natural way to represent and communicate the relationships between multiple different hypotheses and evidence. If there is an arc pointing from node A to node B, node A is considered the parent of node B, and B is a child of A. In a Bayesian network diagram, if two nodes are not directly connected by an arc, they exhibit a form of conditional independence. This conditional independence significantly reduces the complexity of probabilistic calculations, which is one of the advantages of the Bayesian network model.

In 2013, Fenton and others pioneered an idiom-based approach by creating a series of reasoning patterns known as “idioms”, which can be easily used to construct a “consensus-based” Bayesian network diagram for legal cases. Idioms are specific fragments of Bayesian network diagrams that represent common types of reasoning used in legal uncertain reasoning. These idioms can be reused, thus accelerating the process of building Bayesian network diagrams and achieving higher-quality diagram structures. This idiom-based approach has been applied to the study of real legal cases, presenting and analyzing the evidence reasoning process of entire cases and assisting fact-finders in determining case facts. These practical case studies have shown that the idiom-based approach can effectively establish consensus-based Bayesian network diagrams, thus proving its practicality.

In legal evidence reasoning, there are six common idioms: evidence idiom, evidence dependence idiom, evidence accuracy idiom, motive idiom, opportunity idiom, and alibi idiom. By using these idioms, fact-finders can quickly create a Bayesian network diagram for the entire case. During the evaluation process, fact-finders, after considering the evidence, factual claims, and arguments presented by both the prosecution and the defense, use a Bayesian network diagram to structurally represent the relationships between a large amount of evidence and the facts to be proven, making their reasoning process transparent and visualized. This approach helps reveal potential reasoning errors that fact-finders might make during their evaluation, thereby enhancing the accuracy of fact-finding and achieving fairness and justice.

For criminal cases, constructing a Bayesian network diagram using idioms involves three steps. To clearly explain these steps, the following will use the evaluation of DNA forensic opinions and witness testimony as examples to illustrate its working principles. Suppose a murder case occurs, and the suspect, Zhang, is arrested. Let’s assume there are only two pieces of evidence: first, the DNA forensic opinion, where the forensic analysis indicates that Zhang’s blood type DNA matches the blood type DNA collected from the crime scene; second, the witness testimony, where witness Li testifies that he saw Zhang near the crime scene at the time of the incident. How, then, can we visually represent the structural relationships between these two pieces of evidence and the ultimate fact to be proven?

First, the fact-finder, considering the details of the case, identifies the ultimate fact to be proven (i.e., the prosecution’s hypothesis), as well as the motive and opportunity for the crime. In the aforementioned case, the ultimate fact to be proven is that “Zhang is guilty,” meaning Zhang committed the murder, represented by node H1 in Figure 1. In the Bayesian network diagram, each node represents a variable, and variables have values or states. Since evidence reasoning is based on limited knowledge, the state of each variable is uncertain, and probability measures the degree of uncertainty when a variable is in a particular state. Based on the type of state, variables are categorized into discrete and continuous variables. Discrete variables are further divided into Boolean variables, integer-valued variables, and multinomial variables. Each node in Figure 1 is a Boolean variable, with two states: true or false. H1 being true represents that “Zhang is guilty,” while false indicates that “Zhang is not guilty.” Whether H1 is “true” or “false” remains uncertain. Node H2 is an opportunity variable, indicating “Zhang was at the crime scene at the time of the incident.” In this case, there is no motive evidence, so there is no motive node.

Second, the fact-finder must identify the evidence already accepted in the case and other hypotheses related to the ultimate fact to be proven. Regarding the ultimate fact H1, there is no direct evidence associated with it in this case, but there is a related hypothesis H3 as its child node and an opportunity node H2 as its parent node. H2 points to H1, expressing the causal relationship: because Zhang was at the crime scene at the time of the incident, it is possible that he committed the murder. Let E1 represent the forensic opinion that “blood type DNA is detected as a match,” meaning that forensic experts found that Zhang’s blood type DNA matches the blood type DNA collected from the crime scene. E2 represents the witness testimony that “Li testified he saw Zhang near the crime scene at the time of the incident.” E1 is indirect evidence for H1, and E2 is indirect evidence for the opportunity node H2. Evidence E1 is connected to H1 through hypothesis H3, which represents the conclusion of the forensic opinion: “blood type DNA matches,” meaning Zhang’s blood type DNA matches that collected from the crime scene. Evidence E2 is connected to H2 through hypothesis H4, which represents the content of the witness testimony: “At the time of the incident, Li saw Zhang near the crime scene.” The inferential chain “H1→H3→E1” expresses the causal relationship: because Zhang committed the murder, he likely left blood at the scene, making it probable that Zhang’s blood type DNA matches the blood type DNA collected from the crime scene, which the forensic experts then identified as a match. The inferential chain “H2→H4→E2” illustrates the causal relationship: because Zhang was at the crime scene at the time of the incident, Li saw Zhang near the crime scene, and thus testified that he saw Zhang near the crime scene.

When analyzing evidence, it is important to distinguish between the evidence itself and the facts asserted by the evidence, which are known as “evidential facts.” A forensic opinion is evidence, while the conclusion of the forensic opinion is the evidential fact. Similarly, witness testimony is evidence, and the content described by the testimony is the evidential fact. Evidential facts are unobservable; they are a form of hypothesis with uncertainty, requiring evidence for proof. However, different pieces of evidence often have varying levels of reliability. Reliability affects the strength of inferences drawn from the evidence. Let node A1 represent “test reliability” and A2 represent “witness reliability”; they respectively influence the inferential strength of evidence E1 and E2, being the parent nodes of E1 and E2.

Finally, the fact-finder, based on the causal or relational connections between nodes, uses various idioms to establish the structural relationships between evidence and hypotheses, and then draws the complete diagram using the Bayesian AI software GeNIe. Figure 1 clearly presents the reasoning path of the fact-finder, visually displaying the structural relationships between evidence and hypotheses.

It should be noted that the primary purpose of this article is to explain the operational mechanism of applying Bayesian Artificial Intelligence to fact-finding. For simplicity of the model and based on the specifics of this case, Figure 1 only employs three types of idioms: the evidence idiom (e.g., the structure between H2 and H4), the evidence accuracy idiom (e.g., the structure among A1, H3, and E1), and the opportunity idiom (between H2 and H1). Since the evidence is independent of one another, the evidence dependence idiom is not involved. Moreover, as there is no alibi evidence or motive evidence in this case, the motive idiom and alibi idiom are not included either.

Like any other model, a Bayesian network diagram is also an approximate representation of reality. When deciding the number of nodes and considering whether to draw an arc between two nodes, the fact-finder must strike a necessary balance between realism and efficiency. More nodes and arcs bring the model closer to reality but make the creation and operation of the model more challenging. Although the idiom-based approach provides a systematic method for constructing Bayesian network diagrams, the selection of evidence, hypotheses, and idioms still involves the cognitive and subjective judgment of the fact-finder, introducing a degree of subjectivity. To build a Bayesian network diagram that can reach consensus, the fact-finder can engage in discussions with experts or peers, continually refining and improving the structure of the diagram, ultimately achieving consensus.

2.Probabilistic model of the degree of dependence between evidence and facts

The Bayesian network diagram reveals only the inferential relationships between evidence and hypotheses. But how strong are these inferences? In other words, to what extent can the fact-finder believe in the truth of the ultimate fact to be proven based on all the evidence? To answer this question, one must turn to the data system of Bayesian Artificial Intelligence.

In a Bayesian network diagram, each node corresponds to a probability distribution table. A probability distribution table consists of each state of a node and the corresponding probability values for those states. For nodes without parent nodes, such as H2, A1, and A2, their probability distribution tables are composed of prior probabilities. For nodes with parent nodes, their probability distribution tables consist of conditional prior probabilities. The fact-finder, based on the background information of the case and empirical rules, evaluates the values of prior probabilities and conditional prior probabilities. Prior probabilities transparently represent the fact-finder’s subjective assessment of the likelihood that a hypothesis is in a particular state (true or false), while conditional prior probabilities transparently display the fact-finder’s subjective evaluation of the strength of the dependency relationship between parent and child nodes. Therefore, the probability distribution tables present, in numerical form, the fact-finder’s subjective judgment and belief about the relationship between evidence and facts.

Prior probabilities and conditional prior probabilities together constitute the data system of Bayesian Artificial Intelligence, serving as key “inputs” to the model. Another important “input” is the state of the evidence. Based on the data system, by inputting the state of the evidence and using Bayesian AI software, the posterior probability can be automatically calculated. The posterior probability is a quantitative model of inferential strength; its value reflects the strength of the inference and is the most critical “output” of Bayesian AI. The larger the posterior probability, the stronger the inferential strength, and the greater the degree to which the fact-finder believes that the ultimate fact to be proven is true based on the evidence. Conversely, the lower the posterior probability, the weaker the inferential strength.

2.1 Challenges in probability assessment and solutions

Creating a data system involves determining the corresponding probability distribution table for each node in a Bayesian network diagram, which encompasses the sources and methods of calculating prior probabilities and conditional prior probabilities. In most application domains, probability information can be obtained from various sources. The three most common sources are statistical data, literature, and human experts. In the field of fact-finding, aside from a few scientific pieces of evidence like DNA that have statistical databases, most other evidence lacks data and relevant literature containing probability figures. As a result, probability distribution tables are primarily estimated subjectively by the fact-finder based on their own or others’ expertise and experience. Fact-finders can also engage in discussions with other experts or peers to reach a consensus on probability assessment. Expertise and experience not only aid in evaluating the required probabilities but can also fine-tune probabilities obtained from other sources to suit the specifics of the current case, validating the figures within the Bayesian network. Deriving probability values from human expertise and experience is a popular research topic. Although this process is challenging, it is not unachievable. Some scholars criticize that subjective probabilities “could be any number, and they do not need to be constrained by the quality of evidence in any way.” However, this critique is biased. In reality, the Bayesian network itself sets many strong constraints on the feasible range of reasonable hypothesis probabilities, as it encodes some or all of the causal relationships between variables, making it easier and more accurate than deriving probabilities in isolation. “The subjectivity of probability values is an exaggerated obstacle.” In any case, regardless of the method used to analyze evidence, fact-finders inevitably involve personal subjective judgment. The Bayesian AI method simply makes the inherent subjectivity transparent in a numerical form that is open to testing and challenge.

Using the Bayesian AI model may also face a potential risk: in order to obtain a desired probability outcome, fact-finders might “reverse-engineer” prior probabilities, dressing their reasoning process in a guise of “logic,” misleading others into believing that their reasoning is highly rational. However, this risk is not a flaw of the Bayesian AI model but rather a consequence of its improper use by fact-finders. The Bayesian AI model is a structured tool that ensures the reasoning process follows logic, avoids logical errors, and assists fact-finders in visualizing, transparently presenting, and quantifying their reasoning process. This explicit presentation helps both others and the fact-finders themselves clearly see the reasoning process, facilitating discussion and exchange about whether the reasoning is reasonable, how it can be improved, and whether consensus can be reached. In this sense, the Bayesian AI model can indeed enhance the accuracy of fact-finding. However, it does not determine the fact-finder’s reasoning; the fact-finder remains in control throughout the reasoning process. If the fact-finder’s evaluation of the evidence is inaccurate or even deliberately misinterprets the evidence, and then uses the Bayesian AI model to present this flawed reasoning process, the results will inevitably be inaccurate. The accuracy of the reasoning process itself determines the accuracy of the Bayesian AI model’s outputs; the model merely aids in making the reasoning process explicit.

2.2 Probabilistic assessment of DNA forensic opinion and witness testimony

Based on the methods of probability calculation, we have created eight probability distribution tables as shown in Table 1, each corresponding to the eight nodes in Figure 1. Table 1 contains a total of 34 probability values, of which 17 are the values that need to be assessed and determined. According to the axiom of complementary probabilities, the remaining 17 probabilities can be derived. “T” and “F” represent the two states, “True” and “False,” respectively. According to Bayes’ theorem, these 34 probability values are all the input probabilities required for the case illustrated in Figure 1. With these inputs, probabilistic reasoning can be conducted to calculate the desired target probability outputs.

The first table in Table 1 is the prior probability distribution table for the opportunity node H2, while the fourth table is the conditional prior probability distribution table for the ultimate fact to be proven, H1. Fenton and colleagues defined concepts such as “crime scene,” “time of the crime,” “extended crime scene,” and “extended time of the crime.” They provided a method for determining the “total number of people n with the opportunity to commit the crime at the crime scene during the time of the crime” and the “total number of people N with the same or similar opportunity to commit the crime as the defendant in the extended crime scene and time.” Based on these two data points, Fenton et al. suggested that the prior probability of “the defendant being at the crime scene” is n/N, and the conditional prior probability that “the defendant is the perpetrator” given that they had the opportunity to commit the crime is 1/n. Thus, the prior probability that “the defendant is the perpetrator” is 1/N. Applying this method to the case in Figure 1, the fact-finder can use data provided by the police or forensic experts to assess the probability distribution tables for H2 and H1.

First, the forensic expert conducts an autopsy of the deceased to estimate the specific time of the crime. The police investigate the crime scene to estimate how many people were present at the crime scene during the time of the crime. This determines the total number of people with the opportunity to commit the crime, n, for example, n=10. Second, since the police cannot be certain that suspect Zhang is one of these individuals, they expand the search to a broader spatial and temporal scope, identifying the extended crime scene and time. The extended scene and time must meet two conditions. First, the total number of people identified in this scope, N, must include the defendant, Zhang. Second, this scope should be the closest to the crime scene and shortest in time while including Zhang. Suppose, in this case, N=100. That is, the police have identified 100 people who had the opportunity to commit the murder within the extended scene and time, and the suspect Zhang is among them. Finally, using the calculation formulas provided by Fenton and colleagues, the probability values in the first and fourth tables can be obtained.

In the first table, P(H2=T)=10/100=0.1, meaning that based on the data above, the prior probability that “Zhang was at the crime scene” is 0.1. According to the axiom of complementary probabilities, P(H2=F)=1-0.1=0.9, meaning that the probability that “Zhang was not at the crime scene” is 0.9.

In the second column of the fourth table, P(H1=T|H2=T)=1/10=0.1, indicating that, given “Zhang was at the crime scene,” the probability that “Zhang is guilty” is 0.1. According to the axiom of complementary probabilities, P(H1=F|H2=T)=1-0.1=0.9. Based on the causal relationship between the variables, the third column shows that, given “H2 is false”—meaning Zhang was not at the crime scene—the probability that “H1 is true” is 0, and the probability that “H1 is false” is 1.

The second table is the prior probability distribution table for the reliability of the test node A1, indicating that the probability of the forensic opinion being reliable is 0.9, while the probability of it being unreliable is 0.1. The probability of the reliability of the forensic opinion can be estimated using historical frequency data on the accuracy of past tests performed by the forensic expert. If historical data shows that the accuracy rate of the forensic expert is 0.9, the fact-finder should be 90% confident that the forensic opinion is accurate.

The third table is the prior probability distribution table for the witness reliability node A2, indicating that the fact-finder believes the witness is telling the truth with a probability of 70% and believes the witness is lying with a probability of 30%. Based on the witness’s behavior during testimony in court, other character information about the witness, and their own experience, the fact-finder can subjectively assess the probability of the witness’s reliability.

The fifth table is the conditional prior probability distribution table for the blood type DNA match node H3. Based on causal relationships, when Zhang is guilty, it is highly likely that Zhang’s blood type DNA matches the blood type DNA collected from the crime scene, resulting in the probability values of 1 and 0 in the second column. When Zhang is not the perpetrator, the probability that Zhang’s blood type DNA matches the DNA collected from the scene can be estimated using the “random match probability.” If the case occurred in City A, and suspect Zhang is a male resident of City A, the DNA database of all males in City A, including Zhang, can be accessed, containing x DNA samples. The forensic experts would analyze the blood type DNA collected from the crime scene, identify its genetic markers, and calculate the frequency of each genetic marker within the x-sample DNA database. Since the genetic markers at each locus of DNA are independent, multiplying these frequencies yields the random match probability. Due to the specificity of DNA, the random match probability is usually very low. In this case, we assume a random match probability of 0.0003, meaning that under the condition “H1 is false,” the probability that “H3 is true” is 0.0003. According to the axiom of complementary probabilities, P(H3=F|H1=F)=0.9997.

The sixth table is the conditional prior probability distribution table for node H4. When Zhang was at the crime scene, witness Li is likely to have seen Zhang near the crime scene, resulting in probability values of 1 and 0 in the second column based on causal relationships. Similarly, when Zhang was not at the crime scene, it is impossible for witness Li to have seen Zhang, resulting in probability values of 0 and 1 in the third column.

The seventh table is the conditional prior probability distribution table for evidence node E1. Based on causal relationships, when “A1 is true and H3 is true,” that is, when the test is reliable and the blood type DNA matches, the DNA is certain to be detected as matching, so the probability of “E1 being true” is 1, and “E1 being false” is 0. Similarly, the probability values of 0 and 1 in the third column are derived. When “A1 is false,” that is, when the test is unreliable, forensic experts can make two types of errors: false negatives and false positives. A false negative means that DNA that is actually a match is identified as not matching, and a false positive means that DNA that is actually not a match is identified as matching. These error probabilities can be estimated using historical frequency data. In this case, the false negative probability is assumed to be 0.02, and the false positive probability is 0.03. Using the axiom of complementary probabilities, the probability values in the fourth and fifth columns are obtained.

The eighth table is the conditional prior probability distribution table for evidence node E2. Based on causal relationships, when “A2 is true and H2 is true,” that is, when witness Li is reliable and at the time of the incident saw Zhang near the crime scene, Li will certainly testify in court that he saw Zhang, so the probability of “E2 being true” is 1, and “E2 being false” is 0. Similarly, the probability values of 0 and 1 in the third column can be derived. When “A2 is false,” that is, when witness Li is unreliable, it is difficult to assess whether the witness is telling the truth. In the absence of any information, and based on the principle of maximum entropy, this case assumes that the probabilities are “equally likely,” resulting in probability values of 0.5 in the fourth and fifth columns.

Thus, it is evident that the 34 probability values in Table 1 are derived based on open and transparent data or assumptions, transparently reflecting the fact-finder’s subjective judgment of the case, which gives Bayesian Artificial Intelligence its “interpretability.” This interpretability helps to expose potential errors in the fact-finding process, facilitating the correction of viewpoints through review, exchange, and discussion, ultimately leading to consensus, improving the accuracy of fact-finding, and laying the foundation for further transformation of knowledge and experience into calculations. Additionally, creating probability distribution tables involves only an understanding of concepts such as prior probabilities and conditional prior probabilities, translating subjective judgments like knowledge and experience into probability values without requiring complex probability calculations. Therefore, even if a fact-finder is not particularly skilled in probabilistic reasoning, they can still construct probability distribution tables effectively.

3.Intelligent calculation of inferential strength between evidence and facts

Beyond the clear display of the dependency relationships and degrees between numerous pieces of evidence and hypotheses through Bayesian network diagrams and probability distribution tables, the greatest advantage of Bayesian Artificial Intelligence lies in the Bayesian network reasoning performed based on them. Probabilistic algorithms automate and intelligently conduct the reasoning process. Bayesian AI provides a rigorous systematic approach to propagating evidence and calculating the inferential strength of evidence on facts, assisting fact-finders in forming accurate and detailed opinions, thereby guiding what to believe and what to determine. “Appropriate use of Bayesian network reasoning allows for meaningful assessment and communication of evidence relevance, which can significantly enhance the efficiency, transparency, and fairness of the criminal justice system, as well as the accuracy of decisions.”

According to the direction of reasoning, Bayesian network reasoning is divided into two types. The first is reasoning from cause to effect, which follows the direction of the arcs in the network diagram and is also known as predictive reasoning. The second is diagnostic reasoning, which moves from results, phenomena, or evidence back to causes, with this type of reasoning running counter to the direction of the arcs. For example, in Figure 1, when we observe that the state of evidence E1 is true, our belief that H1 is true increases, meaning the posterior probability P(H1=T|E1=T) will be greater than the prior probability P(H1=T). In the process of fact-finding, people often use diagnostic reasoning, adjusting their belief in a hypothesis upon observing evidence. Typically, lawyers, jurors, or judges start with some prior assumptions about the ultimate fact to be proven, such as the presumption of innocence, which assumes that “the defendant is no more likely to be guilty than any other capable person.” Then, they update their belief in the ultimate fact based on the evidence obtained. This process aligns closely with Bayes’ theorem, so in this sense, the application of Bayesian AI methods to fact-finding is natural.

Bayesian network reasoning involves a large amount of probability calculations, requiring a high level of mathematical skill from users. Even with a Bayesian network diagram containing only a few nodes, the calculations needed for accurate probability reasoning can be overly complex. This complexity is challenging even for users with strong mathematical skills, let alone for fact-finders who may be less familiar with mathematics. The complexity of probability calculations was one of the main reasons why Bayesian network models were rarely applied in legal practice in their early years. However, in the late 1980s, breakthroughs in algorithms made it possible to automatically perform effective probabilistic reasoning for most Bayesian networks. These algorithms were subsequently integrated into widely accessible software tools, such as GeNIe, enabling users to create and run Bayesian network models without extensive mathematical or statistical knowledge. The probability calculations are entirely automated by the software tools, eliminating the need for manual operations. This has enabled the automation and intelligence of probabilistic reasoning.

Firstly, Bayesian network reasoning allows for logical and scientific propagation and updating of inferential strength during continuous inference. Based on the data system shown in Table 1, running the Bayesian network diagram in Figure 1 using GeNIe software yields partial results as shown in Table 2. Table 2 lists the effects of the DNA forensic opinion E1 and witness testimony E2 on the two most critical hypotheses: “Zhang was at the crime scene at the time of the incident (H2)” and “Zhang is guilty (H1).” The second row of Table 2 shows the prior probabilities P(H2=T) = 10% and P(H1=T) = 1%. This means that, without considering evidence E1 and E2, based on background knowledge, the fact-finder is 10% confident that Zhang was at the crime scene at the time of the incident and 1% confident that Zhang is guilty. The third row shows the posterior probabilities P(H2=T|E1=T) = 77.5875% and P(H1=T|E1=T) = 75.3463%. This indicates that when considering only evidence E1, the fact-finder’s confidence in “H2 being true” increases from the original 10% to 77.5875%, and their confidence in “H1 being true” increases from 1% to 75.3463%. The fourth row shows the posterior probabilities P(H2=T|E2=T) = 38.6364% and P(H1=T|E2=T) = 3.8636%. This indicates that when considering only evidence E2, the fact-finder’s confidence in “H2 being true” rises from the original 10% to 38.6364%, and their confidence in “H1 being true” rises from 1% to 3.8636%. Thus, for hypotheses H2 and H1, the inferential strength of evidence E1 is greater than that of evidence E2. The fifth row shows the posterior probabilities P(H2=T|E1=T, E2=T) = 95.1496% and P(H1=T|E1=T, E2=T) = 92.4010%. This demonstrates that when considering both mutually reinforcing pieces of evidence, E1 and E2, together, their combined effect results in a stronger inference than considering each piece of evidence separately. The changes in the posterior probabilities clearly reflect the effect of evidence reinforcement. When considering both E1 and E2, the fact-finder is 95.1496% confident that Zhang was at the crime scene at the time of the incident and 92.4010% confident that Zhang is guilty, indicating a very high degree of confidence.

Secondly, Bayesian network reasoning can not only handle reinforcing evidence but also analyze conflicting evidence in a unified manner. Changes in posterior probability values transparently demonstrate the effects of evidence reinforcement and conflict. Row 6 of Table 2 shows the posterior probabilities P(H2=T|E1=F, E2=T) = 36.1754% and P(H1=T|E1=F, E2=T) = 0.0081%. Here, “E1=F” indicates that the observed evidence is contrary to E1, meaning “the blood type DNA is detected as not matching,” and “E2=T” refers to evidence E2 itself, which is “Li testifies that he saw Zhang near the crime scene at the time of the incident.” These two pieces of evidence are in conflict. However, Bayesian network reasoning can comprehensively analyze the inferential strength of these conflicting pieces of evidence. When considering “E1=F, E2=T” together, the probability of “H2 being true” increases from the initial 10% to 36.1754%, while the probability of “H1 being true” decreases from the initial 1% to 0.0081%. In contrast, Row 7 shows P(H2=T|E1=T, E2=F) = 37.9231% and P(H1=T|E1=T, E2=F) = 36.8276%. This indicates that when considering both “the blood type DNA is detected as a match” and “Li testifies that he did not see Zhang near the crime scene at the time of the incident,” the probability of “H2 being true” increases from 10% to 37.9231%, and the probability of “H1 being true” increases from 1% to 36.8276%.

The clear data-driven conclusions in Table 2 represent the greatest advantage of the Bayesian AI method. Based on prior assumptions and observed evidence, the Bayesian network model can calculate precise posterior probabilities, providing crucial references for judges or jurors in making factual determinations. It is important to note that the data conclusions here are fundamentally different from the data requirements in the “legal evidence system” model. The legal evidence system originated in medieval Europe, where laws pre-defined the probative value of various types of evidence and rules for evaluating evidence, such as the testimony of a cleric being superior to that of a layperson, or two typical witness testimonies constituting a complete piece of evidence. Fact-finders in this system mechanically calculate the quantity of evidence, its probative value, and the strength of inferences based on pre-established rules, thereby determining the facts of a case. The legal evidence system has been replaced by the currently prevalent free evaluation of evidence system. Within the framework of Bayesian AI, fact-finders freely evaluate evidence and facts to be proven. The Bayesian network diagram and probability distribution tables represent a fact-finder’s subjective evaluations in a visualized, quantitative, transparent, and open manner. These results are not bound by legal rules but are influenced by the fact-finder’s own knowledge, experience, and logical principles. Therefore, the posterior probabilities automatically calculated by Bayesian AI software using probabilistic logic and algorithms based on these results are not legally binding. The effectiveness of the Bayesian AI model lies in “assisting” discretionary judgment, “uncovering” the black box of the reasoning process, rather than “replacing” the reasoning process.

4. Conclusion and Outlook

This article elaborates on the operational principles of applying Bayesian Artificial Intelligence to the evidence evaluation phase of fact-finding. Although the explanation uses a criminal case as an example, the approach is equally applicable to civil cases, as the logic of evidence reasoning is consistent across cases, differing only in the standard of proof. First, constructing the knowledge system: After all evidence has been presented, cross-examined, and accepted, the fact-finder uses the idiom-based method to draw a Bayesian network diagram in Bayesian AI software, such as “GeNIe,” based on the causal or correlational relationships between evidence and facts to be proven. Second, creating the data system: Based on the three sources of probability information—statistical data, literature, or expert knowledge and experience—the fact-finder assesses the strength of the dependency relationship between evidence and the facts to be proven, assigning a probability distribution table to each node. Finally, inputting the evidence state and allowing the software to reason automatically: The “GeNIe”software comes equipped with various probabilistic reasoning algorithms, enabling it to automatically calculate the posterior probability of the facts to be proven after inputting the evidence state. The posterior probability serves as a quantitative model of inferential strength, providing an answer to the degree to which a fact-finder can believe in the truth of their claim. In judicial practice, this answer is what the fact-finder seeks most and is the most crucial reference for making a factual determination.

Although Bayesian Artificial Intelligence is currently widely applied primarily in the field of forensic science, with forensic experts increasingly adopting Bayesian probability theory as a theoretical framework and using software tools to design Bayesian networks for evidence analysis, we believe that the legal community will eventually embrace this approach. Over time, it will break down disciplinary and cultural barriers, accepting the Bayesian analytical framework and applying it across more branches of law. There are five main reasons for this belief: First, Bayesian AI can not only process big data using machine learning methods, but it can also integrate “smart data,” such as expert knowledge and judgments. When dealing with risk-based decisions like court trials, where data may be sparse or unavailable, Bayesian AI can combine human knowledge and experience to provide viable analytical models. Second, Bayesian network reasoning enables the logical and scientific propagation and updating of inferential strength through continuous inference. It can handle not only reinforcing evidence but also analyze conflicting evidence in a unified manner. Third, the use of software tools addresses the challenge of “weak mathematical calculation skills” faced by fact-finders, facilitating the application of Bayesian AI methods in practical fact-finding. Even fact-finders who are not particularly skilled in probabilistic reasoning can effectively use software tools to construct Bayesian network models after simple training. Fourth, the Bayesian AI model helps to open the black box of discretionary judgment, presenting the subjective judgments and reasoning process of fact-finders in a visualized, transparent, and quantitative manner, thus enhancing the accuracy of fact-finding. Fifth, Bayesian AI is a probabilistic reasoning expert system that emphasizes “assisting humans” and “enhancing humans,” rather than “replacing humans.” All judicial decisions are still made by humans, maintaining the central role of the fact-finder. The application of Bayesian AI transforms the adjudication process into a semi-automated human-machine collaboration, improving judicial efficiency and alleviating the problem of limited judicial resources amid high case volumes. It injects new momentum into the modernization of the judicial system and its capacity.

The original article was published in the Journal of Shandong University (Philosophy and Social Sciences), Issue 3 2024, and is reposted from the WeChat official account “Journal of Shandong University (Philosophy and Social Sciences Edition).”

Assistant Editor: Yang Shuhui

Responsible Editor: Tan Jun

Reviewer: Ji Weidong

download1