Explainable Automated Reasoning in Law using Probabilistic Epistemic Argumentation
EExplainable Automated Reasoning in Law using Probabilistic EpistemicArgumentation
Inga Ibs , Nico Potyka Technical University of Darmstadt University of [email protected], [email protected]
Abstract
Applying automated reasoning tools for decision support andanalysis in law has the potential to make court decisions moretransparent and objective. Since there is often uncertaintyabout the accuracy and relevance of evidence, non-classicalreasoning approaches are required. Here, we investigateprobabilistic epistemic argumentation as a tool for automatedreasoning about legal cases. We introduce a general schemeto model legal cases as probabilistic epistemic argumentationproblems, explain how evidence can be modeled and sketchhow explanations for legal decisions can be generated auto-matically. Our framework is easily interpretable, can dealwith cyclic structures and imprecise probabilities and guar-antees polynomial-time probabilistic reasoning in the worst-case.
Legal reasoning problems can be addressed from differ-ent perspectives. From a lawyer’s perspective, a trial maybe best modeled as a strategic game. In a criminal trial,for example, the prosecutor may try to convince the judgeor jury of the defendant’s guilt while the defense attorneytries the opposite. The problem is then to interpret the lawand the evidence in a way that maximizes the agent’s util-ity. From this perspective, a legal reasoning problem isbest modeled using tools from decision and game theory(Hanson, Hanson, and Hart 2014; Prakken and Sartor 1996;Riveret et al. 2007).Our focus here is not on strategic considerations, but onthe decision process that leads to the final verdict in a le-gal process like a trial. Given different pieces of evidenceand beliefs about their authenticity and relevance, how canwe merge them to make a plausible and transparent deci-sion? Different automated reasoning tools have been ap-plied in order to answer similar questions, for example, case-based reasoning (Bench-Capon and Sartor 2003; McCarty1995), argumentation frameworks (Dung and Thang 2010;Prakken et al. 2013) or Bayesian networks (Fenton, Neil,and Lagnado 2013). Since lawyers and judges often strugglewith the interpretation of Bayesian networks, recent workalso tries to explain Bayesian networks by argumentationtools (Vlek et al. 2016).Here, we investigate the applicability of the probabilisticepistemic argumentation framework developed in (Hunter 2013; Hunter, Polberg, and Thimm 2018; Hunter andThimm 2016; Thimm 2012). As opposed to classical ar-gumentation approaches, this framework allows expressinguncertainty by means of probability theory. In particular,we can compute reasoning results in polynomial time whenwe restrict the language (Potyka 2019). As it turns out, theresulting fragment is sufficiently expressive for our purpose,so that our framework is computationally more efficient thanmany other probabilistic reasoning approaches that sufferfrom exponential runtime in the worst-case. At the sametime, the graphical structure is easily interpretable and al-lows to automatically generate explanations for the final de-grees of belief (probabilities) as we will explain later.While we can incorporate objective probabilities in ourframework, our probabilistic reasoning is best described assubjective in the sense that we basically merge beliefs aboutpieces of evidence and hypotheses (probabilities that can beeither objective or subjective). In order to define the beliefsabout pieces of evidence from objective evidence and statis-tical information, another approach like Bayesian networksor more general tools from probability theory may be bettersuited. Our framework can then be applied on top of thesetools. In this sense, our framework can be seen as a comple-ment rather than a replacement of alternative approaches.The remainder of this paper is structured as follows: Sec-tion 2 explains the necessary basics. We will introduce abasic legal argumentation framework in Section 3 and dis-cuss more sophisticated building blocks in Section 4. Wewill discuss and illustrate the explainability capabilities ofour approach as we proceed, but explain some more generalideas in Section 5. Finally, we add some discussion aboutrelated work, the pros and cons of our framework and futurework in Sections 6 and 7.
Our legal reasoning approach builds up on the probabilis-tic epistemic argumentation approach developed in (Thimm2012; Hunter 2013; Hunter and Thimm 2016; Hunter, Pol-berg, and Thimm 2018). In this approach, we assign de-grees of belief in the form of probabilities to arguments us-ing probability functions over possible worlds. A possibleworld basically interprets every argument as either accepted a r X i v : . [ c s . A I] S e p r rejected. In order to restrict to probability functions thatrespect prior beliefs and the structure of the argumentationgraph, different constraints can be defined. Afterwards, wecan assign a probability interval to every argument based onthese constraints. We will restrict to a fragment of the con-straint language here that allows polynomial-time computa-tions (Potyka 2019).Formally, we represent arguments and their relationshipsin a directed edge-weighted graph ( A , E , w) . A is a finiteset of arguments, E ⊆ A × A is a finite set of directed edgesbetween the arguments and w :
E → Q assigns a rationalnumber to every edge. If there is an edge ( A, B ) ∈ E , wesay that A attacks B if w (( A, B )) < and A supports B if w (( A, B )) > . We let Att( A ) = { B ∈ A | ( B, A ) ∈E , w (( A, B )) < } be the set of attackers of an argument Aand Sup( A ) = { B ∈ A | ( B, A ) ∈ E , w (( A, B )) > } bethe set of supporters.A possible world is a subset of arguments ω ⊆ A . In-tuitively, ω contains the arguments that are accepted in aparticular state of the world. Beliefs about the true state ofthe world are modeled by rational-valued probability func-tions P : 2 A → [0 , ∩ Q such that (cid:80) ω ∈ A P ( ω ) = 1 .The restriction to probabilities from the rational numbers isfor computational reasons only. In practice, it does not re-ally mean any loss of generality because implementationsusually use finite precision arithmetic. We denote the setof all probability functions over A by P A . The probabilityof an argument A ∈ A under P is defined by adding theprobabilities of all worlds in which A is accepted, that is, P ( A ) = (cid:80) ω ∈ A ,A ∈ ω P ( ω ) . P ( A ) can be understood as adegree of belief, where P ( A ) = 1 means complete accep-tance and P ( A ) = 0 means complete rejection.The meaning of attack and support relationships can bedefined by means of constraints in probabilistic epistemicargumentation. For example, the Coherence postulate in(Hunter and Thimm 2016) intuitively demands that the be-lief in an argument is bounded from above by the belief ofits attackers. Formally, a probability function P respects Coherence iff P ( A ) ≤ − P ( B ) for all B ∈ Att( A ) . Amore general constraint language has recently been intro-duced in (Hunter, Polberg, and Thimm 2018). Here, we willrestrict to a fragment of this language that allows solvingour reasoning problems in polynomial time (Potyka 2019).A linear atomic constraint is an expression of the form c + n (cid:88) i =1 c i · π ( A i ) ≤ d + m (cid:88) i =1 d i · π ( B i ) , where A i , B i ∈ A , c i , d i ∈ Q , n, m ≥ (the sums canbe empty) and π is a syntactic symbol that can be read as’the probability of’. For example, the Coherence conditionabove can be expressed by a linear atomic constraint with m = n = 1 , c = 0 , c = 1 , A = A , d = 1 , d = − and B = B . However, we can also define more complex con-straints that take the beliefs of more than just two argumentsinto account. Usually, the arguments that occur in a con-straint are neighbors in the graph and the coefficients c i , d i will often be based on the weight of the edges between thearguments. We will see many examples later. A probability function P satisfies a linear atomic con-straint iff c + (cid:80) ni =1 c i · P ( A i ) ≤ d + (cid:80) mi =1 d i · P ( B i ) . P satisfies a set of linear atomic constraints C , denoted as P | = C , iff it satisfies all constraints c ∈ C . If this is thecase, we call C satisfiable .We are interested in two reasoning problems here thathave been introduced in (Hunter and Thimm 2016). First,the satisfiability problem is, given a graph ( A , E , w) and aset of constraints C over this graph, to decide if the con-straints are satisfiable. This basically allows us to check thatour modelling assumptions are consistent. Second, the en-tailment problem is, given a graph ( A , E , w) , a set of satis-fiable constraints C and an argument A , to compute lowerand upper bounds on the probability of A based on the prob-ability functions that satisfy the constraints. For example,suppose we have A = { A, B, C } , E = { ( A, B ) , ( B, C ) } , w(( A, B )) = 1 , w(( B, C )) = − . We encode the meaningof the support relationship ( A, B ) by w (( A, B )) · π ( A ) ≤ π ( B ) (a supporter bounds the belief in the argument frombelow) and the meaning of the attack relationship ( B, C ) by π ( C ) ≤ w (( B, C )) · P ( B ) (an attacker bounds thebelief in the argument from above). Say, we also tend to ac-cept C and model this by the constraint . ≤ π ( C ) . Thenour constraints are satisfiable and the entailment results are P ( A ) ∈ [0 , . , P ( B ) ∈ [0 , . , P ( C ) ∈ [0 . , . To un-derstand the reasoning, let us consider the upper bound for A . If we had P ( A ) > . , we would also have P ( B ) > . because of the support constraint. But then, we would have P ( C ) < . because of the attack constraint. However, thiswould violate our constraint for C . Hence, we must have P ( A ) ≤ . . In particular, if we would add the constraint ≤ π ( A ) (accept A ), our constraints would become unsatis-fiable. Both the satisfiability and the entailment problem canbe automatically solved by linear programming techniques.In general, the linear programs can become exponentiallylarge. However, both problems can be solved in polynomialtime when we restrict to linear atomic constraints (Potyka2019). Legal reasoning problems can occur in many forms and anattempt to capture all of them at once would most probablyresult in a framework that is hardly more concrete than ageneral abstract argumentation framework. We will there-fore focus on a particular scenario, where the innocenceof a defendant has to be decided. Modeling a single casemay not be sufficient to illustrate the general applicabilityof probabilistic epistemic argumentation. We will thereforetry to define a reasoning framework that can be instantiatedfor different cases, while still being easily comprehensible.As with every formal model, there are some simplifying as-sumptions about the nature of a trial. However, we think thatour framework is sufficient to illustrate how real cases can bemodeled and structured by means of probabilistic epistemicargumentation. We will make some additional commentsabout this as we proceed.Following (Fenton, Neil, and Lagnado 2013), we regard alegal case roughly as a collection of hypotheses and piecesof evidence that support the hypotheses. We model both as igure 1: Meta-Graph for our Legal Reasoning Framework. abstract arguments, that is, as something that can be ac-cepted or rejected to a certain degree by a legal decisionmaker like a judge, the jury or a lawyer. To begin with,we introduce three meta hypotheses that we model by threearguments E inc (the defendant should be declared guiltybecause of the inculpatory evidence), E ex (the defendantshould be declared innocent because of the exculpatory ev-idence) and Innocence (the defendant is innocent). We re-gard
Innocence as the ultimate hypothesis that is to be de-cided within the trial. In general, it may be necessary toconsider several ultimate hypotheses that may correspondto different qualitative degrees of legal liability (e.g. intentvs. accident vs. innocent). If necessary, these can be in-corporated by adding additional ultimate hypotheses in ananalogous way. E inc and E ex are supposed to merge hy-potheses and pieces of evidence that speak against ( E inc ) orfor ( E ex ) the defendant’s innocence as illustrated in Figure1. Support relationships are indicated by a plus and attackrelationships by a minus sign. There can also be attack andsupport relationships between pieces of evidence and addi-tional hypotheses.Intuitively, as our belief in E inc increases, our belief in Innocence should decrease. As our belief in E ex increases,our belief in Innocence should increase. From a classi-cal perspective, accepting E inc , should result in rejecting Innocence and accepting E ex , should result in accepting Innocence . In particular, we should not accept E ex and E inc at the same time. Of course, in general, both the inculpatoryevidence and the exculpatory evidence can be convincing toa certain degree. Probabilities are one natural way to capturethis uncertainty. Intuitively, our basic framework is based onthe following assumptions that we will make precise in thesubsequent definition. Inculpatory Evidence (IE):
The belief in
Innocence isbounded from above by the belief in E inc . Exculpatory Evidence (EE):
The belief in
Innocence isbounded from below by the belief in E ex . Supporting Evidence (SE):
The belief in E inc and E ex isbounded from below by the belief in their supportingpieces of evidence. Presumption of Innocence (PI):
The belief in
Innocence is the maximum belief that is consistent with all assump-tions. The following definition gives a more formal description ofour framework. Our four main assumptions are formalizedin items 4 and 5.
Definition 1 (Basic Legal Argumentation Framework(BLAF)) . A BLAF is a quadruple ( A , E , w , C ) , where A is a finite set of arguments, E is a finite set of directed edgesbetween the arguments, w : E → Q is a weighting functionand C is a set of linear atomic constraints over A such that:1. A = A M (cid:93) A S (cid:93) A E is partitioned into a set of meta-hypotheses A M = { Innocence , E inc , E ex } , a set of sub-hypotheses A S and a set of pieces of evidence A E .2. E = E M (cid:93) E S (cid:93) E E is partitioned into a set of meta edges E M = { (E inc , Innocence) , (E ex , Innocence) } , a set of support edges E S ⊆ ( A S ∪ A E ) × { E inc , E ex } and a setof evidential edges E E ⊆ ( A S ∪ A E ) × ( A S ∪ A E ) .3. w((E inc , Innocence)) = − and w((E ex , Innocence)) =1 . Furthermore, ≤ w( e ) ≤ for all e ∈ E S C contains at least the following constraints: IE: π (Innocence) ≤ inc , Innocence)) · π (E inc ) , EE: w((E ex , Innocence)) · π (E ex ) ≤ π (Innocence) , SE: w((
E, H )) · π ( E ) ≤ π ( H ) for all ( E, H ) ∈ E S .5. For all A ∈ A , we call B ( A ) = min P | = C P ( A ) the lowerbelief in A and B ( A ) = max P | = C P ( A ) the upper beliefin A . The belief in Innocence in is defined as
P I : B (Innocence) = B ( A ) . and the belief in the remaining A ∈ A \ { Innocence } isthe interval B ( A ) = [ B ( A ) , B ( A )] .Items 1-3 basically give a more precise description of thegraph illustrated in Figure 1. Item 4 encodes our first threemain assumptions as linear atomic constraints. The generalform of our basic constraints is π ( B ) ≤ w (( A, B )) · P ( A ) for attack relations ( A, B ) (note that for w (( A, B )) = − , this is just the coherence constraint from (Hunter andThimm 2016)) and w (( A, B )) · π ( A ) ≤ π ( B ) for supportrelations. Intuitively, attacker bound beliefs from above andsupporter bound beliefs from below. Item 5 defines lowerand upper beliefs in arguments as the minimal and maximalprobabilities that are consistent with our constraints. Fol-lowing our fourth assumption (presumption of innocence),the belief in Innocence is defined by the upper bound. Thebeliefs in the remaining arguments is the interval defined bythe lower and upper bound. The following proposition sum-marizes some consequences of our basic assumptions.
Proposition 1.
For every BLAF ( A , E , w , C ) , we have1. B (E inc ) ≤ − B (E ex ) and B (E ex ) ≤ − B (E inc ) .2. For all support edges ( a, E ) ∈ E S , we have • B (E ex ) ≤ − w(( a, E inc )) · B ( a ) if E = E inc , • B (E inc ) ≤ − w(( a, E ex )) · B ( a ) if E = E ex .Proof.
1. We prove only the first statement, the second onefollows analogously. Consider an arbitrary P ∈ P A that sat-isfies C . Then P (E inc ) ≤ P (Innocence) ≤ − P (E ex ) ≤ − B (E ex ) . The first inequality follows from EE and thesecond from IE (Def. 1, item 4) along with the conditions igure 2: BLAF for Example 1. on w (Def. 1, item 3). The third inequality follows because B (E ex ) ≤ P (E ex ) by definition of B .2. Again, we prove only the first statement. Note thatSE (Def. 1, item 4) implies P (E inc ) ≥ w(( a, E inc )) · P ( a ) for all P ∈ P A that satisfy C . Therefore, P (E ex ) ≤ − P (E inc ) ≤ − w(( a, E inc )) · P ( a ) ≤ − w(( a, E inc )) ·B ( a ) ,where the first and third inequalities can be derived like in1. Intuitively, item 1 says that our upper belief that the de-fendant should be declared guilty because of the inculpatoryevidence is bounded from above by our lower belief that thedefendant should be declared innocent because of the ex-culpatory evidence and vice versa. By rearranging the equa-tions, we can see that the lower belief in E inc is also boundedfrom above by the upper belief in E ex and vice versa. Item2 explains that every argument a that directly contributes toinculpatory (exculpatory) evidence E gives an upper boundfor the belief in E ex ( E inc ) that is based on our lower be-lief B ( a ) and the relevance w(( a, E )) of this argument. Ina similar way, we could bound the beliefs in contributorsto E inc by the belief in contributors to E ex by taking theirrespective weights into account. However, the general de-scription becomes more and more difficult to comprehend.Therefore, we just illustrate the interactions by means of asimple example. Example 1.
Let us consider a simple case of hit-and-rundriving. The defendant is accused of having struck a carwhile parking at a shopping center. The plaintiff witnessedthe accident from afar and denoted the registration numberfrom the licence plate when the car left ( T ). The defendantdenies the crime and testified that he was at home with hisgirlfriend at the time of the offence ( T ). His girlfriend con-firmed his alibi ( T ). However, a security camera at the park-ing space recorded a person that bears strong resemblance tothe defendant at the time of the crime ( E ). We consider asimple formalization shown in Figure 2. We designed thegraph in a way that allows illustrating the interactions in ourframework. One may also want to regard T as a supporterof exculpatory evidence and consider attack relationshipsbetween E and T and T . We do not introduce such edgesbecause we want to illustrate the indirect interactions be-tween arguments. In this example, we may weigh all edgeswith and control the uncertainty only about the degreesof belief. However, we assign a weight of . to the edgefrom T in order to illustrate the effect of the weight. Thismay capture the uncertainty that the plaintiff may have writ-ten down the wrong registration number, for example. The A B B B Innocence E inc [0, 1] [0, 0.3] [0.9, 1] E ex [0, 1] [0.7, 1] [0, 0.1] T [0, 1] [0, 0.33] [0, 1] T [0, 1] [0.7, 1] [0, 0.1] T [0, 1] [ , 1] [0, 0.1] E [0, 1] [0, 0.3] [ , 1] Table 1: Beliefs under additional assumptions for Example 1(rounded to two digits). Directly constrained beliefs are high-lighted in bold. probability for T , T and T is our degree of belief that thecorresponding testimonies are true. The probability of E isour degree of belief that the camera does indeed show thedefendant and not just another person. Without additionalassumptions, we can only derive that our degree of belief in Innocence is (presumption of innocence) as shown in thesecond column ( B ) of Table 1. We could now start addingassumptions and looking at the consequences. For example,let us assume that the statement of the defendant’s girlfriendwas very convincing. We could incorporate this by addingthe constraint π ( T ) ≥ . . The consequences are shown inthe third column ( B ) of Table 1. However, if the person onthe camera bears strong resemblance to the defendant, wemay find that the upper belief in E is too low. This meansthat our assumption is too strong and needs to be revised.Let us just delete the constraint π ( T ) ≥ . and instead im-pose a constraint on E . Let us assume that there is hardlyany doubt that the camera shows the defendant. We couldincorporate this by adding the constraint π ( E ) ≥ . . Theconsequences are shown in the fourth column ( B ) of Table1. The choice of probabilities (degrees of belief), weights(relevance) and additional attack or support relations is, ofcourse, subjective. However, arguably, every court decisionis subjective in that the decision maker(s) have to weigh theplausibility and the relevance of the evidence in one way oranother. By making these assumptions explicit in a formalframework, the decision process can become more transpar-ent. Furthermore, by computing probabilities while addingassumptions, possible inconsistencies can be detected andresolved early. Since we restrict to linear atomic constraints,computing probabilities can be done within a second evenwhen there are thousands of arguments.Let us note that our framework also allows defining somesimple rules that allow deriving explanations for the verdictautomatically. For example, the belief in Innocence can beexplained directly from the beliefs in E inc and E ex . If both B (E inc ) ≤ . and B (E ex ) ≤ . . our system may reportthat the defendant is found innocent because of lack of ev-idence. If B (E ex ) > . , it could report that the defendantis found innocent because the exculpatory evidence is moreplausible than the inculpatory evidence (recall from Propo-sition 1 that B (E inc ) ≤ − B (E ex ) ). Finally, if B (E inc ) isufficiently large, it could report that the defendant is foundguilty because of the inculpatory evidence. The belief in E inc and E ex can then be further explained based on the be-lief in supporting hypotheses and pieces of evidence. Theinfluence of supporting arguments can be measured by theirlower belief bounds and their weight. To illustrate this, con-sider again Table 1. For B , the system could report thatthe defendant is innocent because of lack of convincing ev-idence, while, for B , it can explain that there is convincingexculpatory evidence. If desired, it can then further report T as the direct explanation and, going backwards, T as anadditional explanation. Similarly, for B , the system couldreport that the defendant is probably not innocent becauseof the inculpatory evidence. Again, the system could givefurther explanations by going backwards in the graph. Wewill discuss the idea in more general form in Section 5. BLAFs can capture a wide variety of cases. However, itis often desirable to add additional structure that capturesrecurring patterns in legal reasoning. From a usability per-spective, this makes the graph more easily comprehensibleand allows modeling different cases in a consistent and stan-dardized way. From an automated reasoning perspective, itallows adding additional general rules that can automaticallyderive explanations for decisions.Two natural subsets of inculpatory evidence are direct( E d ) and circumstantial ( E c ) inculpatory evidence. Whiledirect evidence provides direct inculpatory evidence, cir-cumstantial evidence involves indirect evidence that requiresmultiple inferential steps (Fenton, Neil, and Lagnado 2013).For example, a camera that recorded the defendant whilecommitting the crime can be seen as direct evidence, while acamera that recorded the defendant close to the crime scenelike in Example 1 can be seen as a piece of circumstantialevidence. Two prominent categories of circumstantial evi-dence are motive (the defendant had a reason to commit thecrime) and opportunity (the defendant had the opportunity tocommit the crime). Figure 3 shows a refined BLAF. As indi-cated by the join of their support edges, the beliefs in piecesof circumstantial evidence are merged and not consideredindependently. Only if both a motive and the opportunity(and perhaps some additional conditions) were present, thedefendant should be found guilty. In contrast, pieces of di-rect evidence are standalone arguments for the defendant’sguilt.Two recurring patterns of exculpatory evidence are alibi and ability . While an alibi indicates that the defendant hasnot been at the crime scene at the time of the crime, abil-ity can contain pieces of evidence that indicate that the de-fendant could not have committed the crime, for example,due to lack of physical strength. Figure 3 shows an ex-tended BLAF with six additional meta-hypotheses. As be-fore, we allow edges between all pieces of evidence and sub-hypotheses, but do not draw all possible direct connectionsin order to keep the graph comprehensible. The meaningof the support edges pointing to inculpatory and exculpa-tory evidence is already defined by SE in Definition 1, item4. That is the corresponding support relations ( A, B ) are Figure 3: Refined BLAF with additional meta-hypotheses. associated with the constraint w (( A, B )) · π ( A ) ≤ π ( B ) .This constraint could also be naturally used for the eviden-tial edges that point to direct evidence, alibi and ability.However, the circumstantial evidence patterns motive andopportunity should not act independently, but complementeach other. Neither a motive, nor the opportunity alone,are a good reason to find the defendant guilty. However,if both a good motive and the opportunity are present, thismay be a good reason. We say that both items together pro-vide collective support for the guilt of the defendant. Toformalize collective support , we can consider a constraint w((Motive , E c )) · π (Motive) + w((Opportunity , E c )) · π (Opportunity) ≤ π (E c ) such that w((Motive , E c )) +w((Opportunity , E c )) ≤ . For example, we could set w((Motive , E c )) = w((Opportunity , E c )) = 0 . . Thenthe presence of a strong motive or the opportunity alone can-not decrease the belief in the defendant’s innocent by morethan . and both together cannot decrease the belief bymore than . . Opportunity is indeed considered a necessaryrequirement for the defendant’s guilt in the legal reasoningliterature and motive is, at least, widely accepted as such(Fenton, Neil, and Lagnado 2013). Collective support is aninteresting pattern in general, so that we give a more gen-eral definition here. Given arguments A , . . . , A n (pieces ofevidence or sub-hypotheses) that support another argument B such that (cid:80) ni =1 w(( A i , B )) ≤ , the collective supportconstraint is defined as CS: (cid:80) ni =1 w(( A i , B )) · π ( A i ) ≤ π ( B ) .The following example illustrates how the additional struc-ture can be applied. Example 2.
Let us consider a simple robbery case. The de-fendant D is accused of having robbed the victim V . Theextended BLAF is shown in Figure 4. Before the crime, D and V met in a bar and had a fight about money that V owed D . V testified that D threatened to get the moneyone way or another ( V ). D acknowledged the fight, but de-nied the threat ( D ). While D’s testimony still contains amotive for the crime, it is now significantly weaker. Thiscan be reflected in the weights. We could consider a morefine-grained view distinguishing the fight and the threat andadd an attack between the contradicting statements, but inorder to keep things simple, we refrain from doing so. V testified that he got robbed at 23:30 by a masked person igure 4: Extended BLAF for Example 2. and that he recognized the defendant based on his voiceand stature ( V ). This can be seen as direct evidence forthe crime, but since the accused is of average stature, itshould have only a small weight. A waiter working at thebar testified that the defendant left the bar at about 23:00( W ). This may have allowed the defendant hypotheticallyto commit the crime, but he could have went anywhere, sothe weight should be again low. The defendant testifiedthat he went to the movie theater and watched a movie thatstarted at 23:15 ( D ). If true, this is a strong alibi and shouldtherefore have a large weight. An employee at the movietheater testified that the defendant is a frequent guest andthat he recalled him buying a drink ( E ). However, he didnot recall the exact time. So the alibi is somewhat weakand should not have too much weight. We weigh Motive and
Opportunity equally with w((Motive , Innocence)) =w((Opportunity , Innocence)) = 0 . . The influence of thebelief in motive and opportunity on circumstantial evidenceis defined by the collective support constraint that we de-scribed above. All evidential edges ( E, A ) that originatefrom a piece of evidence E are associated with the constraint w (( E, A )) · π ( E ) ≤ π ( A ) . Figure 4 shows the final graphstructure and edge weights.Having defined the structure of the graph and the mean-ing of the edges, we can start to assign beliefs to pieces ofevidence. Again, without making any assumptions aboutthe beliefs, we can only infer that the degree of belief in Innocence is 1. This is shown in the second column of Ta-ble 2. To begin with, we assume that the testimonies givenby the cinema employee and the waiter of the bar are true( π ( E
1) = 1 , π ( W
1) = 1 ). The third column of Table 2shows the consequences of these assumptions. We can see,for example, that the alibi E provides a lower bound for thebelief in the exculpatory evidence and thus an upper boundfor the beliefs in the inculpatory evidence and the relatedhypotheses. It seems also safe to assume that the defen-dant did not lie about his participation in the fight, so wethe constraint π ( D
1) = 1 next. The fourth column in Ta-ble 2 shows the resulting belief intervals. The new supportfor motive adds to the support of the circumstantial evidenceand the lower bound on the belief in the inculpatory evidenceis raised. This lowers the belief in the innocence of the ac-cused slightly. Note again that it also decreases the upperbound on the belief in exculpatory evidence indirectly. Fi- A Basic W , E W , E , D W , E , D , V [0, 1] 0.94 0.91 0.8 E inc [0, 1] [0.06, 0.7] [0.09, 0.7] [0.2, 7] E ex [0, 1] [0.3, 0.94] [0.3, 0.91] [0.3, 0.8] E c [0, 1] [0.06, 0.7] [0.09, 0.7] [0.09, 0.7] E d [0, 1] [0, 0.7] [0, 0.7] [0.2, 0.7] Alibi [0, 1] [0.3, 0.94] [0.3, 0.91] [0.3, 0.8]
Ability [0, 1] [0, 0.94] [0, 0.91] [0, 0.8]
Motive [0, 1] [0, 1] [0.1, 1] [0.1, 1]
Opportunity [0, 1] [0.2, 1] [0.2, 1] [0.2, 1] V [0, 1] [0, 1] [0, 1] [0, 1] V [0, 1] [0, 1] [0, 1] D [0, 1] [0, 1] D [0, 1] [0.3, 1] [0.3, 1] [0.3, 0.89] W [0, 1] E [0, 1] Table 2: Belief in
Innocence and entailment results under addi-tional assumptions for Example 2 (rounded to two digits). Directlyconstrained beliefs are highlighted in bold. nally, let us assume that the defendant does not lie abouthaving recognized the defendant ( π ( V
2) = 1 ) (recall thatthe uncertainty about the recognition reliability is incorpo-rated in the edge weight). The fifth column in Table 2 showsthe new beliefs. We can see that the belief in the defendant’sinnocence decreases significantly. If we notice that a largeror smaller change is more plausible, we could take accountof this by adapting the edge weight. In this way, legal casescan be analyzed in a systematic way and the plausibility ofassumptions can be checked on the fly by looking at theirramifications.In addition to the previously introduced additional cate-gories of meta-hypotheses, another recurring pattern in legalcases are mutually dependent pieces of evidence. One wayto model this in our framework, is to define a meta-argumentthat is influenced by the dependent pieces of evidence. Thecollective support constraint CS is well suited to capture thisrelationship accurately. We illustrate this with an examplefrom (Fenton, Neil, and Lagnado 2013, pp.82-84).
Example 3.
Let us assume that a person was recordedby two video cameras from different perspectives at acrime scene. If the person is the defendant, the defen-dant should resemble the person on both images. In theBLAF, we can incorporate the two camera observationsas pieces of evidence
Camera1 , Camera2 supporting ameta-hypothesis
Camera that says that the defendant wasat the crime scene because of camera evidence. Notethat if we use the SE constraint for the evidential edgesfrom
Camera1 , Camera2 , each of the two cameras wouldindependently determine a lower bound for
Camera whichseems to strong in this example. Instead, we can use theCS constraint that we already used to capture the relation-ship between opportunity and motive. In this example,the CS constraint becomes w((Camera1 , Camera)) · π (Camera1) + w((Camera2 , Camera)) · π (Camera2) ≤ (Camera) , where w((Camera1 , Camera)) +w((Camera2 , Camera)) ≤ . For example both cam-era weights could be set to w((Camera1 , Camera)) =w((Camera2 , Camera)) = 0 . to give equal relevanceto both. Then, if the person resembles the defendant onlyfrom one perspective, say we have π (Camera1) = 1 and π (Camera2) = 0 , the induced lower bound on the beliefin Camera will be only . . Only if the belief in bothcameras is larger than . , the lower bound can be largerthan . . For example, if we have π (Camera1) = 0 . and π (Camera2) = 0 . , the induced lower bound is . . As we already illustrated at the end of Section 3, the struc-ture of our framework allows generating explanations fordecisions automatically. In general, explaining the meta-hypotheses
Innocence , E inc and E ex is easier than explain-ing the beliefs in other arguments because of their restrictedform.Note first that the only direct neighbors of Innocence are E inc and E ex and we know that E inc is an attacker and E ex is a supporter. Therefore, we can basically distinguish threecases that we already described at the end of Section 3.1. B (E inc ) ≤ T and B (E ex ) ≤ . : The defendant is foundinnocent due to lack of evidence.2. B (E ex ) > . : the defendant is found innocent becausethe exculpatory evidence is more plausible than the incul-patory evidence.3. B (E inc ) > T : the defendant is found guilty because ofthe inculpatory evidence.Here, T is a threshold that should usually be chosen fromthe open interval (0 . , . . is sometimes regarded as theacceptance threshold, but in a legal setting, it may be moreappropriate to choose a larger threshold like T = 0 . .After having received a high-level explanation of the ver-dict, the user may be interested in more details and askfor reasons that explain the plausibility of inculpatory orexculpatory evidence. Explaining E inc and E ex is morecomplicated already because we have an unknown numberof neighbors in the graph now. However, the only neigh-bors can be supporters (parents) and Innocence (child). ByDefinition 1, item 4, their meaning is encoded by the SE -constraint. Assuming that the user did not add additionalconstraints about the relationships between E inc , E ex and Innocence , we can again define some simple rules. If addi-tional constraints on E inc and E ex are desirable, these rulesmay need to be refined, of course. Otherwise, we can dis-tinguish two cases. If the user asks for an explanation forthe lower belief, we can reason as follows: a non-triviallower bound ( > ) can only result from a supporter withnon-trivial lower bound. So in this case, we can go throughthe supporters, collect those supporters that induce the max-imum lower bound and report them as an explanation.The user may also ask for an explanation for the upperbelief. A non-trivial upper bound ( < ) can only result froma non-trivial bound on the belief in Innocence . Let us as-sume that we want to explain a non-trivial upper bound on E inc . From the IE -constraint in Definition 1, item 4, we cansee that this must be caused by a non-trivial lower bound on Innocence . This lower bound, in turn, must be caused bya non-trivial lower bound on E ex by our assumptions. Wecould now report the lower bound on E ex as an explana-tion. A more meaningful explanation would be obtained byalso explaining the lower bound on E ex . This can be doneas explained before by looking at the supporters of E ex . Anon-trivial upper bound on E ex can be explained in a sym-metrical manner.Generating automatic explanations for the remaining sub-hypotheses and pieces of evidence is most challenging, butcan be done as long as we can make assumptions about theconstraints that are involved. For example, often the SE -constraint gives a natural meaning to support edges and theweighted Coherence constraint gives a natural meaning toattack edges. Intuitively, they cause a lower/upper bound onthe belief in an argument based on their own lower belief.If these are the only constraints that are employed, explana-tions for lower bounds can again be generated by collectingthe supporters that induce the largest lower bound. For ex-plaining the upper bound, we now have to consider two fac-tors. The first factor are attackers with a non-trivial lowerbound. The second factor are other arguments that are sup-ported and have a non-trivial upper bound (then a too largebelief in the supporting argument would cause an inconsis-tency). Therefore, we do not only collect the attacking ar-guments that induce the largest lower bound, but we alsocollect supported arguments. We can order the supportedarguments by their upper belief multiplied by the weight ofthe support edge. If the smallest upper bound from the sup-ported arguments is U and the largest lower bound from theattacking arguments is L , we report the collected supportedarguments as an explanation if − U > L , the collectedattacking arguments as an explanation if − U < L or bothif it happens that − U = L .For additional constraints, we may have to refine theserules again. One important constraint that we discussed isthe CS -constraint. In this case, we have to to treat the sup-porters involved in this constraint differently since they allcontribute to the induced lower bound. When collectingsupporters for explaining lower bounds (the supporters areparents), supporting edges that belong to one CS -constrainthave to be considered jointly and not independently. If theyinduce a lower bound that is larger than all lower boundscaused by an SE -constraint, they can be reported collectivelyas an explanation. When collecting supporters for explain-ing upper bounds (the supporters are children), the reason-ing becomes more complicated because there can be variousinteractions between the beliefs in the involved arguments.We leave an analysis of this case and more general cases forfuture work. Our legal reasoning framework allows explicit formalizationof uncertainty in legal decision making. Other knowledgerepresentation and reasoning formalisms have been appliedfor this purpose. Studies of different game-theoretical toolscan be found in (Prakken and Sartor 1996; Riveret et al.007; Roth et al. 2007). (Dung and Thang 2010) proposeda probabilistic argumentation framework where the beliefsof different jurors are represented by individual probabilityspaces. Intuitively, the jurors weigh the evidence and de-cisions can be made based on criteria like majority votingor belief thresholds. One particularly popular approach forprobabilistic legal reasoning are Bayesian networks. (Fen-ton, Neil, and Lagnado 2013) provide a set of idioms usedfor the construction of Bayesian networks based on legal ar-gument patterns and apply and discuss their framework fora specific case in (Fenton et al. 2019). (Timmer et al. 2017)developed an algorithm to extract argumentative informa-tion from a Bayesian network with an intermediate struc-ture, a support graph and analyze their approach in a legalcase study. (Vlek et al. 2016) propose a method to modeldifferent scenarios about crimes with Bayesian networks us-ing scenario scheme idioms and to extract information aboutthe scenario and the quality of the scenario.Determining the weights and beliefs for the edges anditems of evidence poses a problem for our framework aswell as for other symbolic approaches. For some itemsof evidence the weights as well as the probabilities canbe elicited based on statistical analysis and forensic evi-dence (Kwan et al. 2011; Fenton and Neil 2012; Zhangand Thai 2016). To test the robustness of Bayesian net-works with respect to minor changes in subjective beliefs,(Fenton, Neil, and Lagnado 2013) propose to apply sen-sitivity analysis on the nodes in question. In our frame-work, the impact of subjective beliefs can be analysed ina similar manner, by altering the beliefs which are asso-ciated with the evidence or the weights associated withthe edges. The automated explanation generation outlinedin Section 5 can then provide information about the in-fluence that differing beliefs have on hypotheses and sub-hypotheses in the framework. With this the perspective ofdifferent agents can be modeled, for example the defenseand prosecution perspectives. The clear structure of argu-mentation frameworks is well suited for generating explana-tions automatically and related explanation ideas have beenconsidered recently in (Cocarascu, Rago, and Toni 2019;ˇCyras et al. 2019; Zeng et al. 2018), for example.In Bayesian networks, inconsistency is usually not an is-sue because of the way how they are defined. In contrast,in our framework, inconsistencies can easily occur. For ex-ample, if a forensic expert judges both the accuracy of analibi and the relevance of a direct piece of evidence with , our constraints become inconsistent. While this may beinconvenient, this inconsistency is arguably desirable. Thisis because the modeling assumptions are inconsistent andthis should be recognized and reported by the system. Ifautomated merging of the inconsistent beliefs is desirable,this can be achieved by different tools. One possibility is toapply inconsistency measures for probabilistic logics in or-der to evaluate the severity of conflicts (De Bona and Finger2015; Potyka 2014; Thimm 2013). In order to determine thesources of the inconsistency and their impact, Shapley val-ues can be applied (Hunter and Konieczny 2010). Alterna-tively, we could replace our exact probabilistic reasoning al-gorithms with inconsistency-tolerant reasoning approaches that resolve inconsistencies by minimizing conflicts (Adam-cik 2014; Mui˜no 2011; Potyka and Thimm 2015) or basedon priorities (Potyka 2015). This would be more convenientfor the knowledge engineer, but the resulting meaning of theprobabilities becomes less clear. We proposed a probabilistic abstract argumentation frame-work for automated reasoning in law based on probabilisticepistemic argumentation (Hunter and Thimm 2016; Hunter,Polberg, and Thimm 2018). Our framework is best suited formerging beliefs in pieces of evidence and sub-hypotheses.Computing an initial degree of belief for particular piecesof evidence based on forensic evidence can often be betteraccomplished by applying Bayesian networks or a conven-tional statistical analysis. Our framework can then be ap-plied on top in order to merge the different beliefs in piecesof evidence and subhypotheses in a transparent and explain-able way. In particular, point probabilities are not required,but imprecise probabilities in the form of belief intervals aresupported as well.It is also interesting to note that the worst-case runtime ofour framework is polynomial (Potyka 2019). Bayesian net-works also have polynomial runtime guarantees in some spe-cial cases, for example, when the Bayesian network struc-ture is a polytree (i.e., it does not contain cycles when ignor-ing the direction of the edges). The polynomial runtime inprobabilistic epistemic argumentation is guaranteed by re-stricting to a fragment of the full language. This fragmentis sufficient for many cases and is all that we used in thiswork. However, sometimes it may be necessary to extendthe language. For example, instead of talking only about theprobabilities of single pieces of evidence and subhypothe-ses, we may want to talk about the probabilities of logicalcombinations. Similarly, one may want to merge beliefs notonly in a linear, but in a non-linear way. Both extensions aredifficult to deal with, in general. However, it seems worth-while to study such cases in more detail in order to identifysome other tractable special cases.Another interesting aspect for future work is extendingthe automated support tools for designing and querying ourlegal argumentation frameworks. As explained in Section5, the basic framework can be explained well automati-cally. However, when beliefs are merged in more compli-cated ways like by the collective support constraint, a deeperanalysis is required. We will study explanation generationfor collective support and other interesting merging patternsin more detail in future work. For the design of the frame-work, it may also be helpful to generate explanations for thesources of inconsistency. As explained in the related worksection, a combination of inconsistency measures for prob-abilistic logics and Shapley values seems like a promisingapproach that we will study. It is also interesting to applydifferent approaches for inconsistency-tolerant reasoning inorder to avoid inconsistencies altogether. However, whilethese approaches usually can give some meaningful analyt-ical guarantees, it is important to study empirically if theseguarantees are sufficient in order to guarantee meaningfulresults in legal or other applications. eferences
Adamcik, M. 2014.
Collective reasoning under uncertaintyand inconsistency . Ph.D. Dissertation, Manchester Institutefor Mathematical Sciences, The University of Manchester.Bench-Capon, T., and Sartor, G. 2003. A model of legalreasoning with cases incorporating theories and values.
Ar-tificial Intelligence
International Conference on Au-tonomous Agents and Multiagent Systems (AAMAS) , 1261–1269. International Foundation for Autonomous Agents andMultiagent Systems.ˇCyras, K.; Letsios, D.; Misener, R.; and Toni, F. 2019. Argu-mentation for explainable scheduling. In
AAAI Conferenceon Artificial Intelligence (AAAI) , volume 33, 2752–2759.De Bona, G., and Finger, M. 2015. Measuring inconsistencyin probabilistic logic: rationality postulates and dutch bookinterpretation.
Artificial Intelligence
Inter-national Conference on Computational Models of Argument(COMMA)
Risk assessment and decisionanalysis with Bayesian networks . Crc Press.Fenton, N.; Neil, M.; Yet, B.; and Lagnado, D. 2019. Ana-lyzing the simonshaven case using bayesian networks.
Top-ics in cognitive science .Fenton, N.; Neil, M.; and Lagnado, D. A. 2013. A generalstructure for legal arguments about evidence using bayesiannetworks.
Cognitive science
Game Theory and Business Applications .Springer. 233–263.Hunter, A., and Konieczny, S. 2010. On the measure of con-flicts: Shapley inconsistency values.
Artificial Intelligence
International Conference on Principles of Knowl-edge Representation and Reasoning (KR) , 53–62. AAAIPress.Hunter, A.; Polberg, S.; and Thimm, M. 2018. Epistemicgraphs for representing and reasoning with positive and neg-ative influences of arguments.
ArXiv .Hunter, A. 2013. A probabilistic approach to modellinguncertain logical arguments.
International Journal of Ap-proximate Reasoning
International Conference onDigital Forensics (IFIP) , 231–243. Springer.McCarty, L. T. 1995. An implementation of eisner v. ma-comber. In
International Conference on Artificial Intelli-gence and Law (ICAIL) , volume 95, 276–286. Mui˜no, D. P. 2011. Measuring and repairing inconsistencyin probabilistic knowledge bases.
International Journal ofApproximate Reasoning
International Joint Conference on Artificial Intelligence (IJ-CAI) .Potyka, N. 2014. Linear programs for measuring incon-sistency in probabilistic logics. In
International Conferenceon Principles of Knowledge Representation and Reasoning(KR) .Potyka, N. 2015. Reasoning over linear probabilisticknowledge bases with priorities. In
International Confer-ence on Scalable Uncertainty Management (SUM) , 121–136. Springer.Potyka, N. 2019. A polynomial-time fragment of epistemicprobabilistic argumentation.
International Journal of Ap-proximate Reasoning
Logicalmodels of legal argumentation . Springer. 175–211.Prakken, H.; Wyner, A.; Bench-Capon, T.; and Atkinson,K. 2013. A formalization of argumentation schemes forlegal case-based reasoning in aspic+.
Journal of Logic andComputation
Conference on Legal Knowl-edge and Information Systems (JURIX) .Roth, B.; Riveret, R.; Rotolo, A.; and Governatori, G. 2007.Strategic argumentation: a game theoretical investigation. In
International Conference on Artificial Intelligence and Law ,81–90. ACM.Thimm, M. 2012. A probabilistic semantics for abstractargumentation. In
European Conference on Artificial Intel-ligence (ECAI) , volume 12, 750–755.Thimm, M. 2013. Inconsistency measures for probabilisticlogics.
Artificial Intelligence
InternationalJournal of Approximate Reasoning
Artificial Intelligence and Law
International Conference on Au-tonomous Agents and Multiagent Systems (AAMAS) , 1114–1122. International Foundation for Autonomous Agents andMultiagent Systems.Zhang, G., and Thai, V. V. 2016. Expert elicitation andbayesian network modeling for shipping accidents: A liter-ature review.