aa r X i v : . [ m a t h . L O ] M a r Probabilities with Gaps and Gluts
Dominik Klein, Ondrej Majer, Soroush Rafiee Rad
Abstract
Belnap-Dunn logic (BD), sometimes also known as First Degree En-tailment, is a four-valued propositional logic that complements the clas-sical truth values of
True and
False with two non-classical truth values
Neither and
Both . The latter two are to account for the possibility ofthe available information being incomplete or providing contradictory ev-idence. In this paper, we present a probabilistic extension of BD thatpermits agents to have probabilistic beliefs about the truth and falsity ofa proposition. We provide a sound and complete axiomatization for theframework defined and also identify policies for conditionalization and ag-gregation. Concretely, we introduce four-valued equivalents of Bayes’ andJeffrey updating and also suggest mechanisms for aggregating informationfrom different sources.
Keywords : Belnap-Dunn logic, First Degree Entailment, Non-standard proba-bility theory, Probability theory, Bayes’ updating, Jeffrey updating, ProbabilityAggregation
In learning about a classical system that adheres to the laws of propositionallogic, we may be faced with information that does not. Naturally, if informationis scarce, our evidence may contain truth value gaps , neither indicating certainpropositions to be true nor false. But we may also be faced with contradictoryinformation, especially when our insights are gained by combining various bodiesof evidence. This may lead to truth value gluts , i.e. propositions that are labelledas both true and false.There has been many attempts in the literature to develop formal systemsfor capturing and analyzing such non-classical situations. These are generallydivided into two camps. The first is motivated by adopting the philosophicalposition of dialetheism as defended by Priest (2006, 2007), advocating the thesisthat there are true contradictions, i.e. sentences which are both true and false.Corresponding formal systems should thus allow for assigning both truth valuesto a sentence simultaneously. Probably the most well known example of suchlogical systems is the logic LP (Priest, 1979, 2002).The second camp takes the existence of gaps and gluts as a pathological con-sequence of imperfect information. Crucially, one may hope than even imperfect1nformation would allow for at least some reliable inferences. In general, thereare two ways to go here. One could either make the set of premises consistentor develop non-trivial inference rules that work on inconsistent sets of premises.Consistency of premises can be obtained by focusing on maximal consistent sub-sets, cf. Rescher and Manor (1970); Klein and Marra (2020), or by employingbelief revision, as in AGM systems (Alchourr´on et al., 1985). Mechanisms fordealing with inconsistent information, on the other hand, are developed in avariety of frameworks such as discussive logic (Jaskowski, 1948), adaptive logic (Batens, 2001), Da Costa’s logics of formal inconsistency (1974; 1989), relevantlogic of Anderson and Belnap (1975) and their variants.Another well-known logical framework that falls in this last category isBelnap-Dunn logic (BD. cf. Belnap, 1977, 2019; Dunn, 1976), sometimes alsogoing by the name of First Degree Entailment. Briefly, this system rests on twoassumptions. The first is that gaps and gluts may occur even for boundedly ra-tional agents, as information may be limited (gaps) and the question of whethera given belief set is consistent (i.e. checking for the absence of gluts) is known tobe NP-hard. Building on the latter claim, BD’s second assumption is that thelogic of information should not validate the principle of explosion . Just to thecontrary, BD stipulates that a body of information may afford us substantial in-sights about some matter q , even if it contains contradictory information aboutsome other p that is completely unrelated to q . Belnap-Dunn logic, in short, isa substructural logic, that invalidates explosion and tracks which insights canbe inferred from an information base that may contain gaps and gluts.But of course, the problem of insufficient or contradictory information doesnot apply to categorial true-false information only. Rather, probabilistic infor-mation is affected by similar arguments about gaps and gluts as those outlinedabove. In his 1997 paper, Jøsang puts forward a framework for three valuedprobabilities, incorporating uncertainty as third value that may occur naturallywhen evidence is ambiguous or insufficient. Notably, this framework circum-vents the debated principle of insufficient reason by distinguishing situations ofinsufficient information from those, where equally strong evidence is availablefor and against some proposition. Later approaches extend this to four-valuedprobabilities, where the fourth value represents conflicting information, or gluts.The necessity of gluts is often argued for by considering a Bayesian agent who re-ceives two pieces of mutually contradictory information from sources she judgeshighly reliable, cf. the firefighter example in Dunn and Kiefer (2019).In short, these arguments call for a four-valued probabilistic generalizationof Belnap-Dunn logic in a similar way as classical probability theory generalizespropositional logic. In a first approach to this project Michael Dunn (2010) hasdefined a four-valued probabilistic framework and has studied logical propertiesof the resulting probabilistic entailment. In a similar vein, Childers, Majerand Milne (2019) have put forward a single-valued approach to non-standardprobabilities motivated by a frequentist interpretation where probability gapsand gluts may occur naturally if probabilities are derived from sampling two The principle of explosion states that every formula can be derived from a contradiction.
We start by giving a brief recollection of Belnap-Dunn four-valued logic beforeproceeding to introduce its probabilistic extensions. Belnap-Dunn four-valuedlogic is defined over a propositional language that is built over a set Prop ofpropositional variables. Formally, the logical language L Prop is given by theBackus-Naur form: ϕ :: p | ϕ, | ϕ ^ ϕ Disjunction ( _ ) is defined in the standard way. The main difference to classicalpropositional logic consists in the way that formulas are evaluated. In classicalpropositional logic, evaluations are defined as functions v : L Prop
Ñ t , u thatare derived from a valuation on the set of atoms Prop. For Belnap-Dunn logicthere are two ways to define evaluations. One approach is to define evaluationsas functions v : L Prop Ñ P pt , uq . In other words instead of evaluating formulason the two element lattice 3 ut u they are interpreted on the four element lattice BD t ut u t , ut u Evaluating formulas in the four element lattice P pt , uq allows for the assign-ment of two new truth values tu and t , u . These represent so-called truth-value gaps and gluts , i.e. situations where formulas obtain neither resp. both of theclassic truth values. Formally, the evaluation is defined inductively, startingfrom an atomic valuation v : Prop Ñ P pt , uq , by:1 P v p φ q iff 0 P v p φ q P v p φ q iff 1 P v p φ q P v p φ ^ ψ q iff 1 P v p φ q and 1 P v p ψ q P v p φ ^ ψ q iff 0 P v p φ q or 0 P v p ψ q An alternative approach is to use two separate classical valuations, called thepositive valuation v ` and the negative valuation v ´ . Building on atomic val-uations v ` : Prop Ñ t , u and v ´ : Prop Ñ t , u , these are defined for φ, ψ P L Prop as: v ` p φ q “ v ´ p φ q “ v ´ p φ q “ v ` p φ q “ v ` p φ ^ ψ q “ v ` p φ q “ v ` p ψ q “ v ´ p φ ^ ψ q “ v ´ p φ q “ v ´ p ψ q “ φ ( L ψ if and only if v ` p φ q “ ñ v ` p ψ q “ p v ` , v ´ q .This entailment relation goes by the name of first degree entailment .An important property of Belnap-Dunn logic, that we will make heavy useof later, is that it admits disjunctive (as well as conjunctive) normal forms.Just as in classical logic, a formula in disjunctive normal form is written asa disjunction of conjunctions of literals. However, unlike in classical logic, anatom might appear both positively and negatively within a conjunctive clause.4 heorem 1. (Theorem 3.9 in Font (1997)) Every formula of Belnap-Dunnlogic is equivalent to a formula in a conjunctive (disjunctive) normal form. Moreover, up to permutation of conjuncts and disjuncts, formulas in con-junctive (disjunctive) normal form may be identified with finite families of finitesets of literals (Theorem 3.15 in Pˇrenosil (2018)).
The double valuation approach’s starting assumption is that positive and nega-tive evidence are distinct. That is the absence of positive evidence for some p isnot the same as negative evidence against p (or positive evidence for p , if youwill). In particular, there may be gaps, where neither evidence for p nor against p is available, and gluts, where evidence of both types is present. Within ourmodels, we must hence treat positive and negative evidence separately. In thefollowing, we assume Prop finite and constant. Also, we will denote the set ofliterals over Prop by Lit, i.e. Lit : “ Prop Yt p | p P Prop u . Definition 1. A non-standard model is a triple M “ x Σ , v ` , v ´ y where Σis a finite or countably infinite set of states and v ` , v ´ : Σ ˆ Prop
Ñ t , u arecalled the positive (negative) valuation function respectively. For p P Prop welet v ˘ p p q “ t s P Σ | v ˘ p s, p q “ u .Hence, a state s of a model M might be assigned an inconsistent set ofpropositions (i.e., s P v ` p p q X v ´ p p q for some p P P rop ), and may remainundecided about some propositions ( s R v ` p q q Y v ´ p q q for some q P P rop ).Non-standard models provide a semantics for BD. More specifically, logicalformulas of L Prop are evaluated on model-state pairs, using relations |ù ` and |ù ´ . From this, we then obtain the notions of a positive and negative extension. Definition 2.
Let M “ x Σ , v ` , v ´ y be a non-standard model, s P Σ a stateand ϕ, ψ P L Prop be formulas. Then i q The semantics of L Prop on p M , s q is given by: M , s |ù ` p iff s P v ` p p q M , s |ù ´ p iff s P v ´ p p q M , s |ù ` ϕ ^ ψ iff M , s |ù ` ϕ and M , s |ù ` ψ M , s |ù ´ ϕ ^ ψ iff M , s |ù ´ ϕ or M , s |ù ´ ψ M , s ( ` ϕ iff M , s |ù ´ ϕ M , s |ù ´ ϕ iff M , s |ù ` ϕii q The positive and negative extensions of ϕ P L Prop are | ϕ | ` M “ t s P Σ | M , s ( ` ϕ u| ϕ | ´ M “ t s P Σ | M , s ( ´ ϕ u p“ t s P Σ | M , s ( ` ϕ uq
5e define the entailment relation between sentences in the usual way: φ ( ˘ ψ if and only if for all models M and states σ , if M , s ( ˘ φ then M , s ( ˘ ψ . Observe the obvious connection between positive and negative extension: | ϕ | ` M “ | ϕ | ´ M . Moreover, we define the set of pure belief, pure disbelief,conflict and uncertainty about ϕ as | ϕ | b M “| ϕ | ` M z| ϕ | ´ M | ϕ | d M “| ϕ | ´ M z| ϕ | ` M | ϕ | c M “| ϕ | ` M X | ϕ | ´ M | ϕ | u M “ Σ zp| ϕ | ` M Y | ϕ | ´ M q . The terms belief and disbelief , of course, refer to the intended interpretation asdoxastic state. Whenever clear by context, we omit the subscript M .Towards a semantics of non-standard probability theory, we expand the non-standard model defined above with a probability measure that is classic. Non-classicality of the ensuing probability assignments, then, will be derived fromthe underlying valuations only, i.e. from the fact that non-standard modelsallow for gaps and gluts of truth values. Definition 3. A probabilistic model is a tuple M “ x Σ , µ, v ` , v ´ y where x Σ , v ` , v ´ y is a non-standard model and µ is a probability measure on the fullsubset algebra of Σ.Building on probabilistic models, we can derive two different probability assign-ments from M , one four-valued, the other single valued. These are: Definition 4.
For a probabilistic model M “ x Σ , µ, v ` , v ´ y , i q the induced non-standard probability function p µ : L Prop Ñ R is: p µ p ϕ q “ µ p| ϕ | ` M q ii q the induced four-valued probability function ˆ p µ : L Prop Ñ R isˆ p µ p ϕ q “ ` µ p| ϕ | b q , µ p| ϕ | d q , µ p| ϕ | u q , µ p| ϕ | c q ˘ . To end this section, we’d like to highlight a strong similarity to classic prob-abilistic models. Classic probability assignments can be derived from possibleworlds models equipped with a probability function, i.e. finite classical modelsakin to those in Definition 3. More explicitly, for a classical model of the form M “ x W, v, µ y with W a set of possible worlds, v : W ˆ Prop
Ñ t , u a valua-tion, and µ : P p W q Ñ r
0; 1 s a probability measure, the probability of some ϕ isgiven as µ pr ϕ sq , with r ϕ s “ t s P Σ : M , s ( ϕ u . In fact, if Prop is finite, everyprobability assignment to L Prop can be obtained in this way.Moreover, every world w of a possible worlds models W naturally corre-sponds to its atomic valuation, which can be represented by the subset V Ď Propgiven by p P V iff v p w, p q “ p P Prop. In the same vein, each state We use the following convention in naming of probability measures i) p-like names standfor syntactic measures (on languages) and µ -like names for measures on spaces. ii) Hat-superscripts are used to denote four-valued probabilities. iii) The Subscript µ may be used ifa syntactic measure is derived from a space. of a probabilistic model corresponds to a non-standard possible assignment n s Ď P p Lit q defined by p P n s iff v ` p w, p q “ p P n s iff v ´ p w, p q “ p P At . Hence, non standard probabilistic models are obtained from pos-sible world models by replacing classical worlds, i.e. atomic valuations withBD-possible worlds, that is elements of P p Lit q . In the following, we present a number of axioms for non-standard and four-valued probabilities. The two sets of axioms given here are easily seen to besound w.r.t. to the semantics just presented. That they are also complete willbe shown in Section 6. We can hence use these axioms for a purely syntacticdefinition of non-standard and four-valued probabilities.
Non-standard probabilities
We begin with axioms for single-valued non-standard probabilities, i.e. proba-bility measures assigning each ϕ P L Prop a unique rational number.
Definition 5. A non-standard probability assignment is a function p : L Prop Ñ R satisfying for all ϕ, ψ P L Prop .(A1) 0 ď p p ϕ q ď ϕ ( L ψ then p p ϕ q ď p p ψ q (monotonicity)(A3) p p ϕ ^ ψ q ` p p ϕ _ ψ q “ p p ϕ q ` p p ψ q . (import-export rule)where ( L in (A2) is the entailment relation of Belnap-Dunn logic (first-degreeentailment).These axioms are strictly weaker than the classic Kolmogorov axioms(Kolmogorov, 2018). Axioms (A1)-(A3) can be derived from the Kolmogorov ax-ioms, using that first degree entailment is a sub-relation of classical entailment.In the converse direction, however, only the non-negativity axiom ( p p ϕ q ě ϕ ) is derivable from (A1). Neither Kolmogorov’s unit axiom p p pJq “ q nor the ( σ )-additivity axioms are derivable from (A1)-(A3), as is illustrated bythe fact that assigning probability .5 to every formula satisfies (A1)-(A3). Infact, the import-export axiom is a weak counterpart to additivity, stating thata general rule for adding probabilities that is derivable from the Kolmogorovaxioms, p p ϕ _ ψ q “ p p ϕ q ` p p ψ q ´ p p ϕ ^ ψ q , continues to hold. Within theabove axiomatization, the import-export axioms (A3) is the only conditionregulating the relation between the probability of a formula and its negation.As a result the probabilities of ϕ and ϕ need not sum up to 1. The con-straint p p ϕ _ ϕ q ` p p ϕ ^ ϕ q “ p p ϕ q ` p p ϕ q allows for probabilistic gaps( p p ϕ _ ϕ q ă q and gluts ( p p ϕ ^ ϕ q ą q to occur simultaneously. Thissquares with our original motivation of establishing independence between pos-itive and negative evidence. 7 our-Valued probabilities We now turn to four-valued probability assignments. These are characterizedby a total of six axioms.
Definition 6. A four-valued probability assignment is a function ˆ p : L Prop Ñ R . Writing ˆ ϕ as p b ϕ , d ϕ , u ϕ , c ϕ q , this function must satisfy(D1) 0 ď b ϕ , d ϕ , u ϕ , c ϕ (D2) b ϕ ` d ϕ ` u ϕ ` c ϕ “ b ϕ “ d ϕ , c ϕ “ c ϕ (D4) if ϕ ( L ψ then b ϕ ` c ϕ ď b ψ ` c ψ (D5) b ϕ ^ ϕ “ c ϕ ^ ϕ “ c ϕ (D6) b ϕ ` c ϕ ` b ψ ` c ψ “ b ϕ ^ ψ ` c ϕ ^ ψ ` b ϕ _ ψ ` c ϕ _ ψ where ( L is first-degree entailment and ϕ, ψ P L Prop
The four entries of ˆ p stand for pure belief (i.e. ϕ is true and ϕ is not),pure disbelief, uncertainty and conflict respectively. Let us briefly explain theaxioms. The first two axioms (D1) and (D2) are classicality axioms, statingthat probabilities are non-negative and that the probabilistic masses of purebelief, pure disbelief, conflict and uncertainty must add up to 1. This reflectsthe intuition that the four cases are mutually exclusive and jointly exhaustive,i.e. that the metatheory of gaps and gluts is classical.Axioms (D3)-(D6) then represent structural relations between the four-valued assignments. (D3) emphasizes the strong relation between ϕ and ϕ :belief in one is the same as disbelief in the other, while both share the sameconflict and uncertainty. (D4) is a direct counterpart of axioms (A2) above,stating that the total belief in ϕ (i.e. the sum of pure belief in ϕ and belief in ϕ and ϕ together) must be monotonous under first degree entailment. (D5)expresses that an agent cannot have pure belief in contradictory formulas of theform ϕ ^ ϕ . A fortiori, the conflict about ϕ ^ ϕ must be derived from (andequal to) conflict about ϕ alone. (D6), finally, is a counterpart to the import-export axiom (A3). Briefly, it states the total beliefs (i.e. the sum of pure beliefand conflict together) of ϕ, ψ, ϕ _ ψ and ϕ ^ ψ must satisfy the import-exportrule.We should note that the axioms presented here are weaker than those putforward in Dunn (2010). There, the probability of a conjunction ϕ ^ ψ isdetermined by its conjuncts through: b ϕ ^ ψ “ b ϕ ¨ b ψ d ϕ ^ ψ “ d ϕ ` d ψ ´ d ϕ d ψ ` c ϕ u ψ ` u ϕ c ψ u ϕ ^ ψ “ u ϕ b ψ ` b ϕ u ψ ` u ϕ u ψ c ϕ ^ ψ “ b ϕ c ψ ` c ϕ b ψ ` c ϕ c ψ A similar axiom for three valued probabilities (true/false/uncertain) can befound in Jøsang (1997). Notably, such definition makes conjunctions truth func-tional, i.e. the probability of ϕ ^ ψ is fully determined by the probabilities of ϕ and ψ . We take this to be too strong, especially given that no such functional de-pendence holds in classic probability theory. Moreover this truth functional ap-proach implies that all propositions are mutually probabilistically independent -8recluding any interesting notions of conditionalization. To see this, assume that ϕ and ψ are classical, i.e. ˆ p p ϕ q “ p b ϕ , d ϕ , , q and ˆ p p ψ q “ p b ψ , d ψ , , q . Thenthe above definition simplifies to ˆ p p ϕ ^ ψ q “ p b ϕ b ψ , d ϕ ` d ψ ´ d ϕ d ψ , , q . Withother words, the probability (belief) in ϕ ^ ψ is the product of the probabilitiesof ϕ and ψ - which exactly is the definition of probabilistic independence.In the following section we will show a strong correspondence between non-standard and four-valued probability assignments. Thereafter, we show axiomsystems (A1)-(A3) and (D1)-(D6) to be sound and complete with respect to theclass of probabilistic models defined above (Section 6). In Section 7 we thendiscuss approaches to conditionalization in either setting. We have so far presented two different frameworks for non-standard probability,one real-valued, the other with values in R . As we show now, both are differentbut equivalent perspectives on the same phenomenon. To this end, let P ns and P be the set of non-standard and four-valued probability assignments respec-tively. That is, P ns is the set of functions L Prop Ñ R satisfying (A1)-(A3) while P consists of all mappings L Prop Ñ R satisfying (D1)-(D6). We will show thetranslation map tr ns : P Ñ P ns defined by tr ns p ˆ p qp ϕ q : “ b ϕ ` c ϕ where ˆ p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q to be a bijection. In the opposite direction, the map tr ns : P ns Ñ P is given by tr ns p p qp ϕ q : “p p p ϕ q ´ p p ϕ ^ ϕ q , p p ϕ q ´ p p ϕ ^ ϕ q , p ´ p p ϕ q ´ p p ϕ q ` p p ϕ ^ ϕ q , p p ϕ ^ ϕ qq As expected, the maps tr ns and tr ns are inverse to each other: Theorem 2. tr ns and tr ns are well-defined. Moreover tr ns ˝ tr ns “ id P and tr ns ˝ tr ns “ id P ns Moreover, the translation maps tr ns and tr ns cohere with the way we definednon-standard and four-valued assignments on a given probabilistic model. Theorem 3.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model and p µ and ˆ p µ theinduced non-standard and four-valued probability functions. Then tr ns ˝ ˆ p µ “ p µ and tr ns ˝ p µ “ ˆ p µ . The remainder of this section is devoted to showing these two results.
Proof of Theorem 2.
To see that tr ns is well defined let p “ tr ns p ˆ p q for a fixedˆ p P P . First, note that for any ψ P L Prop with ˆ p p ψ q “ p b ψ , d ψ , u ψ , c ψ q we have0 ď b ψ ` c ψ ď
1, showing that p satisfies (A1). To see that p satisfies (A2)9ssume that ϕ ( L ψ . By (D4), we have that b ϕ ` c ϕ ď b ψ ` c ψ and hence p p ϕ q ď p p ψ q . For (A3) finally, note that by (D6) we have for any ϕ, ψ P L Prop that b ϕ ` c ϕ ` b ψ ` c ψ “ b ϕ ^ ψ ` c ϕ ^ ψ ` b ϕ _ ψ ` c ϕ _ ψ which immediately impliesthat p p ϕ q ` p p ψ q “ p p ϕ ^ ψ q ` p p ϕ _ ψ q .Next, we show that also tr ns is well defined. For this fix p P P ns . For ψ P L Prop denote tr ns p p qp ψ q by p b ψ , d ψ , u ψ , c ψ q . Using this notation, we obtain b ψ ` d ψ ` u ψ ` c ψ “` p p ψ q ´ p p ψ ^ ψ q ` p p ψ q ´ p p ψ ^ ψ q` ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q ` p p ψ ^ ψ q the latter term is easily seen to equal 1, showing (D2). For (D1) note that ψ ^ ψ ( L ψ and ψ ^ ψ ( L ψ . By (A2), we have that p p ψ ^ ψ q ď p p ψ q , p p ψ q which, together with (A1) implies that b ψ , d ψ , c ψ ě
0. Finally, by(A1) and (A3),1 ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q ě p p ψ _ ψ q ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q “ u ψ ě
0. The first half of (D3) follows from the fact that b ψ “ p p ψ q ´ p p ψ ^ ψ q “ d ψ , using that ψ )( L ψ and hence, by (A2), p p ψ q “ p p ψ q .The second half follows from the fact that ψ ^ ψ )( L ψ ^ ψ and hence,by (A2), p p ψ ^ ψ q “ p p ψ ^ ψ q . Similarly, (D4) can be derived from (A2)together with the fact that b ψ ` c ψ “ p p ψ q ´ p p ψ ^ ψ q ` p p ψ ^ ψ q “ p p ψ q .Using the latter fact again, (D6) is an immediate consequence of (A3). For(D5), finally, note that ψ ^ ψ )( L ψ ^ ψ ^ p ψ ^ ψ q and hence, by (A2), p p ψ ^ ψ q “ p p ψ ^ ψ ^ p ψ ^ ψ qq . This implies that c ψ ^ ψ “ c ψ and that b ψ ^ ψ “ p p ψ ^ ψ q ´ p p ψ ^ ψ ^ p ψ ^ ψ qq “ tr ns ˝ tr ns “ id P and tr ns ˝ tr ns “ id P ns , i.e. that tr ns and tr ns are left and right inverses of each other. We begin by showingthat tr ns p tr ns p p qq “ p for any p P P ns . For ϕ P L Prop , we have that tr ns p p qp ϕ q equals p p p ϕ q ´ p p ϕ ^ ϕ q , p p ϕ q ´ p p ϕ ^ ϕ q , ´ p p ϕ q ´ p p ϕ q ` p p ϕ ^ ϕ q , p p ϕ ^ ϕ qq . Hence tr ns p tr ns p p qqp ϕ q “ p p ϕ q ´ p p ϕ ^ ϕ q ` p p ϕ ^ ϕ q “ p p ϕ q as desired.For the converse direction, let ˆ p P P . We have to show that tr ns p tr ns p ˆ p qq “ ˆ p . For this, let ϕ P L Prop and denote ˆ p p ψ q by p b ψ , d ψ , c ψ , u ψ q for any ψ P L Prop .By axioms (D3) and (D5) we have that b ϕ “ d ϕ , c ϕ “ c ϕ , b ϕ ^ ϕ “ c ϕ ^ ϕ “ c ϕ . Hence, the values of tr ns p ˆ p qp ϕ q , tr ns p ˆ p qp ϕ q and tr ns p ˆ p qp ϕ ^ ϕ q are b ϕ ` c ϕ , d ϕ ` c ϕ and c ϕ respectively. We then get that tr ns p tr ns p ˆ p qqp ϕ q“p tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ ^ ϕ q , tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ ^ ϕ q ,tr ns p ˆ p qp ϕ ^ ϕ q , ´ tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ q ` tr ns p ˆ p qp ϕ ^ ϕ qq“p b ϕ ` c ϕ ´ c ϕ , d ϕ ` c ϕ ´ c ϕ , c ϕ , ´ p b ϕ ` c ϕ q ´ p d ϕ ` c ϕ q ` c ϕ q“p b ϕ , d ϕ , c ϕ , ´ b ϕ ´ d ϕ ´ c ϕ q “ p b ϕ , d ϕ , c ϕ , u ϕ q where the last equation employs (D2). Hence tr ns p tr ns p ˆ p qq “ ˆ p as desired.10 roof of Theorem 3. For ϕ P L Prop denote ˆ p µ p ϕ q by p b ϕ , d ϕ , u ϕ , c ϕ q . By Defini-tion 3, we have b ϕ “ µ p| ϕ | b M q “ µ p| ϕ | ` M z| ϕ | ´ M q c ϕ “ µ p| ϕ | c M q “ µ p| ϕ | ` M X | ϕ | ´ M q Hence, tr ns p ˆ p µ qp ϕ q “ b ϕ ` c ϕ “ µ p| ϕ | b M q ` µ p| ϕ | c M q“ µ p| ϕ | ` M z| ϕ | ´ M q ` µ p| ϕ | ` M X | ϕ | ´ M q “ µ p| ϕ | ` M q By definition, the latter term is exactly p µ p ϕ q . Thus tr ns ˝ ˆ p µ “ p µ , as desired.Moreover, the latter formula implies that tr ns ˝ tr ns ˝ ˆ p µ “ tr ns ˝ p µ . ByTheorem 2, we have have tr ns ˝ tr ns “ id P . Hence, the last equation reducesto ˆ p µ “ tr ns ˝ p µ , proving the second part of the theorem. Having shown that non-standard and four-valued probability assignments areequivalent, as witnessed by the bijection tr ns : P Ñ P ns , we now turn ourattention to the class of probability functions that are induced by probabilisticmodels. As it turns out, these are fully characterized by our axioms (A1)-(A3). More specifically, we will show that axioms (A1)-(A3) are a sound andcomplete characterization of the induced non-standard probability functions ofprobabilistic models. Of course, by Theorems 2 and 3, this implies that also(D1)-(D6) are a sound and complete characterization of the induced four-valuedprobability functions of probabilistic models. In fact, the soundness part is easyto check: Lemma 1.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model and p µ the inducednon-standard probability function. Then p µ satisfies (A1)-(A3). Towards completeness, we will show a stronger result. Recall that com-pleteness expresses that every p P P ns is the induced non-standard probabilityfunction of some probabilistic model M . This M may, however, not be uniqueas p may be not expressive enough to completely determine all properties of M .As we will show M , is almost unique. More specifically, we determine a class M can of canonical models such that every p P P ns is the induced non-standardprobability function of exactly one M P M can . Definition 7. i q We call a probabilistic model M “ x Σ , µ, v ` , v ´ y canonical iff Σ “ P p Lit q and v ` , v ´ satisfy v ` p p q “ t σ P P p Lit q | p P σ u v ´ p p q “ t σ P P p Lit q | p P σ u ii q M can is the set of canonical probabilistic models.11 emark: The set M can is representative of the set of all models in the fol-lowing sense: For any probabilistic model M “ x Σ , µ, v ` , v ´ y , there is a uniquecanonical model M c “ x P p Lit q , µ c , v ` c , v ´ c y and a unique function f : M Ñ M c such that x P v ˘ p p q ô f p x q P v ˘ c p p q and µ c p σ c q “ µ p f ´ p σ c qq for all σ c P P p Lit q .In particular, p µ p ϕ q “ p µ c p ϕ q for all ϕ P L Prop . The main theorem of this sectionis:
Theorem 4.
For any p P P ns there is a unique canonical model M p “x P p Lit q , µ, v ` , v ´ y with induced non-standard probability function p µ such that p “ p µ . Corollary 1.
Axioms (A1)-(A3) are sound and complete with respect to theclass of induced non-standard probability functions of probabilistic models.
By Theorems 2 and 3, the previous result readily translates to the level offour-valued probability functions.
Theorem 5.
For any ˆ p P P there is a unique canonical model M ˆ p “x P p Lit q , µ, v ` , v ´ y with induced four-valued probability function ˆ p µ such that ˆ p “ ˆ p µ . Corollary 2.
Axioms (D1)-(D6) are sound and complete with respect to theclass of induced four-valued probability functions of probabilistic models.Proof of Theorem 4.
Fix p P P ns . Let Σ “ P p Lit q and let v ˘ : Prop Ñ P p Σ q bedefined as v ` p q q “ t σ P Σ | q P σ u and v ´ p q q “ t σ P Σ | q P σ u respectively.We will construct a classic probability function µ : P p Σ q Ñ r
0; 1 s such that thecanonical model M “ x Σ , µ, v ` , v ´ y satisfies p µ “ p . It suffices to construct theunderlying probability mass function W : Σ Ñ r
0; 1 s , i.e. the function satisfying W p x q “ µ pt x uq for x P Σ. We will do so by induction on | x | for x P Σ “ P p Lit q .The construction proceeds in three steps. As an induction base, we set µ p x max q with x max the unique element in Σ with | x max | “ | Lit | . In the induction step,we define µ p x q for all x with | x | “ k ě
1, assuming that µ p y q has already beendefined for all y with | y | ą k . In the last step, finally, we define µ pHq , where H is the unique element of Σ of cardinality 0.We will need to ensure that that µ pr ϕ sq “ p p ϕ q for all ϕ P L Prop , where r ϕ s denotes the truth set of ϕ in the non-standard model x Σ , v ` , v ´ y , i.e. r ϕ s “t x Ď Lit | Ź q P x q ( L ϕ u . Note that by the normal form theorem (Theorem 1)and axiom (A2), it suffices to show this property for all ϕ P L Prop that are indisjunctive normal form. Moreover note that for any ϕ, ψ P L Prop in disjunctivenormal form, we have that µ pr ϕ _ ψ sq “ µ pr ϕ sq` µ pr ψ sq´ µ pr ϕ ^ ψ sq , as witnessedby µ pr ϕ _ ψ sq “ ÿ x ( ϕ _ ψ µ p x q “ ÿ x ( ϕ µ p x q ` ÿ x ( ψ µ p x q ´ ÿ x ( ϕ ^ ψ µ p x q“ µ pr ϕ sq ` µ pr ψ sq ´ µ pr ϕ ^ ψ sq . By (A3), hence, knowing that µ pr˚sq “ p p˚q for ˚ P t ϕ, ψ, ϕ ^ ψ u guarantees that µ pr ϕ _ ψ sq “ p p ϕ _ ψ q . It thus suffices to show that µ pr ϕ sq “ p p ϕ q whenever ϕ
12s a conjunction of literals, i.e. of the form Ź q P x q with x Ď Lit. We will showthis property to hold alongside our inductive construction.For the first step, let x max be Ź q P Lit q , the unique element in P p Σ q of max-imal cardinality. Note that t x max u is the truth set of the formula Ź q P Lit q . Wethus set W p x max q : “ p p Ź q P Lit q q . By axiom (A1) we have that 0 ď W p x max q ď
1. For the inductive step let k ě W p y q has already beendefined for all y with | y | ą k . We simultaneously define W p x q for all x Ď Lit with | x | “ k . Let such x be given. Note that the truth set of Ź q P x q is t y Ď Lit | x Ď y u . By induction assumption W p y q is already defined for all t y Ď Lit | x Ă y u . We can hence define W p x q : “ p p ľ q P x q q ´ ÿ t y Ď Lit | x Ă y u W p y q . (1)We have that W p x q ď p p Ź q P x q q and thus W p x q ď
1. On the other hand, notethat t y Ď Lit | x Ă y u is the truth set of Ž y Ą x Ź q P y q . Hence, by inductionassumption, ÿ t y Ď Lit | x Ă y u W p y q “ p p ł y Ą x ľ q P y q q . (2)Moreover, note that Ž y Ą x Ź q P y q ( Ź q P x q and hence, by (A2), p p Ž y Ą x Ź q P y q q ď p p Ź q P x q q . Combining this inequality with (1) and (2) yields W p x q ě W p x q is already defined for all x ‰ H .We then set W pHq “ ´ ř x ‰H W p x q . It follows immediately that ř x P Σ W p x q “
1. Moreover, by our induction, W p x q ě x ‰ H , hence W pHq ď
1. Onthe other hand, note that t x Ď Lit | x ‰ Hu is the truth set of Ž q P Lit q .By induction assumption, ř x ‰H W p x q “ p p Ž q P Lit q q . By axiom (A1), hence, W pHq ě µ pr ϕ sq “ p p ϕ q for all ϕ of the form Ź q P x q for some x Ď Lit. By the above remark, this ensures that µ pr ϕ sq “ p p ϕ q for all ϕ , i.e. that p µ “ p .To end the static parts of this paper, we provide a graphical overview overthe relationships identified so far. By Theorems 2 to 5, the diagram in Figure1 commutes. Moreover, each pair of opposite arrows in the upper half of thediagram, i.e. the pairs p tr ns , tr ns q , p p Ñ M p , µ Ñ p µ q and p ˆ p Ñ M ˆ p , µ Ñ ˆ p µ q are left- and right inverses to each other. In a classic setting, Bayesian conditioning on a formula ϕ describes a situation,where ϕ is learned to be true with probability 1 – and hence ϕ true withprobability 0. A generalization of this rule is Jeffrey conditioning, where an13 ns P M can M tr ns tr ns p Ñ M p µ Ñ p µ ˆ p Ñ M ˆ p µ Ñ ˆ p µ Ď M Ñ M c µ Ñ p µ µ Ñ ˆ p µ id idid Figure 1: The relationships identified so far. By Theorems 2 to 5, this diagramcommutes. ϕ ϕ Figure 2: Classic Conditioningagent may learn the probability of ϕ to be any value in q P r
0; 1 s , rather thanonly the extremal value of 1 (or 0, when ϕ is learned) permitted in Bayes’conditioning.Either method is best illustrated semantically. Within a classical setting, anyformula ϕ defines a binary partition tr ϕ s , r ϕ su on the state space, cf. Figure 2.Jeffrey conditioning is then executed by linearly expanding or contracting theoriginal measure µ on r ϕ s and r ϕ s to some new µ in such a way that µ pr ϕ sq “ q and µ pr ϕ sq “ ´ q . We hence get for any ϕ P L Prop that µ pr ψ sq “ µ pr ψ ^ ϕ sq qµ pr ϕ sq ` µ pr ψ ^ ϕ sq ´ qµ pr ϕ sq (3)which, in the case of Bayesian condition (i.e. q “
1) reduces to the well-knownformula µ pr ψ sq “ µ pr ψ ^ ϕ sq µ pr ϕ sq .Conditionalization in our extended setting follows a similar idea. However,14ote that both Bayes’ and Jeffrey conditioning implicitly rest on the facts that p p ϕ q ` p p ϕ q “ p p ϕ ^ ϕ q “
0, i.e. that there are no gaps andgluts. As this fact no longer holds, conditioning will behave differently in anon-standard setting. In fact, we will show that non-standard probabilitiesallow for two different notions of Jeffrey updating, one where a new value forthe probability of ϕ , i.e. p p ϕ q is learned, the other where a new value of thefour-valued vector ˆ p p ϕ q is acquired. The former version of Jeffrey updating isbest described on the level of non-standard probability assignments, the latteron the level of four-valued assignments. Yet, using the maps tr ns and tr ns ,both versions of updating can naturally be applied to either non-standard orfour-valued probability assignments.Just as in the standard case, non-normal Bayes conditioning can be definedas extremal case of Jeffrey updates. In fact, non-normal Bayes conditioning hasbeen studied independently, for instance in Mares (1997). The current frame-work generalizes the latter’s approach by also incorporating Jeffrey updatingand by identifying a number of different Bayes like updates, containing the oneput forward by Mares. In our first notion of updating, the agent’s update proscribes her to set theprobability of ϕ to some q P r
0; 1 s . Notably, within a non-standard setting,this does not carry any information about the value of ϕ - the agent mayor may not leave p p ϕ q unchanged in her update. In line with classic Jeffreyupdating, non-standard Jeffrey updating is best illustrated semantically.For any set ψ P L Prop , we can dissect the state space of a probabilistic model M “ x Σ , µ, v ` , v ´ y in two sets – the truth set r ϕ s of ϕ and it’s complementΣ zr ϕ s . Unlike in the classic case, however, Σ zr ϕ s is not the truth set of r ϕ s ,nor of any other ψ P L Prop . Yet, we can define Jeffrey updating as in the classiccase.
Definition 8.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model. Let q P r
0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p
0; 1 q . Then the semantic non-standardJeffrey update for updating the probability of ϕ to be q on M is the proba-bilistic model M ϕ,q “ x Σ , µ ϕ,q , v ` , v ´ y determined by: µ ϕ,q pt x uq “ µ pt x uq ¨ qµ pr ϕ sq iff x P r ϕ s µ pt x uq ¨ ´ q ´ µ pr ϕ sq else. Fact 1.
Non-standard Jeffrey updating is successful, i.e. for any probabilisticmodel M “ x Σ , µ, v ` , v ´ y , any q P r
0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p
0; 1 q the non-standard Jeffrey update on M updating the probability of ϕ to q satisfies µ ϕ,q p x qpr ϕ sq “ q . Despite the fact that the set Σ zr ϕ s is not definable, we can give a syn-tactic characterization of non-standard Jeffrey-updating. The following is anon-standard equivalent to classic Jeffrey’s updating, cf. Formula (3).15 emma 2. Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model. Let q P r
0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p
0; 1 q . Then for any ψ P L Prop , the non-standardJeffrey update M ϕ,q “ x Σ , µ ϕ,q , v ` , v ´ y of M satisfies: µ ϕ,q pr ψ sq “ µ pr ψ ^ ϕ sq ¨ qµ pr ϕ sq ` p µ p ψ q ´ µ p ψ ^ ϕ qq ´ q ´ µ pr ϕ sq Notably, after translating the previous fact into its induced non-standardprobability assignments p µ and p µ ϕ,q , we obtain a fully syntactic characteriza-tion of non-standard Jeffrey updating. Definition 9.
Let p : L Prop Ñ R be a non-standard probability assignment, let q P r
0; 1 s and ϕ P L Prop with p p ϕ q P p
0; 1 q . Then the syntactic non-standardJeffrey update setting the probability of ϕ to q is the probability function p ϕ,q : L Prop Ñ R defined by p ϕ,q p ψ q “ p p ψ ^ ϕ q ¨ qp p ϕ q ` p p p ψ q ´ p p ψ ^ ϕ qq ´ q ´ p p ϕ q By construction, semantic and syntactic non-standard Jeffrey updating coincidein the following sense.
Fact 2.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let q P r
0; 1 s and ϕ P L Prop with p p ϕ q P p
0; 1 q . Then p µ ϕ,q “ p ϕ,qµ . We will hence omit the labels and only speak of non-standard Jeffrey updat-ing. We end this section with three facts about non-standard Jeffrey updating.
Fact 3.
Assume that the non-standard probability function p : L Prop Ñ R isclassic, i.e. satisfies the Kolmogorov axioms. Moreover, let ϕ P L Prop with p p ϕ q P p
0; 1 q and q P r
0; 1 s . Then the non-standard and the classic Jeffreyupdate for setting the probability of ϕ to q coincide, i.e. for all ψ P L Prop p ϕ,q p ψ q “ p p ψ ^ ϕ q qp p ϕ q ` p p ψ ^ ϕ q qp p ϕ q . From this, it follows directly that
Fact 4.
Non-standard Jeffrey updating is not commutative. That is, there isa non-standard probability function p : L Prop Ñ R and ϕ, ψ P L Prop and q, r Pr
0; 1 s with p p ϕ q , p p ψ q , p ϕ,q p ψ q , p ψ,r p ϕ q P p
0; 1 q such that p p ϕ,q q ψ,r ‰ p p ψ,r q ϕ,q . Non-standard Bayesian updating
Just as in the classic case, we will define non-standard Bayesian updating asspecial case of non-standard Jeffrey updating where the probability of ϕ is setto 1. In this case, the formula of Definition 9 simplifies to the same formula asin the classical case. Note that this is also the first of two approaches to Bayesupdating proposed by Mares (1997). The second proposal by Mares, in contrastis not related to any version of Bayes updating presented here, as it strives toactively minimize conflict. 16 efinition 10. Let p : L Prop Ñ R be a non-standard probability function andlet ϕ P L Prop with p p ϕ q ą
0. Then the (positive) non-standard Bayesianupdate on ϕ is the function p ϕ,pos : p ϕ,pos p ψ q “ p p ψ ^ ϕ q p p ϕ q for ϕ P L Prop . Unlike in the classical setting, however, non-standard Bayesian updating doesnot cover all extremal cases. Setting the probability of ϕ to 0 is not the same assetting the probability of ϕ to 1, hence this case needs to be treated separately. Definition 11.
Let p : L Prop
Ñ r
0; 1 s be a non-standard probability functionand let ϕ P L Prop with p p ϕ q ă
1. Then the negative non-standard Bayesianupdate on ϕ is the function p ϕ,neg : p ϕ,neg p ψ q “ p p ψ q ´ p p ψ ^ ϕ q ´ p p ϕ q for ϕ P L Prop . As their classic counterpart, positive and negative non-standard Bayesian con-ditioning are order independent:
Lemma 3.
Let p : L Prop Ñ R and let ϕ, ψ P L Prop with p p ϕ q , p p ψ q , p ϕ p ψ q , p ψ p ϕ q P p
0; 1 q . Then p p ϕ, ˚ q ψ, ˆ “ p p ψ ˆ q ϕ, ˚ for ˚ , ˆ P t pos, neg u . Within non-standard probability, knowing the probability of ϕ does not provideany information about the probability of ϕ . Hence, in learning about ϕ ,two cases are to be distinguished. In the first case, the agent only receivesinformation about ϕ , without learning anything about ϕ or ϕ ^ ϕ . In thesecond case, the agent learns the full probabilistic information about ϕ , thatis, the probabilities of ϕ and ϕ , but also the size of the corresponding gapand glut. As discussed above, this information can be encoded in a vector p b, d, u, c q P R specifying the new pure belief (i.e. belief without conflict), puredisbelief (belief in ϕ without conflict), uncertainty and conflict about ϕ .Again, the notion of four-valued Jeffrey updating is best illustrated se-mantically. As shown in Figure 3, for any ϕ P L Prop , the sets of pure be-lief, pure disbelief, uncertainty and conflict about ϕ jointly form a partition pr ϕ szr ϕ ^ ϕ s , pr ϕ szr ϕ ^ ϕ s , Σ zr ϕ _ ϕ s , r ϕ ^ ϕ sq of a probabilistic model M . Hence, a similar idea as in classic Jeffrey updating can be applied, linearlyexpanding or shrinking the measure on each of these four cells to their appro-priate size. Notably, linear expansion (to a larger size) is only well defined ifthe cell to be expanded has a strictly positive measure. We capture this withthe notion of admissibility of a vector p b, d, u, c q : Definition 12.
Let M “ x Σ , µ, v ` , v ´ y , let ϕ P L Prop and denote ˆ p µ p ϕ q by p b ϕ , d ϕ , u ϕ , c ϕ q . We call a vector p b, d, u, c q P r
0; 1 s with b ` d ` u ` c “ admissible for ϕ if it satisfies that b “ b ϕ “ d “ d ϕ “ u “ u ϕ “ c “ c ϕ “
0. 17 ϕ ϕϕ ^ ϕ ϕ z ϕ ϕ z ϕ Figure 3: Four-valued conditioning
Definition 13.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop and let p b, d, u, c q P r
0; 1 s admissible for ϕ . Then four-valued Jeffrey up-dating on ϕ to p b, d, u, c q is the model M ϕ, p b,d,u,c q “ x Σ , µ ϕ, p b,d,u,c q , v ` , v ´ y with: µ ϕ,q p x q “ $’’’’&’’’’% µ p x q ¨ bµ pr ϕ s´ µ r ϕ ^ ϕ sq iff x P r ϕ szr ϕ ^ ϕ s µ p x q ¨ dµ pr ϕ sq´ µ pr ϕ ^ ϕ sq iff x P r ϕ szr ϕ ^ ϕ s µ p x q ¨ cµ pr ϕ ^ ϕ sq iff x P r ϕ ^ ϕ s µ p x q ¨ u ´ µ pr ϕ _ ϕ sq else Fact 5.
Four-valued Jeffrey updating is successful, i.e. for any probabilisticmodel M “ x Σ , µ, v ` , v ´ y , any ϕ P L Prop and any p b, d, u, c q P r
0; 1 s that isadmissible for ϕ , the non-standard Jeffrey update on M setting the probabilityof ϕ to p b, d, u, c q satisfies ˆ p µ ϕ, p b,d,u,c q p ϕ q “ p b, d, u, c q . Just as in the case of non-standard Jeffrey conditioning, we obtain a purelysyntactic characterization of four-valued Jeffrey updating. Unfortunately, thedrop in elegance with respect to standard Jeffrey updating is significant.
Lemma 4.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop andlet p b, d, u, c q P r
0; 1 s be admissible for ϕ . Then non-standard Jeffrey update on setting the probability of ϕ to p b, d, u, c q satisfies for any ψ P L Prop that b ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q d ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q u ψ “ bb ϕ p b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q ` dd ϕ p d ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q` uu ϕ p ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q c ψ “ bb ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q ` dd ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q` uu ϕ p c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ,ψ q where p b ψ , d ψ , u ψ , c ψ q and p b ψ , d ψ , u ψ , c ψ q denote the four-valued probability vec-tor of ψ before and after the update. In the above equations, ψ is shorthand for ψ , while ϕ, ψ stands for ϕ ^ ψ . For ease of notation, this formula uses theconvention that “ .Proof. Consider the propositions ϕ and ψ as well as the labeling of areas in thetop row of Figure 4. By definition of updating, the mass of areas 1-4 need tobe multiplied by bb ϕ , that of areas 5-8 by dd ϕ , the weight of areas 9-12 by uu ϕ andthat of areas 13-16 by bb ϕ . Moreover, the agent’s pure belief in ψ , i.e. b ψ is thejoint mass of areas 1, 5, 9 and 13, her disbelief in ψ the joint mass of areas 2, 6,10 and 14, her uncertainty is the joint weight of areas 3, 7, 11 and 15 and herconflict set the sum of areas 4, 8, 12 and 16.To check correctness of the above equations, it then suffices to verify thatthe formulas pick out the respective fields, i.e. that b ϕ,ψ is the size of field 1, b ϕ,ψ is the size of field 5, d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ is the size of field 9and so on. That this is the case follows from the pictures in Figure 4, showingthe belief and disbelief sets for certain composites of ϕ and ψ .Again, the latter set of equations can be read purely syntactically. Thus, weget a syntactic counterpart to semantic four-valued Jeffrey updates. Definition 14.
Let ˆ p : L Prop Ñ R be a four-valued probability function andlet ϕ P L Prop . Moreover, let p b, d, u, c q P r
0; 1 s be admissible for ϕ . Then (syn-tactic) four-valued Jeffrey updating with the vector p b, d, u, c q yields a four-19 ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ϕ ^ ψ ϕ ^ ϕ ^ ψϕ ^ ϕ ^ ψ ^ ψ Figure 4: Belief and disbelief sets of ϕ (top left), ψ (top center) and variouscombinations thereof. Belief sets are dotted, disbelief set shaded. The diagramsfall in 16 sections that are labelled as shown on the top right.20alued probability function ˆ p ϕ, p b,d,u,c q defined by ˆ p ϕ, p b,d,u,c q p ψ q “ p b ψ , d ψ , u ψ , c ψ q with: b ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q d ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q u ψ “ bb ϕ p b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q ` dd ϕ p d ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q` uu ϕ p ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q c ψ “ bb ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q ` dd ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q` uu ϕ p c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ,ψ q . By construction, semantic and syntactic non-standard Jeffrey updating co-incide in the following sense.
Fact 6.
Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop and let p b, d, u, c q P r
0; 1 s be admissible for ϕ . Then ˆ p µ ϕ, p b,d,u,c q “ ˆ p ϕ, p b,d,u,c q µ . We will hence omit the distinction between semantic and syntactic and onlyspeak of four-valued Jeffrey updating. We end this section with three factsabout this updating.
Fact 7.
Assume that the four-valued probability function ˆ p : L Prop Ñ R isclassic, i.e. ˆ p p ψ q P R ˆ t u . Moreover, let ϕ P L Prop and p b, d, , q P r
0; 1 s be admissible for ϕ , i.e. b “ if p p ϕ q “ and d “ if p p ϕ q “ . Thenthe non-standard and the classic Jeffrey update setting the probability of ϕ to q coincide, i.e. for all ψ P L Prop ˆ p ϕ, p b,d, , q p ψ q “ ˆ b ψ ^ ϕ qb ϕ ` b ψ ^ ϕ qb ϕ , ´ b ψ ^ ϕ qb ϕ ´ b ψ ^ ϕ qb ϕ , , ˙ From this, it follows directly that
Fact 8.
Non-standard Jeffrey updating is not commutative. That is, there isa four-valued probability function ˆ p : L Prop
Ñ r
0; 1 s , some ϕ, ψ P L Prop and p b, d, u, c q , p b , d , u , c q P r
0; 1 s such that p b, d, u, c q is admissible for ϕ in ˆ p andin ˆ p p b ,d ,u ,c q , while p b , d , u , c q is admissible for ψ in both ˆ p and ˆ p p b,d,u,c q , suchthat p p ϕ, p b,d,u,c q q ψ, p b ,d ,u ,c q ‰ p p ψ, p b ,d ,u ,c q q ϕ, p b,d,u,c q our-valued Bayesian updating Just as in the classical case, we can define four-valued Bayesian updating as aspecial instance of Jeffrey updating where the information acquired is extremal.Here, we focus on three cases. In the first, the agent learns the vector (1,0,0,0),i.e. she acquires full pure belief in ϕ . In the second and third case, the agentlearns the vectors (0,0,1,0) or (0,0,0,1) respectively, acquiring full belief in un-certainty or conflict about ϕ . The remaining case, learning (0,1,0,0), followsfrom these, as it corresponds to updating on information (1,0,0,0) about ϕ .In either of our three cases, the above definition of four-valued Jeffrey updatingsimplifies to: Definition 15. i q Let ˆ p : L Prop Ñ R be a four-valued probability func-tion such that b ϕ ą
0, where ˆ p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then positive four-valued Bayesian updating on ϕ yields the function ˆ p ϕ, ` defined by ˆ p ϕ, ` “p b ψ , d ψ , u ψ , c ψ q with b ψ “ b ϕ,ψ b ϕ d ψ “ b ϕ,ψ b ϕ u ψ “ b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ ^ ψ ` c ϕ,ϕ,ψ b ϕ c ψ “ c ϕ,ψ ´ c ϕ,ϕ,ψ b ϕ ii q Let ˆ p : L Prop Ñ R be a four-valued probability function such that u ϕ ą p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then uncertainty Bayesian updating about ϕ is defined as: ˆ p ϕ,u p ψ q “ p b ψ , d ψ , u ψ , c ψ q with b ψ “ d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ u ϕ d ψ “ d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ u ϕ u ψ “ ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ u ϕ c ψ “ c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ u ϕ iii q Let ˆ p : L Prop Ñ R be a four-valued probability function such that c ϕ ą p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then conflict Bayesian updating about ϕ is22efined as: ˆ p ϕ,c p ψ q “ p b ψ , d ψ , u ψ , c ψ q with b ψ “ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ c ϕ d ψ “ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ c ϕ u ψ “ c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ c ϕ c ψ “ c ϕ,ϕ,ψ,ψ c ϕ Just as its classic counterpart, four-valued Bayesian conditioning in all threeflavors is order independent:
Lemma 5.
Let ˆ p : L Prop
Ñ r
0; 1 s and let ϕ, ψ P L Prop such that ˆ p ϕ,a , ˆ p ψ,b , p ˆ p ϕ,a q ψ,b and p ˆ p ψ,bb q ϕ,a are all defined. Then p ˆ p ϕ,a q ψ,b “ p ˆ p ψ,bb q ϕ,a .for a, b P t` , u, c u Using the translation functions tr ns and tr ns , both notions of Jeffrey condition-ing, non-standard and four-valued, work on both types of probability functionsdefined, non-standard and four-valued. However, the notions of updating donot correspond to each other. While non-standard Jeffrey conditioning appliesto situations where only the probability of ϕ is set, without any mention ofthe probabilities of ϕ or ϕ ^ ϕ , four-valued Jeffrey conditioning covers caseswhere new probabilities of ϕ, ϕ and the corresponding gap and glut are allproscribed simultaneously. Hence, even after appropriate transformations oftheir domains with tr ns and tr ns , the two types of Jeffrey updates are not inter-definable. This, however, changes if we move to non-standard and four-valuedBayesian updating. Each of the three types of four-valued Bayesian updatingis equivalent to a composition of two steps of non-standard Bayesian updating.Moreover, the order of these two steps does not matter. Lemma 6.
Let ˆ p : L Prop Ñ R be a four-valued probability assignment and let ϕ P L Prop . i q if b ϕ ą , then tr ns p ˆ p ϕ, ` q “ p tr ns p ˆ p q ϕ,pos q ϕ,neg “ p tr ns p ˆ p q ϕ,neg q ϕ,pos ii q if u ϕ ą , then tr ns p ˆ p ϕ,u q “ p tr ns p ˆ p q ϕ,neg q ϕ,neg “ p tr ns p ˆ p q ϕ,neg q ϕ,neg iii q if c ϕ ą , then tr ns p ˆ p ϕ,u q “ p tr ns p ˆ p q ϕ,pos q ϕ,pos “ p tr ns p ˆ p q ϕ,pos q ϕ,pos Proof. i q By Theorem 5, there is a unique canonical model M “x P p Lit q , µ, v ` , v ´ y such that ˆ p µ “ ˆ p . By Facts 2 and 6, it hence suffices toshow the claim for semantic four-valued Jeffrey updating on M . Note thatthe result of positive Bayesian updating, i.e. the updated four-valued probabil-ity function µ ϕ, p , , , q of M ϕ, p , , , q “ x P p Lit q , µ ϕ, p , , , q , v ` , v ´ y is uniquelydetermined by the conditions 231) µ ϕ, p , , , q p x q “ x R r ϕ szr ϕ ^ ϕ s (2) µ ϕ, p , , , q p x q{ µ ϕ, p , , , q p y q “ µ p x q{ µ p y q whenever x, y P r ϕ szr ϕ ^ ϕ s with µ p y q ą µ ϕ, and µ ϕ, both satisfy (2). Moreover, µ ϕ, p x q “ x R r ϕ s and µ ϕ, p x q “ x P r ϕ s . Thus both p µ ϕ, q ϕ, and p µ ϕ, q ϕ, alsosatisfy (1). Hence, both p µ ϕ, q ϕ, and p µ ϕ, q ϕ, satisfy conditions (1) and(2) and, hence, are identical to µ ϕ, p , , , q . This implies that tr ns p ˆ p ϕ, ` q “p tr ns p ˆ p ϕ,pos q ϕ,neg q “ p tr ns p ˆ p ϕ,neg q ϕ,pos q .The proofs of ii q and iii q follow similarly. In the previous sections we investigated updating a probability function with ageneralized Jeffery rule by learning either only a new value for the belief in ϕ (Section 7.1) or the entire four-valued probability vector assigned to ϕ (Section7.2). However, there may be other contexts where the agent acquires partialinformation about the (four-valued) probability of ϕ , e.g. only a new value forpure belief or pure disbelief in ϕ .The idea for conditioning on partial information proceeds along the samelines as for complete information, i.e. by a modified version of Jeffery condi-tioning. The only difference is that the partiality of information, say about ϕ does not permit to work with the full partition induced by ϕ on a model M ,i.e. the partition into tr ϕ szr ϕ s , r ϕ szr ϕ s , Σ ´ r ϕ _ ϕ s , r ϕ ^ ϕ su , cf Figure3, but with a coarsening thereof.By obtaining partial information we mean that the agent learns the valuesof a partial assignment a : t b, d, u, c u á r
0; 1 s , i.e. an assignment proscribingnew values for some of the agent’s pure belief, pure disbelief, uncertainty andconflict, but not necessarily for all. Let us denote the domain of a , i.e. those x P t b, d, u, c u for which a p x q is defined, by dom p a q . For simplicity, we assumethat H Ă dom p a q Ă t b, d, u, c u with both inclusions strict. Following the sameintuitions as in the four-valued case, we can define conditioning on the partialinformation a by setting the new pure belief, disbelief, uncertainty and conflictin ϕ to be a p b q , a p d q , a p u q and a p c q respectively whenever this is defined andafterwards rescaling the probabilistic mass on the remaining area appropriately.Formally, to ensure that the corresponding operation is well-defined, we needto assume that ř y P dom p a q y ď
1. Denoting the prior four-valued probabilityvector of ϕ with p b ϕ , d ϕ , u ϕ , c ϕ q , the Jeffrey updating sketched above will leadto the posterior four-valued probability vector p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q with:¯ x ϕ “ a p x q iff x P dom p a q x ¨ ´ ř y P dom p a q y ´ ř y P dom p a q y ϕ else for x P t b, d, u, c u . With this, we can formally define partial Jeffrey updating.24 efinition 16. Let a : t b, d, u, c u á r
0; 1 s be a partial assignment such that ř y P dom p a q y ď
1. Let M “ x Σ , µ, v ` , v ´ y be a model, let ϕ P L Prop and letthe vector p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q defined above be admissible for ϕ . Then the four-valued Jeffrey update of ϕ on the partial information a is defined asthe four-valued Jeffrey update on ϕ to p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q . Assume two agents informed you about their credences in ϕ . You take bothagents as similarly competent and equally informed. Yet, they equip you withdifferent assessments of ϕ . How, then, should you combine these judgmentstowards forming your own belief about ϕ ? Within standard probability theory,your options are fairly limited. You may, for instance, decide to follow one ofthe agents, or build a weighted average between the two. A broad number ofapproaches in the literature on peer disagreement, for instance, promotes tosplit the difference equally see for instance Elga (2007); Christensen (2007) onconciliationism, but also Kelly (2010) for an opposing opinion. Within the non-standard probabilities studied here, further options open up.First, note that within classic probability theory, learning about the agentscredence in ψ also informs us about her degree of belief in ψ . This does nothold true within the current non-standard setting. Hence, let us assume for thecurrent analysis that agents inform us about both their positive and negativeattitude towards ϕ , that is about p p ϕ q and p p ϕ q , or even about their four-valued vector ˆ p p ϕ q . Of course, we may follow the previous strategies and formweighted averages between the agents’ assessments of ϕ . If needed, this policycould be specified to also taking a weighted average on the agents conflict anduncertainty and, more general, their remaining belief set. Definition 17.
Let k P r , s . i q Assume agents A and E provide their non-standard assessments of ϕ , i.e. p A p ϕ q , p E p ϕ q , p A p ϕ q and p E p ϕ q . Then their k -weighted non-standardaggregate belief p k t A,E u is defined by p k t A,E u p ϕ q “ kp A p ϕ q ` p ´ k q p E p ϕ q and p k t A,E u p ϕ q “ kp A p ϕ q ` p ´ k q p E p ϕ q .ii q For agent A and E s four-valued probabilitiy assessments p b, d, u, c q A and p b, d, u, c q E for ϕ , i.e. ˆ p A p ϕ q and ˆ p E p ϕ q their k -weighted four-valued aggre-gate belief ˆ p k t A,E u is:ˆ p k t A,E u p ϕ q “ k ˆ p A p ϕ q ` p ´ k q ˆ p E p ϕ q . Lemma 7.
Weighted averaging can be applied to an entire belief base simulta-neously. That is, when agents A and E both provide their full subjective non-standard probability functions p A , p E : L Prop Ñ R (resp. ˆ p A , ˆ p E : L Prop Ñ R ), p t A,E u p ϕ q p t A,E u p ϕ q k-weighted kp A p ϕ q ` p ´ k q p E p ϕ q kp A p ϕ q ` p ´ k q p E p ϕ q credulous max p p A p ϕ q , p E p ϕ qq max p p A p ϕ q , p E p ϕ qq cautious min p p A p ϕ q , p E p ϕ qq min p p A p ϕ q , p E p ϕ qq optimist max p p A p ϕ q , p E p ϕ qq min p p A p ϕ q , p E p ϕ qq pessimist min p p A p ϕ q , p E p ϕ qq max p p A p ϕ q , p E p ϕ qq Table 1: Different rules for aggregating agent A and E ’s non-standard beliefsin ϕ and ϕ , i.e. p A p ϕ q , p E p ϕ q , p A p ϕ q and p E p ϕ q . a weighted average belief p k t A,E u : L Prop Ñ R can be defined by kp A ` p ´ k q p E .Likewise, ˆ p k t A,E u : L Prop Ñ R can be defined by k ˆ p A ` p ´ k q ˆ p E . Moreover, thesepolicies commute with tr ns , that is tr ns p ˆ p k t A,E u q “ p k t A,E u and tr ns p p k t A,E u q “ ˆ p k t A,E u . Non-Standard beliefs, however, allow for further aggregation policies thatdo not have classic counterparts.
Credulous agents, for instance, could optfor the maximal values of their input in terms of belief and disbelief simul-taneously. That is, they could set their updated belief and disbelief in ϕ tobe max p p A p ϕ q , p E p ϕ qq and max p p A p ϕ q , p E p ϕ qq respectively. Likewise, cau-tious agents may rather chose to belief and disbelief ϕ only to an amountsupported by all input information. Such agents would set their belief anddisebelief in ϕ to min p p A p ϕ q , p E p ϕ qq and min p p A p ϕ q , p E p ϕ qq respectively.In special situations, further policies are conceivable. When testing thesafety of a new drug, for example, agents may be extremely vary of false pos-itives while being much less concerned with false negatives. Such an agentmight decide to set her new belief in ϕ to min p p A p ϕ q , p E p ϕ qq while adopting max p p A p ϕ q , p E p ϕ qq as new disbelief in ϕ . Likewise, also the combination of max p p A p ϕ q , p E p ϕ qq with min p p E p ϕ q , p E p ϕ qq are conceivable. In some sense,the latter two policies are aggregation functions that minimize type I and typeII errors. For a lack of a better name we call these pessimist and optimist updating rules respectively. See Table 1 for an overview.Unlike weighted average, none of these four policies can be applied to anentire belief set simultaneously. Fact 9.
Let p A and p E be such that p A p ϕ q “ and p A p ϕ q “ p A p ϕ ^ ϕ q “ ,while p E p ϕ q “ and p E p ϕ q “ p E p ϕ ^ ϕ q “ . Then p A and p E are consistent,but the function p t A,E u defined by p t A,E u p˚q “ max p˚q for ˚ P t ϕ, ϕ, ϕ ^ ϕ u is not.Proof. To see that p A and p E are consistent consider a nonstandard modelwith three worlds, x, y, z and v ` p p q “ t x, y u , v ´ p p q “ t x, z u . The measure µ A putting all weight on y is such that p µ A p˚q “ p A p˚q for ˚ P t ϕ, ϕ, ϕ ^ ϕ u ,showing p A consistent by Lemma 1. Likewise, the measure µ E putting all weight26n z shows p E consistent. For the inconsistency of p t A,E u , finally, note that p t A,E u p ϕ ^ ϕ q “ p t A,E u p ϕ q “ p t A,E u p ϕ q “
1. Plugging these threevalues into (A3) yields 0 ` p t A,E u p ϕ _ ϕ q “
2, contradicting (A1).Likewise, the missing conditions for cautious updates cannot be retrieved byextending the policy of taking minima to the agents’ assessments of ϕ _ ϕ , ascan be seen from the previous Fact. In particular, there is no counterpart toLemma 7 for credulous or cautious update. Neither can be performed for all ϕ P L Prop simultaneously.Before proceeding to four-valued updating, we compare the above policiesto operations in non-probabilistic Belnap-Dunn logic. For this, recall the classicBelnap-Dunn bi-lattice of truth values BD . t ut u t , ut u information t r u t h This bi-lattice can be interpreted in two directions relating to truth values and the available information . We denote meet and join of the truth latticeoperations by ^ and _ while meet and join for the information lattice operationsare [ and \ . Note that we can identify an assignment of BD -values to someformula ϕ with a non-standard probability assignment of p p ϕ q and p p ϕ q into t , u . More specifically, assigning t , u to some ϕ corresponds to p p ϕ q “ p p ϕ q “
1, while assigning t u , resp t u to ϕ corresponds to p p ϕ q “ , p p ϕ q “ p p ϕ q “ , p p ϕ q “ tu , finally, corresponds to p p ϕ q “ p p ϕ q “
0. For a probability assignment p p ϕ q , p p ϕ q P t , u , wedenote the corresponding BD value by t p p ϕ q . Applying this correspondence,we obtain the following characterization of the four updating policies introducedabove: Lemma 8.
Assume when asked about their credences in ϕ , agents A and E pro-vide extremal assignments, i.e. p A p ϕ q , p A p ϕ q , p E p ϕ q , p E p ϕ q P t , u . ThenCredulous update yields beliefs in ϕ and ϕ t p A p ϕ q \ t p E p ϕ q Cautious update that are equal to t p A p ϕ q [ t p E p ϕ q Opimistic update t p A p ϕ q _ t p E p ϕ q Pessimistic update t p A p ϕ q ^ t p E p ϕ q . Finally, we consider the special case where both agents input classic probability values, i.e. values such that p p ϕ q ` p p ϕ q “ act 10. When p A and p E are classic, i.e. p A p ϕ q` p A p ϕ q “ p E p ϕ q` p E p ϕ q “ , then the same holds for the aggregated belief when aggregation follows weightedaveraging, optimistic or pessimistic updates. That is, these three rules preserveclassicality. This does not hold for credulous and cautious updating. The lattertwo rules turn classic inputs beliefs for agent A and E into non-classic aggregatevalues as soon as A and E disagree about p p ϕ q . So far, we have assumed aggregation to operate on non-standard probabilityassignments. Within the above framework, agents provide their subjective non-standard beliefs in both ϕ and ϕ , which the various aggregative mechanismsdescribed above then merge into aggregate belief values for ϕ and ϕ . But ofcourse, our agents might also provide their subjective four-valued probabilitiesˆ p A p ϕ q “ p b Aϕ , d Aϕ , u Aϕ , c Aϕ q and ˆ p E p ϕ q “ p b Eϕ , d Eϕ , u Eϕ , c A E ϕ q instead. Naturally,we could then hope to obtain an aggregate four-valued probabilityˆ p t A,E u p ϕ q “ p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q . Note, that by the map tr ns , the non-standard probabilities p p ϕ q and p p ϕ q can be calculated from the four-valued probability ˆ p p ϕ q . Hence, if ˆ p t A,E u p ϕ q is defined, a corresponding two-valued aggregation mechanism for p t A,E u p ϕ q and p t A,E u p ϕ q follows immediately. However, the opposite does not hold. p t A,E u p ϕ q and p t A,E u p ϕ q do not fully determine ˆ p t A,E u p ϕ q and hence the vari-ous policies defined in the last section do not readily translate into four-valuedaggregation procedures. In fact, when employing the map tr ns , the three val-ues p p ϕ q , p p ϕ q and p p ϕ ^ ϕ q are required to determine ˆ p p ϕ q . In the case ofweighted averaging, this is not a problem. By Lemma 7, setting p k t A,E u p ϕ ^ ϕ q “ kp kA p ϕ ^ ϕ q ` p ´ k q p kE p ϕ ^ ϕ q yields a consistent set of requirements and the corresponding four-valued aggre-gation rule is exactly ˆ p k t A,E u p ϕ q “ k ˆ p A p ϕ q ` p ´ k q ˆ p E p ϕ q .However, the situation is different in the case of credulous or cautious up-dating. As shown in Fact 9, requiring that p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq , p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ ^ ϕ q “ max p p A p ϕ ^ ϕ q , p E p ϕ ^ ϕ qq may yield an inconsistent set of requirements. Hence, otherchoices are needed.The vector p t A,E u p ϕ q is determined by four choices. With two of them givenby p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq ,and a third by axiom (D2), one last condition is missing. In the case of credulousupdate, we would arguably expect that c t A,E u ϕ ě max p c Aϕ , c Eϕ q : If an agent optsto be credulous about both ϕ and ϕ , she could not expect her conflict to fallbelow any of the input conflicts. Within this restriction, the below definitionof credulous update, assumes c t A,E u ϕ to be as close to max p c Aϕ , c Eϕ q as possiblewhile maintaining consistency. 28ikewise, in the case of cautious update, we would arguably expect overalluncertainty to grow, or, at least, not to shrink through aggregation. That is, wewould expect that u t A,E u ϕ ě max p u Aϕ , u Eϕ q . Again, We will demand that u t A,E u ϕ is the maximal possible consistent value with this property. Definition 18.
Assume agents A and E provide four-valued probabili-ties ˆ p A p ϕ q “ p b Aϕ , d Aϕ , u Aϕ , c Aϕ q and ˆ p E p ϕ q “ p b Eϕ , d Eϕ , u Eϕ , c A E ϕ q . Thenthe credulously aggregated four-valued probability ˆ p t A,E u p ϕ q “p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q is given by the following four conditions b t A,E u ϕ ` c t A,E u ϕ “ max p b Aϕ ` c Aϕ , b Eϕ ` c Eϕ q d t A,E u ϕ ` c t A,E u ϕ “ max p d Aϕ ` c Aϕ , d Eϕ ` c Eϕ q b t A,E u ϕ ` d t A,E u ϕ ` u t A,E u ϕ ` c t A,E u ϕ “ c t A,E u ϕ “ max ´ c Eϕ , c Aϕ , p b t A,E u ϕ ` c t A,E u ϕ q ` p d t A,E u ϕ ` c t A,E u ϕ q ´ ¯ By tr ns , the first two of these equations correspond to the two conditionsof credulous non-standard updates, i.e. p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq . The third equation is axiom (D2). Thelast equation, finally expresses that c t A,E u ϕ is the minimal consistent choice suchthat c t A,E u ϕ ě max p c Aϕ , c Eϕ q . To see this, note that by (D2), we have b t A,E u ϕ ` d t A,E u ϕ ` c t A,E u ϕ ď b t A,E u ϕ ` c t A,E u ϕ ` d t A,E u ϕ ` c t A,E u ϕ ´ ď c t A,E u ϕ . Likewise we can define a cautious aggregation of four-valued probabilities:
Definition 19.
For ˆ p A and ˆ p E as above, the cautiously aggregated four-valued probability ˆ p t A,E u p ϕ q “ p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q is given by thefollowing four equations b t A,E u ϕ ` c t A,E u ϕ “ min p b Aϕ ` c Aϕ , b Eϕ ` c Eϕ q d t A,E u ϕ ` c t A,E u ϕ “ min p d Aϕ ` c Aϕ , d Eϕ ` c Eϕ q b t A,E u ϕ ` d t A,E u ϕ ` u t A,E u ϕ ` c t A,E u ϕ “ u t A,E u ϕ “ max ´ u Eϕ , u Aϕ , ´ p b t A,E u ϕ ` c t A,E u ϕ q ´ p d t A,E u ϕ ` c t A,E u ϕ q ¯ Credulous and cautious aggregation as defined here cohere with their definitionfor non-standard probabilities.
Lemma 9.
Assume that agents A and E provide four-valued vectors ˆ p A and ˆ p E respectively. Then the following diagrams commute, where the application of tr ns makes use of the fact that p p ϕ q and p p ϕ q can be calculated from ˆ p p ϕ q . p A p ϕ q ˆ p E p ϕ q ˆ p t A,E u p ϕ q p A p ϕ q , p A p ϕ q p E p ϕ q , p E p ϕ q p t A,E u p ϕ q tr ns c r e d u l o u s c r e d u l o u s tr ns ˆ p A p ϕ q ˆ p E p ϕ q ˆ p t A,E u p ϕ q p A p ϕ q , p A p ϕ q p E p ϕ q , p E p ϕ q p t A,E u p ϕ q tr ns c a u t i o u s c a u t i o u s tr ns The algebraic structure of credulous and cautious aggregation.Definition 20.
For an aggregation strategy S , we call ϕ a neutral element iffor all ψ we have S p ϕ, ψ q “ S p ψ, ϕ q “ ψ, and we call ϕ an anihilator if for all ψS p ϕ, ψ q “ S p ψ, ϕ q “ ϕ. Proposition 1.
The subjective four-valued probability assignment p , , , q ,i.e. the element of maximal conflict, is an anihilator with respect to credulousupdating. Likewise, the subjective four-valued probability assignment p , , , q ,representing maximal uncertainty, is an anihilator for the cautious strategy.Proof. Let ˆ p A p ϕ q “ p , , , q , let ˆ p E p ϕ q be arbitrary and denote the result ofcredulous updating by p B, D, U, C q . Then by definition B ` C “ D ` C “ C “ B “ D “
0, and hence U “
0. In a similar manner, let ˆ p A p ψ q “ p , , , q and let ˆ p E p ψ q be arbitrary,and denote the result of cautious updating by p B , D , C , U q . then by definition B ` C “ D ` C “
0, which implies B “ D “ C “ U “ Proposition 2.
The subjective four-valued probability assignment p , , , q ,i.e. the element of maximal conflict, is a neutral element with respect to cautiousupdating. Likewise, the subjective four-valued probability assignment p , , , q ,representing maximal uncertainty, is neutral with respect to credulous updating.Proof. Let ˆ p A p ϕ q “ p , , , q , let ˆ p E p ϕ q “ p b, d, u, c q be arbitrary and denotethe result of cautious updating by p B, D, U, C q . Then by definition B ` C “ b ` c and D ` C “ d ` c which implies B ` D ` C “ b ` d ` c. (4)Using this, the last condition of cautious updating yields U “ max p , u, ´ b ´ d ´ c q . Since u “ ´ b ´ d ´ c , this implies U “ u . Together with1 “ U ` B ` C ` D “ u ` b ` c ` d , it follows that B ` C ` D “ b ` c ` d . In30ombination with equation (4), this implies C “ c . With this, B ` C “ b ` c and D ` C “ d ` c imply that B “ b and D “ d . The proof for the second claimfollows from a similar argument. Many classical approaches to reasoning address idealized situations, where theagents’ information is consistent, closed under logical implication, and possiblyeven complete. These assumptions, of course, are at odds with many realisticreasoning scenarios, where the available evidence may be scarce and memoryor observation faulty. In short, there is no guarantee for our available infor-mation to be consistent, nor complete. Yet, we would arguably hold that some valid inferences can be drawn from such imperfect information, as partial incom-pleteness or local contradictions may not preclude us from drawing conclusionsabout other parts of the data. As automated reasoning systems are becom-ing increasingly important, there is a need for a rigorous formal treatment ofinferences from non-ideal information. To this end, a wealth of non-classicallogical systems for dealing with uncertainty or conflict has been put forward,with Belnap-Dunn logic (BD) arguably the most prominent such framework.However, the reasons for moving to non-normal, BD like frameworks applyequally well to probabilistic settings. Agents may, for instance, have inconclu-sive, probabilistic evidence for the truth or falsity of various statements. Just asin the classic case, if such information comes from different sources or differentexperiments, it needs not add up to 1, nor be mutually exclusive. It hence seemsnatural to investigate probabilistic extensions of BD. This was the focus of thecurrent paper.Paralleling recent work by Dunn (cf. Dunn, 2010; Dunn and Kiefer, 2019),we have investigated four-valued probability assignments that permit agents tohave probabilistic beliefs about the truth and falsity of a statement, and aboutits gaps and gluts. More specifically, we have provided a theory of four-valuedprobabilities that slightly departs from Dunn’s in its treatment of conjunctions.Yet, both are generalizations of Belnap-Dunn logic in that they coincide withBD whenever all probabilities are extremal, i.e. only assume the values of 0 and1. In this paper, we have clarified the connection between our four-valued prob-abilities and single valued non-standard probabilities as introduced by Childers,Majer and Milne (2019). By providing a translation function between the twoapproaches, we have shown these to be equivalent. Moreover, we have intro-duced probabilistic models as semantics for four-valued probabilities, and haveprovided a sound and complete axiomatization with respect to the class of allsuch models. Lastly, we have enriched our frameworks with dynamical opera-tions for updating and aggregation. As for the former, we have provided versionsof Jeffrey and Bayes’ conditioning that work in non-standard and four-valuedsettings and have clarified the relation between these. For aggregation, finally,31e have studied a host of different aggregation policies, some of which go beyondwhat is available in classic probabilistic settings.Of course, there are other approaches to weakening classic probability the-ory, not all of which have a corresponding logic as starting point. Many suchapproaches take probability or weights as central notion, but consider var-ious cases where no exact probabilistic information is available. A typicalexample are inner measures intended to approximate probability from below(Fagin and Halpern, 1991). Their underlying idea, briefly, is that an agentmight lack probabilistic evidence about some proposition ϕ , for instance when ϕ is not in the algebra of (possible) observations. The agent may, though,estimate a lower bound for the probability of ϕ by building on her availableinformation about other propositions. Formally, this gives rise to an innermeasures that only satisfy super-additivity instead of the classic additivity, i.e. µ ˚ p ϕ _ ψ q ě µ ˚ p ϕ q ` µ ˚ p ψ q , where ϕ ^ ψ a classical contradiction.A related weakening of classic probability theory is Dempster-Shafer (DS)theory of belief (Shafer, 1976; Halpern, 2017). The starting point of this theoryis an agent’s evidence about some state of affairs, usually represented as anormalized measure on a boolean algebra of possible observations. This evidencethen gives rise to a belief function, where Bel p ϕ q , the belief in some ϕ , is derivedfrom all pieces of evidence that entail ϕ . As the agent might have strong evidencefor a compound event, say ψ _ ϕ , without having much evidence that entailseither of its compounds alone, this belief function is super-additive in the sensedefined above. More specifically, the degree of support for some A needs notbe complementary to the support of A . That is, Bel p A q may be less than1 ´ Bel p A q , just as in our framework. While Bel p A q can be seen as a lowerbound for the classical probability for A , the term 1 ´ Bel p A q , sometimesdenoted the plausibility of A , is it’s upper bound. The interval between bothis then interpreted as the agent’s uncertainty about A . As our presentationsuggests, there is a tight connection between DS theory and inner measuresapproaches: both are equivalent, at least on a syntactic level where probabilitiesare associated to formulas, rather than states (Fagin and Halpern, 1991; Zhou,2013).Both, inner probabilitiy approaches and DS theory differ in two ways fromour framework. In one dimension, our framework is more general than DS belieffunctions or inner probabilities, as it admits not only for uncertainty but alsofor conflict in probability assignments. By allowing for gluts, non-standard andfour-valued probability assignments can represent contradictory information inways that DS theory and inner measure frameworks cannot.For a second difference consider a classic tautology such as p _ p . Work-ing on a classical meta-theory, DS theory associates a probability of 1 to thistautology. Yet, when evidence is scarce, the belief values assigned to p and p need not add up to one, exemplifying the above super-additivity. In fact, it iscompatibly with DS theory that both p and p are even assigned a belief ofzero. In our framework, in contrast, uncertainty or conflict derive straight fromthe information available about p and p , rather than from evidence aboutsome larger proposition. Working with an non-classic, BD-metatheory, non-32lassic information about literals extends to complex formulas such as p _ p ,as witnessed in the inclusion-exclusion axiom (A3). This axiom, in fact, can beseen to stand in direct opposition to the theory of inner measures. Our axioms(A3) implies a subadditivity property (i.e. µ ˚ p ϕ _ ψ q ď µ ˚ p ϕ q ` µ ˚ p ψ q when ϕ ^ ψ is a classical contradiction), in contrast to the superadditivity of DS the-ory and inner measures. A detailed comparison beyond DS belief functions andour approach would require a more careful analysis that exceeds the scope ofthis article. We leave this for future work.Finally, another open line of inquiry concerns practical implications of thepresent framework. One may, for instance, ask how an ideally rational agentis to act if she has only imperfect information at her disposal. In future work,we hope to sketch the contours of a non-standard decision theory, that restson four-valued probabilities in the same manner as traditional decision theoryemploys classic probability. Doing so, we hope, can help to fill a gap betweencurrent frameworks for decisions under risk and under uncertainty. References
Alchourr´on, C. E., P. G¨ardenfors, and D. Makinson (1985). On the logic oftheory change: Partial meet contraction and revision functions.
Journal ofSymbolic Logic 50 (2), 510 – 530.Anderson, A. R. and D. Belnap, Nuel (1975).
Entailment: The Logic of Rele-vance and Necessity, Volume I . Princeton: Princeton University Press.Batens, D. (2001). A general characterization of adaptive logics.
Logique etAnalyse 44 (173-175), 45–68.Belnap, N. D. (1977). A useful four-valued logic. In
Modern uses of multiple-valued logic , pp. 5–37. Springer.Belnap, N. D. (2019).
How a Computer Should Think , pp. 35–53. SpringerInternational Publishing.Childers, T., O. Majer, and P. Milne (2019). The (relevant) logic of scientificdiscovery. (under review) .Christensen, D. (2007, 04). Epistemology of Disagreement: The Good News.
The Philosophical Review 116 (2), 187–217.da Costa, N. (1974). On the theory of inconsistent formal systems.
Notre DameJournal of Formal Logic 15 (4), 497–510.da Costa, N. and V. Subrahmanian (1989). Paraconsistent logic as a formalismfor reasoning about inconsistent knowledge bases.
Artificial Intelligence inMedicine 1 , 167–174.Dunn, J. M. (1976). Intuitive semantics for first degree entailment and ‘coupledtrees’.
Philosophicl Studies 29 (3), 149–168.33unn, J. M. (2010). Contradictory information: Too much of a good thing.
Journal of Philosophical Logic 39 (4), 425–452.Dunn, J. M. and N. M. Kiefer (2019). Contradictory information: Better thannothing? the paradox of the two firefighters. In
Graham Priest on Dialetheismand Paraconsistency , pp. 231–247. Springer.Elga, A. (2007). Reflection and disagreement.
Noˆus 41 (3), 478–502.Fagin, R. and J. Y. Halpern (1991). Uncertainty, belief, and probability.
Com-putational Intelligence 7 (3), 160–173.Font, J. M. (1997). Belnap’s four-valued logic and de morgan lattices.
LogicJournal of IGPL 5 (3), 1–29.Halpern, J. (2017).
Reasoning about Uncertainty . MIT press.Jaskowski, S. (1948). Propositional calculus for contradictory deductive systems.
Studia Logica 24 , 143–157.Jøsang, A. (1997). Artificial reasoning with subjective logic. In
Proceedings ofthe second Australian workshop on commonsense reasoning , Volume 48, pp.34. Citeseer.Kelly, T. (2010). Peer disagreement and higher order evidence. In A. I. Goldmanand D. Whitcomb (Eds.),
Social Epistemology: Essential Readings , pp. 183–217. Oxford University Press.Klein, D. and A. Marra (2020). From oughts to goals: A logic for enkrasia.
Studia Logica 108 (1), 85–128.Kolmogorov, A. N. (2018).
Foundations of the theory of probability . CourierDover Publications.Mares, E. D. (1997). Paraconsistent probability theory and paraconsistentbayesianism.
Logique et analyse 40 (160), 375–384.Pˇrenosil, A. (2018).
Reasoning with Inconsistent Information . Ph. D. thesis,Charles University, Faculty of Philosophy.Priest, G. (1979). Logic of paradox.
Journal of Philosophical Logic 8 , 219–241.Priest, G. (2002). Paraconsistent logic.
Dov M. Gabbay and Franz Guenthner(eds.) Handbook of Philosophical Logic 6 , 287–393.Priest, G. (2006).
In contradiction . Oxford University Press.Priest, G. (2007). Paraconsistency and dialetheism.
D. Gabbay and J. Woods(eds.) Handbook of the History of Logic 8 , 129–204.Rescher, N. and R. Manor (1970). On inference from inconsistent premisses.
Theory and Decision 1 (2), 179–217.34hafer, G. (1976).
A mathematical theory of evidence , Volume 42. Princetonuniversity press.Zhou, C. (2013). Belief functions on distributive lattices.