[PDF] Probabilities with Gaps and Gluts

Abstract

Belnap-Dunn logic (BD), sometimes also known as First Degree Entailment, is a four-valued propositional logic that complements the classical truth values of True and False with two non-classical truth values Neither and Both. The latter two are to account for the possibility of the available information being incomplete or providing contradictory evidence. In this paper, we present a probabilistic extension of BD that permits agents to have probabilistic beliefs about the truth and falsity of a proposition. We provide a sound and complete axiomatization for the framework defined and also identify policies for conditionalization and aggregation. Concretely, we introduce four-valued equivalents of Bayes' and Jeffrey updating and also suggest mechanisms for aggregating information from different sources.

Full PDF

aa r X i v : . [ m a t h . L O ] M a r Probabilities with Gaps and Gluts

Dominik Klein, Ondrej Majer, Soroush Raﬁee Rad

Abstract

Belnap-Dunn logic (BD), sometimes also known as First Degree En-tailment, is a four-valued propositional logic that complements the clas-sical truth values of

True and

False with two non-classical truth values

Neither and

Both . The latter two are to account for the possibility ofthe available information being incomplete or providing contradictory ev-idence. In this paper, we present a probabilistic extension of BD thatpermits agents to have probabilistic beliefs about the truth and falsity ofa proposition. We provide a sound and complete axiomatization for theframework deﬁned and also identify policies for conditionalization and ag-gregation. Concretely, we introduce four-valued equivalents of Bayes’ andJeﬀrey updating and also suggest mechanisms for aggregating informationfrom diﬀerent sources.

Keywords : Belnap-Dunn logic, First Degree Entailment, Non-standard proba-bility theory, Probability theory, Bayes’ updating, Jeﬀrey updating, ProbabilityAggregation

In learning about a classical system that adheres to the laws of propositionallogic, we may be faced with information that does not. Naturally, if informationis scarce, our evidence may contain truth value gaps , neither indicating certainpropositions to be true nor false. But we may also be faced with contradictoryinformation, especially when our insights are gained by combining various bodiesof evidence. This may lead to truth value gluts , i.e. propositions that are labelledas both true and false.There has been many attempts in the literature to develop formal systemsfor capturing and analyzing such non-classical situations. These are generallydivided into two camps. The ﬁrst is motivated by adopting the philosophicalposition of dialetheism as defended by Priest (2006, 2007), advocating the thesisthat there are true contradictions, i.e. sentences which are both true and false.Corresponding formal systems should thus allow for assigning both truth valuesto a sentence simultaneously. Probably the most well known example of suchlogical systems is the logic LP (Priest, 1979, 2002).The second camp takes the existence of gaps and gluts as a pathological con-sequence of imperfect information. Crucially, one may hope than even imperfect1nformation would allow for at least some reliable inferences. In general, thereare two ways to go here. One could either make the set of premises consistentor develop non-trivial inference rules that work on inconsistent sets of premises.Consistency of premises can be obtained by focusing on maximal consistent sub-sets, cf. Rescher and Manor (1970); Klein and Marra (2020), or by employingbelief revision, as in AGM systems (Alchourr´on et al., 1985). Mechanisms fordealing with inconsistent information, on the other hand, are developed in avariety of frameworks such as discussive logic (Jaskowski, 1948), adaptive logic (Batens, 2001), Da Costa’s logics of formal inconsistency (1974; 1989), relevantlogic of Anderson and Belnap (1975) and their variants.Another well-known logical framework that falls in this last category isBelnap-Dunn logic (BD. cf. Belnap, 1977, 2019; Dunn, 1976), sometimes alsogoing by the name of First Degree Entailment. Brieﬂy, this system rests on twoassumptions. The ﬁrst is that gaps and gluts may occur even for boundedly ra-tional agents, as information may be limited (gaps) and the question of whethera given belief set is consistent (i.e. checking for the absence of gluts) is known tobe NP-hard. Building on the latter claim, BD’s second assumption is that thelogic of information should not validate the principle of explosion . Just to thecontrary, BD stipulates that a body of information may aﬀord us substantial in-sights about some matter q , even if it contains contradictory information aboutsome other p that is completely unrelated to q . Belnap-Dunn logic, in short, isa substructural logic, that invalidates explosion and tracks which insights canbe inferred from an information base that may contain gaps and gluts.But of course, the problem of insuﬃcient or contradictory information doesnot apply to categorial true-false information only. Rather, probabilistic infor-mation is aﬀected by similar arguments about gaps and gluts as those outlinedabove. In his 1997 paper, Jøsang puts forward a framework for three valuedprobabilities, incorporating uncertainty as third value that may occur naturallywhen evidence is ambiguous or insuﬃcient. Notably, this framework circum-vents the debated principle of insuﬃcient reason by distinguishing situations ofinsuﬃcient information from those, where equally strong evidence is availablefor and against some proposition. Later approaches extend this to four-valuedprobabilities, where the fourth value represents conﬂicting information, or gluts.The necessity of gluts is often argued for by considering a Bayesian agent who re-ceives two pieces of mutually contradictory information from sources she judgeshighly reliable, cf. the ﬁreﬁghter example in Dunn and Kiefer (2019).In short, these arguments call for a four-valued probabilistic generalizationof Belnap-Dunn logic in a similar way as classical probability theory generalizespropositional logic. In a ﬁrst approach to this project Michael Dunn (2010) hasdeﬁned a four-valued probabilistic framework and has studied logical propertiesof the resulting probabilistic entailment. In a similar vein, Childers, Majerand Milne (2019) have put forward a single-valued approach to non-standardprobabilities motivated by a frequentist interpretation where probability gapsand gluts may occur naturally if probabilities are derived from sampling two The principle of explosion states that every formula can be derived from a contradiction.

We start by giving a brief recollection of Belnap-Dunn four-valued logic beforeproceeding to introduce its probabilistic extensions. Belnap-Dunn four-valuedlogic is deﬁned over a propositional language that is built over a set Prop ofpropositional variables. Formally, the logical language L Prop is given by theBackus-Naur form: ϕ :: p | ϕ, | ϕ ^ ϕ Disjunction ( _ ) is deﬁned in the standard way. The main diﬀerence to classicalpropositional logic consists in the way that formulas are evaluated. In classicalpropositional logic, evaluations are deﬁned as functions v : L Prop

Ñ t , u thatare derived from a valuation on the set of atoms Prop. For Belnap-Dunn logicthere are two ways to deﬁne evaluations. One approach is to deﬁne evaluationsas functions v : L Prop Ñ P pt , uq . In other words instead of evaluating formulason the two element lattice 3 ut u they are interpreted on the four element lattice BD t ut u t , ut u Evaluating formulas in the four element lattice P pt , uq allows for the assign-ment of two new truth values tu and t , u . These represent so-called truth-value gaps and gluts , i.e. situations where formulas obtain neither resp. both of theclassic truth values. Formally, the evaluation is deﬁned inductively, startingfrom an atomic valuation v : Prop Ñ P pt , uq , by:1 P v p φ q iﬀ 0 P v p φ q P v p φ q iﬀ 1 P v p φ q P v p φ ^ ψ q iﬀ 1 P v p φ q and 1 P v p ψ q P v p φ ^ ψ q iﬀ 0 P v p φ q or 0 P v p ψ q An alternative approach is to use two separate classical valuations, called thepositive valuation v ` and the negative valuation v ´ . Building on atomic val-uations v ` : Prop Ñ t , u and v ´ : Prop Ñ t , u , these are deﬁned for φ, ψ P L Prop as: v ` p φ q “ v ´ p φ q “ v ´ p φ q “ v ` p φ q “ v ` p φ ^ ψ q “ v ` p φ q “ v ` p ψ q “ v ´ p φ ^ ψ q “ v ´ p φ q “ v ´ p ψ q “ φ ( L ψ if and only if v ` p φ q “ ñ v ` p ψ q “ p v ` , v ´ q .This entailment relation goes by the name of ﬁrst degree entailment .An important property of Belnap-Dunn logic, that we will make heavy useof later, is that it admits disjunctive (as well as conjunctive) normal forms.Just as in classical logic, a formula in disjunctive normal form is written asa disjunction of conjunctions of literals. However, unlike in classical logic, anatom might appear both positively and negatively within a conjunctive clause.4 heorem 1. (Theorem 3.9 in Font (1997)) Every formula of Belnap-Dunnlogic is equivalent to a formula in a conjunctive (disjunctive) normal form. Moreover, up to permutation of conjuncts and disjuncts, formulas in con-junctive (disjunctive) normal form may be identiﬁed with ﬁnite families of ﬁnitesets of literals (Theorem 3.15 in Pˇrenosil (2018)).

The double valuation approach’s starting assumption is that positive and nega-tive evidence are distinct. That is the absence of positive evidence for some p isnot the same as negative evidence against p (or positive evidence for p , if youwill). In particular, there may be gaps, where neither evidence for p nor against p is available, and gluts, where evidence of both types is present. Within ourmodels, we must hence treat positive and negative evidence separately. In thefollowing, we assume Prop ﬁnite and constant. Also, we will denote the set ofliterals over Prop by Lit, i.e. Lit : “ Prop Yt p | p P Prop u . Deﬁnition 1. A non-standard model is a triple M “ x Σ , v ` , v ´ y where Σis a ﬁnite or countably inﬁnite set of states and v ` , v ´ : Σ ˆ Prop

Ñ t , u arecalled the positive (negative) valuation function respectively. For p P Prop welet v ˘ p p q “ t s P Σ | v ˘ p s, p q “ u .Hence, a state s of a model M might be assigned an inconsistent set ofpropositions (i.e., s P v ` p p q X v ´ p p q for some p P P rop ), and may remainundecided about some propositions ( s R v ` p q q Y v ´ p q q for some q P P rop ).Non-standard models provide a semantics for BD. More speciﬁcally, logicalformulas of L Prop are evaluated on model-state pairs, using relations |ù ` and |ù ´ . From this, we then obtain the notions of a positive and negative extension. Deﬁnition 2.

5e deﬁne the entailment relation between sentences in the usual way: φ ( ˘ ψ if and only if for all models M and states σ , if M , s ( ˘ φ then M , s ( ˘ ψ . Observe the obvious connection between positive and negative extension: | ϕ | ` M “ | ϕ | ´ M . Moreover, we deﬁne the set of pure belief, pure disbelief,conﬂict and uncertainty about ϕ as | ϕ | b M “| ϕ | ` M z| ϕ | ´ M | ϕ | d M “| ϕ | ´ M z| ϕ | ` M | ϕ | c M “| ϕ | ` M X | ϕ | ´ M | ϕ | u M “ Σ zp| ϕ | ` M Y | ϕ | ´ M q . The terms belief and disbelief , of course, refer to the intended interpretation asdoxastic state. Whenever clear by context, we omit the subscript M .Towards a semantics of non-standard probability theory, we expand the non-standard model deﬁned above with a probability measure that is classic. Non-classicality of the ensuing probability assignments, then, will be derived fromthe underlying valuations only, i.e. from the fact that non-standard modelsallow for gaps and gluts of truth values. Deﬁnition 3. A probabilistic model is a tuple M “ x Σ , µ, v ` , v ´ y where x Σ , v ` , v ´ y is a non-standard model and µ is a probability measure on the fullsubset algebra of Σ.Building on probabilistic models, we can derive two diﬀerent probability assign-ments from M , one four-valued, the other single valued. These are: Deﬁnition 4.

For a probabilistic model M “ x Σ , µ, v ` , v ´ y , i q the induced non-standard probability function p µ : L Prop Ñ R is: p µ p ϕ q “ µ p| ϕ | ` M q ii q the induced four-valued probability function ˆ p µ : L Prop Ñ R isˆ p µ p ϕ q “ ` µ p| ϕ | b q , µ p| ϕ | d q , µ p| ϕ | u q , µ p| ϕ | c q ˘ . To end this section, we’d like to highlight a strong similarity to classic prob-abilistic models. Classic probability assignments can be derived from possibleworlds models equipped with a probability function, i.e. ﬁnite classical modelsakin to those in Deﬁnition 3. More explicitly, for a classical model of the form M “ x W, v, µ y with W a set of possible worlds, v : W ˆ Prop

Ñ t , u a valua-tion, and µ : P p W q Ñ r

0; 1 s a probability measure, the probability of some ϕ isgiven as µ pr ϕ sq , with r ϕ s “ t s P Σ : M , s ( ϕ u . In fact, if Prop is ﬁnite, everyprobability assignment to L Prop can be obtained in this way.Moreover, every world w of a possible worlds models W naturally corre-sponds to its atomic valuation, which can be represented by the subset V Ď Propgiven by p P V iﬀ v p w, p q “ p P Prop. In the same vein, each state We use the following convention in naming of probability measures i) p-like names standfor syntactic measures (on languages) and µ -like names for measures on spaces. ii) Hat-superscripts are used to denote four-valued probabilities. iii) The Subscript µ may be used ifa syntactic measure is derived from a space. of a probabilistic model corresponds to a non-standard possible assignment n s Ď P p Lit q deﬁned by p P n s iﬀ v ` p w, p q “ p P n s iﬀ v ´ p w, p q “ p P At . Hence, non standard probabilistic models are obtained from pos-sible world models by replacing classical worlds, i.e. atomic valuations withBD-possible worlds, that is elements of P p Lit q . In the following, we present a number of axioms for non-standard and four-valued probabilities. The two sets of axioms given here are easily seen to besound w.r.t. to the semantics just presented. That they are also complete willbe shown in Section 6. We can hence use these axioms for a purely syntacticdeﬁnition of non-standard and four-valued probabilities.

Non-standard probabilities

We begin with axioms for single-valued non-standard probabilities, i.e. proba-bility measures assigning each ϕ P L Prop a unique rational number.

Deﬁnition 5. A non-standard probability assignment is a function p : L Prop Ñ R satisfying for all ϕ, ψ P L Prop .(A1) 0 ď p p ϕ q ď ϕ ( L ψ then p p ϕ q ď p p ψ q (monotonicity)(A3) p p ϕ ^ ψ q ` p p ϕ _ ψ q “ p p ϕ q ` p p ψ q . (import-export rule)where ( L in (A2) is the entailment relation of Belnap-Dunn logic (ﬁrst-degreeentailment).These axioms are strictly weaker than the classic Kolmogorov axioms(Kolmogorov, 2018). Axioms (A1)-(A3) can be derived from the Kolmogorov ax-ioms, using that ﬁrst degree entailment is a sub-relation of classical entailment.In the converse direction, however, only the non-negativity axiom ( p p ϕ q ě ϕ ) is derivable from (A1). Neither Kolmogorov’s unit axiom p p pJq “ q nor the ( σ )-additivity axioms are derivable from (A1)-(A3), as is illustrated bythe fact that assigning probability .5 to every formula satisﬁes (A1)-(A3). Infact, the import-export axiom is a weak counterpart to additivity, stating thata general rule for adding probabilities that is derivable from the Kolmogorovaxioms, p p ϕ _ ψ q “ p p ϕ q ` p p ψ q ´ p p ϕ ^ ψ q , continues to hold. Within theabove axiomatization, the import-export axioms (A3) is the only conditionregulating the relation between the probability of a formula and its negation.As a result the probabilities of ϕ and ϕ need not sum up to 1. The con-straint p p ϕ _ ϕ q ` p p ϕ ^ ϕ q “ p p ϕ q ` p p ϕ q allows for probabilistic gaps( p p ϕ _ ϕ q ă q and gluts ( p p ϕ ^ ϕ q ą q to occur simultaneously. Thissquares with our original motivation of establishing independence between pos-itive and negative evidence. 7 our-Valued probabilities We now turn to four-valued probability assignments. These are characterizedby a total of six axioms.

Deﬁnition 6. A four-valued probability assignment is a function ˆ p : L Prop Ñ R . Writing ˆ ϕ as p b ϕ , d ϕ , u ϕ , c ϕ q , this function must satisfy(D1) 0 ď b ϕ , d ϕ , u ϕ , c ϕ (D2) b ϕ ` d ϕ ` u ϕ ` c ϕ “ b ϕ “ d ϕ , c ϕ “ c ϕ (D4) if ϕ ( L ψ then b ϕ ` c ϕ ď b ψ ` c ψ (D5) b ϕ ^ ϕ “ c ϕ ^ ϕ “ c ϕ (D6) b ϕ ` c ϕ ` b ψ ` c ψ “ b ϕ ^ ψ ` c ϕ ^ ψ ` b ϕ _ ψ ` c ϕ _ ψ where ( L is ﬁrst-degree entailment and ϕ, ψ P L Prop

The four entries of ˆ p stand for pure belief (i.e. ϕ is true and ϕ is not),pure disbelief, uncertainty and conﬂict respectively. Let us brieﬂy explain theaxioms. The ﬁrst two axioms (D1) and (D2) are classicality axioms, statingthat probabilities are non-negative and that the probabilistic masses of purebelief, pure disbelief, conﬂict and uncertainty must add up to 1. This reﬂectsthe intuition that the four cases are mutually exclusive and jointly exhaustive,i.e. that the metatheory of gaps and gluts is classical.Axioms (D3)-(D6) then represent structural relations between the four-valued assignments. (D3) emphasizes the strong relation between ϕ and ϕ :belief in one is the same as disbelief in the other, while both share the sameconﬂict and uncertainty. (D4) is a direct counterpart of axioms (A2) above,stating that the total belief in ϕ (i.e. the sum of pure belief in ϕ and belief in ϕ and ϕ together) must be monotonous under ﬁrst degree entailment. (D5)expresses that an agent cannot have pure belief in contradictory formulas of theform ϕ ^ ϕ . A fortiori, the conﬂict about ϕ ^ ϕ must be derived from (andequal to) conﬂict about ϕ alone. (D6), ﬁnally, is a counterpart to the import-export axiom (A3). Brieﬂy, it states the total beliefs (i.e. the sum of pure beliefand conﬂict together) of ϕ, ψ, ϕ _ ψ and ϕ ^ ψ must satisfy the import-exportrule.We should note that the axioms presented here are weaker than those putforward in Dunn (2010). There, the probability of a conjunction ϕ ^ ψ isdetermined by its conjuncts through: b ϕ ^ ψ “ b ϕ ¨ b ψ d ϕ ^ ψ “ d ϕ ` d ψ ´ d ϕ d ψ ` c ϕ u ψ ` u ϕ c ψ u ϕ ^ ψ “ u ϕ b ψ ` b ϕ u ψ ` u ϕ u ψ c ϕ ^ ψ “ b ϕ c ψ ` c ϕ b ψ ` c ϕ c ψ A similar axiom for three valued probabilities (true/false/uncertain) can befound in Jøsang (1997). Notably, such deﬁnition makes conjunctions truth func-tional, i.e. the probability of ϕ ^ ψ is fully determined by the probabilities of ϕ and ψ . We take this to be too strong, especially given that no such functional de-pendence holds in classic probability theory. Moreover this truth functional ap-proach implies that all propositions are mutually probabilistically independent -8recluding any interesting notions of conditionalization. To see this, assume that ϕ and ψ are classical, i.e. ˆ p p ϕ q “ p b ϕ , d ϕ , , q and ˆ p p ψ q “ p b ψ , d ψ , , q . Thenthe above deﬁnition simpliﬁes to ˆ p p ϕ ^ ψ q “ p b ϕ b ψ , d ϕ ` d ψ ´ d ϕ d ψ , , q . Withother words, the probability (belief) in ϕ ^ ψ is the product of the probabilitiesof ϕ and ψ - which exactly is the deﬁnition of probabilistic independence.In the following section we will show a strong correspondence between non-standard and four-valued probability assignments. Thereafter, we show axiomsystems (A1)-(A3) and (D1)-(D6) to be sound and complete with respect to theclass of probabilistic models deﬁned above (Section 6). In Section 7 we thendiscuss approaches to conditionalization in either setting. We have so far presented two diﬀerent frameworks for non-standard probability,one real-valued, the other with values in R . As we show now, both are diﬀerentbut equivalent perspectives on the same phenomenon. To this end, let P ns and P be the set of non-standard and four-valued probability assignments respec-tively. That is, P ns is the set of functions L Prop Ñ R satisfying (A1)-(A3) while P consists of all mappings L Prop Ñ R satisfying (D1)-(D6). We will show thetranslation map tr ns : P Ñ P ns deﬁned by tr ns p ˆ p qp ϕ q : “ b ϕ ` c ϕ where ˆ p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q to be a bijection. In the opposite direction, the map tr ns : P ns Ñ P is given by tr ns p p qp ϕ q : “p p p ϕ q ´ p p ϕ ^ ϕ q , p p ϕ q ´ p p ϕ ^ ϕ q , p ´ p p ϕ q ´ p p ϕ q ` p p ϕ ^ ϕ q , p p ϕ ^ ϕ qq As expected, the maps tr ns and tr ns are inverse to each other: Theorem 2. tr ns and tr ns are well-deﬁned. Moreover tr ns ˝ tr ns “ id P and tr ns ˝ tr ns “ id P ns Moreover, the translation maps tr ns and tr ns cohere with the way we deﬁnednon-standard and four-valued assignments on a given probabilistic model. Theorem 3.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model and p µ and ˆ p µ theinduced non-standard and four-valued probability functions. Then tr ns ˝ ˆ p µ “ p µ and tr ns ˝ p µ “ ˆ p µ . The remainder of this section is devoted to showing these two results.

Proof of Theorem 2.

To see that tr ns is well deﬁned let p “ tr ns p ˆ p q for a ﬁxedˆ p P P . First, note that for any ψ P L Prop with ˆ p p ψ q “ p b ψ , d ψ , u ψ , c ψ q we have0 ď b ψ ` c ψ ď

1, showing that p satisﬁes (A1). To see that p satisﬁes (A2)9ssume that ϕ ( L ψ . By (D4), we have that b ϕ ` c ϕ ď b ψ ` c ψ and hence p p ϕ q ď p p ψ q . For (A3) ﬁnally, note that by (D6) we have for any ϕ, ψ P L Prop that b ϕ ` c ϕ ` b ψ ` c ψ “ b ϕ ^ ψ ` c ϕ ^ ψ ` b ϕ _ ψ ` c ϕ _ ψ which immediately impliesthat p p ϕ q ` p p ψ q “ p p ϕ ^ ψ q ` p p ϕ _ ψ q .Next, we show that also tr ns is well deﬁned. For this ﬁx p P P ns . For ψ P L Prop denote tr ns p p qp ψ q by p b ψ , d ψ , u ψ , c ψ q . Using this notation, we obtain b ψ ` d ψ ` u ψ ` c ψ “` p p ψ q ´ p p ψ ^ ψ q ` p p ψ q ´ p p ψ ^ ψ q` ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q ` p p ψ ^ ψ q the latter term is easily seen to equal 1, showing (D2). For (D1) note that ψ ^ ψ ( L ψ and ψ ^ ψ ( L ψ . By (A2), we have that p p ψ ^ ψ q ď p p ψ q , p p ψ q which, together with (A1) implies that b ψ , d ψ , c ψ ě

0. Finally, by(A1) and (A3),1 ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q ě p p ψ _ ψ q ´ p p ψ q ´ p p ψ q ` p p ψ ^ ψ q “ u ψ ě

0. The ﬁrst half of (D3) follows from the fact that b ψ “ p p ψ q ´ p p ψ ^ ψ q “ d ψ , using that ψ )( L ψ and hence, by (A2), p p ψ q “ p p ψ q .The second half follows from the fact that ψ ^ ψ )( L ψ ^ ψ and hence,by (A2), p p ψ ^ ψ q “ p p ψ ^ ψ q . Similarly, (D4) can be derived from (A2)together with the fact that b ψ ` c ψ “ p p ψ q ´ p p ψ ^ ψ q ` p p ψ ^ ψ q “ p p ψ q .Using the latter fact again, (D6) is an immediate consequence of (A3). For(D5), ﬁnally, note that ψ ^ ψ )( L ψ ^ ψ ^ p ψ ^ ψ q and hence, by (A2), p p ψ ^ ψ q “ p p ψ ^ ψ ^ p ψ ^ ψ qq . This implies that c ψ ^ ψ “ c ψ and that b ψ ^ ψ “ p p ψ ^ ψ q ´ p p ψ ^ ψ ^ p ψ ^ ψ qq “ tr ns ˝ tr ns “ id P and tr ns ˝ tr ns “ id P ns , i.e. that tr ns and tr ns are left and right inverses of each other. We begin by showingthat tr ns p tr ns p p qq “ p for any p P P ns . For ϕ P L Prop , we have that tr ns p p qp ϕ q equals p p p ϕ q ´ p p ϕ ^ ϕ q , p p ϕ q ´ p p ϕ ^ ϕ q , ´ p p ϕ q ´ p p ϕ q ` p p ϕ ^ ϕ q , p p ϕ ^ ϕ qq . Hence tr ns p tr ns p p qqp ϕ q “ p p ϕ q ´ p p ϕ ^ ϕ q ` p p ϕ ^ ϕ q “ p p ϕ q as desired.For the converse direction, let ˆ p P P . We have to show that tr ns p tr ns p ˆ p qq “ ˆ p . For this, let ϕ P L Prop and denote ˆ p p ψ q by p b ψ , d ψ , c ψ , u ψ q for any ψ P L Prop .By axioms (D3) and (D5) we have that b ϕ “ d ϕ , c ϕ “ c ϕ , b ϕ ^ ϕ “ c ϕ ^ ϕ “ c ϕ . Hence, the values of tr ns p ˆ p qp ϕ q , tr ns p ˆ p qp ϕ q and tr ns p ˆ p qp ϕ ^ ϕ q are b ϕ ` c ϕ , d ϕ ` c ϕ and c ϕ respectively. We then get that tr ns p tr ns p ˆ p qqp ϕ q“p tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ ^ ϕ q , tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ ^ ϕ q ,tr ns p ˆ p qp ϕ ^ ϕ q , ´ tr ns p ˆ p qp ϕ q ´ tr ns p ˆ p qp ϕ q ` tr ns p ˆ p qp ϕ ^ ϕ qq“p b ϕ ` c ϕ ´ c ϕ , d ϕ ` c ϕ ´ c ϕ , c ϕ , ´ p b ϕ ` c ϕ q ´ p d ϕ ` c ϕ q ` c ϕ q“p b ϕ , d ϕ , c ϕ , ´ b ϕ ´ d ϕ ´ c ϕ q “ p b ϕ , d ϕ , c ϕ , u ϕ q where the last equation employs (D2). Hence tr ns p tr ns p ˆ p qq “ ˆ p as desired.10 roof of Theorem 3. For ϕ P L Prop denote ˆ p µ p ϕ q by p b ϕ , d ϕ , u ϕ , c ϕ q . By Deﬁni-tion 3, we have b ϕ “ µ p| ϕ | b M q “ µ p| ϕ | ` M z| ϕ | ´ M q c ϕ “ µ p| ϕ | c M q “ µ p| ϕ | ` M X | ϕ | ´ M q Hence, tr ns p ˆ p µ qp ϕ q “ b ϕ ` c ϕ “ µ p| ϕ | b M q ` µ p| ϕ | c M q“ µ p| ϕ | ` M z| ϕ | ´ M q ` µ p| ϕ | ` M X | ϕ | ´ M q “ µ p| ϕ | ` M q By deﬁnition, the latter term is exactly p µ p ϕ q . Thus tr ns ˝ ˆ p µ “ p µ , as desired.Moreover, the latter formula implies that tr ns ˝ tr ns ˝ ˆ p µ “ tr ns ˝ p µ . ByTheorem 2, we have have tr ns ˝ tr ns “ id P . Hence, the last equation reducesto ˆ p µ “ tr ns ˝ p µ , proving the second part of the theorem. Having shown that non-standard and four-valued probability assignments areequivalent, as witnessed by the bijection tr ns : P Ñ P ns , we now turn ourattention to the class of probability functions that are induced by probabilisticmodels. As it turns out, these are fully characterized by our axioms (A1)-(A3). More speciﬁcally, we will show that axioms (A1)-(A3) are a sound andcomplete characterization of the induced non-standard probability functions ofprobabilistic models. Of course, by Theorems 2 and 3, this implies that also(D1)-(D6) are a sound and complete characterization of the induced four-valuedprobability functions of probabilistic models. In fact, the soundness part is easyto check: Lemma 1.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model and p µ the inducednon-standard probability function. Then p µ satisﬁes (A1)-(A3). Towards completeness, we will show a stronger result. Recall that com-pleteness expresses that every p P P ns is the induced non-standard probabilityfunction of some probabilistic model M . This M may, however, not be uniqueas p may be not expressive enough to completely determine all properties of M .As we will show M , is almost unique. More speciﬁcally, we determine a class M can of canonical models such that every p P P ns is the induced non-standardprobability function of exactly one M P M can . Deﬁnition 7. i q We call a probabilistic model M “ x Σ , µ, v ` , v ´ y canonical iﬀ Σ “ P p Lit q and v ` , v ´ satisfy v ` p p q “ t σ P P p Lit q | p P σ u v ´ p p q “ t σ P P p Lit q | p P σ u ii q M can is the set of canonical probabilistic models.11 emark: The set M can is representative of the set of all models in the fol-lowing sense: For any probabilistic model M “ x Σ , µ, v ` , v ´ y , there is a uniquecanonical model M c “ x P p Lit q , µ c , v ` c , v ´ c y and a unique function f : M Ñ M c such that x P v ˘ p p q ô f p x q P v ˘ c p p q and µ c p σ c q “ µ p f ´ p σ c qq for all σ c P P p Lit q .In particular, p µ p ϕ q “ p µ c p ϕ q for all ϕ P L Prop . The main theorem of this sectionis:

Theorem 4.

For any p P P ns there is a unique canonical model M p “x P p Lit q , µ, v ` , v ´ y with induced non-standard probability function p µ such that p “ p µ . Corollary 1.

Axioms (A1)-(A3) are sound and complete with respect to theclass of induced non-standard probability functions of probabilistic models.

By Theorems 2 and 3, the previous result readily translates to the level offour-valued probability functions.

Theorem 5.

For any ˆ p P P there is a unique canonical model M ˆ p “x P p Lit q , µ, v ` , v ´ y with induced four-valued probability function ˆ p µ such that ˆ p “ ˆ p µ . Corollary 2.

Axioms (D1)-(D6) are sound and complete with respect to theclass of induced four-valued probability functions of probabilistic models.Proof of Theorem 4.

Fix p P P ns . Let Σ “ P p Lit q and let v ˘ : Prop Ñ P p Σ q bedeﬁned as v ` p q q “ t σ P Σ | q P σ u and v ´ p q q “ t σ P Σ | q P σ u respectively.We will construct a classic probability function µ : P p Σ q Ñ r

0; 1 s such that thecanonical model M “ x Σ , µ, v ` , v ´ y satisﬁes p µ “ p . It suﬃces to construct theunderlying probability mass function W : Σ Ñ r

0; 1 s , i.e. the function satisfying W p x q “ µ pt x uq for x P Σ. We will do so by induction on | x | for x P Σ “ P p Lit q .The construction proceeds in three steps. As an induction base, we set µ p x max q with x max the unique element in Σ with | x max | “ | Lit | . In the induction step,we deﬁne µ p x q for all x with | x | “ k ě

1, assuming that µ p y q has already beendeﬁned for all y with | y | ą k . In the last step, ﬁnally, we deﬁne µ pHq , where H is the unique element of Σ of cardinality 0.We will need to ensure that that µ pr ϕ sq “ p p ϕ q for all ϕ P L Prop , where r ϕ s denotes the truth set of ϕ in the non-standard model x Σ , v ` , v ´ y , i.e. r ϕ s “t x Ď Lit | Ź q P x q ( L ϕ u . Note that by the normal form theorem (Theorem 1)and axiom (A2), it suﬃces to show this property for all ϕ P L Prop that are indisjunctive normal form. Moreover note that for any ϕ, ψ P L Prop in disjunctivenormal form, we have that µ pr ϕ _ ψ sq “ µ pr ϕ sq` µ pr ψ sq´ µ pr ϕ ^ ψ sq , as witnessedby µ pr ϕ _ ψ sq “ ÿ x ( ϕ _ ψ µ p x q “ ÿ x ( ϕ µ p x q ` ÿ x ( ψ µ p x q ´ ÿ x ( ϕ ^ ψ µ p x q“ µ pr ϕ sq ` µ pr ψ sq ´ µ pr ϕ ^ ψ sq . By (A3), hence, knowing that µ pr˚sq “ p p˚q for ˚ P t ϕ, ψ, ϕ ^ ψ u guarantees that µ pr ϕ _ ψ sq “ p p ϕ _ ψ q . It thus suﬃces to show that µ pr ϕ sq “ p p ϕ q whenever ϕ

12s a conjunction of literals, i.e. of the form Ź q P x q with x Ď Lit. We will showthis property to hold alongside our inductive construction.For the ﬁrst step, let x max be Ź q P Lit q , the unique element in P p Σ q of max-imal cardinality. Note that t x max u is the truth set of the formula Ź q P Lit q . Wethus set W p x max q : “ p p Ź q P Lit q q . By axiom (A1) we have that 0 ď W p x max q ď

1. For the inductive step let k ě W p y q has already beendeﬁned for all y with | y | ą k . We simultaneously deﬁne W p x q for all x Ď Lit with | x | “ k . Let such x be given. Note that the truth set of Ź q P x q is t y Ď Lit | x Ď y u . By induction assumption W p y q is already deﬁned for all t y Ď Lit | x Ă y u . We can hence deﬁne W p x q : “ p p ľ q P x q q ´ ÿ t y Ď Lit | x Ă y u W p y q . (1)We have that W p x q ď p p Ź q P x q q and thus W p x q ď

1. On the other hand, notethat t y Ď Lit | x Ă y u is the truth set of Ž y Ą x Ź q P y q . Hence, by inductionassumption, ÿ t y Ď Lit | x Ă y u W p y q “ p p ł y Ą x ľ q P y q q . (2)Moreover, note that Ž y Ą x Ź q P y q ( Ź q P x q and hence, by (A2), p p Ž y Ą x Ź q P y q q ď p p Ź q P x q q . Combining this inequality with (1) and (2) yields W p x q ě W p x q is already deﬁned for all x ‰ H .We then set W pHq “ ´ ř x ‰H W p x q . It follows immediately that ř x P Σ W p x q “

1. Moreover, by our induction, W p x q ě x ‰ H , hence W pHq ď

1. Onthe other hand, note that t x Ď Lit | x ‰ Hu is the truth set of Ž q P Lit q .By induction assumption, ř x ‰H W p x q “ p p Ž q P Lit q q . By axiom (A1), hence, W pHq ě µ pr ϕ sq “ p p ϕ q for all ϕ of the form Ź q P x q for some x Ď Lit. By the above remark, this ensures that µ pr ϕ sq “ p p ϕ q for all ϕ , i.e. that p µ “ p .To end the static parts of this paper, we provide a graphical overview overthe relationships identiﬁed so far. By Theorems 2 to 5, the diagram in Figure1 commutes. Moreover, each pair of opposite arrows in the upper half of thediagram, i.e. the pairs p tr ns , tr ns q , p p Ñ M p , µ Ñ p µ q and p ˆ p Ñ M ˆ p , µ Ñ ˆ p µ q are left- and right inverses to each other. In a classic setting, Bayesian conditioning on a formula ϕ describes a situation,where ϕ is learned to be true with probability 1 – and hence ϕ true withprobability 0. A generalization of this rule is Jeﬀrey conditioning, where an13 ns P M can M tr ns tr ns p Ñ M p µ Ñ p µ ˆ p Ñ M ˆ p µ Ñ ˆ p µ Ď M Ñ M c µ Ñ p µ µ Ñ ˆ p µ id idid Figure 1: The relationships identiﬁed so far. By Theorems 2 to 5, this diagramcommutes. ϕ ϕ Figure 2: Classic Conditioningagent may learn the probability of ϕ to be any value in q P r

0; 1 s , rather thanonly the extremal value of 1 (or 0, when ϕ is learned) permitted in Bayes’conditioning.Either method is best illustrated semantically. Within a classical setting, anyformula ϕ deﬁnes a binary partition tr ϕ s , r ϕ su on the state space, cf. Figure 2.Jeﬀrey conditioning is then executed by linearly expanding or contracting theoriginal measure µ on r ϕ s and r ϕ s to some new µ in such a way that µ pr ϕ sq “ q and µ pr ϕ sq “ ´ q . We hence get for any ϕ P L Prop that µ pr ψ sq “ µ pr ψ ^ ϕ sq qµ pr ϕ sq ` µ pr ψ ^ ϕ sq ´ qµ pr ϕ sq (3)which, in the case of Bayesian condition (i.e. q “

1) reduces to the well-knownformula µ pr ψ sq “ µ pr ψ ^ ϕ sq µ pr ϕ sq .Conditionalization in our extended setting follows a similar idea. However,14ote that both Bayes’ and Jeﬀrey conditioning implicitly rest on the facts that p p ϕ q ` p p ϕ q “ p p ϕ ^ ϕ q “

0, i.e. that there are no gaps andgluts. As this fact no longer holds, conditioning will behave diﬀerently in anon-standard setting. In fact, we will show that non-standard probabilitiesallow for two diﬀerent notions of Jeﬀrey updating, one where a new value forthe probability of ϕ , i.e. p p ϕ q is learned, the other where a new value of thefour-valued vector ˆ p p ϕ q is acquired. The former version of Jeﬀrey updating isbest described on the level of non-standard probability assignments, the latteron the level of four-valued assignments. Yet, using the maps tr ns and tr ns ,both versions of updating can naturally be applied to either non-standard orfour-valued probability assignments.Just as in the standard case, non-normal Bayes conditioning can be deﬁnedas extremal case of Jeﬀrey updates. In fact, non-normal Bayes conditioning hasbeen studied independently, for instance in Mares (1997). The current frame-work generalizes the latter’s approach by also incorporating Jeﬀrey updatingand by identifying a number of diﬀerent Bayes like updates, containing the oneput forward by Mares. In our ﬁrst notion of updating, the agent’s update proscribes her to set theprobability of ϕ to some q P r

0; 1 s . Notably, within a non-standard setting,this does not carry any information about the value of ϕ - the agent mayor may not leave p p ϕ q unchanged in her update. In line with classic Jeﬀreyupdating, non-standard Jeﬀrey updating is best illustrated semantically.For any set ψ P L Prop , we can dissect the state space of a probabilistic model M “ x Σ , µ, v ` , v ´ y in two sets – the truth set r ϕ s of ϕ and it’s complementΣ zr ϕ s . Unlike in the classic case, however, Σ zr ϕ s is not the truth set of r ϕ s ,nor of any other ψ P L Prop . Yet, we can deﬁne Jeﬀrey updating as in the classiccase.

Deﬁnition 8.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model. Let q P r

0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p

0; 1 q . Then the semantic non-standardJeﬀrey update for updating the probability of ϕ to be q on M is the proba-bilistic model M ϕ,q “ x Σ , µ ϕ,q , v ` , v ´ y determined by: µ ϕ,q pt x uq “ µ pt x uq ¨ qµ pr ϕ sq iﬀ x P r ϕ s µ pt x uq ¨ ´ q ´ µ pr ϕ sq else. Fact 1.

Non-standard Jeﬀrey updating is successful, i.e. for any probabilisticmodel M “ x Σ , µ, v ` , v ´ y , any q P r

0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p

0; 1 q the non-standard Jeﬀrey update on M updating the probability of ϕ to q satisﬁes µ ϕ,q p x qpr ϕ sq “ q . Despite the fact that the set Σ zr ϕ s is not deﬁnable, we can give a syn-tactic characterization of non-standard Jeﬀrey-updating. The following is anon-standard equivalent to classic Jeﬀrey’s updating, cf. Formula (3).15 emma 2. Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model. Let q P r

0; 1 s and ϕ P L Prop such that µ pr ϕ sq P p

0; 1 q . Then for any ψ P L Prop , the non-standardJeﬀrey update M ϕ,q “ x Σ , µ ϕ,q , v ` , v ´ y of M satisﬁes: µ ϕ,q pr ψ sq “ µ pr ψ ^ ϕ sq ¨ qµ pr ϕ sq ` p µ p ψ q ´ µ p ψ ^ ϕ qq ´ q ´ µ pr ϕ sq Notably, after translating the previous fact into its induced non-standardprobability assignments p µ and p µ ϕ,q , we obtain a fully syntactic characteriza-tion of non-standard Jeﬀrey updating. Deﬁnition 9.

Let p : L Prop Ñ R be a non-standard probability assignment, let q P r

0; 1 s and ϕ P L Prop with p p ϕ q P p

0; 1 q . Then the syntactic non-standardJeﬀrey update setting the probability of ϕ to q is the probability function p ϕ,q : L Prop Ñ R deﬁned by p ϕ,q p ψ q “ p p ψ ^ ϕ q ¨ qp p ϕ q ` p p p ψ q ´ p p ψ ^ ϕ qq ´ q ´ p p ϕ q By construction, semantic and syntactic non-standard Jeﬀrey updating coincidein the following sense.

Fact 2.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let q P r

0; 1 s and ϕ P L Prop with p p ϕ q P p

0; 1 q . Then p µ ϕ,q “ p ϕ,qµ . We will hence omit the labels and only speak of non-standard Jeﬀrey updat-ing. We end this section with three facts about non-standard Jeﬀrey updating.

Fact 3.

Assume that the non-standard probability function p : L Prop Ñ R isclassic, i.e. satisﬁes the Kolmogorov axioms. Moreover, let ϕ P L Prop with p p ϕ q P p

0; 1 q and q P r

0; 1 s . Then the non-standard and the classic Jeﬀreyupdate for setting the probability of ϕ to q coincide, i.e. for all ψ P L Prop p ϕ,q p ψ q “ p p ψ ^ ϕ q qp p ϕ q ` p p ψ ^ ϕ q qp p ϕ q . From this, it follows directly that

Fact 4.

Non-standard Jeﬀrey updating is not commutative. That is, there isa non-standard probability function p : L Prop Ñ R and ϕ, ψ P L Prop and q, r Pr

0; 1 s with p p ϕ q , p p ψ q , p ϕ,q p ψ q , p ψ,r p ϕ q P p

0; 1 q such that p p ϕ,q q ψ,r ‰ p p ψ,r q ϕ,q . Non-standard Bayesian updating

Just as in the classic case, we will deﬁne non-standard Bayesian updating asspecial case of non-standard Jeﬀrey updating where the probability of ϕ is setto 1. In this case, the formula of Deﬁnition 9 simpliﬁes to the same formula asin the classical case. Note that this is also the ﬁrst of two approaches to Bayesupdating proposed by Mares (1997). The second proposal by Mares, in contrastis not related to any version of Bayes updating presented here, as it strives toactively minimize conﬂict. 16 eﬁnition 10. Let p : L Prop Ñ R be a non-standard probability function andlet ϕ P L Prop with p p ϕ q ą

0. Then the (positive) non-standard Bayesianupdate on ϕ is the function p ϕ,pos : p ϕ,pos p ψ q “ p p ψ ^ ϕ q p p ϕ q for ϕ P L Prop . Unlike in the classical setting, however, non-standard Bayesian updating doesnot cover all extremal cases. Setting the probability of ϕ to 0 is not the same assetting the probability of ϕ to 1, hence this case needs to be treated separately. Deﬁnition 11.

Let p : L Prop

Ñ r

0; 1 s be a non-standard probability functionand let ϕ P L Prop with p p ϕ q ă

1. Then the negative non-standard Bayesianupdate on ϕ is the function p ϕ,neg : p ϕ,neg p ψ q “ p p ψ q ´ p p ψ ^ ϕ q ´ p p ϕ q for ϕ P L Prop . As their classic counterpart, positive and negative non-standard Bayesian con-ditioning are order independent:

Lemma 3.

Let p : L Prop Ñ R and let ϕ, ψ P L Prop with p p ϕ q , p p ψ q , p ϕ p ψ q , p ψ p ϕ q P p

0; 1 q . Then p p ϕ, ˚ q ψ, ˆ “ p p ψ ˆ q ϕ, ˚ for ˚ , ˆ P t pos, neg u . Within non-standard probability, knowing the probability of ϕ does not provideany information about the probability of ϕ . Hence, in learning about ϕ ,two cases are to be distinguished. In the ﬁrst case, the agent only receivesinformation about ϕ , without learning anything about ϕ or ϕ ^ ϕ . In thesecond case, the agent learns the full probabilistic information about ϕ , thatis, the probabilities of ϕ and ϕ , but also the size of the corresponding gapand glut. As discussed above, this information can be encoded in a vector p b, d, u, c q P R specifying the new pure belief (i.e. belief without conﬂict), puredisbelief (belief in ϕ without conﬂict), uncertainty and conﬂict about ϕ .Again, the notion of four-valued Jeﬀrey updating is best illustrated se-mantically. As shown in Figure 3, for any ϕ P L Prop , the sets of pure be-lief, pure disbelief, uncertainty and conﬂict about ϕ jointly form a partition pr ϕ szr ϕ ^ ϕ s , pr ϕ szr ϕ ^ ϕ s , Σ zr ϕ _ ϕ s , r ϕ ^ ϕ sq of a probabilistic model M . Hence, a similar idea as in classic Jeﬀrey updating can be applied, linearlyexpanding or shrinking the measure on each of these four cells to their appro-priate size. Notably, linear expansion (to a larger size) is only well deﬁned ifthe cell to be expanded has a strictly positive measure. We capture this withthe notion of admissibility of a vector p b, d, u, c q : Deﬁnition 12.

Let M “ x Σ , µ, v ` , v ´ y , let ϕ P L Prop and denote ˆ p µ p ϕ q by p b ϕ , d ϕ , u ϕ , c ϕ q . We call a vector p b, d, u, c q P r

0; 1 s with b ` d ` u ` c “ admissible for ϕ if it satisﬁes that b “ b ϕ “ d “ d ϕ “ u “ u ϕ “ c “ c ϕ “

0. 17 ϕ ϕϕ ^ ϕ ϕ z ϕ ϕ z ϕ Figure 3: Four-valued conditioning

Deﬁnition 13.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop and let p b, d, u, c q P r

0; 1 s admissible for ϕ . Then four-valued Jeﬀrey up-dating on ϕ to p b, d, u, c q is the model M ϕ, p b,d,u,c q “ x Σ , µ ϕ, p b,d,u,c q , v ` , v ´ y with: µ ϕ,q p x q “ $’’’’&’’’’% µ p x q ¨ bµ pr ϕ s´ µ r ϕ ^ ϕ sq iﬀ x P r ϕ szr ϕ ^ ϕ s µ p x q ¨ dµ pr ϕ sq´ µ pr ϕ ^ ϕ sq iﬀ x P r ϕ szr ϕ ^ ϕ s µ p x q ¨ cµ pr ϕ ^ ϕ sq iﬀ x P r ϕ ^ ϕ s µ p x q ¨ u ´ µ pr ϕ _ ϕ sq else Fact 5.

Four-valued Jeﬀrey updating is successful, i.e. for any probabilisticmodel M “ x Σ , µ, v ` , v ´ y , any ϕ P L Prop and any p b, d, u, c q P r

0; 1 s that isadmissible for ϕ , the non-standard Jeﬀrey update on M setting the probabilityof ϕ to p b, d, u, c q satisﬁes ˆ p µ ϕ, p b,d,u,c q p ϕ q “ p b, d, u, c q . Just as in the case of non-standard Jeﬀrey conditioning, we obtain a purelysyntactic characterization of four-valued Jeﬀrey updating. Unfortunately, thedrop in elegance with respect to standard Jeﬀrey updating is signiﬁcant.

Lemma 4.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop andlet p b, d, u, c q P r

0; 1 s be admissible for ϕ . Then non-standard Jeﬀrey update on setting the probability of ϕ to p b, d, u, c q satisﬁes for any ψ P L Prop that b ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q d ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q u ψ “ bb ϕ p b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q ` dd ϕ p d ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q` uu ϕ p ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q c ψ “ bb ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q ` dd ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q` uu ϕ p c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ,ψ q where p b ψ , d ψ , u ψ , c ψ q and p b ψ , d ψ , u ψ , c ψ q denote the four-valued probability vec-tor of ψ before and after the update. In the above equations, ψ is shorthand for ψ , while ϕ, ψ stands for ϕ ^ ψ . For ease of notation, this formula uses theconvention that “ .Proof. Consider the propositions ϕ and ψ as well as the labeling of areas in thetop row of Figure 4. By deﬁnition of updating, the mass of areas 1-4 need tobe multiplied by bb ϕ , that of areas 5-8 by dd ϕ , the weight of areas 9-12 by uu ϕ andthat of areas 13-16 by bb ϕ . Moreover, the agent’s pure belief in ψ , i.e. b ψ is thejoint mass of areas 1, 5, 9 and 13, her disbelief in ψ the joint mass of areas 2, 6,10 and 14, her uncertainty is the joint weight of areas 3, 7, 11 and 15 and herconﬂict set the sum of areas 4, 8, 12 and 16.To check correctness of the above equations, it then suﬃces to verify thatthe formulas pick out the respective ﬁelds, i.e. that b ϕ,ψ is the size of ﬁeld 1, b ϕ,ψ is the size of ﬁeld 5, d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ is the size of ﬁeld 9and so on. That this is the case follows from the pictures in Figure 4, showingthe belief and disbelief sets for certain composites of ϕ and ψ .Again, the latter set of equations can be read purely syntactically. Thus, weget a syntactic counterpart to semantic four-valued Jeﬀrey updates. Deﬁnition 14.

Let ˆ p : L Prop Ñ R be a four-valued probability function andlet ϕ P L Prop . Moreover, let p b, d, u, c q P r

0; 1 s be admissible for ϕ . Then (syn-tactic) four-valued Jeﬀrey updating with the vector p b, d, u, c q yields a four-19 ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ψ ϕ ^ ϕ ^ ψ ϕ ^ ϕ ^ ψϕ ^ ϕ ^ ψ ^ ψ Figure 4: Belief and disbelief sets of ϕ (top left), ψ (top center) and variouscombinations thereof. Belief sets are dotted, disbelief set shaded. The diagramsfall in 16 sections that are labelled as shown on the top right.20alued probability function ˆ p ϕ, p b,d,u,c q deﬁned by ˆ p ϕ, p b,d,u,c q p ψ q “ p b ψ , d ψ , u ψ , c ψ q with: b ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q d ψ “ bb ϕ p b ϕ,ψ q ` dd ϕ p b ϕ,ψ q` uu ϕ p d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q u ψ “ bb ϕ p b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q ` dd ϕ p d ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ q` uu ϕ p ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ q c ψ “ bb ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q ` dd ϕ p c ϕ,ψ ´ c ϕ,ϕ,ψ q` uu ϕ p c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ q ` cc ϕ p c ϕ,ϕ,ψ,ψ q . By construction, semantic and syntactic non-standard Jeﬀrey updating co-incide in the following sense.

Fact 6.

Let M “ x Σ , µ, v ` , v ´ y be a probabilistic model, let ϕ P L Prop and let p b, d, u, c q P r

0; 1 s be admissible for ϕ . Then ˆ p µ ϕ, p b,d,u,c q “ ˆ p ϕ, p b,d,u,c q µ . We will hence omit the distinction between semantic and syntactic and onlyspeak of four-valued Jeﬀrey updating. We end this section with three factsabout this updating.

Fact 7.

Assume that the four-valued probability function ˆ p : L Prop Ñ R isclassic, i.e. ˆ p p ψ q P R ˆ t u . Moreover, let ϕ P L Prop and p b, d, , q P r

0; 1 s be admissible for ϕ , i.e. b “ if p p ϕ q “ and d “ if p p ϕ q “ . Thenthe non-standard and the classic Jeﬀrey update setting the probability of ϕ to q coincide, i.e. for all ψ P L Prop ˆ p ϕ, p b,d, , q p ψ q “ ˆ b ψ ^ ϕ qb ϕ ` b ψ ^ ϕ qb ϕ , ´ b ψ ^ ϕ qb ϕ ´ b ψ ^ ϕ qb ϕ , , ˙ From this, it follows directly that

Fact 8.

Non-standard Jeﬀrey updating is not commutative. That is, there isa four-valued probability function ˆ p : L Prop

Ñ r

0; 1 s , some ϕ, ψ P L Prop and p b, d, u, c q , p b , d , u , c q P r

0; 1 s such that p b, d, u, c q is admissible for ϕ in ˆ p andin ˆ p p b ,d ,u ,c q , while p b , d , u , c q is admissible for ψ in both ˆ p and ˆ p p b,d,u,c q , suchthat p p ϕ, p b,d,u,c q q ψ, p b ,d ,u ,c q ‰ p p ψ, p b ,d ,u ,c q q ϕ, p b,d,u,c q our-valued Bayesian updating Just as in the classical case, we can deﬁne four-valued Bayesian updating as aspecial instance of Jeﬀrey updating where the information acquired is extremal.Here, we focus on three cases. In the ﬁrst, the agent learns the vector (1,0,0,0),i.e. she acquires full pure belief in ϕ . In the second and third case, the agentlearns the vectors (0,0,1,0) or (0,0,0,1) respectively, acquiring full belief in un-certainty or conﬂict about ϕ . The remaining case, learning (0,1,0,0), followsfrom these, as it corresponds to updating on information (1,0,0,0) about ϕ .In either of our three cases, the above deﬁnition of four-valued Jeﬀrey updatingsimpliﬁes to: Deﬁnition 15. i q Let ˆ p : L Prop Ñ R be a four-valued probability func-tion such that b ϕ ą

0, where ˆ p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then positive four-valued Bayesian updating on ϕ yields the function ˆ p ϕ, ` deﬁned by ˆ p ϕ, ` “p b ψ , d ψ , u ψ , c ψ q with b ψ “ b ϕ,ψ b ϕ d ψ “ b ϕ,ψ b ϕ u ψ “ b ϕ ´ b ϕ,ψ ´ b ϕ,ψ ´ c ϕ ^ ψ ` c ϕ,ϕ,ψ b ϕ c ψ “ c ϕ,ψ ´ c ϕ,ϕ,ψ b ϕ ii q Let ˆ p : L Prop Ñ R be a four-valued probability function such that u ϕ ą p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then uncertainty Bayesian updating about ϕ is deﬁned as: ˆ p ϕ,u p ψ q “ p b ψ , d ψ , u ψ , c ψ q with b ψ “ d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ u ϕ d ψ “ d ϕ,ϕ,ψ,ψ ´ d ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ u ϕ u ψ “ ´ d ϕ,ϕ,ψ,ψ ´ c ϕ,ϕ,ψ,ψ u ϕ c ψ “ c ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ψ ` c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ u ϕ iii q Let ˆ p : L Prop Ñ R be a four-valued probability function such that c ϕ ą p p ϕ q “ p b ϕ , d ϕ , u ϕ , c ϕ q . Then conﬂict Bayesian updating about ϕ is22eﬁned as: ˆ p ϕ,c p ψ q “ p b ψ , d ψ , u ψ , c ψ q with b ψ “ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ c ϕ d ψ “ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ,ψ c ϕ u ψ “ c ϕ ´ c ϕ,ϕ,ψ ´ c ϕ,ϕ,ψ ` c ϕ,ϕ,ψ,ψ c ϕ c ψ “ c ϕ,ϕ,ψ,ψ c ϕ Just as its classic counterpart, four-valued Bayesian conditioning in all threeﬂavors is order independent:

Lemma 5.

Let ˆ p : L Prop

Ñ r

0; 1 s and let ϕ, ψ P L Prop such that ˆ p ϕ,a , ˆ p ψ,b , p ˆ p ϕ,a q ψ,b and p ˆ p ψ,bb q ϕ,a are all deﬁned. Then p ˆ p ϕ,a q ψ,b “ p ˆ p ψ,bb q ϕ,a .for a, b P t` , u, c u Using the translation functions tr ns and tr ns , both notions of Jeﬀrey condition-ing, non-standard and four-valued, work on both types of probability functionsdeﬁned, non-standard and four-valued. However, the notions of updating donot correspond to each other. While non-standard Jeﬀrey conditioning appliesto situations where only the probability of ϕ is set, without any mention ofthe probabilities of ϕ or ϕ ^ ϕ , four-valued Jeﬀrey conditioning covers caseswhere new probabilities of ϕ, ϕ and the corresponding gap and glut are allproscribed simultaneously. Hence, even after appropriate transformations oftheir domains with tr ns and tr ns , the two types of Jeﬀrey updates are not inter-deﬁnable. This, however, changes if we move to non-standard and four-valuedBayesian updating. Each of the three types of four-valued Bayesian updatingis equivalent to a composition of two steps of non-standard Bayesian updating.Moreover, the order of these two steps does not matter. Lemma 6.

Let ˆ p : L Prop Ñ R be a four-valued probability assignment and let ϕ P L Prop . i q if b ϕ ą , then tr ns p ˆ p ϕ, ` q “ p tr ns p ˆ p q ϕ,pos q ϕ,neg “ p tr ns p ˆ p q ϕ,neg q ϕ,pos ii q if u ϕ ą , then tr ns p ˆ p ϕ,u q “ p tr ns p ˆ p q ϕ,neg q ϕ,neg “ p tr ns p ˆ p q ϕ,neg q ϕ,neg iii q if c ϕ ą , then tr ns p ˆ p ϕ,u q “ p tr ns p ˆ p q ϕ,pos q ϕ,pos “ p tr ns p ˆ p q ϕ,pos q ϕ,pos Proof. i q By Theorem 5, there is a unique canonical model M “x P p Lit q , µ, v ` , v ´ y such that ˆ p µ “ ˆ p . By Facts 2 and 6, it hence suﬃces toshow the claim for semantic four-valued Jeﬀrey updating on M . Note thatthe result of positive Bayesian updating, i.e. the updated four-valued probabil-ity function µ ϕ, p , , , q of M ϕ, p , , , q “ x P p Lit q , µ ϕ, p , , , q , v ` , v ´ y is uniquelydetermined by the conditions 231) µ ϕ, p , , , q p x q “ x R r ϕ szr ϕ ^ ϕ s (2) µ ϕ, p , , , q p x q{ µ ϕ, p , , , q p y q “ µ p x q{ µ p y q whenever x, y P r ϕ szr ϕ ^ ϕ s with µ p y q ą µ ϕ, and µ ϕ, both satisfy (2). Moreover, µ ϕ, p x q “ x R r ϕ s and µ ϕ, p x q “ x P r ϕ s . Thus both p µ ϕ, q ϕ, and p µ ϕ, q ϕ, alsosatisfy (1). Hence, both p µ ϕ, q ϕ, and p µ ϕ, q ϕ, satisfy conditions (1) and(2) and, hence, are identical to µ ϕ, p , , , q . This implies that tr ns p ˆ p ϕ, ` q “p tr ns p ˆ p ϕ,pos q ϕ,neg q “ p tr ns p ˆ p ϕ,neg q ϕ,pos q .The proofs of ii q and iii q follow similarly. In the previous sections we investigated updating a probability function with ageneralized Jeﬀery rule by learning either only a new value for the belief in ϕ (Section 7.1) or the entire four-valued probability vector assigned to ϕ (Section7.2). However, there may be other contexts where the agent acquires partialinformation about the (four-valued) probability of ϕ , e.g. only a new value forpure belief or pure disbelief in ϕ .The idea for conditioning on partial information proceeds along the samelines as for complete information, i.e. by a modiﬁed version of Jeﬀery condi-tioning. The only diﬀerence is that the partiality of information, say about ϕ does not permit to work with the full partition induced by ϕ on a model M ,i.e. the partition into tr ϕ szr ϕ s , r ϕ szr ϕ s , Σ ´ r ϕ _ ϕ s , r ϕ ^ ϕ su , cf Figure3, but with a coarsening thereof.By obtaining partial information we mean that the agent learns the valuesof a partial assignment a : t b, d, u, c u á r

0; 1 s , i.e. an assignment proscribingnew values for some of the agent’s pure belief, pure disbelief, uncertainty andconﬂict, but not necessarily for all. Let us denote the domain of a , i.e. those x P t b, d, u, c u for which a p x q is deﬁned, by dom p a q . For simplicity, we assumethat H Ă dom p a q Ă t b, d, u, c u with both inclusions strict. Following the sameintuitions as in the four-valued case, we can deﬁne conditioning on the partialinformation a by setting the new pure belief, disbelief, uncertainty and conﬂictin ϕ to be a p b q , a p d q , a p u q and a p c q respectively whenever this is deﬁned andafterwards rescaling the probabilistic mass on the remaining area appropriately.Formally, to ensure that the corresponding operation is well-deﬁned, we needto assume that ř y P dom p a q y ď

1. Denoting the prior four-valued probabilityvector of ϕ with p b ϕ , d ϕ , u ϕ , c ϕ q , the Jeﬀrey updating sketched above will leadto the posterior four-valued probability vector p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q with:¯ x ϕ “ a p x q iﬀ x P dom p a q x ¨ ´ ř y P dom p a q y ´ ř y P dom p a q y ϕ else for x P t b, d, u, c u . With this, we can formally deﬁne partial Jeﬀrey updating.24 eﬁnition 16. Let a : t b, d, u, c u á r

0; 1 s be a partial assignment such that ř y P dom p a q y ď

1. Let M “ x Σ , µ, v ` , v ´ y be a model, let ϕ P L Prop and letthe vector p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q deﬁned above be admissible for ϕ . Then the four-valued Jeﬀrey update of ϕ on the partial information a is deﬁned asthe four-valued Jeﬀrey update on ϕ to p ¯ b ϕ , ¯ d ϕ , ¯ u ϕ , ¯ c ϕ q . Assume two agents informed you about their credences in ϕ . You take bothagents as similarly competent and equally informed. Yet, they equip you withdiﬀerent assessments of ϕ . How, then, should you combine these judgmentstowards forming your own belief about ϕ ? Within standard probability theory,your options are fairly limited. You may, for instance, decide to follow one ofthe agents, or build a weighted average between the two. A broad number ofapproaches in the literature on peer disagreement, for instance, promotes tosplit the diﬀerence equally see for instance Elga (2007); Christensen (2007) onconciliationism, but also Kelly (2010) for an opposing opinion. Within the non-standard probabilities studied here, further options open up.First, note that within classic probability theory, learning about the agentscredence in ψ also informs us about her degree of belief in ψ . This does nothold true within the current non-standard setting. Hence, let us assume for thecurrent analysis that agents inform us about both their positive and negativeattitude towards ϕ , that is about p p ϕ q and p p ϕ q , or even about their four-valued vector ˆ p p ϕ q . Of course, we may follow the previous strategies and formweighted averages between the agents’ assessments of ϕ . If needed, this policycould be speciﬁed to also taking a weighted average on the agents conﬂict anduncertainty and, more general, their remaining belief set. Deﬁnition 17.

Let k P r , s . i q Assume agents A and E provide their non-standard assessments of ϕ , i.e. p A p ϕ q , p E p ϕ q , p A p ϕ q and p E p ϕ q . Then their k -weighted non-standardaggregate belief p k t A,E u is deﬁned by p k t A,E u p ϕ q “ kp A p ϕ q ` p ´ k q p E p ϕ q and p k t A,E u p ϕ q “ kp A p ϕ q ` p ´ k q p E p ϕ q .ii q For agent A and E s four-valued probabilitiy assessments p b, d, u, c q A and p b, d, u, c q E for ϕ , i.e. ˆ p A p ϕ q and ˆ p E p ϕ q their k -weighted four-valued aggre-gate belief ˆ p k t A,E u is:ˆ p k t A,E u p ϕ q “ k ˆ p A p ϕ q ` p ´ k q ˆ p E p ϕ q . Lemma 7.

Weighted averaging can be applied to an entire belief base simulta-neously. That is, when agents A and E both provide their full subjective non-standard probability functions p A , p E : L Prop Ñ R (resp. ˆ p A , ˆ p E : L Prop Ñ R ), p t A,E u p ϕ q p t A,E u p ϕ q k-weighted kp A p ϕ q ` p ´ k q p E p ϕ q kp A p ϕ q ` p ´ k q p E p ϕ q credulous max p p A p ϕ q , p E p ϕ qq max p p A p ϕ q , p E p ϕ qq cautious min p p A p ϕ q , p E p ϕ qq min p p A p ϕ q , p E p ϕ qq optimist max p p A p ϕ q , p E p ϕ qq min p p A p ϕ q , p E p ϕ qq pessimist min p p A p ϕ q , p E p ϕ qq max p p A p ϕ q , p E p ϕ qq Table 1: Diﬀerent rules for aggregating agent A and E ’s non-standard beliefsin ϕ and ϕ , i.e. p A p ϕ q , p E p ϕ q , p A p ϕ q and p E p ϕ q . a weighted average belief p k t A,E u : L Prop Ñ R can be deﬁned by kp A ` p ´ k q p E .Likewise, ˆ p k t A,E u : L Prop Ñ R can be deﬁned by k ˆ p A ` p ´ k q ˆ p E . Moreover, thesepolicies commute with tr ns , that is tr ns p ˆ p k t A,E u q “ p k t A,E u and tr ns p p k t A,E u q “ ˆ p k t A,E u . Non-Standard beliefs, however, allow for further aggregation policies thatdo not have classic counterparts.

Credulous agents, for instance, could optfor the maximal values of their input in terms of belief and disbelief simul-taneously. That is, they could set their updated belief and disbelief in ϕ tobe max p p A p ϕ q , p E p ϕ qq and max p p A p ϕ q , p E p ϕ qq respectively. Likewise, cau-tious agents may rather chose to belief and disbelief ϕ only to an amountsupported by all input information. Such agents would set their belief anddisebelief in ϕ to min p p A p ϕ q , p E p ϕ qq and min p p A p ϕ q , p E p ϕ qq respectively.In special situations, further policies are conceivable. When testing thesafety of a new drug, for example, agents may be extremely vary of false pos-itives while being much less concerned with false negatives. Such an agentmight decide to set her new belief in ϕ to min p p A p ϕ q , p E p ϕ qq while adopting max p p A p ϕ q , p E p ϕ qq as new disbelief in ϕ . Likewise, also the combination of max p p A p ϕ q , p E p ϕ qq with min p p E p ϕ q , p E p ϕ qq are conceivable. In some sense,the latter two policies are aggregation functions that minimize type I and typeII errors. For a lack of a better name we call these pessimist and optimist updating rules respectively. See Table 1 for an overview.Unlike weighted average, none of these four policies can be applied to anentire belief set simultaneously. Fact 9.

Let p A and p E be such that p A p ϕ q “ and p A p ϕ q “ p A p ϕ ^ ϕ q “ ,while p E p ϕ q “ and p E p ϕ q “ p E p ϕ ^ ϕ q “ . Then p A and p E are consistent,but the function p t A,E u deﬁned by p t A,E u p˚q “ max p˚q for ˚ P t ϕ, ϕ, ϕ ^ ϕ u is not.Proof. To see that p A and p E are consistent consider a nonstandard modelwith three worlds, x, y, z and v ` p p q “ t x, y u , v ´ p p q “ t x, z u . The measure µ A putting all weight on y is such that p µ A p˚q “ p A p˚q for ˚ P t ϕ, ϕ, ϕ ^ ϕ u ,showing p A consistent by Lemma 1. Likewise, the measure µ E putting all weight26n z shows p E consistent. For the inconsistency of p t A,E u , ﬁnally, note that p t A,E u p ϕ ^ ϕ q “ p t A,E u p ϕ q “ p t A,E u p ϕ q “

1. Plugging these threevalues into (A3) yields 0 ` p t A,E u p ϕ _ ϕ q “

2, contradicting (A1).Likewise, the missing conditions for cautious updates cannot be retrieved byextending the policy of taking minima to the agents’ assessments of ϕ _ ϕ , ascan be seen from the previous Fact. In particular, there is no counterpart toLemma 7 for credulous or cautious update. Neither can be performed for all ϕ P L Prop simultaneously.Before proceeding to four-valued updating, we compare the above policiesto operations in non-probabilistic Belnap-Dunn logic. For this, recall the classicBelnap-Dunn bi-lattice of truth values BD . t ut u t , ut u information t r u t h This bi-lattice can be interpreted in two directions relating to truth values and the available information . We denote meet and join of the truth latticeoperations by ^ and _ while meet and join for the information lattice operationsare [ and \ . Note that we can identify an assignment of BD -values to someformula ϕ with a non-standard probability assignment of p p ϕ q and p p ϕ q into t , u . More speciﬁcally, assigning t , u to some ϕ corresponds to p p ϕ q “ p p ϕ q “

1, while assigning t u , resp t u to ϕ corresponds to p p ϕ q “ , p p ϕ q “ p p ϕ q “ , p p ϕ q “ tu , ﬁnally, corresponds to p p ϕ q “ p p ϕ q “

0. For a probability assignment p p ϕ q , p p ϕ q P t , u , wedenote the corresponding BD value by t p p ϕ q . Applying this correspondence,we obtain the following characterization of the four updating policies introducedabove: Lemma 8.

Assume when asked about their credences in ϕ , agents A and E pro-vide extremal assignments, i.e. p A p ϕ q , p A p ϕ q , p E p ϕ q , p E p ϕ q P t , u . ThenCredulous update yields beliefs in ϕ and ϕ t p A p ϕ q \ t p E p ϕ q Cautious update that are equal to t p A p ϕ q [ t p E p ϕ q Opimistic update t p A p ϕ q _ t p E p ϕ q Pessimistic update t p A p ϕ q ^ t p E p ϕ q . Finally, we consider the special case where both agents input classic probability values, i.e. values such that p p ϕ q ` p p ϕ q “ act 10. When p A and p E are classic, i.e. p A p ϕ q` p A p ϕ q “ p E p ϕ q` p E p ϕ q “ , then the same holds for the aggregated belief when aggregation follows weightedaveraging, optimistic or pessimistic updates. That is, these three rules preserveclassicality. This does not hold for credulous and cautious updating. The lattertwo rules turn classic inputs beliefs for agent A and E into non-classic aggregatevalues as soon as A and E disagree about p p ϕ q . So far, we have assumed aggregation to operate on non-standard probabilityassignments. Within the above framework, agents provide their subjective non-standard beliefs in both ϕ and ϕ , which the various aggregative mechanismsdescribed above then merge into aggregate belief values for ϕ and ϕ . But ofcourse, our agents might also provide their subjective four-valued probabilitiesˆ p A p ϕ q “ p b Aϕ , d Aϕ , u Aϕ , c Aϕ q and ˆ p E p ϕ q “ p b Eϕ , d Eϕ , u Eϕ , c A E ϕ q instead. Naturally,we could then hope to obtain an aggregate four-valued probabilityˆ p t A,E u p ϕ q “ p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q . Note, that by the map tr ns , the non-standard probabilities p p ϕ q and p p ϕ q can be calculated from the four-valued probability ˆ p p ϕ q . Hence, if ˆ p t A,E u p ϕ q is deﬁned, a corresponding two-valued aggregation mechanism for p t A,E u p ϕ q and p t A,E u p ϕ q follows immediately. However, the opposite does not hold. p t A,E u p ϕ q and p t A,E u p ϕ q do not fully determine ˆ p t A,E u p ϕ q and hence the vari-ous policies deﬁned in the last section do not readily translate into four-valuedaggregation procedures. In fact, when employing the map tr ns , the three val-ues p p ϕ q , p p ϕ q and p p ϕ ^ ϕ q are required to determine ˆ p p ϕ q . In the case ofweighted averaging, this is not a problem. By Lemma 7, setting p k t A,E u p ϕ ^ ϕ q “ kp kA p ϕ ^ ϕ q ` p ´ k q p kE p ϕ ^ ϕ q yields a consistent set of requirements and the corresponding four-valued aggre-gation rule is exactly ˆ p k t A,E u p ϕ q “ k ˆ p A p ϕ q ` p ´ k q ˆ p E p ϕ q .However, the situation is diﬀerent in the case of credulous or cautious up-dating. As shown in Fact 9, requiring that p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq , p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ ^ ϕ q “ max p p A p ϕ ^ ϕ q , p E p ϕ ^ ϕ qq may yield an inconsistent set of requirements. Hence, otherchoices are needed.The vector p t A,E u p ϕ q is determined by four choices. With two of them givenby p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq ,and a third by axiom (D2), one last condition is missing. In the case of credulousupdate, we would arguably expect that c t A,E u ϕ ě max p c Aϕ , c Eϕ q : If an agent optsto be credulous about both ϕ and ϕ , she could not expect her conﬂict to fallbelow any of the input conﬂicts. Within this restriction, the below deﬁnitionof credulous update, assumes c t A,E u ϕ to be as close to max p c Aϕ , c Eϕ q as possiblewhile maintaining consistency. 28ikewise, in the case of cautious update, we would arguably expect overalluncertainty to grow, or, at least, not to shrink through aggregation. That is, wewould expect that u t A,E u ϕ ě max p u Aϕ , u Eϕ q . Again, We will demand that u t A,E u ϕ is the maximal possible consistent value with this property. Deﬁnition 18.

Assume agents A and E provide four-valued probabili-ties ˆ p A p ϕ q “ p b Aϕ , d Aϕ , u Aϕ , c Aϕ q and ˆ p E p ϕ q “ p b Eϕ , d Eϕ , u Eϕ , c A E ϕ q . Thenthe credulously aggregated four-valued probability ˆ p t A,E u p ϕ q “p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q is given by the following four conditions b t A,E u ϕ ` c t A,E u ϕ “ max p b Aϕ ` c Aϕ , b Eϕ ` c Eϕ q d t A,E u ϕ ` c t A,E u ϕ “ max p d Aϕ ` c Aϕ , d Eϕ ` c Eϕ q b t A,E u ϕ ` d t A,E u ϕ ` u t A,E u ϕ ` c t A,E u ϕ “ c t A,E u ϕ “ max ´ c Eϕ , c Aϕ , p b t A,E u ϕ ` c t A,E u ϕ q ` p d t A,E u ϕ ` c t A,E u ϕ q ´ ¯ By tr ns , the ﬁrst two of these equations correspond to the two conditionsof credulous non-standard updates, i.e. p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq and p t A,E u p ϕ q “ max p p A p ϕ q , p E p ϕ qq . The third equation is axiom (D2). Thelast equation, ﬁnally expresses that c t A,E u ϕ is the minimal consistent choice suchthat c t A,E u ϕ ě max p c Aϕ , c Eϕ q . To see this, note that by (D2), we have b t A,E u ϕ ` d t A,E u ϕ ` c t A,E u ϕ ď b t A,E u ϕ ` c t A,E u ϕ ` d t A,E u ϕ ` c t A,E u ϕ ´ ď c t A,E u ϕ . Likewise we can deﬁne a cautious aggregation of four-valued probabilities:

Deﬁnition 19.

For ˆ p A and ˆ p E as above, the cautiously aggregated four-valued probability ˆ p t A,E u p ϕ q “ p b t A,E u ϕ , d t A,E u ϕ , u t A,E u ϕ , c t A,E u ϕ q is given by thefollowing four equations b t A,E u ϕ ` c t A,E u ϕ “ min p b Aϕ ` c Aϕ , b Eϕ ` c Eϕ q d t A,E u ϕ ` c t A,E u ϕ “ min p d Aϕ ` c Aϕ , d Eϕ ` c Eϕ q b t A,E u ϕ ` d t A,E u ϕ ` u t A,E u ϕ ` c t A,E u ϕ “ u t A,E u ϕ “ max ´ u Eϕ , u Aϕ , ´ p b t A,E u ϕ ` c t A,E u ϕ q ´ p d t A,E u ϕ ` c t A,E u ϕ q ¯ Credulous and cautious aggregation as deﬁned here cohere with their deﬁnitionfor non-standard probabilities.

Lemma 9.

Assume that agents A and E provide four-valued vectors ˆ p A and ˆ p E respectively. Then the following diagrams commute, where the application of tr ns makes use of the fact that p p ϕ q and p p ϕ q can be calculated from ˆ p p ϕ q . p A p ϕ q ˆ p E p ϕ q ˆ p t A,E u p ϕ q p A p ϕ q , p A p ϕ q p E p ϕ q , p E p ϕ q p t A,E u p ϕ q tr ns c r e d u l o u s c r e d u l o u s tr ns ˆ p A p ϕ q ˆ p E p ϕ q ˆ p t A,E u p ϕ q p A p ϕ q , p A p ϕ q p E p ϕ q , p E p ϕ q p t A,E u p ϕ q tr ns c a u t i o u s c a u t i o u s tr ns The algebraic structure of credulous and cautious aggregation.Deﬁnition 20.

For an aggregation strategy S , we call ϕ a neutral element iffor all ψ we have S p ϕ, ψ q “ S p ψ, ϕ q “ ψ, and we call ϕ an anihilator if for all ψS p ϕ, ψ q “ S p ψ, ϕ q “ ϕ. Proposition 1.

The subjective four-valued probability assignment p , , , q ,i.e. the element of maximal conﬂict, is an anihilator with respect to credulousupdating. Likewise, the subjective four-valued probability assignment p , , , q ,representing maximal uncertainty, is an anihilator for the cautious strategy.Proof. Let ˆ p A p ϕ q “ p , , , q , let ˆ p E p ϕ q be arbitrary and denote the result ofcredulous updating by p B, D, U, C q . Then by deﬁnition B ` C “ D ` C “ C “ B “ D “

0, and hence U “

0. In a similar manner, let ˆ p A p ψ q “ p , , , q and let ˆ p E p ψ q be arbitrary,and denote the result of cautious updating by p B , D , C , U q . then by deﬁnition B ` C “ D ` C “

0, which implies B “ D “ C “ U “ Proposition 2.

The subjective four-valued probability assignment p , , , q ,i.e. the element of maximal conﬂict, is a neutral element with respect to cautiousupdating. Likewise, the subjective four-valued probability assignment p , , , q ,representing maximal uncertainty, is neutral with respect to credulous updating.Proof. Let ˆ p A p ϕ q “ p , , , q , let ˆ p E p ϕ q “ p b, d, u, c q be arbitrary and denotethe result of cautious updating by p B, D, U, C q . Then by deﬁnition B ` C “ b ` c and D ` C “ d ` c which implies B ` D ` C “ b ` d ` c. (4)Using this, the last condition of cautious updating yields U “ max p , u, ´ b ´ d ´ c q . Since u “ ´ b ´ d ´ c , this implies U “ u . Together with1 “ U ` B ` C ` D “ u ` b ` c ` d , it follows that B ` C ` D “ b ` c ` d . In30ombination with equation (4), this implies C “ c . With this, B ` C “ b ` c and D ` C “ d ` c imply that B “ b and D “ d . The proof for the second claimfollows from a similar argument. Many classical approaches to reasoning address idealized situations, where theagents’ information is consistent, closed under logical implication, and possiblyeven complete. These assumptions, of course, are at odds with many realisticreasoning scenarios, where the available evidence may be scarce and memoryor observation faulty. In short, there is no guarantee for our available infor-mation to be consistent, nor complete. Yet, we would arguably hold that some valid inferences can be drawn from such imperfect information, as partial incom-pleteness or local contradictions may not preclude us from drawing conclusionsabout other parts of the data. As automated reasoning systems are becom-ing increasingly important, there is a need for a rigorous formal treatment ofinferences from non-ideal information. To this end, a wealth of non-classicallogical systems for dealing with uncertainty or conﬂict has been put forward,with Belnap-Dunn logic (BD) arguably the most prominent such framework.However, the reasons for moving to non-normal, BD like frameworks applyequally well to probabilistic settings. Agents may, for instance, have inconclu-sive, probabilistic evidence for the truth or falsity of various statements. Just asin the classic case, if such information comes from diﬀerent sources or diﬀerentexperiments, it needs not add up to 1, nor be mutually exclusive. It hence seemsnatural to investigate probabilistic extensions of BD. This was the focus of thecurrent paper.Paralleling recent work by Dunn (cf. Dunn, 2010; Dunn and Kiefer, 2019),we have investigated four-valued probability assignments that permit agents tohave probabilistic beliefs about the truth and falsity of a statement, and aboutits gaps and gluts. More speciﬁcally, we have provided a theory of four-valuedprobabilities that slightly departs from Dunn’s in its treatment of conjunctions.Yet, both are generalizations of Belnap-Dunn logic in that they coincide withBD whenever all probabilities are extremal, i.e. only assume the values of 0 and1. In this paper, we have clariﬁed the connection between our four-valued prob-abilities and single valued non-standard probabilities as introduced by Childers,Majer and Milne (2019). By providing a translation function between the twoapproaches, we have shown these to be equivalent. Moreover, we have intro-duced probabilistic models as semantics for four-valued probabilities, and haveprovided a sound and complete axiomatization with respect to the class of allsuch models. Lastly, we have enriched our frameworks with dynamical opera-tions for updating and aggregation. As for the former, we have provided versionsof Jeﬀrey and Bayes’ conditioning that work in non-standard and four-valuedsettings and have clariﬁed the relation between these. For aggregation, ﬁnally,31e have studied a host of diﬀerent aggregation policies, some of which go beyondwhat is available in classic probabilistic settings.Of course, there are other approaches to weakening classic probability the-ory, not all of which have a corresponding logic as starting point. Many suchapproaches take probability or weights as central notion, but consider var-ious cases where no exact probabilistic information is available. A typicalexample are inner measures intended to approximate probability from below(Fagin and Halpern, 1991). Their underlying idea, brieﬂy, is that an agentmight lack probabilistic evidence about some proposition ϕ , for instance when ϕ is not in the algebra of (possible) observations. The agent may, though,estimate a lower bound for the probability of ϕ by building on her availableinformation about other propositions. Formally, this gives rise to an innermeasures that only satisfy super-additivity instead of the classic additivity, i.e. µ ˚ p ϕ _ ψ q ě µ ˚ p ϕ q ` µ ˚ p ψ q , where ϕ ^ ψ a classical contradiction.A related weakening of classic probability theory is Dempster-Shafer (DS)theory of belief (Shafer, 1976; Halpern, 2017). The starting point of this theoryis an agent’s evidence about some state of aﬀairs, usually represented as anormalized measure on a boolean algebra of possible observations. This evidencethen gives rise to a belief function, where Bel p ϕ q , the belief in some ϕ , is derivedfrom all pieces of evidence that entail ϕ . As the agent might have strong evidencefor a compound event, say ψ _ ϕ , without having much evidence that entailseither of its compounds alone, this belief function is super-additive in the sensedeﬁned above. More speciﬁcally, the degree of support for some A needs notbe complementary to the support of A . That is, Bel p A q may be less than1 ´ Bel p A q , just as in our framework. While Bel p A q can be seen as a lowerbound for the classical probability for A , the term 1 ´ Bel p A q , sometimesdenoted the plausibility of A , is it’s upper bound. The interval between bothis then interpreted as the agent’s uncertainty about A . As our presentationsuggests, there is a tight connection between DS theory and inner measuresapproaches: both are equivalent, at least on a syntactic level where probabilitiesare associated to formulas, rather than states (Fagin and Halpern, 1991; Zhou,2013).Both, inner probabilitiy approaches and DS theory diﬀer in two ways fromour framework. In one dimension, our framework is more general than DS belieffunctions or inner probabilities, as it admits not only for uncertainty but alsofor conﬂict in probability assignments. By allowing for gluts, non-standard andfour-valued probability assignments can represent contradictory information inways that DS theory and inner measure frameworks cannot.For a second diﬀerence consider a classic tautology such as p _ p . Work-ing on a classical meta-theory, DS theory associates a probability of 1 to thistautology. Yet, when evidence is scarce, the belief values assigned to p and p need not add up to one, exemplifying the above super-additivity. In fact, it iscompatibly with DS theory that both p and p are even assigned a belief ofzero. In our framework, in contrast, uncertainty or conﬂict derive straight fromthe information available about p and p , rather than from evidence aboutsome larger proposition. Working with an non-classic, BD-metatheory, non-32lassic information about literals extends to complex formulas such as p _ p ,as witnessed in the inclusion-exclusion axiom (A3). This axiom, in fact, can beseen to stand in direct opposition to the theory of inner measures. Our axioms(A3) implies a subadditivity property (i.e. µ ˚ p ϕ _ ψ q ď µ ˚ p ϕ q ` µ ˚ p ψ q when ϕ ^ ψ is a classical contradiction), in contrast to the superadditivity of DS the-ory and inner measures. A detailed comparison beyond DS belief functions andour approach would require a more careful analysis that exceeds the scope ofthis article. We leave this for future work.Finally, another open line of inquiry concerns practical implications of thepresent framework. One may, for instance, ask how an ideally rational agentis to act if she has only imperfect information at her disposal. In future work,we hope to sketch the contours of a non-standard decision theory, that restson four-valued probabilities in the same manner as traditional decision theoryemploys classic probability. Doing so, we hope, can help to ﬁll a gap betweencurrent frameworks for decisions under risk and under uncertainty. References

Alchourr´on, C. E., P. G¨ardenfors, and D. Makinson (1985). On the logic oftheory change: Partial meet contraction and revision functions.

Journal ofSymbolic Logic 50 (2), 510 – 530.Anderson, A. R. and D. Belnap, Nuel (1975).

Entailment: The Logic of Rele-vance and Necessity, Volume I . Princeton: Princeton University Press.Batens, D. (2001). A general characterization of adaptive logics.

Logique etAnalyse 44 (173-175), 45–68.Belnap, N. D. (1977). A useful four-valued logic. In

Modern uses of multiple-valued logic , pp. 5–37. Springer.Belnap, N. D. (2019).

How a Computer Should Think , pp. 35–53. SpringerInternational Publishing.Childers, T., O. Majer, and P. Milne (2019). The (relevant) logic of scientiﬁcdiscovery. (under review) .Christensen, D. (2007, 04). Epistemology of Disagreement: The Good News.

The Philosophical Review 116 (2), 187–217.da Costa, N. (1974). On the theory of inconsistent formal systems.

Notre DameJournal of Formal Logic 15 (4), 497–510.da Costa, N. and V. Subrahmanian (1989). Paraconsistent logic as a formalismfor reasoning about inconsistent knowledge bases.

Artiﬁcial Intelligence inMedicine 1 , 167–174.Dunn, J. M. (1976). Intuitive semantics for ﬁrst degree entailment and ‘coupledtrees’.

Philosophicl Studies 29 (3), 149–168.33unn, J. M. (2010). Contradictory information: Too much of a good thing.

Journal of Philosophical Logic 39 (4), 425–452.Dunn, J. M. and N. M. Kiefer (2019). Contradictory information: Better thannothing? the paradox of the two ﬁreﬁghters. In

Graham Priest on Dialetheismand Paraconsistency , pp. 231–247. Springer.Elga, A. (2007). Reﬂection and disagreement.

Noˆus 41 (3), 478–502.Fagin, R. and J. Y. Halpern (1991). Uncertainty, belief, and probability.

Com-putational Intelligence 7 (3), 160–173.Font, J. M. (1997). Belnap’s four-valued logic and de morgan lattices.

LogicJournal of IGPL 5 (3), 1–29.Halpern, J. (2017).

Reasoning about Uncertainty . MIT press.Jaskowski, S. (1948). Propositional calculus for contradictory deductive systems.

Studia Logica 24 , 143–157.Jøsang, A. (1997). Artiﬁcial reasoning with subjective logic. In

Proceedings ofthe second Australian workshop on commonsense reasoning , Volume 48, pp.34. Citeseer.Kelly, T. (2010). Peer disagreement and higher order evidence. In A. I. Goldmanand D. Whitcomb (Eds.),

Social Epistemology: Essential Readings , pp. 183–217. Oxford University Press.Klein, D. and A. Marra (2020). From oughts to goals: A logic for enkrasia.

Studia Logica 108 (1), 85–128.Kolmogorov, A. N. (2018).

Foundations of the theory of probability . CourierDover Publications.Mares, E. D. (1997). Paraconsistent probability theory and paraconsistentbayesianism.

Logique et analyse 40 (160), 375–384.Pˇrenosil, A. (2018).

Reasoning with Inconsistent Information . Ph. D. thesis,Charles University, Faculty of Philosophy.Priest, G. (1979). Logic of paradox.

Journal of Philosophical Logic 8 , 219–241.Priest, G. (2002). Paraconsistent logic.

Dov M. Gabbay and Franz Guenthner(eds.) Handbook of Philosophical Logic 6 , 287–393.Priest, G. (2006).

In contradiction . Oxford University Press.Priest, G. (2007). Paraconsistency and dialetheism.

D. Gabbay and J. Woods(eds.) Handbook of the History of Logic 8 , 129–204.Rescher, N. and R. Manor (1970). On inference from inconsistent premisses.

Theory and Decision 1 (2), 179–217.34hafer, G. (1976).

A mathematical theory of evidence , Volume 42. Princetonuniversity press.Zhou, C. (2013). Belief functions on distributive lattices.