[PDF] On Accuracy and Coherence with Infinite Opinion Sets

Abstract

There is a well-known equivalence between avoiding accuracy dominance and having probabilistically coherent credences (see, e.g., de Finetti 1974, Joyce 2009, Predd et al. 2009, Schervish et al. 2009, Pettigrew 2016). However, this equivalence has been established only when the set of propositions on which credence functions are defined is finite. In this paper, we establish connections between accuracy dominance and coherence when credence functions are defined on an infinite set of propositions. In particular, we establish the necessary results to extend the classic accuracy argument for probabilism originally due to Joyce (1998) to certain classes of infinite sets of propositions including countably infinite partitions.

Full PDF

aa r X i v : . [ m a t h . S T ] J u l On Accuracy and Coherence withInﬁnite Opinion Sets

Mikayla Kelley

Stanford University ([email protected])

Preprint of July 2020 (under review)

Abstract

There is a well-known equivalence between avoiding accuracy dominance and having prob-abilistically coherent credences (see, e.g., de Finetti 1974, Joyce 2009, Predd et al. 2009,Schervish et al. 2009, Pettigrew 2016). However, this equivalence has been established onlywhen the set of propositions on which credence functions are deﬁned is ﬁnite. In this paper, weestablish connections between accuracy dominance and coherence when credence functions aredeﬁned on an inﬁnite set of propositions. In particular, we establish the necessary results toextend the classic accuracy argument for probabilism originally due to Joyce (1998) to certainclasses of inﬁnite sets of propositions including countably inﬁnite partitions.

A central norm in the epistemology of partial belief is probabilism: a person’s degrees of belief—or credences —should satisfy the laws of probability. There is a long tradition in the spirit of Savage(1971) and de Finetti (1974) of appealing to the epistemic virtue of accuracy to justify probabilism(also see Rosenkrantz 1981). One particular form of argument is the accuracy dominance argumentfor probabilism introduced by Joyce (1998). Let a set F of propositions be an opinion set and afunction c : F → [0 , a credence function on F . Let a credence function be coherent if it satisﬁesthe axioms of probability. A credence function c ′ on F accuracy dominates a credence function c on F if c is more inaccurate than c ′ no matter how the world turns out to be (where inaccuracy isprecisiﬁed as in Section 2). Then the existing accuracy dominance arguments purport to vindicateprobabilism by showing that a credence function is not accuracy dominated if and only if it iscoherent.However, there is a limitation to almost all of the literature on accuracy arguments for proba-bilism: the opinion set is assumed to be ﬁnite. Indeed, de Finetti (1974), Lindley (1987), Joyce This paper is based on work done in Kelley 2019. In an unpublished manuscript, Walsh (2020) gives an accuracy dominance argument in the countably inﬁnitecontext, to which we return in Section 3. In a related but distinct area, Huttegger (2013) and Easwaran (2013)extend to the inﬁnite setting part of the literature on using minimization of expected inaccuracy to vindicateepistemic principles. See, e.g., Greaves and Wallace 2005. Schervish et al. (2014) prove that in certain countablyinﬁnite cases, coherence is suﬃcient to avoid strong dominance . Schervish et al. (2009) and Steeger (2019) explorea diﬀerent way to weaken the assumption that the opinion set is ﬁnite. We return to their work in Section 4.

We ﬁrst set up the framework that will be used throughout the paper. Fix a set W (not necessarilyﬁnite) which represents the set of possible worlds and, for now, a ﬁnite set F ⊆ P ( W ) of propositions that represents an opinion set —the set of propositions that an agent has beliefs about. Deﬁnition 2.1. An algebra over W is a subset F ∗ ⊆ P ( W ) such that:1. W ∈ F ∗ ;2. if p, p ′ ∈ F ∗ , then p ∪ p ′ ∈ F ∗ ;3. if p ∈ F ∗ , then W \ p ∈ F ∗ . Deﬁnition 2.2. i. A credence function on an opinion set F is a function from F to [0 , .ii. A credence function c is coherent if it can be extended to a ﬁnitely additive probabilityfunction on an algebra F ∗ over W containing F . That is, there is an algebra F ∗ ⊇ F over W and a function c ∗ : F ∗ → [0 , such that:(a) c ∗ ( p ) = c ( p ) for all p ∈ F ;(b) c ∗ ( p ∪ p ′ ) = c ∗ ( p ) + c ∗ ( p ′ ) for p, p ′ ∈ F ∗ with p ∩ p ′ = ∅ ;(c) c ∗ ( W ) = 1 .iii. A credence function that is not coherent is incoherent . Remark 2.3. If F = { p , . . . , p n } , we identify a credence function c over F with the vector ( c ( p ) , . . . , c ( p n )) ∈ [0 , n . Thus the space of all credence functions over F can be identiﬁed with [0 , n ⊆ R n . We often simplify notation by setting c i := c ( p i ) .We now introduce an important subclass of the class of all credence functions, namely the(coherent) credence functions that match the truth values of F at a world w exactly. Deﬁnition 2.4.

Fix an opinion set F . For each w ∈ W , let v w : F → { , } be deﬁned by v w ( p ) = 1 if and only if w ∈ p . We call v w the omniscient credence function at world w . We let V F denote the set of all omniscient credence functions on F . Note that |V F | ≤ |F| .2ext, we specify the inaccuracy measures we will be concerned with in this section. Fix aﬁnite opinion set F , and let C denote the set of credence functions on F . We deﬁne an inaccuracymeasure to be a function of the form I : C × W → [0 , ∞ ] . The class of inaccuracy measures we consider is a generalization of the class normatively defendedby Pettigrew (2016): the inaccuracy measures deﬁned in terms of what we call a quasi-additiveBregman divergence . It is a subclass of the inaccuracy measures assumed in Predd et al. 2009. Deﬁnition 2.5.

Suppose D : [0 , n × [0 , n → [0 , ∞ ] .1. D is a divergence if D ( x , y ) ≥ for all x , y ∈ [0 , n with equality if and only if x = y .2. D is quasi-additive if there exists a function d : [0 , → [0 , ∞ ] and a sequence of elements { a i } ni =1 from (0 , ∞ ) such that D ( x , y ) = n X i =1 a i d ( x i , y i ) , in which case we say D is generated by d and { a i } ni =1 .3. D is a quasi-additive Bregman divergence if D is a quasi-additive divergence generated by d and { a i } ni =1 , and in addition there is a function ϕ : [0 , → R such that:(a) ϕ is continuous, bounded, and strictly convex on [0 , ;(b) ϕ is continuously diﬀerentiable on (0 , with the formal deﬁnition ϕ ′ ( i ) := lim x → i ϕ ′ ( x ) for i ∈ { , } ; (c) for all x, y ∈ [0 , , we have d ( x, y ) = ϕ ( x ) − ϕ ( y ) − ϕ ′ ( y )( x − y ) . We call such a d a one-dimensional Bregman divergence .We take the inaccuracy of a credence function c at a world w to be the distance between c andthe omniscient credence function v w , where distance is measured with a quasi-additive Bregmandivergence. Deﬁnition 2.6.

Let a legitimate inaccuracy measure be an inaccuracy measure given by I ( c, w ) = D ( v w , c ) , where D is a quasi-additive Bregman divergence. Using terminology from Deﬁnition 2.5, Predd et al. consider a more general class in allowing diﬀerent one-dimensional Bregman divergences for diﬀerent propositions. We do not require ϕ ′ ( i ) < ∞ for i ∈ { , } .

3y allowing diﬀerent weights depending on the proposition, we can accommodate the intuitionthat some propositions are more important to know than others (see, e.g., Levinstein 2019 forfurther discussion of the varying epistemic importance of propositions). Even if one thinks thatinaccuracy measures should be additive, as Pettigrew (2016) does, relaxing this restriction makesour results more widely relevant. A popular example of an additive legitimate inaccuracy measureis the Brier score (see Section 12, “Homage to the Brier Score,” of Joyce 2009): I ( c, w ) = n X i =1 ( v w ( p i ) − c ( p i )) . Remark 2.7.

The class of additive Bregman divergences is the class of additive and continuous strictly proper scoring rules . See Pettigrew 2016, p. 66. Also see, e.g., Banerjee et al. 2005 andGneiting and Raftery 2007 for more details on Bregman divergences as well as their connection tostrictly proper scoring rules.We now recall the dominance result connecting coherence to accuracy dominance when theopinion set is ﬁnite. It was ﬁrst proved for the Brier score by de Finetti (1974, pp. 87-90) andextended to any legitimate inaccuracy measure by Predd et al. (2009). See Schervish et al. 2009for further generalizations of the ﬁnite result.

Deﬁnition 2.8.

For each pair of credence functions c, c ∗ over F :1. c ∗ weakly dominates c relative to an inaccuracy measure I if I ( c, w ) ≥ I ( c ∗ , w ) for all w ∈ W and I ( c, w ) > I ( c ∗ , w ) for some w ∈ W ;2. c ∗ strongly dominates c relative to I if I ( c, w ) > I ( c ∗ , w ) for all w ∈ W . Theorem 2.9 (de Finetti 1974, Predd et al. 2009) . Let F be a ﬁnite opinion set, I a legitimateinaccuracy measure, and c a credence function on F . Then the following are equivalent:1. c is not strongly dominated;2. c is not weakly dominated;3. c is coherent.Further, if c is incoherent, then c is strongly dominated by a coherent credence function.On the basis of Theorem 2.9, authors in the accuracy literature conclude that an incoherentcredence function is objectionable because there is an undominated coherent credence functionthat does strictly better in terms of accuracy, no matter how the world turns out to be, whereascoherent credence functions are not accuracy dominated in this way. Since it is the basis of theaccuracy argument for probabilism in the ﬁnite case, Theorem 2.9 is the result we would like toextend to inﬁnite opinion sets. We now make progress toward this goal when F is countablyinﬁnite. 4 The Countable Case: Coherence is Necessary

We begin with a discussion of how to measure inaccuracy in the countably inﬁnite setting. Fix acountably inﬁnite opinion set F over a set W of worlds (of arbitrary cardinality). Let C be the setof credence functions over F , which can be identiﬁed with [0 , ∞ (see Remark 2.3). An inaccuracymeasure remains a map from C × W to [0 , ∞ ] .The class of inaccuracy measures that we use are deﬁned in terms of generalizations of quasi-additive Bregman divergences. Deﬁnition 3.1.

Suppose D : [0 , ∞ × [0 , ∞ → [0 , ∞ ] . Then we call D a generalized quasi-additiveBregman divergence if D ( x , y ) = ∞ X i =1 a i d ( x i , y i ) , where d is a bounded one-dimensional Bregman divergence as in Deﬁnition 2.5.3 and { a i } ∞ i =1 asequence of elements from (0 , ∞ ) with sup i a i < ∞ . Remark 3.2.

Note that d —deﬁned in terms of ϕ —being bounded is equivalent to ϕ ′ beingbounded on [0 , . Further, we may assume that ϕ (0) = ϕ ′ (0) = 0 since d ϕ = d ¯ ϕ if ϕ and ¯ ϕ diﬀer by a linear function. In the appendix, we show that generalized quasi-additive Bregman divergences are examples ofwhat Csiszár (1995) calls

Bregman distances , which are generalizations of quasi-additive Bregmandivergences deﬁned on spaces of non-negative functions.Suggestively, we make the following deﬁnition.

Deﬁnition 3.3.

Given an enumeration of F , let a generalized legitimate inaccuracy measure bean inaccuracy measure I : C × W → [0 , ∞ ] given by I ( c, w ) = D ( v w , c ) (1)for D a generalized quasi-additive Bregman divergence.Notice that the Brier score extends to a generalized legitimate inaccuracy measure, namely thesquared ℓ ( F ) norm I ( c, w ) = || v w − c || ℓ ( F ) = ∞ X i =1 ( v w ( p i ) − c ( p i )) . (2)We call (2) the generalized Brier score .The name “generalized legitimate inaccuracy measure” is motivated by the observation thata generalized legitimate inaccuracy measure naturally restricted to the ﬁnite opinion sets is a le-gitimate inaccuracy measure. This is because 1) for both the generalized and ﬁnite legitimate Recall that sup i a i = a ∈ R ∪ { + ∞ , −∞} such that a i ≤ a for all i ∈ N and for any b < a , there is some a i suchthat b < a i ≤ a . Proof: Let ¯ ϕ ( x ) = ϕ ( x ) + ax + b . Then d ¯ ϕ ( x, y ) = ϕ ( x ) + ax + b − ϕ ( y ) − ay − b − ( ϕ ′ ( y ) + a )( x − y ) = ϕ ( x ) + ax + b − ϕ ( y ) − ay − b − ϕ ′ ( y )( x − y ) − ax + ay = ϕ ( x ) − ϕ ( y ) − ϕ ′ ( y )( x − y ) = d ϕ ( x, y ) . Further, if ϕ satisﬁesthe conditions in Deﬁnition 2.5.3, then ¯ ϕ does as well. Thus we may assume that any one-dimensional Bregmandivergence is deﬁned by a function ϕ such that ϕ (0) = ϕ ′ (0) = 0 . The choice of enumeration does not matter since the terms in the inﬁnite sum deﬁning inaccuracy are non-negative. Thus convergence is absolute and independent of order.

We now state one of our main results: coherence is necessary to avoid accuracy dominance in thecountably inﬁnite case.

Theorem 3.4.

Let F be a countably inﬁnite opinion set, I a generalized legitimate inaccuracymeasure, and c an incoherent credence function. Then:1. c is weakly dominated relative to I by a coherent credence function; and2. if I ( c, w ) < ∞ for each w ∈ W , then c is strongly dominated relative to I by a coherentcredence function. Proof.

See the Appendix.

Remark 3.5.

By analyzing the proof of Theorem 3.4, one can see that the most general way tostate the theorem is: assume c is incoherent; if I ( c, w ) < ∞ for some w , then there is a coherentcredence function d such that I ( d, w ) < I ( c, w ) for all w such that I ( c, w ) < ∞ ; if I ( c, w ) = ∞ for all w ∈ W , then any omniscient credence function weakly dominates c . Remark 3.6.

The following is easy to prove from the results of Schervish et al. (2009): anyincoherent credence function c over a countably inﬁnite opinion set is weakly dominated but notnecessarily by a coherent credence function; and if I ( c, w ) < ∞ for each w ∈ W , then c isstrongly dominated but not necessarily by a coherent credence function. Thus the value in theproof strategy to come is that the dominating credence function is proven to be coherent, whichis analogous to the ﬁnite case.

We note that one direction of Walsh’s (2020) accuracy dominance result follows immediatelyfrom Theorem 3.4. We ﬁrst recall his result. Proof sketch: If c is incoherent, then there is some ﬁnite F ⊆ F on which c is incoherent. Restrict c to c on F . Then by Theorem 2.9, there is some d that strongly dominates c . Extend d to a credence function d on F bycopying c oﬀ of F . Then so long as c has ﬁnite inaccuracy at some world, d will weakly dominate c . Thanks to Teddy Seidenfeld for suggesting this connection to the ﬁnite case. Further, it is often argued that not all dominated credence functions are irrational—only those that are domi-nated by a credence function which is itself not dominated (see discussion of the Undominated Dominance principlein Pettigrew 2016, p. 22). For the opinion sets and inaccuracy measures dicussed in Section 4, the undominatedcredence functions will be precisely the coherent credence functions, and so the added strength of Theorem 3.4 isnormatively important, as well. heorem 3.7 (Walsh 2020) . Let F be a countably inﬁnite opinion set. Let I ( c, w ) = ∞ X i =1 − i ( v w ( p i ) − c ( p i )) . (3)Then:1. if c is incoherent, then c is strongly dominated relative to I by a coherent credence function;2. if c is coherent, then c is not weakly dominated relative to I by any credence function d = c .Part 1 of this result follows from Theorem 3.4 by deﬁning I in terms of the generalized quasi-additive Bregman divergence generated by { − i } ∞ i =1 and d ( x, y ) = x − y − xy ( x − y ) = ϕ ( x ) − ϕ ( y ) − ϕ ′ ( y )( x − y ) , where ϕ ( x ) = x . Note that I ( c, w ) < ∞ for all c ∈ C and w ∈ W as P i − i < ∞ . Unlike coherent credence functions on ﬁnite opinion sets, coherent credence functions on countablyinﬁnite opinion sets can be strongly dominated.

Example 4.1.

Let F = {{ n ≥ N : n ∈ N } : N ∈ N } be an opinion set over N (including zero).Let c ( { n ≥ N } ) = 1 √ N + 1 . Then c is coherent—in fact, countably coherent (see Deﬁnition 4.6)—but I ( c, w ) = ∞ for all w ∈ W when I is the generalized Brier score. So any omniscient credence function stronglydominates c .In fact, the classic example of a merely ﬁnitely additive probability function—the 0-1 functiondeﬁned on the ﬁnite-coﬁnite algebra over N taking value 0 on ﬁnite sets—restricts to a coherentdominated credence function. Example 4.2.

Let F = {{ n ≤ N : n ∈ N } : N ∈ N } be an opinion set over N (including zero).Let c ( { n ≤ N } ) = 0 . Then c is coherent—as well as ﬁnitely supported and not countably coherent—but I ( c, w ) = ∞ for all w ∈ W when I is the generalized Brier score. So any omniscient credence function stronglydominates c .The goal of this section is to characterize the opinion sets and inaccuracy measures for whichsome variant of Theorem 2.9 holds. We extend Theorem 2.9 by proving dominance results for countably coherent credence functions and using an opinion set compactiﬁcation construction totransfer these results to merely coherent credence functions. At points, our results will only applyto the generalized Brier score. We conjecture that any such result extends to any generalizedlegitimate inaccuracy measure. In any case, this is a well motivated restriction since the Brier7core has been defended by many—including Horwich (1982), Maher (2002), Joyce (2009), andLeitgeb and Pettigrew (2010a)—as being a particularly appropriate way to measure inaccuracy. Throughout the rest of this section we assume the opinion set F is countably inﬁnite. We begin by introducing the notion of a countably coherent credence function and establishing acharacterization theorem regarding countable coherence on countably discriminating opinion setswhich extends a result of de Finetti (1974).

Deﬁnition 4.3.

For

F ⊆ P ( W ) , we deﬁne an equivalence relation ∼ on W such that w ∼ w ′ ifand only if { p ∈ F : w ∈ p } = { p ∈ F : w ′ ∈ p } . We call the set of equivalence classes of W the quotient of W relative to F . If the quotient of W relative to F is countable, then we call F countably discriminating .Clearly, any countable opinion set over a countable set of worlds is countably discriminating.The following characterization of the coherent credence functions on ﬁnite opinion sets is dueto de Finetti (1974). Recall V F is the set of omniscient credence functions on F , which is ﬁnitewhen F is ﬁnite. Theorem 4.4 (de Finetti 1974) . c is a coherent credence function on a ﬁnite opinion set F if andonly if there are λ w ∈ [0 , with P v w ∈V F λ w = 1 such that c ( p ) = X v w ∈V F λ w v w ( p ) for all p ∈ F .Theorem 4.4 is integral to Predd et al.’s proof that coherence is suﬃcient to avoid dominancein Theorem 2.9. We now show de Finetti’s characterization of the coherent credence functions onﬁnite opinion sets extends to countably coherent credence functions on countably discriminatingopinion sets. Deﬁnition 4.5. A σ -algebra over W is a subset F ∗ ⊆ P ( W ) such that:1. W ∈ F ∗ ;2. if { p i } ∞ i =1 ⊆ F ∗ , then S ∞ i =1 p i ∈ F ∗ ;3. if p ∈ F ∗ , then W \ p ∈ F ∗ . Deﬁnition 4.6.

Let a credence function c be countably coherent if c extends to a countablyadditive probability function on a σ -algebra F ∗ containing F . That is, there is a c ∗ : F ∗ → [0 , such that:1. c ∗ ( p ) = c ( p ) for all p ∈ F ;2. c ∗ ( S ∞ i =1 p i ) = P ∞ i =1 c ∗ ( p i ) for { p i } ∞ i =1 ⊆ F ∗ with p i ∩ p j = ∅ for i = j ; Note that if c is countably coherent on F , then c extends to a countably additive probability function on σ ( F ) ,the σ -algebra generated by F . c ∗ ( W ) = 1 .Otherwise, a credence function is countably incoherent . Proposition 4.7.

Let F be a countably discriminating opinion set. Then a credence function c is countably coherent if and only if there are λ w ∈ [0 , with P v w ∈V F λ w = 1 such that c ( p ) = X v w ∈V F λ w v w ( p ) for all p ∈ F . Proof.

See the Appendix.

In this section, we introduce the compactiﬁcation construction of what we call an opinion space .The construction will be relevant to transferring dominance results for countably coherent credencefunctions to merely coherent credence functions.

Deﬁnition 4.8. An opinion space is a pair ( W, F ) , where W is a nonempty set and F ⊆ P ( W ) .From here on out we will speak in terms of opinion spaces as opposed to opinion sets in order tokeep track of the underlying set of worlds.Borkar et al. (2003) proved that the opinion spaces which satisfy a certain compactness propertyare precisely those where the set of coherent credence functions and the set of countably coherentcredence functions coincide. Deﬁnition 4.9.

Let ( W, F ) be an opinion space. Let f ( n ) ∈ { , } and set p f ( n ) n = p n if f ( n ) = 0 and p f ( n ) n = p cn if f ( n ) = 1 . Then ( W, F ) is compact if for any choice of { p n } ∞ n =1 ⊆ F and f : N → { , } , if T Nn =1 p f ( n ) n is nonempty for every N , then T ∞ n =1 p f ( n ) n is nonempty.As an example, note that the opinion spaces from Examples 4.1 and 4.2 are not compact. In-deed, for the ﬁrst example T ∞ n =1 p n = ∅ and yet every ﬁnite subset of F has nonempty intersection;for the second example, T ∞ n =1 p cn = ∅ while T Nn =1 p cn = ∅ for every N . Remark 4.10.

Assume F is closed under ﬁnite intersections. Let A ( F ) denote the algebragenerated by F , and let T ( F ) denote the topology generated by F . By the Alexander subbasetheorem (see, e.g., Kelley 1975, p. 139), ( W, F ) is compact if and only if T ( A ( F )) is compact. Theorem 4.11 (Borkar et al. 2003) . The following are equivalent:1. ( W, F ) is compact;2. for every credence function c on ( W, F ) , c is coherent if and only if c is countably coherent.We now show how to associate a compact space to any space and, in light of Theorem 4.11, acountably coherent credence function to any coherent credence function. Let ( W, F ) be an opinionspace. Let S denote the set of sequences of the form { p f ( n ) n } (as in Deﬁnition 4.9) such that Proof: The set of elements in F and their complements form a subbase for T ( A ( F )) . Nn =1 p f ( n ) n = ∅ for every N but T ∞ n =1 p f ( n ) n = ∅ . Deﬁne W ∗ = W ∪ { x s : s ∈ S } . Deﬁne F ∗ ⊆ P ( W ∗ ) as follows: for each p ∈ F , let S p denote the set of sequences s of the form { p f ( n ) n } (as in Deﬁnition 4.9) such that s ∈ S , p n = p for some n , and f ( n ) = 0 . Then deﬁne p ∗ = p ∪ { x s : s ∈ S p } . Finally, let F ∗ = { p ∗ : p ∈ F} . We call ( W ∗ , F ∗ ) the compactiﬁcation of ( W, F ) . We alwaysdenote the compactiﬁcation of ( W, F ) by ( W ∗ , F ∗ ) . Further, we let Ψ denote the natural bijectionfrom F to F ∗ given by Ψ( p ) = p ∗ .We ﬁrst note that ( W ∗ , F ∗ ) is in fact compact. Lemma 4.12.

For any opinion space ( W, F ) , ( W ∗ , F ∗ ) is compact. Proof.

See the Appendix.Next we note that we can naturally identify a coherent credence function on ( W, F ) with a count-ably coherent credence function on ( W ∗ , F ∗ ) . Lemma 4.13.

Let ( W, F ) be an opinion space and c a coherent credence function on ( W, F ) . Let ( W ∗ , F ∗ ) be the compactiﬁcation of ( W, F ) and deﬁne c ∗ (Ψ( p )) := c ( p ) for each p ∈ F . Then c ∗ is a countably coherent credence function on ( W ∗ , F ∗ ) and I ( c, w ) = I ( c ∗ , w ) for w ∈ W . Proof.

See the Appendix.For a coherent credence function c deﬁned on an opinion space ( W, F ) , we let c ∗ denote thecountably coherent credence function on ( W ∗ , F ∗ ) given as in Lemma 4.13. Example 4.14.

As an example, let us compute the compactiﬁcation of the opinion space fromExample 4.2 and show how to identify a coherent credence function on the space with a countablycoherent credence function on its compactiﬁcation. We note that only for f ( n ) = 1 for all n ∈ N is T Nn =1 p f ( n ) n nonempty for every N while T ∞ n =1 p f ( n ) n = ∅ . Indeed, assume f ( m ) = 0 for some m . If f ( i ) = 1 for some i ≥ m + 1 , then since p ci ∩ p m = ∅ , we have T in =1 p f ( n ) n = ∅ . So f ( i ) = 0 for all i ≥ m + 1 . But then since T mn =1 p f ( n ) n = ∅ and T mn =1 p f ( n ) n ⊆ p i for all i ≥ m + 1 , it also follows that T ∞ n =1 p f ( n ) n = ∅ , which contradicts our assumption. So S is a single point x , W ∗ = W ∪ { x } , and F ∗ = {{ n ≤ N } : N ∈ N } . F ∗ is identical to F , except that there is a point in the complement ofevery proposition in F ∗ . For a coherent credence function c on ( W, F ) , c ∗ on ( W ∗ , F ∗ ) is identicalto c and is a countably coherent credence function on the compact opinion space ( W ∗ , F ∗ ) . Forexample, for credence function c in Example 4.2, c ∗ extends to the countably additive omniscientcredence function v x on the σ -algebra generated by F ∗ .Using Theorem 4.11, Lemma 4.12 and Lemma 4.13, the proof strategy for extending Theorem2.9 is more precisely as follows. First, we establish dominance results for countably coherent cre-dence functions. Second, we associate each coherent credence function on ( W, F ) with a countablycoherent credence function on ( W ∗ , F ∗ ) as in Lemma 4.13. Lastly, we use the dominance resultsfor countably coherent credence functions to establish dominance results for coherent credencefunctions in certain cases where there is “accuracy dominance stability” in compactifying.10 .3 W-Stable Opinion Spaces In this section, we establish the equivalence between coherence and avoiding weak dominancefor certain opinion spaces (Theorem 4.19), as well as additional results extending Theorem 2.9(Corollary 4.22 and Theorem 4.23). We ﬁrst note that under certain circumstances countablycoherent credence functions are not weakly dominated (Proposition 4.16 and Proposition 4.17);then we use the compactiﬁcation construction from the previous section and a property of anopinion space—

W-stability (Deﬁnition 4.18)—to establish that for certain opinion spaces, merecoherence is also suﬃcient to avoid weak dominance.We ﬁrst prove that if a countably coherent credence function c has ﬁnite expected inaccuracy,then c is not weakly dominated. Deﬁnition 4.15.

For c a countably coherent credence function and I a generalized legitimateinaccuracy measure, we say that c has ﬁnite expected inaccuracy relative to I if c has a countablyadditive extension ¯ c deﬁned on the opinion space ( W, σ ( F )) such that E ¯ c I ( c, · ) < ∞ . For c acoherent but not countably coherent credence function and I a generalized legitimate inaccu-racy measure, we say that c has ﬁnite expected inaccuracy relative to I if c ∗ has ﬁnite expectedinaccuracy relative to I . Note that it follows by Deﬁnition 4.15 that any coherent credence function c has ﬁnite expectedinaccuracy if and only if c ∗ has ﬁnite expected inaccuracy. Proposition 4.16.

See the Appendix.Here is another dominance result for countably coherent credence functions where we assume F is point-ﬁnite ( |{ p ∈ F : w ∈ p }| < ∞ for all w ∈ W ) but weaken the assumption that c hasﬁnite expected inaccuracy considerably, namely to somewhere ﬁnitely inaccurate (there is a w ∈ W such that I ( c, w ) < ∞ ). We also restrict to the generalized Brier score B . Proposition 4.17.

Let ( W, F ) be a point-ﬁnite opinion space with F countably inﬁnite and I a generalized legitimate inaccuracy measure. If a credence function c is countably coherent andsomewhere ﬁnitely inaccurate relative to B , then c is not weakly dominated relative to B . Proof.

See the Appendix.We now introduce the notion of

W-stability which will allow us to use Propositions 4.16 and4.17 to prove extensions of Theorem 2.9.

Deﬁnition 4.18.

Let ( W, F ) be W-stable relative to I if for any coherent credence function c on ( W, F ) , if c is weakly dominated relative to I , then c ∗ on ( W ∗ , F ∗ ) is weakly dominated relativeto I . Consider the measure space ( W, σ ( F ) , µ ) . Note d ( d i , v w ( p i )) = 1 p i ( w ) d (1 , d i ) + (1 − p i ( w )) d (0 , d i ) so that eachterm in I ( d, · ) is measurable for any credence function d , and so the inﬁnite sum is measurable as the ﬁnite sumand limit of measurable functions are measurable. Thus we can take the expectation of I ( d, · ) with respect to µ for any credence function d . Theorem 4.19.

Let I be a generalized legitimate inaccuracy measure and ( W, F ) a W-stableopinion space relative to I where all coherent credence functions have ﬁnite expected inaccuracyrelative to I . Then the following are equivalent:1. c is coherent;2. c is not weakly dominated. Proof.

We prove that if c is coherent, then c is not weakly dominated. Let ( W ∗ , F ∗ ) be thecompactiﬁcation of ( W, F ) . If c is coherent on ( W, F ) , then c ∗ is countably coherent by Lemma 4.13.Further c ∗ has ﬁnite expected inaccuracy by deﬁnition and the assumption that c has ﬁnite expectedinaccuracy. So by Proposition 4.16, c ∗ is not weakly dominated. But since ( W, F ) is W-stable thisimplies that c is not weakly dominated. The other direction follows from Theorem 3.4. Remark 4.20.

It is trivial to see that W-stability is necessary for the equivalence of coherenceand not being weakly dominated. It is open how far ﬁnite expected inaccuracy can be weakened.

Remark 4.21. If I is deﬁned with summable weights, that is, { a i } ∞ i =1 such that P ∞ i =1 a i < ∞ ,then there is a C < ∞ such that I ( c, w ) < C for all credence functions c and w ∈ W . So, inparticular, all coherent credence functions have ﬁnite expected inaccuracy relative to I .If we add in an additional ﬁniteness assumption, then we get the full equivalence of Theorem 2.9. Corollary 4.22.

In Theorem 4.19, if in addition all coherent credence functions c have I ( c, w ) < ∞ for all w ∈ W , then the following are equivalent:1. c is coherent;2. c is not weakly dominated;3. c is not strongly dominated.We combine W-stability and Proposition 4.17 to get another set of suﬃcient conditions on ( W, F ) for Theorem 2.9 to go through for the generalized Brier score. Theorem 4.23.

Let ( W, F ) be a W-stable opinion space with ( W ∗ , F ∗ ) point-ﬁnite such that allcoherent credence functions are somewhere ﬁnitely inaccurate relative to B . Then the followingare equivalent:1. c is coherent;2. c is not weakly dominated relative to B ;3. c is not strongly dominated relative to B . Proof. If c is coherent, then c ∗ is countably coherent on a point-ﬁnite opinion set. Further, c ∗ issomewhere ﬁnitely inaccurate relative to B , as c is somewhere ﬁnitely inaccurate by assumption.Thus by Proposition 4.17, c ∗ is not weakly dominated relative to B . By W-stability, c is notweakly dominated relative to B . Clearly if c is not weakly dominated then c is not strongly12ominated. Finally, we show that if c is incoherent then c is strongly dominated. First, if c is notsomewhere ﬁnitely inaccurate, then any omniscient credence function strongly dominates c since I ( v w , w ′ ) < ∞ for every w, w ′ ∈ W by point-ﬁniteness. If c is somewhere ﬁnitely inaccurate then I ( c, w ) < ∞ for all w ∈ W by point-ﬁniteness. Thus Theorem 3.4 establishes that c is stronglydominated relative to B . Remark 4.24.

We can drop the assumption that all coherent credence functions are somewhereﬁnitely inaccurate in Theorem 4.23 if we strengthen W-stable to compact so that ( W, F ) =( W ∗ , F ∗ ) . Indeed, compactness alongside point-ﬁniteness implies coherent credence functions on ( W, F ) = ( W ∗ , F ∗ ) are somewhere ﬁnitely inaccurate: if there were a coherent (and thus countablycoherent) credence function inﬁnitely inaccurate at all worlds, then it would be strongly dominatedby an omniscient credence function, contradicting Proposition 4.27 below. As an application of Theorem 4.19, we establish Theorem 2.9 for countably inﬁnite partitions. Inparts of the existing literature (e.g., in Joyce 2009), credence functions are assumed to be deﬁnedon a (ﬁnite) partition of W to begin with, and so such a result might be especially relevant toextending the accuracy argument for probabilism to countably inﬁnite opinion sets. Lemma 4.25.

A partition is W-stable relative to any generalized legitimate inaccuracy measure.

Proof.

See the Appendix.

Theorem 4.26.

Let ( W, F ) be a partition and I a generalized legitimate inaccuracy measure.Then the following are equivalent:1. c is coherent;2. c is not weakly dominated;3. c is not strongly dominated. Proof.

The result follows from Corollary 4.22, Lemma 4.25, and the fact that I ( c ∗ , · ) is boundedon W ∗ for each coherent credence function c . To see the latter, note that since c is coherent itfollows that P c i = P c ∗ i ≤ . For w ∈ W such that w ∈ p i , recalling that ϕ (0) = 0 , I ( c ∗ , w ) = a i d (1 , c ∗ i )+ X j = i a j d (0 , c ∗ j ) = a i d (1 , c ∗ i )+ X j = i a j ( c ∗ j ϕ ′ ( c ∗ j ) − ϕ ( c ∗ j )) ≤ C + D X j c ∗ j ≤ C + D for some constants C, D independent of c ∗ or w . Similarly, as seen in the proof of Lemma 4.25, W ∗ \ W = { w ∗ } where I ( c ∗ , w ∗ ) = ∞ X j =1 a j d (0 , c ∗ j ) = ∞ X j =1 a j ( c ∗ j ϕ ′ ( c ∗ j ) − ϕ ( c ∗ j )) ≤ C It has been noted that de Finetti’s (1974) original proof of Theorem 2.9 assuming the Brier score extends tocountably inﬁnite opinion sets. However, the only proof we have seen is a sketch of the necessity of coherence forcountably inﬁnite partitions by (Joyce, 1998, footnote 6), and such a claim could not be true for arbitrary countableopinion sets as Examples 4.1 and 4.2 show. Further, we prove the extension for arbitrary generalized legitimateinaccuracy measures. C independent of c ∗ or w . It follows that i) all coherent credence functions haveﬁnite expected inaccuracy and ii) I ( c, w ) < ∞ for c ∈ C and w ∈ W . Thus Lemma 4.25 andCorollary 4.22 establish the result. In this section, we establish the equivalence between coherence and avoiding strong dominancefor certain opinion spaces (Theorem 4.29). The conditions are in terms of the analogous stabil-ity condition—

S-stability (Deﬁnition 4.28)—but a diﬀerent ﬁniteness assumption, and the proofstrategy is the same as for Theorem 4.19.We begin by establishing that on compact opinion spaces, coherent and thus countably coherentcredence functions (recall Theorem 4.11) are not strongly dominated.

Proposition 4.27.

Let ( W, F ) be a compact opinion space and I a generalized legitimate inac-curacy measure. If c is coherent (and thus countably coherent), then c is not strongly dominatedrelative to I . Proof.

See the Appendix.We now introduce S-stability and the main theorem of this section.

Deﬁnition 4.28.

Let ( W, F ) be S-stable relative to I if whenever a coherent credence function c deﬁned on ( W, F ) is strongly dominated relative to I , then c ∗ deﬁned on ( W ∗ , F ∗ ) is stronglydominated relative to I . Theorem 4.29.

Let I be a generalized legitimate inaccuracy measure and ( W, F ) an S-stableopinion space relative to I . Assume that I ( c, w ) < ∞ for each coherent credence function c and w ∈ W . Then the following are equivalent:1. c is coherent;2. c is not strongly dominated. Proof.

Assume c is coherent. c ∗ deﬁned on the compact opinion space ( W ∗ , F ∗ ) is countablycoherent by Lemma 4.13. So by Proposition 4.27, c ∗ is not strongly dominated. But since F isS-stable this implies that c is not strongly dominated relative to W . The other direction followsfrom Theorem 3.4. Remark 4.30.

It is trivial to see that S-stability is necessary for the equivalence of coherence andavoiding strong dominance. It is open how much the assumption that coherent credence functionssatisfy I ( c, w ) < ∞ for all w can be weakened. Remark 4.31.

Schervish et al. (2009) take a diﬀerent approach to dropping the assumption thatthe opinion set is ﬁnite: they apply weak and strong dominance notions to ﬁnite subsets of arbi-trarily sized opinion sets. They also explore connections between the two notions of dominanceconsidered here—weak and strong dominance—and what they call coherence , which amounts toavoiding being susceptible to a ﬁnite Dutch book . Thanks to Teddy Seidenfeld for pointing me to this work of Schervish et al.. An additional point worth notingabout their work is that they further generalize the ﬁnite results of Predd et al. (2009) by i) allowing a wider variety emark 4.32. Theorem 4.29 is related to Theorem 1 of Schervish et al. 2014. However, 1) theirassumptions are in some ways weaker and in some ways stronger than those in Theorem 4.29 and2) while Schervish et al. (2014) establish that coherence is suﬃcient for avoiding strong dominancein certain cases, unlike Theorems 4.19 and 4.29, their results do not show that coherence is suﬃcientfor avoiding even weak dominance in certain cases or that incoherence always entails being weaklydominated (and sometimes strongly dominated) by a coherent credence function (see Remark 3.6). While Theorems 4.19 and 4.29 come close to characterizing the opinion spaces on which not beingweakly and strongly dominated, respectively, are equivalent to coherence, it is open how far theﬁniteness assumptions in the theorems can be weakened. This is a natural next line of inquiry.In addition, it would be useful to determine characterizations of W- and S-stability in terms ofthe inaccuracy measure that make it relatively easy to check whether an opinion set is W- orS-stable. Also, there are natural ways to generalize the results above to more closely match theﬁnite results: allow diﬀerent one-dimensional Bregman divergences for diﬀerent propositions andallow unbounded one-dimensional Bregman divergences.Another direction one could go in exploring the suﬃciency of coherence for avoiding dominanceis as follows: instead of characterizing the countable opinion sets on which Theorem 2.9 goesthrough, one could characterize the kinds of coherent credence functions for which Theorem 2.9goes through on any countable opinion set. Doing so might show that while coherence is notenough to avoid dominance in all cases, coherence along with additional plausible constraints issuﬃcient. In particular, while restricting to ﬁnitely supported credence functions is not enoughto establish the suﬃciency of coherence for avoiding strong dominance (due to Example 4.2), itis open whether countable coherence is equivalent to avoiding weak or strong dominance on therestricted class.

So far we have been concerned with credences deﬁned on countably inﬁnite opinion sets. Wenow consider what can be said in favor of probabilism when credences are deﬁned on uncountableopinion sets. When extending from the ﬁnite to the countably inﬁnite setting, we used inaccuracymeasures that naturally restrict to legitimate inaccuracy measures in the ﬁnite case. Similarly, inthe uncountable case, we allow for measure theoretically deﬁned inaccuracy measures that naturallyrestrict to generalized legitimate inaccuracy measures in the countable case. However, for the sakeof generality, we allow inaccuracy to be deﬁned by integration against any ﬁnite measure. of inaccuracy measures including those which are merely proper as opposed to strictly proper and ii) by scoringconditional probabilities. A natural direction for future work is to use these relaxations in the ﬁnite case to relaxassumptions made here. Similarly, Steeger (2019) considers the property of avoiding strong dominance with respectto the Brier score for every ﬁnite subset of arbitrarily sized opinion sets (see “suﬃcient coherence” on p. 38). Schervish et al. require that the prevision for the inaccuracy of the credence function be ﬁnite and that inac-curacy be pointwise ﬁnite, while we only assume the latter. On the other hand, we require the opinion set to be S -stable while they do not. Thanks to Thomas Icard and Milan Mosse for suggesting this alternative direction of study. Note that since the counting measure over N is not a ﬁnite measure, the result below does not directly establishTheorem 3.4. Deﬁnition 5.1.

Let ( F , A , µ ) be a measure space and c : F → R + . If c is A -measurable and µ ( { p : c ( p ) / ∈ [0 , } ) = 0 , we call c a µ -credence function . We say a µ -credence function c is µ -coherent if there is a coherent (in the usual sense) credence function c ′ on F with c = c ′ µ -a.e.We say a µ -credence function is µ -incoherent if there is no coherent credence function c ′ such that c = c ′ µ -a.e. Deﬁnition 5.2.

Let F be an opinion set (of arbitrary cardinality) over a set W of worlds. Let ( F , A , µ ) be a σ -ﬁnite measure space over the opinion set F . Let C be the space of all µ -credencefunctions. Assume I : C × W → [0 , ∞ ] is such that, for all ( c, w ) ∈ C × W , we have I ( c, w ) = B ϕ,µ ( v w , c ) , where B ϕ,µ is a Bregman distance relative to ϕ and ( F , A , µ ) (see Deﬁnition A.1). In particular,each v w is a µ -credence function. Then we call I an integral inaccuracy measure on ( F , A , µ ) .We now state a dominance result about integral inaccuracy measures. The proof is essentiallya measure theoretic version of the proof of Theorem 3.4. Theorem 5.3.

Let I be an integral inaccuracy measure on a ﬁnite measure space ( F , A , µ ) .Then for every µ -credence function c , if c is µ -incoherent, then there is a µ -coherent µ -credencefunction c ′ that strongly dominates c relative to I . Proof.

See the Appendix.Here is an example of how Theorem 5.3 can be used to give an accuracy argument in a concreteuncountable setting. Assume we have a coin with unknown bias θ ∈ [0 , and a set of propositionsof the form “ a ≤ θ ≤ b ” for each a, b ∈ [0 , with a ≤ b . Then a credence function on thisuncountable opinion set can be represented by a function c : X → [0 , , where X = { ( a, b ) : 0 ≤ a ≤ b ≤ } ⊆ [0 , . We put the Lebesgue measure λ on X to generalizethe additive constraint often assumed in the ﬁnite case. We let I ( c, w ) = Z X d ( v w ( x ) , c ( x )) λ ( d x ) for a bounded one-dimensional Bregman divergence d . Then the assumptions of Theorem 5.3 hold,so we get the following dominance result: for any λ -credence function c , if c is a λ -incoherent, thenthere is a λ -coherent λ -credence function that strongly dominates c . Again, we assume the one-dimensional Bregman divergence d generated by ϕ is bounded. Conclusion

There is plenty of normative work to be done using the results established above. In light ofthe failure of coherence being suﬃcient to avoid strong dominance on certain countably inﬁniteopinion sets, the most pressing question seems to be: is there an accuracy-based argument forprobabilism on at least all countable opinion sets? If not, what does this mean for the accuracyproject as a whole? Can we give some sort of privileged status to certain kinds of opinion sets orinaccuracy measures for which coherence is equivalent to not being dominated, e.g., partitions?What is the normative status of the stronger condition of countable coherence? Further, whilethe measure theoretic framework introduced in Section 5 to score inaccuracy of credence functionsover opinion sets of arbitrary cardinality seems like a natural extension of the ﬁnite and countablyinﬁnite frameworks, is it well motivated that inaccuracy does not track the behavior of a credencefunction on measure zero sets? The hope with this paper is to start a conversation about thesequestions by ﬁrst establishing relevant mathematical results.

Acknowledgements

Thanks to participants of the Berkeley-Stanford Logic Circle (April 2019), Probability and LogicConference (July 2019), Berkeley Formal Epistemology Reading Course (October 2019), and Stan-ford Logic and Formal Philosophy Seminar (November 2019), to whom earlier versions of this pa-per were presented. Special thanks to Craig Evans, Wesley Holliday, Thomas Icard, Kiran Luecke,Calum McNamara, Sven Neth, Richard Pettigrew, Eric Raidl, Teddy Seidenfeld, and James Walshfor helpful comments and discussion.

A Appendix

A.1 Proof of Theorem 3.4

We review the necessary background before proving Theorem 3.4.

A.1.1 Generalized Projections

Csiszár (1995) showed that what he calls generalized projections onto convex sets with respect toBregman distances exist under very general conditions. We review his relevant results here (butassume knowledge of basic measure theory).

Deﬁnition A.1.

Fix a σ -ﬁnite measure space ( X, X , µ ) . The Bregman distance of non-negative( X -measurable) functions s and t is deﬁned by B ϕ,µ ( s, t ) = Z d ( s ( x ) , t ( x )) µ ( dx ) ∈ [0 , ∞ ] where d ( s ( x ) , t ( x )) = ϕ ( s ( x )) − ϕ ( t ( x )) − ϕ ′ ( t ( x ))( s ( x ) − t ( x )) for some strictly convex, diﬀerentiablefunction ϕ on (0 , ∞ ) . Note that B ϕ,µ ( s, t ) = 0 iﬀ s = t µ -a.e. See Csiszár 1995, p. 165 for details. For B ϕ,µ to be a distance measure, we do not need to assume that ϕ (1) = ϕ ′ (1) = 0 by the remark following(1.9) in Csiszár 1995. emark A.2. Notice that a generalized quasi-additive Bregman divergence D with weights { a i } ∞ i =1 whose generating one-dimensional Bregman divergence d is given in terms of ϕ has acorresponding Bregman distance B ¯ ϕ,µ with1. the measure space being ( N , P ( N ) , µ ) , where µ ( A ) = P i ∈ A a i for each A ∈ P ( N ) , and2. ¯ ϕ on (0 , ∞ ) being a strictly convex, diﬀerentiable extension of ϕ on [0 , . Thus non-negative ( P ( N ) -measurable) functions are elements of R + ∞ . Note, importantly, thatthe corresponding generalized legitimate inaccuracy measure I determined by D is also given bythe corresponding Bregman distance. That is, I ( c, w ) = B ¯ ϕ,µ ( v w , c ) . To simplify notation, let B denote B ¯ ϕ,µ a Bregman distance. Let S be the set of non-negativemeasurable functions. For any E ⊆ S and t ∈ S , we write B ( E, t ) = inf s ∈ E B ( s, t ) . If there exists s ∗ ∈ E with B ( s ∗ , t ) = B ( E, t ) , then s ∗ is unique and is called the B-projection of t onto E (see Csiszár 1995, Lemma 2). As Csiszár notes, these projections may not exist. However, aweaker kind of projection exists in a large number of cases. To describe them, we need to introducea kind of convergence called loose in µ -measure convergence . Deﬁnition A.3.

We say a sequence { s n } of elements from S converges loosely in µ -measure to t ,denoted by s n µ t , if for every A ∈ X with µ ( A ) < ∞ , we have lim n →∞ µ ( A ∩ { p : | s n ( p ) − t ( p ) | > ǫ } ) = 0 for all ǫ > . Deﬁnition A.4. i. Given E ⊆ S and t ∈ S , we say that a sequence { s n } of elements from E is a B -minimizing sequence if B ( s n , t ) → B ( E, t ) .ii. If there is an s ∗ ∈ S such that every B -minimizing sequence converges to s ∗ loosely in µ -measure, then we call s ∗ the generalized B -projection of t onto E .The result that is integral to proving Theorem 3.4 is the following (see Csiszár’s Theorem 1,Lemma 2, and Corollary of Theorem 1). Theorem A.5 (Csiszár 1995) . Let E be a convex subset of S and t ∈ S . If B ( E, t ) is ﬁnite, thenthere exists s ∗ ∈ S such that B ( s, t ) ≥ B ( E, t ) + B ( s, s ∗ ) for every s ∈ E and B ( E, t ) ≥ B ( s ∗ , t ) . It follows that the generalized B -projection of t onto E exists and equals s ∗ . Using that ϕ ′ exists and is ﬁnite at x = 1 as we assumed d is bounded, we extend ϕ as follows: for x ∈ [1 , ∞ ) ,let ¯ ϕ ( x ) = q ( x ) = x + bx + c , where b and c are chosen so ϕ (1) = q (1) and ϕ ′ (1) = q ′ (1) . Then using the factthat ¯ ϕ is diﬀerentiable at by construction and a function is strictly convex if and only if its derivative is strictlyincreasing, it is easy to see that ¯ ϕ is diﬀerentiable and strictly convex on (0 , ∞ ) . .1.2 Extending Partial Measures We also use an extension result of Horn and Tarski (1948) in the proof of Theorem 3.4. FollowingHorn and Tarski, we introduce partial measures and recall that they can be extended to ﬁnitelyadditive probability functions. Recall the deﬁnition of a ﬁnitely additive probability function inDeﬁnition 2.2 (though we drop the assumption that F is ﬁnite). Remark A.6.

It is a simple corollary of the deﬁnition of a ﬁnitely additive probability function c over an algebra F that for any p, p ′ ∈ F : if p ⊆ p ′ , then c ( p ) ≤ c ( p ′ ) .Here is another useful fact about ﬁnitely additive probability functions. Proposition A.7. If c is a ﬁnitely additive probability function on an algebra F and a , . . . , a m − ∈F , then m − X k =0 c ( a k ) = m − X k =0 c ( [ p ∈ S m,k \ i ≤ k a p i ) (4)where S m,k is the set of all sequences p = ( p , . . . , p k ) with ≤ p < . . . < p k < m .To introduce the notion of a partial measure, we need the following deﬁnition. Deﬁnition A.8.

Let ϕ , . . . , ϕ m − and ψ , . . . , ψ n − be elements of F . Then we write ( ϕ , . . . , ϕ m − ) ⊆ ( ψ , . . . , ψ n − ) to mean [ p ∈ S m,k \ i ≤ k ϕ p i ⊆ [ p ∈ S n,k \ i ≤ k ψ p i for every k < m (5)where S r,k ( r = m, n ) is as in Proposition A.7. Deﬁnition A.9.

A function c , deﬁned on a subset S of an algebra F over W , that maps to R iscalled a partial measure if it satisﬁes the following properties:1. c ( x ) ≥ for x ∈ S ;2. If ϕ , . . . , ϕ m − , ψ , . . . , ψ n − ∈ S and ( ϕ , . . . , ϕ m − ) ⊆ ( ψ , . . . , ψ n − ) , then m − X k =0 c ( ϕ k ) ≤ n − X k =0 c ( ψ k ); W ∈ S and c ( W ) = 1 .The following result is the point of introducing the above deﬁnitions. Theorem A.10 (Horn and Tarski 1948) . Let c be a partial measure on a subset F of an algebra A . Then there is a ﬁnitely additive probability function c ∗ on A that extends c . Note that if m > n , this condition implies S p ∈ S m,k T i ≤ k ϕ p i = S p ∈ S n,k T i ≤ k ψ p i = ∅ for k ≥ n . .1.3 Proof We now establish the necessity of coherence to avoid dominance.

Theorem 3.4.

Let I be a generalized legitimate inaccuracy measure and thus deﬁned by a Bregmandistance B ¯ ϕ,µ (see Remark A.2). We write B for B ¯ ϕ,µ . Let S be the set of non-negative functionson F . Let E ⊆ S be the set of coherent credence functions on F . Then clearly E is convex.Let c be an incoherent credence function. Case 1 : I ( c, w ) = ∞ for all w ∈ W . Then since I ( v w , w ) = 0 for all w ∈ W , any omniscientcredence function weakly dominates c . Case 2 : I ( c, w ′ ) < ∞ for some w ′ ∈ W . We show that there is a coherent credence function π c such that I ( c, w ) > I ( π c , w ) for any w such that I ( c, w ) < ∞ . Since v w ′ ∈ E , we see that B ( E, c ) ≤ B ( v w ′ , c ) = I ( c, w ′ ) < ∞ . Thus we can apply Theorem A.5 to get a π c ∈ S such that B ( s, t ) ≥ B ( E, c ) + B ( s, π c ) for every s ∈ E. (6)In particular, (6) holds when s is the omniscient credence function at world w for any w ∈ W ; andso we see that I ( c, w ) ≥ B ( E, c ) + I ( π c , w ) (7)for all w , where all numbers in (7) are ﬁnite whenever I ( c, w ) < ∞ .Next we show that π c is in fact coherent. This is due to the following claim: E is closed underloose convergence in µ -measure where µ is a weighted counting measure on P ( N ) deﬁned withweights { a i } ∞ i =1 . To see this, let c n ∈ E for each n and c ∈ S . Assume c n → c loosely in µ -measure.We show c ∈ E , i.e., c is coherent. Note c is coherent on F if and only if c ′ : F ∪ { W } → [0 , iscoherent on F ∪ { W } , where c ′ = c on F and c ′ ( W ) = 1 . Thus it suﬃces to assume c and c n forall n are deﬁned on F ∪ { W } with c ( W ) = c n ( W ) = 1 for all n .It is easy to see that loose convergence in a weighted counting measure (where all weights arenon-zero) implies pointwise convergence on F , so c ( p ) = lim n →∞ c n ( p ) ∈ [0 , for each p ∈ F ∪ { W } . To show c ∈ E , it suﬃces to show c can be extended to a ﬁnitely additiveprobability function on P ( W ) . 20e ﬁrst show c is a partial measure on F ∪ { W } . Deﬁnitions A.9.1 and A.9.3 clearly hold for c so we just need to show Deﬁnition A.9.2 holds. Let ϕ , . . . , ϕ m − , ψ , . . . , ψ m ′ − ∈ F ∪ { W } and [ p ∈ S m,k \ i ≤ k ϕ p i ⊆ [ p ∈ S m ′ ,k \ i ≤ k ψ p i for every k < m . Since the c n are coherent and thus extend to ﬁnitely additive probability functionson algebras containing F , we have by Proposition A.7 and Remark A.6 that m − X k =0 c n ( ϕ k ) = m − X k =0 c n ( [ p ∈ S m,k \ i ≤ k ϕ p i ) ≤ m ′ − X k =0 c n ( [ p ∈ S m ′ ,k \ i ≤ k ψ p i ) = m ′ − X k =0 c n ( ϕ k ) using that [ p ∈ S m,k \ i ≤ k ψ p i = [ p ∈ S m ′ ,k \ i ≤ k ψ p i = ∅ for k ≥ m ′ . Sending n to inﬁnity and using the pointwise convergence of c n to c on F ∪ { W } weobtain that m − X k =0 c ( ϕ k ) ≤ m ′ − X k =0 c ( ψ k ) . Thus c is a partial measure on F ∪{ W } . By Theorem A.10, it follows that there is a ﬁnitely additiveprobability function c ∗ on an algebra F ∗ ⊇ F that extends c and so c ∈ E , which concludes theproof that E is closed under loose µ -convergence.By Theorem A.5, π c is the generalized B -projection of c onto E . Also, since B ( E, c ) = inf s ∈ E ( s, c ) < ∞ , there is a B-minimizing sequence { s n } ⊆ E such that B ( s n , c ) → B ( E, c ) by the deﬁnition ofinﬁmum. By the deﬁnition of a generalized projection, s n µ π c . Since E is closed under looseconvergence, it follows that π c ∈ E . Further, by Theorem A.5, B ( E, c ) ≥ B ( π c , c ) > , since π c = c (as c is incoherent) and B ( s, t ) = 0 if and only if s = t (as µ is a weighted countingmeasure with all non-zero weights). So for every w such that I ( c, w ) < ∞ , we deduce that I ( c, w ) ≥ B ( E, c ) + I ( π c , w ) > I ( π c , w ) . This proves that c is weakly dominated by π c , and c is strongly dominated by π c if I ( c, w ) < ∞ for all w ∈ W . 21 .2 Proofs from Section 4 Proposition 4.7.

We adapt the proof of Proposition 1 in Predd et al. 2009. Let F = { p , p , . . . } . Let X be the collection of all nonempty sets of the form T ∞ i =1 p ∗ i where p ∗ i is either p i or p ci . Then X partitions W . Also, X is in bijection with V F , the set of omniscient credence functions.Indeed, let f map v w to T ∞ i =1 p ∗ i where p ∗ i = p i if v w ( p i ) = 1 and p ∗ i = p ci otherwise. Then foreach w , w ∈ f ( v w ) and so f ( v w ) ∈ X . Note f is onto. Indeed, let w ∈ T ∞ i =1 p ∗ i , where T ∞ i =1 p ∗ i ∈ X .Then f ( v w ) = T ∞ i =1 p ∗ i . Also, f is injective. Indeed, assume f ( v w ) = f ( v w ′ ) . Then f ( v w ) = ∞ \ i =1 p i = ∞ \ i =1 p i = f ( v w ′ ) for p ji = p i or p ji = p ci for all i ∈ N and j ∈ { , } . If p i = p i for some i , then without loss ofgenerality we may assume p i = p i and p i = p ci . So w ∈ p i but w / ∈ p i and thus w / ∈ T ∞ i =1 p i . But w ∈ T ∞ i =1 p i by deﬁnition of f and so T ∞ i =1 p i = T ∞ i =1 p i , which is a contradiction. It follows that p i = p i for all i , but then by deﬁnition of f , this implies v w ( p i ) = 1 if and only if v w ′ ( p i ) = 1 forall i and so v w = v w ′ .It is easy to see that since F is countably discriminating, V F is countable. It follows that X iscountable. Enumerate the elements of V F and X by v w , v w , . . . and e , e , . . . , respectively, suchthat f − ( e j ) = v w j . We have that p i is the disjoint union of e j such that e j ⊆ p i , or equivalentlythe e j where f − ( e j )( p i ) = 1 . Note i) for any countably additive probability function µ on a σ -algebra containing F (and thus containing X ) and any p i ∈ F : µ ( p i ) = ∞ X j =1 µ ( e j ) f − ( e j )( p i ) . Now we prove the equivalence. Assume c is countably coherent. So c extends to a countablyadditive probability function µ on a σ -algebra containing F . Then by i), c ( p i ) = µ ( p i ) = ∞ X j =1 µ ( e j ) f − ( e j )( p i ) for all p i ∈ F . But since µ ( e j ) are non-negative and sum to (since the e j ’s partition W and µ isa countably additive probability function), we have that c has the form stated.Now assume c ( p i ) = P ∞ j =1 λ j v w j ( p i ) for all i where P ∞ j =1 λ j = 1 . Let σ ( F ) be the smallest σ -algebra on W containing F . Then it is easy to check that the function on σ ( F ) deﬁned by ¯ v w j ( p ) = 1 if and only if w j ∈ p extends v w j and is a countably additive probability function on σ ( F ) . Then P ∞ j =1 λ j ¯ v w j is a countably additive probability function on σ ( F ) since a countablesum of countably additive probability functions with coeﬃcients that sum to is a countably22dditive probability function. Since c ( p i ) = ∞ X j =1 λ j v w j ( p i ) = ∞ X j =1 λ i ¯ v w j ( p i ) for all i , it follows that c extends to a countably additive probability function on a σ -algebracontaining F . Lemma 4.12.

For any opinion space ( W, F ) , ( W ∗ , F ∗ ) is compact. Proof.

Let { Ψ( p n ) f ( n ) } ∞ n =1 be a sequence of elements of F ∗ or their complements as in Deﬁnition4.9. Case 1: for each N there is some w N ∈ W such that w N ∈ T Nn =1 Ψ( p n ) f ( n ) . Then since i) Ψ( p ) ∩ W = p and ii) Ψ( p ) c ∩ W = p c for any p ∈ F , it follows that w N ∈ T Nn =1 p f ( n ) n for each N . Ifthere is some w ′ ∈ W with w ′ ∈ T ∞ n =1 p f ( n ) n then by i) and ii) it follows that w ′ ∈ T ∞ n =1 Ψ( p n ) f ( n ) .Otherwise, by construction, we deﬁned some x s to be such that x s ∈ T ∞ n =1 Ψ( p n ) f ( n ) . In eithercase, we are done. Case 2: there is some N such that T Nn =1 Ψ( p n ) f ( n ) ⊆ W ∗ \ W . I claim this impliesthat T Nn =1 Ψ( p n ) f ( n ) = ∅ . Indeed, if there were some w ∈ W ∗ \ W such that w ∈ T Nn =1 Ψ( p n ) f ( n ) ,then that is because { p f ( n ) n } Nn =1 is an initial sequence of some sequence { ¯ p ¯ f ( n ) n } ∞ n =1 such that T ln =1 ¯ p ¯ f ( n ) n = ∅ for each l and thus, in particular, T Nn =1 p f ( n ) n = ∅ . So there is some w ∈ W suchthat w ∈ T Nn =1 Ψ( p n ) f ( n ) by i) and ii), which is a contradiction. Thus we have established that ( W ∗ , F ∗ ) is compact. Lemma 4.13.

Since ( W ∗ , F ∗ ) is compact, we only need to show that c ∗ is coherent by Theorem 4.11. Thusit suﬃces to show that c ∗ can be extended to a ﬁnitely additive probability function on A ( F ∗ ) .Since c is coherent, there is a ﬁnitely additive probability function ¯ c such that:1. ¯ c ( p ) = c ( p ) for p ∈ F ;2. ¯ c ( p ∪ q ) = ¯ c ( p ) + ¯ c ( q ) for p, q ∈ F with p ∩ q = ∅ ;3. ¯ c ( W ) = 1 .First, deﬁne Ψ( p c ) := Ψ( p ) c for each p ∈ F . Then each element in A ( F ∗ ) can be represented by S Ni =1 T Mj =1 Ψ( q ij ) where q ij or its complement is in F . We deﬁne ¯ c ∗ ( N [ i =1 M \ j =1 Ψ( q ij )) := ¯ c ( N [ i =1 M \ j =1 q ij ) . Using that p = Ψ( p ) ∩ W and p c = Ψ( p ) c ∩ W , we show that ¯ c ∗ is a well-deﬁned ﬁnitely additiveprobability function on A ( F ∗ ) extending c ∗ . We ﬁrst show ¯ c ∗ is well-deﬁned. Assume that N [ i =1 M \ j =1 Ψ( q ij ) = N ′ [ i =1 M ′ \ j =1 Ψ( r ij ) . N [ i =1 M \ j =1 Ψ( q ij ) ∩ W = N ′ [ i =1 M ′ \ j =1 Ψ( r ij ) ∩ W which, noting that p = Ψ( p ) ∩ W and p c = Ψ( p ) c ∩ W , establishes that N [ i =1 M \ j =1 q ij = N ′ [ i =1 M ′ \ j =1 r ij , and so ¯ c ∗ ( N [ i =1 M \ j =1 Ψ( q ij )) = ¯ c ( N [ i =1 M \ j =1 q ij ) = ¯ c ( N ′ [ i =1 M ′ \ j =1 r ij ) = ¯ c ∗ ( N ′ [ i =1 M ′ \ j =1 Ψ( r ij )) . Thus ¯ c ∗ is well-deﬁned. Clearly, ¯ c ∗ extends c ∗ . Now, since W ⊆ W ∗ , if N [ i =1 M \ j =1 Ψ( q ij ) ∩ N ′ [ i =1 M ′ \ j =1 Ψ( r ij ) = ∅ then N [ i =1 M \ j =1 Ψ( q ij ) ∩ W ∩ N ′ [ i =1 M ′ \ j =1 Ψ( r ij ) ∩ W = ∅ and so ¯ c ( N [ i =1 M \ j =1 q ij ∪ N ′ [ i =1 M ′ \ j =1 r ij ) = ¯ c ( N [ i =1 M \ j =1 q ij ) + ¯ c ( N ′ [ i =1 M ′ \ j =1 r ij ) . Then noting the deﬁnition of ¯ c ∗ in terms of ¯ c , we establish ﬁnite additivity. Lastly, if W ∗ = N [ i =1 M \ j =1 Ψ( q ij ) , then W = N [ i =1 M \ j =1 Ψ( q ij ) ∩ W, and so ¯ c ∗ ( N [ i =1 M \ j =1 Ψ( q ij )) = ¯ c ( N [ i =1 M \ j =1 q ij ) = ¯ c ( W ) = 1 . This establishes that c ∗ is coherent on ( W ∗ , F ∗ ) , and so since ( W ∗ , F ∗ ) is compact, c ∗ is countablycoherent. Further, w ∈ p if and only if w ∈ Ψ( p ) for each w ∈ W , so v w deﬁned on F is the sameas v w deﬁned on F ∗ for each w ∈ W . Since c ( p ) = c ∗ (Ψ( p )) for all p ∈ F , this establishes that I ( c, w ) = I ( c ∗ , w ) for each w ∈ W . Proposition 4.16.

Let ( W, F ) be an opinion space with F countably inﬁnite and I a generalizedlegitimate inaccuracy measure. If c is a countably coherent credence function with ﬁnite expectedinaccuracy, then c is not weakly dominated. 24 roof. Since c is countably coherent, let ¯ c be a countably additive probability function on σ ( F ) extending c such that E ¯ c I ( c, · ) < ∞ . Note that since d is strictly proper (see Remark 2.7), weknow that for any i ∈ N , E ¯ c d ( v w , c i ) < E ¯ c d ( v w , x ) for x = c i . Assume toward a contradiction thatthere is a credence function d with d = c and I ( d, w ) ≤ I ( c, w ) for each w with strict inequalityfor some w . Then E ¯ c I ( d, · ) ≤ E ¯ c I ( c, · ) < ∞ , so both I ( d, · ) and I ( c, · ) are integrable withrespect to the measure space ( W, σ ( F ) , ¯ c ) . Then let i be any index such that d i = c i . There mustbe at least one since c = d . Then E ¯ c d ( v w , c i ) < E ¯ c d ( v w , d i ) . If d i = c i then clearly E ¯ c d ( v w , c i ) = E ¯ c d ( v w , d i ) . So since E ¯ c I ( c, · ) < ∞ and E ¯ c I ( d, · ) < ∞ , wehave E ¯ c I ( c, · ) = ∞ X i =1 a i E ¯ c d ( v w , c i ) < ∞ X i =1 a i E ¯ c d ( v w , d i ) = E ¯ c I ( d, · ) , which implies that E ¯ c ( I ( c, · ) − I ( d, · )) < . Thus there is some nonempty set E ∈ σ ( F ) with ¯ c ( E ) > on which I ( c, · ) − I ( d, · ) < (since the Lebesgue integral is positive). But thiscontradicts our assumption that d weakly dominates c , and so we are done. Proposition 4.17.

Assume d weakly dominates c . Note i) c is somewhere ﬁnitely inaccurate if and only if B ( c, w ) < ∞ for all w ∈ W if and only if P ∞ i =1 c i < ∞ . It follows by weak dominance that B ( d, w ) < ∞ for all w ∈ W and therefore P ∞ i =1 d i < ∞ . Let B ( c, w ) = D ( v w , c ) for D ageneralized quasi-additive Bregman divergence.Since ( W, F ) is point-ﬁnite, it is also countably discriminating as there are only countably manyﬁnite subsets of F . So by Proposition 4.7, c = P ∞ j =1 λ j v w j for λ j ∈ [0 , with P ∞ j =1 λ j = 1 . First,note that D ( P ∞ j =1 λ j v w j , c ) = 0 and ( P ∞ j =1 λ j v w j ( p i ) − c ( p i )) = 0 for all i , so D ( ∞ X j =1 λ j v w j , c ) − D ( ∞ X j =1 λ j v w j , d ) = ∞ X i =1 a i ( ∞ X j =1 λ j v w j ( p i ) − c i ) − ( ∞ X j =1 λ j v w j ( p i ) − d i ) . Using that ( ∞ X j =1 λ j v w j ( p i ) − c i ) − ( ∞ X j =1 λ j v w j ( p i ) − d i ) = ∞ X j =1 λ j [( v w j ( p i ) − c i ) − ( v w j ( p i ) − d i ) ] i since P ∞ j =1 λ j = 1 , we have that D ( ∞ X j =1 λ j v w j , c ) − D ( ∞ X j =1 λ j v w j , d ) = ∞ X i =1 a i ∞ X j =1 λ j [( v w j ( p i ) − c i ) − ( v w j ( p i ) − d i ) ] (8) = ∞ X i =1 a i ( X j : w j / ∈ p i λ j )( c i − d i ) + a i ( X j : w j ∈ p i λ j )((1 − c i ) − (1 − d i ) )= ∞ X i =1 a i ( c i − d i ) + 2 a i ( X j : w j ∈ p i λ j )( d i − c i )= ∞ X i =1 a i ( − c i − d i ) + 2 a i ( X j : w j ∈ p i λ j ) d i since c i = P j : w j ∈ p i λ j . We have P ∞ i =1 c i + d i < ∞ by i). Thus ≤ ∞ X i =1 a i X j : w j ∈ p i λ j ) d i < ∞ (9)because ≥ D ( ∞ X j =1 λ j v w j , c ) − D ( ∞ X j =1 λ j v w j , d ) . Having established (9), we claim we can use the dominated convergence theorem (see, e.g.,Theorem 1.4.49 in Tao 2011) to switch limits in (8). Indeed, ∞ X i =1 a i N X j =1 λ j [( v w j ( p i ) − c i ) − ( v w j ( p i ) − d i ) ] = ∞ X i =1 a i ( X ≤ j ≤ N λ j )( c i − d i ) + 2 a i ( X j : w j ∈ p i ≤ j ≤ N λ j )( d i − c i ) . Letting g N ( i ) = a i ( X ≤ j ≤ N λ j )( c i − d i ) + 2 a i ( X j : w j ∈ p i ≤ j ≤ N λ j )( d i − c i ) and noting that − ( P j : w j ∈ p i ≤ j ≤ N λ j ) c i ≥ − c i since c i = P j : w j ∈ p i λ j , we see that | g N ( i ) | ≤ a i (2 c i + d i + 2 a i ( X j : w j ∈ p i λ j ) d i ) . Each of c i , d i , and ( P j : w j ∈ p i λ j ) d i is summable in i and sup i a i < ∞ . So, the dominated conver-gence theorem applies, and we can switch limits.26hus we have ≥ D ( ∞ X j =1 λ j v w j , c ) − D ( ∞ X j =1 λ j v w j , d )= ∞ X j =1 λ j ∞ X i =1 a i [( v w j ( p i ) − c ( p i )) − ( v w j ( p i ) − d ( p i )) ]= ∞ X j =1 λ j ( D ( v w , c ) − D ( v w , d )) ≥ where we used that c and d are both ﬁnitely inaccurate for each w ∈ W to break up the summationin the second line. Thus we conclude that c = d , as D ( c, d ) = 0 if and only if c = d . Lemma 4.25.

A partition is W-stable relative to any generalized legitimate inaccuracy measure.

Proof.

Let F = { p , p , . . . } be a partition. Assume a coherent credence function c on F is weaklydominated by some credence function d . We can assume d is coherent by Theorem 3.4, and so P ∞ m =1 d m ≤ . We show that c ∗ is weakly dominated by d ∗ , thereby establishing that a partitionis W-stable.First, I ( c ∗ , w ) = I ( c, w ) and I ( d ∗ , w ) = I ( d, w ) for all w ∈ W by Lemma 4.13. Thus byassumption of weak dominance, I ( c ∗ , w ) ≥ I ( d ∗ , w ) for all w ∈ W with a strict inequality for some w ∈ W . We therefore need to only check what happens for w ∈ W ∗ \ W . The compactiﬁcation of a partition consists in adding one point w ∗ which is in thecomplement of all p ∗ ∈ F ∗ . So W ∗ = W ∪ w ∗ and I ( c ∗ , w ∗ ) − I ( d ∗ , w ∗ ) = ∞ X m =1 a m ( ϕ ( d m ) − ϕ ′ ( d m ) d m ) − ∞ X m =1 a m ( ϕ ( c m ) − ϕ ′ ( c m ) c m ) , which I claim is greater than or equal to . Indeed, assume toward a contradiction that ∞ X m =1 a m ( ϕ ( d m ) − ϕ ′ ( d m ) d m ) < ∞ X m =1 a m ( ϕ ( c m ) − ϕ ′ ( c m ) c m ) . Then since d n → as P ∞ n =1 d n ≤ and c n → as P ∞ n =1 c n ≤ , ϕ ′ ( d n ) − ϕ ′ ( c n ) → (since ϕ ′ (0) = lim x → ϕ ( x ) = 0 ) and so we can ﬁnd a K such that | ϕ ′ ( d n ) − ϕ ′ ( c n ) | < | ∞ X m =1 a m ( ϕ ( d m ) − ϕ ′ ( d m ) d m ) − ∞ X m =1 a m ( ϕ ( c m ) − ϕ ′ ( c m ) c m ) | for n ≥ K . Thus for n ≥ K , I ( c, w n ) − I ( d, w n ) = ∞ X m =1 a m ( ϕ ( d m ) − ϕ ′ ( d m ) d m ) − ∞ X m =1 a m ( ϕ ( c m ) − ϕ ′ ( c m ) c m )+ ϕ ′ ( d n ) − ϕ ′ ( c n ) < , contradicting that d weakly dominates c . So indeed, d ∗ weakly dominates c ∗ .27 roposition 4.27. Let ( W, F ) be a compact opinion space and I a generalized legitimate inac-curacy measure. If c is coherent (and thus countably coherent), then c is not strongly dominatedrelative to I . Proof.

Let I n ( c ′ , w ) := P ni =1 a i d ( v w ( p i ) , c ′ ( p i )) for each n ∈ N , w ∈ W , and credence function c ′ on F . Consider a credence function d = c . Deﬁne T n = { ( v w ( p ) , . . . , v w ( p n )) : I k ( c, w ) < I k ( d, w ) for some k ≥ n, w ∈ W } and T = { e } ∪ T ∞ n =1 T n , where e is the empty sequence. For each s, t ∈ T , we set s < t if and onlyif s is an initial sequence of t , and we set the height of t ∈ T to be the length of the tuple. Then T is a binary tree.We claim T is inﬁnite. Fix n ∈ N . Then there is a t ∈ T with height n if and only if T n = ∅ if and only if I k ( c, w ) < I k ( d, w ) for some k ≥ n and w ∈ W . Let k be the maximum of n andthe smallest i such that c ( p i ) = d ( p i ) . Then since c restricted to any subset of F is coherent, byTheorem 2.9, I k ( c, w ′ ) < I k ( d, w ′ ) for some w ′ ∈ W and so ( v w ′ ( p ) , . . . , v w ′ ( p n )) ∈ T n .By Konig’s lemma (see, e.g., Hrbacek and Jech 1999, Sec. 12.3), there exists an inﬁnite branch B = ∞ [ n =1 { ( v w n ( p ) , . . . , v w n ( p n )) } through T , where ( v w n ( p ) , . . . , v w n ( p n )) < ( v w m ( p ) , . . . , v w m ( p m )) whenever n < m . For each i , let p ∗ i = p i if v w i ( p i ) = 1 and p ∗ i = p ci if v w i ( p i ) = 0 . Then w n ∈ T ni =1 p ∗ i since v w i ( p i ) = 1 if and only if v w n ( p i ) = 1 for i < n as ( v w i ( p ) , . . . , v w i ( p i )) < ( v w n ( p ) , . . . , v w n ( p n )) . Thus T ni =1 p ∗ i = ∅ for each n and so by compactness there is some w ∈ T ∞ i =1 p ∗ i . Then ( v w ( p ) , . . . , v w ( p n )) = ( v w n ( p ) , . . . , v w n ( p n )) ∈ T n for each n ∈ N . By the deﬁnition of T n , for each n ∈ N we have I k n ( c, w ) < I k n ( d, w ) for some k n ≥ n . Sending n to inﬁnity, I ( c, w ) ≤ I ( d, w ) and thus d does not strongly dominate c . A.3 Proof of Theorem 5.3

Theorem 5.3.

Let I ( c, w ) = B ϕ,µ ( v w , c ) . We write B for B ϕ,µ . Let S be the set of non-negative A -measurable functions on F . Let E ⊆ S be the set of µ -coherent µ -credence functions over F .Then E is convex. Let c be a µ -incoherent µ -credence function. Because µ is ﬁnite and d is28ounded, B ( E, c ) < ∞ . Thus we can apply Theorem A.5 to get a π c ∈ S such that B ( s, c ) ≥ B ( E, c ) + B ( s, π c ) for every s ∈ E. (10)In particular, (10) holds when s is the omniscient credence function at world w for each w , so weobtain I ( c, w ) ≥ B ( E, c ) + I ( π c , w ) (11)for all w , where all numbers in (11) are ﬁnite. We show that π c is in fact a µ -coherent µ -credencefunction. It suﬃces to show that π c is µ -a.e. equal to a coherent credence function on F (since π c ∈ S , it is A -measurable). To do so, we prove the following claim: E is closed under loose-convergence in µ -measure.To see this, let c n ∈ E for each n and c ∈ S . Assume c n → c loosely in µ -measure. Theﬁrst thing to notice is that, since µ is ﬁnite, loose µ -convergence implies µ -a.e. convergence on asubsequence { a n } ∞ n =1 of { n } ∞ n =1 , so that c ( p ) = lim n →∞ c a n ( p ) ∈ [0 , for each p ∈ G with µ ( G c ) = 0 . Since the c a n are µ -coherent, we can change each c a n on a(measurable) measure zero set X n to get coherent µ -credence functions c a n . Further, we replace G with G \ ( ∪ ∞ n =1 X n ) . Assuming these adjustments have been made, we have that c a n → c on G with µ ( G c ) = 0 , and each c a n is coherent. We now show c ∈ E by showing it is equal to a coherentcredence function on F when restricting to G .First, we extend c (resp. c a n ) to c (resp. c a n ), where c (resp. c a n ) is a credence function on G ∪ { W } such that c = c (resp. c a n = c a n ) on G and c ( W ) = 1 (resp. c a n ( W ) = 1 ). Then noticethat c (resp. c a n ) is coherent on G if and only if c (resp. c a n ) is coherent on G ∪ { W } . Thus wework with c and c a n instead noting that c = lim n c a n on G ∪ { W } . To show c ∈ E , we ﬁrst show c is a partial measure on G ∪ { W } .Deﬁnitions A.9.1 and A.9.3 clearly hold for c so we just need to show that Deﬁnition A.9.2holds. Let ϕ , . . . , ϕ m − , ψ , . . . , ψ m ′ − ∈ G ∪ { W } and [ p ∈ S m,k \ i ≤ k ϕ p i ⊆ [ p ∈ S m ′ ,k \ i ≤ k ψ p i for every k < m . Since c a n are coherent on G ∪ { W } and thus extend to measures on an algebracontaining G ∪ { W } , we have by Corollary A.7 that m − X k =0 c a n ( ϕ k ) = m − X k =0 c a n ( [ p ∈ S m,k \ i ≤ k ϕ p i ) ≤ m ′ − X k =0 c a n ( [ p ∈ S m ′ ,k \ i ≤ k ψ p i ) = m ′ − X k =0 c a n ( ψ k ) It is a standard fact that convergence in measure implies a.e. convergence on a subsequence. Now notice thatloose convergence implies convergence in measure when the measure is ﬁnite. [ p ∈ S m,k \ i ≤ k ϕ p i = [ p ∈ S m ′ ,k \ i ≤ k ψ p i = ∅ for k ≥ m ′ . Sending n to inﬁnity and using the pointwise convergence of c a n to c on G ∪ { W } weconclude that m − X k =0 c ( ϕ k ) ≤ m ′ − X k =0 c ( ψ k ) . Thus c is a partial measure on G ∪ { W } . By Theorem A.10, it follows that there is a ﬁnitelyadditive probability function c ∗ on A ( F ) such that c ∗ = c on G ∪ { W } . Thus c ∗ | F is a coherentcredence function on F and c = ¯ c | F = c ∗ | F µ -a.e. (speciﬁcally oﬀ G c ). Further, we already assumed c is A -measurable and { p : c ( p ) ∈ [0 , } ⊆G . Thus c is a µ -coherent µ -credence function.The proof is ﬁnished just as in the proof of Theorem 3.4. By Theorem A.5, π c is the generalizedprojection of c onto E . Since B ( E, c ) = inf s ∈ E ( s, c ) < ∞ there is a B-minimizing sequence { s n } of elements in E such that B ( s n , c ) → B ( E, c ) by thedeﬁnition of inﬁmum. By the deﬁnition of a generalized projection, s n µ π c . Since E is closedunder loose convergence, it follows that π c ∈ E . Further, since c is µ -incoherent we know c = π c (up to µ -a.e. equivalence) so we see B ( E, c ) ≥ B ( π c , c ) > since B ( s, t ) = 0 if and only if s = tµ -a.e. Since I ( c, w ) < ∞ for all w , we deduce that I ( c, w ) ≥ B ( E, c ) + I ( π c , w ) > I ( π c , w ) for all w ∈ W . This proves that c is strongly dominated by π c , and we are done. References

A. Banerjee, Xin Guo, and Hui Wang. On the optimality of conditional expectation as a Bregmanpredictor.

IEEE Transactions on Information Theory , 51(7):2664–2669, 2005.Vivek Borkar, Vijay Konda, and Sanjoy Mitter. On de Finetti coherence and Kolmogorov proba-bility.

Statistics & Probability Letters , 66:417–421, 2003.I. Csiszár. Generalized projections for non-negative functions.

Acta Mathematica Hungarica , 68(1-2):161–186, 1995.Bruno de Finetti.

Theory of Probability . John Wiley, New York, 1974.Kenny Easwaran. Expected accuracy supports conditionalization—and conglomerability and re-ﬂection.

Philosophy of Science , 80(1):119–142, 2013.Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation.

Journal of the American Statistical Association , 102(477):359–378, 2007.30ilary Greaves and David Wallace. Justifying conditionalization: conditionalization maximizesexpected epistemic utility.

Mind , 115(459):607–632, 2005.Alfred Horn and Alfred Tarski. Measures in Boolean algebras.

Transactions of the AmericanMathematical Society , 64:467–497, 1948.Paul Horwich.

Probability and Evidence . Cambridge University Press, 1982.Karel Hrbacek and Thomas Jech.

Introduction to Set Theory . Marcel Dekker, New York, 1999.Simon M. Huttegger. In defense of reﬂection.

Philosophy of Science , 80(3):413–433, 2013.James Joyce. A nonpragmatic vindication of probabilism.

Philosophy of Science , 65(4):575–603,1998.James Joyce. Accuracy and coherence: Prospects for an alethic epistemology of partial belief. InFranz Huber and Christoph Schmidt-Petri, editors,

Degrees of Belief , pages 263–297. Springer,2009.J.L. Kelley.

General Topology . Graduate Texts in Mathematics. Springer, New York, 1975.Mikayla Kelley. Accuracy dominance on inﬁnite opinion sets. Master’s thesis, University of Cali-fornia, Berkeley, 2019.Hannes Leitgeb and Richard Pettigrew. An objective justiﬁcation of Bayesianism I: Measuringinaccuracy.

Philosophy of Science , 77(2):201–235, 2010a.Hannes Leitgeb and Richard Pettigrew. An objective justiﬁcation of Bayesianism II: The conse-quences of minimizing inaccuracy.

Philosophy of Science , 77(2):236–272, 2010b.Benjamin A. Levinstein. An objection of varying importance to epistemic utility theory.

Philo-sophical Studies , 176(11):2919–2931, 2019.D. V. Lindley. Scoring rules and the inevitability of probability. In A. P. Sage, editor,

SystemDesign for Human Interaction , pages 182–208. IEEE Press, Piscataway, NJ, USA, 1987.Patrick Maher. Joyce’s argument for probabilism.

Philosophy of Science , 69(1):73–81, 2002.Richard Pettigrew.

Accuracy and the Laws of Credence . Oxford University Press, 2016.J. B. Predd, R. Seiringer, E. H. Lieb, D. N. Osherson, H. V. Poor, and S. R. Kulkarni. Probabilisticcoherence and proper scoring rules.

IEEE Transactions on Information Theory , 55(10):4786–4792, Oct 2009.Roger D. Rosenkrantz.

Foundations and Applications of Inductive Probability . Ridgeview Press,Atascadero, CA, 1981.Leonard Savage. Elicitation of personal probabilities and expectations.

Journal of the AmericanStatistical Association , 66(336):783–801, 1971.Mark Schervish, Teddy Seidenfeld, and Joseph B. Kadane. Proper scoring rules, dominated fore-casts, and coherence.

Decision Analysis , 6(4):202–221, 2009.31ark Schervish, Teddy Seidenfeld, and Joseph B. Kadane. Dominating countably many forecasts.

The Annals of Statistics , 42(2):728–756, 2014.Jeremy Steeger. Probabilism for stochastic theories.

Studies in History and Philosophy of SciencePart B: Studies in History and Philosophy of Modern Physics , 66:34–44, 2019.Terence Tao.