Exchangeable lower previsions
aa r X i v : . [ m a t h . P R ] J a n EXCHANGEABLE LOWER PREVISIONS
GERT DE COOMAN, ERIK QUAEGHEBEUR, AND ENRIQUE MIRANDAA
BSTRACT . We extend de Finetti’s (1937) notion of exchangeability to finite and count-able sequences of variables, when a subject’s beliefs about them are modelled using coher-ent lower previsions rather than (linear) previsions. We prove representation theorems inboth the finite and the countable case, in terms of sampling without and with replacement,respectively. We also establish a convergence result for sample means of exchangeablesequences. Finally, we study and solve the problem of exchangeable natural extension:how to find the most conservative (point-wise smallest) coherent and exchangeable lowerprevision that dominates a given lower prevision.
1. I
NTRODUCTION
This paper deals with belief models for both finite and countable sequences of exchange-able random variables taking a finite number of values. When such sequences of randomvariables are assumed to be exchangeable, this more or less means that the specific orderin which they are observed is deemed irrelevant.The first detailed study of exchangeability was made by de Finetti (1937) (with the ter-minology of ‘equivalent’ events). He proved the now famous Representation Theorem,which is often interpreted as stating that a sequence of random variables is exchange-able if it is conditionally independent and identically distributed (IID). Other importantwork on exchangeability was done by, amongst many others, Hewitt and Savage (1955),Heath and Sudderth (1976), Diaconis and Freedman (1980) and, in the context of the be-havioural theory of imprecise probabilities that we are going to consider here, by Walley(1991). We refer to Kallenberg (2002, 2005) for modern, measure-theoretic discussions ofexchangeability.One of the reasons why exchangeability is deemed important, especially by Bayesians,is that, by virtue of de Finetti’s Representation Theorem, an exchangeable model can beseen as a convex mixture of multinomial models. This has given some ground (de Finetti,1937, 1975; Dawid, 1985) to the claim that aleatory probabilities and IID processes can beeliminated from statistics, and that we can restrict ourselves to considering exchangeablesequences instead. De Finetti presented his study of exchangeability in terms of the behavioural notionof previsions, or fair prices. The central assumption underlying his approach is that asubject should be able to specify a fair price P ( f ) for any risky transaction (which weshall call a gamble ) f (de Finetti, 1974, Chapter 3). This is tantamount to requiring thathe should always be willing and able to decide, for any real number r , between selling thegamble f for r , or buying it for that price. This may not always be realistic, and for this Key words and phrases.
Exchangeability, lower prevision, Representation Theorem, Bernstein polynomials,convergence in distribution, exchangeable natural extension, sampling without replacement, multinomial sam-pling, imprecise probability, coherence. See de Finetti (1975, Section 11.4); and Cifarelli and Regazzini (1996) for an overview of de Finetti’s work. For a critical discussion of this claim, see Walley (1991, Section 9.5.6). reason, it has been suggested that we should explicitly allow for a subject’s indecision, bydistinguishing between his lower prevision P ( f ) , which is the supremum price for whichhe is willing to buy the gamble f , and his upper prevision P ( f ) , which is the infimumprice for which he is willing to sell f . For any real number r strictly between P ( f ) and P ( f ) , the subject is then not specifying a choice between selling or buying the gamble f for r . Such lower and upper previsions are also subject to certain rationality or coherencecriteria, in very much the same way as (precise) previsions are on de Finetti’s account.The resulting theory of coherent lower previsions , sometimes also called the behaviouraltheory of imprecise probabilities, and brilliantly defended by Walley (1991), generalisesde Finetti’s behavioural treatment of subjective, epistemic probability, and tries to makeit more realistic by allowing for a subject’s indecision. We give a brief overview of thistheory in Section 2.Also in this theory, it is interesting to consider what are the consequences of a subject’s exchangeability assessment , i.e., that the order in which we consider a number of randomvariables is of no consequence. This is our motivation for studying exchangeable lower previsions in this paper. An assessment of exchangeability will have a clear impact on thestructure of so-called exchangeable coherent lower previsions. We shall show they can bewritten as a combination of (i) a coherent (linear) prevision expressing that permutationsof realisations of such sequences are considered equally likely, and (ii) a coherent lowerprevision for the ‘frequency’ of occurrence of the different values the random variables cantake. Of course, this is the essence of representation in de Finetti’s sense: we generalisehis results to coherent lower previsions.A subject’s probability assessments may be local , in the sense that they concern theprobabilities or previsions of specific events or random variables. Assessments may onthe other hand also be structural (see Walley, 1991, Chapter 9), in which case they specifyrelationships that should hold between the probabilities or previsions of a number of eventsor random variables. One may wonder if (and how) it is possible to combine local withstructural assessments, such as exchangeability. We show that this is indeed the case, andgive a surprisingly simple procedure, called exchangeable natural extension , for findingthe point-wise smallest (most conservative) coherent and exchangeable lower previsionthat dominates the local assessments. As an example, we use our conclusions to take afresh look at the old question whether a given exchangeable model for n variables can beextended to an exchangeable model for n + k variables.Before we go on, we want to draw attention to a number of distinctive features of ourapproach. First of all, the usual proofs of the Representation Theorem, such as the onesgiven by de Finetti (1937), Heath and Sudderth (1976), or Kallenberg (2005), do not lendthemselves very easily to a generalisation in terms of coherent lower previsions. In princi-ple it would be possible, at least in some cases, to start with the versions already known for(precise) previsions, and to derive their counterparts for lower previsions using so-calledlower envelope theorems (see Section 2 for more details). This is the method that Walley(1991, Sections 9.5.3 and 9.5.4) suggests. But we have decided to follow a different route:we derive our results directly for lower previsions, using an approach based on Bernsteinpolynomials, and we obtain the ones for previsions as special cases. We believe this methodto be more elegant and self-contained, and it certainly has the additional benefit of drawingattention to what we feel is the essence of de Finetti’s Representation Theorem: specifyinga coherent belief model for a countable exchangeable sequence is tantamount to specify-ing a coherent (lower) prevision on the linear space of polynomials on some simplex, andnothing more. XCHANGEABLE LOWER PREVISIONS 3
Secondly, we shall focus on, and use the language of, (lower and upper) previsions forgambles, rather than (lower and upper) probabilities for events. Our emphasis on previsionor expectation, rather than probability, is in keeping with de Finetti’s (1974) and Whittle’s(2000) approach to probabilistic modelling. But it is not merely a matter of aesthetic pref-erence: as we shall see, in the behavioural theory of imprecise probabilities, the languageof gambles is much more expressive than that of events, and we need its full expressivepower to derive our results.The plan of the paper is as follows. In Section 2, we introduce a number of results fromthe theory of coherent lower previsions necessary to understand the rest of the paper. InSection 3, we define exchangeability for finite sequences of random variables, and establisha representation of coherent exchangeable lower previsions in terms of sampling withoutreplacement. In Section 4, we extend the notion of exchangeability to countable sequencesof random variables, and in Section 5 we generalise de Finetti’s Representation Theorem(in terms of multinomial sampling) to exchangeable coherent lower previsions. The re-sults we obtain allow us to develop a limit law for sample means in Section 6. Section 7deals with exchangeable natural extension: combining local assessments with exchange-ability. In an appendix, we have gathered a few useful results about multivariate Bernsteinpolynomials.2. L
OWER PREVISIONS , RANDOM VARIABLES AND THEIR DISTRIBUTIONS
In this section, we want to provide a brief summary of ideas, and known as well as newresults from the theory of coherent lower previsions (Walley, 1991). This should lead toa better understanding of the developments in the sections that follow. For results that arementioned without proof, proofs can be found in Walley (1991).2.1.
Epistemic uncertainty models.
Consider a random variable X that may assume val-ues x in some non-empty set X . By ‘random’, we mean that a subject is uncertain aboutthe actual value of the variable X , i.e., does not know what this actual value is. But wedo assume that the actual value of X can be determined, at least in principle. Thus wemay for instance consider tossing a coin, where X is the outcome of the coin toss, and X = { heads , tails } . It does not really matter here to distinguish between a subject’s beliefbefore tossing the coin, or after the toss where, say, the outcome has been kept hidden fromthe subject. All that matters for us here is that our subject is in a state of (partial) ignorancebecause of a lack of knowledge. The uncertainty models that we are going to describe hereare therefore epistemic , rather than physical, probability models.Our subject may be uncertain about the value of X , but he may entertain certain beliefsabout it. These beliefs may lead him to engage in certain risky transactions whose outcomedepends on the actual value of X . We are going to try and model his beliefs mathematicallyby zooming in on such risky transactions. They are captured by the mathematical conceptof a gamble on X , which is a bounded map f from X to the set R of real numbers. Agamble f represents a random reward: if the subject accepts f , this means that he is willingto engage in the following transaction: we determine the actual value x that X assumesin X , and then the subject receives the (possibly negative) reward f ( x ) , expressed in unitsof some predetermined linear utility. Let us denote by L ( X ) the set of all gambles on X .De Finetti (1974) has proposed to model a subject’s beliefs by eliciting his fair price,or prevision , P ( f ) for certain gambles f . This P ( f ) can be defined as the unique realnumber p such that the subject is willing to buy the gamble f for all prices s (i.e., acceptthe gamble f − s ) and sell f for all prices t (i.e., accept the gamble t − g ) for all s < p < t . GERT DE COOMAN, ERIK QUAEGHEBEUR, AND ENRIQUE MIRANDA
The problem with this approach is that it presupposes that there is such a real number, or,in other words, that the subject, whatever his beliefs about X are, is willing, for (almost)every real r , to make a choice between buying f for the price r , or selling it for that price.2.2. Coherent lower previsions and natural extension.
A way to address this problemis to consider a model which allows our subject to be undecided for some prices r . This isdone in Walley’s (1991) theory of lower and upper previsions. The lower prevision of thegamble f , P ( f ) , is our subject’s supremum acceptable buying price for f ; similarly, oursubject’s upper prevision , P ( f ) , is his infimum acceptable selling price for f . Hence, he iswilling to buy the gamble f for all prices t < P ( f ) and sell f for all prices s > P ( f ) , but hemay be undecided for prices P ( f ) ≤ p ≤ P ( f ) .Since buying the gamble f for a price t is the same as selling the gamble − f for theprice − t [in both cases we accept the gamble f − t ], the lower and upper previsions are conjugate functions: P ( f ) = − P ( − f ) for any gamble f . This allows us to concentrate onone of these functions, since we can immediately derive results for the other. In this paper,we focus mainly on lower previsions.If a subject has made assessments about the supremum buying price (lower prevision)for all gambles in some domain K , we have to check that these assessments are consistentwith each other. First of all, we say that the lower prevision P avoids sure loss whensup x ∈ X " n (cid:229) k = l k [ f k ( x ) − P ( f k )] ≥ n , any gambles f , . . . , f n in K and any non-negative real numbers l , . . . , l n . When the inequality (1) is not satisfied, there is some non-negative combinationof acceptable transactions that results in a transaction that makes our subject lose utiles, nomatter the outcome, and we then say that his lower prevision P incurs sure loss .More generally, we say that the lower prevision P is coherent whensup x ∈ X " n (cid:229) k = l k [ f k ( x ) − P ( f k )] − l [ f ( x ) − P ( f )] ≥ n , any gambles f , . . . , f n in K and any non-negative real numbers l , . . . , l n . Coherence means that our subject’s supremum acceptable buying price for agamble f in the domain cannot be raised by considering the acceptable transactions implicitin other gambles. In particular, it means that P avoids sure loss. We call an upper previsioncoherent if its conjugate lower prevision is.If a lower prevision P is defined on a linear space of gambles K , then the coherencerequirement (2) is equivalent to the following conditions: for any gambles f and g in K and any non-negative real number l , it should hold that:(P1) P ( f ) ≥ inf f [accepting sure gains];(P2) P ( l f ) = l P ( f ) [non-negative homogeneity];(P3) P ( f + g ) ≥ P ( f ) + P ( g ) [super-additivity].Moreover, a lower prevision on a general domain is coherent if and only if it can be ex-tended to a coherent lower prevision on some linear space.A coherent lower prevision that is defined on indicators of events only is called a coher-ent lower probability . The indicator I A of an event A is the { , } -valued gamble given by I A ( x ) : = x ∈ A and I A ( x ) : = P on some set of gambles K that avoids sureloss can always be ‘corrected’ and extended to a coherent lower prevision on L ( X ) , XCHANGEABLE LOWER PREVISIONS 5 in a least-committal manner: the (point-wise) smallest, and therefore most conservative,coherent lower prevision on L ( X ) that (point-wise) dominates P on K , is called the natural extension of P , and it is given for all f in L ( X ) by E ( f ) : = sup ( inf x ∈ X (cid:20) f ( x ) − n (cid:229) k = l k [ f k ( x ) − P ( f k )] (cid:21) : n ≥ , l k ≥ , f k ∈ K ) . (3)The natural extension of P provides the supremum acceptable buying prices that we canderive for any gamble f taking into account only the buying prices for the gambles in K and the notion of coherence. Interestingly, P is coherent if and only if it coincides with itsnatural extension E on its domain K , and in that case E is the point-wise smallest coherentlower prevision that extends P to L ( X ) .2.3. Linear previsions.
If the lower prevision P ( f ) and the upper prevision P ( f ) for agamble f happen to coincide, then the common value P ( f ) = P ( f ) = P ( f ) is called thesubject’s (precise) prevision for f . Previsions are fair prices in de Finetti’s (1974) sense.We shall call them precise probability models, and lower previsions will be called impre-cise . Specifying a prevision P on a domain K is tantamount to specifying both a lowerprevision P and an upper prevision P on K such that P ( f ) = P ( f ) = P ( f ) . Since then, byconjugacy, P ( f ) = − P ( − f ) = − P ( − f ) , it is also equivalent to specifying a lower prevision P on the larger and negation invariant domain K ′ : = K ∪ − K , by letting P ( f ) : = P ( f ) if f ∈ K and P ( f ) : = − P ( − f ) if f ∈ − K . This prevision P is then called coherent, or linear , if and only if the associated lower prevision P is coherent, and this is equivalent tothe following conditionsup x ∈ X " n (cid:229) k = l k [ f k ( x ) − P ( f k )] − m (cid:229) ℓ = m ℓ [ g ℓ ( x ) − P ( g ℓ )] ≥ n and m , any gambles f , . . . , f n and g , . . . , g m in K and anynon-negative real numbers l , . . . , l n and m , . . . , m m .A prevision on the set L ( X ) of all gambles is linear if and only if it is a positive( f ≥ ⇒ P ( f ) ≥
0) and normed ( P ( ) =
1) real linear functional. A prevision on a generaldomain is linear if and only if it can be extended to a linear prevision on all gambles. Weshall denote by P ( X ) the set of all linear previsions on L ( X ) .The restriction of a linear prevision P on L ( X ) to the set ˆ ( X ) of (indicators of)all events, is a finitely additive probability. Conversely, a finitely additive probability on ˆ ( X ) has a unique extension (namely, its natural extension as a coherent lower proba-bility) to a linear prevision on L ( X ) . In this sense, such linear previsions and finitelyadditive probabilities can be considered equivalent: for precise probability models, thelanguage of events is as expressive as that of gambles.A linear prevision that is defined on indicators of events only, and therefore called acoherent probability, is always the restriction of some finitely additive probability.There is an interesting link between precise and imprecise probability models, expressedthrough the following so-called lower envelope theorem : A lower prevision P on somedomain K is coherent if and only if it is the lower envelope of some set of linear previsions,and in particular of the convex set M ( P ) of all linear previsions that dominate it: for all f in K , P ( f ) = inf { P ( f ) : P ∈ M ( P ) } , GERT DE COOMAN, ERIK QUAEGHEBEUR, AND ENRIQUE MIRANDA where M ( P ) : = { P ∈ P ( X ) : ( ∀ f ∈ K )( P ( f ) ≥ P ( f )) } . We can also use the set M ( P ) to calculate the natural extension of P : for any gamble f on X , we have that E ( f ) : = inf { P ( f ) : P ∈ M ( P ) } . If we have a coherent lower probability defined on some set of events, then there willgenerally be many (i.e., an infinity of) coherent lower previsions that extend it to all gam-bles. In this sense, the language of gambles is actually more expressive than that of eventswhen we are considering lower rather than precise previsions. As already signalled in theIntroduction, this is the main reason why in the following sections, we shall formulate ourstudy of exchangeable lower previsions in terms of gambles and lower previsions ratherthan events and lower probabilities.2.4.
Important consequences of coherence.
Let us list a few consequences of coherencethat we shall have occasion to use further on. Besides the properties (P1)–(P3) we have al-ready mentioned that hold when the domain of P is a linear space, the following propertieshold for a coherent lower prevision whenever the gambles involved belong to its domain:(i) P is monotone : if f ≤ g , then P ( f ) ≤ P ( g ) .(ii) inf f ≤ P ( f ) ≤ P ( f ) ≤ sup f .Moreover, coherent lower and upper previsions are continuous with respect to uniformconvergence of gambles: if a sequence of gambles f n converges uniformly to a gamble f ,meaning that for every e > n such that | f n ( x ) − f ( x ) | < e for all n ≥ n and for all x ∈ X , then P ( f n ) converges to P ( f ) and P ( f n ) converges to P ( f ) . In particular,this implies that a coherent lower prevision defined on some domain K can be uniquelyextended to a coherent lower prevision on the uniform closure of K . As an immediatecorollary, a coherent lower prevision on L ( X ) is uniquely determined by the values itassumes on simple gambles, i.e., gambles that assume only a finite number of values.We end this section by introducing a number of new notions, which cannot be found inWalley (1991). They generalise familiar definitions in standard, measure-theoretic proba-bility to a context where coherent lower previsions are used as belief models.2.5. The distribution of a random variable.
We shall call a subject’s coherent lowerprevision P on L ( X ) , modelling his beliefs about the value that a random variable X assumes in the set X , his distribution for that random variable.Now consider another set Y , and a map j from X to Y , then we can consider Y : = j ( X ) as a random variable assuming values in Y . With a gamble h on Y , therecorresponds a gamble h ◦ j on X , whose lower prevision is P ( h ◦ j ) . This leads us todefine the distribution of Y = f ( X ) as the induced coherent lower prevision Q on L ( Y ) ,defined by Q ( h ) : = P ( h ◦ j ) , h ∈ L ( Y ) . For an event A ⊆ Y , we see that I A ◦ j = I j − ( A ) , where j − ( A ) : = { x ∈ X : j ( x ) ∈ A } ,and consequently Q ( A ) = P ( j − ( A )) . So we see that the notion of an induced lowerprevision generalises that of an induced probability measure.Finally, consider a sequence of random variables X n , all taking values in some metricspace S . Denote by C ( S ) the set of all continuous gambles on S . For each random variable X n , we have a distribution in the form of a coherent lower prevision P X n on L ( S ) . Then wesay that the random variables converge in distribution if for all h ∈ C ( S ) , the sequence ofreal numbers P X n ( h ) converges to some real number, which we denote by P ( h ) . The limitlower prevision P on C ( S ) that we can define in this way, is coherent, because a point-wiselimit of coherent lower previsions always is. XCHANGEABLE LOWER PREVISIONS 7
3. E
XCHANGEABLE RANDOM VARIABLES
We are now ready to recall Walley’s (1991, Section 9.5) notion of exchangeability inthe context of the theory of coherent lower previsions. We shall see that it generalisesde Finetti’s definition for linear previsions (de Finetti, 1937, 1975).3.1.
Definition and basic properties.
Consider N ≥ X , . . . , X N tak-ing values in a non-empty and finite set X . A subject’s beliefs about the values that theserandom variables X = ( X , . . . , X N ) assume jointly in X N is given by their (joint) distribu-tion, which is a coherent lower prevision P N X defined on the set L ( X N ) of all gambles on X N .Let us denote by P N the set of all permutations of { , . . . , N } . With any such permu-tation p we can associate, by the procedure of lifting, a permutation of X N , also denotedby p , that maps any x = ( x , . . . , x N ) in X N to p x : = ( x p ( ) , . . . , x p ( N ) ) . Similarly, with anygamble f on X N , we can consider the permuted gamble p f : = f ◦ p , or in other words, ( p f )( x ) = f ( p x ) for all x ∈ X N .A subject judges the random variables X , . . . , X N to be exchangeable when he is dis-posed to exchange any gamble f for the permuted gamble p f , meaning that P N X ( p f − f ) ≥ for any permutation p . Taking into account the properties of coherence, this means that P N X ( p f − f ) = P N X ( f − p f ) = f on X N and all permutations p in P N . In this case, we shall also callthe joint coherent lower prevision P N X exchangeable . A subject will make an assumptionof exchangeability when there is evidence that the processes generating the values of therandom variables are (physically) similar (Walley, 1991, Section 9.5.2), and consequentlythe order in which the variables are observed is not important.When P N X is in particular a linear prevision P N X , exchangeability is equivalent to hav-ing P N X ( p f ) = P N X ( f ) for all gambles f and all permutations p . Another equivalent for-mulation can be given in terms of the (probability) mass function p N X of P N X , defined by p N X ( x ) : = P N X ( { x } ) . Indeed, if we apply linearity to find that P N X ( f ) = (cid:229) x ∈ X N f ( x ) p N X ( x ) ,we see that the exchangeability condition for linear previsions is equivalent to having p N X ( x ) = p N X ( p x ) for all x in X N , or in other words, the mass function p N X should beinvariant under permutation of the indices. This is essentially de Finetti’s (1937) definitionfor the exchangeability of a prevision. The following proposition, mentioned by Walley(1991, Section 9.5), and whose proof is immediate and therefore omitted, establishes aneven stronger link between Walley’s and de Finetti’s notions of exchangeability. Proposition 1.
Any coherent lower prevision on L ( X N ) that dominates an exchangeablecoherent lower prevision, is also exchangeable. Moreover, let P N X be the lower envelopeof some set of linear previsions M N X , in the sense thatP N X ( f ) = min (cid:8) P N X ( f ) : P N X ∈ M N X (cid:9) for all gambles f on X N . Then P N X is exchangeable if and only if all the linear previ-sions P N X in M N X are exchangeable. We could easily define exchangeability for variables that assume values in a set X that is not necessarilyfinite. But since we only prove interesting results for finite X , we have decided to use a finitary context from theoutset. This means that the subject is willing to accept the gamble p f − f , i.e., to exchange f for p f , in return forany positive amount of utility e , however small. GERT DE COOMAN, ERIK QUAEGHEBEUR, AND ENRIQUE MIRANDA
If a coherent lower prevision P N X is exchangeable, it is immediately guaranteed to bealso permutable in the sense that P N X ( p f ) = P N X ( f ) for all gambles f on X N and all permutations p in P N . The converse does not hold in general. For linear previsions P N X , permutability is equiva-lent to exchangeability, but this equivalence is generally broken for coherent lower previ-sions that are not linear. Clearly, if X , . . . , X N are exchangeable, then any permutation X p ( ) , . . . , X p ( N ) isexchangeable as well, and has the same distribution P N X . Moreover, any selection of1 ≤ n ≤ N random variables from amongst the X , . . . , X N are exchangeable too, and theirdistribution is given by P n X , which is the X n -marginal of P N X , given by P n X ( f ) : = P N X ( e f ) for all gambles f on X n , where the gamble e f on X N is the cylindrical extension of f to X N , given by e f ( z , . . . , z N ) : = f ( z , . . . , z n ) for all ( z , . . . , z N ) in X N . Running example.
This is the place to introduce our running example. As we go along,we shall try to clarify our reasoning by looking at a specific special case, that is as sim-ple as possible, namely where the random variables X k we consider can assume only twovalues. So we might be looking at tossing coins, or thumbtacks, and consider modellingthe exchangeability assessment that the order in which these coin flips are considered is ofno consequence. More generally, our random variables might be the indicators of events: X k = I E k , and then we consider the events E , . . . , E N to be exchangeable when the orderin which they are observed is of no consequence.Formally, we denote the set of possible values for such variables by B = { , } , where 1and 0 could stand for heads and tails, success and failure, the occurrence or not of an event,and so on. In what follows, we shall often call 1 a success, and 0 a failure.The joint random variable X = ( X , . . . , X N ) then assumes values in the space B N , whichis made up of all N -tuples of zeros and ones. As an example, in the case N =
3, twopossible elements of B are ( , , ) and ( , , ) . These elements can be related to eachother by a permutation of the indices, i.e., of the order in which they occur, and thereforeany exchangeable linear prevision should assign the same probability mass to them. Andany exchangeable coherent lower prevision is a lower envelope of such exchangeable linearprevisions. ♦ Count vectors.
Interestingly, exchangeable coherent lower previsions have a verysimple representation, in terms of sampling without replacement. To see how this comesabout, consider any x ∈ X N . Then the so-called (permutation) invariant atom [ x ] : = { p x : p ∈ P N } is the smallest non-empty subset of X N that contains x and that is invariant under allpermutations p in P N . We shall denote the set of permutation invariant atoms of X N We use the terminology in Walley (1991, Section 9.4). This is an instance of a more general phenomenon: we can generally consider two types of invariance ofa belief model (a coherent lower prevision) with respect to a semigroup of transformations: weak and strong invariance. The former, of which permutability is a special case, tells us that the model or the beliefs are symmet-rical (symmetry of evidence), whereas the latter, of which exchangeability is a special case, reflects that a subjectbelieves there is symmetry (evidence of symmetry). Strong invariance generally implies weak invariance, but thetwo notions in general only coincide for linear previsions. For more details, see De Cooman and Miranda (2007). Actually this is a special case of a much more general representation result for coherent lower previsionson a finite space that are strongly invariant with respect to a finite group of permutations of that space; see(De Cooman and Miranda, 2007) for more details. Here we give a different proof.
XCHANGEABLE LOWER PREVISIONS 9 by A N X . It constitutes a partition of the set X N . We can characterise these invariant atomsusing the counting maps T Nx : X N → N defined for all x in X in such a way that T Nx ( z ) = T Nx ( z , . . . , z N ) : = |{ k ∈ { , . . . , N } : z k = x }| is the number of components of the N -tuple z that assume the value x . Here | A | denotes thenumber of elements in a finite set A , and N is the set of all non-negative integers (includingzero). We shall denote by T N X the vector-valued map from X N to N X whose componentmaps are the T Nx , x ∈ X . Observe that T N X actually assumes values in the set of countvectors N N X : = ( m ∈ N X : (cid:229) x ∈ X m x = N ) . Since permuting the components of a vector leaves the counts invariant, meaning that T N X ( z ) = T N X ( p z ) for all z ∈ X N and p ∈ P N , we see that for all y and z in X N y ∈ [ z ] ⇐⇒ T N X ( y ) = T N X ( z ) . The counting map T N X can therefore be interpreted as a bijection (one-to-one and onto)between the set of invariant atoms A N X and the set of count vectors N N X , and we canidentify any invariant atom [ z ] by the count vector m = T N X ( z ) of any (and therefore all)of its elements. We shall therefore also denote this atom by [ m ] ; and clearly y ∈ [ m ] if andonly if T N X ( y ) = m . The number of elements n ( m ) in any invariant atom [ m ] is given bythe number of different ways in which the components of any z in [ m ] can be permuted,and is therefore given by n ( m ) : = (cid:18) N m (cid:19) = N ! (cid:213) x ∈ X m x ! . If the joint random variable X = ( X , . . . , X N ) assumes the value z in X N , then thecorresponding count vector assumes the value T N X ( z ) in N N X . This means that we can see T N X ( X ) = T N X ( X , . . . , X N ) as a random variable in N N X . If the available information aboutthe values that X assumes in X N is given by the coherent exchangeable lower prevision P N X – the distribution of X –, then the corresponding uncertainty model for the values that T N X ( X ) assumes in N N X is given by the coherent induced lower prevision Q N X on L ( N N X ) – the distribution of T N X ( X ) –, given by Q N X ( h ) : = P N X ( h ◦ T N X ) = P N X (cid:18) (cid:229) m ∈ N N X h ( m ) I [ m ] (cid:19) (4)for all gambles h on N N X . We shall now prove a theorem that shows that, conversely,any exchangeable coherent lower prevision P N X is in fact completely determined by thecorresponding distribution Q N X of the count vectors, also called its count distribution . Italso establishes a relationship between exchangeability and sampling without replacement.To get where we want, consider an urn with N balls of different types, where the differ-ent types are characterised by the elements x of the set X . Suppose the composition of theurn is given by the count vector m ∈ N N X , meaning that m x balls are of type x , for x ∈ X .We are now going to subsequently select (in a random way) N balls from the urn, withoutreplacing them. Denote by Y k the random variable in X that is the type of the k -th ballselected. The possible outcomes of this experiment, i.e., the possible values of the jointrandom variable Y = ( Y , . . . , Y N ) are precisely the elements z of the permutation invariantatom [ m ] , and random selection simply means that each of these outcomes is equally likely.Since there are n ( m ) such possible outcomes, each of them has probability 1 / n ( m ) . Also, any z not in [ m ] has zero probability of being the outcome of our sampling procedure. Thismeans that for any gamble f on X N , its (precise) prevision (or expectation) is given by MuHy N X ( f | m ) : = n ( m ) (cid:229) z ∈ [ m ] f ( z ) . The linear prevision
MuHy N X ( ·| m ) is the one associated with a multiple hyper-geometricdistribution (Johnson et al., 1997, Chapter 39), whence the notation. Indeed, for any x =( x , . . . , x n ) in X n , where 1 ≤ n ≤ N , the probability of drawing a sequence of balls x froman urn with composition m is given by MuHy N X ( { x } × X N − n | m ) = n ( m − mmm ) n ( m ) = n ( mmm ) (cid:213) x ∈ X (cid:18) m x m x (cid:19) / (cid:18) Nn (cid:19) where mmm = T n X ( x ) . This means that the probability of drawing without replacement anysample with count vector mmm is n ( mmm ) times this probability [there are that many such sam-ples], and is therefore given by n ( m − mmm ) n ( mmm ) n ( m ) = (cid:213) x ∈ X (cid:18) m x m x (cid:19) / (cid:18) Nn (cid:19) , which indeed gives the mass function for the multiple hyper-geometric distribution. Forany permutation p of { , . . . , N } MuHy N X ( p f | m ) = n ( m ) (cid:229) z ∈ [ m ] f ( p z ) = n ( m ) (cid:229) p − z ∈ [ m ] f ( z ) = MuHy N X ( f | m ) , (5)since p − z ∈ [ m ] iff z ∈ [ m ] . This means that the linear prevision MuHy N X ( ·| m ) is ex-changeable. The following theorem establishes an even stronger result. Theorem 2 (Representation theorem for finite sequences of exchangeable variables) . LetN ≥ and let P N X be a coherent exchangeable lower prevision on L ( X N ) . Let f be anygamble on X N . Then the following statements hold:1. The gamble ˆ f on X N given by ˆ f : = | P N | (cid:229) p ∈ P N p f is permutation invariant, meaningthat p ˆ f = ˆ f for all p ∈ P N . It is therefore constant on the permutation invariant atomsof X N , and also given by ˆ f = (cid:229) m ∈ N N X I [ m ] MuHy N X ( f | m ) . (6)
2. P N X ( f − ˆ f ) = P N X ( ˆ f − f ) = , and therefore also P N X ( f ) = P N X ( ˆ f ) .3. P N X ( f ) = Q N X ( MuHy N X ( f |· )) , where MuHy N X ( f |· ) is the gamble on N N X that assumesthe value MuHy N X ( f | m ) in m ∈ N N X .Consequently a lower prevision on L ( X N ) is exchangeable if and only if it has the formQ ( MuHy N X ( ·|· )) , where Q is any coherent lower prevision on L ( N N X ) .Proof. The first statement is fairly immediate. We therefore turn at once to the secondstatement. Observe that f − ˆ f = | P N | (cid:229) p ∈ P N [ f − p f ] . Now use the coherence [super-additivity and non-negative homogeneity], and the exchangeability of the lower prevision P N X to find that P N X ( f − ˆ f ) ≥ | P N | (cid:229) p ∈ P N P N X ( f − p f ) = . XCHANGEABLE LOWER PREVISIONS 11
In a completely similar way, we get P N X ( ˆ f − f ) ≥
0. Since it also follows from the coher-ence [super-additivity] of P N X that P N X ( f − ˆ f ) + P N X ( ˆ f − f ) ≤ P N X ( ) =
0, we find thatindeed P N X ( f − ˆ f ) = P N X ( ˆ f − f ) =
0. Now let g : = f − ˆ f , then f = ˆ f + g and ˆ f = f − g ,and use the coherence [super-additivity and accepting sure gains] of P N X to infer that P N X ( f ) ≥ P N X ( ˆ f ) + P N X ( g ) = P N X ( ˆ f ) ≥ P N X ( f ) + P N X ( − g ) = P N X ( f ) , whence indeed P N X ( f ) = P N X ( ˆ f ) .To prove the third statement, use P N X ( f ) = P N X ( ˆ f ) together with Equations (4) and (6)to find that P N X ( f ) = P N X ( ˆ f ) = Q N X ( MuHy N X ( f |· )) .These statements imply that any exchangeable coherent lower prevision is of the form Q ( MuHy N X ( ·|· )) , where Q is some coherent lower prevision on L ( N N X ) . Conversely, if Q is any coherent lower prevision on L ( N N X ) , then Q ( MuHy N X ( ·|· )) is a coherent lowerprevision on L ( X N ) that is exchangeable: simply observe that for any gamble f on X N and any p ∈ P N , Q ( MuHy N X ( f − p f |· )) = Q ( MuHy N X ( f |· ) − MuHy N X ( p f |· )) = Q ( ) = , taking into account that each MuHy N X ( ·| m ) is an exchangeable linear prevision [Equa-tion (5)]. (cid:3) This theorem implies that any exchangeable coherent lower prevision on X N can beassociated with, or equivalently, that any collection of N exchangeable random variablesin X can be seen as the result of, N random draws without replacement from an urnwith N balls whose types are characterised by the elements x of X , whose composition m is unknown, but for which the available information about the composition is modelled bya coherent lower prevision on L ( N N X ) . That exchangeable linear previsions can be interpreted in terms of sampling withoutreplacement from an urn with unknown composition, is of course well-known, and es-sentially goes back to de Finetti’s work on exchangeability; see (de Finetti, 1937) and(Cifarelli and Regazzini, 1996). Heath and Sudderth (1976) give a simple proof for vari-ables that may assume two values. But we believe our proof for the more general case ofexchangeable coherent lower previsions and random variables that may assume more thantwo values, is conceptually even simpler than Heath and Sudderth’s proof, even though itis a special case of a much more general representation result (De Cooman and Miranda,2007, Theorem 30). The essence of the present proof in the special case of linear previsions P is captured wonderfully well by Zabell’s (1992, Section 3.1) succinct statement: “Thus P is exchangeable if and only if two sequences having the same frequency vector have thesame probability.” Running example.
We come back to the simple case considered before, where X = B . Anytwo elements x and y of B N can be related by some permutation of the indices { , . . . , N } iff they have the same number of successes s = T N ( x ) = T N ( y ) (and of course, the samenumber of failures f = N − s ). We can identify the count space N N B = { ( s , f ) : s + f = N } When P N X , and therefore also Q N X , is a linear prevision, i.e., a precise probability model, this interpre-tation follows from the Theorem of Total Probability, by interpreting the MuHy N X ( ·| m ) as conditional previ-sions, and Q N X as a marginal. For imprecise models P N X and Q N X , the validity of this interpretation followsby analogous reasoning, using Walley’s Marginal Extension Theorem; see Walley (1991, Section 6.7) andMiranda and De Cooman (2006). Walley (1991, Chapter 9) also mentions this result for exchangeable coherent lower previsions. with the set { s : s = , . . . , N } , and count vectors m = ( s , N − s ) with the correspondingnumber of successes s , which is what we shall do from now on.The 2 N elements of B N are divided into N + [ s ] of elements with thesame number of successes s , each of which has n ( s ) = (cid:0) Ns (cid:1) = N ! s ! ( N − s ) ! elements. We havedepicted the situation for N = ( , , ) ( , , ) ( , , ) ( , , ) ( , , ) ( , , ) ( , , ) ( , , ) s = s = s = s = F IGURE
1. The four invariant atoms [ s ] in the space N B , characterisedby the number of successes s .Exchangeability forces each of the elements within an invariant atom [ s ] to be ‘equallylikely’. So each [ s ] is to be considered as a ‘lump’, within which probability mass is dis-tributed uniformly. The only freedom exchangeability leaves us with, lies in assigningprobabilities to the lumps [ s ] . This is the essence of Theorem 2, which tells us that anyexchangeable coherent lower prevision P N B on L ( B N ) can be seen as the composition ofa coherent lower prevision Q N B on L ( { , , . . . , N } ) , representing beliefs about the num-ber of successes s , and the hyper-geometric distributions on [ s ] , which guarantee that theprobability is distributed uniformly over each of the n ( s ) = (cid:0) Ns (cid:1) elements of [ s ] : for anygamble f on B N , Hy N ( f | s ) : = MuHy N B ( f | s , N − s ) = n ( s ) (cid:229) x ∈ [ s ] f ( x ) . ♦ For an exchangeable random variable X = ( X , . . . , X N ) , with (exchangeable) distribu-tion P N X on L ( X N ) , we have seen that we can completely characterise this distributionby the corresponding distribution of the count vectors Q N X on L ( N N X ) .We have also seen that any selection of 1 ≤ n ≤ N random variables from amongst the X , . . . , X N will be exchangeable too, and that their distribution is given by P n X , whichis the X n -marginal of P N X . There is moreover an interesting relation between the dis-tributions Q N X and Q n X of the corresponding count vectors, which we shall derive inthe next section (Equation (9)). On the other hand, it is well-known (see for instanceDiaconis and Freedman (1980); we shall come back to this in Section 7) that if we have anexchangeable N -tuple ( X , . . . , X N ) , it is not always possible to extend it to an exchangeable N + XCHANGEABLE SEQUENCES
Definitions.
We now generalise the definition of exchangeability from finite to count-able sequences of random variables. Consider a countable sequence X , . . . , X n , . . . ofrandom variables taking values in the same non-empty set X . This sequence is called exchangeable if any finite collection of random variables taken from this sequence is ex-changeable. This is clearly equivalent to requiring that the random variables X , . . . , X n should be exchangeable for all n ≥ XCHANGEABLE LOWER PREVISIONS 13
We can also consider the exchangeable sequence as a single random variable X assum-ing values in the set X N , where N is the set of the natural numbers (positive integers,without zero). Its possible values x are sequences x , . . . , x n , . . . of elements of X , orin other words, maps from N to X . We can model the available information about thevalue that X assumes in X N by a coherent lower prevision P N X on L ( X N ) , called the distribution of the exchangeable random sequence X .The random sequence X , or its distribution P N X , is clearly exchangeable if and only if allits X n -marginals P n X are exchangeable for n ≥
1. These marginals P n X on L ( X n ) aredefined as follows: for any gamble f on X n , P n X ( f ) : = P N X ( e f ) , where e f is the cylindricalextension of f to X N , defined by e f ( x ) : = f ( x , . . . , x n ) for all x = ( x , . . . , x n , x n + , . . . ) in X N . In addition, the family of exchangeable coherent lower previsions P n X , n ≥ time consistency ’ requirement: P n X ( f ) = P n + k X ( e f ) , (7)for all n ≥ k ≥
0, and all gambles f on X n , where now e f denotes the cylindrical exten-sion of f to X n + k : P n X should be the X n -marginal of any P n + k X .It follows at once that any finite collection of n ≥ n variables X , . . . , X n , whichis the exchangeable coherent lower prevision P n X on L ( X n ) .Conversely, suppose we have a collection of exchangeable coherent lower previsions P n X on L ( X n ) , n ≥ P N X on L ( X N ) that has X n -marginals P n X is exchangeable. Thesmallest, or most conservative such (exchangeable) coherent lower prevision is given by E N X ( f ) : = sup n ∈ N P n X ( proj n ( f )) = lim n → ¥ P n X ( proj n ( f )) , where f is any gamble on X N , and its lower projection proj n ( f ) on X n is the gambleon X n that is defined by proj n ( f )( x ) : = inf z k = x k , k = ,..., n f ( z ) for all x ∈ X n , i.e., the lowerprojection of f on x is the infimum of f over the elements of X N whose projection on X n is x . See (De Cooman and Miranda, 2006, Section 5) for more details.4.2. Time consistency of the count distributions.
It will be of crucial interest for whatfollows to find out what are the consequences of the time consistency requirement (7) onthe marginals P n X for the corresponding family Q n X , n ≥
1, of distributions of the countvectors T n X ( X , . . . , X n ) . Consider therefore n ≥ k ≥ h on N n X . Let f : = h ◦ T n X , then Q n X ( h ) = P n X ( f ) = P n + k X ( e f ) = Q n + k X ( MuHy n + k X ( e f |· )) , where the first equality follows from Equation (4), the second from Equation (7), and thelast from Theorem 2. Now for any m ′ in N n + k X , and any z ′ = ( z , y ) in X n + k = X n × X k we have that T n + k X ( z ′ ) = T n X ( z ) + T k X ( y ) and therefore MuHy n + k X ( e f | m ′ )= n ( m ′ ) (cid:229) z ′ ∈ [ m ′ ] e f ( z ′ ) = n ( m ′ ) (cid:229) ( z , y ) ∈ [ m ′ ] f ( z ) = n ( m ′ ) (cid:229) m ∈ N n X m ≤ m ′ (cid:229) y ∈ [ m ′ − m ] (cid:229) z ∈ [ m ] f ( z )= n ( m ′ ) (cid:229) m ∈ N n X m ≤ m ′ n ( m ′ − m ) n ( m ) MuHy n X ( f | m ) = (cid:229) m ∈ N n X n ( m ′ − m ) n ( m ) n ( m ′ ) h ( m ) , (8)since MuHy n X ( f | m ) = h ( m ) , and n ( m ′ − m ) is zero unless m ≤ m ′ . So we see that timeconsistency is equivalent to Q n X ( h ) = Q n + k X (cid:18) (cid:229) m ∈ N n X n ( · − m ) n ( m ) n ( · ) h ( m ) (cid:19) (9)for all n ≥ k ≥ h ∈ L ( N n X ) .5. A REPRESENTATION THEOREM FOR EXCHANGEABLE SEQUENCES
De Finetti (1937, 1975) has proven a representation result for exchangeable sequenceswith linear previsions that generalises Theorem 2, and where multinomial distributionstake over the rˆole that the multiple hyper-geometric ones play for finite collections of ex-changeable variables. One simple and intuitive way (see also de Finetti, 1975, p. 218) tounderstand why the representation result can be thus extended from finite collections tocountable sequences, is based on the fact that the multinomial distribution can be seen asas limit of multiple hyper-geometric ones (Johnson et al., 1997, Chapter 39). This is alsothe central idea behind Heath and Sudderth’s (1976) simple proof of this representationresult in the case of variables that may only assume two possible values.However, there is another, arguably even simpler, approach to proving the same results,which we present here. It also works for exchangeability in the context of coherent lowerprevisions. And as we shall have occasion to explain further on, it has the additional ad-vantage of clearly indicating what the ‘representation’ is, and where it is uniquely defined.We make a start at proving our representation theorem by taking a look at multinomialprocesses.5.1.
Multinomial processes are exchangeable.
Consider a sequence of random variables Y , . . . , Y n , . . . that are mutually independent, and such that each random variable Y n has thesame probability mass function qqq : the probability that Y n = x is q x for x ∈ X . Observethat qqq is an element of the X -simplex S X = ( qqq ∈ R X : ( ∀ x ∈ X )( q x ≥ ) and (cid:229) x ∈ X q x = ) . Then for any n ≥ z in X n the probability that ( Y , . . . , Y n ) is equal to z is givenby (cid:213) x ∈ X q T x ( z ) x , which yields the multinomial mass function (Johnson et al., 1997, Chap-ter 35). As a result, we have for any gamble f on X n that its corresponding (multinomial) In other words, the random variables are IID.
XCHANGEABLE LOWER PREVISIONS 15 prevision (expectation) is given by Mn n X ( f | qqq ) = (cid:229) z ∈ X n f ( z ) (cid:213) x ∈ X q T x ( z ) x = (cid:229) m ∈ N n X (cid:229) z ∈ [ m ] f ( z ) (cid:213) x ∈ X q m x x = (cid:229) m ∈ N n X MuHy n X ( f | m ) n ( m ) (cid:213) x ∈ X q m x x = CoMn n X ( MuHy n X ( f |· ) | qqq ) , (10)where we defined the (count multinomial) linear prevision CoMn n X ( ·| qqq ) on L ( N n X ) by CoMn n X ( g | qqq ) = (cid:229) m ∈ N n X g ( m ) n ( m ) (cid:213) x ∈ X q m x x , (11)where g is any gamble on N n X . The corresponding probability mass for any count vector m ,namely CoMn n X ( { m }| qqq ) = n ( m ) (cid:213) x ∈ X q m x x = : B m ( qqq ) , (12)is the probability of observing some value z for ( Y , . . . , Y n ) whose count vector is m . Thepolynomial function B m on the X -simplex is called a (multivariate) Bernstein (basis) poly-nomial . We have listed a number of very interesting properties for these special polynomi-als in the Appendix. One important fact, which we shall need quite soon, is that the set (cid:8) B m : m ∈ N n X (cid:9) of all Bernstein (basis) polynomials of fixed degree n forms a basis forthe linear space of all (multivariate) polynomials on S X whose degree is at most n ; hencetheir name. If we have a polynomial p of degree m , this means that for any n ≥ m , p has a unique (Bernstein) decomposition b np ∈ L ( N n X ) such that p = (cid:229) m ∈ N n X b np ( m ) B m . If we combine this with Equations (11) and (12), we find that b np is the unique gambleon N n X such that CoMn n X ( b np |· ) = p .We deduce from Equation (10) and Theorem 2 that the linear prevision Mn n X ( ·| qqq ) on L ( X n ) – the distribution of ( Y , . . . , Y n ) – is exchangeable, and that CoMn n X ( ·| qqq ) is thecorresponding distribution for the corresponding count vectors T n X ( Y , . . . , Y n ) . Thereforethe sequence of IID random variables Y , . . . , Y n , . . . is exchangeable. Running example.
Let us go back to our example, where X = B . Here the B -simplex S B = { ( q , − q ) : q ∈ [ , ] } can be identified with the unit interval, and every element qqq = ( q , − q ) can be identified with the probability q of a success.The count multinomial distribution CoMn n B ( ·| qqq ) now of course turns into the (count) binomial distribution CoBi n ( ·| q ) on L ( { , . . . , n } ) , given by CoBi n ( g | q ) : = n (cid:229) s = g ( s ) (cid:18) ns (cid:19) q s ( − q ) n − s = n (cid:229) s = g ( s ) B ns ( q ) (13)for any gamble g on the set { , , . . . , n } of possible values for the number of successes s .In this expression, the B ns ( q ) : = (cid:0) ns (cid:1) q s ( − q ) n − s are the n + n (Lorentz, 1986; Prautzsch et al., 2002). For fixed n , they addup to one and are linearly independent, and they form a basis for the linear space of allpolynomials on [ , ] of degree at most n . ♦ We assume implicitly that a = a ≥ A representation theorem.
Consider the following linear subspace of L ( S X ) : V ( S X ) : = { CoMn n X ( g |· ) : n ≥ , g ∈ L ( N n X ) } = { Mn n X ( f |· ) : n ≥ , f ∈ L ( X n ) } , each of whose elements is a polynomial function on the X -simplex: CoMn n X ( g | qqq ) = (cid:229) m ∈ N n X g ( m ) n ( m ) (cid:213) x ∈ X q m x x = (cid:229) m ∈ N n X g ( m ) B m ( qqq ) , and is actually a linear combination of Bernstein basis polynomials B m with coefficients g ( m ) . So V ( S X ) is the linear space spanned by all Bernstein basis polynomials, and istherefore the set of all polynomials on the X -simplex S X .Now if R X is any coherent lower prevision on L ( S X ) , then it is easy to see that thefamily of coherent lower previsions P n X , n ≥
1, defined by P n X ( f ) = R X ( Mn n X ( f |· )) , f ∈ L ( X n ) (14)is still exchangeable and time consistent, and the corresponding count distributions aregiven by Q n X ( f ) = R X ( CoMn n X ( g |· )) , g ∈ L ( N n X ) . (15)Here, we are going to show that a converse result also holds: for any time consistentfamily of exchangeable coherent lower previsions P n X , n ≥
1, there is a coherent lowerprevision R X on V ( S X ) such that Equation (14), or its reformulation for counts (15),holds. We shall call such an R X a representation , or representing coherent lower prevision,for the family P n X . Of course, any representing R X , if it exists, is uniquely determinedon V ( S X ) .So consider a family of coherent lower previsions Q n X on L ( N n X ) that are time con-sistent, meaning that Equation (9) is satisfied. It suffices to find an R X such that (15)holds, because the corresponding exchangeable lower previsions P n X on L ( X n ) are thenuniquely determined by Theorem 2, and automatically satisfy the condition (14).Our proposal is to define the functional R X on the set V ( S X ) as follows: consider anyelement p of V ( S X ) . Then, by definition, there is some n ≥ and a corresponding uniqueb np ∈ L ( N n X ) such that p = CoMn n X ( b np |· ) . We then let R X ( p ) : = Q n X ( b np ) . Of course, the first thing to check is whether this definition is consistent: any polynomial p of degree m has unique representations b np for all n ≥ m , which means that we have tocheck that no inconsistencies can arise in the sense that Q n X ( b n p ) = Q n X ( b n p ) for some n , n ≥ m . It turns out that this is guaranteed by the time consistency of the P n X , or that ofthe corresponding Q n X , as is made apparent by the proof of the following lemma. Lemma 3.
Consider a polynomial of degree m, and let n , n ≥ m. Then Q n X ( b n p ) = Q n X ( b n p ) .Proof. We may assume without loss of generality that n ≥ n . The Bernstein decomposi-tions b n p and b n p are then related by Zhou’s formula [see Equation (22) in the Appendix]: b n p ( m ) = (cid:229) m ∈ N n X n ( m − m ) n ( m ) n ( m ) b n p ( m ) , m ∈ N n X . Consequently, by the time consistency requirement (9), we indeed get that Q n X ( b n p ) = Q n X ( b n p ) . (cid:3) We also have to check whether the functional R X thus defined on the linear space V X is acoherent lower prevision. This is established in the following lemma. Lemma 4. R X is a coherent lower prevision on the linear space V ( S X ) . XCHANGEABLE LOWER PREVISIONS 17
Proof.
We show that R X satisfies the necessary and sufficient conditions (P1)–(P3) forcoherence of a lower prevision on a linear space.We first prove that (P1) is satisfied. Consider any p ∈ V ( S X ) . Let m be the degree of p .We must show that R X ( p ) ≥ min p . We find that R X ( p ) = Q n X ( b np ) ≥ min b np for all n ≥ m ,because of the coherence [accepting sure gains] of the count lower previsions Q n X . ButProposition 8 in the Appendix tells us that min b np ↑ min p , whence indeed R X ( p ) ≥ min p .Next, consider any p in V ( S X ) and any real l ≥
0. Consider any n that is not smallerthan the degree of p . Since obviously b n l p = l b np , we get R X ( l p ) = Q n X ( b n l p ) = Q n X ( l b np ) = l Q n X ( b np ) = l R X ( p ) , where the third equality follows from the coherence [non-negative homogeneity] of thecount lower prevision Q n X . This tells us that the lower prevision R X satisfies the non-negative homogeneity requirement (P2).Finally, consider p and q in V ( S X ) , and any n that is not smaller than the maximum ofthe degrees of p and q . Since obviously b np + q = b np + b nq , we get R X ( p + q ) = Q n X ( b np + q ) = Q n X ( b np + b nq ) ≥ Q n X ( b np ) + Q n X ( b nq ) = R X ( p ) + R X ( q ) , where the inequality follows from the coherence [super-additivity] of the count lower pre-vision Q n X . This tells us that the lower prevision R X also satisfies the super-additivityrequirement (P3) and as a consequence it is coherent. (cid:3) We can summarise the argument above as follows.
Theorem 5 (Representation theorem for exchangeable sequences) . Given a time consistentfamily of exchangeable coherent lower previsions P n X on L ( X n ) , n ≥ , there is a uniquecoherent lower prevision R X on the linear space V ( S X ) of all polynomial gambles onthe X -simplex, such that for all n ≥ , all f ∈ L ( X n ) and all g ∈ L ( N n X ) :P n X ( f ) = R X ( Mn n X ( f |· )) and Q n X ( g ) = R X ( CoMn n X ( g |· )) . (16)Hence, the belief model governing any countable exchangeable sequence in X can becompletely characterised by a coherent lower prevision on the linear space of polynomialgambles on S X .In the particular case where we have a time consistent family of exchangeable linear previsions P n X on L ( X n ) , n ≥
1, then R X will be a linear prevision R X on the linearspace V ( S X ) of all polynomial gambles on the X -simplex. As such, it will be charac-terised by its values R X ( B m ) on the Bernstein basis polynomials B m , m ∈ N n X , n ≥
1, oron any other basis of V ( S X ) .It is a consequence of coherence that R X is also uniquely determined on the set C ( S X ) of all continuous gambles on the X -simplex S X : by the Stone-Weierstaß theorem, anysuch gamble is the uniform limit of some sequence of polynomial gambles, and coherenceimplies that the lower prevision of a uniform limit is the limit of the lower previsions.This unicity result cannot be extended to more general (discontinuous) types of gambles:the coherent lower prevision R X is not uniquely determined on the set of all gambles L ( S X ) on the simplex: and there may be different coherent lower previsions R X and R X on L ( S X ) satisfying Equation (16). But any such lower previsions will agree on See Miranda et al. (2007) for a study of the gambles whose prevision is determined by the prevision of thepolynomials. the class V ( S X ) of polynomial gambles, which is the class of gambles we need in orderto characterise the exchangeable sequence. We now investigate the meaning of the representing lower prevision R X a bit further.Consider the sequence of so-called frequency random variables F n : = T n X ( X , . . . , X n ) / n corresponding to an exchangeable sequence of random variables X , . . . , X n , . . . , and as-suming values in the X -simplex S X . The distribution P F n of F n , i.e., the coherent lowerprevision on L ( S X ) that models the available information about the values that F n as-sumes in S X , is given by P F n ( h ) : = Q n X ( h ◦ n ) = R X ( CoMn n X ( h ◦ n |· )) , h ∈ L ( S X ) , because we know that Q n X is the distribution of T n X ( X , . . . , X n ) , and also taking into ac-count Theorem 5 for the last equality. Now, CoMn n X ( h ◦ n | qqq ) = (cid:229) m ∈ N n X h (cid:0) m n (cid:1) B m ( qqq ) is the Bernstein approximant or approximating Bernstein polynomial of degree n for thegamble h , and it is a known result (see (Feller, 1971, Section VII.2), (Heitzinger et al.,2003, Section 2)) that the sequence of approximating Bernstein polynomials CoMn n X ( h ◦ n |· ) converges uniformly to h for n → ¥ if h is continuous. So, because R X is defineduniquely, and is uniformly continuous, on the set C ( S X ) , we find the following result,which provides an interpretation for the representation R X , and which can be seen asanother generalisation of de Finetti’s Representation Theorem: R X is the limit of the fre-quency distributions. Theorem 6.
For all continuous gambles h on S X , we have that lim n → ¥ P F n ( h ) = R X ( h ) , or, in other words, the sequence of distributions P F n converges point-wise to R X on C ( S X ) , and in this specific sense, the sample frequencies F n converge in distribution .Running example. Back to our example, where X = B . Here the Representation Theorem(Theorem 5) states that the coherent count lower previsions Q n B , n ≥
1, for any exchange-able sequence of variables in B have the form Q n B ( g ) = R B ( CoBi n ( g |· )) , for all gambles g on the set { , , . . . , n } of possible numbers of successes s , where the(count) binomial distribution CoBi n ( ·| q ) is given by Equation (13), and R B is some coher-ent lower prevision defined on the set V ([ , ]) of all polynomials on [ , ] , which is theset of possible values for the probability q of a success.This R B can be uniquely extended to a coherent lower prevision on the set C ([ , ]) of all continuous gambles (functions) on [ , ] . And Theorem 6 assures us that this R B on C ([ , ]) is the ‘limiting distribution’ of the frequency of successes F n = T n ( X , . . . , X n ) / n ,as the number of ‘trials’ n goes to infinity.When all the count distributions Q n B are linear previsions Q n B , then the representation R B is a linear prevision R B , and vice versa . This linear prevision on C ([ , ]) , or equivalently,on V ([ , ]) is completely determined by (and of course completely determines) its values We refrain here from imposing conditions other than coherence (e.g., related to s -additivity) on such exten-sions, which could guarantee unicity on the set of all measurable gambles; see Miranda et al. (2007) for relateddiscussion. XCHANGEABLE LOWER PREVISIONS 19 on any basis of the set of polynomials on [ , ] . If we take as a basis the set { q n : n ≥ } ,then we see that R B is completely determined by its (raw) moment sequence m n = R B ( q n ) , n ≥
0. It is well-known (see for instance Feller, 1971, Section VII.3) that in the case offinitely additive probabilities, or linear previsions, a moment sequence uniquely determinesa distribution function, except in its discontinuity points. And this brings us right back tode Finetti’s (1937) version of the Representation Theorem: “la loi de probabilit´e F n ( x ) = P ( Y n ≤ x ) tend vers une limite pour n → ¥ . [. . . ] il s’ensuit qu’il existe une loi-limite F ( x ) telle que lim n → ¥ F n ( x ) = F ( x ) sauf peut-ˆetre pour les points de discontinuit´e .” ♦
6. L
OOKING AT THE SAMPLE MEANS
Consider an exchangeable sequence X , . . . , X n , . . . , and any gamble f on X . Then thesequence f ( X ) , . . . , f ( X n ) , . . . is again an exchangeable sequence of random variables,now taking values in the finite set f ( X ) . We are interested in the sample meansS n ( f )( X , . . . , X n ) : = n n (cid:229) k = f ( X k ) which form a sequence of random variables in [ inf f , sup f ] . For any m in N n X and any z ∈ [ m ] , S n ( f )( z ) = n n (cid:229) k = f ( z k ) = n (cid:229) x ∈ X m x f ( x ) = : S X (cid:16) f | m n (cid:17) where for each qqq ∈ S X , we have defined the linear prevision S X ( ·| qqq ) on L ( X ) by S X ( f | qqq ) : = (cid:229) x ∈ X f ( x ) q x . Observe that S X ( f |· ) is a very special (linear) polynomialgamble on the X -simplex. We then get MuHy n X ( S n ( f ) | m ) = n ( m ) (cid:229) z ∈ [ m ] S n ( f )( z ) = n ( m ) (cid:229) z ∈ [ m ] S X (cid:16) f | m n (cid:17) = S X (cid:16) f | m n (cid:17) so we find for the distribution P S n ( f ) of the sample mean S n ( f ) , which is a coherent lowerprevision on L ([ inf f , sup f ]) , that P S n ( f ) ( h ) = P n X ( h ( S n ( f ))) = Q n X ( h ( S X ( f |· )) ◦ n ) , h ∈ L ([ inf f , sup f ]) . In terms of the representing lower prevision R X , we see that CoMn n X ( h ( S X ( f |· ) ◦ n ) | qqq ) = (cid:229) m ∈ N n X h ( S X ( f | m n )) B m ( qqq ) is the approximating Bernstein polynomial for the gamble h ( S X ( f |· )) on S X . So for allcontinuous gambles h on [ inf f , sup f ] , h ( S X ( f |· )) is a continuous gamble on S X , and istherefore the uniform limit of its sequence of approximating Bernstein polynomials. Sincea coherent lower prevision is uniformly continuous, we see thatlim n → ¥ P S n ( f ) ( h ) = R X ( h ( S X ( f |· ))) . (17)This tells us that for an exchangeable sequence X , . . . , X n , . . . the sequence of samplemeans S n ( f )( X , . . . , X n ) converges in distribution. Our italics. In de Finetti’s notation, Y n is our F n , and F n its distribution function.
7. E
XCHANGEABLE NATURAL EXTENSION
Throughout this paper, we have always considered exchangeable lower previsions P N X defined on the set L ( X N ) of all gambles on X N . At first sight, it seems an impossibletask to specify or assess such an exchangeable lower prevision: a subject must specifyan uncountable infinity of supremum acceptable prices, and at the same time keep trackof all the symmetry requirements imposed by exchangeability, as well as the coherencerequirement.Alternatively, a subject must specify a coherent count lower prevision Q N X on L ( N N X ) ,and this means specifying an uncountable infinity of real numbers Q N X ( g ) , for all gambles g on N N X . Is it therefore realistic, or of any practical relevance, to consider such exchangeablecoherent lower previsions? Indeed it is, and we now want to show why.7.1.
The general problem.
What will usually happen in practice, is that a subject makesan assessment that N variables X , . . . , X N taking values in a finite set X are exchange-able, and in addition specifies supremum acceptable buying prices P ( f ) for all gamblesin some (typically finite, but not necessarily so) set of gambles K ⊆ L ( X N ) . The ques-tion then is: can we turn these assessments into an exchangeable coherent lower previsionP N X defined on all of L ( X N ) , that is furthermore as small (least-committal, conservative)as possible? To answer this question, we begin by looking at the most conservative (i.e., point-wisesmallest) exchangeable coherent lower prevision E P N for N variables. Since the mostconservative coherent lower prevision on L ( N N X ) is the vacuous lower prevision, given by Q N X ( g ) = min m ∈ N N X g ( m ) , our Representation Theorem for finite exchangeable sequences(Theorem 2) tells us that E P N ( f ) = min m ∈ N N X MuHy N X ( f | m ) (18)for all gambles f on X N , whose corresponding count lower prevision is vacuous. It modelsa subject’s beliefs about sampling without replacement from an urn with N balls, where thissubject is completely ignorant about the composition of the urn.Using this E P N , we can invoke a general theorem we have proven elsewhere, aboutthe existence of coherent lower previsions that are (strongly) invariant under a monoid oftransformations (De Cooman and Miranda, 2007, Theorem 16) to find that ENE-1. there are exchangeable coherent lower previsions on L ( X N ) that dominate P on K if and only if E P N (cid:18) n (cid:229) k = l k [ f k − P ( f k )] (cid:19) ≥ n ≥ l k ≥ f k ∈ K , k = , . . . , n ; (19) When Q N X is a linear prevision Q N X , it suffices to specify a finite number of real numbers Q N X ( { m } ) , for m in N N X , but such an extremely efficient reduction is generally not possible for coherent count lower previ-sions Q N X . This is a so-called structural assessment in Walley’s (1991) terminology. Equation (19) is closely related to the avoiding sure loss condition (1), but where the supremum is replacedby the coherent upper prevision E P N . Similarly, Equation (20) is related to the expression (3) for natural exten-sion, but where the infimum operator is replaced by the coherent lower prevision E P N . There is a small and easilycorrectable oversight in the formulation of Theorem 16 of De Cooman and Miranda (2007), as becomes imme-diately apparent when considering its proof: it is there (but should not be) formulated without the multipliers l k ≥ XCHANGEABLE LOWER PREVISIONS 21
ENE-2. in that case the point-wise smallest (most conservative) exchangeable coherentlower prevision E P , P N on L ( X N ) that dominates P on K is given by E P , P N ( f ) : = sup ( E P N (cid:18) f − n (cid:229) k = l k [ f k − P ( f k )] (cid:19) : n ≥ , l k ≥ , f k ∈ K ) , (20)and is called the exchangeable natural extension of P .If we now combine Equation (18) with Equations (19) and (20), and define the lowerprevision Q on the set H : = (cid:8) MuHy N X ( f |· ) : f ∈ K (cid:9) ⊆ L ( N N X ) by letting Q ( g ) : = sup (cid:8) P ( f ) : MuHy N X ( f |· ) = g , f ∈ K (cid:9) for all g ∈ H , then it is but a small technical step to prove the following result. Theorem 7 (Exchangeable natural extension) . There are exchangeable coherent lowerprevisions on L ( X N ) that dominate P on K if and only if Q is a lower prevision on H that avoids sure loss. In that case E P , P N = E Q ( MuHy N X ( ·|· )) , i.e., the count distributionfor the exchangeable natural extension E P , P N of P is the natural extension E Q of the lowerprevision Q. Since there are quite efficient algorithms (Walley et al., 2004) for calculating the naturalextension of a lower prevision based on a finite number of assessments, this theorem notonly has intuitive appeal, but it provides us with an elegant and efficient manner to findthe exchangeable natural extension, i.e., to combine (finitary) local assessments P with thestructural assessment of exchangeability.7.2. From n to n + k exchangeable random variables? Suppose we have n random vari-ables X , . . . , X n , that a subject judges to be exchangeable, and whose distribution is givenby the exchangeable coherent lower prevision P n X on L ( X n ) , with count distribution Q n X on L ( N n X ) . Can this model be extended to a coherent exchangeable model for n + k vari-ables? And if so, what is the most conservative such extended model? It is well-known that when P n X is a linear prevision, it cannot generally be extended(Diaconis and Freedman, 1980). In the more general case that we are considering here, wenow look at our Theorem 7 to provide us with an elegant answer: the problem consideredhere is a special case of the one studied in Section 7.1.Indeed, if we denote, as before in Section 4.1, by e f the cylindrical extension to X n + k of the gamble f on X n , then we see that the local assessments P are defined on the setof gambles K : = n e f : f ∈ L ( X n ) o ⊆ L ( X n + k ) by P ( e f ) : = P n X ( f ) , f ∈ L ( X n ) . Ob-serve that here N = n + k . If we recall Equation (8) in Section 4.2, then we see that thecorresponding set H ⊆ L ( N n + k X ) is given by H : = { g : g ∈ L ( N n X ) } , where for any gamble g on N n X and all mmm ∈ N n + k X g ( mmm ) : = (cid:229) m ∈ N n X n ( m ) n ( mmm − m ) n ( mmm ) g ( m ) = P ( g | mmm ) , Observe that it is necessary that Q ( g ) should be finite, in order for the condition (19) to hold. The explicit requirement that Q is a lower prevision means that Q must be nowhere infinite. where P ( ·| mmm ) is the linear prevision associated with drawing n balls without replacementfrom an urn with composition mmm . Moreover, for any h in H , there is a unique gamble g on N n X such that h = g . This implies that the corresponding lower prevision Q on H isgiven by Q ( g ) : = Q n X ( g ) , g ∈ L ( N n X ) . Now observe that(a) l = l for all real l ;(b) l g = l g for all g in L ( X n ) and all real l ;(c) g + g = g + g for all g and g in L ( X n ) .This tells us that H is a linear subspace of L ( N N X ) that contains all constant gambles.Moreover, because Q n X is a coherent lower prevision, we find that(i) Q ( h + h ) ≥ Q ( h ) + Q ( h ) for all h and h in H ;(ii) Q ( l h ) = l Q ( h ) for all real l ≥ h in H ;(iii) Q ( h + l ) = Q ( h ) + l for all real l and all h in H .Because Q and H have these special properties, the condition for P n X to be extendableto some coherent exchangeable model for n + k variables, namely that Q avoids sure losson H , simplifies to max g ≥ Q ( g ) for all g ∈ L ( N n X ) , i.e., tomax mmm ∈ N n + k X (cid:229) m ∈ N n X n ( m ) n ( mmm − m ) n ( mmm ) g ( m ) ≥ Q n X ( g ) for all g ∈ L ( N n X ) . The expression for the natural extension E Q of Q , applicable when the above conditionholds, can also be simplified significantly, again because of the special properties of Q and H : E Q ( h ) = sup ( inf h h − n (cid:229) k = l k [ g k − Q ( g k )] i : n ≥ , l k ≥ , g k ∈ L ( N n X ) ) = sup (cid:8) inf (cid:2) h − g + Q ( g ) (cid:3) : g ∈ L ( N n X ) (cid:9) = sup (cid:8) Q ( g + inf [ h − g ]) : g ∈ L ( N n X ) (cid:9) = sup (cid:8) Q ( g ) : g ≤ h , g ∈ L ( N n X ) (cid:9) = sup n Q n X ( g ) : g ≤ h , g ∈ L ( N n X ) o , for all gambles h on N n + k X . The point-wise smallest extension of P n X to a coherent ex-changeable model on L ( X n + k ) is then the coherent exchangeable lower prevision withcount distribution E Q , because of Theorem 7.In the well-known case that P n X is a linear prevision P n X , and therefore Q n X is also alinear prevision Q n X , the condition for extendibility can also be written asmin mmm ∈ N n + k X P ( g | mmm ) ≤ Q n X ( g ) for all g ∈ L ( N n X ) , where on the left hand side we now see the lower prevision of the gamble g , associatedwith drawing n balls from an urn with n + k balls, of unknown composition. When thisis satisfied, the lower prevision Q will actually be a linear prevision Q on the linear space H , and E Q will be the lower envelope of all linear previsions Q n + k X on L ( N n + k X ) that To see this, consider the polynomial p = (cid:229) mmm ∈ N n + k X h ( mmm ) B mmm . Use Zhou’s formula [Equation (22) in theAppendix] to find that if h = g , then also p = (cid:229) m ∈ N n X g ( m ) B m , and consider that expansions in a Bernstein basisare unique. XCHANGEABLE LOWER PREVISIONS 23 extend Q . Similarly, the exchangeable natural extension will be the lower envelope of allthe exchangeable linear previsions P n + k X on L ( X n + k ) that extend P n X .8. C ONCLUSIONS
We have shown that the notion of exchangeability has a natural place in the theory ofcoherent lower previsions. Indeed, on our approach using Bernstein polynomials, and gam-bles rather than events, it seems fairly natural and easy to derive representation theoremsdirectly for coherent lower previsions, and to derive the corresponding results for preciseprobabilities (linear previsions) as special cases.Interesting results can also obtained in a context of predictive inference, where a co-herent exchangeable lower prevision for n + k variables is updated with the informationthat the first n variables have been observed to assume certain values. For a fairly detaileddiscussion of these issues, we refer to De Cooman and Miranda (2007, Section 9.3).In Section 6, we have argued that the sample means S n ( f )( X , . . . , X n ) converge in dis-tribution. It is possible (and quite easy for that matter) to prove stronger results. Indeed,using an approach that is completely similar to the one originally used by de Finetti (1937),we can prove that for all non-negative n and p : P N X ([ S n + p ( f ) − S n ( f )] ) ≤ pn ( n + p ) sup f . In other words, for any fixed p ≥
1, the sequence S n + p ( f ) − S n ( f ) ‘converges in mean-square’ to zero as n → ¥ . Even stronger, we find that for any non-negative k and ℓ P N X ([ S k ( f ) − S ℓ ( f )] ) ≤ | k − ℓ | k ℓ sup f , and therefore the sequence S n ( f ) ‘Cauchy-converges in mean-square’. These convergenceresults can also be used to derive the convergence in distribution of the S n ( f ) , but weconsider the approach using Bernstein polynomials to be distinctly more elegant.A CKNOWLEDGEMENTS
We acknowledge financial support by research grant G.0139.01 of the Flemish Fund forScientific Research (FWO), and by projects MTM2004-01269, TSI2004-06801-C04-01.Erik Quaeghebeur’s research was financed by a Ph.D. grant of the Institute for the Promo-tion of Innovation through Science and Technology in Flanders (IWT Vlaanderen).We would like to thank J¨urgen Garloff for very helpful comments and pointers to theliterature about multivariate Bernstein polynomials.A
PPENDIX
A. M
ULTIVARIATE B ERNSTEIN POLYNOMIALS
With any n ≥ m ∈ N n X there corresponds a Bernstein (basis) polynomial of de-gree n on S X , given by B m ( qqq ) = n ( m ) (cid:213) x ∈ X q m x x , qqq ∈ S X . These polynomials have anumber of very interesting properties (see for instance Prautzsch et al., 2002, Chapters 10and 11), which we list here:B1. The set (cid:8) B m : m ∈ N n X (cid:9) of all Bernstein polynomials of fixed degree n is linearlyindependent: if (cid:229) m ∈ N n X l m B m =
0, then l m = m in N n X .B2. The set (cid:8) B m : m ∈ N n X (cid:9) of all Bernstein polynomials of fixed degree n forms a parti-tion of unity: (cid:229) m ∈ N n X B m = S X . B4. The set (cid:8) B m : m ∈ N n X (cid:9) of all Bernstein polynomials of fixed degree n forms a basisfor the linear space of all polynomials whose degree is at most n .Property B4 follows from B1 and B2. It follows from B4 that:B5. Any polynomial p of degree m has a unique expansion in terms of the Bernstein basispolynomials of fixed degree n ≥ m ,or in other words, there is a unique gamble b np on N n X such that p = (cid:229) m ∈ N n X b np ( m ) B m = CoMn n X ( b np |· ) . This tells us [also use B2 and B3] that each p ( qqq ) is a convex combination of the Bernsteincoefficients b np ( m ) , m ∈ N n X whencemin b np ≤ min p ≤ p ( qqq ) ≤ max p ≤ max b np . (21)It follows from a combination of B2 and B4 that for all k ≥ mmm in N n + k X , b n + kp ( mmm ) = (cid:229) m ∈ N n X n ( m ) n ( mmm − m ) n ( mmm ) b np ( m ) . (22)This is Zhou’s formula (see Prautzsch et al., 2002, Section 11.9). Hence [let p = k ≥ mmm in N n + k X , (cid:229) m ∈ N n X n ( m ) n ( mmm − m ) n ( mmm ) = . (23)The expressions (22) and (23) also imply that each b n + kp ( mmm ) is a convex combination ofthe b np ( m ) , and therefore min b n + kp ≥ min b np and max b n + kp ≤ max b np . Combined with theinequalities in (21), this leads to: [ min p , max p ] ⊆ [ min b n + kp , max b n + kp ] ⊆ [ min b np , max b np ] (24)for all n ≥ m and k ≥
0. This means that the non-decreasing sequence min b np convergesto some real number not greater than min p , and, similarly, the non-increasing sequencemax b np converges to some real number not smaller than max p . The following propositionstrengthens this. Proposition 8.
For any polynomial p on S X of degree m, lim n → ¥ n ≥ m [ min b np , max b np ] = [ min p , max p ] = p ( S X ) . Proof.
This follows from the fact that the b np converge uniformly to the polynomial p as n → ¥ ; see for instance Trump and Prautzsch (1996). Alternatively, it can be shown (seePrautzsch et al., 2002, Section 11.9) that for n ≥ mb np ( mmm ) = (cid:229) m ∈ N m X b mp ( m ) B m ( mmm n ) + O ( n ) = p ( mmm n ) + O ( n ) , mmm ∈ N n X . From this, we deduce that min b np ≥ min p + O ( n ) for any n ≥ m , and as a consequencelim n → ¥ , n ≥ m min b np ≥ min p . If we use now Equation (24), we see that lim n → ¥ , n ≥ m min b np = min p . The proof of the other equality is completely analogous. (cid:3) XCHANGEABLE LOWER PREVISIONS 25 R EFERENCES
D. M. Cifarelli and E. Regazzini. De Finetti’s contributions to probability and statistics.
Statistical Science , 11:253–282, 1996.A. P. Dawid. Probability, symmetry, and frequency.
British Journal for the Philosophy ofScience , 36(2):107–128, 1985.G. de Cooman and E. Miranda. Weak and strong laws of large numbers for coherentlower previsions.
Journal of Statistical Planning and Inference , 2006. Submitted forpublication.G. de Cooman and E. Miranda. Symmetry of models versus models of symmetry. In W. L.Harper and G. R. Wheeler, editors,
Probability and Inference: Essays in Honor of HenryE. Kyburg, Jr. , pages 67–149. King’s College Publications, 2007.B. de Finetti. La pr´evision: ses lois logiques, ses sources subjectives.
Annales de l’InstitutHenri Poincar´e , 7:1–68, 1937. English translation in Kyburg Jr. and Smokler (1964).B. de Finetti.
Teoria delle Probabilit`a . Einaudi, Turin, 1970.B. de Finetti.
Theory of Probability , volume 1. John Wiley & Sons, Chichester, 1974.English translation of de Finetti (1970).B. de Finetti.
Theory of Probability , volume 2. John Wiley & Sons, Chichester, 1975.English translation of de Finetti (1970).P. Diaconis and D. Freedman. Finite exchangeable sequences.
The Annals of Probability ,8:745–764, 1980.W. Feller.
An Introduction to Probability Theory and Its Applications , volume II. JohnWiley and Sons, New York, 1971.D. C. Heath and W. D. Sudderth. De Finetti’s theorem on exchangeable variables.
TheAmerican Statistician , 30:188–189, 1976.C. Heitzinger, A. H¨ossinger, and S. Selberherr. On Smoothing Three-Dimensional MonteCarlo Ion Implantation Simulation Results.
IEEE Transactions on Computer-Aided De-sign of integrated circuits and systems , 22(7):879–883, 2003.E. Hewitt and L. J. Savage. Symmetric measures on Cartesian products.
Transactions ofthe American Mathematical Society , 80:470–501, 1955.N. L. Johnson, S. Kotz, and N. Balakrishnan.
Discrete Multivariate Distributions . WileySeries in Probability and Statistics. John Wiley and Sons, New York, 1997.O. Kallenberg.
Foundations of Modern Probability . Springer-Verlag, New York, secondedition, 2002.O. Kallenberg.
Probabilistic Symmetries and Invariance Principles . Springer, New York,2005.H. E. Kyburg Jr. and H. E. Smokler, editors.
Studies in Subjective Probability . Wiley, NewYork, 1964. Second edition (with new material) 1980.G. G. Lorentz.
Bernstein Polynomials . Chelsea Publishing Company, New York, NY,second edition, 1986.E. Miranda and G. de Cooman. Marginal extension in the theory of coherent lower previ-sions.
International Journal of Approximate Reasoning , 2006. doi: 10.1016/j.ijar.2006.12.009. In press.E. Miranda, G. de Cooman, and E. Quaeghebeur. The Hausdorff moment problemunder finite additivity.
Journal of Theoretical Probability , 2007. doi: 10.1007/s10959-007-0055-4. In press.H. Prautzsch, W. Boehm, and M. Paluszny.
B´ezier and B-Spline Techniques . Springer,Berlin, 2002.
W. Trump and H. Prautzsch. Arbitrary degree elevation of B´ezier representations.
Com-puter Aided Geometric Design , 13:387–398, 1996.P. Walley.
Statistical Reasoning with Imprecise Probabilities . Chapman and Hall, London,1991.P. Walley, R. Pelessoni, and P. Vicig. Direct algorithms for checking consistency and mak-ing inferences from conditional probability assessments.
Journal of Statistical Planningand Inference , 126:119–151, 2004.P. Whittle.
Probability via Expectation . Springer, New York, fourth edition, 2000.S. L. Zabell. Predicting the unpredictable.
Synthese , 90:205–232, 1992. Reprinted inZabell (2005).S. L. Zabell.
Symmetry and Its Discontents: Essays on the History of Inductive Proba-bility . Cambridge Studies in Probability, Induction, and Decision Theory. CambridgeUniversity Press, Cambridge, UK, 2005. G HENT U NIVERSITY , SYST E MS R
ESEARCH G ROUP , T
ECHNOLOGIEPARK –Z WIJNAARDE
WIJNAARDE , B
ELGIUM
E-mail address : [email protected], [email protected] R EY J UAN C ARLOS U NIVERSITY , D
EPT . OF S TATISTICS AND O PERATIONS R ESEARCH . C-T
ULIP ´ AN , S / N , 28933, M ´ OSTOLES , S
PAIN
E-mail address ::