Detailed Large Deviation Analysis of a Droplet Model Having a Poisson Equilibrium Distribution
aa r X i v : . [ m a t h . P R ] S e p Detailed Large Deviation Analysisof a Droplet Model Having aPoisson Equilibrium Distribution
Richard S. Ellis [email protected] Shlomo Ta’asan [email protected] Department of Mathematics and StatisticsUniversity of MassachusettsAmherst, MA 01003 Department of Mathematical SciencesCarnegie Mellon UniversityPittsburgh PA 15213
April 21, 2019
Abstract
One of the main contributions of this paper is to illustrate how large deviation theorycan be used to determine the equilibrium distribution of a basic droplet model that underliesa number of important models in material science and statistical mechanics. The model issimply defined. Given b ∈ N and c > b , K distinguishable particles are placed, each withequal probability /N , onto the N sites of a lattice, where the ratio K/N , the averagenumber of particles per site, equals c . We focus on configurations for which each site isoccupied by a minimum of b particles. The main result is the large deviation principle(LDP), in the limit where K → ∞ and N → ∞ with K/N = c , for a sequence of random,number-density measures, which are the empirical measures of dependent random variablesthat count the droplet sizes. The rate function in the LDP is the relative entropy R ( θ | ρ ⋆ ) ,where θ is a possible asymptotic configuration of the number-density measures and ρ ⋆ is Poisson distribution restricted to the set of positive integers n satisfying n ≥ b . ThisLDP reveals that ρ ∗ is the equilibrium distribution of the number-density measures, whichin turn implies that ρ ∗ is the equilibrium distribution of the random variables that count thedroplet sizes. We derive the LDP via a local large deviation estimate of the probability thatthe number-density measures equal θ for any probability measure θ in the range of theserandom measures. American Mathematical Society 2010 Subject Classifications : 60F10 (primary), 82B05 (sec-ondary)
Key words and phrases : large deviation principle, microcanonical ensemble, number-densitymeasures, relative entropy
This paper contains the material in the companion paper [12] together with the following: fulldetails of several routine proofs omitted from [12], additional appendices, and extra backgroundinformation.These two papers are motivated by a natural and simply stated question. Given b ∈ N and c > b , K distinguishable particles are placed, each with equal probability /N , onto the N sitesof a lattice. Under the assumption that K/N = c and that each site is occupied by a minimumof b particles, what is the equilibrium distribution, as N → ∞ , of the number of particles persite? We prove in Corollary 2.3 that this equilibrium distribution is a Poisson distribution ρ b,α b ( c ) restricted to the set of positive integers n satisfying n ≥ b ; the parameter α b ( c ) is chosen so thatthe mean of ρ b,α b ( c ) equals c . As we explain at the end of the introduction, this equilibriumdistribution has important applications to technologies using sprays and powders.We answer this question about the equilibrium distribution by first proving a large deviationprinciple (LDP) for a sequence of random, number-density measures, which are the empiricalmeasures of a sequence of dependent random variables that count the droplet sizes. This LDPis stated in Theorem 2.1. The space for which we prove the LDP is a natural choice, beingthe smallest convex subset of probability measures containing the range of the number-densitymeasures. Our proof of the LDP avoids general results in the theory of large deviations, many ofwhich do not apply because the space for which we prove the LDP is not a complete, separablemetric space. Our proof is completely self-contained and starts from first principles, usingtechniques that are familiar in statistical mechanics. For example, the proof of the local largedeviation estimate in Theorem 3.1, a key step in the proof of the LDP for the number-densitymeasures, is based on combinatorics, Stirlings formula, and Laplace asymptotics. Our self-2ontained proof of the LDP perfectly matches the simplicity and elegance of our main result onthe equilibrium distribution stated in the preceding paragraph.In order to define the droplet model and to formulate the LDP for the number-density mea-sures, a standard probabilistic model is introduced. We begin as in the first paragraph. Given b ∈ N and c > b , K distinguishable particles are placed, each with equal probability /N , ontothe N sites of the lattice Λ N = { , , . . . , N } . In section 2 we also consider the case b = 0 .The large deviation limit — or in statistical mechanical terminology, the thermodynamic limit— is defined by taking K → ∞ and N → ∞ with K/N equal to c . The ratio K/N equals theaverage number of particles per site or the average size of a droplet. The configuration spacefor the droplet model is the set Ω N = Λ KN consisting of all ω = ( ω , ω , . . . , ω K ) , where ω i denotes the site in Λ N occupied by the i ’th particle. The cardinality of Ω N equals N K . Denoteby P N the uniform probability measure that assigns equal probability /N K to each of the N K configurations ω ∈ Ω N . For subsets A of Ω N , P N ( A ) = card ( A ) /N K , where card denotescardinality.The asymptotic analysis of the droplet model involves the following two random variables,which are functions of the configuration ω ∈ Ω N : for ℓ ∈ Λ N , K ℓ ( ω ) denotes the number ofparticles occupying the site ℓ in the configuration ω ; for j ∈ N ∪ { } , N j ( ω ) denotes the numberof sites ℓ ∈ Λ N for which K ℓ ( ω ) = j .We focus on the subset of Ω N consisting of all configurations ω for which every site of Λ N is occupied by at least b particles. Because of this restriction N j ( ω ) is indexed by j ∈ N b = { n ∈ Z : n ≥ b } . It is useful to think of each particle as having one unit of mass and of the setof particles at each site ℓ as defining a droplet. With this interpretation, for each configuration ω , K ℓ ( ω ) denotes the mass or size of the droplet at site ℓ . The j ’th droplet class has N j ( ω ) dropletsand mass jN j ( ω ) . Because the number of sites in Λ N equals N and the sum of the masses ofall the droplet classes equals K , the following conservation laws hold for such configurations: X j ∈ N b N j ( ω ) = N and X j ∈ N b jN j ( ω ) = K. (1.1)In addition, since the total number of particles is K , it follows that P ℓ ∈ Λ N K ℓ = K . Theseequality constraints show that the random variables N j and the random variables K ℓ are notindependent.In order to carry out the asymptotic analysis of the droplet model, we introduce a quantity m = m ( N ) that converges to ∞ sufficiently slowly with respect to N ; specifically, we requirethat m ( N ) /N → as N → ∞ . In terms of b and m we define the subset Ω N,b,m of Ω N consisting of all configurations ω for which every site of Λ N is occupied by at least b particlesand at most m of the quantities N j ( ω ) are positive. This second condition is a key technicaldevice that allows us to control the errors in several estimates.3he random quantities in the droplet model for which we formulate an LDP are the number-density measures Θ N,b . For ω ∈ Ω N,b,m these random probability measures assign to j ∈ N b the probability N j ( ω ) /N , which is the number density of the j ’th droplet class. Thus for anysubset A of N b Θ N,b ( ω, A ) = X j ∈ N b Θ N,b ; j ( ω ) δ j ( A ) = X j ∈ A Θ N,b ; j ( ω ) , where Θ N,b ; j ( ω ) = N j ( ω ) N .
Because of the two conservation laws in (1.1) and because
K/N = c , for ω ∈ Ω N,b,m , Θ N,b ( ω ) is a probability measure on N b = { n ∈ Z : n ≥ b } having mean X j ∈ N b j Θ N,b ; j ( ω ) = 1 N X j ∈ N b jN j ( ω ) = KN = c. Thus Θ N,b takes values in P N b ,c , which is defined to be the set of probability measures on N b having mean c . P N b ,c is topologized by the topology of weak convergence.The probability measure P N,b,m defining the droplet model is obtained by restricting theuniform measure P N to the set of configurations Ω N,b,m . Thus P N,b,m equals the conditionalprobability P N ( ·| Ω N,b,m ) . For subsets A of Ω N,b,m , P N,b,m ( A ) takes the form P N,b,m ( A ) = 1 card (Ω N,b,m ) · card ( A ) . In the language of statistical mechanics P N,b,m defines a microcanonical ensemble that incorpo-rates the conservation laws for number and mass expressed in (1.1).A natural question is to determine two equilibrium distributions: the equilibrium distribution ρ ⋆ of the number-density measures and the equilibrium distribution ρ ∗∗ = P j ∈ N b ρ ∗∗ j δ j of thedroplet-size random variables K ℓ . These distributions are defined by the following two limits:for any ε > , any ℓ ∈ Λ N , and all j ∈ N b lim N →∞ P N,b,m (Θ N,b ∈ B ( ρ ∗ , ε )) → and lim N →∞ P N,b,m ( K ℓ = j ) = ρ ∗∗ j , where B ( ρ ∗ , ε ) denotes the open ball with center ρ ∗ and radius ε defined with respect to anappropriate metric on P N b ,c . We make the following observations concerning these equilibriumdistributions.1. The equilibrium distributions ρ ∗ for Θ N,b and ρ ∗∗ for K ℓ coincide.2. We first determine the equilibrium distribution ρ ∗ of Θ N,b and then prove that ρ ∗ is alsothe equilibrium distribution of K ℓ . 4. As in many models in statistical mechanics, an efficient way to determine the equilibriumdistribution ρ ∗ of Θ N,b is to prove an LDP for Θ N,b , which we carry out in Theorem 2.1.The content of Theorem 2.1 is the following: as N → ∞ the sequence of number-densitymeasures Θ N,b satisfies the LDP on P N b ,c with respect to the measures P N,b,m . The rate functionis the relative entropy R ( θ | ρ b,α ) of θ ∈ P N b ,c with respect to the Poisson distribution ρ b,α on N b having components ρ b,α ; j = 1 Z b ( α ) · α j j ! for j ∈ N b . In this formula Z b ( α ) is the normalization that makes ρ b,α a probability measure, and α equalsthe unique value α b ( c ) for which ρ b,α b ( c ) has mean c [Thm. C.1(a)]. Using the fact that R ( θ | ρ b,α b ( c ) ) equals 0 at the unique measure θ = ρ b,α b ( c ) , we apply the LDP for Θ N,b to conclude in Theorem2.2 that ρ b,α b ( c ) is the equilibrium distribution of Θ N,b . Corollary 2.3 then implies that ρ b,α b ( c ) isalso the equilibrium distribution of K ℓ .The space P N b ,c is the most natural space on which to formulate the LDP for Θ N,b in Theo-rem 2.1. Not only is P N b ,c the smallest convex set of probability measures containing the rangeof Θ N,b for all N ∈ N , but also the union over N ∈ N of the range of Θ N,b is dense in P N b ,c .As we explain in part (a) of Theorem 2.4, P N b ,c is not a complete, separable metric space, asituation that prevents us from applying the many general results in the theory of large devi-ations that require the setting of a complete, separable metric space. In our opinion the factthat we avoid using such general results makes our self-contained proof of the LDP even moreattractive.The droplet model is defined in section 2. Our proof of the LDP for Θ N,b consists of thefollowing three steps, the first of which is the topic of section 3 and the second and third ofwhich are the topics of section 4.1. Step 1 is to derive the local large deviation estimate in part (b) of Theorem 3.1. This localestimate, one of the centerpieces of the paper, gives information not available in the LDPfor Θ N,b , which involves global estimates. It states that as N → ∞ , for any probabilitymeasure θ in the range of the number-density measure Θ N,b N log P N,b,m (Θ N,b = θ ) = − R ( θ | ρ b,α b ( c ) ) + o (1) , (1.2)where o (1) is an error term converging to 0 uniformly for all measures θ in the range of Θ N,b . Showing that the parameter of the Poisson distribution ρ b,α b ( c ) in the local largedeviation estimate equals α b ( c ) is one of the crucial elements of the proof. The proof ofthe local large deviation estimate involves combinatorics, Stirling’s formula, and Laplaceasymptotics. 5. Step 2 is to lift this local large deviation estimate to the large deviation limit for Θ N,b lying in open balls and certain other subsets of P N b ,c . This is done in Theorem 4.1 asa consequence of the general formulation given in Theorem 4.2 and the approximationprocedure proved in appendix B.3. Step 3 is to lift the large deviation limit for open balls and certain other subsets to theLDP for Θ N,b stated in Theorem 2.1, thus proving this LDP. This is done by applying thegeneral formulation given in Theorem 4.3.The paper has four appendices. In appendix A we derive properties of the relative entropyneeded in a number of our results. Appendix B is devoted to the proof of the approximationprocedure to which we just referred in item 2 above. In appendix C we prove the existence ofthe quantity α b ( c ) that defines the Poisson distribution ρ b,α b ( c ) and derive a number of propertiesof this quantity. Our proof of the existence of α b ( c ) for general b is subtle. This proof shouldbe contrasted with the straightforward proof of the existence of α b ( c ) for b = 1 , which is givenin Theorem C.2. We now explain the contents of appendix D. In order to control several errorsin our self-contained proof of the LDP, we must introduce the restriction involving the quantity m = m ( N ) that, as mentioned earlier, requires no more than m of the quantities N j to bepositive. This restriction is explained in detail in section 2; it is incorporated in the definition(2.1) of the set of configurations Ω N,b,m and the definition (2.3) of the microcanonical ensemble P N,b,m . In appendix D we present evidence supporting the conjecture that this restriction can beeliminated. Eliminating this restriction would enable us to present our results in a more naturalform.The paper [13] explores how our work on the droplet model was inspired by the work ofLudwig Boltzmann on a simple model of a random ideal gas, for which the Maxwell-Boltzmannis the equilibrium distribution. The form of the Maxwell-Boltzmann distribution can be provedusing Sanov’s theorem, which proves the LDP for the empirical measures of i.i.d. random vari-ables [13, § Θ N,b is the empirical measure of therandom variables K ℓ . However, Sanov’s theorem for empirical measures of i.i.d. random vari-ables cannot be applied because the K ℓ are dependent and, since their distributions depend on N , they form a triangular array. In section 7 of [13] we explore how Sanov’s theorem, althoughnot applicable as stated, can be used to give a heuristic motivation of the LDP for Θ N,b .The main application of the results in this paper is to technologies using sprays and powders,which are ubiquitous in many fields, including agriculture, the chemical and pharmaceuticalindustries, consumer products, electronics, manufacturing, material science, medicine, mining,paper making, the steel industry, and waste treatment. In this paper we focus on sprays; ourtheory also applies to powders with only changes in terminology. The behavior of sprays mightbe complex depending on various parameters including evaporation, temperature, and viscosity.6ur goal here is to consider the simplest model where the only assumption is made on theaverage size of droplets in the spray. In many situations it is important to have good control overthe sizes of the droplets, which can be translated into properties of probability distributions. Thesize distributions are important because they determine reliability and safety in each particularapplication.Interestingly, there does not seem to be a rigorous theory that predicts the equilibrium dis-tribution of droplet sizes, analogous to the Maxwell–Boltzmann distribution of energy levelsin a random ideal gas [17, 20]. Our goal in the present paper is to provide such a theory. Wedo so by focusing on one aspect of the problem related to the relative entropy, an approachthat characterizes the equilibrium distribution of droplet sizes as being a Poisson distributionrestricted to N b . We expect that this distribution will dominate experimental observations. Afull understanding of droplet behavior under dynamic conditions requires treating many otheraspects and is beyond the scope of this paper. A comparison of our results with experimentaldata will appear elsewhere. In addition we plan to apply the ideas in this paper to understandthe entropy of dislocation networks.Because of the length of this paper and its many technicalities, we would like to help thereader by summarizing the main results and explaining how one proceeds from the local largedeviation estimate stated in (1.2) and proved in part (b) of Theorem 3.1 to the LDP for thenumber-density measures Θ N,b stated in Theorem 2.1. We also summarize the theorems provedin appendices A, B, C, and D. • Theorem 2.1.
This theorem states that the sequence of P N,K,m -distributions of the number-density measures Θ N,b on P N ,c satisfies the LDP on P N ,c with rate function R ( θ | ρ b,α b ( c ) ) . • Theorem 2.2.
In this theorem we identify the Poisson distribution ρ b,α b ( c ) as the equilib-rium distribution of Θ N,b with respect to P N,b,m . It is a consequence of Theorem 2.1. • Corollary 2.3.
The Poisson distribution ρ b,α b ( c ) is shown in this corollary to be also theequilibrium distribution of the droplet-size random variables K ℓ with respect to P N,b,m . Itis a consequence of Theorem 2.2. • Theorem 2.4.
This theorem proves a number of properties of two spaces of probabilitymeasures that arise in the large deviation analysis of Θ N,b . • Theorem 3.1.
In part (a) of this theorem we show that there exists a unique value α = α b ( c ) ∈ (0 , ∞ ) for which the measure ρ b,α b ( c ) has mean c ; the components of ρ b,α b ( c ) aredefined in (2.7). In part (b) we prove the local large deviation estimate (1.2). • Theorems 4.1 and 4.2.
Theorem 4.1 shows how to lift the local large deviation estimatein part (b) of Theorem 3.1 to the large deviation limit for Θ N,b lying in open balls and7ertain other subsets of P N b ,c . Theorem 4.1 is derived as a consequence of the generalformulation stated in Theorem 4.2. • Theorem 4.3.
This theorem is a general formulation that allows us to lift the large devi-ation limit for open balls and certain other subsets in Theorem 4.1 to the LDP stated inTheorem 2.1, thus proving this LDP. • Theorem A.1 . In this theorem we collect a number of properties of the relative entropyused throughout the paper. • Theorem B.1.
This result is an approximation theorem that allows us to approximatean arbitrary probability measure θ ∈ P N b ,c by a sequence of probability measures θ ( N ) in the range of Θ N,b having the following property: the sequence of relative entropies R ( θ ( N ) | ρ b,α b ( c ) ) converges to R ( θ | ρ b,α b ( c ) ) as N → ∞ . This approximation theorem isapplied in two key places. First, it allows us to prove the asymptotic estimate in Lemma3.3, which is a basic ingredient in the proof of the local large deviation estimate in part(b) of Theorem 3.1. Second, it allows us to lift this local large deviation estimate to thelarge deviation limit for open balls and certain other subsets as formulated in Theorem4.1. • Theorem C.1.
This theorem studies a number of properties of the quantity α b ( c ) thatdefines the Poisson-type equilibrium distribution ρ α ( c ) . • Theorem C.2.
This theorem studies a number of properties of the quantity α b ( c ) for b = 1 . • Theorems D.1, D.2, and D.4 and Proposition D.3.
These results address issues relatedto the constraint involving the quantity m = m ( N ) in the definition (2.1) of the set ofconfigurations Ω N,b,m and the definition (2.3) of the microcanonical ensemble P N,b,m . Wediscuss how, if we could eliminate this constraint, our results would have a more naturalform. Theorem D.4 is based on a deep, classical result on the asymptotic behavior ofStirling numbers of the second kind.
Acknowledgments.
The research of Shlomo Ta’asan is supported in part by a grant from theNational Science Foundation (NSF-DMS-1216433). Richard S. Ellis thanks Jonathan Machtafor sharing his insights into statistical mechanics and for useful comments on this introduction,Luc Rey-Bellet for valuable conversations concerning large deviation theory, and Michael Sul-livan for his generous help with a number of topological issues arising in this paper. We arealso grateful to Jonathan Machta for suggesting the generalization, explained in section 2, from8 minimum of 1 particle at each site to a minimum of b particles at each site, where b is anypositive integer, and for helping us with the proof of part (a) of Theorem C.1. After defining the droplet model, we state the main theorem in the paper, Theorem 2.1. Thecontent of this theorem is the LDP for the sequence of random, number-density measures,which are the empirical measures of a sequence of dependent random variables that count thedroplet sizes in the model. As we show in Theorem 2.2 and in Corollary 2.3, the LDP enablesus to identify a Poisson distribution as the equilibrium distribution both of the number-densitymeasures and of the droplet-size random variables. Finally, in Theorem 2.4 we prove a numberof properties of two spaces of probability measures in terms of which the LDP for the number-density measures is formulated.We start by fixing parameters b ∈ N ∪ { } and c ∈ ( b, ∞ ) . The droplet model is defined bya probability measure P N,b parametrized by N ∈ N and the nonnegative integer b . The measuredepends on two other positive integers, K and m , where ≤ m ≤ N < K . Both K and m arefunctions of N in the large deviation limit N → ∞ . In this limit — which is the same as thethermodynamic limit in statistical mechanics — we take K → ∞ and N → ∞ , where K/N ,the average number of particles per site, stays equal to c . Thus K = N c . In addition, we take m → ∞ sufficiently slowly by choosing m to be a function m ( N ) satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ ; e.g., m ( N ) = N δ for some δ ∈ (0 , / . Throughout this paperwe fix such a function m ( N ) . The parameter b and the function m = m ( N ) first appear in thedefinition of the set of configurations Ω N,b,m in (2.1), where these quantities will be explained.Because K and N are integers, c must be a rational number. This in turn imposes a restric-tion on the values of N and K . If c is a positive integer, then N → ∞ along the positive integersand K → ∞ along the subsequence K = cN . If c = x/y , where x and y are positive integerswith y ≥ and x and y relatively prime, then N → ∞ along the subsequence N = yn for n ∈ N and K → ∞ along the subsequence K = cN = xn . Throughout this paper, when wewrite N ∈ N or N → ∞ , it is understood that N and K satisfy the restrictions discussed here.In the droplet model K distinguishable particles are placed, each with equal probability /N , onto the sites of the lattice Λ N = { , , . . . , N } . This simple description corresponds toa simple probabilistic model. The configuration space is the set Ω N = Λ KN consisting of allsequences ω = ( ω , ω , . . . , ω K ) , where ω i ∈ Λ N denotes the site in Λ N occupied by the i ’thparticle. Let ρ ( N ) be the measure on Λ N that assigns equal probability /N to each site in Λ N ,and let P N = ( ρ ( N ) ) K be the product measure on Ω N with equal one-dimensional marginals ρ ( N ) . Thus P N is the uniform probability measure that assigns equal probability /N K to eachof the N K configurations ω ∈ Ω N ; for subsets A of Ω N we have P N ( A ) = card ( A ) /N K , where9ard denotes cardinality.The asymptotic analysis of the droplet model involves two random variables that we nowintroduce. Our goal is to prove a large deviation principle (LDP) for a sequence of randomprobability measures defined in terms of these random variables. The LDP is stated in Theorem2.1. • For ℓ ∈ Λ N and ω ∈ Ω N , K ℓ ( ω ) denotes the number of particles occupying site ℓ in theconfiguration ω . In other words, K ℓ ( ω ) = card { i ∈ { , , . . . , K } : ω i = ℓ } . • For j ∈ N ∪ { } and ω ∈ Ω N , N j ( ω ) denotes the number of sites ℓ ∈ Λ N for which K ℓ ( ω ) = j .The dependence of K ℓ ( ω ) and N j ( ω ) on N is not indicated in the notation. Because the distri-butions of both random variables depend on N , both K ℓ and N j form triangular arrays.We now specify the role played by the nonnegative integer b , first focusing on the case where b is a positive integer. The case where b = 0 is discussed later. For ω ∈ Ω N , in general thereexist sites ℓ ∈ Λ N for which K ℓ ( ω ) = 0 ; i.e., sites that are occupied by 0 particles. For thisreason the quantity N j ( ω ) just defined is indexed by j ∈ N ∪ { } . The next step in the definitionof the droplet model is to specify a subset Ω N,b,m of configurations ω ∈ Ω N for which everysite is occupied by at least b particles and another constraint holds. In the following definitionof Ω N,b,m , N b denotes the set { n ∈ Z : n ≥ b } . Thus N is the set of nonnegative integers.1. Given b ∈ N , for any configuration ω ∈ Ω N,b,m every site of Λ N is occupied by atleast b particles. In other words, for each ℓ ∈ Λ N there exists at least b values of i ∈{ , , . . . , K } such that ω i = ℓ . Equivalently, in the configuration ω and for each ℓ ∈ Λ N we have K ℓ ( ω ) ≥ b . It follows that for ω ∈ Ω N,b,m , N j ( ω ) is indexed by j ∈ N b .2. For any configuration ω ∈ Ω N,b,m at most m of the components N j ( ω ) for j ∈ N b arepositive. As specified at the start of this section, m = m ( N ) → ∞ and m ( N ) /N → as N → ∞ .We denote by N ( ω ) the sequence { N j ( ω ) , j ∈ N b } and define | N ( ω ) | + = card { j ∈ N b : N j ( ω ) ≥ } . In terms of this notation Ω N,b,m = { ω ∈ Ω N : K ℓ ( ω ) ≥ b ∀ ℓ ∈ Λ N and | N ( ω ) | + ≤ m = m ( N ) } . (2.1)Constraint 2, which restricts the number of positive components of N ( ω ) , is a useful tech-nical device that allows us to control the errors in several estimates. In appendix D we explain10hy we impose this constraint and give evidence supporting the conjecture that this restric-tion can be eliminated. Because of the two constraints, the maximum number of particles thatcan occupy any site is K − b ( N −
1) = N ( c − b ) + b . It follows that N j ( ω ) = 0 for all j ≥ N ( c − b ) + b .When b is a positive integer, for each ω ∈ Ω N,b,m each site in Λ N is occupied by at least b particles. In this case it is useful to think of each particle as having one unit of mass andof the set of particles at each site ℓ as defining a droplet. With this interpretation, for eachconfiguration ω , K ℓ ( ω ) denotes the mass or the size of the droplet at site ℓ . The j ’th dropletclass has N j ( ω ) droplets and mass jN j ( ω ) . Because the number of sites in Λ N equals N andthe sum of the masses of all the droplet classes equals K , it follows that the quantities N j ( ω ) satisfy the following conservation laws for all ω ∈ Ω N,b,m : X j ∈ N b N j ( ω ) = N and X j ∈ N b jN j ( ω ) = K. (2.2)We now consider the modifications that must be made in these definitions when b = 0 .In this case constraint 1 in the definition of Ω N,b,m disappears because we allow sites to beoccupied by 0 particles, and therefore N j ( ω ) is indexed by j ∈ N = N ∪ { } . On the otherhand, we retain constraint 2 in the definition of Ω N, ,m , which requires that for any configuration ω ∈ Ω N, ,m at most m of the components N j ( ω ) for j ∈ N are positive. In terms of | N ( ω ) | + the definition of Ω N, ,m becomes Ω N, ,m = { ω ∈ Ω N : | N ( ω ) | + ≤ m = m ( N ) } . Because the choice b = 0 allows sites to be empty, we lose the interpretation of the set ofparticles at each site as being a droplet. However, for ω ∈ Ω N, ,m the two conservation laws(2.2) continue to hold.For the remainder of this paper we work with any fixed nonnegative integer b . The proba-bility measure P N,b,m defining the droplet model is obtained by restricting the uniform measure P N to the set Ω N,b,m . Thus P N,b,m equals the conditional probability P N ( ·| Ω N,b,m ) . For subsets A of Ω N,b,m , P N,b,m ( A ) takes the form P N,b,m ( A ) = P N ( A | Ω N,b,m ) = 1 P N (Ω N,b,m ) · P N ( A ) (2.3) = 1 card (Ω N,b,m ) · card ( A ) . The second line of this formula follows from the fact that P N assigns equal probability /N K to every ω ∈ Ω N,b,m . In the language of statistical mechanics P N,b,m defines a microcanonicalensemble that incorporates the conservation laws for number and mass expressed in (2.2).11aving defined the droplet model, we introduce the random probability measures whoselarge deviations we will study. For ω ∈ Ω N,b,m these measures are the number-density measures Θ N,b that assign to j ∈ N b the probability N j ( ω ) /N . This ratio represents the number densityof droplet class j . Thus for any subset A of N b Θ N,b ( ω, A ) = X j ∈ N b Θ N,b ; j ( ω ) δ j ( A ) = X j ∈ A Θ N,b ; j ( ω ) , where Θ N,b ; j ( ω ) = N j ( ω ) N . (2.4)By the two formulas in (2.2) X j ∈ N b Θ N,b ; j ( ω ) = 1 and X j ∈ N b j Θ N,b ; j ( ω ) = KN = c. (2.5)Thus Θ N,b ( ω ) is a probability measure on N b having mean c .We next introduce several spaces of probability measures that arise in the large deviationanalysis of the droplet model. P N b denotes the set of probability measures on N b = { n ∈ Z : n ≥ b } . Thus θ ∈ P N b has the form P j ∈ N b θ j δ j , where the components θ j satisfy θ j ≥ and θ ( N b ) = P j ∈ N b θ j = 1 . We say that a sequence of measures { θ ( n ) , n ∈ N } in P N b convergesweakly to θ ∈ P N b , and write θ ( N ) ⇒ θ , if for any bounded function f mapping N b into R lim n →∞ Z N b f dθ ( n ) = Z N b f dθ. P N b is topologized by the topology of weak convergence. There is a standard technique for in-troducing a metric structure on P N b for which we quote the main facts. Because N is a complete,separable metric space with metric d ( x, y ) = | x − y | , there exists a metric π on P N b called theProhorov metric with the following properties: • Convergence with respect to the Prohorov metric is equivalent to weak convergence [14,Thm. 3.3.1]; i.e., θ ( n ) ⇒ θ if and only if π ( θ ( n ) , θ ) → as N → ∞ . • With respect to the Prohorov metric, P N b is a complete, separable metric space [14, Thm.3.1.7].We denote by P N b ,c the set of measures in P N b having mean c . Thus θ ∈ P N b ,c has theform P j ∈ N b θ j δ j , where the components θ j satisfy θ j ≥ , P j ∈ N b θ j = 1 , and R N xθ ( dx ) = P j ∈ N b jθ j = c . By (2.5) the number-density measures Θ N,b defined in (2.4) take values in P N b ,c .In part (a) of Theorem 2.4 we prove two properties of P N b ,c : with respect to the Prohorovmetric, P N b ,c is a relatively compact, separable subset of P N b ; however, P N b ,c is not a closed12ubset of P N b and thus is not a compact subset or a complete metric space. The fact that P N b ,c isnot a closed subset of P N b is easily motivated. If θ ( n ) is a sequence in P N b ,c such that θ ( n ) ⇒ θ for some θ ∈ P N b , then some of the mass of θ ( n ) could escape to ∞ , causing θ to have a meanstrictly less than c ; an example is given in (2.12). Although P N b ,c is the natural space in whichto formulate the LDP for Θ N,b in Theorem 2.1, the fact that P N b ,c is not a closed subset of P N b gives rise to a number of unique features in the LDP.Because P N b ,c is not a closed subset of P N b , it is natural to introduce the closure of P N b ,c in P N b . As we prove in part (b) of Theorem 2.4, the closure of P N b ,c in P N b equals P N b , [ b,c ] , whichis the set of measures in P N b having mean lying in the closed interval [ b, c ] . For any θ ∈ P N b the minimum value of the mean of θ is b , which occurs if and only if θ = δ b . Being the closureof the relatively compact, separable metric space P N b ,c , P N b , [ b,c ] is a compact, separable metricspace with respect to the Prohorov metric. This space appears in the formulation of the largedeviation upper bound in part (c) of Theorem 2.1.We next state Theorem 2.1, which is the LDP for the sequence of distributions P N,b,m (Θ N,b ∈ dθ ) on P N b ,c as N → ∞ . The rate function in the LDP is the relative entropy of θ with respect toa certain measure ρ b,α b ( c ) = P j ∈ N b ρ b,α b ( c ); j δ j defined in (2.7), where each ρ b,α b ( c ); j > . Thusany θ ∈ P N b ,c is absolutely continuous with respect to ρ b,α b ( c ) . For θ ∈ P N b ,c the relative entropyof θ with respect to ρ b,α b ( c ) is defined by R ( θ | ρ b,α b ( c ) ) = X j ∈ N b θ j log( θ j /ρ b,α b ( c ); j ) . (2.6)If θ j = 0 , then θ j log( θ j /ρ b,α b ( c ); j ) = 0 . For A a subset of P N b ,c or P N b , [ b,c ] , R ( A | ρ b,α b ( c ) ) denotesthe infimum of R ( θ | ρ b,α b ( c ) ) over θ ∈ A .For j ∈ N b the components of the measure ρ b,α b ( c ) appearing in the LDP have the form ρ b,α b ( c ); j = 1 Z b ( α b ( c )) · [ α b ( c )] j j ! , (2.7)where α b ( c ) ∈ (0 , ∞ ) is chosen so that ρ b,α b ( c ) has mean c and Z b ( α b ( c )) is the normalizationmaking ρ b,α b ( c ) a probability measure; thus Z ( α ( c )) = e α ( c ) , and for b ∈ N , Z b ( α b ( c )) = e α b ( c ) − P b − j =0 [ α b ( c )] j /j ! . As we show in part (a) of Theorem C.1, there exists a unique valueof α b ( c ) . For b ∈ N the Poisson-type distribution ρ b,α b ( c ) differs from a standard Poisson dis-tribution because the former has 0 mass at , , . . . , b − while the latter has positive mass atthese points. In fact, ρ b,α b ( c ) can be identified as the distribution of a Poisson random variable Ξ α b ( c ) with parameter α b ( c ) conditioned on Ξ α b ( c ) ∈ N b [Thm. C.1(d)]. Despite this differencewe shall also refer to ρ b,α b ( c ) as a Poisson distribution.According to part (a) of Theorem 2.1 R ( ·| ρ b,α b ( c ) ) has compact level sets in P N b ,c . It is wellknown that the relative entropy has compact level sets in the complete space P N b . The level sets13re also compact in P N b , [ b,c ] because the latter is a compact subset of P N b . However, because P N b ,c is not closed in P N b , the compactness of the level sets in P N b ,c is not obvious.As a consequence of the fact that P N b ,c is not closed in P N b , the large deviation upper boundtakes two forms depending on whether the subset F of P N b ,c is compact or whether F is closed.When F is compact, in part (b) we obtain the standard large deviation upper bound for F with − R ( F | ρ b,α b ( c ) ) on the right hand side. When F is closed, in part (c) we obtain a variation ofthe standard large deviation upper bound; − R ( F | ρ b,α b ( c ) ) on the right hand side is replacedby − R ( F | ρ b,α b ( c ) ) , where F is the closure of F in the compact space P N b , [ b,c ] and is thereforecompact. When F is compact, its closure in P N b , [ b,c ] is F itself. In this case the large deviationupper bounds in parts (b) and (c) coincide.The refinement in part (c) is important. It is applied in the proof of Theorem 2.2 to show that ρ b,α b ( c ) is the equilibrium distribution of the number-density measures Θ N,b . In turn, Theorem2.2 is applied in the proof of Corollary 2.3 to show that ρ b,α b ( c ) is the equilibrium distribution ofthe droplet-size random variables K ℓ .In the next theorem we assume that m is the function m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . The assumptionthat m ( N ) /N → is used to control error terms in Lemmas 3.2, 3.3, and B.3. This assumptionon m ( N ) is optimal in the sense that it is a minimal assumption guaranteeing that an error termin the lower bound in part (a) of Lemma B.3 and in the upper bound in part (b) of the lemmaconverge to 0. Theorem 2.1.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let m be thefunction m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . Let ρ b,α b ( c ) ∈ P N b ,c be the distribution having the componentsdefined in (2.7) . Then as N → ∞ , with respect to the measures P N,b,m , the sequence Θ N,b satisfies the large deviation principle on P N b ,c with rate function R ( θ | ρ b,α b ( c ) ) in the followingsense. (a) R ( θ | ρ b,α b ( c ) ) maps P N b ,c into [0 , ∞ ] , and for any M < ∞ the level set { θ ∈ P N b ,c : R ( θ | ρ b,α b ( c ) ) ≤ M } is compact. (b) For any compact subset F of P N b ,c we have the large deviation upper bound lim sup N →∞ N log P N,b,m (Θ N,b ∈ F ) ≤ − R ( F | ρ b,α b ( c ) ) . (c) For any closed subset F of P N b ,c , let F denote the closure of F in P N b , [ b,c ] . We have thelarge deviation upper bound lim sup N →∞ N log P N,b,m (Θ N,b ∈ F ) ≤ − R ( F | ρ b,α b ( c ) ) . For any open subset G of P N b ,c we have the large deviation lower bound lim inf N →∞ N log P N,b,m (Θ N,b ∈ G ) ≥ − R ( G | ρ b,α b ( c ) ) . As noted in the comments after the statement of Theorem 4.3, Theorem 2.1 is a consequenceof that theorem and several other results proved in the paper. Part (b) of Theorem 3.1 provesa local large deviation estimate for probabilities of the form P N,b,m (Θ N,b = θ ) , where θ is aprobability measure in the range of Θ N,b . This local estimate is one of the centerpieces of thispaper, giving information not available in the LDP for Θ N,b , which involves global estimates. InTheorem 4.1 we show how to lift this local estimate to the large deviation limit for Θ N,b lyingin open balls and certain other subsets of P N b ,c defined in terms of open balls. Theorem 4.1 isproved as an application of the general formulation given in Theorem 4.2. Finally we show howto lift the large deviation limit for open balls and certain other subsets defined in terms of openballs to the LDP stated in Theorem 2.1. We do so by applying the general formulation given inTheorem 4.3. In part (d) of Theorem A.1 we prove that the level sets of R ( θ | ρ b,α b ( c ) ) in P N b ,c are compact.The rate function in Theorem 2.1 has the property that for θ ∈ P N b , [ b,c ] , R ( θ | ρ b,α b ( c ) ) ≥ with equality if and only if θ = ρ b,α b ( c ) [Thm. A.1(a)]. As we explain in the next theorem,the large deviation upper bound and this property of the relative entropy allow us to interpretthe Poisson distribution ρ b,α b ( c ) as the equilibrium distribution of the number-density measures Θ N,b . In this theorem [ B π ( ρ b,α b ( c ) , ε )] c denotes the complement in P N b ,c of the open ball in P N b ,c with center ρ b,α b ( c ) and radius ε > with respect to the Prohorov metric π . This open ball isdefined by B π ( ρ b,α b ( c ) , ε ) = { ν ∈ P N b ,c : π ( ρ b,α b ( c ) , ν ) < ε } . [ b B π ( ρ b,α b ( c ) , ε )] c denotes the complement in P N b , [ b,c ] of the open ball defined by b B π ( ρ b,α b ( c ) , ε ) = { ν ∈ P N b , [ b,c ] : π ( ρ b,α b ( c ) , ν ) < ε } . There is a subtlety in the proof in the next theorem that ρ b,α b ( c ) is the equilibrium distri-bution of Θ N,b . To prove this, we need an exponentially decaying estimate on the probabilitythat Θ N,b ∈ [ B π ( ρ b,α b ( c ) , ε )] c . Since [ B π ( ρ b,α b ( c ) , ε )] c is closed in P N b ,c but is not compact,we obtain this estimate by applying the large deviation upper bound in part (c) of Theorem2.1 to [ B π ( ρ b,α b ( c ) , ε )] c and using the fact that the closure of this set in P N b , [ b,c ] is a subset of [ b B π ( ρ b,α b ( c ) , ε )] c . Theorem 2.2.
We assume the hypotheses of Theorem . The following results hold for any ε > . (a) The quantity x ⋆ = inf { R ( θ | ρ b,α b ( c ) ) : θ ∈ [ b B π ( ρ b,α b ( c ) , ε )] c } is strictly positive. For any number y in the interval (0 , x ⋆ ) and all sufficiently large NP N,b,m (Θ N,b ∈ [ B π ( ρ b,α b ( c ) , ε )] c ) ≤ exp[ − N y ] as N → ∞ . This upper bound implies that as N → ∞ lim N →∞ P N,b,m (Θ N,b ∈ B π ( ρ b,α b ( c ) , ε )) = 1 and lim ε → lim N →∞ P N,b,m (Θ N,b ∈ B π ( ρ b,α b ( c ) , ε )) = 1 . These limits allow us to interpret the Poisson distribution ρ b,α b ( c ) having the components definedin (2.7) as the equilibrium distribution of the number-density measures Θ N,b with respect to P N,b,m . Proof.
The starting point is the large deviation upper bound in part (c) of Theorem 2.1 ap-plied to the closed set [ B π ( ρ b,α b ( c ) , ε )] c , which is a subset of [ b B π ( ρ b,α b ( c ) , ε )] c . We denote theclosure of [ B π ( ρ b,α b ( c ) , ε )] c in P N b , [ b,c ] by [ B π ( ρ b,α b ( c ) , ε ] c . We claim that [ B π ( ρ b,α b ( c ) , ε )] c ⊂ [ b B π ( ρ b,α b ( c ) , ε )] c . Indeed, any ν ∈ [ B π ( ρ b,α b ( c ) , ε ] c is the weak limit of a sequence ν ( n ) ∈ [ B π ( ρ b,α b ( c ) , ε ] c ⊂ P N b ,c . Since the closure of P N b ,c in P N b equals P N b , [ b,c ] , in general we have ν ∈ P N b , [ b,c ] . In addition, since ν ( n ) ∈ [ b B π ( ρ b,α b ( c ) , ε )] c , it follows that ν ∈ [ b B π ( ρ b,α b ( c ) , ε )] c .This proves the claim that [ B π ( ρ b,α b ( c ) , ε )] c ⊂ [ b B π ( ρ b,α b ( c ) , ε )] c . Because of this relationship, thelarge deviation upper bound in part (c) of Theorem 2.1 takes the form lim sup N →∞ N log P N,b,m (Θ N,b ∈ [ B π ( ρ b,α b ( c ) , ε )] c } (2.8) ≤ − R ([ B π ( ρ b,α b ( c ) , ε )] c | ρ b,α b ( c ) ) ≤ − R ([ b B π ( ρ b,α b ( c ) , ε )] c | ρ b,α b ( c ) ) . We now prove part (a) of Theorem 2.2. Since R ( θ | ρ b,α b ( c ) ) has compact level sets in P N b , [ b,c ] ,it attains its infimum x ⋆ on the closed set [ b B π ( ρ b,α b ( c ) , ε )] c . If x ⋆ = 0 , then there would exist θ ∈ [ b B π ( ρ b,α b ( c ) , ε )] c such that R ( θ | ρ b,α b ( c ) ) = 0 . But on P N b , [ b,c ] , R ( θ | ρ b,α b ( c ) ) attains its infimumof 0 at the unique measure θ = ρ b,α b ( c ) . Hence we obtain a contradiction because ρ b,α b ( c ) [ b B π ( ρ b,α b ( c ) , ε )] c . This completes the proof of part (a). The inequality in part (b) is an immediateconsequence of part (a) and the large deviation upper bound (2.8). This inequality yields thetwo limits in the next display. The proof of Theorem 2.2 is complete.We now apply Theorem 2.2 to prove that ρ b,α b ( c ) is also the equilibrium distribution of therandom variables K ℓ , which count the droplet sizes at the sites of Λ N . Although these randomvariables are identically distributed, they are dependent because for each ω ∈ Ω N,b,m they satisfythe equality constraint P ℓ ∈ Λ N K ℓ ( ω ) = K . Except for one step the proof that ρ b,α b ( c ) is also theequilibrium distribution of K ℓ is completely algebraic and requires only the condition that the16 ℓ are identically distributed. Their dependence does not affect the proof. A key observationneeded in the proof is that Θ N,b is the empirical measure of these random variables; i.e., for ω ∈ Ω N,b,m , Θ N,b ( ω ) assigns to subsets A of N b the probability Θ N,b ( ω, A ) = 1 N N X ℓ =1 δ K ℓ ( ω ) ( A ) . This characterization of Θ N,b follows from the fact that the empirical measure of K ℓ assigns to j ∈ N b the probability N N X ℓ =1 δ K ℓ ( ω ) ( { j } ) = N j ( ω ) N = Θ N,b ; j ( ω ) . (2.9) Corollary 2.3.
We assume the hyotheses of Theorem . Then for any site ℓ ∈ Λ N and any j ∈ N b lim N →∞ P N,b,m ( K ℓ = j ) = ρ b,α b ( c ); j = 1 Z b ( α b ( c )) · [ α b ( c )] j j ! . Proof.
Since the random variables K ℓ are identically distributed, it suffices to prove the corol-lary for ℓ = 1 . Theorem 2.2 implies that if g is any bounded continuous function mapping P N b ,c into R , then lim N →∞ Z Ω N,b,m g (Θ N,b ) dP N,b,m = g ( ρ b,α b ( c ) ) . (2.10)Given ϕ any bounded function mapping N b into R we define for θ ∈ P N b the bounded function g ( θ ) = X j ∈ N b ϕ ( j ) θ j . By the definition of weak convergence, g is continuous on P N b ,c . Equation (2.9) now yields g (Θ N,b ( ω )) = X j ∈ N b ϕ ( j )Θ N,b ; j ( ω )= 1 N X ℓ ∈ Λ N X j ∈ N b ϕ ( j ) δ K ℓ ( ω ) ( { j } ) = 1 N X ℓ ∈ Λ N ϕ ( K ℓ ( ω )) . K ℓ are identically distributed, it follows from (2.10) that lim N →∞ Z Ω N,b,m ϕ ( K ) dP N,b,m = lim N →∞ N N X ℓ =1 Z Ω N,b,m ϕ ( K ℓ ) dP N,b,m = lim N →∞ Z Ω N,b,m g (Θ N,b ) dP N,b,m = g ( ρ b,α b ( c ) ) = X j ∈ N b ϕ ( j ) ρ b,α b ( c ); j . Setting ϕ = 1 j ′ for any j ′ ∈ N b yields lim N →∞ P N,b,m ( K = j ′ ) = ρ b,α b ( c ); j ′ . This completes the proof of the corollary.The last theorem in this section proves several properties of P N b ,c and P N b , [ b,c ] with respectto the Prohorov metric that are needed in the paper. Theorem 2.4.
Fix a nonnegative integer b and a real number c ∈ ( b, ∞ ) . The metric spaces P N b ,c and P N b , [ b,c ] have the following properties. (a) P N b ,c , the set of probability measures on N b having mean c , is a relatively compact,separable subset of P N b . However, P N b ,c is not a closed subset of P N b and thus is not a compactsubset or a complete metric space. (b) P N b , [ b,c ] , the set of probability measures on N b having mean lying in the closed interval [ b, c ] , is the closure of P N b ,c in P N b . P N b , [ b,c ] is a compact, separable subset of P N b . Proof. (a) For ξ ∈ N satisfying ξ ≥ b let Ψ ξ denote the compact subset { b, b + 1 , . . . , ξ } of N b ,and let [Ψ ξ ] c denote its complement. For any θ ∈ P N b ,c c = X j ∈ N b jθ j ≥ X j ≥ ξ +1 jθ j ≥ ξ X j ≥ ξ +1 θ j = ξθ ([Ψ ξ ] c ) . It follows that P N b ,c is tight; i.e., for any ε > there exists ξ ∈ N such that sup θ ∈P N b,c θ ([Ψ ξ ] c ) < ε. Prohorov’s Theorem implies that P N b ,c is relatively compact [14, Thm. 3.2.2]. The separabilityof P N b ,c is proved in Corollary B.2. 18n the present setting the relative compactness of P N b ,c is easy to prove from the tightness of P N b ,c without appealing to the general formulation of Prohorov’s Theorem. Given any sequence θ ( n ) ∈ P N b ,c , a diagonal argument yields a subsequence θ ( n ′ ) such that θ j = lim n →∞ θ ( n ′ ) j existsfor all j ∈ N b . Define θ = P j ∈ N b θ j δ j . We claim that θ ( n ′ ) ⇒ θ . To see this let f be anynonzero bounded function mapping N b into R . Given ε > choose ξ ∈ N b so large that sup n ′ θ ( n ′ ) ([Ψ ξ ] c ) < ε/ [2 k f k ∞ ] and θ ([Ψ ξ ] c ) < ε/ [2 k f k ∞ ] . The latter bound is possible since by Fatou’s Lemma c = lim inf n ′ →∞ P j ∈ N b jθ ( n ′ ) j ≥ P j ∈ N b jθ j .It follows that (cid:12)(cid:12)(cid:12)(cid:12)Z N b f dθ ( n ′ ) − Z N b f dθ (cid:12)(cid:12)(cid:12)(cid:12) ≤ ξ X j = b | f ( j ) || θ ( n ′ ) j − θ j | + X j ≥ ξ +1 | f ( j ) | ( θ ( n ′ ) j + θ j ) (2.11) ≤ ξ X j = b | f ( j ) || θ ( n ′ ) j − θ j | + ε. Since θ ( n ′ ) j → θ j for j ∈ { b, b + 1 , . . . , ξ } and ε > is arbitrary, the weak convergence of θ ( n ′ ) to θ is proved. Taking f to be identically 1 verifies that θ ∈ P N b , which must be the case since P N b is complete.We now prove that P N b ,c is not a closed subset of P N b by exhibiting a sequence θ ( n ) ∈ P N b ,c having a weak limit that does not lie in P N b ,c . To simplify the notation, we denote the mean of σ ∈ P N b by h σ i . Let θ be any measure in P N b with mean h θ i = β ∈ [ b, c ) ; thus θ
6∈ P N b ,c . Thesequence θ ( n ) = n − cn − β θ + c − βn − β δ n for n ∈ N , n > c (2.12)has the property that θ ( n ) ∈ P N b ,c and that θ ( n ) ⇒ θ
6∈ P N b ,c . We conclude that P N b ,c is not aclosed subset of P N b . This completes the proof of part (a).(b) Since P N b ,c is a separable subset of P N b and P N b ,c is dense in P N b , [ b,c ] , it follows that P N b , [ b,c ] is separable. We prove that P N b , [ b,c ] is the closure of P N b ,c in P N b . Let θ ( n ) be a sequencein P N b ,c converging weakly to θ ∈ P N b . Since θ ( n ) ⇒ θ implies that θ ( n ) j → θ j for each j ∈ N b ,Fatou’s Lemma implies that c = lim inf n →∞ h θ ( n ) i ≥ h θ i . Since for any θ ∈ P N b we have h θ i ≥ b , it follows that c ≥ h θ i ≥ b . This shows that the closureof P N b ,c in P N b is a subset of P N b , [ b,c ] . 19e next prove that P N b , [ b,c ] is a subset of the closure of P N b ,c in P N b by showing that for any θ ∈ P N b , [ b,c ] there exists a sequence θ ( n ) ∈ P N b ,c such that θ ( n ) ⇒ θ . If h θ i = c , then we choose θ ( n ) = θ for all n ∈ N . If h θ i = β ∈ [ b, c ) , then we use the sequence θ ( n ) in (2.12), whichconverges weakly to θ . We conclude that θ lies in the closure of P N b ,c and thus that P N b , [ b,c ] is a subset of the closure of P N b ,c in P N b . This completes the proof of part (b). The proof ofTheorem 2.4 is done.We end this section by giving examples of closed, noncompact subsets of P N b ,c and compactsubsets of P N b ,c . We do this to emphasize the care that must be taken in dealing with the non-closed metric space P N b ,c and the necessity of having separate large deviation upper bounds forcompact sets in part (b) of Theorem 2.1 and for closed sets in part (c) of Theorem 2.1. Weconstruct these examples as level sets of lower semicontinuous functions I mapping P N b ,c into [0 , ∞ ] and having the form I ( θ ) = Z N b gdθ = X j ∈ N b g ( j ) θ j , where g ( j ) ≥ for all j ∈ N b . Since θ ( n ) ⇒ θ ∈ P N b ,c implies that θ ( n ) j → θ j for each j ∈ N b , Fatou’s Lemma implies that I is lower semicontinuous on P N b ,c . Thus for any M < ∞ the level set U M = { θ ∈ P N b ,c : I ( θ ) ≤ M } is closed in P N b ,c .For the next set of examples, we assume that g is a nondecreasing function mapping N b into [0 , ∞ ) and satisfying g ( j ) → ∞ and g ( j ) /j → as j → ∞ . In this case, as in the proof of part(a) of Theorem 2.4 that P N b ,c is relatively compact, Prohorov’s Theorem implies that the levelset U M is relatively compact. However, in general U M is not compact because it is not closedin P N b . A sequence showing that U M is not closed in P N b is given by θ ( n ) ∈ P N b ,c defined in(2.12), where θ has mean β ∈ [ b, c ) . For all sufficiently large n , θ ( n ) lies in the level set U β +1 ,but θ ( n ) ⇒ θ , which is not in P N b ,c .For the final set of examples, we assume that g is a nondecreasing function mapping N b into [0 , ∞ ) and satisfying g ( j ) /j → ∞ as j → ∞ . Again Prohorov’s Theorem implies that U M isrelatively compact. In addition, because of the assumption on g , U M is uniformly integrable;i.e., lim D →∞ sup θ ∈ U M Z { x ∈ N b : x ≥ D } xθ ( dx ) = 0 . This implies that if θ ( n ) ∈ U M converges weakly to θ ∈ P N b , then c = h θ ( n ) i → h θ i . Thisstandard consequence of uniform integrability, proved in Proposition 2.3 in the appendix of2014], can be proved in the present setting as in (2.11) if θ ( n ′ ) is replaced by θ ( n ) and f ( j ) isreplaced by j for j ∈ N b . It follows that θ has mean c and so lies in P N b ,c and therefore in U M because U M is closed in P N b ,c . We conclude that U M is both relatively compact and closed in P N b ,c , implying that U M is compact.The rate function in Theorem 2.1 is the relative entropy R ( θ | ρ α ( c ) ) , a lower semicontinuousfunction mapping P N b ,c into [0 , ∞ ] that does not have the simple form of I . The proof that R ( ·| ρ α ( c ) ) has compact level sets in P N b ,c relies on Lemma 5.1 in [7] and the fact that ρ α ( c ) hasa finite moment generating function R N b exp( wx ) ρ α ( c ) ( dx ) for all w ∈ (0 , ∞ ) [Thm. A.1(d)].In the next section we present the local large deviation estimate that will be used in section4 to prove the LDP for Θ N,b in Theorem 2.1.
The main result needed to prove the LDP in Theorem 2.1 is the local large deviation estimatestated in part (b) of Theorem 3.1. The first step is to introduce a set A N,b,m that plays a centralrole in this paper. Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Given N ∈ N define K = N c and let m be the function appearing in the definition of Ω N,b,m in (2.1) andsatisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . Define N b = { n ∈ Z : n ≥ b } ; thus N is the set of nonnegative integers. Let ν be a sequence { ν j , j ∈ N b } for which each ν j ∈ N ;thus ν ∈ N N b . We define A N,b,m to be the set of ν ∈ N N b satisfying X j ∈ N b ν j = N, X j ∈ N b jν j = K, and | ν | + ≤ m = m ( N ) , (3.1)where | ν | + = card { j ∈ N b : ν j ≥ } . Because ν j ∈ N , the two sums involve only finitelymany terms.For ω ∈ Ω N,b,m the components Θ N,b ; j ( ω ) of the number-density measure defined in (2.4)are N j ( ω ) /N for j ∈ N b , where N j ( ω ) denotes the number of sites in Λ N containing j particlesin the configuration ω . We denote by N ( ω ) the sequence { N j ( ω ) , j ∈ N b } . By definition, forevery ω ∈ Ω N,b,m each site ℓ ∈ Λ N is occupied by at least b particles, and | N ( ω ) | + ≤ m = m ( N ) . It follows that A N,b,m is the range of N ( ω ) for ω ∈ Ω N,b,m ; the two sums involving ν j in (3.1) correspond to the two sums involving N j ( ω ) in (2.2).Since the range of N ( ω ) is A N,b,m , for ω ∈ Ω N,b,m the range of Θ N,b ( ω ) is the set ofprobability measures θ N,b,ν whose components for j ∈ N b have the form θ N,b,ν ; j = ν j N for ν ∈ A N,b,m . (3.2)21y (3.1) θ N,b,ν takes values in P N b ,c , the set of probability measures on N b having mean c . Itfollows that the set B N,b,m = { θ ∈ P N b ,c : θ j = ν j /N for j ∈ N b for some ν ∈ A N,b,m } (3.3)is the range of Θ N,b ( ω ) for ω ∈ Ω N,b,m .In part (b) of the next theorem we state the local large deviation estimate for the event { Θ N,b = θ N,b,ν } . In part (a) we introduce the Poisson distribution ρ b,α b ( c ) that appears in thelocal estimate. This Poisson distribution is the restriction to N b of a standard Poisson distributionon N ∪ { } ; ρ b,α b ( c ) is defined in terms of a parameter α b ( c ) guaranteeing that it has mean c . If b = 0 , then α ( c ) = c , while if b ∈ N , then α b ( c ) < c [Thm. C.1(b)].In Theorem C.2 we give the straightforward proof of the existence of α b ( c ) for b = 1 . Theproof of the existence of α b ( c ) for general b ∈ N is much more subtle than the proof for b = 1 .The proof for general b ∈ N is given in appendix C in the present paper, where it is the contentof part (a) of Theorem C.1. Parts (b)–(d) of that theorem explore other properties of α b ( c ) . Inparticular, in part (b) we prove that α b ( c ) is asymptotic to c as c → ∞ .We comment on the proof of part (a) of the next theorem for b ∈ N because the existence of α b ( c ) is crucial to the paper. Define γ b ( α ) = αZ b − ( α ) /Z b ( α ) , where Z b ( α ) = e α − P b − j =0 α j /j ! .According to part (a), if for a given c ∈ ( b, ∞ ) there exists a unique solution α = α b ( c ) ∈ (0 , ∞ ) of γ b ( α ) = c , then it follows that ρ b,α b ( c ) ∈ P N b ,c . The existence of such a solution is aconsequence of the following three steps, which are carried out in appendix C: lim α → + γ ( α ) = b ; lim α →∞ γ ( α ) = ∞ ; γ ′ b ( α ) > for α ∈ (0 , ∞ ) . To carry out step 3, we note that because Z ′ b ( α ) = Z b − ( α ) , we can write γ b ( α ) = ( α log Z b ( α )) ′ and γ ′ b ( α ) = ( α log Z b ( α )) ′′ . To provethat γ ′ b ( α ) > , we express Z b ( α ) first in terms of an incomplete gamma function and then interms of a moment generating function. The log-convexity of the moment generating functionand a short calculation involving power series completes the proof. Theorem 3.1. (a)
Fix a nonnegative integer b and a real number c ∈ ( b, ∞ ) . For α ∈ (0 , ∞ ) let ρ b,α be the measure on N b having components ρ b,α ; j = 1 Z b ( α ) · α j j ! for j ∈ N b , where Z ,α = e α , and for b ∈ N , Z b ( α ) = e α − P b − j =0 α j /j ! . Then there exists a unique value α b ( c ) ∈ (0 , ∞ ) such that ρ b,α b ( c ) lies in the set P N b ,c of probability measures on N b havingmean c . If b = 0 , then α ( c ) = c . If b ∈ N , then α b ( c ) is the unique solution in (0 , ∞ ) of αZ b − ( α ) /Z b ( α ) = c . (b) Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let m be the func-tion m ( N ) appearing in the definitions of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and ( N ) /N → as N → ∞ . For any ν ∈ A N,b,m we define θ N,b,ν ∈ P N b ,c to have the compo-nents θ N,b,ν ; j = ν j /N for j ∈ N b . Then the relative entropy R ( θ N,b,ν | ρ b,α b ( c ) ) is finite, and wehave the local large deviation estimate N log P N,b,m (Θ N,b = θ N,b,ν ) = − R ( θ N,b,ν | ρ b,α b ( c ) ) + ε N ( ν ) . The quantity ε N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ . We now prove the local large deviation estimate in part (b) of Theorem 3.1. This proofis based on a combinatorial argument that is reminiscent of, and as natural as, the combinato-rial argument used to prove Sanov’s theorem for empirical measures defined in terms of i.i.d.random variables having a finite state space [13, § ν ∈ A N,b,m , our goal is to estimate the probability P N,b,m (Θ N,b = θ N,b,ν ) , where θ N,b,ν has the components θ N,b,ν ; j = ν j /N for j ∈ N b . A basic observation is that the set { ω ∈ Ω N,b,m : Θ
N,b ( ω ) = θ N,b,ν } coincides with the set ∆ N,b,m ; ν = { ω ∈ Ω N,b,m : N j ( ω ) = ν j for j ∈ N b } . (3.4)It follows that P N,b,m (Θ N,b = θ N,b,ν ) = P N,b,m (∆ N,b,m ; ν ) (3.5) = 1 card (Ω N,b,m ) · card (∆ N,b,m ; ν ) . Our first task is to determine the asymptotic behavior of card (∆ N,b,m ; ν ) . In determining theasymptotic behavior of card (Ω N,b,m ) , we will use the fact that Ω N,b,m can be written as thedisjoint union Ω N,b,m = [ ν ∈ A N,b,m ∆ N,b,m ; ν . (3.6)Let ν ∈ A N,b,m be given. We start by expressing the cardinality of card (∆ N,b,m ; ν ) as aproduct of two multinomial coefficients. For each configuration ω ∈ ∆ N,b,m ; ν , K particles aredistributed onto the N sites of the lattice Λ N with j particles going onto ν j sites for j ∈ N b . Wecarry this out in two stages. In stage one K particles are placed into N bins, ν j of which have j particles for j ∈ N b . The number of ways of making this placement equals the multinomialcoefficient K ! Y j ∈ N b ( j !) ν j . (3.7)23his multinomial coefficient is well-defined since P j ∈ N b jν j = K . Given this placement of K particles into N bins, the number of ways of moving the particles from the bins onto the sites , , . . . , N of the lattice Λ N equals the multinomial coefficient N ! Y j ∈ N b ν j ! . (3.8)This second multinomial coefficient is well-defined since P j ∈ N b ν j = N . We conclude that thecardinality of ∆ N,b,m ; ν is given by the product of these two multinomial coefficients:card (∆ N,b,m ; ν ) = N ! Y j ∈ N b ν j ! · K ! Y j ∈ N b ( j !) ν j . (3.9)Since | ν | + ≤ m , at most m of the components ν j are positive. A related version of this formula,well known in combinatorial analysis, is derived in Example III.23 of [16].The next two steps in the proof of the local estimate given in part (b) of Theorem 3.1 isto prove the asymptotic formula for card (∆ N,b,m ; ν ) in Lemma 3.2 and the asymptotic formulafor card (Ω N,b,m ) in part (b) of Lemma 3.3. The proof of Lemma 3.2 is greatly simplified by asubstitution in line 3 of (3.16). This substitution involves a parameter α ∈ (0 , ∞ ) , which, weemphasize, is arbitrary in this lemma. The substitution in line 3 of (3.16) allows us to expressthe asymptotic behavior of both card (∆ N,b,m ; ν ) in Lemma 3.2 and card (Ω N,b,m ) in Lemma 3.3directly in terms of the relative entropy R ( θ N,b,ν | ρ b,α ) , where ρ b,α is the probability measureon N b having the components defined in part (a) of Theorem 3.1. One of the major issues inthe proof of part (b) of Theorem 3.1 is to show that the arbitrary parameter α appearing inLemmas 3.2 and 3.3 must take the value α b ( c ) , which is the unique value of α guaranteeing that ρ b,α ∈ P N b ,c [Thm. 3.1(a)]. We show that α must equal α b ( c ) after the statement of Lemma 3.3. Lemma 3.2.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let α be any realnumber in (0 , ∞ ) , and let m be the function m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . We define f ( α, b, c, K ) = log Z b ( α ) − c log α + c log K − c. For any ν ∈ A N,b,m , we define θ N,b,ν ∈ P N b ,c to have the components θ N,b,ν ; j = ν j /N for j ∈ N b . Then N log card (∆ N,b,m ; ν )= − R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) + ζ N ( ν ) . The quantity ζ N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ . roof. The proof is based on a weak form of Stirling’s approximation, which states that for all N ∈ N satisfying N ≥ and for all n ∈ N satisfying ≤ n ≤ N ≤ log( n !) − ( n log n − n ) ≤ N. (3.10)We summarize (3.10) by writing log( n !) = n log n − n + O (log N ) ∀ N ∈ N , N ≥ and ∀ n ∈ { , , . . . , N } . (3.11)By (3.10) the term denoted by O (log N ) satisfies ≤ O (log N ) ≤ N . We will also use(3.10) with N replaced by K and by other quantities in the model.To simplify the notation, we rewrite (3.9) in the formcard (∆ N,b,m ; ν ) = M ( N, ν ) · M ( K, ν ) , where M ( N, ν ) denotes the first multinomial coefficient on the right side of (3.9), and M ( K, ν ) denotes the second multinomial coefficient on the right side of (3.9). We have N log card (∆ N,b,m ; ν ) = 1 N log card ( M ( N, ν )) + 1 N log card ( M ( K, ν )) . (3.12)The asymptotic behavior of the first term on the right side of the last display is easily cal-culated. Since ν ∈ A N,b,m , there are | ν | + ∈ { , , . . . , m } positive components ν j . Because ofthis restriction on the number | ν | + of positive components of ν , we are able to control the errorin line 3 of (3.13). We define Ψ N ( ν ) = { j ∈ N b : ν j ≥ } . For each j ∈ Ψ N ( ν ) , since thecomponents ν j satisfy ≤ ν j ≤ N , we have log( ν j !) = ν j log ν j − ν j + O (log N ) for all N ≥ . Using the fact that P j ∈ Ψ N ( ν ) ν j = N , we obtain N log card ( M ( N, ν )) (3.13) = 1 N log( N !) − N X j ∈ Ψ N ( ν ) log( ν j !)= 1 N ( N log N − N + O (log N )) − N X j ∈ Ψ N ( ν ) ( ν j log ν j − ν j + O (log N ))= − X j ∈ N b ( ν j /N ) log( ν j /N ) + O (log N ) N − N X j ∈ Ψ N ( ν ) O (log N )= − X j ∈ N b θ N,b,ν ; j log θ N,b,ν ; j + ζ (1) N − ζ (2) N ( ν ) , ζ (1) N = [ O (log N )] /N → as N → ∞ and ζ (2) N ( ν ) = 1 N X j ∈ Ψ N ( ν ) O (log N ) . By the inequality noted after (3.11) and the fact that | ν | + ≤ m ≤ max ν ∈ A N,b,m ζ (2) N ( ν ) ≤ max ν ∈ A N,b,m N X j ∈ Ψ N ( ν ) log N ≤ m log NN .
Since ( m log N ) /N → as N → ∞ , we conclude that ζ (2) N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ .We now study the asymptotic behavior of the second term on the right side of (3.12). Since K = N c , we obtain for all K ≥ N log card ( M ( K, ν )) (3.14) = 1 N log( K !) − N X j ∈ N b ν j log( j !)= 1 N ( K log K − K + O (log K )) − X j ∈ N b θ N,b,ν ; j log( j !)= c log K − c − X j ∈ N b θ N,b,ν ; j log( j !) + ζ (3) N . where ≤ ζ (3) N = O (log K ) N = O (log N ) N → as N → ∞ . The weak form of Stirling’s formula is used to rewrite the term log( K !) in the last display, butnot to rewrite the terms log( j !) , which we leave untouched.Substituting (3.13) and (3.14) into (3.12), we obtain N log card (∆ N,b,m ; ν ) (3.15) = 1 N log card ( M ( N, ν )) + 1 N log card ( M ( K, ν ))= − X j ∈ N b θ N,b,ν ; j log θ N,b,ν ; j − X j ∈ N b θ N,b,ν ; j log( j !) + c log K − c + ζ N ( ν )= − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) + c log K − c + ζ N ( ν ) .
26n this formula ζ N ( ν ) = ζ (1) N − ζ (2) N ( ν ) + ζ (3) N . As N → ∞ max νA N,b,m | ζ N ( ν ) | ≤ ζ (1) N + max ν ∈ A N,b,m ζ (2) N ( ν ) + ζ (3) N → . We conclude that ζ N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ .Now comes the key step, the purpose of which is to express the sum in the last line of (3.15)as the relative entropy R ( θ N,b,ν ; j | ρ b,α ) , where α ∈ (0 , ∞ ) is arbitrary. To express the sum in thelast line of (3.15) as R ( θ N,b,ν | ρ b,α ) , we rewrite the sum as shown in line 3 of the next display: N log card (∆ N,b,m ; ν ) (3.16) = − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) + c log K − c + ζ N ( ν )= − X j ∈ N b θ N,b,ν ; j log (cid:18) θ N,b,ν ; j α j / ( Z b ( α ) · j !) · α j Z b ( α ) (cid:19) + c log K − c + ζ N ( ν )= − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j /ρ b,α ; j ) + (log Z b ( α )) X j ∈ N b θ N,b,ν ; j − (log α ) X j ∈ N b jθ N,b,ν ; j + c log K − c + ζ N ( ν )= − R ( θ N,b,ν | ρ b,α ) + log Z b ( α ) − c log α + c log K − c + ζ N ( ν )= − R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) + ζ N ( ν ) . We obtain the next-to-last equality by using the fact that since θ N,b,ν ∈ P N b ,c , X j ∈ N b θ N,b,ν ; j = 1 and X j ∈ N b jθ N,b,ν ; j = c. The proof of Lemma 3.2 is complete.The local large deviation estimate in Lemma 3.2 suggests a beautiful connection with Boltz-mann’s calculation of the Maxwell–Boltzmann distribution for the random ideal gas. This con-nection and Boltzmann’s calculation are described in [13].The next step in the proof of the local large deviation estimate in part (b) of Theorem 3.1 is toprove the asymptotic formula for card (Ω N,b,m ) stated in part (b) of the next lemma. The proof ofthis lemma uses Lemma 3.2 in a fundamental way. After the statement of this lemma we showhow to apply it and Lemma 3.2 to prove part (b) of Theorem 3.1. An important component ofthis proof is to calculate the quantity min θ ∈P N b,c R ( θ | ρ b,α ) , which appears in part (b) of the nextlemma. The proof of part (b) of the lemma depends on part (a), which is also used to verifyhypothesis (i) of Theorem 4.2 in the setting of Theorem 4.1.27 emma 3.3. Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . The followingconclusions hold. (a) The set A N,b,m defined at the beginning of section has the property that lim N →∞ N log card ( A N,b,m ) = 0 . (b) Let α be the positive real number in Lemma , and let m be the function m ( N ) ap-pearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . We define f ( α, b, c, K ) = log Z b ( α ) − c log α + c log K − c. Then R ( θ | ρ b,α ) attains its infimum over θ ∈ P N b ,c , and N log card (Ω N,b,m ) = f ( α, b, c, K ) − min θ ∈P N b,c R ( θ | ρ b,α ) + η N . (3.17) The quantity η N → as N → ∞ . Before proving Lemma 3.3, we derive the local large deviation estimate in part (b) of The-orem 3.1 by applying Lemmas 3.2 and 3.3. An integral part of the proof is to show how thearbitrary value of α ∈ (0 , ∞ ) appearing in these lemmas is replaced by the specific value α b ( c ) appearing in Theorem 3.1. As in the statement of part (b) of Theorem 3.1, let ν be any vector in A N,b,m and define θ N,b,ν ∈ P N b ,c to have the components θ N,b,ν ; j = ν j /N for j ∈ N b . By (3.5) N log P N,b,m (Θ N,b = θ N,b,ν ) (3.18) = 1 N log P N,b,m (∆ N,b,m ; ν )= 1 N log card (∆ N,b,m ; ν ) − N log card (Ω N,b,m ) . Substituting the asymptotic formula for log card (∆ N,b,m ; ν ) derived in Lemma 3.2 and the asymp-totic formula for log card (Ω N,b,m ) given in part (b) of Lemma 3.3 yields N log P N,b,m (Θ N,b = θ N,b,ν ) (3.19) = − R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) + ζ N ( ν ) − (cid:18) f ( α, b, c, K ) − min θ ∈P N b,c R ( θ | ρ b,α ) + η N (cid:19) = − R ( θ N,b,ν | ρ b,α ) + min θ ∈P N b,c R ( θ | ρ b,α ) + ε N ( ν ) . ε N ( ν ) equals ζ N ( ν ) − η N ; ζ N ( ν ) is the error term in Lemma 3.2, and η N is theerror term in Lemma 3.3. As N → ∞ , ζ N ( ν ) → uniformly for ν ∈ A N,b,m , and η N → . Itfollows that ε N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ .We now consider the first two terms on the right side of the last line of (3.19). By assertion(ii) in part (f) of Theorem A.1 applied to θ = θ N,b,ν ∈ P N b ,c , for any α ∈ (0 , ∞ ) R ( θ N,b,ν | ρ b,α ) − min θ ∈P N b,c R ( θ | ρ b,α ) = R ( θ N,b,ν | ρ b,α b ( c ) ) . With this step we have succeeded in replacing the relative entropy R ( θ N,b,ν | ρ b,α ) with respectto ρ b,α , which appears in Lemma 3.2, by the relative entropy R ( θ N,b,ν | ρ b,α b ( c ) ) with respect to ρ b,α b ( c ) , which appears in Theorem 3.1. Substituting the last equation into (3.19) gives N log P N,b,m (Θ N,b = θ N,b,ν ) = − R ( θ N,b,ν | ρ b,α b ( c ) ) + ε N ( ν ) , where ε N ( ν ) → uniformly for ν ∈ A N,b,m as N → ∞ . This is the conclusion of part (b) ofTheorem 3.1.We now complete the proof of part (b) of Theorem 3.1 by proving Lemma 3.3. Proof of Lemma 3.3. (a) To estimate the cardinality of A N,b,m we write A N,b,m ⊂ ( ν ∈ N N : X j ∈ N b ν j = N, | ν | + ≤ m ) = m [ k =1 ( ν ∈ N N : X j ∈ N b ν j = N, | ν | + = k ) . Thus we can bound the cardinality of A N,b,m by bounding separately the cardinality of each ofthe disjoint sets in the union. By [2, Cor. 2.5] the number of elements in the set indexed by k equals the binomial coefficient C ( N − , k − . Since by assumption m/N → as N → ∞ ,for all sufficiently large N the quantities C ( N − , k − are increasing and are maximal when k = m . Since C ( N − , k − ≤ C ( N, k ) , it follows thatcard ( A N,b,m ) ≤ m X k =1 C ( N, k ) ≤ mC ( N, m ) = m N ! m !( N − m )! . An application of the weak form of Stirling’s formula yields for all m ≥ and all N ≥ m + 20 ≤ N log card ( A N,b,m ) ≤ N (log m + log( N !) − log( m !) − log(( N − m )!)))= log mN − mN log mN − (cid:16) − mN (cid:17) log (cid:16) − mN (cid:17) + O (log N ) N . m/N → as N → ∞ , we conclude that as N → ∞ ≤ N log card ( A N,b,m ) ≤ log mN − mN log mN − (cid:16) − mN (cid:17) log (cid:16) − mN (cid:17) + O (log N ) N → . This completes the proof of part (a).(b) The starting point is (3.6), which states that Ω N,b,m = [ ν ∈ A N,b,m ∆ N,b,m ; ν . For distinct ν ∈ A N,b,m the sets ∆ N,b,m ; ν are disjoint. Hence N log card (Ω N,b,m ) (3.20) = 1 N log X ν ∈ A N,b,m card (∆ N,b,m ; ν )= 1 N log max ν ∈ A N,b,m card (∆ N,b,m ; ν ) · X ν ∈ A N,b,m card (∆ N,b,m ; ν )max ν ∈ A N,b,m card (∆ N,b,m ; ν ) = 1 N log (cid:18) max ν ∈ A N,b,m card (∆ N,b,m ; ν ) (cid:19) + δ N , where < δ N = 1 N log X ν ∈ A N,b,m card (∆ N,b,m ; ν )max ν ∈ A N,b,m card (∆ N,b,m ; ν ) ≤ N log card ( A K,N,m ) . It follows from part (a) that δ N → as N → ∞ .We continue with the estimation of card (Ω N,b,m ) . By Lemma 3.2 and the fact that logarithmis an increasing function − min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) − max ν ∈ A N,b,m | ζ N ( ν ) |≤ max ν ∈ A N,b,m (cid:18) N log card (∆ N,b,m ; ν ) (cid:19) = 1 N log (cid:18) max ν ∈ A N,b,m card (∆ N,b,m ; ν ) (cid:19) ≤ − min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) + max ν ∈ A N,b,m | ζ N ( ν ) | .
30s proved in Lemma 3.2, max ν ∈ A N,b,m | ζ N ( ν ) | → as N → ∞ . Hence by (3.20) − min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) − max ν ∈ A N,b,m | ζ N ( ν ) | + δ N (3.21) ≤ N log card (Ω N,b,m ) ≤ − min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) + f ( α, b, c, K ) + max ν ∈ A N,b,m | ζ N ( ν ) | + δ N . Under the assumption that R ( ·| ρ b,α ) attains its infimum over P N b ,c , we define η N = 1 N log card (Ω N,b,m ) − f ( α, b, c, K ) + min θ ∈P N b,c R ( θ | ρ b,α ) . In the last two paragraphs of this proof, we show that η N → as N → ∞ . Given this fact, thelast equation yields the asymptotic formula (3.17) in part (b).We now prove that η N → as N → ∞ . To do this, we use (3.21) to write | η N | ≤ (cid:18) min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) − min θ ∈P N b,c R ( θ | ρ b,α ) (cid:19) + max ν ∈ A N,b,m | ζ N ( ν ) | + δ N . Like the second and third terms on the right side, the first term on the right side is nonnnegativebecause A N,b,m is a subset of P N b ,c . Since max ν ∈ A N,b,m | ζ N ( ν ) | → and δ N → as N → ∞ , itwill follow that η N → if we can show that R ( ·| ρ b,α ) attains its infimum over P N b ,c and that lim N →∞ min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) = min θ ∈P N b,c R ( θ | ρ b,α ) . (3.22)Given the existence of min θ ∈P N b,c R ( θ | ρ b,α ) , this assertion is certainly plausible since as shownin Corollary B.2, the measures θ N,b,ν are dense in P N b ,c for ν ∈ ∪ N ∈ N A N,b,m .We start the proof of (3.22) by noting that since R ( ·| ρ b,α ) has compact level sets in P N b ,c [Thm. A.1(d)], R ( ·| ρ b,α ) attains its infimum over P N b ,c at some measure θ ⋆ . In assertion (i) inpart (f) of Theorem A.1, we show that θ ⋆ = ρ b,α b ( c ) . However, this detail is not needed in thepresent proof, which we would like to keep as self-contained as possible. We prove (3.22) byapplying Theorem B.1 to θ = θ ⋆ , obtaining a sequence θ ( N ) with the following properties: • For N ∈ N , θ ( N ) ∈ B N,b,m has components θ ( N ) j = ν ( N ) j /N for j ∈ N b , where ν ( N ) is anappropriate sequence in A N,b,m . • θ ( N ) ⇒ θ ⋆ as N → ∞ . • R ( θ ( N ) | ρ b,α ) → R ( θ ⋆ | ρ b,α ) as N → ∞ . 31he limit in (3.22) follows from the inequalities min θ ∈P N b,c R ( θ | ρ b,α ) ≤ min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) ≤ R ( θ ( N ) | ρ b,α ) and the limit R ( θ ( N ) | ρ b,α ) → R ( θ ⋆ | ρ b,α ) = min θ ∈P N b,c R ( θ | ρ b,α ) as N → ∞ . This completes the proof of Lemma 3.3 and thus the proof of the local estimate in part (b) ofTheorem 3.1.We end this section by explaining the insight behind the key step in the proof of Lemma3.2. This key step is to rewrite the sum in line 2 of (3.16) as shown in line 3. This allows us toexpress the sum in line 3 as the relative entropy R ( θ N,b,ν | ρ b,α b ( c ) ) plus terms that are independentof θ N,b,ν . We now motivate this step. In order to streamline this motivation, we drop all errorterms and avoid rigor.Our starting point is line 2 of (3.16). If we do not rewrite the sum as shown in line 3 of thatdisplay, then we have the following modification of the conclusion of Lemma 3.2: N log card (∆ N,b,m ; ν ) ≈ − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) + c log K − c. (3.23)This in turn leads to the following modification of Lemma 3.3: N log card (Ω N,b,m ) ≈ c log K − c − min ν ∈ A N,b,m X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) ! . For ν ∈ ∪ N ∈ N A N,b,m the probability measures θ N,b,ν are dense in P N ,c [Cor. B.2]. Hence it isplausible that as N → ∞ the minimum in the last display can be replaced by min θ ∈P N b,c X j ∈ N b θ j log( θ j j !) ! . (3.24)To determine this minimum, we introduce two Lagrange multipliers corresponding to the twoequality constraints P j ∈ N b θ j = 1 and P j ∈ N b jθ j = c satisfied by θ ∈ P N b ,c . A formal calcu-lation, which we omit, suggests that the minimum is attained at the unique θ ∈ P N b ,c havingcomponents θ j = 1 Z b ( α ) · α j j ! for j ∈ N b , α = α b ( c ) and Z b ( α ) = Z b ( α b ( c )) are chosen so that P j ∈ N b θ j = 1 and P j ∈ N b jθ j = c [Thm. 3.1(a)]. The measure θ with α = α b ( c ) coincides with the Poisson distribution ρ b,α b ( c ) appearing in the local large deviation estimate in part (b) of Theorem 3.1. One easily checks thatthe value of the minimum in (3.24) is c log α b ( c ) − log Z b ( α b ( c )) . These calculations suggestthat N log card (Ω N,K,m ) ≈ c log K − c − c log α b ( c ) + log Z b ( α b ( c )) . (3.25)When (3.25) is combined with (3.23), we have by (3.18) N log P N,K,m (Θ N,b = θ N,b,ν )= 1 N log P N,K,m (∆ N,b,m ; ν )= 1 N log card (∆ N,b,m ; ν ) − N log card (Ω N,b,m ) ≈ − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) + c log K − c − ( c log K − c − c log α b ( c ) + log Z b ( α b ( c )) ≈ − X j ∈ N b θ N,b,ν ; j log( θ N,b,ν ; j j !) + c log α b ( c ) − log Z b ( α b ( c )) . The last line of this display can be rewritten as − X j ∈ N b θ N,b,ν ; j log (cid:18) θ N,b,ν ; j [ α b ( c )] j / ( Z b ( α b ( c )) · j !) (cid:19) = − X j ∈ N b θ N,b,ν ; j log( θ j /ρ b,α b ( c ); j ) = − R ( θ N,b,ν ; j | ρ b,α b ( c ) ) . It follows that N log P N,K,m (Θ N,b = θ N,b,ν ) ≈ − R ( θ N,b,ν ; j | ρ b,α b ( c ) ) . Except for the error terms, this coincides with the conclusion of part (b) of Theorem 3.1.The calculation just presented was our first attempt to prove Lemmas 3.2 and 3.3. It alsoguided us to the much more efficient current proofs both of Lemma 3.2 — where the sum inline 2 of (3.16) is written directly in terms of the relative entropy — and of Lemma 3.3. Ananalogous but much simpler calculation motivates the solution of a finite dimensional probleminvolving the minimum of a relative entropy over a set of probability measures having fixedmean. This simpler calculation is directly related to the present paper because it gives the form33f the Maxwell–Boltzmann distribution for a random ideal gas. For details see section 6.4 of[10], sections 4-5 of [11], and section 4 of [13], each of which emphasizes different aspects ofthe calculation. This completes the motivation of the proof of Lemma 3.2.In the next section we show how the local large deviation estimate in part (b) of Theorem3.1 yields the LDP in Theorem 2.1.
In Theorem 2.1 we state the LDP for the sequence Θ N,b of number-density measures. Thissequence takes values in P N b ,c , which is the set of probability measures on N having mean c ∈ ( b, ∞ ) . The purpose of the present section is to show how the local large deviation estimatein part (b) of Theorem 3.1 yields the LDP for Θ N,b . The basic idea is first to prove the largedeviation limit for θ N,b,ν lying in open balls in P N b ,c and in other subsets defined in terms ofopen balls and then to use this large deviation limit to prove the LDP in Theorem 2.1. Both ofthese steps are implemented as applications of the general formulation in Theorems 4.2 and 4.3.In Theorem 4.1 we state the large deviation limit for open balls and other subsets defined interms of open balls. Two types of open balls are considered. Let θ be a measure in P N b ,c , andtake r > . Part (a) states the large deviation limit for open balls in P N b ,c defined by B π ( θ, r ) = { µ ∈ P N b ,c : π ( θ, µ ) < r } , where π denotes the Prohorov metric on P N b ,c [14, § P N b ,c in part (b) of Theorem 2.1 and thelarge deviation lower bound for open subsets of P N b ,c in part (d) of Theorem 2.1. Now let θ be ameasure in P N b , [ b,c ] . Part (b) states the large deviation limit for sets of the form b B π ( θ, r ) ∩ P N b ,c ,where b B π ( θ, r ) is the open ball in P N b , [ b,c ] defined by b B π ( θ, r ) = { µ ∈ P N b , [ b,c ] : π ( θ, µ ) < r } . This limit will be used to prove the large deviation upper bound for closed subsets in part (c)of Theorem 2.1. Since P N b ,c is a dense subset of P N b , [ b,c ] [Thm. 2.4(b)], b B π ( θ, r ) ∩ P N b ,c isnonempty. If θ ∈ P N b ,c , then B π ( θ, r ) = b B π ( θ, r ) ∩ P N b ,c , and the conclusions of parts (a) and(b) of the next theorem coincide. For A a subset of P N b ,c or P N b , [ b,c ] we denote by R ( A | ρ b,α b ( c ) ) the infimum of R ( θ | ρ b,α b ( c ) ) over θ ∈ A . Theorem 4.1.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let m be thefunction m ( N ) appearing in the definitions of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . The following conclusions hold. Let θ be a measure in P N b ,c and take r > . Then for any open ball B π ( θ, r ) in P N b ,c , R ( B π ( θ, r ) | ρ b,α b ( c ) ) is finite, and we have the large deviation limit lim N →∞ N log P N,b,m (Θ N,b ∈ B π ( θ, r )) = − R ( B π ( θ, r ) | ρ b,α b ( c ) ) . (b) Let θ be a measure in P N b , [ b,c ] and take r > . Then the set b B π ( θ, r ) ∩ P N b ,c is nonempty, R ( b B π ( θ, r ) ∩ P N b ,c | ρ b,α b ( c ) ) is finite, and we have the large deviation limit lim N →∞ N log P N,b,m (Θ N,b ∈ b B π ( θ, r ) ∩ P N b ,c ) = − R ( b B π ( θ, r ) ∩ P N b ,c | ρ b,α b ( c ) ) . We prove Theorem 4.1 by applying the local large deviation estimate in Lemma 3.2. Akey step is to approximate probability measures in B π ( θ, ε ) and in b B π ( θ, r ) ∩ P N b ,c by appro-priate sequences of probability measures in the range of Θ N,b . This procedure allows one toshow in part (a) that the infimum R ( B π ( θ, ε ) | ρ b,α b ( c ) ) can be approximated by the infimum of R ( θ | ρ b,α b ( c ) ) over θ lying in the intersection of B π ( θ, ε ) and the range of Θ N,b ; a similar state-ment holds for the infimum in part (b). A set of hypotheses that allow one to carry out thisapproximation procedure is given in Theorem 4.2, a general formulation that yields Theorem4.1 as a special case.Theorem 4.2 is formulated for a complete, separable metric space X containing a relativelycompact subset W that is not closed. We define Z to be the closure of W in X . In the applicationto Theorem 4.1 X equals P N b , the set of probability measures on N ; W equals P N b ,c , the subsetof P N b containing probability measures with mean c ; and Z equals P N b , [ b,c ] , the subset of P N b containing probability measures with mean lying in the closed interval [ b, c ] . If τ denotes themetric on X , then for x ∈ W and r > open balls in W have the form B τ ( x, r ) = { y ∈ W : τ ( x, y ) < r } . For x ∈ Z and r > open balls in Z have the form b B τ ( x, r ) = { y ∈ Z : τ ( x, y ) < r } . Theorem 4.2.
For N ∈ N let (Ω N , F N , Q N ) be a sequence of probability spaces. Let X be acomplete, separable metric space, W a relatively compact subset of X that is not closed andthus not compact, and Z the closure of W in X ; thus Z is compact. Also let Y N be a sequenceof random vectors mapping Ω N into W , and let I be a function mapping X into [0 , ∞ ] . For A a subset of X we denote the infimum of I over A by I ( A ) . We assume the following fourhypotheses. For ω ∈ Ω the range of Y N ( ω ) is a finite subset W N of W , and the cardinality of W N satisfies lim N →∞ N log card ( W N ) = 0 . (ii) For each y ∈ W N we have I ( y ) < ∞ and the local large deviation estimate N log Q N ( Y N = y ) = − I ( y ) + ε N ( y ) , where ε N ( y ) → as N → ∞ uniformly for y ∈ W N . (iii) There exists a dense subset D of W such that I ( y ) < ∞ for all y ∈ D . (iv) For any y ∈ W satisfying I ( y ) < ∞ , there exists a sequence y N ∈ W N for which y N → y and I ( y N ) → I ( y ) as N → ∞ .Under these hypotheses the following conclusions hold. (a) For any open ball B in W , I ( B ) is finite, and we have the large deviation limit lim N →∞ N log Q N ( Y N ∈ B ) = − I ( B ) . (b) For any open ball b B in Z , b B ∩ W is nonempty, I ( b B ∩ W ) is finite, and we have the largedeviation limit lim N →∞ N log Q N ( Y N ∈ b B ∩ W ) = − I ( b B ∩ W ) . Proof. (a) By hypothesis (iii), for any open ball B in W there exists x ∈ B ∩ D such that I ( x ) < ∞ . Thus I ( B ) ≤ I ( x ) < ∞ . By the local large deviation estimate in hypothesis (ii) Q N ( Y N ∈ B ) = X y ∈ B ∩W N Q N ( Y N = y ) = X y ∈ B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] . For the last sum in this equation we have the bounds max y ∈ B ∩ W N exp[ − N ( I ( y ) − ε N ( y ))] ≤ X y ∈ B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] ≤ card ( W N ) · max y ∈ B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] .
36n addition, for the term max y ∈ B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] we have the bounds exp (cid:20) − N (cid:18) I ( B ∩ W N ) + max y ∈ B ∩W N ε N ( y ) (cid:19)(cid:21) = exp (cid:20) − N (cid:18) min y ∈ B ∩W N I ( y ) + max y ∈ B ∩W N ε N ( y ) (cid:19)(cid:21) ≤ max y ∈ B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] ≤ exp (cid:20) − N (cid:18) min y ∈ B ∩W N I ( y ) − max y ∈ B ∩W N ε N ( y ) (cid:19)(cid:21) = exp (cid:20) − N (cid:18) I ( B ∩ W N ) − max y ∈ B ∩W N ε N ( y ) (cid:19)(cid:21) . It follows that − I ( B ∩ W N ) − max y ∈ B ∩W N ε N ( y ) ≤ N log Q N ( Y N ∈ B ) ≤ − I ( B ∩ W N ) + max y ∈ B ∩W N ε N ( y ) + log( card ( W N )) N .
Since ε N ( y ) → uniformly for y ∈ W N , by hypothesis (i) the proof is done once we showthat lim N →∞ I ( B ∩ W N ) = I ( B ) . (4.1)Since B ∩ W N ⊂ B , we have I ( B ) ≤ I ( B ∩ W N ) , which implies that I ( B ) ≤ lim inf N →∞ I ( B ∩ W N ) . The limit in (4.1) is proved if we can show that lim sup N →∞ I ( B ∩ W N ) ≤ I ( B ) . (4.2)For any δ > there exists y ⋆ ∈ B such that I ( y ⋆ ) ≤ I ( B ) + δ < ∞ . Hypothesis (iv) guaranteesthe existence of a sequence y N ∈ W N such that y N → y ⋆ and I ( y N ) → I ( y ⋆ ) . Since for allsufficiently large N we have y N ∈ B ∩ W N , it follows that I ( B ∩ W N ) ≤ I ( y N ) . Hence lim sup N →∞ I ( B ∩ W N ) ≤ lim N →∞ I ( y N ) = I ( y ⋆ ) ≤ I ( B ) + δ. δ → gives (4.2) and thus proves the limit (4.1). This completes the proof of part (a).(b) Let b B be any open ball in Z . Since W is dense in Z , b B ∩ W is nonempty. By hypothesis(iii) there exists x ∈ b B ∩ D such that I ( x ) < ∞ . Thus I ( b B ∩ W ) ≤ I ( b B ∩ D ) ≤ I ( x ) < ∞ .To prove the limit in part (b), we proceed as in the proof of the limit in part (a), replacing theset B in part (a) by the set b B ∩ W . Since W N ⊂ W , we have b B ∩ W ∩ W n = b B ∩ W N . By thelocal large deviation estimate in hypothesis (ii) Q N ( Y N ∈ b B ∩ W N ) = X y ∈ b B ∩W∩W N Q N ( Y N = y )= X y ∈ b B ∩W N Q N ( Y N = y ) = X y ∈ b B ∩W N exp[ − N ( I ( y ) − ε N ( y ))] . Exactly as in the proof of part (a), it follows that − I ( b B ∩ W N ) − max y ∈ b B ∩W N ε N ( y ) ≤ N log Q N ( Y N ∈ b B ∩ W N ) ≤ − I ( b B ∩ W N ) + max y ∈ b B ∩W N ε N ( y ) + log( card ( W N )) N .
Since ε N ( y ) → uniformly for y ∈ W N , by hypothesis (i) the proof is done once we showthat lim N →∞ I ( b B ∩ W N ) = I ( b B ∩ W ) . (4.3)Since b B ∩ W N ⊂ b B ∩ W , we have I ( b B ∩ W ) ≤ I ( b B ∩ W N ) , which implies that I ( b B ∩ W ) ≤ lim inf N →∞ I ( b B ∩ W N ) . The limit in (4.1) is proved if we can show that lim sup N →∞ I ( b B ∩ W N ) ≤ I ( b B ∩ W ) . (4.4)For any δ > there exists y ⋆ ∈ b B ∩ W such that I ( y ⋆ ) ≤ I ( b B ∩ W ) + δ < ∞ . Hypothesis(iv) guarantees the existence of a sequence y N ∈ W N such that y N → y ⋆ and I ( y N ) → I ( y ⋆ ) .Since for all sufficiently large N we have y N ∈ b B ∩ W N , it follows that I ( b B ∩ W N ) ≤ I ( y N ) .Hence lim sup N →∞ I ( b B ∩ W N ) ≤ lim N →∞ I ( y N ) = I ( y ⋆ ) ≤ I ( b B ∩ W ) + δ. δ → gives (4.4) and thus proves the limit (4.3). This completes the proof of part (b)and thus the proof of the theorem.We now prove Theorem 4.1 as an application of Theorem 4.2. In Theorem 4.2 we make thefollowing identifications for N ∈ N . • The probability spaces (Ω N , F N , Q N ) are (Ω N,b,m , F N,b,m , P
N,b,m ) , where Ω N,b,m is theset defined in (2.1), F N,b,m is the σ -algebra of all subsets of Ω N,b,m , and P N,b,m is theconditional probability defined in (2.3). • X equals P N b , W equals P N b ,c , and Z equals P N b , [ b,c ] . These spaces have the propertiespostulated in Theorem 4.2: P N b is a complete, separable metric space; P N b ,c is relativelycompact subset of P N b that is not closed; and P N b , [ b,c ] is the closure of P N b ,c in P N b .The properties of P N b are proved in Theorems 3.3.1 and Theorem 3.1.7 of [14], and theproperties of P N b ,c and P N b , [ b,c ] are proved in Theorem 2.4. • The random vectors Y N equal Θ N,b , where Θ N,b is the number-density measure definedin (2.4). Θ N,b maps Ω N,b,m into the subspace W = P N b ,c of P N b . • The function I is the relative entropy R ( ·| ρ b,α b ( c ) ) on P N b . R ( ·| ρ b,α b ( c ) ) maps P N b into [0 , ∞ ] [Thm. A.1(a)], as specified in the third sentence of Theorem 4.2. • The range W N of Y N = Θ N,b is the set of probability measures θ N,b,ν ∈ B N,b,m , thecomponents of which are specified in (3.2). The set B N,b,m ⊂ P N b ,c is defined in (3.3).We now verify that the four hypotheses of Theorem 4.2 are valid in the setting of Theorem4.1. Verification of hypothesis (i) in Theorem . In the setting of Theorem 4.1 W N is range of Θ N,b ( ω ) for ω ∈ Ω N,b,m . This range is B N,b,m , the elements of which are in one-to-one corre-spondence with the elements of the set A N,b,m defined in (3.1). As shown in part (a) of Lemma3.3 ≤ log card ( W N ) N = log card ( A N,b,m ) N → as N → ∞ . This completes the verification of hypothesis (i) in Theorem 4.2.
Verification of hypothesis (ii) in Theorem . In the setting of Theorem 4.1 hypothesis (ii) inTheorem 4.2 is given by the local estimate in part (b) of Theorem 3.1. As shown there, the error ε N ( ν ) → as N → ∞ uniformly for ν ∈ A N,b,m . Since there is a one-to-one correspondencebetween ν ∈ A N,b,m and θ ∈ B N,b,m , the error in part (b) of Theorem 3.1 converges to 039niformly for θ ∈ B N,b,m , which is the range of Θ N,b ( ω ) for ω ∈ Ω N,b,m . This completes theverification of hypothesis (ii) in Theorem 4.2.
Verification of hypothesis (iii) in Theorem . The fact that there exists a dense subset of θ ∈ P N b ,c for which R ( θ | ρ ) < ∞ is proved in Corollary B.2. This completes the verification ofhypothesis (iii) in Theorem 4.2. Verification of hypothesis (iv) in Theorem . In Theorem B.1 we prove that any α ∈ (0 , ∞ ) and any θ ∈ P N b ,c satisfying R ( θ | ρ b,α ) < ∞ there exists a sequence θ ( N ) ∈ B N,b,m for which θ ( N ) ⇒ θ and R ( θ ( N ) | ρ b,α ) → R ( θ | ρ b,α ) as N → ∞ . In particular, this property holds for α = α b ( c ) . This completes the verification of hypothesis (iv) in Theorem 4.2.Having verified the four hypotheses of Theorem 4.2 in the context of Theorem 4.1, we havefinished the proof of the latter theorem from the former theorem.Theorem 2.1 states the LDP for the number-density measures Θ N,b in the droplet model. Inorder to complete the proof of Theorem 2.1, we show how to lift the large deviation limit foropen balls in Theorem 4.1 to the large deviation upper bound for compact sets and for closedsets in P N b ,c and the large deviation lower bound for open sets in P N b ,c . This procedure is carriedout as an application of Theorem 4.3, a general result formulated in a setting close to that ofTheorem 4.2. In Theorem 4.3 the assumption in Theorem 4.2 on the function I is strengthenedto the assumption that I is lower semicontinuous on X .The LDP in the next theorem has a number of unique features because W is not a closedsubset of X . The large deviation upper bound takes two forms depending on whether the subset F of W is compact or whether F is closed. When F is compact, in part (b) we obtain thestandard large deviation bound for F with − I ( F ) on the right hand side. When F is closed,in part (c) we obtain a different form of the standard large deviation upper bound; − I ( F ) onthe right hand side is replaced by − I ( F ) , where F is the closure of F in the compact space Y .When F is compact, its closure in the compact space P N b , [ b,c ] is F itself. In this case the largedeviation upper bounds in parts (c) and (d) coincide. Theorem 4.3.
For N ∈ N let (Ω N , F N , Q N ) be a sequence of probability spaces. Let X be acomplete, separable metric space, W a relatively compact subset of X that is not closed andthus not compact, and Z the closure of W in X ; thus Z is compact. Also let Y N be a sequenceof random vectors mapping Ω N into W , and I be a lower semicontinuous function mapping X into [0 , ∞ ] . We assume the following two limits: for any open ball B in W lim N →∞ N log Q N ( Y N ∈ B ) = − I ( B ) (4.5)40 nd for any open ball b B in Z lim N →∞ N log Q N ( Y N ∈ b B ∩ W ) = − I ( b B ∩ W ) . (4.6) Then, as N → ∞ , with respect to the measures Q N , the sequence Y N satisfies the LDP on W with rate function I in the following sense. (a) For any compact subset F of W we have the large deviation upper bound lim sup N →∞ N log Q N { Y N ∈ F } ≤ − I ( F ) . (b) For any closed subset F of W we have the large deviation upper bound lim sup N →∞ N log Q N { Y N ∈ F } ≤ − I ( F ) , where F denotes the closure of F in Z . (c) For any open subset G of W we have the large deviation lower bound lim inf N →∞ N log Q N ( Y N ∈ G } ≥ − I ( G ) . Theorem 2.1 is an immediate consequence of this theorem, Theorem 4.1, and TheoremA.1. Part (a) of Theorem 4.1 proves the large deviation limit for any open ball in P N b ,c , whichcorresponds to the limit (4.5) in Theorem 4.3. Part (b) of Theorem 4.1 proves the large deviationlimit for b B ∩ Z , where b B is any open ball in P N b , [ b,c ] . This corresponds to the limit (4.6) inTheorem 4.3. In the application to Theorem 2.1 W is the relatively compact, nonclosed subset P N b ,c of X = P N b and Z is the compact subset P N b , [ b,c ] of P N b . According to parts (a) and(b) of Theorem A.1, R ( ·| ρ b,α b ( c ) ) maps P N b ,c into [0 , ∞ ] and is lower semicontinuous on P N b ,while part (d) of that theorem proves that R ( ·| ρ b,α b ( c ) ) has compact level sets in P N b ,c . This lastproperty of the relative entropy is needed for part (a) of Theorem 2.1. Proof of Theorem 4.3.
We prove the three large deviation bounds in the order (c), (a), and (b).(c) Let G be any open subset of W . We denote by τ the metric on X . For any point x ∈ G there exists ε > such that the open ball B τ ( x, ε ) = { y ∈ W : τ ( x, y ) < ε } is a subset of G .The limit (4.5) implies that lim inf N →∞ N log Q N ( Y N ∈ G ) ≥ lim N →∞ N log Q N ( Y N ∈ B τ ( x, ε ))= − I ( B τ ( x, ε )) ≥ − I ( x ) . x is an arbitrary point in G , it follows that lim inf N →∞ N log Q N ( Y N ∈ G ) ≥ − inf x ∈ G I ( x ) = − I ( G ) . This completes the proof of the large deviation lower bound for any open set G in W .(a) Let F be any compact subset of W . We first prove the large deviation upper bound for F under the assumption that I ( F ) < ∞ . The proof when I ( F ) = ∞ is given afterward. Westart by showing that for each x ∈ F lim inf ε → + I ( B τ ( x, ε )) ≥ I ( F ) . (4.7)Let ε n be any positive sequence converging to , and take any δ > . For any n ∈ N there exists x n ∈ B τ ( x, ε n ) such that I ( B τ ( x, ε n )) + δ ≥ I ( x n ) . Since x n → x , the lower semicontinuityof I on W and the fact that x ∈ F imply that lim inf n →∞ I ( B τ ( x, ε n )) + δ ≥ lim inf n →∞ I ( x n ) ≥ I ( x ) ≥ I ( F ) . Sending δ → yields (4.7) because ε n is an arbitrary positive sequence converging to 0.We now prove the large deviation upper bound in part (a). Take any η > . By (4.7) foreach x ∈ F there exists ε x > such that I ( B τ ( x, ε x )) ≥ I ( F ) − η. The open balls { B τ ( x, ε x ) , x ∈ F } cover F . Since F is compact, there exist T < ∞ and finitelymany points x i ∈ F, i = 1 , , . . . , T , such that F ⊂ S Ti =1 B τ ( x i , ε i ) , where ε i = ε x i . It followsthat min i =1 , ,...,T I ( B τ ( x i , ε i )) ≥ I ( F ) − η. By Lemma 1.2.15 in [6] and by the limit (4.5) applied to B = B τ ( x i , ε i )lim sup N →∞ N log Q N { Y N ∈ F } (4.8) ≤ lim sup N →∞ N log Q N Y N ∈ T [ i =1 B τ ( x i , ε i ) ! ≤ lim sup N →∞ N log T X i =1 Q N ( Y N ∈ B τ ( x i , ε i )) ! = max i =1 , ,...,T (cid:18) lim sup N →∞ N log Q N ( Y N ∈ B τ ( x i , ε i )) (cid:19) = − min i =1 , ,...,T I ( B τ ( x i , ε i )) ≤ − I ( F ) + η. η → , we obtain lim sup N →∞ N log Q N { Y N ∈ F } ≤ − I ( F ) . This completes the proof of the large deviation upper bound for any compact subset F of W under the assumption that I ( F ) < ∞ .We now assume that I ( F ) = ∞ , which implies that I ( x ) = ∞ for each x ∈ F . Theproof of the large deviation upper bound when I ( F ) = ∞ rests on the assertion that for each x ∈ F there exists ε x > such that I ( B τ ( x, ε x )) = ∞ . Indeed, if this assertion were false,then there would exist a sequence x n ∈ W satisfying I ( x n ) < ∞ and x n → x . Since I islower semicontinuous on W , it would follow that lim inf n →∞ I ( x n ) ≥ I ( x ) = ∞ , which inturn would imply that I ( x n ) = ∞ . This contradiction completes the proof that for each x ∈ F there exists ε x > such that I ( B τ ( x, ε x )) = ∞ . As in the case when I ( F ) < ∞ , the open balls { B τ ( x, ε x ) , x ∈ F } cover F . Since F is compact, there exist T < ∞ and finitely many points x i ∈ F, i = 1 , , . . . , T , such that F ⊂ S Ti =1 B τ ( x i , ε i ) , where ε i = ε x i . It follows that min i =1 , ,...,T I ( B τ ( x i , ε i )) = ∞ = I ( F ) . By the same steps as in (4.8) lim sup N →∞ N log Q N { Y N ∈ F } ≤ − min i =1 , ,...,T I ( B τ ( x i , ε i )) = −∞ = − I ( F ) . This completes the proof of the large deviation upper bound for any compact subset F of W when I ( F ) = ∞ . The proof of part (a) is complete.(b) Let F be any closed subset of W . We claim that F equals F ∩ W , where F is the closureof F in Z . Since Z is compact, the closed subset F is also compact. Clearly F ⊂ F ∩ W . Onthe other hand, any x ∈ F ∩ W is a limit point lying in W of a sequence x n in F . Since F isclosed in W , any x ∈ F ∩ W lies in F . This completes the proof that F = F ∩ W . This is aspecial case of a general result in topology stated in Theorem 17.2 of [18].We first prove the large deviation upper bound for F under the assumption that I ( F ) < ∞ .The proof when I ( F ) = ∞ is given afterward. The proof proceeds as in part (a), essentially byreplacing the balls B τ ( x, ε ) for x ∈ W by b B τ ( x, ε ) ∩ W for x ∈ Z , where b B τ ( x, ε ) = { y ∈Z : τ ( x, y ) < ε } . As in the proof of part (a), we start by showing that for each x ∈ F lim inf ε → + I ( b B τ ( x, ε ) ∩ W ) ≥ I ( F ) . (4.9)43et ε n be any positive sequence converging to , and take any δ > . For any n ∈ N thereexists x n ∈ b B τ ( x, ε n ) ∩ W such that I ( b B τ ( x, ε n ) ∩ W ) + δ ≥ I ( x n ) . Since x n → x , the lowersemicontinuity of I and the fact that x ∈ F imply that lim inf n →∞ I ( b B τ ( x, ε n ) ∩ W ) + δ ≥ lim inf n →∞ I ( x n ) ≥ I ( x ) ≥ I ( F ) . Sending δ → yields (4.9) because ε n is an arbitrary positive sequence converging to 0.We now prove the large deviation upper bound in part (b). Take any η > . By (4.9) foreach x ∈ F there exists ε x > such that I ( b B τ ( x, ε x ) ∩ W ) ≥ I ( F ) − η. The open balls { b B τ ( x, ε x ) , x ∈ F } cover F . Since F is compact, there exist T < ∞ and finitelymany points x i ∈ F , i = 1 , , . . . , T , such that F ⊂ S Ti =1 b B τ ( x i , ε i ) , where ε i = ε x i . It followsthat min i =1 , ,...,T I ( b B τ ( x i , ε i ) ∩ W ) ≥ I ( F ) − η and F ∩ W ⊂ T [ i =1 (cid:16) b B τ ( x i , ε i ) ∩ W (cid:17) . Since F = F ∩ W , we have again by Lemma 1.2.15 in [6] lim sup N →∞ N log Q N { Y N ∈ F } (4.10) = lim sup N →∞ N log Q N { Y N ∈ F ∩ W}≤ lim sup N →∞ N log Q N Y N ∈ T [ i =1 (cid:16) b B τ ( x i , ε i ) ∩ W (cid:17)! ≤ lim sup N →∞ N log T X i =1 Q N ( Y N ∈ b B τ ( x i , ε i ) ∩ W ) ! = max i =1 , ,...,T (cid:18) lim sup N →∞ N log Q N ( Y N ∈ b B τ ( x i , ε i ) ∩ W ) (cid:19) .
44e now apply the limit (4.6) to b B ∩ W = b B τ ( x i , ε i ) ∩ W , obtaining lim sup N →∞ N log Q N { Y N ∈ F } (4.11) ≤ max i =1 , ,...,T (cid:18) lim sup N →∞ N log Q N ( Y N ∈ b B τ ( x i , ε i ) ∩ W ) (cid:19) = − min i =1 , ,...,T I ( b B τ ( x i , ε i ) ∩ W ) ≤ − I ( F ) + η. Sending η → , we obtain lim sup N →∞ N log Q N { Y N ∈ F } ≤ − I ( F ) . This completes the proof of the large deviation upper bound for any closed subset F of W underthe assumption that I ( F ) < ∞ .We now assume that I ( F ) = ∞ , which implies that I ( x ) = ∞ for each x ∈ F . The proofof the large deviation upper bound when I ( F ) = ∞ rests on the assertion that for each x ∈ F there exists ε x > such that I ( b B τ ( x, ε x ) ∩ W ) = ∞ . As in the proof of part (b), this assertionis a consequence of the lower semicontinuity of I . As in the proof of the large deviation upperbound when I ( F ) < ∞ , the open balls { b B τ ( x, ε x ) , x ∈ F } cover F . Since F is compact, thereexist T < ∞ and finitely many points x i ∈ F , i = 1 , , . . . , T , such that F ⊂ S Ti =1 b B τ ( x i , ε i ) ,where ε i = ε x i . It follows that min i =1 , ,...,T I ( b B τ ( x i , ε i )) = ∞ = I ( F ) and F ∩ W ⊂ T [ i =1 b B τ ( x i , ε i ) ∩ W . By the same steps as in (4.10) and (4.11) lim sup N →∞ N log Q N { Y N ∈ F } = lim sup N →∞ N log Q N { Y N ∈ F ∩ W}≤ − min i =1 , ,...,T I ( b B τ ( x i , ε i ) ∩ W ) = −∞ = − I ( F ) . This completes the proof of the large deviation upper bound for any closed subset F of W when I ( F ) = ∞ . The proof of part (b) as well as the proof of the theorem are done.45his paper contains four appendices. In appendix A we prove properties of the relativeentropy needed in the paper. Theorem B.1 in appendix B states a basic approximation resultthat is applied in two crucial places in the paper. In appendix C we study a number of propertiesof the quantity α b ( c ) appearing in part (a) of Theorem 3.1. In appendix D we discuss why weimpose the constraint involving m = m ( N ) in the definitions of Ω N,b,m in (2.1) and P N,b,m in(2.3) and how, if this constraint could be eliminated, then our results could be formulated in amore natural way.
Appendices
A Properties of Relative Entropy
We fix a nonnegative integer b and a real number c ∈ ( b, ∞ ) . Given θ a probability measureon N b = { n ∈ Z : n ≥ b } , the mean R N xθ ( dx ) of θ is denoted by h θ i . In Theorem A.1 westudy properties of the relative entropy R ( θ | ρ b,α ) and R ( θ | ρ b,α b ( c ) ) for θ in each of the followingthree spaces: P N b , the set of probability measures on N ; P N b ,c , the set of θ ∈ P N b satisfying h θ i = c ; and P N b , [ b,c ] , the set of θ ∈ P N b satisfying h θ i ∈ [ b, c ] . The Prohorov metric introducesa topology on P N b that is equivalent to the topology of weak convergence. These three spaceshave the following properties: P N b is a complete, separable metric space; P N b ,c is relativelycompact, separable subset of P N b that is not closed in P N b and therefore is not complete; P N b , [ b,c ] is the closure of P N b ,c in P N b and is a compact, separable metric space. The properties of P N b are proved in Theorems 3.3.1 and Theorem 3.1.7 of [14], and the properties of P N b ,c and P N b , [ b,c ] are proved in Theorem 2.4.We recall that for α ∈ (0 , ∞ ) , ρ b,α denotes the Poisson distribution on N b having compo-nents ρ b,α ; j = 1 Z b ( α ) · α j j ! for j ∈ N b , where Z ( α ) = e α , and for b ∈ N , Z b ( α ) = e α − P b − j =0 α j /j ! . According to part (a) of Theorem3.1 there exists a unique value α = α b ( c ) for which h ρ b,α b ( c ) i = c ; thus ρ b,α b ( c ) lies in P N b ,c .Assertion (ii) in part (f) of the next theorem plays an important role in the main part of the paper.After the statement of Lemma 3.3 we use this assertion to show that the arbitrary parameter α in Lemmas 3.2 and 3.3 must have the value α b ( c ) in Theorem 3.1. Theorem A.1.
Fix a nonnegative integer b and a real number c ∈ ( b, ∞ ) . For any α ∈ (0 , ∞ ) the relative entropy R ( θ | ρ b,α ) = P j ∈ N b θ j log( θ j /ρ b,α ; j ) has the following properties. (a) R ( ·| ρ b,α ) maps P N b into [0 , ∞ ] , and for θ ∈ P N b , R ( θ | ρ b,α ) = 0 if and only if θ = ρ b,α . R ( ·| ρ b,α ) is a convex, lower semicontinuous function on P N b . In other words, for θ and σ in P N b , λ ∈ (0 , , and θ ( N ) a sequence in P N b converging weakly to θR ( λθ + (1 − λ ) σ | ρ b,α ) ≤ λR ( θ | ρ b,α ) + (1 − λ ) R ( σ | ρ b,α ) and lim inf N →∞ R ( θ ( N ) | ρ b,α ) ≥ R ( θ | ρ b,α ) . (c) R ( ·| ρ b,α ) is a strictly convex function on the set A = { θ ∈ P N b : R ( θ | ρ b,α ) < ∞} . Inother words, if θ = σ are two measures in A , then for λ ∈ (0 , R ( λθ + (1 − λ ) σ | ρ b,α ) < λR ( θ | ρ b,α ) + (1 − λ ) R ( σ | ρ b,α ) . (d) R ( ·| ρ b,α ) has compact level sets in P N b , in P N b , [ b,c ] and in P N b ,c . In other words, for Y equal to any of these three spaces and any M < ∞ , the set { θ ∈ Y : R ( θ | ρ b,α ) ≤ M } is acompact subset of Y . (e) Define g ( α, b, c ) = log Z b ( α ) − c log α − (log Z b ( α b ( c )) − c log α b ( c )) , where Z ( α ) = e α , and for b ∈ N , Z b ( α ) = e α − P b − j =0 α j /j ! . Then for any θ ∈ P N b ,c R ( θ | ρ b,α ) = R ( θ | ρ b,α b ( c ) ) + g ( α, b, c ) . (f) The following two assertions hold. (i) R ( θ | ρ b,α ) attains its infimum over θ ∈ P N b ,c at the unique measure θ = ρ b,α b ( c ) , and min θ ∈P N b,c R ( θ | ρ b,α ) = R ( ρ b,α b ( c ) | ρ b,α ) = g ( α, b, c ) . (ii) For any θ ∈ P N b ,c , R ( θ | ρ b,α ) is related to R ( θ | ρ b,α b ( c ) ) by the formula R ( θ | ρ b,α ) − min θ ∈P N b,c R ( θ | ρ b,α ) = R ( θ | ρ b,α b ( c ) ) . Proof . (a)–(c) These properties are proved in Lemma 1.4.1 and in part (b) of Lemma 1.4.3 in[8].(d) The fact that R ( ·| ρ b,α b ( c ) ) has compact level sets in P N is proved in part (c) of Lemma1.4.3 in [8]. According to part (b) of Theorem 2.4, P N b , [ b,c ] is a compact subset of P N b . Hencefor any M < ∞{ θ ∈ P N b , [ b,c ] : R ( θ | ρ b,α ) ≤ M } = { θ ∈ P N b : R ( θ | ρ b,α ) ≤ M } ∩ P N b , [ b,c ]
47s a compact subset of P N b , [ b,c ] . This completes the proof that R ( ·| ρ b,α ) has compact level setsin P N b , [ b,c ] .Because P N b ,c is not a closed subset of P N b , [ b,c ] [Thm. 2.4(a)], the proof that R ( ·| ρ b,α ) has compact level sets in P N b ,c is more subtle. If θ ( n ) is any sequence in P N b ,c satisfying R ( θ ( n ) | ρ b,α ) ≤ M , then since θ ( n ) ∈ P N b and R ( ·| ρ b,α ) has compact level sets in P N b , thereexists θ ∈ P N b and a subsequence θ ( n ′ ) such that θ ( n ′ ) ⇒ θ and R ( θ | ρ b,α ) ≤ M . To completethe proof that R ( ·| ρ b,α ) has compact level sets in P N b ,c , we must show that θ ∈ P N b ,c ; i.e., that h θ i = c . By Fatou’s Lemma h θ i ≤ lim inf N →∞ h θ ( n ′ ) i = c. In addition, for any w ∈ (0 , ∞ ) Z N b e wx ρ b,α ( dx ) = X j ∈ N b e wj ρ b,α ; j = 1 Z b ( α ) · X j ∈ N b e wj α j j ! ≤ Z b ( α ) · exp( αe w ) < ∞ . Lemma 5.1 in [7] implies that the sequence θ ( n ′ ) is uniformly integrable; i.e., lim D →∞ sup n ∈ N Z { x ∈ N : x ≥ D } xθ ( n ′ ) ( dx ) = 0 . These properties of θ and θ ( n ′ ) imply that c = lim n ′ →∞ h θ ( n ′ ) i = h θ i [14, Appendix, Prop. 2.3].This completes the proof that R ( ·| ρ b,α ) has compact level sets in P N b ,c . The proof of part (d) isfinished.(e) For any θ ∈ P N b ,c we have P j ∈ N b θ j = 1 and P j ∈ N b jθ j = c . Hence R ( θ | ρ b,α ) = X j ∈ N b θ j log( θ j /ρ b,α ; j )= X j ∈ N b θ j log( θ j /ρ b,α b ( c ); j ) + X j ∈ N b θ j log( ρ b,α b ( c ); j /ρ b,α ; j )= R ( θ | ρ b,α b ( c ) ) + X j ∈ N b θ j log (cid:18) [ α b ( c )] j Z b ( α b ( c )) j ! · Z b ( α ) j ! α j (cid:19) = R ( θ | ρ b,α b ( c ) ) + X j ∈ N b θ j log( Z b ( α ) /Z b ( α b ( c ))) + X j ∈ N b jθ j log( α b ( c ) /α )= R ( θ | ρ b,α b ( c ) ) + log( Z b ( α ) /Z b ( α b ( c ))) + c log( α b ( c ) /α )= R ( θ | ρ b,α b ( c ) ) + g ( α, b, c ) . R ( ·| ρ b,α ) has compact level sets in P N b ,c , it attains its infimum over P N b ,c . Bypart (a) R ( ·| ρ b,α b ( c ) ) attains its minimum value of 0 over P N b ,c at the unique measure ρ b,α b ( c ) .Hence part (e) implies that the minimum value of R ( ·| ρ b,α ) over P N b ,c equals min θ ∈P N b,c R ( θ | ρ b,α ) = min θ ∈P N b,c R ( θ | ρ b,α b ( c ) ) + g ( α, b, c )= g ( α, b, c ) = R ( ρ b,α b ( c ) | ρ b,α b ( c ) ) + g ( α, b, c ) = R ( ρ b,α b ( c ) | ρ b,α ) . The last equality follows by applying part (e) with θ = ρ b,α b ( c ) . This display shows that R ( ·| ρ b,α ) attains its infimum over P N b ,c at ρ b,α b ( c ) . Let us assume that R ( ·| ρ b,α ) attains its infimum over P N b ,c at another measure θ ⋆ = ρ b,α b ( c ) . Then for any λ ∈ (0 , , we have λρ b,α b ( c ) + (1 − λ ) θ ⋆ ∈P N b ,c . The strict convexity of R ( ·| ρ b,α ) in part (c) yields min θ ∈P N b,c R ( θ | ρ b,α ) ≤ R ( λρ b,α b ( c ) + (1 − λ ) θ ⋆ | ρ b,α ) < λR ( ρ b,α b ( c ) | ρ b,α ) + (1 − λ ) R ( θ ⋆ | ρ b,α ) = min θ ∈P N b,c R ( θ | ρ b,α ) . The equality of the extreme terms contradicts the strict inequality, proving that R ( ·| ρ b,α ) attainsits infimum over P N b ,c at the unique measure ρ b,α b ( c ) . This completes the proof of assertion (i)in part (f).(ii) By assertion (i) min θ ∈P N b,c R ( θ | ρ b,α ) = g ( α, b, c ) . Substituting this into part (e) yieldsassertion (ii). This completes the proof of part (f). The proof of Theorem A.1 is done.This completes our discussion of properties of the relative entropy. The main theorem inappendix B is a basic approximation result that is applied in two crucial places in the paper. B Approximating θ ∈ P N b ,c by θ ( N ) ∈ B N,b,m
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . P N b ,c is the set of probabilitymeasures on N b = { n ∈ Z : n ≥ b } having mean c . We recall the definitions of the sets A N,b,m and B N,b,m , which are introduced at the beginning of section 3: A N,b,m = ( ν = { ν j , j ∈ N b } ∈ N N : X j ∈ N b ν j = N, X j ∈ N b jν j = K, and | ν | + ≤ m = m ( N ) ) and B N,b,m = { θ ∈ P N b ,c : θ j = ν j /N for j ∈ N b for some ν ∈ A N,b,m } .
49n the formula defining A N,b,m , N is the set of nonnegative integers and | ν | + = card { j ∈ N b : ν j ≥ } . The quantities K and m are functions of N as N → ∞ : K = N c , and m is thefunction m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ .Our goal in this appendix is to prove the approximation theorem, Theorem B.1, and Corol-lary B.2. The theorem is applied in two crucial places in the paper. It is first applied near theend of the proof of Lemma 3.3 to prove the limit in (3.22) and thus to complete the proof ofthat lemma. Theorem B.1 is also needed to verify hypothesis (iv) in Theorem 4.2 in the settingof Theorem 4.1. Theorem 4.2 is applied to lift the local large deviation estimate in part (b) ofTheorem 3.1 to the large deviation limit for open balls and certain other subsets in Theorem 4.1.Because R ( ·| ρ b,α ) is lower semicontinuous on P N b [Thm. A.1(b)], the weak convergence inpart (a) of the next theorem implies that lim inf N →∞ R ( θ ( N ) | ρ b,α ) ≥ R ( θ | ρ b,α ) . The proof ofthe convergence R ( θ ( N ) | ρ b,α ) → R ( θ | ρ b,α ) in part (b) requires the finiteness of R ( θ | ρ b,α ) andspecial properties of the sequence θ ( N ) proved in Lemma B.3. Theorem B.1.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) , and let θ be anyprobability measure in P N b ,c . Let m be the function m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( n ) /N → as N → ∞ . Then for any α ∈ (0 , ∞ ) there exists a sequence θ ( N ) ∈ B N,b,m for which the following properties hold. (a) θ ( N ) ⇒ θ as N → ∞ . (b) If R ( θ | ρ b,α ) < ∞ , then R ( θ ( N ) | ρ b,α ) → R ( θ | ρ b,α ) as N → ∞ . We also need the following corollary, which is applied to verify hypothesis (iii) in Theorem4.2 in the setting of Theorem 4.1. It also shows that P N b ,c is separable, a fact needed in parts (a)and (b) of Theorem 2.4. Corollary B.2.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let m be thefunction m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . Then there exists a countable dense subset of P N b ,c consistingof θ ∈ P N b ,c for which R ( θ | ρ b,α b ( c ) ) < ∞ . This countable dense subset is ∪ N ∈ N B N,b,m , where B N,b,m is defined at the beginning of this section. It follows that P N b ,c is separable. Proof.
Given any θ ∈ P N b ,c and any ε > , let B π ( θ, ε ) denote the open ball with center θ and radius ε defined in terms of the Prohorov metric π . We apply part (a) of Theorem B.1 with α = α b ( c ) . Since the measures θ ( N ) constructed in part (a) of that theorem converge weakly to θ , for all sufficiently large N we have θ ( N ) ∈ B π ( θ, ε ) . The fact that only finitely many of thecomponents θ ( N ) j are nonzero implies that R ( θ ( N ) | ρ b,α b ( c ) ) < ∞ for all N . Since ∪ N ∈ N B N,b,m is a countable set, the proof is complete. 50iven θ ∈ P N b ,c , we determine a sequence ν ( N ) ∈ A N,b,m such that the probability measures θ ( N ) with components θ ( N ) j = ν ( N ) j /N have the properties stated in parts (a) and (b) of TheoremB.1. We start by defining j ⋆ = min { j ∈ N b : θ j > } . For example, for the Poisson distribution ρ b,α b ( c ) defined in part (a) of Theorem 2.1, j ⋆ = b sincefor j ∈ N b all the components ρ b,α b ( c ); j are positive.We next define the components ν ( N ) j of ν ( N ) for all j ∈ N b except for the two values j = j ⋆ and j = j ⋆ + 1 . The two components corresponding to these two values of j will then bedefined so that ν ( N ) satisfies the two summation constraints in the definition of A N,b,m . In orderto simplify the notation, the components ν ( N ) j are written as ν j . For x ∈ R we denote by ⌊ x ⌋ the largest integer less than or equal to x . The definition of the components is the following: ν j = if b ≤ j ≤ j ⋆ − ⌊ N θ j ⌋ if j ⋆ + 2 ≤ j ≤ j ⋆ + m − if j ≥ j ⋆ + m. (B.1)We make a few simple observations. If j ⋆ = b , then the first line of this definition is vacuous.For j ⋆ + 2 ≤ j ≤ j ⋆ + m − (cid:18) θ j − N , (cid:19) ≤ ν j N ≤ θ j for all N and lim N →∞ ν j N = θ j . (B.2)In addition, for b ≤ j ≤ j ⋆ − , we have ν j /N = 0 = θ j . If for some j satisfying j ⋆ + 2 ≤ j ≤ j ⋆ + m − we have θ j = 0 , then ν j = 0 .We now define ν j for j = j ⋆ and j = j ⋆ + 1 so that ν j /N → θ j for these two values and sothat the following two summation constraints in the definition of A N,b,m are valid: X j ∈ N b ν j = N and X j ∈ N b jν j = K. (B.3)With these definitions of ν j ⋆ and ν j ⋆ +1 , we have | ν | + ≤ m . According to part (d) of LemmaB.3, the resulting vector ν lies in A N,b,m for all sufficiently large N .In order to keep the notation manageable, we introduce the set of m − indices Φ( j ⋆ , m ) = { j ∈ N b : j ⋆ + 2 ≤ j ≤ j ⋆ + m − } . Since ν j = 0 for b ≤ j ≤ j ⋆ − and for j ≥ j ⋆ + m , the two equalities in (B.3) can be rewrittenin the form ν j ⋆ + ν j ⋆ +1 = N − X j ∈ Φ( j ⋆ ,m ) ν j (B.4)51nd j ⋆ ν j ⋆ + ( j ⋆ + 1) ν j ⋆ +1 = K − X j ∈ Φ( j ⋆ ,m ) jν j . (B.5)These are two linear equations for the two unknowns ν j ⋆ and ν j ⋆ +1 . Solving them for the twounknowns and inserting ν j = ⌊ N θ j ⌋ for j ∈ Φ( j ⋆ , m ) , we obtain the following definitions of ν j ⋆ and ν j ⋆ +1 : ν j ⋆ = ( j ⋆ + 1) N − K + X j ∈ Φ( j ⋆ ,m ) jν j − ( j ⋆ + 1) X j ∈ Φ( j ⋆ ,m ) ν j (B.6) = ( j ⋆ + 1) N − K + X j ∈ Φ( j ⋆ ,m ) j ⌊ N θ j ⌋ − ( j ⋆ + 1) X j ∈ Φ( j ⋆ ,m ) ⌊ N θ j ⌋ and ν j ⋆ +1 = K − j ⋆ N − X j ∈ Φ( j ⋆ ,m ) jν j + j ⋆ X Φ( j ⋆ ,m ) ν j (B.7) = K − j ⋆ N − X j ∈ Φ( j ⋆ ,m ) j ⌊ N θ j ⌋ + j ⋆ X Φ( j ⋆ ,m ) ⌊ N θ j ⌋ . The next lemma states a number of facts about ν j for j ∈ N b that are needed to proveTheorem B.1. Parts (a) and (b) give upper and lower bounds on ν j ⋆ and ν j ⋆ +1 that follow from(B.6) and (B.7). The reason for imposing the condition that m /N → as N → ∞ in TheoremB.1 is the appearance of this quantity as an error term in parts (a) and (b). Part (c) focuses onthe convergence of ν j /N to θ j for j ⋆ ≤ j ≤ j ⋆ + m − . Part (d) shows that for all sufficientlylarge N the vector ν ( N ) with components ν j is an element of A N,b,m and the measure θ ( N ) withcomponents θ ( N ) j = ν j /N for j ∈ N b is an element of B N,b,m ⊂ P N b ,c . In order to prove part (b)of Theorem B.1 concerning the convergence R ( θ ( N ) | ρ b,α ) → R ( θ | ρ b,α ) , we will use the fact,stated in part (e), that for all j ∈ N b satisfying j = j ⋆ + 1 we have θ ( N ) j = ν j /N ≤ θ j for all N .The conclusion of part (f) is that such a bound does not exist for j = j ⋆ + 1 and that in generalthere does not exist M < ∞ such that for any N ∈ N , ν j ⋆ +1 /N ≤ M θ j ⋆ +1 . Lemma B.3.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) , and let θ beany probability measure in P N b ,c . Let m be the function m ( N ) appearing in the definitionof Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( n ) /N → as N → ∞ . We define β m = P j ≥ j ⋆ + m θ j and γ m = P j ≥ j ⋆ + m jθ j ; since θ ∈ P N b ,c , β m → and γ m → as N → ∞ .The following conclusions hold. (a) ν j ⋆ satisfies the inequalities N θ j ⋆ ≥ ν j ⋆ ≥ N (cid:18) θ j ⋆ + ( j ⋆ + 1) β m − γ m − m N (cid:19) . ν j ⋆ +1 satisfies the inequalities N (cid:18) θ j ⋆ +1 + γ m − j ⋆ β m + m N (cid:19) ≥ ν j ⋆ +1 ≥ N ( θ j ⋆ +1 + γ m − j ⋆ β m ) ≥ N θ j ⋆ +1 . (c) For all j ∈ N b we have lim N →∞ θ ( N ) j = lim N →∞ ν j /N = θ j . (d) For all sufficiently large N the vector ν ( N ) with components ν j defined in (B.1) , (B.6) ,and (B.7) is an element of A N,b,m . Hence for all sufficiently large N the measure θ ( N ) withcomponents θ ( N ) j = ν j /N for j ∈ N b is an element of B N,b,m ⊂ P N b ,c . (e) For all j ∈ N b satisfying j = j ⋆ + 1 we have θ ( N ) j = ν j /N ≤ θ j for all N ∈ N . (f) The upper bound θ ( N ) j ⋆ +1 = ν j ⋆ +1 /N ≤ θ j ⋆ +1 does not hold for any N . On the other hand,if θ j ⋆ +1 > , then for all sufficiently large N we have ν j ⋆ +1 /N ≤ θ j ⋆ +1 . However, if θ j ⋆ +1 = 0 ,then in general there does not exist M < ∞ such that for any N ∈ N , ν j ⋆ +1 /N ≤ M θ j ⋆ +1 . Proof. (a) We first prove the lower bound. According to (B.2), ν j ≥ N ( θ j − /N ) for all j ∈ Φ( j ⋆ , m ) . Since for all j ∈ Φ( j ⋆ , m ) we have j > j ⋆ + 1 , the first line of (B.6) implies that ν j ⋆ = N j ⋆ + 1 − c + X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ − ν j N (B.8) ≥ N j ⋆ + 1 − c + X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ − (cid:18) θ j − N (cid:19) = N ( j ⋆ + 1) − X j ∈ Φ( j ⋆ ,m ) θ j − c + X j ∈ Φ( j ⋆ ,m ) jθ j − X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ −
1) 1 N . We now use the facts that θ j = 0 for b ≤ j ≤ j ⋆ − , P j ∈ N b θ j = 1 , P j ∈ N b jθ j = c to calculate X j ∈ Φ( j ⋆ ,m ) jθ j = j ⋆ + m − X j = j ⋆ +2 jθ j (B.9) = X j ∈ N jθ j − j ⋆ θ j ⋆ − ( j ⋆ + 1) θ j ⋆ +1 − γ m = c − j ⋆ θ j ⋆ − ( j ⋆ + 1) θ j ⋆ +1 − γ m X j ∈ Φ( j ⋆ ,m ) θ j = j ⋆ + m − X j = j ⋆ +2 θ j (B.10) = X j ∈ N θ j − θ j ⋆ − θ j ⋆ +1 − β m = 1 − θ j ⋆ − θ j ⋆ +1 − β m . In addition X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ −
1) 1 N = 1 N m − X j =1 j = ( m − m − N ≤ m N . (B.11)Substituting (B.9), (B.10), and (B.11) into the last expression in (B.8), we conclude that ν j ⋆ ≥ N (cid:20) θ j ⋆ + ( j ⋆ + 1) β m − γ m − m N (cid:21) . This is the lower bound in part (a).We now prove the upper bound in part (a). According to (B.2), ν j ≤ N θ j for all j ∈ Φ( j ⋆ , m ) . Since for all j ∈ Φ( j ⋆ , m ) we have j > j ⋆ + 1 , the first line of (B.6) implies that ν j ⋆ = N j ⋆ + 1 − c + X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ − ν j N ≤ N j ⋆ + 1 − c + X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ − θ j . Except for the absence of the term containing /N , this is the same expression that appears inthe second line of (B.8). Hence by a calculation similar to that yielding the lower bound in part(a) ν j ⋆ ≤ N ( θ j ⋆ + ( j ⋆ + 1) β m − γ m ) . We now use the fact that ( j ⋆ + 1) β m − γ m = ( j ⋆ + 1) X j ≥ j ⋆ +1 θ j − X j ≥ j ⋆ +1 jθ j ≤ . ν j ⋆ ≤ N θ j ⋆ . This is the upperbound in part (a). The proof of part (a) is complete.(b) We first prove the upper bound. According to (B.2), ν j ≥ N ( θ j − /N ) for all j ∈ Φ( j ⋆ , m ) . Since for all j ∈ Φ( j ⋆ , m ) we have j > j ⋆ , the first line of (B.7) implies that ν j ⋆ +1 = N c − j ⋆ − X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) ν j N (B.12) ≤ N c − j ⋆ − X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) (cid:18) θ j − N (cid:19) = N c − X j ∈ Φ( j ⋆ ,m ) jθ j − j ⋆ − X j ∈ Φ( j ⋆ ,m ) θ j + X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) 1 N . As in the proof of (B.11), X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) 1 N ≤ m N .
Substituting this inequality as well as the equalities in (B.9) and (B.10) into the last expressionin (B.12), we conclude that ν j ⋆ +1 ≤ N (cid:18) θ j ⋆ +1 + γ m − j ⋆ β m + m N (cid:19) . This is the upper bound in part (b).We now prove the lower bound in part (b). According to (B.2), ν j ≤ N θ j for all j ∈ Φ( j ⋆ , m ) . Since for all j ∈ Φ( j ⋆ , m ) we have j > j ⋆ , the first line of (B.12) implies that ν j ⋆ +1 = N c − j ⋆ − X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) ν j N ≥ N c − j ⋆ − X j ∈ Φ( j ⋆ ,m ) ( j − j ⋆ ) θ j Except for the absence of the term containing /N , this is the same expression that appears inthe second line of (B.12). Hence by a calculation similar to that yielding the upper bound in55art (b) ν j ⋆ +1 ≥ N ( θ j ⋆ +1 + γ m − j ⋆ β m ) . This is the second inequality in part (b). We now use the fact that N ( θ j ⋆ +1 + γ m − j ⋆ β m ) = N θ j ⋆ +1 + N X j ≥ j ⋆ +1 ( j − j ⋆ ) θ j ≥ N θ j ⋆ +1 . This is the third inequality in part (b). The proof of part (b) is complete.(c) For j = j ⋆ and j = j ⋆ + 1 the limits lim N →∞ ν j /N = θ j are immediate consequencesof parts (a) and (b) since each of the quantities β m , γ m , and m /N converge to 0 as N → ∞ .For j ∈ N satisfying j ≥ j ⋆ + 2 the limit lim N →∞ ν j /N = θ j follows from (B.2) and the factthat m → ∞ as N → ∞ . Finally, for j ∈ N b satisfying b ≤ j ≤ j ⋆ − , ν j /N = 0 = θ j . Theproof of part (c) is complete.(d) According to (B.1), for all j ∈ N b satisfying j = j ⋆ , j ⋆ + 1 we have ν j ∈ N for all N . We now consider ν j ⋆ . As N → ∞ , each of the quantities β m , γ m , and m /N converge to0. Since θ j ⋆ > , it follows from the lower bound in part (a) of this lemma that ν j ⋆ > forall sufficiently large N . The definition of ν j ⋆ in (B.6) shows that ν j is an integer for all N . Itfollows that ν j ⋆ ∈ N for all sufficiently large N . Finally we consider ν j ⋆ +1 . The lower bound inpart (b) of this lemma shows that ν j ⋆ +1 ≥ . The definition of ν j ⋆ +1 in (B.7) shows that ν j ⋆ +1 isan integer for all N . It follows that ν j ⋆ +1 ∈ N for all N . We conclude that for all sufficientlylarge N the vector ν ( N ) is an element of N N . In addition, since ν j = 0 for all j ∈ N b satisfying b ≤ j ≤ j ⋆ − and j ≥ j ⋆ + m , we have | ν ( N ) | + ≤ m ; i.e., at most of the components ν j are positive. These correspond to the indices j ∈ N b satisfying j ⋆ ≤ j ≤ j ⋆ + m − . If thedefinitions of ν j ⋆ and ν j ⋆ +1 in (B.6) and (B.7) are substituted into (B.4) and (B.5), then we seethat the components ν ( N ) j satisfy the two equality constraints in the definition of A N,K,m forall N . It follows that ν ( N ) ∈ A N,K,m for all sufficiently large N . We also conclude that themeasure θ ( N ) having components θ ( N ) j = ν j /N for j ∈ N b is an element of B N,K,m ⊂ P N ,c forall sufficiently large N . The proof of part (d) is complete.(e) For j = j ⋆ and all N , we have ν j ⋆ /N ≤ θ j ⋆ by the upper bound in part (a) of LemmaB.3. For all j ∈ N b satisfying j ⋆ + 2 ≤ j ≤ j ⋆ + m − and for all N , we have ν j /N ≤ θ j by(B.2). Finally, by (B.1) for all j ∈ N b satisfying b ≤ j ≤ j ⋆ − and j ≥ j ⋆ + m and for all N we have ν ( N ) j /N = 0 ≤ θ j . The proof of part (e) is complete.(f) Assume that θ j ⋆ +1 > . By the upper bound in part (b) of this lemma, γ m − j ⋆ β m + m /N → as N → ∞ . Hence for all sufficiently large N , ν j ⋆ +1 /N ≤ θ j ⋆ +1 . However, even56f θ j ⋆ +1 > . the upper bound ν j ⋆ +1 /N ≤ θ j ⋆ +1 cannot hold for any N because of the threeadditional terms in the upper bound in part (b); while γ m and β m can be 0 for sufficiently large N , the term m /N > for all N . This proves the first two assertions in part (f). Concerning thethird assertion, let us see how the bound ν j ⋆ +1 /N ≤ M θ j ⋆ +1 can fail. We assume that θ j ⋆ +1 = 0 and that there exists a subsequence j ′ → ∞ such that θ j ′ > along this subsequence. By thelower bound in part (a) of this lemma ν j ⋆ +1 ≥ N ( γ m − j ⋆ β m ) = N X j ≤ j ⋆ + m ( j − j ⋆ ) θ j ! . Since θ j ′ > along the subsequence j ′ → ∞ , it follows that for all N ∈ N and all j ′ ν j ⋆ +1 ≥ N ( j ′ − j ⋆ ) θ j ′ > . Since θ j ⋆ +1 = 0 and ν j ⋆ +1 /N > for all N ∈ N , the bound ν j ⋆ +1 /N ≤ M θ j ⋆ +1 cannot holdfor any M < ∞ . This completes the proof of part (f). The proof of Lemma B.3 is done.We are now ready to prove Theorem B.1. Given θ ∈ P N b ,c , θ ( N ) in this theorem is thesequence with components θ ( N ) j = ν j /N for j ∈ N b . The quantities ν j = ν ( N ) j are defined in(B.1), (B.6), and (B.7). In the proof of the theorem we work with sufficiently large N ∈ N guaranteeing, according to part (d) of Lemma B.3, that θ ( N ) is a probability measure lying in B N,b,m ⊂ P N b ,c . Proof of part (a) of Theorem B.1.
We prove that θ ( N ) ⇒ θ by showing that for any boundedfunction f mapping N b into R lim N →∞ Z N b f dθ ( N ) = lim N →∞ X j ∈ N b f ( j ) θ ( N ) j = X j ∈ N b f ( j ) θ j = Z N b f dθ. We use the facts that ν j = 0 = θ j for b ≤ j ≤ j ⋆ − , ν j = 0 for j ≥ j ⋆ + m , and max j ⋆ +2 ≤ j ≤ j ⋆ + m − (cid:12)(cid:12)(cid:12) ν j N − θ j (cid:12)(cid:12)(cid:12) ≤ N . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X j ∈ N b f ( j ) θ ( N ) j − X j ∈ N b f ( j ) θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ | f ( j ⋆ ) (cid:12)(cid:12)(cid:12) ν j ⋆ N − θ j ⋆ (cid:12)(cid:12)(cid:12) + | f ( j ⋆ + 1) | (cid:12)(cid:12)(cid:12) ν j ⋆ +1 N − θ j ⋆ +1 (cid:12)(cid:12)(cid:12) + k f k ∞ j ⋆ + m − X j = j ⋆ +2 (cid:12)(cid:12)(cid:12) ν j N − θ j (cid:12)(cid:12)(cid:12) + k f k ∞ X j ≥ j ⋆ + m θ j ≤ | f ( j ⋆ ) (cid:12)(cid:12)(cid:12) ν j ⋆ N − θ j ⋆ (cid:12)(cid:12)(cid:12) + | f ( j ⋆ + 1) | (cid:12)(cid:12)(cid:12) ν j ⋆ +1 N − θ j ⋆ +1 (cid:12)(cid:12)(cid:12) + k f k ∞ ( m − (cid:18) max j ⋆ +2 ≤ j ≤ j ⋆ + m − (cid:12)(cid:12)(cid:12) ν j N − θ j (cid:12)(cid:12)(cid:12)(cid:19) + k f k ∞ X j ≥ j ⋆ + m θ j ≤ | f ( j ⋆ ) (cid:12)(cid:12)(cid:12) ν j ⋆ N − θ j ⋆ (cid:12)(cid:12)(cid:12) + | f ( j ⋆ + 1) | (cid:12)(cid:12)(cid:12) ν j ⋆ +1 N − θ j ⋆ +1 (cid:12)(cid:12)(cid:12) + k f k ∞ mN + k f k ∞ X j ≥ j ⋆ + m θ j . By part (c) of Lemma B.3 ν j ⋆ /N → θ j ⋆ and ν j ⋆ +1 /N → θ j ⋆ +1 as N → ∞ . Since m/N → and P j ≥ j ⋆ + m θ j → as N → ∞ , it follows that lim N →∞ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X j ∈ N b f ( j ) θ ( N ) j − X j ∈ N b f ( j ) θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 . This completes the proof of part (a) of Theorem B.1.
Proof of part (b) of Theorem B.1.
Let θ be a probability measure in P N b ,c . We prove that if R ( θ | ρ b,α ) < ∞ , then lim N →∞ R ( θ ( N ) | ρ b,α ) = R ( θ | ρ b,α ) . We use the following facts.1. For all j ∈ N b we have lim N →∞ θ ( N ) j = θ j .2. For all j ∈ N b satisfying j = j ⋆ + 1 , we have θ ( N ) j ≤ θ .Item 1, which is stated in part (c) of Lemma B.3, follows from the weak convergence θ ( N ) ⇒ θ proved in part (a) of Theorem B.1. Item 2, which is stated in part (e) of Lemma B.3, is easily58erified. For j = j ⋆ the upper bound θ ( N ) j ⋆ ≤ θ j ⋆ is valid by part (a) of Lemma B.3. For all other j ∈ N b satisfying j = j ⋆ + 1 , the upper bound θ ( N ) j ≤ θ j is a consequence of (B.1) and (B.2).According to part (f) of Lemma B.3 the upper bound θ ( N ) j ⋆ +1 ≤ θ j ⋆ +1 is not valid for any N , andin general there does not exist M < ∞ such that for any N ∈ N , θ ( N ) j ⋆ +1 ≤ M θ j ⋆ +1 . Because ofthis anomaly the term in R ( θ ( N ) | ρ b,α ) corresponding to j = j ⋆ + 1 must be handled separately.Define ϕ ( x ) = x log x for x ∈ [0 , ∞ ) ; if x = 0 , then ϕ ( x ) = 0 . This function is continuouson [0 , ∞ ) . For each j ∈ N b , since θ ( N ) j → θ j as N → ∞ , it follows that ϕ ( θ ( N ) j /ρ b,α ; j ) → ϕ ( θ j /ρ b,α ; j ) as N → ∞ . To prove part (b) of Theorem B.1 we must justify the followinginterchange of the limit N → ∞ and the sum over j ∈ N b \ { j ⋆ + 1 } : lim N →∞ R ( θ ( N ) | ρ b,α )= lim N →∞ ρ b,α ; j ⋆ +1 ϕ ( θ ( N ) j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + lim N →∞ X j ∈ N b \{ j ⋆ +1 } ρ b,α ; j ϕ ( θ ( N ) j /ρ b,α ; j )= ρ b,α ; j ⋆ +1 ϕ ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + X j ∈ N b \{ j ⋆ +1 } ρ b,α ; j (cid:16) lim N →∞ ϕ ( θ ( N ) j /ρ b,α ; j ) (cid:17) = ρ b,α ; j ⋆ +1 ϕ ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + X j ∈ N b \{ j ⋆ +1 } ρ b,α ; j ϕ ( θ j /ρ b,α ; j ) = R ( θ | ρ b,α ) . We justify the interchange of the limit and the sum over j ∈ N b \ { j ⋆ + 1 } by applying theDominated Convergence Theorem. This procedure requires finding constants a j for j ∈ N b \{ j ⋆ + 1 } such that for all sufficiently large N ∈ N ρ b,α ; j | ϕ ( θ ( N ) j /ρ b,α ; j ) | ≤ a j and X j ∈ N b \{ j ⋆ +1 } a j < ∞ . The key to applying the Dominated Convergence Theorem is to use two properties of ϕ ( x ) = x log x : its boundedness on the interval [0 , and its monotonicity on the interval [1 , ∞ ) . Property 1.
For x ∈ [0 , , ≥ ϕ ( x ) ≥ − e − . Property 2.
For x ∈ [1 , ∞ ) , ϕ ( x ) ≥ , ϕ ( x ) → ∞ as x → ∞ , and ϕ is monotone in the sensethat for ≤ x < y , ≤ ϕ ( x ) < ϕ ( y ) .Let Ψ = { j ⋆ + 1 } . We write ϕ ( x ) = ϕ + ( x ) − ϕ − ( x ) , where ϕ + ( x ) = ϕ ( x ) · [1 , ∞ ) ( x ) and ϕ − ( x ) = − ϕ ( x ) · [0 , ( x ) . For N ∈ N define C N = { j ∈ N b \ Ψ : θ ( N ) j /ρ b,α ; j ∈ [0 , } and D N = { j ∈ N b \ Ψ : θ ( N ) j /ρ b,α ; j ∈ [1 , ∞ ) } .
59n terms of these sets we write X j ∈ N b \ Ψ ρ b,α ; j | ϕ ( θ ( N ) j /ρ b,α ; j ) | = X j ∈ C N ρ b,α ; j ϕ − ( θ ( N ) j /ρ b,α ; j ) + X j ∈ D N ρ b,α ; j ϕ + ( θ ( N ) j /ρ b,α ; j ) . For j ∈ C N the boundedness of ϕ on [0 , implies that ≤ ρ b,α ; j ϕ − ( θ ( N ) j /ρ b,α ; j ) ≤ e − ρ b,α ; j . For j ∈ D N the monotonicity of ϕ on [1 , ∞ ) and the bound θ ( N ) j ≤ θ j imply that ≤ ρ b,α ; j ϕ + ( θ ( N ) j /ρ b,α ; j ) ≤ ρ b,α ; j ϕ + ( θ j /ρ b,α ; j ) ≤ ρ b,α ; j | ϕ ( θ j /ρ b,α ; j ) | . Thus for all j ∈ N b \ Ψ ρ b,α ; j | ϕ ( θ ( N ) j /ρ b,α ; j ) | ≤ a j = e − ρ b,α ; j + ρ b,α ; j | ϕ ( θ j /ρ b,α ; j ) | . Using the fact that R ( θ | ρ b,α ) < ∞ , we prove that P j ∈ N b \ Ψ a j < ∞ . We have X j ∈ N b \ Ψ a j ≤ e − X j ∈ N b \ Ψ ρ b,α ; j + X j ∈ N b \ Ψ ρ b,α ; j | ϕ ( θ j /ρ b,α ; j ) | (B.13) ≤ e − + X j ∈ N b \ Ψ ρ b,α ; j | ϕ ( θ j /ρ b,α ; j ) | . Define C = { j ∈ N b \ Ψ : θ j /ρ b,α ; j ∈ [0 , } and D = { j ∈ N b \ Ψ : θ j /ρ b,α ; j ∈ [1 , ∞ ) } . In terms of these sets we write R ( θ | ρ b,α )= ρ b,α ; j ⋆ +1 ϕ ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + X j ∈ N b \ Ψ ρ b,α ; j ϕ ( θ j /ρ b,α ; j )= ρ b,α ; j ⋆ +1 ϕ ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) − X j ∈ C ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) + X j ∈ D ρ b,α ; j ϕ + ( θ j /ρ b,α ; j ) . For j ∈ C ∪ Ψ we have ≤ ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) ≤ e − ρ b,α ; j . Hence ρ b,α ; j ⋆ +1 ϕ − ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) ≤ e − ρ b,α ; j ⋆ +1 ≤ e − X j ∈ C ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) ≤ e − X j ∈ C ρ b,α ; j ≤ e − . It follows that X j ∈ N b \ Ψ ρ b,α ; j | ϕ ( θ j /ρ b,α ; j ) | = X j ∈ C ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) + X j ∈ D ρ b,α ; j ϕ + ( θ j /ρ b,α ; j ) ≤ e − + X j ∈ D ρ b,α ; j ϕ + ( θ j /ρ b,α ; j )= e − + R ( θ | ρ b,α ) − ρ b,α ; j ⋆ +1 ϕ ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + X j ∈ C ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) ≤ e − + R ( θ | ρ b,α ) + ρ b,α ; j ⋆ +1 ϕ − ( θ j ⋆ +1 /ρ b,α ; j ⋆ +1 ) + X j ∈ C ρ b,α ; j ϕ − ( θ j /ρ b,α ; j ) ≤ e − + R ( θ | ρ b,α ) < ∞ . Substituting the last display into (B.13), we conclude that X j ∈ N b \ Ψ a j < e − + R ( θ | ρ b,α ) < ∞ . This completes the proof of part (b). The proof of Theorem B.1 is done.In appendix C we study prove part (a) of Theorem 3.1 as well as a number of other propertiesof the parameter α b ( c ) that defines the Poisson equilibrium distribution ρ b,α b ( c ) . C Proof of Part (a) of Theorem 3.1 re α b ( c ) The goal of this appendix is to prove Theorem C.1. Part (a) restates part (a) of Theorem 3.1concerning the existence of α b ( c ) . This parameter defines the Poisson distribution ρ b,α b ( c ) ap-pearing in the local large deviation estimate in part (b) of Theorem 3.1. In part (b) we derive twosets of bounds on α b ( c ) and use these bounds to show that α b ( c ) is asymptotic to c as c → ∞ .Part (c) shows an interesting monotonic relationship between α b ( c ) and α b +1 ( c ) while part (d)makes precise the relationship between ρ b,α b ( c ) and a Poisson random variable having parameter α b ( c ) . Parts (a), (b), and (d) of the next theorem appear in Theorem C.1 in [12] as parts (a), (b),and (c). Part (c) of the next theorem is new. 61he fact that α b ( c ) is asymptotic to c as c → ∞ is certainly plausible. If c is large, then themean of ρ b,α b ( c ) , which equals c , is not changed appreciably if ρ b,α b ( c ) is replaced by a standardPoisson distribution on N ∪ { } with parameter α b ( c ) . Since the mean of an actual Poissondistribution on N ∪ { } with parameter α b ( c ) is α b ( c ) , we expect that if c is large, then α b ( c ) should be close to c . Theorem C.1.
Fix a nonnegative integer b and a real number c ∈ ( b, ∞ ) . For α ∈ (0 , ∞ ) define Z ( α ) , and for b ∈ N define Z b ( α ) = e α − P b − j =0 α j /j ! . Let ρ b,α be the probabilitymeasure on N b whose components are defined by ρ b,α ; j = 1 Z b ( α ) · α j j ! for j ∈ N b . The following conclusions hold. (a)
There exists a unique value α b ( c ) ∈ (0 , ∞ ) such that ρ b,α b ( c ) lies in the set P N b ,c ofprobability measures on N b having mean c . If b = 0 , then α ( c ) = c . If b ∈ N , then α b ( c ) is theunique solution in (0 , ∞ ) of αZ b − ( α ) /Z b ( α ) = c . (b) For b ∈ N c > α b ( c ) > c − b and c > α b ( c ) > c (1 − b e − ( c − b ) / ) . Either of these bounds imply that α b ( c ) is asymptotic to c as c → ∞ ; i.e., lim c →∞ α b ( c ) /c = 1 . (c) For all b ∈ N ∪ { } and c > b + 1 , α b +1 ( c ) < α b ( c ) . (d) For b ∈ N , if Ξ α b ( c ) is a Poisson random variable with parameter α b ( c ) , then ρ b,α b ( c ) isthe distribution of Ξ α b ( c ) conditioned on Ξ α b ( c ) ∈ N b . Before we prove Theorem C.1, we state a second theorem that focuses on the case b = 1 . Inthis case the equilibrium distribution ρ ,α ( c ) is a probability measure on N = N . In part (a) wegive the proof of the existence of α b ( c ) for b = 1 , which is much more straightforward than theproof for general b . In parts (b) and (c) we give two iterative procedures for calculating α ( c ) while in part (d) we derive two sets of inequalities that are tighter than the inequalities for α b ( c ) for general b given in part (b) of Theorem C.1. Like the inequalities in part (b) of Theorem C.1,the inequalities in part (d) of the next theorem imply that α ( c ) is asymptotic to c as c → ∞ . Theorem C.2.
Fix a real number c ∈ (1 , ∞ ) . The following results are valid. (a) There exists a unique value α ( c ) ∈ (0 , ∞ ) such that ρ ,α ( c ) lies in the set P N ,c ofprobability measures on N . The quantity α ( c ) is the unique solution in (0 , ∞ ) of αe α = c ( e α − . (b) Let α = c and consider the following iterative procedure defined for n ∈ N , n ≥ : α n +1 = c (1 − e − α n ) . hen the sequence { α n , n ∈ N } is monotonically decreasing and lim n →∞ α n = α ( c ) . (c) Let β = log c and consider the following iterative procedure defined for n ∈ N , n ≥ : β n +1 = c (1 − e − β n ) . Then the sequence { β n , n ∈ N } is monotonically increasing and lim n →∞ β n = α ( c ) . (d) We have the following two bounds on α ( c ) : c (1 − e − c ) > α ( c ) > c − and c (1 − e − c ) > α ( c ) > c (1 − e − c +1 ) . Either of these bounds implies that α ( c ) is asymptotic to c as c → ∞ ; i.e., lim c →∞ α ( c ) /c = 1 . Proof. (a) The measure ρ ,α is a probability measure on N having mean X j ∈ N jρ ,α ; j = 1 e α − · X j ∈ N α j ( j − e α − · α ∞ X j =0 α j j ! = 1 e α − · αe α . Thus ρ ,α has mean c if and only if α satisfies αe α = c ( e α − . We prove part (a) by showingthat this equation has a unique solution α ( c ) ∈ (0 , ∞ ) for any c > .The proof that αe α = c ( e α − has a unique solution α ( c ) ∈ (0 , ∞ ) for any c > isstraightforward. A positive real number α solves αe α = c ( e α − if and only if γ ( α ) = c, where γ ( α ) = α − e − α . The function γ is continuously differentiable on (0 , ∞ ) and lim α → + γ ( α ) = 1 . In addition,for α ∈ (0 , ∞ ) γ ′ ( α ) = 1 − (1 + α ) e − α (1 − e − α ) = e − α · e α − − α (1 − e − α ) > . The inequality holds since for α > , e α − − α > . It follows that there exists a sufficientlysmall value of ε > such that < γ ( ε ) < c and γ is monotonically increasing on ( ε, ∞ ) .Since γ ( α ) → ∞ as α → ∞ , we conclude that there exists a unique value α = α ( c ) ∈ (0 , ∞ ) solving γ ( α ( c )) = c and thus solving α ( c ) e α ( c ) = c ( e α ( c ) − . This completes the proof ofpart (a).(b) Since e − c < , we have the inequality α = c (1 − e − α ) = c (1 − e − c ) < c = α .
63e use induction to prove that the sequence α n is monotonically decreasing. For n ∈ N , n ≥ ,under the assumption that α n < α n − , this property of the sequence is a consequence of thefollowing calculation: α n +1 − α n = c ( e − α n − − e − α n ) < . We now use induction to prove that the sequence α n is bounded below by log c . For n = 1 , a = c > log c . Assuming that α n > log c , we have α n +1 = c (1 − e − α n ) > c (1 − e − log c ) = c − > log c. The last inequality follows from the facts that when c = 1 , c − c and that for c ∈ (1 , ∞ ) , ( c − ′ = 1 > /c = (log c ) ′ . This completes the proof that α n > log c forall n ∈ N . Since α n is a monotonically decreasing sequence bounded above by c and belowby log c , we conclude α ⋆ = lim n →∞ α n exists and satisfies both α ⋆ ∈ (log c, c ) and α ⋆ = c (1 − e − α ⋆ ) . Because α ( c ) is the unique positive solution of this equation, it follows that lim n →∞ α n = α ( c ) . This completes the proof of part (b).(c) Since β = log c , we have the inequality β = c (1 − e − β ) = c (1 − e − log c ) = c − > log c. We use induction to prove that the sequence β n is monotonically increasing. For n ∈ N , n ≥ ,under the assumption that β n − < β n , this is a consequence of the following calculation: β n +1 − β n = c ( e − β n − − e − β n ) > . We now use induction to prove that the sequence β n is bounded above by c . For n = 1 , β = log c < c . Assuming that β n < c , we have β n +1 = c (1 − e − β n ) < c (1 − e − c ) < c, This completes the proof that β n is bounded above by c . Since β n is a monotonically increasingsequence bounded above by c and below by log c , we conclude β ⋆ = lim n →∞ β n exists andsatisfies both β ⋆ ∈ (log c, c ) and β ⋆ = c (1 − e − β ⋆ ) . Because α ( c ) is the unique positivesolution of this equation, it follows that lim n →∞ β n = α ( c ) . This completes the proof of part(c). (d) We first prove that c (1 − e − c ) > α ( c ) . This follows immediately from the iterativeprocedure discussed in part (a), which implies that c = α > α = c (1 − e − c ) > α ( c ) . Onecan obtain the weaker upper bound c > α ( c ) directly if one writes the equation solved by α ( c ) in the form α ( c ) = c (1 − e − α ( c ) ) (C.1)64nd uses the fact that e − α ( c ) ∈ (0 , .We now prove a series of three lower bounds, the last two of which, in combination withthe upper bound c (1 − e − c ) > α ( c ) , imply that α ( c ) ∼ c as c → ∞ . The first lower boundis α ( c ) > log c . To prove this, we use the fact that α ( c ) > to write e α ( c ) − ≥ α ( c ) . Itfollows that α ( c ) = c (1 − e − α ( c ) ) = ce − α ( c ) ( e α ( c ) − > ce − α ( c ) α ( c ) , or equivalently that e α ( c ) > c . This implies that α ( c ) > log c , as claimed.We now bootstrap this lower bound into a tighter lower bound by substituting α ( c ) > log c into the right hand side of (C.1), obtaining the second lower bound α ( c ) = c (1 − e − α ( c ) ) > c (1 − e − log c ) = c (cid:18) − c (cid:19) = c − . (C.2)It follows that − e − c > α ( c ) c > − c . This implies that lim c →∞ α ( c ) /c = 1 or that α ( c ) is asymptotic to c as c → ∞ .By bootstrapping the lower bound in (C.2), we obtain yet a tighter lower bound on α ( c ) which gives a second proof that α ( c ) ∼ c . To do this, we substitute α ( c ) > c − into the righthand side of (C.1), obtaining the third lower bound α ( c ) > c (1 − e − c +1 ) . It follows that − e − c > α ( c ) c > − e − c +1 . (C.3)This implies lim c →∞ α ( c ) /c = 1 at a rate that is at least exponentially fast. By contrast, (C.2)shows a much slower rate of convergence to 1 that is only of the order /c . Interestingly,iterating this procedure again does not give a tighter lower bound than that in (C.3). Thiscompletes the proof of Theorem C.2.We now turn to the proof of Theorem C.1. According to part (a) of this theorem, for b ∈ N , α b ( c ) is the unique solution of αZ b − ( α ) /Z b ( α ) = c . The heart of the proof of Theorem C.1, andits most subtle step, is to prove that the function γ b ( α ) = αZ b − ( α ) /Z b ( α ) satisfies γ ′ b ( α ) > for α ∈ (0 , ∞ ) and thus is monotonically increasing on this interval. This fact is proved in thenext lemma. Lemma C.3.
Fix a positive integer b and a real number c ∈ ( b, ∞ ) . For α ∈ (0 , ∞ ) the function γ b ( α ) = αZ b − ( α ) /Z b ( α ) satisfies γ ′ b ( α ) > . roof. For b ∈ N and for α ∈ (0 , ∞ ) , we have Z ′ b ( α ) = Z b − ( α ) . Thus γ b ( α ) = αZ b − ( α ) Z b ( α ) = α (log Z b ( α )) ′ . The key to proving that γ ′ b ( α ) > is to represent log Z b ( α ) in terms of the moment generatingfunction of a probability measure. We do this by first expressing Z b ( α ) in terms of the upperincomplete gamma function via the formula Z b ( α ) = e α ( b − Z α x b − e − x dx. (C.4)This formula is easily proved by induction. For b = 1 the right side equals e α − Z ( α ) .Assuming that it is true for b = n , we prove that it is true for b = n + 1 by integrating by parts,which gives e α n ! Z α x n e − x dx = e α ( n − Z α x n − e − x dx − α n n != Z n ( α ) − α n n ! = Z n +1 ( α ) . This completes the proof of (C.4) for all b ∈ N .As suggested in [19], we now make the change of variables x = yα , obtaining the represen-tation Z b ( α ) = e α b ! α b g b ( α ) , where g b ( α ) = Z − e αy b ( − y ) b − dy. (C.5)The function g b is the moment generating function of the probability measure on R having thedensity h b ( y ) = b ( − y ) b − on [ − , . For α ∈ (0 , ∞ ) let σ b,α be the probability measure on R having the density e αy h b ( y ) /g b ( α ) on [ − , . A straightforward calculation shows that (log g b ) ′ ( α ) = Z R yσ b,α ( dy ) and (log g b ) ′′ ( α ) = Z R [ y − g ′ b ( α )] σ b,α ( dy ) . As the variance of the nontrivial probability measure σ b,α , we conclude that (log g b ) ′′ ( α ) > for all α ∈ (0 , ∞ ) .Using (C.5) and the power series representations Z b − ( α ) = ∞ X j = b − α j j ! and Z b ( α ) = ∞ X j = b α j j ! ,
66e calculate γ ′ b ( α ) = (log Z b ( α )) ′ + α (log Z b ( α )) ′′ = (log Z b ( α )) ′ + α (cid:20) log (cid:18) e α b ! α b g b ( α ) (cid:19)(cid:21) ′′ = Z b − ( α ) Z b ( α ) + α [ α − log( b !) + b log α + log g b ( α )] ′′ = Z b − ( α ) Z b ( α ) − bα + α (log g b ( α )) ′′ = αZ b − ( α ) − bZ b ( α ) αZ b ( α ) + α (log g b ( α )) ′′ = 1 αZ b ( α ) · ∞ X j = b (cid:18) j − − bj ! (cid:19) α j + α (log g b ( α )) ′′ = 1 Z b ( α ) · ∞ X j = b j − bj ! α j − + α (log g b ( α )) ′′ > . This completes the proof of the lemma.We are now ready to prove Theorem C.1.
Proof of Theorem C.1. (a) We first consider b = 0 . In this case ρ ,α is a standard Poissondistribution on N having mean α . It follows that α ( c ) = c is the unique value for which ρ ,α ( c ) has mean c and thus lies in P N ,c . This completes the proof of part (a) for b = 0 .We now consider b ∈ N . In this case ρ b,α is a probability measure on N b having mean X j ∈ N b jρ b,α ; j = 1 Z b ( α ) · X j ∈ N b α j ( j − (C.6) = 1 Z b ( α ) · α ∞ X j = b − α j j ! = 1 Z b ( α ) · αZ b − ( α ) . Thus ρ b,α has mean c if and only if α satisfies γ b ( α ) = c , where γ b ( α ) = αZ b − ( α ) /Z b ( α ) . Weprove part (a) by showing that γ b ( α ) = c has a unique solution α b ( c ) ∈ (0 , ∞ ) for all b ∈ N andany c > b .The proof depends on the following three steps:1. lim α → + γ b ( α ) = b ; 67. lim α →∞ γ b ( α ) = ∞ ;3. for all α ∈ (0 , ∞ ) , γ ′ b ( α ) > .These three steps yield part (a). Indeed, by steps 1 and 3 there exists a sufficiently small value of ε > such that b < γ b ( ε ) < c , and by step 3 γ b is monotonically increasing on ( ε, ∞ ) . Since bystep 2 γ b ( α ) → ∞ as α → ∞ , we conclude that there exists a unique value α = α b ( c ) ∈ (0 , ∞ ) solving γ b ( α b ( c )) = c and thus guaranteeing that ρ b,α b ( c ) ∈ P N b ,c .Step 3 is proved in Lemma C.3. We now prove steps 1 and 2. Step 1.
For b ∈ N and for α ∈ (0 , ∞ ) satisfying α → γ b ( α ) = αZ b − ( α ) Z b ( α ) = α P ∞ j = b − α j /j ! P ∞ j = b α j /j != P ∞ j = b α j / ( j − P ∞ j = b α j /j ! = α b / ( b − o (1) α b /b ! + o (1) = b + o (1) . The terms denoted by o (1) converge to 0 as α → . It follows that lim α → + γ b ( α ) = b . Thiscompletes the proof of step 1. Step 2.
For b ∈ N and for α ∈ (0 , ∞ ) γ b ( α ) = αZ b − ( α ) Z b ( α ) = α (cid:16) e α − P b − j =0 α j /j ! (cid:17) e α − P b − j =0 α j /j != α (cid:16) − e − α P b − j =0 α j /j ! (cid:17) − e − α P b − j =0 α j /j ! = α (1 + o (1)) . The term denoted by o (1) converges to 0 as α → ∞ . It follows that lim α →∞ γ b ( α ) = ∞ . Thiscompletes the proof of step 2.Having completed steps 1, 2, and 3, we have proved part (a) for all b ∈ N . Since we alsovalidated part (a) for b = 0 , the proof of part (a) for all nonnegative integers b is done.(b) We first prove that α b ( c ) < c for b ∈ N by observing that for any α ∈ (0 , ∞ ) wehave Z b − ( α ) > Z b ( α ) . Thus γ b ( α ) = αZ b − ( α ) /Z b ( α ) > α , which implies that α b ( c ) <γ b ( α b ( c )) = c . To prove that α b ( c ) > c − b , we use the inequality Z b ( α ) = ∞ X j = b α j j ! > α b b !
68o write γ b ( α ) = αZ b − ( α ) Z b ( α ) = α + α ( Z b − ( α ) − Z b ( α )) Z b ( α )= α + α b / ( b − Z b ( α ) < α + α b / ( b − α b /b ! = α + b. It follows that c = γ b ( α b ( c )) < α b ( c ) + b , which gives the desired lower bound α b ( c ) > c − b .We now bootstrap this lower bound into the tighter lower bound indicated in part (b). To dothis we note that for any α ∈ (0 , ∞ ) αe α > αZ b − ( α ) = γ b ( α ) Z b ( α )= γ b ( α ) e α − b − X j =0 α j j ! ! > γ b ( α )( e α − b e α/ ) . The first lower bound α b ( c ) > c − b now yields the tighter lower bound α b ( c ) > γ b ( α b ( c ))(1 − b e − α b ( c ) / ) = c (1 − b e − α b ( c ) / ) > c (1 − b e − ( c − b ) / ) . This completes the proof of the bounds in part (b). Either of these bounds imply that lim c →∞ α b ( c ) /c =1 . This proves that α b ( c ) is asymptotic to c as c → ∞ , completing the proof of part (b).(c) According to part (b), for c > we have α ( c ) < c = α ( c ) . In order to prove that for b ∈ N and c > b + 1 we have α b +1 ( c ) < α b ( c ) , we first prove that for b ∈ N and any α ∈ (0 , ∞ ) wehave γ b ( α ) < γ b +1 ( α ) . As shown in the proof of part (b), for all α ∈ (0 , ∞ ) γ b ( α ) = α + α b / ( b − Z b ( α ) = α + α b ( b − · Z b ( α ) . By substituting the power series representation for Z b ( α ) , we find that ( b − · Z b ( α ) α b = ( b − · ∞ X j =0 α j ( j + b )! = ∞ X j =0 α j Q ji =0 ( b + i ) . Since the product Q ji =0 ( b + i ) is a strictly increasing function of b ∈ N , it follows that for fixed α ∈ (0 , ∞ ) γ b ( α ) = α + α b ( b − · Z b ( α ) = α + ∞ X j =0 α j Q ji =0 ( b + i ) ! −
69s a strictly increasing function of b ∈ N . This proves that γ b ( α ) < γ b +1 ( α ) for b ∈ N . Wenow choose c > b + 1 . Then c = γ b ( α b ( c )) < γ b +1 ( α b ( c )) . In step 3 in the proof of part (a)we showed that γ ′ b ( α ) > for α ∈ (0 , ∞ ) and thus that γ b is strictly increasing on (0 , ∞ ) . If α b +1 ( c ) ≥ α b ( c ) , it would then follow that c < γ b +1 ( α b ( c )) ≤ γ b +1 ( α b +1 ( c )) . This contradictsthe fact that γ b +1 ( α b +1 ( c )) = c and completes the proof of assertion (c).(d) For b ∈ N we identify ρ b,α b ( c ) as the distribution of Ξ α b ( c ) conditioned on Ξ α b ( c ) ∈ N b . Let Ξ α b ( c ) be defined on a probability space having measure P . For any j ∈ N b P (Ξ α b ( c ) = j | Ξ α b ( c ) ∈ N b ) = 1 P (Ξ α b ( c ) ∈ N b ) · P (Ξ α b ( c ) = j )= 11 − e α b ( c ) P b − i =0 [ α b ( c )] i /i ! · e − α b ( c ) [ α b ( c )] j j != 1 Z b ( α b ( c )) · [ α b ( c )] j j ! = ρ b,α b ( c ); j . This completes the proof of part (d). The proof of Theorem C.1 is done as is the proof of part(a) of Theorem 3.1.In the next and final appendix we explore how the restriction involving m = m ( N ) could beavoided in the definition of the set of configurations Ω N,b,m in (2.1) and in the definition of themicrocanonical ensemble P N,b,m in (2.3). Avoiding this restriction would enable us to presentour results in a more natural form.
D Avoiding Restriction Involving m = m ( N ) In this appendix we explore a more natural formulation of our results, and we explain the issuesthat make such a formulation so challenging. Among these issues there is a limitation that seemsto be inherent in the approximation procedure we use to prove our results. This discussionmakes contact with several interesting ideas including Stirling numbers of the second kind andassociated Stirling numbers of the second kind.Let us review the notation. We start with the configuration space Ω N = Λ KN . For ω ∈ Ω N , K ℓ ( ω ) is the droplet-size random variable denoting the number of particles occupying the site ℓ ∈ Λ N , and N j ( ω ) is the number of sites for which K ℓ ( ω ) = j . We also introduce | N ( ω ) | + ,which is the number of indices j for which N j ( ω ) ≥ . Given b a nonnegative integer, wefocus on the configuration space Ω N,b,m consisting of all ω ∈ Ω N for which every site of Λ N is occupied by at least b particles and for which | N ( ω ) | + ≤ m . The quantity m is a function70 ( N ) satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . In symbols Ω N,b,m = { ω ∈ Ω N : K ℓ ( ω ) ≥ b ∀ ℓ ∈ Λ N and | N ( ω ) | + ≤ m } . (D.1)The first constraint involving K ℓ is intrinsic to the definition of the model. By contrast, thesecond constraint involving m is not intrinsic to the definition of the model, but rather is a usefultechnical device that enables us to control the errors that arise at various stages of the analysis.A more natural configuration space would be the set Ω N,b consisting of all ω ∈ Ω N for whichevery site of Λ N is occupied by at least b particles but for which there is no restriction on thenumber of positive quantities N j ( ω ) . In symbols Ω N,b = { ω ∈ Ω N : K ℓ ( ω ) ≥ b ∀ ℓ ∈ Λ N } . (D.2)We now come to the main point. Let P N be the uniform probability measure on Ω N thatassigns equal probability /N K to each of the N K configurations in Ω N . All of the results inthe paper are formulated for the probability measure P N,b,m , defined as the restriction of P N to Ω N,b,m . However, because the second constraint in the definition of Ω N,b,m involving m isnot intrinsic to the definition of the model, it would be more natural to formulate our resultsfor the probability measure P N,b , defined as the restriction of P N to the larger and more naturalconfiguration space Ω N,b .In order to understand why our results are formulated for P N,b,m and not for P N,b , we explainhow the constraint involving m arises in the paper. There are three sources. First, in Lemma 3.2we require that m log N/N → as N → ∞ to prove that the error ζ (2) N ( ν ) in (3.13) converges to0 uniformly for ν ∈ A N,b,m . Second, we require that m/N → as N → ∞ to prove part (a) ofLemma 3.3 and the weak convergence θ ( N ) ⇒ θ in part (a) of Theorem B.1. Part (a) of Lemma3.3 is used to prove part (b) of the lemma and to verify hypothesis (i) in Theorem 4.2 whenapplied to Theorem 4.1. Third, to prove part (b) of Lemma 3.3 and to verify hypothesis (iv) inTheorem 4.2 when applied to Theorem 4.1, the stronger condition that m /N → as N → ∞ is required. The source of this error is Lemma B.3, which is used to prove the approximationresult in Theorem B.1. This stronger condition on m is optimal in the sense that it is a minimalassumption guaranteeing that an error term in the lower bound in part (a) of Lemma B.3 and inthe upper bound in part (b) of the lemma converge to 0.The stronger condition that m /N → as N → ∞ means that m → ∞ at a slower ratethan √ N . What we find fascinating is the fact that the relationship between m and √ N is alsocentral to another component of our analysis. As we show in the next theorem, if m → ∞ at afaster rate than √ N , then for all sufficiently large N the configuration spaces Ω N,b,m and Ω N,b coincide as do the conditional probability measures P N,b,m and P N,b .71 heorem D.1.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Define Ω N,b asin (D.2) and Ω N,b,m as in (D.1) , where m = m ( N ) any function satisfying m ( N ) → ∞ as N → ∞ . The following conclusions hold. (a) max ω ∈ Ω N,b | N ( ω ) | + = p cN + 1 / − / . (b) If m/ √ N → ∞ as N → ∞ , then for all sufficiently large N , Ω N,b,m = Ω
N,b and P N,b,m = P N,b . Proof. (a) For ω ∈ Ω N,b , N ( ω ) denotes the sequence { N j ( ω ) , j ∈ N b } . Let κ ( ω ) = | N ( ω ) | + ,and let ≤ j < j < . . . < j κ ( ω ) denote the indices for which N j ( ω ) ≥ . We have strictinequality since the | N ( ω ) | + droplet classes have different sizes. Since for each of these indiceswe have j k ≥ k , the second conservation law in (2.2) implies that K = cN = κ ( ω ) X k =1 j k N j k ( ω ) ≥ κ ( ω ) X k =1 kN j k ( ω ) ≥ κ ( ω ) X k =1 k = κ ( ω )( κ ( ω ) + 1)2 . It follows that cN ≥ κ ( ω )( κ ( ω ) + 1) = ( κ ( ω ) + 1 / − / , which in turn implies that κ ( ω ) = | N ( ω ) | + ≤ p cN + 1 / − / . (D.3)Now let ω be any configuration in Ω N,b for which N k ( ω ) = 1 for k = 1 , , . . . , | N ( ω ) | + . In thiscase K = cN = κ ( ω ) X k =1 kN k ( ω ) = κ ( ω )( κ ( ω ) + 1)2 , which in turn implies that | N ( ω ) | + = p cN + 1 / − / . Since this gives equality in (D.3),the proof of part (a) is complete.(b) Since m/ √ N → ∞ , part (a) implies that for any ω ∈ Ω N,b,m we have | N ( ω ) | + ≤ m forall sufficiently large N . It follows that for all sufficiently large N , Ω N,b,m = Ω
N,b . Since P N,b,m and P N,b are the respective restrictions of P N to Ω N,b,m and Ω N,b , it also follows that these twoprobability measures coincide for all sufficiently large N . The proof of the lemma is complete.Theorem D.1 motivated us to seek a new approximation procedure. The new procedurewould replace the condition m /N → , needed to prove Lemma B.3, with a function m = m ( N ) satisfying m/ √ N → ∞ , needed to prove Theorem D.1, and satisfying the conditionsneeded to prove Lemma 3.2, part (a) of Lemma 3.3, and part (a) of Theorem B.1, which are72 log N/N → and m/N → ; an example of such a function would be m = N δ for some δ ∈ (1 / , . If we could find such an approximation procedure, then all our results formulatedfor P N,b,m would automatically hold for the more natural measure P N,b . Unfortunately, despitegreat effort, we were unsuccessful.Because of this situation it is worthwhile to look more closely at the two components of theapproximation procedure presented in appendix B. Given any measure θ ∈ P N b ,c , this procedureconstructs a sequence θ ( N ) lying in the range B N,b,m of Θ N,b and having the following twoproperties:(a) θ ( N ) ⇒ θ as N → ∞ ;(b) if R ( θ | ρ b,α ) < ∞ , then R ( θ ( N ) | ρ b,α ) → R ( θ | ρ b,α ) as N → ∞ .We are able to construct a number of sequences θ ( N ) ∈ B N,b,m that satisfy property (a) underthe hypothesis that m/N → . However, none of these satisfy property (b) with a function m satisfying m/ √ N → ∞ . On the basis of this experience, we conjecture that there exists nosequence θ ( N ) ∈ B N,b,m satisfying both properties (a) and (b) under a hypothesis that is weakerthan the current condition that m /N → .This setback motivated us to seek an alternate approach that would allow us to replace theprobability measure P N,b,m , which is the restriction of the uniform measure P N to Ω N,b,m , withthe probability measure P N,b , which is the restriction of P N to Ω N,b . The alternate approach isbased on equation (D.4) relating the probability measures P N,b and P N,b,m . This approach issuccessful for b = 0 and b = 1 in transferring to P N,b the large deviation lower bound proved inpart (d) of Theorem 2.1 for P N,b,m . However, so far it has been not successful for any value of b in transferring to P N,b either of the large deviation upper bounds proved in parts (b) and (c) ofTheorem 2.1 for P N,b,m .The starting point of the alternate approach is the following relationship between P N,b and P N,b,m . For A any subset of Ω N,b P N,b ( A ) = P N ( A ∩ Ω N,b ) P N (Ω N,b ) (D.4) = P N ( A ∩ Ω N,b,m ) P N (Ω N,b ) + P N ( A ∩ (Ω N,b \ Ω N,b,m )) P N (Ω N,b )= card (Ω N,b,m ) card (Ω N,b ) · P N,b,m ( A ) + P N,b ( A ∩ (Ω N,b \ Ω N,b,m )) . Part (a) of the next theorem gives a hypothesis that allows us to transfer the large deviationlower bound for open subsets of P N b ,c from P N,b,m to P N,b . According to part (b), this hypothesisis satisfied for b = 0 and b = 1 . We prove part (a) after the statement of the theorem. The73roof of part (b) for b = 0 is based on Proposition D.3 while the proof for b = 1 is based onProposition D.3 and Theorem D.4. Theorem D.2.
Fix a nonnegative integer b and a rational number c ∈ ( b, ∞ ) . Let m be thefunction m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . Let ρ b,α b ( c ) ∈ P N b ,c be the distribution having the componentsdefined in (2.7) . The following conclusions hold. (a) Assume that lim N →∞ N log (cid:18) card (Ω N,b,m ) card (Ω N,b ) (cid:19) = 0 . (D.5) Then for any open subset G of P N b ,c we have the large deviation lower bound lim inf N →∞ N P
N,b ( ω ∈ Ω N,b : Θ
N,b ( ω ) ∈ G ) ≥ − R ( G | ρ b,α b ( c ) ) . (D.6)(b) The hypothesis in part (a) is satisfied for b = 0 and b = 1 . Thus for these values of b thelarge deviation lower bound (D.6) holds. Proof of part (a).
Let A = { ω ∈ Ω N,b : Θ
N,b ( ω ) ∈ G } . It follows from (D.4) that P N,b ( A ) ≥ card (Ω N,b,m ) card (Ω N,b ) · P N,b,m ( A ) . Hence by the hypothesis in part (a) and the large deviation lower bound in part (d) of Theorem2.1 lim inf N →∞ N log P N,b ( A ) ≥ lim inf N →∞ N log (cid:18) card (Ω N,b,m ) card (Ω N,b ) (cid:19) + lim inf N →∞ N log P N,b,m ( A )= lim inf N →∞ N log P N,b,m ( A )= lim inf N →∞ N log P N,b,m ( ω ∈ Ω N,b,m : Θ
N,b ( ω ) ∈ G ) ≥ − R ( G | ρ b,α b ( c ) ) . This completes the proof of part (a).In order to prove part (b) of Theorem D.2, we now show that condition (D.5) holds if b = 0 or b = 1 . To prove this we compare the asymptotic behavior of card (Ω N,b,m ) with that ofcard (Ω N,b ) for these values of b . A formula for the asymptotic behavior of card (Ω N,b,m ) for anynonnegative integer b is derived in part (b) of Lemma 3.3. In the next proposition we expressthis formula in a different and more useful form for b = 0 and b = 1 . Although we do not applyit here, in part (c) we give the analogous formula for b ∈ N satisfying b ≥ .74 roposition D.3. Let b = 0 or b = 1 , and fix a rational number c ∈ ( b, ∞ ) . Let m be thefunction m ( N ) appearing in the definition of Ω N,b,m in (2.1) and satisfying m ( N ) → ∞ and m ( N ) /N → as N → ∞ . Let α b ( c ) be the quantity defined in part (a) of Theorem . Thefollowing conclusions hold. (a) For b = 0 1 N log card (Ω N, ,m ) = c log N + η N , where η N → as N → ∞ . (b) For b = 11 N log card (Ω N, ,m ) = c log N + ( c −
1) log[ c/α ( c )] + α ( c ) − c + η N , where η N → as N → ∞ . (c) For b ∈ N satisfying b ≥ N log card (Ω N,b,m ) = c log N + c log[ c/α b ( c )] + log Z b ( α b ( c )) − c + η N . where Z b ( α b ( c )) = e α b ( c ) − P b − j =0 [ α b ( c )] j /j ! and η N → as N → ∞ . Proof.
We start by considering any nonnegative integer b . Let α be the positive real number inLemma 3.2, and define f ( α, b, c, K ) = log Z b ( α ) − c log α + c log K − c . According to part (b)of Lemma 3.3 N log card (Ω N,b,m ) = f ( α, b, c, K ) − min θ ∈P N b,c R ( θ | ρ b,α ) + η N , where η N → as N → ∞ . We now appeal to item (i) in part (f) of Theorem A.1, which showsthat min θ ∈P N b,c R ( θ | ρ b,α ) = g ( α, b, c ) = log Z b ( α ) − c log α − (log Z b ( α b ( c )) − c log α b ( c )) . Substituting this formula into the preceding display, we obtain N log card (Ω N,b,m ) (D.7) = c log K + log Z b ( α b ( c )) − c log α b ( c ) + c log K − c + η N = c log N + c log[ c/α b ( c )] + log Z b ( α b ( c )) − c + η N . where η N → as N → ∞ . 75e next use (D.7) to prove part (a) for b = 0 and part (b) for b = 1 . Part (c) for b ∈ N satisfying b ≥ is obtained by specializing (D.7) to these values.(a) As pointed out in part (a) of Theorem 3.1, if b = 0 , then α ( c ) = c . In this case (D.7)becomes N log card (Ω N, ,m ) = c log N + η N , where η N → as N → ∞ . This completes the proof of part (a).(b) For b = 1 , α ( c ) is the unique solution in (0 , ∞ ) of the equation c = α ( c ) Z ( α ( c )) Z ( α ( c )) = α ( c ) e α ( c ) Z ( α ( c )) . It follows that log Z ( α ( c )) = α ( c ) + log α ( c ) − log c. Substituting this back into (D.7) yields N log card (Ω N, ,m ) = c log N + ( c −
1) log[ c/α ( c )] + α ( c ) − c + η N , where η N → as N → ∞ . The last equation coincides with the conclusion of part (b) for b = 1 . This completes the proof of the theorem.We now prove part (b) of Theorem D.2 first for b = 0 and then for b = 1 . Proof of part (b) of Theorem D.2 for b = 0 . We verify condition (D.5) for b = 0 . Accordingto part (a) of Proposition D.3 N log card (Ω N, ,m ) = c log N + η N , where η N → as N → ∞ . On the other hand, when b = 0 , Ω N,b equals Ω N = Λ KN . Therefore N log card (Ω N, ) = 1 N · K log N = c log N. We conclude that N log (cid:18) card (Ω N, ,m ) card (Ω N, ) (cid:19) = η N → as N → ∞ . We conclude that condition (D.5) holds for b = 0 and thus that the large deviation lower bound(D.6) is valid for b = 0 . This completes the proof.76he verification of condition (D.5) for b = 1 is much deeper than that for b = 0 . Proof of part (b) of Theorem D.2 for b = 1 . This proof depends on the relationship betweencard (Ω N, ) and Stirling numbers of the second kind. Given c a rational number in (1 , ∞ ) ,let K and N be positive integers satisfying K/N = c . We denote by S ( K, N ) the Stirlingnumber of the second kind, which is the number of ways to partition a set of K elements into N nonempty subsets [2, pp. 96–97]. The N ! permutations of the class of all such partitionscorrespond to all the ways of placing the K particles in the droplet model onto the N sites of Λ N and therefore are in one-to-one correspondence with the elements of Ω N, . It follows thatcard (Ω N, ) = N ! · S ( K, N ) .The computation of N − log card (Ω N, ) is given in part (b) of the next theorem. This com-putation is based on a deep, classical result on the asymptotic behavior of S ( K, N ) that isderived in Example 5.4 in [1] and is stated in part (a) of the next theorem in our notation. Thequantities in [1] denoted by n , k , and r correspond respectively to our K , N , and α ( c ) .We now apply part (b) of Proposition D.3 and the conclusion of the next theorem; the formerinvolves the error term η N → as N → ∞ , and the latter involves the error term ε N → as N → ∞ . Except for the error terms the asymptotic formulas are identical. Hence we obtain N log (cid:18) card (Ω N, ,m ) card (Ω N, ) (cid:19) = η N − ε N → as N → ∞ . This shows that the hypothesis in part (a) of Theorem D.2 is satisfied for b = 1 . The proof ofpart (b) of this theorem for b = 1 will be complete after we prove the next result. Theorem D.4.
Let S ( K, N ) denote the Stirling number of the second kind. Fix a rationalnumber c ∈ (1 , ∞ ) , any δ ∈ (1 , ∞ ) , and any M ∈ ( δ, ∞ ) . Then as K → ∞ and N → ∞ with K/N = c N log card (Ω N, ,m ) = 1 N log( N ! · S ( K, N ))= c log N + ( c −
1) log[ c/α ( c )] + α ( c ) − c + ε N , where ε N → as N → ∞ . Proof.
We start with the asymptotic formula for S ( K, N ) derived in Example 5.4 in [1] andstated here in our notation. For any δ ∈ (0 , and any M < ∞ , uniformly for c ∈ (1 + δ, M ) the asymptotic behavior of S ( K, N ) is given by S ( K, N ) = K ! e Nα ( c ) N ! c N − α ( c ) K − N − [1 − ce − α ( c ) ] √ πK . n , k , and r correspond respectively to our K , N , and α ( c ) . Itfollows that N log card (Ω N, ,m )= 1 N log( N ! · S ( K, N ))= K ! N + α ( c ) − log c − K − NN log α ( c ) + ε N = c log N + c log c − c + α ( c ) − log c − ( c −
1) log α ( c ) + ε N = c log N + ( c −
1) log[ c/α ( c )] + α ( c ) − c + ε N , where ε N → as N → ∞ . The proof of the theorem is complete.According to Theorem D.2, for b = 0 and b = 1 the large deviation lower bound, proved inpart (b) of Theorem 2.1 for P N,b,m , is also valid for P N,b . Thus for any open subset G of P N b ,c lim inf N →∞ N log P N,b ( ω ∈ Ω N,b : Θ
N,K,m ( ω ) ∈ G ) ≥ − R ( G | ρ b,α b ( c ) ) . (D.8)For b ∈ N satisfying b ≥ the quantity card (Ω N,b ) is related to the b -associated Stirlingnumber S b ( K, N ) of the second kind by the formula card (Ω N,b ) = N ! · S b ( K, N ) . The quantity S b ( K, N ) is the number of ways to partition a set of K elements into N subsets, each of whichcontains at least b elements [3, pp. 221–222]. One could verify condition (D.5) for these valuesof b if there were an asymptotic formula for S b ( K, N ) analogous to the formula derived inExample 5.4 in [1]. However, we are unable to locate such a formula. Nevertheless, based onour calculation for b = 0 and b = 1 it is reasonable to conjecture that condition (D.5) holds forany b ∈ N satisfying b ≥ , which would imply the large deviation lower bound (D.8) for thesevalues.We now explore whether we can extend to P N,b the large deviation upper bound proved inparts (c) and (d) of Theorem 2.1 for P N,b,m . If we could do this, then we could transfer to P N,b the fact, proved in Theorem 2.2 and Corollary 2.3, that with respect to P N,b,m , ρ b,α b ( c ) is theequilibrium distribution of Θ N,b and of K ℓ . Unfortunately, we are unable to prove the largedeviation upper bound for P N,b using either of two possible approaches explained briefly below.Concerning the statement about the equilibrium distribution, the best that we can do is touse the large deviation lower bound for b = 0 and b = 1 to prove that with respect to P N,b forthese values of b , ρ b,α b ( c ) is the equilibrium distribution of Θ N,b in the following weak form: forany ε > N →∞ N log P N,b ( ω ∈ Ω N,b : Θ
N,b ( ω ) ∈ B π ( ρ b,α b ( c ) , ε )) = 0 , B π ( ρ b,α b ( c ) , ε ) is the open ball in P N b ,c with center ρ b,α b ( c ) and radius ε with respect to theProhorov metric π . This follows from (D.8) with G = B π ( ρ b,α b ( c ) , ε ) and from the facts that R ( B π ( ρ b,α b ( c ) , ε ) | ρ b,α b ( c ) ) = 0 and lim sup N →∞ N log P N,b ( ω ∈ Ω N,b : Θ
N,b ( ω ) ∈ B π ( ρ b,α b ( c ) , ε )) ≤ lim sup N →∞ N log 1 = 0 . We end this section by discussing two possible approaches to transferring to P N,b the largedeviation upper bound proved in parts (c) and (d) of Theorem 2.1 for P N,b,m . The first approachis based on the following upper bound valid for any subset A of Ω N,b : P N,b ( A ) ≤ P N,b,m ( A ) + P N,b ( A ∩ (Ω N,b \ Ω N,b,m )) . This formula is a consequence of (D.4) and the fact that card (Ω N,b,m ) / card (Ω N,b ) ≤ . Now let F be a compact subset of P N b ,c , and define A = { ω ∈ Ω N,b : Θ
N,b ∈ F } . The case where F isa closed subset of P N b ,c can be handled analogously. By part (b) of Theorem 2.1 lim sup N →∞ N log P N,b ( A ) ≤ max (cid:18) lim sup N →∞ N log P N,b,m ( A ) , lim sup N →∞ N P
N,b ( A ∩ (Ω N,b \ Ω N,b,m ) (cid:19) ≤ max (cid:18) − R ( F | ρ b,α b ( c ) ) , lim sup N →∞ N P
N,b ( A ∩ (Ω N,b \ Ω N,b,m ) (cid:19) . If we could prove that − R ( F | ρ b,α b ( c ) ) is greater than or equal to the second expression on theright side of the last line, then we would be able to transfer the large deviation upper bound to P N,b . Unfortunately, however, we are unable prove that − R ( F | ρ b,α b ( c ) ) is greater than or equalto the second expression on the right side of the last line.The second approach to transferring to P N,b the large deviation upper bound in parts (c) and(d) of Theorem 2.1 rests on a careful analysis of how these upper bounds follow from the localestimate in part (b) of Theorem 3.1 and from Theorem 4.2 as applied to Theorem 4.1, for whichwe need only the large deviation upper bound for the sets appearing in Theorem 4.1. Omittingthe details, we claim that the crucial step is to show that lim N →∞ min ν ∈ A N,b,m R ( θ N,b,ν | ρ b,α ) = min θ ∈P N b,c R ( θ | ρ b,α ) . At the end of the proof of part (b) of Lemma 3.3 we prove this limit by applying the approxi-mation procedure in appendix B, which requires the condition that m /N → as N → ∞ . Ifwe could prove this limit without invoking the approximation procedure and under a condition79hat is compatible with m/ √ N → ∞ as n → ∞ , then the large deviation upper bound in parts(c) and (d) of Theorem 2.1 would hold with P N,b replacing P N,b,m . Unfortunately, we have notbeen able to carry this out.We end this section by proposing an interesting test case for gaining insight into whetherthe conditioned measure P N,b,m could be replaced by P N,b in the LDP for Θ N,b in Theorem2.1. This test case would be to use the methods of this paper to prove Sanov’s Theorem forthe empirical measures of i.i.d. random variables taking values in N b . This theorem, of course,can be proved directly without the methods of this chapter [6, Thm. 6.2.10], [7, Thm. 4.5]. Ifone uses the methods of this paper, then one would first have to prove it for the analogue of themeasure P N,b,m restricted to the analogue of the restricted configuration space Ω N,b,m , where thenumber of positive components of N j is restricted by m = m ( N ) . The quantity m ( N ) → ∞ atan appropriate rate. It would be instructive to see if this restriction can be eliminated using oneof the approaches proposed in this appendix. References [1] Edward A. Bender. Central and local limit theorems applied to asymptotic enumeration.
Journal of Combinatorial Theory , Series A 15:91–111, 1973.[2] Charalambos A. Charalambides.
Enumerative Combinatorics , Chapman & Hall/CRC,Boca Raton, 2002.[3] Louis Comtet.
Advanced Combinatorics: The Art of Finite and Infinite Expansions . D.Reidel Publishing Company, Dordrecht, Holland, 1974. Translated by J. W. Nienhuys.Revised and enlarged edition.[4] Definition of “powder” retrieved May 16, 2014 from Wikipedia.com website:http://en.wikipedia.org/wiki/Powder substance.[5] Definition of “spray” retrieved March 4, 2014 from Dictionary.com website:http://dictionary.reference.com/browse/spray?s=t.[6] Amir Dembo and Ofer Zeitouni.
Large Deviations Techniques and Applications , secondedition, Springer, New York, 1998.[7] M. D. Donsker and S. R. S. Varadhan. Asymptotic evaluation of certain Markov pro-cess expectations for large time, III.
Communications in Pure and Applied Mathematics
XXIX:389–461, 1976. 808] Paul Dupuis and Richard S. Ellis.
A Weak Convergence Approach to the Theory of LargeDeviations , John Wiley & Sons, New York, 1997.[9] Richard S. Ellis.
Entropy, Large Deviations, and Statistical Mechanics , Springer, NewYork, 1985. Reprinted in 2006 in
Classics of Mathematics .[10] Richard S. Ellis. The theory of large deviations and applications to statistical mechanics.
Long-Range Interacting Systems: Les Houches 2008 Session XC , 227–277. Edited by T.Dauxois, S. Ruffo, and L. F. Cugliandolo. Oxford University Press (New York), 2010.Posted at http://people.math.umass.edu/˜rsellis/pdf-files/Les-Houches-paper.pdf.[11] Richard S. Ellis. The theory of large deviations: from Boltzmann’s 1877 calculation toequilibrium macrostates in 2D turbulence.
Physica D
International Journal ofStochastic Analysis . Posted at http://people.math.umass.edu/˜rsellis/pdf-files/ldp-droplet-model.pdf.[13] Richard S. Ellis and Shlomo Ta’asan. The Boltzmann-Sanov large deviation principleand applications to statistical mechanics. Unpublished. 48-page L A TEXmanuscript,2014. Posted at http://people.math.umass.edu/˜rsellis/pdf-files/boltzmann-sanov-applications.pdf.[14] Stewart N. Ethier and Thomas G. Kurtz.
Markov Processes: Characterization and Con-vergence , John Wiley & Sons, New York, 1986.[15] Richard S. Ellis and Aaron Wyner. Uniform large deviation property of the empirical pro-cess of a Markov chain.
Annals of Probability
Analytic Combinatorics , Cambridge UniversityPress, Cambridge, 2009.[17] R. A. Mugele and H. D. Evans. Droplet size distribution in sprays,
Ind. Eng. Chem.
Topology , second edition, Prentice-Hall, Upper Saddle River, NJ,2000.[19] E. Neuman. Inequalities and bounds for the incomplete gamma function,
Results in Math.
Atomisation and Spray Technology