aa r X i v : . [ m a t h . P R ] M a y Fluctuations in Mean-Field Ising models
Nabarun Deb and Sumit Mukherjee ∗ Department of Statistics, Columbia Universitye-mail: [email protected] ; [email protected] In this paper, we study the fluctuations of the average magnetization in an Ising model on an approx-imately d N regular graph G N on N vertices. In particular, if G N is “well connected”, we show thatwhenever d N ≫ √ N , the fluctuations are universal and same as that of the Curie-Weiss model in theentire Ferro-magnetic parameter regime. We give a counterexample to demonstrate that the condition d N ≫ √ N is tight, in the sense that the limiting distribution changes if d N ∼ √ N except in the hightemperature regime. By refining our argument, we extend universality in the high temperature regime upto d N ≫ N / . Our results conclude universal fluctuations of the average magnetization in Ising modelson regular graphs, Erd˝os-R´enyi graphs (directed and undirected), stochastic block models, and sparseregular graphons. In fact, our results apply to general matrices with non-negative entries, including Isingmodels on a Wigner matrix, and the block spin Ising model. As a by-product of our proof technique,we obtain Berry-Esseen bounds for these fluctuations, exponential concentration for the average of spins,and tight error bounds for the Mean-Field approximation of the partition function. MSC 2010 subject classifications:
Primary 82B20; secondary 82B26.
Keywords and phrases:
Berry-Esseen bound, Ising model, Regular graphs, Mean-Field, Partitionfunction.
1. Introduction
The Ising model is a discrete Markov random field which was initially introduced as a mathematical model offerromagnetism in statistical physics, and has received extensive attention in Probability and Statistics (c.f. [1,4, 5, 10, 14, 16, 17, 20, 21, 23, 24, 25, 26, 27, 29, 31, 32, 34, 35] and references therein). The model can bedescribed by the following probability mass function in σ := ( σ , · · · , σ N ) ∈ {− , } N : P ( σ ) := 1 Z N ( β, B ) exp β σ ⊤ A N σ + B N X i =1 σ i ! . (1.1)Here A N is a symmetric N × N matrix with non-negative entries, and has zeroes on its diagonal, and β > B ∈ R are scalar parameters often referred to in the Statistical Physics literature as inverse temperature and external magnetic field respectively. The factor Z N ( β, B ) is the normalizing constant/partition functionof the model. The most common choice of the coupling matrix A N is the adjacency matrix of graph G N on N vertices, scaled by the average degree d N := N P Ni,j =1 G N ( i, j ). Here and throughout the rest of the paper,we use the notation G N to denote both a graph and its adjacency matrix. A pivotal quantity of interestwhich has attracted extensive attention in the literature is the average sum of spins/magnetization density,defined by σ := P Ni =1 σ i N .
The fluctuations for σ are mostly known for very few choices of the graph G N , including the complete graph(see e.g., [14, 19, 21]), the directed Erd˝os-R´enyi graph (see [26]), sparse Erd˝os-R´enyi graphs (see [24]). In thispaper, we focus on studying fluctuations of σ , when A N is the scaled adjacency matrix of an approximatelyregular graph G N . The motivation for this work is the recent paper [4], where the authors show universalasymptotics of the partition function Z N ( β, B ) on any sequence of approximately regular graphs with di-verging average degree, which is governed by the Mean-Field prediction formula. In particular, it followsfrom [4, Theorem 2.1] that the Mean-Field prediction formula is asymptotically correct for any sequence ofapproximately d N regular graphs G N with d N → ∞ . A natural follow up question is to what extent thisuniversality extends to other properties of such “Mean-Field” Ising models. In this paper we try to addressthis question partially by studying universal behavior of the statistic σ . ∗ Research partially supported by NSF grant DMS-1712037 1 eb and Mukherjee/Fluctuations in Mean-Field Ising models We begin with a definition which partitions the parameter set { ( β, B ) : β > , B ∈ R } into different domains. Definition 1.1.
Let Θ := { ( β,
0) : 0 < β < } , Θ := { ( β, B ) : β > , B = 0 } , Θ := { ( β,
0) : β > } , Θ := (1 , . Finally, let Θ := Θ ∪ Θ . We will refer to Θ as the uniqueness regime, Θ as the nonuniqueness regime, and Θ as the critical point. The names of the different regimes are by motivated by nextlemma, the proof of which follows from simple calculus (see for e.g. [17, Page 10]). Lemma 1.1.
Consider the fixed point equation φ ( x ) = 0 , where φ ( x ) := x − tanh( βx + B ) . (1.2) (a) If ( β, B ) ∈ Θ , then (1.2) has a unique solution at t = 0 , and φ ′ (0) > .(b) If ( β, B ) ∈ Θ , then (1.2) has a unique root t with the same sign as that of B , and φ ′ ( t ) > .(c) If ( β, B ) ∈ Θ , then (1.2) has two non zero roots ± t of this equation, where t > , and φ ′ ( ± t ) > .(d) If ( β, B ) ∈ Θ , then (1.2) has a unique solution at t = 0 , and φ ′ (0) = 0 . We will use t as defined in the above lemma throughout the paper, noting that t does depend on ( β, B ).The following result summarizes the fluctuations of σ in the Curie-Weiss model (see [21]), which is the Isingmodel on the complete graph. Lemma 1.2.
Suppose σ is a random vector from the Curie Weiss model P CW with p.m.f. P CW ( σ ) = 1 Z CWN ( β, B ) exp (cid:16) N β σ + B N X i =1 σ i (cid:17) . (1.3) Let Z τ ∼ N (0 , τ ) with τ := − t − β (1 − t ) for ( β, B ) / ∈ Θ , and let W be a continuous random variable withdensity proportional to e − x / . Then the following holds: √ N (cid:16) σ − t (cid:17) d → Z τ if ( β, B ) ∈ Θ , √ N (cid:16) σ − M ( σ ) (cid:17) d → Z τ if ( β, B ) ∈ Θ ,N / σ d → W if ( β, B ) ∈ Θ . Here M ( σ ) is a random variable which equals t if σ ≥ , and − t otherwise. We will now explore to what extent the fluctuations of σ are universal. We need the following notations tostate our main results. Definition 1.2. (i) Given two positive sequences x N , y N , we use the notation x N . y N to denote theexistence of a finite constant C free of N , such that x N ≤ Cy N .(ii) Given a symmetric matrix A N , let R i := P Nj =1 A N ( i, j ) denote the row sums of A N , and let ( λ ( A N ) , · · · , λ N ( A N )) denote its eigenvalues arranged in decreasing order. Let k A N k F and k A N k op denote the Frobenius norm and operator norm of A N respectively.(iii) Given two real valued random variables X, Y , define the Kolmogorov-Smirnov distance between X and Y by d KS ( X, Y ) := sup x ∈ R | P ( X ≤ x ) − P ( Y ≤ x ) | . Theorem 1.1.
Suppose that ( β, B ) ∈ Θ . Assume further that the sequence of matrices A N satisfies thefollowing two conditions: max ≤ i ≤ N R i . , (1.4)lim N →∞ λ ( A N ) = 1 . (1.5) If σ is a random vector from the Ising model (1.1) , then we have d KS (cid:16) √ N ( σ − t ) , Z τ (cid:17) . √ N k A N k F + N X i =1 ( R i − + t (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12)(cid:12)! . (1.6) eb and Mukherjee/Fluctuations in Mean-Field Ising models Note that Theorem 1.1 leaves out the parameter regime Θ ∪ Θ . The following example shows that such auniversal behavior is not expected in this parameter regime, unless we assume some notion of connectivityfor A N . Example 1.1.
With N even, let A N be the adjacency matrix of two disjoint complete graphs K N/ , scaledby N/ . Then the following holds:(a) If ( β, B ) ∈ Θ , then σ d → δ + ( δ t + δ − t ) .(b) If ( β, B ) ∈ Θ , then N / σ d → ( W + W ) / / , where W , W are i.i.d. with the same distribution asthat of W . The above example shows that if we want universal fluctuations in the regimes Θ ∪ Θ , the matrix A N needs to be “well connected” in some asymptotic sense. If A N is exactly the adjacency matrix of a d N regular graph G N scaled by d N , then λ ( A N ) = 1, and it is easy to check that the graph G N is connected iff λ ( A N ) <
1. Motivated by this, we propose the following asymptotic notion of well connectedness.
Definition 1.3.
We say a sequence of symmetric matrices { A N } N ≥ with non- negative entries is wellconnected, if lim sup N →∞ λ ( A N ) λ ( A N ) < . (1.7)We note that assumption (1.7) is somewhat weak in the sense that it does not imply connectivity in general.In particular this allows the existence of small disconnected subgraphs in G N , as shown in the followingexample. Example 1.2.
Let G N denote a graph which is the disjoint union of a d N regular graph G ,N on N vertices,and an arbitrary graph G ,N on N vertices, with N + N = N and N = o ( d N ) . Then the average degree ofthe whole graph G N is e d N = d N (1 + o (1)) . It is easy to check that if G ,N satisfies (1.7) , then G N satisfies (1.7) , even though G N is disconnected. Under the additional assumption of well connectedness, we show universal fluctuations in the non-uniquenessand critical regimes.
Theorem 1.2.
Suppose that ( β, B ) ∈ Θ . Assume further that the sequence of matrices A N satisfies (1.4) , (1.5) , and (1.7) . If σ is a random vector from the Ising model (1.1) , then we have d KS (cid:16) √ N ( σ − M ( σ )) , Z τ (cid:17) . √ N k A N k F + N X i =1 ( R i − + (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12)(cid:12)! . (1.8) where M ( σ ) is as in Lemma 1.2. Theorem 1.3.
Suppose that ( β, B ) ∈ Θ . If σ is a random vector from the Ising model (1.1) where A N satisfies lim sup N →∞ N / max ≤ i ≤ N | R i − | . . (1.9) Then we have d KS (cid:16) N / σ , W (cid:17) . ε N √ N + ε N r N N / + (log N ) N / vuut N X i =1 ( R i − + N − / h N X i =1 ( R i − i , (1.10) where r N := vuut (log N ) max ≤ i ≤ N N X j =1 A N ( i, j ) + log N max ≤ i ≤ N | R i − | ,ε N := k A N k F + 1 N h N X i =1 ( R i − i + 1 N N X i =1 ( R i − + log N, and W is as in Lemma 1.2. Remark 1.1.
Using these results, in section 1.2 we will show that for any sequence of well connected d N regular graphs the fluctuation of σ is universal in Θ ∪ Θ if d N ≫ √ N , and in Θ if d N ≫ √ N log N . Wenow give an example to show that the above conditions are actually necessary (up to log factor in the criticalregime), which suggests that the convergence rates obtained in this paper are tight. The proof of this examplewill appear in an upcoming draft [36]. eb and Mukherjee/Fluctuations in Mean-Field Ising models Example 1.3.
Let G N denote the line graph of the complete graph K n , so that N = (cid:0) n (cid:1) = n (1 + o (1)) .This is a regular graph with degree d N = 2( n −
2) = 2 √ N (1 + o (1)) , and its top two eigenvalues are λ ( G N ) = 2( n − and λ ( G N ) = n − (see [15, Lemma 2]). It follows that A N = d N G N does satisfy (1.7) ,and lim N →∞ √ N k A N k F = √ N max ≤ i ≤ N N X j =1 A N ( i, j ) = 12 √ = 0 . In this case we have the following limiting distributions across different regimes: √ N ( σ N − t ) + µ w −→ Z τ if ( β, B ) ∈ Θ , √ N ( σ N − M ( σ )) + sgn( M ( σ )) µ w −→ Z τ if ( β, B ) ∈ Θ ,N / σ N w −→ f W if ( β, B ) ∈ Θ , where µ := βt √ − β (1 − t )) · (2 − β (1 − t )) ≥ , and f W has density proportional to exp( − w − w √ ) . Therefore, thefluctuations do not match that of the Curie-Weiss model unless ( β, B ) ∈ Θ . Note that in the above example, σ has a different limit compared to the Curie-Weiss model in Θ ∪ Θ ∪ Θ ,but continues to have universal fluctuations in the high parameter regime Θ . We now state a modifiedtheorem for the regime Θ , which shows that in this regime we can do better. Theorem 1.4.
Suppose that ( β, B ) ∈ Θ , and A N satisfies lim N →∞ max ≤ i ≤ N R i = 1 . (1.11) If σ is a random vector from the Ising model (1.1) , then setting α N := max ≤ i ≤ N P Nj =1 A N ( i, j ) we have d KS (cid:16) √ N σ , Z τ (cid:17) . √ N + k A N k F √ α N log N √ N + h k A N k F α N log N is P Ni =1 ( R i − N , (1.12)
Remark 1.2.
It follows from the above result that in the regime Θ , σ has universal fluctuations on regulargraphs of degree d N ≫ ( N log N ) / . We believe this is not tight, and universal fluctuations should hold onany sequence of regular graphs with d N → ∞ . In [26] the authors prove such a result when G N is a nonsymmetric Erd˝os-R´enyi graph in the regime Θ (details in example section below). Note that we only expect similar behavior as the Curie Weiss model, if the underlying graphs are approxi-mately regular and have large degree. Quantifying this philosophy, the bounds in each of the theorems havetwo terms, the first term controls the sparsity of the underlying graph/matrix, and the second term controlsthe extent of regularity of the graph/matrix. Recall example 1.3, which suggests that the term controllingthe sparsity is optimal. In a similar spirit, the following example suggests that the term controlling theextent of regularity is also optimal.
Example 1.4. (a) Assume that √ N is an integer, and let G N be the disjoint union of two complete graphsof size N − √ N and √ N respectively. Let d N denote the average degree of G N and A N = ( d N ) − G N .In this case lim N →∞ √ N N X i =1 ( R i − > , but every other term in the RHS of (1.6) converges to . If σ is a random vector from the Ising model (1.1) with B = 0 , then √ N (cid:0) σ − t (cid:1) w −→ µ + Z τ , where µ := βt (1 − t )1 − β (1 − t ) + tanh( B ) − t = 0 .(b) With G N = K N , let A N = N −√ N G N . In this case lim N →∞ √ N N X i =1 ( R i − > , but every other term in the RHS of (1.6) converges to . If σ is a random vector from the Ising model (1.1) with B = 0 , then √ N (cid:0) σ − t (cid:1) w −→ µ + Z τ , where µ := βt (1 − t )1 − β (1 − t ) = 0 . eb and Mukherjee/Fluctuations in Mean-Field Ising models The main ingredient of our proof technique is comparing the Ising model on an approximately regular graphto that of an i.i.d. model/Curie-Weiss model. As a byproduct of this approach, we also obtain quantitativebounds for the following asymptotics of the log partition function via the Mean-Field prediction formula,defined via the following lower bound (c.f. [4]): Z N ( β, B ) ≥ sup σ ∈ [ − , N n β σ ⊤ A N σ + B N X i =1 σ i − N X i =1 I ( σ i ) o , where I ( x ) := x log x + − x log − x is the binary entropy function. By choosing σ = t with t as definedin Lemma 1.2, we get the further lower bound Z N ( β, B ) ≥ N n βt Bt − I ( t ) o + βt N X i =1 ( R i −
1) =: M N ( β, B ) , (1.13)where t is as in Lemma 1.1. It follows from [4, Theorem 2.1] that log Z N ( β, B ) − M N ( β, B ) = o ( N ), as soonas k A N k F + P Ni =1 ( R i − = o ( N ). Our next result gives a bound to the approximation error of the parti-tion function Z N ( β, B ) by M N ( β, B ), which we henceforth refer to as the Mean-Field prediction in this paper. Theorem 1.5.
Let A N satisfy (1.4) and (1.5) .(a) If ( β, B ) ∈ Θ then we have log Z N ( β, B ) − M N ( β, B ) . k A N k F + t N X i =1 ( R i − . (b) If ( β, B ) ∈ Θ , then the same conclusion as in part (a) holds under the extra assumption that A N satisfies (1.7) .(c) If ( β, B ) ∈ Θ , then under the extra assumption that A N satisfies (1.7) we have log Z N ( β, B ) − M N ( β, B ) . k A N k F + 1 N h N X i =1 ( R i − i + 1 N h N X i =1 ( R i − i + log N. Remark 1.3.
To see how the error bounds of the above theorem compare to existing error bounds for theMean-Field prediction formula in the literature, let us take the example where A N is the (scaled) adjacencymatrix of a d N -regular graph G N . In this case, the above theorem gives the error bound O ( N/d N ) for theMean-Field prediction formula. This immediately improves the bounds from [4, Theorem 1.1] — o ( N ) , [25,Theorem 1.1] — O ( N/d / N ) , [20, Example 3] — O ( N/d / − o (1) N ) under strong expander type conditions notneeded here) and [2, Corollary 2.9 and Example 2.10] — O ( N/ √ d N ) . For our next result, define an i.i.d. probability measure Q on {− , } N by setting Q ( σ , . . . , σ N ) := (exp( − βt − B ) + exp( βt + B )) N exp ( βt + B ) N X i =1 τ i ! . (1.14)Our next theorem shows that if an event is unlikely under the above i.i.d. measure/ the Curie Weiss model(depending on ( β, B )), then it is also unlikely under an Ising model on an approximately regular graph withlarge degree. Theorem 1.6.
Let A N satisfy (1.4) and (1.5) . Also, let E N ⊂ {− , } N be arbitrary.(a) If ( β, B ) ∈ Θ , then we have log P ( E N ) . log Q ( E N ) + k A N k F + t N X i =1 ( R i − . (b) If ( β, B ) ∈ Θ , then under the further assumption (1.7) we have log P ( E N ) . log P CW ( E N ) + k A N k F + N X i =1 ( R i − . eb and Mukherjee/Fluctuations in Mean-Field Ising models (c) If ( β, B ) ∈ Θ , then under the further assumption (1.7) we have log P ( E N ) . log P CW ( E N ) + k A N k F + 1 N h N X i =1 ( R i − i + 1 N h N X i =1 ( R i − i + log N. As an application of the above theorem, we immediately get the following exponential concentration for σ . Corollary 1.1.
Suppose A N satisfies (1.4) , (1.5) , and lim N →∞ N N X i =1 ( R i − = 0 , lim N →∞ N k A N k F = 0 . • If ( β, B ) ∈ Θ , then for every δ > we have lim sup n →∞ N log P ( | σ − t | > δ ) < . Same conclusion holds for ( β, B ) ∈ Θ , under the extra assumption that A N satisfies (1.7) . • If ( β, B ) ∈ Θ , then under the extra assumption that A N satisfies (1.7) , for every δ > we have lim sup n →∞ N log P ( | σ − M ( σ ) | > δ ) < . Similar concentration results can be obtained for other higher order polynomials of σ , as studied in [1, 12, 23]and the references therein. However, these papers focus exclusively on the high temperature regime Θ ,whereas our result applies to all temperatures. As mentioned before, the most common example of a coupling matrix A N in model (1.1) is the scaledadjacency matrix d N G N , where G N is the adjacency matrix of a simple labelled graph on N vertices withdegree vector ( d , · · · , d N ), and d N := N P Ni =1 d i is the average degree of G N . The scaling discussed inthe above definition ensures that the resulting Ising model has non-trivial phase transition properties (seee.g., [4, 31]). Below we consider some specific examples of graphs to illustrate our theorems.(a) Regular graphs:
Let G N be a d N regular graph. Then k A N k F = Nd N and R i = 1, and so applyingTheorems 1.1, 1.2, 1.3 and 1.4 give d KS (cid:16) √ N ( σ − t ) , Z τ (cid:17) . q N log Nd N + √ N if ( β, B ) ∈ Θ ,d KS (cid:16) √ N ( σ − t ) , Z τ (cid:17) . √ Nd N if ( β, B ) ∈ Θ ,d KS (cid:16) √ N ( σ − M ( σ )) , Z τ (cid:17) . √ Nd N if ( β, B ) ∈ Θ and G N satisfies (1.7), d KS (cid:16) N / σ , W (cid:17) . (cid:16) √ N log Nd N (cid:17) / + √ Nd N + log N √ N if ( β, B ) ∈ Θ and G N satisfies (1.7).In particular this means that σ has the same fluctuations as that of the Curie Weiss model as soon as d N ≫ ( N log N ) / if ( β, B ) ∈ Θ ,d N ≫√ N if ( β, B ) ∈ Θ ,d N ≫√ N if ( β, B ) ∈ Θ and (1.7) holds ,d N ≫√ N log N if ( β, B ) ∈ Θ and (1.7) holds . (1.15)Further, as already shown in Example 1.3, the requirement d N ≫ √ N is sharp in the regimes Θ ∪ Θ ∪ Θ . Note that for the particular case of the Curie-Weiss model at criticality we get the convergencerate of N − / log N , which matches the rate obtained in [14] up to the log factor. In fact, it is easyto modify our argument in the special case of the Curie-Weiss model to get rid of the log factor. Weobserve that for the case of random d N regular graphs, condition (1.7) holds with high probability, as λ ( G N ) = O p ( √ d N ) ≪ d N (see [11]), and so our results apply directly to random regular graphs if d N satisfies (1.15). We stress that our results apply to regular bipartite graphs as well, and does not needthe graph to be an expander as in [10]. eb and Mukherjee/Fluctuations in Mean-Field Ising models (b) Erd˝os-R´enyi graphs:
Suppose G N ∼ G ( N, p N ) be the symmetric Erd˝os R´enyi random graph with0 < p N ≤
1. Define A N ( i, j ) := N − p N G N ( i, j ), and note thatmax ≤ i ≤ N | R i − | = O p (cid:16)s log NN p N (cid:17) , (cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12) = O p (cid:16) √ p N (cid:17) , N X i =1 ( R i − = O p (cid:16) p N (cid:17) . (1.16)Since λ ( G N ) = O p ( √ N p N ) ≪ N p N ([22, Theorem 1.1]), (1.7) holds as well. Then our theoremsconclude universal fluctuations for σ as soon as p N ≫ (log N ) / N − / if ( β, B ) ∈ Θ ,p N ≫ N − / if ( β, B ) ∈ Θ ∪ Θ ,p N ≫ (log N ) N − / if ( β, B ) ∈ Θ , (1.17)both in the quenched and annealed setting. We note that our results also apply to the asymmetricErd˝os-R´enyi random graph e G ( N, p N ), under the same regime of p N as in the symmetric case. This isbecause an Ising model on the asymmetric Erd˝os-R´enyi graph is equivalent to an Ising model withthe symmetric coupling matrix A N ( i, j ) = e G N ( i,j )+ e G N ( j,i )2( N − p N . The asymmetric case was studied recentlyin [26], where the authors derive fluctuations as soon as N p N → ∞ , but only in the sub parameterregime Θ ∪ Θ . The authors conjecture similar results for the symmetric case, which we are able toverify partially in this paper. Moreover, our theorems apply simultaneously to both the symmetric andthe asymmetric cases with explicit convergence rates, thereby yielding fluctuations even in the hithertounexplored regimes Θ and Θ .(c) Balanced stochastic block model:
Suppose G N is a stochastic block model with 2 communities ofsize N/ N is even). Let the probability of an edge within the community be a N , and acrosscommunities be b N . This is the well known stochastic block model, which has received considerableattention in Probability, Statistics and Machine Learning (see [18, 27, 30] and references within). If wetake A N = N ( a N + b N ) G N , universal asymptotics hold for σ as soon as p N := a N + b N satisfies (1.17), andlim inf N →∞ b N a N > Sparse regular graphons:
Suppose that W be a symmetric measurable function from [0 , to [0 , R [0 , W ( x, y ) dy = a > x ∈ [0 , λ ( W ) < a , where { λ i ( W ) } i ≥ are the countableset of ordered eigenvalues. Also let ( U , · · · , U N ) i.i.d. ∼ U (0 , γ ∈ (0 , { G N ( i, j ) } ≤ i
2. Indeed, writing R i − R i − P Nj =1 W ( U i , U j ) aN + P Nj =1 W ( U i , U j ) aN − , it is easy to check that (1.16) holds. Also with W N denoting the N × N matrix with W N ( i, j ) = W ( U i , U j ), using [3, Corollary 3.3] we have k A N − ( aN ) − W N k op = O p (cid:16) √ NNp N (cid:17) . Since W N converges incut norm to W , it follows using [28, Section 11.6] we havelim N →∞ λ ( A N ) = a − lim N →∞ λ ( W N ) N = a − λ ( W ) < A N satisfies (1.7). By our results, universal fluctuations hold for σ as soon as (1.16) holds.(e) Block spin Ising model:
Suppose that N is even, and A N ( i, j ) = a N if i, j ≤ N/ i, j > N/ , = b N if i ≤ N/ , j > N/ , or i > N/ , j ≤ N/ .A N can be thought of as the expectation of a stochastic block Model with 2 communities. In theparticular case a N = βN , b N = αN , this model has been studied in [5, 29] under the name block spin eb and Mukherjee/Fluctuations in Mean-Field Ising models Ising model. Again in this case universal asymptotics holds for σ as soon as d N := N ( a N + b N )2 satisfies(1.15), and lim inf N →∞ b N a N >
0. This in particular matches the results obtained from [29, Theorems1.2, 1.4] which studies the sub parameter regime Θ ∪ Θ . Our results apply to the whole parameterregime of ( β, B ) and a wide regime of scalings of ( a N , b N ), providing explicit convergence rates. Similarextension holds when the matrix A N has more than 2 groups as well.(f) Wigner matrices:
To demonstrate that our techniques apply to examples well beyond scaled adjacencymatrices, let A N be a Wigner matrix with its entries { A N ( i, j ) , ≤ i < j ≤ N } i.i.d. from a distribution F scaled by N µ , where F is a distribution on non-negative reals with finite exponential moment andmean µ >
0. In this case we havemax ≤ i ≤ N | R i − | = O p (cid:16)r log NN (cid:17) , (cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12) = O p (1) , N X i =1 ( R i − = O p (1) . Also [3, Corollary 3.5] shows that k A N − N ⊤ k op = N − / , and so (1.7) holds. Thus our theoremsapply giving universal fluctuations for σ .
2. Proof of main results
We first state a lemma which will be needed in all parameter regimes.
Lemma 2.1.
Suppose σ ∼ (1.1) for some A N satisfying (1.4) , and β > , B ∈ R .(a) Setting m i ( σ ) := P Nj =1 A N ( i, j ) σ j we have E h N X i =1 ( σ i − tanh( βm i ( σ ) + B )) tanh( βm i ( σ ) + B ) i . N. (b) For any c = ( c , · · · , c n ) ∈ R n we have log P (cid:16) | N X i =1 c i ( σ i − tanh( βm i ( σ ) + B )) | > t (cid:17) . − t || c || . Here, part (a) follows by invoking [13, Lemma 3.2] and (b) can be obtained by making minor adjustments inthe proof of [32, Lemma 1].We now present an exponential moment control lemma in all parameter regimes, which is one of the mainestimates of this paper, and is itself new and possibly of independent interest. The proof of this is deferredto Section 3.
Lemma 2.2.
Suppose σ ∼ (1.1) , with A N satisfying (1.4) and (1.5) .(a) If ( β, B ) ∈ Θ , then there exists a fixed positive number δ > such that log E " exp δ N X i =1 ( m i ( σ ) − t ) ! . k A N k F + t N X i =1 ( R i − . (2.1) (b) If ( β, B ) ∈ Θ , then the conclusion of part (a) holds under the additional assumption that A N satisfies (1.7) .(c) If ( β, B ) ∈ Θ , then under the additional assumption that A N satisfies (1.7) there exists a fixed positivenumber δ > such that log E " exp δ N X i =1 ( m i ( σ ) − m ( σ )) ! . k A N k F + 1 N h N X i =1 ( R i − i + 1 N h N X i =1 ( R i − i + log N, (2.2) where m ( σ ) := N − P Ni =1 m i ( σ ) . eb and Mukherjee/Fluctuations in Mean-Field Ising models Without loss of generality we may assume that the RHS of (1.6) and (1.8) are bounded by 1, becauseotherwise the bound is trivial. With σ an observation from the Ising model (1.1), form an exchangeable pair( σ , σ ′ ) as follows: Let I denote a randomly sampled index from { , , . . . , N } . Given I = i , replace σ i withan independent ± σ ′ i with mean tanh( βm i ( σ ) + B ) = E [ σ i | ( σ j , j = i )], and let σ ′ := ( σ , · · · , σ i − , σ ′ i , σ i +1 , · · · , σ N ). Then ( σ , σ ′ ) is an exchangeable pair. Extend the definition of M ( σ )to all parameter regimes by setting M ( σ ) := t if ( β, B ) ∈ Θ ∪ Θ . Then, with T N := √ N ( σ − M ( σ )) and T ′ N := √ N ( σ ′ − M ( σ ′ )), the pair ( T N , T ′ N ) is exchangeable as well. A direct computation gives E [ T N − T ′ N | T N ] = 1 N / N X i =1 E [ σ i − tanh( βm i ( σ ) + B ) | T N ] − √ N E [ M ( σ ) − M ( σ ′ ) | T N ] , (2.3)where the second term in the RHS above can be expanded as N X i =1 tanh( βm i ( σ ) + B ) = N tanh( βM ( σ ) + B ) + β (1 − t ) N X i =1 ( m i ( σ ) − M ( σ )) + N X i =1 ξ i ( m i − M ( σ )) = N M ( σ ) + β (1 − t ) N X i =1 ( σ i − M ( σ )) + β (1 − t ) M ( σ ) N X i =1 ( R i − β (1 − t ) N X i =1 ( R i − σ i − M ( σ )) + N X i =1 ξ i ( m i ( σ ) − M ( σ )) (2.4)for random variables ( ξ i ) ≤ i ≤ N satisfying max ≤ i ≤ N | ξ i | .
1, where the second line uses the identity M ( σ ) =tanh( βM ( σ ) + B ). Setting h i = β (1 − t )( R i −
1) and plugging (2.4) into (2.3) we get E [ T N − T ′ N | T N ] = T N N (1 − β (1 − t )) | {z } g ( T N ) − N √ N N X i =1 E [ ξ i ( m i ( σ ) − M ( σ )) | T N ] | {z } H ( T N ) − N √ N E " N X i =1 h i ( σ i − M ( σ )) | T N H ( T N ) − √ N E [ M ( σ ) − M ( σ ′ ) | T N ] − N − / β (1 − t ) M ( σ ) N X i =1 ( R i − | {z } H ( T N ) . (2.5)Set G ( x ) := R x g ( y ) dy = (1 − β (1 − t )) x / N , c := N/ (1 − t ) and c := (2 πτ ) − / , and note the existenceof positive constants c and c free of N such that assumptions (H1), (H2) and (H3) from [14, Page 2] areall satisfied. By [14, Theorem 1.2], we then have d KS ( T N , Z τ ) . E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − N − t ) E (cid:2) ( T N − T ′ N ) | T N (cid:3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + N − t ) c E [ | T N − T ′ N | ] + N c c − t c E (cid:20) X a =1 | H a ( T N ) | (cid:21) + E | T N | + 1 √ N . (2.6)We will now estimate each term in the RHS above. First observe that N E (cid:2) | T N − T ′ N | (cid:3) . NN √ N E (cid:2) | σ I − σ ′ I | (cid:3) + N √ N E (cid:2) | M ( σ ) − M ( σ ′ ) | (cid:3) . √ N (2.7)where the control on the second term for ( β, B ) ∈ Θ follows on using part (b) of Theorem 1.6 with E N := { P Ni =1 σ i ∈ {− , , }} along with part (c) of Proposition 5.1 to note thatlim sup N →∞ N log P ( M ( σ ) = M ( σ ′ )) ≤ lim sup N →∞ N log P ( E N ) < . (2.8)Proceeding to control E | H ( T N ) | we have N √ N | H ( T N ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 E (cid:0) ξ i ( m i ( σ ) − M ( σ )) (cid:1) | T N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . E (cid:16) N X i =1 ( m i ( σ ) − M ( σ )) (cid:12)(cid:12)(cid:12) T N (cid:17) , (2.9) eb and Mukherjee/Fluctuations in Mean-Field Ising models and so N √ N E | H ( T N ) | . E N X i =1 (cid:16) m i ( σ ) − M ( σ ) (cid:17) ≤ η N (2.10)using Lemma 2.2, with η N := k A N k F + t P Ni =1 ( R i − denoting the RHS of (2.1). Next, we have N √ N | H ( T N ) | ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E N X i =1 h i (cid:16) σ i − tan( βm i ( σ ) + B ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T N ! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E N X i =1 h i (cid:16) tanh( βm i ( σ ) + B ) − tan( βM ( σ ) + B ) !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T N ! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (2.11)and so N √ N E | H ( T N ) | . vuut n X i =1 h i + vuut N X i =1 h i vuut E N X i =1 (cid:16) m i ( σ ) − M ( σ ) (cid:17) . vuut N X i =1 ( R i − (1 + √ η N ) . η N + N X i =1 ( R i − , (2.12)where the penultimate line uses part (b) of Lemma 2.1, and the last line again uses (2.10). Also observe that, N √ N | H ( T N ) | . N E (cid:16) | M ( σ ) − M ( σ ′ ) | T N (cid:17) + t (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12)(cid:12) , (2.13)where the first term has an expectation which is exponentially small in N using (2.8). Finally we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:20) − N − t ) ( T N − T ′ N ) (cid:12)(cid:12)(cid:12)(cid:12) T N (cid:21) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E h − ( σ I − σ ′ I ) − t ) (cid:12)(cid:12)(cid:12)(cid:12) T N i(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + N E [ | M ( σ ) − M ( σ ′ ) | ] . The second term on the RHS above is exponentially small, by (2.8). For the first term on the RHS, note that E [1 − ( σ I − σ ′ I ) / − t ) | σ ] . N − (cid:12)(cid:12) N X i =1 ( σ i tanh( βm i ( σ ) + B ) − t ) (cid:12)(cid:12) . As a result we have E | E [1 − ( σ I − σ ′ I ) / − t ) | T N ] | . E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( σ i − tanh( βm i ( σ ) + B )) tanh( βm i ( σ ) + B ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + 1 N N X i =1 E ( m i ( σ ) − M ( σ )) . √ N + 1 N N X i =1 E ( m i ( σ ) − M ( σ )) ≤ η N N + 1 √ N , (2.14)where we have used (2.10), and part (a) of Lemma 2.1. We now claim that E T N . . (2.15)Given this claim, combining the estimates from (2.6), (2.7), (2.10), (2.12), (2.13),and (2.14) we get d KS ( T N , Z τ ) . √ N + η N √ N + 1 √ N N X i =1 ( R i − + t √ N (cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12) , from which the desired bound follows on noting that η N & k A N k F & eb and Mukherjee/Fluctuations in Mean-Field Ising models It thus suffices to prove (2.15). To this effect, using (2.5) we get: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E [ T N − T ′ N | T N ] − T N N (1 − β (1 − t )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . X a =1 | H a ( T N ) | . By multiplying both sides of the above display by | T N | and taking expectation gives E [ T N ] . (cid:12)(cid:12)(cid:12)(cid:12) E [ N ( T N − T ′ N ) T N ] (cid:12)(cid:12)(cid:12)(cid:12) + E " N | T N | (cid:18) X a =1 | H a ( T N ) | (cid:19) , where we have used the fact that β (1 − t ) = 1. By the exchangeability of T N and T ′ N we have E [ N ( T N − T ′ N ) T N ] = E [ N ( T ′ N − T N ) T ′ N ] = 12 E [ N ( T N − T ′ N ) ] . . Also, from (2.9), (2.11) and (2.13) we have N X a =1 E [ H a ( T N )] . η N + P Ni =1 ( R i − √ N . , where the last bound uses the fact that the RHS of (1.6) and (1.8) are bounded. Using Chebyshev’s inequalitythen gives E ( T N ) . q E ( T N ) sX a =1 E ( N H a ( T N )) . q E ( T N )which implies E ( T N ) .
1. This verifies (2.15), and hence completes the proof of the theorem.
For proving Theorem 1.4 we need the following lemma, whose proof we defer to section 4.
Lemma 2.3.
Assume that σ is an observation from (1.1) with ( β, B ) ∈ Θ , and A N satisfies (1.11) . Setting α N = max ≤ i ≤ N P Nj =1 A N ( i, j ) as in Theorem 1.4, the following conclusions hold:(a) log P (cid:0) max ≤ i ≤ N | m i ( σ ) | ≥ λ √ α N log N (cid:1) . − λ , for any λ > .(b) E hP Ni =1 ( R i − σ i i . (cid:16)P Ni =1 ( R i − (cid:17) h k A N k F α N (log N ) i . Proof.
Without loss of generality we can assume that the RHS of (1.12) is bounded as before. As in theproof of the previous theorems, it suffices to bound the RHS of (2.6), but with t = M ( σ ) = 0 which implies H ( T N ) = 0. To begin, use (2.14) to get E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:20) − N T N − T ′ N ) (cid:12)(cid:12) T N (cid:21) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . k A N k F N + 1 √ N , (2.16)using (2.1), which allows us to replace η N in the previous proof by k A N k F . Proceeding to bound E | H ( T N ) | ,use the first equality of (2.9) along with Cauchy-Schwarz inequality to note that N √ N E | H ( T N ) | . E max ≤ i ≤ N | m i ( σ ) | N X i =1 m i ( σ ) ≤ r E max ≤ i ≤ N m i ( σ ) vuut E (cid:16) N X i =1 m i ( σ ) (cid:17) . k A N k F p α N log N, (2.17)where the last inequality uses part (a) of Lemma 2.3. Finally, for E | H ( T N ) | we have N √ N E | H ( T N ) | ≤ E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( R i − σ i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . vuut ( N X i =1 ( R i − ) h k A N k α N log N i , eb and Mukherjee/Fluctuations in Mean-Field Ising models where we use part (b) of Lemma 2.3. Plugging in the above bounds in (2.6), we have d KS ( T N , Z τ ) . E ( T N ) √ N + k A N k F √ α N log N √ N + h k A N k α N log N is ( P Ni =1 ( R i − N , from which the claimed bound follows immediately, if we can verify (2.15). But the proof of this is the sameas in the previous theorem, and so we are done.
For proving Theorem 1.3 we need the following lemma, whose proof we again defer to section 4.
Lemma 2.4.
Suppose σ ∼ (1.1) with ( β, B ) ∈ Θ , such that A N satisfies (1.9) and (1.7) . Suppose furtherthat the RHS of (1.10) is bounded. Then the following conclusions hold:(a) log P (cid:18) max ≤ i ≤ N | m i ( σ ) − m ( σ ) | ≥ λ p α N (log N ) + log N max ≤ i ≤ N | R i − | (cid:19) . − λ , for any λ > .(b) E hP Ni =1 ( R i − σ i i . (cid:18) P Ni =1 ( R i − + N − / h P Ni =1 ( R i − i (cid:19) (log N ) . (c) N / E ( σ ) . .Proof. With ( σ , σ ′ ) the usual exchangeable pair, setting T N := N / σ and T N := N / σ ′ we have E [ T N − T ′ N | σ ] = N − / ( σ − tanh( σ )) + N − / (tanh( σ ) − tanh( m ( σ ))) + N − / N X i =1 (tanh( m i ( σ )) − tanh( m ( σ ))) . Using Taylor’s expansion, this gives | E [ T N − T ′ N | σ ] − N − / ( σ − tanh( σ )) | . N − / | σ − m ( σ ) | + N − / | σ | N X i =1 ( m i ( σ ) − m ( σ )) + N − / (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( m i ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (2.18)and so we have E [ T N − T ′ N | T N ] = g ( T N ) + H ( T N ) where g ( x ) = N − / x /
3, and H ( T N ) satisfies E [ | H ( T N ) | ] . N − E (cid:2) | T N | (cid:3) + N − / E (cid:2) | σ − m ( σ ) | (cid:3) + N − E " | T N | N X i =1 ( m i ( σ ) − m ( σ )) + N − / E "(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( m i ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12)(cid:12) . Invoking [14, Theorem 1.2] with G ( x ) := R x g ( t ) dt = N − / x /
12 we have d KS ( T N , W ) . E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − N / E (cid:2) ( T N − T ′ N ) | T N (cid:3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + N / E [ | H ( T N ) | ] + N − / E | T N | . (2.19)By part (c) of Lemma 2.4 we have E [ | T N | ] .
1. Set δ N := P Ni =1 ( R i − + N − / h P Ni =1 ( R i − i , and usepart (b) of Lemma 2.4 and the Cauchy-Schwarz inequality to get E (cid:2) | σ − m ( σ ) | (cid:3) . p E ( σ − m ( σ )) . N − (log N ) p δ N . Similarly, by the Cauchy-Schwarz inequality and part (c) of Lemmas 2.2 along with part (a) of Lemma 2.4,we get: E " | T N | N X i =1 ( m i ( σ ) − m ( σ )) ≤ p E ( T N ) vuut E (cid:16) n X i =1 ( m i ( σ ) − m ( σ )) (cid:17) . ε N , eb and Mukherjee/Fluctuations in Mean-Field Ising models E " N X i =1 | m i ( σ ) − m ( σ ) | ≤ r E max ≤ i ≤ N ( m i ( σ ) − m ( σ )) vuut E (cid:16) N X i =1 ( m i ( σ ) − m ( σ )) (cid:17) . r N ε N , where ε N is as in the statement of Theorem 1.3. Combining the above observations, we get N / E [ | H ( T N ) | ] . N − / + N − / (log N ) p δ N + N − / r N ε N . (2.20)Finally, we have E (cid:12)(cid:12)(cid:12)(cid:12) − N / E (cid:2) ( T N − T ′ N ) | T N (cid:3) (cid:12)(cid:12)(cid:12)(cid:12) . N E (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 σ i tanh m i ( σ ) (cid:12)(cid:12)(cid:12)(cid:12) . N E (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( σ i − tanh m i ( σ )) tanh m i ( σ ) (cid:12)(cid:12)(cid:12)(cid:12) + 1 N E N X i =1 (cid:16) m i ( σ ) − m ( σ ) (cid:17) + E σ . √ N + ε N N + 1 √ N (2.21)where the last inequality follows from part (a) of Lemma 2.1, part (c) of Lemma 2.2, and part (c) of Lemma 2.4.Combing (2.20) and (2.21) along with (2.19) gives d KS ( T N , W ) . √ N + √ δ N (log N ) N / + r N ε N N / , as desired.
3. Proofs of Theorems 1.5, 1.6 and Lemma 2.2
We will need the following proposition which expresses the Curie-Weiss model as a mixture of i.i.d. randomvariables, first shown in [31, Lemma 3].
Proposition 3.1.
Let σ ∼ P CW be an observation from the Curie-Weiss model (1.3) . Given σ , let W N bea Gaussian random variable with mean σ N and variance ( N β ) − . Then the following conclusions hold:(a) Given W N , the random variables ( σ , σ , . . . , σ N ) are i.i.d. with mean f W N := tanh( βW N + B ) .(b) The marginal density of W N is proportional to exp( − N f ( w )) , where f ( w ) = βw − log cosh( βw + B ) . We state two more lemmas necessary for proving the results of this section, the proofs of which we deferto Section 5. The first lemma is a version of the Hanson-Wright inequality, which controls exponentialmoment of quadratic forms of binary random variables.
Lemma 3.1.
Suppose X , X , . . . , X N , N ≥ are i.i.d. ± valued random variables such that E [ X ] = µ .Set s µ := 2 µ/ (log (1 + µ ) − log (1 − µ )) with s being . Also assume that D N is a N × N symmetric matrixsuch that s µ lim sup N →∞ λ ( D N ) < . Then, given any vector c ⊤ := ( c , c , . . . , c N ) , we get: log E exp N X i,j =1 D N ( i, j ) e X i e X j + N X i =1 c i e X i . Tr + ( D N ) + k D N k F + N X i =1 c i where e X i = X i − µ for ≤ i ≤ N , and Tr + ( D N ) = P Ni =1 max( D N ( i, i ) , . The second lemma gives a quantitative estimate which allows us to neglect the region where f W N is not closeto t . Lemma 3.2.
Suppose (1.4) , (1.5) and (1.7) holds, and further assume that k A N k F = o ( N ) , P i =1 ( R i −
1) = o ( N ) . Also, let V N be any random variable such that V N ≤ cN for some fixed c > , and ε > be fixed.Setting A N := A N − ⊤ /N , for any ( β, B ) ∈ Θ ∪ Θ there exists δ = δ ( ε, c, β ) > such that, lim sup N →∞ N log E CW (cid:20) exp (cid:18) δV N + β σ ⊤ A N σ (cid:19) ( | f W N − M ( σ ) | ≥ ǫ ) (cid:21) < . (3.1) eb and Mukherjee/Fluctuations in Mean-Field Ising models Proof of Theorem 1.5. (a) To begin, note that β σ ⊤ A N σ + B N X i =1 σ i = β σ − t ) ⊤ A N ( σ − t ) + N X i =1 ( βtR i + B ) σ i − ( βt / ⊤ A N , which on using (1.13) gives Z N ( β, B )exp( M N ( β, B )) = EQ exp β N X i,j =1 ( σ i − t ) A N ( i, j )( σ j − t ) + βt N X i =1 ( R i − σ i − t ) ! (3.2) where Q is as defined in (1.14). In this case with D N = βA N we have s t lim sup N →∞ λ ( D N ) = βs t lim sup N →∞ λ ( A N ) ≤ βs t = βtβt + B < , for ( β, B ) ∈ Θ . If ( β, B ) ∈ Θ , then we have t = 0, and s = 1, and so with D N = βA N asbefore, we have s t lim sup N →∞ λ ( D N ) = β <
1. Thus in both cases Lemma 3.1 is applicable with D N = β A N , c i = βt ( R i − EQ exp β N X i,j =1 ( σ i − t ) A N ( i, j )( σ j − t ) + βt N X i =1 ( R i − σ i − t ) . k A N k F + t N X i =1 ( R i − . (3.3)The conclusion of part (a) follows from this combined with (3.2).(b) Define Y N := ( σ − f W N ) ⊤ A N ( σ − f W N ) + 2 f W N N X i =1 ( R i − σ i − f W N ) + ( f W N − t ) N X i =1 ( R i − , (3.4)and note that σ ⊤ A N σ = Y N + t P Ni =1 ( R i − . Using this, with J N,ǫ := {| t | − ε ≤ | f W N | ≤ | t | + ε } forsome ǫ >
0, by a similar calculation as in part (a) we have: Z N ( β, B ) Z CWN ( β, B ) = E CW exp (cid:18) β σ ⊤ A N σ (cid:19) = E CW (cid:20) exp (cid:18) β σ ⊤ A N σ (cid:19) ( J cN,ǫ ) (cid:21) + exp (cid:16) β t N X i =1 ( R i − (cid:17) E CW ( e β Y N ( J N,ǫ )) (cid:3) . (3.5)The first term in the right hand side of (3.5) is o (1) by invoking Lemma 3.2 with δ = 0. For the secondterm, by Proposition 3.1, the inner (conditional) expectation is taken with respect to i.i.d. ± f W N . In this regime βs t = βt/βt = 1. But since lim sup N →∞ λ ( A N ) < J N,ε we havelim sup N →∞ s f W N λ ( β A N ) ≤ lim sup N →∞ sup µ ∈ J N,ε s µ λ ( β A N ) < ε small enough. Therefore, Lemma 3.1 is applicable with D N = β A N and c i = 2 f W N ( R i −
1) to givelog E CW ( e β Y N ( J N,ǫ ) | f W N ) ≤ C n k A N k F + N X i =1 ( R i − o + β (cid:12)(cid:12)(cid:12) ( f W N − t ) N X i =1 ( R i − (cid:12)(cid:12)(cid:12) for some C < ∞ , which on taking another expectation giveslog E CW ( e β Y N ( J N,ǫ )) ≤ C n k A N k F + N X i =1 ( R i − o + log E e β (cid:12)(cid:12)(cid:12) ( f W N − t ) P Ni =1 ( R i − (cid:12)(cid:12)(cid:12) . k A N k F + N X i =1 ( R i − + 1 N h N X i =1 ( R i − i , (3.6) eb and Mukherjee/Fluctuations in Mean-Field Ising models where the last step uses part (b) of Proposition 5.1. This along with (3.5) giveslog Z N ( β, B ) − log Z CWN ( β, B ) − βt N X i =1 ( R i − . k A N k F + N X i =1 ( R i − , from which the desired conclusion follows by another application of part (a) of Proposition 5.1 to notethat log Z CWN ( β, B ) − N h βt + Bt − I ( t ) i . t = 0, and so s t = s = 1, and βs = 1. As in the proof of part (b), the first termin the RHS of (3.5) is o (1) invoking Lemma 3.2 with δ = 0. For handling the second term, invoking(1.7) gives lim sup N →∞ s f W N λ ( A N ) ≤ lim sup N →∞ sup µ ∈ J N,ε s µ λ ( A N ) < ε small enough. Also Lemma 3.1 with D N = A N and c i = 2 f W N ( R i −
1) giveslog E CW ( e β Y N ( J N,ǫ ) | f W N ) ≤ C n k A N k F + f W N N X i =1 ( R i − o + β f W N N X i =1 ( R i − C < ∞ free of N . This, on taking another expectation along with (3.5) giveslog E CW ( e β Y N ( J N,ǫ )) ≤ C k A N k F + log E exp C f W N N X i =1 ( R i − + β f W N N X i =1 ( R i − ! . k A N k F + 1 N h N X i =1 ( R i −
1) + N X i =1 ( R i − i , (3.7)where the last bound uses part (b) of Proposition 5.1. Combining (3.5) and (3.7) giveslog Z N ( β, B ) − log Z CWN ( β, B ) . k A N k F + 1 N h N X i =1 ( R i − + 1 N [ N X i =1 ( R i − i . We incur an additional log factor in the final answer because log Z CWN ( β, B ) − N h βt + Bt − I ( t ) i . log N by part (a) of Proposition 5.1. Proof of Theorem 1.6. (a) Using a similar calculation as in (3.5), we get: P ( σ ∈ E N ) = c ( N ) EQ exp β N X i,j =1 ( σ i − t ) A N ( σ j − t ) + βt N X i =1 ( R i − σ i − t ) ( σ ∈ E N ) (3.8)where the deterministic sequence c ( N ) satisfies c ( N ) = exp( βt ( ⊤ A N − N )) (exp( βt + B ) + exp( − βt − B )) N Z N ( β, B ) exp (( βt / ⊤ A N ) ≤ , on invoking the Mean-Field lower bound (1.13). Next, by using H¨older’s inequality with exponent p (tobe chosen later), the left hand side of (3.8) can be bounded above by, EQ exp β (1 + p )2 N X i,j =1 ( σ i − t ) A N ( σ j − t ) + βt (1 + p ) N X i =1 ( R i − σ i − t ) p ( Q ( E N )) p p . (3.9)Using arguments similar to the derivation of (3.3) shows that for p small enough we havelog EQ exp β (1 + p )2 X i,j ( σ i − t ) A N ( σ j − t ) + βt (1 + p ) N X i =1 ( R i − σ i − t ) . k A N k F + t N X i =1 ( R i − . Combining this along with (3.8) and (3.9) gives the desired conclusion follows. eb and Mukherjee/Fluctuations in Mean-Field Ising models (b) With Y N as in (3.4), using a similar calculation as in the derivation of (3.5) we can bound P ( σ ∈ E N )by Z CWN ( β, B ) Z N ( β, B ) E CW e β σ ⊤ A N σ ( σ ∈ E N ) ≤ Z CWN ( β, B ) Z N ( β, B ) (cid:20) E CW e β σ ⊤ A N σ ( σ ∈ J cN,ε ) + e βt P Ni =1 ( R i − E CW e β Y N ( σ ∈ E N ) ( J N,ε ) (cid:21) . (3.10)For controlling the ratio of partition functions in the RHS of (3.10), use the Mean-Field approximation(1.13) to get a lower bound for log Z N ( β, B ), whereas part (a) of Proposition 5.1 gives log Z CWN ( β, B ) − N h βt + Bt − I ( t ) i . . Combining these two observations, we get:log Z CWN ( β, B ) − log Z N ( β, B ) + βt N X i =1 ( R i − . . (3.11)Also, the first term inside the parenthesis in the RHS of (3.10) is exponentially small in N , by invokingLemma 3.2 with δ = 0. Proceeding to control the second term in the RHS of (3.10) we have E CW e β Y N ( σ ∈ E N ) ( J N,ε ) ≤ h E CW e β (1+ p )2 Y N ( J N,ε ) i p (cid:2) P CW ( σ ∈ E N ) (cid:3) p p . (3.12)where the last step uses Holder’s inequality with p > p > E CW ( e β (1+ p )2 Y N ) ( J N,ε ) . k A N k F + N X i =1 ( R i − . (3.13)Combining (3.10), (3.11), (3.12) and (3.13), the desired conclusion follows.(c) All steps of part (b) above go through verbatim, except the RHS of (3.11) gets replaced by log N (bypart (a) of Proposition 5.1), and (3.13) is replaced by (c.f. (3.7))log E CW e β (1+ p )2 Y N ( J N,ε ) . k A N k F + 1 N h N X i =1 ( R i − i + 1 N h N X i =1 N X i =1 ( R i − i . (3.14)Combining this with (3.10) and (3.14) gives the desired conclusion. Proof of Lemma 2.2. (a) Invoking Theorem 1.6 and changing δ if necessary, it suffices to show the desiredconclusion under Q , where Q is the i.i.d. measure defined in (1.14)). A direct calculation shows that m i ( σ ) − t equals P Nj =1 A N ( i, j )( σ j − t ) + t ( R i − N X i =1 (cid:16) m i ( σ ) − t (cid:17) ≤ N X i =1 h N X j =1 A N ( i, j )( σ j − t ) i + 2 t N X i =1 ( R i − =2 N X i =1 N X j =1 A N ( i, j )( σ i − t )( σ j − t ) + 2 t N X i =1 ( R i − . (3.15)It therefore suffices to control the exponential moment of the first term in the RHS of (3.15). Sincelim sup N →∞ λ ( A N ) ≤
1, for any δ ∈ (0 , / D N = δA N and c i = 0 we havelog E Q exp (cid:0) δ ( σ − t ) ⊤ A N ( σ − t ) (cid:1) . k A N k F = N X i =1 λ i . N X i =1 λ i = k A N k F . This gives the desired conclusion. eb and Mukherjee/Fluctuations in Mean-Field Ising models (c) By invoking Theorem 1.6, it suffices to show the desired conclusion under the Curie-Weiss model. Startby noting that m ( σ ) = N P Ni =1 R i σ i , and so m i ( σ ) − m ( σ ) = N X j =1 A N ( i, j )( σ j − f W N ) + 1 N N X i =1 R i ( σ i − f W N ) + f W N ( R i − R ) . This shows that P Ni =1 (cid:16) m i ( σ ) − m ( σ ) (cid:17) is bounded by3 N X i =1 h N X j =1 A N ( i, j )( σ j − f W N ) i + 3 N h N X i =1 R i ( σ i − f W N ) i + 3 f W N N X i =1 ( R i − R ) ≤ N X i,j =1 (cid:18) A N ( i, j ) + 3 N R i R j (cid:19) ( σ i − f W N )( σ j − f W N ) + 3 f W N N X i =1 ( R i − . (3.16)Conditioning on f W N , we now control the exponential moment of the first term in the RHS of the abovedisplay under the Curie-Weiss model. By Proposition 3.1, under the Curie Weiss model, given f W N , therandom vector ( σ , · · · , σ N ) are i.i.d. with mean f W N . Sincelim sup N →∞ λ (cid:16) A N + 3 N RR ⊤ (cid:17) ≤ lim sup N →∞ λ ( A N ) + lim sup N →∞ N λ ( RR ⊤ ) . , on invoking Lemma 3.1 with c i = 0, D N = δ (cid:16) A N + N RR ⊤ (cid:17) for δ small enough, and noting that k N RR ⊤ k F = N ( P Ni =1 R i ) . E CW e δ P Ni =1 ( m i ( σ ) − m ( σ )) − δ f W N N X i =1 ( R i − . k A N k F + tr( A N ) . k A N k F , from which the desired conclusion follows on noting thatlog E CW e δ f W N P Ni =1 ( R i − . N (cid:2) N X i =1 ( R i − (cid:3) , which follows from part (b) of Proposition 5.1.(b) To begin note that N X i =1 ( m i − M ( σ )) . N X i =1 ( m i ( σ ) − m ( σ )) + 1 N h N X i =1 R i ( σ i − f W N ) i + ( f W N − M ( σ )) (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( R i − (cid:12)(cid:12)(cid:12)(cid:12) . (3.17)By H¨older’s inequality, it suffices to bound the exponential moments of the first three terms of theabove display at some δ >
0. Exponential moment of the third term in the RHS of (3.17) is boundedby part (b) of Proposition 5.1, as P Ni =1 | R i − | = o ( N ). Proceeding to bound the sum of the first twoterms, use (3.16) to get N X i =1 ( m i ( σ ) − m ( σ )) + 1 N h N X i =1 R i ( σ i − f W N ) i ≤ N X i,j =1 (cid:18) A N ( i, j ) + 4 N R i R j (cid:19) ( σ i − f W N )( σ j − f W N )+3 N X i =1 ( R i − , and so it suffices to boundlog E CW exp δ N X i,j =1 (cid:18) A N ( i, j ) + 4 N R i R j (cid:19) ( σ i − f W N )( σ j − f W N ) for δ small enough. But this follows on invoking Lemma 3.1 with D N = δ ( A N + N RR ⊤ ) and c i = 0 togetlog E CW exp δ N X i,j =1 (cid:18) A N ( i, j ) + 4 N R i R j (cid:19) ( σ i − f W N )( σ j − f W N ) . k A N k F + tr( A N ) . k A N k F , which completes the proof of part (b). eb and Mukherjee/Fluctuations in Mean-Field Ising models
4. Proof of Lemmas 2.3 and 2.4
Proof of Lemma 2.3. (a) To begin, note that it suffices to prove the bound for λ large enough. To thiseffect, using part (b) of Lemma 2.1 we have the existence of a constant M free of N , such that for all λ > P (cid:18) | m i ( σ ) − N X j =1 A N ( i, j ) tanh( βm j ( σ )) | > λ vuut log N N X j =1 A N ( i, j ) (cid:19) ≤ e − λ NM , which on using a union bound with α N = max ≤ i ≤ N P Nj =1 A N ( i, j ) (as in Theorem 1.4) gives P (cid:18) max ≤ i ≤ N | m i ( σ ) − N X j =1 A N ( i, j ) tanh( βm j ( σ )) | > λ p α N log N (cid:19) ≤ N e − λ NM . On the set n max ≤ i ≤ N | m i ( σ ) − P Nj =1 A N ( i, j ) tanh( βm j ( σ )) | ≤ λ √ α N log N o using the bound | tanh( x ) | ≤ | x | we havemax ≤ i ≤ N | m i ( σ ) | ≤ p α N log N + β max ≤ i ≤ N R i max ≤ i ≤ N | m i ( σ ) | , which on using (1.11) gives max ≤ i ≤ N | m i ( σ ) | . √ α N log N . Thus there exists a constant c ′ such that P ( max ≤ i ≤ N | m i ( σ ) | > c ′ λ p α N log N ) ≤ N e − λ NM , from which the desired conclusion follows for all λ large enough.(b) More generally, we will show that for any vector c ∈ R N we have E N X i =1 c i σ i ! . (log N ) / N X i =1 c i . (4.1)To this effect, for every non-negative integer ℓ set c ( ℓ ) := β l A ℓN c , and x ℓ := E [( P i c ( ℓ ) i σ i ) ], and notethat c (0) = c , and the LHS of (4.1) is just x . For any ℓ ≥ x ℓ = T ,ℓ + T ,ℓ + T ,ℓ , (4.2)where T ,ℓ := E N X i =1 c ( ℓ ) i ( σ i − tanh( βm i ( σ ))) ! , T ,ℓ := E N X i =1 c ( ℓ ) i tanh( βm i ( σ )) ! T ,ℓ = 2 E X i = j c ( ℓ ) i c ( ℓ ) j ( σ i − tanh( βm i ( σ ))) tanh( βm j ( σ )) . For controlling T ,ℓ , setting m ji ( σ ) := P Nk =1 ,k = j A N ( i, k ) σ k σ j we have | T ,ℓ | =2 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i = j c ( ℓ ) i c ( ℓ ) j E h ( σ i − tanh( βm i ( σ )))(tanh( βm j ( σ )) − tanh( βm ij ( σ ))) i(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . N X i = j (cid:12)(cid:12) c ( ℓ ) i (cid:12)(cid:12)(cid:12)(cid:12) c ( ℓ ) j (cid:12)(cid:12) A N ( i, j ) . k c ( ℓ ) k (4.3)where we use E [ σ i − tanh( βm i ( σ )) | ( σ j , j = i )] = 0 in the first line, and the bound | tanh( βm i ( σ )) − tanh( βm ji ( σ )) | . A N ( i, j ) in the second line. eb and Mukherjee/Fluctuations in Mean-Field Ising models Proceeding to bound T ,ℓ , use a Taylor’s series expansion to get tanh( βm i ( σ )) = βm i ( σ ) + ξ i m i ( σ ) for random variables { ξ i } ≤ i ≤ N uniformly bounded by 1 in absolute value. Consequently, T ,ℓ − x ℓ +1 = E N X i =1 c ( ℓ ) i (cid:8) m i ( σ ) β + ξ i m i ( σ ) (cid:9)! − E N X i =1 c ( ℓ ) i m i ( σ ) ! ≤ √ x ℓ +1 k c ( ℓ ) k vuut E "X i m i ( σ ) + k c ( ℓ ) k E "X i m i ( σ ) . (4.4)Finally, using Cauchy-Schwarz inequality gives E (cid:16) N X i =1 m i ( σ ) (cid:17) ≤ vuut E ( N X i =1 m i ) r E max ≤ i ≤ N | m i ( σ ) | ≤ C k A N k F α N (log N ) (4.5)for some C free of N , where the last inequality uses part (a) of this lemma and part (b) of Lemma 2.2.Noting that T ,ℓ . k c ( ℓ ) k by part (b) of Lemma 2.1, combining (4.3), (4.4) and (4.5) along with (4.2)gives the existence of a constant D free of N, ℓ such that x ℓ ≤ x ℓ +1 + 2 √ x ℓ +1 k c k β ℓN δ N + k c k β ℓN δ N + Dβ ℓN k c k , (4.6)where we have also used the bound k c ( ℓ ) k ≤ β ℓN k c k with β N := β k A N k , and we set δ N :=max(1 , C k A N k α N log N ). Since β N → β <
1, for all N large we have β N ≤ β for some β < β ∈ (0 , , D >
0, there exists M large enough such that M > ( β √ M + 1) + D .With this M, β we claim that for all ℓ , we have x ℓ ≤ M k c k β ℓ δ N , (4.7)from which (4.1) is immediate on setting ℓ = 0. For proving (4.7) we use backwards induction on ℓ .Using Cauchy-Schwarz inequality gives x ℓ ≤ N k c ( ℓ ) k ≤ N β ℓN k c k , and so (4.7) holds for all ℓ large enough, as β N < β . Assume that the result holds for x ℓ +1 for some ℓ ,i.e. x ℓ +1 ≤ M k c k β ℓ +20 δ N . Using (4.6) gives x ℓ ≤ k c k β ℓ δ N (cid:16) M β + 2 √ M β + 1 + D (cid:17) ≤ M k c k β ℓ δ N , where the last step uses the choice of M . This verifies the claim for ℓ , and hence proves (4.7) bybackward induction, for all ℓ ≥ Proof of Lemma 2.4. (a) As in the proof of part (a) of Lemma 2.3, it suffices to prove the result for λ large. To this effect, define an N × N matrix e A N by setting e A N ( i, j ) := A N ( i, j ) /R max for i = j and e A N ( i, i ) := 1 − R i /R max where R max = max ≤ i ≤ N R i . Observe that ⊤ e A N = ⊤ , and so | ( m i ( σ ) − m ( σ )) − N X j =1 e A N ( i, j )( m j ( σ ) − m ( σ )) | = | m i ( σ ) − N X j =1 e A N ( i, j ) m j ( σ ) |≤| m i − N X j =1 A N ( i, j ) m j ( σ ) | + N X j =1 | A N ( i, j ) − e A N ( i, j ) | . | m i − N X j =1 A N ( i, j ) tanh( m j ( σ )) | + max ≤ i ≤ N | m i ( σ ) | + max ≤ i ≤ N | R i − | . (4.8) eb and Mukherjee/Fluctuations in Mean-Field Ising models Using part (b) of Lemma 2.1, a union bound as in the proof of part (a) of Lemma 2.3 shows that for all λ > P ( E cN ) ≤ e − cλ for some constant c > N , where E N := (cid:26) max ≤ i ≤ N (cid:12)(cid:12) m i ( σ ) − N X j =1 A N ( i, j ) tanh( m j ( σ )) (cid:12)(cid:12) ≤ λ p α N log N (cid:27) (4.9)for some constant c free of N , with α N = max ≤ i ≤ N P Nj =1 A N ( i, j ) as in Theorem 1.4. Proceeding to boundthe second term in the RHS of (4.8), note that | m i ( σ ) | . | m i ( σ ) − tanh( m i ( σ )) | ≤ | m i ( σ ) − N X j =1 A N ( i, j ) tanh( m j ( σ )) | + max ≤ i ≤ N | R i − | . (4.10)Thus, combining (4.8) and (4.10), on the set E N we havemax ≤ i ≤ N | ( m i ( σ ) − m ( σ )) − N X j =1 e A N ( i, j )( m j ( σ ) − m ( σ )) | ≤ C h λ p α N log N + max ≤ i ≤ N | R i − | i (4.11)for some C < ∞ free of N . Now, for any integer ℓ ≥ | ( m i ( σ ) − m ( σ )) − N X j =1 e A ℓN ( i, j )( m j ( σ ) − m ( σ )) |≤| ( m i ( σ ) − m ( σ )) − N X j =1 e A ℓ − N ( i, j )( m j ( σ ) − m ( σ )) | + (cid:12)(cid:12)(cid:12) N X j =1 e A ℓ − N ( i, j ) n ( m j ( σ ) − m ( σ )) − N X k =1 e A N ( j, k )( m k ( σ ) − m ( σ )) o(cid:12)(cid:12)(cid:12) ≤| ( m i ( σ ) − m ( σ )) − N X j =1 e A ℓ − N ( i, j )( m j ( σ ) − m ( σ )) | + max ≤ j ≤ N (cid:12)(cid:12)(cid:12) ( m j ( σ ) − m ( σ )) − N X k =1 e A ℓN ( j, k )( m k ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12) , which, via a recursive argument givesmax ≤ i ≤ N (cid:12)(cid:12)(cid:12) ( m i ( σ ) − m ( σ )) − N X j =1 e A ℓN ( i, j )( m j ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12) ≤ ℓ max ≤ i ≤ N | ( m i ( σ ) − m ( σ )) − N X j =1 e A N ( i, j )( m j ( σ ) − m ( σ )) | ≤ Cℓ (cid:16) λ p α N log N + max ≤ i ≤ N | R i − | (cid:17) , (4.12)where the last line uses (4.11) on the set E N . Using part (a) of Lemma 5.2, we note the existence of D freeof N such that for the choice ℓ = D log N we have max ≤ i ≤ N A ℓ ( i, i ) ≤ N . With this choice of ℓ , we have P (cid:18) max ≤ i ≤ N | m i ( σ ) − m ( σ ) | ≥ Cℓ h λ p α N log N + max ≤ i ≤ N | R i − | i , E N (cid:19) ≤ P (cid:18) max ≤ i ≤ N | N X j =1 e A ℓN ( i, j )( m j ( σ ) − m ( σ )) | ≥ Cℓ h λ p α N log N + max ≤ i ≤ N | R i − | i(cid:19) ≤ P (cid:18) N X j =1 ( m j ( σ ) − m ( σ )) ≥ C ℓ N h λ p α N log N + max ≤ i ≤ N | R i − | i (cid:19) , where the last line uses Cauchy-Schwarz inequality. Fixing δ small enough and using part (c) of Lemma 2.2,this giveslog P ( (cid:18) max ≤ i ≤ N | m i ( σ ) − m ( σ ) | ≥ Cℓ h λ p α N log N + max ≤ i ≤ N | R i − | i , E N (cid:19) . − N α N (log N ) λ − N (log N ) max ≤ i ≤ N | R i − | + log E e δ P Ni =1 ( m i ( σ ) − m ( σ )) . − N α N (log N ) λ − N (log N ) max ≤ i ≤ N | R i − | + k A N k F + 1 N h N X i =1 ( R i − i + 1 N h N X i =1 ( R i − i + log N, from which the desired conclusion follows for λ large enough on noting the inequality N α N ≥ k A N k F & eb and Mukherjee/Fluctuations in Mean-Field Ising models In order to prove Lemma 2.4, parts (b) and (c), we need the following lemma whose proof we defer to theend of this lemma.
Lemma 4.1.
Assume that (1.4) , (1.5) , (1.9) holds, and the RHS of (1.10) is bounded. Then, setting ν N := E N [( N / σ ) ] the following conclusions hold: ν N . ν / N + ν / N + ν / N + ν / N vuut E hP Ni =1 ( R i − σ i i N / , (4.13) E " N X i =1 ( R i − σ i . (log N ) N X i =1 ( R i − + N − / h N X i =1 ( R i − i ! (cid:18) E [( N / σ ) ] (cid:19) . (4.14) Proof of Lemma 2.4, parts (b) and (c).
Use (4.14) and the fact that the RHS of (1.10) is bounded to get E [ N X i =1 ( R i − σ i ] . √ N (1 + E [( N / σ ) ]) . √ N (1 + ν / N ) . Along with (4.13), this gives ν N . ν / N + ν / N + ν / N (1 + ν / N ) + 1, and so ν N must be bounded, therebyproving part (b). Now, part (c) is an immediate consequence of part (b) and (4.14). Proof of Lemma 4.1. (a) Proof of (4.13).To begin, borrowing notation from the proof of Theorem 1.3 and using (2.18) gives the existence of
C < ∞ such that (cid:12)(cid:12) E [ T N − T ′ N | σ ] − N − / T N / (cid:12)(cid:12) ≤ N − | T N | + C (cid:26) N − / | σ − m ( σ ) | + N − | T N | N X i =1 ( m i ( σ ) − m ( σ )) + N − / (cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( m i ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) . On multiplying both sides of the above inequality by | T N | and taking expectation gives E [ T N ] ≤ (2 / N − / E | T N | + 3 C (cid:26) N / E (cid:2) | T N | | σ − m ( σ ) | (cid:3) + N − / E " | T N | N X i =1 ( m i ( σ ) − m ( σ )) + N − / E " | T N | (cid:12)(cid:12) N X i =1 ( m i ( σ ) − m ( σ )) (cid:12)(cid:12) + 3 N / (cid:12)(cid:12) E ( T N − T ′ N ) T N (cid:12)(cid:12) . (4.15)We will now bound each of the terms in the RHS of (4.15). To begin, note that that | T N − T ′ N | ≤ N − / and E [ T N ] = E [ T ′ N ]. This, along with the fact that ( T N , T ′ N ) is an exchangeable pair gives E ( T N − T ′ N ) T N = (1 / E ( T N − T ′ N ) T N − (1 / E ( T N − T ′ N )( T ′ N ) = (1 / E (cid:2) ( T N − T ′ N ) ( T N + T N T ′ N + ( T ′ N ) ) (cid:3) ≤ N − / E [ T N ] ≤ N − / ν / N , (4.16)where ν N = E [( N / σ ) ] as in the statement of the lemma. Also with ε N , r N as in the statement ofTheorem 1.3, use part (c) of Lemma 2.2, and part (a) of Lemma 2.4 to get that for any positive integer p , we have E h N X i =1 ( m i ( σ ) − m ( σ )) i p . ε pN , E max ≤ i ≤ N | m i ( σ ) − m ( σ ) | p . r pN . (4.17)Finally, since the RHS of (1.10) is bounded, we have ε N . √ N , ε N r N . N / , k c k + N − / h N X i =1 c i i . √ N . (4.18) eb and Mukherjee/Fluctuations in Mean-Field Ising models Armed with these estimates and proceeding to bound the second, third and fourth terms in (4.15), useH¨older’s inequality to get N / E [ | T N | | σ − m ( σ ) | ] ≤ N − / √ ν N vuut E " N X i =1 ( R i − σ i (4.19) E " T N N X i =1 ( m i ( σ ) − m ( σ )) ≤ ν / N E " N X i =1 ( m i ( σ ) − m ) / . ν / N ε N . ν / N √ N (4.20) E " | T N | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ( m i ( σ ) − m ( σ )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤√ ν N (cid:18) E h N X i =1 ( m i ( σ ) − m ( σ )) i (cid:19) / (cid:18) E h max ≤ i ≤ N ( m i ( σ ) − m ( σ )) i(cid:19) / . √ ν N ε N r N . √ ν N N / (4.21)where the last bounds in (4.20) and (4.21) use (4.17) and (4.18). Finally, for the fifth term in the RHSof (4.15), note that | T N | ≤ N / , and so the first term in the RHS of (4.15) is bounded by (2 / E [ T N ].Combining this along with (4.15), (4.16), (4.19), (4.20) and (4.21) gives ν N . ν / N + ν / N + ν / N + ν / N vuut E hP Ni =1 ( R i − σ i i N / which completes the proof of (4.13)(b) Proof of (4.14).To begin, for any vector h := ( h , · · · , h N ) write N X i =1 h i σ i = N X i =1 h i ( σ i − tanh( m i ( σ ))) + N X i =1 h i (tanh( m i ( σ )) − tanh( m ( σ ))) + tanh( m ( σ )) N X i =1 h i , which using part (b) of Lemma 2.1 gives E h N X i =1 h i σ i i . k h k + k h k ε N + h N X i =1 h i i E m ( σ ) . k h k ε N + h N X i =1 h i i E ( m ( σ ) ) , (4.22)where the second line uses part (c) of Lemma 2.2, and ε N equals the RHS of (2.2). Setting c = R − and using (4.22) with h = c gives E h N X i =1 c i σ i i . k c k + k c k ε N + h N X i =1 c i i E m ( σ ) . N, (4.23)where the last line uses (4.18). Along with (4.13) this gives ν N . ν / N + ν / N + ν / N + ν / N √ N , and so ν N . √ N ⇒ E σ . N − . (4.24)Also, an argument similar to the derivation of (4.23) shows that for any positive integer p , we have E ( σ − m ( σ )) p = N − p E h N X i =1 c i σ i i p . N − p (cid:18) k c k p + k c k p ε pN + N X i =1 c i ! p E m ( σ ) (cid:19) . N − p , (4.25)where the last bound uses (4.18). Combining we have the following conclusions: E m ( σ ) . E ( σ ) + E ( σ − m ( σ )) . N , (4.26) E (cid:16) N X i =1 m i ( σ ) (cid:17) . N E m ( σ ) + r E max ≤ i ≤ N ( m i ( σ ) − m ( σ )) vuut E h N X i =1 ( m i ( σ ) − m ( σ )) i eb and Mukherjee/Fluctuations in Mean-Field Ising models . r N ε N . , (4.27)where (4.26) uses (4.24) and (4.25) with p = 3, and (4.27) uses (4.26) along with (4.17) and (4.18).Armed with these estimates, we now focus on deriving (4.14).Let e A N be as defined in the proof of part (a) of Lemma 2.4, and set c ( ℓ ) := c ⊤ e A ℓN and x ℓ := E (cid:20) N P i =1 c ( ℓ ) i σ i (cid:21) for ℓ ≥
0. As in the proof of part (b) of Lemma 2.3, we can write x ℓ = T ,ℓ + T ,ℓ + T ,ℓ ,where T ,ℓ := E (cid:20) N X i =1 c ( ℓ ) i ( σ i − tanh m i ( σ )) (cid:21) , T ,ℓ := E (cid:20) N X i =1 c ( ℓ ) i tanh m i ( σ ) (cid:21) ,T ,ℓ := 2 E X i = j c ( ℓ ) i c ( ℓ ) j ( σ i − tanh m i ( σ )) tanh m j ( σ ) . By the argument presented in the proof of part (b) of Lemma 2.3 we have T ,ℓ . k c ( ℓ ) k ≤ k c k , and T ,ℓ . k c k . Next, using Taylor Series expansion, we can write tanh( m i ( σ )) = m i ( σ ) + ξ i m i ( σ ) forrandom variables { ξ i } ≤ i ≤ N which are uniformly bounded by 1 in absolute value. Consequently, T ,ℓ − x ℓ +1 = E (cid:20) σ ⊤ A N c ( ℓ ) + N X i =1 c ( ℓ ) i ξ i m i ( σ ) (cid:21) − E (cid:20) σ ⊤ e A N c ( ℓ ) (cid:21) ≤ √ x ℓ +1 vuut E h N X i =1 | c ( ℓ ) i m i ( σ ) | i + 2 √ x ℓ +1 r E h c ( ℓ ) H N σ i + E h c ( ℓ ) H N σ i + E h N X i =1 | c ( ℓ ) i m i ( σ ) | i + 2 r E h c ( ℓ ) H N σ i vuut E h N X i =1 | c ( ℓ ) i m i ( σ ) | i ≤ √ x ℓ +1 k c k vuut E N X i =1 m i ( σ ) + 2 √ x ℓ +1 r E h c ( ℓ ) H N σ i + E h c ( ℓ ) H N σ i + k c k E h N X i =1 m i ( σ ) i + 2 k c k r E h c ( ℓ ) H N σ i vuut E N X i =1 m i ( σ ) . (4.28)Proceeding to bound the RHS of (4.28), use (1.9) and (4.25) respectively to note that k H N k op . N − / ,and N E ( m ( σ )) . N E ( σ ) + 1, and use and so an application of (4.22) with h = H N c ( ℓ ) ) gives E h ( c ( ℓ ) ) ⊤ H N σ i . k c ( ℓ ) k k H N k (cid:18) ε N + N E ( m ( σ ) ) (cid:19) . k c k µ N , (4.29)with µ N := 1 + E ( N / σ ) , where the last line uses (4.17) and (4.18). We now claim that there existsa constant D > x D (log N ) . µ N (cid:18) k c k + N − / (cid:20) N X i =1 c i (cid:21) (cid:19) . (4.30)Given this claim, we have the existence of a constant C free of N such that x D (log N ) ≤ C µ N (cid:18) k c k + N − / (cid:20) N X i =1 c i (cid:21) (cid:19) . (4.31)Also, using (4.29) and (4.27), and making C bigger if needed, for all ℓ ≥ x ℓ ≤ x ℓ +1 + 2 C √ x ℓ +1 k c k √ µ N + C k c k µ N . (4.32)With L = D (log N ) , we will now show that the bound x ℓ ≤ ( L − ℓ + 1) C k c k + N − / N X i =1 c i ! (4.33) eb and Mukherjee/Fluctuations in Mean-Field Ising models holds for all ℓ ∈ [0 , L ] by backwards induction. By (4.31) we have that (4.33) holds for ℓ = L . Suppose(4.33) holds for ℓ + 1 for some ℓ ∈ [0 , L − x ℓ ≤ C µ N k c k h ( L − ℓ ) + 2( L − ℓ ) + 1 i = ( L − ℓ + 1) C µ N k c k , verifying (4.33) for ℓ , thus verifying (4.33) for all ℓ ∈ [0 , L ] by induction. Setting ℓ = 0 in (4.33) we getthe bound E (cid:16) N X i =1 c i σ i (cid:17) ≤ L C µ N (cid:20) N X i =1 c i + N − / (cid:16) N X i =1 c i (cid:17) (cid:21) ≤ C D µ N (log N ) (cid:20) N X i =1 c i + N − / (cid:16) N X i =1 c i (cid:17) (cid:21) , which verifies (4.14), as desired.It thus remains to verify (4.30), for which using spectral decomposition write e A N = P Ni =1 e λ i e q i e q ⊤ i ,where we set e λ i := λ i ( e A N ) for convenience of notation . With L = D (log N ) , this gives c ⊤ e A LN σ = σ N X i =1 c i + e λ LN c ⊤ e q N e q ⊤ N σ + N − X i =2 e λ Li c ⊤ e q i e q ⊤ i σ = σ N X i =1 c i + e λ LN c ⊤ e q N e q ⊤ N σ + O ( N − cD +2 ) , where the last equality uses Lemma 5.2 to getmax ≤ i ≤ N − | e λ i | L ≤ (cid:16) − c log N (cid:17) ℓ ≤ N − cD for some c >
0. Consequently for D large enough we have E h c ⊤ e A LN σ i . h N X i =1 c i i E [ σ ] + k c k E [( e q ⊤ N σ ) ] . (4.34)Since e q ⊤ N e A N = λ N e q ⊤ N where e λ N is bounded away from 1 by (1.7), we have(1 − e λ N ) N X i =1 e q N ( i ) σ i = N X i =1 e q N ( i )( σ i − m i ( σ )) + e q ⊤ N H N σ = N X i =1 e q N ( i )( σ i − tanh( m i ( σ ))) + N X i =1 e q N ( i )(tanh( m i ( σ )) − m i ( σ )) + e q ⊤ N H N σ. This immediately gives(1 − e λ N ) E " N X i =1 e q N ( i ) σ i . E " N X i =1 e q N ( i )( σ i − tanh( m i ( σ ))) + E " N X i =1 | e q N ( i ) || m i ( σ ) | + E he q ⊤ N H N σ i ≤ N X i =1 e q N ( i ) + vuut N X i =1 e q N ( i ) vuut E " N X i =1 m i ( σ ) + E he q ⊤ N H N σ i . k H N k h ε N + N E ( m ( σ ) ) i . where the last bound uses (4.22) with h = e q N . Since N E ( m ( σ ) ) . N E ( σ ) + 1 . √ N µ N , using thelast bound along with (4.34) gives E ( c ⊤ e A LN σ ) . µ N (cid:18) N − / h N X i =1 c i i + N X i =1 c i (cid:17) , thus verifying (4.30), and hence completing the proof of the lemma. Remark 4.1.
As in the proof of part (b) of Lemma 2.3, the above argument can be modified to bound themoments of general linear combinations P Ni =1 c i σ i for any c ∈ R N . eb and Mukherjee/Fluctuations in Mean-Field Ising models
5. Proof of supplementary lemmas
Proof of Lemma 3.1.
Noting the presence of Tr + ( D N ) in the RHS of the bound, it suffices to prove the resultfor D N with all diagonal entries set to 0. With ( Z , Z , . . . , Z N ) be i.i.d. N (0 ,
1) random variables, we claimthat E exp N X i,j =1 D N ( i, j ) e X i e X j + N X i =1 c i e X i ≤ E exp s µ N X i,j =1 D N ( i, j ) Z i Z j + √ s µ N X i =1 c i Z i . (5.1)Indeed, to see this, recall that the sub-Gaussian norm of e X i is given by s µ for 1 ≤ i ≤ n , (see e.g., [33,Theorem 2.1]). Consequently, for every θ ∈ R we have E h exp (cid:16) θ e X i (cid:17)i ≤ E (cid:2) exp (cid:0) θ √ s µ Z i (cid:1)(cid:3) . Using this, (5.1)can be obtained by inductively replacing each e X i on the left hand side of (5.1) with √ s µ Z i .The RHS of (5.1)can be computed directly to getlog E exp N X i,j =1 s µ D N ( i, j ) Z i Z j + √ s µ N X i =1 c i Z i = − (1 /
2) log det( I N − s µ D N ) + (1 / s µ N X i =1 c i , from which the desired bound follows on noting the existence of ρ ∈ ( s µ lim sup N →∞ λ ( D N ) , − log(1 − x ) . x for x ∈ [0 , − ρ ]. Proof of Lemma 3.2.
By H¨older’s inequality, for any p > (cid:18) E CW (cid:20) exp (cid:18) β (1 + p )2 σ ⊤ A N σ (cid:19)(cid:21)(cid:19) / (1+ p ) P ( f W N − M ( σ ) | ε ) p p . Since lim sup N →∞ N log P ( f W N − M ( σ ) | > ε ) < p > N →∞ N log E CW (cid:20) exp (cid:18) β (1 + p )2 σ ⊤ A N σ (cid:19)(cid:21) ≤ . (5.2)To this effect, setting g p ( σ ) := β σ ⊤ A N σ + βp σ ⊤ A N σ note thatlog E CW (cid:20) exp (cid:18) β (1 + p )2 σ ⊤ A N σ (cid:19)(cid:21) = sup σ ∈ [ − , N { g p ( σ ) − N X i =1 I ( σ i ) } − log Z CWN ( β, B ) + o ( N ) , (5.3)where the last line uses [4, Theorem 1.1] along with the observation Tr(( A N + A N ) ) = o ( N ). Using spectraltheorem we have A N = P Ni =1 λ i q i q ⊤ i with λ i = λ i ( A N ), and sosup σ ∈ [ − , N g p ( σ ) − β N X i =1 σ i ! = sup σ ∈ [ − , N " β N X i =1 ( λ i − σ ⊤ q i q ⊤ i σ + βp σ ⊤ λ q q ⊤ − ⊤ N ! σ + βp N X i =2 λ i σ ⊤ q i q ⊤ i σ . o ( N ) + N X i =2 ( σ ⊤ q i q ⊤ i σ ) (cid:18) − β − λ i ) + βp λ i (cid:19) where the bound in the last line uses (1.5), and Lemma 5.1. Finally note that (1.7) shows the existence of ρ < ≤ i ≤ N λ i ≤ ρ , and so there exists p = p ( ρ ) such that max ≤ i ≤ N (cid:16) − β (1 − λ i ) + βp λ i (cid:17) ≤ σ ∈ [ − , N g p ( σ ) − β N X i =1 σ i ! ≤ o ( N ) , eb and Mukherjee/Fluctuations in Mean-Field Ising models and sosup σ ∈ [ − , N ( g p ( σ ) − I ( σ )) ≤ sup σ ∈ [ − , N g p ( σ ) − β N X i =1 σ i ! + sup σ ∈ [ − , N β N X i =1 σ i − I ( σ ) ! = o ( N ) + M N ( β, B ) , where M N ( β, B ) is the Mean-Field prediction defined in (1.13). Since | log Z CWN ( β, B ) − M N ( β, B ) | . log N by part (a) of Proposition 5.1, (5.2) follows, thus completing the proof of the lemma. Lemma 5.1.
Let P Ni =1 λ i ( A N ) q i q ⊤ i be the spectral decomposition of A N . Suppose that (1.5) and (1.7) hold,and P Ni =1 ( R i −
1) = o ( N ) .(a) Then k q − e k = o (1) , where e := N − / .(b) Further we have lim sup N →∞ λ ( A N ) < , where A N = A N − N ⊤ .Proof. (a) Write e = P Ni =1 c i q i with c > o (1) = 1 N N X i =1 R i = e ⊤ A N e = N X i =1 c i λ i ( A N ) ≤ λ ( A N ) c + λ ( A N )(1 − c )Along with (1.5) and (1.7), this gives c = 1 + o (1), and so h q , e i = c = 1 + o (1), thus completing theproof of part (a).(b) This follows on using part (a) to note that kA N k ≤ k N X i =2 λ i ( A N ) q i q ⊤ i k + k λ ( A N ) qq ⊤ − ee ⊤ k ≤ λ ( A N ) + o (1) , and using (1.7). Lemma 5.2.
Let Γ N be an N × N symmetric matrix with non-negative entries, such that ⊤ Γ N = ⊤ and Γ N satisfies (1.7) . Then the following conclusions hold:(a) There exists c > such that for all ℓ ≥ and N large we have max ≤ i ≤ N Γ ℓN ( i, i ) ≤ N + 2 e cℓ . (b) There exists δ > such that for all N large enough we have max ≤ i ≤ N − | λ i (Γ N ) | ≤ − δ log N .
Proof. (a) Setting λ i := λ i (Γ N ) for simplicity of notation, let J + := { j ∈ [2 , N ] : λ j > } and J − := { j ∈ [2 , N ] : λ j < } , and use spectral theorem to note that for any positive integer ℓ we haveΓ ℓN = 1 N ⊤ + X j ∈J + | λ j | ℓ q j q ⊤ j + ( − ℓ X j ∈J − | λ j | ℓ q j q ⊤ j , where ( q , · · · , q N ) are the eigenvectors of Γ N . To begin, use (1.7) to note the existence of c > N large enough we have λ ≤ e − c , which gives X j ∈J + | λ j | ℓ p ij ≤ λ ℓ ≤ e − cℓ (5.4) eb and Mukherjee/Fluctuations in Mean-Field Ising models For ℓ odd, noting that Γ ℓN ( i, i ) ≥ X j ∈J − | λ j | ℓ q ij ≤ N + X j ∈J + | λ j | ℓ p ij ≤ N + λ ℓ ≤ N + e − cℓ , where the last inequality uses (5.4). Using the fact that max ≤ i ≤ N | λ i | ≤
1, for ℓ ≥ X j ∈J − | λ j | ℓ q ij ≤ X j ∈J − | λ j | ℓ − q ij ≤ N + e − cℓ . Combining these two bounds, for all ℓ ≥ | Γ ℓN ( i, i ) | ≤ N + X j ∈J + | λ j | ℓ q ij + X j ∈J − | λ j | ℓ q ij ≤ N + 2 e cℓ , thus completing the proof of part (a).(b) Let δ > e − δ/c >
2. Using part (a) with ℓ = Nc and even, we have N X i =1 | λ i | ℓ = N X i =1 Γ ℓN ( i, i ) ≤ N e − N → . On the other hand if max ≤ i ≤ N − | λ i | > − δ log N , then N X i =1 | λ i | ℓ ≥ (cid:16) − δ log N (cid:17) Nc → e − δc . These two together imply 3 e − δ/c ≤
2, a contradiction.
Remark 5.1.
Note that if Γ N is the adjacency matrix of a well connected bipartite graph scaled by the averagedegree, then our lemma implies lim N →∞ max ≤ i ≤ N (cid:12)(cid:12)(cid:12) N Γ ℓN ( i, i ) − (cid:12)(cid:12)(cid:12) = 0 for ℓ = D log N with D large enough. This highlights the asymptotic optimality of the bound obtained in part(a) of Lemma 5.2. Part (b) quantifies the graph theoretic fact that a connected regular graph, scaled by itsdegree, the multiplicity of the eigenvalue − can be at most . It is easy to check that if − happens to be aneigenvalue the graph must be a bipartite graph, and all other eigenvalues will be strictly larger than − (i.e.there is a unique bipartition for a connected bipartite graph). In fact, our proof can be modified to show thestronger conclusion that for a regular well connected bipartite graphs, the second last eigenvalue is boundedaway from − , i.e. lim inf N →∞ λ N − (Γ N ) > − . The following proposition collects all the results for the Curie-Weiss model which we have used previously.
Proposition 5.1.
Suppose σ is drawn from the Curie-Weiss model. With f W N as in Proposition 3.1, thefollowing conclusions hold:(a) log Z CWN ( β, B ) − N n β t + Bt − I ( t ) o . if ( β, B ) ∈ Θ ∪ Θ , . log N if ( β, B ) ∈ Θ . (b) For any λ > , we have log P CW ( | f W N − M ( σ ) | ≥ λ ) . − N λ if ( β, B ) ∈ Θ ∪ Θ , eb and Mukherjee/Fluctuations in Mean-Field Ising models . − N min( λ , λ ) if ( β, B ) ∈ Θ . Consequently for any sequence δ N = o ( N ) we have log E CW e δ N ( f W N − M ( σ )) . if ( β, B ) ∈ Θ ∪ Θ , . δ N N if ( β, B ) ∈ Θ . (c) For ( β, B ) ∈ Θ , we have: lim sup N →∞ N log P CW N X i =1 σ i ∈ {− , , } ! < . Proof. (a) With f ( w ) = βw − log cosh( βw + B ) as in 3.1, a direct computation gives Z CWN ( β, B ) = e − β/ q nβ π R R e − nf ( w ) dw, where the function f ( w ) has a unique global minimum at w = t for ( β, B ) ∈ Θ ∪ Θ , and two global minima at ± t for ( β, B ) ∈ Θ . Also, it is easy to verify that f ( w ) − f ( t ) ∼ =( w − t ) , for all w ∈ R , if ( β, B ) ∈ Θ ,f ( w ) − f ( t ) ∼ =( w − t ) for all w > , if ( β, B ) ∈ Θ ,f ( w ) − f ( t ) ∼ = min h ( w − t ) , ( w − t ) i for all w ∈ R , if ( β, B ) ∈ Θ . (5.5)The desired estimates follow from these bounds and using Laplace method.(b) Noting that | f W N − M ( σ ) | = | tanh( βW N + B ) − tanh( βM ( σ ) + B ) | ≤ β | W N − M ( σ ) | , it suffices to prove the desired bounds W N , which follows from straightforward computations on using(5.5).(c) This follows on using part (b) to note that, when ( β, B ) ∈ Θ , the random variable W N has anexponential concentration near the points ± t , none of which are near 0. References [1]
Adamczak, R. a. , Kotowski, M. , Polaczyk, B. o. and
Strzelecki, M. (2019). A note on concen-tration for polynomials in the Ising model.
Electron. J. Probab. Paper No. 42, 22. MR3949267[2]
Augeri, F. (2019). A transportation approach to the mean-field approximation. arXiv preprintarXiv:1903.08021 .[3]
Bandeira, A. S. and van Handel, R. (2016). Sharp nonasymptotic bounds on the norm of randommatrices with independent entries.
Ann. Probab. Basak, A. and
Mukherjee, S. (2017). Universality of the mean-field for the Potts model.
Probab.Theory Related Fields
Berthet, Q. , Rigollet, P. and
Srivastava, P. (2019). Exact recovery in the Ising blockmodel.
Ann.Statist. Borgs, C. , Chayes, J. T. , Lov´asz, L. , S´os, V. T. and
Vesztergombi, K. (2008). Convergentsequences of dense graphs. I. Subgraph frequencies, metric properties and testing.
Adv. Math.
Borgs, C. , Chayes, J. T. , Lov´asz, L. , S´os, V. T. and
Vesztergombi, K. (2012). Convergentsequences of dense graphs II. Multiway cuts and statistical physics.
Ann. of Math. (2)
Borgs, C. , Chayes, J. T. , Cohn, H. and
Zhao, Y. (2018). An L p theory of sparse graph convergenceII: LD convergence, quotients and right convergence. Ann. Probab. Borgs, C. , Chayes, J. T. , Cohn, H. and
Zhao, Y. (2019). An L p theory of sparse graph convergenceI: Limits, sparse random graph models, and power law distributions. Trans. Amer. Math. Soc. eb and Mukherjee/Fluctuations in Mean-Field Ising models [10] Bresler, G. and
Nagaraj, D. (2019). Stein’s method for stationary distributions of Markov chainsand application to Ising models.
Ann. Appl. Probab. Broder, A. and
Shamir, E. (1987). On the second eigenvalue of random regular graphs. In
Chatterjee, S. (2005).
Concentration inequalities with exchangeable pairs . ProQuest LLC, Ann Arbor,MI Thesis (Ph.D.)–Stanford University. MR2707160[13]
Chatterjee, S. and
Dembo, A. (2016). Nonlinear large deviations.
Adv. Math.
Chatterjee, S. and
Shao, Q.-M. (2011). Nonnormal approximation by Stein’s method of exchangeablepairs with application to the Curie-Weiss model.
Ann. Appl. Probab. Chuang, H. and
Omidi, G. R. (2009). Graphs with three distinct eigenvalues and largest eigenvaluesless than 8.
Linear Algebra Appl.
Comets, F. and
Gidas, B. (1991). Asymptotics of maximum likelihood estimators for the Curie-Weissmodel.
Ann. Statist. Dembo, A. and
Montanari, A. (2010). Gibbs measures and phase transitions on sparse randomgraphs.
Braz. J. Probab. Stat. Deshpande, Y. , Sen, S. , Montanari, A. and
Mossel, E. (2018). Contextual stochastic block models.In
Advances in Neural Information Processing Systems
Eichelsbacher, P. and
L¨owe, M. (2010). Stein’s method for dependent random variables occurringin statistical mechanics.
Electron. J. Probab. no. 30, 962–988. MR2659754[20] Eldan, R. (2018). Taming correlations through entropy-efficient measure decompositions with applica-tions to mean-field approximation.
Probability Theory and Related Fields
Ellis, R. S. and
Newman, C. M. (1978). The statistics of Curie-Weiss models.
J. Statist. Phys. Feige, U. and
Ofek, E. (2005). Spectral techniques applied to sparse random graphs.
Random Struc-tures Algorithms Gheissari, R. , Lubetzky, E. and
Peres, Y. (2018). Concentration inequalities for polynomials ofcontracting Ising models.
Electron. Commun. Probab. Paper No. 76, 12. MR3873783[24]
Giardin`a, C. , Giberti, C. , van der Hofstad, R. and Prioriello, M. L. (2016). Annealed centrallimit theorems for the Ising model on random graphs.
ALEA Lat. Am. J. Probab. Math. Stat. Jain, V. , Koehler, F. and
Risteski, A. (2019). Mean-field approximation, convex hierarchies, andthe optimality of correlation rounding: a unified perspective. In
Proceedings of the 51st Annual ACMSIGACT Symposium on Theory of Computing
Kabluchko, Z. , L¨owe, M. and
Schubert, K. (2019). Fluctuations of the Magnetization for IsingModels on Erd˝os-R´enyi Random Graphs–the Regimes of Small p and the Critical Temperature. arXivpreprint arXiv:1911.10624 .[27]
Liu, L. (2017). On the Log Partition Function of Ising Model on Stochastic Block Model. arXiv preprintarXiv:1710.05287 .[28]
Lov´asz, L. (2012).
Large networks and graph limits . American Mathematical Society Colloquium Pub-lications . American Mathematical Society, Providence, RI. MR3012035[29] L¨owe, M. and
Schubert, K. (2018). Fluctuations for block spin Ising models.
Electron. Commun.Probab. Paper No. 53, 12. MR3852267[30]
Mossel, E. , Neeman, J. and
Sly, A. (2012). Stochastic block models and reconstruction. arXivpreprint arXiv:1202.1499 .[31]
Mukherjee, R. , Mukherjee, S. and
Yuan, M. (2018). Global testing against sparse alternativesunder Ising models.
Ann. Statist. Mukherjee, R. and
Ray, G. (2019). On testing for parameters in Ising models. arXiv preprintarXiv:1906.00456 .[33]
Ostrovsky, E. and
Sirota, L. (2014). Exact value for subgaussian norm of centered indicator randomvariable. arXiv e-prints arXiv:1405.6749.[34]
Ravikumar, P. , Wainwright, M. J. and
Lafferty, J. D. (2010). High-dimensional Ising modelselection using ℓ -regularized logistic regression. Ann. Statist. Sly, A. and
Sun, N. (2014). Counting in two-spin models on d -regular graphs. Ann. Probab. Xu, M. and
Mukherjee, S. (2020). Fluctuations in the two-star exponential random graph model.(2020). Fluctuations in the two-star exponential random graph model.