Typicality and entropy of processes on infinite trees
aa r X i v : . [ m a t h . P R ] F e b TYPICALITY AND ENTROPY OF PROCESSES ON INFINITE TREES
ÁGNES BACKHAUSZ, CHARLES BORDENAVE, AND BALÁZS SZEGEDY
Abstract.
Consider a uniformly sampled random d -regular graph on n vertices. If d is fixed and n goesto ∞ then we can relate typical (large probability) properties of such random graph to a family of invariantrandom processes (called "typical" processes) on the infinite d -regular tree T d . This correspondence betweenergodic theory on T d and random regular graphs is already proven to be fruitful in both directions. Thispaper continues the investigation of typical processes with a special emphasis on entropy. We study anatural notion of micro-state entropy for invariant processes on T d . It serves as a quantitative refinement ofthe notion of typicality and is tightly connected to the asymptotic free energy in statistical physics. Usingentropy inequalities, we provide new sufficient conditions for typicality for edge Markov processes. We alsoextend these notions and results to processes on unimodular Galton-Watson random trees. Introduction
Typical processes and sofic entropy.
Random d -regular graphs have been extensively studied overthe past 50 years [10, 20, 25, 29]. Sophisticated methods from probability theory, combinatorics and statisticalphysics have been successfully used to uncover many of their properties such as independence ratio, thedensity of a maximal cut or its spectral gap [18, 28, 20]. The recently emerging theory of graph limits[8, 22, 21, 23] gives a new, limiting point of views on the subject. It turns out that many of the crucialproperties of random d -regular graphs for d fixed and n going to infinity can also be studied in the frameworkof ergodic theory on the infinite d -regular tree T d [2]. An illustration of the power of this method is the proofof the Gaussianity of the almost eigenvectors of random d -regular graphs [3, 11]. The proof is a combinationof ergodic theory on T d with information theoretic methods.In this paper we are interested in a question which is also related to the limit of random regular graphs,namely, to the family of typical processes on T d introduced in [2]. These objects arise as the local limitsof vertex-colored random d -regular graphs (see formal definition below). These typical processes containuseful information on the structure of d -regular graphs. For example, the classical fact from [9] that theindependence ratio is separated from / is equivalent to the fact that the alternating coloring of the infinite d -regular tree is not typical. Several necessary conditions have been formulated for typical processes inthe last years. Some of them are about the covariance structure, others are entropy inequalities [4, 5, 15].However, general sufficient conditions for a process to be typical are less common.In this paper we obtain sufficient conditions for the typicality of processes on T d , by studying a newmicro-state entropy. This entropy measures in some sense the number of finite approximations of a processon a random instance of a large d -regular graph. This entropy is tightly connected to Bowen’s sofic entropyin measured group theory, see [13].Our approach has another connection to the convergence of random graphs. While graph limit theoryshows great promise in a variety of questions related to random d -regular graphs, it also revealed an intriguingopen problem. It is believed that for fix d and n → ∞ we have that a random d -regular graph on n verticesis convergent (in probability) in the local-global sense of [22] and also right-convergent in the sense of [21],this last notion of convergence relies on a deep statistical physics theory, see the monograph [24]. Thesegeneral conjectures are a common strengthening of a large variety of conjectures many of which are alreadyproven. For example the convergence of the independence ratio is proven in [6, 17] and the convergence ofasymptotic free energies of a large class of statistical physics models on random d -regular graphs in [26, 14].In this paper, we introduce an upper and a lower version of the micro-state entropies. Equality of theseentropies imply the convergence of random regular graphs. For a family of processes we can establish thisequality. This will lead to the sufficient conditions mentioned above, and this shows that our results mightlead to a deep understanding of the structure of random regular graphs. e now formalize the main definitions. Let d ≥ be an integer and T d the infinite d -regular tree (allvertices have d neighbors). Let M be a finite set. A process X on T d is a random variable on M T d . Thisprocess (or its law) is invariant if the law of X is invariant by all automorphisms of the tree T d . We denoteby I d ( M ) the set of invariant probability measures on M T d .We now define the Benjamini-Schramm topology. A pair ( G, f ) formed by a graph G = ( V, E ) and a map f : V → M will be called a colored graph with color set M . A rooted colored graph is a triple ( G, f, o ) formed by a connected colored graph ( G, f ) and a distinguished vertex o of G , called the root. Two rootedcolored graphs ( G, f, o ) and ( G ′ , f ′ , o ′ ) are isomorphic if there is an isomorphism of G and G ′ which preservesthe colors and the roots. An equivalence class of rooted colored graphs is called an unlabeled rooted coloredgraph in combinatorics.Unlabeled rooted graphs give the proper setup for defining a meaningful notion of convergence. It ishowever more convenient to work with rooted labeled graphs instead of unlabeled rooted graphs. To thisend, we now define a randomized canonical graph in each equivalence class. We define the set of finite integersequences as N f = ∪ k ≥ N k (1.1)where N = { o } and N = { , , . . . , } by convention. The tree T d can be classically built on a subset of N f asfollows. The root of the tree is o , its d -neighbors of o are V = { , . . . , d } ⊂ N , the neighbors of i ∈ V are o and { ( i, , . . . , ( i, d − } ∈ N and so on. More generally, if ( G, o ) is a rooted graph, the breadth-first searchtree started at the root o , where ties between vertices are broken uniformly at random defines a randomgraph ( G ′ , o ) on a subset of N f whose law depends only the equivalence class of ( G, o ) : a vertex at distance k from the root receives a label in N k , if ( i , . . . , i k − ) is the label of its parent in the search tree, it has thelabel ( i , . . . , i k − , j ) if it the j -th offspring of its parent in the random ordering. We call this random rootedgraph, the randomly labeled rooted graph associated to ( G, o ) . Conversely, we will say that a random labeledcolored graph ( G, f, o ) on a subset of N f is randomly labeled if its law is equal to the law of the randomlylabeled rooted colored graph associated to its unlabeled rooted colored graph. By definition if X ∈ I d ( M ) then ( T d , X, o ) is randomly labeled.Recall that a graph is locally finite if all its vertices have a finite number of neighbors. We denote by G • (respectively G • M ) the set of locally finite graphs (respectively locally finite colored graphs on the colorset M ) on the vertex set N f rooted at o which are admissible in the sense that they are realizable as abreadth-first search labeling of a locally finite graph. The sets G • and G • M are complete separable metricspaces when equipped with the distance d( g, g ′ ) = P r ≥ − r ( g ) r =( g ′ ) r where is the indicator function and g r is the restriction of g to the vertices at distance at most r from the root. We denote by P ( G • M ) the setof probability measures on G • M . We equip P ( G • M ) with a distance, also denoted by d , which generates theweak topology on P ( G • M ) (for example, the Lévy-Prohorov distance).If ( G, f ) is a locally finite colored graph and v is a vertex of G then we denote by distr G,v ( f ) the law in P ( G • M ) of the randomly labeled rooted graph (( G, f )( v ) , v ) where ( G, f )( v ) is the restriction of ( G, f ) to theconnected component of G containing v . The law distr G,v in P ( G • ) is defined similarly for a graph G anda vertex v . Finally, if G is finite with vertex set V , we may define the probability measures in P ( G • ) and P ( G • M ) : distr G = 1 | V | X v ∈ V distr G,v and distr G ( f ) = 1 | V | X v ∈ V distr G,v ( f ) . For integers n ≥ d + 1 with nd even, the set G n ( d ) of simple d -regular graphs on the vertex set [ n ] = { , . . . , n } is not empty. For each integer n ≥ d + 1 with nd even, let G n be a uniformly distributed randomgraph on G n ( d ) . Almost-surely, the probability distribution distr G n converges as n goes to infinity to theDirac mass at T d rooted at o (it is a consequence of the fact that the number of cycles of length k in G n is O (1) for any fixed k , see [10]). It can further be checked that, a.s. if µ is an accumulation point of distr G n ( f n ) for some sequence of colorings f n ∈ M n , then µ ∈ I d ( M ) . This motivates the following definition introducedin [2]. Definition 1.1 (Typical process) . A measure µ ∈ I d ( M ) is weakly typical if lim ǫ → lim sup n →∞ P ( ∃ f ∈ M n : d(distr G n ( f ) , µ ) ≤ ǫ ) = 1 . t is strongly typical if lim ǫ → lim n →∞ P ( ∃ f ∈ M n : d(distr G n ( f ) , µ ) ≤ ǫ ) = 1 . Note that it is apparent from the definition that being typical does not depend on the choice of thedistance d which generates the weak topology. We note also that the definition in [2] is slightly different butit turns out to be equivalent by using some measure concentration phenomena (see [3, Section 5]).We may also define a notion of micro-state entropy of µ ∈ I d ( M ) as follows. For µ ∈ I d ( M ) and r ≥ ,the probability measure µ r is defined as the restriction of µ to the vertices at distance at most r from theroot. For any ǫ > , if G ∈ G n ( d ) , we define F G ( µ, r, ǫ ) = { f ∈ M n : d(distr G ( f ) r , µ r ) ≤ ǫ } . (1.2)This is the set of coloring functions f on G which are ǫ close to µ r in the Benjamini-Schramm sense. Let G n be a uniformly distributed random graph on G n ( d ) with n ≥ d − and nd even. Roughly speaking thesofic entropy of µ is a limit of H G n ( µ, r, ǫ ) = 1 n log |F G n ( µ, r, ǫ ) | , (1.3)in n → ∞ and then in ǫ → and r → ∞ . However, since H G n ( µ, r, ǫ ) is a random variable in {−∞} ∪ [0 , ∞ ) some care is needed. More formally, we fix < α < and consider the following median value of H G n ( µ, r, ǫ ) , h n ( µ, r, ǫ, α ) = sup { h ∈ {−∞} ∪ [0 , ∞ ) : P ( H G n ( µ, r, ǫ ) ≥ h ) ≥ α } . Since h n is non-decreasing in ǫ , we may define the upper and lower entropies of µ as ¯ h ( µ, r, α ) = lim ǫ → lim sup n →∞ h n ( µ, r, ǫ, α ) and h ( µ, r, α ) = lim ǫ → lim inf n →∞ h n ( µ, r, ǫ, α ) . These entropies are extended real numbers in {−∞} ∪ [0 , ∞ ) . This entropy can be interpreted as a versionof Bowen’s sofic entropy, see the survey [13].The sofic entropies ¯ h ( µ, r, α ) and h ( µ, r, α ) do not depend on α ∈ (0 , . Note that they do not dependneither on the choice of the distance d (in the sense that if two distances are topologically equivalent, thecorresponding quantities ¯ h ( µ, r, α ) and h ( µ, r, α ) are equal). We shall prove the following. Lemma 1.2.
Let µ ∈ I d ( M ) and r ≥ . The function α (¯ h ( µ, r, α ) , h ( µ, r, α )) is constant on (0 , . By Lemma 1.2, we may consider the common value of the entropies: for all α ∈ (0 , , ¯ h ( µ, r ) = ¯ h ( µ, r, α ) and h ( µ, r ) = h ( µ, r, α ) . By construction, ¯ h ( µ, r ) and h ( µ, r ) are the growth rates of the number of coloring of a random d -regulargraph whose r -neighborhood is close to µ r . Finally, since ¯ h ( µ, r ) and h ( µ, r ) are non-increasing in r , we maydefine the upper and lower sofic entropies as ¯ h ( µ ) = lim r →∞ ¯ h ( µ, r ) and h ( µ ) = lim r →∞ h ( µ, r ) . Taking the limit as α → , for h ≥ , the inequality ¯ h ( µ ) ≥ h is equivalent to the existence of a vanishingsequence ( ǫ n ) and such that lim sup n →∞ P ( H G n ( µ, /ǫ n , ǫ n ) ≥ ( h − ǫ n ) + ) = 1 , where ( x ) + = x ∨ , and similarly for h ( µ ) . Since for ν, µ ∈ P ( G • M ) , d( ν r , µ r ) converges to d( ν, µ ) as r → ∞ ,the sofic entropy is thus closely related to the typicality: Lemma 1.3.
Let µ ∈ I d ( M ) . We have ¯ h ( µ ) ≥ (resp. h ( µ ) ≥ ) if and only if µ is weakly (resp. strongly)typical. This work is notably motivated the following conjecture which is connected to the notion of right conver-gence, see [21] and Subsection 1.5 below.
Conjecture 1.4.
For all µ ∈ I d ( M ) , we have h ( µ ) = ¯ h ( µ ) . In particular, µ is weakly typical if and only ifit is strongly typical. In this work, we will compute the entropy h ( µ ) = ¯ h ( µ ) for a large class of invariant measures µ ∈ I d ( M ) .This class is a class of processes where a second moment method can be applied. In a subsequent work, wewill use more advanced statistical physics methods to refine our criterion. .2. Annealed entropy. If r ≥ is an integer and S is a subset of T d , we define B r ( S ) the subset ofvertices of T d at distance at most r from a vertex in S . For ease of notation, for r ≥ , we define S r = B r ( o ) as the ball of radius r around the root of T d and E r = B r − ( { o, } ) is the set of vertices at distance r − from the edge { o, } of T d .If X is an invariant process with law µ ∈ I d ( M ) and r ≥ integer, we set Σ r ( X ) = Σ r ( µ ) = H ( X S r ) − d H ( X E r ) , (1.4)where X S is the restriction of X to the subset S ⊂ T d and H is the usual Shannon entropy: if Y is a randomvariable taking value a finite set F , then H ( Y ) = − X x ∈ F P ( Y = x ) ln P ( Y = x ) . It follows from [3, 12] (see Subsection 2.2 below for details) that Σ r ( µ ) is non-increasing in r . We maythus define Σ( µ ) = lim r →∞ Σ r ( µ ) . Note that the law of X S r is µ r and that the law of X E r is a marginal of µ r . The quantities Σ r ( µ ) and Σ( µ ) will be called the annealed entropy of µ r and µ , the reason will be clear in the forthcoming Subsection2.3. The following first moment bound is essentially contained in [2, 12]. Theorem 1.5.
For any µ ∈ I d ( M ) and integer r ≥ , we have ¯ h ( µ, r ) ≤ Σ r ( µ ) and ¯ h ( µ ) ≤ Σ( µ ) . As a corollary, we recover the "star-edge inequality" of [2, 3] which is a necessary condition of typicality:if µ is weakly typical then by Lemma 1.3, ¯ h ( µ ) ≥ and thus, by Theorem 1.5, Σ( µ ) ≥ . Corollary 1.6 ([2, 3]) . If µ ∈ I d ( M ) is a weakly typical process then Σ( µ ) ≥ . The main result of this paper is a matching lower bound for a large class of invariant processes. To thisend, we first recall the notion of coupling restricted to our setting. Let M and M be two finite sets, X and X be two random variables on M T d and M T d with respective laws µ and µ . A coupling of µ and µ is a distribution ν on ( M × M ) T d such that if Y = ( Y , Y ) has law ν , Y i has law µ i for i = 1 , . If X i isan invariant process for i = 1 , , we say that ν or Y is an invariant coupling if ν ∈ I d ( M × M ) . Theorem 1.7.
Let µ ∈ I d ( M ) . For any integer r ≥ , if all invariant couplings ν of µ and µ satisfy Σ r ( ν ) ≤ r ( µ ) then h ( µ, r ) = ¯ h ( µ, r ) = Σ r ( µ ) . In particular, if the above condition holds for an increasingsequence of integers ( r k ) k ≥ then h ( µ ) = ¯ h ( µ ) = Σ( µ ) . We note that the bound Σ r ( ν ) ≤ r ( µ ) is attained for the independent coupling Y = ( X , X ) with X i independent with law µ . Note also that if ν is the trivial coupling of µ and µ , that is Y = ( X, X ) with X with law µ , we find Σ r ( ν ) = Σ r ( µ ) . Hence, under the condition of Theorem 1.7, we have Σ r ( µ ) ≤ r ( µ ) orequivalently Σ r ( µ ) ≥ . As a corollary, by Lemma 1.3, we thus obtain the following sufficient condition fortypicality. Corollary 1.8.
Let µ ∈ I d ( M ) and ( r k ) k ≥ an increasing sequence of integers be such that for all invariantcouplings ν of µ and µ and all k ≥ we have Σ r k ( ν ) ≤ r k ( µ ) . Then µ is strongly typical. Edge-Markov processes.
There is a specific class of processes in I d ( M ) for which it is possible toimprove on Theorem 1.7, the edge-Markov processes defined as follows. As above, for integer r ≥ , B r ( S ) is the r -neighborhood of a subset S in T d . For r ≥ , recall that S r = B r ( o ) and E r = B r − ( { o, } ) . Definition 1.9 (Edge-Markov process) . A probability measure in M T d is edge-Markov if conditioned on thevalue at an edge, the processes on the left and right subtrees of that edge are independent.More generally, for integer r ≥ , a probability measure on M T d is r -Markov if conditioned on the value at B r − ( e ) , the ( r − -neighborhood of an edge e , the processes on the left and right subtrees of e are independent(for r = 1 we recover the edge-Markov process). et I d,r ( M ) denote the set of probability measures on M S r that are invariant by automorphisms of S r and whose restriction to E r is invariant by switching the two sides of the edge { o, } . If µ ∈ I d ( M ) , then, µ r , its restriction to S r , is in I d,r ( M ) . Conversely, the following lemma is easy to see. Lemma 1.10.
Let r ≥ be an integer and p ∈ I d,r ( M ) . Then there is a unique r -Markov process µ ( p ) ∈I d ( M ) such that the marginal of µ ( p ) on S r is equal to µ . If p ∈ I d,r ( M ) , we define Σ( p ) = Σ r ( µ ( p )) = H ( X S r ) − ( d/ H ( X E r ) as in Equation (1.4), where X has law p . As above, if p and p are probability measures on M S r and M S r , a coupling of p and p is aprobability measure on M S r × M S r whose marginals are p and p . The following theorem is a strengtheningof Theorem 1.7 for egde-Markov processes. Theorem 1.11.
Let r ≥ be an integer and p ∈ I d,r ( M ) . If for all couplings q ∈ I d,r ( M ) of p and p , wehave Σ( q ) ≤ p ) then h ( µ ( p )) = ¯ h ( µ ( p )) = Σ( p ) and µ ( p ) is strongly typical. Theorem 1.11 provides an easy to check criteria for typicality for edge Markov processes. In the courseof the proof, we will need an important maximizing property satisfied by edge Markov processes (a closelyrelated characterization can be found in [12, Theorem 1.3] and [3, Lemma 10.1]).
Lemma 1.12.
Let X ∈ I d ( M ) and r ≥ . We have Σ r +1 ( X ) ≤ Σ r ( X ) with equality if and only if X S r +1 is a r -Markov process on S r +1 . Vertex-Markov processes.
There is a subclass of edge-Markov processes for which the annealedentropy takes a particularly simple form.
Definition 1.13 (Vertex-Markov process) . Let T be a tree, a probability measure in M T is vertex-Markovif conditioned on the value at a vertex, the processes on the pending subtrees of that vertex are independent. Let I e ( M ) denote the set of probability measures on M E that are invariant by switching the two sidesof the edge { o, } . If µ ∈ I d ( µ ) then its restriction to E is in I e ( M ) . Conversely, if p ∈ I e ( M ) , there existsa unique vertex-Markov process whose restriction to E is p . We denote the law of this process by µ ( p ) . If X ∈ I d ( M ) , we define Σ e ( X ) = d H ( X E ) − ( d − H ( X o ) . If p ∈ I e ( M ) , we set Σ( p ) = Σ e ( µ ( p )) . Vertex-Markov processes satisfy the following extremal property. Lemma 1.14. If X ∈ I d ( M ) then Σ ( X ) ≤ Σ e ( X ) , with equality if and only if X S is a vertex-Markov process on S . Combined with Theorem 1.11, the above lemma implies the following corollary.
Theorem 1.15.
Let p ∈ I e ( M ) . If for all couplings q ∈ I e ( M ) of p and p we have Σ( q ) ≤ p ) , then h ( µ ( p )) = ¯ h ( µ ( p )) = Σ( p ) and µ ( p ) is strongly typical.Proof. Let p ′ = µ ( p ) ∈ I d, ( M ) be the law of µ ( p ) restricted to S . Let q ′ be an invariant coupling of p ′ and p ′ and let q be its restriction to E . By construction q ∈ I e ( M ) . Moreover, by Lemma 1.14, Σ( q ′ ) ≤ Σ ( µ ( q )) = Σ e ( µ ( q )) = Σ( q ) . It follows that Theorem 1.15 is a consequence of Theorem 1.11 appliedto r = 1 and p ′ = µ ( p ) . (cid:3) Application to factor graphs and combinatorial optimization.
In this paragraph, we discuss abasic connection between asymptotic free energy of factor graphs and sofic entropy. This may serve an extramotivation for studying the sofic entropy.Let M be a finite set, r ≥ be an integer and let ϕ be a function on the set of rooted unlabeled M -coloredgraphs of radius r taking value in (0 , ∞ ) . If G ∈ G n ( d ) , Z G = X f ∈ M n n Y v =1 ϕ (( G, f, v ) r ) , here ( G, f, v ) r is the rooted colored graph associated to the ball of radius r around v in G . We set ψ = ln ϕ. The asymptotic free energy is defined as the limit of (1 /n ) ln Z G n where G n is a uniformly sampled graphin G n ( d ) (provided that the limit exists). By standard concentration inequality (see argument in Theorem2.1), it is easy to check that if G n is uniformly sampled in G n ( d ) , then, in probability, as n goes to infinity, n ln Z G n − E n ln Z G n → . (1.5)It is straightforward to express the limits of the expected free energy in terms of the entropy. If p ∈I d,r ( M ) , we set for ease of notation h ( p ) = h ( µ ( p ) , r ) and similarly for ¯ h ( p ) (there are the upper and lowergrowth rates of the number of colorings of G n whose r -neighborhood is close to p ). In the statement below,we use the notation h p, ψ i = E ψ ( X ) where X has law p . Lemma 1.16.
For integer r ≥ and ψ as above, if G n is uniformly distributed on G n ( d ) (with nd even and n ≥ d + 1 ), we have sup p ∈I d,r ( M ) ( h ( p ) + h p, ψ i ) ≤ lim inf n →∞ E n ln Z G n ≤ lim sup n →∞ E n ln Z G n ≤ sup p ∈I d,r ( M ) (cid:0) ¯ h ( p ) + h p, ψ i (cid:1) . In particular, if Conjecture 1.4 holds true, then it would automatically imply the convergence of theexpected free energy for all functions ψ . Note also that Theorem 1.7 can be used to obtain a lower boundexpected free energy while Theorem 1.5 can be used to get an upper bound. With proper technical conditions,it is possible to extend Lemma 1.16 to some hard-constrained models, that is to some functions ϕ = e ψ whichtake value in [0 , ∞ ) . For simplicity, we will however not discuss in details this possibility here.In the same vein, in combinatorial optimization problems, we are often interested in the computation ofa graph functional of the form: L G = max f ∈ M n n X v =1 ψ (( G, f, v ) r ) , with r ≥ and ψ as above. Again, it is easy to check that if G n is uniformly sampled in G n ( d ) , we have, inprobability, L G n n − E L G n n → . The following statement is a corollary of Lemma 1.16. It shows that typical processes are intimately connectedto the computation of limits of E L G n /n . Lemma 1.17.
For integer r ≥ and ψ as above, if G n is uniformly distributed on G n ( d ) (with nd even and n ≥ d + 1 ), we have sup p ∈I d,r ( M ): h ( p ) ≥ h p, ψ i ≤ lim inf n →∞ E L G n n ≤ lim sup n →∞ E L G n n ≤ sup p ∈I d,r ( M ):¯ h ( p ) ≥ h p, ψ i . Again, we observe that Theorem 1.7 can be used to obtain a lower bound on E L G n /n and Theorem 1.5an upper bound. Remark . We conclude this paragraph by mentioning that, for r = 1 , there is a simplification of Lemma1.16 and Lemma 1.17 for functions ψ of the form ψ (( G, f, v ) ) = ψ ( f ( v )) + X u : u ∼ v ψ ( f ( v ) , f ( u )) where the supremum in Lemma 1.16 and Lemma 1.17 is taken over p ∈ I e ( M ) instead of p ∈ I d, ( M ) andthe entropic term is given by h ( p ) = sup h ( q ) where the supremum is over all q ∈ I d, ( M ) whose restrictionto E is p , and similarly for ¯ h ( p ) . This can be useful because it reduces the dimension of the underlyingoptimization problem. In that case, Theorem 1.15 can be used to give lower bounds. Organization of the paper.
The remainder of this text is organized as follows. In Section 2, we willestablish the key properties of sofic and annealed entropies. In Section 3, we will prove the main results ofthis paper. In the final Section 4, we will extend our framework and main results to invariant processes onunimodular Galton-Watson trees. . Properties of sofic and annealed entropies
Concentration of entropy: proof of Lemma 1.2.
Let G n be a uniformly distributed random graphon G n ( d ) with n ≥ d − and nd even. Recall the definition of H G n ( µ, r, ǫ ) in (1.3). The aim of this subsectionis to establish the following concentration result. Theorem 2.1.
Let r ≥ , µ ∈ I d ( M ) , h ∈ {−∞} ∪ [0 , ∞ ) . For all continuous functions δ : [0 , ∞ ) → [0 , ∞ ) with δ (0) = 0 , we have the following: if for all ǫ > , lim sup n →∞ n log P ( H G n ( µ, r, ǫ ) ≥ h ) ≥ − δ ( ǫ ) , (2.1) then h ( µ, r, α ) ≥ h for all < α < . Conversely, there exists a function δ as above positive on (0 , ∞ ) suchthat: if for all ǫ > , lim sup n →∞ n log P ( H G n ( µ, , r, ǫ ) ≤ h ) ≥ − δ ( ǫ ) (2.2) then ¯ h ( µ, r, α ) ≤ h for all < α < . Finally, the same claims hold with lim inf and h replacing lim sup and ¯ h . It is immediate to check that Lemma 1.2 is a corollary of Theorem 2.1. Beware of the asymmetry betweenthe lower and upper bound. We believe that it is a caveat of our proof. It is ultimately due to the fact that H G n ( µ, r, ǫ ) can be equal to −∞ .The proof of Theorem 2.1 makes a detour through a relaxation of the entropy. Fix µ ∈ I d ( M ) and r ≥ .If G is in G d ( n ) , we define for β > , Z G ( β ) = X f ∈ M n e − nβ d(distr G ( f ) r ,µ r ) . We start the proof of the proposition with a concentration inequality.
Lemma 2.2.
Let G n be uniformly distributed on G n ( d ) with dn even and n ≥ d − . There exists a constant C depending on ( d, r ) and a deterministic number s n ( β ) depending on ( n, d, r, β ) such that for any t > , wehave P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) n ln Z G n ( β ) − s n ( β ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ t (cid:19) ≤ C exp( − nt / ( Cβ ) ) . Proof.
The proof follows a standard path. By classical contiguity results, it is enough to establish the claimfor the configuration model (see Bollobás [10, Section 2.4]). Recall that the configuration model is the graph(with possible loops and multiple edges) obtained as follows. We attach to each vertex in [ n ] , d half-edges.We sample a matching m on the set ~E of nd half-edges uniformly at random (recall that a matching is aninvolution without fixed point). Finally, we form a d -regular graph G = G ( m ) by creating an edge for eachpair of matched half-edges.Let us say that two matchings m, m ′ differ by a switch if there exists ( a, b, c, d ) in ~E such that m ( e ) = m ′ ( e ) for all e ∈ ~E \{ a, b, c, d } and m ( a ) = b , m ′ ( a ) = c , m ( c ) = d , m ′ ( b ) = d . If m and m ′ differ by a switch, weclaim that for any f ∈ M n , d(distr G ( m ) ( f ) r , distr G ( m ′ ) ( f ) r ) ≤ C d ( d − r − n = θn , where C is the diameter of P ( G • M ) for the distance d . Indeed, we have distr G,v ( f ) r = distr G ′ ,v ′ ( f ′ ) r ifthe rooted subgraphs ( G, f, v ) r and ( G ′ , f ′ , v ′ ) r are isomorphic. Notably, distr G ( m ) ,v ( f ) r = distr G ( m ′ ) ,v ( f ) r unless v is at distance at most r from an edge in the symmetric difference of G ( m ) and G ( m ′ ) .We deduce that e − βθ ≤ Z G ( m ′ ) Z G ( m ) ≤ e βθ and (cid:12)(cid:12) ln Z G ( m ) − ln Z G ( m ′ ) (cid:12)(cid:12) ≤ βθ. rom [29, Theorem 2.19], if m is a uniformly sampled matching on ~E , we get P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) n ln Z G ( m ) − E n ln Z G ( m ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ t (cid:19) ≤ − nt / (2 dθ β )) . The conclusion follows with s n ( β ) = E n ln Z G ( m ) ( β ) . (cid:3) We are ready for the proof of Theorem 2.1.
Proof of Theorem 2.1.
Recall the definition of F G ( µ, r, ǫ ) in (1.2). Since µ and r are fixed, we write simply F G ( ǫ ) and set F G ( ǫ ) = |F G ( ǫ ) | . For any ǫ > , we have ln Z G ( β ) ≥ ln F G ( ǫ ) − nβǫ, (2.3)where we have used that d(distr G ( f ) r , µ r ) ≤ ǫ for all f ∈ F G ( ǫ ) . The other way around, if f / ∈ F G ( ǫ ) , wehave d(distr G ( f ) r , µ r ) ≥ ǫ . Hence, ln Z G ( β ) ≤ ln (cid:0) | M | n e − nβǫ + F G ( ǫ ) (cid:1) ≤ ln 2 + (ln F G ( ǫ )) ∨ ( n (ln | M | − βǫ )) . (2.4)We may now prove the first claim of the theorem. Assume that (2.1) holds for some h ≥ (if h = −∞ ,there is nothing to prove). Let h < h and E = { H G n ( µ, r, ǫ ) ≥ h } . On the event E , from (2.3), we have forall β > , n ln Z G n ( β ) ≥ h − βǫ. There exists β ǫ such that, as ǫ → , β ǫ ǫ → , β ǫ → ∞ and β ǫ δ ( ǫ ) → . For this choice of β , ( t ǫ /β ǫ ) ≫ δ ( ǫ ) for some t ǫ → . It follows from Lemma 2.2 and (2.1) that for any h , h such that h < h < h < h , s n ( β ǫ ) ≥ h for all n large enough (depending on ǫ ). Let < α < . Applying again Lemma 2.2, we deducethat the event E = { n ln Z G n ( β ǫ ) ≥ h } has probability greater than α for all n large enough.Now, there exists η ǫ such that, as ǫ → , η ǫ → and β ǫ η ǫ → ∞ (for example η ǫ = 1 / √ β ǫ ). We apply(2.4) with ǫ = η ǫ . We get on the event E , if ǫ is small enough, H G n ( µ, r, η ǫ ) ≥ n ln Z G n ( β ǫ ) − n ln 2 ≥ h − n ln 2 . The right-hand side is larger than h if n is large enough. We deduce that ¯ h ( µ, r, α ) ≥ h , since h can bearbitrarily close to h , the first claim follows.The second claim is proven similarly. Since H G n ( µ, r, ǫ ) takes value in {−∞}∪ [0 , ∞ ) , we have H G n ( µ, r, ǫ ) ≤− if and only if H G n ( µ, r, ǫ ) = −∞ . We may thus prove the second claim with h ≥ − . Assume that (2.2)holds for some δ ( ǫ ) which will be defined later on. We now set E = { H G n ( µ, r, ǫ ) ≤ h } . On the event E , from(2.4), we have for all β > , n ln Z G n ( β ) ≤ n ln 2 + h ∨ (ln | M | − βǫ ) . If β ≥ β ǫ = (ln | M | + 1) /ǫ then we get, for all n large enough n ln Z G n ( β ) ≤ n ln 2 + h. We choose δ ( ǫ ) so that δ ( ǫ ) β ǫ → as ǫ → (for example δ ( ǫ ) = ǫ ). Then, if h > h > h > h and ǫ issmall enough, we deduce by Lemma 2.2 that s n ( β ǫ ) ≤ h for all n large enough. We apply again Lemma 2.2 and deduce that the event E = { n ln Z G n ( β ǫ ) ≤ h } hasprobability greater than α for all n large enough. Finally, from (2.3), we have on the event E , H G n ( µ, r, ǫ ) ≤ h + β ǫ ǫ The latter is less than h for all ǫ small enough. The second claim follows. Obviously, the same argumentworks with lim inf and h ( µ, r, α ) . (cid:3) .2. Maximizers of the annealed entropy: proofs of Lemma 1.12 and Lemma 1.14.
In this sub-section, we prove Lemma 1.12 and Lemma 1.14. If
X, Y are discrete random variables, we recall that therelative entropy of X given Y is H ( X | Y ) = − X x,y P (( X, Y ) = ( x, y )) ln P ( X = x | Y = y ) , where P ( A | B ) = P ( A ∩ B ) / P ( B ) is the usual conditional probability (if P ( B ) = 0 , P ( A | B ) takes an arbitraryvalue). In other words, H ( X | Y ) is the average over Y of the entropy of the conditional law of X given Y .We will repeatedly use that H ( X, Y ) = H ( Y ) + H ( X | Y ) and H ( X | ( Y, Y ′ )) ≤ H ( X | Y ) , (2.5)with equality if and only if X conditioned on Y is independent of Y ′ .We start with the proof of Lemma 1.12. Proof of Lemma 1.12.
The following fact is useful. For a given integer r ≥ , we introduce the finite set N = M S r − where as usual S r − = B r − ( o ) . We consider the map Ψ from M T d to N T d which maps x to Ψ( x ) such that for v ∈ T d , Ψ( x ) v is the restriction of x to S r − ( v ) (composed by a given isomorphism from S r − ( v ) to S r − ). If X is a process on T d then for all integers t ≥ , we have Σ t + r ( X ) = Σ t +1 (Ψ( X )) .Moreover, if X is a r -Markov process, then Ψ( X ) is an edge-Markov process.As a byproduct, it is sufficient to prove Theorem 1.12 with r = 1 : we should check that Σ ( X ) ≤ Σ ( X ) with equality if and only if X S is an edge Markov process. Note that the above inequality can be equivalentlywritten as H ( X S ) − H ( X S ) − d H ( X E ) + d H ( X E ) ≤ . (2.6)To check that (2.6) holds, we need some extra notation. We denote by L = { , . . . , d } the left side of E along E = { o, } . We also set L i = { ( i, , . . . , ( i, d − } with i = 1 , . . . , d . We have S = L ∪ E and thus,from (2.5) H ( S ) = H ( E ) + H ( L | E ) , where for ease of notation, for sets S, T , we write H ( S ) and H ( S | T ) in place of H ( X S ) and H ( X S | X T ) .Similarly, since E = S ∪ L , H ( E ) = H ( S ) + H ( L | S ) = H ( E ) + H ( L | E ) + H ( L | S ) . Finally, since S is the disjoint union of E and ∪ di =2 L i , we have, H ( S ) = H ( E ) + H ( ∪ di =2 L i | E ) . The last three identities imply that Equation (2.6) is equivalent to H ( ∪ di =2 L i | E ) − (cid:18) d − (cid:19) H ( L | S ) − d H ( L | E ) ≤ . (2.7)Using the invariance, we deduce from (2.5) that H ( ∪ di =2 L i | E ) ≤ ( d − H ( L | E ) with equality if and only if there is conditional independence of the X L i ’s given X E . Now, since E contains S , we get H ( L | E ) ≤ H ( L | S ) = H ( L | S ) , with equality in case of conditional independence of X L and X E \ S given X S . It follows that the left-handside of (2.7) is upper bounded by d H ( L | S ) − d H ( L | E ) . (2.8)From the invariance of X by switching the two sides of e , we get H ( L | E ) = H ( L | E ) and thus (2.8) isequal to d H ( L | S ) − H ( L | E )) . sing again (2.5), since E ⊂ S , this last expression is always non-positive with equality if and only if X L is conditionally independent of X S given X e . This proves that (2.6) holds. By considering the case ofequality, it is then easy to check that it implies that X S is an edge Markov process. It concludes the proofof Lemma 1.12. (cid:3) We now prove Lemma 1.14.
Proof of Lemma 1.14.
Let X ∈ I d ( M ) . From (2.5), with the notation used in the proof of Lemma 1.12, wehave H ( S ) = H ( o ) + H ( S | o ) ≤ H ( o ) + dH ( E | o ) , with equality if the variables ( X o , X i ) ≤ i ≤ d conditioned on X o are independent. Using (2.5) again, we get H ( S ) ≤ dH ( E ) − ( d − H ( o ) . So finally, Σ ( X ) = H ( S ) − d H ( E ) ≤ d H ( E ) − ( d − H ( o ) = Σ e ( X ) as requested. (cid:3) Combinatorial characterization of the annealed entropy.
In this subsection, we give a combina-torial interpretation of the annealed entropy Σ r ( µ ) . Recall that G n ( d ) is the set of simple d -regular graphson the vertex set [ n ] . For µ ∈ I d ( M ) , r ≥ integer and ǫ > , we define the set of colored graphs whose r -neighborhood is close to µ r as G n ( µ, r, ǫ ) = { ( G, f ) : G ∈ G n ( d ) , f ∈ M n , d(distr G ( f ) r , µ r ) ≤ ǫ } = G G ∈G n ( d ) F G ( µ, r, ǫ ) , where ⊔ is the disjoint union and F G ( µ, r, ǫ ) was defined in (1.2). We then set Σ n ( µ, r, ǫ ) = 1 n (log |G n ( µ, r, ǫ ) | − log |G n ( d ) | ) = 1 n log E |F G n ( µ, r, ǫ ) | , (2.9)where the expectation is with respect to the random graph G n uniformly distributed on G n ( d ) . In comparisonwith the definition of H G n ( µ, r, ǫ ) in (1.3), Σ n ( µ, r, ǫ ) appears as an annealed quantity in the sense that thereis an average over the randomness of G n inside the logarithm. The following theorem asserts that Σ n ( µ, r, ǫ ) is close to Σ r ( µ ) as n goes to infinity and ǫ goes to . Theorem 2.3.
Let µ ∈ I d ( M ) and r ≥ integer. We have lim ǫ → lim inf n →∞ Σ n ( µ, r, ǫ ) = lim ǫ → lim sup n →∞ Σ n ( µ, r, ǫ ) = Σ r ( µ ) . Proof.
One side of this identity can be found in [3, Lemma 6.2]. We will however give a proof which relieson [16] which is a generalization of [12] to colored graphs. This is interesting because it connects [12, 16] tothe entropic inequalities found in [2, 3]. First, a classical result of Bender and Canfield [7] implies that n log |G n ( d ) | = d n − s ( d ) − log( d !) + o (1) , (2.10)where s ( d ) = d/ − ( d/
2) log d . On the other hand, Proposition 5 and Proposition 6 in Delgosha andAnantharam [16] imply that lim ǫ → lim inf n →∞ (cid:18) n log |G n ( µ, r, ǫ ) | − d n (cid:19) = lim ǫ → lim sup n →∞ (cid:18) n log |G n ( µ, r, ǫ ) | − d n (cid:19) = J r ( µ ) , where J r ( µ ) has an explicit formula that we now describe (the same formula appears in [12]).We define e T • r − as the set of unlabeled colored rooted ( d − -ary trees of depth r − . An element g = ( t, t ′ ) ∈ e E r = e T • r − × e T • r − can be seen as an unlabeled coloring of E r rooted on the oriented edge ( o, .For g = ( t, t ′ ) in e E r and X a coloring of T d , we then define N X ( g ) as the number of neighbors v of the rootsuch that X restricted to E r ( o, v ) = B r − ( { o, v } ) is isomorphic to g : more precisely such that the restrictionof X to ( d − -ary tree rooted at o (respectively v ) in E r ( o, v ) \{ o, v } is isomorphic to t (respectively t ′ ). Byconstruction X g ∈ e E r N X ( g ) = deg( o ) = d. (2.11) f X is a random coloring of T d with µ , we then define a probability measure on e E r by, for all g ∈ e E r : π µ ( g ) = E [ N X ( g )] d , where the expectation is with respect to the randomness of X . We have J r ( µ ) = − s ( d ) + H ( e X S r ) − d H ( π µ ) − X g ∈ e E r E [log( N X ( g )!)] , where e X S r is the rooted unlabeled coloring associated to X S r . As a sanity check, if M is a singleton, then J r ( µ ) = − s ( d ) − log( d !) and we retrieve Equation (2.10). Moreover, in view of Equation (2.10), the theoremfollows from the claim J r ( µ ) = − s ( d ) − log( d !) + H ( X S r ) − d H ( X E r ) . (2.12)The expression (2.12) is obtained by putting random labeling on an unlabeled rooted coloring and followingthe effect on the Shannon entropy. We first observe that, since X is invariant, for any g ∈ e E r , we have P ( X E r ≃ g ) = 1 d d X v =1 P ( X E r ( o,v ) ≃ g ) = π µ ( g ) . It follows that π µ is the law of e X E r defined as the unlabeled coloring associated to X E r rooted at the orientededge ( o, . Besides, since X is invariant, X E r is in one-to-one correspondence with the triple ( e X E r , σ, σ ′ ) where, given e X E r , σ and σ ′ are independent and σ is a uniform random labeling of e X E r restricted to E or ,the ( d − -ary tree rooted at o in E r \{ o, } , and similarly for σ ′ . From the relative entropy identity (2.5),we find that H ( X E r ) = H ( π µ ) + 2 K, where K is the relative entropy of σ given e X E or .Secondly, we observe that X S r is in one-to-one correspondence with the vector Y = ( X E r , . . . , X E dr ) where E kr is the ( d − -ary tree rooted at k in E r \{ o, k } . It follows that H ( Y ) = H ( X S r ) . Also, if e Y = ( e X E r , . . . , e X E dr ) , we find from what precedes and the invariance of X that H ( X S r ) = H ( Y ) = H ( e Y ) + dK. Finally, the difference between e Y and e X S r is that the neighbors of o are ordered (or labeled) in e Y . Wededuce from Lemma 2.4 below that H ( e Y ) = H ( e X S r ) − X g ∈ e E r E [log( N X ( g )!)] + log( d !) . This concludes the proof of (2.12). (cid:3)
In the proof of Theorem 2.3, we have used the following elementary lemma. Recall that a vector isexchangeable if its law is invariant by any permutation of its coordinates.
Lemma 2.4.
Let F be a finite set and Z = ( Z , . . . , Z n ) a random exchangeable vector in F n . The countingmeasure N Z = P ni =1 δ Z i associated to Z satisfies H ( Z ) = H ( N Z ) + X x ∈ F E [log N Z ( x )!] − log n ! . Proof.
We consider the equivalence class on F n , z ∼ z ′ if z and z ′ are equal up to a permutation of thecoordinates of z . We have z ∼ z ′ if and only if N z = N z ′ . Moreover, the number of vectors in the equivalenceclass of z is given by the multinomial formula: n ! Q x ∈ F N z ( x )! . Using the exchangeability of Z , we deduce that P ( Z = z ) = Q x ∈ F N z ( x )! n ! X z ′ ∼ z P ( Z = z ′ ) = Q x ∈ F N z ( x )! n ! P ( N Z = N z ) . t then remains to use the relative entropy formula (2.5). (cid:3) Proofs of main results
First moment method: proof of Theorem 1.5.
Let r ≥ , ǫ > and G n be uniformly sampled on G n ( d ) . From Markov inequality, for any real h , P ( H G n ( µ, r, ǫ ) ≥ h ) = P ( |F G n ( µ, r, ǫ ) | ≥ e nh ) ≤ e − nh E |F G n ( µ, r, ǫ ) | . In particular, we find n log P ( H G n ( µ, r, ǫ ) ≥ h ) ≤ Σ n ( µ, r, ǫ ) − h. From Theorem 2.3, we deduce the large deviations bound lim sup n →∞ n log P ( H G n ( µ, r, ǫ ) ≥ h ) ≤ Σ r ( µ ) − h + δ ( ǫ ) , where δ ( ǫ ) goes to as ǫ → . If h > Σ r ( µ ) , the right-hand side of the above expression is negative for all ǫ small enough. We deduce in particular that for all ǫ small enough, P ( H G n ( µ, r, ǫ ) ≥ h ) converges to . ByLemma 1.2, this proves that h ( µ, r ) < h . It concludes the proof of Theorem 1.5. (cid:3) Second moment method: proof of Theorem 1.7.
Let µ ∈ I d ( M ) , r ≥ and set p = µ r ∈ I d,r ( M ) .In view of Theorem 1.5, we should prove that h ( µ, r ) ≥ Σ( p ) . (3.1)For ease of notation, we write F G ( p, ǫ ) in place of F G ( µ, r, ǫ ) (since this depends of µ only through µ r = p ).The Paley-Zygmund inequality implies that P (cid:18) H G n ( µ, r, ǫ ) ≥ Σ n ( µ, r, ǫ ) − n (cid:19) = P (cid:0) |F G n ( p, ǫ ) | ≥ e − E |F G n ( p, ǫ ) | (cid:1) ≥ (1 − e − ) ( E |F G n ( p, ǫ ) | ) E |F G n ( p, ǫ ) | = (1 − e − ) exp(2 n Σ n ( µ, r, ǫ )) E |F G n ( p, ǫ ) | . Since µ r = p , we have Σ( p ) = Σ r ( µ ) and, by Theorem 2.3, lim inf n →∞ Σ n ( µ, r, ǫ ) ≥ Σ( p ) − δ ( ǫ ) , where δ ( ǫ ) goes to as ǫ → . We deduce that if we manage to prove that lim sup n →∞ n log E |F G n ( p, ǫ ) | ≤ p ) + δ ′ ( ǫ ) , (3.2)where δ ′ ( ǫ ) goes to as ǫ → , then we would get that lim inf n →∞ n P ( H G n ( µ, r, ǫ ) ≥ Σ( p ) − δ ( ǫ )) ≥ − δ ( ǫ ) − δ ′ ( ǫ ) . From Equation (2.1) in Theorem 2.1, this would imply that h ( µ, r ) ≥ Σ( p ) as claimed in (3.1).It thus remains to prove Equation (3.2). For concreteness, we may assume that the chosen distance d generating the weak topology is the total variation distance. To that end, let ǫ > and N ε be an ε -net onthe set of invariant coupling q ∈ I d,r ( M ) of p and p . Given a graph G ∈ G n ( d ) , consider two colorings of G with color set M whose r -neighborhood statistics are at most at total variation distance ε from p . Thenumber of such pairs is |F G ( p, ε ) | . On the other hand, each pair is in fact a coloring of G on M . Thenits r -neighborhood statistics is an element q ′ ∈ I d,r ( M ) . Since both marginals of q ′ are at most at totalvariation distance ε from p , there is a measure q ∗ ∈ I d,r ( M ) whose total variation distance is at most ε from q ′ and whose both marginals are exactly p (for each marginal of q ′ , there is an invariant coupling ofthis marginal and p such that the two colorings are equal with probability − ǫ ). Therefore there exists an lement in the ε -net, q ∈ N ǫ such that the distance of q from the original pair of coloring is at most ε . Weconclude that |F G ( p, ε ) | ≤ X q ∈N ε |F G ( q, ε ) | . This implies that E |F G n ( p, ǫ ) | ≤ |G n ( d ) | X G ∈G n ( d ) X q ∈N ε |F G ( q, ε ) | = X q ∈N ε exp( n Σ n ( µ ( q ) , r, ǫ )) . It follows that, n log E |F G n ( p, ǫ ) | ≤ max q ∈N ǫ Σ n ( µ ( q ) , r, ǫ ) + 1 n log |N ε | . By Theorem 2.3, we deduce that, for some function δ ′ ( ǫ ) going to with ǫ → , lim sup n →∞ n log E |F G n ( p, ǫ ) | ≤ max q ∈N ǫ Σ( q ) + δ ′ ( ǫ ) . By assumption for any q ∈ I d,r ( M ) , Σ( q ) ≤ p ) . We thus have proved that (3.2) holds. (cid:3) Proof of Theorem 1.11.
In view of Theorem 1.5 and Theorem 1.7, it remains to prove that for any t ≥ r , h ( µ ( p ) , t ) ≥ Σ r ( µ ( p )) = Σ( p ) . Let q ∈ I d,t ( M ) be an invariant coupling of ( µ ( p )) t and ( µ ( p )) t . Then q r is an invariant coupling of p and p (since (( µ ( p )) t ) r = µ ( p ) r = p by construction). By Lemma 1.12, we have Σ( q ) = Σ( µ ( q )) ≤ Σ( µ ( µ ( q ) r )) = Σ( q r ) . By assumption Σ( q r ) ≤ p ) . However, by Lemma 1.12, we have Σ( p ) = Σ( µ ( p ) t ) . It follows that Σ( q ) ≤ µ ( p ) t ) . From Equation (3.1) applied to the radius t , this implies that h ( µ ( p ) , t ) ≥ Σ( µ ( p ) t ) . By a last application of Lemma 1.12, the right-hand side of above expression is equal to Σ( p ) . This concludesthe proof of Theorem 1.11. (cid:3) Application to factor graphs: proofs of Lemmas 1.16 and Lemma 1.17.
We start with theproof of Lemma 1.16.
Proof of Lemma 1.16.
By construction, we have Z G = X f ∈ M n n Y v =1 e ψ (( G,f,v ) r ) = X f ∈ M n e n h distr G ( f ) r ,ψ i . Let ǫ > and N ε be an ε -net of I d,r ( M ) . The function p → h p, ψ i being uniformly continuous (since M is finite), there exists a function δ ( ǫ ) → as ǫ → such that for any probability measure, say q , on rootedcolored graphs of radius r , if d( q, p ) ≤ ǫ then |h q, ψ i − h p, ψ i| ≤ δ ( ǫ ) . If N G is the number of colorings suchthat distr G ( f ) is at distance larger than ǫ from N ε , it follows that Z G ≤ X p ∈N ε |F G ( µ ( p ) , r, ǫ ) | e n h p,ψ i + nδ ( ǫ ) + N G e n k ψ k ∞ , ≤ |N ǫ | max p ∈I d,r ( M ) e n ( H G ( µ ( p ) ,r,ǫ )+ h p,ψ i + δ ( ǫ )) + N G e n k ψ k ∞ . Now, if G n is uniformly sampled on G n ( d ) , then, for any fixed ǫ > , P ( N G n = 0) converges to (since distr G n converges in probability to a Dirac mass at ( T d , o ) ). Using (1.5) and taking the limit in n , we find lim sup n →∞ E n ln Z G n ≤ max p ∈I d,r ( M ) (cid:0) ¯ h ( p ) + h p, ψ i + δ ′ ( ǫ ) (cid:1) with δ ′ ( ǫ ) → as ǫ → . This gives the upper bound in Lemma 1.16. or the lower bound, we write similarly, Z G ≥ X p ∈N ε |F G ( µ ( p ) , r, ǫ ) | e n h p,ψ i− nδ ( ǫ ) , ≥ max p ∈N ε e n ( H G ( µ ( p ) ,r,ǫ )+ h p,ψ i− δ ( ǫ )) ≥ max p ∈I d,r ( M ) e n ( H G ( µ ( p ) ,r,ǫ )+ h p,ψ i− δ ( ǫ )) . The conclusion follows easily. (cid:3)
Lemma 1.17 is a corollary of Lemma 1.16.
Proof of Lemma 1.17.
For β > , let Z G ( β ) be the factor graph model: Z G ( β ) = X f ∈ M n n Y v =1 e nβψ (( G,f,v ) r ) . By construction, we have | M | − n Z G ( β ) ≤ e βL G ≤ Z G ( β ) . By Lemma 1.16, we find sup p ∈I d,r ( M ) (cid:18) h ( p ) β + h p, ψ i − ln | M | β (cid:19) ≤ lim inf n E L G n n ≤ lim sup n E L G n n ≤ sup p ∈I d,r ( M ) (cid:18) ¯ h ( p ) β + h p, ψ i (cid:19) . We recall that ¯ h ( p ) and h ( p ) take value in {−∞} ∪ [0 , ln | M | ] . We get sup p ∈I d,r ( M ): h ( p ) ≥ (cid:18) h p, ψ i − ln | M | β (cid:19) ≤ lim inf n E L G n n ≤ lim sup n E L G n n ≤ sup p ∈I d,r ( M ): h ( p ) ≥ (cid:18) ln | M | β + h p, ψ i (cid:19) . We obtain the statement of the lemma by taking the limit β → ∞ . (cid:3) Extension to processes on unimodular Galton-Watson trees
An extended setting.
We now discuss an extension to processes on random trees. We will focus ourattention on unimodular Galton-Watson trees . In this section, we fix a probability measure π on integerswith positive and finite expectation: d = ∞ X k =0 kπ ( k ) > . We define ˆ ν , the size-biased version of ν as the probability measure defined by: for all integers k ≥ π ( k ) = ( k + 1) π ( k + 1) d . Then, the unimodular Galton-Watson tree with degree distribution π , is the Galton-Watson tree whose vertexset is a subset of N f defined in (1.1) such that the root o has a number of offsprings N o with distribution π indexed by , . . . , N o and all other vertices v have an independent number of offsprings N v with distribution ˆ π indexed by ( v, , . . . , ( v, N v ) . We will denote by T a realization of this random tree and UGW( π ) the lawof the rooted tree ( T, o ) . We note that ( T, o ) is randomly labeled in the sense defined in Subsection 1.1.For example, if π is a Dirac mass at d then T is the d -regular tree. If π is a Poisson random variable withmean d , then ˆ π = π and T is a standard Galton-Watson tree with Poisson offspring distribution.As its name suggests, the random rooted tree T is unimodular. Recall that a random rooted graph ( G, o ) is unimodular if for all non-negative functions f on the set of doubly rooted graphs (a connected graph withtwo ordered distinguished vertices) which are invariant by isomorphisms, we have E X v ∈ V f ( G, o, v ) = E X v ∈ V f ( G, v, o ) . (4.1)where V is the vertex set of G and the expectation is with respect to the randomness of ( G, o ) .If M is a finite set, an invariant process X on T is defined as a random colored tree ( T, X ) such that ( T, X, o ) is unimodular (that is, it satisfies (4.1) with G = ( T, X ) and f defined on the set of doubly rooted olored graphs which are invariant by isomorphisms). We denote by I π ( M ) the set of laws of ( T, X ) with X invariant colorings of T on the color set M .Now, in order to define a relevant notion of sofic entropy, we need to choose the ensemble of finite graphs G n such that distr G n converges to UGW( π ) . A natural choice is the family of uniform random graphs witha given degree sequence. Let d n = ( d n (1) , . . . , d n ( n )) be a sequence of integers, indexed by a subset of N ,whose sum is even and such that distr d n = 1 n n X v =1 δ d n ( v ) converges weakly to π . For technical simplicity, we assume that the degree sequence is uniformly bounded:for some real ∆ , sup n max ≤ v ≤ n d n ( v ) ≤ ∆ . (4.2)Note in particular that (4.2) implies that the support of π is contained in { , . . . , ∆ } . From Erdős-GallaiTheorem [19], for all n large enough, the set G n ( d n ) of simple graphs with vertex set [ n ] = { , . . . , n } suchthat for all v ∈ [ n ] , v has degree d n ( v ) is not empty. Under these conditions, if G n is uniformly distributedon G n ( d n ) then almost surely distr G n converges to UGW( π ) , see for example [27]. In the statements below,we will not repeat the above assumptions on the sequence ( d n ) .For a given probability measure µ ∈ I π ( M ) (that is, µ is the law of an invariant coloring ( T, X ) ), we cannow reproduce the definition of weakly and typical processes and define the sofic entropy by taking limitsof H G n ( µ, r, ǫ ) defined in (1.3). We do not repeat the definitions since there are identical except that G n is now a random graph uniformly distributed on G n ( d n ) . For integer r ≥ and < α < , we define thequantity ¯ h ( µ, r, α ) and h ( µ, r, α ) exactly as done below (1.3). Lemma 1.2 continues to holds in this moregeneral setting. Lemma 4.1.
Let µ ∈ I π ( M ) and r ≥ . The function α (¯ h ( µ, r, α ) , h ( µ, r, α )) is constant on (0 , .Proof. The proof of Theorem 2.1 works verbatim under the assumption (4.2). (cid:3)
We define ¯ h ( µ, r ) and h ( µ, r ) as the common value of ¯ h ( µ, r, α ) and h ( µ, r, α ) . The upper and lower soficentropies ¯ h ( µ ) and h ( µ ) are the limits in r of ¯ h ( µ, r ) and h ( µ, r ) . Exactly as in Lemma 1.3, as a corollary ofLemma 4.1, we obtain the following claim. Lemma 4.2.
Let µ ∈ I π ( M ) . We have ¯ h ( µ ) ≥ (resp. h ( µ ) ≥ ) if and only if µ is weakly (resp. strongly)typical. Annealed entropy.
In this broader setting, the annealed entropy is defined as follows. Let ( T, o ) bea randomly labeled rooted unimodular tree and X an invariant coloring of T with ( T, X ) having law µ . Thedegree of a vertex v of T is denoted by deg( v ) . We assume that d = E deg( o ) > . As above, if r ≥ is aninteger and S is a subset of the vertices of T , B r ( S ) is the subset of vertices of T at distance at most r from S . For r ≥ , we set S r = B r ( o ) and, if deg( o ) ≥ , we set E r = B r − ( { o, } ) (since T is randomly labeled,the neighbors of the root are indexed by (1 , . . . , deg( o )) ).We denote by X S r the colored tree ( T, X ) restricted to S r : by construction, X S r has law µ r . We also needto define the law of X restricted to E r but this requires a biasing of the tree T . This is done as follows. A(directed) edge-rooted graph is defined as a pair ( G, ρ ) formed by a connected graph G and a distinguisheddirected edge ρ = ( u, v ) (that is, { u, v } is an edge of the graph). Now, we denote by ~µ the law on colorededge-rooted trees defined by: ~µ ( · ) = 1 d E [deg( o ) (( T, X, ( o, ∈ · ] , (4.3)where ( T, X ) has law µ and d = E deg( o ) . Note that under the probability measure ~µ , o has at least degree and thus { o, } is an edge of the tree. We denote by ( ~T , ~X, ρ ) , with ρ = ( o, a random variable withlaw ~µ . It is easy to check that Equation (4.1) implies that ~µ is invariant by switching the two sides of theoriented edge. Moreover, if T has law UGW( π ) then ~T is given by two independent Galton-Watson treeswith offspring distribution ˆ π whose roots are connected by the root-edge, see [1, Example 1.1]. In particular,if π is a Dirac mass at d , then T = ~T .We denote by ~X E r the colored tree ( ~T , ~X ) restricted to E r . The law of ~X E r is ~µ r , the restriction of ~µ to E r . We observe that ~µ r depends on µ only through its marginal µ r . ow, if X is an invariant coloring of T with law µ ∈ I π ( M ) and r ≥ is an integer, we set Σ r ( X ) = Σ r ( µ ) = H ( X S r ) − d H ( ~X E r ) − H ( π ) . (4.4)See Remark 4.5 for an alternative expression which is arguably more natural. Thanks to assumption (4.2) itis immediate that the above entropies are finite as soon as M is finite. Note also that Σ r ( µ ) depends on µ only through µ r . We will check in Lemma 4.10 below that Σ r ( µ ) is non-increasing in r . We may thus define Σ( µ ) = lim r →∞ Σ r ( µ ) . The quantities Σ r ( µ ) and Σ( µ ) are the annealed entropies of µ r and µ . The following lemma generalizesLemma 4.3. Theorem 4.3.
For any µ ∈ I π ( M ) and integer r ≥ , we have ¯ h ( µ, r ) ≤ Σ r ( µ ) and ¯ h ( µ ) ≤ Σ( µ ) . There is also an analog of Theorem 1.7.
Theorem 4.4.
Let µ ∈ I π ( M ) . For any integer r ≥ , if all invariant couplings ν of µ and µ , we have Σ r ( ν ) ≤ r ( µ ) then h ( µ, r ) = ¯ h ( µ, r ) = Σ r ( µ ) . In particular, if the above condition holds for an increasingsequence of integers ( r k ) k ≥ then h ( µ ) = ¯ h ( µ ) = Σ( µ ) . Remark . The annealed entropy is also given by the formula: Σ r ( X ) = H ( X S r | T S r ) − d H ( ~X E r | ~T E r ) , where H ( X | Y ) = H ( X, Y ) − H ( Y ) is the relative entropy. Indeed ~T is the union of two independent copiesof T ′ , a Galton-Watson tree with offspring distribution ˆ π , while T is the union of N independent copiesof T ′ and with N independent with distribution π . It follows that, H ( ~T E r ) = 2 H ( T ′ S r − ) and H ( T S r ) = H ( N ) + dH ( T ′ S r − ) (from (2.5)). In particular, H ( T S r ) − ( d/ H ( ~T E r ) = H ( N ) = H ( π ) .4.3. Markov processes.
There is an extension of Theorem 1.11 and Theorem 1.15 in our extended setting.The previous definitions of Markov processes carry over when conditioned on the random tree. More precisely,we use the following definitions.
Definition 4.6 (Markov process) . Let ( T, X ) be a random coloring of a tree T on a finite set M with law µ . For integer r ≥ , X or µ is r -Markov if conditioned on T and on the value at B r − ( e ) , the ( r − -neighborhood of an edge e , the processes on the left and right subtrees of e are independent. Similarly, X or µ is vertex-Markov if conditioned on T and on the value at a vertex, the processes on the pending subtreesof that vertex are independent. For integer r ≥ , let I π,r ( M ) denote the set of laws µ of coloring ( T ′ , X ′ ) on M which are randomlylabeled, such that T ′ has law UGW( π ) r (the law of the restriction of T to S r ) and such that ~µ as defined aboveis invariant by switching the two sides of the oriented edge. If µ ∈ I π ( M ) then µ r ∈ I π ( M ) . Conversely, wehave the following (see [12, Proposition 1.1]): Lemma 4.7.
Let r ≥ integer and p ∈ I π,r ( M ) . Then there is a unique r -Markov process µ ( p ) ∈ I π ( M ) such that the marginal of µ ( p ) on S r is equal to µ . If p ∈ I π,r ( M ) , we define Σ( p ) = Σ r ( µ ( p )) as in Equation (4.4). The following theorem is an extension ofTheorem 1.11. Theorem 4.8.
Let r ≥ be an integer and p ∈ I π,r ( M ) . If for all couplings q ∈ I π,r ( M ) of p and p , wehave Σ( q ) ≤ p ) then h ( µ ( p )) = ¯ h ( µ ( p )) = Σ( p ) and µ ( p ) is strongly typical. There is a version of this theorem for vertex-Markov processes. As above, let I e ( M ) denote the set ofprobability measures on M E that are invariant by switching the two sides of the edge { o, } . If µ ∈ I µ ( M ) then the restriction of ~µ to E is in I e ( M ) . Conversely, if p ∈ I e ( M ) , there exists a unique vertex-Markovprocess µ ( p ) in I π ( M ) such that ~µ restricted to E is in I e ( M ) . If ( T, X ) ∈ I π ( M ) , we define Σ e ( X ) = d H ( ~X E ) − dH ( ~X o ) + H ( X o ) − H ( π ) . f p ∈ I e ( M ) , we set Σ( p ) = Σ e ( µ ( p )) . The following theorem is an extension of Theorem 1.15. Theorem 4.9.
Let p ∈ I e ( M ) . If for all couplings q ∈ I e ( M ) of p and p , we have Σ( q ) ≤ p ) then h ( µ ( p )) = ¯ h ( µ ( p )) = Σ( p ) and µ ( p ) is strongly typical. In the remainder of the paper, we explain the proofs of Theorem 4.3, Theorem 4.4, Theorem 4.8 andTheorem 4.9. The proofs are entirely similar to the proof of the corresponding results for invariant processeson T d . We will only sketch the proof and explain the differences.4.4. Maximizers of the annealed entropy.
The following lemma is the exact analog of Lemma 1.12 andLemma 1.14.
Lemma 4.10.
Let X ∈ I π ( M ) and r ≥ . We have Σ r +1 ( X ) ≤ Σ r ( X ) , with equality if and only if X S r +1 is a r -Markov process on S r +1 . Moreover, Σ ( X ) ≤ Σ e ( X ) , with equality if and only if X S is a vertex-Markov process on S .Proof. We start by the first statement. Arguing as in the proof of Lemma 4.10, it is enough to check theinequality for r = 1 . The inequality Σ ( X ) ≤ Σ ( X ) is equivalent to H ( X S ) − H ( X S ) − d H ( ~X E ) + d H ( ~X E ) ≤ . (4.5)Let deg T ( o ) and deg ~T ( o ) be the degrees of the root in T and ~T . For integer k ≥ , from (4.3), we have,for any event A , P (( ~X, ~T ) ∈ A, deg ~T ( o ) = k ) = kd P (( X, T ) ∈ A, deg T ( o ) = k ) . It follows that P (deg ~T ( o ) = k ) = kπ ( k ) /d and if k ≥ is in the support of π , P (( ~X, ~T ) ∈ A | deg ~T ( o ) = k ) = P (( X, T ) ∈ A | deg T ( o ) = k ) . (4.6)In other words, ( X, T ) and ( ~X, ~T ) have the same law when conditioned on the root degree is k ≥ . For a set S of vertices, let us denote by by H k ( S ) the entropy of the variable ( T, X ) restricted to S and conditioned onthe event deg T ( o ) = k . Similarly, we set H k ( S | S ′ ) = H k ( S, S ′ ) − H k ( S ′ ) is the associated relative entropy.From (2.5), we may write, for t = 1 , , H ( X S t ) = H (deg T ( o )) + π (0) H ( X o ) + ∞ X k =1 π ( k ) H k ( S t ) . and H ( ~X E t ) = H (deg ~T ( o )) + ∞ X k =1 kπ ( k ) d H k ( E t ) . Hence, (4.5) is equivalent to the claim: ∞ X k =1 π ( k ) (cid:18) H k ( S ) − H k ( S ) − k H k ( E ) + k H k ( E ) (cid:19) ≤ . (4.7)On the event deg T ( o ) = k , for ≤ i ≤ k , let L i = { ( i, , . . . , ( i, n i ) } be the offspring of vertex i . Note thatthe random variables X L i conditioned on the event deg T ( o ) = k are exchangeable. Then, the computationfrom (2.6) to (2.8) gives H k ( S ) − H k ( S ) − k H k ( E ) + k H k ( E ) ≤ k H k ( L | S ) − k H k ( L | E ) The left-hand side of (4.7) is thus upper bounded by ∞ X k =1 k π ( k )( H k ( L | S ) − H k ( L | e )) = d (cid:16) H ( ~X L | ~X S ) − H ( ~X L | ~X E ) (cid:17) . e now use the invariance of ~µ by switching the two sides of the edge E = { o, } . We get H ( ~X L | ~X E ) = H ( ~X L | ~X E ) . Finally, since E ⊂ S , we deduce that the inequalities (4.5)-(4.7) hold. As in the proof ofLemma 1.12, the case of equality is a directed consequence of the case of equality in (2.5).We now prove the second statement of Lemma 4.10, Σ ( X ) ≤ Σ e ( X ) . It is equivalent to prove that H ( X S ) − d H ( ~X E ) ≤ d H ( ~X E ) − dH ( ~X o ) + H ( X o ) . Arguing as above, we find that this is equivalent to ∞ X k =1 π ( k )( H k ( S ) − kH k ( E ) − ( k − H k ( o )) ≤ . As in the proof of Lemma 1.14, it remains to use that H k ( S ) ≤ H ( o ) + kH k ( E | o ) and H k ( E | o ) = H k ( E ) − H k ( o ) . In the case of equality, this implies the conditional independence of ( X , . . . , X k ) given X o and root degree equal to k . (cid:3) Combinatorial characterization of the annealed entropy.
In our extended setting, the combi-natorial interpretation of the annealed entropy Σ r ( µ ) explained in Subsection 2.3 continues to hold. Thedefinition of Σ n ( µ, r, ǫ ) in (2.9) remains unchanged. The following theorem is an extension of Theorem 2.3. Theorem 4.11.
Let µ ∈ I π ( M ) and r ≥ integer. We have lim ǫ → lim inf n →∞ Σ n ( µ, r, ǫ ) = lim ǫ → lim sup n →∞ Σ n ( µ, r, ǫ ) = Σ r ( µ ) . Proof.
Let s ( d ) = d/ − ( d/
2) log d . From [10, Theorem 2.16], we have lim n →∞ (cid:18) n log |G n ( d n ) | − d n (cid:19) = − s ( d ) + H (deg( o )) − E [log(deg( o )!)] , where deg( o ) has law π . Also, from [12, 16], we have lim ǫ → lim inf n →∞ (cid:18) n log |G n ( µ, r, ǫ ) | − d n (cid:19) = lim ǫ → lim sup n →∞ (cid:18) n log |G n ( µ, r, ǫ ) | − d n (cid:19) = J r ( µ ) , where J r ( µ ) = − s ( d ) + H ( e X S r ) − d H ( π µ ) − P g ∈ e E r E [log( N X ( g )!)] is defined exactly as in Theorem 2.3, theonly difference being that (2.11) is replaced by X g ∈ e E r N X ( g ) = deg( o ) . Arguing as in the proof of Theorem 2.3, we have H ( π µ ) = H ( e X E r ) where e X E r is the unlabeled coloringassociated to ~X E r . Hence, the theorem follows by checking that H ( e X S r ) − d H ( e X E r ) = H ( X S r ) − d H ( ~X E r ) + X g ∈ e E r E [log( N X ( g )!)] − E [log(deg( o )!)] . (4.8)In order to prove that (4.8) holds, we decompose the left-hand side of the possible values of the root-degree.We write H ( e X S r ) = H ( π ) + H ( X o ) + ∞ X k =1 π ( k ) H k ( e X S r ) , where H k is the entropy conditioned on deg T ( o ) = k . Similarly, d H ( e X E r ) = d H (ˆ π ) + ∞ X k =1 k π ( k ) H k ( e X E r ) . et E k [ · ] is the expectation conditioned on deg T ( o ) = k . From (4.6), it follows that the identity (4.8) isequivalent to ∞ X k =1 π ( k ) (cid:18) H k ( e X S r ) − k H k ( e X E r ) (cid:19) = ∞ X k =1 π ( k ) H k ( X S r ) − k H k ( X E r ) + X g ∈ e E r E k [log( N X ( g )!)] − log( k !) . (4.9)We denote by E or , the tree rooted at o in E r \{ o, } and by E r the tree rooted at . Arguing as in theproof of Theorem 2.3, we find H k ( X E r ) = H k ( e X E r ) + K k + K ′ k , where K k is the relative entropy of a random labeling σ of e X E or conditioned on deg( o ) = k and K ′ k is therelative entropy of a random labeling σ ′ of e X E r conditioned on deg( o ) = k . Similarly, arguing as in the proofof Theorem 2.3, we get H k ( X S r ) = H k ( e X S r ) + kK ′ k − X g ∈ e E r E k [log( N X ( g )!)] + log( k !) . We deduce that the right-hand side of (4.9) is equal to ∞ X k =1 π ( k ) (cid:18) H k ( e X S r ) − k H k ( e X E r ) (cid:19) + ∞ X k =1 π ( k ) (cid:18) k K ′ k − k K k (cid:19) . Finally, we observe that ∞ X k =1 π ( k ) (cid:18) k K ′ k − k K k (cid:19) = d H ( σ ) − H ( σ ′ )) . The above expression is equal to because the law ~µ is invariant by switching the two sides of the edge { o, } . This concludes the proof of Equation (4.8). (cid:3) Proofs of Theorem 4.3, Theorem 4.4, Theorem 4.8 and Theorem 4.9.
As already pointed, theconclusion of Theorem 2.1 holds in our extended setting. We may thus repeat verbatim the proofs in Section3 and invoke Theorem 4.11 in place of Theorem 2.3 and Lemma 4.10 in place of Lemma 1.12. (cid:3)
References [1] D. Aldous and R. Lyons. Processes on unimodular random networks.
Electronic Journal of Probability , 12:1454–1508, 2007.15[2] A. Backhausz and B. Szegedy. On large-girth regular graphs and random processes on trees.
Random Structures Algorithms ,53(3):389–416, 2018. 1, 2, 3, 4, 10[3] A. Backhausz and B. Szegedy. On the almost eigenvectors of random regular graphs.
Ann. Probab. , 47(3):1677–1725, 2019.1, 3, 4, 5, 10[4] A. Backhausz, B. Szegedy, and B. Virág. Ramanujan graphings and correlation decay in local algorithms.
Random Struc-tures Algorithms , 47(3):424–435, 2015. 1[5] A. Backhausz and B. Virág. Spectral measures of factor of i.i.d. processes on vertex-transitive graphs.
Ann. Inst. HenriPoincaré Probab. Stat. , 53(4):2260–2278, 2017. 1[6] M. Bayati, D. Gamarnik, and P. Tetali. Combinatorial approach to the interpolation method and scaling limits in sparserandom graphs.
Ann. Probab. , 41(6):4080–4115, 2013. 1[7] E. A. Bender and E. Canfield. The asymptotic number of labeled graphs with given degree sequences.
Journal of Combi-natorial Theory, Series A , 24(3):296 – 307, 1978. 10[8] I. Benjamini and O. Schramm. Recurrence of distributional limits of finite planar graphs.
Electron. J. Probab. , 6:no. 23,13, 2001. 1[9] B. Bollobás. The independence ratio of regular graphs.
Proc. Amer. Math. Soc. , 83(2):433–436, 1981. 1[10] B. Bollobás.
Random graphs , volume 73 of
Cambridge Studies in Advanced Mathematics . Cambridge University Press,Cambridge, second edition, 2001. 1, 2, 7, 18[11] C. Bordenave. Normalité asymptotique des vecteurs propres d’un graphe régulier aléatoire (d’après Backhausz et Szegedy).Séminaire Bourbaki, 71 e année, 2018–2019, No. 1151., 2018. 1[12] C. Bordenave and P. Caputo. Large deviations of empirical neighborhood distribution in sparse random graphs. Probab.Theory Related Fields , 163(1-2):149–222, 2015. 4, 5, 10, 16, 18[13] L. P. Bowen. A brief introduction of sofic entropy theory. In
Proceedings of the International Congress of Mathematicians—Rio de Janeiro 2018. Vol. III. Invited lectures , pages 1847–1866. World Sci. Publ., Hackensack, NJ, 2018. 1, 3[14] A. Coja-Oghlan and W. Perkins. Spin systems on Bethe lattices.
Comm. Math. Phys. , 372(2):441–523, 2019. 1
15] E. Csóka, V. Harangi, and B. Virág. Entropy and expansion.
Ann. Inst. Henri Poincaré Probab. Stat. , 56(4):2428–2444,2020. 1[16] P. Delgosha and V. Anantharam. A notion of entropy for stochastic processes on marked rooted graphs. arXiv:1908.00964,2019. 10, 18[17] J. Ding, A. Sly, and N. Sun. Maximum independent sets on random regular graphs.
Acta Math. , 217(2):263–340, 2016. 1[18] W. Duckworth and N. C. Wormald. On the independent domination number of random regular graphs.
Combin. Probab.Comput. , 15(4):513–522, 2006. 1[19] P. Erdős and T. Gallai. Graphs with prescribed degrees of vertices (hungarian).
Mat. Lapok , 11:264–274, 1960. 15[20] J. Friedman. A proof of Alon’s second eigenvalue conjecture and related problems.
Mem. Amer. Math. Soc. ,195(910):viii+100, 2008. 1[21] D. Gamarnik. Right-convergence of sparse random graphs.
Probab. Theory Related Fields , 160(1-2):253–278, 2014. 1, 3[22] H. Hatami, L. Lovász, and B. Szegedy. Limits of locally-globally convergent graph sequences.
Geom. Funct. Anal. ,24(1):269–296, 2014. 1[23] L. Lovász.
Large networks and graph limits , volume 60 of
American Mathematical Society Colloquium Publications . Amer-ican Mathematical Society, Providence, RI, 2012. 1[24] M. Mezard and A. Montanari.
Information, Physics, and Computation . Oxford University Press, Inc., USA, 2009. 1[25] D. Puder. Expansion of random graphs: new proofs, new results.
Invent. Math. , 201(3):845–908, 2015. 1[26] J. Salez. The interpolation method for random graphs with prescribed degrees.
Combin. Probab. Comput. , 25(3):436–447,2016. 1[27] R. van der Hofstad.
Random Graphs and Complex Networks, Volume 2 . Cambridge Series in Statistical and ProbabilisticMathematics. Cambridge University Press, to appear. 15[28] N. C. Wormald. Differential equations for random processes and random graphs.
Ann. Appl. Probab. , 5(4):1217–1235, 1995.1[29] N. C. Wormald. Models of random regular graphs. In
Surveys in combinatorics, 1999 (Canterbury) , volume 267 of
LondonMath. Soc. Lecture Note Ser. , pages 239–298. Cambridge Univ. Press, Cambridge, 1999. 1, 8, pages 239–298. Cambridge Univ. Press, Cambridge, 1999. 1, 8