Biased measures for random Constraint Satisfaction Problems: larger interaction range and asymptotic expansion
aa r X i v : . [ c ond - m a t . d i s - nn ] S e p Biased measures for random Constraint Satisfaction Problems:larger interaction range and asymptotic expansion
Louise Budzynski and Guilhem Semerjian Laboratoire de physique de l’Ecole normale sup´erieure, ENS, Universit´e PSL,CNRS, Sorbonne Universit´e, Universit´e de Paris, F-75005 Paris, France
We investigate the clustering transition undergone by an exemplary random constraint satisfactionproblem, the bicoloring of k -uniform random hypergraphs, when its solutions are weighted non-uniformly, with a soft interaction between variables belonging to distinct hyperedges. We showthat the threshold α d ( k ) for the transition can be further increased with respect to a restrictedinteraction within the hyperedges, and perform an asymptotic expansion of α d ( k ) in the large k limit. We find that α d ( k ) = k − k (ln k + ln ln k + γ d + o (1)), where the constant γ d is strictly largerthan for the uniform measure over solutions. Contents
I. Introduction II. Definitions
III. The replica symmetric cavity formalism
IV. The dynamic transition
V. Finite k numerical results VI. Large k asymptotics k limit for finite n n limit 221. For γ > γ r ( b ) 222. A reweighting scheme 223. A Gaussian approximation for the quasi-hard fields 274. Algorithmic implementation 28D. Results 29 VII. Conclusions Acknowledgments A. Existence and uniqueness of the RS solution B. An inequality References I. INTRODUCTION
In a Constraint Satisfaction Problem (CSP) N discrete valued variables are subject to M constraints. Each of theconstraints enforces some requirement on a subset of the variables, a solution of the CSP is thus an assignement of thevariables that satisfies simultaneously all the constraints. Famous examples of CSPs are the k -satisfiability problemand the graph q -coloring one; we will focus in this paper on another related problem, the bicoloring of k -hypergraphs.In this problem the variables are boolean and lie on the vertices of an hypergraph, with hyperedges linking subsetsof k vertices (instead of two for a graph). The constraint associated to each hyperedge is that both values (or colors)are present among its k adjacent vertices, forbidding locally monochromatic configurations.CSPs have been studied from various perspectives. Computational complexity theory [1, 2] classifies them accordingto their worst-case difficulty, characterized by the existence or not of an efficient (running in a time polynomial inthe size N ) algorithm able to solve (i.e. decide whether they admit a solution) all their possible instances. Anotherapproach, in which this work takes place, focuses on the typical difficulty of CSPs, where typical is defined with respectto a random ensemble of instances (see e.g. [3–11]). The most commomly studied random ensemble is obtained bydrawing the M constraints uniformly at random, i.e. by constructing a G ( N, M ) Erd˝os-R´enyi random k -hypergraph.In this paper we will focus on a slightly different ensemble, the ( k, l + 1)-regular one, where the probability is uniformon the set of hypergraphs for which each vertex belongs to l + 1 hyperedges. A striking property of random CSPsis the occurence of phase transitions, or threshold phenomena, in the large size (thermodynamic) limit N, M → ∞ with a fixed ratio α = M/N (in the regular ensemble α = ( l + 1) /k ), the density of constraints per variable. Whenthe control parameter α is varied one observes several phase transitions, at which the probability of some propertiesjumps abruptly from 1 to 0 in the thermodynamic limit. In particular the satisfiability threshold α sat separates anunderconstrained regime α < α sat where typical instances are satisfiable, from an overconstrained regime α > α sat where they typically do not admit any solution. The existence of such a transition has been proven (in a slightlyweaker sense) in [12], as well as lower [13, 14] and upperbounds [15] on the threshold α sat , that become tighter andtighter as k grows [16, 17]. Statistical mechanics methods, adapted from the study of spin-glasses [18, 19], providedpredictions for the value of α sat ( k ) for several random CSPs [5, 6, 20], the correctness of this method was later provenrigorously for large but finite k [11, 21]. There appears to be a large universality among the various random CSPsthat have been studied, in particular in the large k limit; for the sake of readability the quantitative statements beloware given with the scale of α corresponding to the bicoloring problem, even when quoting papers that derived thisresult for another CSP.Several other phase transitions occur in the satisfiable phase α < α sat [7]. In this paper we focus on the clusteringphase transition α d , which is also known as the dynamic or reconstruction transition. This transition can be definedin several ways. Below α d the set of solutions of most instances is rather well-connected, any typical solution canbe reached from another one through a path constituted of nearby solutions. Above α d the solution set splitsinto an exponential number of isolated subsets of solutions, called clusters, which are internally well-connected butseparated one from each other by regions without solutions: this is called the clustering phenomenon. This transitionalso manifests itself with the appearance of a specific form of long range correlations between variables, known aspoint-to-set correlations, in the probability law defined as the uniform measure over the set of solutions. Thesecorrelations imply the solvability of an information-theoretic problem called tree reconstruction [22, 23], and forbidthe rapid equilibration of the stochastic processes that respect the detailed balance condition [24], which justifies thealternative “dynamic” name of the clustering transition. In the cavity method [25] treatment of the random CSPs α d can also be defined as the appearance of a non trivial solution of the one step of Replica Symmetry Breaking (1RSB)equation with Parisi breaking parameter m = 1, see in particular [23] for the connection between this formalismand the reconstruction problem. In the large k limit the dynamic transition happens at a much smaller constraintdensity than the satisfiablity one, the asymptotic expansion of these two thresholds being α d ( k ) ∼ k − ln k/k and α sat ( k ) ∼ k − ln 2.An important open problem in the field of random CSPs concerns the behavior of algorithms in the satisfiableregime, where the goal is to find a solution, as typical instances admit such configurations. In particular one wouldlike to determine the algorithmic threshold α alg ( k ) above which no algorithm is able to find a solution in polynomialtime with high probability (assuming P =NP). For small values of k it is possible to design algorithms (see [5, 26–29])that are found through numerical simulations to be efficient at densities very close to the satisfiability threshold.Unfortunately these algorithms cannot be studied numerically in the large k limit and one has to resort to analyticalstudies in this case, which can only be performed on relatively simple heuristics. The best result in this direction isthe one of [30], which provides an algorithm that provably works in polynomial time up to constraint densities of theorder of 2 k − ln k/k , i.e. the same asymptotic scaling as α d ( k ). This leaves a wide range of α where typical instanceshave a non-empty set of solutions, but no known algorithm is able to find them efficiently (and where some familiesof algorithms have been proven to fail [31–33]).One could hope that this algorithmic question, and in particular the value of α alg ( k ), could be enlightened by theaccumulated knowledge on the several structural phase transitions that occur in the satisfiable phase. The connectionbetween these two aspects is however very delicate because algorithms are intrisically out-of-equilibrium processes,either because their mere definition violates the detailed balance condition, or because they fall out of equilibriumduring their evolution on a time scale that is shorter than their relaxation time. In both cases there are no fundamentalprinciples to connect their dynamics with the static properties of the solution set. Even if one cannot understandprecisely α alg ( k ) in terms of a structural phase transition one can reasonably state that the dynamic transition is alower bound to the algorithmic one, α d ( k ) ≤ α alg ( k ). Indeed for α ≤ α d simulated annealing [34] should be ableto reach thermalization in polynomial time down to arbitrarily small temperatures, and hence sample uniformly thesolution set. For α slightly larger than α d one expects simulated annealing to fall out-of-equilibrium on polynomialtimescales but in many cases it should still be able to find (non-uniformly) solutions, hence the bound α d ( k ) ≤ α alg ( k )is not tight in general.The study of the structural phase transitions in the satisfiable regime, and in particular the definition of α d in termsof long-range correlations, relies on the characterization of a specific probability law on the space of configurations,namely the uniform measure over solutions. This paper belongs to a series of works studying a probability measureover the set of solutions that is biased instead of uniform, i.e. that weights differently the various solutions ofthe CSP instance. This idea has been used in several articles, see in particular [35–40], with slightly differentperspectives and results (for instance in [36, 37] solutions are weighted according to the number of other solutions intheir neighborhood, in [35] according to their number of frozen variables taking the same value in the whole cluster,while in [38] the solutions are non-overlapping positions of hard spheres, biased through an additional pairwise softinteraction between them). In [39] we have studied a simple implementation of the bias in the measure on the setof solutions of an hypergraph bicoloring instance, where the interactions induced by the bias can be factorized overthe bicoloring constraints, and studied systematically the modifications of the clustering threshold α d induced bythe non-uniformity between solutions. We showed, for k between 4 and 6, that with well-chosen parameters such abias allows to increase α d , and to improve the performances of simulated annealing, in agreement with the discussionabove. However in [39] we left essentially open the question of the increase of α d this bias could achieve in the large k limit, and hence whether such a strategy could reduce the algorithmic gap in this limit.As a matter of fact the large k behavior of α d is a rather involved asymptotic expansion, even for the uniformmeasure, and until recently only relatively loose bounds on the asymptotic behavior of α d were known [41–43]. Weconsidered this specific problem in [44] and found that the clustering threshold occurs on the scale α ∼ k − (ln k +ln ln k + γ ) /k with γ constant, and more precisely that for the uniform measure γ d , u ≈ . k it allows a further increase of the dynamic threshold α d . Moreover we adaptthe large k expansion of [44] to this biased measure and manage to assess the asymptotic effect of the bias on α d ; wefind that the factorized bias of [39] cannot improve on the constant γ d in the asymptotic expansion with respect to γ d , u , while the bias with larger interaction range we introduce here allows to increase its value up to γ d ≈ . α d , nevertheless it opensthe possibility to study further generalizations of the bias and to bring some light on the nature of the algorithmicgap between α alg and α sat .The rest of the paper is organized as follows. In Section II we define more precisely the bicoloring problem and thebiased measure we introduce on its set of solutions. Its treatment with the simplest version of the cavity method ispresented in Section III, while Sec. IV refines this description to incorporate the clustering phenomenon and presentsthe equations that allow to compute the dynamic threshold. In section V we display some numerical results forfinite values of k and compare them to the simpler biasing strategy of [39]. The analytical expansion of the dynamictransition threshold in the large k limit is the subject of section VI, followed by some conclusions and perspectivesfor future work in Sec. VII. More technical details of the computations are deferred to the Appendices A and B. II. DEFINITIONSA. Biased measures with interactions at distance 1
An instance of the bicoloring problem is specified by a k -uniform hypergraph G = ( V, E ), where V = { , . . . , N } is a set of N vertices and E a set of M hyperedges, each hyperedge a ∈ E linking a subset denoted ∂a of k vertices(we shall denote similarly ∂i the set of hyperedges in which a vertex i appears); a graphical representation for asmall example can be found in the left panel of Fig. 1. Binary variables, encoded as Ising spins σ i ∈ {− , +1 } , areplaced on the vertices of G , their global configuration being denoted σ = ( σ , . . . , σ N ) ∈ {− , +1 } N ; we shall write σ S = { σ i : i ∈ S } for the configuration of the variables in an arbitrary subset S ⊂ V of the vertices. A solutionof the bicoloring problem (also called proper bicoloring of G ) is, by definition, an assignment σ of the variables suchthat no hyperedge a ∈ E is monochromatic, in other words such that for each a ∈ E there is at least one neighboringvertex i ∈ ∂a with σ i = +1, and at least one i ∈ ∂a with σ i = −
1. Assuming that G admits proper bicolorings (i.e.that the bicoloring problem on G is satisfiable), we introduce the uniform measure over the set of solutions as ρ u ( σ ) = 1 Z Y a ∈ E ω ( σ ∂a ) , (1)where the subscript u stands for “uniform”, the normalization constant (partition function) Z counts the number ofsolutions, and ω ( σ . . . σ k ) is the indicator function of the event “ σ . . . σ k are not all equal”: ω ( σ , . . . σ k ) = ( σ = · · · = σ k , . (2)Our goal now is to define a measure ρ which, as ρ u , has for support the set of solutions ( ρ ( σ ) > σ is a proper bicoloring), but that can give different weights to different solutions. There are obviously an infinitenumber of ρ that fulfills this condition, we shall progressively narrow down these possibilities to arrive at the formof ρ studied in the rest of the paper. For simplicity we restrict ourselves from now on to regular hypergraphs, whereevery vertex has the same degree | ∂i | = l + 1.We will first impose a locality requirement for ρ : for this biased measure to be tractable in Monte Carlo simulationsor in analytical computations the additional interactions between variables induced by the non-uniformity shouldbe local, with respect to the notion of distance induced by the hypergraph G . Considering the shortest non-trivialinteraction range that allows the coupling of variables from different hyperedges yields the form ρ ( σ ) = 1 Z Y a ∈ E ω ( σ ∂a ) N Y i =1 ϕ (cid:16) σ i , { σ ∂a \ i } a ∈ ∂i (cid:17) , (3)where the biasing function ϕ > i -th variable with its ( l + 1)( k −
1) neighbors at distance 1; as G isregular and uniform we use the same function ϕ on all the vertices.There is still a vast freedom in the choice of ϕ ; we further restrict it by imposing its invariance under the spin-flipsymmetry σ → − σ (that is fulfilled by the set of solutions), and under the permutations of the l + 1 hyperedgesaround i , and of the k − ϕ (cid:16) σ i , { σ ∂a \ i } a ∈ ∂i (cid:17) = b ϕ ( { m a → i } a ∈ ∂i ) with m a → i = X j ∈ ∂a \ i σ i σ j , (4)and b ϕ invariant under the permutations of its l + 1 arguments. m a → i counts the number of variables in ∂a \ i that are of the same color as σ i ; we shall finally discard part of theinformation contained in m a → i , and only distinguish between the cases m a → i = 0 and m a → i >
0. This is indeed arelevant information about the solution σ : in the former case σ i is the only variable of its color in the a -th hyperedge,hence σ i cannot be flipped without violating the a -th monochromatic constraint, one says that a forces i in such asituation. On the contrary when m a → i > σ i is not forced to its value by the a -th hyperedge. Withthis simplification, and because of the invariance by permutation of the arguments of b ϕ , the weight on the variable i becomes a function of the number of constraints forcing it. As σ is a solution the event m a → i = 0 is equivalent to“the variables in σ ∂a \ i are all equal (a.e.)”, the biased measure becomes thus ρ ( σ ) = 1 Z Y a ∈ E ω ( σ ∂a ) N Y i =1 ψ X a ∈ ∂i I [ σ ∂a \ i a.e.] ! , (5)where here and in the following I [ A ] is the indicator function of the event A , and ψ ( p ) > p ∈ { , . . . , l + 1 } forcing hyperedges.The following study will be devoted to the properties of the measure (5); despite the several restricting hypotheseswe have made to reach this form of ρ there are still l + 1 free parameters in ψ (its argument can take l + 2 values, buta global multiplicative constant gets absorbed in the normalization Z ). Part of our computations will be made for anarbitrary ψ but in some places, and in particular in the large k analysis, we will use the specific form: ψ (0) = 1 , ψ ( p ) = b (1 − ǫ ) p for 1 ≤ p ≤ l + 1 , (6) PSfrag replacements aa ii b η a → i η i → a FIG. 1: Left panel: an example of an hypergraph G with N = 7 vertices represented by black circles, and M = 3 hyperedgeslinking k = 3 vertices, drawn as white squares. Center panel: the introduction of an interaction, represented as a black square,between all the vertices at distance 1 from the central vertex i , generates short loops even if G is a tree. Right panel: the factorgraph representation of the probability measure (9), the white circles stand for the variable nodes v ( i,a ) , the black circles (resp.white squares) are the interaction factors e ψ (resp. e ω ). The messages η i → a and b η a → i obey the Belief Propagation equations(12). which contains the two parameters b > ǫ <
1. This form actually encompasses two cases previously investigatedin the literature: • when ǫ = 0, in such a way that ψ ( p ) = ( p = 0 ,b if p > , (7)one recovers a measure studied in [35]. Indeed the weight of a solution σ is then b raised to a power equal to thenumber of variables that are forced by at least one constraint, i.e. those that are not “whitened” after T = 1step of a coarsening algorithm used in [8, 21, 45–49], whose large deviations on atypical solutions were studiedin [35] for arbitrary values of T (with an unfortunate conflict of notation the parameter called b here is denoted e ǫ in [35]). • when b = 1 one has ψ ( p ) = (1 − ǫ ) p , (8)the weight of a solution σ is thus (1 − ǫ ) raised to the number of forcing constraints. Indeed in a solution everyconstraint a forces at most one of its variable i ∈ ∂a , when σ i is the unique representant of its color in σ ∂a , hencethere is no double counting of forcing constraints in the product over variables in (5). This case was investigatedin [39], and is somehow simpler thanks to the factorization of the biasing function along the hyperedges of G .Obviously the uniform measure (1) is recovered when b = 1 and ǫ = 0. B. Factor graph and auxiliary variables
We will study the properties of the measure (5) when G is drawn uniformly at random among all k -uniform, l + 1-regular hypergraphs, in the thermodynamic limit N, M → ∞ with
M k = N ( l + 1); in this limit such hypergraphsconverge locally to hypertrees (they contain a bounded number of finite length cycles), which allows the use of theBelief Propagation algorithm [50–52] and of the cavity method analysis [19, 25], that are based on this locally tree-likestructure. However the probability law (5), interpreted as a graphical model with variables σ i , contain short loopseven if G is a tree, because the biasing term ψ couples all the variables at distance 1 from each vertex (see the middlepanel of Fig. 1). This prevents a direct application of the cavity method, and requires a preliminary step in order toget rid of these short loops. To achieve this we introduce some auxiliary, redundant variables in the following way: foreach edge ( i, a ) between a vertex i and one of its incident hyperedge a ∈ ∂i we introduce two variables, w a → i ∈ { , } and σ ai ∈ {− , +1 } , which are deterministic functions of the original configuration σ , according to w a → i = I [ σ ∂a \ i a.e.]and σ ai = σ i . We will call v ( i,a ) = ( σ ai , w a → i ) the value of these two auxiliary variables, and v = { v ( i,a ) } i ∈ V,a ∈ ∂i theirglobal configuration. Consider now the following probability law for v : ρ ( v ) = 1 Z N Y i =1 e ψ ( { v ( i,a ) } a ∈ ∂i ) Y a ∈ E e ω ( { v ( i,a ) } i ∈ ∂a ) , (9)where e ψ ( σ , w , . . . , σ l +1 , w l +1 ) = ψ l +1 X i =1 w i ! I [ σ = · · · = σ l +1 ] , (10) e ω ( σ , w , . . . , σ k , w k ) = ω ( σ , . . . , σ k ) k Y i =1 I [ w i = I [ { σ j } j = i a.e.]] . (11)One realizes easily that for a configuration v in the support of (9) σ ai is independent of a , and that the marginal lawof σ is nothing but (5). The partition function Z is the same in the two expressions (5) and (9), and in the support of ρ ( v ) the variables w a → i are the deterministic functions of σ defined above. This equivalent rewriting with redundantvariables has an important advantage: as shown in the right panel of Fig. 1 the graphical model corresponding to(9), with variables v on the edges ( i, a ) of G and interaction nodes both on the original hyperedges ( e ω ) and on theoriginal vertices ( e ψ ), respects the topology of G , and is thus (locally) a tree if G is. III. THE REPLICA SYMMETRIC CAVITY FORMALISMA. Belief Propagation Equations
We will study the typical properties of the probability law (9) for large random hypergraphs with the cavitymethod [19, 25]. We first briefly recall the main ideas that underly it: if G were a tree then all the marginals of (9), aswell as the normalization constant Z , could be efficiently computed by recursion, breaking the tree into independentsubtrees and combining the results of the computations in the smaller substructures. This procedure takes the formof local recursion relations between “messages” passed from one variable node to its adjacent interaction nodes, andvice-versa, where these messages are probability laws for the variables in amputated factor graphs where some nodeshave been removed. These local recursions are exact if the factor graph is a tree, they can nevertheless be used evenif it has some cycles; in that case the corresponding algorithm, called Belief Propagation (BP) [50–52], is a priorionly approximate, with no convergence guarantee. Sparse random graph models being locally tree-like, BP is a goodcandidate to describe asymptotically their behavior. This is indeed the case within an hypothesis of correlation decay,called Replica Symmetry (RS), which implies that the long cycles of typical graphs do not spoil the locally tree-likefeatures captured by BP. This RS hypothesis breaks down for frustrated enough models, in particular for constraintsatisfaction problems at high enough densities, as we shall discuss in more details in the next Section.For now let us state the form of the BP equations and of the RS cavity predictions for the model at hand. TheBP messages are η i → a and b η a → i , the marginal laws of the variable v ( i,a ) that is placed on the edge ( i, a ) of G , ingraphs where one has removed the interactions a and i , respectively. These definitions are illustrated in the rightpanel of figure 1. Note that in a literal application of the BP algorithm one would have introduced messages fromevery variable node to every interaction node, for instance η i → ( i,a ) and η ( i,a ) → a ; as the variable nodes are of degreetwo these two messages are actually equal, we denoted their common value η i → a to lighten the notations. The BPequations between these messages are of the form η i → a = f ( { b η b → i } b ∈ ∂i \ a ) , b η a → i = g ( { η j → a } j ∈ ∂a \ i ) , (12)where the functions f and g derive from the interaction nodes e ψ and e ω stated in (10,11). The relation η = f ( b η , . . . , b η l )is thus found to mean η ( σ, w ) = 1 z X w ,...,w l ψ w + l X i =1 w i ! l Y i =1 b η i ( σ, w i ) , (13)where z is a normalization constant. Similarly b η = g ( η , . . . , η k − ) stands for b η ( σ, w ) = 1 b z X σ ,...,σ k − w ,...,w k − ω ( σ, σ , . . . , σ k − ) I [ w = I [ σ , . . . , σ k − a.e.]] (14) k − Y i =1 η i ( σ i , w i ) I [ w i = I [ σ, σ , . . . , σ i − , σ i +1 , . . . , σ k − a.e.]] , with b z a normalization constant. More explicitly one has b η ( σ,
1) = 1 b z k − Y i =1 η i ( − σ, , (15) b η ( σ,
0) = 1 b z k − X i =1 η i ( − σ, Y j = i η j ( σ,
0) + X I ⊂{ ,...,k − } ≤| I |≤ k − Y i ∈ I η i ( − σ, Y i I η i ( σ, , (16)as there is at most one variable which is the unique representant of its color in a set of k ≥ B. The replica symmetric solution and thermodynamics
In a k -uniform l + 1-regular hypergraph the local neighborhood of every vertex is the same, it is thus natural tolook for a translationally invariant solution of the BP equations. Moreover the probability measure we are studyingis invariant under the spin-flip symmetry σ → − σ , we can thus further restrict ourselves to a solution of the BPequation that respects this invariance. This amounts to take η i → a ( σ, w ) = η ∗ ( w ), b η a → i ( σ, w ) = b η ∗ ( w ) for all edges( i, a ). Plugging this form into (12) yields the equations satisfied by η ∗ and b η ∗ : η ∗ ( w ) = 1 z l X p =0 (cid:18) lp (cid:19) ψ ( p + w ) b η ∗ (0) l − p b η ∗ (1) p , (17) b η ∗ (1) = 1 b z η ∗ (0) k − , (18) b η ∗ (0) = 1 b z (cid:2) ( k − η ∗ (1) η ∗ (0) k − + (2 k − − k − η ∗ (0) k − (cid:3) . (19)Introducing the ratio of probabilities y = η ∗ (0) η ∗ (1) , b y = b η ∗ (0) b η ∗ (1) , (20)one can get rid of the normalization factors z and b z and rewrite (17,18,19) more simply y = l P p =0 (cid:0) lp (cid:1) ψ ( p ) b y − pl P p =0 (cid:0) lp (cid:1) ψ ( p + 1) b y − p , b y = 2 k − − k − k − y . (21)It turns out that for any choice of the parameters k , l , and ψ there exists a unique solution ( y, b y ) to the equations(21), which might not be obvious at first sight; a proof of this existence and uniqueness is provided in Appendix A.The thermodynamic aspects of the probability law (5) are described by its free-entropy ln Z and its Shannon entropy.In the large size limit these are extensive self-averaging quantities, we thus define the typical value of their densitiesas φ = lim N →∞ N E [ln Z ] , s = lim N →∞ N E − X σ ρ ( σ ) ln ρ ( σ ) , (22)where the average E [ • ] is over the uniform choice of the k -uniform l + 1-regular hypergraph G . The RS cavitymethod prediction for these quantities is obtained through the Bethe-Peierls approximation of ln Z in terms of theBP messages [19, 25, 51, 52]; on the translationally invariant solution this yields after a short, standard computation: φ = (cid:18) − ( l + 1)( k − k (cid:19) ln 2 − ( l + 1) ln (cid:18) y b y (cid:19) + ln l +1 X p =0 (cid:18) l + 1 p (cid:19) ψ ( p ) b y − p ! + l + 1 k ln (cid:18) k − − k − ky (cid:19) , (23) s = φ − l +1 P p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − p ln ψ ( p ) l +1 P p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − p . (24)Note that for some choices of the parameters, in particular when l gets large enough, this expression of the entropy s becomes negative. This is impossible for a model with discrete degrees of freedom, the Shannon entropy being alwaysnon-negative, such a negativity of s is thus a clear evidence of the failure of the RS hypothesis. This is however notthe only mechanism for the appearance of Replica Symmetry Breaking (RSB), as we shall see next this phenomenoncan occur in a phase with s > IV. THE DYNAMIC TRANSITIONA. The reconstruction problem and its recursive distributional equations
We shall now present the formalism that allows to compute the location of the dynamic transition which, asexplained in the introduction, manifests itself in different ways. Here we shall exploit its definition in terms of theexistence of long-range point-to-set correlations in the probability measure ρ [23, 24], that are related to the solvabilityof a tree reconstruction problem [22].Let us define the point-to-set correlation function, or overlap, at distance n , as follows: C n = lim N →∞ E [ h σ h σ i σ Bn i − h σ i ] , (25)where 0 is an arbitrary reference vertex and B n the vertices at distance at least n from 0; h·i denotes the expectationwith respect to ρ , while h·i σ Bn is the conditional average with the law ρ ( ·| σ B n ). Note that the second term in C n actually vanishes thanks to the invariance of ρ under the spin-flip transformation. The function C n can be interpretedas a measure of the correlation between the variable at the point 0 and those in the set B n , which explains itsname. Because the interactions in the biased measure couple spins belonging to neighboring hyperedges it is notenough to take for B n the set of variables at distance exactly n from the root; it is however equivalent to includein B n all the variables are distance at least n , or at distances n and n + 1. The dynamic transition separates anunderconstrained (Replica Symmetric, RS) regime in which C n → n → ∞ , and an overconstrained (ReplicaSymmetry Breaking, RSB) one in which C n remains strictly positive at large distances. To compute C n we firstremark that the local neighborhood of the vertex 0, up to any finite distance, converges when N → ∞ to a regulartree structure, as represented in Fig. 2. Moreover the marginal law of ρ on any finite neighborhood of G converges,within the hypothesis of the RS solution described in Sec. III B, to a measure that admits an explicit description interms of a broadcast process. Generating a configuration with the law of ρ in a finite neighborhood of a root vertex0 amounts indeed to (see Fig. 2 for a graphical representation): • choose σ = ± / • draw the l + 1 variables v = ( σ , w ) , . . . , v l +1 = ( σ l +1 , w l +1 ) adjacent to the root with the probability p ( v , . . . , v l +1 | σ ) = ψ (cid:18) l +1 P i =1 w i (cid:19) l +1 Q i =1 b η ∗ ( w i ) P w ′ ,...w ′ l +1 ψ (cid:18) l +1 P i =1 w ′ i (cid:19) l +1 Q i =1 b η ∗ ( w ′ i ) l +1 Y i =1 I [ σ i = σ ] . (26) PSfrag replacements σ p b p p FIG. 2: The tree structure considered for the computation of C n , represented here for k = l + 1 = 3. The generation of aconfiguration from the law ρ is performed in a broadcast fashion, the root σ being ± /
2, this informationis then propagated down the tree with transmission channels p , b p and p . • consider each of the v , . . . , v l +1 variables of the first generation as the root of the subtree lying below it, anddraw the value of the descendents v , . . . v k − of a variable equal to v from the conditional law b p ( v , . . . , v k − | v ) = ω ( σ, σ . . . σ k − ) I [ w = I [ { σ i } a.e.]] k − Q i =1 η ∗ ( w i ) I [ w i = I [ σ, { σ j } j = i a.e.]] P { σ ′ i ,w ′ i } ω ( σ, σ ′ . . . σ ′ k − ) I [ w = I [ { σ ′ i } a.e.]] k − Q i =1 η ∗ ( w ′ i ) I [ w ′ i = I [ σ, { σ ′ j } j = i a.e.]] . (27) • consider again the variables of the second generation as roots, and extract the value of their descendents fromthe conditional law p ( v , . . . , v l | v ) = ψ (cid:18) w + l P i =1 w i (cid:19) l Q i =1 b η ∗ ( w i ) P w ′ ,...w ′ l ψ (cid:18) w + l P i =1 w ′ i (cid:19) l Q i =1 b η ∗ ( w ′ i ) l Y i =1 I [ σ i = σ ] . (28) • iterate the last two steps until all the variables in the target neighborhood have been assigned.This broadcast procedure, that must be performed on the v variables and not only on the σ ’s to preserve theMarkov structure of the tree, can be interpreted as the transmission of an information (the value σ at the root)through noisy channels (the conditional laws p , b p and p defined in (26,27,28)). The question raised in the treereconstruction problem [22] is whether the variables σ B n , in the large n limit, contains some information on the valueof σ , in the sense that the observation of σ B n allows to infer the value of the root σ with a success probabilitylarger than the one expected from a random guess. In this Bayesian setting the optimal inference strategy is tocompute the posterior probability of σ given σ B n , which for Ising variables is completely described by the conditionalmagnetization h σ i σ Bn . The correlation function C n is a possible way of quantifying this amount of information, thetree reconstruction problem being solvable if and only if C n remains strictly positive in the large n limit.To complete the computation of C n we shall exploit again the recursive nature of the tree, but now in the oppositedirection with respect to the broadcast, namely from the variables at distance n towards the root. Indeed the measure ρ ( ·| σ B n ) is exactly described in terms of the solution of the BP equations (12), supplemented with the boundarycondition b η i → a ( v ) = δ v,v ( i,a ) on the edges at distance larger than n of the root, v ( i,a ) being the value taken by thevariable during the broadcast. These BP messages, directed towards the root, are thus random variables because ofthe randomness in the boundary condition σ B n ; one can nevertheless write recursion equations on their distributions,their law depending only on their distance from the boundary. We shall denote P v,n ( η ) the law of the message η on anedge at distance n from the boundary, conditional on the value of the variable on this edge being v in the broadcast,and similarly b P v,n ( b η ) for the law of the b η messages. Putting together all the above observations leads to the following0recursion equations: P v,n +1 ( η ) = X v ,...,v l p ( v , . . . , v l | v ) Z l Y i =1 d b P v i ,n +1 ( b η i ) δ ( η − f ( b η , . . . , b η l )) , (29) b P v,n +1 ( b η ) = X v ,...,v k − b p ( v , . . . , v k − | v ) Z k − Y i =1 d P v i ,n ( η i ) δ ( b η − g ( η , . . . , η k − )) , (30)where f , g , b p and p have been defined in (13,14,27,28), respectively, and with the initial condition for n = 0: b P v, ( b η ) = δ ( b η ( · ) − δ v, · ) . (31)The point-to-set correlation function is then computed as C n = 12 X σ X v ,...,v l +1 p ( v , . . . , v l +1 | σ ) Z l +1 Y i =1 d b P v i ,n ( b η i ) σ m ( b η , . . . , b η l +1 ) , (32)where p is the law defined in (26), and with the expression m ( b η , . . . , b η l +1 ) = m + ( b η , . . . , b η l +1 ) − m − ( b η , . . . , b η l +1 ) m + ( b η , . . . , b η l +1 ) + m − ( b η , . . . , b η l +1 ) , (33) m σ ( b η , . . . , b η l +1 ) = X w ,...,w l +1 ψ l +1 X i =1 w i ! l +1 Y i =1 b η i ( σ, w i ) , (34)for the conditional magnetization of the root.The recursion equations (29,30), which are equivalent to the 1RSB equations with Parisi breaking parameter equalto 1 [23], always admit the trivial solution P v ( η ) = δ ( η − η ∗ ), b P v ( b η ) = δ ( b η − b η ∗ ) as a stationary fixed point. In thenon-reconstructible (RS) phase this is the limit reached by P v,n and b P v,n in the large n limit, and then C n →
0. Onthe contrary in the reconstructible (RSB) phase the limit of P v,n and b P v,n is a non-trivial fixed point, and C n remainsstrictly positive. For a given choice of the parameters k and ψ we define the dynamic transition l d as the thresholdseparating these two behaviors. As l is here an integer parameter we will say more precisely that l < l d is the RSphase, l ≥ l d the RSB phase, i.e. l d ( k, ψ ) is the smallest integer value of l such that RSB occurs. B. Simplifications and symmetries
The recursion equations (29,30) bear, for each value of n , on eight distributions P v,n , b P v,n , as the variable v = ( σ, w )takes four different values. This number can however be divided by two thanks to the invariance of the problem underthe spin-flip symmetry σ → − σ . To state its consequences let us define the flip transformation η → η f betweenmessages, according to η f ( σ, w ) = η ( − σ, w ) (and similarly b η f ( σ, w ) = b η ( − σ, w )). The channels p and b p beinginvariant under a global spin-flip, one can check that P ( − ,w ) ,n ( η ) = P (+ ,w ) ,n ( η f ) , b P ( − ,w ) ,n ( b η ) = b P (+ ,w ) ,n ( b η f ) , (35)which allows to close (29,30) on the four distributions { P (+ ,w ) ,n , b P (+ ,w ) ,n } w =0 , , that we shall denote for simplicity { P w,n , b P w,n } w =0 , . Using this property, as well as the invariance of f , g under a permutation of their arguments anda more explicit version of the expressions (27,28) of b p and p , one can simplify (29,30) into: P w,n +1 ( η ) = l X p =0 (cid:0) lp (cid:1) ψ ( p + w ) b y − pl P p ′ =0 (cid:0) lp ′ (cid:1) ψ ( p ′ + w ) b y − p ′ Z p Y i =1 d b P ,n +1 ( b η i ) l Y i = p +1 d b P ,n +1 ( b η i ) δ ( η − f ( b η , . . . , b η l )) , (36)1 b P ,n +1 ( b η ) = Z k − Y i =1 d P ,n ( η i ) δ ( b η − g ( η f , . . . , η fk − )) , (37) b P ,n +1 ( b η ) = k − y b y Z d P ,n ( η ) k − Y i =2 d P ,n ( η i ) δ ( b η − g ( η f , η , . . . , η k − )) (38)+ 1 b y k − X t =2 (cid:18) k − t (cid:19) Z k − Y i =1 d P ,n ( η i ) δ ( b η − g ( η f . . . η ft , η t +1 , . . . , η k − )) . The expression (32) of the correlation function C n can similarly be rewritten as: C n = l +1 X p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − pl +1 P p ′ =0 (cid:0) l +1 p ′ (cid:1) ψ ( p ′ ) b y − p ′ Z p Y i =1 d b P ,n ( b η i ) l +1 Y i = p +1 d b P ,n ( b η i ) m ( b η , . . . , b η l +1 ) . (39)A further symmetry constrains the distributions P v,n ; to unveil it let us call P n ( η ) the distribution of η in a broadcastprocess wich is not conditioned on the value of the root, i.e.: P n ( η ) = X v =( σ,w ) η ∗ ( w ) P v,n ( η ) , (40)where η ∗ is normalized in such a way that η ∗ (0) + η ∗ (1) = 1 /
2. Applying Bayes theorem to express the joint law ofthe variable at the root and those at the boundary one obtains [23] P v,n ( η ) = η ( v ) η ∗ ( w ) P n ( η ) . (41)This yields a relation between P w,n = P (+ ,w ) ,n for the two values of w , namely P ,n ( η ) = y η (+ , η (+ , P ,n ( η ) , (42)where we recall that y = η ∗ (0) /η ∗ (1) was defined in Eq. (20). Moreover the spin-flip symmetry implies the invarianceof P n , i.e. P n ( η ) = P n ( η f ). This property, combined with (41), allows to relate P w,n in η and η f through a change ofdensity, namely P ,n ( η f ) = η ( − , η (+ , P ,n ( η ) , P ,n ( η f ) = η ( − , η (+ , P ,n ( η ) . (43)These symmetry relations, as well as the similar ones that hold for b P w,n modulo the replacement of y by b y in (42),will be particularly useful in the treatment of the large k limit presented in Sec. VI. They imply a variety of identitiesbetween average observables, and in particular they can be used to rewrite the correlation function as C n = l +1 X p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − pl +1 P p ′ =0 (cid:0) l +1 p ′ (cid:1) ψ ( p ′ ) b y − p ′ Z p Y i =1 d b P ,n ( b η i ) l +1 Y i = p +1 d b P ,n ( b η i ) m ( b η , . . . , b η l +1 ) , (44)which obviously shows that C n ≥
0. This alternative form of C n can be derived by first checking that l +1 X p =0 (cid:18) l + 1 p (cid:19) ψ ( p ) b y − p Z p Y i =1 d b P ,n ( b η i ) l +1 Y i = p +1 d b P ,n ( b η i ) A ( b η , . . . , b η l +1 ) (45)= l +1 X p =0 (cid:18) l + 1 p (cid:19) ψ ( p ) b y − p Z p Y i =1 d b P ,n ( b η i ) l +1 Y i = p +1 d b P ,n ( b η i ) m − ( b η , . . . , b η l +1 ) m + ( b η , . . . , b η l +1 ) A ( b η f , . . . , b η fl +1 ) (46)for an arbitrary function A which is invariant under the permutation of its arguments, and such that the integrals arewell-defined, and then applying this identity with the test function A = m (1 − m ).2 C. Hard fields
The tree reconstruction problem considered above asks whether the observation of σ B n gives some information onthe value of the root σ , as quantified by the correlation function C n ; answering this question requires to solve thefunctional recursion relations (36-38). We shall now consider a more drastic question, namely whether σ B n allows toinfer σ with perfect certainty, and call H n the probability of this event. It turns out that H n is much simpler tocompute than C n , with scalar recursions instead of functional ones, and that H n is a lower bound for C n ; this lastfact is quite intuitive, if σ B n implies the value of σ it certainly conveys information about it.To explain the computation of H n let us first remark that σ is implied by σ B n if and only if all the proper bicoloringsof the tree that coincides with σ B n on the boundary take the same value at the root; by definition we only considerbiased measures that do not strictly forbid any solution (here ψ ( p ) > p ), hence the certain determination of σ can only arise from the bicoloring constraints acting on the spin variables. This observation can be turned into analgorithm, called the naive reconstruction procedure: consider all the hyperedges at the boundary, and declare them“forcing to the value σ ” if their k − n from the root are all equal to − σ , and “not forcing”otherwise. Now the variables at distance n − σ if at least one of their incident boundaryhyperedge is forcing to this value (by construction of the broadcast process there cannot be conflicting forcings to +and − on the same variable), and a “white” value 0 if all the hyperedges are not forcing. This process can be iteratedfrom the boundary towards the root, hyperedges being forcing if and only if k − − H n is thus the probability that this successive forcing mechanism percolates fromthe boundary to the root, with at least one of its incident hyperedge forcing it.To embed the analysis of this naive reconstruction algorithm into the formalism defined above we first introducesome terminology to classify the messages η, b η ; we will say that • η is forcing to σ = ± η ( − σ,
0) = η ( − σ,
1) = 0, η ( σ, > η ( σ, > • η is non-forcing iff η ( σ, w ) > σ and w . • b η is forcing to σ = ± b η ( σ,
1) = 1, b η ( σ,
0) = b η ( − σ,
0) = b η ( − σ,
1) = 0; we write then b η = b η σ . • b η is non-forcing iff b η (+ ,
0) + b η (+ , > b η ( − ,
0) + b η ( − , > η = f ( b η , . . . , b η l )is forcing to σ iff at least one b η i is forcing to σ and none forcing to − σ , η is non-forcing otherwise. Similarly b η = g ( η , . . . , η k − ) is forcing to σ iff all the η i are forcing to − σ , and non-forcing otherwise.We decompose now the distributions P w,n , b P w,n between the contributions of the hard and of the soft fields, defining P w,n ( η ) = h w,n R w,n ( η ) + (1 − h w,n ) Q w,n ( η ) , (47) b P w,n ( b η ) = b h w,n δ ( b η − b η + ) + (1 − b h w,n ) b Q w,n ( b η ) , (48)where h, b h ∈ [0 ,
1] are the total weights of hard fields in the corresponding distributions, the R are normalizeddistributions on η ’s forcing to +, and Q and b Q are probability laws supported on non-forcing messages. By constructionthere are no messages forcing to − in P w,n = P (+ ,w ) ,n .Inserting these decompositions in the recursion equations (36-38) we see that the evolution of the hard fields weightsdecouple; in particular from (37) we obtain b h ,n +1 = ( h ,n ) k − and from (38) b h ,n +1 = 0, we shall thus write moresimply b h n instead of b h ,n . The equation (36) yields h w,n +1 = 1 − l P p =0 (cid:0) lp (cid:1) ψ ( p + w ) b y − p (1 − b h n +1 ) pl P p =0 (cid:0) lp (cid:1) ψ ( p + w ) b y − p , (49)the recursion can thus be closed on h ,n and b h n , and solved starting from the initial condition h , = 1. Finally H n can be read off from the expression (39) of C n by isolating the contribution with at least one forcing field b η aroundthe root, which gives H n = 1 − l +1 P p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − p (1 − b h n ) pl +1 P p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − p . (50)3Depending on the choice of the parameters ( l, k, ψ ) the sequence h ,n (or equivalently H n ) either decays to 0 or toa strictly positive fixed point. The so-called rigidity threshold l r ( k, ψ ) separates these two behaviors, we define it insuch a way that H n → l < l r whereas it remains strictly positive in the large n limit for l ≥ l r . In this lattercase there is a positive probability for the observation of a far away boundary to completely determine the root (thenaive reconstruction problem is solvable), hence there is certainly information about the value of the root (the usualreconstruction problem is also solvable). This observation shows that l r is an upperbound for the dynamic threshold, l d ≤ l r .For future use we also give here the recursion equation for the soft fields distributions, obtained by inserting thedecompositions (47,48) into (36-38): Q w,n +1 ( η ) = l X p =0 (cid:0) lp (cid:1) ψ ( p + w ) b y − p (1 − b h n +1 ) pl P p ′ =0 (cid:0) lp ′ (cid:1) ψ ( p ′ + w ) b y − p ′ (1 − b h n +1 ) p ′ Z p Y i =1 d b Q ,n +1 ( b η i ) l Y i = p +1 d b Q ,n +1 ( b η i ) δ ( η − f ( b η , . . . , b η l )) , (51) b Q ,n +1 ( b η ) = k − X u =1 (cid:0) k − u (cid:1) ( h ,n ) k − − u (1 − h ,n ) u − ( h ,n ) k − Z u Y i =1 d Q ,n ( η i ) k − Y i = u +1 d R ,n ( η i ) δ ( b η − g ( η f , . . . , η fk − )) (52) b Q ,n +1 ( b η ) = b P ,n +1 ( b η ) . (53)The last equation comes from the absence of hard fields in b P , one can thus take the expression (38) and insert in itsright hand side the decomposition (47) for P and P to have an equation involving only the soft fields distributions;this is relatively cumbersome notationally in general, we shall only write the corresponding equation in a special caselater on. The correlation function C n can also be decomposed from (39) as the sum of H n and a soft contribution: C n = H n + l +1 X p =0 (cid:0) l +1 p (cid:1) ψ ( p ) b y − p (1 − b h n ) pl +1 P p ′ =0 (cid:0) l +1 p ′ (cid:1) ψ ( p ′ ) b y − p ′ Z p Y i =1 d b Q ,n ( b η i ) l +1 Y i = p +1 d b Q ,n ( b η i ) m ( b η , . . . , b η l +1 ) . (54)Exploiting the symmetry relations (42,43) one can rewrite the second term in this equation with the integrand squared,exactly as we did in the expression (44) of C n , which proves the bound C n ≥ H n and confirms the intuition that thereconstruction problem is solvable if the naive reconstruction is. D. The Kesten-Stigum bound
The study of the naive reconstruction procedure presented above has yielded the upperbound l d ≤ l r for thethreshold of the dynamic transition. We state here another upperbound, l d ≤ l KS , that is obtained by analyzing thestability of the trivial fixed point P v ( η ) = δ ( η − η ∗ ), b P v ( b η ) = δ ( b η − b η ∗ ) under the iterations of Eqs. (29,30). Thethreshold l KS is defined in such a way that for l < l KS (resp. l > l KS ) a small perturbation around the fixed point isattenuated (resp. amplified) by the iterations; for l > l KS the sequence P v,n thus converges towards a non-trivial fixedpoint, which justifies the bound l d ≤ l KS . In the tree reconstruction literature this is known as the Kesten-Stigumtransition [22, 53], while equivalent computations have been performed in statistical physics under the name of localinstability of the RS solution towards RSB [54]. We shall only state the expression of l KS and refer the reader tothe literature for more details on its derivation, see for instance [55] for an in-depth study of this transition. Let usdefine M v,v ′ as the probability that in a broadcast from the channel p defined in (28) one of the descendent variablestakes the value v ′ if the parent is equal to v ; in formula, M v,v ′ = P v ,...,v l p ( v ′ , v , . . . , v l | v ). We define in a similar way c M using instead the conditional law b p of (27). Writing M and c M as matrices, ordering their rows and columns as v = (+ , , (+ , , ( − , , ( − , M = α − α − α α α − α − α α , c M = − b α − b α b α b α b α b α − b α − b α
01 0 0 0 , (55)4where the matrix elements have the following expressions, α = l − P p =0 (cid:0) l − p (cid:1) ψ ( p ) b y − pl − P p =0 (cid:0) l − p (cid:1) ψ ( p ) b y − p + l − P p =0 (cid:0) l − p (cid:1) ψ ( p + 1) b y − p − , α = l − P p =0 (cid:0) l − p (cid:1) ψ ( p + 2) b y − p − l − P p =0 (cid:0) l − p (cid:1) ψ ( p + 1) b y − p + l − P p =0 (cid:0) l − p (cid:1) ψ ( p + 2) b y − p − , (56) b α = 2 k − − b y , b α = 1 y b y . (57)We then call θ the second largest, in modulus, eigenvalue of the matrix product M c M (the largest one being the Perroneigenvalue 1 as the rows are probability vectors), and define l KS through the relation l KS ( k − θ = 1. V. FINITE k NUMERICAL RESULTS
For a given choice of the parameters ( k, l, ψ ) the model is either in a non-reconstructible, Replica Symmetric (RS)phase if the point-to-set correlation function C n decays to 0 as n → ∞ , or in a reconstructible, Replica SymmetryBreaking (RSB) phase if C n remains strictly positive in this large distance limit. We recall that we defined l d ( k, ψ ) asthe smallest integer value of l which leads to RSB, and that our goal is to find a bias ψ that pushes l d to the largestpossible value. We have seen several upperbounds on l d that can be easily computed analytically: if the entropy (24)computed in the RS ansatz is negative this is certainly an evidence for the RSB phenomenon; the existence of hardfields, i.e. the possibility of naive reconstruction, implies the reconstructibility, hence the rigidity bound l d ≤ l r ; finallythe Kesten-Stigum bound l d ≤ l KS follows from the instability of the trivial fixed point of the reconstruction recursiveequations. Nevertheless there are no lowerbounds on l d that are simple to compute, hence an explicit determination ofthis threshold requires a numerical resolution of the equations (36-38). This type of Recursive Distributional Equation(RDE) admits a natural numerical procedure to solve them, called population dynamics algorithm [25, 56], in whicha distribution, say P ,n , is approximated by the empirical distribution over a sample of N ≫ P ,n ( η ) ≈ N N X i =1 δ ( η − η i ) . (58)With 2 N fields η i one can encode the distributions P w,n at distance n for w = 0 ,
1, from which the representantsof b P w,n +1 are generated stochastically according to (37,38), and in turn the populations representing P w,n +1 canbe obtained from (36). At each step of this iterative procedure one computes the correlation function C n from (39),interpreting the average over b P w,n as an uniform sampling of an element of the corresponding population. An exampleof the results thus obtained is presented in Fig. 3, where one sees, depending on the choices of parameters, RS caseswith C n vanishing at large n , and RSB situations where C n remains positive. The results presented in the rest of thissection have been obtained with populations of size N = 10 ; we considered that C n → C n , for large enough values of n such that stationarity was reached within our numerical accuracy, dropped belowa small threshold value (we used 0 .
005 in the figures below).The function ψ ( p ) contains a large number ( l + 1) of free parameters, some choices must hence be made on itsspecific form. We first considered the case where ψ ( p ) = (1 − ǫ ) p , with a single free parameter ǫ <
1. As explainedin Sec. II A this corresponds to a bias that factorizes over the hyperedges of the bicoloring problem, that we studiedin [39] for Erd˝os-R´enyi (ER) random hypergraphs in which the degree of a vertex has a Poisson distribution of average αk . The phase diagrams we obtained numerically for the regular case considered in this paper are presented in the( l, ǫ ) parameter plane for k = 5 and k = 6 in Fig. 4. They are qualitatively similar to the results obtained in [39]for the ER case, and quantitatively close with the correspondence αk = l + 1 between the average degree of the ERensemble and the one fixed here. The important point we want to emphasize here is the fact that a suitable choice of ǫ allows to increase l d with respect to its value for the uniform measure ( ǫ = 0). For instance for k = 5 and l = 47,the RSB phase at ǫ = 0 is turned into a RS phase when ǫ = 0 .
04. Similarly for k = 6 the dynamic transition l d = 108of the uniform measure can be pushed to l d = 113 for a well-chosen value of ǫ .The natural question that arises at this point is whether the more generic bias introduced in this manuscript, i.e.the additional degrees of freedom in the choice of ψ ( p ), allows to further increase the dynamic transition threshold l d . To investigate this point without introducing too large a space of parameters, that would be impossible to exploresystematically, we considered the following function ψ : ψ (0) = 1 , ψ (1) = b , ψ ( p ) = b (1 − ǫ ) p for p ≥ , (59)5 b = 1 . b = 1 . b = 1 . nC n FIG. 3: An example of the shape of the correlation function C n as a function of n , here for k = 6, l = 114 and the bias function ψ ( p ) defined in Eq. (59), with b = 1 . ǫ = 0 . b . lǫ lǫ FIG. 4: Phase diagrams in the ( l, ǫ ) plane for the bias function ψ ( p ) = (1 − ǫ ) p , for k = 5 (left panel) and k = 6 (right panel).The points are the clustering threshold l d ( ǫ ): for a given value of ǫ the RS phase corresponds to l < l d , the RSB phase to l ≥ l d . The three continuous lines are upperbounds of l d , the area on their right is necessarily in a RSB phase; the solid one isthe Kesten-Stigum bound explained in Sec. IV D, the dashed line marks the rigidity threshold l r defined in Sec. IV C, and thedot-dashed line corresponds to the vanishing of the RS entropy (24). with the three free parameters ( b , b , ǫ ). We then solved numerically the RDEs with parameters close to the optimalvalues found previously in the restricted case with b = 1 − ǫ and b = 1. The results are shown in Fig. 5 in theparameter plane ( b , b ), for fixed values of k , l and ǫ , with squares (resp. crosses) marking RS (resp. RSB) phases.The left panel shows the existence of a RS phase at k = 5, l = 48, whereas all values of ǫ led to RSB at this valueof l for the factorized bias ψ ( p ) = (1 − ǫ ) p . We did not find any choice of parameters ( b , b , ǫ ) with a RS phase for l = 49. Similarly the right panel shows, for k = 6, the largest value of l , l = 114, for which we found a RS phase forwell-chosen parameters (see also the drop of C n to 0 in Fig. 3 for ǫ = 0 . b = 1 . b = 1 . . l d of the uniformmeasure, the second column gives the result obtained with a bias of the form (8), when optimizing on the choice of ǫ to increase as much as possible l d , and the third column gives the result obtained with a bias of the form (59), forwell-chosen values of ( b , b , ǫ ). We can see that we were able to further improve the value of l d for k = 5 and k = 6 byusing a bias that introduces interactions between variables of different hyperedges, with respect to the factorized bias.An even further improvement might be achieved by a more systematic exploration of the parameter space ( b , b , ǫ ),or by using even more general bias functions ψ ( p ), at the price of a large computational cost due to the increaseddimensionality of the parameter space. For comparison we give in the last column the satisfiability threshold, i.e. thesmallest value of l such that the typical hypergraphs have no proper bicolorings, computed within the 1RSB ansatz(see [21, 35] for details), that is obviously an upperbound for l d , independently of the bias.6 b b b b FIG. 5: The phase diagrams in the ( b , b ) parameter plane for the bias function of Eq. (59). The left panel is for k = 5, l = 48 and ǫ = 0 .
06, the right panel for k = 6, l = 114, ǫ = 0 . l < l d ( k, b , b , ǫ )), the crosses to a RSB phase ( l ≥ l d ( k, b , b , ǫ )). The line is the Kesten-Stigum bound, the area below it isRSB. k uniform ǫ ( b , b , ǫ ) l sat l d for the uniform measure, for the bias of equation (8) that factorizes on the hyperedges, with theoptimal value of ǫ , and for the bias of equation (59) for well-chosen parameters ( b , b , ǫ ). The last column is the satisfiabilitythreshold. VI. LARGE k ASYMPTOTICS
The rest of the paper will be devoted to an asymptotic expansion of the clustering threshold l d when k → ∞ , forthe biased measure; we have just seen that for finite k the latter has a larger l d with respect to the uniform one, andthat the inclusion of interactions between variables at larger distance brings a further improvement compared to abiasing function factorized over the hyperedges. It is thus natural to investigate this phenomenon in the large k limit,that allows for some analytical simplifications, and where the algorithmic gap discussed in the introduction is mostclearly seen. One would like in particular to understand at which order of the asymptotic expansion of l d the effectof the bias does appear.This Section being rather long and technical we give here, for the convenience of the reader, the main ideas andexplain the organization of the forecoming computation, which is the generalization of the one we presented in [44]for the uniform measure. We will focus on the particular form of the function ψ ( p ) defined in (6), with the twoparameters b and ǫ , and start in Sec. VI A by summarizing the main equations derived above, for arbitrary k , in thisspecial case. In order to take the k → ∞ limit we must specify how the degree l and the parameters ( b, ǫ ) behavewith k ; we will set l = 2 k − (ln k + ln ln k + γ ) , ǫ = e ǫ r k ln k , (60)where γ and e ǫ are constants independent of k that parametrize the degree and the bias in this limit (the factor √ b will be independent of k . This specific choice for the scaling of b and ǫ will be justified later in this section. We will find that both l d and the rigidity threshold l r have asymptotic expansionsof the form (60), our goal being to determine the corresponding rescaled thresholds γ d and γ r , as a function of theparameters ( b, e ǫ ).To do so we shall first expand the correlation function C n and its hard-fields contribution H n , for a finite distance n , and find that both go to their maximal value 1, with the correction term scaling as C n = 1 − e C n k ln k + o (cid:18) k ln k (cid:19) , H n = 1 − e H n k ln k + o (cid:18) k ln k (cid:19) , (61)7where e C n and e H n are independent of k . These sequences depend on the rescaled parameters γ , b and e ǫ , and wepresent in Sec. VI B recursion equations that allow to compute them (in Sec. VI B 1 for e H n and in Sec. VI B 2 for e C n ).The tresholds l d and l r have been defined for finite k according to the positivity of the large n limit of the sequences C n and H n , respectively. Their asymptotic expansion should thus be performed by taking the large k limit after thelarge n one; however, under the natural hypothesis (that can be checked explicitly for H n ) that the large n limit of C n and H n is either strictly vanishing or scales with k as in (61), one can determine γ d and γ r by reversing the orderof the limits and studying whether e C n and e H n remain bounded or not in the large n limit. The large n limit of e C n isthus discussed in Sec. VI C. Additional difficulties need to be overcome in the intermediate regime γ d < γ < γ r wherereconstruction is possible but naive reconstruction is not: even if strictly hard fields are not present here the scaling(61) reveals that the soft fields are actually quasi-hard, the correlation function tending to one. We thus reformulatein Sec. VI C 2 the recursion of VI B 2 and put it in a form for which the large n limit can be performed in a numericallytractable way. Finally our explicit results for γ d are presented in VI D. A. A specialization of some formulas
Let us first specialize some of the formulas we wrote previously for a generic ψ ( p ) to the case defined in equation(6) with the two parameters b and ǫ . The BP equation (13) for the function η = f ( b η , . . . , b η l ) becomes η ( σ,
1) = 1 z b (1 − ǫ ) l Y i =1 ( b η i ( σ,
0) + (1 − ǫ ) b η i ( σ, , (62) η ( σ,
0) = 1 z " (1 − b ) l Y i =1 b η i ( σ,
0) + b l Y i =1 ( b η i ( σ,
0) + (1 − ǫ ) b η i ( σ, , for σ = ±
1. The equation (21) for the factorized RS solution reads y = 11 − ǫ (cid:18) − bb (1 + (1 − ǫ ) b y − ) − l (cid:19) , b y = 2 k − − k − k − y . (63)The evolution equations (49,50) for the hard fields become h ,n +1 = 1 − − b + b (1 + (1 − b h n +1 )(1 − ǫ ) b y − ) l − b + b (1 + (1 − ǫ ) b y − ) l , (64) h ,n +1 = 1 − (1 + (1 − b h n +1 )(1 − ǫ ) b y − ) l (1 + (1 − ǫ ) b y − ) l , (65) H n = 1 − − b + b (1 + (1 − b h n )(1 − ǫ ) b y − ) l +1 − b + b (1 + (1 − ǫ ) b y − ) l +1 , (66)where we recall the initial condition h ,n =0 = 1 and the fact that b h n +1 = ( h ,n ) k − . We can thus write a closedequation on h ,n : h ,n +1 = 1 − − b + b (1 + (1 − ( h ,n ) k − )(1 − ǫ ) b y − ) l − b + b (1 + (1 − ǫ ) b y − ) l . (67)One can check numerically that this equation undergoes a discontinuous bifurcation when l increases above the rigiditythreshold l r . Here all the formulas depend analytically on l , we can thus consider it as a real parameter, even if theoriginal model is only defined for integer l . The fixed point h = lim n →∞ h ,n jumps abruptly from 0 to a strictlypositive value when l is increased above l r . We can determine the location of this threshold by noting that at sucha bifurcation the function that maps h ,n to h ,n +1 is tangent with the diagonal, hence l r and the bifurcating fixedpoint h , r are solutions of h , r = 1 − − b + b (1 + (1 − ( h , r ) k − )(1 − ǫ ) b y − ) l r − b + b (1 + (1 − ǫ ) b y − ) l r , (68)1 = l r ( k − h , r ) k − (1 − ǫ ) b y − b (1 + (1 − ( h , r ) k − )(1 − ǫ ) b y − ) l r − − b + b (1 + (1 − ǫ ) b y − ) l r . (69)8For a generic bias ψ ( p ) the distribution R w,n of the hard fields introduced in (47) is a priori non-trivial, but for theparticular choice of ψ defined in (6) it simplifies into R w,n ( η ) = δ ( η − η + ) , where η + ( σ,
0) = 12 − ǫ δ σ, + , η + ( σ,
1) = 1 − ǫ − ǫ δ σ, + , (70)for all w and n . We will also denote η − = ( η + ) f the message forcing to − . This allows to simplify the equation (52)on the soft fields distribution, which reads now: b Q ,n +1 ( b η ) = k − X u =1 (cid:0) k − u (cid:1) ( h ,n ) k − − u (1 − h ,n ) u − ( h ,n ) k − Z u Y i =1 d Q ,n ( η i ) δ ( b η − g ( η f , . . . , η fu , η − , . . . , η − )) . (71)It will be useful in the following to encode in a compact way the value of g ( η , . . . , η k − ) when all, or almost all, thearguments of g are forcing messages. We shall hence define, for a real number α , the message b η = g ( α ) as b η ( σ, w ) = δ w, σ tanh( α )2 ; (72)the value w is thus fixed to 0, while σ can be seen as an Ising spin submitted to an effective magnetic field α . Onethen founds that the values of g when all its arguments are forcing are: • g ( η − , . . . , η − ) = b η + and g ( η + , . . . , η + ) = b η − , the usual combination rule to obtain a forcing message b η ; • g ( η + , η − , . . . , η − ) = g ( ǫ ′ ) and g ( η − , η + , . . . , η + ) = g ( − ǫ ′ ), with ǫ ′ = − ln(1 − ǫ ), when all the messages exceptone are forcing in the same direction, the last one in the opposite direction; • g ( η + , . . . , η + , η − , . . . , η − ) = g (0) when there are at least two forcing fields in each direction.We will also introduce two functions g + and g − that gives the value of g when all its arguments are forcing in the samedirection, except one which is arbitrary, namely g + ( η ) = g ( η, η − , . . . , η − ) and g − ( η ) = g ( η, η + , . . . , η + ). Explicitly, b η = g σ ( η ) means b η ( σ,
1) = 1 b z η ( − σ, , b η ( − σ,
1) = 0 , b η ( σ,
0) = 1 b z η ( σ, , b η ( − σ,
0) = 1 b z η ( σ, , (73)with b z normalizing this distribution. Note that the two functions g + and g − are linked by the spin-flip operationaccording to g + ( η f ) = ( g − ( η )) f . B. The large k limit for finite n
1. Evolution of the hard fields
We start our large k asymptotic expansion, using the scaling of the parameters defined in (60), by considering thesolution (63) of the translationally invariant RS equation; its leading order behavior is easily found to be y = 1 + e ǫ r k ln k + o (cid:18) √ k ln k (cid:19) , b y = 2 k − (cid:18) O (cid:18) k k − (cid:19)(cid:19) . (74)Turning to the sequences h w,n for the weights of the hard fields, solutions of the recursion equations (64,65), onerealizes easily that, for n finite in the large k limit with the scaling of the parameters stated above, h ,n = 1 − x ,n k ln k + o (cid:18) k ln k (cid:19) , h ,n = 1 − x ,n k ln k + o (cid:18) k ln k (cid:19) , b h n +1 = 1 − x ,n ln k + o (cid:18) k (cid:19) , (75)where x ,n and x ,n are independent of k and solutions of the recursion relations: x ,n +1 = B e − γ + e − γ + x ,n , (76) x ,n +1 = e − γ + x ,n = x ,n +1 − B e − γ . (77)9 e − γ ( B + e x ) x x nx ,n FIG. 6: Left panel: the functions x and B e − γ + e − γ + x as a function of x , with b = 0 . γ = 1 . , . , . γ r (0 . ≈ . x ,n as a function of n ,for b = 0 . γ = 1 . , . , . Here and sometimes in the following it is more convenient to use the notation B = 1 − bb (78)as a parameter equivalent to b . The recursion above is closed on x ,n , and satisfies the initial condition x ,n =0 = 0,that follows immediately from h ,n =0 = 1. Note that for b = 1 (i.e. B = 0) one recovers the result of equation (33) in[44] for x n = x ,n = x ,n , as it should in the uniform case. One also finds by expanding (66) that H n , the hard fieldscontribution to the correlation function, is indeed given by the asymptotic expansion stated in (61), with e H n = x ,n .The behavior of the sequence x ,n solution of (76) is easily determined by plotting the shape of the function x B e − γ + e − γ + x , see the left panel of Fig. 6 for an example. For a given value of b (hence of B ) there exists acritical value γ r ( b ) such that this function remains strictly above the diagonal when γ < γ r ( b ), while it intersects itfor γ > γ r ( b ). As a consequence in the former case the sequence x ,n diverges (very rapidly, as iterated exponentials)with n , whereas in the latter it converges to the smallest fixed point; these behaviors are illustrated in the right panelof Fig. 6. The divergence of x ,n = e H n corresponds, in the large k limit, to the vanishing of H n at finite k (recall thedefinition (61)), i.e. to the impossibility of naive reconstruction. The value of γ r ( b ) can be obtained by noticing thatat this bifurcation the function x B e − γ + e − γ + x is tangent with the diagonal at their unique intersection point x r ( b ), hence that ( x r ( b ) , γ r ( b )) are solution of ( x = B e − γ + e − γ + x e − γ + x ⇒ ( x = γγ = 1 + B e − γ . (79)As b > B > −
1, this equation admits a unique solution with γ > x ,n being positive this is alsothe case for the fixed point x , and hence also of γ at the bifurcation), which can be expressed as γ r ( b ) = 1 + W (cid:18) Be (cid:19) , (80)where W ( z ) is the Lambert function, i.e. the principal solution of the equation z = W e W . Note that this resultcoincides with the asymptotic expansion of l r one obtains from (68), which shows the commutativity of the limits n → ∞ and k → ∞ for the determination of the rigidity transition. The function γ r ( b ) is plotted in the figure 10(right panel, upper curve): it is a decreasing function of b , with γ r (1) = 1 for the uniform measure. An example forthe values of the fixed point reached by x ,n for γ > γ r ( b ) can be found in the right panel of Fig. 7.
2. Evolution of the soft fields distribution
We shall now study the large k limit of the soft fields distributions Q w,n , b Q w,n . The crucial point we shall exploitto simplify them is the fact that the hard fields weights h w,n are very close to 1 according to the scaling (75), hence0the dominant contributions to b Q w,n will arise when the incoming messages are almost all forcing. To put this remarkon a quantitative ground we start with the equation (71) on the distribution b Q ,n +1 . The integer u that appears inthis equation is a random number drawn from the binomial distribution Bin( k − , − h ,n ), conditioned to be strictlypositive. In the large k limit, using the scaling behavior (75) of h ,n , one sees that the average ( k − − h ,n ) of thebinomial distribution vanishes as O (1 / ln k ), hence the main contribution in (71) arises from the smallest value u = 1appearing in the sum. We thus obtain at the leading order: b Q ,n +1 ( b η ) = Z d Q ,n ( η ) δ ( b η − g + ( η f )) , (81)where the function g + was defined in (73). Consider now the equation (38) for b Q ,n +1 = b P ,n +1 . When all the η i ’sare extracted from the hard part of P ,n and P ,n the arguments of g are all forcing, with the two possible directionsrepresented; in most of these terms there are at least two messages in each direction, except for the term in the firstline of (38), and the term with t = k − g (0) and g ( ± ǫ ′ ), where the function g was defined in (72). When exactly one of the η i is asoft field, we obtain a contribution g − ( η f ) from the first line if it is η which is soft, a contribution g + ( η k − ) from theterm t = k − η k − is the unique soft field, all other cases leading to subdominant contributionsof the form g ( α ) with α of order ǫ ′ . Collecting these various contributions we thus obtain at the leading order: b Q ,n +1 ( b η ) = x ,n k − ln k Z d Q ,n ( η ) δ ( b η − g − ( η f )) + x ,n k − ln k Z d Q ,n ( η ) δ ( b η − g + ( η )) (82)+ β + δ ( b η − g ( ǫ ′ )) + β − δ ( b η − g ( − ǫ ′ )) + (cid:18) − x ,n + x ,n k − ln k − β + − β − (cid:19) δ ( b η − g (0)) , with β + = k − b y ( h ,n ) k − , β − = k − y b y h ,n ( h ,n ) k − . (83)We turn now our attention to the equation (51) for Q w,n +1 ; in the limit we are considering the non-vanishingcontributions are found to arise only for values of p that remain finite, the law of p becoming P w,n ( p ), with P ,n ( p ) = Po( p ; x ,n ) , P ,n ( p ) = 11 − b + b e x ,n × ( p = 0 b ( x ,n ) p p ! if p > , (84)where we have introduced the notation Po( p ; λ ) = e − λ λ p p ! for the Poisson law of parameter λ ; the P w,n are indeedwell-normalized probability distributions. In the right hand side of (51) a (finite) number p of messages b η i are thusdrawn from b Q ,n +1 , for which we can use the limit form (81), while the others l − p ∼ k − ln k are drawn from b Q ,n +1 . Observing the form of Eq. (82) one realizes that the number of times the first two terms of b Q ,n +1 will bepicked become Poissonian random variables of parameter x ,n and x ,n , respectively. All the other terms are of theform g ( α ), which will be dealt with thanks to the simple exact identity: f ( b η , . . . , b η s , g ( α s +1 ) , . . . , g ( α l )) = f ( b η , . . . , b η s , g ( α s +1 + · · · + α l )) . (85)The sum α of the arguments of g is thus α = ǫ ′ ( a + − a − ), where a + , a − are a pair of integers drawn from themultinomial distribution of parameters ( l ; β + , β − ). One can thus compute the first two cumulants of α as E [ α ] = ǫ ′ l ( β + − β − ) , Var[ α ] = ( ǫ ′ ) l ( β + + β − − ( β + − β − ) ) . (86)In the limit we are considering one finds that these two quantities converge to e ǫ , while the cumulants of higher ordervanish, which show that α tends to a Gaussian distributed random variable with mean and variance both equal to e ǫ ;we will denote the corresponding probability density as D e ǫ α = d α √ π e ǫ e − e ǫ ( α − e ǫ ) . Note that this result justifies thechoice for the scaling of ǫ made in (60), because it leads to a finite contribution of the random variable α , while another scaling would have led to a trivial contribution (with mean and variance either going to 0 or diverging with k ).Collecting all these facts yields Q w,n +1 ( η ) = ∞ X p,q,r =0 P w,n ( p )Po( q ; x ,n )Po( r ; x ,n ) Z D e ǫ α p Y i =1 d Q ,n ( η i ) p + q Y i = p +1 d Q ,n ( η i ) p + q + r Y i = p + q +1 d Q ,n ( η i ) (87) δ ( η − f ( g ( α ) , g + ( η f ) , . . . , g + ( η fp ) , g + ( η p +1 ) f , . . . , g + ( η p + q ) f , g + ( η p + q +1 ) , . . . , g + ( η p + q + r ))) , g − ( η f ) = g + ( η ) f to transform q of the arguments of f . In this equation the function f isthe one defined in Eq. (62), in which one can take ǫ = 0 at this leading order; explicitly, η = f ( b η , b η , . . . ) means η ( σ,
1) = 1 z b Y i ( b η i ( σ,
0) + b η i ( σ, , (88) η ( σ,
0) = 1 z " (1 − b ) Y i b η i ( σ,
0) + b Y i ( b η i ( σ,
0) + b η i ( σ, , with σ = ± Q w,n =1 can be deduced after a short computation from the one on b P v, given in (31): Q w, ( η ) = ∞ X q,r =0 Po( q ; e − γ )Po (cid:18) r ; e − γ b (cid:19) Z D e ǫ α δ (cid:18) η − f (cid:18) g (cid:18) α + q − r b (cid:19)(cid:19)(cid:19) , (89)for both values w = 0 ,
1. The explicit value of η for a given choice of α , q and r reads η (+ ,
0) = 1 z b q e α , η (+ ,
1) = 1 z b q +1 e α , η ( − ,
0) = 1 z b r e − α , η ( − ,
1) = 1 z b r +1 e − α , (90)with z normalizing this distribution.The recursion relation (87) bears on the two sequences of distributions Q ,n and Q ,n ; however the two sequencesare not independent, and obey some symmetry properties, that follow from the equations (42,43). In the large k limitthese relations translate into Q w,n ( η f ) = η ( − , w ) η (+ , w ) Q w,n ( η ) , or equivalently Z d Q w,n ( η ) A ( η ) = Z d Q w,n ( η ) A ( η f ) η ( − , w ) η (+ , w ) , (91)and x ,n Q ,n ( η ) = η (+ , η (+ , x ,n Q ,n ( η ) , x ,n Z d Q ,n ( η ) A ( η ) = x ,n Z d Q ,n ( η ) A ( η ) η (+ , η (+ , , (92)for any function A such that the integrals exist. One can check by induction on n that the sequences Q ,n and Q ,n solution of (87) with the initial condition (89) do indeed satisfy these identities.We can finally establish the scaling stated in (61) for the correlation function C n , by simplifying the expression (54)in the large k limit. Observing in particular that the probability law for p is the same in this equation and in (51)with w = 0 (modulo the shift l → l + 1 which is irrelevant in the limit), one finds after after a short computation theexpression e C n = x ,n Z d Q ,n ( η )(1 − e m ( η )) , (93)for the reduced correlation function e C n , where we defined e m ( η ) = η (+ , − η ( − , η (+ ,
0) + η ( − , . (94)Note that e C n satisfies the inequalities 0 ≤ e C n ≤ x ,n = e H n , which are immediate consequences of the bounds H n ≤ C n ≤ k and of the definitions in (61). As a consistency check one can also derive thebounds on e C n directly in the large k formalism; one of them is obvious from the observation that e m ( η ) ≤ η ,the other one follows from the identity Z d Q ,n ( η ) e m ( η ) = Z d Q ,n ( η ) e m ( η ) ≥ , (95)which can be proven from the Bayes symmetry expressed in (91), using the test function A ( η ) = e m ( η )(1 − e m ( η )).2 C. The large n limit Let us summarize what we have just achieved and underline the main equations that will be used in the following.We have obtained recursive equations, in which the parameter k has disappeared, that allow to compute the reducedcorrelation function e C n and its hard-fields contribution e H n introduced in (61). The latter can be obtained from thescalar recursion (76), it depends on γ and b , and the asymptotic expansion of the rigidity threshold is of the form (60)with a constant γ r ( b ) easily determined from the large n behavior of e H n : for γ < γ r ( b ) one has e H n → ∞ as n → ∞ ,while e H n remains bounded for γ ≥ γ r ( b ). The computation of the reduced correlation function e C n requires insteadthe resolution of the functional recursion equation (87) on the distributions of the soft-fields Q w,n , supplemented bythe initial condition (89), from which e C n is computed using the equation (93). The sequence e C n depends on theparameters γ , b and e ǫ , and the constant γ d ( b, e ǫ ) in the asymptotic expansion of the dynamic threshold l d is deducedfrom the large n asymptotics of e C n (if γ < γ d ( b, e ǫ ) then e C n → ∞ , while it remains bounded for γ > γ d ( b, e ǫ )). We shallnow discuss the computation of e C n in the large n limit, as the final step to complete the determination of γ d ( b, e ǫ ).
1. For γ > γ r ( b ) The most natural way to solve numerically the functional recursion equation (87) on Q w,n is to use the populationdynamics algorithm already explained in Sec. V, that consists in approximating Q w,n by the empirical distribution overa sample of N representative elements { η , . . . , η N } . An iteration step n → n + 1 amounts to update the populationsby drawing the integers p , q , and r from their respective laws, extracting the η i ’s from the current populations, andcreating an η of the new population according to the argument of the Dirac delta in (87). When γ > γ r ( b ) thisprocedure can be performed without difficulty for arbitrarily large distances n , as the sequences x ,n , x ,n remainbounded for all n .The figure 7 presents numerical results obtained in this way for b = 0 . e ǫ = 0. We have plotted on the leftpanel e C n as a function of n for some values of γ above the rigidity threshold γ r ( b ) ≈ . e C n converges at large n to a finite limit e C , that we have plotted as a function of γ in the right panel, along with the limit e H of e H n = x ,n . As we mentioned before the reduced overlap satisfy the bounds 0 ≤ e C n ≤ e H n , hence in the large n limit one has 0 ≤ e C ≤ e H , that is indeed verified in the right panel of the figure 7. This implies that e C n remainsbounded for γ > γ r ( b ), hence the expected inequality γ d ( b, e ǫ ) ≤ γ r ( b ). The observation of the right panel of figure 7suggests the less obvious fact that this inequality is strict; indeed e H has a square root singularity when γ → γ +r , as aconsequence of the bifurcation it undergoes, while e C seems pefectly smooth in this limit, suggesting that it remainsfinite down to a critical value γ d < γ r .Unfortunately the most interesting regime γ d < γ < γ r cannot be studied with the simple numerical procedure wejust described: when γ < γ r ( b ) the sequences x ,n and x ,n diverge, hence the random numbers p, q, r of fields η i thatmust be manipulated to implement (87) become very quickly too large for any practical purpose. We shall thus devisein the next subsection an alternative formulation to circumvent this difficulty, that was used in particular to obtainthe points of the curve e C below the rigidity threshold in the right panel of figure 7.In order to give an intuition on how this reformulation should be performed we first present in figure 8 the resultsof the simple procedure for γ slightly below γ r , and distances n not too large. One sees clearly in this plot that e H n = x ,n diverges, while e C n seems to remain bounded; the expression (93) of e C n reveals that such a situation ispossible if Q ,n concentrates on fields η with e m ( η ) very close to 1. By inspection of the populations in our numericalsimulations we have checked that this is indeed the case, and more precisely that both Q ,n and Q ,n tend to a Diracpeak on the hard-field η + . The finite value of e C n in the large n limit of the intermediate regime γ d ( b, e ǫ ) < γ < γ r ( b )which is reconstructible without strictly hard-fields arises thus from a delicate compensation in the multiplication ofthe diverging factor x ,n and of the vanishing integral R d Q ,n ( η )(1 − e m ( η )). The relevant contribution of the latterarises from atypical values of η for which Q ,n is of order 1 /x ,n , the typical values of η ≈ η + having 1 − e m ( η ) ≈
2. A reweighting scheme
To handle the difficulty that arises in the intermediate regime γ d < γ < γ r we will adapt the approach we developedin [44] for the uniform measure, introducing a reweighted probability distribution µ n ( η ) that gives less importance to3 γ = 1 . γ = 1 . γ = 1 . γ = 1 . n e C n γ . . . . . . . . . . . . . . . . . FIG. 7: Left: the reduced correlation function e C n as a function of the distance n for b = 0 . e ǫ = 0, and several values of γ larger than the rigidity threshold γ r (0 . ≈ . e C (points) and its hard-field contribution e H (solid line) as a function of γ for b = 0 . e ǫ = 0. The points of e C below the rigidity threshold have been obtained with thereweighted algorithm presented in Sec. VI C 2. e C n e H n n . . . . FIG. 8: The reduced correlation function e C n and its hard-field contribution e H n as a function of the distance n for b = 0 . e ǫ = 0 and γ = 1 .
35, slightly below the rigidity threshold. the typical quasi-hard fields which do not contribute to e C n . We define it as µ n ( η ) = x ,n Q ,n ( η ) s η ( − , η (+ , , (96)the reweighting factor proportional to q η ( − , η (+ , indeed vanishes when η is a hard-field η + . This choice also ensuresthe invariance of µ n under a spin-flip transformation, µ n ( η f ) = µ n ( η ), as can be easily seen from the first equality in(91) with w = 1. Note that µ n is a positive measure, but not a normalized probability measure anymore. One cannevertheless check that its total mass, that we shall denote m n , is finite for all finite n . One has indeed, inverting therelation (96) and exploiting the normalization of Q ,n , x ,n = Z d µ n ( η ) s η (+ , η ( − ,
1) = Z d µ n ( η ) 12 s η (+ , η ( − ,
1) + s η ( − , η (+ , ! = Z d µ n ( η ) η (+ ,
1) + η ( − , p η (+ , η ( − , , (97)where we used the invariance µ n ( η f ) = µ n ( η ) to symmetrize the integrand. Thanks to the inequality betweenarithmetic and geometric means of positive numbers the last integral is larger than m n , which implies finally m n ≤ x ,n .We can thus define a probability distribution ν n by dividing µ n by its total mass, ν n ( η ) = µ n ( η ) /m n . The problem4at hand is equivalently described in terms of the { x w,n , Q w,n } , of µ n , or of the pair ( ν n , m n ); for instance the reducedoverlap can be expressed as e C n = m n Z d ν n ( η ) C ( η ) , C ( η ) = 1 p η (+ , η ( − ,
1) 2 η (+ , η ( − , η (+ ,
0) + η ( − , , (98)where we used (92) to transform the integral over Q ,n as one over Q ,n . As we shall see the reweighted formulationis however much more convenient to study the large n limit, as it avoids the direct manipulation of the divergingquantities x w,n .We will now derive recursion relations for ( ν n , m n ); to do so it will be convenient to first introduce a differentparametrization of the messages η . These are normalized probability distributions on a space of four states ( σ, w ),they can be thus encoded with three real numbers, that we shall choose as u (1) = s η (+ , η ( − , , u (2) = s η (+ , − η (+ , η ( − , − η ( − , , u (3) = s ( η (+ , − η (+ , η ( − , − η ( − , B η (+ , η ( − , . (99)We will group them as a row vector with three columns, u = ( u (1) , u (2) , u (3) ), and define for later use the associatedcanonical basis e (1) = (1 , , e (2) = (0 , , e (3) = (0 , , η = f ( b η , b η , . . . ) definedin (88); it becomes in terms of this parametrization u (1) = Y i s b η i (+ ,
0) + b η i (+ , b η i ( − ,
0) + b η i ( − , , u (2) = Y i s b η i (+ , b η i ( − , , (100) u (3) = Y i s b η i (+ , b η i ( − , b η i (+ ,
0) + b η i (+ , b η i ( − ,
0) + b η i ( − , . This shows that the arguments of the square roots in (99) are non-negative numbers, as they should for the definitionof u to be meaningful. Moreover this expression reveals the motivation for this peculiar choice of parametrization: theBP equation η = f ( b η , b η , . . . ) becomes multiplicative with respect to its arguments when η is expressed in terms of u . We will also use the notation e u = ( e u (1) , e u (2) , e u (3) ) with u = ( e e u (1) , e e u (2) , e e u (3) ); as the components of u are positivethose of e u are real numbers, and the BP equation becomes additive in terms of e u . It will also be useful to define thespin-flip operation on the triplets u and e u ; as η f ( σ, w ) = η ( − σ, w ) one deduces easily from (99) the correspondingtransformations: u f = (cid:18) u (1) , u (2) , u (3) (cid:19) , e u f = ( − e u (1) , − e u (2) , e u (3) ) . (101)In the following we will take the liberty to use the three equivalent parametrizations η , u and e u according to whichone is the most convenient, keeping implicit the relationships between them that we have just defined.Let us now rewrite (87) by translating the image η of the function f in the e u parametrization: Q w,n +1 ( η ) = ∞ X p,q,r =0 P w,n ( p )Po( q ; x ,n )Po( r ; x ,n ) Z D e ǫ α p Y i =1 d Q ,n ( η i ) p + q Y i = p +1 d Q ,n ( η i ) p + q + r Y i = p + q +1 d Q ,n ( η i ) (102) δ e u − V ( α ) − p X i =1 V + ( η fi ) − p + q X i = p +1 V − ( η i ) − p + q + r X i = p + q +1 V + ( η i ) , where we defined V ( α ) = ( α, α, , (103) V + ( η ) = (cid:18)
12 ln (cid:18) η (+ ,
0) + η ( − , η (+ , (cid:19) ,
12 ln (cid:18) η (+ , η (+ , (cid:19) ,
12 ln (cid:18) η (+ , η (+ ,
0) + η ( − , (cid:19)(cid:19) , (104) V − ( η ) = V + ( η ) f = (cid:18) −
12 ln (cid:18) η (+ ,
0) + η ( − , η (+ , (cid:19) , −
12 ln (cid:18) η (+ , η (+ , (cid:19) ,
12 ln (cid:18) η (+ , η (+ ,
0) + η ( − , (cid:19)(cid:19) . (105)5For completeness we also state the expression of V + with its argument translated in the u parametrization, namely V (1)+ ( η ) = 12 ln (cid:18) B u (2) u (3) u (1) + 1( u (1) ) + B u (3) u (1) u (2) (cid:19) , (106) V (2)+ ( η ) = 12 ln (cid:18) B u (2) u (3) u (1) (cid:19) , (107) V (3)+ ( η ) = V (2)+ ( η ) − V (1)+ ( η ) , (108)(109)Because of the additivity property of the parametrizations in terms of e u it is easier to describe Q w,n in terms of itscharacteristic function, that we define as Ξ w,n ( z ) = Z d Q w,n ( η ) e iz · e u , (110)where z = ( z (1) , z (2) , z (3) ) and we denoted the standard scalar product z · e u = z (1) e u (1) + z (2) e u (2) + z (3) e u (3) . Indeedthe equation (102) translates intoΞ w,n +1 ( z ) = e e ǫ ( i ( z (1) + z (2) ) − ( z (1) + z (2) ) ) ∞ X p,q,r =0 P w,n ( p )Po( q ; x ,n )Po( r ; x ,n ) (cid:18)Z d Q ,n ( η ) e iz · V + ( η f ) (cid:19) p (111) (cid:18)Z d Q ,n ( η ) e iz · V − ( η ) (cid:19) q (cid:18)Z d Q ,n ( η ) e iz · V + ( η ) (cid:19) r , (112)where the first factor comes from the Gaussian integration on α . For w = 1 the three integers p, q, r have Poissondistributions, the sums can then be easily performed to obtainΞ ,n +1 ( z ) = exp (cid:20)e ǫ ( i ( z (1) + z (2) ) −
12 ( z (1) + z (2) ) ) − x ,n − x ,n + x ,n Z d Q ,n ( η ) e iz · V − ( η ) + x ,n Z d Q ,n ( η ) (cid:16) e iz · V + ( η ) + e iz · V + ( η f ) (cid:17)(cid:21) . (113)We can now come back to the reweighted measure µ n we introduced in (96), and its normalized version ν n , forwhich we define the characteristic functions similarly b µ n ( z ) = Z d µ n ( η ) e iz · e u , b ν n ( z ) = Z d ν n ( η ) e iz · e u = 1 m n b µ n ( z ) . (114)The reweighting factor q η ( − , η (+ , between µ n and Q ,n can be expressed as e − e u (1) , the characteristic functions of thesetwo measures are thus linked by a simple shift of their arguments: µ n ( η ) = x ,n Q ,n ( η ) e − e u (1) ⇐⇒ b µ n ( z ) = x ,n Ξ ,n ( z + ie (1) ) . (115)Using this shift of argument in (113), and recalling from (77) that x ,n +1 = e − γ + x ,n we obtain: b µ n +1 ( z ) = exp " − γ − e ǫ − e ǫ z (1) + z (2) ) − x ,n − x ,n + x ,n Z d Q ,n ( η ) s η (+ ,
0) + η ( − , η (+ , e iz · V − ( η ) + x ,n Z d Q ,n ( η ) s η (+ , η (+ ,
0) + η ( − , e iz · V + ( η ) + s η ( − , η (+ ,
0) + η ( − , e iz · V + ( η f ) ! . (116)We will now trade the integrations over Q ,n and Q ,n for integrals over µ n , thanks to the change of densities expressedin (92) and (96). We will also write x ,n + x ,n = 2 x ,n + Be − γ according to (77), and write 2 x ,n as an integral over6 µ n following (97). This yields b µ n +1 ( z ) = exp " − γ − B e − γ − e ǫ − e ǫ z (1) + z (2) ) + Z d µ n ( η ) s η (+ ,
0) + η ( − , η ( − , e iz · V − ( η ) + Z d µ n ( η ) η (+ , p η ( − , η (+ ,
0) + η ( − , e iz · V + ( η ) + η (+ , p η (+ , η (+ ,
0) + η ( − , e iz · V + ( η f ) ! − Z d µ n ( η ) η (+ ,
1) + η ( − , p η (+ , η ( − , . (117)Using the invariance under spin-flip of µ n one can regroup the two terms in the second line of this equation; simplifyingthe prefactors one obtains b µ n +1 ( z ) = exp (cid:20) − γ − B e − γ − e ǫ − e ǫ z (1) + z (2) ) + Z d µ n ( η ) s η (+ ,
0) + η ( − , η ( − , (cid:16) e iz · V + ( η ) + e iz · V − ( η ) (cid:17) − η (+ ,
1) + η ( − , p η (+ , η ( − , ! . (118)This is a recursion equation for the reweighted measure µ n (and its characteristic function b µ n ). It will be moreconvenient in the following to work with the pair ( ν n , m n ); the mass m n of µ n can be expressed as b µ n (0), we thusobtain m n +1 = exp (cid:20) − γ − B e − γ − e ǫ m n Z d ν n ( η ) M ( η ) (cid:21) , (119) b ν n +1 ( z ) = exp (cid:20) − e ǫ z (1) + z (2) ) + m n Z d ν n ( η ) L ( η ) (cid:16) e iz · V + ( η ) + e iz · V − ( η ) − (cid:17)(cid:21) , (120)where we introduced the functions M ( η ) = p η (+ ,
0) + η ( − , p η (+ ,
1) + p η ( − , − η (+ , − η ( − , p η (+ , η ( − , , (121) L ( η ) = s η (+ ,
0) + η ( − , η ( − , . (122)The initial condition for the recursion on ( ν n , m n ) is obtained from the one on Q , given in (89): m = exp (cid:20) − γ + e − γ (cid:18) √ b − b − (cid:19) − e ǫ (cid:21) , (123) ν ( η ) = ∞ X q,r =0 Po (cid:18) q ; e − γ √ b (cid:19) Po (cid:18) r ; e − γ √ b (cid:19) Z d α √ π e ǫ e − α e ǫ δ (cid:18)e u − V (cid:18) α + q − r b (cid:19)(cid:19) . (124)For completeness we give here the expressions of the functions we introduced in terms of the u -parametrization: L ( η ) = r B u (1) u (3) u (2) + ( u (1) ) + Bu (1) u (2) u (3) , (125) M ( η ) = s B u (2) u (3) u (1) + 1( u (1) ) + B u (3) u (1) u (2) − u (1) (126)+ r B u (1) u (3) u (2) + ( u (1) ) + Bu (1) u (2) u (3) − u (1) , (127) C ( η ) = 2 (cid:16) B u (2) u (3) u (1) (cid:17) (cid:16) B u (1) u (3) u (2) (cid:17) u (1) + Bu (2) u (3) + u (1) + B u (3) u (2) . (128)7
3. A Gaussian approximation for the quasi-hard fields
We have obtained above the recursion equations (119,120) for the scalar m n and the probability distribution ν n ,complemented by the initial conditions (123,124). We will now discuss the possibility to solve numerically thisrecursion with a population representation of ν n , and its advantages with respect to the direct resolution in terms of Q w,n . To do so let us first rewrite the recursion equation (120) on ν n as b ν n +1 ( z ) = exp (cid:20) − e ǫ z (1) + z (2) ) (cid:21) exp (cid:20)Z d π n ( e u )( e iz · e u − (cid:21) , (129)where we have introduced a measure π n of total mass we shall denote λ n , according to π n ( e u ) = m n Z d ν n ( η ) L ( η ) ( δ ( e u − V + ( η )) + δ ( e u − V − ( η ))) , (130) λ n = Z d π n ( e u ) = 2 m n Z d ν n ( η ) L ( η ) = m n Z d ν n ( η ) s η (+ ,
0) + η ( − , η (+ , η ( − ,
1) ( p η (+ ,
1) + p η ( − , , (131)where the last expression of λ n has been obtained by symmetrizing the integrand.According to the equation (129) a random variable e u drawn from ν n +1 can be decomposed as the sum of tworandom variables, one Gaussian distributed and the other with a compound Poisson distribution. More explicitly onehas the following equality in distribution, e u d = α ( e (1) + e (2) ) + P pi =1 e u i where α is a Gaussian with zero mean andvariance e ǫ , p is extracted from a Poisson law of mean λ n , and the e u i ’s are i.i.d. copies extracted from π n /λ n . If ν n is known as an empirical distribution over a sample then it is possible to draw e u i from the probability law π n /λ n by extracting a field η in the population representing ν n with a probability proportional to L ( η ), and then setting e u i = V δ ( η ) with δ = ± with equal probability 1 /
2. It seems then possible to use this distributional interpretationto solve numerically the recursion on ( m n , ν n ). However this is doable in practice only if λ n remains bounded when n grows, otherwise one falls back on the problem we wanted to avoid of having to manipulate a diverging numberof summands. As a matter of fact the reweighting has not offered a free lunch from this point of view: it turns outthat λ n diverges if and only if x ,n does, in other words if and only if γ < γ r ( b ). This statement is a consequenceof the bounds c − ( b ) x ,n ≤ λ n ≤ c + ( b ) x ,n , where c ± ( b ) are positive constants, the proof of which we defer to theAppendix B for the sake of readability.Fortunately the reweighting procedure we followed will help us to handle the divergence of λ n more easily than theone of x ,n in the direct recursion. Indeed the divergence of λ n comes from the contributions of fields for which L ( η )becomes very large; the crucial point is that these η yield very small values of V ± ( η ), we can thus make a Gaussianapproximation for this sum of a very large number of very small random variables. To put this idea at work we rewrite(129) by decomposing it as b ν n +1 ( z ) = exp (cid:20) − e ǫ z (1) + z (2) ) (cid:21) b ν ( ≤ ) n +1 ( z ) b ν ( > ) n +1 ( z ) , (132)with b ν ( ≤ ) n +1 ( z ) = exp (cid:20) m n Z d ν n ( η ) L ( η ) (cid:16) e iz · V + ( η ) + e iz · V − ( η ) − (cid:17) I [ L ( η ) ≤ ξ n ] (cid:21) , (133) b ν ( > ) n +1 ( z ) = exp (cid:20) m n Z d ν n ( η ) L ( η ) (cid:16) e iz · V + ( η ) + e iz · V − ( η ) − (cid:17) I [ L ( η ) > ξ n ] (cid:21) , (134)where ξ n is a threshold that is arbitrary for the moment, we shall specify it later on. The decomposition (132) meansthat under the law ν n +1 the random variable e u is the sum of the Gaussian random variable described previously andof two random variables, one with the law ν ( ≤ ) n +1 , the other with the law ν ( > ) n +1 .We describe the distribution ν ( ≤ ) n +1 using the interpretation explained above, defining π ( ≤ ) n ( e u ) = m n Z d ν n ( η ) L ( η ) ( δ ( e u − V + ( η )) + δ ( e u − V − ( η ))) I [ L ( η ) ≤ ξ n ] , (135) λ ( ≤ ) n = Z d π ( ≤ ) n ( e u ) = 2 m n Z d ν n ( η ) L ( η ) I [ L ( η ) ≤ ξ n ] . (136)8Under the law ν ( ≤ ) n +1 the variable e u obeys the distributional equality e u = P pi =1 e u i where p is a Poisson variable of mean λ ( ≤ ) n , and the e u i ’s are i.i.d copies extracted from π ( ≤ ) n /λ ( ≤ ) n .The contribution ν ( > ) n +1 is instead approximated by a multivariate Gaussian G ( V n , Σ n ) with V n and Σ n the meanand the covariance matrix of ν ( > ) n +1 , computed by taking derivatives of ln b ν ( > ) n +1 with respect to z : V n = m n Z d ν n ( η ) L ( η ) I [ L ( η ) > ξ n ] ( V + ( η ) + V − ( η )) , (137)Σ ( a ) , ( b ) n = m n Z d ν n ( η ) L ( η ) I [ L ( η ) > ξ n ] ( V + ( η ) ( a ) V + ( η ) ( b ) + V − ( η ) ( a ) V − ( η ) ( b ) ) , (138)for a, b ∈ { , , } . As V − ( η ) = ( V + ( η )) f several components of V n and Σ n vanish, namely V (1) n = V (2) n = Σ (1) , (3) n =Σ (2) , (3) n = 0.Replacing ν ( > ) n +1 by a Gaussian is an approximation, that amounts to neglect the cumulants of order larger than 2,the accuracy of which is controlled by the cutoff ξ n . The larger is ξ n the better the truncation is, because a smallerpart of the full law ν n +1 is treated approximatively, but the price to pay is a simultaneous increase of λ ( ≤ ) n , the averagenumber of fields that must be summed in the description of ν ( ≤ ) n +1 . A compromise needs thus to be found betweenthese two effects, we explain below how we fixed ξ n in practice.
4. Algorithmic implementation
We now give an explicit description of the algorithm we implemented to solve the recursion equations (119,120) for m n and ν n . Suppose that at the n -th step of the iteration we have an estimation of m n and of ν n , with ν n representedas a population of fields: ν n ( η ) ≃ N N X i =1 δ ( η − η i ) . (139)One can evaluate the average of an arbitrary function A with respect to ν n as Z d ν n ( η ) A ( η ) ≃ N N X i =1 A ( η i ) , (140)and in particular compute in this way m n +1 from (119). We further assume that the fields η i have been sorted byincreasing values of L ( η ), and translate the cutoff ξ n by defining the index N n such that L ( η N n ) ≤ ξ n < L ( η N n +1 ).The integrals where the indicator function I [ L ( η ) ≤ ξ n ] (resp. I [ L ( η ) > ξ n ]) can thus be translated as sums over thepopulation elements from 1 to N n (resp. from N n + 1 to N ), which allows to compute easily λ ( ≤ ) n from (136), and V n and Σ n from (137) and (138).Each of the N elements of the new population representing ν n +1 is then generated independently of the others, bytranslating equation (132) as follows: • draw an integer p from the Poisson law of mean λ ( ≤ ) n . • extract i , . . . , i p independently in { , . . . , N n } with probability proportional to L ( η i ) (this can be done efficientlyby precomputing a cumulative table). • insert in the new population e u = V δ ( η i ) + · · · + V δ p ( η i p ) + α ( e (1) + e (2) ) + V n + g where the δ i ’s are ± withequal probability, α is a centered Gaussian random variable of variance e ǫ , and g is a centered three-dimensionalGaussian vector with covariance matrix Σ n .We can then compute the reduced overlap e C n +1 from (98), and sort the elements of the new population according totheir values of L ( η ).In practice we chose the threshold ξ n (or equivalently N n ) in an adaptive way: for each iteration step we took thelargest N n ≤ N that gave λ ( ≤ ) n ≤ λ , where λ is a parameter fixed beforehand. The accuracy of this numerical procedureis thus controlled by N , the approximation in (139) being better and better as N grows, and by λ , the Gaussiantruncation being more precise when λ is larger. Obviously the memory and time requirements of the procedure alsoincrease with N and λ ; the numerical results presented below have been obtained with population sizes N between10 and 10 , and λ around 20, we checked that the conclusions were not modified, within our numerical accuracy, bymodifying these values in a reasonable range.9 n e C n . . . . . . n /m n . . . . . . . . FIG. 9: The reduced correlation function e C n (left panel) and the inverse of the mass m n of µ n (right panel) for b = 0 . e ǫ = 0,and from left to right in both panels, γ = 0 .
95, 0 .
96, 0 .
97, 0 .
98, 0 .
99, 1.
D. Results
In Fig. 9 we complete our study of the case b = 0 . e ǫ = 0 that was started in Figs. 7 and 8. The reweightingprocedure allows now to investigate the regime γ < γ r ≈ . e C remains bounded for values of γ down to 0 .
98, these results being reported as a function of γ in theright panel of Fig. 7. Further simulations allowed us to pinpoint more precisely γ d ( b = 0 . , e ǫ = 0) ≈ . γ for which e C n diverges. As shown in the right panel of Fig. 9 the condition of divergence of e C n coincides, within our numerical accuracy, with the divergence of m n . The value of the large n limit of m n is seen tobe close to 1 when γ reaches γ d from above (see the plateau in the right panel of Fig. 9), an observation that we alsomade for the other values of b we investigated. We have given analytical arguments in [44] that indeed the plateauvalue of m n is exactly equal to 1 at γ d for the uniform measure, our numerical results suggest that this remains truewhen ( b, e ǫ ) = (1 , γ d ( b, e ǫ ) for various values of b and e ǫ , and we present now thephase diagrams obtained in this way. Consider first the left panel of Fig. 10, which deals with the case b = 1, i.e. thebias factorized over the hyperedges considered in [39]. We see that for all values of e ǫ = 0 one has γ d (1 , e ǫ ) < γ d (1 ,
0) = γ d , u , i.e. this bias has, in the large k limit, a detrimental effect on the dynamic phase transition that is pushed tolower values with respect to the one of the uniform measure. In the right panel of Fig. 10 we have plotted insteadthe threshold γ d as a function of b for e ǫ = 0; one sees now that decreasing b below 1 (that corresponds to the uniformmeasure and is marked as an horizontal dashed line on the figure) has a beneficial effect with an increase of γ d . Thelargest value we could reach was for b = 0 .
4, decreasing b further below reduces again γ d . The lowest value of b wecould investigate was b = 0 .
3, for b < . ν n exhibitingstrong fluctuations that prevented an accurate representation as a population. Finally in Fig. 11 we have checked thatthe parameter e ǫ has a detrimental effect also for values of b = 1, we found indeed that γ d ( b, e ǫ ) < γ d ( b,
0) when e ǫ = 0,for all the values of b we considered. This leads us to the conclusion that, within the biasing strategy we consideredin the large k limit, the optimal choice of parameters is e ǫ = 0 and b ≈ .
4, yielding a constant γ d ≈ . γ d , u ≈ . b . Since we obtain an optimum for the dynamicalthreshold when b is finite, we expect to have a smaller dynamical threshold if we choose a different scaling (i.e b goingto 0 or to + ∞ in the large k limit). VII. CONCLUSIONS
We have performed in this article a quantitative study of the biasing strategy for random constraint satisfactionproblems, focusing on the k -hypergraph bicoloring problem and a bias coupling variables at distance 1 on the hyper-graph. We have determined the dynamic transition both for finite k via a numerical resolution of the 1RSB equations,and in the large k limit through a partly analytic asymptotic expansion. We have shown that the increased range ofsoft interactions with respect to the ones factorized over the hyperedges enhance the efficiency of the bias by furtherpushing the dynamic transition to higher constraint densities, and in particular in the large k limit we have achieved0 γ e ǫ γ r γ d γb . . . . . . . . . FIG. 10: Left panel: γ d as a function of e ǫ for b = 1, the line being a guide to the eye. Right panel: γ d as a function of b for e ǫ = 0, the solid line corresponding to the rigidity upperbound γ r ( b ). b = 3 . b = 1 . b = 1 . b = 0 . b = 0 . γ e ǫ FIG. 11: γ d as a function of e ǫ for various b . a constant γ d ≈ .
977 in the asymptotic expansion α = 2 k − (ln k + ln ln k + γ ), strictly greater than the one obtainedin the absence of bias, γ d , u ≈ . γ d remaining itself smaller than the rigidity threshold γ r , u = 1of the uniform measure. However, the conceptual link between this transition and the important algorithmic gapexplained in the introduction justifies for us the efforts devoted to achieve this improvement, and calls for furtherinvestigations. The natural question that arises is to determine the optimal asymptotic scaling of α d that can beachieved for a biased measure incorporating soft interactions of an arbitrary but finite range, and in particular atwhich order of the asymptotic expansion does α d exceeds α d , u . This seems quite a challenge in this fully general form,but partial results could certainly be obtained by extending our study, for instance considering more general formsof ψ ( p ) than the one of equation (6), retaining more information on the local configuration than just the number offorcing clauses around one variable, or increasing the range of interactions to distance 2. Both positive or negativeresults, i.e. the impossibility to increase α d beyond the third order term of the asymptotic expansion, would shed lighton the intrisic difficulty of random CSPs, putting barriers for larger and larger families of algorithms, in the spiritof [31–33]. Finally it would also be interesting to extend this study to other CSPs, in particular the k -satisfiabilityand the q -coloring problems.1 Acknowledgments
We thank Federico Ricci-Tersenghi and Lenka Zdeborova for useful discussions. GS is part of the PAIL grant ofthe French Agence Nationale de la Recherche, ANR-17-CE23-0023-01.
Appendix A: Existence and uniqueness of the RS solution
In this appendix we shall show that the translationally invariant RS equation (21) admits a unique solution for allchoices of the bias function ψ that is strictly positive, ψ ( p ) > ∀ p ∈ { , . . . l + 1 } .We first remark that in the uniform case, where ψ ( p ) is a positive constant independent of p , the equation (21)obviously admits a unique solution (with y = 1, b y = 2 k − − ψ varies in its allowed domain. To achieve this we first rewrite (21) in the equivalent form G ( x ; ψ ) = l +1 X p =0 ψ ( p ) X p ( x ) = 0 , (A1)where for simplicity we denoted x = 1 / b y and where the coefficients X p ( x ) are: X p ( x ) = x p (cid:20) (2 k − − k − (cid:18) lp (cid:19) x + ( k − (cid:18) lp − (cid:19) − (cid:18) lp (cid:19)(cid:21) (A2)= x p l + 1 (cid:18) l + 1 p (cid:19) (cid:2) (2 k − − k − l + 1 − p ) x − ( l + 1 − kp ) (cid:3) ; (A3)in the first line we used the convention (cid:0) ll +1 (cid:1) = (cid:0) l − (cid:1) = 0.The function G ( x ; ψ ) introduced in (A1) depends smoothly on its two arguments (polynomially in x , and linearlyin ψ ), the number of solutions x ( ψ ) of the equation G = 0 can thus only change at a bifurcation point, i.e. a pair( x ; ψ ) such that G ( x ; ψ ) = ∂ x G ( x ; ψ ) = 0, otherwise the implicit function theorem allows to smoothly continue anybranch of solution. As we remarked above the solution is unique when ψ is independent of p , the uniqueness for all ψ will then follow if we show the absence of solution to the bifurcation equation: l +1 X p =0 ψ ( p ) X p ( x ) = 0 , l +1 X p =0 ψ ( p ) Y p ( x ) = 0 , ψ ( p ) ≥ ∀ p , (A4)where Y p ( x ) = ∂X p ∂x = x p − l + 1 (cid:18) l + 1 p (cid:19) (cid:2) (2 k − − k − l + 1 − p )( p + 1) x − ( l + 1 − kp ) p (cid:3) . (A5)This equation being invariant under the multiplication of ψ by a positive constant we can further assume the normal-ization condition l +1 X p =0 ψ ( p ) = 1 . (A6)For a given value of x , the existence of a ψ satisfying (A4,A6) is equivalent to the origin of R being in the convexhull of the l + 2 points of coordinates X p ( x ) Y p ( x ) ! for p = 0 , . . . , l + 1. We can then invoke the Caratheodory theoremthat states that any point of the convex hull of a set A ⊂ R d can be written as the convex combination of d + 1points of A . Here d = 2, so the absence of solutions of (A4,A6) follows from the impossibility to satisfy, for any p, q, r ∈ { , . . . , l + 1 } , the system α p X p ( x ) + α q X q ( x ) + α r X r ( x ) = 0 , (A7) α p Y p ( x ) + α q Y q ( x ) + α r Y r ( x ) = 0 , (A8) α p ≥ , α q ≥ , α r ≥ , α p + α q + α r > . (A9)This is equivalent to the three quantities X p ( x ) Y q ( x ) − X q ( x ) Y p ( x ), X q ( x ) Y r ( x ) − X r ( x ) Y q ( x ) and X r ( x ) Y p ( x ) − X p ( x ) Y r ( x ) being of the same sign; using the expressions (A3,A5) of X and Y we have checked the impossibility ofthis condition, for all x > p, q, r , which concludes the reasoning.2 Appendix B: An inequality
We provide in this Appendix a proof of the bounds c − ( b ) x ,n ≤ λ n ≤ c + ( b ) x ,n that we used in Sec. VI C 3. Westart by stating some inequalities that are fulfilled by the messages η in the support of ν n , and that are consequencesof the BP equation (88). They are more compactly stated in terms of the u -parametrization; from (100) one obtainsindeed u (2) u (3) u (1) = Y i b η i (+ , b η i (+ ,
0) + b η i (+ , , u (1) u (3) u (2) = Y i b η i ( − , b η i ( − ,
0) + b η i ( − , , (B1)which allows to conclude that u (2) u (3) u (1) ≤ u (1) u (3) u (2) ≤
1, for all the η ’s in the support of ν n .Consider now the expressions (97) for x ,n and (131) for λ n ; the ratio of the integrands in these two equations reads2 p η (+ ,
0) + η ( − , p η (+ ,
1) + p η ( − , η (+ ,
1) + η ( − ,
1) = s η (+ ,
0) + η ( − , η (+ ,
1) + η ( − , s η (+ , η (+ ,
1) + η ( − ,
1) + 2 s η ( − , η (+ ,
1) + η ( − , ! . The parenthesis in the right hand side of this equation is of the form 2( √ x + √ − x ) for some x in [0 , , √ u -parametrization, s
11 + ( u (1) ) (cid:18) B u (1) u (2) u (3) + ( u (1) ) + B u (1) u (3) u (2) (cid:19) . (B2)Consider first the case b ≤
1, i.e. B ≥
0; as the components of u are non-negative the expression in (B2) iscertainly lower bounded by 1. Moreover the bounds u (2) u (3) u (1) ≤ u (1) u (3) u (2) ≤ √ B = 1 / √ b . The case b ≥ B ≤ / √ b, c − ( b ) x ,n ≤ λ n ≤ c + ( b ) x ,n , with for b ≤ c − ( b ) = 2, c + ( b ) = 2 p /b , and for b ≥ c − ( b ) = 2 / √ b , c + ( b ) = 2 √ [1] M. Garey and D. Johnson. Computers and intractability: A guide to the theory of NP-completeness . Freeman, SanFrancisco, 1979.[2] C. H. Papadimitriou.
Computational complexity . Addison-Wesley, 1994.[3] R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, and L. Troyansky. 2+p-SAT: Relation of typical-case complexity tothe nature of the phase transition.
Random Structures and Algorithms , , 414 (1999).[4] G. Biroli, R. Monasson, and M. Weigt. A variational description of the ground state structure in random satisfiabilityproblems. Eur. Phys. J. B , , 551 (2000).[5] M. M´ezard, G. Parisi, and R. Zecchina. Analytic and Algorithmic Solution of Random Satisfiability Problems. Science , , 812–815 (2002).[6] S. Mertens, M. M´ezard, and R. Zecchina. Threshold values of random K-SAT from the cavity method. Random Struct.Algorithms , (3), 340–373 (2006).[7] F. Krzakala, A. Montanari, F. Ricci-Tersenghi, G. Semerjian, and L. Zdeborova. Gibbs states and the set of solutions ofrandom constraint satisfaction problems. Proceedings of the National Academy of Sciences , (25), 10318–10323 (2007).[8] D. Achlioptas and F. Ricci-Tersenghi. On the solution-space geometry of random constraint satisfaction problems. In Proc. of 38th STOC , pages 130–139, New York, NY, USA, 2006. ACM.[9] D. Achlioptas and A. Coja-Oghlan. Algorithmic barriers from phase transitions. In
Proceedings of FOCS 2008 , page 793,2008.[10] M. Molloy. The freezing threshold for k-colourings of a random graph. In
Proceedings of the 44th symposium on Theory ofComputing , page 921. ACM, 2012.[11] J. Ding, A. Sly, and N. Sun. Proof of the Satisfiability Conjecture for Large K. In
Proceedings of the Forty-seventh AnnualACM Symposium on Theory of Computing , STOC ’15, pages 59–68, 2015.[12] E. Friedgut. Sharp Thresholds of Graph Properties, and the k -sat Problem. Journal of the AMS , , 1017–1054 (1999).[13] J. Franco. Results related to threshold phenomena research in satisfiability: lower bounds. Theor. Comput. Sci. , , 147(2001).[14] D. Achlioptas. Lower bounds for random 3-SAT via differential equations. Theor. Comput. Sci. , (1-2), 159–185 (2001).[15] O. Dubois. Upper bounds on the satisfiability threshold. Theor. Comput. Sci. , , 187 (2001). [16] D. Achlioptas and C. Moore. The Asymptotic Order of the Random K -SAT Threshold. In Proceedings of the 43rdSymposium on Foundations of Computer Science , FOCS ’02, pages 779–788, Washington, DC, USA, 2002. IEEE ComputerSociety.[17] D. Achlioptas and Y. Peres. The threshold for random k -SAT is 2 k log 2 − O ( k ). Journal of the AMS , , 947–973 (2004).[18] M. M´ezard, G. Parisi, and M. A. Virasoro. Spin-Glass Theory and Beyond , volume 9 of
Lecture Notes in Physics . WorldScientific, Singapore, 1987.[19] M. M´ezard and A. Montanari.
Information, Physics, and Computation . Oxford Press, Oxford, 2009.[20] F. Krzakala, A. Pagnani, and M. Weigt. Threshold values, stability analysis and high- q asymptotics for the coloringproblem on random graphs. Phys. Rev. E , , 046705 (2004).[21] J. Ding, A. Sly, and N. Sun. Satisfiability Threshold for Random Regular nae-sat. Communications in MathematicalPhysics , (2), 435–489 (2016).[22] E. Mossel and Y. Peres. Information flow on trees. Ann. Appl. Probab. , (3), 817–844 (2003).[23] M. M´ezard and A. Montanari. Reconstruction on Trees and Spin Glass Transition. J. Stat. Phys. , , 1317–1350 (2006).[24] A. Montanari and G. Semerjian. Rigorous Inequalities between Length and Time Scales in Glassy Systems. J. Stat. Phys. , , 23 (2006).[25] M. M´ezard and G. Parisi. The Bethe lattice spin glass revisited. Eur. Phys. J. B , , 217 (2001).[26] B. Selman, H. A. Kautz, and B. Cohen. Noise strategies for improving local search. In Proc. 12th AAAI , pages 337–343,Menlo Park, CA, USA, 1994. AAAI Press.[27] J. Ardelius and E. Aurell. Behavior of heuristics on large and hard satisfiability problems.
Phys. Rev. E , , 037702 (2006).[28] M. Alava, J. Ardelius, E. Aurell, P. Kaski, S. Krishnamurthy, P. Orponen, and S. Seitz. Circumspect descent prevails insolving random constraint satisfaction problems. Proceedings of the National Academy of Sciences , (40), 15253–15257(2008).[29] R. Marino, G. Parisi, and F. Ricci-Tersenghi. The Backtracking Survey Propagation Algorithm for Solving Random K-SATProblems. Nature Communications , , 12996 (2016).[30] A. Coja-Oghlan. A Better Algorithm for Random k-SAT. SIAM Journal on Computing , (7), 2823–2864 (2010).[31] D. Gamarnik and M. Sudan. Performance of Sequential Local Algorithms for the Random NAE- K -SAT Problem. SIAMJournal on Computing , (2), 590–619 (2017).[32] A. Coja-Oghlan, A. Haqshenas, and S. Hetterich. Walksat Stalls Well Below Satisfiability. SIAM Journal on DiscreteMathematics , (2), 1160–1173 (2017).[33] S. Hetterich. Analysing Survey Propagation Guided Decimation on Random Formulas. arXiv preprint arXiv:1602.08519 ,(2016).[34] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi. Optimization by Simulated Annealing. Science , , 671–680 (1983).[35] A. Braunstein, L. Dall’Asta, G. Semerjian, and L. Zdeborova. The large deviations of the whitening process in randomconstraint satisfaction problems. Journal of Statistical Mechanics: Theory and Experiment , (5), 053401 (2016).[36] C. Baldassi, A. Ingrosso, C. Lucibello, L. Saglietti, and R. Zecchina. Local entropy as a measure for sampling solutions inconstraint satisfaction problems. Journal of Statistical Mechanics: Theory and Experiment , (2), 023301 (2016).[37] C. Baldassi, C. Borgs, J. T. Chayes, A. Ingrosso, C. Lucibello, L. Saglietti, and R. Zecchina. Unreasonable effectivenessof learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes. Proceedings of theNational Academy of Sciences , (48), E7655–E7662 (2016).[38] T. Maimbourg, M. Sellitto, G. Semerjian, and F. Zamponi. Generating dense packings of hard spheres by soft interactiondesign. SciPost Phys. , , 39 (2018).[39] L. Budzynski, F. Ricci-Tersenghi, and G. Semerjian. Biased landscapes for random constraint satisfaction problems. Journal of Statistical Mechanics: Theory and Experiment , (2), 023302 (2019).[40] H. Zhao and H.-J. Zhou. Maximally flexible solutions of a random K -satisfiability formula. arXiv preprint arXiv:2006.07023 ,(2020).[41] A. Sly. Reconstruction of Random Colourings. Communications in Mathematical Physics , (3), 943–961 (2009).[42] A. Montanari, R. Restrepo, and P. Tetali. Reconstruction and Clustering in Random Constraint Satisfaction Problems. SIAM Journal on Discrete Mathematics , (2), 771–808 (2011).[43] A. Sly and Y. Zhang. Reconstruction of colourings without freezing. arXiv preprint arXiv:1610.02770 , (2016).[44] L. Budzynski and G. Semerjian. The asymptotics of the clustering transition for random constraint satisfaction problems. arXiv preprint arXiv:1911.09377 , (2019).[45] A. Braunstein, M. M´ezard, and R. Zecchina. Survey propagation: An algorithm for satisfiability. Random Struct. Algo-rithms , (2), 201–226 (2005).[46] A. Braunstein, M. M´ezard, M. Weigt, and R. Zecchina. Constraint Satisfaction by Survey Propagation. In A. Percus,G. Istrate, and C. Moore, editors, Computational Complexity and Statistical Physics , page 107. Oxford University Press,2003.[47] G. Parisi. On local equilibrium equations for clustering states. arXiv:cs.CC/0212047, 2002.[48] A. Braunstein and R. Zecchina. Survey propagation as local equilibrium equations.
J. Stat. Mech. , P06007 (2004).[49] E. N. Maneva, E. Mossel, and M. J. Wainwright. A new look at survey propagation and its generalizations.
J. ACM , (4)(2007).[50] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference . Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 1988.[51] F. R. Kschischang, B. Frey, and H.-A. Loeliger. Factor graphs and the sum-product algorithm.
IEEE Trans. Inform.Theory , (2), 498–519 (2001). [52] J. Yedidia, W. Freeman, and Y. Weiss. Understanding Belief Propagation and Its Generalizations. In Exploring ArtificialIntelligence in the New Millennium , pages 239–236. Science & Technology Books, 2003.[53] H. Kesten and B. P. Stigum. Additional Limit Theorems for Indecomposable Multidimensional Galton-Watson Processes.
The Annals of Mathematical Statistics , , 1463 (1966).[54] J. R. L. de Almeida and D. J. Thouless. Stability of the Sherrington-Kirkpatrick Solution of a Spin-Glass Model. J. Phys.A , , 983–990 (1978).[55] F. Ricci-Tersenghi, G. Semerjian, and L. Zdeborov´a. Typology of phase transitions in Bayesian inference problems. Phys.Rev. E , , 042109 (2019).[56] R. Abou-Chacra, D. J. Thouless, and P. W. Anderson. A selfconsistent theory of localization. Journal of Physics C: SolidState Physics ,6