Circumventing spin glass traps by microcanonical spontaneous symmetry breaking
aa r X i v : . [ c ond - m a t . d i s - nn ] J u l Circumventing spin glass traps by microcanonicalspontaneous symmetry breaking
Hai-Jun Zhou , , CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics,Chinese Academy of Sciences, Beijing 100190, China School of Physical Sciences, University of Chinese Academy of Sciences, Beijing100049, China Synergetic Innovation Center for Quantum Effects and Applications, Hunan NormalUniversity, Changsha 410081, ChinaE-mail: [email protected]
Abstract.
The planted p -spin interaction model is a paradigm of random-graphsystems possessing both a ferromagnetic ground state and an intermediate spin glassphase. Conventional simulated annealing and message-passing algorithms could notreach the planted ground state but are trapped by an exponential number of spin glassstates. Here we propose discontinuous microcanonical spontaneous symmetry breaking(MSSB) as a simple mechanism to circumvent all the spin glass traps. The existenceof a discontinuous MSSB phase transition is confirmed by microcanonical MonteCarlo simulations. We conjecture that the planted ground state could be retrieved inpolynomial time by applying machine-learning methods (such as perceptron-learning)to microcanonically sampled independent configurations. Three candidate algorithmsare proposed.
1. Introduction
The planted p -spin interaction model on a finite-connectivity random graph is arepresentative ferromagnetic system with an intermediate spin glass phase. Thismodel system has played an important role in understanding the physics of structuralglasses [1, 2]. This system is also quite relevant and challenging to the field of statisticalinference [3, 4]; it is equivalent to the generalized Sourlas code problem in informationscience [5, 6] and the planted maximum XORSAT problem in computer science [7].When the environmental temperature slowly decreases, as occurs in a simulatedannealing (SA) dynamics [8], the system is predicted to experience an equilibrium phasetransition from the disordered paramagnetic phase to the ordered ferromagnetic phase.But this transition never occurs in practice when the interactions are many-body innature ( p ≥ N and the crystal-nucleation mechanism then simplyfails. The system instead remains in the paramagnetic phase as temperature furtherdecreases [1, 2], until finally it is frozen in one of the exponentially many disordered ircumventing spin glass traps by MSSB p -spinsystem but is a common property of many large inference problems such as the plantedsatisfiability and coloring problems [12, 13, 14]. When the ground state is completelymasked by an exponential number of disordered configurations (for example throughquiet planting as in the p -spin system and the coloring problem [14]), and SA andmessage-passing processes are trapped by spin glass states, it seems the only resort isbrute-force enumeration which is of course infeasible even for moderate system sizes N .In this work we point out the existence of an alternative route to the planted groundstate, namely the route of microcanonical spontaneous symmetry breaking (MSSB).This is a discontinuous transition from the paramagnetic (or disordered symmetric, DS)phase to the microcanonical polarized (MP) phase. The microcanonically stable MPphase has so far been largely overlooked in the literature except for a recent detailedanalysis concerning the Potts model [15]. Because the energy density of the system isfixed along the whole DS-to-MP transition trajectory, the difficulty of climbing hugeenergy barriers is completely avoided (Fig. 1).The DS and MP phases are connected by many transition trajectories at the givenfixed energy density, and therefore ergodicity between these two phases is preserved. TheMP phase is entropically favored than the DS phase because it contains exponentially E ne r g y Ocean of Configurations MSSBSA
Figure 1.
An illustration of energy landscape of a planted p -body interaction spinsystem ( p ≥ ircumventing spin glass traps by MSSB p = 3 (each interaction involving threevertices) which have the property that their ground state is unique. Very interestinglywe find that these DS configurations actually contain information about the uniqueplanted ground state. It may then be possible to infer the planted solution from thesampled DS configurations through machine-learning techniques. We propose threedifferent inference strategies based respectively on the ideas of perceptron-learning,hyperplane optimization, and curve-fitting, to solve the planted p -spin model. By somesimple scaling analysis we conjecture that the planted ground state can be inferred inpolynomial time. We hope to be able to confirm this conjecture in the near future byextensive numerical simulations, and then to extend this work to other hard plantedsystems. Our work also call for theoretical understanding on the interesting asymmetricbehavior shown in Fig. 4.
2. Theoretical predictions
Consider a planted p -spin interaction system in which each vertex j ∈ { , . . . , N } participates in K interactions (clauses) and each clause involves p randomly chosenvertices (we set p = 3 in all the following numerical computations). The total number ofclauses is M = KN/p . The energy of a generic configuration ~σ ≡ ( σ , . . . , σ N ) is E ( ~σ ) = − M X a =1 J a Y j ∈ ∂a σ j , (1)where σ j ∈ ± j and ∂a denotes the set of vertices constrained byclause a . We denote by u the energy density of the system, namely u = E ( ~σ ) N . (2)There is a planted spin configuration ~σ ≡ ( σ , . . . , σ N ) dictating the couplingconstant of clause a , such that J a = + Q j ∈ ∂a σ j (probability 1 − ε ) , − Q j ∈ ∂a σ j (probability ε ) . (3)The parameter ε is the noise level of the above planting rule. When ε > p -body interactions [5, 6]. When ε is ircumventing spin glass traps by MSSB ~σ is a ground state of Eq. (1) and it maylie extensively below all the other minimum-energy configurations (Fig. 1). We definethe overlap (magnetization) m of configuration ~σ with respect to ~σ as m = 1 N N X j =1 σ j σ j . (4)This random-graph system has been intensively studied by the mean-field cavitymethod of statistical mechanics [16, 17, 10, 18]. The mean-field results obtained at ε = 0and K = 10 are demonstrated in Fig. 2, and the qualitatively identical results obtainedfor K = 4 are shown in Fig. A1 (Appendix A). Some aspects of these theoretical resultsare similar to what were found for the Potts model [15], but there is a key qualitativedifference to be discussed at the end of this section. First, let us interpret these -4-3-20.2 β f β d f β DSMPCP (a) d u mic -1 s uDSMPCP (b) β f β d -3 -2 u d u mic -1 β u DS MP CP MMC (c) d u mic -1 m uDSMPCPMMC (d) Figure 2.
The DS (solid lines), MP (dotted lines), and CP (long dashed lines) fixedpoints of the mean-field theory for the planted 3-body model on random graphs ofdegree K = 10 at noise ε = 0. (a) Free energy density f versus canonical inversetemperature β . (b) Entropy density s versus energy density u . (c) Microcanonicalinverse temperature β versus u . (d) Mean overlap m versus u . β d and u d : criticalinverse temperature and energy density at the dynamical SG phase transition; u mic :critical energy density at the MSSB phase transition; β f : critical inverse temperatureat the canonical ferromagnetic phase transition. The predicted MSSB phase transitionis confirmed by MMC simulations (circles) on a single problem instance of size N = 960. ircumventing spin glass traps by MSSB u decreases to the spin glass(SG) dynamical transition point u d this DS phase suddenly divides into an exponentialnumber of ergodicity-broken macro-states, and the system then enters into the SGphase [1]. The value of u d is independent of ε because of gauge symmetry [10] andcan be determined by the tree-reconstruction method [17] (see also textbooks [4, 18]).Because of planting, a stable canonical polarized (CP) phase containing configurationssimilar to ~σ exist in the configuration space at sufficiently low energies. This CP phaseis simply the conventional ferromagnetic phase.The Microcanonical polarized phase, corresponding to the unstable fixed point ofthe mean field theory with higher free energy density [Fig. 2(a)], serves as the watershedbetween the DS and CP phases [Fig. 2(c)]. Configurations of the MP phase have positivemagnetizations and therefore are similar to ~σ , but they are unstable in the canonicalensemble. The MP phase can only be explored by fixing the energy density u .The entropy density s of the MP phase as a function of energy density u has twobranches [Fig. 2(b)]. The higher-entropy branch corresponds to the MP configurationsthat are stable at fixed u , and its entropy starts to surpass that of the DS phase atthe critical energy density u mic , indicating a discontinuous microcanonical spontaneoussymmetry-breaking phase transition. This transition leads to a jump in the meanmagnetization and a drop in the microcanonical inverse temperature β , which are verifiedby our simulation results [Figs. 2(c) and 2(d)]. The lower-entropy MP branch, on theother hand, always has lower entropy than that of the DS phase; it marks the watershedbetween the MP and DS phases in the microcanonical ensemble.We observe that the MSSB transition energy density u mic is located above the SGtransition point u d when the noise level ε is low enough (Table 1). In principle it isthen possible to reach the MP phase by inducing a MSSB transition at fixed value of u ∈ ( u d , u mic ), avoiding the spin glass traps of the canonical ensemble. We will continueto discuss the practical feasibility of this proposal in the next two sections. A relevantobservation to this issue is that the DS fixed point of the p -spin model is always locally Table 1.
Magnetization jump ∆ m , microcanonical inverse temperature drop ∆ β ,critical energy density u mic , at the MSSB phase transition of the planted 3-body modelon random graphs of degree K (noise ε = 0). The critical energy densities u d are alsolisted (based on Table 4.2 of [18]). K m .
906 0 .
821 0 .
758 0 .
712 0 .
676 0 .
647 0 . β − . − . − . − . − . − . − . u mic − . − . − . − . − . − . − . u d − . − . − . − . − . − . − . ircumventing spin glass traps by MSSB m always exists between the DS andMP phases down to this minimal energy density [Fig. 2(d)]. This feature is significantlydifferent from what was observed in the Potts model (Fig. 2(d) of Ref. [15]). The DSphase of the Potts model becomes unstable below certain threshold energy density. Byfixing the energy density below this threshold value, the system will then evolve fromthe DS phase to the MP phase gradually. Such a gradual evolution will not be possiblefor the p -spin model because of the gap in the order parameter m .
3. Microcanonical Monte Carlo
We implement a simple MMC dynamics to explore configurations at the vicinity of anobjective energy E o ≡ N u with u ∈ ( u d , u mic ). After the system reaches E o from arandom initial configuration through SA or irreversible energy diving, we keep updatingits configuration ~σ by single-spin flips. An elementary MMC trial consists of pickinga vertex j uniformly at random, proposing a flip σ j → − σ j , and accepting this flipif the new configuration energy does not exceed E o . We sample configurations atfixed intervals and record their overlap values and energies. The microcanonical inversetemperature is estimated through β = 14 ln (cid:16) E o − h E ( ~σ ) i (cid:17) , (5)where h E ( ~σ ) i is the averaged energy of the configuration samples [20].Our MMC dynamics is an unbiased and ergodic random walk within themicrocanonical configuration space. As evolution time goes to infinity everyconfiguration of this space has equal frequency to be visited, and then the systemwill certainly be in the MP phase as this phase is exponentially dominant in statisticalweight. Empirically we have observed that this MMC dynamics achieves MSSB in small-sized systems [Fig. 3(b)]. It is more easier for the MSSB transition to occur in networksof larger degrees K . This is consistent with the mean-field theoretical results, whichreveal that the entropy barrier and the gap of the order parameter m both decreaseswith K (Table 1). The hardest RR problem instances are those with degree K = 4.But when N becomes larger ( N >
350 for K = 4, N > K = 10, and N > K = 20) the waiting time needed to observe a MSSB transition starts to increaseexponentially with N . This slowing down is due to an entropy-barrier effect [Fig. 3(a)].On the coarse-grained level the MMC dynamics is a one-dimensional random walkunder a potential field − s u ( m ), where s u ( m ) is the entropy density of configurations atenergy density u having overlap m with the planted configuration ~σ . At each elementarytrial, m may change to m ± /N with probability proportional to (1 ∓ g m ) /
2. The biasratio is g m = 1 − e s ′ u ( m ) e s ′ u ( m ) , (6) ircumventing spin glass traps by MSSB m t(1-g m )/2(1+g m )/2m m+2/Nm-2/N m m* m s u (m) (a) W Nc x c x c x K=20K=10K= 8K= 4 (b)
Figure 3.
The time complexity of observing a MSSB transition. (a) A schematiccurve of the entropy density s u ( m ) of the planted 3-body model at fixed energy density u and noise ε = 0, showing a local maximum at overlap m ≈
0, a global maximum at m , and a minimum at m ∗ . The MMC dynamics is equivalent to a one-dimensionalbiased random walk with bias ratio g m . The example evolution trajectory shows aMSSB event at u = − .
61 in a random graph of degree K = 10 and size N = 960 (oneunit of MMC time t means N spin-flip trials). (b) The simulated waiting time W forsingle problem instances. Each point is obtained by simulating 24 independent MMCevolution trajectories on a single network and then setting W to be the length of theshortest trajectory: K = 4, u = − .
133 (circle); K = 8, u = − . K = 10, u = − .
61 (pluses); K = 20, u = − . cN b referencerelations. with s ′ u ( m ) ≡ d s u ( m ) / d m being the slope of s u ( m ) at m . Because s ′ ( m ) < m ∈ (0 , m ∗ ) with m ∗ being the watershed point of s u ( m ), the occurrence of MSSBis an exponentially rare first-passage event characterized by a waiting time of order O ( e N [ s u (0) − s u ( m ∗ )] ) [21, 22].
4. Asymmetry of overlap distribution
When p is odd so that each clause of the energy (1) involves an odd number of vertices,the planted configuration ~σ is a ground state of the system but the globally flipped one( − ~σ ) is not. Furthermore, ~σ will be the unique ground state if K ≥ M is not too small. We conceive that since the entropy density s u ( m ) is “M”-shaped with a minor peak at m ≈ not be strictly symmetric in the vicinity of m = 0 when N is finite. Instead,for relatively large overlap magnitudes, s u (+ | m | ) may slightly exceed s u ( −| m | ). As theprobability P ( m ) of observing an overlap value m is related to s u ( m ) by P ( m ) ∝ e Ns u ( m ) ,we can quantify the magnitude of entropy asymmetry by examining the P ( m ) profile.Some representative simulation results for p = 3 are shown in Fig. 4 and in Fig. A3(Appendix A).Consistent with our expectations, the overlap m is more likely to be positive when | m | > . N − . On the other hand we were initially quite puzzled to notice that m ircumventing spin glass traps by MSSB -2-100 1 2 3 4 5 N [ P ( + m )- P (- m ) ] N m -6 -4 -2 -4 -2 0 2 4 6 N . P ( m ) (a) -8 -6 -4 -2 P ( m ) |m - m |m = -0.000205 m > m m < m (b) Figure 4.
Asymmetry of the overlap distribution P ( m ) for the planted 3-bodymodel at noise ε = 0. (a) Rescaled probability difference N [ P (+ | m | ) − P ( −| m | )] versusrescaled overlap √ Nm for a single random-graph system of degree K = 4 at u = − . N = 2004 (pluses), 4008 (crosses), 8016 (circles), and 16032 (squares). The insetshows √ N P ( m ) versus √ N m . (b) The two branches of P ( m ) at m > m and m < m for the system of size N = 16032, with m = − . is significantly biased toward negative values for smaller values of | m | , especially at | m | ≈ . N − . After some detailed analysis we now understand that this peculiarbehavior is caused by the fact that the most probable overlap value (denoted as m ) ofthe distribution P ( m ) is not located exactly at zero but is slightly negative ( m < P ( m ) issystematically biased towards the positive side of m than towards the negative side[Fig. 4(b)].The numerical results of Fig. 4(a) and Fig. A3 also reveal that the histograms fordifferent system sizes roughly superimpose onto each other after rescaling P ( m ) by N − and m by N − . From this scaling behavior we infer that the probabilities of overlap m being non-positive ( p ≤ ) and non-negative ( p ≥ ) change with N as p ≤ = 12 + γ N − , p ≥ = 12 − γ N − , (7)and the mean and squared-mean overlaps decay with N according to h m i = µN − , h m i = N − . (8)The values of the asymmetry coefficients γ and µ are determined by two competingeffects, namely that the most probable overlap value m is negative and that P ( m + | ∆ m | ) > P ( m − | ∆ m | ) for any deviation ∆ m of overlap from m . We expect that thesetwo coefficients γ and µ are distinct from zero in general. Given a problem instance,we can estimate γ and µ by MMC assuming ~σ = (1 , . . . , γ = 0 . ± . µ = 0 . ± . K = 4 and size N = 4008 at u = − .
135 of Fig. 4(a), by averaging over 7 . × configuration samples through the bootstrap method (Appendix B). ircumventing spin glass traps by MSSB P ( m ) persists even whenthe energy density u is higher than the MSSB phase transition value u mic . This fact isdemonstrated in Fig. A2 (Appendix A) by MMC simulation results obtained at differentenergy densities u on a single RR network instance of size N = 4008 and degree K = 4.Therefore the condition u < u mic is not strictly necessary to explore the asymmetry of P ( m ). Of course the extend of asymmetry in P ( m ) decreases as u increases, and P ( m )should be perfectly symmetric at u = 0.
5. Inference on microcanonical configuration samples
Equation (7) indicates that a single configuration sample ~σ contains O ( N − ) bitinformation about the planted configuration ~σ . It may then be possible to infer ~σ directly from a sufficiently large number of configurations sampled at fixed energydensity u , without waiting for a rare MSSB transition event. Here we discuss threedifferent inference strategies. Assuming µ of Eq. (8) to be non-zero, we may construct a perceptron-learning problemto infer ~σ (stragegy A). The recipe is quite straightforward. We sample Q independentconfigurations ~σ ( t ) at equal time intervals, here t is the index of a sampled configuration,and then add them together to form a composite vector ~r ≡ ( r , . . . , r N ) = P Qt =1 ~σ ( t ).Define the alignment L between ~r and ~σ as L ≡ P Ni =1 r i σ i . Because of Eq. (8)the random variable L follows approximately a Gaussian distribution with mean µQ and standard deviation √ N Q . Therefore if setting Q to be proportional to N with Q ≥ (5 /µ ) N then the signs of L and µ will highly likely be the same. As an example,this number is Q ≈ . × for the problem instance of size N = 4008 and degree K = 4in Fig. 4(a). After getting a large number X of such indendent composite vectors byparallel computing [28], we can feed them to a perceptron to infer ~σ .The inferred probability P A ( ~σ ) of the planted configuration being ~σ in thisperceptron-learning task is P A ( ~σ ) ∝ X Y ℓ =1 Θ (cid:16) θ µ N X i =1 r ( ℓ ) i σ i (cid:17) , (9)where Θ( x ) is the Heaviside function with Θ( x ) = 1 for x > x ) = 0 for x ≤ r ( ℓ ) i is the i -th entry of the ℓ -th composite sample vector [23, 24, 25, 26, 27]. Theparameter θ µ ∈ ± µ which only affects the global sign of the inferredconfiguration ~σ . We may simply assume θ µ = 1 and later fix the optimal sign of ~σ byenergy comparison.We should be able to achieve almost perfect inference of ~σ by setting X ≥ N [27].The total number of sampled independent configurations is then of order O ( N ). It maybe reasonable to assume that one needs O ( N ) spin-flip trials to pick one independent ircumventing spin glass traps by MSSB O ( N ),much shorter than the exponential time complexity of the naive random-walk strategy.Perfect inference is actually not necessary. We could compute the mean values ofthe planted spins σ i through Eq. (9) and then use them to guide the MMC dynamicspassing through the entropy valley around m ∗ [Fig. 3(a)]. Given N independent configuration samples, the hyperplane defined by the plantedconfiguration ~σ will split these configurations into two groups such that one groupcontains order O ( N / √ N ) more configurations than the other group. The hyperplaneperpendicular to a random spin configuration will also split these configurations intotwo groups, but their size difference will only be of order O ( √N ). We expect that when N reaches order O ( N ) the hyperplane of the planted configuration will be the uniqueoptimal choice to separate the N configuration samples in the most uneven way. Thecorresponding optimization problem has the following cost function C ( ~σ ) = N X ℓ =1 Sign (cid:16) N X i =1 σ ( ℓ ) i σ i (cid:17) , (10)where Sign( x ) is the sign function defined by Sign( x ) = 1 for x >
0, Sign( x ) = − x < x ) = 0 for x = 0, and σ ( ℓ ) i is the i -th entry of the ℓ -thsampled configurations. The minimum-cost solution of Eq. (10) may be reachableby simulated annealing dynamics or by message-passing algorithms. The method ofprincipal component analysis might also be helpful [29].Both this optimization strategy (strategy B) and the perceptron-learning strategyA have the advantage that one does not need to know the value of the hyper-parameter ε . A conceptual disadvantage is that they are not applicable to problem instances witheven p -values, because the overlap distribution P ( m ) must be symmetric when everyinteraction involves an even number of vertices. This disadvantage might be overcomeby the strategy of the next subsection. The distribution P ( m ) of overlaps m with respective to the planted configuration ~σ ,viewed as a function of m , should not depend on the details of ~σ . This means that wecould estimate P ( m ) without knowing ~σ simply by setting the coupling constants J a ofEq. (1) indendently as − ε and 1 − ε , respectively. On the otherhand, we can sample a large number of independent configurations ~σ ( ℓ ) ≡ ( σ ( ℓ )1 , . . . , σ ( ℓ ) N ),with indices ℓ = 1 , . . . , N , for the planted p -spin model (1) with the original set ofcoupling constants.Then the following inference problem (strategy C) concerning the probability ircumventing spin glass traps by MSSB P C ( ~σ ), of the unknown planted configuration ~σ arises: P ( m ) = 1 N N X ℓ =1 X ~σ P C ( ~σ ) δ (cid:16) m − N N X i =1 σ ( ℓ ) i σ i (cid:17) , (11)where δ ( x ) is the Dirac delta function. We may assume P C ( ~σ ) to be factorized suchthat P C ( ~σ ) = N Y i =1 h ν i σ i i , (12)where ν i is the inferred mean value of the i -th planted entry σ i . Then Eq. (11) can beapproximated as a sum of many Gaussian distributions [27] P ( m ) = 1 N N X ℓ =1 √ π Λ exp (cid:16) − ( m − N P Ni =1 σ ( ℓ ) i ν i ) (cid:17) , (13)with Λ = P Ni =1 (1 − ( ν i ) ) /N being a variance parameter.Equation (13) is essentially a curve-fitting problem with N adjustable realparameters ν i . This problem may be solvable by various algorithms developed in themachine-learning community. We expect that when the number N of configurationsamples becomes sufficiently large, the inferred first moments ν i will offer highly faithfulprediction about the true planted configuration ~σ .An advantage of this inference strategy C is that it may also be applicable whenthe hyper-parameter p of model (1) is even-valued. For p being even, of course P ( m )will be a symmetric function of m , but because configurations of larger magnitudes | m | of overlaps are more frequently to be sampled in a planted system, the form of P ( m )may significantly deviate from being Gaussian, making it useful for the inference task(13). ε As mentioned in the preceding subsection, to evaluate the overlap distribution P ( m ) weneed to discard the original coupling constants J a of the planted model (1) and assign ± ε . But the value of this hyper-parameter ε may often be unknown.This lack of information should not cause a fundamental difficulty. We may simplyset ε to a set of different values between 0 and . For each fixed value of ε we can get thecorresponding P ( m ) function and use it as input to the inference problem (13), whilethe set of configuration samples ~σ ( ℓ ) is the same. We may be able to achieve the bestfitting performance when the assigned ε is close to the true value of this noise level.
6. Summary
In summary, we picked the p -spin interaction model as an example to demonstrate thepotential of microcanonical spontaneous symmetry breaking for solving hard inference ircumventing spin glass traps by MSSB O ( cN ) elementary localized updates, possibly with a large prefactor c . The finite-size scaling behaviors (7) and (8) are quite fascinating and they call fora thorough theoretical understanding. Numerical implementation of the three machine-learning strategies of Section 5 will be needed to check the validity of our conjecture.We are also starting to work on other challenging planted ensembles of optimizationproblems, such as K -satisfiability and Q -coloring [12, 14, 13], to get more insights onthe MSSB mechanism. Acknowledgments
The author thanks Yuliang Jin and Pan Zhang for helpful conversations. This work wassupported by the National Natural Science Foundation of China Grants No.11975295and No.11947302, and the Chinese Academy of Sciences Grant No.QYZDJ-SSW-SYS018. Numerical simulations were carried out at the Tianwen clusters of ITP-CASand the Tianhe-2 platform of the National Supercomputer Center in Guangzhou.
Appendix A. Additional numerical results
The mean-field theoretical results obtained for the planted 3-body spin model on aregular random network of degree K = 4 are summarized in Fig. A1. Qualitativelyspeaking, these theoretical results are identical to the results shown in Fig. 2 for K = 10.The circles in Fig. A1(c) are simulation results obtained on a single network instanceof size N = 10002. Because of the huge entropy barrier and the large gap of theorder parameter m between the DS and MP phases, the waiting time to observe aMSSB transition is too much longer than the simulation time, and consequently such atransition does not occur in our simulation processes.Figure A2 shows how the asymmetry of the probability distribution P ( m ) changeswith the energy density u , for a single random-network instance of degree K = 4 andsize N = 4008. From the fact that the different curves superimpose onto each other afterrescaling the probability distributions by | u | [Fig. A2(b)], we know that the asymmetryof P ( m ) decreases linearly with | u | and vanishes as u approaches zero.We show in Fig. A3 the probability profiles of the overlap m in the vicinity of m = 0 for single RR network instances of degree K = 10. These results are obtained bysampling a huge number of spin configurations at fixed energy density u = − .
61 throughthe MMC evolution dynamics. This figure confirm that the asymmetric features revealedin Fig. 4(a) are quite general for RR networks of different degrees K . For the systemof size N = 15360, we estimate the asymmetry parameters to be µ = 0 . ± .
006 and λ = 1 . ± . m = 0 . N = 4 . × for this problem instance). ircumventing spin glass traps by MSSB -1.6-1.4-1.20.5 β f β d f β DSMPCP (a) d u mic -1 s uDSMPCP (b) β f β d d u mic -1 -0.9 β u DS MP CP MMC (c) d u mic -1 m uDSMPCP (d) Figure A1.
The DS (solid lines), MP (dotted lines), and CP (long dashed lines)fixed points of the mean-field theory for the planted 3-body model on random graphsof degree K = 4 (noise ε → f versus canonical inversetemperature β . (b) Entropy density s versus energy density u . (c) Microcanonicalinverse temperature β versus u . Circular symbols are MMC simulation results obtainedon a single problem instance of size N = 10002. (d) Mean overlap m versus u . β d and u d : critical inverse temperature and energy density at the dynamical SG phasetransition; u mic : critical energy density at the MSSB phase transition; β f : criticalinverse temperature at the canonical ferromagnetic phase transition. Appendix B. Bootstrap statistics analysis
We estimate the statistical errors of the scaling coefficients λ and µ of Eqs. (7) and (8)by the bootstrap method [18, 30].Consider a random variable x which can take m different values x , . . . , x m . We aregiven a large number N of independent samples of this random number, among which n i samples are the value x i ( n i ≥ i = 1 , . . . , m and P mi =1 n i = N ). An empiricalmean of this random variable is then constructed as x = 1 N m X i =1 n i x i . (B.1)Because the sample set size N is finite, the empirical mean x may deviate from the truemean of the random variable x . ircumventing spin glass traps by MSSB -6-300 0.03 0.06 [ P ( + m ) - P (- m ) ] ( - ) m u = -0.10 u = -0.54 u = -0.98 u = -1.135 (a) -6-300 0.03 0.06 [ P ( + m )- P (- m ) ]/ | u | ( - ) m u = -0.10 u = -0.54 u = -0.98 u = -1.135 (b) Figure A2.
Probability difference (unit 10 − ) between positive and negative overlapvalues m in a single random-network system of K = 4 and N = 4008. The energy densityis u = − .
10 (pluses), − .
54 (crosses), − .
984 (circles), and − .
135 (diamonds). Panel(b) is rescaled plot of panel (a) with the vertical axis being the ratio between theprobability difference and | u | . -4-3-2-1010 1 2 3 4 5 N [ P ( + m )- P (- m ) ] N m -6 -4 -2 -4 -2 0 2 4 6 N . P ( m ) (a) -8 -6 -4 -2 P ( m ) |m - m |m = -0.000282 m > m m < m (b) Figure A3.
Asymmetry of the overlap distribution P ( m ) for the planted 3-bodymodel on single random graphs of degree K = 10 at noise ε = 0. (a) Rescaled probabilitydifference N [ P (+ | m | ) − P ( −| m | )] versus rescaled overlap √ Nm at u = − .
61: size N = 3840 (pluses), 7680 (crosses), 15360 (circles). The inset shows √ NP ( m ) versus √ N m . (b) The two branches of P ( m ) at m > m and m < m for the system of size N = 15360, with m = − . To estimate the standard error of the empirical mean, we assume that the trueprobability distribution of x is identical to the empirical distribution, namely theprobability of x = x i is p i = n i / N . It then follows that the joint probability P ( n , . . . , n m ) that, among N independent samples, n i of them equal to x i is P ( n , . . . , n m ) = N ! n ! . . . n m ! p n . . . p n m m . (B.2)According to this probability distribution, the first moments h n i i and the correlations h n i n j i of these random counting numbers n i are simply h n i i = N p i , h n i n j i = N ( N − p i p j + N p i δ ji , (B.3) ircumventing spin glass traps by MSSB δ ji = 0 if i = j and δ ji = 1 if i = j . Then, the standard deviation of the empiricalmean x is SD( x ) ≡ p h ( x ) i − h x i = (cid:16) x − ( x ) N (cid:17) , (B.4)where x is the empirical second moment of x : x = 1 N m X i =1 n i x i . (B.5)Applying the above statistical theory to the parameter λ , we notice that the randomvariable in this case is x ∈ {−√ N , , + √ N } corresponding to the overlap value m beingpositive, zero, and negative, respectively. Then the standard deviation of λ is SD ( λ ) ≈ r N N . (B.6)For the parameter µ , the corresponding random variable is x = P Ni =1 σ i σ i where σ i isthe spin of vertex i in the sampled configuration ~σ . Then the standard deviation of µ is SD ( µ ) ≈ r N N . (B.7)We see that the standard errors of λ and µ both are approximately p N/ N . References [1] S. Franz, M. M´ezard, F. Ricci-Tersenghi, M. Weigt, and R. Zecchina. A ferromagnet with a glasstransition.
Europhys. Lett. , 55:465–471, 2001.[2] A. Montanari and F. Ricci-Tersenghi. Cooling-schedule dependence of the dynamics of mean-fieldglasses.
Phys. Rev. B , 70:134406, 2004.[3] L. Zdeborov´a and F. Krzakala. Statistical physics of inference: thresholds and algorithms.
Adv.Phys. , 65:453–552, 2016.[4] M. M´ezard and A. Montanari.
Information, Physics, and Computation . Oxford Univ. Press, NewYork, 2009.[5] N. Sourlas. Spin-glass models as error-correcting codes.
Nature , 339:693–695, 1989.[6] H. Huang and H. J. Zhou. Cavity approach to the Sourlas code system.
Phys. Rev. E , 80:056113,2009.[7] O. Watanabe. Message passing algorithms for MLS-3LIN problems.
Algorithmica , 66:848–868,2013.[8] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi. Optimization by simulated annealing.
Science ,220:671–680, 1983.[9] Y.-Z. Xu, C. H. Yeung, H.-J. Zhou, and D. Saad. Entropy inflection and invisible low-energystates: Defensive alliance example.
Phys. Rev. Lett. , 121:210602, 2018.[10] Y. Matsuda, H. Nishimori, L. Zdevorov´a, and F. Krzakala. Random-field p -spin-glass model onregular random graphs. J. Phys. A: Math. Theor. , 44:185002, 2011.[11] A. S. Bandeira, A. Perry, and A. S. Wein. Notes on computational-to-statistical gaps: Predictionsusing statistical physics.
Portugaliae Mathematica , 75:159–186, 2018.[12] K. Li, H. Ma, and H. J. Zhou. From one solution of a 3-satisfiability formula to a solution cluster:Frozen variables and entropy.
Phys. Rev. E , 79:031102, 2009.[13] F. Krzakala, M. M´ezard, and L. Zdeborov´a. Reweighted belief propagation and quiet planting forrandom k -sat. Journal on Satisfiability, Boolean Modeling and Computation , 8:149–171, 2014. ircumventing spin glass traps by MSSB [14] F. Krzakala and L. Zdeborov´a. Hiding quiet solutions in random constraint satisfaction problems. Phys. Rev. Lett. , 102:238701, 2009.[15] H.-J. Zhou. Kinked entropy and discontinuous microcanonical spontaneous symmetry breaking.
Phys. Rev. Lett. , 122:160601, 2019.[16] M. M´ezard and G. Parisi. The Bethe lattice spin glass revisited.
Eur. Phys. J. B , 20:217–233,2001.[17] M. M´ezard and A. Montanari. Reconstruction on trees and spin glass transition.
J. Stat. Phys. ,124:1317–1350, 2006.[18] H.-J. Zhou.
Spin Glass and Message Passing . Science Press, Beijing, China, 2015.[19] D. H. E. Gross.
Microcanonical Thermodynamics: Phase Transitions in “Small” Systems . WorldScientific, Singapore, 2001.[20] M. Creutz. Microcanonical Monte Carlo simulation.
Phys. Rev. Lett. , 50:1411–1414, 1983.[21] G. H. Weiss. First passage time problems for one-dimensional random walks.
J. Stat. Phys. ,24:587–594, 1981.[22] M. Khantha and V. Balakrishnan. First passage time distribution for finite one-dimensionalrandom walks.
Pramana , 21:111–122, 1983.[23] A. Engel and C. Van den Broeck.
Statistical Mechanics of Learning . Cambridge University Press,Cambridge, UK, 2001.[24] Y. Kabashima and S. Uda. A BP-based algorithm for performing bayesian inference in largeperceptron-type networks.
Lect. Notes Artif. Intellig. , 3244:479–493, 2004.[25] A. Braunstein and R. Zecchina. Learning by message passing in networks of discrete synapses.
Phys. Rev. Lett. , 96:030201, 2006.[26] H. Huang and T. Toyoizumi. Advanced mean-field theory of the restricted Boltzmann machine.
Phys. Rev. E , 91:050101(R), 2015.[27] H.-J. Zhou. Active online learning in the binary perceptron problem.
Commun. Theor. Phys. ,71:243–252, 2019.[28] H. Bauke and S. Mertens. Random numbers for large-scale distributed monte carlo simulations.
Phys. Rev. E , 75:066701, 2007.[29] G.-K. Hu, T. Liu, M.-X. Liu, W. Chen, and X.-S. Chen. Condensation of eigen microstate instatistical ensemble and phase transition.
Science China Phys. Mech. Astron. , 62:990511, 2019.[30] B. Efron. Computers and the theory of statistics: Thinking the unthinkable.