[PDF] Circumventing spin glass traps by microcanonical spontaneous symmetry breaking

Abstract

The planted p-spin interaction model is a paradigm of random-graph systems possessing both a ferromagnetic phase and a disordered phase with the latter splitting into many spin glass states at low temperatures. Conventional simulated annealing dynamics is easily blocked by these low-energy spin glass states. Here we demonstrate that, actually this planted system is exponentially dominated by a microcanonical polarized phase at intermediate energy densities. There is a discontinuous microcanonical spontaneous symmetry breaking transition from the paramagnetic phase to the microcanonical polarized phase. This transition can serve as a mechanism to avoid all the spin glass traps, and it is accelerated by the restart strategy of microcanonical random walk. We also propose an unsupervised learning problem on microcanonically sampled configurations for inferring the planted ground state.

Full PDF

aa r X i v : . [ c ond - m a t . d i s - nn ] J u l Circumventing spin glass traps by microcanonicalspontaneous symmetry breaking

Hai-Jun Zhou , , CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics,Chinese Academy of Sciences, Beijing 100190, China School of Physical Sciences, University of Chinese Academy of Sciences, Beijing100049, China Synergetic Innovation Center for Quantum Eﬀects and Applications, Hunan NormalUniversity, Changsha 410081, ChinaE-mail: [email protected]

Abstract.

The planted p -spin interaction model is a paradigm of random-graphsystems possessing both a ferromagnetic ground state and an intermediate spin glassphase. Conventional simulated annealing and message-passing algorithms could notreach the planted ground state but are trapped by an exponential number of spin glassstates. Here we propose discontinuous microcanonical spontaneous symmetry breaking(MSSB) as a simple mechanism to circumvent all the spin glass traps. The existenceof a discontinuous MSSB phase transition is conﬁrmed by microcanonical MonteCarlo simulations. We conjecture that the planted ground state could be retrieved inpolynomial time by applying machine-learning methods (such as perceptron-learning)to microcanonically sampled independent conﬁgurations. Three candidate algorithmsare proposed.

1. Introduction

The planted p -spin interaction model on a ﬁnite-connectivity random graph is arepresentative ferromagnetic system with an intermediate spin glass phase. Thismodel system has played an important role in understanding the physics of structuralglasses [1, 2]. This system is also quite relevant and challenging to the ﬁeld of statisticalinference [3, 4]; it is equivalent to the generalized Sourlas code problem in informationscience [5, 6] and the planted maximum XORSAT problem in computer science [7].When the environmental temperature slowly decreases, as occurs in a simulatedannealing (SA) dynamics [8], the system is predicted to experience an equilibrium phasetransition from the disordered paramagnetic phase to the ordered ferromagnetic phase.But this transition never occurs in practice when the interactions are many-body innature ( p ≥ N and the crystal-nucleation mechanism then simplyfails. The system instead remains in the paramagnetic phase as temperature furtherdecreases [1, 2], until ﬁnally it is frozen in one of the exponentially many disordered ircumventing spin glass traps by MSSB p -spinsystem but is a common property of many large inference problems such as the plantedsatisﬁability and coloring problems [12, 13, 14]. When the ground state is completelymasked by an exponential number of disordered conﬁgurations (for example throughquiet planting as in the p -spin system and the coloring problem [14]), and SA andmessage-passing processes are trapped by spin glass states, it seems the only resort isbrute-force enumeration which is of course infeasible even for moderate system sizes N .In this work we point out the existence of an alternative route to the planted groundstate, namely the route of microcanonical spontaneous symmetry breaking (MSSB).This is a discontinuous transition from the paramagnetic (or disordered symmetric, DS)phase to the microcanonical polarized (MP) phase. The microcanonically stable MPphase has so far been largely overlooked in the literature except for a recent detailedanalysis concerning the Potts model [15]. Because the energy density of the system isﬁxed along the whole DS-to-MP transition trajectory, the diﬃculty of climbing hugeenergy barriers is completely avoided (Fig. 1).The DS and MP phases are connected by many transition trajectories at the givenﬁxed energy density, and therefore ergodicity between these two phases is preserved. TheMP phase is entropically favored than the DS phase because it contains exponentially E ne r g y Ocean of Configurations MSSBSA

Figure 1.

An illustration of energy landscape of a planted p -body interaction spinsystem ( p ≥ ircumventing spin glass traps by MSSB p = 3 (each interaction involving threevertices) which have the property that their ground state is unique. Very interestinglywe ﬁnd that these DS conﬁgurations actually contain information about the uniqueplanted ground state. It may then be possible to infer the planted solution from thesampled DS conﬁgurations through machine-learning techniques. We propose threediﬀerent inference strategies based respectively on the ideas of perceptron-learning,hyperplane optimization, and curve-ﬁtting, to solve the planted p -spin model. By somesimple scaling analysis we conjecture that the planted ground state can be inferred inpolynomial time. We hope to be able to conﬁrm this conjecture in the near future byextensive numerical simulations, and then to extend this work to other hard plantedsystems. Our work also call for theoretical understanding on the interesting asymmetricbehavior shown in Fig. 4.

2. Theoretical predictions

Consider a planted p -spin interaction system in which each vertex j ∈ { , . . . , N } participates in K interactions (clauses) and each clause involves p randomly chosenvertices (we set p = 3 in all the following numerical computations). The total number ofclauses is M = KN/p . The energy of a generic conﬁguration ~σ ≡ ( σ , . . . , σ N ) is E ( ~σ ) = − M X a =1 J a Y j ∈ ∂a σ j , (1)where σ j ∈ ± j and ∂a denotes the set of vertices constrained byclause a . We denote by u the energy density of the system, namely u = E ( ~σ ) N . (2)There is a planted spin conﬁguration ~σ ≡ ( σ , . . . , σ N ) dictating the couplingconstant of clause a , such that J a =  + Q j ∈ ∂a σ j (probability 1 − ε ) , − Q j ∈ ∂a σ j (probability ε ) . (3)The parameter ε is the noise level of the above planting rule. When ε > p -body interactions [5, 6]. When ε is ircumventing spin glass traps by MSSB ~σ is a ground state of Eq. (1) and it maylie extensively below all the other minimum-energy conﬁgurations (Fig. 1). We deﬁnethe overlap (magnetization) m of conﬁguration ~σ with respect to ~σ as m = 1 N N X j =1 σ j σ j . (4)This random-graph system has been intensively studied by the mean-ﬁeld cavitymethod of statistical mechanics [16, 17, 10, 18]. The mean-ﬁeld results obtained at ε = 0and K = 10 are demonstrated in Fig. 2, and the qualitatively identical results obtainedfor K = 4 are shown in Fig. A1 (Appendix A). Some aspects of these theoretical resultsare similar to what were found for the Potts model [15], but there is a key qualitativediﬀerence to be discussed at the end of this section. First, let us interpret these -4-3-20.2 β f β d f β DSMPCP (a) d u mic -1 s uDSMPCP (b) β f β d -3 -2 u d u mic -1 β u DS MP CP MMC (c) d u mic -1 m uDSMPCPMMC (d) Figure 2.

The DS (solid lines), MP (dotted lines), and CP (long dashed lines) ﬁxedpoints of the mean-ﬁeld theory for the planted 3-body model on random graphs ofdegree K = 10 at noise ε = 0. (a) Free energy density f versus canonical inversetemperature β . (b) Entropy density s versus energy density u . (c) Microcanonicalinverse temperature β versus u . (d) Mean overlap m versus u . β d and u d : criticalinverse temperature and energy density at the dynamical SG phase transition; u mic :critical energy density at the MSSB phase transition; β f : critical inverse temperatureat the canonical ferromagnetic phase transition. The predicted MSSB phase transitionis conﬁrmed by MMC simulations (circles) on a single problem instance of size N = 960. ircumventing spin glass traps by MSSB u decreases to the spin glass(SG) dynamical transition point u d this DS phase suddenly divides into an exponentialnumber of ergodicity-broken macro-states, and the system then enters into the SGphase [1]. The value of u d is independent of ε because of gauge symmetry [10] andcan be determined by the tree-reconstruction method [17] (see also textbooks [4, 18]).Because of planting, a stable canonical polarized (CP) phase containing conﬁgurationssimilar to ~σ exist in the conﬁguration space at suﬃciently low energies. This CP phaseis simply the conventional ferromagnetic phase.The Microcanonical polarized phase, corresponding to the unstable ﬁxed point ofthe mean ﬁeld theory with higher free energy density [Fig. 2(a)], serves as the watershedbetween the DS and CP phases [Fig. 2(c)]. Conﬁgurations of the MP phase have positivemagnetizations and therefore are similar to ~σ , but they are unstable in the canonicalensemble. The MP phase can only be explored by ﬁxing the energy density u .The entropy density s of the MP phase as a function of energy density u has twobranches [Fig. 2(b)]. The higher-entropy branch corresponds to the MP conﬁgurationsthat are stable at ﬁxed u , and its entropy starts to surpass that of the DS phase atthe critical energy density u mic , indicating a discontinuous microcanonical spontaneoussymmetry-breaking phase transition. This transition leads to a jump in the meanmagnetization and a drop in the microcanonical inverse temperature β , which are veriﬁedby our simulation results [Figs. 2(c) and 2(d)]. The lower-entropy MP branch, on theother hand, always has lower entropy than that of the DS phase; it marks the watershedbetween the MP and DS phases in the microcanonical ensemble.We observe that the MSSB transition energy density u mic is located above the SGtransition point u d when the noise level ε is low enough (Table 1). In principle it isthen possible to reach the MP phase by inducing a MSSB transition at ﬁxed value of u ∈ ( u d , u mic ), avoiding the spin glass traps of the canonical ensemble. We will continueto discuss the practical feasibility of this proposal in the next two sections. A relevantobservation to this issue is that the DS ﬁxed point of the p -spin model is always locally Table 1.

Magnetization jump ∆ m , microcanonical inverse temperature drop ∆ β ,critical energy density u mic , at the MSSB phase transition of the planted 3-body modelon random graphs of degree K (noise ε = 0). The critical energy densities u d are alsolisted (based on Table 4.2 of [18]). K m .

906 0 .

821 0 .

758 0 .

712 0 .

676 0 .

647 0 . β − . − . − . − . − . − . − . u mic − . − . − . − . − . − . − . u d − . − . − . − . − . − . − . ircumventing spin glass traps by MSSB m always exists between the DS andMP phases down to this minimal energy density [Fig. 2(d)]. This feature is signiﬁcantlydiﬀerent from what was observed in the Potts model (Fig. 2(d) of Ref. [15]). The DSphase of the Potts model becomes unstable below certain threshold energy density. Byﬁxing the energy density below this threshold value, the system will then evolve fromthe DS phase to the MP phase gradually. Such a gradual evolution will not be possiblefor the p -spin model because of the gap in the order parameter m .

3. Microcanonical Monte Carlo

We implement a simple MMC dynamics to explore conﬁgurations at the vicinity of anobjective energy E o ≡ N u with u ∈ ( u d , u mic ). After the system reaches E o from arandom initial conﬁguration through SA or irreversible energy diving, we keep updatingits conﬁguration ~σ by single-spin ﬂips. An elementary MMC trial consists of pickinga vertex j uniformly at random, proposing a ﬂip σ j → − σ j , and accepting this ﬂipif the new conﬁguration energy does not exceed E o . We sample conﬁgurations atﬁxed intervals and record their overlap values and energies. The microcanonical inversetemperature is estimated through β = 14 ln (cid:16) E o − h E ( ~σ ) i (cid:17) , (5)where h E ( ~σ ) i is the averaged energy of the conﬁguration samples [20].Our MMC dynamics is an unbiased and ergodic random walk within themicrocanonical conﬁguration space. As evolution time goes to inﬁnity everyconﬁguration of this space has equal frequency to be visited, and then the systemwill certainly be in the MP phase as this phase is exponentially dominant in statisticalweight. Empirically we have observed that this MMC dynamics achieves MSSB in small-sized systems [Fig. 3(b)]. It is more easier for the MSSB transition to occur in networksof larger degrees K . This is consistent with the mean-ﬁeld theoretical results, whichreveal that the entropy barrier and the gap of the order parameter m both decreaseswith K (Table 1). The hardest RR problem instances are those with degree K = 4.But when N becomes larger ( N >

350 for K = 4, N > K = 10, and N > K = 20) the waiting time needed to observe a MSSB transition starts to increaseexponentially with N . This slowing down is due to an entropy-barrier eﬀect [Fig. 3(a)].On the coarse-grained level the MMC dynamics is a one-dimensional random walkunder a potential ﬁeld − s u ( m ), where s u ( m ) is the entropy density of conﬁgurations atenergy density u having overlap m with the planted conﬁguration ~σ . At each elementarytrial, m may change to m ± /N with probability proportional to (1 ∓ g m ) /

2. The biasratio is g m = 1 − e s ′ u ( m ) e s ′ u ( m ) , (6) ircumventing spin glass traps by MSSB m t(1-g m )/2(1+g m )/2m m+2/Nm-2/N m m* m s u (m) (a) W Nc x c x c x K=20K=10K= 8K= 4 (b)

Figure 3.

The time complexity of observing a MSSB transition. (a) A schematiccurve of the entropy density s u ( m ) of the planted 3-body model at ﬁxed energy density u and noise ε = 0, showing a local maximum at overlap m ≈

0, a global maximum at m , and a minimum at m ∗ . The MMC dynamics is equivalent to a one-dimensionalbiased random walk with bias ratio g m . The example evolution trajectory shows aMSSB event at u = − .

61 in a random graph of degree K = 10 and size N = 960 (oneunit of MMC time t means N spin-ﬂip trials). (b) The simulated waiting time W forsingle problem instances. Each point is obtained by simulating 24 independent MMCevolution trajectories on a single network and then setting W to be the length of theshortest trajectory: K = 4, u = − .

133 (circle); K = 8, u = − . K = 10, u = − .

61 (pluses); K = 20, u = − . cN b referencerelations. with s ′ u ( m ) ≡ d s u ( m ) / d m being the slope of s u ( m ) at m . Because s ′ ( m ) < m ∈ (0 , m ∗ ) with m ∗ being the watershed point of s u ( m ), the occurrence of MSSBis an exponentially rare ﬁrst-passage event characterized by a waiting time of order O ( e N [ s u (0) − s u ( m ∗ )] ) [21, 22].

4. Asymmetry of overlap distribution

When p is odd so that each clause of the energy (1) involves an odd number of vertices,the planted conﬁguration ~σ is a ground state of the system but the globally ﬂipped one( − ~σ ) is not. Furthermore, ~σ will be the unique ground state if K ≥ M is not too small. We conceive that since the entropy density s u ( m ) is “M”-shaped with a minor peak at m ≈ not be strictly symmetric in the vicinity of m = 0 when N is ﬁnite. Instead,for relatively large overlap magnitudes, s u (+ | m | ) may slightly exceed s u ( −| m | ). As theprobability P ( m ) of observing an overlap value m is related to s u ( m ) by P ( m ) ∝ e Ns u ( m ) ,we can quantify the magnitude of entropy asymmetry by examining the P ( m ) proﬁle.Some representative simulation results for p = 3 are shown in Fig. 4 and in Fig. A3(Appendix A).Consistent with our expectations, the overlap m is more likely to be positive when | m | > . N − . On the other hand we were initially quite puzzled to notice that m ircumventing spin glass traps by MSSB -2-100 1 2 3 4 5 N [ P ( + m )- P (- m ) ] N m -6 -4 -2 -4 -2 0 2 4 6 N . P ( m ) (a) -8 -6 -4 -2 P ( m ) |m - m |m = -0.000205 m > m m < m (b) Figure 4.

Asymmetry of the overlap distribution P ( m ) for the planted 3-bodymodel at noise ε = 0. (a) Rescaled probability diﬀerence N [ P (+ | m | ) − P ( −| m | )] versusrescaled overlap √ Nm for a single random-graph system of degree K = 4 at u = − . N = 2004 (pluses), 4008 (crosses), 8016 (circles), and 16032 (squares). The insetshows √ N P ( m ) versus √ N m . (b) The two branches of P ( m ) at m > m and m < m for the system of size N = 16032, with m = − . is signiﬁcantly biased toward negative values for smaller values of | m | , especially at | m | ≈ . N − . After some detailed analysis we now understand that this peculiarbehavior is caused by the fact that the most probable overlap value (denoted as m ) ofthe distribution P ( m ) is not located exactly at zero but is slightly negative ( m < P ( m ) issystematically biased towards the positive side of m than towards the negative side[Fig. 4(b)].The numerical results of Fig. 4(a) and Fig. A3 also reveal that the histograms fordiﬀerent system sizes roughly superimpose onto each other after rescaling P ( m ) by N − and m by N − . From this scaling behavior we infer that the probabilities of overlap m being non-positive ( p ≤ ) and non-negative ( p ≥ ) change with N as p ≤ = 12 + γ N − , p ≥ = 12 − γ N − , (7)and the mean and squared-mean overlaps decay with N according to h m i = µN − , h m i = N − . (8)The values of the asymmetry coeﬃcients γ and µ are determined by two competingeﬀects, namely that the most probable overlap value m is negative and that P ( m + | ∆ m | ) > P ( m − | ∆ m | ) for any deviation ∆ m of overlap from m . We expect that thesetwo coeﬃcients γ and µ are distinct from zero in general. Given a problem instance,we can estimate γ and µ by MMC assuming ~σ = (1 , . . . , γ = 0 . ± . µ = 0 . ± . K = 4 and size N = 4008 at u = − .

135 of Fig. 4(a), by averaging over 7 . × conﬁguration samples through the bootstrap method (Appendix B). ircumventing spin glass traps by MSSB P ( m ) persists even whenthe energy density u is higher than the MSSB phase transition value u mic . This fact isdemonstrated in Fig. A2 (Appendix A) by MMC simulation results obtained at diﬀerentenergy densities u on a single RR network instance of size N = 4008 and degree K = 4.Therefore the condition u < u mic is not strictly necessary to explore the asymmetry of P ( m ). Of course the extend of asymmetry in P ( m ) decreases as u increases, and P ( m )should be perfectly symmetric at u = 0.

5. Inference on microcanonical conﬁguration samples

Equation (7) indicates that a single conﬁguration sample ~σ contains O ( N − ) bitinformation about the planted conﬁguration ~σ . It may then be possible to infer ~σ directly from a suﬃciently large number of conﬁgurations sampled at ﬁxed energydensity u , without waiting for a rare MSSB transition event. Here we discuss threediﬀerent inference strategies. Assuming µ of Eq. (8) to be non-zero, we may construct a perceptron-learning problemto infer ~σ (stragegy A). The recipe is quite straightforward. We sample Q independentconﬁgurations ~σ ( t ) at equal time intervals, here t is the index of a sampled conﬁguration,and then add them together to form a composite vector ~r ≡ ( r , . . . , r N ) = P Qt =1 ~σ ( t ).Deﬁne the alignment L between ~r and ~σ as L ≡ P Ni =1 r i σ i . Because of Eq. (8)the random variable L follows approximately a Gaussian distribution with mean µQ and standard deviation √ N Q . Therefore if setting Q to be proportional to N with Q ≥ (5 /µ ) N then the signs of L and µ will highly likely be the same. As an example,this number is Q ≈ . × for the problem instance of size N = 4008 and degree K = 4in Fig. 4(a). After getting a large number X of such indendent composite vectors byparallel computing [28], we can feed them to a perceptron to infer ~σ .The inferred probability P A ( ~σ ) of the planted conﬁguration being ~σ in thisperceptron-learning task is P A ( ~σ ) ∝ X Y ℓ =1 Θ (cid:16) θ µ N X i =1 r ( ℓ ) i σ i (cid:17) , (9)where Θ( x ) is the Heaviside function with Θ( x ) = 1 for x > x ) = 0 for x ≤ r ( ℓ ) i is the i -th entry of the ℓ -th composite sample vector [23, 24, 25, 26, 27]. Theparameter θ µ ∈ ± µ which only aﬀects the global sign of the inferredconﬁguration ~σ . We may simply assume θ µ = 1 and later ﬁx the optimal sign of ~σ byenergy comparison.We should be able to achieve almost perfect inference of ~σ by setting X ≥ N [27].The total number of sampled independent conﬁgurations is then of order O ( N ). It maybe reasonable to assume that one needs O ( N ) spin-ﬂip trials to pick one independent ircumventing spin glass traps by MSSB O ( N ),much shorter than the exponential time complexity of the naive random-walk strategy.Perfect inference is actually not necessary. We could compute the mean values ofthe planted spins σ i through Eq. (9) and then use them to guide the MMC dynamicspassing through the entropy valley around m ∗ [Fig. 3(a)]. Given N independent conﬁguration samples, the hyperplane deﬁned by the plantedconﬁguration ~σ will split these conﬁgurations into two groups such that one groupcontains order O ( N / √ N ) more conﬁgurations than the other group. The hyperplaneperpendicular to a random spin conﬁguration will also split these conﬁgurations intotwo groups, but their size diﬀerence will only be of order O ( √N ). We expect that when N reaches order O ( N ) the hyperplane of the planted conﬁguration will be the uniqueoptimal choice to separate the N conﬁguration samples in the most uneven way. Thecorresponding optimization problem has the following cost function C ( ~σ ) = N X ℓ =1 Sign (cid:16) N X i =1 σ ( ℓ ) i σ i (cid:17) , (10)where Sign( x ) is the sign function deﬁned by Sign( x ) = 1 for x >

0, Sign( x ) = − x < x ) = 0 for x = 0, and σ ( ℓ ) i is the i -th entry of the ℓ -thsampled conﬁgurations. The minimum-cost solution of Eq. (10) may be reachableby simulated annealing dynamics or by message-passing algorithms. The method ofprincipal component analysis might also be helpful [29].Both this optimization strategy (strategy B) and the perceptron-learning strategyA have the advantage that one does not need to know the value of the hyper-parameter ε . A conceptual disadvantage is that they are not applicable to problem instances witheven p -values, because the overlap distribution P ( m ) must be symmetric when everyinteraction involves an even number of vertices. This disadvantage might be overcomeby the strategy of the next subsection. The distribution P ( m ) of overlaps m with respective to the planted conﬁguration ~σ ,viewed as a function of m , should not depend on the details of ~σ . This means that wecould estimate P ( m ) without knowing ~σ simply by setting the coupling constants J a ofEq. (1) indendently as − ε and 1 − ε , respectively. On the otherhand, we can sample a large number of independent conﬁgurations ~σ ( ℓ ) ≡ ( σ ( ℓ )1 , . . . , σ ( ℓ ) N ),with indices ℓ = 1 , . . . , N , for the planted p -spin model (1) with the original set ofcoupling constants.Then the following inference problem (strategy C) concerning the probability ircumventing spin glass traps by MSSB P C ( ~σ ), of the unknown planted conﬁguration ~σ arises: P ( m ) = 1 N N X ℓ =1 X ~σ P C ( ~σ ) δ (cid:16) m − N N X i =1 σ ( ℓ ) i σ i (cid:17) , (11)where δ ( x ) is the Dirac delta function. We may assume P C ( ~σ ) to be factorized suchthat P C ( ~σ ) = N Y i =1 h ν i σ i i , (12)where ν i is the inferred mean value of the i -th planted entry σ i . Then Eq. (11) can beapproximated as a sum of many Gaussian distributions [27] P ( m ) = 1 N N X ℓ =1 √ π Λ exp (cid:16) − ( m − N P Ni =1 σ ( ℓ ) i ν i ) (cid:17) , (13)with Λ = P Ni =1 (1 − ( ν i ) ) /N being a variance parameter.Equation (13) is essentially a curve-ﬁtting problem with N adjustable realparameters ν i . This problem may be solvable by various algorithms developed in themachine-learning community. We expect that when the number N of conﬁgurationsamples becomes suﬃciently large, the inferred ﬁrst moments ν i will oﬀer highly faithfulprediction about the true planted conﬁguration ~σ .An advantage of this inference strategy C is that it may also be applicable whenthe hyper-parameter p of model (1) is even-valued. For p being even, of course P ( m )will be a symmetric function of m , but because conﬁgurations of larger magnitudes | m | of overlaps are more frequently to be sampled in a planted system, the form of P ( m )may signiﬁcantly deviate from being Gaussian, making it useful for the inference task(13). ε As mentioned in the preceding subsection, to evaluate the overlap distribution P ( m ) weneed to discard the original coupling constants J a of the planted model (1) and assign ± ε . But the value of this hyper-parameter ε may often be unknown.This lack of information should not cause a fundamental diﬃculty. We may simplyset ε to a set of diﬀerent values between 0 and . For each ﬁxed value of ε we can get thecorresponding P ( m ) function and use it as input to the inference problem (13), whilethe set of conﬁguration samples ~σ ( ℓ ) is the same. We may be able to achieve the bestﬁtting performance when the assigned ε is close to the true value of this noise level.

6. Summary

In summary, we picked the p -spin interaction model as an example to demonstrate thepotential of microcanonical spontaneous symmetry breaking for solving hard inference ircumventing spin glass traps by MSSB O ( cN ) elementary localized updates, possibly with a large prefactor c . The ﬁnite-size scaling behaviors (7) and (8) are quite fascinating and they call fora thorough theoretical understanding. Numerical implementation of the three machine-learning strategies of Section 5 will be needed to check the validity of our conjecture.We are also starting to work on other challenging planted ensembles of optimizationproblems, such as K -satisﬁability and Q -coloring [12, 14, 13], to get more insights onthe MSSB mechanism. Acknowledgments

The author thanks Yuliang Jin and Pan Zhang for helpful conversations. This work wassupported by the National Natural Science Foundation of China Grants No.11975295and No.11947302, and the Chinese Academy of Sciences Grant No.QYZDJ-SSW-SYS018. Numerical simulations were carried out at the Tianwen clusters of ITP-CASand the Tianhe-2 platform of the National Supercomputer Center in Guangzhou.

Appendix A. Additional numerical results

The mean-ﬁeld theoretical results obtained for the planted 3-body spin model on aregular random network of degree K = 4 are summarized in Fig. A1. Qualitativelyspeaking, these theoretical results are identical to the results shown in Fig. 2 for K = 10.The circles in Fig. A1(c) are simulation results obtained on a single network instanceof size N = 10002. Because of the huge entropy barrier and the large gap of theorder parameter m between the DS and MP phases, the waiting time to observe aMSSB transition is too much longer than the simulation time, and consequently such atransition does not occur in our simulation processes.Figure A2 shows how the asymmetry of the probability distribution P ( m ) changeswith the energy density u , for a single random-network instance of degree K = 4 andsize N = 4008. From the fact that the diﬀerent curves superimpose onto each other afterrescaling the probability distributions by | u | [Fig. A2(b)], we know that the asymmetryof P ( m ) decreases linearly with | u | and vanishes as u approaches zero.We show in Fig. A3 the probability proﬁles of the overlap m in the vicinity of m = 0 for single RR network instances of degree K = 10. These results are obtained bysampling a huge number of spin conﬁgurations at ﬁxed energy density u = − .

61 throughthe MMC evolution dynamics. This ﬁgure conﬁrm that the asymmetric features revealedin Fig. 4(a) are quite general for RR networks of diﬀerent degrees K . For the systemof size N = 15360, we estimate the asymmetry parameters to be µ = 0 . ± .

006 and λ = 1 . ± . m = 0 . N = 4 . × for this problem instance). ircumventing spin glass traps by MSSB -1.6-1.4-1.20.5 β f β d f β DSMPCP (a) d u mic -1 s uDSMPCP (b) β f β d d u mic -1 -0.9 β u DS MP CP MMC (c) d u mic -1 m uDSMPCP (d) Figure A1.

The DS (solid lines), MP (dotted lines), and CP (long dashed lines)ﬁxed points of the mean-ﬁeld theory for the planted 3-body model on random graphsof degree K = 4 (noise ε → f versus canonical inversetemperature β . (b) Entropy density s versus energy density u . (c) Microcanonicalinverse temperature β versus u . Circular symbols are MMC simulation results obtainedon a single problem instance of size N = 10002. (d) Mean overlap m versus u . β d and u d : critical inverse temperature and energy density at the dynamical SG phasetransition; u mic : critical energy density at the MSSB phase transition; β f : criticalinverse temperature at the canonical ferromagnetic phase transition. Appendix B. Bootstrap statistics analysis

We estimate the statistical errors of the scaling coeﬃcients λ and µ of Eqs. (7) and (8)by the bootstrap method [18, 30].Consider a random variable x which can take m diﬀerent values x , . . . , x m . We aregiven a large number N of independent samples of this random number, among which n i samples are the value x i ( n i ≥ i = 1 , . . . , m and P mi =1 n i = N ). An empiricalmean of this random variable is then constructed as x = 1 N m X i =1 n i x i . (B.1)Because the sample set size N is ﬁnite, the empirical mean x may deviate from the truemean of the random variable x . ircumventing spin glass traps by MSSB -6-300 0.03 0.06 [ P ( + m ) - P (- m ) ] ( - ) m u = -0.10 u = -0.54 u = -0.98 u = -1.135 (a) -6-300 0.03 0.06 [ P ( + m )- P (- m ) ]/ | u | ( - ) m u = -0.10 u = -0.54 u = -0.98 u = -1.135 (b) Figure A2.

Probability diﬀerence (unit 10 − ) between positive and negative overlapvalues m in a single random-network system of K = 4 and N = 4008. The energy densityis u = − .

10 (pluses), − .

54 (crosses), − .

984 (circles), and − .

135 (diamonds). Panel(b) is rescaled plot of panel (a) with the vertical axis being the ratio between theprobability diﬀerence and | u | . -4-3-2-1010 1 2 3 4 5 N [ P ( + m )- P (- m ) ] N m -6 -4 -2 -4 -2 0 2 4 6 N . P ( m ) (a) -8 -6 -4 -2 P ( m ) |m - m |m = -0.000282 m > m m < m (b) Figure A3.

Asymmetry of the overlap distribution P ( m ) for the planted 3-bodymodel on single random graphs of degree K = 10 at noise ε = 0. (a) Rescaled probabilitydiﬀerence N [ P (+ | m | ) − P ( −| m | )] versus rescaled overlap √ Nm at u = − .

61: size N = 3840 (pluses), 7680 (crosses), 15360 (circles). The inset shows √ NP ( m ) versus √ N m . (b) The two branches of P ( m ) at m > m and m < m for the system of size N = 15360, with m = − . To estimate the standard error of the empirical mean, we assume that the trueprobability distribution of x is identical to the empirical distribution, namely theprobability of x = x i is p i = n i / N . It then follows that the joint probability P ( n , . . . , n m ) that, among N independent samples, n i of them equal to x i is P ( n , . . . , n m ) = N ! n ! . . . n m ! p n . . . p n m m . (B.2)According to this probability distribution, the ﬁrst moments h n i i and the correlations h n i n j i of these random counting numbers n i are simply h n i i = N p i , h n i n j i = N ( N − p i p j + N p i δ ji , (B.3) ircumventing spin glass traps by MSSB δ ji = 0 if i = j and δ ji = 1 if i = j . Then, the standard deviation of the empiricalmean x is SD( x ) ≡ p h ( x ) i − h x i = (cid:16) x − ( x ) N (cid:17) , (B.4)where x is the empirical second moment of x : x = 1 N m X i =1 n i x i . (B.5)Applying the above statistical theory to the parameter λ , we notice that the randomvariable in this case is x ∈ {−√ N , , + √ N } corresponding to the overlap value m beingpositive, zero, and negative, respectively. Then the standard deviation of λ is SD ( λ ) ≈ r N N . (B.6)For the parameter µ , the corresponding random variable is x = P Ni =1 σ i σ i where σ i isthe spin of vertex i in the sampled conﬁguration ~σ . Then the standard deviation of µ is SD ( µ ) ≈ r N N . (B.7)We see that the standard errors of λ and µ both are approximately p N/ N . References [1] S. Franz, M. M´ezard, F. Ricci-Tersenghi, M. Weigt, and R. Zecchina. A ferromagnet with a glasstransition.

Europhys. Lett. , 55:465–471, 2001.[2] A. Montanari and F. Ricci-Tersenghi. Cooling-schedule dependence of the dynamics of mean-ﬁeldglasses.

Phys. Rev. B , 70:134406, 2004.[3] L. Zdeborov´a and F. Krzakala. Statistical physics of inference: thresholds and algorithms.

Adv.Phys. , 65:453–552, 2016.[4] M. M´ezard and A. Montanari.

Information, Physics, and Computation . Oxford Univ. Press, NewYork, 2009.[5] N. Sourlas. Spin-glass models as error-correcting codes.

Nature , 339:693–695, 1989.[6] H. Huang and H. J. Zhou. Cavity approach to the Sourlas code system.

Phys. Rev. E , 80:056113,2009.[7] O. Watanabe. Message passing algorithms for MLS-3LIN problems.

Algorithmica , 66:848–868,2013.[8] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi. Optimization by simulated annealing.

Science ,220:671–680, 1983.[9] Y.-Z. Xu, C. H. Yeung, H.-J. Zhou, and D. Saad. Entropy inﬂection and invisible low-energystates: Defensive alliance example.

Phys. Rev. Lett. , 121:210602, 2018.[10] Y. Matsuda, H. Nishimori, L. Zdevorov´a, and F. Krzakala. Random-ﬁeld p -spin-glass model onregular random graphs. J. Phys. A: Math. Theor. , 44:185002, 2011.[11] A. S. Bandeira, A. Perry, and A. S. Wein. Notes on computational-to-statistical gaps: Predictionsusing statistical physics.

Portugaliae Mathematica , 75:159–186, 2018.[12] K. Li, H. Ma, and H. J. Zhou. From one solution of a 3-satisﬁability formula to a solution cluster:Frozen variables and entropy.

Phys. Rev. E , 79:031102, 2009.[13] F. Krzakala, M. M´ezard, and L. Zdeborov´a. Reweighted belief propagation and quiet planting forrandom k -sat. Journal on Satisﬁability, Boolean Modeling and Computation , 8:149–171, 2014. ircumventing spin glass traps by MSSB [14] F. Krzakala and L. Zdeborov´a. Hiding quiet solutions in random constraint satisfaction problems. Phys. Rev. Lett. , 102:238701, 2009.[15] H.-J. Zhou. Kinked entropy and discontinuous microcanonical spontaneous symmetry breaking.

Phys. Rev. Lett. , 122:160601, 2019.[16] M. M´ezard and G. Parisi. The Bethe lattice spin glass revisited.

Eur. Phys. J. B , 20:217–233,2001.[17] M. M´ezard and A. Montanari. Reconstruction on trees and spin glass transition.

J. Stat. Phys. ,124:1317–1350, 2006.[18] H.-J. Zhou.

Spin Glass and Message Passing . Science Press, Beijing, China, 2015.[19] D. H. E. Gross.

Microcanonical Thermodynamics: Phase Transitions in “Small” Systems . WorldScientiﬁc, Singapore, 2001.[20] M. Creutz. Microcanonical Monte Carlo simulation.

Phys. Rev. Lett. , 50:1411–1414, 1983.[21] G. H. Weiss. First passage time problems for one-dimensional random walks.

J. Stat. Phys. ,24:587–594, 1981.[22] M. Khantha and V. Balakrishnan. First passage time distribution for ﬁnite one-dimensionalrandom walks.

Pramana , 21:111–122, 1983.[23] A. Engel and C. Van den Broeck.

Statistical Mechanics of Learning . Cambridge University Press,Cambridge, UK, 2001.[24] Y. Kabashima and S. Uda. A BP-based algorithm for performing bayesian inference in largeperceptron-type networks.

Lect. Notes Artif. Intellig. , 3244:479–493, 2004.[25] A. Braunstein and R. Zecchina. Learning by message passing in networks of discrete synapses.

Phys. Rev. Lett. , 96:030201, 2006.[26] H. Huang and T. Toyoizumi. Advanced mean-ﬁeld theory of the restricted Boltzmann machine.

Phys. Rev. E , 91:050101(R), 2015.[27] H.-J. Zhou. Active online learning in the binary perceptron problem.

Commun. Theor. Phys. ,71:243–252, 2019.[28] H. Bauke and S. Mertens. Random numbers for large-scale distributed monte carlo simulations.

Phys. Rev. E , 75:066701, 2007.[29] G.-K. Hu, T. Liu, M.-X. Liu, W. Chen, and X.-S. Chen. Condensation of eigen microstate instatistical ensemble and phase transition.

Science China Phys. Mech. Astron. , 62:990511, 2019.[30] B. Efron. Computers and the theory of statistics: Thinking the unthinkable.