[PDF] The SK model is infinite step replica symmetry breaking at zero temperature

Abstract

We prove that the Parisi measure of the mixed p-spin model at zero temperature has infinitely many points in its support. This establishes Parisi's prediction that the functional order parameter of the Sherrington-Kirkpatrick model is not a step function at zero temperature. As a consequence, we show that the number of levels of broken replica symmetry in the Parisi formula of the free energy diverges as the temperature goes to zero.

Full PDF

aa r X i v : . [ m a t h . P R ] M a r The SK model is Full-step Replica Symmetry Breakingat zero temperature

Antonio Auﬃnger ∗ Northwestern University

Wei-Kuo Chen † University of Minnesota

Qiang Zeng ‡ Northwestern University

March 21, 2017

Abstract

We prove that the Parisi measure of the mixed p -spin model at zero temperature has in-ﬁnitely many points in its support. This establishes Parisi’s prediction that the functional orderparameter of the Sherrington-Kirkpatrick model is not a step function at zero temperature. As aconsequence, we show that the number of levels of broken replica symmetry in the Parisi formulaof the free energy diverges as the temperature goes to zero. The study of glass and mean ﬁeld or realistic spin glass models is a very rich and important part oftheoretical physics [13, 14, 23]. For mathematicians, it is a challenging program [17, 27, 29]. Roughlyspeaking, the main goal is to study the global maxima or, more generally, the “largest individuals”of a stochastic process with “high-dimensional” correlation structure.The classic example of such a process is the mixed p -spin model. Its Hamiltonian (or energy) H N is deﬁned on the spin conﬁguration space Σ N = {− , } N by H N ( σ ) = X N ( σ ) + h N X i =1 σ i . Here, h ∈ R denotes the strength of the external ﬁeld and X N is a centered Gaussian process withcovariance, E X N ( σ ) X N ( σ ) = N ξ ( R , ) , where ξ ( s ) := X p ≥ c p s p for some real sequence ( c p ) p ≥ with P p ≥ p c p < ∞ and R , = R ( σ , σ ) := 1 N N X i =1 σ i σ i ∗ Department of Mathematics, Northwestern University, [email protected], research partially supported byNSF Grant CAREER DMS-1653552 and NSF Grant DMS-1517894. † School of Mathematics, University of Minnesota. Email: [email protected], research partially supported by NSFgrant DMS-1642207 and Hong Kong Research Grants Council GRF-14302515. ‡ Department of Mathematics, Northwestern University, [email protected].

1s the normalized inner product between σ and σ , known as the overlap. The covariance structureof X N is as rich as the structure of the metric space (Σ N , d ), where d is the Hamming distance onΣ N , d ( σ , σ ) = 1 − R ( σ , σ )2 . The problem of computing the maximum energy (or the ground state energy) of H N as N divergesis a rather nontrivial task. Standard statistical mechanics deals with this problem by consideringthe Gibbs measure G N,β ( σ ) = 1 Z N,β e βH N ( σ ) and the free energy F N,β = 1 βN log Z N,β , where Z N,β is the partition function of H N deﬁned as Z N,β = X σ ∈ Σ N e βH N ( σ ) . The parameter β = 1 / ( kT ) > k is the Boltzmann constantand T is the absolute temperature. The main goal in this approach is to try to describe the large N limit of the sequences of the free energies F N,β and the Gibbs measures G N,β . When the temperature T decreases, large values of H N become more important (to both the partition function Z N,β andto the Gibbs measure G N,β ) and they prevail over the more numerous smaller values. Since H N is a high-dimensional correlated ﬁeld with a large number of points near its global maximum, thisquestion becomes very challenging, especially for small values of T .When ξ ( s ) = s / h = 0, the model above is the famous Sherrington-Kirkpatrick (SK) modelintroduced in [25], as a mean ﬁeld modiﬁcation of the Edwards-Anderson model [10]. Using a non-rigorous replica trick and a replica symmetric hypothesis, Sherrington and Kirkpatrick [25] proposeda solution to the limiting free energy of the SK model. Their solution however was incomplete;an alternative solution was proposed in 1979 in a series of ground-breaking articles by GiorgioParisi [19–22], where it was foreseen that:( i ) The limiting free energy is given by a variational principle, known as the Parisi formula,( ii ) The Gibbs measures are asymptotically ultrametric,( iii ) At low temperature, the symmetry of replicas is broken inﬁnitely many times.The ﬁrst two predictions were conﬁrmed in the past decade. Following the beautiful discovery ofGuerra’s broken replica symmetry scheme [12], the Parisi formula was proved in the seminal workof Talagrand [28] in 2006 under the convexity assumption of ξ . Later, in 2012, the ultrametricityconjecture was established by Panchenko [16] assuming the validity of the extended Ghirlanda-Guerraidentities [11]. These identities are known to be valid for the SK model under an asymptoticallyvanishing perturbation term to the Hamiltonian, and for generic models without any perturbation.As a consequence of ultrametricity, the Parisi formula was further extended to generic models byPanchenko [18] utilizing the Aizenman-Sims-Starr scheme [6]. Our main result in this paper conﬁrmsthe third prediction at zero temperature, T = 0. 2ore precisely, the Parisi formula is stated as follows. Denote by M the collection of all cu-mulative distribution functions α on [0 ,

1] and by α ( ds ) the probability induced by α . For α ∈ M ,deﬁne P β ( α ) = log 2 β + Ψ α,β (0 , h ) − Z βα ( s ) sξ ′′ ( s ) ds, (1)where Ψ α,β ( t, x ) is the weak solution to the following nonlinear parabolic PDE, ∂ t Ψ α,β ( t, x ) = − ξ ′′ ( t )2 (cid:0) ∂ xx Ψ α,β ( t, x ) + βα ( t )( ∂ x Ψ α,β ( t, x )) (cid:1) for ( t, x ) ∈ [0 , × R with boundary conditionΨ α,β (1 , x ) = log cosh βxβ . For the existence and regularity of Ψ α,β , we refer the readers to [3,15]. The Parisi formula [28] statesthat F β := lim N →∞ F N,β = inf α ∈M P β ( α ) a.s. (2)The inﬁnite dimensional variational problem on the right side of (2) has a unique minimizer [3],denoted by α P,β . The measure α P,β ( dt ) induced by α P,β is known as the Parisi measure [13] . Itsphysical relevance is described by the facts that it is the limiting distribution of the overlap R ( σ , σ )under the measure E G ⊗ N and, more importantly, that it determines the ultrametric description ofthe asymptotic Gibbs measure. For instance, the number of points in the support of the Parisimeasure corresponds to the number of levels in the tree structure induced by the ultrametricity ofthe asymptotic Gibbs measure. See [13, 17] for detailed discussion.The importance of the Parisi measure leads to the following classiﬁcation. If a Parisi measure α P,β ( dt ) is a Dirac measure, we say that the model is replica symmetric (RS). For k ≥

1, we saythat the model has k levels of replica symmetry breaking ( k -RSB) if the Parisi measure is atomicand has exactly k + 1 jumps. If the Parisi measure is neither RS nor k -RSB for some k ≥ , thenthe model has full-step replica symmetry breaking (FRSB). We will also say that the model is atleast k -RSB if the Parisi measure contains at least k + 1 distinct values in its support.The FRSB prediction in ( iii ) above plays an inevitable role in Parisi’s original solution of theSK model; see [9] for a historic account. It can be written as: Prediction (Parisi) . For any ξ and h , there exists a critical inverse temperature β c > such thatfor any β > β c , the mixed p -spin model is FRSB. In this paper, we establish this prediction at zero temperature. To prepare for the statement ofour main result, we recall the Parisi formula for the ground state energy of H N as follows. First ofall, the Parisi formula allows us to compute the ground state energy of the model by sending thetemperature T to zero, GSE := lim N →∞ max σ ∈ Σ N H N ( σ ) N = lim β →∞ F β = lim β →∞ inf α ∈M P β ( α ) , (3)where the validity of the ﬁrst equality can be found, for instance, in Panchenko’s book [17, Chapter1]. Recently, the analysis of the β -limit of the second equality was carried out in Auﬃnger-Chen [4] The Parisi measure is the inverse of the functional order parameter q ( x ) in [20], sometimes written as x ( q ). U denote the collection of all cumulative distribution functions γ on [0 ,

1) induced by any measures on[0 ,

1) and satisfying R γ ( t ) dt < ∞ . Denote by γ ( dt ) the measure that induces γ and endow U withthe L ( dt )-distance. For each γ ∈ U , consider the weak solution to the Parisi PDE, ∂ t Ψ γ ( t, x ) = − ξ ′′ ( t )2 (cid:0) ∂ xx Ψ γ ( t, x ) + γ ( t )( ∂ x Ψ γ ( t, x )) (cid:1) for ( t, x ) ∈ [0 , × R with boundary conditionΨ γ (1 , x ) = | x | . One may ﬁnd the existence and regularity properties of this PDE solution in [7]. The Parisi functionalat zero temperature is given by P ( γ ) = Ψ γ (0 , h ) − Z tξ ′′ ( t ) γ ( t ) dt. (4)Auﬃnger and Chen [4] proved that the maximum energy can be computed through GSE = inf γ ∈U P ( γ ) a.s. (5)We call this variational representation the Parisi formula at zero temperature. It was proved in [7]that this formula has a unique minimizer, denote by γ P . We call γ P ( dt ) the Parisi measure at zerotemperature. We say that the model is FRSB at zero temperature if γ P ( dt ) contains inﬁnitely manypoints in its support. Our ﬁrst main result is a proof of Parisi’s FRSB prediction at zero temperature. Theorem 1.

For any ξ and h, the mixed p -spin model at zero temperature is FRSB. Similar to the role of the Parisi measure at positive temperature played in describing the behaviorof the model, the Parisi measure at zero temperature also has its own relevance in understandingthe energy landscape of the Hamiltonian around the maximum energy. Indeed, consider the mixedeven p -spin model, i.e., c p = 0 for all odd p ≥ . It can be shown that for any ε, η > u inthe support of γ P ( dt ), there exists some constant K > N such that P (cid:16) ∃ σ , σ such that R , ∈ ( u − ε, u + ε ) and H N ( σ ) N , H N ( σ ) N ≥ GSE − η (cid:17) ≥ − Ke − NK (6)for all N ≥ . This means that for any u ∈ supp γ P , one can always ﬁnd two spin conﬁgurationsaround the maximum energy such that their overlap is near u with overwhelming probability. Thedisplay (6) can be carried out by means of the Guerra-Talagrand replica symmetry breaking boundfor the maximum coupled energy with overlap constraint (see [7, Subsection 3.1] and [5]). Nowknowing that the model is FRSB by Theorem 1 indicates that the spin conﬁgurations around themaximum energy are not simply clustered into equidistant groups. This is in sharp contrast to theenergy landscape of the spherical version of the mixed p -spin model, where in the pure p -spin model,i.e., ξ ( t ) = t p for p ≥ , it was shown by Subag [26] that around the maximum energy, the spinconﬁgurations are essentially orthogonally structured. This structure was also presented in moregeneral mixtures of the spherical model in the recent work of Auﬃnger and Chen [5]. Remark . The problem of computing the maximum energy is also generally known as the Dean’sproblem and is frequently used to motivate the theory of mean ﬁeld spin glasses, see [13, 17]. Morerecently, the formula (3) has appeared in other optimization problems related to theoretical computerscience such as extremal cuts on sparse random graphs, see [8, 24] and the references therein.4e now return to the positive temperature case. Recall the Parisi measure α P,β introduced in(2). Our second main result, as a consequence of Theorem 1, shows that for any mixture parameter ξ and external ﬁeld h , the number of levels of replica symmetry breaking must diverge as β goes toinﬁnity. Theorem 2.

Let k ≥ . For any ξ and h , there exists β k such that the mixed p -spin model is atleast k -RSB for all β > β k . We ﬁnish this section with some historical remarks and describing the main novelty of ourapproach. For the SK model without external ﬁeld, the Parisi measure was shown to be RS in thehigh temperature regime β < β >

1. The wholeregion β > p -spin model with h = 0 is at least 2-RSB. It is also believed that thefunctional ordered parameter α P,β is not only FRSB at low temperature, but also has an absolutelycontinuous part [13, Chapter III]. Regularity properties of Parisi measures can be found in [2].The main novelty of our approach to Theorem 1 is to explore the Parisi formula for the groundstate energy (5) by considering a perturbation around the point 1. In short, we show that it is alwayspossible to lower the value of the Parisi functional of any atomic measure with ﬁnite atoms by addinga large enough jump near 1. At ﬁnite temperature, since the Parisi measure is a probability measure,the idea of adding a large jump is not feasible. As the reader will see, some miraculous cancellations(see Lemma 2 and Proposition 2 among others) occur during the proof. These cancellations mostlycome from exact computations that use the fact that the boundary condition of Ψ γ at 1 is | x | .Theorem 2 follows from Theorem 1 after some weak convergence considerations. In this section, we show that for any atomic γ ( ds ) with ﬁnitely many jumps, one can always lowerthe value of the Parisi functional by a perturbation of γ around 1. Let γ ∈ U be ﬁxed. Suppose that γ ( dt ) is atomic and consists of ﬁnitely many jumps, that is, γ ( t ) = n − X i =0 m i [ q i ,q i +1 ) ( t ) + m n [ q n , ( t ) , where ( q i ) ≤ i ≤ n and ( m i ) ≤ i ≤ n satisfy0 = q < q < q < · · · < q n < , ≤ m < m < m < · · · < m n < ∞ . (7)Here and in what follows, B ( t ) = [ t ∈ B ] is the indicator function of the set B ⊂ R . Let m n +1 beany number greater than m n . For any q ∈ ( q n , , consider a perturbation of γ by γ q ( t ) = n − X i =0 m i [ q i ,q i +1 ) ( t ) + m n [ q n ,q ) ( t ) + m n +1 [ q, ( t ) . (8)In other words, we add a jump to the top of γ. Our main result is the following theorem. It saysthat if m n +1 is large enough, then the Parisi functional evaluated at perturbed measure γ q ( dt ) hasa smaller value than P ( γ ) locally for q near 1. 5 heorem 3. There exist m n +1 > m n and η ∈ ( q n , such that P ( γ q ) < P ( γ ) for all η ≤ q < . The following three subsections are devoted to the proof of Theorem 3. P We start by observing that the Parisi functional at γ admits a probabilistic expression by an appli-cation of the Cole-Hopf transformation to the Parisi PDE. Indeed, let z , . . . , z n be i.i.d. standardGaussian random variables. Denote J = h + n − X i =0 z i p ξ ′ ( q i +1 ) − ξ ′ ( q i ) + z n p ξ ′ (1) − ξ ′ ( q n ) . Set X n +1 = | J | . Deﬁne iteratively, for 0 ≤ i ≤ n, X i = 1 m i log E z i exp m i X i +1 . where E z i stands for the expectation for z i . Here X is deﬁned as E z X if m = 0 . Then Ψ γ (0 , h ) = X and thus, P ( γ ) = X − n − X i =0 m i Z q i +1 q i tξ ′′ ( t ) dt − m n Z q n tξ ′′ ( t ) dt. Recall the perturbation γ q from (8). Clearly γ q = γ on [0 , q ) for all q n < q < . For notationalconvenience, we denote q n +1 = q, q n +2 = 1 . (9)In a similar manner, by applying the Cole-Hopf transformation, we can express Ψ γ q (0 , h ) as follows.Let z n +1 be a standard Gaussian random variables independent of z , . . . , z n . Deﬁne Y n +2 = (cid:12)(cid:12)(cid:12) h + n +1 X j =0 z j q ξ ′ ( q j +1 ) − ξ ′ ( q j ) (cid:12)(cid:12)(cid:12) and iteratively, for 0 ≤ i ≤ n + 1 , Y i = 1 m i log E z i exp m i Y i +1 . Here again we let Y = E z Y whenever m = 0 . Thus, Ψ γ q (0 , h ) = Y for any q ∈ ( q n , . As a result, P ( γ q ) = Y − n − X i =0 m i Z q i +1 q i tξ ′′ ( t ) dt − m n Z qq n tξ ′′ ( t ) dt − m n +1 Z q tξ ′′ ( t ) dt. (10)In particular, we have lim q → − Ψ γ q (0 , h ) = Ψ γ (0 , h ) and lim q → − P ( γ q ) = P ( γ ) . .2 Some auxiliary lemmas We state two propositions that will be heavily used in our main proof in the next subsection. Let0 ≤ a < t < b and 0 < m < m ′ . Denote by z a standard normal random variable. Deﬁne A ( t, x ) = 1 m ′ log E exp m ′ (cid:12)(cid:12) x + z √ b − t (cid:12)(cid:12) ,B ( t, x ) = 1 m log E exp mA ( t, x + z √ t − a ) ,C ( t, x ) = E A x ( t, x + z √ t − a ) V ( t, x, x + z √ t − a ) , (11)where V ( t, x, y ) = e m ( A ( t,y ) − B ( t,x )) . Here A x ( t, x ) is the partial derivative of A ( t, x ) in x . In what follows, we will adopt the samenotation for A xx ( t, x ) , A t ( t, x ) , B t ( t, x ) , C t ( t, x ), etc. for the partial derivatives with respect to thesubscripts. We will also consider these functions applied to random variables. Using again z todenote a standard Gaussian, we set V = V ( t, x, x + z √ t − a ), A x = A x ( t, x + z √ t − a ), etc.The main results of this subsection are the following two propositions. Proposition 1.

For any ( t, x ) ∈ [ a, b ) × R , we have that B t ( t, x ) = ( m − m ′ )2 C ( t, x ) (12) and C t ( t, x ) = E (cid:0) A xx + 2( m − m ′ ) A xx A x (cid:1) V + ( m − m ′ ) m (cid:0) E A x V − (cid:0) E A x V (cid:1) (cid:1) . (13) Remark . The functions (11) and the formula (12) also appeared in [29, Section 14.7] in a simi-lar manner, where in the exponent of A , the author used the random variable β − log cosh( β ( x + z √ b − t )) instead of | x + z √ b − t | . Proposition 2.

We have that lim t → b − C ( t, x ) = 1 (14) and lim inf t → b − C t ( t, x ) ≥ m + m ′ )3 ∆( x ) , where ∆( x ) = 2 p π ( b − a ) e − x b − a ) E e m | x + z √ b − a | . (15)Before we turn to the proof of Propositions 1 and 2, we ﬁrst gather some fundamental propertiesof the function A. emma 1. A is the classical solution to the following PDE with boundary condition A ( b, x ) = | x | ,A t ( t, x ) = − (cid:0) A xx ( t, x ) + m ′ A x ( t, x ) (cid:1) (16) for ( t, x ) ∈ [ a, b ) × R . In addition, | A x ( t, x ) | ≤ , ( t, x ) ∈ [ a, b ) × R , (17)lim t → b − A x ( t, x ) = sign( x ) , ∀ x ∈ R \ { } , (18)lim t → b − E A kx V = 1 , ∀ k ≥ , ∀ < m < m ′ , (19) where sign ( x ) = 1 if x > and = − if x < . Proof.

Deﬁne g ( t, x ) = e ( b − t ) m ′ + m ′ x Φ (cid:16) m ′ √ b − t + x √ b − t (cid:17) , where Φ is the cumulative distribution function of a standard normal random variable. Note that adirect computation gives E e m ′ | x + z √ b − t | = g ( t, x ) + g ( t, − x ) . Thus, A ( t, x ) = 1 m ′ log (cid:0) g ( t, x ) + g ( t, − x ) (cid:1) . From this expression, we can compute that A x ( t, x ) = g ( t, x ) − g ( t, − x ) g ( t, x ) + g ( t, − x ) ,A xx ( t, x ) = m ′ (cid:16) − (cid:16) g ( t, x ) − g ( t, − x ) g ( t, x ) + g ( t, − x ) (cid:17) (cid:17) + 2Γ( t, x ) ,A t ( t, x ) = − m ′ − Γ( t, x ) , (20)where Γ( t, x ) := 1 p π ( b − t ) e − x b − t ) g ( t, x ) + g ( t, − x ) . Therefore, these equations together validate (16). From the ﬁrst equation, we can also conclude (17)and (18). Note that lim t → b − V ( t, x, y ) = V ( b, x, y ) and ln V ( t, · , · ) is at most of linear growth. From(17) and (18), the dominated convergence theorem implies (19). Proof of Proposition 1.

Recall that the Gaussian integration by parts states that for a standardnormal random variable z , E zf ( z ) = E f ′ ( z ) for all absolutely continuous functions f satisfying8hat ln | f | is at most of linear growth at inﬁnity. From this formula and the PDE (16), the partialderivative of B in t is given by B t ( t, x ) = E (cid:0) A t + z √ t − a A x (cid:1) V = E (cid:0) − (cid:0) A xx + m ′ A x (cid:1) + 12 (cid:0) A xx + mA x (cid:1)(cid:1) V = m − m ′ E A x V, which gives (12). To compute the partial derivative of C in t , write C t = I + II, where I := 2 E (cid:16) A tx + z √ t − a A xx (cid:17) A x VII := m E A x (cid:16) A t + z √ t − a A x − B t ( t, x ) (cid:17) V. Here, from (16), since A tx = − (cid:0) A xxx + 2 m ′ A xx A x (cid:1) , using the Gaussian integration by parts again gives I = 2 E (cid:16) A tx A x + 12 (cid:0) A xxx A x + A xx + mA xx A x (cid:1)(cid:17) V = E (cid:16) − A x (cid:0) A xxx + 2 m ′ A xx A x (cid:1) + (cid:0) A xxx A x + A xx + mA xx A x (cid:1)(cid:17) V = E (cid:0) A xx + ( m − m ′ ) A xx A x (cid:1) V. In addition, from (16), II = m E (cid:16) − (cid:0) A xx A x + m ′ A x (cid:1) + 12 (cid:0) A xx A x + mA x (cid:1) − A x B t ( t, x ) (cid:17) V = m E A xx A x V + ( m − m ′ ) m (cid:0) E A x V − (cid:0) E A x V (cid:1) (cid:1) . From these, (13) follows.To handle the limits in Proposition 2, we need two lemmas.

Lemma 2.

For any odd k ≥ , there exists a constant K independent of t such that E A k − x A xx V ≤ Ke m | x | √ t − a (21) for all t ∈ [ a, b ) and x ∈ R . Moreover, lim t → b − E A k − x A xx V = 1 k ∆( x ) , (22) where ∆( x ) is deﬁned in Proposition 2. roof. Deﬁne D ( t, x ) = E zA kx ( t, x + z √ t − a ) V ( t, x, x + z √ t − a ) . Note that | A x ( t, x ) | ≤ B ( t, x ) ≥

0. We have V ( t, x, y ) = e m ( A ( t,y ) − B ( t,x )) ≤ e mA ( t, m | y | . Using the Gaussian integration by parts, we can write D ( t, x ) = √ t − a E (cid:0) kA k − x A xx + mA k +1 x (cid:1) V. (23)This and the previous inequality together imply (21) since k √ t − a E A k − x A xx V ≤ D ( t, x ) ≤ e mA ( a, m | x | E | z | e m | z |√ b − a , where the ﬁrst inequality used the fact that k + 1 is even. Next, we verify (22). Note that from (17)and (18), the dominated convergence theorem implieslim t → b − D ( t, x ) = E z sign( x + z √ b − a ) e m | x + z √ b − a | E e m | x + z √ b − a | = √ b − a (cid:0) ∆( x ) + m (cid:1) , where the second equation used the fact that E z sign( x + z √ b − a ) e m | x + z √ b − a | = 2 √ π e − x b − a ) + m √ b − a E e m | x + z √ b − a | . (24)See the veriﬁcation of this equation in the appendix. In addition, since k + 1 is even, (19) yieldslim t → b − E A k +1 x V = 1 . Thus, from (23) and the last two limits,∆( x ) √ b − a + m √ b − a = lim t → b − D ( t, x ) = √ b − a (cid:0) k lim t → b − E A k − x A xx V + m (cid:1) , from which (22) follows. Lemma 3.

We have that lim inf t → b − E A xx V ≥ m ′ x ) . Proof.

Recall the middle equation of (20). We see that A xx = m ′ (1 − A x ) + 2Γ , (25)on [ a, b ) × R . Here Γ = Γ( t, x + z √ b − a ) as usual. Using (19), (25), and Lemma 2 with k = 1 giveslim t → b − E Γ V = 12 ∆( x ) . A x and applying (19) and Lemma 2 with k = 3 yieldlim t → b − E A x Γ V = 16 ∆( x ) . From (25), since A xx = (cid:0) m ′ (1 − A x ) + 2Γ (cid:1) ≥ [ m ′ (1 − A x )] + 4 m ′ (1 − A x )Γ , the announced result follows by the last two limits. Proof of Proposition 2.

The statement (14) follows from (11) and (19). From (13), Lemma 2,and Lemma 3, lim inf t → b − C t ( t, x ) ≥ lim inf t → b − E (cid:0) A xx + 2( m − m ′ ) A xx A x (cid:1) V ≥ lim inf t → b − E A xx V + 2( m − m ′ ) lim t → b − E A xx A x V = 4 m ′ x ) + 2( m − m ′ )3 ∆( x )= 2( m + m ′ )3 ∆( x ) , where the ﬁrst inequality used (19). Recall the sequences ( q i ) ≤ i ≤ n +2 and ( m i ) ≤ i ≤ n +1 from (7) and (9). Recall the quantities m, m ′ , a, b and the functions A, B, C, V from (11). From now on, we take m = m n , m ′ = m n +1 ,a = ξ ′ ( q n ) , b = ξ ′ (1) , and let ˆ A ( q, x ) = A ( ξ ′ ( q ) , x ) , ˆ B ( q, x ) = B ( ξ ′ ( q ) , x ) , ˆ C ( q, x ) = C ( ξ ′ ( q ) , x ) , ˆ V ( q, x, y ) = V ( ξ ′ ( q ) , x, y ) . For 0 ≤ i ≤ n , set W i = exp m i ( Y i +1 − Y i ) . Denote Z = h + n − X j =0 z j q ξ ′ ( q j +1 ) − ξ ′ ( q j ) , and φ ( q ) = E W · · · W n − ˆ C ( q, Z ) . emma 4. We have that ∂ q P ( γ q ) = ξ ′′ ( q )2 ( m n +1 − m n ) (cid:0) q − φ ( q ) (cid:1) (26) and φ ′ ( q ) = ξ ′′ ( q )( m n − m n +1 )2 n − X i =0 m i E h W · · · W i (cid:0) E i +1 (cid:2) W i +1 · · · W n − ˆ C ( q, Z ) (cid:3)(cid:1) i − ξ ′′ ( q )( m n − m n +1 )2 n − X i =0 m i E h W · · · W i E z i (cid:2) W i (cid:0) E i +1 (cid:2) W i +1 · · · W n − ˆ C ( q, Z ) (cid:3)(cid:1) (cid:3)i + E W · · · W n − ˆ C q ( q, Z ) , (27) where E i is the expectation with respect to z i , . . . , z n − and ˆ C q is the partial derivative with respectto q. Proof.

Observe that for 0 ≤ i ≤ n − ∂ q Y i = E z i W i ∂ q Y i +1 . An induction argument yields ∂ q Y i = E i W i · · · W n − ∂ q Y n for 0 ≤ i ≤ n −

1. Since Y n = ˆ B ( q, Z ) , the equation (12) leads to ∂ q Y i = ξ ′′ ( q )( m n − m n +1 )2 E i W i · · · W n − ˆ C ( q, Z ) . (28)From (10), since ∂ q P ( γ q ) = ∂ q Y + qξ ′′ ( q )2 ( m n +1 − m n ) , this and (28) with i = 0 yield (26). On the other hand, for 0 ≤ i ≤ n − , from (28), ∂ q W i = m i (cid:0) ∂ q Y i +1 − ∂ q Y i (cid:1) W i = ξ ′′ ( q )2 ( m n − m n +1 ) m i W i (cid:0) E i +1 W i +1 · · · W n − ˆ C ( q, Z ) − E i W i · · · W n − ˆ C ( q, Z ) (cid:1) where E i +1 W i +1 · · · W n − ˆ C ( q, Z ) = ˆ C ( q, Z ) if i = n − . Finally, since φ ′ ( q ) = n − X i =0 E W · · · W i − ( ∂ q W i ) W i +1 · · · W n − ˆ C ( q, Z ) + E W · · · W n − ˆ C q ( q, Z ) , plugging the last equation into this derivative yields (27). Proof of Theorem 3.

Recall φ ′ ( q ) from (27). Let W ′ , . . . , W ′ n − be W , . . . , W n − evaluated at q = 1 . Note that E z i W i = 1 for all 0 ≤ i ≤ n − | ˆ C | ≤ q → − φ ′ ( q ) = lim inf q → − E W · · · W n − ˆ C q ( q, Z ) ≥ E W ′ · · · W ′ n − lim inf q → − ˆ C q ( q, Z ) ≥ ξ ′′ (1)( m n + m n +1 )3 E W ′ · · · W ′ n − ∆( Z ) , (29)where ∆( Z ) is deﬁned through (15) with a = ξ ′ ( q n ) , b = ξ ′ (1), and m = m n . We emphasize thatalthough we do not know whether ˆ C q is nonnegative (see (13)), the use of Fatou’s lemma remainsjustiﬁable. Indeed, note that | ˆ A x | ≤ E z n ˆ V = 1, and by (21),0 ≤ E z n ˆ A xx ( q, Z ) ˆ A x ( q, Z ) ˆ V ( q, Z, Z + z n p ξ ′ ( q ) − ξ ′ ( q n )) ≤ Ke m n | Z | p ξ ′ ( q ) − ξ ′ ( q n ) , where K is a constant independent of q. From (13),ˆ C q ( q, Z ) ≥ − ( m n +1 − m n ) ξ ′′ ( q ) (cid:16) Ke m n | Z | p ξ ′ ( q ) − ξ ′ ( q n ) + m n (cid:17) . In addition, it can be shown that each ln (cid:0) W W · · · W n − (cid:1) is at most of linear growth in z , . . . , z n − following from the fact that each Y i is uniformly Lipschitz in the variable z i for all q ∈ [ q n , . Thisand the last inequality together validates (29).Next, from (29), we can choose m n +1 large enough at the beginning such thatlim inf q → − φ ′ ( q ) > . Note that lim q → − φ ( q ) = 1. From (26), the above inequality implies that ∂ q P ( γ q ) < m n +1 as long as q is suﬃciently close to 1 . This completes our proof.

Remark . The validity of (29) and Theorem 3 relies on the positive lower bound of C t coming fromProposition 2. When one looks at (13) together with the fact lim t → b − A x = 1, it is tempting to thinkthat C t is actually negative since m ′ = m n +1 is taken to be large. As a result, Proposition 2 maylook counterintuitive. The remedy for this puzzle is the fact that A xx is singular in the limit t → b − and the dominated convergence theorem does not apply. These “singular expectations” are one ofthe major diﬃculties to prove Theorem 3. They are handled by the exact computations coming fromLemmas 2 and 3. Proof of Theorem 1.

We prove Theorem 1 by contradiction. First, note that it is known by [7,Theorem 6] that the Parisi measure γ P is not constantly zero. Suppose that the support of γ P consists of only n ≥ γ P at 1 deﬁned in (8). This leads to a contradiction of the minimality of P ( γ P ) . Hence, the support of γ P must contain inﬁnitely many points.13 emark . The statement of Theorem 1 can be strengthened to the fact that the Parisi measure γ P cannot be “ﬂat” near 1, i.e., γ P ( t ) < γ P (1 − ) for any 0 < t <

1. In fact, if this is not true, then γ P is a constant function on [ a,

1) for some a. One can then apply essentially the same argument asProposition 3 to lower the Parisi functional. The only diﬀerence is that since γ P is not necessarilya step function on [0 , a ) , the term W · · · W n − in Lemma 4 have to be replaced by a continuousmodiﬁcation using the optimal stochastic control representation for Ψ γ in [7]. We omit the detailsof the argument. Remark . Our argument of Theorem 1 does not rely on the uniqueness of the Parisi measure. Allwe need is the existence of a Parisi measure which was proved in [4].

Proof of Theorem 2.

Recall the Parisi measure α P,β for the free energy from (2). We ﬁrst claimthat ( βα P,β ) β> converges to γ P vaguely on [0 , . Suppose there exists an inﬁnite sequence ( β l ) l ≥ such that ( β l α P,β l ) l ≥ does not converge to γ P vaguely on [0 , . By an identical argument as [4,Equation (16)], we can further pass to a subsequence of ( β l α P,β l ) l ≥ such that it vaguely convergesto some γ on [0 , . To ease our notation, we use ( β l α P,β l ) l ≥ to standard for this subsequence. Itwas established in [4, Lemma 3] that lim l →∞ F β l ≥ P ( γ ) . From this, P ( γ P ) = lim β →∞ F β ≥ P ( γ ) . From the uniqueness of γ P established in [7, Theorem 4], it follows that γ P = γ, a contradiction.Thus, ( βα P,β ) β> converges to γ P vaguely on [0 , . This completes the proof of our claim.Next, if Theorem 2 does not hold, then from the above claim, there exists some k ≥ α P,β contains at most k points for all suﬃciently large β . This implies that thesupport of γ P contains at most k points. This contradicts Theorem 1. Appendix

Denote by Φ( x ) the c.d.f. of the standard normal distribution. The following lemma follows astandard Gaussian computation and is used in (24). Lemma 5.

Suppose that z is a standard normal random variable. For any x, m ∈ R and a > , E ze m | x + az | sign( x + az ) = r π e − x a + ma E e m | x + az | . Proof.

Deﬁne f ( y ) = e my Φ (cid:16) ya + ma (cid:17) , for y ∈ R , where Φ( x ) is the c.d.f. of the standard Gaussian random variable. Note that amz − z −

12 ( z − ma ) + m a . E ze m ( x + az ) { x + az> } = e mx √ π Z ∞− x/a ze amz − z dz = e mx + m a √ π Z ∞− x/a ( z − ma ) e − ( z − ma )22 dz + mae mx + m a √ π Z ∞− x/a e − ( z − ma )22 dz = e − x a √ π + mae m a f ( x ) . On the other hand, since z ′ := − z is a standard Gaussian random variable, we may apply the aboveformula to obtain E ze − m ( x + az ) { x + az< } = − E ( − z ) e m (( − x )+ a ( − z )) {− x + a ( − z ) > } = − E z ′ e m (( − x )+ az ′ ) {− x + az ′ > } = − e − x a √ π − mae m a f ( − x ) . Combining these two equations together leads to E ze m ( x + az ) { x + az> } − E ze − m ( x + az ) { x + az< } = r π e − x a + mae m a (cid:0) f ( x ) + f ( − x ) (cid:1) . Here, note that E e m | x + az | = e m a + xm Φ (cid:16) xa + am (cid:17) + e m a − xm Φ (cid:16) − xa + am (cid:17) = e m a (cid:0) f ( x ) + f ( − x ) (cid:1) . This and the last equation together imply the announced result.

References [1] Aizenman, M., Lebowitz, J. L., Ruelle, D.: Some rigorous results on the Sherrington-Kirkpatrickspin glass model.

Comm. Math. Phys. , , 3–20, (1987).[2] Auﬃnger, A., Chen, W.-K.: On properties of Parisi Measures. Probab. Theory Related Fields , , no. 3, 817–850, (2015).[3] Auﬃnger, A., Chen, W.-K.: The Parisi formula has a unique minimizer. Comm. Math. Phys. , , no. 3, 1429–1444, (2015).[4] Auﬃnger, A., Chen, W.-K.: Parisi formula for the ground state energy in the mixed p -spinmodel. arXiv:1606.05335, (2016).[5] Auﬃnger, A., Chen, W.-K.: On the energy landscape near the maximum energy in the sphericalmixed even p -spin model. arXiv:1702.08906, (2017).[6] Aizenman, M., Sims, R., Starr, S. L.: An extended variational principle for the SK spin-glassmodel. Phys. Rev. B. , 214403 (2003). 157] Chen, W.-K., Handschy, M., Lerman, G.: On the energy landscape of the mixed even p -spinmodel. arXiv:1609.04368, (2016).[8] Dembo, A., Montanari, A., Sen, S.: Extremal cuts of sparse random graphs. arXiv:1503.03923,(2015).[9] Derrida, B.: The discovery of the broken symmetry of replicas. Journal of Physics A: Mathe-matical and General, (45), 451001, (2016).[10] Edwards, S. F., Anderson, P. W.: Theory of spin glasses. Journal of Physics F: Metal Physics , (5):965, (1975).[11] Ghirlanda, S., Guerra, F.: General properties of overlap probability distributions in disorderedspin systems. Towards Parisi ultrametricity. J. Phys. A , no. 46, 91499155 (1998).[12] Guerra, F.: Broken replica symmetry bounds in the mean ﬁeld spin glass model. Comm. Math.Phys. , , no. 1, (2003).[13] M´ezard, M., Parisi, G., Virasoro, M.: Spin glass theory and beyond. World Scientiﬁc, , Singa-pore, (2004).[14] M´ezard, M., Montanari, A.: Information, physics, and computation, Oxford University Press ,(2009).[15] Jagannath, A., Tobasco, I.: A dynamic programming approach to the Parisi functional.

Proc.Amer. Math. Soc. , , 3135–3150, (2016).[16] Panchenko, D.: The Parisi ultrametricity conjecture. Ann. of Math. (2) , :383–393, (2012).[17] Panchenko, D.: The Sherrington-Kirkpatrick model. Springer Monographs in Mathematics.Springer, New York, (2013).[18] Panchenko, D.: The Parisi formula for mixed p -spin models. Ann. Probab. , , no. 3, 946–958,(2014).[19] Parisi, G.: Inﬁnite number of order parameters for spin-glasses. Phys. Rev. Lett. , , 1754–1756,(1979).[20] Parisi, G.: A sequence of approximate solutions to the SK model for spin glasses. J. Phys. A , (4):L–115, (1980).[21] Parisi, G.: The order parameter for spin glasses: A function on the interval 0-1. Journal ofPhysics A: Mathematical and General, (3), 1101, (1980).[22] Parisi, G.: Order parameter for spin-glasses. Phys. Rev. Lett. , (24), 1946, (1983).[23] Parisi, G.: Field Theory, Disorder and Simulation. World Scientiﬁc Publishing Company,(1992).[24] Sen, S.: Optimization on sparse random hypergraphs and spin glasses. arXiv:1606.02365, (2016).[25] Sherrington, D., Kirkpatrick, S.: Solvable model of a spin-glass. Phys. Rev. Lett. , (26):1792–1796, (1975). 1626] Subag, E.: The complexity of spherical p -spin models - a second moment approach.arXiv:1504.02251, (2015).[27] Talagrand, M.: Spin Glasses: A Challenge for Mathematicians: Cavity and Mean Field Models.Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys inMathematics, , Springer-Verlag, Berlin, (2003).[28] Talagrand, M.: The Parisi formula. Ann. of Math. (2) , , no. 1, 221–263, (2006).[29] Talagrand, M.: Mean ﬁeld models for spin glasses. Ergebnisse der Mathematik und ihrer Gren-zgebiete. 3. Folge. A Series of Modern Surveys in Mathematics, , Springer-Verlag, Berlin,(2011).[30] Toninelli, F.: About the Almeida-Thouless transition line in the Sherrington-Kirkpatrick meanﬁeld spin glass model. Europhysics letters ,60