[PDF] Effect of energy degeneracy on the transition time for a series of metastable states: application to Probabilistic Cellular Automata

Abstract

We consider the problem of metastability for stochastic reversible dynamics with exponentially small transition probabilities. We generalize previous results in several directions. We give an estimate of the spectral gap of the transition matrix and of the mixing time of the associated dynamics in terms of the maximal stability level. These model-independent results hold in particular for a large class of Probabilistic Cellular Automata (PCA), which we then focus on. We consider the PCA in a finite volume, at small and fixed magnetic field, and in the limit of vanishing temperature. This model is peculiar because of the presence of three metastable states, two of which are degenerate with respect to their energy. We identify rigorously the metastable states by giving explicit upper bounds on the stability level of every other configuration. We rely on these estimates to prove a recurrence property of the dynamics, which is a cornerstone of the pathwise approach to metastability. Further, we also identify the metastable states according to the potential-theoretic approach to metastability, and this allows us to give precise asymptotics for the expected transition time from any such metastable state to the stable state.

Full PDF

EEﬀect of energy degeneracy on the transition timefor a series of metastable states:application to Probabilistic Cellular Automata

Gianmarco Bet , Vanessa Jacquier † , and Francesca R. Nardi a Università degli Studi di Firenze, b Eindhoven University of Technology [email protected] , [email protected] , [email protected] † Corresponding author

July 17, 2020

To our friend and colleague Carlo Casolo

Abstract

We consider the problem of metastability for stochastic reversible dynamics with expo-nentially small transition probabilities. We generalize previous results in several directions.We give an estimate of the spectral gap of the transition matrix and of the mixing time ofthe associated dynamics in terms of the maximal stability level. These model-independentresults hold in particular for a large class of Probabilistic Cellular Automata (PCA), whichwe then focus on. We consider the PCA in a ﬁnite volume, at small and ﬁxed magnetic ﬁeld,and in the limit of vanishing temperature. This model is peculiar because of the presence ofthree metastable states, two of which are degenerate with respect to their energy. We iden-tify rigorously the metastable states by giving explicit upper bounds on the stability level ofevery other conﬁguration. We rely on these estimates to prove a recurrence property of thedynamics, which is a cornerstone of the pathwise approach to metastability. Further, we alsoidentify the metastable states according to the potential-theoretic approach to metastability,and this allows us to give precise asymptotics for the expected transition time from any suchmetastable state to the stable state.

Keywords:

Stochastic dynamics, probabilistic cellular automata, metastability, potentialtheory, low temperature dynamics, mixing times.

MSC2020: secondary : 60J10, 60J45, 82C22, 82C26.

Acknowledgment:

The research of Francesca R. Nardi was partially supported by the NWOGravitation Grant 024.002.003–NETWORKS and by the PRIN Grant 20155PAWZB “LargeScale Random Structures”.

Metastability is a phenomenon that occurs when a physical system is close to a ﬁrst or-der phase transition. Among classical examples are super-saturated vapors and ferromagneticmaterials in a hysteresis loop [51]. The metastability phenomenon occurs only for some ther-modynamical parameters when a system is trapped for a long time in a state diﬀerent from thestable state. This is the so-called metastable state . While the system is trapped, it behaves as1 a r X i v : . [ m a t h . P R ] J u l f it was in equilibrium, except that at a certain time it makes a sudden transition from themetastable state to the stable state. Metastability occurs in several physical situations and thishas led to the formulation of numerous models for metastable behavior. However, in each case,three interesting issues are typically investigated. The ﬁrst is the study of the transition time from any metastable state to any stable states. The ﬂuctuations of the dynamics should facilitatethe transition, but these are very unlikely, so the system is typically stuck in the metastable statefor an exponentially long time. The second issue is the identiﬁcation of certain conﬁgurations,the so-called critical conﬁgurations , that trigger the transition. The system ﬂuctuates in a neigh-borhood of the metastable state until it visits the set of critical conﬁgurations during the lastexcursion. After this, the system relaxes to equilibrium. The third and last issue is the study ofthe typical path that the system follows during the transition from the metastable state to thestable state, the so-called tube of typical trajectories . This issue is especially interesting from aphysics point of view.The goal of this paper is twofold. First we consider general dynamics with exponentiallysmall transition probabilities and we give an estimate of the mixing time and of the spectral gapof the transition matrix in terms of the maximal stability level. Second, we focus on a speciﬁcProbabilistic Cellular Automata in a ﬁnite volume, at small and ﬁxed magnetic ﬁeld, in the limitof vanishing temperature and we prove some results describing the metastable behaviour of thesystem.Let us now discuss the two goals in detail, starting with a comparison between our estimatesfor the mixing time and the spectral gap and the literature on the topic. Similar results onthe estimate of the spectral gap have been proved for the model of simulated annealing in[38]. The authors use Sobolev inequalities to study the simulated annealing algorithm and theydemonstrate that this approach gives detailed information about the rate at which the processis tending to its ground state. Thanks to this result the mixing time is estimated for Metropolisdynamics. Our model-independent theorems are a generalization of the result in [45, Proposition3.24] to reversible dynamics with exponentially small transition probabilities in ﬁnite volume.The analysis of the spectral gap between the zero eigenvalue and the next-smallest eigenvalue ofthe generator is very interesting for Markov processes, since it is useful to control convergenceto equilibrium. In [10] the authors focus on the connection between metastability and spectraltheory for the so-called generic Markov chains under the assumption of non-degeneracy. Inparticular, they use spectral information to derive sharp estimates on the transition times. Werefer also to [7, Chapter 8 and 16], where the authors incorporate all the previous results aboutthe study of metastability through spectral data. In particular, they show that the spectrumof the generator decomposes into a cluster of very small real eigenvalues that are separated bya gap from the rest of the spectrum. In order to study our PCA, we extend their estimatesof the spectral gap to the case of degenerate in energy metastable states. The states σ and η are degenerate metastable states if they have the same energy and the energy barrier betweenthem is smaller then the energy barrier between a metastable state and the stable state (seeCondition 2.4 for a precise formulation and see [7, Chapter 16.5 point 3] for a discussion). Tosuit our purposes, we express these estimates as functions of the virtual energy instead of theHamiltonian function, see Equation (2.5) for the speciﬁc deﬁnition and [14], [21].Regarding the expected transition time, in [25] the authors consider series of two metastablestates with decreasing energy in the framework of reversible ﬁnite state space Markov chainswith exponentially small transition probabilities. Under certain assumptions, not only they ﬁndthe (exponential) order of magnitude of the transition time from the ﬁrst metastable state tothe stable state, they also give an addition rule to compute the prefactor. We generalize theirresults on the mean transition time and their addition rule to a setting with several degeneratemetastable states, see Section 2.4 for details. 2he second goal concerns a particular Probabilistic Cellular Automata (PCA). Cellular Au-tomata (CA) are discrete–time dynamical systems on a spatially extended discrete space andare used in a wide range of applications, for example to model natural and social phenomena.Probabilistic Cellular Automata (PCA) are the stochastic version of Cellular Automata, wherethe updating rules are random, i.e., the conﬁgurations are chosen according to probability dis-tributions determined by the neighborhood of each site. Mathematically, we consider PCA withparallel (synchronous) dynamics, i.e., systems of ﬁnite-states Markov chains whose distributionat time n depends only on the states in a neighboring set at time n − . PCA are characterizedby a matrix of transition probabilities from any conﬁguration σ to any other conﬁguration η deﬁned as a product of local transition probabilities as p ( σ, η ) := (cid:89) i ∈ Λ p i,σ ( η ( i )) , σ, η ∈ X , where Λ ⊂ Z is a ﬁnite box with periodic boundary conditions and X = {− , +1 } Λ is the setof all conﬁgurations. Here we consider a speciﬁc PCA in the class introduced by Derrida [30],where the local transition probability is a certain function of the sum of neighboring spins S σ ( · ) (2.30) and the external magnetic ﬁeld hp i,σ ( a ) := 11 + exp {− βa ( S σ ( i ) + h ) } = 12 [1 + a tanh β ( S σ ( i ) + h )] . We obtain our PCA by summing only over the nearest neighbor sites, see (3.1) and Figure3.1. When the sum is carried out over a symmetric set, the resulting dynamics is reversiblewith respect to a suitable Gibbs–like measure µ deﬁned via a translation invariant multi–bodypotential, see (2.28). This measure depends on a parameter β which can be thought of as theinverse of the temperature of the system. For small values of the temperature, the PCA is likelyto be found in the local minima of the Hamiltonian associated to µ . The metastable behavior ofthis model has been investigated on heuristic and numerical grounds in [6]. A key quantity inthe study of metastability is the energy barrier from one of the metastable states to the stablestate. This is the minimum, over all paths connecting the metastable to the stable state, ofthe maximal transition energy along each path, minus the energy of the starting conﬁguration(see (2.8)-(2.9)). Intuitively, the energy barrier from η to σ is the energy that the system mustovercome to reach η starting from σ .For our choice of parameters, our PCA has one stable state +1 and peculiarly three metastablestates, which we identify rigorously as {− , c e , c o } . To prove this, we will construct for eachconﬁguration σ / ∈ {− , c e , c o , +1 } a path starting from σ and ending in a lower energy state,such that the maximal energy, along the path, is lower than the energy barrier from − to +1 .This leads to an explicit upper-bound V ∗ for the stability level of every conﬁguration except {− , c e , c o , +1 } , in Lemma 3.1, which we will refer to as our main technical tool. We rely on thisestimate to prove two recurrence properties. The ﬁrst is that, starting from any conﬁguration, thesystem reaches the set {− , c e , c o , +1 } in a time smaller than e βV ∗ with probability exponentiallyclose to one. The second is that starting from any conﬁguration the system reaches +1 in a timesmaller than e β Γ PCA . To prove this, we combine our main tool with the computation of theenergy barrier Γ PCA in [19] to prove the second recurrence property. We remark that c e and c o are two degenerate metastable states, since they have the same energy and the energy barrierbetween them is zero. Hence, we will use the shorthand c = { c e , c o } .In order to ﬁnd sharp estimates of the transition time from − to +1 , we extend in Section2.4, and then verify, the three model-dependent conditions given in [25]. These are, respectively,our main technical tool, the property that starting from − the system visits the chessboard c +1 with high probability [19], and the computation of the constants k and k in[24]. In fact, the sharp estimates on the transition time which we give here were already statedin [24], but the proof there missed some key steps, which we provide here. First, our Lemma3.1 was assumed to hold without proof and the generalization given in theorems 2.8, 2.9, 2.10,2.11, 2.12 were not done explicitly. To prove these last statements, we use model-independenttheorems discussed earlier and model-dependent inputs such as the energy barrier.Regarding the model-dependent results, [19] focuses on the transition from the metastablestates to the stable state. In particular, the authors describe the tube of typical trajectories andthey also estimate the transition time. To do this, they analyze the geometrical conditions forthe shrinking or the growing of a cluster. Furthermore, they characterize the local minima of theenergy and the so-called traps for the PCA dynamics. Building on this, we construct a speciﬁcpath from any cluster to the stable state that the system follows with probability tending to one.Our estimates of the stability levels in Lemma 3.1 are based on these characterizations.The authors in [23] consider a reversible PCA model with self-interactions. In particularthey prove the recurrence to the set {− , +1 } and that − is the unique metastable state. Theyestimate the transition time in probability, in L and in law. Moreover, they characterize thecritical droplet that is visited by the system with probability tending to one during its excursionfrom the metastable to the stable state. Furthermore, in [44] they prove sharp estimates forexpected transition time by computing the prefactor explicitly. State of the art.

A ﬁrst mathematical description of metastability [51] was inspired by Gibb-sian Equilibrium Statistical Mechanics and was based on the computation of the expected valueswith respect to restricted equilibrium states. The ﬁrst dynamical approach, known as pathwiseapproach, was initiated in [13] and developed in [48, 49, 54], see also [50]. This approach de-rives large deviation estimates of the ﬁrst hitting time and of the tube of typical trajectories.It is based on the notions of cycles and cycle paths and it hinges on a detailed knowledge ofthe energy landscape. Independently, similar results based on a graphical deﬁnition of cycleswere derived in [15, 14] and applied to reversible Metropolis dynamics and to simulated an-nealing in [16, 55]. The pathwise approach was further developed in [40, 20, 21] to disentanglethe study of transition time from the one of typical trajectories. This method was applied in[1, 18, 26, 37, 28, 36, 39, 43, 46, 47, 50] for Metropolis dynamics and in [19, 23, 22] for paralleldynamics.The potential-theoretical approach is based on the study of the hitting time through the useof the Dirichlet form and spectral properties of the transition matrix. One of the advantages ofthis method is that it provides an estimate of the expected value of the transition time includingthe prefactor, by exploiting a detailed knowledge of the critical conﬁgurations, see [11, 7]. Thismethod was applied in [2, 12, 25, 8, 29] for Metropolis dynamics and in [44] for parallel dynamics.Recently other approaches are described in [3, 4, 33] and in [5].The more involved inﬁnite volume limit, at low temperature or vanishing magnetic ﬁeld, wasstudied for Metropolis dynamics via large deviation techniques in [17, 27, 41, 42, 52, 53] and viathe potential-theoretical approach in [9, 32, 35, 37, 34].

Outline.

The paper is organized as follows, in Section 2 we deﬁne a general setup and wepresent the main model-independent results with some applications to concrete models. InSection 3 we describe the reversible PCA model that we consider and we present the mainmodel-dependent results. In Section 4 we carry out the proof of the model-independent results,and in Section 5 we carry out the proof of the model-dependent results. Finally in Appendix Awe recall some results and give explicit computation that are used in the paper, and in AppendixB we prove theorems stated in Section 2.4. 4

Model-independent results

Let X be a ﬁnite set, which we refer to as state space , and let ∆ : X × X −→ R + ∪ {∞} be afunction, which we call rate function . ∆ is said to be irreducible if for every x, y ∈ X there exista path ω = ( ω , ..., ω n ) ∈ X n with ω = x , ω n = y and ∆( ω i , ω i +1 ) < ∞ for every ≤ i ≤ n − ,where n is a positive integer. A family of time-homogeneous Markov chains ( X n ) n ∈ N on X withtransition probabilities P β indexed by a positive parameter β is said to have rare transitions withrate function ∆ when lim β →∞ − log P β ( x, y ) β =: ∆( x, y ) , (2.1)for any x, y ∈ X . Intuitively, ∆( x, y ) = + ∞ should be understood as the fact that, when β islarge, there is no possible transition between states x and y . We also note that condition (2.1)is sometimes written more explicitly as [21, Equation (2.2)]: for any γ > , there exists β > such that e − β [∆( x,y )+ γ ] ≤ P β ( x, y ) ≤ e − β [∆( x,y ) − γ ] , (2.2)for any β > β and any x, y ∈ X , where the parameter γ is a function of β that vanishes for β → ∞ . Because of this, we also refer to the function ∆( x, y ) as the energy cost of the transitionfrom x to y .We assume that the Markov chain ( X n ) n satisﬁes the detailed balance property P β ( x, y ) e − βG ( x ) = P β ( y, x ) e − βG ( y ) , (2.3)for any x, y ∈ X , where G : X −→ R is the so-called Hamiltonian function . Equivalently, theMarkov chain is reversible with respect to the Gibbs measure µ ( x ) := e − βG ( x ) (cid:80) y ∈X e − βG ( y ) . (2.4)This implies that the measure µ is stationary, that is (cid:80) x ∈X µ ( x ) P β ( x, y ) = µ ( y ) . Next, we deﬁnethe virtual energy as H ( x ) := lim β →∞ G ( x ) . (2.5)Deﬁnition (2.5) is well-posed, since for large β , the Markov chain ( X n ) n is irreducible and its in-variant probability distribution µ in (2.4) is such that for any x ∈ X the limit lim β →∞ − β log µ ( x ) exists and is a positive real number [21, Prop. 2.1]. Taking the limit β → ∞ in (2.3) yields H ( x ) + ∆( x, y ) = H ( y ) + ∆( y, x ) . (2.6)This motivates the following deﬁnition of transition energy H ( x, y ) := H ( x ) + ∆( x, y ) , (2.7)where x, y are conﬁgurations in X . The deﬁnition of transition energy is needed to deﬁne theheight along a path ω in the general setting. Indeed, there may not exist a conﬁguration whoseenergy is equal to the energy of the maximum along the path. The transition energy betweentwo conﬁgurations is deﬁned as the sum between the virtual energy of the ﬁrst conﬁguration andthe energy cost of the transition between the two conﬁgurations. This is unlike the Metropolisdynamics case [45], where the transition energy between two conﬁgurations is the virtual energyof some state along the path between the two. 5et ω = { ω , ..., ω n } be a ﬁnite sequence of conﬁgurations. We call ω a path with startingconﬁguration ω and ﬁnal conﬁguration ω n . We denote the length of ω as | ω | = n . We deﬁnethe height along ω as Φ ω = H ( ω ) if | ω | = 1 , or if | ω | > ω := max i =1 ,..., | ω |− H ( ω i , ω i +1 ) . (2.8)Let x, y ∈ X be two conﬁgurations. The communication height between two conﬁgurations x , y is deﬁned as Φ( x, y ) := min ω ∈ Θ( x,y ) Φ w , (2.9)where Θ( x, y ) the set of all the paths ω starting from x and ending in y . Similarly, we also deﬁnethe communication height between two sets A, B ⊂ X as Φ( A, B ) := min x ∈ A,y ∈ B Φ( x, y ) . (2.10) x y Figure 1: Example of a path ω between x and y with | ω | = 5 .Figure 2: There are three paths in Θ( x, y ) . The red mark represents the communication heightbetween x and y .The ﬁrst hitting time of A ⊂ X starting from x ∈ X is deﬁned as τ xA := inf { t > | X t ∈ A } . (2.11)Whenever possible we shall drop from the notation the superscript denoting the starting point.For any x ∈ X , let I x be the set of conﬁgurations with energy strictly lower than H ( x ) , i.e., I x := { y ∈ X | H ( y ) < H ( x ) } . (2.12)6he stability level V x of x is the energy barrier that, starting from x , must be overcome to reachthe set I x , i.e., V x := Φ( x, I x ) − H ( x ) . (2.13)If I x is empty, then we let V x = ∞ . We denote by X s the set of global minima of the energy,and we refer to these as ground states. The metastable states are those states that attain themaximal stability level Γ m < ∞ , that is Γ m := max x ∈X \X s V x , (2.14) X m := { y ∈ X | V y = Γ m } . (2.15)Since the metastable states are deﬁned in terms of their stability level, a crucial role in our proofsis played by the set of all conﬁgurations with stability level strictly greater than V , that is X V := { x ∈ X | V x > V } . (2.16)We frame the problem of metastability as the identiﬁcation of metastable states and the computa-tion of transition times from the metastable states to the stable conﬁgurations. In summary, fromthe mathematical point of view, the metastability phenomenon for a given system is describedin terms of X s , Γ m and X m . Now we deﬁne formally the energy barrier Γ as Γ := Φ( y m , y s ) − H ( y m ) , (2.17)where y m ∈ X m and y s ∈ X s . Note that Γ does not depend on the speciﬁc choice of y m , y s . Theenergy barrier is the minimum energy necessary to trigger the nucleation. The energy Γ turnsout to be equal to Γ m under speciﬁc assumptions [20, Theorem 2.4].A diﬀerent notion of metastable states is given in [10], within the framework of the potential-theoretic approach. The Dirichlet form associated with our reversible Markov chain is the func-tional D β [ f ] := 12 (cid:88) y,z ∈X µ β ( y ) p β ( y, z )[ f ( y ) − f ( z )] , (2.18)where f : X → R is a function. Thus, given two not empty disjoint sets Y, Z ⊂ X the capacity of the pair Y and Z deﬁned as cap β ( Y, Z ) := min f : X→ [0 , f | Y =1 ,f | Z =0 D β [ f ] . (2.19)Note that the capacity is a symmetric function of the sets Y and Z . It can be proven that theright hand side of (2.19) has a unique minimizer called equilibrium potential of the pair Y and Z . There is a nice interpretation of the equilibrium potential in terms of hitting times. Forany x ∈ X , we denote by P x ( · ) and E x [ · ] respectively the probability and the average along thetrajectories of the process started at x . Then, it can be proven that the equilibrium potential ofthe pair Y and Z is equal to the function h Y,Z deﬁned as follows h Y,Z ( x ) :=  P x ( τ Y < τ Z ) for x ∈ X \ ( Y ∪ Z )1 for x ∈ Y for x ∈ Z (2.20)where τ Y and τ Z are, respectively, the ﬁrst hitting time to Y and Z for the chain started at x .It can be also proven that, for any Y ⊂ X and z ∈ X \ Y ,cap β ( z, Y ) = µ β ( z ) P z ( τ Y < τ z ) , (2.21)see [7, equation (7.1.16)]. 7 eﬁnition 2.1. According to the potential-theoretic approach, a set M ⊂ X is said to be metastable if lim β →∞ max x/ ∈ M µ β ( x )[ cap β ( x, M )] − min x ∈ M µ β ( x )[ cap β ( x, M \ { x } )] − = 0 . (2.22)In order to avoid confusion, we will denote the states that satisfy (2.22) as p.t.a.-metastable .The physical meaning of the above deﬁnition can be understood once one remarks that thequantity µ β ( x ) / cap β ( x, y ) , for any x, y ∈ X , is strictly related to the communication cost betweenthe states x and y , see Proposition B.5 for details. Thus, condition (2.22) ensures that thecommunication cost between any state outside M and M itself is smaller than the communicationcost between any two states in M . The following theorems give estimates of the mixing time and the spectral gap in the generalsetting.

Theorem 2.2.

Let ( P β ( x, y )) x,y ∈X be the transition matrix of a Markov chain. Assume thereexists at least a stable state s such that lim β →∞ − β log P β ( s, s ) = 0 . (2.23) Then, for any < (cid:15) < we have lim β →∞ β log t mixβ ( (cid:15) ) = Γ m , (2.24) where t mixβ := min { n ≥ | max x ∈X ||P nβ ( x, · ) − µ ( · ) || T V ≤ (cid:15) } and || ν − ν (cid:48) || T V = (cid:80) x ∈X | ν ( x ) − ν (cid:48) ( x ) | for every ν, ν (cid:48) probability distribution on X . Theorem 2.3.

Let ( P β ( x, y )) x,y ∈X be a reversible transition matrix. Let ρ β = 1 − a (2) β be thespectral gap, with a (2) β is the second eigenvalue of the transition matrix such a (1) β > a (2) β ≥ ... ≥ a ( |X | ) β ≥ − . Then there exist two constants < c < c < ∞ independent of β such thatfor every β > , c e − β (Γ m + γ ) ≤ ρ β ≤ c e − β (Γ m − γ ) , (2.25) where γ , γ are functions of β that vanish for β → ∞ . In this section we show that several well-known models in statistical mechanics satisfy theassumption (2.23) of Theorem 2.2. In particular we are able to get precise asymptotics for themixing time of these models. Throughout this section we denote by Λ a ﬁnite subset of Z , by X the conﬁguration space and by s a stable state. Metropolis algorithm.

The Hamiltonian function for this model coincides with the virtualenergy and is given by H ( σ ) := − J (cid:88) i,j ∈ Λ | i − j | =1 σ ( i ) σ ( j ) − h (cid:88) i ∈ Λ σ ( i ) , σ ∈ X . (2.26)8he transition probabilities are given by P β ( σ, η ) := q ( σ, η ) exp {− β [ H ( η ) − H ( σ )] } , σ, η ∈ X , (2.27)where q ( σ, η ) := (cid:26) | Λ | if ∃ i ∈ Λ : σ i = η, otherwise . and σ i ( j ) := (cid:26) σ ( j ) if j (cid:54) = i, − σ ( j ) if j = i. In this case the assumption (2.23) is shown to hold in [21, Prop. 3.24]. Note that Kawasakidynamics is a type of Metropolis dynamics, so it falls into this case.

Reversible PCA model for Spin Systems.

For this model, the Hamiltonian function isgiven by G ( σ ) := − h (cid:88) i ∈ Λ σ ( i ) − β (cid:88) i ∈ Λ log cosh[ β ( S σ ( i ) + h )] , (2.28)and the virtual energy is obtained by (2.5) H ( σ ) = − h (cid:88) i ∈ Λ σ ( i ) − (cid:88) i ∈ Λ | S σ ( i ) + h | . (2.29)Here S σ ( i ) := (cid:88) j ∈ U i K ( i − j ) σ ( j ) , (2.30)where K ( i − j ) (cid:54) = 0 for j ∈ U i a neighborhood of i . Diﬀerent choices of K ( · ) and U i yielddiﬀerent PCA. It can be shown that, if U i is symmetric, then the Markov chain is reversible.The transition probabilities are given by p ( σ, η ) := (cid:89) i ∈ Λ p i,σ ( η ( i )) , σ, η ∈ X , (2.31)where, for i ∈ Λ and σ ∈ X , p i,σ ( · ) is the probability measure on {− , +1 } deﬁned as p i,σ ( a ) := 11 + exp {− βa ( S σ ( i ) + h ) } = 12 [1 + a tanh β ( S σ ( i ) + h )] , (2.32)with a ∈ {− , +1 } . We have lim β →∞ − β log p ( s, s ) = lim β →∞ − β log (cid:89) i ∈ Λ

11 + exp {− βs ( i )( S s ( i ) + h ) } = lim β →∞ (cid:88) i ∈ Λ log((1 + exp {− βs ( i )( S s ( i ) + h ) } ) β ) ≤ lim β →∞ (cid:88) i ∈ Λ log (cid:16) β exp {− βs ( i )( S s ( i ) + h ) } (cid:17) , (2.33)9here we used the inequality (1 + x ) α ≤ αx with α ∈ (0 , . In this model the unique stablestate is s = +1 , so we conclude in the following way lim β →∞ (cid:88) i ∈ Λ log (cid:16) β exp {− β ( S s ( i ) + h ) } (cid:17) = lim β →∞ (cid:88) i ∈ Λ log (cid:16) β exp {− β ( | U i | + h ) } (cid:17) = lim β →∞ | Λ | log (cid:16) β exp {− β ( | U i | + h ) } (cid:17) = 0 , (2.34)where in the last equality we used that h ≥ and | U i | is the same for all i ∈ Λ . Irreversible PCA model.

The Hamiltonian function of the Irreversible PCA model is givenby G ( σ, τ ) := − (cid:88) k ∈ Λ N [ σ k ( τ k u + τ k r ) + hσ k τ k ] , σ, τ ∈ X , (2.35)with k u := ( i, j + 1) , k r := ( i + 1 , j ) for k = ( i, j ) ∈ Λ N . The transition probabilities are givenby P β ( σ, η ) := e − βG ( σ,η ) (cid:80) τ ∈X e − βG ( σ,τ ) . (2.36)Note that the subset X \ X s is not empty since G is not constant. We compute lim β →∞ − β log P β ( s, s ) = lim β →∞ − β log (cid:16) e − βG ( s,s ) (cid:80) τ ∈X e − βG ( s,τ ) (cid:17) = H ( s, s ) + lim β →∞ β log (cid:16) (cid:88) τ ∈X e − βG ( s,τ ) (cid:17) . (2.37)Take τ ∈ X such that G ( s, τ ) = min τ G ( s, τ ) . We get H ( s, s ) + lim β →∞ β log (cid:16) (cid:88) τ ∈X e − βG ( s,τ ) (cid:17) ≤ H ( s, s ) + lim β →∞ β log (cid:16) N e − βG ( s,τ ) (cid:17) = H ( s, s ) − H ( s, τ ) + lim β →∞ β log(2 N ) . (2.38)The last term goes to zero since N is ﬁnite. Since in this model s = +1 , we have H (+1 , +1) = − N (2 + h ) , H (+1 , τ ) = − N (2 + h ) and the conclusion follows. The structure of the energy landscape that we analyze for our reversible PCA model in Section 3.1is such that the system has three metastable states with one non-degenerate-in-energy metastablestate and two degenerate metastable states. Moreover, the system started at the metastable statewith higher energy, must necessarily visit the second one before relaxing to the stable state. Inthis Section we generalize the results in [25, Section 2.5, 2.6] to this degenerate context. Inparticular, we shall prove the addition rule for the exit times from the metastable states.10 ondition 2.4.

We assume that the energy landscape ( X , Q, H, ∆) is such that there existfour or more states x , x , x , ..., x n and x such that X s = { x } , X m = { x , ..., x n , x } , and H ( x ) > H ( x r ) , H ( x r ) = H ( x q ) , Φ( x r , x q ) − H ( x r ) < Γ m for every r, q = 1 , ..., n , with n ∈ N . Recalling the deﬁnition of the set of ground states X s , we immediately have H ( x r ) > H ( x ) for every r = 1 , ..., n. (2.39)Moreover, from the deﬁnition (2.13) of maximal stability level it follows that (see [20, Theo-rem 2.3]) the communication cost from x to x is equal to the communication cost from x r to x for every r = 1 , ..., n , that is Φ( x , x ) − H ( x ) = Φ( x r , x ) − H ( x r ) = Γ m . (2.40)Note that, since x is a metastable state, its stability level cannot be lower than Γ m . Then,recalling that H ( x ) > H ( x r ) for every r = 1 , ..., n , one has that Φ( x , x r ) − H ( x ) ≥ Γ m . Onthe other hand, (2.40) implies that there exists a path ω ∈ Θ( x , x r ) such that Φ ω = H ( x ) + Γ m and, hence, Φ( x , x r ) − H ( x ) ≤ Γ m for every r = 1 , ..., n . The two bounds ﬁnally imply that Φ( x , x r ) − H ( x ) = Γ m . (2.41)Note that the communication cost from x to x and that from x r to x are larger than Γ m , i.e., Φ( x , x ) − H ( x )Γ m and Φ( x r , x ) − H ( x r )Γ m , for every r = 1 , ..., n. (2.42)Indeed, recalling the reversibility property (2.6), we have Φ( x r , x ) − H ( x r ) = Φ( x , x r ) − H ( x ) + H ( x ) − H ( x r )= Γ m + H ( x ) − H ( x r )Γ m . where in the last two steps we have used (2.41) and Condition 2.4, which proves the second ofthe two equations (2.42). The ﬁrst of them can be proved similarly. When the system is startedat x , with high probability it will visit x r before x for every r = 1 , ..., n . For this reason weshall assume the following condition. Condition 2.5.

Condition 2.4 is satisﬁed and lim β →∞ P x ( τ x < τ x r ) = 0 , for every r = 1 , ..., n. (2.43)We remark that the Condition 2.5 is in fact a condition on the equilibrium potential h x ,x r evaluated at x , for every r = 1 , ..., n .One of important goals of this paper is to prove an additional rule for the mean hitting timeof +1 starting at − using Theorem 2.12 for the expectation of the transition time τ x for thechain started at x . Such an expectation, hence, will be of order exp( β Γ m ) and the prefactorwill be that given in (2.52).We can thus formulate the further assumptions that we shall need in the sequel. Condition 2.6.

Condition 2.4 is satisﬁed and there exists two positive constants k , k < ∞ and such that µ β ( x ) cap β ( x , { x , ..., x n , x } ) = 1 k e β Γ m [1 + o (1)] , , µ β ( { x , ..., x n } ) cap β ( { x , ..., x n } , x ) = 1 k e β Γ m [1 + o (1)] , (2.44) where o (1) denotes a function tending to zero in the limit β → ∞ . ondition 2.7. Condition 2.4 is satisﬁed and there exists n positive constants c , c , ..., c n < ∞ such that µ β ( x r ) cap β ( x r , x ) = 1 c i e β Γ m [1 + o (1)] , for every r = 1 , ..., n, (2.45) where o (1) denotes a function tending to zero in the limit β → ∞ . The following theorems generalize respectively Theorem 1, Theorem 2, Theorem 3, Theorem4 in [25]. We prove them in Appendix B.

Theorem 2.8.

Assume Condition 2.4 is satisﬁed. Then for every r = 1 , ..., n we have { x , x r , x } ⊂X is a p.t.a.-metastable set. Theorem 2.9.

Assume Condition 2.4 is satisﬁed. Then E x [ τ { x ,...,x n ,x } ] = µ β ( x ) cap β ( x , { x , ..., x n , x } ) [1 + o (1)] , (2.46) E { x ,...,x n } [ τ x ] = µ β ( { x , ..., x n } ) cap β ( { x , ..., x n } , x ) [1 + o (1)] , (2.47) E x r [ τ x ] = nµ β ( x r ) cap β ( x r , x ) [1 + o (1)] , for every r = 1 , ..., n. (2.48) Theorem 2.10.

Assume Condition 2.4 and Condition 2.6 are satisﬁed. Then E x [ τ { x ,...,x n ,x } ] = e β Γ m k [1 + o (1)] , (2.49) E { x ,...,x n } [ τ x ] = e β Γ m k [1 + o (1)] , (2.50) Theorem 2.11.

Assume Condition 2.4 and Condition 2.7 are satisﬁed. Then E x r [ τ x ] = e β Γ m nc i [1 + o (1)] , for every i = 1 , ..., n. (2.51) Theorem 2.12.

Assume Condition 2.4, Condition 2.5, and Condition 2.6 are satisﬁed. Then E x [ τ x ] = e β Γ m (cid:16) k + 1 k (cid:17) [1 + o (1)] (2.52)We remark that Theorem 2.12 gives an addition formula for the mean hitting time of x starting at x . Neglecting terms of order o (1) , such a mean time can be written as the sum ofthe mean hitting time of the subset { x , ..., x n , x } starting at x and of the mean hitting time of x starting from any state in { x , ..., x n } . It is very interesting to note that in this decompositionno role is played by the mean hitting time of { x , ..., x n } starting at x . We consider the reversible PCA model for Spin Systems introduced by Derrida in [30], seealso [19]. In the second example of Section 2.3, we considered a general PCA, but from nowon we restrict ourselves to a speciﬁc nearest-neighbor interaction, see ﬁgure 3.1. Consider thetwo–dimensional torus with L even Λ L := { , ..., L − } , endowed with the Euclidean metric.12 Figure 3: In black are highlighted the sites j such that K ( i − j ) (cid:54) = 0 in the reversible PCA modelfor spin systems.To each site i ∈ Λ we associate a variable σ ( i ) ∈ {− , +1 } . Λ L represents an interactingparticles system characterized by their spin and we interpret σ ( i ) = +1 (respectively σ ( i ) = − )as indicating that the spin at site i is pointing upwards (respectively downwards). Let X := {− , +1 } Λ be the conﬁguration space , let β := T > where T is thought of as the temperature.Let h ∈ (0 , be a parameter representing the external ferromagnetic ﬁeld . We do not considerthe case h > , because in that case there is no metastable behavior. The dynamics of the systemare modelled as a Markov chain ( σ n ) n ∈ N on X with transition matrix deﬁned in (2.30), (2.31).In the rest of the paper, we will choose K ( i − j ) := (cid:26) if | i − j | = 1 , otherwise . (3.1)Note that the transition probability p i,σ ( s ) for the spin σ ( i ) given in (2.32) depends only on thevalues of the adjacent spins.The system evolves in discrete time steps, where at each step, all the spins are updatedsimultaneously according to the probability distribution (2.32). Intuitively, the value of the spinis likely to align with the local eﬀective ﬁeld S σ ( i ) + h . Here S σ ( i ) represents a ferromagneticinteraction among spins.The Markov chain σ n satisﬁes the detailed balance property (2.3), where G ( · ) in (2.28) is the Hamiltonian function . Equivalently, the Markov chain is reversible with respect to the Gibbsmeasure (2.4) and this implies that the measure µ is stationary. Finally, given σ, η ∈ X , wedeﬁne the energy cost of the transition from σ to η for our speciﬁc PCA, as ∆( σ, η ) := − lim β →∞ log p ( σ, η ) β = (cid:88) i ∈ Λ: η ( i ) | S σ ( i )+ h | < | S σ ( i ) + h | . (3.2)Note that ∆( σ, η ) ≥ and, perhaps surprisingly, ∆( σ, η ) is not necessarily equal to ∆( η, σ ) . Wealso note that condition (3.2) is sometimes written more explicitly as in (2.2). The last equality13n (3.2) is obtained as follows (for more details, see Appendix A), − lim β →∞ log p ( σ, η ) β = (cid:88) i ∈ Λ: η ( i )( S σ ( i )+ h ) < lim β →∞ log(1 + exp { β | S σ ( i ) + h |} ) β = (cid:88) i ∈ Λ: η ( i )( S σ ( i )+ h ) < | S σ ( i ) + h | . Let us ﬁx the notation of some important states as follows: • +1 is the conﬁguration such that +1( i ) = +1 for every i ∈ Λ ; • − is the conﬁguration such that − i ) = − for every i ∈ Λ ; • c e and c o are the conﬁgurations such that c e ( i ) = ( − i + i and c o ( i ) = ( − i + i +1 forevery i = ( i , i ) ∈ Λ . These conﬁguration are called chessboard conﬁgurations .Next we deﬁne the virtual energy as the limit lim β →∞ G ( σ ) := H ( σ ) = − h (cid:88) i ∈ Λ σ ( i ) − (cid:88) i ∈ Λ | S σ ( i ) + h | , (3.3)We distinguish two cases. • Case h = 0 . In this case H ( σ ) = − (cid:80) i ∈ Λ | S σ ( i ) | , so there exist four minima of H given bythe conﬁgurations +1 , − and the chessboard conﬁgurations. The conﬁgurations +1, − and c are ground states and each site of them contributes − to the total energy. • Case h > . In this case +1 is the unique ground state. The energy of this state is ( − h − (4 + h )) | Λ | , so each site contributes − h − (4 + h ) to the total energy.From now on we assume h > , ﬁxed and small. Under periodic boundary conditions, the energyof these conﬁgurations is, respectively • H (+1) = − L (4 + 2 h ) , • H ( −

1) = − L (4 − h ) , • H ( c e ) = H ( c ) = − L .Since H ( c e ) = H ( c o ) and ∆( c e , c o ) = ∆( c o , c e ) = 0 , from now on we will indicate either elementof the set { c e , c o } as c , this is an example of stable pair (see Deﬁnition 5.1). Therefore, H ( − >H ( c ) > H (+1) for < h < . Our ﬁrst goal is to show that {− , c } is the set of metastablestates and +1 is the global minimum (or ground state). In the setup introduced in [40], the minimal description of the metastability phenomenon isgiven in terms of X s , X m and Γ m , so we concentrate our attention on these. In particular wedetermine the metastable and stable stases and we show that the maximal stability level Γ m isequal to the energy barrier Γ PCA , deﬁned as [19, (3.29)] Γ ≡ Γ PCA = − hλ + 2 λ (4 + h ) − h, (3.4)14here λ is the critical length computed in [19, (3.24)] and deﬁned as λ := (cid:104) h (cid:105) + 1 , (3.5)where [ · ] is the integer part. Assuming that the system is prepared in the state σ = − , withprobability tending to one as β → ∞ the system visits the chessboard c before relaxing to thestable state +1 . Moreover, by [19, Theorem 3.11, Theorem 3.13] along the tube of paths from − to c the system visits a certain set of conﬁgurations called critical droplets from − to c . Thecritical droplets are all those conﬁgurations that have a single chessboard droplet of a speciﬁcsize in a sea of minuses. Instead, along the tube of paths from c to +1 the system visits a certainset of conﬁgurations, also called critical droplets from c to +1 , but in this case these are all thoseconﬁgurations that have a single plus droplet of a speciﬁc size in a chessboard. The droplet size,in both cases, is the so-called critical length λ . We then say that a rectangle is supercritical (resp. subcritical ) if the side of the rectangle is greater than λ (resp. smaller than λ ). Formally,the chessboard droplet is a supercritical rectangle with a one-by-one protuberance attached toone of the two longest sides and with the spin plus in this protuberance. Note that starting fromdiﬀerent initial conﬁgurations yields diﬀerent kinds of droplets.We are ﬁnally ready to present our model-dependent results. In Lemma 3.1 we show that allstates diﬀerent from +1 , − , c have a strictly lower stability level than Γ PCA . Using this lemmaand [19, Lemma 3.4, Lemma 4.1], we show that Γ PCA = Γ m , allowing us to conclude in Theorem3.2 that the only metastable states are indeed − and c . Lemma 3.1 (Estimate of stability levels) . For every η ∈ X \ {− , c, +1 } , there exists V ∗ suchthat V η ≤ V ∗ < Γ PCA . Theorem 3.2 (Identiﬁcation of metastable states) . For the reversible PCA model (3.1) we have Γ m = Γ PCA and thus X m = {− , c } . Theorem 3.3 below implies that the system visits a metastable state or a ground state in atime shorter than e βV ∗ + (cid:15) and visits a stable state in a time shorter than e β Γ m + (cid:15) , uniformly inthe starting state for any (cid:15) > . We say that a function β (cid:55)→ f ( β ) is super exponentially small(SES) if lim β →∞ log f ( β ) = −∞ . Theorem 3.3 (Recurrence property) . For any (cid:15) > , the functions β (cid:55)→ sup η ∈X P η ( τ { +1 ,c, − } > e β ( V ∗ + (cid:15) ) ) , β (cid:55)→ sup η ∈X P η ( τ +1 > e β (Γ PCA + (cid:15) ) ) (3.6) are SES. Equation (3.7) in the next theorem already appeared in [24, Theorem 3.1], however the proofthere was incomplete. Thanks to the previous theorems we are able to prove it rigorously here.The second part of the next theorem is an application of Theorem 2.2 to the reversible PCAmodel by Derrida.

Theorem 3.4.

For β large enough, we have E − [ τ +1 ] = (cid:18) k + 1 k (cid:19) e β Γ PCA (1 + o (1)) , (3.7) where k = k = 8 λ | Λ | . Moreover for any < (cid:15) < we have lim β →∞ β log t mixβ ( (cid:15) ) = Γ PCA , (3.8)15 nd there exist two constants < c < c < ∞ independent of β such that for every β > c e − β (Γ PCA + γ ) ≤ ρ β ≤ c e − β (Γ PCA − γ ) , (3.9) where γ , γ are functions of β that vanish for β → ∞ , and ρ β is the spectral gap. The ﬁrst term k e β Γ PCA represents the contribution of the mean hitting time E − [ τ c { τ c <τ +1 } ] while the second term k e β Γ PCA represents the contribution of E c [ τ +1 ] . Before we prove Theorem 2.2, let us recall some important deﬁnitions.

Deﬁnition 4.1 (Cycle, [21, Def. 2.3], [14, Def. 4.2]) . Let ( X n ) n be a Markov chain. A nonemptyset C ⊂ X is a cycle if it is either a singleton or for any x, y ∈ C , such that x (cid:54) = y , lim β →∞ − β log P ( X τ ( X\ C ) ∪{ y } (cid:54) = y | X = x ) > . (4.1)In other words, a nonempty set C ⊂ X is a cycle if it is either a singleton or if for any x ∈ C ,the probability for the process starting from x to leave C without ﬁrst visiting all the otherelements of C is exponentially small. We denote by C ( X ) the set of cycles of X . Deﬁnition 4.2 (Energy Cycle, [21, (2.17)], [21, Def. 3.5]) . A nonempty set A ⊂ X is an energy-cycle if and only if it is either a singleton or it veriﬁes the relation max x,y ∈ A Φ( x, y ) < Φ( A, X \ A ) . (4.2) Deﬁnition 4.3.

Given a cycle C ⊂ X , we denote by F ( C ) the set of the minima of the energyin C , namely F ( C ) := { x ∈ C | min y ∈ C H ( y ) = H ( x ) } . (4.3)The proposition [21, Prop. 3.10] establishes the equivalence between cycle and energy-cycleand allows us to use the equivalence between the approach in [38, 16, 15] and the path-wiseapproaches [19, 45, 21, 40, 48, 49, 50] that uses the energy-cycle. Next we deﬁne the collectionof maximal cycles. Deﬁnition 4.4 ([45, Def. 20], [21, Def. 2.4]) . Given a nonempty subset A ⊂ X , we denote by M ( A ) the collection of maximal cycles that partitions A , that is M ( A ) := { C ∈ C ( X ) | C maximal by inclusion under the constraint C ⊆ A } . (4.4)Moreover, we extend to the general setting the deﬁnition of the maximal depth given in [45,Def. 21] for the setting of Metropolis dynamics. Deﬁnition 4.5.

The maximal depth ˜Γ( A ) of a nonempty subset A ⊂ X is the maximal depthof a cycle contained in A , that is ˜Γ( A ) := max C ∈M ( A ) Γ( C ) . (4.5) Trivially ˜Γ( C ) := Γ( C ) if C ∈ C ( X ) . roof of Theorem 2.2. We prove (2.24) by generalizing [45, Prop. 3.24]. To do this, we showthat ˜Γ(

X \ { s } ) is equal to Γ m . Recall deﬁnition (2.14) Γ m := max x ∈X \{ s } (Φ( x, I x ) − H ( x )) . Since Φ( x, I x ) ≤ Φ( x, s ) , we have that Γ m ≤ ˜Γ( X \ { s } ) . To prove the reverse inequality Γ m ≥ ˜Γ( X \ { s } ) , we consider R D ( x ) , the union of { x } and of the points in X which can bereached by means of paths starting from x with height smaller than the height that is necessaryto escape from D ⊂ X starting from x [21, (3.58)]. We consider R X \{ s } ( x ) = { x } ∪ { y ∈ X | Φ( x, y ) < Φ( x, s ) } . (4.6)We partition X into the set of local minima X (i.e., X V with V = 0 ) and its complement, as X = X ∪ ( X \ X ) , so that X \ { s } = ( X ∪ ( X \ X )) \ { s } = ( X \ { s } ) ∪ ( X \ X ) . Then, ˜Γ( X \ { s } ) = max x ∈X \{ s } Γ( R X \{ s } ( x )) = max (cid:26) max x ∈X \X Γ( R X \{ s } ( x )) , max x ∈X \{ s } Γ( R X \{ s } ( x )) (cid:27) . (4.7)Let us analyze the two terms on the right separately. • If x ∈ X \ { s } , then R X \{ s } ( x ) = { y ∈ X | Φ( x, y ) < Φ( x, s ) } is a non-trivial cycle. Using[21, Prop. 3.17],i) If x ∈ F ( R X \{ s } ( x )) , then Γ( R X \{ s } ( x )) ≤ V x , by [21, Prop. 3.17 (3)].ii) Suppose that x (cid:54)∈ F ( R X \{ s } ( x )) . Consider ˜ x = argmin x ∈ R X\{ s } ( x ) H ( x ) , then ˜ x ∈F ( R X \{ s } ( x )) and by [21, Prop. 3.17 (2), (3)] we have V x < Γ( R X \{ s } ( x )) = Γ( R X \{ s } (˜ x )) = V ˜ x . So max y ∈ R X\{ s } ( x ) V y = V ˜ x = Γ( R X \{ s } ( x )) . (4.8)From this follows that max x ∈X \{ s } Γ( R X \{ s } ( x )) = max x ∈X \{ s } max y ∈ R X\{ s } ( x ) V y ≤ Γ m . (4.9) • If x ∈ X \ X , we proceed as followsI) If Φ( x, s ) = H ( x ) , then R X \{ s } ( x ) = { x } because { y ∈ X | Φ( x, y ) < H ( x ) } is empty.Indeed, Φ( x, y ) is always greater than or equal to H ( x ) . So, Γ( R X \{ s } ( x )) = Γ( { x } ) =0 .II) If Φ( x, s ) > H ( x ) , we choose ˜ x = argmin x ∈ R X\{ s } ( x ) H ( x ) , so ˜ x ∈ X \ { s } and Φ( x, s ) = Φ(˜ x, s ) . Then { y ∈ X | Φ( x, y ) < Φ( x, s ) } ⊆ R X \{ s } (˜ x ) and we refer to theprevious case x ∈ X , since ˜ x ∈ X \ { s } .This concludes the proof that Γ m ≥ ˜Γ( X \ { s } ) and hence that Γ m = ˜Γ( X \ { s } ) .The key step in [45, Prop. 3.24] was to show that H = H , H is deﬁned as [14, Theorem5.1] H := (cid:101) Γ( X \ { x } ) , x ∈ argmin x ∈X G ( x ) (4.10)The critical depth H is deﬁned as [14, Theorem 5.1] H := (cid:101) Γ( X × X \ F ) , (4.11)17here F = { ( x, x ) | x ∈ X } , (cid:101) Γ( X × X \ F ) = max C ∈M ( X ×X \ F ) Γ( C ) and M ( X × X \ F ) = { C ∈C ( X ) | C maximal cycle by inclusion under the constraint C ⊆ X ×X } . Through the equivalenceof two deﬁnitions of cycles, given by [21, Prop. 3.10], the critical depth H is equal to ˜Γ( X \ { s } ) .This quantity is well deﬁned because its value is independent of the choice of s [14, Theorem 5.1].Now we consider two independent Markov chains, X t and Y t , on the same energy landscape andwith the same inverse temperature β . We deﬁne the two dimensional Markov chain { ( X t , Y t ) } on X × X with transition probabilities P ⊗ β given by P ⊗ β (cid:16) ( x, y )(˜ x, ˜ y ) (cid:17) = P β ( x, ˜ x ) P β ( y, ˜ y ) ∀ ( x, y ) , (˜ x, ˜ y ) ∈ X × X (4.12)So, using [14, Theorem 5.1] and the assumption (2.23), the proof is concluded.Before proving the bounds (2.25) c e − β (Γ m + γ ) ≤ ρ β ≤ c e − β (Γ m − γ ) , we recall the Deﬁnition 2.18 and we deﬁne the generator of a Markov process. Deﬁnition 4.6.

For any function f : X −→ R , L β f is the function deﬁned as L β f ( x ) := (cid:88) y ∈X P β ( x, y )[ f ( x ) − f ( y )] . (4.13)The result (2.25) is an immediate consequence of the next two lemmas and it is obtained bygeneralizing [38, Theorem 2.1, Lemma 2.3, Lemma 2.7]. Lemma 4.7.

There exists a constant C ≤ ∞ such that for all β ≥ , ρ β ≤ Ce − β (Γ m − γ ) , (4.14) where γ is a function of β that vanishes for β → ∞ .Proof. We ﬁrst observe that by assumption Γ m > . Without loss of generality, we may assumethat x ∈ X m , y ∈ X s and H ( y ) = 0 . Therefore Γ m = Φ( x , y ) − H ( x ) since X is ﬁnite. Wewrite the spectral gap ρ β as ρ β = inf f ∈ L ( µ ) − (cid:80) x ∈X f ( x ) L β f ( x ) µ ( x ) Var β ( f ) , (4.15)where Var β ( f ) := (cid:80) x ∈X f ( x ) µ ( x ) − ( (cid:80) x ∈X f ( x ) µ ( x )) , and L is the space of functions withﬁnite second moment under the measure µ . We will ﬁnd a function F and a constant C < ∞ ,such that − (cid:80) x ∈X F ( x ) L β F ( x ) µ ( x ) Var β ( F ) ≤ Ce − β (Γ m − γ ) . (4.16)Let x ∈ X and y ∈ I x be two points for which Φ( x , y ) − H ( x ) = Γ m and let us considerthe set R X \{ x } ( y ) = { y } ∪ { x ∈ X | Φ( y , x ) < Φ( y , x ) } . Note that x (cid:54)∈ R X \{ x } ( y ) and y ∈ R X \{ x } ( y ) . Moreover if x ∈ R X \{ x } ( y ) and y (cid:54)∈ R X \{ x } ( y ) , then H ( y ) + ∆( y, x ) ≥ Φ( y , x ) . (4.17)For any x ∈ R X \{ x } ( y ) and y (cid:54)∈ R X \{ x } ( y ) , by reversibility we have18 xx y y y y Figure 4: In this ﬁgure we draw an example energy-landscape, compatible with the assumptionson x , y and x . We also draw four y i (cid:54)∈ R X \{ x } ( y ) , i = 1 , , , , for which (4.17) is valid. P β ( x, y ) µ ( x ) = P β ( y, x ) µ ( y ) = e − β ( − log P β ( y,x ) β − log µ ( y ) β ) ≤ e − β (∆( y,x )+ H ( y ) − γ ∗ ) , (4.18)where, to obtain the inequality, the ﬁrst term is estimated by (2.1) and [21, Equation (2.2)], i.e., − log P β ( y, x ) β ≥ ∆( y, x ) − ˜ γ . (4.19)The second term in (4.18) is estimated by (2.5) and (2.4), that is − log µ ( y ) β ≥ H ( y ) − ˜ γ , (4.20)where ˜ γ , ˜ γ and γ ∗ = ˜ γ + ˜ γ are functions of β that vanish for β → ∞ . Then using (4.17) weget e − β (∆( y,x )+ H ( y ) − γ ∗ ) ≤ e − β Φ( x ,y ) e βγ ∗ . (4.21)Let F ( x ) = R X\{ x } ( y ) ( x ) , then − (cid:88) x ∈X F ( x ) L β F ( x ) µ ( x ) = 12 (cid:88) x,y ∈X µ ( x ) P β ( x, y )[ F ( x ) − F ( y )] ≤ (cid:88) x ∈R X\{ x } ( y ) y (cid:54)∈R X\{ x } ( y ) e − β (Φ( x ,y )) e βγ ∗ . (4.22)On the other hand,Var β ( f ) = µ ( R X \{ x } ( y )) µ ( R X \{ x } ( y ) c ) ≥ e − βG ( y ) Z e − βG ( x ) Z ≥ e − β ( H ( y )+˜ γ ) e − β ( H ( x )+˜ γ ) = e − β ( H ( x )+2˜ γ ) , (4.23)19here the last inequality is obtained by (4.20), and by our assumption H ( y ) = 0 . We concludethat ρ β ≤ Ce − β (Γ m − γ ) where C is a constant and γ = γ ∗ + 2˜ γ . Lemma 4.8.

There exists a constant

C > , such that for all β ≥ , ρ β ≥ Ce − β (Γ m + γ ) , (4.24) where γ is a function of β that vanishes for β → ∞ .Proof. It will be enough to ﬁnd a constant c > such that for every β ≥ and every f ∈ L ( µ ) , − (cid:80) x ∈X f ( x ) L β f ( x ) µ ( x ) Var β ( F ) ≥ Ce − β (Γ m + γ ) . (4.25)We consider x, y ∈ X and ω ∈ Θ( x, y ) with length | ω | = n ( x, y ) and deﬁne N := max x,y ∈X n ( x, y ) . (4.26)For z ∈ X , w ∈ I z , we deﬁne the function F ( z,w ) : Θ( x, y ) −→ { , } as F ( z,w ) ( ω ) := (cid:26) if ω i = z and ω i +1 = w for some ≤ i < n ( x, y ) , otherwise . (4.27)Then, Var β ( f ) = (cid:88) x,y ∈X ( f ( y ) − f ( x )) µ ( y ) µ ( x ) = (cid:88) x,y ∈X (cid:32) n ( x,y ) (cid:88) i =1 f ( ω i ) − f ( ω i − ) (cid:33) µ ( y ) µ ( x ) , where in the last equality we use that ω ∈ Θ( x, y ) with | ω | = n ( x, y ) and we wrote f ( y ) − f ( x ) as a telescopic sum. Using (4.26) and (4.27), we get the following inequalities (cid:88) x,y ∈X (cid:32) n ( x,y ) (cid:88) i =1 f ( ω i ) − f ( ω i − ) (cid:33) µ ( x ) µ ( y ) ≤ (cid:88) x,y ∈X n ( x, y ) n ( x,y ) (cid:88) i =1 ( f ( ω i ) − f ( ω i − )) µ ( x ) µ ( y ) ≤ N (cid:88) x,y ∈X (cid:88) z,w ∈X F ( z,w ) ( ω )( f ( w ) − f ( z )) µ ( x ) µ ( y ) . (4.28)We estimate µ ( x ) µ ( y ) as in (4.20), µ ( x ) µ ( y ) = e − β ( − log( µ ( x )) β − log( µ ( y )) β ) ≤ e − β ( H ( x )+ H ( y ) − γ ) . (4.29)Then we have N (cid:88) x,y ∈X (cid:88) z,w ∈X F ( z,w ) ( ω )( f ( w ) − f ( z )) µ ( x ) µ ( y ) ≤ N (cid:88) x,y ∈X (cid:88) z,w ∈X F ( z,w ) ( ω )( f ( w ) − f ( z )) e − β Φ( z,w ) e − β ( H ( x )+ H ( y ) − γ ) e − β Φ( z,w ) ≤ N (cid:16) max z,w (cid:88) x,y ∈X F ( z,w ) ( ω ) e − β ( H ( x )+ H ( y ) − γ ) e − β Φ( z,w ) (cid:17) (cid:88) u,v ∈X ( f ( v ) − f ( u )) e − β Φ( u,v ) . (4.30)20oreover F ( z,w ) ( ω ) e − β ( H ( x )+ H ( y ) − γ ) e − β Φ( z,w ) = F ( z,w ) ( ω ) e β (Φ( z,w ) − H ( x ) − H ( y )+2˜ γ ) ≤ F ( z,w ) ( ω ) e β (Φ( x,y ) − H ( x ) − H ( y )+2˜ γ ) ≤ F ( z,w ) ( ω ) e β (Γ m +2˜ γ ) . (4.31)The result (4.24) follows from (4.28), (4.30), (4.31). In Section 5.1 we prove the main model-dependent results except for Lemma 3.1, which wepostpone to Section 5.2.

Note that our PCA veriﬁes [20, Deﬁnition 2.1]. In order to prove Theorem 3.2 we will leanon [20, Theorem 2.4] (see Appendix A). Roughly speaking, if we have an ansatz for the set ofmetastable conﬁgurations and one for the communication height, and we show that these verifytwo conditions, then [20, Theorem 2.4] guarantees that the anzatzes are correct.

Proof of Theorem 3.2 (Identiﬁcation of metastable states).

In [19] the authors computed the valueof Γ to be Γ PCA = − hλ + 2 λ (4 + h ) − h . There, it was also proven that Φ( − , +1) − H ( −

1) = Γ

PCA , (5.1) Φ( c, +1) − H ( c ) = Γ PCA . (5.2)By [19, Lemma 3.4, Lemma 4.1] we have that Φ( − , c ) = Γ PCA + H ( − , that is Γ PCA + H ( − is the minmax between − and c . The ﬁrst assumption of [20, Theorem 2.4] is satisﬁed for A = {− , c } and a = Γ PCA thanks to [19, Theorem 3.11, Lemma 3.4, Lemma 4.1], hence Φ( σ, X s ) − H ( σ ) = Γ PCA for all σ ∈ {− , c } . (5.3)Moreover, the second assumption of [20, Theorem 2.4] is satisﬁed because by Lemma 3.1 either X \ ( {− , c } ∪ X s ) = ∅ or V σ < Γ PCA for all σ ∈ X \ ( {− , c } ∪ X s ) . (5.4)Finally, by applying [20, Theorem 2.4], we conclude that Γ m = Γ PCA and X m = {− , c } . Proof of Theorem 3.3 (Recurrence property).

In Lemma 3.1 we compute V ∗ = 2(2 − h ) . Recallthe deﬁnition of X V in (2.16) and apply [21, Prop. 2.8] with a = V ∗ , X V ∗ = {− , c, +1 } . We get β (cid:55)→ sup η ∈X P η ( τ X m ∪X s > e β ( V ∗ + (cid:15) ) ) is SES. (5.5)With a similar reasoning with a = Γ m , X Γ m = X s , we get β (cid:55)→ sup η ∈X P η ( τ X s > e β (Γ m + (cid:15) ) ) is SES. (5.6)21 roof of Theorem 3.4. In [24] the proof of [24, Theorem 3.1] was only sketched in Section 4.Recall Theorem 2.12, then Condition 2.4 is satisﬁed thanks to our Theorem 3.2, , Condition 2.5is satisﬁed thanks to [24, Lemma 3.3, Lemma 3.4] and Condition 2.6 is satisﬁed thanks to [24,Lemma 3.5]. Thus, applying Theorem 2.12 concludes the rigorous proof of (3.7). In the secondexample of Section 2.3 we verify the assumptions of Theorem 2.2 and Theorem 2.3 for generalreversible PCA model in order to get (3.8) and (3.9).

Deﬁnition 5.1.

We call stable conﬁgurations those conﬁgurations σ ∈ X such that p ( σ, σ ) → in the limit β → ∞ . Equivalently, σ ∈ X is a stable conﬁguration if and only if p ( σ, η ) → inthe limit β → ∞ for all η ∈ X \ { σ } . For any σ ∈ X there exists a unique conﬁguration η ∈ X such that the transition σ → η happens with high probability as β → ∞ , that is p ( σ, η ) β →∞ −→ . So let η and σ be twoconﬁgurations in X such that η = T σ , where T : X → X σ (cid:55)→ T σ is the map such that for each x ∈ Λ T σ ( x ) = (cid:26) σ x ( x ) if p x ( σ x ( x ) | σ ) β →∞ −→ σ ( x ) if p x ( σ x ( x ) | σ ) β →∞ −→ Deﬁnition 5.2.

Let σ, η ∈ X be two diﬀerent conﬁgurations. We say that σ and η form a stablepair if and only if η = T σ and

T η = σ . Moreover, we say that σ ∈ X is a trap if either σ is astable conﬁguration or the pair ( σ, T σ ) is a stable pair. We denote by T ⊂ X the collection ofall traps.

We deﬁne two further maps, that will be useful later on. For any given j ∈ Λ , T Fj ( σ ) = T ( σ ) except in the site j , where T Fj ( σ ) = σ ( j ) . Formally, T Fj σ ( i ) =  σ i X \{ j } ( i ) if p i ( σ i ( i ) | σ ) β →∞ −→ ,σ X \{ j } ( i ) if p i ( σ i ( i ) | σ ) β →∞ −→ ,σ ( j ) if i = j. (5.7)For any given j ∈ Λ , T Cj ( σ ) = T ( σ ) except in the site j , where T Cj ( σ ) = − σ ( j ) . Formally, T Cj σ ( i ) =  σ i X \{ j } ( i ) if p i ( σ i ( i ) | σ ) β →∞ −→ ,σ X \{ j } ( i ) if p i ( σ i ( i ) | σ ) β →∞ −→ , − σ ( j ) if i = j. (5.8)The two maps are similar to T ( σ ) , the only diﬀerence being that T Fj ( σ ) ﬁxes the value of thespin in j and T Cj ( σ ) changes the value of the spin in j .We say that x, y ∈ Λ are nearest neighbors if and only if the lattice distance d between x, y isone, i.e., d ( x, y ) = 1 . We indicate by R l,m ⊆ Λ the rectangle with sides l and m , ≤ l ≤ m andwe call non-interacting rectangles two rectangles R l,m and R l (cid:48) ,m (cid:48) such that any of the followingconditions hold: 22 d ( R l,m , R l (cid:48) ,m (cid:48) ) ≥ , if σ R l,m = c oR l,m and σ R l (cid:48) ,m (cid:48) = c oR l (cid:48) ,m (cid:48) ; • d ( R l,m , R l (cid:48) ,m (cid:48) ) ≥ , if σ R l,m = c eR l,m and σ R l (cid:48) ,m (cid:48) = c eR l (cid:48) ,m (cid:48) ; • d ( R l,m , R l (cid:48) ,m (cid:48) ) ≥ , if σ R l,m = +1 R l,m and σ R l (cid:48) ,m (cid:48) = +1 R l (cid:48) ,m (cid:48) ; • d ( R l,m , R l (cid:48) ,m (cid:48) ) = 1 , if σ R l,m = c oR l,m and σ R l (cid:48) ,m (cid:48) = c eR l (cid:48) ,m (cid:48) ; • d ( R l,m , R l (cid:48) ,m (cid:48) ) = 1 , if σ R l,m = c R l,m , σ R l (cid:48) ,m (cid:48) = +1 R l (cid:48) ,m (cid:48) and the sides on the interface areof the same length.Whenever two rectangles are not non-interacting , we call them interacting . Proof of Lemma 3.1.

We begin by giving a rough sketch of the proof. Without loss of generality,we consider only conﬁgurations in U := X \ {− , c, +1 } , since the conﬁgurations in X \ X havestability level zero. Indeed, if σ ∈ X \ X , we construct the path ω = ( σ, T ( σ )) , so that T ( σ ) ∈ I σ and V σ = 0 , where I σ was deﬁned in (2.12). We will partition X \{− , c, +1 } into several subsets A, B, D, E and for each of these we will construct a path ω ∈ Θ( σ, I σ ∩ X ) . Denote with σ Λ (cid:48) a conﬁguration σ ∈ Λ (cid:48) ⊆ Λ . We will ﬁnd an explicit upper-bound V ∗ σ on the transition energyalong ω as max k =1 ,..., | ω |− H ( ω k , ω k +1 ) − H ( σ ) ≤ V ∗ σ . (5.9)We deﬁne V ∗ S = max σ ∈ S V ∗ σ , S ∈ { A, B, D, E } , (5.10)and since max S ∈{ A,B,D,E } V ∗ S < Γ PCA , (5.11)from (5.9) and (5.10) follows that, for any σ ∈ X \ {− , c, +1 } , Φ( σ, I σ ) − H ( σ ) = min ω ∈ Θ( σ,η ) max i =1 ,..., | ω |− H ( ω i , ω i +1 ) − H ( σ ) < Γ PCA . (5.12)This means that all conﬁgurations in X \ {− , c, +1 } have a lower stability level than Γ PCA . Wenow proceed with the detailed proof. We partition the set X \ {− , c, +1 } into four subset as X \ {− , c, +1 } = A ∪ B ∪ D ∪ E [19, Prop. 3.3]. For each set A, B, D, E , we ﬁrst describe it inwords and then give its formal deﬁnition.We deﬁne the set A to be the set of conﬁgurations consisting of a single rectangle containingeither c or +1 , and surrounded by either c or − , see Figure 5. More precisely, A = A ∪ A ∪ A ∪ A ∪ A ∪ A , where: • A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l < λ , σ R l,m = c R l,m and σ Λ \ R l,m = − Λ \ R l,m ; • A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l ≥ λ , σ R l,m = c R l,m and σ Λ \ R l,m = − Λ \ R l,m ; • A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l < λ , σ R l,m = +1 R l,m and σ Λ \ R l,m = c Λ \ R l,m ; • A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l ≥ λ , σ R l,m = +1 R l,m and σ Λ \ R l,m = c Λ \ R l,m ; 23 A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l < λ , σ R l,m = +1 R l,m and σ Λ \ R l,m = − Λ \ R l,m ; • A is the collection of conﬁgurations such that ∃ ! R l,m ⊂ Λ with l ≥ λ , σ R l,m = +1 R l,m and σ Λ \ R l,m = − Λ \ R l,m . A , A A , A A , A − − − − − − −− − − − − − −− − + − + − −− + − + − − −− − + − + − −− + − + − − −− − − − − − − + − + − + − + − + − + − + − + + + + + − + − + + + + + − + + + + + − + − + + + + + − + − + − + − + − − − − − − −− − − − − − −− + + + + − −− + + + + − −− + + + + − −− + + + + − −− − − − − − − Figure 5: Examples of one conﬁgurations in A .Conﬁgurations in the set B consist of a single chessboard rectangle which may contain an islandof +1 , surrounded by − , see Figure 6. More precisely, B = B ∪ B ∪ B , where: • B is the collection of conﬁgurations such that ∃ ! R l,m with σ R l,m = +1 R l,m and ∃ ! R l (cid:48) ,m (cid:48) (cid:41) R l,m with l (cid:48) < λ , σ R l (cid:48) ,m (cid:48) \ R l,m = c R l (cid:48) ,m (cid:48) \ R l,m , σ Λ \ R l (cid:48) ,m (cid:48) = − Λ \ R l (cid:48) ,m (cid:48) ; • B is the collection of conﬁgurations such that ∃ ! R l,m with l ≥ λ , σ R l,m = +1 R l,m and ∃ ! R l (cid:48) ,m (cid:48) (cid:41) R l,m such that σ R l (cid:48) ,m (cid:48) \ R l,m = c R l (cid:48) ,m (cid:48) \ R l,m , σ Λ \ R l (cid:48)− ,m (cid:48) = − Λ \ R l (cid:48) ,m (cid:48) ; • B is the collection of conﬁgurations such that ∃ ! R l,m with l < λ , σ R l,m = +1 R l,m and ∃ ! R l (cid:48) ,m (cid:48) (cid:41) R l,m with l (cid:48) ≥ λ such that σ R l (cid:48) ,m (cid:48) \ R l,m = c R l (cid:48) ,m (cid:48) \ R l,m , σ Λ \ R l (cid:48) ,m (cid:48) = − Λ \ R l (cid:48) ,m (cid:48) . B , B , B − − − − − − − − − −− − + − + − + − − −− + + + + + − + − −− − + + + + + − − −− + + + + + − + − −− − + + + + + − − −− + − + − + − + − −− − − − − − − − − −− − − − − − − − − − Figure 6: Examples of one conﬁgurations in B .The set D contains all conﬁgurations with more than one rectangle, see Figure 7. More precisely, D = D ∪ D ∪ D ∪ D ∪ D ∪ D , where: • D is the collection of conﬁgurations such that there exist subcritical non-interacting rect-angles R := ( R l,m ) l,m such that σ Λ \R = − Λ \R and any rectangle of chessboard maycontain one or more non-interacting rectangles of pluses;24 D is the collection of conﬁgurations such that there exist non-interacting rectangles R := ( R l,m ) l,m where at least one of them is supercritical and such that σ Λ \R = − Λ \R .Moreover, any rectangle of chessboard may contain one or more non-interacting rectanglesof pluses; • D is the collection of conﬁgurations consisting of interacting rectangles R := ( R l,m ) l,m with l < λ and such that any rectangle of chessboard may contain one or more non-interacting rectangles of pluses; • D is the collection of conﬁgurations consisting of non-interacting rectangles R := ( R l,m ) l,m with l < λ such that σ R l,m = +1 R l,m and σ Λ \R = c Λ \R ; • D is the collection of conﬁgurations consisting of rectangles R := ( R l,m ) l,m where at leastone has l ≥ λ and such that σ R l,m = +1 R l,m and σ Λ \R = c Λ \R ; D , D D D , D − − − − − − − − − − − −− − − + − + − + − + − −− − − − + + + + + − − −− − − + − + + + − + − −− − − − + − + − + − − −− − − − − − − − − − − −− − − − − − − − − − − −− + + − − − + − + − + −− + + − − − − + − + − −− − − − − − + − + − + −− − − − − − − − − − − − + + − + − − − − − − − ++ + + − + − − − − − − ++ + − + − − − − − − − + − − − − − − − + − + − −− − − − − − + − + + + −− − − − − − − + + + − −− − − − − − + − + − + −− − + − + − + − + − − −− − − + − + − + − − − −− − − − − − − − − − − − + + − + − − − − − − − + − + − + − + − + − + − ++ − + − + − + − + − + −− + − + − + − + + + − ++ − + − + − + + + + + −− + − + − + − + − + − ++ − + − + − + − + − + −− + − + − + − + − + − ++ − + + + + + + + − + −− + − + + + + + − + − ++ − + + + + + + + − + −− + − + − + − + − + − + Figure 7: Examples of conﬁgurations in D .The set E contains all possible strips, that is, rectangles winding around the torus, see Figure 8.More precisely, E = E ∪ E ∪ E ∪ E ∪ E ∪ E ∪ E , where: • E is the collection of conﬁgurations containing strips of c of width one surrounded by − ,and possibly rectangles of +1 and c ; 25 E is the collection of conﬁgurations containing strips of +1 of width one surrounded by c ,and possibly rectangles of +1 ; • E is the collection of conﬁgurations containing strips of +1 of width one surrounded by − , and possibly rectangles of +1 and c ; • E is the collection of conﬁgurations containing pairs of adjacent strips of c and − . Forat least one of these pairs, both strips have width greater than one. Furthermore, theremay be rectangles of c and +1 surrounded by − , and rectangles of +1 surrounded by c ; • E is the collection of conﬁgurations containing pairs of adjacent strips of c and +1 . Forat least one of these pairs, both strips have width greater than one. Furthermore, theremay be rectangles of +1 surrounded by c ; • E is the collection of conﬁgurations containing pair of adjacent strips of +1 and − . Forat least one these pairs, both strips have width greater than one. Furthermore, there maybe rectangles of c and +1 surrounded by − ; • E is the collection of conﬁgurations containing strips of c , − and +1 with at least onewidth greater than one, and possibly rectangles of c and +1 in − , and possibly rectanglesof +1 in c ;We begin by considering the set A . Consider ﬁrst the set A . Case A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in A ∪ {− } with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ ( A ∪ {− } )) . We nowﬁx σ ≡ ω ∈ A and we begin by deﬁning ω . If there is a minus corner in σ R l,m , say in j , then σ ( j ) is kept ﬁxed and all other spins in the rectangle switch sign, i.e., ω := T Fj ( ω ) . On theother hand, if there is no minus corner in σ R l,m , then we call the next conﬁguration in the path ω (cid:48) and we deﬁne it as ω (cid:48) := T ( ω ) , i.e., all the spins in the rectangle switch sign. After thisstep, ω (cid:48) has a minus corner, so we can proceed as above and deﬁne ω := T Fj ( ω (cid:48) ) . Note that in ω there are two minus corners in the rectangle that are nearest neighbors of j . For the nextstep, keep ﬁxed the minus corner that is contained in a side of length l , say in j , and deﬁne ω := T Fj ( ω ) . By iterating this procedure l − times, a full slice of the droplet is erased and weobtain the conﬁguration η ≡ ω l such that η R l,m − = c and η Λ \ R l,m − = − . In order to determinewhere the maximum of the transition energy is attained, we rewrite for k = 1 , . . . , l − H ( ω k , ω k +1 ) − H ( ω ) = H ( ω k ) + ∆( ω k , ω k +1 ) − H ( ω )= k − (cid:88) m =1 ( H ( ω m +1 ) − H ( ω m )) + ∆( ω k , ω k +1 ) , (5.13)with the convention that a sum over an empty set is equal to zero. From the reversibility propertyof the dynamics follows that H ( ω k ) + ∆( ω k , ω k +1 ) = H ( ω k +1 ) + ∆( ω k +1 , ω k ) , (5.14)and since ∆( ω k +1 , ω k ) = 0 for k = 1 , . . . , l − , for the path ω , H ( ω k , ω k +1 ) − H ( ω ) = (cid:40)(cid:80) km =1 ( H ( ω m +1 ) − H ( ω m )) if k = 1 , . . . , l − , (cid:80) l − m =1 ( H ( ω m +1 ) − H ( ω m )) + ∆( ω l − , ω l ) if k = l − . (5.15)26 E E E − − + − − − −− − − − + + −− − + − + + −− − − − − − − + − + − − − + − − − − − + −− − + − − − − + + + − + + −− + − + + − ++ + + − + + −− + − + + − ++ + + − + + −− + − + + − ++ + + − + + − − − − + + − −− − − + + − −− − − + + − −− − − + + − − + − − + + − + − + − + + − −− − − + + − −− − − − − + + + + − + − + + − + −− + + − − + + + − + − + − − + − + − + + − − + + + + − + − + + + + −− − − − − + + + − + − + − − + + + − − − − − + + + + + + + + + + + −− − − − − + + + − + + + − − + − + − + − − − + + + + − + − + + − + − + − − − − + + + − + − + − − + − + − + − − − + + + + − + − + + − + − + − − − − + + + − + − + − − + − + − − − − − + + + + − + − + + − + −− − − − − + + + − + + + − − + − + − − − − − + + + + − + + + + − + −− + + + − + + + − + + + − − + − + − + + + − + + + + − + + + + − + −− + + + − + + + − + − + − − + − + − − − − − + + + + − + − + + − + − Figure 8: Examples of conﬁgurations in E .It can be shown that H ( ω m +1 ) − H ( ω m ) = 2 h > for m = 1 , . . . , l − and ∆( ω l − , ω l ) = 2 h [19,Tab. 1], so the maximum is attained in the pair of conﬁgurations ( ω l − , ω l ) . Hence, max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = l − (cid:88) m =1 ( H ( ω m +1 ) − H ( ω m )) + ∆( ω l − , ω l )= 2 h ( l −

2) + 2 h = 2 h ( l −

1) := V ∗ σ . (5.16)Since V ∗ σ depends only on the length l , we ﬁnd V ∗ A = max σ ∈ A V ∗ σ by taking the maximum over l . Since l < λ , we have V ∗ A < − h ) . (5.17)Finally, let us check that ω l ∈ I σ ∩ ( A ∪ {− } ) . Using (5.14), (5.16) and [19, Tab. 1], we get H ( ω ) + 2 h ( l −

1) = H ( ω l ) + 2(2 − h ) . (5.18)The rectangle R l,m is subcritical if and only if l < /h , and so H ( ω ) − H ( ω l ) = 4 − hl > , (5.19)which concludes the proof for A . 27 ase A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in A ∪ { c } with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ ( A ∪ { c } )) . We nowﬁx σ ≡ ω ∈ A and we begin by deﬁning ω . We call j ∈ R l,m a site in one of the sides oflength l and such that σ ( j ) = +1 . Furthermore, we call j ∈ Λ \ R l,m the nearest neighbor of j such that (necessarily) σ ( j ) = − and we deﬁne ω := T Cj ( ω ) , i.e., σ ( j ) switches sign and thesigns of all other sites in σ Λ \ R l,m remain ﬁxed. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until a new slice is ﬁlled with chessboard. We obtain the conﬁguration η such that η R l,m +1 = c and η Λ \ R l,m +1 = − . Note that at the ﬁrst step of the dynamics either one ortwo nearest neighbors of j in the external side of the rectangle switch sign when T is applied.Analogously, at each subsequent application of T , either one of two further sites in the externalside of the rectangle switch sign. Therefore, the maximum number of iterations of the map T is l − . In order to determine where the maximum of the transition energy is attained, we rewritethe energy diﬀerence as in (5.13). Using (5.14) and since ∆( ω k , ω k +1 ) = 0 for k = 2 , . . . , l − ,for the path ω , H ( ω k , ω k +1 ) − H ( ω ) = (cid:40) ∆( ω , ω ) + (cid:80) km =2 ( H ( ω m +1 ) − H ( ω m )) if k = 2 , . . . , l − ω , ω ) if k = 1 (5.20)It can be shown that H ( ω m +1 ) − H ( ω m ) = − ∆( ω m +1 , ω m ) = − h < for m = 2 , . . . , l [19, Tab.1], so the maximum is attained in the pair of conﬁgurations in ( ω , ω ) , hence max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = ∆( ω , ω ) = 2(2 − h ) := V ∗ σ . (5.21)Since V ∗ σ is the same for all conﬁgurations in A , V ∗ A = max σ ∈ A V ∗ σ = 2(2 − h ) . Finally, let uscheck that ω l ∈ I σ ∩ ( A ∪ { c } ) . Using (5.14), (5.21) and [19, Tab. 1], we get H ( ω ) + 2(2 − h ) = H ( ω l ) + 2 h ( l − . (5.22)The rectangle R l,m is supercritical if and only if l > /h , and so H ( ω ) − H ( ω l ) = 2 hl − > , (5.23)which concludes the proof for A . Case A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in A ∪ { c } with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ ( A ∪ { c } )) . We now ﬁx σ ≡ ω ∈ A and we begin by deﬁning ω . If in σ R l,m there is a plus corner surrounded by twominuses, say in j , then σ ( j ) switches sign and the signs of all other spins in the rectangle remainﬁxed, i.e., ω := T Cj ( ω ) . On the other hand, if in σ R l,m there are no plus corners surroundedby minuses, then we call the next conﬁguration in the path ω (cid:48) and we deﬁne it as ω (cid:48) := T ( ω ) ,i.e., all the spins in σ Λ \ R l,m switch sign. After this step, ω (cid:48) has a plus corner surrounded bytwo minuses, so we can proceed as above and deﬁne ω := T Cj ( ω (cid:48) ) . Note that in ω there aretwo plus corners in the rectangle that are nearest neighbors of j . For the next step, the pluscorner, say in j , that is contained in a side of length l , switches sign, i.e., ω := T Cj ( ω ) . Byiterating this step l − times, a full slice of the droplet is erased and we obtain the conﬁguration η ≡ ω l such that η R l,m − = +1 and η Λ \ R l,m − = c . In order to determine where the maximum ofthe transition energy is attained, we rewrite the energy diﬀerence as in (5.13). Using (5.14), weobtain the same result as in (5.15). Hence, max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = l − (cid:88) m =1 ( H ( ω m +1 ) − H ( ω m )) + ∆( ω l − , ω l )= 2 h ( l −

2) + 2 h = 2 h ( l −

1) := V ∗ σ . (5.24)28ince V ∗ σ depends only on the length l , we ﬁnd V ∗ A = max σ ∈ A V ∗ σ by taking the maximum over l . Since l < λ , we have V ∗ A < − h ) . (5.25)Finally, let us check that ω l ∈ I σ ∩ ( A ∪ { c } ) . Using (5.14), (5.24) and [19, Tab. 1], we get H ( ω ) + 2 h ( l −

1) = H ( ω l ) + 2(2 − h ) . (5.26)The rectangle R l,m is subcritical if and only if l < /h , and so H ( ω ) − H ( ω l ) = 4 − hl > , (5.27)which concludes the proof for A . Case A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in A ∪ { +1 } with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ ( A ∪ { +1 } )) . Wenow ﬁx σ ≡ ω ∈ A and we begin by deﬁning ω . Pick any site j ∈ R l,m in one of the sidesof length l , such that its nearest neighbor j ∈ Λ \ R l,m is such that σ ( j ) = +1 . We deﬁne ω := T Fj ( ω ) , i.e., σ ( j ) is kept ﬁxed and all the spins in σ Λ \ R l,m switch sign. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until a new slice is ﬁlled with +1 . We obtainthe conﬁguration η such that η R l,m +1 = +1 and η Λ \ R l,m +1 = c . Note that at the ﬁrst step of thedynamics either one or two nearest neighbors of j in the external side of the rectangle switchsign when T is applied. Analogously, at each subsequent application of T , either one of twofurther sites in the external side of the rectangle switch sign. Therefore, the maximum numberof iterations of the map T is l − . In order to determine where the maximum of the transitionenergy is attained, we rewrite the energy diﬀerence as in (5.13). Using (5.14), we obtain thesame result as in (5.20). Hence, max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = ∆( ω , ω ) = 2(2 − h ) := V ∗ σ . (5.28)Since V ∗ σ is the same for all conﬁgurations in A , V ∗ A = max σ ∈ A V ∗ σ = 2(2 − h ) . Finally, let uscheck that ω l ∈ I σ ∩ ( A ∪ { +1 } ) . Using (5.14), (5.28) and [19, Tab. 1], we get H ( ω ) + 2(2 − h ) = H ( ω l ) + 2 h ( l − . (5.29)The rectangle R l,m is supercritical if and only if l > /h , and so H ( ω ) − H ( ω l ) = 2 hl − > , (5.30)which concludes the proof for A . Case A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in D with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ D ) . We now ﬁx σ ≡ ω ∈ A and we begin by deﬁning ω . We call j a corner in R l,m such that (necessarily) σ ( j ) = +1 and we deﬁne ω := T Cj ( ω ) , i.e., σ ( j ) switches sign and the signs of all other spins in therectangle remain ﬁxed. Note that in ω there are two plus corners in the rectangle that arenearest neighbors of j . For the next step, the plus corner, say in j , that is contained in a sideof length l switches sign, i.e., ω := T Cj ( ω ) . After this, the spin of the nearest neighbor of j along the same side of R l,m and diﬀerent from j , say in j , switches spin, i.e., ω := T Cj ( ω ) . Byiterating this step l − times, a full slice of the droplet is erased and we obtain the conﬁguration ω l ≡ η such that η R l,m − = +1 , η R l, = c , η Λ \ R l,m = − . The conﬁguration η is a conﬁguration29n D . In order to determine where the maximum of the transition energy is attained, we rewritethe energy diﬀerence as in (5.13). Using (5.14), we obtain the same result (5.15). Hence, max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = l − (cid:88) m =1 ( H ( ω m +1 ) − H ( ω m )) + ∆( ω l − , ω l )= 2 h ( l −

2) + 2 h = 2 h ( l −

1) := V ∗ σ . (5.31)Since V ∗ σ depends only on the length l , we ﬁnd V ∗ A = max σ ∈ A V ∗ σ by taking the maximum over l . Since l < λ , we have V ∗ A < − h ) . (5.32)Finally, let us check that ω l ∈ I σ ∩ D . Using (5.14), (5.31) and [19, Tab. 1], we get H ( ω ) + 2 h ( l −

1) = H ( ω l ) + 2(2 − h ) . (5.33)The rectangle R l,m is subcritical if and only if l < /h , and so H ( ω ) − H ( ω l ) = 4 − hl > , (5.34)which concludes the proof for A . Case A . For any conﬁguration σ ∈ A we construct a path that begins in σ and ends in aconﬁguration in D with lower energy than σ , i.e., ω ∈ Θ( σ, I σ ∩ D ) . We now ﬁx σ ≡ ω ∈ A andwe begin by deﬁning ω . We call j ∈ R l,m a site in a side of R l,m , and note that (necessarily) σ ( j ) = +1 . Without loss of generality, we choose a side of length l . Furthermore, we call j ∈ Λ \ R l,m the nearest neighbor of j contained in the external side with length l such that(necessarily) σ ( j ) = − . We deﬁne ω := T Cj ( ω ) , i.e., σ ( j ) switches sign and the signs of allother spins in σ Λ \ R l,m remain ﬁxed. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so onuntil a new slice is ﬁlled with c , so we obtain the conﬁguration η such that η R l,m = +1 , η R l, = c and η Λ \ R l,m +1 = − . Note that at the ﬁrst step of the dynamics either one or two nearestneighbors of j in the external side of the rectangle switch sign when T is applied. Analogously,at each subsequent application of T , either one of two further sites in the external side of therectangle switch sign. Therefore, the maximum number of iterations of the map T is l − .The conﬁguration η is a conﬁguration in D . In order to determine where the maximum of thetransition energy is attained, we rewrite the energy diﬀerence as in (5.13). Using (5.14), weobtain the same result as in (5.20). Hence, max ω k ,ω k +1 ∈ ω H ( ω k , ω k +1 ) − H ( ω ) = ∆( ω , ω ) = 2(2 − h ) := V ∗ σ . (5.35)Since V ∗ σ is the same for all conﬁgurations in A , V ∗ A = max σ ∈ A V ∗ σ = 2(2 − h ) . Finally, let uscheck that ω l ∈ I σ ∩ D . Using (5.14), (5.35) and [19, Tab. 1], we get H ( ω ) + 2(2 − h ) = H ( ω l ) + 2 h ( l − . (5.36)The rectangle R l,m is supercritical if and only if l > /h , and so H ( ω ) − H ( ω l ) = 2 hl − > , (5.37)which concludes the proof for A . In conclusion, V ∗ A := max i =1 ,..., V ∗ A i = 2(2 − h ) (5.38)Next we consider the set B . 30 ase B . For every conﬁguration in B , both rectangles are subcritical. Following a path thatchanges a slice of +1 into a slice of c , analogously as was done for A , we get a conﬁguration in I σ ∩ ( B ∪ A ) . We have V ∗ B = V ∗ A < − h ) . (5.39) Case B . For every conﬁguration in B , both rectangles are supercritical. Following a paththat adds a slice of c , analogously as was done for A , we get a conﬁguration in I σ ∩ ( B ∪ A ) .We have V ∗ B = V ∗ A = 2(2 − h ) . (5.40) Case B . For every conﬁguration in B , the external rectangle is supercritical and the internalrectangle is subcritical. Following a path that adds a slice of c , analogously as was done for A ,we get a conﬁguration in I σ ∩ ( B ∪ A ) . We have V ∗ B = V ∗ A = 2(2 − h ) . (5.41)We conclude that V ∗ B = max { V ∗ B , V ∗ B , V ∗ B } = V ∗ A . Next we consider the set D . Case D . For every conﬁguration σ in D , all rectangles are subcritical and non-interacting.If σ contains at least one rectangle of +1 surrounded by c , we take our path to be the path thatcuts a slice of +1 , analogously as was done for A . We get a conﬁguration in I σ ∩ D . Otherwise,if σ contains at least one rectangle of +1 surrounded by − , we take our path to be the path thatchanges a slice of +1 into a slice of c , analogously as was done for A . We get a conﬁguration in I σ ∩ D . Finally, we consider all remaining conﬁgurations, namely chessboard rectangles in a seaof minus. We take our path to be the path that cuts a slice of c , analogous to the one describedin A . We get a conﬁguration in I σ ∩ ( D ∪ A ) . So, we have V ∗ D = max { V ∗ A , V ∗ A , V ∗ A } < − h ) . (5.42) Case D . For every conﬁguration σ in D , there exists at least one supercritical rectangle. Ifthis is a chessboard rectangle, then we take the path that makes the rectangle grow a slice of c ,analogously as was done for A . We get a conﬁguration in I σ ∩ ( A ∪ A ∪ D ∪ D ∪ D ∪ E ∪{ c } ) .Otherwise, if this supercritical rectangle contains +1 , we take the path that makes the rectanglegrow a slice of c , analogously as was done for A . We get a conﬁguration in I σ ∩ ( D ∪ D ∪ D ) .So, we have V ∗ D = max { V ∗ A , V ∗ A } = 2(2 − h ) . (5.43) Case D . For every conﬁguration σ in D , all rectangles are subcritical and non-interacting.If σ contains at least one rectangle of +1 surrounded by c , we take our path to be the path thatcuts a slice of +1 , analogously as was done for A . We get a conﬁguration in I σ ∩ D . Otherwise,if σ contains at least one rectangle of +1 at lattice distance one from a rectangle of c , we take thepath that changes a slice of +1 into a slice of c along the interface between the two rectangles,analogously as was done for A . We get a conﬁguration in I σ ∩ ( A ∪ D ∪ D ) . In the remainingcases, σ contains at least two rectangles of diﬀerent chessboard parity at lattice distance one.We take our path to be a path that changes a slice of c , analogously as was done for A . We geta conﬁguration in I σ ∩ ( A ∪ D ∪ D ) . So, we have V ∗ D = max { V ∗ A , V ∗ A } < − h ) . (5.44)31 ase D . For every conﬁguration σ in D , all rectangles of +1 surrounded by c are subcriticaland non-interacting. We take our path to be a path that cuts a slice of +1 , analogously as wasdone for A . We get a conﬁguration in I σ ∩ ( D ∪ A ) . So, we have V ∗ D = V ∗ A < − h ) . (5.45) Case D . For every conﬁguration σ in D , there exists at least a supercritical rectangle of +1 surrounded c . We consider this rectangle and we take the path that makes the rectangle grow aslice of +1 , analogously as was done for A . We get a conﬁguration in I σ ∩ ( D ∪ A ∪ E ) . So,we have V ∗ D = V ∗ A = 2(2 − h ) . (5.46)In conclusion, V ∗ D = max { V ∗ D , V ∗ D , V ∗ D , V ∗ D , V ∗ D } = V ∗ A . The last set E is composed of strips. Case E . A conﬁguration σ ≡ ω in E has at least a strip of c of width one. Pick a site j in the strip such that σ ( j ) = − and deﬁne ω = T Fj ( ω ) , i.e., σ ( j ) is kept ﬁxed. The energydiﬀerence is H ( ω ) − H ( ω ) = 2 h [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) andso on until we obtain a conﬁguration in I σ ∩ ( E ∪ D ∪ D ∪ D ∪ A ∪ A ∪ A ∪ A ∪ B ∪ {− } ) .So, we have V ∗ E = 2 h. (5.47) Case E . A conﬁguration σ ≡ ω in E contains at least a strip of +1 of width one. Let σ ( j ) be a plus in the strip surrounded by one or two minuses. We deﬁne ω = T Cj ( ω ) , i.e., σ ( j ) switches sign. The maximum energy diﬀerence is H ( ω ) − H ( ω ) = 2(2 − h ) [19, Tab.1].We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until we obtain a conﬁguration in I σ ∩ ( E ∪ E ∪ { c } ) . So, we have V ∗ E = max { V ∗ E , − h ) } = 2(2 − h ) . (5.48) Case E . A conﬁguration σ ≡ ω in E has at least a strip of +1 of width one. If in σ there is a strip of +1 surrounded by two chessboards with the same parity, then pick a plus σ ( j ) in the strip and deﬁne ω = T Cj ( ω ) , i.e., σ ( j ) switches sign. The energy diﬀerence is H ( ω ) − H ( ω ) = 2 h [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on untilwe obtain a conﬁguration in I σ ∩ ( E ∪ E ) . Instead, if in σ there is a strip of +1 surrounded bytwo chessboards with diﬀerent parity, then pick a plus σ ( j ) in a chessboard at lattice distanceone from the strip and deﬁne ω = T Fj ( ω ) , i.e., σ ( j ) is kept ﬁxed. The energy diﬀerence is H ( ω ) − H ( ω ) = 2(2 − h ) [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so onuntil we obtain a conﬁguration in I σ ∩ E . So, we have V ∗ E = max { h, − h ) } = 2(2 − h ) . (5.49) Case E . We consider a conﬁguration σ ≡ ω in E and pick a plus on the interface between c and − , and call j the site of this plus. We call j the nearest neighbor of j in − and we deﬁne ω = T Cj ( ω ) , i.e., σ ( j ) switches sign. The energy diﬀerence is H ( ω ) − H ( ω ) = 2(2 − h ) [19,Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until we obtain a conﬁgurationin I σ ∩ ( E ∪ D ∪ D ∪ E ∪ { c } ) . So, we have V ∗ E = 2(2 − h ) . (5.50)32 ase E . We consider a conﬁguration σ ≡ ω in E and pick a plus in c on the interfacebetween c and +1 , and call j the site of this plus. We deﬁne ω = T Fj ( ω ) , i.e., σ ( j ) is keptﬁxed. The energy diﬀerence is H ( ω ) − H ( ω ) = 2(2 − h ) [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until we obtain a conﬁguration in I σ ∩ ( E ∪ { +1 } ) . So, wehave V ∗ E = 2(2 − h ) . (5.51) Case E . We consider a conﬁguration σ ≡ ω in E and pick a minus on the interface between − and +1 , and call j the site of this minus. We deﬁne ω = T Cj ( ω ) , i.e., σ ( j ) switchessign. The energy diﬀerence is H ( ω ) − H ( ω ) = 2(2 − h ) [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and so on until we obtain a conﬁguration in I σ ∩ E . So, we have V ∗ E = 2(2 − h ) . (5.52) Case E . If the conﬁguration σ ≡ ω in E contains a strip of − adjacent to a strip of +1 and both have width greater then one, then we pick a minus one on the interface between − and +1 and we take a path analogously as was done for E . We get a conﬁguration in I σ ∩ ( E ∪ E ) . Otherwise, E contains a strip of c adjacent to a strip of − , both with widthgreater then one. Then, we pick a plus one, say in j , in the strip of c . We call j the nearestneighbor of j in − and we deﬁne ω = T Cj ( ω ) , i.e., σ ( j ) switches sign. The energy diﬀerenceis H ( ω ) − H ( ω ) = 2(2 − h ) [19, Tab.1]. We deﬁne ω := T ( ω ) , ω := T ( ω ) = T ( ω ) and soon until we obtain a conﬁguration in I σ ∩ ( E ∪ E ) . So, we have V ∗ E = max { V ∗ E , − h ) } = 2(2 − h ) . (5.53)Then V ∗ E = max { V ∗ E , V ∗ E , V ∗ E , V ∗ E , V ∗ E , V ∗ E , V ∗ E } = V ∗ A . To conclude the proof, we compare the value of V ∗ = max { V ∗ A , V ∗ B , V ∗ D , V ∗ E } = 2(2 − h ) and Γ PCA ,and we get Γ PCA ≡ − hλ + 2 λ (4 + h ) − h > − h ) = V ∗ . (5.54) A Appendix

In this Appendix, we recall some results and give explicit computation that are used in thepaper. Equation (3.2) is obtained as follows, − lim β →∞ log p ( σ, η ) β = − lim β →∞ log (cid:16) (cid:81) i ∈ Λ p i,σ ( η ( i )) (cid:17) β == − lim β →∞ (cid:80) i ∈ Λ log (cid:16) {− βη ( i ) | S σ ( i )+ h |} (cid:17) β == lim β →∞ (cid:80) i ∈ Λ log(1 + exp {− βη ( i ) | S σ ( i ) + h |} ) β , lim β →∞ (cid:80) i ∈ Λ: η ( i )( S σ ( i )+ h ) > log(1 + exp {− βη ( i )( S σ ( i ) + h ) } ) β ++ lim β →∞ (cid:80) i ∈ Λ: η ( i )( S σ ( i )+ h ) < log(1 + exp { βη ( i )( S σ ( i ) + h ) } ) β == (cid:88) i ∈ Λ: η ( i )( S σ ( i )+ h ) < lim β →∞ log(1 + exp { β | S σ ( i ) + h |} ) β == (cid:88) i ∈ Λ: η ( i )( S σ ( i )+ h ) < | S σ ( i ) + h | . Deﬁnition A.1. [20, Energy landscape Deﬁnition 2.1] An energy landscape is a quadruplet ( X , Q, H, ∆) where the ﬁnite non-empty sets X , Q ⊂ X × X , and the maps H : X → R , ∆ : Q → R + are called respectively state space, connectivity relation, energy, and energy cost, andfor any σ, η ∈ X there exists an integer n ≥ and ω , ..., ω n ∈ X such that ω = σ, ω n = η , and ( ω i , ω i +1 ) ∈ Q for any i = 1 , ..., n − . An energy landscape ( X , Q, H, ∆) is called reversible if andonly if Q is symmetric, that is if ( σ, η ) ∈ Q then ( η, σ ) ∈ Q , and H ( σ )+∆( σ, η ) = ∆( η, σ )+ H ( η ) for all ( σ, η ) ∈ Q . Theorem A.2. [20, Theorem 2.4] Consider a reversible energy landscape ( X , Q, H, ∆) . Let X s be the set of stable states and assume X \ X s (cid:54) = ∅ . If there exist A ⊂ X \ X s and a ∈ R + suchthat1. Φ( σ, X s ) − H ( σ ) = a for all σ ∈ A ;2. either X \ ( A ∪ X s ) = ∅ or V σ < a for all σ ∈ X \ ( A ∪ X s ) ;then Γ m = a and X m = A . Proposition A.3. [19, Proposition 3.1] A conﬁguration σ ∈ S − is stable for PCA if and onlyif σ ( x ) = +1 for all sites x inside a collection of pairwise non-interacting rectangles of minimalside length l ≥ and σ ( x ) = − elsewhere. A conﬁguration σ ∈ S +1 is stable if and only if σ = +1 . There is no stable conﬁguration σ ∈ S c . Proposition A.4. [19, Proposition 3.3]i) For any σ ∈ S +1 \ { +1 } , the pair ( σ, T σ ) is not a stable pair.ii) Given C ∈ { c o , c e } and σ ∈ S c the pair ( σ, T σ ) is a stable pair if and only if there exist k ≥ pairwise non-interacting rectangles R l ,m , R l ,m , ..., R l k ,m k such that ≤ l i ≤ m i ≤ L − for any i = 1 , ..., k , σ R = +1 R ( σ coincides with +1 inside the rectangles) and σ Λ \R = c Λ \R ( σ coincides with the chessboard C outside the rectangles), where R = (cid:83) ki =1 R l i ,m i .iii) Given σ ∈ S − the pair ( σ, T σ ) is a stable pair if and only if there exist k ≥ rectangles R l ,m , R l ,m , ..., R l k ,m k with ≤ l i ≤ m i ≤ L − for any i = 1 , ..., k , and there exists aninteger s ∈ { , ..., k } such that the following conditions are fulﬁlled:1. R l i ,m i ∩ R l j ,m j = ∅ and l i ≥ for any i, j ∈ { , ..., k } ;2. or any j ∈ { , ..., s } the family R l j ,m j , R l s +1 ,m s +1 , ..., R l k ,m k is a family of pairwisenon-interacting rectangles;3. σ Λ \R = − Λ \R ( σ coincides with − outside the rectangles); . for any j ∈ { s + 1 , ..., k }

5. 5.1. R (cid:48) l (cid:48) i ,m (cid:48) i ⊂ R l j ,m j for any i ∈ { , .., k (cid:48) } ;5.2. for any j = 1 , ..., s the family { R (cid:48) l (cid:48) i ,m (cid:48) i : i = 1 , ..., k (cid:48) } (recall R (cid:48) l (cid:48) i ,m (cid:48) i = R (cid:48) l (cid:48) i ,m (cid:48) i ( j ) forany i = 1 , ..., k (cid:48) = k (cid:48) ( j ) is a family of pairwise non-interacting rectangles;5.3. σ R (cid:48) = +1 R (cid:48) where R (cid:48) ≡ R (cid:48) ( j ) := (cid:84) ki =1 R (cid:48) l (cid:48) i ,m (cid:48) i ;5.4. either σ R lj,mj \R (cid:48) = C olj,mj \R (cid:48) or σ R lj,mj \R (cid:48) = C elj,mj \R (cid:48) ;6. or any i, j ∈ { , ..., s } the two rectangles R l j ,m j R l i ,m i must be non-interacting if σ R lj,mj \R (cid:48) ( j ) = σ R lj,mj \R (cid:48) ( i ) B Appendix

In this section we prove theorems given in Section 2.4.

Proof of Theorem 2.8.

Recall the equivalence relation above Theorem 3.6 in [20] for x, y ∈ X x ∼ y if and only if Φ( x, y ) − H ( x ) < Γ m and Φ( y, x ) − H ( y ) < Γ m . (B.1)The conﬁgurations x , ..., x n are in the same equivalence class. Thus, the theorem follows imme-diately by Condition 2.4, (2.41), and [20, Theorem 3.6].Before given the proof of Theorem 2.9, we state two useful lemmas. In the ﬁrst of the twolemmas we collect two bounds on the energy cost to go from any state x (cid:54) = x r to x r or to x , for r = 1 , ..., n . The second lemma is similar. Lemma B.1.

Assume Condition 2.4 is satisﬁed. For any x ∈ X and x (cid:54) = x r , for every r =1 , ..., n . If H ( x ) ≤ H ( x r ) , we have that Φ( x, x ) − H ( x ) < Γ m and Φ( x, x r ) − H ( x r ) ≥ Γ m , for every r = 1 , ..., n. (B.2) Proof.

Let us prove the ﬁrst inequality. By Theorem 2.3 in [20] we have that Φ( x, x ) ≤ Γ m + H ( x ) . If by contradiction Φ( x, x ) = Γ m + H ( x ) then, by the same Theorem 2.3 in [20], x ∈ X m which is in contradiction with Condition 2.4. Next we turn to the proof of the secondinequality and we distinguish two cases. If H ( x ) < H ( x r ) , then we have that x ∈ I x r . By (2.3)and by (2.13), we get Φ( x r , x ) ≥ Φ( x r , I x r ) = Γ m + H ( x r ) that proves the inequality. If H ( x ) = H ( x r ) , then let us deﬁne the set C := { y ∈ X : Φ( y, x r ) < H ( x r ) + Γ m } . We will show that x (cid:54)∈ C . Since H ( x ) = H ( x r ) , the identity I x = I x r follows. Furthermore,since x r ∈ X m , we have C ∩ I x r = ∅ ; hence, C ∩ I x = ∅ as well. Moreover, if x ∈ C then V x = Φ( x, I x ) − H ( x ) ≥ H ( x r ) + Γ m − H ( x ) = Γ m . By (2.3), x would be a metastable state, incontradiction with Condition 2.4. Hence, since x (cid:54)∈ C , we have that Φ( x, x r ) ≥ Γ m + H ( x r ) . This proves the inequality for every r = 1 , ..., n .35 emma B.2. Assume Condition 2.4 is satisﬁed. For any x ∈ X and x / ∈ { x , x , ..., x n , x } . If H ( x ) ≤ H ( x ) , then Φ( x, { x , ..., x n , x } ) − H ( x ) < Γ m and Φ( x, x ) − H ( x ) ≥ Γ m (B.3) Proof.

Let us prove the ﬁrst inequality. By Theorem 2.3 in [20] we have Φ( x, { x , ..., x n , x } ) ≤ Φ( x, x ) ≤ Γ m + H ( x ) . We proceed by contradiction and assume that Φ( x, x ) = Γ m + H ( x ) .By [20, Theorem 2.3], x ∈ X m which is in contradiction with Condition 2.4. Next we turn tothe proof of the second inequality we distinguish two cases. If H ( x ) < H ( x ) , then we have that x ∈ I x . By (2.3) of metastable state and by (2.13), we get Φ( x , x ) ≥ Φ( x , I x ) = Γ m + H ( x ) . This proves the inequality. If H ( x ) = H ( x ) , then let us deﬁne the set C := { y ∈ X : Φ( y, x ) < H ( x ) + Γ m } . We will show that x (cid:54)∈ C . Since H ( x ) = H ( x ) , the identity I x = I x follows. Furthermore,since x ∈ X m , we have C ∩ I x = ∅ ; hence, C ∩ I x = ∅ as well. Moreover, if x ∈ C then V x = Φ( x, I x ) − H ( x ) ≥ H ( x ) + Γ m − H ( x ) = Γ . By (2.3), x would be a metastable state, incontradiction with Condition 2.4. Hence, since x (cid:54)∈ C , we have that Φ( x, x ) ≥ Γ m + H ( x ) . This proves the inequality.

Proof of Theorem 2.9.

We begin by proving Equation (2.46). The proof is based on Lemma B.1and Lemma B.2. In the proof we only use the representation of the expected mean time in termsof the Green function [11, Corollary 3.3], see also [31, Eq. (4.29)]. Indeed, recalling (2.21) above,we rewrite the expected value in terms of the capacity as E x [ τ { x ,...,x n ,x } ] = 1 cap β ( x , { x , ..., x n , x } ) (cid:88) x ∈X µ β ( x ) h x , { x ,...,x n ,x } ( x ) . (B.4)Since h x , { x ,...,x n ,x } ( x ) = 1 , we get the following lower bound: E x [ τ { x ,...,x n ,x } ] ≥ cap ( x , { x , ..., x n , x } ) µ β ( x ) h x , { x ,...,x n ,x } ( x )= µ β ( x ) cap ( x , { x , ..., x n , x } ) . (B.5)In order to give an upper bound, we ﬁrst use the boundary conditions in (2.20) to rewrite (B.4)as follows: E x [ τ { x ,...,x n ,x } ] = 1 cap ( x , { x , ..., x n , x } ) (cid:104) (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) ≤ H ( x µ β ( x ) h x , { x ,...,x n ,x } ( x )+ (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) >H ( x µ β ( x ) h x , { x ,...,x n ,x } ( x ) (cid:105) . µ β ( x ) as µ β ( x ) ≤ µ β ( x ) exp( − βδ ) for some positive δ = min x { H ( x ) − H ( x ) } and for any x ∈ X such that H ( x ) > H ( x ) . We get E x [ τ { x ,...,x n ,x } ] (cid:39) cap ( x , { x , ..., x n , x } ) (cid:104) (cid:88) x ∈X\ ( X m ∪X s ) ,H ( x ) ≤ H ( x µ β ( x ) h x , { x ,...,x n ,x } ( x )+ µ β ( x )[1 + o (1)] (cid:105) . (B.6)Next we upper bound the equilibrium potential h x , { x ,...,x n ,x } ( x ) by applying Proposition B.4with x = x , Y = { x } , Z = { x , ..., x n , x } , as h x , { x ,...,x n ,x } ( x ) ≤ cap ( x, x ) cap ( x, { x , ..., x n , x } ) . Furthermore, if H ( x ) ≤ H ( x ) and x / ∈ X m ∪ X s , then h x , { x ,...,x n ,x } ( x ) ≤ C e − β Φ( x,x ) e − β Φ( x, { x ,...,x n ,x } ) ≤ C e − β (Γ m + H ( x )) e − β (Γ m + H ( x ) − δ ) = C e − βδ µ β ( x ) µ β ( x ) , where C , δ are suitable positive constants. In the ﬁrst inequality we used Proposition B.5, inthe second we used Lemma B.1 and Lemma B.2. By using (B.6) we get E x [ τ { x ,...,x n ,x } ] ≤ cap ( x , { x , ..., x n , x } ) (cid:104) (cid:88) x ∈X\ ( X m ∪X s ) ,H ( x ) ≤ H ( x C µ β ( x ) e − βδ µ β ( x ) µ β ( x ) + µ β ( x )[1 + o (1)] (cid:105) , which implies E x [ τ { x ,...,x n ,x } ] ≤ µ β ( x ) cap ( x , { x , ..., x n , x } ) [1 + o (1)] , (B.7)where we have used that the conﬁguration space is ﬁnite. Equation (2.46) ﬁnally follows by (B.5)and (B.7).Next we prove Equation (2.47). Recalling (2.21) above, we rewrite the expected value interms of the capacity as E { x ,...,x n } [ τ x ] = 1 cap β ( { x , ..., x n } , x ) (cid:88) x ∈X µ β ( x ) h { x ,...,x n } ,x ( x ) . (B.8)Considering the contribution of x r for every r = 1 , ..., n in the sum and observing that h { x ,...,x n } ,x ( x q ) =1 for every q = 1 , ..., n , we get the following lower bound: E { x ,...,x n } [ τ x ] ≥ cap ( { x , ..., x n } , x ) n (cid:88) q =1 µ β ( x q ) h { x ,...,x n } ,x ( x q )= 1 cap ( { x , ..., x n } , x ) n (cid:88) q =1 µ β ( x q )= µ β ( { x , ..., x n } ) cap ( { x , ..., x n } , x ) , (B.9)where the last equality follows from the deﬁnition of Gibbs-measure and H ( x r ) = H ( x q ) forevery r, q = 1 , ..., n . In order to give an upper bound, we ﬁrst use the boundary conditions in372.20) to rewrite (B.8) as follows: E { x ,...,x n } [ τ x ] = 1 cap ( { x , ..., x n } , x ) (cid:104) (cid:88) x ∈X\ x ,H ( x ) ≤ H ( xr µ β ( x ) h { x ,...,x n } ,x ( x )+ (cid:88) x ∈X\ x ,H ( x ) >H ( xr µ β ( x ) h { x ,...,x n } x ( x ) (cid:105) . (B.10)Next we bound µ β ( x ) as µ β ( x ) ≤ µ β ( x r ) exp( − βδ ) for some positive δ = min x { H ( x ) − H ( x r ) } and for any x ∈ X such that H ( x ) > H ( x r ) . We get E { x ,...,x n } [ τ x ] = 1 cap ( { x , ..., x n } , x ) (cid:104) (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) ≤ H ( xr µ β ( x ) h { x ,...,x n } ,x ( x ) + nµ β ( x r )[1 + o (1)] (cid:105) . (B.11)Next we upper bound the equilibrium potential h { x ,...,x n } ,x ( x ) by applying Proposition B.4with x = x , Y = { x , ..., x n } and Z = { x } h { x ,...,x n } ,x ( x ) ≤ cap ( x, { x , ..., x n } ) cap ( x, x ) . Furthermore, if H ( x ) ≤ H ( x r ) and x / ∈ { x , ..., x n , x } , then h { x ,...,x n } ,x ( x ) ≤ C e − β Φ( x, { x ,...,x n } ) e − β Φ( x,x ) ≤ C e − β (Γ m + H ( x r )) e − β (Γ m + H ( x ) − δ ) = C e − βδ µ β ( x r ) µ β ( x ) , where C , δ are suitable positive constants. In the ﬁrst inequality we used Proposition B.5, inthe second we used Lemma B.1 and Lemma B.2. By using (B.11) we get E { x ,...,x n } [ τ x ] ≤ cap ( { x , ..., x n } , x ) (cid:104) (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) ≤ H ( xr C µ β ( x ) e − βδ µ β ( x r ) µ β ( x ) + nµ β ( x r )[1 + o (1)] (cid:105) , which implies E { x ,...,x n } [ τ x ] ≤ nµ β ( x r ) cap ( { x , ..., x n } , x ) [1 + o (1)] , (B.12)where we have used that the conﬁguration space is ﬁnite. Equation (2.47) ﬁnally follows recalling nµ β ( x r ) = µ β ( { x , ...x n } ) and by (B.9) and (B.12).Next we prove Equation (2.48). Recalling (2.21) above, we rewrite the expected value interms of the capacity as E x r [ τ x ] = 1 cap β ( x r , x ) (cid:88) x ∈X µ β ( x ) h x r ,x ( x ) for every r = 1 , ..., n. (B.13)Considering the contribution of every x r in the sum and observing that h x r ,x ( x r ) = 1 and38 x r ,x ( x q ) (cid:39) for every q = 1 , ..., n , we get the following lower bound: E x r [ τ x ] ≥ cap ( x r , x ) µ β ( x r ) h x r ,x ( x r ) + n (cid:88) q =1 ,q (cid:54) = r cap ( x r , x ) µ β ( x q ) h x r ,x ( x q ) (cid:39) cap ( x r , x ) n (cid:88) q =1 µ β ( x q )= nµ β ( x r ) cap ( x r , x ) , (B.14)where the last equality follows from the deﬁnition of Gibbs-measure and H ( x r ) = H ( x q ) forevery q = 1 , ..., n . In order to give an upper bound, we ﬁrst use the boundary conditions in(2.20) to rewrite (B.13) as follows: E x r [ τ x ] = 1 cap ( x r , x ) (cid:104) (cid:88) x ∈X\ x ,H ( x ) ≤ H ( xr µ β ( x ) h x r ,x ( x ) + (cid:88) x ∈X\ x ,H ( x ) >H ( xr µ β ( x ) h x r ,x ( x ) (cid:105) . Next we bound µ β ( x ) as µ β ( x ) ≤ µ β ( x r ) exp( − βδ ) for some positive δ = min x { H ( x ) − H ( x r ) } and for any x ∈ X such that H ( x ) > H ( x r ) . Recalling that h x r ,x ( x r ) = 1 , h x r ,x ( x q ) = 1 + o (1) for every q = 1 , ..., n with q (cid:54) = r , we get E x r [ τ x ] (cid:39) cap ( x r , x ) (cid:104) (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) ≤ H ( xr µ β ( x ) h x r ,x ( x ) + n (cid:88) q =1 µ β ( x q )[1 + o (1)] (cid:105) . (B.15)Next we upper bound the equilibrium potential h x r ,x ( x ) by applying Proposition B.4 with x = x , Z = { x } and Y = { x r } for every i = 1 , ..., nh x r ,x ( x ) ≤ cap ( x, x r ) cap ( x, x ) . Furthermore, if H ( x ) ≤ H ( x r ) and x (cid:54) = x q for every q = 1 , ..., n , then h x r ,x ( x ) ≤ C e − β Φ( x,x r ) e − β Φ( x,x ) ≤ C e − β (Γ m + H ( x r )) e − β (Γ m + H ( x ) − δ ) = C e − βδ µ β ( x r ) µ β ( x ) , where C , δ are suitable positive constants. In the ﬁrst inequality we used Proposition B.5, inthe second we used Lemma B.1 and Lemma B.2. By using (B.15) we get E x r [ τ x ] ≤ cap ( x r , x ) (cid:104) (cid:88) x ∈X\{ x ,...,xn ,x } ,H ( x ) ≤ H ( xr C µ β ( x ) e − βδ µ β ( x r ) µ β ( x ) + n (cid:88) q =1 µ β ( x q )[1 + o (1)] (cid:105) , which implies E x r [ τ x ] ≤ (cid:80) nq =1 µ β ( x q ) cap ( x r , x ) [1 + o (1)] = nµ β ( x r ) cap ( x r , x ) [1 + o (1)] , (B.16)where we have used that the conﬁguration space is ﬁnite and H ( x r ) = H ( x q ) for every q = 1 , ..., n . Proof of Theorem 2.10 and Theorem 2.11.

The two theorems follow immediately by exploitingCondition 2.6 and applying Theorem 2.9.The proof of Theorem 2.12 is based on the following lemma.39 emma B.3.

Given three or more states y, w , ..., w n , z ∈ X pairwise mutually diﬀerent, wehave that the following holds E y [ τ z ] = E y [ τ { w ,...,w n ,z } ] + E { w ,...,w n } [ τ z ] P y ( τ { w ,...,w n } < τ z ) . (B.17) Proof.

First of all we note that E y ( τ z ) = E y [ τ z { τ w ,...,wn } <τ z ] + E y [ τ z τ { w ,...,wn } ≥ τ z ] . We now rewrite the ﬁrst term as follows E y [ τ z { τ { w ,...,wn } <τ z } ] = E y [ E y [ τ z { τ { w ,...,wn } <τ z } |F τ { w ,...,wn } ]]= E y [ { τ { w ,...,wn } <τ z } ( τ { w ,...,w n } + E { w ,...,w n } [ τ z ])]= E y [ τ { w ,...,w n } { τ { w ,...,wn } <τ z } ] + P y ( τ { w ,...,w n } < τ z ) E { w ,...,w n } [ τ z ] , where we have used the fact that τ { w ,...,w n } = min { τ w , ..., τ w n } is a stopping time, that { τ { w ,...,wn } } is measurable with respect to the pre– τ { w ,...,w n } – σ –algebra F τ { w ,...,wn } and thestrong Markov property which gives E y [ τ z |F τ { w ,...,wn } ] = τ { w ,...,w n } + E { w ,...,w n } [ τ z ] on theevent { τ { w ,...,w n } ≤ τ z } . Since ( τ { w ,...,w n } { τ { w ,...,wn } <τ z } + τ z { τ { w ,...,wn } ≥ τ z } ) = τ { w ,...,w n ,z } ,(B.17) follows. Proof of Theorem 2.12.

By (B.17) we have that E x [ τ x ] = E x [ τ { x ,...,x n x } ] + E { x ,...,x n } [ τ x ] P x ( τ { x ,...,x n } < τ x ) By Theorem 2.10 and Condition 2.5 it follows that E x [ τ x ] = e β Γ m (cid:18) k + 1 k (cid:19) [1 + o (1)] which concludes the proof. Proposition B.4.

Consider the Markov chain deﬁned in Section 2.1. We have that P x ( τ Y < τ Z ) ≤ cap β ( x, Y ) cap β ( x, Z ) (B.18) for any Y = { y , ..., y t } ⊂ X for t ∈ N , Z = { z , ..., z t (cid:48) } ⊂ X for t (cid:48) ∈ N , Y ∩ Z = ∅ , x ∈ X \ { Y ∪ Z } .Proof. Given

Y, Z ⊂ X such that Y ∩ Z = ∅ and x ∈ X \ { Y ∪ Z } , a renewal argument and thestrong Markov property yield P x ( τ Y < τ Z ) = P x ( τ Y < τ Z , τ Y ∪ Z > τ x ) + P x ( τ Y < τ Z , τ Y ∪ Z < τ x )= P x ( τ Y < τ Z | τ Y ∪ Z > τ x ) P x ( τ Y ∪ Z > τ x )+ P x ( τ Y < τ Z , τ Y ∪ Z < τ x )= P x ( τ Y < τ Z ) P x ( τ Y ∪ Z > τ x ) + P x ( τ Y < τ Z , τ Y < τ x )= P x ( τ Y < τ Z ) P x ( τ Y ∪ Z > τ x ) + P x ( τ Y < τ Z ∪{ x } ) . P x ( τ Y < τ Z ) = P x ( τ Y < τ Z ∪{ x } )1 − P x ( τ Y ∪ Z > τ x ) = P x ( τ Y < τ Z ∪{ x } ) P x ( τ Y ∪ Z ≤ τ x ) ≤ P x ( τ Y < τ x ) P x ( τ Z < τ x ) . Recalling (2.21), we can rewrite the ratio in terms of ratio of capacities: P x ( τ Y < τ x ) P x ( τ Z < τ x ) = cap β ( x, Y ) cap β ( x, Z ) . Hence, we get Equation (B.18).

Proposition B.5. [8, Lemma 3.1.1] Consider the Markov chain deﬁned in Section 2.1. Forevery not empty disjoint sets

Y, Z ⊂ X there exist constants < C < C < ∞ such that C ≤ e β Φ( Y,Z ) Z β cap β ( Y, Z ) ≤ C , (B.19) for all β large enough. References [1] G. B. Arous and R. Cerf. Metastability of the three dimensional Ising model on a torus atvery low temperatures.

Electronic Journal of Probability , 1, 1996.[2] K. Bashiri. A note on the metastability in three modiﬁcations of the standard Ising model. arXiv preprint arXiv:1705.07012 , 2017.[3] J. Beltran and C. Landim. Tunneling and metastability of continuous time markov chains.

Journal of Statistical Physics , 140(6):1065–1114, 2010.[4] J. Beltrán and C. Landim. Tunneling and metastability of continuous time markov chainsii, the nonreversible case.

Journal of Statistical Physics , 149(4):598–618, 2012.[5] A. Bianchi and A. Gaudilliere. Metastable states, quasi-stationary distributions and softmeasures.

Stochastic Processes and their Applications , 126(6):1622–1680, 2016.[6] S. Bigelis, E. N. Cirillo, J. L. Lebowitz, and E. R. Speer. Critical droplets in metastablestates of probabilistic cellular automata.

Physical Review E , 59(4):3935, 1999.[7] A. Bovier and F. Den Hollander.

Metastability: a potential-theoretic approach , volume 351.Springer, 2016.[8] A. Bovier, F. Den Hollander, and F. R. Nardi. Sharp asymptotics for Kawasaki dynamicson a ﬁnite box with open boundary.

Probability theory and related ﬁelds , 135(2):265–310,2006.[9] A. Bovier, F. Den Hollander, C. Spitoni, et al. Homogeneous nucleation for Glauber andKawasaki dynamics in large volumes at low temperatures.

The Annals of Probability ,38(2):661–713, 2010.[10] A. Bovier, M. Eckhoﬀ, V. Gayrard, and M. Klein. Metastability and low lying spectra inreversible Markov chains.

Communications in mathematical physics , 228(2):219–255, 2002.4111] A. Bovier, M. Eckhoﬀ, V. Gayrard, and M. Klein. Metastability in reversible diﬀusionprocesses I. Sharp asymptotics for capacities and exit times. 2004.[12] A. Bovier and F. Manzo. Metastability in Glauber dynamics in the low-temperature limit:beyond exponential asymptotics.

Journal of Statistical Physics , 107(3-4):757–779, 2002.[13] M. Cassandro, A. Galves, E. Olivieri, and M. E. Vares. Metastable behavior of stochasticdynamics: a pathwise approach.

Journal of Statistical Physics , 35(5-6):603–634, 1984.[14] O. Catoni. Simulated annealing algorithms and Markov chains with rare transitions. In

Séminaire de probabilités XXXIII , pages 69–119. Springer, 1999.[15] O. Catoni and R. Cerf. The exit path of a Markov chain with rare transitions.

ESAIM:Probability and Statistics , 1:95–144, 1997.[16] O. Catoni and A. Trouvé. Parallel annealing by multiple trials: a mathematical study.

Simulated annealing , pages 129–143, 1992.[17] R. Cerf and F. Manzo. Nucleation and growth for the Ising model in d dimensions at verylow temperatures. The Annals of Probability , 41(6):3697–3785, 2013.[18] E. N. M. Cirillo and J. L. Lebowitz. Metastability in the two-dimensional Ising model withfree boundary conditions.

Journal of Statistical Physics , 90(1-2):211–226, 1998.[19] E. N. M. Cirillo and F. R. Nardi. Metastability for a stochastic dynamics with a parallelheat bath updating rule.

Journal of Statistical Physics , 110(1-2):183–217, 2003.[20] E. N. M. Cirillo and F. R. Nardi. Relaxation height in energy landscapes: an application tomultiple metastable states.

Journal of Statistical Physics , 150(6):1080–1114, 2013.[21] E. N. M. Cirillo, F. R. Nardi, and J. Sohier. Metastability for general dynamics withrare transitions: escape time and critical conﬁgurations.

Journal of Statistical Physics ,161(2):365–403, 2015.[22] E. N. M. Cirillo, F. R. Nardi, and C. Spitoni. Competitive nucleation in reversible Proba-bilistic Cellular Automata.

Physical Review E , 78(4):040601, 2008.[23] E. N. M. Cirillo, F. R. Nardi, and C. Spitoni. Metastability for reversible ProbabilisticCellular Automata with self-interaction.

Journal of Statistical Physics , 132(3):431–471,2008.[24] E. N. M. Cirillo, F. R. Nardi, and C. Spitoni. Sum of exit times in series of metastablestates in Probabilistic Cellular Automata. In

International Workshop on Cellular Automataand Discrete Complex Systems , pages 105–119. Springer, 2016.[25] E. N. M. Cirillo, F. R. Nardi, and C. Spitoni. Sum of exit times in a series of two metastablestates.

The European Physical Journal Special Topics , 226(10):2421–2438, 2017.[26] E. N. M. Cirillo and E. Olivieri. Metastability and nucleation for the Blume-Capel model.diﬀerent mechanisms of transition.

Journal of Statistical Physics , 83(3-4):473–554, 1996.[27] P. Dehghanpour and R. H. Schonmann. Metropolis dynamics relaxation via nucleation andgrowth.

Communications in mathematical physics , 188(1):89–119, 1997.[28] F. Den Hollander, F. R. Nardi, E. Olivieri, and E. Scoppola. Droplet growth for three-dimensional Kawasaki dynamics.

Probability theory and related ﬁelds , 125(2):153–194, 2003.4229] F. Den Hollander, F. R. Nardi, and A. Troiani. Metastability for low–temperature Kawasakidynamics with two types of particles.

Electronic Journ. of Probability , 17:1–26, 2012.[30] B. Derrida. Dynamical phase transitions in spin models and automata. Technical report,CEA Centre d’Etudes Nucleaires de Saclay, 1989.[31] A. Gaudillière. Condenser physics applied to Markov chains.

Lecture Notes for the 12thBrazilian School of Probability , 2009.[32] A. Gaudillière, F. Den Hollander, F. R. Nardi, E. Olivieri, and E. Scoppola. Ideal gasapproximation for a two-dimensional rareﬁed gas under Kawasaki dynamics.

StochasticProcesses and their Applications , 119(3):737–774, 2009.[33] A. Gaudilliere and C. Landim. A Dirichlet principle for non reversible Markov chains andsome recurrence theorems.

Probability Theory and Related Fields , 158:55–89, 2014.[34] A. Gaudillière, P. Milanesi, and M. E. Vares. Asymptotic exponential law for the transitiontime to equilibrium of the metastable kinetic Ising model with vanishing magnetic ﬁeld.

Journal of Statistical Physics , pages 1–46, 2020.[35] A. Gaudilliere and F. R. Nardi. An upper bound for front propagation velocities insidemoving populations.

Brazilian Journal of Probability and Statistics , 24(2):256–278, 2010.[36] A. Gaudilliere, E. Olivieri, and E. Scoppola. Nucleation pattern at low temperature for localKawasaki dynamics in two dimensions.

Markov Processes Relat. Fields .[37] F. D. Hollander, E. Olivieri, and E. Scoppola. Metastability and nucleation for conservativedynamics.

Journal of Mathematical Physics , 41(3):1424–1498, 2000.[38] R. Holley and D. Stroock. Simulated annealing via Sobolev inequalities.

Communicationsin Mathematical Physics , 115(4):553–569, 1988.[39] R. Koteck`y and E. Olivieri. Shapes of growing droplets—a model of escape from a metastablephase.

Journal of Statistical Physics , 75(3-4):409–506, 1994.[40] F. Manzo, F. R. Nardi, E. Olivieri, and E. Scoppola. On the essential features of metastabil-ity: tunnelling time and critical conﬁgurations.

Journal of Statistical Physics , 115(1-2):591–642, 2004.[41] F. Manzo and E. Olivieri. Relaxation patterns for competing metastable states: a nucleationand growth model. In

Markov Proc. Relat. Fields , volume 4, pages 549–570, 1998.[42] F. Manzo and E. Olivieri. Dynamical Blume–Capel model: competing metastable states atinﬁnite volume.

Journal of Statistical Physics , 104(5-6):1029–1090, 2001.[43] F. R. Nardi and E. Olivieri. Low temperature stochastic dynamics for an Ising model withalternating ﬁeld. In

Markov Proc. Relat. Fields , volume 2, pages 117–166, 1996.[44] F. R. Nardi and C. Spitoni. Sharp asymptotics for stochastic dynamics with parallel updatingrule.

Journal of Statistical Physics , 146(4):701–718, 2012.[45] F. R. Nardi, A. Zocca, and S. C. Borst. Hitting time asymptotics for hard-core interactionson grids.

Journal of Statistical Physics , 162(2):522–576, 2016.4346] E. J. Neves and R. H. Schonmann. Critical droplets and metastability for a Glauber dynam-ics at very low temperatures.

Communications in Mathematical Physics , 137(2):209–230,1991.[47] E. J. Neves and R. H. Schonmann. Behavior of droplets for a class of Glauber dynamics atvery low temperature.

Probability theory and related ﬁelds , 91(3-4):331–354, 1992.[48] E. Olivieri and E. Scoppola. Markov chains with exponentially small transition probabilities:ﬁrst exit problem from a general domain I. The reversible case.

Journal of Statistical Physics ,79(3-4):613–647, 1995.[49] E. Olivieri and E. Scoppola. Markov chains with exponentially small transition probabilities:ﬁrst exit problem from a general domain. II. The general case.

Journal of Statistical Physics ,84(5-6):987–1041, 1996.[50] E. Olivieri and M. E. Vares.

Large deviations and metastability , volume 100. CambridgeUniversity Press, 2005.[51] O. Penrose and J. L. Lebowitz. Rigorous treatment of metastable states in the Van derWaals-Maxwell theory.

Journal of Statistical Physics , 3(2):211–236, 1971.[52] R. H. Schonmann. Slow droplet-driven relaxation of stochastic Ising models in the vicinityof the phase coexistence region.

Communications in Mathematical Physics , 161(1):1–49,1994.[53] R. H. Schonmann and S. B. Shlosman. Wulﬀ droplets and the metastable relaxation ofkinetic Ising models.

Communications in mathematical physics , 194(2):389–462, 1998.[54] E. Scoppola. Metastability for Markov chains: a general procedure based on renormalizationgroup ideas. In

Probability and Phase Transition , pages 303–322. Springer, 1994.[55] A. Trouvé. Rough large deviation estimates for the optimal convergence speed exponent ofgeneralized simulated annealing algorithms. In