Metastability of finite state Markov chains: a recursive procedure to identify slow variables for model reduction
aa r X i v : . [ m a t h . P R ] D ec METASTABILITY OF FINITE STATE MARKOV CHAINS: ARECURSIVE PROCEDURE TO IDENTIFY SLOW VARIABLESFOR MODEL REDUCTION
C. LANDIM, T. XU
Abstract.
Consider a sequence ( η N ( t ) : t ≥
0) of continuous-time, irreducibleMarkov chains evolving on a fixed finite set E , indexed by a parameter N .Denote by R N ( η, ξ ) the jump rates of the Markov chain η Nt , and assume thatfor any pair of bonds ( η, ξ ), ( η ′ , ξ ′ ) arctan { R N ( η, ξ ) /R N ( η ′ , ξ ′ ) } converges as N ↑ ∞ . Under a hypothesis slightly more restrictive (cf. (2.6) below), wepresent a recursive procedure which provides a sequence of increasing time-scales θ N , . . . , θ p N , θ jN ≪ θ j +1 N , and of coarsening partitions { E j , . . . , E j n j , ∆ j } ,1 ≤ j ≤ p , of the set E . Let φ j : E → { , , . . . , n j } be the projection definedby φ j ( η ) = P n j x =1 x { η ∈ E jx } . For each 1 ≤ j ≤ p , we prove that thehidden Markov chain X jN ( t ) = φ j ( η N ( tθ jN )) converges to a Markov chain on { , . . . , n j } . Introduction
This article has two motivations. On the one hand, the metastable behavior ofnon-reversible Markovian dynamics has attracted much attention recently [24, 23,6, 20, 25, 11, 7, 10, 14, 13]. On the other hand, the emergence of large complexnetworks gives a particular importance to the problem of data and model reduction[21, 9, 1]. This issue arises in as diverse contexts as meteorology, genetic networksor protein folding, and is very closely related to the identification of slow variables,a fundamental tool in decreasing the degrees of freedom [30].Not long ago, Beltr´an and one of the authors introduced a general approachto derive the metastable behavior of continuous-time Markov chains, particularlyconvenient in the presence of several valleys with the same depth [2, 4, 5]. In thecontext of finite state Markov chains [3], it permits to identify the slow variablesand to reduce the model and the state space.More precisely, denote by E a finite set, by η Nt a sequence of E -valued continuous-time, irreducible Markov chains, and by E , . . . , E n , ∆ a partition of the set E . Let E = ∪ ≤ x ≤ n E x and let φ E : E → { , , . . . , n } be the projection defined by φ E ( η ) = n X x =1 x { η ∈ E x } . In general, X N ( t ) = φ E ( η Nt ) is not a Markov chain, but only a hidden Markovchain. We say that φ E is a slow variable if there exists a time-scale θ N for whichthe dynamics of X N ( tθ N ) is asymptotically Markovian. Key words and phrases.
Metastability, Markov chains, slow variables, model reduction.
The set ∆ plays a special role in the partition, separating the sets E , . . . , E n ,called here valleys. The chain remains a negligible amount of time in the set ∆ inthe time-scale θ N at which the slow variable evolves.Slow variables provide an efficient mechanism to contract the state space andto reduce the model in complex networks, as it allows to represent the originalevolution through a simple Markovian chain X N ( t ) which takes value in a muchsmaller set, without losing the essential features of the dynamics. It may also revealaspects of the dynamics which may not be apparent at first sight.When the number of sets in the partition is reduced to 2, n = 2, and the Markovchain which describes the asymptotic behavior of the slow variable has one absorb-ing point and one transient point, the chain presents a metastable behavior. In acertain time-scale, it remains for an exponential time on a subset of the state spaceafter which it jumps to another set where it remains for ever. By extension, andmay be inapropriately, we say that the chain η Nt exhibits a metastable behavioramong the valleys E , . . . , E n in the time-scale θ N whenever we prove the existenceof a slow variable.We present in this article a recursive procedure which permits to determine allslow variables of the chain. It provides a sequence of time-scales θ N , . . . , θ p N and ofpartitions { E j , . . . , E j n j , ∆ j } , 1 ≤ j ≤ p , of the set E with the following properties. • The time-scales are increasing: lim N →∞ θ jN /θ j +1 N = 0 for 1 ≤ j < p . Thisrelation is represented as θ jN ≪ θ j +1 N . • The partitions are coarser. Each set of the ( j + 1)-th partition is obtainedas a union of sets in the j -th partition. Thus n j +1 < n j and for each a in { , . . . , n j +1 } , E j +1 a = ∪ x ∈ A E jx for some subset A of { , . . . , n j } . • The sets ∆ j , which separates the valleys, increase: ∆ j ⊂ ∆ j +1 . Actually,∆ j +1 = ∆ j ∪ x ∈ B E jx for some subset B of { , . . . , n j } . • The projection Ψ jN ( η ) = P ≤ x ≤ n j x { η ∈ E jx } + N { η ∈ ∆ j } is a slowvariable which evolves in the time-scale θ jN .We prove three further properties of the partitions { E j , . . . , E j n j , ∆ j } . • As mentioned above, the time the chain remains in the set ∆ j in the time-scale θ jN is negligible. We refer to condition (H3) below for a mathematicalformulation of this assertion. • Starting from any configuration in E jx , the chain η Nt attains the set ∪ y = x E jy at a time which is asymptotically exponential in the time-scale θ jN (cf.Remark 2.4). • With a probability asymptotically equal to 1, the chain η Nt visits all pointsof the set E jx before hitting another set E jy of the partition. In the terminol-ogy of Freidlin and Wentzell [15], the sets of the first partition, denoted by E x , are cycles while the set of the following partitions are cycles of cycles.These results have been proved in [3] for finite state reversible Markovian dy-namics. We remove in this article the assumption of reversibility and we simplifysome proofs.In contrast with other approaches [22, 28, 12, 24, 11, 14, 13], we do not describethe tube of typical trajectories in a transition between two valleys, nor do weidentify the critical configurations which are visited with high probability in suchtransitions. ETASTABILITY OF FINITE STATE MARKOV CHAINS 3
The arguments presented here have been designed for sequences of Markovchains. The examples we have in mind are zero-temperature limits of non-reversibledynamics in a finite state space. It is not clear whether the analysis can be adaptedto handle the case of a single fixed dynamics as in [9, 21, 1].The approach presented in this article is based on a multiscale analysis. Thesequence of increasing time-scales is defined in terms of the depth of the differentvalleys. In this sense, the method is similar to the one proposed by Scoppola in [29],and developed by Olivieri and Scoppola [26, 27], but it does not require the valleys tohave exponential depth, nor the jump rates to be expressed in terms of exponentials.Actually, one of its main merit is that it relies on a minimal hypothesis, presentedin (2.6) below, which is very easy to check since it is formulated only in terms ofthe jump rates.The article is organized as follows. In Section 2 we state the main results. Inthe following three sections we introduce the tools needed to prove these results,which is carried out in the last three sections.2.
Notation and main results
Consider a finite set E . The elements of E are called configurations and aredenoted by the Greek letters η , ξ , ζ . Consider a sequence of continuous-time, E -valued, irreducible Markov chains { η Nt : t ≥ } . Denote the jump rates of η Nt by R N ( η, ξ ), and by µ N the unique invariant probability measure.Denote by D ( R + , E ) the space of right-continuous functions x : R + → E withleft-limits endowed with the Skorohod topology, and by P η = P Nη , η ∈ E , theprobability measure on the path space D ( R + , E ) induced by the Markov chain η Nt starting from η . Expectation with respect to P η is represented by E η .Denote by H A , H + A , A ⊂ E , the hitting time and the time of the first return to A : H A = inf (cid:8) t > η Nt ∈ A (cid:9) , H + A = inf (cid:8) t > τ : η Nt ∈ A (cid:9) , (2.1)where τ represents the time of the first jump of the chain η Nt : τ = inf { t > η Nt = η N } .Denote by λ N ( η ), η ∈ E , the holding rates of the Markov chain η Nt and by p N ( η, ξ ), η , ξ ∈ E , the jump probabilities, so that R N ( η, ξ ) = λ N ( η ) p N ( η, ξ ). Fortwo disjoint subsets A , B of E , denote by cap N ( A , B ) the capacity between A and B : cap N ( A , B ) = X η ∈ A µ N ( η ) λ N ( η ) P η [ H B < H + A ] . (2.2)Consider a partition E , . . . , E n , ∆ of the set E , which does not depend on theparameter N and such that n ≥
2. Fix two sequences of positive real numbers α N , θ N such that α N ≪ θ N , where this notation stands for lim N →∞ α N /θ N = 0.Let E = ∪ x ∈ S E x , where S = { , . . . , n } . Denote by { η E t : t ≥ } the trace of { η Nt : t ≥ } on E , and by R E N : E × E → R + the jump rates of the trace process η E t . We refer to Section 6 of [2] for a definition of the trace process. Denote by r E N ( E x , E y ) the mean rate at which the trace process jumps from E x to E y : r E N ( E x , E y ) = 1 µ N ( E x ) X η ∈ E x µ N ( η ) X ξ ∈ E y R E N ( η, ξ ) . (2.3) C. LANDIM, T. XU
Assume that for every x = y ∈ S , r E ( x, y ) := lim N →∞ θ N r E N ( E x , E y ) ∈ R + , and that X x ∈ S X y = x r E ( x, y ) > . (H1) The symbol := in the first line of the previous displayed equation means thatthe limit exists, that it is denoted by r E ( x, y ), and that it belongs to R + . Thisconvention is used throughout the article.Assume that for every x ∈ S for which E x is not a singleton and for all η = ξ ∈ E x ,lim inf N →∞ α N cap N ( η, ξ ) µ N ( E x ) > . (H2) Finally, assume that in the time scale θ N the chain remains a negligible amountof time outside the set E : For every t > N →∞ max η ∈ E E η h Z t { η Nsθ N ∈ ∆ } ds i = 0 . (H3) Denote by Ψ N : E → { , . . . , n , N } the projection defined by Ψ N ( η ) = x if η ∈ E x , Ψ N ( η ) = N , otherwise:Ψ N ( η ) = X x ∈ S x { η ∈ E x } + N { η ∈ ∆ } . Recall from [19] the definition of the soft topology.
Theorem 2.1.
Assume that conditions (H1)–(H3) are in force. Fix x ∈ S anda configuration η ∈ E x . Starting from η , the speeded-up, hidden Markov chain X N ( t ) = Ψ N (cid:0) η N ( θ N t ) (cid:1) converges in the soft topology to the continuous-time Markovchain X E ( t ) on { , . . . , n } whose jump rates are given by r E ( x, y ) and which startsfrom x . This theorem is a straightforward consequence of known results. We stated ithere in sake of completeness and because all the analysis of the metastable behaviorof η Nt relies on it. Remark 2.2.
Theorem 2.1 states that in the time scale θ N , if we just keep track ofthe set E x where η Nt is and not of the specific location of the chain, we observe anevolution on the set S close to the one of a continuous-time Markov chain whichjumps from x to y at rate r E ( x, y ) . Remark 2.3.
The function Ψ N represents a slow variable of the chain. Indeed,we will see below that the sequence α − N stands for the order of magnitude of thejump rates of the chain. Theorem 2.1 states that on the time scale θ N , which ismuch longer than α N , the variable Ψ N ( η Nt ) evolves as a Markov chain. In otherwords, under conditions (H1)–(H3), one still observes a Markovian dynamics aftera contraction of the configuration space through the projection Ψ N . Theorem 2.1provides therefore a mechanism of reducing the degrees of freedom of the system,keeping the essential features of the dynamics, as the ergodic properties. Remark 2.4.
It also follows from assumptions (H1)–(H3) that the exit time froma set E x is asymptotically exponential. More precisely, let ˘ E x , x ∈ S , be the union ETASTABILITY OF FINITE STATE MARKOV CHAINS 5 of all set E y except E x : ˘ E x = [ y = x E y . (2.4) For every x ∈ S and η ∈ E x , under P η the distribution of H ˘ E x /θ N converges to anexponential distribution. Remark 2.5.
Under the assumptions (H1)–(H3), the sets E x are cycles in thesense of [15] . More precisely, for every x ∈ S for which E x is a not a singleton,and for all η = ξ ∈ E x , lim N →∞ P η (cid:2) H ξ < H ˘ E x (cid:3) = 1 . This means that starting from η ∈ E x , the chain visits all configurations in E x beforehitting the set ˘ E x . The main assumption.
We present in this subsection the main and uniquehypothesis made on the sequence of Markov chains η Nt . Fix two configurations η = ξ ∈ E . We assume that the jump rate from η to ξ is either constant equal to0 or is always strictly positive: R N ( η, ξ ) = 0 for all N ≥ R N ( η, ξ ) > N ≥ . This assumption permits to define the set of ordered bonds of E , denoted by B , asthe set of ordered pairs ( η, ξ ) such that R N ( η, ξ ) > B = (cid:8) ( η, ξ ) ∈ E × E : η = ξ , R N ( η, ξ ) > (cid:9) . Note that the set B does not depend on N .Our analysis of the metastable behavior of the sequence of Markov chain η Nt relies on the assumption that the set of ordered bonds can be divided into equivalentclasses in such a way that the all jump rates in the same equivalent class are ofthe same order, while the ratio between two jump rates in different classes eithervanish in the limit or tend to + ∞ . Some terminology is necessary to make thisnotion precise. Ordered sequences : Consider a set of sequences ( a rN : N ≥
1) of nonnegative realnumbers indexed by some finite set r ∈ R . The set R is said to be ordered if forall r = s ∈ R the sequence arctan { a rN /a sN } converges as N ↑ ∞ .In the examples below the set R will be the set of configurations E or the setof bonds B . Let Z + = { , , , . . . } , and let A m , m ≥
1, be the set of functions k : B → Z + such that P ( η,ξ ) ∈ B k ( η, ξ ) = m . Assumption 2.6.
We assume that for every m ≥ the set of sequences n Y ( η,ξ ) ∈ B R N ( η, ξ ) k ( η,ξ ) : N ≥ o , k ∈ A m is ordered. We assume from now on that the sequence of Markov chains η Nt fulfills Assump-tion 2.6. In particular, the sequences { R N ( η, ξ ) : N ≥ } , ( η, ξ ) ∈ B , are ordered. C. LANDIM, T. XU
The shallowest valleys, the fastest slow variable.
We identify in thissubsection the shortest time-scale at which a metastable behavior is observed, weintroduce the shallowest valleys, and we prove that these valleys form a partitionwhich fulfills conditions (H1)–(H3).We first identify the valleys. Let1 α N = X η ∈ E X ξ : ξ = η R N ( η, ξ ) . We could also have defined α − N as max { R N ( η, ξ ) : ( η, ξ ) ∈ B } . By Assumption 2.6,for every η = ξ ∈ E , α N R N ( η, ξ ) → R ( η, ξ ) ∈ [0 , λ ( η ) = P ξ = η R ( η, ξ ) ∈ R + ,and denote by E the subset of points of E such that λ ( η ) >
0. For all η ∈ E let p ( η, ξ ) = R ( η, ξ ) /λ ( η ). It is clear that for all η , ζ in E , ξ ∈ E ,lim N →∞ α N λ N ( η ) = λ ( η ) , lim N →∞ p N ( ξ, ζ ) = p ( ξ, ζ ) . (2.5)Denote by X R ( t ) the E -valued Markov chain whose jump rates are given by R ( η, ξ ). Not that this Markov chain might not be irreducible. However, by defini-tion of α N , there is at least one bond ( η, ξ ) ∈ B such that R ( η, ξ ) > E , E , . . . , E n the recurrent classes of the Markov chain X R ( t ), andby ∆ the set of transient points, so that { E , . . . , E n , ∆ } forms a partition of E : E = E ⊔ ∆ , E = E ⊔ · · · ⊔ E n . (2.6)Here and below we use the notation A ⊔ B to represent the union of two disjointsets A , B : A ⊔ B = A ∪ B , and A ∩ B = ∅ .Note that the sets E x , x ∈ S = { , . . . , n } , do not depend on N . If n = 1, thechain does not possess valleys. This is the case, for instance, if the rates R N ( x, y )are independent of N . Assume, therefore, and up to the end of this subsection,that n ≥ θ N be defined by 1 θ N = X x ∈ S cap N ( E x , ˘ E x ) µ N ( E x ) . (2.7) Theorem 2.7.
The partition E , . . . , E n , ∆ and the time scales α N , θ N fulfill theconditions (H1)–(H3) . Moreover, For every x ∈ S and every η ∈ E x , there exists m x ( η ) ∈ (0 , such that lim N →∞ µ N ( η ) µ N ( E x ) = m x ( η ) . (H0)Remark 2.8. The jump rates r E ( x, y ) which appear in condition (H1) are intro-duced in Lemma 7.1. It follows from Theorems 2.1 and 2.7 that in the time-scale θ N the chain η Nt evolves among the sets E x , x ∈ S , as a Markov chain which jumpsfrom x to y at rate r E ( x, y ) . In the next three remarks we present some outcomes of Theorem 2.1 and 2.7on the evolution of the chain η Nt in a time-scale longer than θ N . These remarksanticipate the recursive procedure of the next subsection. Remark 2.9.
The jump rates r E ( x, y ) define a Markov chain on S , representedby X E ( t ) . Denote by T the set of transient points of this chain and assume that T = ∅ . It follows from Theorem 2.1 that in the time-scale θ N , starting from a set E x , x ∈ T , the chain η Nt leaves the set E x at an asymptotically exponential time, ETASTABILITY OF FINITE STATE MARKOV CHAINS 7 and never returns to E x after a finite number of visits to this set. In particular, ifwe observe the chain η Nt in a longer time-scale than θ N , starting from E x the chainremains only a negligible amount of time at E x . Remark 2.10.
Denote by A the set of absorbing points of X E ( t ) , and assume that A = ∅ . In this case, in the time-scale θ N , starting from a set E x , x ∈ A , the chain η Nt never leaves the set E x . To observe a non-trivial behavior starting from this setone has to consider longer-time scales. Remark 2.11.
Finally, denote by C , . . . , C p the equivalent classes of X E ( t ) . Sup-pose that there is a class, say C , of recurrent points which is not a singleton. Inthis case, starting from a set E x , x ∈ C , in the time-scale θ N , the chain η Nt leavesthe set E x at an asymptotically exponential time, and returns to E x infinitely manytimes.Suppose now that there are at least two classes, say C and C , of recurrentpoints. This means that in the time-scale θ N , starting from a set E x , x ∈ C , theprocess never visits a set E y for y ∈ C . For this to occur one has to observe thechain η Nt in a longer time-scale.Denote by R , . . . , R m the recurrent classes of X E ( t ) . In the next subsection,we derive a new time-scale at which one observes jumps from sets of the form F a = ∪ x ∈ R a E x to sets of the form F b = ∪ x ∈ R b E x . All deep valleys and slow variables.
We obtained in the previous sub-section two time-scales α N , θ N , and a partition E , . . . , E n , ∆ of the state space E which satisfy conditions (H0)–(H3). We present in this subsection a recursiveprocedure. Starting from two time-scales β − N , β N , and a partition F , . . . , F p , ∆ F of the state space E satisfying the assumptions (H0)–(H3) and such that p ≥
2, itprovides a longer time-scale β + N and a coarser partition G , . . . , G q , ∆ G which fulfillsconditions (H0)–(H3) with respect to the sequences β N , β + N .Consider a partition F , . . . , F p , ∆ F of the set E and two sequences β − N , β N suchthat β − N /β N →
0. Assume that p ≥ β − N , β N satisfy conditions (H0)–(H3). Denote by r F ( x, y ) the jump rates appearing inassumption (H1). The coarser partition . Let P = { , . . . , p } and let X F ( t ) be the P -valued Markovchain whose jumps rates are given by r F ( x, y ).Denote by G , G , . . . , G q the recurrent classes of the chain X F ( t ), and by G q +1 the set of transient points. The sets G , . . . , G q +1 form a partition of P . We claimthat q < p . Fix x ∈ P such that P y = x r F ( x, y ) >
0, whose existence is guaranteedby hypothesis (H1). Suppose that the point x is transient. In this case the numberof recurrent classes must be smaller than p . If, on the other hand, x is recurrent, therecurrent class which contains x must have at least two elements, and the numberof recurrent classes must be smaller than p .Let Q = { , . . . , q } , G a = [ x ∈ G a F x , ∆ ∗ = [ x ∈ G q +1 F x , ∆ G = ∆ F ∪ ∆ ∗ , a ∈ Q . (2.8)Since, by (2.6), { F , . . . , F p , ∆ F } forms a partition of E , { G , . . . , G q , ∆ G } also formsa partition of E : E = G ⊔ ∆ G , G = G ⊔ · · · ⊔ G q . (2.9) C. LANDIM, T. XU
The longer time-scale . For a ∈ Q = { , . . . , q } , let ˘ G a be the union of all leavesexcept G a : ˘ G a = [ b = a G b . Assume that q >
1, and let β + N be given by1 β + N = X a ∈ Q cap N ( G a , ˘ G a ) µ N ( G a ) . (2.10) Theorem 2.12.
The partition G , . . . , G q , ∆ G and the time scales ( β N , β + N ) satisfyconditions (H0)–(H3) . Remark 2.13.
It follows from Theorems 2.1 and 2.12 that the chain η Nt exhibitsa metastable behavior in the time-scale β + N if q > . We refer to Remarks 2.2, 2.3,2.4 and 2.5. Remark 2.14. As q < p and as we need p to be greater than or equal to to applythe iterative procedure, this recursive algorithm ends after a finite number of steps.If q = 1 , β N is the longest time-scale at which a metastable behavior is observed.In this time-scale, the chain η Nt jumps among the sets F x as does the chain X F ( t ) until it reaches the set G = ∪ x ∈ G F x . Once in this set, it remains there for everjumping among the sets F x , x ∈ G , as the Markov chain X F ( t ) , which restrictedto G is an irreducible Markov chain. The successive valleys : Observe that the valleys G a were obtained as the recur-rent classes of the Markov chain X F ( t ): G a = ∪ x ∈ G a F x , where G a is a recurrentclass of X F ( t ). In particular, at any time-scale the valleys are formed by unions ofthe valleys obtained in the first step of the recursive argument, which were denotedby E x in the previous subsection. Moreover, by (H0), each configuration in G a hasmeasure of the same order. Conclusion : We presented an iterative method which provides a finite sequence oftime-scales and of partitions of the set E satisfying conditions (H0)-(H3). At eachstep, the time scales become longer and the partitions coarser. By Theorem 2.1,to each pair of time-scale and partition corresponds a metastable behavior of thechain η Nt . This recursive algorithm provides all time-scales at which a metastablebehavior of the chain η Nt is observed, and all slow variables which keep a Markoviandynamics. 3. What do we learn from Assumption 2.6?
We prove in this section that the jump rates of the trace processes satisfy As-sumption 2.6, and that some sequences, such as the one formed by the measures ofthe configurations, are ordered.
Assertion 3.A.
Let F be a proper subset of E and denote by R FN ( η, ξ ) , η = ξ ∈ F ,the jump rates of the trace of η Nt on F . The jump rates R FN ( η, ξ ) satisfy Assumption2.6.Proof. We prove this assertion by removing one by one the elements of E \ F .Assume that F = E \ { ζ } for some ζ ∈ E . By Corollary 6.2 in [2] and by the ETASTABILITY OF FINITE STATE MARKOV CHAINS 9 equation following the proof of this corollary, for η = ξ ∈ F , R FN ( η, ξ ) = R N ( η, ξ ) + R N ( η, ζ ) p N ( ζ, ξ ). Hence, R FN ( η, ξ ) = P w ∈ E R N ( η, ξ ) R N ( ζ, w ) + R N ( η, ζ ) R N ( ζ, ξ ) P w ∈ E R N ( ζ, w ) · (3.1)It is easy to check from this identity that Assumption 2.6 holds for the jump rates R FN . It remains to proceed recursively to complete the proof. (cid:3) Lemma 3.1.
The sequences { µ N ( η ) : N ≥ } , η ∈ E , are ordered.Proof. Fix η = ξ ∈ E and let F = { η, ξ } . By [2, Proposition 6.3], the stationarystate of the trace of η Nt on F , denoted by µ FN , is given by µ FN ( η ) = µ N ( η ) /µ N ( F ). As µ FN is the invariant probability measure, µ FN ( η ) R FN ( η, ξ ) = µ FN ( ξ ) R FN ( ξ, η ). There-fore, µ N ( η ) /µ N ( ξ ) = µ FN ( η ) /µ FN ( ξ ) = R FN ( ξ, η ) /R FN ( η, ξ ). By Assertion 3.A, thesequences { R FN ( a, b ) : N ≥ } , a = b ∈ { η, ξ } are ordered. This completes the proofof the lemma. (cid:3) The previous lemma permits to divide the configurations of E into equivalentclasses by declaring η equivalent to η ′ , η ∼ η ′ , if µ N ( η ) /µ N ( η ′ ) converges to a realnumber belonging to (0 , ∞ ). Assertion 3.B.
Let F be a proper subset of E . For every bond ( η ′ , ξ ′ ) ∈ B andevery m ≥ the set of sequences n Y ( η,ξ ) ∈ B R FN ( η, ξ ) k ( η,ξ ) R N ( η ′ , ξ ′ ) : N ≥ o , k ∈ A m is ordered.Proof. We proceed as in the proof of Assertion 3.A, by removing one by one theelements of E \ F . Fix ζ ∈ E \ F . It follows from (3.1) and from Assumption 2.6that the claim of the assertion holds for F ′ = E \ { ζ } .Fix ζ ′ ∈ E \ F , ζ ′ = ζ . By using formula (3.1), to express the rates R E \{ ζ,ζ ′ } interms of the rates R E \{ ζ } , and the statement of this assertion for F ′ = E \ { ζ } weprove that this assertion also holds for F ′ = E \ { ζ, ζ ′ } . Iterating this algorithm wecomplete the proof of the assertion. (cid:3) Denote by c N ( η, ξ ) = µ N ( η ) R N ( η, ξ ), ( η, ξ ) ∈ B , the (generally asymmetric)conductances. Lemma 3.2.
The conductances { c N ( η, ξ ) : N ≥ } , ( η, ξ ) ∈ B , are ordered.Proof. Consider two bonds ( η, ξ ), ( η ′ , ξ ′ ) in B . As in the proof of Lemma 3.1, wemay express the ratio of the conductances as c N ( η, ξ ) c N ( η ′ , ξ ′ ) = µ N ( η ) R N ( η, ξ ) µ N ( η ′ ) R N ( η ′ , ξ ′ ) = R FN ( η ′ , η ) R N ( η, ξ ) R FN ( η, η ′ ) R N ( η ′ , ξ ′ ) , where F = { η, η ′ ). It remains to recall the statement of assertion 3.B to completethe proof of the lemma. (cid:3) Denote by B s the symmetrization of the set B , that is, the set of bonds ( η, ξ )such that ( η, ξ ) or ( ξ, η ) belongs to B : B s = (cid:8) ( η, ξ ) ∈ E × E : η = ξ , ( η, ξ ) ∈ B or ( ξ, η ) ∈ B (cid:9) . Denote by c sN ( η, ξ ), ( η, ξ ) ∈ B s , the symmetric part of the conductance: c sN ( η, ξ ) = 12 (cid:8) c N ( η, ξ ) + c N ( ξ, η ) (cid:9) . (3.2)Next result is a straightforward consequence of the previous lemma. Corollary 3.3.
The symmetric conductances { c sN ( η, ξ ) : N ≥ } , ( η, ξ ) ∈ B s , areordered. As in Lemma 3.1, the previous corollary permits to divide the set B s into equiva-lent classes by declaring ( η, ξ ) equivalent to ( η ′ , ξ ′ ), ( η, ξ ) ∼ ( η ′ , ξ ′ ), if c sN ( η, ξ ) /c sN ( η ′ , ξ ′ )converges to a constant in (0 , ∞ ).It is possible to deduce from Assumption 2.6 that many other sequences areordered. We do not present these results here as we do not use them below.4. Cycles, sector condition and capacities
We prove in this section that the generator of a Markov chain on a finite setcan be decomposed as the sum of cycle generators and that it satisfies a sectorcondition. This last bound permits to estimate the capacity between two sets bythe capacity between the same sets for the reversible process.Throughout this section, E is a fixed finite set and L represents the generatorof an E -valued, continuous-time Markov chain. We adopt all notation introducedin Section 2, removing the index N since the chain is fixed. We start with somedefinitions.In a finite set, the decomposition of a generator into cycle generators is verysimple. The problem for infinite sets is much more delicate. We refer to [16] for adiscussion of the question. Cycle : A cycle is a sequence of distinct configurations ( η , η , . . . , η n − , η n = η )whose initial and final configuration coincide: η i = η j ∈ E , i = j ∈ { , . . . , n − } .The number n is called the length of the cycle. Cycle generator : A generator L of an E -valued Markov chain, whose jump ratesare denoted by R ( η, ξ ), is said to be a cycle generator associated to the cycle c = ( η , η , . . . , η n − , η n = η ) if there exists reals r i >
0, 0 ≤ i < n , such that R ( η, ξ ) = ( r i if η = η i and ξ = η i +1 for some 0 ≤ i < n , . We denote this cycle generator by L c . Note that( L c f )( η ) = n − X i =0 { η = η i } r i [ f ( η i +1 ) − f ( η i )] . Sector condition : A generator L of an E -valued, irreducible Markov chain, whoseunique invariant probability measure is denoted by µ , is said to satisfy a sectorcondition if there exists a constant C < ∞ such that for all functions f , g : E → R , hL f, g i µ ≤ C h ( −L f ) , f i µ h ( −L g ) , g i µ . In this formula, h f, g i µ represents the scalar product in L ( µ ): h f, g i µ = X η ∈ E f ( η ) g ( η ) µ ( η ) . ETASTABILITY OF FINITE STATE MARKOV CHAINS 11
We claim that every cycle generator satisfies a sector condition and that everygenerator L of an E -valued Markov chain, stationary with respect to a probabilitymeasure µ , can be decomposed as the sum of cycle generators which are stationarywith respect to µ . Assertion 4.A.
Consider a cycle c = ( η , η , . . . , η n − , η n = η ) of length n ≥ and let L be a cycle generator associated to c . Denote the jump rates of L by R ( η i , η i +1 ) . A measure µ is stationary for L if and only if µ ( η i ) R ( η i , η i +1 ) is constant . (4.1)The proof of the previous assertion is elementary and left to the reader. Theproof of the next one can be found in [18, Lemma 5.5.8]. Assertion 4.B.
Let L be a cycle generator associated to a cycle c of length n .Then, L satisfies a sector condition with constant n : For all f , g : E → R , hL f, g i µ ≤ n h ( −L f ) , f i µ h ( −L g ) , g i µ . Lemma 4.1.
Let L be a generator of an E -valued, irreducible Markov chain.Denote by µ the unique invariant probability measure. Then, there exists cycles c , . . . , c p such that L = p X j =1 L c j , where L c j are cycle generators associated to c j which are stationary with respect to µ .Proof. The proof consists in eliminating successively all 2-cycles (cycles of length2), then all 3-cycles and so on up to the | E | -cycle if there is one left. Denote by R ( η, ξ ) the jump rates of the generator L and by C the set of all 2-cycles ( η, ξ, η )such that R ( η, ξ ) R ( ξ, η ) >
0. Note that the cycle ( η, ξ, η ) coincide with the cycle( ξ, η, ξ ).Fix a cycle c = ( η, ξ, η ) ∈ C . Let ¯ c ( η, ξ ) = min { µ ( η ) R ( η, ξ ) , µ ( ξ ) R ( ξ, η ) } be theminimal conductance of the edge ( η, ξ ), and let R c ( η, ξ ) be the jump rates given by R c ( η, ξ ) = ¯ c ( η, ξ ) /µ ( η ), R c ( ξ, η ) = ¯ c ( η, ξ ) /µ ( ξ ). Observe that R c ( ζ, ζ ′ ) ≤ R ( ζ, ζ ′ )for all ( ζ, ζ ′ ), and that R c ( ξ, η ) = R ( ξ, η ) or R c ( η, ξ ) = R ( η, ξ ).Denote by L c the generator associated the the jump rates R c . Since µ ( η ) R c ( η, ξ ) =¯ c ( η, ξ ) = µ ( ξ ) R c ( ξ, η ), by (4.1), µ is a stationary state for L c (actually, reversble).Let L = L − L c so that L = L + L c . As R c ( ζ, ζ ′ ) ≤ R ( ζ, ζ ′ ), L is the generator of a Markov chain. Since both L and L c are stationary for µ , so is L . Finally, if we draw an arrow from ζ to ζ ′ if the jumprate from ζ to ζ ′ is strictly positive, the number of arrows for the generator L isequal to the number of arrows for the generator L minus 1 or 2. This procedurehas therefore strictly decreased the number of arrows of L .We may repeat the previous algorithm to L to remove from L all 2-cycles ( η, ξ, η )such that R ( η, ξ ) R ( ξ, η ) >
0. Once this has been accomplished, we may remove all3-cycles ( η , η , η , η = η ) such that Q ≤ i< R ( η i , η i +1 ) >
0. At each step at leastone arrow is removed from the generator which implies that after a finite numberof steps all 3-cycles are removed.
Once all k -cycles have been removed, 2 ≤ k < | E | , we have obtained a decom-position of L as L = | E |− X k =2 L k + ˆ L , where L k is the sum of k -cycle generators and is stationary with respect to µ , andˆ L is a generator, stationary with respect to µ , and with no k -cycles, 2 ≤ k < | E | .If ˆ L has an arrow, as it is stationary with respect to µ and has no k -cycles, ˆ L mustbe an | E | -cycle generator, providing the decomposition stated in the lemma. (cid:3) Remark 4.2.
Observe that a generator L is reversible with respect to µ if and onlyif it has a decomposition in -cycles. Given a measure µ on a finite state space, forexample the Gibbs measure associated to a Hamiltonian at a fixed temperature, byintroducing k -cycles satisfying (4.1) it is possible to define non-reversible dynamicswhich are stationary with respect to µ . The previous lemma asserts that this is theonly way to define such dynamics. Corollary 4.3.
The generator L satisfies a sector condition with constant boundedby | E | : For all f , g : E → R , hL f, g i µ ≤ | E | h ( −L f ) , f i µ h ( −L g ) , g i µ . Proof.
Fix f and g : E → R . By Lemma 4.1, hL f, g i µ = (cid:16) p X j =1 hL c j f, g i µ (cid:17) , where L c j is a cycle generator, stationary with respect to µ , associated to the cycle c j . By Assertion 4.B and by Schwarz inequality, since all cycles have length at most | E | , the previous sum is bounded by2 | E | p X j =1 h ( −L c j f ) , f i µ p X k =1 h ( −L c k g ) , g i µ = 2 | E | h ( −L f ) , f i µ h ( −L g ) , g i µ , as claimed (cid:3) Denote by R s ( η, ξ ) the symmetric part of the jump rates R s ( η, ξ ): R s ( η, ξ ) = 12 n R ( η, ξ ) + µ ( ξ ) µ ( η ) R ( ξ, η ) o . (4.2)Denote by η st the E -valued Markov chain whose jump rates are given by R s . Thechain η st is called the reversible chain.For two disjoint subsets A , B of E , denote by cap( A, B ) (resp. cap s ( A, B ))the capacity between A and B (for the reversible chain). Next result follows fromCorollary 4.3 and Lemmas 2.5 and 2.6 in [17] Corollary 4.4.
Fix two disjoint subsets A , B of E . Then, cap s ( A, B ) ≤ cap( A, B ) ≤ | E | cap s ( A, B ) . We conclude the section with an identity and an inequality which will be usedseveral times in this article. Let A and B be two disjoint subsets of E . By definitionof the capacitycap( A, B ) = X η ∈ A µ ( η ) λ ( η ) P η (cid:2) H B < H + A (cid:3) = X η ∈ A µ ( η ) λ ( η ) X ξ ∈ B P η (cid:2) H ξ = H + A ∪ B (cid:3) . ETASTABILITY OF FINITE STATE MARKOV CHAINS 13
Therefore, if we denote by R A ∪ B ( η, ξ ), η = ξ ∈ A ∪ B , the jump rates of the traceof the chain η t on the set A ∪ B , by [2, Proposition 6.1],cap( A, B ) = X η ∈ A µ ( η ) X ξ ∈ B R A ∪ B ( η, ξ ) . (4.3)Let A be a non-empty subset of E and denote by R A ( η, ξ ) the jump rates of thetrace of η t on A . We claim that for all η = ξ ∈ A , µ ( η ) R A ( η, ξ ) ≤ cap( η, ξ ) . (4.4)Denote by λ A ( ζ ) the holding rates of the trace process on A and by p A ( ζ, ζ ′ ) thejump probabilities. By definition, R A ( η, ξ ) = λ A ( η ) p A ( η, ξ ) = λ A ( η ) P η [ H ξ = H + A ] ≤ λ A ( η ) P η [ H ξ < H + η ] . Multiplying both sides of this inequality by µ A ( η ) = µ ( η ) /µ ( A ), by definition ofthe capacity we obtain that µ A ( η ) R A ( η, ξ ) ≤ cap A ( η, ξ ) , where cap A ( η, ξ ) stands for the capacity with respect to the trace process on A . Tocomplete the proof of (4.4), it remains to recall formula (A.10) in [4].5. Reversible chains and capacities
We present in this section some estimates for the capacity of reversible, finitestate Markov chains obtained in in [3]. There are useful below since se proved inCorollary 4.4 that the capacity between two disjoint subsets A , B of E is of thesame order as the capacity with respect to the reversible chain.Recall from (3.2) that we denote by c sN ( η, ξ ) the symmetric conductance of thebond ( η, ξ ). Fix two disjoint subsets A , B of E . A self-avoiding path γ from A to B is a sequence of configurations ( η , η , . . . , η n ) such that η ∈ A , η n ∈ B , η i = η j , i = j , c sN ( η i , η i +1 ) >
0, 0 ≤ i < n . Denote by Γ A , B the set of self-avoiding pathsfrom A to B and let c sN ( γ ) = min ≤ i In view of Theorem 5.1 in [19], Theorem 2.1 follows from from condition (H3)and from Propositions 6.1 below. Denote by ψ E : E → { , . . . , n } the projectiondefined by ψ E ( η ) = x if η ∈ E x : ψ E ( η ) = X x ∈ S x { η ∈ E x } . Proposition 6.1. Fix x ∈ S and a configuration η ∈ E x . Starting from η , thespeeded-up, hidden Markov chain X N ( t ) = ψ E (cid:0) η E ( θ N t ) (cid:1) converges in the Skorohodtopology to the continuous-time Markov chain X E ( t ) , introduced in Theorem 2.1,which starts from x . Lemma 6.2. For every x ∈ S for which E x is not a singleton and for all η = ξ ∈ E x , lim N →∞ cap N ( E x , ˘ E x )cap N ( η, ξ ) = 0 . Proof. Fix x ∈ S . By (4.3), applied to A = E x , B = ˘ E x , and by assumption (H1),lim N →∞ θ N cap N ( E x , ˘ E x ) µ N ( E x ) = X y = x r E ( x, y ) ∈ R + . The claim of the lemma follows from this equation, from assumption (H2) and fromthe fact that α N /θ N → (cid:3) Proof of Proposition 6.1. In view of Theorem 2.1 in [4], the claim of the propositionfollows from condition (H1), and from Lemma 6.2. (cid:3) Proof of Theorem 2.7 The proof of Theorem 2.7 is divided in several steps. 1. The measure of the metastable sets. We start proving that condition (H0)is in force. Recall from Section 2 that we denote by X R ( t ) the E -valued chain whichjumps from η to ξ at rate R ( η, ξ ). Denote by C , . . . , C m the equivalent classes ofthe chain X R ( t ). Assertion 7.A. For all ≤ j ≤ m , and for all η = ξ ∈ C j , there exists m ( η, ξ ) ∈ (0 , ∞ ) such that lim N →∞ µ N ( η ) µ N ( ξ ) = m ( η, ξ ) . ETASTABILITY OF FINITE STATE MARKOV CHAINS 15 Proof. Fix 1 ≤ j ≤ m and η = ξ ∈ C j . By assumption, there exists a path( η = η , . . . , η n = ξ ) such that R ( η i , η i +1 ) > ≤ i < n . On the other hand,since µ N is an invariant probability measure, λ N ( ξ ) µ N ( ξ ) = X ζ ,ζ ,...,ζ n − ∈ E µ N ( ζ ) λ N ( ζ ) p N ( ζ , ζ ) · · · p N ( ζ n − , ξ ) ≥ µ N ( η ) λ N ( η ) p N ( η , η ) · · · p N ( η n − , ξ ) . Therefore, µ N ( ξ ) µ N ( η ) ≥ λ N ( η ) λ N ( ξ ) p N ( η, η ) · · · p N ( η n − , ξ ) . Since R ( η i , η i +1 ) > ≤ i < n , by (2.5), p N ( η i , η i +1 ) converges to p ( η i , η i +1 ) > 0. For the same reason, α N λ N ( η ) converges to λ ( η ) ∈ (0 , ∞ ). Finally, as ξ and η belong to the same equivalent class, there exists a path from ξ to η with similarproperties to the one from η to ξ , so that α N λ N ( ξ ) converges to λ ( ξ ) ∈ (0 , ∞ ). Inconclusion, lim inf N →∞ µ N ( ξ ) µ N ( η ) > . Replacing η by ξ we obtain that lim inf µ N ( η ) /µ N ( ξ ) > 0. Since by Lemma 3.1 thesequences { µ N ( ζ ) : N ≥ } , ζ ∈ E , are ordered, µ N ( η ) /µ N ( ξ ) must converge tosome value in (0 , ∞ ). (cid:3) By the previous assertion for every x ∈ S and η ∈ E x , m x ( η ) := lim N →∞ µ N ( η ) µ N ( E x ) ∈ (0 , , (7.1)where we adopted the convention established in condition (H1) of Section 2. 2. The time-scale. In this subsection, we introduce a time-scale γ N , we provethat it is much longer than α N and that it is of the same order of θ N . In particularthe requirement α N /θ N → { η E t : t ≥ } the trace of η Nt on the set E , and by R E N : E × E → R + the jump rates of η E t . Let 1 γ N = X x ∈ S X η ∈ E x X ξ ∈ ˘ E x R E N ( η, ξ ) , (7.2)where ˘ E x has been introduced in (2.4). The sequence γ N represents the time neededto reach the set ˘ E x starting from E x for some x ∈ S . This time scale might be longerfor other sets E y , y = x , but it is of the order γ N at least for one x ∈ S . We couldas well have defined γ N as max x ∈ S max η ∈ E x max ξ ∈ ˘ E x R E N ( η, ξ ). Assertion 7.B. The time scale γ N is much longer than the time-scale α N : lim N →∞ α N γ N = 0 . Proof. We have to show that α N R E N ( η, ξ ) converges to 0 as N ↑ ∞ , for all η ∈ E x , ξ ∈ E y , x = y ∈ S . Fix x = y ∈ S , η ∈ E x , ξ ∈ E y . Since E x is a recurrent class, R ( η, ζ ) = 0 for all ζ E x . On the other hand, by [2, Proposition 6.1] and by thestrong Markov property, R E N ( η, ξ ) = λ N ( η ) P η [ H ξ = H + E ] = R N ( η, ξ ) + X ζ E R N ( η, ζ ) P ζ [ H ξ = H E ] . Since R ( η, ζ ) = 0 for all ζ E x , it follows from the previous identity and from thedefinition of R ( η, ζ ) that α N R E N ( η, ξ ) → 0, as claimed. (cid:3) By Assertion 3.A, for all x ∈ S , η ∈ E x , ξ ∈ ˘ E x , with the convention adopted incondition (H1) of Section 2, r E ( η, ξ ) := lim N →∞ γ N R E N ( η, ξ ) ∈ [0 , . (7.3) Assertion 7.C. For all x ∈ S , ℓ x := lim N →∞ γ N cap N ( E x , ˘ E x ) µ N ( E x ) ∈ R + . Moreover , ℓ = X x ∈ S ℓ x > . Proof. By (4.3), applied to A = E x , B = ˘ E x , by (7.1) and by (7.3),lim N →∞ γ N cap N ( E x , ˘ E x ) µ N ( E x ) = X η ∈ E x m x ( η ) X ξ ∈ ˘ E x r E ( η, ξ ) ∈ R + , which completes the proof of the first claim of the assertion.By (7.2) and by definition of r E ( η, ξ ), X x ∈ S X η ∈ E x X ξ ∈ ˘ E x r E ( η, ξ ) = 1 , so that ℓ = X x ∈ S ℓ x = X x ∈ S X η ∈ E x m x ( η ) X ξ ∈ ˘ E x r E ( η, ξ ) ≥ min x ∈ S min η ∈ E x m x ( η ) > , which is the second claim of the assertion. (cid:3) It follows from Assertion 7.C that the time-scale γ N is of the same order of θ N in the sense that γ N /θ N converges as N ↑ ∞ :lim N →∞ γ N θ N = ℓ ∈ (0 , ∞ ) . (7.4) 3. The average jump rate, condition (H1). Denote by r N ( E x , E y ) the meanrate at which the trace process jumps from E x to E y : r N ( E x , E y ) = 1 µ N ( E x ) X η ∈ E x µ N ( η ) X ξ ∈ E y R E N ( η, ξ ) . (7.5)Next lemma follows from (7.1), (7.3) and (7.4). Lemma 7.1. For every x = y ∈ S , r E ( x, y ) := lim N →∞ θ N r N ( E x , E y ) = 1 ℓ X η ∈ E x m x ( η ) X ξ ∈ E y r E ( η, ξ ) ∈ R + 4. Inside the metastable sets, condition (H2). Next assertion shows thatcondition (H2) is in force. Assertion 7.D. For every x ∈ S for which E x is not a singleton and for all η = ξ ∈ E x , there exist constants < c < C < ∞ such that c ≤ lim inf N →∞ α N cap N ( η, ξ ) µ N ( E x ) ≤ lim sup N →∞ α N cap N ( η, ξ ) µ N ( E x ) ≤ C . ETASTABILITY OF FINITE STATE MARKOV CHAINS 17 Proof. Fix x ∈ S for which E x is not a singleton, and η = ξ ∈ E x . On the one hand,by definition of the capacity α N cap N ( η, ξ ) µ N ( E x ) ≤ µ N ( η ) µ N ( E x ) α N λ N ( η ) . By (2.5) and (7.1), the right hand side converges to λ ( η ) m x ( η ) < ∞ , which provesone of the inequalities.On the other hand, as E x is an equivalent class which is not a singleton, λ ( ζ ) > ζ ∈ E x , or, in other words, E x ⊂ E . Since η ∼ ξ , there exists a path( η = η , . . . , η n = ξ ) such that R ( η i , η i +1 ) > ≤ i < n . Since, P η (cid:2) H ξ < H + η (cid:3) ≥ p N ( η, η ) · · · p N ( η n − , ξ ) , in view of the formula (2.2) for the capacity, we have that α N cap N ( η, ξ ) µ N ( E x ) ≥ µ N ( η ) µ N ( E x ) α N λ N ( η ) p N ( η, η ) · · · p N ( η n − , ξ ) . The right hand side converges to m x ( η ) λ ( η ) p ( η, η ) · · · p ( η n − , ξ ) > 0, which com-pletes the proof of the assertion. (cid:3) 5. Condition (H3) holds. To complete the proof of Theorem 2.7 it remains toshow that the chain η Nt spends a negligible amount of time on the set ∆ in the timescale θ N . Lemma 7.2. For every t > , lim N →∞ max η ∈ E E η h Z t { η Nsθ N ∈ ∆ } ds i = 0 . Proof. Since α N /θ N → 0, a change of variables in the time integral and the Markovproperty show that for every η ∈ E , for every T > N large enough, E η h Z t { η Nsθ N ∈ ∆ } ds i ≤ tT max ξ ∈ E E ξ h Z T { η Nsα N ∈ ∆ } ds i . Note that the process on the right hand side is speeded up by α N instead of θ N .We estimate the expression on the right hand side of the previous formula. Wemay, of course, restrict the maximum to ∆. Let T be the first time the chain η Nt hits E and let T be the time it takes for the process to return to ∆ after T : T = H E , T = inf (cid:8) s > η NT + s ∈ ∆ (cid:9) . Fix η ∈ ∆ and note that E η h T Z T { η Nsα N ∈ ∆ } ds i ≤ P η (cid:2) T > t α N (cid:3) + P η (cid:2) T < T α N (cid:3) + t T (7.6)for all t > { T ≤ t α N } ∩ { T ≥ T α N } the time average is bounded by t /T . By Assertion 7.Ebelow, the first term on the right hand side vanishes as N ↑ ∞ and then t ↑ ∞ .On the other hand, by the strong Markov property, the second term is boundedby max ξ ∈ E P ξ [ H ∆ ≤ T α N ]. By definition of the set E , for every η ∈ E and every ξ ∈ ∆, α N R N ( η, ξ ) → N ↑ ∞ . This shows that for every T > N ↑ ∞ , which completes the proofof the lemma. (cid:3) Assertion 7.E. For every η ∈ ∆ , lim t →∞ lim sup N →∞ P η (cid:2) H E ≥ tα N (cid:3) = 0 . Proof. Recall that we denote by X R ( t ) the continuous-time Markov chain on E which jumps from η to ξ at rate R ( η, ξ ) = lim N α N R N ( η, ξ ). Note that the set E consists of recurrent points for the chain X R ( t ), while points in ∆ are transient.Since the jump rates converge, the chain η Ntα N converges in the Skorohod topologyto X R ( t ). Therefore, for all t > η ∈ ∆,lim sup N →∞ P η (cid:2) H E ≥ t α N (cid:3) ≤ P η (cid:2) H E ≥ t (cid:3) , where P η stands for the law of the chain X R ( t ) starting from η . Since the set ofrecurrent points for X R ( t ) is equal to E = ∆ c , the previous probability vanishes as t ↑ ∞ . (cid:3) We conclude this section with an observation concerning the capacities of themetastable sets E x . Assertion 7.F. The sequences { cap N ( E x , ˘ E x ) /µ N ( E x ) : N ≥ } , x ∈ S , are or-dered.Proof. Fix x ∈ S . By (4.3) applied to A = E x , B = ˘ E x ,cap N ( E x , ˘ E x ) = X η ∈ E x µ N ( η ) X ξ ∈ ˘ E x R E N ( η, ξ ) . The claim of the assertion follows from this identity, from Assertion 3.A and from(7.1). (cid:3) Proof of Theorem 2.12 Theorem 2.12 is proved in several steps. 1. The measure of configurations in G a . We assumed in (H0) that all config-urations in a set F x have measure of the same order. We prove below in Assertion8.A that a similar property holds for the sets G a .Let λ F N ( F x ) = X y : y = x r F N ( F x , F y ) , p F N ( F x , F y ) = r F N ( F x , F y ) λ F N ( F x ) if λ F N ( F x ) > . Denote by P the subset of points in P such that λ F ( x ) = P y = x r F ( x, y ) > 0. Forall x ∈ P let p F ( x, y ) = r F ( x, y ) /λ F ( x ). It follows from assumption (H1) that forall x , z in P , y ∈ P ,lim N →∞ β N λ F N ( F x ) = λ F ( x ) , lim N →∞ p F N ( F y , F z ) = p F ( y, z ) . (8.1)Recall that X F ( t ) is the P -valued Markov chain which jumps from x to y at rate r F ( x, y ). Denote by C a , a ∈ P = { , . . . , q } , the equivalent classes of the Markovchain X F ( t ), and let C a = ∪ x ∈ C a F x . All configurations in a set C a have probabilityof the same order. Assertion 8.A. For all equivalent classes C a , a ∈ P , and for all η = ξ ∈ C a ,there exists m ( η, ξ ) ∈ (0 , ∞ ) such that lim N →∞ µ N ( η ) µ N ( ξ ) = m ( η, ξ ) . ETASTABILITY OF FINITE STATE MARKOV CHAINS 19 Proof. The argument is very close to the one of Assertion 7.A Denote by ¯ X N ( t )the chain η F ( t ) in which each set F x has been collapsed to a point. We refer to theSection 3 of [17] for a precise definition of the collapsed chain and for the proof ofthe results used below.The chain ¯ X N ( t ) takes value in the set P , its jump rate from x to y , denotedby ¯ r N ( x, y ), is equal to r F N ( F x , F y ) introduced in (2.3), and its unique invariantprobability measure, denoted by ¯ µ N ( x ), is given by ¯ µ N ( x ) = µ N ( F x ) /µ N ( F ).Fix an equivalent class C a and η = ξ ∈ C a . If η and ξ belong to the same set F x ,the claim follows from Assumption (H0). Suppose that η ∈ F x , ξ ∈ F y for some x = y ∈ C a . By assumption, there exists a path ( x = x , . . . , x n = y ) such that r F ( x i , x i +1 ) > ≤ i < n .Denote by ¯ λ N ( x ), x ∈ P , the holding rates of the collapsed chain ¯ X N ( t ), and by¯ p N ( x, y ), x = y ∈ P , the jump probabilities. Since ¯ µ N is the invariant probabilitymeasure for the collapsed chain,¯ λ N ( y ) ¯ µ N ( y ) = X z ,z ,...,z n − ∈ P ¯ µ N ( z ) ¯ λ N ( z ) ¯ p N ( z , z ) · · · ¯ p N ( z n − , y ) ≥ ¯ µ N ( x ) ¯ λ N ( x ) ¯ p N ( x , x ) · · · ¯ p N ( x n − , y ) . Therefore, ¯ µ N ( y )¯ µ N ( x ) ≥ ¯ λ N ( x )¯ λ N ( y ) ¯ p N ( x, x ) · · · ¯ p N ( x n − , y ) . Since r F ( x i , x i +1 ) > ≤ i < n , by (8.1), ¯ p N ( x i , x i +1 ) converges to p F ( x i , x i +1 ) > 0. For the same reason, β N ¯ λ N ( x ) = β N λ F N ( F x ) converges to λ F ( x ) ∈ (0 , ∞ ). As y and x share the same properties, inverting their role we obtain that β N ¯ λ N ( y )converges to λ F ( y ) ∈ (0 , ∞ ). In conclusion,lim inf N →∞ ¯ µ N ( x )¯ µ N ( y ) > . Replacing x by y we obtain that lim inf ¯ µ N ( y ) / ¯ µ N ( x ) > 0. By [17], ¯ µ N ( z ) = µ N ( F z ), z ∈ P . To complete the proof it remains to recall the statement of Lemma3.1 and Assumption (H0). (cid:3) By the previous assertion for every a ∈ Q and η ∈ G a , m ∗ a ( η ) := lim N →∞ µ N ( η ) µ N ( G a ) ∈ (0 , . (8.2)Thus, assumption (H0) holds for the partition { G , . . . , G q , ∆ G } . 2. The time scale. We prove in this subsection that the time-scale β + N introducedin (2.10) is much longer than β N . Assertion 8.B. We have that lim N →∞ β N β + N = 0 . Proof. We have to show thatlim N →∞ β N cap N ( G a , ˘ G a ) µ N ( G a ) = 0 for each a ∈ Q . Fix a ∈ Q and recall from (2.8) the definition of the set G a . Since G a is recurrent class for the chain X F ( t ), r F ( x, y ) = 0 for all x ∈ G a , y ∈ P \ G a .By definition of the capacity,cap N ( G a , ˘ G a ) µ N ( G a ) = X η ∈ G a µ N ( η ) µ N ( G a ) λ N ( η ) P η (cid:2) H ˘ G a < H + G a (cid:3) ≤ X η ∈ G a µ N ( η ) µ N ( G a ) λ N ( η ) P η (cid:2) H F \ G a < H + G a (cid:3) . By [2, Proposition 6.1], this sum is equal to X η ∈ G a µ N ( η ) µ N ( G a ) X ξ ∈ F \ G a R F N ( η, ξ ) = X x ∈ G a µ N ( F x ) µ N ( G a ) X y ∈ P \ G a r F N ( x, y ) . Since r F ( x, y ) = 0 for all x ∈ G a , y ∈ P \ G a , by assumption (H1) the previous summultiplied by β N converges to 0 as N ↑ ∞ . (cid:3) 3. Condition (H1) is fulfilled by the partition { G , . . . , G q , ∆ G } . We firstobtain an alternative formula for the time-scale β + N . The arguments and the ideasare very similar to the ones presented in the previous section. Let1 γ N = X a ∈ Q X η ∈ G a X ξ ∈ ˘ G a R G N ( η, ξ ) . By Assertion 3.A, for all a ∈ Q , η ∈ G a , ξ ∈ ˘ G a , with the convention adopted incondition (H1) of Section 2, r G ( η, ξ ) := lim N →∞ γ N R G N ( η, ξ ) ∈ [0 , . (8.3) Assertion 8.C. For all a ∈ Q , ˆ λ G ( a ) := lim N →∞ γ N cap N ( G a , ˘ G a ) µ N ( G a ) ∈ R + . Moreover , ˆ λ G = X a ∈ Q ˆ λ G ( a ) > . Proof. Fix a ∈ Q . By (4.3), applied to A = G a , B = ˘ G a , by (8.2) and by (8.3),lim N →∞ γ N cap N ( G a , ˘ G a ) µ N ( G a ) = X η ∈ G a m ∗ a ( η ) X ξ ∈ ˘ G a r G ( η, ξ ) ∈ R + , which completes the proof of the first claim of the assertion.By definition of γ N and by definition of r G ( η, ξ ), X a ∈ Q X η ∈ G a X ξ ∈ ˘ G a r G ( η, ξ ) = 1 , so thatˆ λ G = X a ∈ Q ˆ λ G ( a ) = X a ∈ Q X η ∈ G a m ∗ a ( η ) X ξ ∈ ˘ G a r G ( η, ξ ) ≥ min a ∈ Q min η ∈ G a m ∗ a ( η ) > , which is the second claim of the assertion. (cid:3) ETASTABILITY OF FINITE STATE MARKOV CHAINS 21 It follows from the previous assertion that the time-scale γ N is of the same orderof β + N : lim N →∞ γ N β + N = ˆ λ G ∈ (0 , ∞ ) . (8.4)Denote by r G N ( G a , G b ) the mean rate at which the trace process jumps from G a to G b : r G N ( G a , G b ) := 1 µ N ( G a ) X η ∈ G a µ N ( η ) X ξ ∈ G b R G N ( η, ξ ) . (8.5) Lemma 8.1. For every a = b ∈ Q , r G ( a, b ) := lim N →∞ β + N r G N ( G a , G b ) = 1ˆ λ G X η ∈ G a m ∗ a ( η ) X ξ ∈ G b r G ( η, ξ ) ∈ R + Moreover, X a ∈ Q X b : b = a r G ( a, b ) = 1 . Proof. The first claim of this lemma follows from (8.2), (8.3) and (8.4). On the otherhand, by the explicit formula for r G ( a, b ) and by the formula for ˆ λ G ( a ) obtained inthe previous assertion, X a ∈ Q X b : b = a r G ( a, b ) = 1ˆ λ G X a ∈ Q X η ∈ G a m ∗ a ( η ) X b : b = a X ξ ∈ G b r G ( η, ξ ) = 1ˆ λ G X a ∈ Q ˆ λ G ( a ) . This expression is equal to 1 by definition of ˆ λ G . (cid:3) 4. Condition (H2) is fulfilled by the partition { G , . . . , G q , ∆ G } . The proofof condition (H2) is based on the next assertion. Assertion 8.D. For every a ∈ Q for which G a is not a singleton and for all η = ξ ∈ G a , lim inf N →∞ β N cap N ( η, ξ ) µ N ( G a ) > . Proof. Throughout this proof c represents a positive real number independent of N and which may change from line to line. Fix a ∈ Q for which G a is not asingleton, and η = ξ ∈ G a . By definition, G a = ∪ x ∈ G a F x . If η and ξ belongs to thesame F x , the result follows from assumption (H2) and from Assertion 8.A.Fix η ∈ F x and ξ ∈ F y for some x = y , F x ∪ F y ⊂ G a . Recall that we denoteby cap sN ( A , B ) the capacity between two disjoint subsets A , B of E with respectto the reversible chain introduced in (4.2).Since G a is a recurrent class for the chain X F ( t ), there exists a sequence ( x = x , x , . . . , x n = y ) such that r F ( x i , x i +1 ) > ≤ i < n . in view of assumptions(H0) and (H1), there exist ξ i ∈ F x i , η i +1 ∈ F x i +1 such that β N R F N ( ξ i , η i +1 ) ≥ c .Therefore, by Corollary 4.4 and (4.4), β N cap sN ( ξ i , η i +1 ) ≥ β N | E | cap N ( ξ i , η i +1 ) ≥ c µ N ( ξ i ) , (8.6)so that, by (5.4), β N c sN ( ξ i , η i +1 ) ≥ c µ N ( ξ i ).Since the configuration η and ξ belongs to the same set F x , by assumption (H2), β − N cap N ( η, ξ ) /µ N ( F x ) ≥ c . A similar assertion holds for the pair of configurations η i , ξ i , 1 ≤ i < n , and for the pair η n , ξ . Hence, if we set η = η , ξ n = ξ , by Corollary4.4 and (5.4), we have that β − N c sN ( η i , ξ i ) ≥ c µ N ( F x i ) . By (8.2), we may replace µ N ( F x i ) by µ N ( G a ) in the previous inequality, and µ N ( ξ i ) by µ N ( G a ) in (8.6). By (5.3), c sN ( η, ξ ) ≥ min ≤ i 5. Condition (H3) is fulfilled by the partition { G , . . . , G q , ∆ G } . Lemma 8.2shows that it is enough to prove condition (H3) for the trace process η F ( t ). Lemma 8.2. Assume that lim N →∞ max η ∈F E η h Z t { η F sβ + N ∈ ∆ ∗ } ds i = 0 , where ∆ ∗ = ∪ x ∈ G q +1 F x has been introduced in (2.8) . Then, lim N →∞ max η ∈ E E η h Z t { η Nsβ + N ∈ ∆ G } ds i = 0 . Proof. Fix η ∈ E . Since ∆ G = ∆ ∗ ∪ ∆ F , E η h Z t { η sβ + N ∈ ∆ F ∪ ∆ ∗ } ds i ≤ E η h Z t { η sβ + N ∈ ∆ F } ds i + max ξ ∈ F E ξ h Z t { η F sβ + N ∈ ∆ ∗ } ds i . The second term vanishes as N ↑ ∞ by assumption. The first one is bounded by β N β + N [ β + N /β N ] X n =0 E η h Z ( n +1) tnt { η sβ N ∈ ∆ F } ds i , where [ r ] stands for the integer part of r . By the Markov property, this expressionis bounded above by 2 max ξ ∈ E E ξ h Z t { η sβ N ∈ ∆ F } ds i , which vanishes as N ↑ ∞ by assumption (H3). (cid:3) To prove that condition (H3) is fulfilled by the partition { G , . . . , G q , ∆ G } itremains to show that the assumption of the previous lemma is in force. The proofof this claim relies on the next assertion. Denote by P F η the probability measure on D ( R + , F ) induced by the trace chain η F t starting from η . Assertion 8.E. For every η ∈ ∆ ∗ , lim t →∞ lim sup N →∞ P F η (cid:2) H G ≥ tβ N (cid:3) = 0 . ETASTABILITY OF FINITE STATE MARKOV CHAINS 23 Proof. Fix η ∈ F x ⊂ ∆ ∗ . Since the partition F , . . . , F p , ∆ F satisfy the conditions(H1)–(H3), by Proposition 6.1, starting from η the process X N ( t ) = ψ F ( η F tβ N )converges in the Skorohod topology to the Markov chain X F ( t ) on P = { , . . . , p } which starts from x and which jumps from y to z at rate r F ( y, z ). Therefore,lim sup N →∞ P F η (cid:2) H G ≥ t β N (cid:3) ≤ P x (cid:2) H R ≥ t (cid:3) , where P x represents the distribution of the chain X F ( t ) starting from x and R = ∪ ≤ a ≤ q G a . Since R corresponds to the set of recurrent points of the chain X F ( t ),the previous expression vanishes as t ↑ ∞ . (cid:3) Lemma 8.3. For all t > , lim N →∞ max η ∈ F E η h Z t { η F sβ + N ∈ ∆ ∗ } ds i = 0 . Proof. Since β N /β + N → 0, a change of variables in the time integral, similar to theone performed in the proof of Lemma 8.2, and the Markov property show that forevery η ∈ F , every T > N large enough, E η h Z t { η F sβ + N ∈ ∆ ∗ } ds i ≤ tT max ξ ∈ F E ξ h Z T { η F sβ N ∈ ∆ ∗ } ds i . Note that the process on the right hand side is speeded up by β N instead of β + N .We estimate the expression on the right hand side of the previous formula. Wemay, of course, restrict the maximum to ∆ ∗ . Let T be the first time the traceprocess η F t hits G and let T be the time it takes for the process to return to ∆ ∗ after T : T = H G , T = inf (cid:8) s > η F T + s ∈ ∆ ∗ (cid:9) . Fix η ∈ ∆ ∗ and note that E η h T Z T { η F sβ N ∈ ∆ ∗ } ds i ≤ P F η (cid:2) T > t β N (cid:3) + P F η (cid:2) T ≤ T β N (cid:3) + t T for all t > 0. By Assertion 8.E, the first term on the right hand side vanishes as N ↑ ∞ and then t ↑ ∞ . On the other hand, by the strong Markov property, thesecond term is bounded by max ξ ∈ G P F ξ [ H ∆ ∗ ≤ T β N ]. Since, by Proposition 6.1, theprocess ψ F ( η F tβ N ) converges in the Skorohod topology to the Markov chain X F ( t ),lim sup N →∞ max ξ ∈ G P F ξ [ H ∆ ∗ ≤ T β N ] ≤ max ≤ a ≤ q max x ∈ G a P x [ H G q +1 ≤ T ] , where, as in the proof of the previous assertion, P x represents the distribution ofthe chain X F ( t ) starting from x . Since the sets G a are recurrent classes for thechain X F ( t ), r F ( x, y ) = 0 for all x ∈ ∪ ≤ a ≤ q G a , y ∈ G q +1 . Therefore, the previousprobability is equal to 0 for all T > 0, which completes the proof of the lemma. (cid:3) References [1] L. Avena, A. Gaudilli`ere: On some random forests with determinantal roots. ArxivarXiv:1310.1723 (2013)[2] J. Beltr´an, C. Landim: Tunneling and metastability of continuous time Markov chains. J.Stat. Phys. , 1065–1114 (2010). [3] J. Beltr´an, C. Landim: Metastability of reversible finite state Markov processes. Stoch. Proc.Appl . , 1633–1677 (2011).[4] J. Beltr´an, C. Landim: Tunneling and metastability of continuous time Markov chains II. J.Stat. Phys. , 598–618 (2012).[5] J. Beltr´an, C. Landim: A Martingale approach to metastability, Probab. Th. Rel. Fields. , 267–307 (2015).[6] O. Benois, C. Landim, C. Mourragui: Hitting Times of Rare Events in Markov Chains. J.Stat. Phys. , 967–990 (2013).[7] A. Bianchi, A. Gaudilli`ere: Metastable states, quasi-stationary distributions and soft mea-sures. To appear in Stoch. Proc. Appl. (2016)[8] A. Bovier, F. den Hollander: Metastability: a potential-theoretic approach . Grundlehren dermathematischen Wissenschaften , Springer, Berlin, 2015.[9] M. Cameron, E. Vanden-Eijnden: Flows in Complex Networks: Theory, Algorithms, andApplication to Lennard–Jones Cluster Rearrangement. J. Stat. Phys. , 427–454 (2014)[10] P, Chleboun, S. Grosskinsky: A dynamical transition and metastability in a size-dependentzero-range process J. Phys. A: Math. Theor. , 055001 (2015)[11] E. Cirillo, F. Nardi, J. Sohier: Metastability for general dynamics with rare transitions:escape time and critical configurations. arXiv:1412.7923 (2014)[12] W. E, E. Vanden-Eijnden: Towards a theory of transition paths. J. Stat. Phys. , 503–523(2006)[13] R. Fernandez, F. Manzo, F. Nardi, E. Scoppola, J. Sohier: Conditioned, quasi-stationary,restricted measures and metastability. Ann. Appl. Probab. (2015)[14] R. Fernandez, F. Manzo, F. Nardi, E. Scoppola: Asymptotically exponential hitting timesand metastability: a pathwise approach without reversibility. Electron. J. Probab. (2015)[15] M. I. Freidlin, A. D. Wentzell: Random perturbations of dynamical systems. Translatedfrom the 1979 Russian original by Joseph Sz¨ucs. Second edition. Grundlehren der Mathe-matischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 260. Springer-Verlag, New York, 1998.[16] D. Gabrielli, C. Valente: Which random walks are cyclic? ALEA, Lat. Am. J. Probab. Math.Stat. , 231–267 (2012)[17] A. Gaudilli`ere, C. Landim; A Dirichlet principle for non reversible Markov chains and somerecurrence theorems. Probab. Theory Related Fields , 55–89 (2014)[18] T. Komorowski, C. Landim and S. Olla; Fluctuations in Markov Processes, Time Symme-try and Martingale Approximation , Grundlheren der mathematischen Wissenschaften ,Springer-Verlag, Berlin, New York, (2012).[19] C. Landim; A topology for limits of Markov chains. Stoch. Proc. Appl. , 1058–1098 (2014)[20] C. Landim; Metastability for a non-reversible dynamics: the evolution of the condensate intotally asymmetric zero range processes. Commun. Math. Phys. , 1–32 (2014)[21] J. Lu, E. Vanden-Eijnden: Exact dynamical coarse-graining without time-scale separation.J. Chem. Phys. , 044109 (2014)[22] F. Manzo, F. Nardi, E. Olivieri, E. Scoppola: On the essential features of metastability:tunnelling time and critical configurations. J. Stat. Phys. , 591–642 (2004)[23] C. Maes, W. OKelly de Galway: A Low Temperature Analysis of the Boundary DrivenKawasaki Process. J. Stat. Phys. , 991–1007 (2013)[24] P. Metzner, Ch. Schuette, E. Vanden-Eijnden: Transition path theory for Markov jumpprocesses. SIAM Multiscale Model. Simul. , 1192–1219 (2009)[25] R. Misturini: Evolution of the ABC model among the segregated configurations in the zero-temperature limit. arXiv:1403.4981 (2014)[26] E. Olivieri, E. Scoppola: Markov Chains with Exponentially Small Transition Probabilities:First Exit Problem from a General Domain. I. The Reversible Case J. Stat. Phys. , 613–647(1995).[27] E. Olivieri, E. Scoppola: Markov Chains with Exponentially Small Transition Probabilities:First Exit Problem from a General Domain. II. The General Case J. Stat. Phys. , 987–1041(1996).[28] E. Olivieri and M. E. Vares. Large deviations and metastability . Encyclopedia of Mathematicsand its Applications, vol. 100. Cambridge University Press, Cambridge, 2005.[29] E. Scoppola. Renormalization group for Markov chains and application to metastability. J.Stat. Phys. , 83–121 (1993). ETASTABILITY OF FINITE STATE MARKOV CHAINS 25 [30] A. Singer, R. Erban, I. G. Kevrekidis, R. R. Coifman: Detecting intrinsic slow variables instochastic dynamical systems by anisotropic diffusion maps Proc. Natl. Acad. Sci. USA ,16090-16095 (2009) IMPA, Estrada Dona Castorina 110, CEP 22460-320 Rio de Janeiro, Brasil.CNRS UMR 6085, Universit´e de Rouen, Avenue de l’Universit´e, BP.12, Technopˆole duMadrillet, F76801 Saint-´Etienne-du-Rouvray, France.e-mail: [email protected]