Analysis of estimators for adaptive Kinetic Monte Carlo
aa r X i v : . [ m a t h - ph ] J un ANALYSIS OF ESTIMATORS FOR ADAPTIVE KINETICMONTE CARLO
D. ARISTOFF, S. CHILL, AND G. SIMPSON
Abstract.
Adaptive Kinetic Monte Carlo combines the simplicity of KineticMonte Carlo (KMC) with a Molecular Dynamics (MD) based saddle pointsearch algorithm in order to simulate metastable systems. Key to makingAdaptive KMC effective is a stopping criterion for the saddle point search. Inthis work, we examine a criterion, recently appearing in [6], that is based on thefraction of total reaction rate found instead of the fraction of observed saddles.The criterion uses the Eyring-Kramers law to estimate the reaction rate at theMD search temperature. We also consider a related criterion that remainsvalid when the Eyring-Kramers law is not. We examine the mathematicalproperties of both estimators and prove their mean square errors are wellbehaved, vanishing as the simulation continues to run. Introduction
An outstanding problem in theoretical materials science and chemistry is how toreach laboratory time scales of microseconds (10 − s) and longer using MolecularDynamics (MD) based models which resolve the atomistic time scale of femtosec-onds (10 − s). Much of this scale separation is due to the presence of metastable regions in the configuration space of the system. In such regions, often defined bylocal minima of an energy landscape, the system stays close to a particular config-uration, such as a local minima, before crossing into some other metastable regionassociated with a different configuration. Consequently, during much of a directMD simulation, the system is close to one metastable region or another. It exhibitsdynamics akin to a continuous time random walk on the set of metastable states,with comparatively long waiting times.Since much of the physical significance of these systems is characterized by thesequence of visited metastable states and the time spent in each, there have beena variety of efforts to systematically coarse grain the MD trajectory into a morecomputationally efficient continuous time random walk. A.F. Voter has proposedthree methods, Parallel Replica Dynamics, Hyperdynamics, and Temperature Ac-celerated Dynamics, which can overcome metastability through intelligent usage ofthe primitive Langevin dynamics, [14, 16]. In recent years, significant effort hasbeen made to understand and quantify the approximations in these methods andextend their applicability, [1–3, 5, 11, 12, 15].Another approach to the problem is Kinetic Monte Carlo (KMC), and this will bethe focus of this work. Let us assume our system is governed by a potential energy V ( x ), x ∈ R d at inverse temperature β . Furthermore, we assume that we have Date : June 19, 2018.GS was supported by US Department of Energy Award de-sc0012733. GS also thanks P.Hitczenko for helpful discussions. partitioned configuration space into an at most countable set of metastable states,Ω i , associated with local minima m i of V . The system can go from metastablestate i to metastable state j if there is a saddle point, s ij , of V ( x ) joining Ω i andΩ j . For conciseness, we will assume there is a single saddle point joining two givenadjacent metastable states, though, in general, there may be multiple pathways.In traditional KMC, before a simulation is run, one must identify the metastablestates, their connectivity ( i.e. , which ones are joined by saddle points), and thereaction rates of each such connection. Given all of this information, KMC is verycheap to simulate. A single random number is generated and used to select one ofthe possible reactions, the system migrates into the new metastable region, and thealgorithm repeats.Unfortunately, such complete details of the metastable states and their con-nectivity are, a priori , unavailable in all but the simplest low dimensional sys-tems. This has motivated the development of Adaptive Kinetic Monte Carlo(AKMC), [6, 17, 18]. In AKMC, the system starts in some metastable region Ω i .Saddle points associated with Ω i are then sought via a saddle point search algorithm that successively finds s ij . Reaction rates for each such saddle can be estimated bythe Eyring-Kramers law [8]:(1.1) k ij = g ij exp [ − β ( V ( s ij ) − V ( m i )] , where, writing λ for the sole negative eigenvalue of ∇ V ( s ij ), g ij = | λ | π s(cid:12)(cid:12)(cid:12)(cid:12) det ∇ V ( m i )det ∇ V ( s ij ) (cid:12)(cid:12)(cid:12)(cid:12) . Once a sufficient number of saddles associated with Ω i have been identified, theproblem is treated by using traditional KMC with the thus far identified reactionsand their rates; this process then repeats in the next metastable region. Two thingsare needed to proceed with AKMC:(1) A saddle point search algorithm;(2) A stopping criterion.In this work, we will consider the question of the stopping criterion, provided oursaddle point search algorithm satisfies certain assumptions. Our analysis will focuson estimators similar to the one introduced by Chill & Henkelman in [6]. We callthese Chill type estimators .In [6], the authors searched for saddle points out of each metastable state usinghigh temperature MD. For concreteness, consider the Brownian dynamics in R d :(1.2) dX t = −∇ V ( X t ) dt + p β − dW t . The aim is to model the dynamics at low temperature β = β lo . Starting at X ∈ Ω i ,integrate (1.2) at a higher temperature β = β hi ( i.e. , β lo > β hi ) until the trajectoryleaves Ω i . Using the higher temperature β hi allows an escape to occur more quickly.After the trajectory leaves Ω i , one of the saddle points s ij is identified with thispathway using, for instance, the nudged elastic band method [9, 10], and the lowtemperature reaction rate is computed using (1.1) with β = β lo . This is thenrepeated, with a new initial condition chosen in Ω i . Throughout, the cumulativesimulation time is recorded.Other saddle point search algorithms have been proposed, including the Dimermethod and the string method [7, 13]. In our analysis, the key property that we NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 3 need to hold true for all of our search methods is the following. Let(1.3) N ij ( t ) = Number of times saddle s ij has been found by time t. Then for fixed i , during a saddle point search, the N ij ( t ) are independent, withrespect to j , Poisson processes. We prove below that this holds for a carefullyperformed saddle point search via integration of (1.2).This article is organized as follows. We describe the saddle point search in detailin Section 2 below, and prove some of its properties, including the above conditionon N ij ( t ), in Section 3 below. In Section 4 we introduce stopping criteria for thesaddle point search, and in Section 5 we analyze these criteria. Section 6 containsproofs of some of the estimates in Section 5. In Section 7 we make some concludingremarks. 2. Notation and saddle point search algorithm
Here and throughout ( X t ) is Brownian dynamics, that is, a stochastic processsatisfying (1.2). For simplicity we fix a single metastable set Ω ≡ Ω i and suppressthe index i in all of our notations from the Introduction. For our purposes, V issmooth, and Ω is an (open) basin of attraction of V with respect to the gradientdynamics dy/dt = −∇ V ( y ). We assume that ∂ Ω is partitioned into finitely manydisjoint (measurable) subsets, called pathways and labeled 1 , , . . . , N , such thateach pathway j contains a unique saddle point s j of V . When ( X t ) leaves Ω, itmust exit through one of the pathways 1 , , . . . , N .The algorithm, as well as our analysis, depends heavily on the quasistationarydistribution (QSD) for ( X t ) in Ω, which we denote by ν . The QSD ν is a prob-ability measure that is locally invariant for ( X t ), in the sense that it is invariantconditionally on the event that ( X t ) remains in Ω: Definition 2.1.
The QSD for ( X t ) in Ω is a probability measure ν supported in Ω such that for all t > , ν ( · ) = P ( X t ∈ · | X ∼ ν, X s ∈ Ω for all s ∈ [0 , t ]) . Of course ν depends on Ω, but for simplicity we do not indicate this explicitly.It has been shown [11] that ν exists, is unique, and satisfies(2.1) ν ( A ) = lim n →∞ P ( X t ∈ A | X s ∈ A for s ∈ [0 , t ]) , for all A ⊂ Ω . Moreover this convergence is exponentially fast, uniformly in A . Equation (2.1)leads to simple algorithms for sampling ν , based on the idea that a sample can beobtained from the endpoint of a trajectory of ( X t ) that has remained in Ω for asufficiently long time; see [5] for details.We are now ready to state the high temperature saddle point search algorithm.Versions of this algorithm have been used previously; see for instance [6] and ref-erences therein. The search runs at a user-specified “high” (inverse) temperature β hi . Below we write ν for the QSD in Ω at temperature β = β hi . We also write H ( t ) = ( , t < , t ≥ D. ARISTOFF, S. CHILL, AND G. SIMPSON
Algorithm 2.2.
Set N j ( t ) ≡ for t ≥ and j = 1 , . . . , N . Let M be the currentcycle of the algorithm, and t sim the simulation clock. Initialize M = 1 and t sim = 0 ,and iterate the following: Generate a sample x M from ν . The simulation clock t sim is stopped duringthis step. Starting at X = x M , evolve ( X t ) at β = β hi until it first leaves Ω , say attime t = τ ( M ) through pathway I ( M ) . The simulation clock t sim is runningduring this step, and the stopping criterion is continuously checked. If atsome time t sim the criterion is met, the algorithm stops. If I ( M ) = j , update N j ( t ) = N j ( t ) + H ( t − t sim ) for t ≥ and record thesaddle point s j . Then update M = M + 1 . The simulation clock t sim isstopped during this step. It is not necessary to know N , and the pathways can be given labels accordingto the order in which they are found. The simulation clock is cumulative, and itonly increases in Step 2. In particular, during the M -th cycle of the algorithm, t sim increases by τ ( M ) . The stopping criterion will be described in Section 4. Below wewrite t sim for the final value of the simulation clock in the algorithm, that is, itsvalue when the simulation is stopped. To refer to a generic simulation clock timewe write t . Thus, 0 ≤ t ≤ t sim and when the algorithm stops, N j ( t ) is the numberof times an exit through pathway j has been observed by time t . Below we write N j ( t ) for its final value when the algorithm stops. We will also use the followingnotations:(2.2) χ j ( t ) = N j ( t ) ≥ , N ( t ) = N X j =1 N j ( t ) . That is, χ j ( t ) = 1 if an exit through pathway j has been observed at least once bytime t , and is 0 otherwise; N ( t ) is the total number of exits observed by time t .3. Properties of the saddle point search
Our first result follows immediately from properties of the QSD establishedin [11].
Theorem 3.1.
Suppose that in Step 1 in the M -th cycle of Algorithm 2.2, x M isa random variable with distribution ν . Then:(i) τ ( M ) is exponentially distributed with mean κ − : P ( τ ( j ) > t ) = exp( − κt ) ,(ii) τ ( M ) and I ( M ) are independent. Theorem 3.1 then leads to the following.
Theorem 3.2.
Suppose that in Step 1 of Algorithm 2.2, x , x , . . . are iid withcommon distribution ν . Then:(i) { N ( t ) } ≤ t ≤ t sim is a Poisson process with parameter κ ,(ii) { N j ( t ) } j =1 ,...,N ≤ t ≤ t sim , are independent Poisson processes with parameters (3.1) κ j := κ p j , p j := P ( I (1) = j ) . Proof.
Let ( ˜ N ( s )) s ≥ be a Poisson process with parameter κ , which we denote by˜ N ( s ) for brevity. Label each arrival time of ˜ N ( s ) with a pathway j according to NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 5 the distribution p j , independently of the other arrival times, and let ˜ N j ( s ) be theprocess with arrivals labeled by j . Then for r, s ≥ m , . . . , m N ≥ P N \ j =1 n ˜ N j ( r + s ) − ˜ N j ( r ) = m j o = P N ( r + s ) − N ( r ) = N X j =1 m j (cid:18) m + . . . + m N m , . . . , m N (cid:19) N Y j =1 p m j j = N Y j =1 e − κp j s ( κp j s ) m j m j ! . (3.2)By summing over all m i ≥ i = j in the last expression above, we see that forfixed r, s ≥
0, the increment ˜ N j ( r + s ) − ˜ N j ( r ) is Poisson distributed with mean κp j s . ˜ N j ( s ) also inherits independent increments from ˜ N ( s ). This shows that ˜ N j ( s )is a Poisson process with parameter κ j = κp j . Moreover, (3.2) shows that ˜ N j ( s ), j = 1 , . . . , N , are independent.Let us now relate ( ˜ N ( s )) s ≥ with ( N ( s )) ≤ s ≤ t sim . For fixed s ∈ [0 , t sim ], thetime marginal N ( s ) is the largest m such that τ (1) + . . . + τ ( m ) ≤ s . Together withpart (i) of Theorem 3.1, this shows that on [0 , t sim ], ( N ( s )) ≤ s ≤ t sim and ( ˜ N ( s )) s ≥ are Poisson processes with the same law. By part (ii) of Theorem 3.1, it followsthat the multivariate processes ( N j ( s )) j =1 ,...,N ≤ s ≤ t sim and ( ˜ N j ( s )) j =1 ,...,N ≤ s ≤ t sim have the samelaw. This establishes the result. (cid:3) Chill type estimators and stopping criteria
The purpose of the high temperature saddle point search (Algorithm 2.2) is tolocate “enough” of the low-temperature rate corresponding to the metastable set Ω.More precisely, at a low temperature corresponding to β = β lo , the first exit time of X t from Ω is approximately exponentially distributed with mean ( k + . . . + k N ) − ,where k j = k lo j is given by the Eyring-Kramers law (1.1) at β = β lo (recall thesubscript i has been suppressed). See [4] and references therein for rigorous resultsin this direction. The k j ’s are then exponential rates associated with leaving Ωthrough pathway j at low temperature β lo . The proportion of low temperaturerate found by time t in Algorithm 2.2 is(4.1) R ( t ) := P Nj =1 χ j ( t ) k j P Nj =1 k j . The expected value of R ( t ) is(4.2) E [ R ( t )] = ¯ R ( t ) := P Nj =1 p j ( t ) k j P Nj =1 k j , where(4.3) p j ( t ) := E [ χ j ( t )] = 1 − exp( − κ j t ) . Here κ j is defined as in Theorem 3.2 at temperature β = β hi . The idea behind Chill-type estimators is that when R ( t ) is sufficiently close to 1, the high temperaturesaddle point search can stop. There are two obstacles to this idea. D. ARISTOFF, S. CHILL, AND G. SIMPSON
The first is that, at any time during Algorithm 2.2, it is unlikley that all saddlepoints have been found. This problem is remedied by replacing k j in (4.1) with χ j ( t ) k j , which is computable once pathway j has been found during the simulation.The second obstacle is that an exact formula for p j ( t ) := E [ χ j ( t )] will not be knownin practice. Chill-type estimators overcome the latter obstacle by using one of thefollowing approximations:˜ p j ( t ) := 1 − exp[ − k hij t ] , k hi j given by the Eyring-Kramers law at β = β hi , ˆ p j ( t ) := 1 − exp[ − ˆ N j ( t )] , ˆ N j ( t ) := ( N j ( t ) , N j ( t ) ≥ , , else . (4.4)We have used the superscript hi to indicate that the rate in (4.4) is computed attemperature β hi (whereas k j is computed at low temperature β lo ). Also note that˜ p j ( t ) is a physical estimate of E [ χ j ( t )] based on Eyring-Kramers, while ˆ p j ( t ) is a(biased) Monte Carlo estimator. From (4.4) we obtain the following estimators for R ( t ):(4.5) ˜ R ( t ) := P Nj =1 ˜ p j ( t ) χ j ( t ) k j P Nj =1 χ j ( t ) k j , ˆ R ( t ) := P Nj =1 ˆ p j ( t ) χ j ( t ) k j P Nj =1 χ j ( t ) k j .R ( t ), ˜ R ( t ), and ˆ R ( t ) are all random, while ¯ R ( t ) is deterministic. Both ˜ R ( t ) andˆ R ( t ) are explicitly computable at time t during the saddle point search. See [6] forfurther discussion of ˜ R ( t ). To our knowledge ˆ R ( t ) has not appeared before in theliterature. We emphasize that ˆ R ( t ) may be used at any temperature β hi , while ˜ R ( t )is limited by the fact that it gives reasonable estimates of R ( t ) only at (relativelylow) temperatures where the Eyring-Kramers law holds.After choosing ˜ R ( t ) (resp. ˆ R ( t )) as the preferred estimator, the stopping cri-terion can now be defined as follows: for a user-specified parameter ǫ >
0, stopAlgorithm 2.2 in Step 3 if and only if(4.6) ˜ R ( t ) > − ǫ (resp. ˆ R ( t ) > − ǫ ) . In Section 5 we give rigorous estimates of the bias and variance of the estimators˜ R ( t ) and ˆ R ( t ). Such estimates will show that, as t increases, when the algorithmstops, on average at least (1 − ǫ )% of the low temperature rate has been found.5. Analysis
The approximation ˜ p j ( t ) of p j ( t ) is usually considered valid when β hi ≪ V ( s j ) − V ( m ), with m the minimizer of V in Ω. To the authors’ knowledge, rigorous resultsare scarce except when s j = argmin s ,...,s N V ( s j ) − V ( m ); see [4] and referencestherein. However, the following is a consequence of results in [2]: Theorem 5.1.
Suppose
Ω = ( a, b ) is an interval and V is a Morse potential. Thenfor each t > , (5.1) 1 − ˜ p j ( t )1 − p j ( t ) = 1 + O (1 /β hi ) as β hi → ∞ , j = 1 , . Proof.
An examination of the proof of Theorem 4.1 of [2] shows that for j = 1 , k hi j /κ j = 1 + O (1 /β hi ) as β hi → ∞ , NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 7 where k hi j is as in (4.4), and κ j is as in Theorem 3.2 at temperature β = β hi . Theresult follows. (cid:3) We next examine the approximation ˆ p ( t ) of p ( t ). Theorem 5.2.
Conditionally on N ( t ) ≥ , ˆ N j ( t ) is an unbiased estimator for κ j t : (5.2) E [ ˆ N j ( t ) | N ( t ) ≥
1] = κ j t. Also conditionally on N ( t ) ≥ , ˆ p j ( t ) is a conservative estimate of p j ( t ) : (5.3) E [ˆ p j ( t ) | N j ( t ) ≥ ≤ p j ( t ) . Proof.
Recall that N j ( t ) is a Poisson process with parameter κ j . Thus, E [ ˆ N j ( t ) | N j ( t ) ≥
1] = (cid:0) − e − κ j t (cid:1) − ∞ X n =2 n ( κ j t ) n e − κ j t n != κ j t − e − κ j t ∞ X n =1 ( κ j t ) n e − κ j t n ! = κ j t. Since x − e − x is a concave function, the second statement of the theoremfollows from Jensen’s inequality. (cid:3) The reason that we consider conditional expectations in Theorem 5.2 is thatAlgorithm 2.2 cannot stop before N ( t ) ≥
1. Thus, we want estimates conditionedon that event. We call ˆ p j ( t ) a conservative estimate for p j ( t ) because it is a lowerbound on average, so that using ˆ p j ( t ) in place of p j ( t ) leads to a larger averagestopping time for Algorithm 2.2.Before proceeding we define, for a real-valued random variables X and Y ,(5.4) Bias( X, Y ) := E [ X − Y ] , MSE(
X, Y ) := Bias(
X, Y ) + Var( X ) . Observe that the mean square error is not symmetric in its arguments.
Theorem 5.3.
Write q j ( t ) = 1 − p j ( t ) = exp[ − κ j t ] and K = k + . . . + k N . Forthe estimator ˜ R ( t ) , (cid:12)(cid:12)(cid:12) Bias( ˜ R ( t ) , R ( t )) (cid:12)(cid:12)(cid:12) ≤ N max j | Bias(˜ p j ( t ) , p j ( t )) | + K min j k j ¯ R ( t ) max j q j ( t ) , Var( ˜ R ( t )) ≤ K min j k j ¯ R ( t ) max j q j ( t ) , MSE( ˜ R ( t ) , R ( t )) ≤ N max j MSE(˜ p j ( t ) , p j ( t ))+ K min j k j (cid:18) j q j ( t ) + 4 (cid:19) ¯ R ( t ) max j q j ( t ) . (5.5) D. ARISTOFF, S. CHILL, AND G. SIMPSON
For the estimator ˆ R ( t ) , (cid:12)(cid:12)(cid:12) Bias( ˆ R ( t ) , R ( t )) (cid:12)(cid:12)(cid:12) ≤ N max j | Bias(ˆ p j ( t ) , p j ( t )) | + K min j k j ¯ R ( t ) max j q j ( t ) , Var( ˆ R ( t )) ≤ K min j k j ¯ R ( t ) max j q j ( t ) + (cid:18) N max j q j ( t ) (cid:19) max j Var(ˆ p j ( t )) , MSE( ˆ R ( t ) , R ( t )) ≤ (cid:18) N + 2 N max j q j ( t ) (cid:19) max j MSE(ˆ p j ( t ) , p j ( t ))+ 4 K min j k j ¯ R ( t ) (cid:18) j q j ( t ) (cid:19) max j q j ( t ) . (5.6) Here, all maxima and minima are taken over j ∈ { , . . . , N } .Proof. We give proofs in Section 6 below. (cid:3)
We note that some of the bounds in Theorem 5.3 have been loosened so thatsimpler expressions are obtained. This will become clear in the derivation of thebounds in Section 6 below. We highlight that the bias is bounded by the biasof the estimate of p j ( t ), together with another term representing an “inherent”bias associated with ¯ R ( t ). This second term may be approximated by noting that | ¯ R ( t ) | < t and, due to Theorem 5.1, we expect q j ( t ) can be estimated bythe known function ˜ p j ( t ) or ˆ p j ( t ). 6. Estimates
In this section we give a proof of Theorem 5.3. Recall that q j ( t ) := 1 − p j ( t ) and K := P Nj =1 k j is the total reaction rate. For brevity, we will sometimes suppress the t dependence in our expressions. Also, all sums are over 1 , . . . , N unless otherwiseindicated.6.1. Preliminary Calculations.
Observe thatBias( ˜ R ( t ) , R ( t )) = Bias( ˜ R ( t ) , ¯ R ( t )) , MSE( ˜ R ( t ) , R ( t )) = MSE( ˜ R ( t ) , ¯ R ( t ))and similarly for ˆ R ( t ); this fact will be used below without comment. There are afew expressions that will show up repeatedly in the analyses of both ˜ R and ˆ R . Weanalyze them here for simplicity. Let(6.1) ξ i = k i + X m = i k m χ m We make the following calculations: k i ≤ ξ i ≤ K (6.2a) E [ ξ i ] = k i + X m = i p m k m = K − X m = i q m k m (6.2b)A lower bound on this can be obtained from Jensen’s inequality,(6.3) E [ ξ − i ] ≥ E [ ξ i ] − = 1 K − P m = i q m k m ≥ K + 1 K X m = i k m q m NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 9 while an upper bound can be obtained from the Edmunson-Madansky inequality,(6.4) E [ ξ − i ] ≤ k i K − E [ ξ i ] K − k i + 1 K E [ ξ i ] − k i K − k i = 1 K + 1 k i K X m = i k m q m In the same way,(6.5) E [ ξ − i ] ≥ E [ ξ i ] − = 1 (cid:16) K − P m = i q m k m (cid:17) ≥ K + 2 K X m = i q m k m and(6.6) E [ ξ − i ] ≤ k i K − E [ ξ i ] K − k i + 1 K E [ ξ i ] − k i K − k i = 1 K + K + k i k i K X m = i q m k m Therefore,(6.7) Var( ξ − i ) ≤ (cid:18) K + k i k i K − K (cid:19) X m = i q m k m ≤ Kk i X m = i q m ( t ) k m , where we have lost some of the estimate in the last inequality for the sake ofconciseness.6.2. Estimates for ˜ R . Below it is useful to notice that(6.8) ˜ R ( t ) = N X i =1 ˜ p i ( t ) χ i ( t ) k i k i + P m = i χ m ( t ) k m = X i ˜ p i χ i k i ξ i . Bias.
We begin with the direct calculation E [ ˜ R − ¯ R ] = N X i =1 E (cid:20) χ i ˜ p i k i ξ i − χ i k i K (cid:21) = N X i =1 (˜ p i − p i ) E (cid:20) χ i k i ξ i (cid:21) + N X i =1 E (cid:20) χ i p i k i ξ i − χ i k i K (cid:21) = N X i =1 (˜ p i − p i ) E (cid:20) χ i k i ξ i (cid:21) + N X i =1 E (cid:20) Kp i ξ i − (cid:21)| {z } ≡ b i p i k i K .
Using (6.3) and (6.4),1 K X m = i k m q m − q i ≤ b i ≤ k i X m = i k m q m − q i . Thus, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 b i p i k i K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N X i =1 N X j =1 k j k i q j p i ( t ) k i K ≤ K max j q j ( t )min j k j ¯ R ( t ) . Combining the above expressions gives(6.9) (cid:12)(cid:12)(cid:12)
Bias( ˜ R ( t ) , ¯ R ( t )) (cid:12)(cid:12)(cid:12) ≤ N max i | ˜ p i ( t ) − p i ( t ) | + K max i q i ( t )min i k i ¯ R ( t ) . Variance.
For the variance, we first write(6.10) ˜ R − E [ ˜ R ] = N X i =1 (cid:18) χ i ξ i − E (cid:20) χ i ξ i (cid:21)(cid:19) ˜ p i k i . Hence,(6.11) Var( ˜ R ( t )) = N X i,j =1 k i k j ˜ p i ˜ p j Cov (cid:18) χ i ξ i , χ j ξ j (cid:19)| {z } ≡ v ij . Since v ij ≤ √ v ii √ v jj , it will be sufficient for us to analyze the diagonal terms. ByTheorem 3.2, χ i and ξ i are independent. Thus(6.12) v ii = E [ χ i ] Var( ξ − i ) + E [ ξ − i ] Var( χ i ) + Var( ξ − i ) Var( χ i ) . Using (6.6) and (6.7), v ii ≤ p i Var( ξ − i ) + p i q i E [ ξ − i ] ≤ p i (cid:18) K + k i k i K − K (cid:19) X m = i q m k m + p i q i K + K + k i k i K X m = i q m k m ≤ p i q i K + 4 p i k i K X m = i q m k m ≤ p i k i max j q j ≤ k i max j q j ( t ) ≤ j k j max j q j ( t ) . (6.13)We have made some sacrifices in the last inequalities in order to obtain a moreconcise expression. Consequently,Var( ˜ R ( t )) ≤ N X i,j =1 k i k j ˜ p i ( t )˜ p j ( t ) √ v ii √ v jj ≤ K min i k i ¯ R ( t ) max i q i ( t ) . (6.14)6.2.3. MSE.
Combining (6.9) and (6.14), we then obtainMSE( ˜ R ( t ) , ¯ R ( t )) ≤ N max i | ˜ p i ( t ) − p i ( t ) | + K min i k i (cid:16) i q i ( t ) + 4 (cid:17) ¯ R ( t ) max i q i ( t ) . (6.15)In this calculation, we see that the mean square error may ultimately be dominatedby how well the ˜ p i approximate the p i .6.3. Estimates for ˆ R . We begin by noting that, since ˆ p j ( t ) = 0 if χ j ( t ) = 1,(6.16) ˆ R ( t ) = X j ˆ p j ( t ) k j k j + P m = j χ m ( t ) k j . NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 11
Bias.
We begin by writing(6.17) ˆ R − ¯ R = N X i =1 (ˆ p i − p i ) k i ξ i + N X i =1 k i p i ξ i − k i p i K so that, after taking an expectation,(6.18) E [ ˆ R − ¯ R ] = X i =1 E (cid:20) (ˆ p i − p i ) k i ξ i (cid:21) + N X i =1 (cid:18) E (cid:20) Kξ i (cid:21) − (cid:19) k i p i K .
Hence,(6.19) (cid:12)(cid:12)(cid:12)
Bias( ˆ R ( t ) , ¯ R ( t )) (cid:12)(cid:12)(cid:12) ≤ N max i | Bias(ˆ p i ( t ) , p ( t )) | + K min i k i ¯ R ( t ) max i q i ( t ) , and we see that the observed bias is controlled by the biases of the approximateprobabilities, ˆ p i , and the inherent bias of the Chill type estimators.6.3.2. Variance.
For the variance, we have(6.20) Var( ˆ R ) = N X i,j =1 k i k j Cov (cid:18) ˆ p i ξ i , ˆ p j ξ j (cid:19)| {z } ≡ ˆ v ij . As before, we only need to study the diagonal entries, and use Theorem 3.2 toobtain ˆ v ii = E [ˆ p i ] Var( ξ − i ) + E [ ξ − i ] Var(ˆ p i ) + Var(ˆ p i ) Var( ξ − i ) ≤ Var( ξ − i ) + E [ ξ − i ] Var(ˆ p i ) ≤ i k i max i q i + (cid:18) K + 2min i k i max i q i (cid:19) Var(ˆ p i ) ≤ i k i max i q i + (cid:18) K + 2min i k i max i q i (cid:19) max i Var(ˆ p i ) . (6.21)We note that these estimates require full independence of N j ( t ) for j = 1 , . . . , N ,not just independence of the χ j ( t ). Now,(6.22) Var( ˆ R ( t )) ≤ K min i k i ¯ R ( t ) max i q i ( t ) + (cid:16) N max i q i ( t ) (cid:17) max i Var(ˆ p i ( t )) . MSE.
We can therefore express the mean square error of estimator ˆ R asMSE( ˆ R ( t ) , ¯ R ( t )) ≤ K min i k i ¯ R ( t ) (cid:16) i q i ( t ) (cid:17) max i q i ( t )+ (cid:16) N + 2 N max i q i ( t ) (cid:17) max i MSE(ˆ p i ( t ) , p i ( t )) . (6.23) − − − - E s t i m a t o r n = − tn = 0 0 100 200 300 n = − R ( t )1 − ˜ R ( t )1 − ˆ R ( t ) Figure 1.
Comparison of the Chill type estimators ˜ R ( t ) and ˆ R ( t )to the true expected proportion of the low temperature rate found, R ( t ), on a test system that can deviate from the Erying-Kramerslaw. 0 200 400 600 80010 − − - E s t i m a t o r n = − − R ( t )1 − ˜ R ( t )1 − ˆ R ( t ) 0 100 200 tn = 0 0 100 200 n = Figure 2.
Comparison of the expected value of the Chill typeestimators ˜ R ( t ) and ˆ R ( t ) to the true expected proportion of thelow temperature rate found, R ( t ), on a test system that can deviatefrom the Erying-Kramers law.7. Discussion
We have considered three Chill type estimators and shown them to be consistent.Their biases are small, relative to their variances, and thus we have good estimatorsof R ( t ), the true fraction of the observed rate in the system. They represent asignificant improvement over the original AKMC stopping criterion presented in[17]. Indeed, these prior approaches attempted to estimate the fraction of thesaddles observed when, in fact, it is the fraction of the observed rate that is offundamental importance.As an example, we will compare the accuracy of both estimators using a testsystem that consists of saddle points s j corresponding to potential energy barriers V ( s j ) − V ( m ) = 1 + j , for j = 0 , . . . ,
19. The test system has rates that obey a
NALYSIS OF ESTIMATORS FOR ADAPTIVE KINETIC MONTE CARLO 13 modified Arrhenius equation with the form:(7.1) ˜ k hi j = (cid:18) β lo β hi (cid:19) n g j exp[ βV ( s j ) − V ( m )] . Compare to equation (1.1) (recall the subscript i has been suppressed). The variable n controls how the rates deviate from an unmodified Arrhenius rate law. When n = 0 the modified rates ˜ k hi j are equal to the unmodified rates k hi j , while when β hi < β lo , the modified rates are larger (resp. smaller) than the unmodified ratesif n > n < k hi j . This means( N j ( t )) j =1 ,...,N ≤ t ≤ t sim are independent Poisson processes with parameters ˜ k hi j . To com-pute R ( t ), we use (4.1) and sample χ j ( t ) via (2.2). To compute ˜ R ( t ) we use theunmodified Arrenius rates k hi j in equation (4.4). For each of R ( t ), ˜ R ( t ) and ˆ R ( t ),the low temperature rates k j = k lo j used in equations (4.1) and (4.5) are the same.We take g j = 1 for all j and β hi = 2 . β lo = 10 .
0. The variable n was variedto compare the cases where the Erying-Kramers rates k hi j underestimate ( n = ),overestimate ( n = − ), and provide an exact estimate ( n = 0) of the modified rates˜ k hi j . Results are shown in Figures 1 and 2. The test system shows that ˜ R ( t ) canoverestimate R ( t ) if the Eyring-Kramers rate deviates from the true rate at β hi ,while ˆ R ( t ) tends to provide a conservative estimate of R ( t ). References [1] D. Aristoff. The parallel replica method for computing equilibrium averages of Markov chains. arXiv.org , February 2015.[2] D. Aristoff and T. Leli`evre. Mathematical Analysis of Temperature Accelerated Dynamics.
MMS , 12(1):290–317, 2014.[3] D. Aristoff, T. Leli`evre, and G. Simpson. The parallel replica method for simulating longtrajectories of Markov chains.
Appl. Math. Res. Express , 2014:332–352, 2014.[4] N. Berglund and S. Dutercq. The Eyring Kramers Law for Markovian Jump Processes withSymmetries.
Journal of Theoretical Probability , 2015.[5] A. Binder, T. Leli`evre, and G. Simpson. A generalized parallel replica dynamics.
Journal OfComputational Physics , 284(C):595–616, March 2015.[6] S.T. Chill and G. Henkelman. Molecular dynamics saddle search adaptive kinetic MonteCarlo.
The Journal of Chemical Physics , 140(21):214110, June 2014.[7] W. E, W. Ren, and E. Vanden-Eijnden. Simplified and improved string method for computingthe minimum energy paths in barrier-crossing events.
The Journal of Chemical Physics ,126(16):164103, 2007.[8] P. Hanggi, P. Talkner, and M. Borkovec. Reaction-rate theory: fifty years after Kramers.
Reviews Of Modern Physics , 62(2):251–341, 1990.[9] G. Henkelman and J. J´onsson. Improved tangent estimate in the nudged elastic bandmethod for finding minimum energy paths and saddle points.
Journal of Chemical Physics ,113(22):9978–9985, 2000.[10] G. Henkelman, B.P. Uberuaga, and J. J´onsson. A climbing image nudged elastic bandmethod for finding saddle points and minimum energy paths.
Journal of Chemical Physics ,113(22):9901–9904, 2000.[11] C. Le Bris, T. Leli`evre, M. Luskin, and D. Perez. A mathematical formalization of the parallelreplica dynamics.
Monte Carlo Meth. Appl. , 18(2):119–146, 2012.[12] T. Leli`evre and F. Nier. Low temperature asymptotics for Quasi-Stationary Distributions ina bounded domain. arXiv.org , September 2013.[13] R.A. Olsen, G.J. Kroes, G. Henkelman, A. Arnaldsson, and H. J´onsson. Comparison ofmethods for finding saddle points without knowledge of the final states.
The Journal ofChemical Physics , 121(20):9776, 2004. [14] D. Perez, B.P. Uberuaga, Y. Shim, J.G. Amar, and A.F. Voter. Accelerated molecular dy-namics methods: introduction and recent developments.
Ann. Rep. Comp. Chem. , 5:79–98,2009.[15] G. Simpson and M. Luskin. Numerical analysis of parallel replica dynamics.
ESAIM: Math-ematical Modelling and Numerical Analysis , 47(5):1287–1314, July 2013.[16] A.F. Voter, F. Montalenti, and T.C. Germann. Extending the time scale in atomistic simu-lation of materials.
Ann. Rev. Mater. Sci , 32:321–346, 2002.[17] L. Xu and G. Henkelman. Adaptive kinetic Monte Carlo for first-principles accelerated dy-namics.
The Journal of Chemical Physics , 129(11), 2008.[18] L. Xu, D. Mei, and G. Henkelman. Adaptive kinetic Monte Carlo simulation of methanoldecomposition on Cu(100).