Advancements in Milestoning II: Calculating Autocorrelation from Milestoning Data Using Stochastic Path Integrals in Milestone Space
AAdvancements in Milestoning II: Calculating Autocorrelation from Milestoning DataUsing Stochastic Path Integrals in Milestone Space
Gianmarc Grazioli and Ioan Andricioaei ∗ Department of Chemistry, University of California, Irvine, CA 92697 (Dated: November 6, 2018)The Milestoning method has achieved great success in the calculation of equilibrium kinetic proper-ties such as rate constants from molecular dynamics simulations. The goal of this work is to advanceMilestoning into the realm of non-equilibrium statistical mechanics, in particular, the calculationof time correlation functions. In order to accomplish this, we introduce a novel methodology forobtaining flux through a given milestone configuration as a function of both time and initial config-uration, and build upon it with a novel formalism describing autocorrelation for Brownian motionin a discrete configuration space. The method is then applied to three different test systems: aharmonic oscillator, which we solve analytically, a two well potential, which is solved numerically,and an atomistic molecular dynamics simulation of alanine dipeptide.
PACS numbers: Valid PACS appear here
I. INTRODUCTION
The calculation of time correlation functions from timeseries measurements made along molecular dynamics tra-jectories plays the same central role in kinetics as cal-culating partition functions from sets of molecular con-figurations and their respective energies in the realm ofthermodynamics. To put the magnitude of this task intoperspective, consider a simple system where 100 differ-ent configurations are possible, and a transition betweenany pair of these configurations is possible. In this sim-ple system, there are over 1 . × different 10 steptrajectories possible (100 choose 10) without even con-sidering the fact that the same series of 10 configurationscan occur with different transition times which makes thenumber of possible trajectories proliferate even further!All important experimental properties can be calculatedfrom time correlation functions measured from moleculardynamics simulations, but these effects are typically onlymeasurable on timescales which are out of reach for bruteforce molecular dynamics. An example would be cal-culating RDCs (Residual Dipole Couplings) from NMRexperiments from bond vector time correlation functions.The challenge of and demand for calculating kinetic prop-erties from molecular dynamics simulations have causedit to become a major growth area in chemical physics [1][2], leading to the development of several methods, span-ning from early treatments using transition state theory(TST) [3] [4], to more recently, transition path sampling(TPS) [5], transition path theory (TPT) [6], and tran-sition interface sampling (TiS) [7]. A common strategyin measuring kinetics in molecular dynamics simulationsis the measurement of fluxes of trajectories through hy-perplanes in phase space or configuration space [8] [9] .More recently, the use of the hyperplanes in the Mileston-ing method has been generalized to subdividing phase ∗ Electronic address: [email protected] space into Voronoi cells, where the milestones exist asthe interfaces between cells [10]. Thus far, Mileston-ing has been used to calculate many useful properties,such as equilibrium flux values through the set of mile-stones, rate constants [11], and other equilibrium prop-erties such as mean first passage times between states[12], but the method has never before been used to cal-culate non-equilibrium dynamical objects such as timecorrelation functions. In our first paper, Advancementsin Milestoning I, we introduced a methodology for rapidcalculation of transition time density functions betweenmilestone hyperplanes, the central objects of milestoningcalculations, by artificially pushing the system towardthe target milestone and then re-weighting the distribu-tion to recover the true transition time distribution [13].In this paper, we venture into this realm by introduc-ing a method for calculating time correlation functionsfrom milestoning data. In order to calculate autocorrela-tion from milestoning data, not only must we know theequilibrium flux values through each interface, we mustalso know the flux through each interface as a functionof time and initial configuration. For this reason, it wasnecessary that we also introduce our stochastic path in-tegral approach to calculating the time-dependent fluxes,in addition to the methodology for calculating time cor-relation functions from these time-dependent fluxes.
II. THEORYMilestoning Theory
A more in-depth overview of milestoning theory can befound in our first paper [13], or in [11], but let us review afew of the key premises upon which our method for calcu-lating time correlation functions hinge. The quantity ofmost fundamental importance in milestoning is the fluxthrough a given milestone, for which the equation is [8]: a r X i v : . [ c ond - m a t . s t a t - m ec h ] O c t P s ( t ) = (cid:90) t Q s ( t (cid:48) ) (cid:34) − (cid:90) t − t (cid:48) K s ( τ ) dτ (cid:35) dt (cid:48) ,Q s ( t ) = 2 δ ( t ) P s (0) + (cid:90) t Q s ± ( t (cid:48)(cid:48) ) K ∓ s ± ( t − t (cid:48)(cid:48) ) dt (cid:48)(cid:48) (1)where P s ( t ) is the probability of being at milestone s attime t , (or, more specifically, arriving at time t (cid:48) and notleaving before time t [8]), and Q s ( t ) is the probabilityof a transition to milestone s at time t. K s ( τ ) indicatesthe probability of transitioning out of milestone s givenan incubation time of τ , thus (cid:82) t − t (cid:48) K s ( τ ) dτ is the prob-ability of an exit from milestone s anytime between 0and t − t (cid:48) , which makes 1 − (cid:82) t − t (cid:48) K s ( τ ) dτ the probabil-ity of there not being an exit from milestone s over thatsame time period. Since the probability of two indepen-dent events happening concurrently is the product of thetwo events, the equation for P s ( t ) is simply integratingthe concurrent probabilities of arriving at milestone s and not leaving over the time frame from time 0 to t .Turning our attention towards the meaning of the firstterm, Q s ( t ), 2 δ ( t ) P s (0), simply represents the probabil-ity that the system is already occupying milestone s attime t = 0, where the factor of 2 is present since the δ -function is centered at zero, meaning only half of itsarea would be counted without this factor. Q s ± ( t (cid:48)(cid:48) ) isthe probability that the system transitioned into one ofthe two milestones adjacent to s at an earlier time t (cid:48)(cid:48) . K ∓ s ± ( t − t (cid:48)(cid:48) ) is the probability of a transition from mile-stones s ± s . Thus the second term of thesecond line of equation 14 is another concurrent proba-bility: the probability of the system entering an adjacentmilestone at an earlier time, and then transitioning intomilestone s between time t and 0. It is important to notethat all functions P s ( t ) and Q s ( t ) are calculated using therespective values of K s ( τ ) between adjacent milestones,thus the set of K s ( τ ) between all milestones of interestcontains all the information needed to calculate kineticsusing the milestoning method. It is also important tonote that a K function between two milestones x = A and x = B , K AB ( τ ), is simply a probability distribu-tion representing the lifetime for the system remainingin state A before transitioning to state B . Time Correlation from Milestoning Data
This approach aims to glean the time correlation func-tion C ( t ) of an observable from Milestoning data. Thekey insight into this method is the approximation of thecontinuous configuration space, which we define as x , asa discrete space of milestone configurations. Althoughthe formalism presented below requires that the equi-librium distribution of configurations occupied, f ( x ), isknown, any successful Milestoning simulation yields the equilibrium flux through the set of milestones, and sothis set of fluxes will serve as the equilibrium distribu-tion of configurations in our discrete space. For the sakeof clarity of notation, we will be limiting our derivationto observables which are a function of configuration x ,but it should be noted that all developments presentedherein can be easily generalized to observables which area function of both position and velocity by consideringour variable x as a phase space coordinate. We beginwith the usual definition for a time correlation functionfor time-ordered measurements of an observable that is afunction of configuration, A ( x ; t ), arising from the equi-librium distribution of configurations, f ( x ): C ( t ) = (cid:104) A ( x, A ( x, t ) (cid:105) = (cid:90) A ( x , A ( x, t ) f ( x ) dx (2)where time t is the lag time between measurements.For time t = 0 the time correlation function hasthe lower limit C (0) = (cid:82) A ( x , A ( x , f ( x ) dx = (cid:104) A (cid:105) , the variance. On the opposite extreme, givenan infinite relaxation time, the mean value of x attime t will be equivalent to the mean at equilib-rium, lim t →∞ (cid:104) A ( x, t ) (cid:105) = (cid:82) A ( x ) f ( x ) dx , which im-plies: lim t →∞ C ( t ) = (cid:82) A ( x ) (cid:0)(cid:82) A ( x ) f ( x ) dx (cid:1) f ( x ) dx = (cid:82) A ( x ) f ( x ) dx (cid:82) A ( x ) f ( x ) dx = (cid:104) A (cid:105) So far, we have only discussed equilibrium probabilitydistributions in configuration space, which we defined as f ( x ), but let us now consider a time-dependent probabil-ity density function of configuration, which is a functionof initial configuration x (0). Keep in mind that time-dependent probability density functions such as these arethe solutions to Fokker-Planck equations. Let us definethis probability density function as g ( x, t ; x , (cid:104) x ( t, x ) (cid:105) , in the following manner: (cid:104) x ( t, x ) (cid:105) = (cid:90) xg ( x, t ; x , dx (3)Following suit, the expectation value of our observable A as a function of time can be written as: (cid:104) A ( x, t ; x , (cid:105) = (cid:90) A ( x ) g ( x, t ; x , dx (4)We can now substitute (cid:104) A ( x, t ; x , (cid:105) for A ( x, t ) in thedefinition of a time correlation function: C ( t ) = (cid:90) A ( x ) (cid:18)(cid:90) A ( x ) g ( x, t ; x , dx (cid:19) f ( x ) dx (5)As stated earlier in this section, our aim is to coarsegrain the continuous configuration space of x into a dis-crete space of milestone configurations, from which wecan calculate a time correlation function. Our first stepin constructing this model will be to approximate theoutermost integral in x with a sum over a discrete set ofconfigurations { x i } multiplied by the equilibrium prob-ability of finding the system in the configuration i . Ifwe define the probability of the system being in config-uration x i at time t given an initial configuration x as P i ( t ; x ), then given that our system will reach equilib-rium at infinite time regardless of initial configuration,the equilibrium probability can be expressed as P i ( ∞ ).Thus we arrive at our first discrete approximation of timecorrelation: C ( t ) ≈ (cid:88) i A ( x i ) P i ( ∞ ) (cid:18)(cid:90) A ( x ) g ( x, x (0) , t ) dx (cid:19) (6)Our next task is to approximate the remaining integral inthe equation with a sum over milestone states. Equation4 gives us an expression for the mean value of A ( x ) ina continuous space, given an amount of time elapsed t and an initial configuration x . Now consider the casewhere x can only occupy discrete values from the set { x s } . In this case, the integral in equation 4 is replacedby a sum in a weighted average expression where eachdiscrete value of x i multiplied by its statistical weight asa function of time: (cid:90) A ( x ) g ( x, x (0) , t ) dx ≈ (cid:88) s A ( x s ) P s ( t | x ) (7)Next, we substitute this weighted sum approximationinto equation 6: C ( t ) = (cid:88) i (cid:32) A ( x i ) P i ( ∞ ) (cid:88) s A ( x s ) P s ( t | x ) (cid:33) (8)Note that we have now arrived at a complete expres-sion for a discrete approximation of time correlation, withthe assumption that P s ( t | x ) and P i ( ∞ ) can be obtainedfrom milestoning calculations. Since the set of equilib-rium fluxes, P i ( ∞ ), have been calculated from mileston-ing simulations since the beginning, and we will intro-duce a novel method for calculating P s ( t | x ) from mile-stoning simulations in the Random Walk / Path IntegralMethodology subsection later in the article, we are ableto demonstrate that time correlation can indeed be cal-culated from Milestoning simulations. III. ANALYTICAL SOLUTION FOR 1DHARMONIC OSCILLATOR
In this section, we demonstrate the effectiveness ofequation 8 in approximating the time correlation func-tion for diffusion in a harmonic potential, for which thereis an analytical solution. Our potential is defined as V ( x ) = kx , and it’s equilibrium distribution in x isthe Boltzmann distribution, f ( x ) = e − βV ( x ) . The closed form expression for the time-dependent probability dis-tribution for diffusion in a harmonic well is [14]: p ( x, t | x ,
0) = 1 (cid:112) πk B T S ( t ) /k exp (cid:34) − (cid:0) x − x e − t/ ¯ τ (cid:1) k B T S ( t ) /k (cid:35) (9)where S ( t ) = 1 − e − t/ ¯ τ and ¯ τ = 2 k B T /kD .Given this analytical expression for p ( x, t, | x , C ( t ) by substi-tuting p ( x, t, | x ,
0) into equation 5 for g ( x, x i (0) , t ) andintegrating. This yields the exact time correlation func-tion C ( t ) for diffusion in a harmonic potential: C ( t ) = 2 √ πe − t ¯ τ (cid:16) kk B T (cid:17) / (cid:114) k B T (cid:16) − e − t ¯ τ (cid:17) k (cid:114) k ( coth ( t ¯ τ ) +1 ) k B T (10)Alternatively, we can apply equation 8, and obtain a gen-eral closed form expression for approximating C ( t ) bysumming over a discrete configuration space of N mile-stones rather than integrating over a continuous one: C ( t ) = 1 (cid:114) πk B T (cid:16) − e − t ¯ τ (cid:17) kN (cid:88) i =1 x i P i ( ∞ )∆ x N (cid:88) j =1 ( x j Q ji ( t )∆ x + x i Q ii ( t )∆ x ) (11)where Q ji ( t ) = exp − k (cid:0) coth (cid:0) t ¯ τ (cid:1) − (cid:1) (cid:16) x i − x j e t ¯ τ (cid:17) k B T Q ii ( t ) = exp (cid:32) − x i k tanh (cid:0) t ¯ τ (cid:1) k B T (cid:33) (12)and ∆ x is the distance between the evenly spaced mile-stones. Q ji ( t ) represents the discrete time-dependentprobability density as a function of time that our systemis in configuration x i at time t , given that the systemwas in state x j at time t = 0. Likewise, Q ii ( t ) is the dis-crete probability density as a function of time that oursystem is still in configuration x i at time t if it started inconfiguration x i at time t = 0. Thinking in terms of theassumption of Markov statistics for transitions betweenmilestones inherent to the Milestoning method, it makessense that these probabilities are added given that we areinterested in the outcome of finding our system in con-figuration x i whether it was already there, or it arrivedthere from another configuration.The most straightforward and intuitive way to com-pare equations 10 and 11 is to plot them. In figure1, we can compare the exact time correlation functionfor diffusion in a harmonic potential (with parameters β = . , k = 5 , and D = . C ( t ) generated using equation 11. Discretizing the spaceto three milestones is clearly too coarse of an approxima-tion, but the gain in accuracy in going from 6 to 9 mile-stones is quite modest. As one might expect, the discreteapproximation of the time correlation function is most ac-curate for long times and least accurate for C (0). It turnsout that this sacrifice in accuracy is a meager one because C (0) is always available from Milestoning data because itis equivalent to the sum approximation of the variance inconfiguration space at equilibrium, (cid:80) Ni =1 x i P i ( ∞ ). Thiswill be leveraged to our advantage in the following sec-tion. (cid:72) ps (cid:76) (cid:72) t (cid:76) (cid:72) t (cid:76) FIG. 1: This figure shows the approximate time correlationfunctions calculated using equation 8 for 3, 6, and 9 milestonesoverlaid on top of the exact analytical function C ( t ). IV. NUMERICAL DEMONSTRATION1D Fokker-Planck Diffusion on a Bistable Potential
In order to further validate the approach of calculatingtime correlation functions using the nested sum in equa-tion 8 in a discrete configuration space to approximateintegrating equation 2 in continuous conformation space,the method was applied to a simple two well potential ofequation y = ( x − ( x + 1) , where the time evolutionof the probability density function in configuration spacewas calculated using a Fokker-Planck formalism: ∂ρ ( x, t ) ∂t = D ∂ ρ ( x, t ) ∂x + Dk B T ∂∂x ( ρ ( x, t ) ∂V∂x ) (13)By repeatedly solving equation 13 numerically with theusing the Mathematica software package [15], using anormalized Gaussian distribution centered at the vari-ous x i (0) values as the initial condition, the manifolds g ( x, x i (0) , t ) were obtained for each of the 10 milestoneconfigurations x i in the set {− , − . , ..., . , } . These manifolds were then used to find C ( t ) using both the in-termediate method described by equation 6 (shown asred circles in figure 2) as well as our fully developed dis-crete method described by equation 8 (shown as bluecircles in figure 2). In the case of the equation 6, theintegral (cid:82) xg ( x, x (0) , t ) dx was numerically integrated di-rectly, while in the case of equation 8, the manifold g ( x, x (0) , t ) was used to obtain values of P i ( x (0) , t ) bymultiplying g ( x, x (0) , t )∆ x , similar to the transformationfrom equation 6 to equation 8, but in reverse. The resultsare shown superimposed over a plot of the time correla-tion function for the system obtained in the traditionalmanner by running 10 steps of langevin dynamics andthen calculating the time correlation function over thisone long trajectory using the equation: C ( t ) = 1 n − t n − t (cid:88) i =1 x i x t + i (14)We would like to point out that, as we alluded to in theprevious section, the data point for C (0) is the only por-tion of the time correlation function approximated usingequation 8 with any appreciable error. In practice, thedata point for C (0) can always be replaced with the valueobtained from the sum C (0) = (cid:80) i x i P i ( ∞ ) (shown asthe green ring in figure 2), due to the fact that the set ofequilibrium probabilities, P i ( ∞ ) are always known fromMilestoning simulations. (cid:72) ps (cid:76) (cid:72) t (cid:76) IntegralSumEq. Variance
FIG. 2: This plot demonstrates a successful implementationof our method for approximating time correlation functions incontinuous space by summing over time dependent joint prob-abilities of transitions between discrete states, as obtainedin Milestoning simulations. The red rings mark the datapoints from implementing equation 6, the blue data pointsindicate the positions where the full nested sum approxima-tion of equation 8 was implemented, and the green ring is thedata point for C (0) calculated from equilibrium probabilitieswhich is used to replace the value of C (0) generated usingequation 8. The data is shown superimposed over the timecorrelation function C ( t ), represented by a solid black line,calculated using the traditional method of equation 14. Random Walk / Path Integral Methodology
In order to make use of the formalism for obtainingautocorrelation in a discrete configuration space, as in-troduced in the Theory section, we require an expressionfor P s ( t | x i (0)), i.e. the probability that our system is inconfiguration s at time t , given that it was in configu-ration i at time t = 0. Since previous implementationsof the milestoning method have been “based on iterativedetermination of stationary flux vectors at milestones”[12], and not the determination of non-equilibrium timedependent fluxes given some initial configuration, it wasnecessary to devise a methodology for obtaining the func-tion P s ( t | x r (0)) from milestoning data. In the case ofdiffusive systems which can be described using a Fokker-Planck formalism (eq. 13), the Fokker-Planck equationcan be solved for a manifold ρ ( x, t ) which represents aprobability density of configurations evolving in time,where the distribution at time t = 0 is the distributiondictated by the initial condition and the distribution as t → ∞ is equivalent to the equilibrium distribution inx. While this Fokker-Planck description can be directlysolved for the time evolution of a probability densityfunction of configurations (when tractable, as in figure2), it is also possible to obtain the manifold ρ ( x, t ) via apath integral approach using a large ensemble of trajecto-ries generated using stochastic models such as Langevindynamics. This equivalence was the inspiration behindthe random walk / path integral method introduced inthis section. There are some differences however, for ex-ample, instead of Langevin trajectories, we use randomwalks along the given set of milestones. Very long ran-dom walks, orders of magnitude longer than time scalesaccessible to molecular dynamics, can be quickly gener-ated with minimal computational cost by taking advan-tage of two data sets which are already known in anymilestoning calculation: the transition matrix K (essen-tially a Markov matrix) and the set of all K AB ( τ ) func-tions, which are the probability density functions of tran-sition times between milestone A and milestone B . The K AB ( τ ) functions are obtained by histogramming tran-sition times between milestones, and each element K ij of the matrix K is obtained by integrating the distribu-tions of transition times, k ij ( τ ), over all time τ and thennormalizing each row to impose the constraint that thesystem at state i has probability 1 of transitioning to oneof the states to which it is coupled ( j ). Since the matrix K gives the equilibrium transition probabilities betweenmilestones, and the k ij functions are probability densityfunctions for the transition time between connected mile-stones, these two pieces of information can be used toconstruct time-dependent random walks along a set ofmilestones. Each step taken from some current config-uration i is chosen by selecting between each possiblecoupled state j , weighted by the transition probabilitiesfrom K , next, the amount of time each selected transi-tion from state i to j took is selected randomly from thedistribution defined by k ij ( τ ). In this manner, trajec- tories of arbitrary length in this discrete space can bevery quickly generated in only the amount of CPU timenecessary to select 2 N random numbers, where N is thedesired number of steps in the random walk. Once a largeset of these random walks is generated, they can be usedto calculate discrete versions of the same ρ ( x, t ) mani-folds which would be obtained as the solutions to theFokker-Planck equation (see figure 3). To elaborate onthis, consider a single random walk along the milestoneconfigurations. If, at each time step, we histogram thefrequency with which our system has visited each mile-stone configuration up to that point in time into a nor-malized distribution, then we have constructed a discretemanifold in configuration space x and time t which repre-sents the time evolution of the probability distribution offinding our system in a particular configuration for thisparticular realization of a random walk in our discreteconfiguration space. From here, it only remains to av-erage the set of probability distributions generated fromnumerous manifestations of the random walk. An alter-native approach to calculating time correlation functionsfrom these random walks would be to “connect the dots”along the random walk using an interpolation method,and then use the traditional approach to numerically cal-culating time correlation, shown in equation 14, from theresulting continuous function, as shown in figure 5. V. APPLICATION TO CALCULATINGLONG-TIME RDCS IN ATOMISTICSIMULATIONSApplication of Discrete Space Time CorrelationMethodology to the Alanine Dipeptide Bond Vector
In this section, we describe an application of ourmethodology to a molecular system. Shown in figure 6is the molecular structure of our system, alanine dipep-tide. After constraining the nitrogen and carbon atomslabeled in yellow to remain fixed at their initial positions,Langevin dynamics at T = 300 K was run for 4 × timesteps with a time step size of 0.001 ps for a total of 40nanoseconds using the CHARMM molecular dynamicssoftware package. As the molecular dynamics simulationran, the orientation of the bond vector extending fromthe center of the labeled nitrogen atom to the center ofthe hydrogen atom indicated by the purple arrow in fig-ure 6 was recorded. Although this bond vector possessesthree spatial degrees of freedom, it’s orientation could bewell approximated by a single rotational degree of free-dom, as shown in figure 7. By counting the number oftime steps between transitions from one milestone stateto the next (shown graphically as the four colored planesin figure 7) over the course of the 40 nanosecond trajec-tory, probability distribution functions for the transitiontimes between neighboring pairs were constructed as his-tograms to obtain the set of k ij ( τ ) functions for eachpair of neighboring milestone states. These k ij ( τ ) func- time (ps)probability densityspatial coordinate (x)A)B) P S (t)time (ps)milestone position (S) FIG. 3: This figure shows a graphical comparison betweenthe time evolution of a discrete probability distribution for aset of 5 milestone configurations subjected to the two well 1Dpotential found in the Numerical Demonstration section usingour random walk / path integral methodology (part A), andthe manifold representing the time evolution of a continuousprobability density function of configurations for the sametwo well system subjected to Fokker-Planck diffusion (partB). Part A is the set of probabilities as a function of time forthe system being found at each milestone configuration, giventhat the system was in configuration x = − t = 0,and part B shows Fokker-Planck diffusion on the same twowell system. Note that the random walk in part A began atthe milestone located at x = −
12, thus we see a decay from { P (0) = 0 , P (0) = 1 , P (0) = 0 , P (0) = 0 , P (0) } to theequilibrium distribution, the same way our initial continuousdistribution, a normalized Gaussian centered at −
1, decaysto the equilibrium probability distribution predicted by theBolzmann distribution for the two well potential, and bothevolve in time on about the same time scale. tions were then used as the basis for the random walk /path integral approach described in the previous section.Thusly, the P s ( t | x ) functions necessary to calculate thetime correlation function using equation 16 were calcu-lated by averaging 75,000 different time-dependent prob-ability distribution functions which each resulted fromsome particular manifestation of the random walk. Thetime correlation functions of interest for this system arethose which can be calculated using the Lipari-Szabo for-malism [16], as implemented by Xing and Andricioaei[17], using the equation: C ( t ) = (cid:104) L ( u (0) u ( t )) (cid:105) (15) True C ( t ) ( t ) ( t ) ( t )
10 20 30 40 50 t ( ps ) ( t ) FIG. 4: Shown here are time correlation functions calculatedusing equation 8, where the conditional probability as a func-tions of time, P s ( t | x (0)), are calculated using our randomwalk / path integral methodology, represented graphically infigure 3A. True C ( t ) ( t ) ( t ) ( t )
10 20 30 40 50 t ( ps ) ( t ) FIG. 5: Shown here are time correlation functions which werecalculated by first generating one long random walk usingthe method introduced in this article, then linking each pointin the trajectory using linear interpolation, and finally usingequation 14 to calculate C ( t ). where L ( u (0) u ( t )) refers to plugging the scalar resultingfrom the dot product of time series measurements of thebond vector u into the second order Legendre polyno-mial. This motif of measuring the autocorrelation of thisvalue is then applied to equation 8 to yield the discretespace time correlation function relationship: C ( t ) = (cid:88) i L (cid:34)(cid:88) s ( u i (0) · u s ) P s ( t | u i (0)) (cid:35) P i ( ∞ ) (16)where the vectors u i represent the different possible val-ues for the bond vector, given the coarse graining of thebond vector into a discrete space. The oscillatory andslower decay in correlation for the 4 milestone case is aneffect of coarse graining the space. This is due to a loss inentropy in going from the continuous space to the discreteone, i.e. if only four possibilities exist for the position ofthe bond vector, the probability of pointing in the samedirection as that of a previous time step increases com-pared to a system where 8 or more configurations arepossible.Notably, the oscillatory and slower decay in correlationfor the 4 milestone case is an effect of coarse graining thespace (the oscillations are reproduceable). This is dueto a loss in entropy in going from the continuous spaceto the discrete one, i.e. if only four possibilities existfor the position of the bond vector, the probability ofpointing in the same direction as that of a previous timestep increases compared to a system where 8 or moreconfigurations are possible. FIG. 6: Shown in this figure is the alanine dipeptide moleculeused as our model system. The two atoms shown in yellowwere held fixed in space while the rest of the molecule wassubjected to Langevin dynamics. The purple arrow gives theorientation of the bond vector which served as the measurablein our time correlation function calculations.
VI. CONCLUDING DISCUSSION
We have demonstrated for the first time that time cor-relation functions for continuous processes can be approx-imated using equation 8 to coarse grain the configura-tion space to a discrete one. Additionally, we have in-troduced a novel method for extending milestoning intonon-equilibrium regimes by numerically calculating thetime-dependent fluxes P s ( t | x i (0)). The method consistsof constructing random walks in the discrete configura-tion space, defined by a set of milestone configurations,from transition time probability density functions k ij ( τ )obtained using the milestoning method, followed by cal-culating time-dependent histograms of milestone statesoccupied using the stochastic path integral method de-scribed in the Random Walk / Path Integral Methodol-ogy section.The time correlation function for the harmonic os-cillator calculated analytically using our discretizationmethod showed excellent agreement with the true timecorrelation function C ( t ), also obtained analytically, fora harmonic oscillator. There was also an excellent agree- FIG. 7: Shown here is a graphical representation of the fourmilestone configuration for measuring the time correlationfunction of the alanine dipeptide bond vector. Although thebond vector, shown as many thin, purple arrows, posses threedegrees of freedom as it fluctuates in time, we are able tochoose a frame of reference where the bulk of the motion istaking place as a rotation about the z-axis, shown as a thickgreen arrow. Using the four milestones, shown as the red,green, yellow, and blue planes, we can calculate transitiontime probability distributions between each pair of adjacentmilestones. ment between the C ( t ) calculated for a discrete configu-ration space for a bistable potential and the true autocor-relation function, where P s ( t | x i (0)) was obtained by nu-merically solving a Fokker-Planck equation. We also ob-tained a promising result from applying the discretizationmethod of equation 16 in conjunction with the stochas-tic path integral method to an atomistic system. Theautocorrelation function C ( t ) for the bond vector calcu-lated using the methods introduced herein showed a niceagreement with the true C ( t ) calculated using equation15. The limitations to the methods we have introducedappear to be limited to the challenges inherent to imple-mentation of the milestoning method. A key advantageof our method is that the random walks between dis-crete configurations can be constructed at trivial compu-tational cost, allowing for us to make predictions well intotime regimes inaccessible to molecular dynamics simula-tions. We would like to note that, although the calcula-tions described in this article were performed on systemswhere the observable of interest was constant along eachmilestone hyperplane, the method can easily be general-ized for systems where the observable varies along eachmilestone hyperplane. In order to account for such ob- FIG. 8: This plot gives the probability of finding our system ineach of the four milestone configurations as a function time,given that we began the simulation with our system in theconfiguration shown as the blue plane, using the same colorscheme as in figure 7. The probability of being found in theblue milestone is equal to 1 at time t = 0 of course, but theplot range stops shy of P s ( t ) = 1 in order to provide a moredetailed view. Note that the probability of the system beingin any of the other three milestone configurations is equalto zero at time t = 0, as expected. These functions werecalculated using the methodology described in the RandomWalk / Path Integral Methodology section. These functionscontributed to the calculation of C ( t ) shown in figure 9. Notethat the probabilities converge to their equilibrium values onroughly the same timescale that C ( t ) converges to its longtime value. servables, one must simply construct equilibrium proba-bility distributions of the observable on each hyperplane,then select from this distribution at each time step ofthe random walk along the milestones. In other words,at each step, the algorithm must first choose the nextstep to take using the transition matrix, then select thetransition time from the appropriate transition time dis-tribution function, then select the value of the observablefrom the probability distribution describing the observ-able along that hyperplane. We feel that the methods in-troduced in this paper have the potential to allow for thecalculation of experimental observables from moleculardynamics simulations that are currently unattainable bybrute force long time simulations. The method presentedherein could also be further enhanced by combining itwith the enhanced sampling methodology introduced inthe companion article to this paper, also found withinthis publication [13]. VII. ACKNOWLEDGMENTS
IA acknowledges funds from an NSF CAREER award(CHE-0548047).
FIG. 9: This figure shows the approximate time correlationfunctions calculated using equation 8 superimposed over thetrue time correlation function, calculated using equation 14.The 4 milestone C ( t ) function was calculated with the mile-stones placed 90 degrees apart as illustrated in figure 7, whilethe 8 milestone configuration was the same motif, only with8 planes placed 45 degrees apart. [1] Christoph Dellago, Peter G. Bolhuis, Flix S. Csajka, andDavid Chandler. Transition path sampling and the cal-culation of rate constants. The Journal of ChemicalPhysics , 108(5):1964–1977, 1998.[2] Christoph Dellago, Peter G. Bolhuis, and David Chan-dler. On the calculation of reaction rate constants inthe transition path ensemble.
The Journal of ChemicalPhysics , 110(14):6617–6625, 1999.[3] Henry Eyring. The activated complex in chemical reac-tions.
The Journal of Chemical Physics , 3(2):107–115,1935.[4] E. Wigner. The transition state method.
Trans. FaradaySoc. , 34:29–41, 1938.[5] P. Bolhuis, D. Chandler, C. Dellago, and P. Geissler.Transition path sampling: Throwing ropes over roughmountain passes, in the dark.
Annu. Rev. Phys. Chem. ,53:291–318, 2002.[6] P. Metzner, C. Schtte, and E. Vanden-Eijnden. Transi-tion path theory for markov jump processes.
MultiscaleModeling and Simulation , 7(3):1192–1219, 2009.[7] Titus S. van Erp, Daniele Moroni, and Peter G. Bol-huis. A novel path sampling method for the calcula-tion of rate constants.
The Journal of Chemical Physics ,118(17):7762–7774, 2003.[8] Ron Elber and Anton Faradjian. Computing time scalesfrom reaction coordinates by milestoning.
Journal ofChemical Physics , 120(23):10880–9, March 2004.[9] Titus S. van Erp and Peter G. Bolhuis. Elaborating tran-sition interface sampling methods.
Journal of Computa- tional Physics , 205(1):157 – 181, 2005.[10] Eric Vanden-Eijnden and Maddalena Venturoli. Marko-vian milestoning with voronoi tessellations.
The Journalof Chemical Physics , 130(19):–, 2009.[11] Anthony M. A. West, Ron Elber, and David Shalloway.Extending molecular dynamics time scales with mileston-ing: Example of complex kinetics in a solvated peptide.
The Journal of Chemical Physics , 126(14):–, 2007.[12] Juan M. Bello-Rivas and Ron Elber. Exact milestoning.
The Journal of Chemical Physics , 142(9):–, 2015.[13] Gianmarc Grazioli and Ioan Andricioaei. Advancementsin milestoning I: Accelerated milestoning via wind as-sisted re-weighted milestoning (WARM). arXiv , 2015.[14] Phys 498 lecture notes, university of illinois urbana-champaign. . Accessed:2015-08-24.[15] Inc. Wolfram Research.
Mathematica . Wolfram Research,Inc., Champaign, Illinois, version 10.0 edition, 2014.[16] Giovanni Lipari and Attila Szabo. Model-free approachto the interpretation of nuclear magnetic resonance relax-ation in macromolecules. 1. theory and range of validity.
Journal of the American Chemical Society , 104(17):4546–4559, 1982.[17] Chenyue Xing and Ioan Andricioaei. On the calculationof time correlation functions by potential scaling.