Local and Global Perspectives on Diffusion Maps in the Analysis of Molecular Systems
LLocal and Global Perspectives on Diffusion Maps in the Analysis ofMolecular Systems
Z. Trstanova , B. Leimkuhler and T. Leli`evre School of Mathematics, University of Edinburgh, James Clerk Maxwell Building,Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK CERMICS (ENPC), INRIA, 77455 Marne-la-Vall´ee, France
November 26, 2019
Abstract
Diffusion maps approximate the generator of Langevin dynamics from simulation data. Theyafford a means of identifying the slowly-evolving principal modes of high-dimensional molecularsystems. When combined with a biasing mechanism, diffusion maps can accelerate the sam-pling of the stationary Boltzmann-Gibbs distribution. In this work, we contrast the local andglobal perspectives on diffusion maps, based on whether or not the data distribution has beenfully explored. In the global setting, we use diffusion maps to identify metastable sets andto approximate the corresponding committor functions of transitions between them. We alsodiscuss the use of diffusion maps within the metastable sets, formalising the locality via theconcept of the quasi-stationary distribution and justifying the convergence of diffusion mapswithin a local equilibrium. This perspective allows us to propose an enhanced sampling algo-rithm. We demonstrate the practical relevance of these approaches both for simple models andfor molecular dynamics problems (alanine dipeptide and deca-alanine).
The calculation of thermodynamic averages for complex models remains a fundamental challengein computational chemistry [2], materials modelling [6] and biology [54]. In typical situations, thepotential energy U of the system is known, and the states are assumed to be distributed accordingto the Boltzmann-Gibbs distribution with density ρ β = Z − e − βU . The difficulty arises due to the high-dimensional, multimodal nature of the target distribution incombination with limited computational resources. The multimodality of the target distributioncauses that the high-likelihood conformational states are separated by low-probability transitionregions. In such a setting, the transitions between critical states become “rare events”, meaning thatnaive computational approaches converge slowly (if at all) or produce inaccurate results. Moreover,standard MD simulations of large proteins (¿500 residues) run on regular CPUs allow to simulatedynamics over hundreds of ns (or a few µ m max) in a reasonable amount of time. This time scaleis often not enough to sample relevant conformational transitions which may occur on the ms timescale. 1 a r X i v : . [ phy s i c s . d a t a - a n ] N ov s an illustration, consider the problem of exploring all folded configurations of a deca-alaninemolecule in vacuum at a specific temperature (300K). This short peptide is a common model forillustrating the difficulties of folding due its very complicated free energy landscape [14]. Threetypical states are shown in Figure 1. The many interactions between atoms of the molecule meanthat changes in structure are the result of a sequence of coordinated moves. Because the transitionsbetween the states occur infrequently (they are ‘rare events’) the results of short Langevin dynamicssimulation will typically constitute a highly localised exploration of the region surrounding oneparticular conformational minimum. (a) A (b) B1 (c) B2 Figure 1: Folding of deca-alanine: three metastable conformations.The goal of enhanced sampling strategies is to dramatically expand the range of observed statesby modifying in some way the dynamical model. Typical schemes [57, 51, 38, 16, 31, 20, 21] relyon the definition of collective variables in order to drive the enhanced sampling of the system onlyin relevant low dimensional coordinates describing the slowest time scales.In this article, we discuss the automatic identification of collective variables for the purpose ofenhancing sampling in applications such as molecular dynamics. In some cases, the natural collec-tive variables relate to underlying physical processes and can be chosen using scientific intuition, butin many cases, this is far from straightforward, as there may be competing molecular mechanismsunderpinning a given conformational change. Methods capable of automatically detecting CVshave, moreover, much wider potential for application. Many Bayesian statistical inference calcula-tions arising in clustering and classifying data sets, and in the training of artificial neural networks,reduce to sampling a smooth probability distribution in high dimension and are frequently treatedusing the techniques of statistical physics [41, 15]. In such systems, a priori knowledge of the col-lective variables is typically not available, so methods that can automatically determine collectivevariables are of high potential value.Diffusion maps [17, 18] provide a dimensionality reduction technique which yields a parame-terized description of the underlying low-dimensional manifold by computing an approximation ofa Fokker-Planck operator on the trajectory point-cloud sampled from a probability distribution(typically the Boltzmann-Gibbs distribution corresponding to prescribed temperature). The con-struction is based on a normalized graph Laplacian matrix. In an appropriate limit (discussedbelow), the matrix converges (point-wise) to the generator of overdamped Langevin dynamics. Thespectral decomposition of the diffusion map matrix thus yields an approximation of the continuousspectral problem on the point-cloud [42] and leads to natural collective variables. Since the firstappearance of diffusion maps [17], several improvements have been proposed including local scal-ing [50], variable bandwidth kernels[7] and target measure maps (TMDmap) [4].The latter scheme2xtends diffusion maps on point-clouds obtained from a surrogate distribution, ideally one thatis easier to sample from. Based on the idea of importance sampling, it can be used on biasedtrajectories, and improves the accuracy and application of diffusion maps in high dimensions [4].Diffusion maps can be combined with the variational approach to conformation dynamics in orderto improve the approximation of the eigenfunctions and provide eigenvalues that relate directly tophysical relaxation timescales [10].Diffusion maps underpin a number of algorithms that have been designed to learn the collectivevariable adaptively and thus enhance the dynamics in the learned slowest dynamics [62, 47, 13, 12].These methods are based on iterative procedures whereby diffusion maps are employed as a toolto gradually uncover the intrinsic geometry of the local states and drive the sampling towardunexplored domains of the state space, either through sequential restarting [62] or pushing [13] thetrajectory from the border of the point-cloud in the direction given by the reduced coordinates.In [37] time structure-based independent component analysis was performed iteratively to identifythe slowest degrees of freedom and to use metadynamics [31] to directly sample them. All thesemethods try to gather local information about the metastable states to drive global sampling, usingad hoc principles. In this paper, we provide a rigorous perspective on the construction of diffusionmaps within a metastable state by formalizing the concept of a local equilibrium based on the quasi-stationary distribution [19]. Moreover, we provide the analytic form of the operator which isobtained when metastable trajectories are used in computing diffusion maps.Diffusion maps can also be used to compute committor functions [30], which play a central rolein transition path theory [59]. The committor is a function which provides dynamical informationabout the connection between two metastable states and can thus be used as a reaction coordinate(importance sampling function for biasing or splitting methods, for example). Committors, or the’commitment probabilities’, were first introduced as ’splitting probability for ion-pair recombina-tion’ by Onsager [44] and appear for example as the definition of p fold , the probability of proteinfolding [25, 45]. Markov state models can in principle be used to compute committor probabili-ties [48], but high dimensionality makes grid-based methods intractable. The finite temperaturestring method [26] approximates the committor on a quasi-one-dimensional reaction tube, which ispossible under the assumption that the transition paths lie in regions of small measure comparedto the whole sampled state space. The advantage of diffusion maps is that the approximation ofthe Fokker-Planck operator holds on the whole space, and therefore we can compute the committoroutside the reaction tube. Diffusion maps have already been used to approximate committor func-tions in [30, 56]. In [30], in order to improve the approximation quality, a new method based on apoint-cloud discretization for Fokker-Planck operators is introduced (however without any conver-gence result). A more recent work [56] uses diffusion maps for committor computations. Finally,we mention that artificial neural networks were used to solve for the committor in [29], althoughthe approach is much different than that considered here.The main conceptual novelty of this article lies in the insight on the local versus global per-spective provided by the quasi-stationary distribution. Thus, to be precise, compared to the workof [30], we clarify the procedure for going from the local perspective to designing an enhancedsampling method and we apply our methods to much more complicated systems (only toy modelsare treated in [30]). The second, more practical, contribution is the demonstration of the use ofdiffusion maps to identify the metastable states and to directly compute the committor function.We consider, for example, the use of the diffusion map as a diagnostic tool for transition out ofa metastable state. A third contribution lies in drafting an enhanced sampling algorithm based3n quasi-stationary distribution and diffusion maps. A careful implementation and application forsmall biomolecules shows the relevance and potential of this methodology for practical applications.Algorithm 1 detailed in Section 5 will provide a starting point for further investigations. The cur-rent article serves to bridge works in the mathematical and computational science literatures, thushelps to establish foundations for future rigorously based sampling frameworks.This paper is organized as follows: in Section 2, we start with the mathematical description ofoverdamped Langevin dynamics and diffusion maps. In Section 3, we formalize the application ofdiffusion maps to a local state using the quasi-stationary distribution. We present several examplesillustrating the theoretical findings. In Section 4, we define committor probabilities and the diffusionmap-based algorithm to compute them. We also apply our methodology to compute collectivevariables, metastable states and committors for various molecules, including the alanine dipeptideand a deca alanine system. In Section 5, we show how the quasi-stationary distribution can beused to reveal transitions between molecular conformations. Finally, we conclude by taking up thequestion of how the quasi-stationary distribution can be used as a tool for the enhanced sampling oflarge scale molecular models, paving the way for a full implementation of the described methodologyin software framework. We begin with the mathematical description of overdamped Langevin dynamics, which is used togenerate samples from the Boltzmann distribution. By introducing the generator of the Langevinprocess, we make a connection to the diffusion maps which is formalized in the following section.We review the construction of the original diffusion maps and define the target measure diffusionmap, which removes some of its limitations.
Langevin dynamics and the Boltzmann distribution
We denote the configuration of thesystem by x ∈ D , where, depending on the application, D = R d , D is a subset of R d or D = ( R / Z ) d for systems with periodic boundary conditions. Overdamped Langevin dynamics is defined by thestochastic differential equation dx t = −∇ V ( x t ) dt + (cid:112) /β dW t , (1)where W t is a standard d -dimensional Wiener process, β > V ( x )is the potential energy driving the diffusion process. The Boltzmann-Gibbs measure is invariantunder this dynamics: µ ( dx ) = Z − e − βV ( x ) dx, Z = (cid:90) D e − βV ( x ) dx. (2)The ergodicity property is characterized by the almost sure (a.s.) convergence of the trajectoryaverage of a smooth observable A to the average over the phase space with respect to a probabilitymeasure, in this case µ : lim t →∞ (cid:98) A t = E µ ( A ) a.s. , (cid:98) A t := 1 t (cid:90) t A ( x s ) ds. (3)The infinitesimal generator of the Markov process ( x t ) t ≥ , a solution of (1), is the differentialoperator L β = −∇ V · ∇ + β − ∆ , (4)4efined for example on the set of C ∞ functions with compact support. The fact that L β is theinfinitesimal generator of ( x t ) t ≥ means that (see for example [35]): ∀ x, ddt (cid:104) E ( A ( x t ) | x = x ) (cid:105)(cid:12)(cid:12) t =0 = L β A ( x ) , where A is C ∞ compactly supported function and x = x is the initial condition at time t = 0.Another way to make a link between the differential operator (4) and the stochastic differentialequation (1) is to consider the law of the process x t . Let us denote by ψ ( t, x ) the density of x t attime t . Then ψ is a solution of the Fokker-Planck equation ∂ t ψ = L ∗ β ψ, ψ (0) = ψ , where ψ is the density of x and L ∗ β is the L adjoint of L β . Under the assumption on thesmoothness of the potential and the compactness of the domain D , the solution can be written as ψ ( t, x ) = ∞ (cid:88) k =0 c k e λ k t φ k ( x ) , (5)with eigenvalues { λ k } ∞ k =0 with λ = 0 > λ ≥ λ ≥ . . . and eigenfunctions { φ k } ∞ k =0 of L ∗ β . Theeigenfunctions are smooth functions and the sum (5) converges uniformly in x for all times t >t > ψ ( t, x ) → c φ ( x ) as t → ∞ , and therefore the firstterm c φ in the sum (5) is equal to µ . The convergence rate is determined by the next dominanteigenfunctions and eigenvalues. A k -dimensional diffusion map at time t is a lower dimensionalrepresentation of the system defined as a nonlinear mapping of the state space to the Euclideanspace with coordinates given by the first k eigenfunctions: G k ( t, x ) := (e λ t φ ( x ) , . . . , e λ k t φ k ( x )) . The diffusion distance is then the Euclidean distance between the diffusion map coordinates(DC).Finally, we would like to stress that although the eigenfunctions of the kernel matrix do approx-imate point-wise the spatial eigenfunctions of the Fokker-Planck operator, there is no relationshipbetween the diffusion map matrix eigenvalues and the exact propagator eigenvalues as showedin [10]. This connection would be provided by constructing diffusion map based approximator ofthe generator of underdamped Langevin dynamics (with finite friction).
Diffusion maps
The diffusion map [17] reveals the geometric structure of a manifold M fromgiven data D ( m ) by constructing a m × m matrix that approximates a differential operator. Therelevant geometric features of M are expressed in terms of the dominant eigenfunctions of thisoperator.The construction requires that a set of points D ( m ) := { x , x , . . . , x m } ⊂ R N ( N >
0) whichhave been sampled from a distribution π ( x ) lie on a compact d -dimensional differentiable subman-ifold M ⊂ R N with dimension d < N . The original diffusion maps introduced in [17, 55] are basedon the isotropic kernel h ε ( x, y ) = h ( (cid:107) x − y (cid:107) /ε ) where h is an exponentially decaying function, ε > (cid:107) · (cid:107) is a norm in R N . A typical choice is h ε ( x, y ) = exp (cid:0) − (4 ε ) − (cid:107) x − y (cid:107) (cid:1) . (6) For example Euclidean or RMSD, which is the most commonly used norm in molecular simualtions [10]
5n the next step, an m × m kernel matrix K ε is built by the evaluation of h ε on the set D ( m ) .This matrix is then normalized several times to give a matrix P ε that can be interpreted as thegenerator of a Markov chain on the data. To be precise, the kernel matrix K ε is normalized usingthe power α ∈ [0 ,
1] of the estimate q of the density π , usually obtained from the kernel densityestimate q i = (cid:80) Nj =1 K ij as the row sum of K ε . In some cases the analytic expression of the density π ( x ) is known and we can set directly q ( x ) = π ( x ). After obtaining the transition matrix P ε = D − α K ε , (7)where D α = diag( q − α ), we compute in the last step the normalized graph Laplacian matrix L ε = ε − ( P ε − I ) . (8)As reviewed in [42], for given α and sufficiently smooth functions f , the matrix L ε converges inthe limit m → ∞ point-wise to an integral operator describing random walk in continuous spaceand discrete time, which in the limit ε → f ] = ( f ( x ) , ..., f ( x m )) T for representingfunctions evaluated on the data set D ( m ) as vectors such that [ f ] i = f ( x i ), we formally write thepoint-wise convergence: for α ∈ [0 , m → ∞ and ε → L ε [ f ]) j → L f ( x j ) , for all x j ∈ D ( m ) , where L is a Fokker-Planck operator L f = ∆ f + (2 − α ) ∇ f · ∇ ππ (9)where ∆ is the Laplace-Beltrami operator on M and ∇ is the gradient operator on M . Note thatin the special case α = 1 /
2, and for the choice π = Z − e − βV (Boltzmann-Gibbs), the approximatedoperator corresponds to the generator of overdamped Langevin dynamics (4), such that L = β L β . (10)For this reason, we focus on the choice α = 1 / L ε approximate discretized eigenfunctions of L . Then eigenvectors of L ε approximate solutions tothe eigenproblem associated to L : L ε [ ψ ] = λ [ ψ ], an approximation of L ψ ( x ) = λψ ( x ) , ∀ x ∈ M ⊂ supp( π ) . The spectral decomposition of L ε provides real, non-positive eigenvalues 0 = λ > λ ≥ λ ≥ . . . ≥ λ m sorted in decreasing order. The dominant eigenfunctions allow for a structure preservingembedding Ψ of D ( m ) into a lower dimensional space and hence reveal the geometry of the data.Singer [55] showed that for uniform density π , the approximation error for fixed ε and m is( L ε [ f ]) j = L f ( x j ) + O (cid:16) ε, m − / ε − / − d/ (cid:17) . (11)6 emark 2.1 (Infinite friction limit) . In molecular dynamics, trajectories are usually obtained bydiscretizing [34] the (underdamped) Langevin dynamics: dx t = M − p t dt,dp t = −∇ V ( x t ) dt − γM − p t dt + (cid:114) γβ dW t , (12) where x ∈ R d are positions, p ∈ R d momenta, W t is a standard d -dimensional Wiener process, β > is proportional to the inverse temperature, M = diag m , . . . , m d is the diagonal matrix ofmasses, and γ > is the friction constant. The generator of this process is L γ = M − p · ∇ x − ∇ V ( x ) · ∇ p + γ (cid:18) − M − p · ∇ p + 1 β ∆ p (cid:19) . Recall that the diffusion maps approximate the generator (9) which is the generator of the over-damped Langevin dynamics, an infinite-friction-limit dynamics of the Langevin dynamics (12) .Diffusion maps therefore provide the dynamical information in the large friction limit γ → + ∞ ,rescaling time as γt , or in a small mass limit ( M → ). Remark 2.2.
Note that diffusion maps require the data to be distributed with respect to π ( x ) dx ,which appears in the limiting operator (9) . It implies that even though diffusion maps eventuallyprovide an approximation of the generator of overdamped Langevin dynamics (10) , one can usetrajectories from any ergodic discretization of underdamped Langevin dynamics to approximate theconfigurational marginal π ( x ) dx as for example the BAOAB integrator [33]. The target measure diffusion map If π is particularly difficult to sample, it might be desirableto use points which are not necessarily distributed with respect to π in order to compute anapproximation to the operator L = ∇ log( π ) · ∇ + ∆ . (13)For this purpose, the target measure diffusion map was recently introduced [4]. The main advantageis that it allows construction of an approximation of L even if the data points are distributed withrespect to some distribution µ such that supp π ⊂ supp µ . The main idea is to use the previouslyintroduced kernel density estimator q ε , already obtained as an average of the kernel matrix K ε .Since we know the density of the target distribution π , the matrix normalization step can be doneby re-weighting q with respect to the target density, which allows for the matrix normalization inan importance sampling sense. More precisely, the target measure diffusion map (TMDmap) isconstructed as follows: The construction begins with the m × m kernel matrix K ε with components( K ε ) ij = k ε ( x i , x j ) and the kernel density estimate q ε ( x i ) = (cid:80) mj =1 ( K ε ) ij . Then the diagonal matrix D ε,π with components ( D ε,π ) ii = π / ( x i ) q − ε ( x i ) is formed and the kernel matrix is right normalizedwith D ε,π : K ε,π = K ε D ε,π . Let us define ˜ D ε,π as the diagonal matrix of row sums of K ε,π , that is,( ˜ D ε,π ) ii = ( K ε,π [ ]) i = m (cid:88) j =1 ( K ε,π ) ij . .0 0.5 0.0 0.5 1.0x0.020.010.000.010.02 E V s o f D i r i c h l e t B C EV 0EV 1EV 2EV 3EV 4 Number of points N10 M e a n a b s o l u t e e rr o r EV 0EV 1EV 2EV 3EV 4 N Figure 2: (a) Eigenvectors obtained from diffusion maps. (b) Mean absolute error on the first fiveeigenfunctions over the number of samples.Finally, the TMDmap matrix is built as L ε,π = ε − (cid:16) ˜ D − ε,π K ε,π − I (cid:17) . (14)In [4], it is shown that L ε,π converges point-wise to the operator (13). Remark 2.3 (Nystr¨om extension of the eigenvectors [5]) . Note that the j -th eigenvector [ ψ ] of thematrix L ε can be extended on x ∈ M as ψ j ( x ) = 1 λ j N (cid:88) i =1 h ε ( x, x i ) D ( x ) [ ψ j ] i where λ j (cid:54) = 0 is the corresponding eigenvalue, D ( x ) = (cid:80) Ni =1 h ε ( x, x i ) and [ ψ j ] i = ψ j ( x i ) . Dirichlet boundary problems
Diffusion maps provide a matrix L ε , which converges point-wiseto the generator L defined in (9). This method can be used to solve the following eigenvalue problemwith homogeneous Dirichlet boundary conditions: find ( λ, f ) such that L f = λf in Ω , f = 0 in ∂ Ω . In the following example, we solve a linear eigenvalue problem with Dirichlet boundary conditions.In the first step, we construct L ε as the diffusion map approximation of L on the point-cloud { x i } mi =1 .In order to express the Dirichlet boundary condition, we identify points outside the domain Ω thatwe define as C := { x j } j ∈ J where the set of indices J := { j : x j / ∈ Ω } . Finally, we solve the eigenvalueproblem with matrix L ε , in which rows with indices in J have been set to zero.Let us illustrate this on the following one-dimensional eigenvalue problem L v = λv, v ∈ ( − , , v = 0 , v ∈ {− , } , (15)with L = − x∂ + ∂ . The eigenfunctions are ψ k ( x ) = sin( π ( x + 1)( k + 1). To approximatethe solution of (15) using diffusion maps with α = 1 /
2, we have generated 10 points from adiscretized overdamped Langevin trajectory with potential V ( x ) = x /
2, using a second order nu-merical scheme [33]. In Figure 2(a) we show the diffusion map approximation of the eigenfunctions ψ k / (cid:107) ψ k (cid:107) . From Figure 2(right) we observe that the decay of the mean absolute error of the nor-malized eigenfunctions is asymptotically proportional to N − / , where N is the number of samples.Different numbers N of samples were obtained by sub-sampling the trajectory.8 Defining a ’local’ perspective in diffusion-map analysis
We now concentrate on the case when diffusion maps are built using trajectories of the Langevindynamics. As we have reviewed in the introduction, many iterative methods aim at graduallyuncovering the intrinsic geometry of the local states. The information obtained from the metastablestates can then be used, for example, to accelerate the sampling towards unexplored domains ofthe state space.The approximation error of diffusion maps (11) scales in O ( m − / ), m being the number ofsamples. In order to provide an approximation of a Fokker-Planck operator (4) using a point-cloudobtained from the process x t , a solution of (a discretized version of) (1), the time averages shouldhave converged with a sufficiently small statistical error. In this section, using the notion of thequasi-stationary distribution, we explain to which operator converges a diffusion map approximationconstructed on the samples in a metastable subset of the state space and why it is possible to obtainconvergence of the approximation in this setup. Quasi-stationary distribution
The quasi-stationary distribution (QSD) is a local equilibriummacro-state describing a process trapped in a metastable state Ω ⊂ R d (see for example [19]). TheQSD ν can be defined as follows: for all smooth test functions A : R d → R , ∀ X ∈ Ω , lim t →∞ E ( A ( X t )) | τ > t ) = (cid:90) Ω Adν, where Ω is a smooth bounded domain in R d , which is the support of ν and the first exit time τ from Ω for X t is defined by τ = inf { t > X t (cid:54)∈ Ω } . The following mathematical properties of the QSD were proved in [32]. The probability distribution ν has a density v with respect to the Boltzmann-Gibbs measure π ( dx ) = Z − e − βV ( x ) dx . Thedensity v is the first eigenfunction of the infinitesimal generator L of the process X t , with Dirichletboundary conditions on ∂ Ω: (cid:40) L v = − λv, in Ω ,v = 0 , on ∂ Ω , (16)where − λ < ν with respect to the Lebesgue measure is thus ∀ x ∈ Ω , ν ( x ) = v ( x )e − βV ( x ) (cid:82) Ω v ( x )e − βV ( x ) dx . (17)Let us consider the situation when Ω is a metastable state for the dynamics ( X t ), and X ∈ Ω.Then, for a long time, the process remains in Ω. If the diffusion map is built using those samplesin Ω, it then provides an approximation of the Kolmogorov operator (9) where π is replaced by thequasi-stationary distribution ν , namely L Ω f := ∆ f + (2 − α ) ∇ f · ∇ νν = ∆ f + (2 − α ) ∇ f · ( ∇ ln( v ) − β ∇ V ) , on supp(Ω) . (18)Notice that if Ω = D , ν = π and we recover the operator (9) with respect to the distribution π .In the case when Ω is in the basin of attraction of a local minimum x of V for the dynamics9 = 0.5 L = 3.1 exact -3 3 -1 320 1 1-1 x x v · L = 1.9 Figure 3: Quadratic potential in 1D. (left) The approximation of the first eigenfunction v of Dirichletboundary eigenvalue problem (16). (right) The quasi-stationary-distribution ν .˙ x = −∇ V ( x ), the quasi-stationary distribution with density ν defined by (17) is exponentiallyclose to π ( x ) = Z − e − βV ( x ) dx on any compact in Ω: the two distributions differ essentially on theboundary ∂ Ω. More precisely, as proved in [24, Lemma 23, Lemma 85] for example (see also [28,Theorem 3.2.3], for any compact subset K of Ω, there exists c > β → ∞ , (cid:13)(cid:13)(cid:13)(cid:13) K exp( − βV ) (cid:82) K exp( − βV ) − ν (cid:13)(cid:13)(cid:13)(cid:13) L (Ω) = O (exp( − βc )) . In order to illustrate these ideas, we compare the diffusion map constructed from a trajectory in ametastable state and points from a trajectory which has covered the whole support of the underlyingdistribution. As explained above, the distribution of the samples in the metastable state is thequasi-stationary distribution and diffusion maps provide an approximation of the operator (18).Let us next illustrate the quasi-stationary distribution and explain how it differs from thestationary distribution on a simple one-dimensional example. The density of the quasi-stationarydistribution (17) can be obtained using accurate numerical approximation of the solution v of theDirichlet problem (16) by a pseudo-spectral Chebyschev method [58, 43] in interval [ − L, L ] with
L >
0, with the grid chosen fine enough to provide sufficient numerical accuracy. We consider asimple double-well potential V ( x ) = (( x − − . To illustrate the convergence of the quasi-stationary distribution ν = Z − ν v e − V to π = Z − e − V we increase the interval size by plotting theapproximation for increasing values of L . In Figure 3 (a), we plot the approximation of the solution v of the Dirichlet eigenvalue problem (18) on [ − L, L ] for several values of
L >
0. As expected,we observe that v ( x ) → L → ∞ . In Figure 3 (b), we plot the corresponding quasi-stationarydensities. As the size of the domain increases, the quasi-stationary distribution converges to π .In the next example, we sample from the Boltzmann distribution with a two-dimensional double-well [35, Section 1.3.3.1]: V DW ( x, y ) = 16 (4( − x − y + w ) + 2 h ( x − + (( x + y ) − w ) + (( x − y ) − w ) ) , (19)with h = 2 and w = 1. We employ a second-order discretization of Langevin dynamics (1) atlow temperature ( β = 10). Due to this low temperature, the samples are trapped in the first welllong enough to locally equilibrate to the quasi-stationary distribution. We compute the statisticalerror of averages of various observables such as the configurational and the kinetic temperatures,10igure 4: Local geometry. (a) The sampling of the metastable state. (b) The dominant eigenvectorparameterizes the y -coordinate.Figure 5: Global geometry. (a) The samples cover the whole support of the distribution. (b) Thedominant eigenvector correlates with the x -coordinate.because of the knowledge of the exact expected values of these observables . We also track the firstdominant eigenvalues from the diffusion map, in order to detect when the sampling has reached alocal equilibrium. From this short trajectory, the diffusion map uncovers the geometry of the localstate: the dominant eigenvector clearly parameterizes the y -coordinate, i.e. the slowest coordinatewithin the metastable state, see Figure 4. On the other hand, when the trajectory explores alsothe second well and hence covers the support of the distribution, the diffusion map parameterizes x as the slowest coordinate, see Figure 5.These examples demonstrate that diffusion maps can be constructed using points of the qua-sistationary distribution to uncover the slowest local modes. As we will show in Section 5 thisproperty will eventually allow us to define local collective variables which can guide the construc-tion of sampling paths that exit the metastable state. The following equalities hold, respectively, for the kinetic and configurational temperatures: k B T = E π [ p ·∇ U ( p )] = E π [ |∇ V ( x ) | ] E [∆ V ( x )] , where k B is the Boltzmann constant, T is temperature and U ( p ) = p (cid:124) M − p is kinetictemperature. Global perspective: Identification of metastable states and com-mittors
We illustrate in this section how diffusion maps applied to global sampling can be used to approx-imate the generator of Langevin dynamics and committor probabilities between metastable statesin infinite-friction limit . We also use diffusion maps to automatically identify metastable sets inhigh-dimensional systems.The committor function is the central object of transition path theory [59, 40]. It provides adynamical reaction coordinate between two metastable states A ⊂ D and B ⊂ D . The committorfunction is the probability that the trajectory starting from x (cid:54)∈ A ∪ B reaches first B rather than A , q ( x ) = P ( τ B < τ A | x = x )The committor q is also the solution of the following system of equations: L q = 0 , in D \ ( A ∪ B ) ,q = 0 , in A,q = 1 , in B. (20)From the committor function, one can compute the reaction rates, density and current of transitionpaths [36].In the spirit of [30], we use diffusion maps to compute committors from the trajectory data.Given a point-cloud on D , diffusion maps provide an approximation of the operator L and can beused to compute q . After computing the graph Laplacian matrix (8) (choosing again α = 1 / L ε [ c, c ] q [ c ] = − L ε [ c, b ] q [ b ] , (21)where we defined by c and b indices of points belonging to the set C = D \ ( A ∪ B ) and B respectively, and L ε [ I, J ] the projection of the matrix or vector on the set of indices I = { I k } Ik =1 and J = { J k } Jk =1 . One-dimensional Double-well.
First, we compute the committor function for a one-dimensionaldouble-well potential V ( x ) = ( x − . We use a second order discretized Langevin scheme withstep size ∆ t = 0 . points and compute the TMDmap with (cid:15) = 0 .
1. We fix thesets A = [ − . , −
1] and B = [1 , . The most commonly used method for identifying metastable subsests is the Perron-cluster clusteranalysis (PCCA), which exploits the structure of the eigenvectors [52, 53, 49, 23]. In this work, we We are aware that diffusion maps do not provide access to dynamical properties. However, having access to betterreaction coordinate such as the committor can be used to obtain dynamical properties (for example in combinationwith Adaptive Multilevel Splitting[11]). .0 0.5 0.0 0.5 1.0x0.00.20.40.60.81.0 P o t e n t i a l CAB 1.0 0.5 0.0 0.5 1.0x0.00.20.40.60.81.0 C o mm i tt o r PseudospectralDFM
Figure 6: Double-well in 1D. (a) A plot of the potential showing the three sets
A, B and C = D \ ( A ∪ B ). (b) The committor function approximation by diffusion maps (DFM) and an accuratepseudospectral method.use the eigenvectors provided by diffusion maps in order to automatically identify the metastablesubsets. The main idea is to compute the dominant eigenvectors of the transfer operator P , whichare those with eigenvalues close to 1, excluding the first eigenvalue. We approximate P by P ε defined in (7). The metastable states can be clustered according to the ’sign’ structure of the firstdominant eigenvector, which is moreover constant on these states. More precisely, we find themaximal and the minimal points of the first eigenvector, which define the centers of the two sets.In the next step, we ‘grow’ the sets by including points with Euclidean distance in diffusion spacesmaller than a fixed threshold. See for example Figure 5(b) for an illustration in the case of thetwo dimensional double-well potential. Two-dimensional Double-well
In the second example, we consider a two-dimensional problemwith potential (19) with h = 2 and w = 1. We generate m = 10 samples using discretizedLangevin dynamics with timestep ∆ t = 0 .
1. We use 10 points obtained by sub-sampling of thetrajectory for the diffusion maps, which is chosen with kernel (6) and ε = 0 .
1. We compute the firsttwo dominant eigenvectors and define the metastable subsets A and B using the first dominanteigenvector as described above . In Figure 7(a) we see the sampling and the chosen sets. Wecompute the committor by solving the linear system (21) and extrapolate linearly the solutionon a two dimensional grid of x and y ; this is illustrated in the right panel of Figure 7 (b). Therepresentation of the committor in the diffusion coordinates is depicted in Figure 8 (b). We now use diffusion maps to compute dominant eigenfunctions of the transition operator, iden-tify metastable sets and approximate the committors of small molecules: alanine dipeptide anddeca-alanine. We discuss the local and global perspective by computing diffusion maps inside themetastable states. For the example of deca-alanine, we use a trajectory from biased dynamics anduse the TMDmap to approximate committor, comparing the dynamics at various temperatures. The results suggest that the method is robust with respect to the variation of the definition domains, under theassumption that the domains A and B are within the metastable state. y P o t e n t i a l E n e r g y y C o mm i tt o r Figure 7: Two dimensional double-well potential: committor approximations on the automaticallyidentified metastable sets. (a) The sampled trajectory with the automatically chosen sets set A (blue) and B (red). (b) The committor extended on a grid. Note that the value of 0 . x = 0 as expected. y P r o b a b ili t y A B D C C o mm i tt o r Figure 8: (a) The flux streamlines coloured by probability ρ AB . (b) The diffusion map embeddingand the chosen sets A, B (colored blue and red).14igure 9: Alanine dipeptide. (a) The two metastable states uncovered by the first diffusion coordi-nate. (b) The committor function over sets A (red) and B (blue). The first eigenvector has a veryhigh correlation with the committor function.
Alanine dipeptide in vacuum
Alanine dipeptide ( CH − CO − N H − C α HCH − CO − N H − CH ) is commonly used as a toy problem for sampling studies since it has two well-definedmetastable states which can be parameterized by the dihedral angles φ between C, N, C α , C and ψ defined between N, C α , C, N .We simulate a 20 nanosecond Langevin dynamics trajectory (using the BAOAB integrator [34])at temperature 300 K with a 2fs stepsize, friction γ = 1ps − and periodic boundary conditions usingthe openmmtools [1] library where the alanine dipeptide in vacuum is provided by the AMBERff96 force field. We sub-sample and RMSD-align the configurations with respect to a reference oneleaving only 5 × points for the diffusion map analysis. We use kernel (6) with ε = 1 and theEuclidean metric.In Figure 9(a) we show the first diffusion coordinate, which has opposite signs on the twometastable states. Since the transition is very rare, there are only very few points in the vicinity ofthe saddle point, the dominant eigenvector however clearly parameterizes the dihedral angles, andseparates the two metastable sets. In order to define the reactive sets, we use the first diffusioncoordinate as described in Section 4. Figure 9 (a) shows the dominant diffusion coordinate, whereasthe Figure 9(b) shows the committor function. We observe that the committor strongly correlateswith the first eigenvector, which is expected due to the definition of the metastable state givenby the dominant eigenvector. Figure 10 shows the free energy profile of the committor function,which was extended from the sub-sampled points to the trajectory of length 20 nanoseconds usingnonlinear regression (a multi-layer perceptron) .In the previous example, we have used a globally converged trajectory. The diffusion map anal-ysis therefore describes the dynamics from the global perspective. In order to illustrate the localperspective, we compute the diffusion maps and the committor approximation from a trajectorywhich has not left the first metastable state. The first eigenvector parameterizes the two wells ofthis metastable state and we define the reactive states as before using the first DC. The approxi-mated committor assigns the probability 0 . More precisely, the subset served as a training set and fitted model was evaluated on the full trajectory. We haveperformed cross-validation to find the right parameters of the neural network. D C C o mm i tt o r Figure 11: The first metastable state of alanine dipeptide. (a) The dominant eigenvector parame-terizes the two wells of the first metastable state. (b) The committor approximation with sets
A, B in blue and red.with the automatically chosen sets. Figure 12 shows the corresponding diffusion map embeddingcoloured with respect to the committor values. Note that the diffusion maps used on the samplesfrom this metastable state, whose distribution is the QSD, correlate with the dihedral angles. Thisobservation suggests that alanine dipeptide in vacuum is a trivial example for testing enhanceddynamics methods using collective variables learned on-the-fly because it is likely that the slowdynamics of the metastable state are similar to the global slow dynamics, which might not be thecase in more complicated molecules.
Deca-alanine
In the following example, we use a long trajectory of the deca-alanine molecule.This molecule has 132 atoms and a very complex free-energy landscape. A kinetic analysis wasdone in [60] showing the force-field-dependent slowest dynamics processes. A representation of themolecular conformations associated to the two main metastable states are shown in Figure 1. Thesystem is highly metastable: a standard simulation at 300K requires at least 5 µ s to converge [60].Therefore, the trajectory studied here was obtained using the infinite-swap simulated tempering(ISST) [39] method with temperatures in the range from 300K to 500K (nominal stepsize 2fs andnominal simulation length 2 µs ). This method is incorporated into the MIST [8] library which is16 .01 0.00 0.01DC 10.020.010.000.010.02 D C C o mm i tt o r Figure 12: The embedding in the first two diffusion coordinates, coloured by the committor values.The sets
A, B are showed in blue and red respectively.coupled to the GROMACS and Amber96 forcefields. ISST provides the weights which are necessaryto recover the Boltzmann distribution at a given temperature. We use the weights within TMDmapallowing us to efficiently compute the committor iso-surfaces at various temperatures. We extendthe committors in the root-mean-square deviation (RMSD) and the radius of gyration (Rg) forbetter visualisation in Figure 13 (however, note that the diffusion maps were applied to all degreesof freedom). We automatically identify the metastable sets using the dominant eigenvector. Notethat at lower temperatures of 300K and 354K, the two states are A and B
1, while at highertemperatures they are A and B
2. See Figure 1 for the corresponding conformations. The reasonis that at the lower temperature, the dominant barrier is enthalpic and at higher temperatures, itis rather entropic, suggesting that the slowest transition is between the states A and B
2, a statewhich is not very probable at low temperatures. The different dynamical behaviours can be alsoseen by the varying 0 . In this section, we first illustrate how the spectrum computed from diffusion maps converges withinthe quasi-stationary distribution. Next, we use diffusion coordinates built from samples within themetastable state to identify local CVs, which are physically interpretable and valid over the wholestate space. Finally, we demonstrate that when used in combination with enhaced dynamics asmetadynamics, these CVs lead to more efficient global exploration. The combination of these threeprocedures defines an algorithm for enhanced sampling which we formalize and briefly illustrate.
The local perspective introduced in Section 3 allows us to define a metastable state as an ensemble ofconfigurations (snapshots) along a trajectory, for which the diffusion map spectrum converges. Theidea is that, when trapped in a metastable state, one can compute the spectrum of the infinitesimalgenerator associated with the QSD, which will typically change when going to a new metastablestate. 17 a) 300K (b) 354K(c) 413K (d) 485K
Figure 13: Committor of deca-alanine. Sets A(blue) and B1 (red, top) and B2 (red, bottom)were obtained from the first DC. The transition region is showed by committor isosurface linesat [0 . , . K and 345 K ) and high temperatures (413 K and 485 K ).18 a) 300K (b) 354K(c) 413K (d) 485K Figure 14: Committor of deca-alanine in diffusion coordinates.19igure 15: The first dominant eigenvalues of matrix L ε with fixed number of points from alaninedipeptide trajectory. The bottom right plot shows the values of φ -angle over the simulation steps,with various colors corresponding to the iterations after which the spectrum is recomputed. Above,we plot the two functions of the dominant eigenvalues, computed over each iteration, which indicatethe exit from the state as the eigenvalues suddenly change. Before the first transition to the secondstate, the spectrum is the one of L Ω (see (18) with α = ), where Ω is the metastable state wherethe process is trapped. After the transition, the spectrum is evolving towards the spectrum of theoperator L R d = − β ∇ V + ∆ on the whole domain.To illustrate this, we analyze an alanine dipeptide trajectory. Every 4000 steps, we computethe first dominant eigenvalues of the diffusion map matrix L ε (with α = 1 /
2) by sampling 2000points from the trajectory { x n } mn =0 until the end of the last iteration τ = m ∆ t . We observein Figure 15(left), that the sampling has locally equilibrated during the first 3 iterations withinthe metastable state, and the eigenvalues have converged. The transition occurs after the fourthiteration, which we can see in Figure 15(bottom right) which shows the values of the dihedral angle φ during the simulation step. In the left figure, we clearly observe a change in the spectrum at thispoint, with an increase in the spectral gap. After the trajectory has exited the metastable state,the eigenvalues begin evolving to new values, corresponding to the spectrum of the operator onthe whole domain. The change in the spectrum allows us to detect the exit from the metastablestate. Instead of tracking each of the first eigenvalues separately, we compute the average and themaximal difference of the dominant eigenvalues, the values of which we plot in Figure 15(top right).This example illustrates the local perspective on the diffusion maps: their application on apartially explored distribution provides an approximation of a different operator since the usedsamples are distributed with respect to a quasi-stationary distribution. It is therefore possible touse samples from the local equilibrium to learn the slowest dynamics within the metastable state.This observation can be used together with biasing techniques to improve sampling. We now describe an outline of an algorithm for enhanced sampling.Sampling from the QSD allows us to build high quality local CVs (within the meastable state)by looking for the most correlated physical CVs to the DCs. In typical practice the CVs are chosenfrom a list of physically relevant candidates, as for example the backbone dihedral angles or atom-atom contacts. There are several advantages to using physical coordinates instead of the abstractDC. First, the artificial DCs are only defined on the visited states, i.e. inside the metastable20tate. Extrapolated DCs outside the visited state lose their validity further from points used forthe computation. By contrast, the physical coordinates can typically be defined over the entirestate space. Second, the slowest physical coordinates might provide more understanding of themetastable state.Once the best local CVs have been identified, we can use metadynamics to enhance the sampling,effectively driving the dynamics to exit the metastable state. In the next iteration, we suggest touse TMDmap to unbias the influence of metadynamics on the newly generated trajectory.The above strategy can be summarized in the following algorithm (Enhanced sampling algorithmbased on local-global analysis):1. Run molecular dynamics until the spectrum is converged (the reference walker is trapped)2. [Optional]
Refine the convergence of the spectrum and the DC by using a Fleming-Viot [9]process within the metastable state (each replica being initialized with the whole initial tra-jectory of the reference dynamics, for the diffusion map computation)3. Identify CVs which are the most correlated with DCs4. Build an effective bias using, for example, metadynamics based on these CVs5. When an exit is observed for the reference walker (through a change in the spectrum), restartthe procedure from step 1, keeping in memory the bias built using TMDmapAs proof-of-concept, we will illustrate this method in a simplified setup based on metadynamicsin the case of the alanine dipeptide. We stress that the construction of sampling procedures forgeneral macromolecules is a complex undertaking and many challenges still need to be addressedto implement the full proposed methodology for larger systems; this is the subject of current studyby the authors.We first run a short trajectory to obtain samples from the first metastable state . This samplingis done by Fleming-Viot process [9] with boundary defined by the converged spectrum which weregularly recompute during the sampling . We compute the two dominant DCs and use them toselect the two most correlated from a list of physical coordinates (a similar approach was presentedin [27]). In this case, we looked at the collection of all possible dihedral angles from all the atomsexcept the hydrogens. In order to select the two most correlated CVs we employed the Pearsoncorrelation coefficient, ρ ( X, Y ) = cov(
X, Y ) σ X σ Y , where σ Z is the standard deviation of a random variable Z . We found dihedral angles ACE1C-ALA2C-ALA2C-NME3N and ACE1C-ACE1O-ALA2N-NME3N to be the best candidates withhighest correlations with the first and the second eigenvector respectively. Note however, thatthere were also several other CVs with high correlations, among these the φ and ψ angles. Inthe next step, we use the most correlated physical CVs within metadynamics and track the φ -angle as a function of time. Moreover, we run metadynamics with a priori chosen CV’s based onexpert knowledge, specifically the φ, ψ angles. As shown in Figure 16, there is no transition for The ’first’ state corresponds to the left top double well in Figure 9. Alternatively, one can define the boundary by the free-energy in φ, ψ angles. We also found that resampling usingFleming-Viot improves the quality of diffusion maps. φ, ψ angles (middle) is also more efficient thanstandard Langevin.the standard dynamics (top). On the other hand, metadynamics with the learned CVs exhibitsseveral transitions, see Figure 16(bottom), similarly to the knowledge based CVs (middle). We havealso compared the least correlated CVs for metadynamics and this approach was much worse thanLangevin dynamics: we did not observe any exit during the expected Langevin exit time. Thesenumerical experiments are strongly suggestive of the relevance of the most correlated CVs withthe dominant DCs . In conclusion, learning the slowest CVs from a local state provides importantinformation allowing to escape the metastable state and hence enhance the sampling. Once theprocess leaves the first visited metastable state, one keeps the bias, new CVs are computed to updatethe bias once trapped elsewhere. This process can be iterated, and the weights from metadynamicscan subsequently be unbiased by TMDmap to compute the relevant physical coordinates at everyiteration.We have thus demonstrated, at least in this specific case, how the local perspective can beused together with biasing techniques (metadynamics) to get more rapid sampling of the targetdistribution. Diffusion maps are an effective tool for uncovering a natural collective variable as needed for en-hanced sampling. In this work, we have formalized the use of diffusion maps within a metastablestate, which provides insight into diffusion map-driven sampling based on iterative procedures. In this example, we did not compute the full expected exit times since the difference was significant for severalrealizations. To estimate the actual speed up of the sampling method, one would need to run thousands of trajectoriesand perform a detailed statistical analysis on the results, which is proposed for future study.
Data is accessible as ”Martinsson, Anton; Trstanova, Zofia. (2019). ISST Deca-Alanine. Universityof Edinburgh. School of Mathematics.https://doi.org/10.7488/ds/2491”. The following librarieswere used: - pydiffmap https://github.com/DiffusionMapsAcademics/pyDiffMap - OpenMM East-man, Peter, et al. ”OpenMM 7: Rapid development of high-performance algorithms for moleculardynamics.” PLoS computational biology 13.7 (2017): e1005659
B.L and Z.T. were supported by EPSRC Grant EP/P006175/1. B.L. was further supported by theAlan Turing Institute (EPSRC EP/N510129/1) as a Turing Fellow. T.L. is supported by the Eu-ropean Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement number 614492.The authors would like to thank Ben Goddard and Antonia Mey (both at University of Edin-burgh) for helpful discussions.
References [1] OpenMMTools.[2] Advanced Sampling in Molecular Simulation.
J. Comput. Chem. , 37(6):543–622, 2016.233] Rosalind J Allen, Chantal Valeriani, and Pieter Rein ten Wolde. Forward flux sampling forrare event simulations.
Journal of physics: Condensed matter , 21(46):463102, 2009.[4] Ralf Banisch, Zofia Trstanova, Andreas Bittracher, Stefan Klus, and P´eter Koltai. Diffusionmaps tailored to arbitrary non-degenerate Itˆo processes.
Appl. Comput. Harmon. Anal. , 2018.[5] Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux, Jean-Fran¸cois Paiement, Pascal Vincent,and Marie Ouimet. Learning eigenfunctions links spectral embedding and kernel PCA.
Neuralcomputation , 16(10):2197–2219, 2004.[6] Tristan Bereau, Denis Andrienko, and Kurt Kremer. Research Update: Computational mate-rials discovery in soft matter.
APL Mater. , 4(5):53101, 2016.[7] Tyrus Berry and John Harlim. Variable bandwidth diffusion kernels.
Appl. Comput. Harmon.Anal. , 40(1):68–96, 2016.[8] Iain Bethune, Ralf Banisch, Elena Breitmoser, Antonia B K Collis, Gordon Gibb, GianpaoloGobbo, Charles Matthews, Graeme J Ackland, and Benedict J Leimkuhler. MIST: A Simpleand Efficient Molecular Dynamics Abstraction Library for Integrator Development.
Comp.Phys. Comm. , 236:224–236, 2019.[9] A. Binder, G. Simpson, and T. Leli`evre. A generalized parallel replica dynamics,.
J. Comput.Phys. , 284:595–616, 2015.[10] Lorenzo Boninsegna, Gianpaolo Gobbo, Frank No´e, and Cecilia Clementi. Investigating molec-ular kinetics by variationally optimized diffusion maps.
Journal of chemical theory and com-putation , 11(12):5947–5960, 2015.[11] Fr´ed´eric C´erou and Arnaud Guyader. Adaptive multilevel splitting for rare event analysis.
Stoch. Anal. Appl. , 25(2):417–443, 2007.[12] Wei Chen and Andrew L Ferguson. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration.
J. Comput.Chem. , 39(25):2079–2102, 2018.[13] Eliodoro Chiavazzo, Roberto Covino, Ronald R Coifman, C William Gear, Anastasia S Geor-giou, Gerhard Hummer, and Ioannis G Kevrekidis. Intrinsic map dynamics exploration foruncharted effective free-energy landscapes.
Proc. Natl. Acad. Sci. , 114(28):E5494—-E5503,2017.[14] Christophe Chipot and J´erˆome H´enin. Exploring the free-energy landscape of a short peptideusing an average force.
J. Chem. Phys. , 123(24):244906, 2005.[15] Nicolas Chopin, Tony Leli`evre, and Gabriel Stoltz. Free energy methods for Bayesian inference:efficient exploration of univariate Gaussian mixture posteriors.
Stat. Comput. , 22(4):897–916,2012.[16] Giovanni Ciccotti, Raymond Kapral, and Eric Vanden-Eijnden. Blue moon sampling, vectorialreaction coordinates, and unbiased constrained dynamics.
ChemPhysChem , 6(9):1809–1814,2005. 2417] R. Coifman and S. Lafon. Diffusion maps.
Appl. Comput. Harmon. Anal. , 21(1):5–30, 2006.[18] Ronald R Coifman, Ioannis G Kevrekidis, St´ephane Lafon, Mauro Maggioni, and Boaz Nadler.Diffusion maps, reduction coordinates, and low dimensional representation of stochastic sys-tems.
Multiscale Model. Simul. , 7(2):842–864, 2008.[19] Pierre Collet, Servet Martinez, and Jaime San Martin.
Quasi-stationary distributions: Markovchains, diffusions and dynamical systems . Springer Science & Business Media, 2012.[20] Jeffrey Comer, James C Gumbart, J´erˆome H´enin, Tony Leli`evre, Andrew Pohorille, andChristophe Chipot. The adaptive biasing force method: Everything you always wanted toknow but were afraid to ask.
J. Phys. Chem. B , 119(3):1129–1151, 2014.[21] Eric Darve and Andrew Pohorille. Calculating free energies using average force.
J. Chem.Phys. , 115(20):9169–9183, 2001.[22] Eric Darve, David Rodr´ıguez-G´omez, and Andrew Pohorille. Adaptive biasing force methodfor scalar and vector free energy calculations.
J. Chem. Phys. , 128(14), 2008.[23] Peter Deuflhard and Marcus Weber. Robust perron cluster analysis in conformation dynamics.
Linear algebra and its applications , 398:161–184, 2005.[24] G. Di Gesu, T. Leli`evre, D. Le Peutrec, and B. Nectoux. Sharp asymptotics of the first exitpoint density. arXiv:1706.08728v1 , 2017.[25] Rose Du, Vijay S Pande, Alexander Yu Grosberg, Toyoichi Tanaka, and Eugene S Shakhnovich.On the transition coordinate for protein folding.
J. Chem. Phys. , 108(1):334–350, 1998.[26] Weinan E, Ren Weiqing, and Eric Vanden-Eijnden. Finite temperature string method for thestudy of rare events.
J. Phys. Chem. B , 109(14):6688–6693, 2005.[27] Andrew L Ferguson, Athanassios Z Panagiotopoulos, Pablo G Debenedetti, and Ioannis GKevrekidis. Integrating diffusion maps with umbrella sampling: Application to alanine dipep-tide.
The Journal of chemical physics , 134(13):04B606, 2011.[28] Bernard Helffer and Francis Nier.
Quantitative analysis of metastability in reversible diffusionprocesses via a Witten complex approach: the case with boundary . Soci´et´e math´ematique deFrance, 2006.[29] Yuehaw Khoo, Jianfeng Lu, and Lexing Ying. Solving for high dimensional committor functionsusing artificial neural networks. arXiv Prepr. arXiv1802.10275 , 2018.[30] Rongjie Lai and Jianfeng Lu. Point Cloud Discretization of Fokker–Planck Operators forCommittor Functions.
Multiscale Model. Simul. , 16(2):710–726, 2018.[31] Alessandro Laio and Michele Parrinello. Escaping free-energy minima.
Proc. Natl. Acad. Sci. ,99(20):12562–12566, 2002.[32] Claude Le Bris, Tony Leli`evre, Mitchell Luskin, and Danny Perez. A mathematical formaliza-tion of the parallel replica dynamics.
Monte Carlo Methods Appl. , 18(2):119–146, 2012.2533] Benedict Leimkuhler and Charles Matthews. Robust and efficient configurational molecularsampling via Langevin dynamics.
The Journal of chemical physics , 138(17):174102, 2013.[34] Benedict Leimkuhler, Charles Matthews, and Gabriel Stoltz. The computation of averagesfrom equilibrium and nonequilibrium Langevin molecular dynamics.
IMA J. Numer. Anal. ,36(1):13–79, 2015.[35] T Leli`evre, M Rousset, and G Stoltz.
Free Energy Computations: A Mathematical Perspective .Imperial College Press, 2010.[36] Jianfeng Lu and James Nolen. Reactive trajectories and the transition path process.
Probab.Theory Relat. Fields , 161(1-2):195–244, 2015.[37] Mohammad M. Sultan and Vijay S Pande. tICA-metadynamics: accelerating metadynamicsby using kinetically selected collective variables.
J. Chem. Theory Comput. , 13(6):2440–2447,2017.[38] Luca Maragliano and Eric Vanden-Eijnden. A temperature accelerated method for samplingfree energy and determining reaction pathways in rare events simulations.
Chem. Phys. Lett. ,426(1-3):168–175, 2006.[39] Anton Martinsson, Jianfeng Lu, Benedict Leimkuhler, and Eric Vanden-Eijnden. SimulatedTempering Method in the Infinite Switch Limit with Adaptive Weight Learning. arXiv Prepr.arXiv1809.05066 , 2018.[40] Philipp Metzner, Christof Sch¨utte, and Eric Vanden-Eijnden. Illustration of transition paththeory on a collection of simple examples.
J. Chem. Phys. , 125(8):84110, 2006.[41] Kevin P Murphy.
Machine learning : a probabilistic perspective . MIT Press, Cambridge, Mass.[u.a.], 2013.[42] Boaz Nadler, Stephane Lafon, Ronald Coifman, and Ioannis G Kevrekidis. Diffusion Maps -a Probabilistic Interpretation for Spectral Embedding and Clustering Algorithms. In Alexan-der N Gorban, Bal´azs K´egl, Donald C Wunsch, and Andrei Y Zinovyev, editors,
Princ. Man-ifolds Data Vis. Dimens. Reduct. , pages 238–260, Berlin, Heidelberg, 2008. Springer BerlinHeidelberg.[43] Andreas Nold, Benjamin D Goddard, Peter Yatsyshin, Nikos Savva, and Serafim Kalliadasis.Pseudospectral methods for density functional theory in bounded and unbounded domains.
J.Comput. Phys. , 334:639–664, 2017.[44] Lars Onsager. Initial recombination of ions.
Phys. Rev. , 54(8):554, 1938.[45] Vijay S Pande, Alexander Yu Grosberg, Toyoichi Tanaka, and Daniel S Rokhsar. Pathwaysfor protein folding: is a new view needed?
Curr. Opin. Struct. Biol. , 8(1):68–79, 1998.[46] D. Perez, B.P. Uberuaga, and A.F. Voter. The parallel replica dynamics method - coming ofage.
Computational Materials Science , 100:90–103, 2015.[47] Jordane Preto and Cecilia Clementi. Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics.
Phys. Chem. Chem. Phys. , 16(36):19181–19191, 2014.2648] Jan-Hendrik Prinz, Martin Held, Jeremy C Smith, and Frank No´e. Efficient computation,sensitivity, and error analysis of committor probabilities for complex dynamical processes.
Multiscale Model. Simul. , 9(2):545–567, 2011.[49] Jan-Hendrik Prinz, Bettina Keller, and Frank No´e. Probing molecular kinetics with markovmodels: metastable states, transition pathways and spectroscopic observables.
Physical Chem-istry Chemical Physics , 13(38):16912–16927, 2011.[50] Mary A. Rohrdanz, Wenwei Zheng, Mauro Maggioni, and Cecilia Clementi. Determination ofreaction coordinates via locally scaled diffusion map.
J. Chem. Phys. , 134(12), 2011.[51] Lula Rosso, Peter Min´ary, Zhongwei Zhu, and Mark E Tuckerman. On the use of the adiabaticmolecular dynamics technique in the calculation of free energy profiles.
J. Chem. Phys. ,116(11):4389–4402, 2002.[52] Ch Sch¨utte, Alexander Fischer, Wilhelm Huisinga, and Peter Deuflhard. A direct approachto conformational dynamics based on hybrid monte carlo.
Journal of Computational Physics ,151(1):146–168, 1999.[53] Christof Sch¨utte and Wilhelm Huisinga. Biomolecular conformations can be identified asmetastable sets of molecular dynamics.
Handb. Numer. Anal. , 10:699–744, 2003.[54] Patrick Shaffer, Omar Valsson, and Michele Parrinello. Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an applicationto chignolin.
Proc. Natl. Acad. Sci. , 113(5):1150–1155, 2016.[55] Amit Singer. From graph to manifold Laplacian: The convergence rate.
Appl. Comput.Harmon. Anal. , 21(1):128–134, 2006.[56] Erik H Thiede, Dimitrios Giannakis, Aaron R Dinner, and Jonathan Weare. Galerkin Ap-proximation of Dynamical Quantities using Trajectory Data. arXiv Prepr. arXiv1810.01841 ,2018.[57] Glenn M Torrie and John P Valleau. Nonphysical sampling distributions in Monte Carlofree-energy estimation: Umbrella sampling.
J. Comput. Phys. , 23(2):187–199, 1977.[58] Lloyd N Trefethen.
Spectral methods in MATLAB , volume 10. Siam, 2000.[59] Eric Vanden-Eijnden. Transition path theory. In
Comput. Simulations Condens. Matter Syst.From Mater. to Chem. Biol. Vol. 1 , pages 453–493. Springer, 2006.[60] Francesca Vitalini, Antonia S J S Mey, Frank No´e, and Bettina G Keller. Dynamic propertiesof force fields.
J. Chem. Phys. , 142(8):02B611 1, 2015.[61] A.F. Voter. Parallel replica method for dynamics of infrequent events.
Phys. Rev. B , 57(22):R13985, 1998.[62] Wenwei Zheng, Mary A Rohrdanz, and Cecilia Clementi. Rapid exploration of configurationspace with diffusion-map-directed molecular dynamics.