[PDF] Pseudo generators of spatial transfer operators

Abstract

Metastable behavior in dynamical systems may be a significant challenge for a simulation based analysis. In recent years, transfer operator based approaches to problems exhibiting metastability have matured. In order to make these approaches computationally feasible for larger systems, various reduction techniques have been proposed: For example, Schütte introduced a spatial transfer operator which acts on densities on configuration space, while Weber proposed to avoid trajectory simulation (like Froyland et al.) by considering a discrete generator. In this manuscript, we show that even though the family of spatial transfer operators is not a semigroup, it possesses a well defined generating structure. What is more, the pseudo generators up to order 4 in the Taylor expansion of this family have particularly simple, explicit expressions involving no momentum averaging. This makes collocation methods particularly easy to implement and computationally efficient, which in turn may open the door for further efficiency improvements in, e.g., the computational treatment of conformation dynamics. We experimentally verify the predicted properties of these pseudo generators by means of two academic examples.

Full PDF

PPseudo generators of spatial transfer operators

Andreas Bittracher ˚ P´eter Koltai : Oliver Junge ˚ October 9, 2018

Abstract

Metastable behavior in dynamical systems may be a signiﬁcant challenge for a simulationbased analysis. In recent years, transfer operator based approaches to problems exhibitingmetastability have matured. In order to make these approaches computationally feasiblefor larger systems, various reduction techniques have been proposed: For example, Sch¨utteintroduced a spatial transfer operator which acts on densities on conﬁguration space, whileWeber proposed to avoid trajectory simulation (like Froyland et al.) by considering a discretegenerator.In this manuscript, we show that even though the family of spatial transfer operators isnot a semigroup, it possesses a well deﬁned generating structure. What is more, the pseudogenerators up to order 4 in the Taylor expansion of this family have particularly simple,explicit expressions involving no momentum averaging. This makes collocation methodsparticularly easy to implement and computationally eﬃcient, which in turn may open thedoor for further eﬃciency improvements in, e.g., the computational treatment of conforma-tion dynamics. We experimentally verify the predicted properties of these pseudo generatorsby means of two academic examples.

Conformations of molecular systems

The properties of many biomolecular systems suchas proteins or enzymes depend heavily on their molecular conﬁguration, i.e. the position of singleatoms relative to each other. It is often observed that the system tends to ”cluster” aroundcertain key conﬁgurations. Transitions between these so-called conformations can be consideredrare events, as the time scale on which they occur and the characteristic dynamic time scaleof the atoms in the molecule typically lie 10 ´

15 orders of magnitude apart. Nevertheless,these transitions play an essential role for the biological function of these molecules [15, 50, 16,35, 33]. The reliable identiﬁcation of these conformations and the probabilities (and rates) oftransitions between them via direct numerical simulation is computationally very demanding ifnot infeasible for larger molecules.

Transfer operator based methods

Molecular systems as described above are typicallymodeled as Hamiltonian systems, possibly including stochastic perturbations. Conformationsthen are almost-invariant (metastable) subsets of position space, corresponding to local minimaof the potential energy surface. The ultimate goal of conformation analysis is to obtain a reducedmodel of the given system which accurately depicts these sets and the proper statistics of thetransitions between them. This ﬁeld of research, also called

Markov state modeling , attracted alot of interest in the last decade [4, 5, 21, 36, 39, 43, 48]. ˚ Center for Mathematics, Technische Universit¨at M¨unchen : Mathematics Institute, Freie Universit¨at Berlin a r X i v : . [ m a t h . D S ] M a y ioneering work of Deuﬂhard, Dellnitz et al [10, 42] exploits that these almost invariant setscan in principle be identiﬁed through eigenfunctions of a certain linear operator, the transferoperator , which describes the evolution of distributions under the dynamics.A direct application of this approach considers the operator acting on densities on the entirestate space (i.e. position and momentum), while the conformational changes of interest are onlyobserved in the position coordinate. Moreover, the approach is subject to the curse of dimensionfor all but the smallest systems, since a discretization of state space has to be constructed.To remedy this, one might consider the “overdampled” or Smoluchowski dynamics [22], whichacts on position space only, but this is a physically acceptable model of molecular motiononly in the case that random collisions with the solvent overwhelm the eﬀect of inertia in themolecule of interest. Sch¨utte [41] came up with a physically justiﬁable solution as he introducedthe so-called spatial dynamics , whose metastable sets still bear the interpretation of moleculardynamical conformations. The associated spatial transfer operator acts on densities on positionspace and can be seen as a momentum-averaged version of the full Hamiltonian or Langevindynamics.Commonly, the transfer operator is ﬁnitely approximated by a stochastic matrix, whose entriescan be interpreted as transition probabilities of single system instances in the canonical ensemblefrom some subset (in state or position space) to another. These transition probabilities in turnare computed by short time integration of a number of trajectories starting in each subset.Thus, the computation of a few long simulations is replaced by the computation of many shorttrajectories for an ensemble of adequately distributed initial conditions. Still, the momentumaveraging has to be done explicitely by additionally sampling the momentum space for each ofthese initial conditions.Over the years, diﬀerent techniques for a ﬁnite approximation of transfer operators have beenproposed, we refer to [9, 7] and the references there. More recently, Weber [47] used meshfreeapproximation techniques and showed that for a given approximation error, the number of basiselements scales with the number of metastable sets, not necessarily with the dimension of thesytem. In [25], an approximation by a sparse Haar space was proposed in order to mollify thecurse of dimension, while in [17] a tensor-product construction was used in combination witha mean ﬁeld- approach. A novel approach to coarse-grain a multi-scale system by discretizingits transfer operator without using a full partition of the phase space [43] excels especially ineﬃciently reproducing the dominant time scales of the original system, but relies heavily onlong trajectory simulations.Elaborate schemes to extract the metastability information from the eigenvectors of the (ap-proximate) transfer operator were developed in [22, 24, 12, 23]. Simulation-free and generator-based methods

All these methods rely on the numericalintegration of trajectories. Only recently, methods have been proposed that require no timeintegration, albeit imposing further requirements on the system [27, 48, 18]: Under these con-ditions and provided that the system’s transfer operator forms a continuous time semigroup,one can exploit that the eigenvalues and -functions are the same as those of the semigroup’s inﬁnitesimal generator . Discretizing this generator requires no time-integration and is thuscomputationally considerably cheaper than classical methods.

This manuscript

Unfortunately, Sch¨utte’s spatial transfer operator is not a time semigroup.In this contribution, we deﬁne suitable pseudo generators of non-semigroup families of oper-ators which inherit desirable properties of the spatial transfer operator as well as “restored”operators which approximate the spatial operator at least for small times. The appeal of theseconstructions from a numerical perspective is threefold:21) no numerical time integration is needed,(2) momentum averaging is accomplished analytically, i.e. momentum sampling is completelyavoided,(3) the pseudo generators can be discretized by collocation methods, avoiding costly boundaryintegrals.We establish theoretical asymptotic estimates on the error of density propagation, and validatethem numerically. The numerical experiments indicate that the information on metastablesets (i.e. conformations) gained from the restored operators remains close to the “original” onegained from the exact spatial transfer operator, even for times beyond those guaranteed by ourestimates. A quantitative understanding of this phenomenon (also observed by Sch¨utte [41]) isstill lacking; some steps towards a theoretical explanation have been made in [3].The manuscript is structured as follows. In Section 2, we introduce the basic dynamical modelswe are working with and describe their action on propagating (probability) densities by transferoperators. This necessitates the discussion of operator semigroups, given at the end of the sec-tion. Section 3 is concerned with ﬂuctuations in the spatial distribution of the system governedby its dynamics, leading to the concept of spatial transfer operators as well as metastability inposition space. In Section 4, we introduce the concept of pseudo generators, the correspondingrestored operators, and give asymptotic error estimates on their approximation quality. Sec-tion 5 includes numerical experiments. We conclude our work in Section 6, and discuss futuredirections to make the method applicable for realistic (bio-)molecular systems. Three appen-dices are given: Appendix A gives a detailed derivation of the pseudo generators up to order 3;Appendix B gives a complete and self-contained proof of the applicability of Huisinga’s the-ory [22, 24] on the quantitavive identiﬁcation of metastable components from spectral analysisof transfer operators for the spatial transfer operator based on Langevin dynamics (especiallyreversibility and ergodicity of the spatial dynamics, which are probably known or at least an-ticipated, however we could not ﬁnd neither a statement, nor even a partial derivation of theseproperties); in Appendix C, we show that the eigenfunctions of the spatial transfer operator aresmooth, i.e. inﬁnitely diﬀerentiable, if the potential is a smooth function.

In this section we introduce the dynamical systems of interest as well as the concept of transferoperators for describing statistical transport under these dynamics.

Broadly speaking, we will be studying continuous time stochastic dynamical systems on a phasespace Ω Ă R d . They will be described by Ω-valued random variables x t , t ě

0, following an Itˆodiﬀusion equation, i.e. a stochastic diﬀerential equation of the form B t x t “ b p x t q ` Σ p x t q w t . (1)Here, B t denotes the diﬀerentiation with respect to t , b : Ω Ñ R d , Σ : Ω Ñ R d ˆ d and w t is a R d -valued “white noise” term, see e.g. [34, p. 61]. The functions b and Σ are assumed to beglobally Lipschitz and growing at most linearly at inﬁnity, such that (1) has unique solutions(cf. [34, Theorem 5.2.1]). Here and in the following boldface lower case letters denote randomvariables. 3 olecular dynamics Consider a molecular system described by d P N positional degrees offreedom. Typically, these correspond to either internal coordinates or particle positions in R n , n being the number of particles. Let Q Ă R d denote the conﬁguration space.Let a potential V : Q Ñ R , describing the relative energy of a given conﬁguration q P Q , becontinously diﬀerentiable . Under the assumption of strict total energy, the movement of thesystem is then described by classical deterministic Hamiltonian dynamics: B t q “ p B t p “ ´ ∇ V p q q . (2)The phase space is thus Ω “ Q ˆ P , with q P Q the position and p P P “ R d the momentumcoordinate. For simplicity, we set the mass matrix M to be the identity, otherwise the ﬁrstequation in (2) would be B t q “ M ´ p .Equation (2) models a system “in vacuo”, independent from external inﬂuence. Of more physicalrelevance, however, are systems which are stochastically coupled to their surroundings, phys-ically motivated by the presence of a heat bath or implicit solvent not modeled explicitly. Aprominent way of doing this is via a drift-diﬀusion perturbation of (2), known as the Langevinequations, which can be formally derived using averaging techniques from the Mori-Zwanzigformalism [51], B t q t “ p t B t p t “ ´ ∇ V p q t q ´ γ p t ` σ w t . (3)These can be written in the form of (1) with b p q, p q “ ˆ p ´ ∇ V p q q ´ γp ˙ and Σ p q, p q “ ˆ σ ˙ . The term ´ γ p t mimics the drag through the implicitly present solvent, σ w t accounts for randomcollisions with the solvent particles. To balance damping and excitation and to keep the systemat a constant average internal energy β , we set σ “ a γ { β . The choice of γ is problemdependent, and mimics the viscosity of the aforementioned implicit solvent. For further detailson the modeling see [6].An even further model reduction leads to the so-called Smoluchowski dynamics. In the second-order form of (3), B t q t “ ´ ∇ V p q t q ´ γ B t q t ` σ w t , (4)we consider a high-friction situation γ Ñ 8 . After appropriate rescaling of the time, τ “ γ ´ t (in order to be able to observe movement under the now extremely slow dynamics), (4) becomes γ ´ B τ q τ “ ´ ∇ V p q τ q ´ B τ q τ ` c β w τ . In the limit γ Ñ 8 , this yields the

Smoluchowski equation B τ q τ “ ´ ∇ V p q τ q ` c β w τ . (5)This conceptual derivation can be made precise by considering stochastic convergence notions.The interested reader is referred either to [32], where the physical intuition and mathematical Later on, we will impose stronger assumptions on V . Actually, β is the inverse temperature, β “ {p k B T q , with Boltzmann’s constant k B and T the systemtemperature. , or to [37], where homogenization techniques for the transferoperators of the underlying equations are exploited. Next, we will consider this latter, operator-based characterization of stochastic processes. We shall now examine how phase space density functions evolve under the dynamics inducedby (1). That is, given a probability density at time t “

0, what is the probability to ﬁnd thesystem in a certain region at time t ą t ? More precisely, given x „ f “ f (a random variable x distributed according to the density f ), ﬁnd f t with x t „ f t , t ě

0, where the evolution of x t is governed by (1).To this end, let p Ω , B , µ q be a probability space with B denoting the Borel σ -algebra, andconsider the stochastic transition function p : R ě ˆ Ω ˆ B Ñ r , s , p p t, x, B q “ Prob r x t P B | x “ x with probability 1 s , and denote by p µ p t, A, B q : “ Prob µ r x t P B | x P A s (6)the transition probabilities between A P B and B P B , where Prob µ indicates that x „ µ ; i.e.the initial condition is distributed according to µ . For the long term macroscopic behavior ofthe system, sets A Ă Ω play an important role for which p µ p t, A, A q « µ and times t ą x „ f “ : f P L µ p Ω q is given. We then have that x t „ f t with ż B f t p x q dµ p x q “ ż Ω f p x q p p t, x, B q dµ p x q , @ B P B . (7)Under mild conditions , satisﬁed by the systems considered here, f t is uniquely deﬁned by (7).This yields the transfer operator with lag time t P t : L µ p Ω q Ñ L µ p Ω q via P t f p x q : “ f t p x q where we extend the deﬁnition of P t from densities to arbitrary integrable functions usinglinearity. In the deterministic case, this is the so-called Perron–Frobenius operator .For some of the following results, it will be necessary to distinguish between the transfer opera-tors of the general Itˆo diﬀusion (1), and the special cases of the Langevin (3) and Smoluchowskidynamics (5). We will then refer to them as P t , P t Lan and P t Smol , respectively. Of course,statements concerning P t hold for P t Lan and P t Smol as well.

Some properties of P t P t can be considered a time-parametrized family of linear operators,which then possesses the Chapman–Kolmogorov (or semigroup) property:(i) lim t Ñ P t f “ f ,(ii) P t ` s f “ P t ` P s f ˘ for all s, t ě A reader with a physicists view might be irritated by the fact that the terms with ∇ V and w t act as forces (oraccelerations) in (3) and as velocities in (5). By eliminating the damping coeﬃcient γ , the physical dimensionsof the terms changed. Since the mathematical statement is not aﬀected by this, we shall not go into details. In the literature, L p sometimes denotes the “pre-Lebesgue space”, i.e. the Lebesgue space before equivalenceclass formation, and L p usually denotes the actual Lebesgue space. Due to clash of notation, however, we callthe actual Lebesgue space L p and use } ¨ } k,µ to denote the standard norm. See e.g. [29]. P t to the spaces L kµ p Ω q , 1 ď k ď 8 , is well deﬁned for proper choices of µ (see Corollary2.1). We thus have that } P t } k,µ ď P t f ě ď f P L kµ p Ω q .Using the standard scalar product on L µ p Ω q , for µ p A q ą

0, the transition probabilities can beexpressed via the transfer operator: p µ p t, A, B q “ µ p A q ż B P t χ A dµ “ µ p A q ż Ω P t χ A χ B dµ “ x P t χ A , χ B y ,µ x χ A , χ A y ,µ with χ being the indicator function. The semigroup property basically means that P t is “memoryless” (in other terms, (1) generatesa Markov process). The identity P t “ ` P t { n ˘ n suggests that all information about the densitytransport is contained in P τ for arbitrarily small τ .This is formalized by looking at an operator L : D p L q Ñ L kµ p Ω q given by Lf “ lim τ Ñ P τ f ´ fτ , (8)where D p L q Ă L kµ p Ω q is the linear subspace of L kµ p Ω q where the above limit exists. L iscalled the inﬁnitesimal generator of the semigroup P t , and the ﬁeld of operator semigrouptheory [38] answers the question in which sense P t is a solution operator to the Cauchy problem B t f t “ Lf t . Essentially, the power of the inﬁnitesimal generator lies in the fact that all therelevant information about P t for all times t ě L . We will discuss thisbelow. Invariant density

Having its interpretation in mind, it is not surprising that the inﬁnitesimalgenerator is exactly the right hand side of the parabolic partial diﬀerential equation describingthe ﬂow of suﬃciently regular densities, the

Kolmogorov forward equation or Fokker–Planckequation , see [26, p. 282]: B t f t p x q “ d ÿ i “ d ÿ k “ B B x i B x k ` Σ ik p x q f t p x q ˘ ´ d ÿ i “ BB x i ` b i p x q f t p x q ˘loooooooooooooooooooooooooooooooooomoooooooooooooooooooooooooooooooooon “ : Lf t p x q . (9)In the Langevin case, this simpliﬁes to B t f t p q, p q “ ´ γβ ∆ p ´ p ¨ ∇ q ` ∇ q V p q q ¨ ∇ p ` γp ¨ ∇ p ` dγ ¯looooooooooooooooooooooooooooooomooooooooooooooooooooooooooooooon “ : L Lan f t p q, p q , (10)where the dot denotes the Euclidean inner product, ∇ x and ∆ x are the gradient and Laplaceoperators with respect to x , respectively. For Smoluchowski, it is B t f t p q q “ ´ β ∆ q ` ∇ q V ¨ ∇ q ` ∆ q V ¯looooooooooooooooomooooooooooooooooon “ : L Smol f t p q q . (11)Densities which are invariant under the dynamics play a naturally prominent role. Since theyare ﬁxed under P t for any t ě

0, by (8) they lie in the kernel of L ; see also Corollary 2.3 below.6or the stochastic processes considered here the invariant density can be shown to be unique(cf. [34]), and—using the term from statistical mechanics—we call it the canonical density , f Ω .For the Langevin dynamics f Ω p q, p q “ f Q p q q ¨ f P p p q , (12)where f Q p q q : “ Z Q exp ` ´ βV p q q ˘ , f P p p q : “ Z P exp ` ´ β p ¨ p ˘ , with Z P “ ş P exp ` ´ β p ¨ p ˘ dp and Z Q “ ş P exp ` ´ βV p q q ˘ dq . For the Smoluchowski dynamics, f Ω p q q “ f Q p q q is the canonical density. f Ω is a density with respect to the Lebesgue measure m ,and its existence requires the integrability of exp p´ βV q ; which is from now on assumed to hold.To understand their relevance, note that the canonical density not just describes the statisticalequilibrium of the system, but the system also tends to this equilibrium as time grows, i.e.according to whatever f the system is distributed initially, f t Ñ f Ω as t Ñ 8 in L m p Ω q . Domain and Spectral properties

The results of this paragraph hold true for the transferoperators of both the Langevin and the Smoluchowski dynamics, by taking the correspondingphase space and invariant measure (i.e. the measure having the canonical density as Radon–Nikod´ym derivative with respect to the Lebesgue measure).Let µ Ω be the invariant measure of P t . Note that for arbitrary 1 ď k ď 8 , P t can be deﬁnedon L kµ Ω p Ω q due to the following corollary: Corollary 2.1 ([2, Corollary to Lemma 1]) . Let P t be a transfer operator associated with atransition function having the invariant measure µ Ω . Then P t is a well-deﬁned contraction on L kµ Ω p Ω q for every ď k ď 8 . For our purposes the main connection between a semigroup of operators and their generator isgiven by the following

Theorem 2.2 (Spectral mapping theorem [38]) . Let X be a Banach space, T t : X Ñ X , t ě ,a C semigroup of bounded linear operators (i.e. T t f Ñ f as t Ñ for every f P X , and T t bounded for every t ), and let A be its inﬁnitesimal generator. Then e tσ p p A q Ă σ p p T t q Ă e tσ p p A q Y t u , with σ p denoting the point spectrum. The corresponding eigenvectors are identical. We can immediately deduce the following statements.

Corollary 2.3.

A function f is an invariant density of P t for all t ě , if and only if Lf “ . Corollary 2.4.

Since P t is a contraction in L kµ Ω p Ω q , the eigenvalues of L lie in the left complexhalf-plane. Theorem 2.2 suggests that P t “ e tL “ ř k “ t k k ! L k . This intuition is false in general, as L maybe unbounded and Ş k “ D ` L k ˘ ‰ L pµ Ω p Ω q for any p . However, P t can be approximated by atruncated “Taylor series”, at least pointwise, in the function space V N p Ω q : “ (cid:32) f P C N p Ω q ˇˇ L n f P L µ Ω p Ω q @ n “ , . . . , N ( . (13)We require f and V to be 2 N -times diﬀerentiable, as this is the highest derivative occuringin L N , cf. (9).The following convergence result also holds true if choosing L kµ Ω p Ω q instead of L µ Ω p Ω q in thedeﬁnition of V N p Ω q , and correspondingly regarding the norm } ¨ } k,µ Ω . However, we state it for L µ Ω p Ω q , as this is the space we are ultimately operating in.7 roposition 2.5. Let f P V N ` p Ω q . Then ››› P t f ´ N ÿ n “ t n n ! L n f ››› ,µ Ω “ O p t N ` q for t Ñ . Proof.

Let f P V N ` p Ω q . Then, P t f : t ÞÑ L µ Ω p Ω q is N ` t because B k P t ˇˇ t “ f “ L k f, k ď N ` f . The Taylor series expansion for Banachspace valued linear operators can for example be found in [49, Section 4.5]. Application to P t yields P t f “ N ÿ n “ L n f t n n ! ` ´ ż N ! p ´ s q N B N ` s P st f ds ¯ t N ` . We estimate the remainder: ››› P t f ´ N ÿ n “ t n n ! L n f ››› ,µ Ω “ ››› ż N ! p ´ s q N B N ` s P st f ds ››› ,µ Ω t N ` . As we are interested in the limit t Ñ t ă

1. In that case, st ă s , and therefore ď t N ` N ! sup s Pr , s ›› B N ` s P s f ›› ,µ Ω .P t f is the solution of the Fokker–Planck equation (9), and thus B s P s f “ LP s f and, by exten-sion, B N ` s P s f “ L N ` P s f . Moreover, due to Pazy [38, Corollary 1.4], the transfer operatorand generator commute: LP s f “ P s Lf . Therefore, “ t N ` N ! sup s Pr , s ›› P s L N ` f ›› ,µ Ω ď t N ` N ! sup s Pr , s ›› P s ›› ,µ Ω looomooon ď ›› L N ` f ›› ,µ Ω . In the last line, ›› P s ›› ,µ Ω ď P t is a contraction (Corollary 2.1). As L N ` f P L µ Ω p Ω q by the choice of f , it is bounded. This completes the proof. In order to analyze the behaviour of molecular systems in regard of conﬁgurational stability, wehave to restrict our view to the dynamics on position space Q . For this purpose, Sch¨utte in [41]proposed a reduction of the classical Hamiltonian dynamics, called Hamiltonian dynamics withrandomized momenta, while Weber [48] proposed the corresponding generalized version for astochastic evolution. Following Sch¨utte and Weber, we formulate the extension to Langevindynamics and state the appropriate deﬁnition of metastability. Consider an inﬁnitely large number of identical systems of form (3) in thermodynamic equi-librium, i.e. identically and independently distributed according to f Ω (called an ensemble inclassical statistical mechanics literature). To determine which portion of these systems undergo8 certain conﬁgurational change, i.e. leave a subset A Ă Q , we have to track the evolution of allthese systems starting from A . Due to the product structure (12) of f Ω , their momenta are stilldistributed according to f P and so the whole coordinates are initially distributed according to χ A f Ω , with χ A the indicator function of A on Q . This phase space density now evolves under P t Lan , but as we are only interested in the position portion of the evolving density, we form themarginal distribution with respect to q . The resulting spatial transfer operator on L kµ Q p Q q with dµ Q : “ f Q dm is S t χ A p q q : “ f Q p q q ż P P t Lan ` χ A p q q f Ω p q, p q ˘ dp, (14)cf. Corollary 2.1 applied to the operator P t Lan and the invariant measure µ Ω .Intuitively, one can think of S t u with normalized u P L kµ Q p Q q as transporting a positionalportion of the canonical density. Langevin dynamics with randomized momenta

Considering S t as an operator on L µ Q p Q q and using the standard associated scalar product x u, v y ,µ Q gives us access to certain transitionprobabilities on Q , which ﬁt our intuition of metastability. For A Ă Q we callΓ p A q : “ (cid:32) p q, p q P Ω | q P A ( the ”slice“ of phase space corresponding to A . It represents a sub-ensemble in position spaceassociated to all possible momenta. It is easy to see that the transition probabilities betweenslices Γ p A q and Γ p B q can be expressed in terms of S t : p p t, Γ p A q , Γ p B qq “ x S t χ A , χ B y ,µ Q x χ A , χ A y ,µ Q , (15)where p is the stochastic transition function under Langevin dynamics with respect to theLebesgue measure. We now call a disjoint decomposition A Y . . . Y A n “ Q of position space metastable if p p t, Γ p A j q , Γ p A j qq « , j “ , . . . , n. The meaning of “ «

1” will become apparent later.The connection between eigenvalues close to one of some transfer operator and metastable setswas ﬁrst observed in [9] and applied in conformation dynamics in [10]. An extension to a broaderclass of transfer operators (satisfying an assumption related to self-adjointness) was providedby Huisinga and Schmidt [24]. Our S t falls into that class, which is shown in Appendix B. Theorem 3.1 (Application of [24, Theorem 2]) . Let σ p S t q Ă r a, s with a ą ´ and λ n ď . . . ď λ ă λ “ be the n largest eigenvalues of S t , with eigenvectors v n , . . . , v . Let t A , . . . , A n u be a measurable decomposition of Q and Π : L µ Q p Q q Ñ L µ Q p Q q be the orthogonalprojection onto span p χ A , . . . , χ A n q , i.e. Π v “ n ÿ j “ x v, χ A j y ,µ Q x χ A j , χ A j y ,µ Q χ A j . The metastability of the decomposition can then be bounded from above by p p t, Γ p A q , Γ p A qq ` . . . ` p p t, Γ p A n q , Γ p A n qq ď ` λ ` . . . ` λ n , while it is bounded from below by ` ρ λ ` . . . ` ρ n λ n ` c ď p p t, Γ p A q , Γ p A qq ` . . . p p t, Γ p A n q , Γ p A n qq where ρ j “ } Π v j } ,µ Q P r , s and c “ a p ´ ρ ` . . . ` ´ ρ n q . v j , the better the lower bound matches the upperbound. We choose A , . . . , A n in accordance to the the sign structure of v , . . . , v n as a heuristicto the optimal decomposition (i.e. we treat the eigenfunctions as one-step functions of the form χ A ´ χ B ).However, more sophisticated strategies for extracting metastable sets are available, most notablythe linear optimization-based PCCA-algorithm and its extensions, developed by Deuﬂhard et.al. ([11], [40]). Let it be noted that it is applicable to all the operators developed herein, asPCCA does not depend on the underlying dynamical model. Smoluchowski dynamics

As described in section 2.1, another way to restrict the moleculardynamics to position space is via the high-friction limit and the arising transition from Langevinto Smoluchowski dynamics. As this limit may represent a considerable deviation from physicalreality, it is initially unclear how metastability in system (5) can be interpreted in the context ofthe original system. As the transition also involves a rescaling of time, especially the transitionprobabilities have to be treated with caution.Nevertheless, metastability under Smoluchowski dynamics can formally be deﬁned as above,and it holds p p t, A, B q “ x P t Smol χ A , χ B y ,µ Q x χ A , χ A y ,µ Q . Here, p is the stochastic transition function with respect to Smoluchowski dynamics. Using thesame reasoning as in the previous paragraph, we seek eigenpairs p λ, u q of P t Smol with λ « P t Smol near 1 coincide with eigenvaluesof the inﬁnitesimal generator L Smol near 0, and the associated eigenvectors are identical. Aneﬃcient method for metastability analysis based on L Smol was developed in [18]. There it isshown that for an eigenvalue λ ă L Smol , the corresponding eigenvector u and the sets A ` “ t u ą u , A ´ “ t u ă u holds p p t, A ` , A ` q ` p p t, A ´ , A ´ q “ ` exp p tλ q ` O p t q . Unfortunately, S t lacks the semi-group property, and so cannot be the solution operator of anautonomous PDE, such as the Fokker–Planck equation. Equivalently, spatial dynamics is notinduced by an Itˆo diﬀusion process, and thus has no inﬁnitesimal generator in the sense of (11). Formally, the time-derivatives of S t can still be deﬁned, in analogy to (8). We will see in thefollowing how the resulting operators can play the role of the inﬁnitesimal generator in thecontext of metastability analysis. We ﬁrst deﬁne these time derivatives for general time-parameterized operators:

Deﬁnition 4.1.

Let X be a Banach space, T t : X Ñ X , t ą B t T t : D ` B t T t ˘ Ñ X by B t T t f “ lim h Ñ T t ` h f ´ T t fh time-derivative of T t . D ` B t T t ˘ here is the subspace of X where the abovelimit exists. Iteratively, we deﬁne by B nt T t : “ B t ` B n ´ t T t ˘ the n -th time-derivative on D ` B nt T t ˘ .Finally, G n : “ B nt T t ˇˇ t “ is called the n -th pseudo generator of T t .For T t “ P t , the transfer operator of an Itˆo process, the pseudo generators are the iteratedinﬁnitesimal generators: Proposition 4.2. On D p L n q , the n -th pseudo generator G n of P t takes the form G n “ L n , with L the inﬁnitesimal generator of the respective dynamics. With this, the pseudo generators of the spatial transfer operator S t can be expressed by thegenerator L Lan of the full Langevin transfer operator:

Lemma 4.3. On D p L n Lan q , the n -th pseudo generator G n of S t takes the form G n u p q q “ f Q p q q ż P L n Lan ` u p q q f Ω p q, p q ˘ dp. Proof.

The ﬁrst time-derivative of S t is B t S t u p q q “ B t ´ f Q p q q ż P P t Lan ` u p q q f Ω p q, p q ˘ dp ¯ “ f Q p q q ż P B t P t Lan ` u p q q f Ω p q, p q ˘ dp “ p q f Q p q q ż P ´ L Lan P t Lan ` u p q q f Ω p q, p q ˘¯ dp. Inductively, this gives the n -th time derivative B nt S t “ f Q p q q ż P L n Lan P t Lan ` u p q q f Ω p q, p q ˘ dp. As P “ id, the n -th pseudo generator is G n u p q q “ ´ f Q p q q ż P L n Lan P t Lan ` u p q q f Ω p q, p q ˘ dp ¯ˇˇˇ t “ “ f Q p q q ż P L n Lan ` u p q q f Ω p q, p q ˘ dp. From now on, when speaking of pseudo generators, we always mean pseudo generators of S t .Note that, in general, G n is not simply a power of G , as the integral and the power of L n Lan donot commute. We thus take a closer look at the ﬁrst few G n : Proposition 4.4.

Let S t be the spatial transfer operator for the Langevin dynamical process.On their respective domain, its ﬁrst three pseudo generators take the form1. G “ ,2. G “ β ∆ ´ ∇ V ¨ ∇ . Notably, G is independent of γ .3. G “ ´ γG . The proof can be found in Appendix A. 11 onnection to Smoluchowski dynamics

We can draw a perhaps surprising connectionbetween the second pseudo generator and Smoluchowski dynamics. Recall that (11) and thus P t Smol deals with densities with respect to Lebesgue measure m . However, to compare it to S t ,we have to track the transport of densities with respect to µ Q . This transport is described by B t u t p q q “ ˆ β ∆ ´ ∇ V ¨ ∇ ˙loooooooooomoooooooooon “ : G Smol u t p q q . (16)Note, however, that the transition from m to µ Q is merely a basis transformation: Let, for abrief moment, P t Smol ,m and P t Smol ,µ Q be the transfer operators generated by L Smol and G Smol ,respectively. Then, P t Smol ,m p uf Q q “ ´ P t Smol ,µ Q p u q ¯ f Q . (17) P t Smol ,µ Q exists on every L kµ Q p Q q , 1 ď k ď 8 (choosing µ “ f Q dm in Corollary 2.1), and thusbecause of (17), P t Smol ,m exists on L km p Q q , ď k ď 8 . As we only work on f Q weighted spaces,we drop the second subscript from now on and set P t Smol : “ P t Smol ,µ Q . Comparing equation (16) to Proposition 4.4, we immediately see

Corollary 4.5.

The pseudo generator G (of the spatial transfer operator) is the inﬁnitesimalgenerator of the Smoluchowski dynamics: G “ G Smol . In conclusion, the density transport under spatial dynamics is similar to that under Smolu-chowski dynamics, but on diﬀerent timescales: The Taylor expansion of P t Smol (in spirit ofProposition 2.5) gives P t Smol u “ u ` tG u ` t G u ` . . . , (18)while that of S t gives S t u “ u ` t G u ` t G u ` . . . . (19)Thus formally rescaling t ÞÑ t in (18) equals (19) up to second order terms in t . We will makethis rigorous in the following section. In this section we aim to approximate S t in a way that is suitable for subsequent numericalmetastability analysis.To ensure that the various operators constructed with the pseudo generators are well-deﬁned,we introduce the spaces W NK p Q q : “ (cid:32) u P C KN p Q q | p G k q n u P L µ Q p Q q @ n “ , . . . , N, @ k “ , . . . , K ( . (20)Note that, despite a similar notation, W NK p Q q is not the usual Sobolev space. 2 KN is thehighest derivative appearing in G NK , so we require the corresponding diﬀerentiability. The choice12f L µ Q p Q q in W NK p Q q is motivated by the deﬁnition of transition probabilities via the scalarproduct on L µ Q p Q q , (15). However, all error estimates in this section hold for L kµ Q p Q q , ď k ď 8 as well, if the deﬁnition of V N p Ω q is also changed accordingly. The requirement that p G k q n u P L µ Q p Q q is mostly technical in nature. Arbitrary K and N will only appear in Theorem4.6 and Lemma 4.10. Later on, only the space W p Q q will be of interest, due to the simplestructure of G and G . We will see later (Section 5.1) that W p Q q in fact contains our objectsof interest, namely the eigenvectors of our approximations to S t . Moreover, it is big enough toallow for a sensible discretization basis (Section 5.2). Taylor reconstruction

Combining Proposition 2.5 and Lemma 4.3 gives the following nat-ural Taylor reconstruction of S t : Theorem 4.6.

Let u P W K p Q q . Then, ››› S t u ´ K ÿ k “ t k k ! G k u ››› ,µ Q “ O p t K ` q , p t Ñ q . Proof.

By deﬁnition of S t and Lemma 4.3, we can write ››› S t u ´ K ÿ k “ t k k ! G k u ››› ,µ Q “ ››› f Q p q q ż P P t Lan ` u p q q f Ω p q, p q ˘ dp ´ K ÿ k “ t k k ! 1 f Q p q q ż P L k ` u p q q f Ω p q, p q ˘ dp ››› ,µ Q ď f Q p q q ż P ››› P t Lan ` u p q q f Ω p q, p q ˘ ´ K ÿ k “ t k k ! L k ` u p q q f Ω p q, p q ˘››› ,µ Q dp. However, the integrand is of order O p t K ` q by Proposition 2.5.Unfortunately, the G k for k ą V appear, whose analytic or numerical evaluation can be costly(they are k -dimensional tensors).In practice, however, the gradient ∇ V (the ”force ﬁeld“) typically is available, as it would beneeded for numerical simulation of the system anyway. If we thus truncate the Taylor-likesum from Theorem 4.6 after the third term, higher derivatives of V are avoided, as in thecomputation of G and G only ∇ V occurs. We call R t u : “ ´ id ` t G ` t G ¯ u “ u ` ` t ´ γ t ˘´ β ∆ u ´ ∇ u ¨ ∇ V ¯ (21)the of S t . This yields the convergence result Corollary 4.7.

Let u P W p Q q . Then ›› S t u ´ R t u ›› ,µ Q “ O p t q , p t Ñ q . As we never work with higher orders, we refer to R t simply as “the Taylor approximation” from now on. xponential reconstruction We expect that R t approximates S t well (for t Ñ

0) and canbe computed cheaply, provided ∇ V and ∆ u are available. However, unlike S t , R t is not norm-preserving and positive for densities with respect to f Ω , i.e. } R t u } ,µ Q ‰ } u } ,µ Q for u ě u , we lose the interpretation of ` R t u ˘ f Ω as a physical density.Moreover, for t suﬃciently large, R t u is not even a contraction on W p Q q . With λ P σ p G q , λ ‰ ˇˇ ` t λ ´ γt λ loooooooomoooooooon P σ p R t q ˇˇ Ñ 8 , p t Ñ 8q , and so } R t } ,µ Q Ñ 8 , p t Ñ 8q . We will see in the numerical experiments that this quickly(i.e. already for small to moderate t ) destroys the interpretation of the eigenvalues of R t asmetastability quantiﬁers.Therefore we mainly use an alternative approximation to S t , called the exponential approx-imation E t , which is L µ Q -norm-preserving and positive for densities, further contractive on W p Q q . One has to be careful with notation, however. As G is an unbounded operator on C p Q q X L µ Q p Q q , an operator exponential of form e G cannot be deﬁned by an inﬁnite se-ries. However, considerations of e.g. Pazy [38] allow us to deﬁne E t over a bounded operatorapproximating G , the so-called Yosida approximation : G λ : “ λG p λI ´ G q ´ for λ P R ě . Lemma 4.8. G λ is a bounded linear operator on W p Q q , and lim λ Ñ8 G λ “ G . Proof.

As the inﬁnitesimal generator of the Smoluchowski dynamics, G fullﬁlls the assumptionsof [38, Theorem 3.1]. Thus, the statement holds due to [38, Lemma 3.3].With this, we deﬁne E t u : “ exp ˆ t G ˙ u : “ lim λ Ñ8 exp ˆ t G λ ˙ u, which has the desired properties: Proposition 4.9.

Let u P L µ Q p Q q , u ě . Then E t u ě and } E t u } ,µ Q “ } u } ,µ Q . Moreover, E t is a contraction on L kµ Q p Q q .Proof. Due to Corollary 4.5 and Lemma 4.8, E t is simply a time-scaled version of the transferoperator of the Smoluchowski dynamics: E t “ P t { . As such, it inherits the desired properties from P t Smol .The following statements describe the approximation quality of E t for small t , analogous to theTaylor approximation: Lemma 4.10.

Let u P W N p Q q . Then for t Ñ , ε p t q : “ ›››› E t u ´ N ÿ n “ ` t G ˘ n n ! u ›››› ,µ Q “ O p t N ` q . Corollary 4.11.

Let u P W p Q q . Then ›› E t u ´ S t u ›› ,µ Q “ O p t q for t Ñ .Proof. ›› E t u ´ S t u ›› ,µ Q ď ››› E t u ´ ` ÿ n “ t n n ! G k ˘ u ››› ,µ Q ` ›››` ÿ n “ t n n ! G n ˘ u ´ S t u ››› ,µ Q “ ››› E t u ´ ÿ n “ ` t G ˘ n n ! u ››› ,µ Q ` ›››` ÿ n “ t n n ! G n ˘ u ´ S t u ››› ,µ Q . Both summands are O p t q , the ﬁrst due to Lemma 4.10, the second due to Theorem 4.6. Remark 4.12.

1. An approximation order of O p t q can be achieved by including G into E t , i.e. setting E t : “ exp ˆ t G ` t G ˙ : “ lim λ Ñ8 exp ˆˆ t ´ γ t ˙ G λ ˙ .E t again is a time-rescaled transfer operator of the Smoluchowski dynamics. However,it is not contractive for all t as σ p G q Ă p´8 , s and t ´ γ t Ñ ´8 for t Ñ 8 . Wethus stick to the lower-order approximation to retain the correct qualitative behavior for t Ñ 8 .2. In contrast to the operator R t , which is only deﬁned on the domain of the associatedpseudo generators, the operator E t can be deﬁned for every u P L kµ Q p Q q . We conjecturethat Corollary 4.11 holds also for this class of functions, although our proof is not extend-able to this case—it uses the Taylor reconstruction to estimate the error. More advancedtechniques from semigroup theory are needed, hence this will be subject of future studies. Reconstruction of eigenspaces

The error asymptotics carries over to the spectrum andeigenvectors of S t , R t and E t in the following way: Corollary 4.13.

Let u be an eigenvector of R t or E t to eigenvalue λ . Then u P W p Q q and ›› S t u ´ λu ›› ,µ Q “ O p t q for R t , ›› S t u ´ λu ›› ,µ Q “ O p t q for E t . Thus, for small t we interpret the eigenpair p u, λ q of R t or E t as a good approximation to aneigenpair of S t . We now show how to numerically exploit the newly-developed approximations to S t on twosimple examples. We will see that the simple structure of our pseudogenerators (up to G )allows for cheap evaluation if the right discretization techniques are used. For small lag times15 , the approximated operators can be used for accurate metastability analysis and the expectedconvergence rates hold.In this setting, the discretization technique of choice are spectral collocation methods, sincethey are known to converge faster than any polynomial order, provided the underlying objectsare (inﬁnitely) smooth [45]. Under suﬃcient regularity assumptions on the Langevin SDE, the eigenfunctions of the asso-ciated spatial transfer operator S t and of the associated Smoluchowski dynamics P t Smol (henceof G too) can be shown to be C , i.e. smooth. This allows for an extremely eﬃcient approxi-mation by spectral methods. All one needs is the following assumption. Assumption 5.1.

Let the potential V : R d Ñ R be C and let all its derivatives of ordergreater or equal two be bounded.The proof of smoothness touches upon techniques beyond the scope of this manuscript, hencewe deferred it to Appendix C. All our operators will be discretized and compared using spectral collocation methods. We referto [18], where these methods have been applied to transfer operator semigroups, i.e. the casewhere a “true” inﬁnitesimal generator exists. Note that in contrast to [48], we do not need tocompute boundary integrals on high dimensional domains.To avoid dealing with boundary conditions, we restrict ourselves to periodic position spaces.For a more general introduction see, for example, [45].For Q “ T , the 1-dimensional unit circle, deﬁne for n odd the ﬁnite dimensional approximationspace U n of trigonometric polynomials with basis of Fourier modes (cid:32) φ k ( ´ n ´ ď k ď n ´ , φ k p q q “ e iπkq , (22)as well as the corresponding collocation nodes Q n : “ t , { n, . . . , p n ´ q{ n u .with ﬁtting collocation nodes Q n : “ (cid:32) cos ` p k ´ q π {p n q ˘ , k “ , . . . , n ´ ( .Multi-dimensional phase spaces, consisting of cartesian products of T , can be discretized usinga corresponding product basis.Let I n : U Ñ U n be the corresponding projection from some function space U (in our case L µ Q p Q q ) to U n . For an operator A : U Ñ U , we then deﬁne the discretized operator A n : U n Ñ U n by A n : U n Ñ U n , A n f : “ I n Af. (23)The operator A n has a matrix representation which, for the sake of notational simplicity, is alsodenoted by A n : A n “ ` Aφ i p q j q ˘ ij , We are interested in parts of the spectrum and the associated eigenspaces of A , for A “ S t , R t , E t or G . So instead of solving Av “ λv on U , we solve A n v “ v on U n . In matrix form, using thecollocation matrix M n “ ` φ i p q j q ˘ ij , this becomes a generalized eigenvalue problem on C n : A n c “ λM n c or p A n ´ λM n q c “ , (24)16here ř nk “ c k φ k then is the approximative eigenfunction of A to eigenvalue λ in Laplace space.Under mild smoothness conditions on the potential V , readily fulﬁlled if Assumption 5.1 issatisﬁed, the Fourier and Chebyshev bases fulﬁll the requirements of Corollaries 4.7 and 4.11,i.e. φ k P W p Q q , k “ , . . . , n . Since even for our simple academic examples, S t cannot be computed analytically, a referencemethod is needed with which we can compute the eigenvalues and -vectors of S t . For this, wehere employ Ulam’s method (see e.g. [8]).Let t A , . . . , A n u be a partition of Q into subsets of positive Lebesgue measure . Deﬁne theapproximation space U n “ span p χ A , . . . , χ A n q . Now, the discretization of S t , denoted by S tn : U n Ñ U n , is the Galerkin projection of S t onto U n . It has the matrix representation ` S tn ˘ i,j “ µ p A j q ż A j S t χ A i dµ Q “ µ p A j q ż Γ p A i q f Ω p q, p q p ` t, p q, p q , Γ p A j q ˘ d p q, p q , with p p¨ , ¨ , ¨q the stochastic transition function of Langevin dynamics (6).This integral can be computed numerically by Monte Carlo quadrature, which involves sam-pling Γ p A i q , numerically integrating the Langevin equations for time t , and counting transitionsto Γ p A j q . To accurately test the approximation quality of R t and E t to S t , we ﬁrst analyze a simple one-dimensional Langevin system on the unit circle, which can be discretized to high resolution. Ithas the periodic potential V p q q “ ` p πq q ` p πq q ´ cos p πq q . . . . . q V p q q . . . . t q Figure 1: The two wells of the periodic double well potential indicate two metastable regionsin conﬁguration space. A trajectory of the according Langevin dynamics with appropriatetemperature shows the characteristic jumping pattern between wells. Typical choices are (uniform) hyperrectangles or Voronoi cells.

17e consider the system at inverse temperature β “ γ “ S t with a large number of discretizationboxes N and sampling points M , denoted by S tN . Spectra and eigenvectors of S tN then serveas a reference point for the error analysis. A resolution of N “ and M “ samplingpoints produce suﬃciently accurate spectral data, as a further increase does not alter the resultsconsiderably.As our goal is metastability analysis, we analyze the error in the portion of the spectrum andthe eigenvectors that are necessary to identify almost invariant sets. Table 2 shows a distinct spectral gap after the second eigenvalue, so analysing λ , λ and v , v should reveal the principalmetastable sets. EV t “ . t “ S tN for diﬀerent lag times t . Note the spectral gap after λ .For the discretized approximative operators G ,n , R tn and E tn , we use n “

33 approximationfunctions of form (22) and the same number of collocation points. The fourier modes haveinherent periodic boundary conditions.

Eigenvalue comparison

The absolute error in the relevant eigenvalues can be measured by ε R p t q : “ ˇˇ λ p S tN q ´ λ p R tn q ˇˇ ` ˇˇ λ p S tN q ´ λ p R tn q ˇˇ ε E p t q : “ ˇˇ λ p S tN q ´ λ p E tn q ˇˇ ` ˇˇ λ p S tN q ´ λ p E tn q ˇˇ , where λ k p S tN q , λ k p R tn q , λ k p E tn q is the k -the eigenvalue of S N , R n , E n , respectively.10 ´ ´ ´ ´ tε R p t q ε E p t q t Figure 3: Eigenvalue errors for the Taylor and Exponential approximation for small t . ε R p t q is consistent with the estimated convergence rate O p t q . ε E p t q even (visually) exceeds thepredicted convergence rate of O p t q .In Figure 4 the 8 largest eigenvalues for increasing lag times are shown. We see that for small t , adecent approximation of the eigenvalues of S tN can be expected from both R tn and E tn . However,for bigger t , the third-order Taylor approximation R t becomes worse, while E t at least shows theproper qualitative behavior. Added for comparison, the Smoluchowski spectrum, representinga completely diﬀerent dynamics, does not resemble the spectrum of S tN .18 0 . . S tN t . . R tn t . . E tn t . . P t Smol ,N t Figure 4: The solid lines show the course of the 8 biggest eigenvalues of the various discretizedoperators. For small lag times, both the spectra of R t and E t show the right asymptotics, whilefor t Ñ 8 , E t at least has the qualitative behavior. The spectrum of the Smoluchowski transferoperator compares poorly to that of S tN . Eigenvector comparison

Before comparing to the approximated operators, observe thatthe eigenvectors of S t (and S tN ) are time-dependent. This is contrastive to any semi-grouptransfer operator, whose eigenvectors coincide with those of its inﬁnitesimal generator for alltimes. While the ﬁrst eigenvector remains constant v ”

1, the second eigenvector starts outas almost a step function, but gets more concentrated in the potential wells for increased lagtimes (Figure 5).The eigenvectors of R t and E t , however, are time-invariant, and coincide with those of G , byconstruction. For small lag times, the second eigenvector w of G ,n (and thus R tn and E tn )compares well to the second eigenvector v of S tN (see Figure 3). For larger t , when the v becomes more and more concentrated at the potential wells, the diﬀerence is more noticable.19 0 . v S tN , unweighted t “ . t “ . t “ . w G ,n , unweighted.0 0 . q v f Q S tN ,weighted with f Q t “ . t “ . t “ . q w f Q G ,n weighted with f Q .Figure 5: Visual comparison of the second largest eigenvectors of S tN for short and intermediatelag times t (left column), and of G ,n (right column). For S tN , we see strong time-dependencyin the unweighted case.As we consider all spatial operators on L µ Q p Q q , their physical interpretation is to transportdensities with respect to f Q (see also the next paragraph). It is therefore appropriate to weighttheir eigenvectors with f Q , as this gives their representation with respect to the Lebesguemeasure. In the weighted space, the time-dependence of v becomes insigniﬁcant, as can beseen in Figure 5. Consequently, our restoration provides a very good approximation (Figure 5).The sign structure of w now identiﬁes the pair of metastable sets A “ t w ą u “ p , . q , A “ t w ă u “ p . , q . Transition probabilities

Theorem 3.1 also provides estimates for the transition probabilitiesbetween those sets: the combined “degree of metastability” p p t, A , A q ` p p t, A , A q can bebounded from above and below by functions of the second largest eigenvalue λ p S t q . To verifythis numerically, and to examine the approximation quality of bounds based on the eigenvalue λ p E t q , we ﬁrst estimate p p t, A i , A i q by Monte Carlo integration:1. Sample two sets of starting points, with density χ A f Q and χ A f Q .2. Integrate those samples numerically for some ﬁxed time t under the Langevin dynamics. The approximation of the bounds based on R t is omitted here, as the only diﬀerence to E t is the (alreadydemonstrated) rate of decay for t Ñ 8 .

20. Count the portion of points that remained in A and A , respectively.For suﬃciently many samples, this provides an accurate estimate for p p t, A , A q and p p t, A , A q .Figure 7 (left) conﬁrms the bounds based on λ p S t q . They do, however, provide a relativelylarge margin of error, and diverge for increasing lag time . Figure 7 (right) now shows that forsmall lag times , the bounds based on λ p E t q also contain p p t, A , A q ` p p t, A , A q , and thusare an accurate approximation to the bounds based on λ p S t q in this time region.0 0.25 0.50 q χ A f Q A q χ A f Q A Figure 6: The invariant sets with the corresponding portion of the canonical density.0 0.25 0.511 . S tN t ` λ ` ρ λ ` cp p t, A , A q ` p p t, A , A q . E tn t ` λ ` ρ λ ` cp p t, A , A q ` p p t, A , A q Figure 7: Metastability of the partition A , A and comparison to the bounds of Theorem 3.1,calculated based on the second largest eigenvalue of S t (left) and E t (right).A recent continuation of the pseudo generator theory [3] shows that it may be possible toextend the time scales for which meaningful information can be extracted from the reconstructedoperator, i.e. for which λ p E t q resembles λ p S t q . It has been discussed that S t exhibits a kind of”almost-Markovianity”, which results in a basically exponential decay of λ p S t q for high enoughdamping γ and large enough t . Transferring this decay rate to λ p E t q , it has been demonstratedthat λ p S k ¨ τ q « ` λ p E τ q ˘ k , for the right choice of τ and k large enough. The discretization and restoration method performs similarly on higher-dimensional domains.Using grid-based spectral collocation, we can reconstruct the spatial transfer operator for a It can be argued that weighting the initial distribution of points by | w | gives a “more physical” portion ofthe canonical density, i.e. χ A i | w | f Q . The corresponding degree of metastability does, in fact, coincide muchbetter with the upper bound. V p q , q q “ ` p πq q ` p πq q ´ cos p πq q` ` p πq q ` p πq q ´ cos p πq q ` cos p πq ´ π q . (25)Figure 8: The four well potential in the region r , s .This potential is of interest, as the four local minima of diﬀerent depth form multiple hierarchiesof metastable sets. A similar (albeit non-periodic) potential was considered in [10]. Again, thepotential is periodic on T , so for discretization we use (products of) Fourier modes. As forthis example we do not perform rigorous error analysis, a resolution of 33 basis functions andcollocation points per dimension is suﬃcient for both the Ulam and collocation discretizations,resulting in a total of 961 basis functions. We use heat and damping parameters β “ , γ “ t “ . t “ S tn for short and intermediate lag times t . Note thespectral gap after λ .For the signiﬁcant eigenvectors v , v , v of S tn and their approximations w , w , w via G ,n , weobserve a similar behavior as in the one-dimensional case: For longer lag times, the relevanteigenvectors get more and more concentrated in the regions of the potential wells. Again, thesign structure is largely identical (Figure 10). Transition probabilities and hierarchies of metastability

The sign structure of thethree isolated eigenvectors partitions Q into three pairs of metastable sets, each with a diﬀerent“degree of metastability” (Figure 11). The portion of f Q resembling the respective metastablesets now is f A i “ χ A i f Q , f B i “ χ B i f Q , f C i “ χ C i f Q , i “ , . of G ,n . . . . . . . . q v of G ,n . . . . . . . . v of G ,n . . . . . . . . ´ ´ . ´ ´ . . . v of S tn , t “ .

10 0 . . . . . . . . q v of S tn , t “ .

10 0 . . . . . . . . v of S tn , t “ .

10 0 . . . . . . . . v of S tn , t “

10 0 . . . . . . . . q q v of S tn , t “

10 0 . . . . . . . . q v of S tn , t “

10 0 . . . . . . . . q Figure 10: The three most signiﬁcant eigenvectors of G ,n (top row) and S tn for diﬀerent lagtimes (second and third row), unweighted.Again sampling these densities, integrating and counting the points remaining in the respectivesets shows the connection to the dominant eigenvalues of S t as of Theorem 3.1. Again, for shorttimes, the bounds based on the spectrum of E tn captures the decay of metastability quite well(Figure 12). 23 0 . . . . . . . . q q A (blue), A (red) 0 0 . . . . . . . . q B (blue), B (red) 0 0 . . . . . . . . q C (blue), C (red)Figure 11: The three tiers of invariant sets, identiﬁed via G ,n . The artifacts on the left borderof C , C can be attributed to the ill-conditioned sign structure analysis.0 0.25 0.511 . S tN ` λ ` ρ λ ` cp p t, A , A q ` p p t, A , A q . S tN ` λ ` ρ λ ` cp p t, B , B q ` p p t, B , B q . S tN ` λ ` ρ λ ` cp p t, C , C q ` p p t, C , C q . E tN t ` λ ` ρ λ ` cp p t, A , A q ` p p t, A , A q . E tN t ` λ ` ρ λ ` cp p t, B , B q ` p p t, B , B q . E tN t ` λ ` ρ λ ` cp p t, C , C q ` p p t, C , C q Figure 12: Combined metastability of the identiﬁed sets and comparison to the bounds ofTheorem 3.1 for both S t and E t . We have considered the dynamics of the position coordinate for a molecular dynamics sys-tem given by the Langevin process in thermal equilibrium. Following the aim of an eﬃcient,trajectory-free evaluation of the associated spatial transfer operator , we have found that thespatial dynamics behaves up to third order in time as a t ÞÑ t scaled Smoluchowski dynamics(for t Ñ ‚ Corollary 4.11 tells us that a time-scaled Smoluchowski dynamics approximates the spatialdynamics of a Langevin process up to third order in time – independently of the Langevindamping coeﬃcient γ . Of course, a quantitative estimate would depend on the damping, andit is important to understand how the approximation quality behaves for varying damping andlarger lag times. ‚ It is shown in Appendix B that the spatial dynamics is ergodic, which suggests an exponentialdecay of the eigenvalues of the spatial transfer operator as t Ñ 8 . Incorporating this qualitativeproperty into the structure of the reconstruction (i.e. the approximation of the spatial transferoperator from the pseudo generators) is key to be able to leap to larger lag times and have thetime scales of the original system approximated properly. ‚ The potential, governing the self-dynamics of a molecule, can be formulated in terms of internalcoordinates of the molecule: bond lengths, bond angles, and dihedral angles. Describing thedynamics of the system in these coordinates maintains its essential dynamical properties, butreduces its dimension. We shall examine how the pseudo generator approach can be transferredinto these coordinates, and whether the dimensional reduction pays oﬀ against the imposedadditional diﬃculties (e.g. the mass matrix, which is a constant diagonal matrix in Cartesiancoordinates—taken to be the identity in (2) and (3)—, is in general a position-dependent fullmatrix in the internal coordinates [17]). ‚ The internal coordinates above often play an even more prominent role: in many cases, thedominant conformational transitions can be described by only a few of them; e.g. the α é β transition for alanin dipeptide occurs in the φ { ψ coordinate plane. Evidently, if a faithfulprojection of the full Langevin dynamics to these essential coordinates can be carried out, thisyields a massive dimensionality reduction. Certainly, however, through this projection (which isadditional to the already imposed averaging over the momenta) dynamical information is lost,and we expect the approximation quality by the associated pseudo generators to deteriorate. ‚ Especially in the latter case, the use of higher order pseudo generators could be advantageous.It shall be investigated how to incorporate them into the reconstruction, and assessed whetherthe increase in accuracy due to increased approximation order is worth the eﬀort to computehigher order derivatives of the potential—as they will probably enter the expressions. However,we have seen in Proposition 4.4 that G “ ´ γG , hence no additional potential evaluationsare needed. Whether this is a generic pattern for pseudo generators or a singular ﬂuke will beexamined in future studies. ‚ Cases will appear where despite all dimension and model reduction techniques we have todeal with the discretization of operators on function spaces over medium- to high-dimensionalspatial domains, which are subject to the curse of dimension. Following [48], we intend to usemeshfree approximation methods where the nodes are distributed according to prior knowledgegained from trajectory data. The faith in this approach resides in recent results of transitionpath theory [13, 14, 46], which shows that the vast majority of conformational transitions occuralong a few dominant, low dimensional transition pathways.

Acknowledgments

The authors would like to thank Carsten Hartmann for helpful discussions.25

Derivation of pseudo generators by vector calculus

Proof of Proposition 4.4.

By Lemma 4.3, we obtain G n by calculating L n Lan , n

P t , , u andapplying the momentum integral afterwards. For readability, we use the shorthand L insteadof L Lan for the remainder of this proof. Let u P L µ Q p Q q X C p Q q .1. L is the diﬀerential operator of the Fokker–Planck equation (10), and therefore L ` u p q q f Ω p q, p q ˘ “ ˆ γβ ∆ p ´ p ¨ ∇ q ` ∇ q V ¨ ∇ p ` γp ¨ ∇ p ` dγ ˙ ` u p q q f Ω p q, p q ˘ “ γβ u p q q ∆ p f Ω p q, p q´ p ¨ ` ∇ q u p q q f p q, p q ` u p q q ∇ q f Ω p q, p q ˘ ` ∇ q V p q q ¨ ` u p q q ∇ p f Ω p q, p q ˘ ` γu p q q p ¨ ∇ p f Ω p q, p q` dγu p q q f Ω p q, p q . Setting f Ω p q, p q “ Z e ´ β p p ¨ p ` V p q q q , this becomes “ Z e ´ β p p ¨ p ` V p q q q ” γβ p β p ¨ p ´ dβ q u p q q ´ p p ¨ ∇ q u p q q ´ βp ¨ ∇ q V p q q u p q qq´ βp ¨ ∇ q V p q q u p q q ´ βγp ¨ pu p q q ` dγu p q q ı “ ´ Z e ´ β p p ¨ p ` V p q q q p ¨ ∇ q u p q q . Applying the integral (and normalizing) in (14), we get G u p q q “ f Q p q q ż P B t T t ˇˇˇ t “ ` u p q q f Ω p q, p q ˘ dp “ f Q p q q ż P ´ Z e ´ β p p ¨ p ` V p q q q p ¨ ∇ q u p q q dp “ ´ f Q p q q Z e ´ βV p q q „ż P e ´ β p ¨ p p p ¨ ∇ q u p q qq dp  “ ´ f Q p q q Z e ´ βV p q q d ÿ k “ „ B q k u p q q ż P p k e ´ β p ¨ p dp  . With ˜ p k “ p p , . . . , p k ´ , p k ` , . . . , p d q (cid:124) , P k “ R and ˜ P k “ R d ´ this becomes “ ´ f Q p q q Z e ´ βV p q q d ÿ k “ „ B q k u p q q ż ˜ P k e ´ β ˜ pk ¨ ˜ pk d ˜ p k ż P k e ´ β p k p k dp k loooooooomoooooooon “  . The last integral is 0 due to symmetry, and so G u p q q “ , @ u, q.

2. Having already derived (see above) L ` u p q q f Ω p q, p q ˘ “ ´ Z e ´ β p p ¨ p ` V p q q q p ¨ ∇ q u p q q ,

26e apply L a second time: L ` u p q q f Ω p q, p q ˘ “ L ˆ ´ Z e ´ β p p ¨ p ` V p q q q p ¨ ∇ q u p q q ˙ “ ˆ γβ ∆ p ´ p ¨ ∇ q ` ∇ q V ¨ ∇ p ` γp ¨ ∇ p ` dγ ˙ ˆ ´ Z e ´ β p p ¨ p ` V p q q q p ¨ ∇ q u p q q ˙ “ γβ Z e ´ β p p ¨ p ` V p q q q γβ ” p ¨ ∇ q u p q qp´ β p d ` q ` β p ¨ p q´ ` ´ βp ¨ ∇ q V p q q p ¨ ∇ u p q q ´ βp ¨ H q u p q q ¨ p ˘ ` ` ∇ q V p q q ¨ ∇ q u p q q ´ β p ∇ q V p q q ¨ p qp p ¨ ∇ q u p q qq ˘ ` γ ` p ¨ ∇ q u p q q ´ β p p ¨ p qp p ¨ ∇ q u p q qq ˘ ` dγp ¨ ∇ q u p q q ı “ Z e ´ β p p ¨ p ` V p q q q ` γp ¨ ∇ q u p q q ` p ¨ H q u p q q ¨ p ´ ∇ q V p q q ¨ ∇ q u p q q ˘ , with H q u the Hessian of u . Applying the integral in (14) and using the deﬁntion of f Q and Z gives G u p q q “ ´ż P e ´ β p ¨ p dp ¯ ´ ż P e ´ β p ¨ p ` γp ¨ ∇ q u p q q ` βp ¨ H q u p q q ¨ p ´ ∇ q V p q q ¨ ∇ q u p q q ˘ dp “ ´ β π ¯ d ” ż P γe ´ β p ¨ p p ¨ ∇ q u p q q dp loooooooooooooomoooooooooooooon “ ` ż P e ´ β p ¨ p p ¨ H q u p q q ¨ p dp looooooooooooooomooooooooooooooon “ β ´ πβ ¯ d ∆ q u p q q ´ ż P e ´ β p ¨ p ∇ q V p q q ¨ ∇ q u p q q dp loooooooooooooooooomoooooooooooooooooon “ ´ πβ ¯ ∇ q u p q q¨ ∇ q V p q q ı The ﬁrst integral is 0 due to symmetry. Expanding the second integral, all but the “diagonal”summands are 0 due to symmetry, so only ∆ q remains as integral operator. So this ﬁnallybecomes “ β ∆ q u p q q ´ ∇ q u p q q ¨ ∇ q V p q q .

3. We apply L a third time: L ` u p q q f Ω p q, p q ˘ “ L ´ L ` u p q q f Ω p q, p q ˘¯ , (26)with L ` u p q q f Ω p q, p q ˘ “ f Ω p q, p q ` γp ¨ ∇ q u p q q ` p ¨ H q u p q q ¨ p ´ ∇ q V p q q ¨ ∇ q u p q q ˘ . (27)After simplifying, with partial help of a computer algebra system, this becomes L ` u p q q f Ω p q, p q ˘ “ β f Ω p q, p q ¨ “ p ¨ H q u p q q ¨ ∇ q V p q q ` βp ¨ H q V p q q ¨ ∇ q u p q q` γ ∆ q u p q q ` βγ ∇ q u p q q ¨ ∇ q V p q q´ βγ p ¨ ∇ q u p q q ´ βγp ¨ H q u p q q ¨ p ´ βp ¨ ∇ q p p ¨ H q u p q q ¨ p q ‰ .

27e now take the integral over p . First note, that ż P f Ω p q, p q p ¨ H q u p q q ¨ ∇ q V p q q dp “ ż P f Ω p q, p q p ¨ H q V p q q ¨ ∇ q u p q q dp “ ż P f Ω p q, p q p ¨ ∇ q u p q q dp “ ż P f Ω p q, p q p ¨ ∇ q p p ¨ H q u p q, p q ¨ p q dp “ f Q p q q ż P L ` u p q q f Ω p q, p q ˘ dp ““ ´ β ż P e ´ β p ¨ p dp ¯ ´ ż P e ´ β p ¨ p “ γ ∆ q u p q q ` βγ ∇ q u p q q ¨ ∇ q V p q q ´ βγp ¨ H q u p q q ¨ p ‰ dp “ β ´ πβ ¯ ´ d { ” γ ∆ q u p q q ´ πβ ¯ d { ` βγ ∇ q u p q q ¨ ∇ q V p q q ´ πβ ¯ d { ´ γ ∆ q u p q q ´ πβ ¯ d { ı “ ´ γβ ∆ q u p q q ` γ ∇ q u ¨ ∇ q V p q q“ ´ γ ´ β ∆ q u p q q ´ ∇ q u p q q ¨ ∇ q V p q q ¯ . Proof of Lemma 4.10. E t u is N ` t . Thus we can apply the Taylorexpansion for Banach space valued functions to E t (see [49, Section 4.5]): E t u “ N ÿ n “ t n n ! pB ns E s ˇˇ s “ q u ` ´ ż p N q ! p ´ s q N B N ` s E st u ds ¯ t N ` . The n -th derivative of E s is B ns E s “ E s t n u ÿ k “ n ! s n ´ k k k ! p n ´ k q ! G n ´ k . For future reference, we deﬁne the operator A n : “ t n u ÿ k “ n ! s n ´ k k k ! p n ´ k q ! G n ´ k .Evaluation at s “ B ns E s ˇˇ s “ “ $&% , n odd n !2 n p n q ! G n , n even , and so N ÿ n “ t n n ! pB ns E s ˇˇ s “ q u “ N ÿ n “ t n p n q ! ´ p n q !2 n n ! G n ¯ u “ N ÿ n “ t n n n ! G n u. ›››› E t u ´ N ÿ n “ ` t G ˘ n n ! u ›››› ,µ Q “ ››› t N ` ż p N q ! p ´ s q N B N ` E st u ds ››› ,µ Q ď t N ` p N q ! sup s Pr , s ›› B N ` E s u ›› ,µ Q “ t N ` p N q ! sup s Pr , s ›› p E s A N ` q u ›› ,µ Q ď t N ` p N q ! sup s Pr , s ›› E s ›› ,µ Q looomooon ď } A N ` u } ,µ Q . Due to the choice of u , } A N ` u } ,µ Q ă 8 . This completes the proof. B Spectral properties of the spatial transfer operator

To extract quantitative metastability properties of the spatial dynamics from its transfer op-erator, we will use the results of Huisinga [22]. More precisely, we will prove that the spatialtransfer operator S t we consider satisﬁes the conditions under which Theorem 3.1 above isvalid. Recall that we model molecular dynamics by the Langevin equation on the canonicalstate space with position coordinates q and momenta p , such that the unique invariant densityof the system is the canonical density f Ω p q, p q “ f Q p q q f P p p q with f Q p q q9 exp p´ βV p q qq and f P p p q9 exp p´ p ¨ p q being called the spatial and momentum distributions , respectively.Opposed to the density-based statistical description of the dynamics used in the main text, wewill work with a slightly greater generality here. To this end let p tLan denote the stochastic tran-sition function of the Langevin process, i.e. if p q t , p t q is a Langevin process with deterministicinitial condition q “ q and p “ p (almost surely), then p tLan pp q, p q , A q “ Prob pp q t , p t q P A q , @ measurable A Ă Q ˆ P . Now we can express spatial ﬂuctuations in the canonical density even for initial distributionsnot absolutely continuous to f Q , such as the Dirac measure δ q ˚ centered in some q ˚ P Q . Inaccordance with (14) we have for a measurable A Ă Q that p tS p q ˚ , A q : “ S t δ q ˚ p A q “ f Q p q ˚ q ż P f Ω p q ˚ , p q p tLan ` p q ˚ , p q , A ˆ P ˘ dp . (28)Below we will consider transition probabilities of the spatial dynamics (with lag time t ) betweentwo measurable subsets A, B Ă Q of the conﬁguration space, supposed that the initial conditionis distributed with respect to the invariant density f Q . Let us denote these probabilities by p tS p A, B q . We have that p tS p A, B q “ ş A f Q p q q dq ż A f Q p q q p tS p q, B q dq “ ş A f Q p q q dq ż A ˆ P f Q p q q f P p p q p tLan ` p q, p q , B ˆ P ˘ d p q, p q . Reversibility of the spatial dynamics

First, we will show that S t is self-adjoint in theweighted space L µ Q p Q q , hence its spectrum is purely real. Self-adjointness of the transferoperator is equivalent with reversibility of the corresponding process: the following result isfrom [22, Proposition 1.1], re-stated for our purposes.29 roposition B.1. Fix t ą . Let S t : L µ Q p Q q Ă L µ Q p Q q Ñ L µ Q p Q q denote the transferoperator of the spatial dynamics for lag time t . Let the associated (discrete time) Markov processbe denoted by q n , n P N . Then S t is self-adjoint with respect to the scalar product x¨ , ¨y ,µ Q , i.e. x S t u, v y ,µ Q “ x u, S t v y ,µ Q for all u, v P L µ Q p Q q , if and only if q n is reversible. Reversibility in this case is equivalent with p tS p A, B q “ p tS p B, A q for any measurable A, B Ă Q .Indeed, one way to deﬁne the reversed process is by setting p tS,rev p A, B q : “ p tS p B, A q for anymeasurable A, B Ă Q . In order to show reversibility let us start with a property of the Langevinprocess. Lemma B.2.

Let p tLan,rev denote the transition function of the reversed Langevin process, andlet A Ă Q ˆ P be a measurable set which is symmetric in the momentum coordinate, i.e. A “ (cid:32) p q, ´ p q ˇˇ p q, p q P A ( . Then p tLan ` p q, p q , A ˘ “ p tLan,rev ` p q, ´ p q , A ˘ for any q P Q , p P P .Proof. Recall the Langevin SDE (3): B q t “ p t B p t “ ´ ∇ V p q t q ´ γ p t ` σ w t . The reversed Langevin process is also an Itˆo diﬀusion [19] governed by the SDE B q t “ ´ p t B p t “ ∇ V p q t q ´ γ p t ` σ w t . Applying the substitution ˜ p “ ´ p for the Langevin equations in forward time, and using thefact that w t and ´ w t are stochastically equivalent in the sense that their distributions coincide,we obtain B q t “ ´ ˜ p t B ˜ p t “ ∇ V p q t q ´ γ ˜ p t ` σ w t . Note that this is the same SDE as for the reversed process. Thus, the reversed process startingat p q, ´ p q has the same distribution as p q t , ´ p t q , where p q t , p t q is the forward time processstarting at p q, p q .To show reversibility of the spatial dynamics, we rewrite p tS p A, B q in equivalent terms. Let usalso introduce the shorthand notation F A “ µ Q p A q “ ş A f Q p q q dq . p tS p A, B q “ F ´ A ż A ˆ P f Q p q q f P p p q p tLan ` p q, p q , B ˆ P ˘ d p q, p q“ p´ q d F ´ A ż A ˆ´ P f Q p q q f P p´ ˜ p q p tLan ` p q, ´ ˜ p ˘ , B ˆ P q d p q, ˜ p q“ F ´ A ż A ˆ P f Q p q q f P p´ ˜ p q p tLan ` p q, ´ ˜ p q , B ˆ P ˘ d p q, ˜ p q , where we used the integral substitution ˜ p “ ´ p , then the symmetry of P , such that ﬂipping theintegration bounds only introduces change of sign. From this and Lemma B.2 we obtain p tS p A, B q “ F ´ A ż A ˆ P f Q p q q f P p ˜ p q p tLan,rev ` p q, ˜ p q , B ˆ P ˘ d p q, ˜ p q , (29) The property described in Lemma B.2 is also known as extended detailed balance condition , see [44,Lemma 4.10].

30y exploiting that f P p´ ˜ p q “ f P p ˜ p q . In the next lemma we establish that the right hand sideof (29) in fact expresses the transition probability from A to B for the reversed spatial process p tS,rev , and hence p tS p A, B q “ p tS,rev p A, B q “ p tS p B, A q . This concludes the proof of reversibilityfor the spatial process. Lemma B.3.

It holds p tS,rev p A, B q “ F ´ A ş A ˆ P f Q p q q f P p p q p tLan,rev ` p q, p q , B ˆ P ˘ d p q, p q for anymeasurable A, B Ă Q .Proof. The transition probability for the reversed system can be obtained from Bayes formula: p tS,rev p A, B q “ p tS p B, A q ş B f Q p q q dq ş A f Q p q q dq “ F ´ A ż B f Q p q q p tS p q, A q dq “ F ´ A ż q P B ż ˜ q P A f Q p q q p tS p q, d ˜ q q dq “ . . . where, by writing ş q P B we would like to indicate which variable is integrated over which set, tomaintain a good readability. With (28) we can expand the term on the right hand side further: . . . “ F ´ A ż q P B ż p P P ż ˜ q P A ż ˜ p P P f Q p q q f P p p q p tLan ` p q, p q , d p ˜ q, ˜ p q ˘ dpdq looooooooooooooooooooooomooooooooooooooooooooooon def “ f Q p ˜ q q f P p ˜ p q p tLan,rev ` p ˜ q, ˜ p q ,d p q,p q ˘ d ˜ qd ˜ p “ . . . where, for the underbraced term, we use the inﬁnitesimal version of Bayes formula to relate thetransition functions of the forward and backward time Langevin processes. Rearranging theintegration order yields the claim . . . “ F ´ A ż ˜ q P A ż ˜ p P P f Q p ˜ q q f P p ˜ p q p tLan,rev ` p ˜ q, ˜ p q , B ˆ P ˘ d ˜ pd ˜ q . Geometric ergodicity of the spatial dynamicsDeﬁnition B.4.

Let x t , where t denotes either discrete or continuous time, be a Markov processwith transition function p t and unique invariant measure µ . Then x t is called geometricallyergodic if for every state x P X and time t } p t p x, ¨q ´ µ } TV ď M p x q ρ t holds for some M P L µ and ρ ă

1. Here, } ¨ } TV denotes the total variation norm for signedmeasures.Following [22, Proposition 6.3] (see also [30] and [31]) we can establish the geometric ergodicityof the Langevin process. Here and in the following µ Ω denotes the canonical measure of theLangevin process, i.e. dµ Ω “ f Ω dm . Proposition B.5.

Let x t denote the Langevin process. Fix some lag time t ą , and let z n : “ x nt be the sampled time process. In either of the following cases, z n has the uniqueinvariant measure µ Ω and is geometrically ergodic. i) The state space X Ă R d is periodic and the potential V : X Ñ R is smooth.(ii) The state space X “ R d , the potential V : X Ñ R ě is smooth and V p x q is growing atinﬁnity as } x } l for some positive integer l . We assume from now on that either condition (i) or (ii) of Proposition B.5 is satisﬁed. Thismeans, in particular, that for a ﬁxed t ą M P L µ Ω p Q ˆ P q and a constant ρ ă q P Q and p P P holds } p ntLan pp q, p q , ¨q ´ µ Ω } TV ď M p q, p q ρ n . (30)To show geometric ergodicty of the spatial dynamics, recall its transition function from (28), andthat µ Q given by dµ Q p q q : “ f Q p q q dq is its unique invariant measure. Note that by construction µ Q “ µ Ω p¨ ˆ P q . We have ›› p ntS pp q, p q , ¨q ´ µ Q ›› TV “ ››› ż P f Ω p q,p q f Q p q q p ntLan ` p q, p q , ¨ ˆ P ˘ dp ´ µ Ω p¨ ˆ P q ››› TV “ ››› ż P f Ω p q,p q f Q p q q “ p ntLan ` p q, p q , ¨ ˆ P ˘ ´ µ Ω p¨ ˆ P q ‰ dp ››› TV ď ż P f Ω p q,p q f Q p q q ›› p ntLan ` p q, p q , ¨ ˆ P ˘ ´ µ Ω p¨ ˆ P q ›› TV dp ď ρ n ż P f Ω p q,p q f Q p q q M p q, p q dp looooooooooomooooooooooon “ : ˜ M p q q , where the second line follows from ş P f Ω p q,p q f Q p q q dp “

1, the third comes from pulling the norminside the integral, while the last inequality is a consequence of (30), and of the total variationnorm being deﬁned by a supremum over all possible partitions of state space (more precisely:restricting the partitions to sets of the form ¨ ˆ P results in a total variation not greater thanin the non-restricted case). M P L µ Ω p Q ˆ P q implies ˜ M P L µ Q p Q q . Thus, the spatial process isgeometrically ergodic. The spectrum of the spatial transfer operator

Dynamical properties of the transitionfunction p tS —namely reversibility and geometric ergodicity—imply some desirable spectral prop-erties of the associated transfer operator S t : L µ Q p Q q Ñ L µ Q p Q q .Consider the two properties of some transfer operator P :(P1) The essential spectral radius of P is less than one, i.e. r ess p P q ă λ “ P is simple and dominant.[22, Theorem 4.31] states: Theorem B.6.

Let P : L µ Ñ L µ be a transfer operator associated with the reversible stochastictransition function p . Then P satisﬁes properties (P1) and (P2) in L µ , if and only if p is µ -irreducible and ( µ -a.e.) geometrically ergodic. The latter two conditions on p are satisﬁed, inparticular, if p is geometrically ergodic. Thus, we have shown

Corollary B.7.

If the potential V satisﬁes either conditions in Proposition B.5, then the spatialtransfer operator S t : L µ Q p Q q Ñ L µ Q p Q q is self-adjoint and has the properties (P1) and (P2).These are exactly the conditions under which Theorem 3.1 holds. As shown by Baxter and Rosenthal [2, Corollary to Lemma 1], the transfer operator associated with atransition function having the invariant measure µ is a well-deﬁned contraction on every L rµ p Q q , 1 ď r ď 8 . Seealso Corollary 2.1 in this manuscript. Smoothness of eigenfunctions

We are going to show in this section that if Assumption 5.1 holds for the potential, then theeigenfunctions of the(i) spatial transfer operator S t associated with the Langevin process (3), and of the(ii) generator G of the Smoluchowski process (5)are smooth, i.e. C functions. Qualitative results of this kind go back to H¨ormander [20], wherehypoelliptic diﬀusions have been considered. Indeed, what is known as H¨ormander’s condition ,is equivalent to W (cid:96) p x q ą (cid:96) ě

1, the function W (cid:96) being deﬁned below. Since thespatial transfer operator involves an averaging over the momenta, we will require quantitativeestimates on the smoothness (cf. (32)) of transition density functions in order to carry over thesmoothness to the spatial transition density function, and to the eigenfunctions of S t . For thiswe will use the results based on Malliavin calculus. The general theory

Consider the stochastic diﬀerential equation B t x t “ b p x t q ` m ÿ i “ b i p x t q w it , (31)where the b , b , . . . , b m are vector ﬁelds over R d , and the w it are independent standard one-dimensional “white noise” processes. Note that both the Langevin and the Smoluchowski dif-ferential equations we consider here are special cases of this, e.g. the latter with m “ d and b i ” const ¨ e i , i “ , , . . . , d , with a suitable constant, and e i P R d being the canonical unitvectors.For two diﬀerentiable vector ﬁelds v , v : R d Ñ R d deﬁne their Lie bracket by r v , v sp x q “ Dv p x q v p x q ´ Dv p x q v p x q , where Dv is the derivative of v with entries p Dv p x qq ij “ B v i B x j p x q .Further, for a multi-index α “ p α , . . . , α k q P t , , . . . , m u k Y tHu deﬁne b αi , 1 ď i ď m , byinduction: b H i “ b i , 1 ď i ď m , and b p α,j q i “ r b j , b αi s , 0 ď j ď m , where p α, j q “ p α , . . . , α k , j q .Also, let pH , j q “ j . For a multi-index α deﬁne | α | “ " , if α “ H (cid:96), if α P t , . . . , m u (cid:96) } α } “ " , if α “ H| α | ` card t j | α j “ u , if | α | ě (cid:96) P N and x, η P R d let W (cid:96) p x, η q “ m ÿ i “ ÿ } α }ď (cid:96) ´ p η ¨ b αi p x qq ,W (cid:96) p x q “ ^ inf } η } “ W (cid:96) p x, η q : “ min t , inf } η } “ W (cid:96) p x, η qu , where the dot denotes the usual scalar product and } ¨ } the Euclidean norm. Kusuoka andStroock prove the following result; see also in [1]. Theorem C.1 ([28, Corollary 3.25]) . Let b , . . . , b m be C with all derivatives of order greateror equal one be bounded, and ﬁx some (cid:96) P N . Then for every x P A (cid:96) : “ t x P R d | W (cid:96) p x q ą u , t ą , the stochastic transition function p t of (31) has a smooth (i.e. C in every variable) ransition density function k with respect to the Lebesgue measure, i.e. p t p x, dy q “ k p t, x, y q dy .Moreover, for n, n x , n y P N ě , multi-indices α P t , , . . . , n x u d , β P t , , . . . , n y u d , and some T ą , there are constants c , c , r ą , independent of x and y , such that ˇˇ B nt B αx B βy k p t, x, y q ˇˇ ď c t r W (cid:96) p x q exp ˆ ´ c } x ´ y } t ˙ @ x P A (cid:96) , y P R d , ă t ď T. (32) Smoothness of the Smoluchowski eigenfunctions

Consider the Smoluchowski equation B t q t “ ´ ∇ V p q t q ` σ w t , with σ ą

0. Note that if V satisﬁes the conditions of Assumption 5.1, then the SmoluchowskiSDE automatically satisﬁes the ﬁrst condition of Theorem C.1.We have that b H i ” σe i , the i th column of the d ˆ d matrix σI (here, I denotes the identitymatrix). Hence, with B “ σI , W p q q “ ^ inf } η } “ d ÿ i “ p η ¨ σe i q “ ^ inf } η } “ } η T B } “ ^ σ ą , @ q P R d . (33)In particular, by Theorem C.1, for all q P R d the stochastic transition function of the aboveSmoluchowski equation has a smooth transition density function k p t, q, ˆ q q with |B αy k p t, q, ˆ q q| ď ˜ c exp p´ ˜ c } q ´ ˆ q } q , @ q, ˆ q P R d , (34)for a ﬁxed t ą

0, with some constants ˜ c , ˜ c independent of q and ˆ q .Let G and P t Smol denote the generator and the transfer operator (with respect to the Lebesguemeasure) of the Smoluchowski process, respectively. Now, by the Spectral Mapping Theorem(Theorem 2.2) Gu “ λu if and only if P t Smol u “ e λt u for any t ě

0, i.e. e λt u p ˆ q q “ ż R d u p q q k p t, q, ˆ q q dq. By Lebesgue’s theorem on diﬀerentiating parameter-dependent integrals, the right hand side ofthis equation is continuously diﬀerentiable (with respect to ˆ q ) because for any i P t , . . . , d u , | u p q qB ˆ q i k p t, q, ˆ q q| ď | u p q q| ˜ c exp ` ´ ˜ c } q ´ ˆ q } ˘ ď ˜ c | u p q q| , which is an integrable function. This argument can be iterated for derivatives of any order,showing that u p q q , the eigenfunction of G , is smooth. Remark C.2.

We have also considered the generator and semigroup with respect to the canon-ical measure with density f Q p q q9 exp p´ βV p q qq . As noted in the discussion preceding Corol-lary 4.5, the densities of the transfer operator with respect to the Lebesgue and the canonicalmeasure diﬀer only up to a factor f Q p q q , which is a nowhere zero smooth function. Thus theSmoluchowski eigenfunctions are smooth, no matter with respect to which of these both measureswe consider them. Smoothness of the spatial eigenfunctions

Consider the Langevin equation, as in (3), B t q t “ p t B t p t “ ´ ∇ V p q t q ´ γ p t ` σ w t , with γ, σ ą

0. 34n order to show the smoothness of the (time-dependent) spatial eigenfunctions satisfying S t u t “ λ t u t , our strategy is the same as for the Smoluchowski case. First, we show that W (cid:96) is uniformlybounded away from zero as in (33), then apply Theorem C.1 to imply a bound on the derivativesof the transition density function k , as in (34). From this, using Lebesgue’s theorem, we showthe smoothness of the right hand side of the eigenvalue equation λ t u p ˆ q q “ S t u t p ˆ q q “ f Q p q q ż R d ĳ R d u t p q q f Ω p q, p q k ` t, p q, p q , p ˆ q, ˆ p q ˘ d p q, p q d ˆ p . Apart from bounding W (cid:96) , every step is analogous as in the Smoluchowski case, hence we willomit them.As for W (cid:96) , note that for the Langevin equation b p q, p q “ ˆ p ´ γp ´ ∇ V p q q ˙ , b i p q, p q “ ˆ σe i . ˙ , i P t , . . . , d u . It follows that (see also [31, Theorem 3.2]) b H i ” ˆ σe i . ˙ , b i ” ˆ σe i ´ γσe i ˙ , i P t , . . . , d u . Note that these latter 2 d vectors span R d , hence the 2 d ˆ d matrix B with columns b H i and b i is non-singular. Recalling the deﬁnition of W (cid:96) , we have that W ` p q, p q ˘ ě ^ inf } η } “ } η T B } ą s @ q, p P R d , where s min ą B . Thus, W is uniformly bounded away fromzero, concluding the proof of smoothness of u t for any t ą Proposition C.3.

If the potential satisﬁes Assumption 5.1, then the eigenfunctions of thespatial transfer operator S t are smooth for every t ą . From R d to periodic systems Since in molecular dynamics often periodic angular coordi-nates are used, it is interesting to see whether the smoothness results from above apply also forperiodic systems on a torus.To this end, let X “ T d be the unit torus, which we identify with r , q d Ă R d . We consider anItˆo diﬀusion of the form (1) on X . Let b : T d Ñ R d and Σ : T d Ñ R d ˆ d be smooth functions.Set p b and p Σ to be the periodic extensions of b and Σ to R d , respectively, i.e. p b p x ` r q “ b p x q for all x P r , q d and r P Z d , and analogously for Σ. Assume that b and Σ are such that p b and p Σ satisfy all conditions of Theorem C.1, and that there is an (cid:96) ě W (cid:96) is uniformlybounded away from zero (as it is the case for the Smoluchowski and Langevin dynamics). Thenthe associated SDE has a smooth transition density function p k such that its derivatives of anyﬁxed order show an exponential decay: ˇˇˇ B αy p k p t, x, y ` r q ˇˇˇ “ O ` exp p´ c } r } q ˘ as } r } Ñ 8 , for x, y P r , q d and some c ą x, y . An analogous bound holds for derivativeswith respect to t as well.Note that for a ﬁxed x , the function p t, y q ÞÑ p k p t, x, y q solves the Fokker–Planck equation (9)on R ` ˆ R d . Thus, since this partial diﬀerential equation is linear, and p b and p Σ are periodic,the transition density function k p t, x, y q : “ ÿ r P Z d p k p t, x, y ` r q (35)35ormally solves the Fokker–Plack equation on T d associated with b and Σ. For this statement tobe rigorous, we have to show that k p t, x, y q , deﬁned as here, is continuously diﬀerentiable in t andtwice continuously diﬀerentiable in y . This is achieved by showing the uniform convergence ofthe summand-wise diﬀerentiated sum in (35). Setting ρ “ } r } , observe that there are O p ρ d ´ q many sets of the form r , q d ` r , r P Z d , intersected by a sphere of radius ρ . Hence, we canestimate ÿ r P Z d ˇˇˇ B αy p k p t, x, y ` r q ˇˇˇ ď C ÿ r P Z d exp ` ´ c } r } ˘ ď ˜ C ż ρ d ´ e ´ cρ dρ ă 8 , with some constants c, C, ˜ C ą

0, independent of y . This shows uniform convergence of the sum,and the smoothness of k p t, x, y q in y . An analogous computation can be done for t . Thus, k is smooth and solves the Fokker–Plack equation on T d . Since p k p t, x, ¨q converges (weakly) tothe Dirac distribution centered in x as t Ñ

0, so does k p t, x, ¨q . These last two properties showthat k is the (unique) transition density function associated with the Itˆo diﬀusion on the torus;and we have shown that it is smooth.Note that if the potential V : T d Ñ R is smooth, then p V p x ` r q : “ V p x q , r P Z d , readily satisﬁesAssumption 5.1. Thus we have Proposition C.4.

Let the potential V : T d Ñ R be smooth. Then the eigenfunctions of theassociated spatial transfer operator S t are smooth for every t ą . References [1]

V. Bally and D. Talay , The law of the Euler scheme for stochastic diﬀerential equations:II. convergence rate of the density , Probability theory and related ﬁelds, 104 (1996), pp. 43–60.[2]

J. R. Baxter and J. S. Rosenthal , Rates of convergence for everywhere-positive markovchains , Statistics & probability letters, 22 (1995), pp. 333–338.[3]

A. Bittracher, C. Hartmann, O. Junge, and P. Koltai , Pseudo generators forunder-resolved molecular dynamics , The European Physical Journal ST, forthcoming(2015).[4]

G. R. Bowman, X. Huang, and V. S. Pande , Using generalized ensemble simulationsand markov state models to identify conformational states , Methods, 49 (2009), pp. 197–201.[5]

J. D. Chodera, N. Singhal, V. S. Pande, K. A. Dill, and W. C. Swope , Automaticdiscovery of metastable states for the construction of markov models of macromolecularconformational dynamics , The Journal of chemical physics, 126 (2007), p. 155101.[6]

C. Cramer , Essentials of Computational Chemistry , Wiley, 2004.[7]

M. Dellnitz, G. Froyland, and O. Junge , The algorithms behind GAIO – set ori-ented numerical methods for dynamical systems , in Ergodic theory, analysis, and eﬃcientsimulation of dynamical systems, Springer Berlin Heidelberg, 2001, pp. 145–174.[8] ,

The algorithms behind Gaio-set oriented numerical methods for dynamical systems ,in Ergodic theory, analysis, and eﬃcient simulation of dynamical systems, Springer, 2001,pp. 145–174. 369]

M. Dellnitz and O. Junge , On the approximation of complicated dynamical behaviour ,SIAM J. Num. Anal., 36 (1999).[10]

P. Deuflhard, M. Dellnitz, O. Junge, and C. Sch¨utte , Computation of essen-tial molecular dynamics by subdivision techniques , in Computational molecular dynamics:challenges, methods, ideas, Springer, 1999, pp. 98–115.[11]

P. Deuflhard, W. Huisinga, A. Fischer, and C. Sch¨utte , Identiﬁcation of almostinvariant aggregates in reversible nearly uncoupled markov chains , Linear Algebra and itsApplications, 315 (2000), pp. 39–59.[12]

P. Deuflhard and M. Weber , Robust perron cluster analysis in conformation dynamics ,Linear algebra and its applications, 398 (2005), pp. 161–184.[13]

W. E and E. Vanden-Eijnden , Metastability, conformation dynamics, and transitionpathways in complex systems , in Multiscale modelling and simulation, Springer, 2004,pp. 35–68.[14] ,

Towards a theory of transition paths , Journal of statistical physics, 123 (2006),pp. 503–523.[15]

R. Elber and M. Karplus , Multiple conformational states of proteins: a moleculardynamics analysis of myoglobin , Science, 235 (1987), pp. 318–321.[16]

H. Frauenfelder and B. McMahon , Energy landscape and ﬂuctuations in proteins ,Annalen der Physik, 9 (2000), pp. 655–667.[17]

G. Friesecke, O. Junge, and P. Koltai , Mean ﬁeld approximation in conformationdynamics , Multiscale Modeling & Simulation, 8 (2009), pp. 254–268.[18]

G. Froyland, O. Junge, and P. Koltai , Estimating long-term behavior of ﬂows withouttrajectory integration: the inﬁnitesimal generator approach , SIAM Journal on NumericalAnalysis, 51 (2013), pp. 223–247.[19]

U. G. Haussmann and E. Pardoux , Time reversal of diﬀusions , The Annals of Proba-bility, (1986), pp. 1188–1205.[20]

L. H¨ormander , Hypoelliptic second order diﬀerential equations , Acta Mathematica, 119(1967), pp. 147–171.[21]

X. Huang, Y. Yao, G. R. Bowman, J. Sun, L. J. Guibas, G. E. Carlsson, andV. S. Pande , Constructing multi-resolution markov state models (msms) to elucidate rnahairpin folding mechanisms. , in Paciﬁc Symposium on Biocomputing, vol. 15, World Sci-entiﬁc, 2010, pp. 228–239.[22]

W. Huisinga , Metastability of Markovian systems , PhD thesis, Freie Universit¨at Berlin,2001.[23]

W. Huisinga, S. Meyn, and C. Sch¨utte , Phase transitions and metastability in marko-vian and molecular systems , Annals of Applied Probability, (2004), pp. 419–458.[24]

W. Huisinga and B. Schmidt , Metastability and dominant eigenvalues of transfer op-erators , in New Algorithms for Macromolecular Simulation, B. Leimkuhler, C. Chipot,R. Elber, A. Laaksonen, A. Mark, T. Schlick, C. Sch¨utte, and R. Skeel, eds., vol. 49 of Lec-ture Notes in Computational Science and Engineering, Springer Berlin Heidelberg, 2006,pp. 167–182. 3725]

O. Junge and P. Koltai , Discretization of the Frobenius–Perron operator using a sparsehaar tensor basis: the sparse Ulam method , SIAM Journal on Numerical Analysis, 47(2009), pp. 3464–3485.[26]

I. Karatzas , Brownian motion and stochastic calculus , vol. 113, springer, 1991.[27]

P. Koltai , Eﬃcient approximation methods for the global long-term behavior of dynamicalsystems: theory, algorithms and examples , PhD thesis, TU M¨unchen, 2010.[28]

S. Kusuoka and D. W. Stroock , Applications of the Malliavin calculus, part ii. , J.Fac. Sci. Univ. Tokyo, Sect. IA, Math., 32 (1985), pp. 1–76.[29]

A. Lasota and M. C. Mackey , Chaos, fractals, and noise: stochastic aspects of dynam-ics , vol. 97, Springer, 1994.[30]

J. C. Mattingly and A. M. Stuart , Geometric ergodicity of some hypo-elliptic diﬀu-sions for particle motions , Markov Process. Related Fields, 8 (2002), pp. 199–214.[31]

J. C. Mattingly, A. M. Stuart, and D. J. Higham , Ergodicity for sdes and approxi-mations: locally lipschitz vector ﬁelds and degenerate noise , Stochastic processes and theirapplications, 101 (2002), pp. 185–232.[32]

E. Nelson, E. Nelson, E. Nelson, and E. Nelson , Dynamical theories of Brownianmotion , vol. 17, Princeton university press Princeton, 1967.[33]

F. No´e, D. Krachtus, J. C. Smith, and S. Fischer , Transition networks for thecomprehensive characterization of complex conformational change in proteins , Journal ofChemical Theory and Computation, 2 (2006), pp. 840–857.[34]

B. Øksendal , Stochastic diﬀerential equations , Springer, 2003.[35]

A. Ostermann, R. Waschipky, F. G. Parak, and G. U. Nienhaus , Ligand bindingand conformational motions in myoglobin , Nature, 404 (2000), pp. 205–208.[36]

V. S. Pande, K. Beauchamp, and G. R. Bowman , Everything you wanted to knowabout markov state models but were afraid to ask , Methods, 52 (2010), pp. 99–105.[37]

G. Pavliotis and A. Stuart , Multiscale methods: averaging and homogenization , vol. 53,Springer, 2008.[38]

A. Pazy , Semigroups of linear operators and applications to partial diﬀerential equations ,Springer-Verlag, New York, 1983.[39]

J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. D. Chodera,C. Sch¨utte, and F. No´e , Markov models of molecular kinetics: Generation and valida-tion , The Journal of chemical physics, 134 (2011), p. 174105.[40]

S. R¨oblitz and M. Weber , Fuzzy spectral clustering by pcca+: application to markovstate models and data classiﬁcation , Advances in Data Analysis and Classiﬁcation, 7 (2013),pp. 147–179.[41]

C. Sch¨utte , Conformational dynamics: Modelling, theory, algorithm, and application tobiomolecules , 1999. Habilitation Thesis.[42]

C. Sch¨utte, W. Huisinga, and P. Deuflhard , Transfer operator approach to confor-mational dynamics in biomolecular systems , in Ergodic Theory, Analysis, and Eﬃcient Sim-ulation of Dynamical Systems, B. Fiedler, ed., Springer Berlin Heidelberg, 2001, pp. 191–223. 3843]

C. Sch¨utte, F. No´e, J. Lu, M. Sarich, and E. Vanden-Eijnden , Markov statemodels based on milestoning , The Journal of chemical physics, 134 (2011), p. 204105.[44]

C. Sch¨utte and M. Sarich , Metastability and Markov State Models in Molecular Dy-namics. , Courant Lecture Notes in Mathematics, 2013.[45]

L. N. Trefethen , Spectral methods in MATLAB , vol. 10, Siam, 2000.[46]

E. Vanden-Eijnden , Transition-path theory and path-ﬁnding algorithms for the study ofrare events , Annual review of physical chemistry, 61 (2010), pp. 391–420.[47]

M. Weber , Meshless Methods in Conformation Dynamics , PhD thesis, FU Berlin, 2006.[48] ,

A subspace approach to molecular Markov state models via a new inﬁnitesimal gen-erator. , 2012. Habilitation thesis.[49]

E. Zeidler , Applied functional analysis , vol. 108, Springer, 1995.[50]

H.-X. Zhou, S. T. Wlodek, and J. A. McCammon , Conformation gating as a mech-anism for enzyme speciﬁcity , Proceedings of the National Academy of Sciences, 95 (1998),pp. 9280–9283.[51]