A Toolbox for Quantifying Memory in Dynamics Along Reaction Coordinates
AA Toolbox for Quantifying Memory in Dynamics Along Reaction Coordinates
Alessio Lapolla and Aljaˇz Godec ∗ Mathematical bioPhysics Group, Max Planck Institute for Biophysical Chemistry, G¨ottingen 37077, Germany (Dated: January 14, 2021)Memory effects in time-series of experimental observables are ubiquitous, have important con-sequences for the interpretation of kinetic data, and may even affect the function of biomolecularnanomachines such as enzymes. Here we propose a set of complementary methods for quantifyingconclusively the magnitude and duration of memory in a time series of a reaction coordinate. Thetoolbox is general, robust, easy to use, and does not rely on any underlying microscopic model. Asa proof of concept we apply it to the analysis of memory in the dynamics of the end-to-end distanceof the analytically solvable Rouse-polymer model and an experimental time-series of extensions ofa single DNA hairpin measured in an optical tweezers experiment.
The dynamics of complex, high-dimensional physicalsystems such as complex biomolecules is frequently de-scribed by means of memory-less, Markovian diffusionalong a low-dimensional reaction coordinate [1–9]. Suchsimplified models often accurately describe selected ob-servations in experiments [10–14] and computer simula-tions [1, 6, 7]. However, as soon as latent, hidden de-grees of freedom that become projected out do not relaxinstantaneously on the time scale we observe the reac-tion coordinate [15], or the reaction coordinate does notlocally equilibrate in long-lived meta-stable meso-states[16], almost any projection of high-dimensional dynamicsonto a lower dimensional coordinate introduces memory[15–24].Memory effects can have intriguing manifestations inthe evolution of both, ensemble- [15, 25–28] and time-average observables [15, 29], and are often particularlywell-pronounced in observations that reflect, or coupleto, intra-molecular distances in conformationally flexiblebiomolecules [16, 18, 20–22, 25, 30–38]. Moreover, if thedynamics is ergodic in the sense that the system relaxesto a unique equilibrium probability density function fromany initial condition (i.e. the reaction coordinate has aunique free energy landscape) then the memory is neces-sarily transient [15]. Whether or not memory is in factrelevant depends on how its extent compares to the re-laxation time and whether or not the latter is reached inan experiment. If the extent of memory is comparableto, or longer than, the time-scale on which biomoleculesoperate, such e.g. enzymes catalyzing chemical reactions[39, 40], non-Markovian effects shape biological function.It is therefore important to asses the presence andduration of memory effects in the dynamics along re-action coordinates. An elegant “test of Markovianity”of a reaction coordinate has recently been proposed byBerezhkovskii and Makarov, who considered the behav-ior of transition paths [41]. The authors provide a pair ofinequalities whose violation conclusively reflects that thedynamics is non-Markovian. However, memory-effectsare typically transient [15] although their extent mayexceed the duration of experimental observations [33].There is thus a need to determine not only the presence of memory in a time-series of a reaction coordinate butalso its extent and attenuation at different time-scales.Here, we fill this gap by providing a toolbox for quan-tifying memory in a time-series of a low-dimensional re-action coordinate. We propose a set of model-free com-plementary methods that are easy to use and are suitedto treat reaction coordinates with arbitrary dimensional-ity. As a proof of concept we apply these methods to theanalysis of an experimental time-series of the extensionof a DNA-hairpin measured in an optical tweezers exper-iment and the exactly-solvable Rouse model of polymerchain.
Theory.—
Our approach rests on a comparison of thetrue time-evolution with a pair reference processes: oneconstructed as a fictitious, Markovian time-series and theother testing for the semi-group property. Let q t with0 ≤ t ≤ T denote the time-series of the reaction coor-dinate and q M t the fictitious Markovian series. Withoutany loss of generality we assume that the reaction co-ordinate is one-dimensional – the generalization to mul-tiple dimensions is straightforward. We assume q t and q M t to be ergodic with an equilibrium probability den-sity p eq ( q ) that is by construction identical for both pro-cesses. Let G ( q, t | q ) = (cid:104) δ ( q t − q ) (cid:105) q denote the proba-bility density that the reaction coordinate evolving from q t =0 = q is found at time t to have a value in an infinites-imal neighborhood of q and G M ( q, t | q ) = (cid:104) δ ( q M t − q ) (cid:105) q the corresponding Markovian counterpart, where δ ( x ) de-notes Dirac’s delta function and the angular brackets (cid:104)·(cid:105) q the average over all realizations of q t evolving from q .We then have lim t →∞ G ( q, t | q ) = lim t →∞ G M ( q, t | q ) = p eq ( q ) as a result of ergodicity. In practice the limits areachieved as soon as t becomes sufficiently larger than therelaxation time t rel , i.e. t (cid:38) t rel , which may or may notbe reached in an experiment. Note that the relaxationtimes of the true and Markovian reference process aretypically not equal.We use two descriptors, the Kullback-Leibler diver-gence between the transition probabilities of the true and a r X i v : . [ c ond - m a t . s t a t - m ec h ] J a n a fictitious reference process defined as [42] D a q ( t ) ≡ (cid:90) dqG ( q, t | q ) ln[ G ( q, t | q ) /G a ( q, t | q )] , (1)where a = M,CK denotes the particular kind of referenceprocess that we detail below. By construction D a q ( t ) (cid:54) = 0if and only if G ( q, t | q ) (cid:54) = G a ( q, t | q ) and thus non-zerovalues of D a q ( t ) reflect memory in the dynamics of q t .When q t reaches equilibrium in the coarse of the exper-iment we may also consider the normalized equilibriumautocorrelation function defied as C ,M ( t ) ≡ (cid:104) q t q (cid:105) ,M − (cid:104) q (cid:105) (cid:104) q (cid:105) eq − (cid:104) q (cid:105) (2)where we have introduced (cid:104) q t q (cid:105) ,M ≡ (cid:90) (cid:90) qq G ,M ( q, t | q ) p eq ( q ) dqdq T (cid:29) t rel = ( T − t ) − (cid:90) T − t q τ + t q τ dτ, t (cid:28) T, (cid:104) q n (cid:105) eq ≡ (cid:90) q n p eq ( q ) dq, for n = 1 , T (cid:29) t rel = T − (cid:90) T q nτ dτ (3)where the respective definitions in terms of time-averageshold if the trajectory is much longer than the relaxationtime, i.e. T (cid:29) t rel . The absence of an index refers to thetrue process and the index M to the fictitious Markoviancounterpart.In the comparison we consider two distinct referenceprocesses. The first one builds on the semi-group prop-erty of Markov processes, i.e. the so-called Chapman-Kolmogorov equation that we may write as G CK τ ( q, t | q ) ≡ (cid:90) G ( q, t − τ | q (cid:48) ) G ( q (cid:48) , τ | q ) dq (cid:48) (4)because the Green’s function of a time-homogeneousMarkov process is time-translation invariant, G ( q, t − τ | q (cid:48) ) = G ( q, t | q (cid:48) , τ ) [43]. The physical interpretation ofthe construction is that we observe the true dynamics q t until time τ and then instantaneously reset the memory(if any) to zero.If q t is indeed memoryless we have identically G CK τ ( q, t | q ) = G ( q, t | q ) for any τ and thus D CK τ,q ( t ) =0 for any t and τ . If G CK τ ( q, t | q ) (cid:54) = G ( q, t | q ) forsome t and τ then q t is conclusively non-Markovian and D CK τ,q ( t ) > q t to reach equilibrium during an exper-iment and requires only G ( q, t | q ) that is straightforwardto determine from a time series q t . However, if equilib-rium is reached then lim t (cid:29) t rel D CK τ,q ( t ) = 0 for any q . By analyzing D CK τ,q ( t ) we can quantify the degree of mem-ory and its range as a function of τ and q which wedemonstrate below.In the second method we construct from q t a cor-responding Markovian time-series q M t evolving underthe influence of the potential of mean force w ( q ) ≡− k B T ln p eq ( q ) according to the Itˆo Langevin equation ddt q M t = Df ( q M t ) /k B T + ξ t (5)where f ( q M t ) ≡ − k B T ∂ q ln p eq ( q ) | q = q M t and ξ t denoteszero mean Gaussian white noise with variance (cid:104) ξ t ξ t (cid:48) (cid:105) =2 Dδ ( t − t (cid:48) ) and D is the diffusion coefficient. This methodassumes that we are able to determine the equilibriumprobability density p eq ( q ) and thus requires q t to reachequilibrium. In the simplest model, which we adopt here,it is assumed that the diffusion coefficient does not de-pend on q . In this case inferring D is straightforward.However, we note that the best possible Markovian ap-proximation would allow that the diffusion coefficient de-pends on q , i.e. D → D ( q ) [45] and efficient methodshave been developed to treat this case as well [46, 47].In this case Eq. (5) is to be interpreted in the thermo-dynamically consistent anti-Itˆo or post-point convention[16].On the level of the probability density function Eq. (5)corresponds to the Fokker-Planck equation ∂ t G M ( q, t | q ) = ∂ q D [ ∂ q − ( k B T ) − f ( q )] G M ( q, t | q ) (6)with initial condition G M ( q, t | q ) = δ ( q − q ) andnatural boundary conditions imposed by the under-lying physics. Depending on the specific problem G M ( q, t | q ) can be found by a numerical integration ofthe Langevin equation and subsequent histogram analy-sis, i.e. G M ( q, t | q ) = (cid:104) δ ( q M t − q ) (cid:105) q , or directly by solvingEq. (6) as e.g. [15, 48]. In the following we illustrate bothapproaches. End-to-end distance of a Rouse polymer.—
As a firstexample we consider a Rouse polymer chain with N + 1-beads ( N -bonds) in absence of hydrodynamic interac-tions [49, 50] and focus on the end-to-end distance as thereaction coordinate, i.e. q t ≡ | r − r N +1 | . An importantfeature of this model is that it is exactly solvable. We ex-press time in units of t Kuhn , the characteristic diffusiontime of a Kuhn-segment, i.e. t Kuhn = b /D , where b isthe Kuhn-length and D the diffusion coefficient of a bead.The probability density function for the positions of allbeads { r i } is well-known [49, 50] and allows us to de-termine exactly the probability density of the end-to-enddistance. Introducing ν k ≡ kπ/ N + 1), α k = 4 sin ( ν k )as well as Q ik ≡ (cid:112) / ( N + 1) cos( ν k [2 i − η t ≡ (cid:80) Nk =1 ( Q k − Q k ) e − α k t / α k the equilibrium probabil-ity density of q is given by p eq ( q ) = q e − q / η / √ πη / for q ∈ [0 , ∞ ) with the mean extension (cid:104) d (cid:105) = 4 (cid:112) η /π and mean square extension (cid:104) d (cid:105) = 6 η . The probabilitydensity function of q reads exactly (for a derivation seeRef. [26]) G R ( q, t | q ) = qq e − η ( q + q ) / η − η t ) p eq ( q )2 πη t (cid:112) η − η t sinh (cid:18) η t qq η − η t ) (cid:19) . (7)The exact autocorrelation function in Eq. (2) is in turnobtained in the form C ( t ) = 6 (cid:112) η − η t (3 π − η + 4( η + η t ) arctan (cid:16) η t / (cid:112) η − η t (cid:17) (3 π − η η t (8)The Fokker-Planck equation in the Markovian approx-imation to the evolution of q for the Rouse polymer(i.e. Eq. (6)) can be obtained exactly in the formof a spectral expansion [48] and reads G M R ( q, t | q ) = (cid:80) ∞ k =0 ψ Rk ( q ) ψ Lk ( q )e − kt/η where ψ Lk ( x ) ≡ (cid:115) k ! √ π / k ) L / k (cid:18) x η (cid:19) (9)where Γ( x ) denotes the Gamma-function and L / k ( x ) thegeneralized Laguerre polynomial of degree k with pa-rameter 1 / ψ Rk ( x ) = p eq ( x ) ψ Lk ( x ). Here from it isstraightforward to obtain the autocorrelation function inthe Markovian approximation that reads C M ( t ) = √ π − /π ∞ (cid:88) k =1 e − kt/η [ k !Γ(3 / − k ) Γ(3 / k )] − . (10)A comparison of the true autocorrelation function C ( t )given by Eq. (8) and the Markovian approximation C M ( t )given by Eq. (10) is shown in Fig. 1a, with the inset de-picting the corresponding equilibrium probability densityfunction p eq ( q ). Note that when the free energy land-scape w ( q ) overestimates the confining effect of hiddendegrees of freedom on q t the approximate Markovian evo-lution always overestimates the relaxation rate (e.g. [15];see also SM). Namely, the Markovian evolution assumesthe hidden degrees of freedom to remain at equilibrium atall times whereas the instantaneous fluctuating restoringforce on q t is in this case smaller than the force arisingfrom the free energy landscape that corresponds to theaverage of the fluctuating force.The Chapman-Kolmogorov-construct for the Rousepolymer, G CK τ ( q, t | q ), is somewhat lengthy and is the-fore given explicitly in the Supplementary Material, SM. G CK τ ( q, t | q ) differs from Eq. (7) for all expect large val-ues of t − τ . A quantification of the discrepancy betweenthe true and “Chapman-Kolmogorov” evolution of theend-to-end distance of the Rouse-polymer in terms of theKullback-Leibler divergence (1) is shown in Fig. 2a.A typical time evolution of D CK τ,q ( t ) gradually increases . . . .
81 10 a) C ( t ) t TrueMarkovian . . . .
81 10 − − b) C ( t ) t [ ms ] TrueMarkovian p e q ( q ) q p e q ( q ) q [nm] Figure 1. Comparison of the true autocorrelation func-tion C ( t ) (orange) with the Markovian approximation C M ( t )(blue) for a) a Rouse-polymer with 1000 monomers with timeexpressed in units of the diffusion time of a Kuhn-segment t Kuhn ; b) the extension of a DNA-hairpin. The insets depictthe respective equilibrium probability density function p eq ( q ).The dashed lines depict the initial conditions we consider inFig. 3. . . . . .
025 10 a) D C K τ , q ( t ) tq = 40 ,τ = 10040 , , , . . . . . .
06 10 b) D C K τ , q ( t ) t [ms] q = 664 nm ,τ = 0 . ms , . , . , . Figure 2. Kullback-Leibler divergence defined in Eq. (1)between the true Green’s function G ( q, t | q ) and the ap-proximate Green’s function constructed from the Chapman-Kolmogorov equation (4) as a function of time t for a) theRouse-polymer with 1000 beads and b) the extension of aDNA-hairpin. Several values of τ are considered in the con-struction of G CK τ ( q, t | q ) as well as a pair of different initialconditions q . Note that due to the particular construction ofEq. (4) times shorter than depicted are not accessible due tonumerical instability and poor statistics. from zero, reaches a maximum and afterwards returnsback to 0, which reflects the gradual build-up and atten-uation of memory because q t “remembers” the initial con-dition of the hidden degrees of freedom [15]. As a result,the Chapman-Kolmogorov Green’s function G CK τ ( q, t | q )fails to predict the true evolution of q t , and D CK τ,q ( t ) con-structed this way depends on both, τ and initial condition q . For the Rouse-polymer with 1000 beads D CK τ,q ( t ) (cid:54) = 0at least up to t ∼ × t Kuhn .Next we examine D M q ( t ), the Kullback-Leibler diver-gence (1) between the true Green’s function G ( q, t | q )and the Markovian approximation corresponding to thewhite-noise Markovian diffusion in the exact free energy . . . . a) D M q ( t ) t q = 504060 00 . . . . . − − b) D M q ( t ) t [ms] q = 668 [nm] Figure 3. Kullback-Leibler divergence D M t defined in Eq. (1)between the true Green’s function G ( q, t | q ) and the Marko-vian approximation to the Green’s function G M ( q, t | q ) corre-sponding to the evolution equation (5) as a function of time t for a) the Rouse-polymer with 1000 beads evolving from q = 40 (orange), q = 50 (black), q = 60 (blue) and b) theextension of a DNA-hairpin evolving from a distance withina bin of thickness 1nm centered at q = 664 nm (orange), q = 668 (black) and q = 671 nm (blue). landscape (i.e. Eq. (5)). The results are shown in Fig. 3a.The qualitative features of the time-dependence of D M q ( t ) are similar to those observed in Fig. 2 – mem-ory builds up in a finite interval and smoothly returnsback to zero from the attained maximum. The intuitionbehind this result is that it takes a finite time to allowfor distinct evolutions of hidden degrees of freedom thatintroduce memory in the dynamics of the reaction coor-dinate q t . At long times memory is progressively lost asa result of the gradual relaxation of the hidden degrees offreedom to their respective equilibrium that in turn ren-ders the dynamics of the reaction coordinate effectivelymemory-less and correspondingly D M q ( t ) vanishes. Single-molecule experiments on a DNA hairpin.—
As asecond example we consider an experimental time-seriesof the end-to-end distance of a single-strand DNA hairpinmeasured in an optical tweezers experiment performedby the Woodside group [53]. The data-set contains 11million measurements of the extension of the DNA hair-pin 30R50T4 held in a pair of optical traps with stiffness0 .
63 pN/nm and 1 . . µ s temporal resolution. It has been shown that thistime-series is non-Markovian [38]. We find the lengthof the time-series to be much larger that the relaxationtime (see Fig. 1b) and therefore slice it into several piecesthat are statistically independent. More precisely, we usea conservative estimate, namely the time-scale t cut wherethe autocorrelation function of the extension, C ( t ), fallsto below (cid:39) t cut (cid:29) t rel andyields an ensemble of 50 statistically independent trajec-tories.We determine the equilibrium probability density p eq ( q ) (see inset of Fig. 1b) and two-point joint prob-ability density p ( q, t, q ,
0) = p ( q, t + t, q , t ) by per-forming a standard histogram analysis with a bin-size of l bin =0.35 nm, such that q refers to a bin of width l bin centered at q . The Greens function is thereupon ob-tained by the law of conditional probability, G ( q, t | q ) = p ( q, t, q , /p eq ( q ) while the autocorrelation function inEq. (2) is determined directly from the respective secondlines of Eq. (3).The Chapman-Kolmogorov construct is determinedfrom G ( q, t | q ) by direct integration of Eq. (4) and isused to determine D CK τ,q ( t ), while the corresponding fic-titious Markovian process evolves as Markovian diffusionin a free energy landscape w ( q ) ≡ − k B T ln p eq ( q ) witha diffusion coefficient D that we parameterize as fol-lows. We first determine the first and second momentof the displacement from each bin-point q l after a singletime-step δt = 2 . µ s, i.e. (cid:104) δq δt ( l ) (cid:105) and (cid:104) δq δt ( l ) (cid:105) where δq δt ( l ) = q t + δt − q t | q t = q l . We consider two bin-sizes, l D =0.01 nm and l D =0.001 nm, and find the result tobe essentially independent on the precise value of l D wechoose (see SM). Moreover, the results are also ratherindependent of the location of the bin q l (see SM), im-plying that to a good approximation D may indeed betaken as being constant. It is of course possible to extendthe analysis to include a coordinate dependent diffusioncoefficient D ( q ) (see e.g. [47]), which, however, is beyondthe scope of this proof-of-concept paper.From the first and second local moments of q the dif-fusivity D is in turn determined according to D = N b (cid:88) l =1 (cid:104) δq δt ( l ) (cid:105) − (cid:104) δq δt ( l ) (cid:105) N b δt , (11)where the brackets (cid:104)·(cid:105) here denote the average over alldisplacements from this bin observed during the entiretime-series. The analysis yields D = 447 ± / ms for l D =0.001 nm and D = 448 ± / ms for l D =0.01nm, respectively. Using the values for and D obtainedin this way we generate the Markovian time-series q M t byintegrating the Itˆo Langevin equation (5) using the Euler-Mayurama scheme, and determine D M t ( q ) in Eq. (1) and C M ( t ) in Eq. (2), respectively.In contrast to the Rouse-polymer the DNA hairpin ex-ists in two characteristic conformational states – foldedand unfolded. As a result, the equilibrium probabilitydensity function p eq ( q ) is bimodal and the dynamics of q t displays signatures of metastability [53]. However, sincethe two peaks corresponding to the two sub-populationsare not separated (see inset of Fig.1b) the potential ofmean force w ( q ) is expected to underestimate the freeenergy barrier and therefore the Markovian evolution islikely to overestimate the relaxation rate. In completeagreement Fig.1b displays an overestimation of the rateof decay of autocorrelations in the Markovian approxi-mation by two orders of magnitude in time. Moreover, along-lived plateau is observed in the true C ( t ) spanningmore than an order of magnitude in time.In order to assess whether the mismatch between trueand Markovian time evolution is predominantly due toan underestimation of the free energy barrier betweenfolded and unfolded states of the hairpin we inspect heKullback-Leibler divergence (1) between the true and“Chapman-Kolmogorov evolution” of the extension ofthe hairpin is shown in Fig. 2b. The result clearly showspronounced signatures of memory in the evolution of theend-to-end distance of the hairpin that extend over morethan ∼
10 ms. Note that the “Chapman-Kolmogorov evo-lution” describes the correct evolution until time t = τ whereupon memory is reset to zero. Therefore a non-zero D Ck τ,q ( t ) is a clear signature of memory arising fromthe dynamical coupling of q t to hidden degrees of free-dom. Similar to the Rouse-polymer D CK τ,q ( t ) depends onthe initial condition q .A build-up and decay of memory similar to the Rouse-polymer is also observed in the time evolution of D M q ( t ),the Kullback-Leibler divergence between the Green’sfunction of the true evolution and the white-noise Marko-vian diffusion in the exact free energy landscape shown inFig. 3b. Notably, Fig. 2b and Fig. 3b display essentiallythe same extent of memory (though the peak is attainedsooner in the white-noise Markovian diffusion), demon-strating that metastability does not necessarily destroynor dominate memory in the evolution of reaction coordi-nates. Note that the presence of memory in metastablesystems is not unusual (see e.g. [20, 21] and [28]). Intotal, the analysis conclusively identifies the presence ofextended memory effects in the dynamics of the extensionof the hairpin.It is important to note that the extent of memory (ofthe order of ∼ q t and the initial conditions of thehidden degrees of freedom [15]. The information encodedin C ( t ) and D M , CK ( t ) is therefore different – D M , CK ( t ) isa genuine measure of the extent and duration of memory. Conclusion.—
We presented a set of complementarymethods to quantify conclusively the degree and dura-tion of memory in a time series of a reaction coordinate q t . The proposed toolbox does not assume any partic-ular physical model. Instead it exploits the semi-groupproperty of Markov processes and constructs a fictitiousMarkovian diffusion process in the free energy landscapeof q t , and compares the artificially constructed transi-tion probability density with the observed probabilitydensity. The analysis not only determines whether thedynamics of q t has memory but also quantifies the mag-nitude and duration of memory and thus complementsthe recently proposed “test for Markovianity” based ontransition paths [41]. Whereas in our examples we con-sidered only one-dimensional coordinates and constant diffusion coefficients the toolbox generalizes straightfor-wardly to higher-dimensional reaction coordinates anddiffusion landscapes D ( q ). The method is general, ro-bust, and easy to use. We therefore hope that it will findnumerous applications involving time-series derived fromexperiments and computer simulations. Acknowledgments
We thank Krishna Neupane andMichael T. Woodside for providing unlimited access totheir DNA-hairpin data. The financial support from theGerman Research Foundation (DFG) through the EmmyNoether Program GO 2762/1-1 to AG is gratefully ac-knowledged. ∗ [email protected][1] R. B. Best and G. Hummer, Reaction coordinates andrates from transition paths, Proc. Natl. Acad. Sci. ,6732–6737 (2005).[2] J. J. Portman, S. Takada, and P. G. Wolynes, Microscopictheory of protein folding rates. ii. local reaction coordi-nates and chain dynamics, J. Chem. Phys. , 5082–5096(2001).[3] B. Peters, P. G. Bolhuis, R. G. Mullen, and J.-E. Shea, Re-action coordinates, one-dimensional smoluchowski equa-tions, and a test for dynamical self-consistency, J. Chem.Phys. , 054106 (2013).[4] A. K. Faradjian and R. Elber, Computing time scales fromreaction coordinates by milestoning, J. Chem. Phys. ,10880–10889 (2004).[5] A. Berezhkovskii and A. Szabo, One-dimensional reactioncoordinates for diffusive activated rate processes in manydimensions, J. Chem. Phys. , 014503 (2005).[6] G. Hummer, Position-dependent diffusion coefficients andfree energies from bayesian analysis of equilibrium andreplica molecular dynamics simulations, New J. Phys. ,34–34 (2005).[7] R. B. Best and G. Hummer, Coordinate-dependent dif-fusion in protein folding, Proc. Natl. Acad. Sci. ,1088–1093 (2009).[8] W. Zhang, C. Hartmann, and C. Sch¨utte, Effective dy-namics along given reaction coordinates, and reaction ratetheory, Faraday Discussions , 365–394 (2016).[9] A. M. Berezhkovskii and D. E. Makarov, Communication:Coordinate-dependent diffusivity from single molecule tra-jectories, J. Chem. Phys. , 201102 (2017).[10] O. K. Dudko, G. Hummer, and A. Szabo, The-ory, analysis, and interpretation of single-molecule forcespectroscopy experiments, Proceedings of the NationalAcademy of Sciences , 15755–15760 (2008).[11] K. Neupane, A. P. Manuel, and M. T. Woodside, Pro-tein folding trajectories can be described quantitativelyby one-dimensional diffusion over measured energy land-scapes, Nat. Phys. , 700 (2016).[12] K. Neupane, D. A. N. Foster, D. R. Dee, H. Yu, F. Wang,and M. T. Woodside, Direct observation of transitionpaths during the folding of proteins and nucleic acids, Sci-ence , 239 (2016).[13] J. Gladrow, M. Ribezzi-Crivellari, F. Ritort, and U. F.Keyser, Experimental evidence of symmetry breaking oftransition-path times, Nat. Commun. , 10.1038/s41467- , 10.1126/sciadv.aaz4642 (2020).[15] A. Lapolla and A. Godec, Manifestations of Projection-Induced Memory: General Theory and the Tilted SingleFile, Front. Phys. , 10.3389/fphy.2019.00182 (2019).[16] D. Hartich and A. Godec, Emergent memory and kinetichysteresis in strongly driven networks, arXiv:2011.04628(2020), arXiv:2011.04628 [cond-mat.stat-mech].[17] N. van Kampen, Remarks on Non-Markov Processes,Brazilian J. Phys. , 10.1590/S0103-97331998000200003(1998).[18] S. S. Plotkin and P. G. Wolynes, Non-markovian con-figurational diffusion and reaction coordinates for proteinfolding, Phys. Rev. Lett. , 5015 (1998).[19] R. Zwanzig, Nonequilibrium statistical mechanics (Ox-ford Univ. Press, 2010).[20] D. E. Makarov, Interplay of non-markov and internal fric-tion effects in the barrier crossing kinetics of biopolymers:Insights from an analytically solvable model, J. Chem.Phys. , 014102 (2013).[21] M. Ozmaian and D. E. Makarov, Transition path dynam-ics in the binding of intrinsically disordered proteins: Asimulation study, J. Chem. Phys. , 235101 (2019).[22] H. Meyer, P. Pelagejcev, and T. Schilling, Non-markovianout-of-equilibrium dynamics: A general numerical pro-cedure to construct time-dependent memory kernels forcoarse-grained observables, EPL (Europhys. Lett.) ,40001 (2020).[23] E. Herrera-Delgado, J. Briscoe, and P. Sollich, Tractablenonlinear memory functions as a tool to capture and ex-plain dynamical behaviors, Phys. Rev. Research , 043069(2020).[24] F. M¨uller, U. Basu, P. Sollich, and M. Kr¨uger, Coarse-grained second-order response theory, Phys. Rev. Research , 043123 (2020).[25] W. Min, G. Luo, B. J. Cherayil, S. C. Kou, and X. S.Xie, Observation of a power-law memory kernel for fluctu-ations within a single protein molecule, Phys. Rev. Lett. , 198302 (2005).[26] A. Lapolla and A. Godec, Faster uphill relaxation in ther-modynamically equidistant temperature quenches, Phys.Rev. Lett. , 10.1103/physrevlett.125.110602 (2020).[27] A. Lapolla and A. Godec, Bethesf: Efficient computationof the exact tagged-particle propagator in single-file sys-tems via the bethe eigenspectrum, Comput. Phys. Com-mun. , 107569 (2021).[28] A. Lapolla and A. Godec, Single-file diffusion in a bi-stable potential: Signatures of memory in the barrier-crossing of a tagged-particle, J. Chem. Phys. , 194104(2020).[29] A. Lapolla and A. Godec, Unfolding tagged particle his-tories in single-file diffusion: exact single- and two-tag lo-cal times beyond large deviation theory, New J. Phys. ,113021 (2018).[30] S. C. Kou and X. S. Xie, Generalized langevin equationwith fractional gaussian noise: Subdiffusion within a singleprotein molecule, Phys. Rev. Lett. , 180603 (2004).[31] T. Neusius, I. Daidone, I. M. Sokolov, and J. C. Smith,Subdiffusion in peptides originates from the fractal-likestructure of configuration space, Phys. Rev. Lett. ,188103 (2008). [32] S. Press´e, J. Peterson, J. Lee, P. Elms, J. L. MacCallum,S. Marqusee, C. Bustamante, and K. Dill, Single moleculeconformational memory extraction: P5ab rna hairpin, J.Phys. Chem. B , 6597–6603 (2014).[33] X. Hu, L. Hong, M. Dean Smith, T. Neusius, X. Cheng,and J. Smith, The dynamics of single protein molecules isnon-equilibrium and self-similar over thirteen decades intime, Nature Physics , 171–174 (2015).[34] A. K. Sangha and T. Keyes, Proteins Fold by Subdiffu-sion of the Order Parameter, J. Phys. Chem. B , 15886(2009).[35] S. M. Avdoshenko, A. Das, R. Satija, G. A. Papoian,and D. E. Makarov, Theoretical and computational vali-dation of the Kuhn barrier friction mechanism in unfoldedproteins, Sci. Rep. , 269 (2017).[36] Y. Cote, P. Senet, P. Delarue, G. G. Maisuradze, andH. A. Scheraga, Anomalous diffusion and dynamical cor-relation between the side chains and the main chain ofproteins in their native state, Proc. Natl. Acad. Sci. ,10346 (2012).[37] I. Grossman-Haham, G. Rosenblum, T. Namani, andH. Hofmann, Slow domain reconfiguration causes power-law kinetics in a two-state enzyme, Proc. Natl. Acad. Sci. , 513 (2018).[38] A. G. T. Pyo and M. T. Woodside, Memory effectsin single-molecule force spectroscopy measurements ofbiomolecular folding, Phys. Chem. Chem. Phys. , 24527(2019).[39] H. P. Lu, Single-molecule enzymatic dynamics, Science , 1877–1882 (1998).[40] B. P. English, W. Min, A. M. van Oijen, K. T. Lee,G. Luo, H. Sun, B. J. Cherayil, S. C. Kou, and X. S.Xie, Ever-fluctuating single enzyme molecules: Michaelis-menten equation revisited, Nat. Chem. Biol. , 87–94(2005).[41] A. M. Berezhkovskii and D. E. Makarov, Single-MoleculeTest for Markovianity of the Dynamics along a ReactionCoordinate, J. Phys. Chem. Lett. , 2190 (2018).[42] S. Kullback and R. Leibler, On information and suffi-ciency, Ann. Math. Statist , 79 (1951).[43] Gardiner, C.W., Handbook of Stochastic Methods forPhysics, Chemistry and Natural Sciences , 2nd ed.(Springer-Verlag, 1985).[44] W. Feller, Non-Markovian Processes with the SemigroupProperty, The Annals of Mathematical Statistics , 1252(1959).[45] A. Berezhkovskii and A. Szabo, Time scale separationleads to position-dependent diffusion along a slow coordi-nate, J. Chem. Phys. , 074108 (2011).[46] B. J. Berne, M. Borkovec, and J. E. Straub, Classical andmodern methods in reaction rate theory, J. Phys. Chem. , 3711–3725 (1988).[47] G. Hummer, Position-dependent diffusion coefficientsand free energies from bayesian analysis of equilibrium andreplica molecular dynamics simulations, New J. Phys. ,34 (2005).[48] S. Sunagawa and M. Doi, Theory of Diffusion-ControlledIntrachain Reactions of Polymers, Polymer J. , 604(1975).[49] P. E. Rouse, A Theory of the Linear Viscoelastic Prop-erties of Dilute Solutions of Coiling Polymers, J. Chem.Phys. , 1272 (1953).[50] K. H. Ahn, J. L. Schrag, and S. J. Lee, Bead-spring chainmodel for the dynamics of dilute polymer solutions, J. Non-Newton Fluid Mech. , 349 (1993).[51] Abramowitz, Milton and Stegun, Irene A., Handbookof Mathematical Functions with Formulas, Graphs, andMathematical Tables, in Handbook of Mathematical Func-tions with Formulas, Graphs, and Mathematical Tables (Dover, New York, 1964) ninth dover printing, tenth gpoprinting ed.[52] F. Johansson, Arb: Efficient Arbitrary-PrecisionMidpoint-Radius Interval Arithmetic, IEEE Transactionson Computers , 1281 (2017).[53] K. Neupane, A. P. Manuel, J. Lambert, and M. T. Wood-side, Transition-Path Probability as a Test of Reaction-Coordinate Quality Reveals DNA Hairpin Folding Is aOne-Dimensional Diffusive Process, J. Phys. Chem. Lett. , 1005 (2015). Supplementary Material for:A Toolbox for Quantifying Memory in Dynamics Along Reaction Coordinates
Alessio Lapolla and Aljaˇz Godec
Mathematical bioPhysics Group, Max Planck Institute for Biophysical Chemistry, 37077 G¨ottingen, Germany
Abstract
In this Supplementary Material (SM) we present the exact result for the “Champman-Kolmogorov”construction of the Green’s function of the end-to-end distance and other internal coordinates of the Rousepolymer defined in Eq. (4) in the manuscript as well as a local error analysis of the inferred diffusioncoefficient D entering the Markovian approximation of the end-to-end diffusion of the DNA hairpin. Inaddition, supplementary figures are included showing the various Green’s functions for the Rouse polymerand DNA hairpin. DETAILS OF THE PROJECTION AFFECT THE RELAXATION TIME AND EXTENT OF MEMORY
In the main text we consider Rouse polymer chain composed of 1000 beads and we focus on the autocorrelationfunction of its end-to-end distance as the reaction coordinate q t . We find that the fictitious Markovian reference processdescribing Brownian diffusion in the free energy landscape overestimates the relaxation rate; a similar observation isalso made in the case of the experimental hairpin data. However this difference in the rate of relaxation is non-uniqueand in fact depends on the observable, i.e. on details of the projection. . . . .
81 10 − − C ( t ) t True, 1-2Markovian, 1-2True, 1-1000Markovian, 1-1000
Figure S4. The autocorrelation function of the distance between the first and the second bead (dashed lines) and first and lastbead (full lines) of a Rouse chain composed by 1000 according to the true (orange) and fictitious Markovian evolution (blue).Note the the free energy landscape for both orange-blue pairs is by construction identical. The continuous lines are those shownin the main text.
For example we demonstrate in Fig. S4 the opposite trend that arises when we observe the autocorrelation functionof the distance between the first and the second bead of the same Rouse Chain (see dashed lines).In addition, is worth to note that if the Green’s function G describing the full many-dimensional system is diago-nalizable (like in the Rouse chain case [1] or any Markovian dynamics obeying detailed balance), it can be writtenas G ( x , t | x ) = (cid:88) k ψ Rk ( x ) ψ Lk ( x )e − λ k t , (S1)where ψ Rk and ψ Lk are respectively the right and left eigenfunctions of the underlying Fokker-Planck-Smoluchowskioperator, while λ k denotes the eigenvalues. Then the Green’s function of the projected observable – the reactioncoordinate q = Γ( x ) – can be written in full generality [2] as G ( q, t | q ) = (cid:88) k V Rk ( q ; Γ) V Lk ( q ; Γ)e − λ k t , (S2)where the elements V Rk and V Lk depend both on ψ Rk and ψ Lk , and on the projection Γ( x ). In turn the autocorrelationfunction can be easily computed as: C ( t ) = (cid:88) k ( (cid:90) qV Rk ( q ; Γ) dq )( (cid:90) q V Lk ( q ; Γ) dq )e − λ k t ≡ (cid:88) k a R, Γ k b L, Γ k e − λ k t , (S3)and one can show that for systems obeying detailed balance a R, Π k b L, Π k ≥ ∼ while in the first-to-second distance is ∼ . This disparity is simply a result of the projection that determinesthe relative contribution of different eigenfunctions. THE CHAPMAN-KOLMOGOROV CONSTRUCTION
In the case of the Rouse chain the integral defined in Eq. (4) in the main text can be solved analytically via astraightforward but tedious calculation using Eq. (7) in the main text. The result of the integral reads exactly G CKt ( q, t | q ) = η / qq η t − τ η τ (cid:112) π Ξ τ,t − τ sinh (cid:18) qq η η τ η t − τ τ,t − τ (cid:19) × exp (cid:34) − q η − t − τ (cid:18) − η t − τ Ω − τ τ,t − τ (cid:19) − q − τ (cid:32) η Ω − τ − η t − τ Ω + τ Ω − t − τ − η η τ Ω − t − τ Ξ τ,t − τ (cid:33)(cid:35) (S4)having defined Ω ± t = η ± η t , Ξ τ,t − τ = 4 η − Ω + τ Ω + t − τ . (S5)Notably, the structure of Eq. (S4) is identical to the structure of the plain Green’s function (Eq. (7) in the main text)but here the temporal dependence is obviously different.Note that in when the observation time is much larger than the relaxation time of the observable t rel , we findfor t − τ > t rel that G CK τ ( q, t | q ) (cid:39) p eq ( q ) (cid:82) dq (cid:48) G ( q (cid:48) , t | q ) = p eq ( q ). Therefore, since lim t →∞ G ( q, t | q (cid:48) ) = p eq ( q ), thedefinition of G CK τ ( q, t | q ) (Eq. (4) in the main text) by construction ensures lim t →∞ D CK τ,q ( t ) = 0. In Fig. S5 weexplicitly show the Green’s function that is required for the computation of the Kullback-Liebler divergence. ESTIMATION OF THE DIFFUSION COEFFICIENT
Eq. (12) in the main text demonstrates how one can estimate the ( q -independent) diffusion coefficient D from atime-series using the first two moments of the local displacement.The moments are in turn determined as follows: if q t is found during an epoch in the bin centered around q and ofsize l D its displacement from the previous epoch is determined and its square and both are averaged over all epochsin the time-series yielding (cid:104) δq δt ( l ) (cid:105) and (cid:104) δq δt ( l ) (cid:105) respectively.This procedure is repeated for different positions q on the support of the equilibrium distribution. We find that theresulting diffusion coefficient changes little over the entire support even for different bin-sizes l D (see Fig.S6). Thereforewe treat it, as a first approximation, as being independent of q and average over all bins to obtain D (cid:39) D = 448nm / ms. Using this value of D we perform Brownian Dynamics simulations of the fictitious Markovian processaccording to the Euler-Mayurama scheme.0 . . .
12 0 20 40 60 80 100 120 140 a) G ( q , t | q ) q t = 10100100000 . . . . . c) G M ( q , t | q ) q t = 101001000 00 . . . .
16 650 655 660 665 670 675 680 b) G ( q , t | q ) q [nm] t = 0 . ms . . . . . .
16 650 655 660 665 670 675 680 d) G M ( q , t | q ) q [nm] t = 0 . ms . . Figure S5. Green’s function at different times for both considered systems. a) and b) depict the true Green’s function for theend-to-end distance of the Rouse chain and of the DNA hairpin respectively. Panels c) and d) show the Green’s function oftheir respective fictitious Markovian processes at the same times. The initial conditions are q = 60 for the Rouse chain and q = 671 nm for the hairpin. − . − . − . . . . . ( D − D ) / D q [nm] ∆ l = 0 . nm ∆ l = 0 . nm Figure S6. Relative deviation of the local diffusion coefficient in a given bin D l from the average value D = N − (cid:80) Nl =1 D l asa function of the position of the bin. In the case of ∆ l = 0.001 nm we find D = 447 nm / ms with a deviation ± / msand for ∆ l = 0.01 nm we find D = 448 nm / ms with a deviation ± / ms. In a first approximation the values of D l areindependent of l and we thus set D ≈ D (cid:39)
448 nm / ms. ∗ [email protected][1] G. Wilemski and M. Fixman, Diffusion-controlled intrachain reactions of polymers. I Theory, J. Chem. Phys. , 866 (1974).[2] A. Lapolla and A. Godec, Manifestations of Projection-Induced Memory: General Theory and the Tilted Single File, Front.Phys.7