Effective Mori-Zwanzig equation for the reduced-order modeling of stochastic systems
EEffective Mori-Zwanzig equation for the reduced-order modeling ofstochastic systems
Yuanran Zhu a, ∗ , Huan Lei b a Department of Applied Mathematics, University of California, MercedMerced (CA) 95343 b Department of Computational Mathematics, Michigan State UniversityEast Lansing (MI) 48824
Abstract
Built upon the hypoelliptic analysis of the effective Mori-Zwanzig (EMZ) equation for observables of stochas-tic dynamical systems, we show that the obtained semigroup estimates for the EMZ equation can be usedto drive prior estimates of the observable statistics for system in the equilibrium and nonequilibrium state.In addition, we introduce both first-principle and data-driven methods to approximate the EMZ memorykernel, and prove the convergence of the data-driven parametrization schemes using the regularity estimateof the memory kernel. The analysis results are validated numerically via the Monte-Carlo simulation ofthe Langevin dynamics for a Fermi-Pasta-Ulam chain model. With the same example, we also show theeffectiveness of the proposed memory kernel approximation methods.
Keywords:
Mori-Zwanzig equation, Stochastic differential equation, Reduced-order modeling.
MSC:
1. Introduction
The projection operator method, which is also known as the Mori-Zwanzig (MZ) formulation [23, 36], isa widely used dimension-reduction framework in statistical mechanics. The key features of such formulationis that it allows us to formally derive the generalized Langevin equations (GLEs) [37, 3, 30, 14] for coarse-grained quantities of interest based on microscopic equations of motion. Such GLEs can be found in a varietyof applications, including molecular dynamics [19, 31, 11, 10], fluid mechanics [24, 12], and, more generally,systems described by nonlinear partial differential equations (PDEs) [29, 27, 22, 21]. Although being usedin the physics and applied mathematics communities for a rather long time, a systematic study of the MZequation within an rigorous analytical framework is still lacking. This is closely related to the well-knowndifficulty on the quantification of the orthogonal dynamics in the MZ equation. Being a high-dimensionalflow which is generated by an integro-differential operator, the mathematical properties such as the regularityand ergodicity of the orthogonal dynamics are not well understood. Hence from a theoretical point of view,there is no available prior estimates which help to determine the properties of the MZ memory integral andthe fluctuation force. As a result, the numerical approximations of these terms has to be done in an rather ad hoc manner.Some recent works have shed light on this direction. In particular, Kupferman, Givon and Hald proved[5] the existence and uniqueness of the orthogonal dynamics for a classical dynamical system with Mori’sprojection operator. More recently, Zhu and Venturi [32] were able to get the uniform boundedness of theorthogonal dynamics propagator for Hamiltonian systems using semigroup estimates [32]. The theoreticalresult obtained therein was later extended and greatly improved for the analysis of the effective Mori-Zwanzig(EMZ) equation corresponding to stochastic differential equations (SDEs) [35]. In particular, they developeda thorough mathematical analysis of the EMZ equation using the hypoelliptic technique developed mainlyby H´erau, Nier, Eckmann, Hairer and Helffer [16, 8, 15]. The key finding is that the ergodicity and regularity ∗ Corresponding author
Email address: [email protected] (Yuanran Zhu)
Preprint submitted to arXiv February 3, 2021 a r X i v : . [ m a t h - ph ] F e b f the stochastic flow generated by the Markovian semigroup e − t K , where K is the Kolmogorov operatorcorresponding to the SDE, implies the ergodicity and regularity of the stochastic flow generated by the EMZ orthogonal semigroup e − t QKQ , provided that P = I − Q is a Mori-type projection operator. This connectionenables us to get a clear understanding on the dynamical properties of the orthogonal dynamics generatedby e − t QKQ .In this work, we continue Zhu and Venturi’s hypoelliptic study of the EMZ equation for stochasticdynamical systems. The main objective of the paper is twofold. First, we apply the semigroup estimateobtained in [35] to different stochastic systems and show that it enables us to derive useful prior estimatesfor the statistics of observables. In particular, we prove that for some commonly used stochastic modelsin physics, any reduced-order system observable has exponentially decaying time autocorrelation functionand EMZ memory kernel. This fact verified the frequently used exponentially decaying assumption for thememory kernel from a theoretical point of view. Secondly, we will demonstrate the effectiveness of the seriesexpansion approximation method for the memory kernel reconstruction of the EMZ equation. To this end,we will focus on a generalized first-principle parametrization method [34], and various data-driven methods[1, 2, 4, 20] developed over the years. For the numerical example we considered, both the first-principle andthe data-driven method are proven to yield accurate simulation result within the range of their applicability.Moreover, we will prove the convergence of the commonly used data-driven method using the regularityestimate for the orthogonal dynamics. For the reduced-order modeling problem of a large-scale stochasticsystem, the proposed analysis for the EMZ equation shows the potential usage of the hypoelliptic methodin understanding the dynamical behavior of the reduced-order model, and the numerical methodology weprovide is shown to be a practical way to solve it.This paper is organized as follows. In Section 2, we briefly review the derivation of the effective MoriZwanzig (EMZ) equation for the stochastic dynamical system driven by white noise. In Section 3, we focus onthe equilibrium and nonequilibrium dynamics of the interacting anharmonic chains and derive prior estimatesfor various observable statistics such as the time autocorrelation function, the nonequilibrium mean, the EMZmemory kernel and the fluctuation force. In Section 4, we introduce different parametrization methods toapproximate the EMZ memory kernel and prove their convergence. All these theoretical results are verifiednumerically in Section 5 via the simulation of the Langevin dynamics for a Fermi-Pasta-Ulam chain model.The main findings of this paper are summarized in Section 6.
2. Effective Mori-Zwanzig equation for stochastic system
Let us consider a d -dimensional stochastic differential equation in R d : d x ( t ) dt = F ( x ( t )) + σ ( x ( t )) ξ ( t ) , x (0) = x ∼ ρ ( x ) , (1)where F : R d → R d and σ : R d → R d × m are smooth functions. ξ ( t ) is a m -dimensional Gaussian whitenoise with independent components, and x is a random initial state characterized in terms of a probabilitydensity function ρ ( x ). It is well known that the system of SDEs (1) induces a d -dimensional Markovianprocess in R d . This allows us to define a composition operator M ( t,
0) that pushes forward in time theaverage of the observable u ( t ) = u ( x ( t )) over the noise, i.e., E ξ ( t ) [ u ( x ( t )) | x ] = M ( t, u ( x ) = e t K u ( x ) , (2)where M ( t,
0) is a Markovian semigroup generated by the following (backward) Kolmogorov operator [26, 17]: K ( x ) = d (cid:88) k =1 F k ( x ) ∂∂x k + 12 m (cid:88) j =1 d (cid:88) i,k =1 σ ij ( x ) σ kj ( x ) ∂∂x i ∂x k . (3)With the evolution operator M ( t,
0) available, we can now derive the Mori-Zwanzig equation for noise-averaged u ( x ( t )). To this end, we introduce a projection operator P and the complementary projection Q = I − P . By differentiating Dyson’s identity [34, 33, 6] for the Markovian semigroup M ( t, ∂∂t e t K u (0) = e t K PK u (0) + e t QKQ QK u (0) + (cid:90) t e s K PK e ( t − s ) QKQ QK u (0) ds, (4)2here u (0) = u ( x ). The three terms on the right hand side of (4) are called streaming term, fluctuation(or noise) term, and memory term respectively. It is often more useful to compute the evolution of theobservable u ( t ) within a closed linear space, e.g., the image of the projection operator P . Hence we applythe projection operator P to (4) and get the projected equation: ∂∂t P e t K u (0) = P e t K PK u (0) + (cid:90) t P e s K PK e ( t − s ) QKQ QK u (0) ds. (5)Due to the fact that equation (4) and its projected form (5) only describe the dynamics of noise averageof observable u ( x ), they were called as the effective Mori-Zwanzig (EMZ) equation for stochastic systems.The EMZ equation and the classical MZ equation for deterministic (autonomous) systems [34, 32, 33] share asame structure. The only difference is that the Liouville operator L is replaced by a Kolmogorov operator K .Depending on the choice of the projection operator, the EMZ equation (4) and (5) yield evolution equationsfor different statistical quantities. In this paper, we mainly focus on the EMZ equation correspondingto Mori-type linear projection operator. To drive such equation, we consider the weighted Hilbert space H = L ( R d , ρ ), where ρ is a positive weight function in R d . Let (cid:104) h, g (cid:105) ρ = 1 (cid:82) ρd x (cid:90) h ( x ) g ( x ) ρ ( x ) d x h, g ∈ H (6)be the inner product in H . We introduce the following projection operator P h = M (cid:88) i,j =1 G − ij (cid:104) u i (0) , h (cid:105) ρ u j (0) , h ∈ H. (7)where G ij = (cid:104) u i (0) , u j (0) (cid:105) ρ and u i (0) = u i ( x ) ( i = 1 , ..., M ) are M linearly independent functions withrespect to inner product (cid:104)· , ·(cid:105) ρ . With P defined as (7), we can rewrite the EMZ equation (4), and itsprojected version (5), as d u ( t ) dt = Ω u ( t ) + (cid:90) t K ( t − s ) u ( s ) ds + f ( t ) , (8) ddt P u ( t ) = Ω P u ( t ) + (cid:90) t K ( t − s ) P u ( s ) ds. (9)where u ( t ) = [ u ( t ) , . . . , u M ( t )] T and G ij = (cid:104) u i (0) , u j (0) (cid:105) ρ (Gram matrix) , (10a)Ω ij = M (cid:88) k =1 G − jk (cid:104) u k (0) , K u i (0) (cid:105) ρ (streaming matrix) , (10b) K ij ( t ) = M (cid:88) k =1 G − jk (cid:104) u k (0) , K e t QKQ QK u i (0) (cid:105) ρ (memory kernel) , (10c) f i ( t ) = e t QKQ QK u i (0) (fluctuation term) . (10d)In statistical mechanics, the EMZ equation (8) and (9) are often called as the generalized Langevin equations.We note that the classical second fluctuation-dissipation theorem [34] for the EMZ equation is violated dueto non-skew-adjointness of the Kolmogorov operator K .
3. Applications of the hypoelliptic analysis for the EMZ equation
In the previous section, we demonstrate how the EMZ equation is derived from the evolution operator e t K and the orthogonal e t QKQ . In this section, we focus on the prior estimation of these two semigroupsand apply the established analytical results to various physical models. To be consistent with the literatureon hypoelliptic analysis, we will use the negative of K and QKQ as semigroup generators and write the3emigroups appearing in EMZ equation (4) as e − t K and e − t QKQ . Moreover, all estimates are obtained firstin an regular Hilbert space L ( R d ) and then transformed back in weighted Hilbert space L ( R d ; ρ ) (seeexplanations in [35]). Throughout the paper, we denote the standard L ( R d ) norm as (cid:107) · (cid:107) . The innerproduct in L ( R d ; ρ ) is defined as (6) with the induced weighted norm (cid:107) · (cid:107) L ρ given by (cid:107) · (cid:107) L ρ = (cid:20) (cid:82) ρd x (cid:90) ( · ) ρd x (cid:21) . Unless otherwise stated we only consider scalar quantities of interest. We begin our discussion by reviewingsome results proved in [35]:
Proposition 1 (Zhu and Venturi [35]) . Assuming a Kolmogorov operator ˜ K of the form (3) satisfies somesuitable conditions (see Theorem 2 [35] for details). If the spectrum of ˜ K in L ( R n ) is such that σ ( ˜ K ) ∩ i R = { } , then there exits positive constants α and C = C ( α ) such that (cid:107) e − t ˜ K ˜ u − ˜ π ˜ u (cid:107) ≤ Ce − αt (cid:107) ˜ u (cid:107) , (11) for ˜ u ∈ L ( R n ) and all t > , where ˜ π is the spectral projection onto the kernel of ˜ K . Moreover, the n -thorder derivatives of the semigroup e − t ˜ K satisfies (cid:107) e − t ˜ K ˜ K n (cid:107) ≤ (cid:18) B ˜ K (cid:18) tn (cid:19) + (cid:107) ˜ π ˜ K(cid:107) (cid:19) n , B ( t ) = Ce − αt (cid:20) t + · · · + 1 t M (cid:21) , (12) for some positive constant C and M . In [35], it is further shown that the similar semigroup estimates hold for orthogonal semigroup e t ˜ Q ˜ K ˜ Q if˜ P = I − ˜ Q is finite-rank, symmetric projection operator such as Mori’s projection. In particular, we have Proposition 2 (Zhu and Venturi [35]) . Assume that ˜ K satisfies all conditions listed in Proposition 1. If ˜ P : L ( R n ) → L ( R n ) is a symmetric, finite-rank projection operator, and the spectrum of ˜ Q ˜ K ˜ Q in L ( R n ) is such that σ ( ˜ Q ˜ K ˜ Q ) ∩ i R = { } , then there exits positive constants α ˜ Q and C = C ( α ˜ Q ) such that (cid:107) e − t ˜ Q ˜ K ˜ Q ˜ u − ˜ π ˜ Q ˜ u (cid:107) ≤ Ce − α ˜ Q t (cid:107) ˜ u (cid:107) (13) for all ˜ u ∈ L ( R n ) and t > , where ˜ π ˜ Q is the spectral projection onto the kernel of ˜ Q ˜ K ˜ Q . Moreover, the n -th order derivatives of the semigroup e − t ˜ Q ˜ K ˜ Q satisfies (cid:107) e − t ˜ Q ˜ K ˜ Q ( ˜ Q ˜ K ˜ Q ) n (cid:107) ≤ (cid:18) B ˜ Q (cid:18) tn (cid:19) + (cid:107) ˜ π ˜ Q ( ˜ Q ˜ K ˜ Q ) (cid:107) (cid:19) n , B ˜ Q ( t ) = Ce − α Q t (cid:20) t + · · · + 1 t M Q (cid:21) , (14) for some positive constant C and M Q .3.1. Application to Langevin dynamics Consider the Langevin dynamics of an interactive particle system, described by the following system ofSDEs in R d : d q dt = 1 m p d p dt = −∇ V ( q ) − γm p + σ ξ ( t ) . (15)In eqn (15), m is the mass of each particle, V ( q ) is the interaction potential and ξ ( t ) is a d -dimensionalGaussian white noise process modeling physical Brownian motion. The parameters γ and σ are linked by thefluctuation-dissipation relation σ = (2 γ/β ) / , where β is proportional to the inverse of the thermodynamictemperature. The (negative) Kolmogorov operator (3) associated with the SDE (15) is given by K = − p m · ∇ q + ∇ q V ( q ) · ∇ p + γ (cid:18) p m · ∇ p − β ∆ p (cid:19) , (16)4here “ · ” denotes the standard dot product. If the interaction potential V ( q ) is strictly positive at infinityand satisfies the weak ellipticity assumption (Hypothesis 1 in [35]), then the Langevin equation (15) admits anunique invariant Gibbs distribution given by ρ eq ( p , q ) = e − β H /Z , where H = (cid:107) p (cid:107) m + V ( q ) is the Hamiltonianand Z is the partition function. In [35], it is further proved that Proposition 1 holds for any ˜ u ∈ L ( R d )and t > π ( · ) = (cid:104) ( · ) , e − β H / (cid:105) e − β H / . Now we choose ρ eq as the weight of Hilbert space L ( R d ; ρ eq ),then the L -estimation (11) can be unitarily transformed [35] into the semigroup estimate in L ( R d ; ρ eq )as: (cid:107) e − t K u − π u (cid:107) L eq ≤ Ce − αt (cid:107) u (cid:107) L eq , (17)where π ( · ) = (cid:104) ( · ) (cid:105) eq = E [( · )]. Similarly, for orthogonal semigroup e t QKQ we have: (cid:107) e − t QKQ u − π Q u (cid:107) L eq ≤ Ce − α Q t (cid:107) u (cid:107) L eq . (18)Different from the estimate for e − t K , the explicit expression of the spectral projection operator π Q dependson the projection operator P . For Mori-type projection operator P we considered, if there exists uniqueobservable set { w j } mj =1 such that (cid:104) w j , u i (cid:105) eq = 0 and K w j = K ∗ w j = u j , then π Q admits analytical form π Q ( · ) = π ( · ) + P ( · ) + m (cid:88) i =1 (cid:104) ( · ) , w i (cid:105) eq w i . (19)Otherwise π Q ( · ) = π + P . With semigroup estimates (17) and (18), we can drive prior estimations fordifferent observable statistics. Equilibrium state . The equilibrium Langevin dynamics was studied thoroughly in [35]. Here we onlyreview the key estimation result while the derivation is omitted. If the initial condition of the Langevindynamics is (15) set to be ρ = ρ ( t = 0) = ρ eq , then the system is in a statistical equilibrium state, thecorresponding dynamics is called the equilibirum Langevin dynamics . For equilibrium system, the timeautocorrelation function C ( t ) of a scalar observable u ( x ( t )) = u ( p ( t ) , q ( t )) is stationary quantity satisfying C ( t, s ) = C ( | t − s | , C ( t ) := E x (0) [ E ξ ( t ) [ u ( t ) u (0) | x (0)]] = (cid:104) e t K u , u (cid:105) eq . (20)Using Cauchy-Schwarz inequality and the semigroup estimate (17), for u ∈ L ( R d ; ρ eq ) it is easy to getthe asymptotic estimate for C ( t ): | C ( t ) − (cid:104) u (cid:105) eq | = |(cid:104) e − t K u , u (cid:105) eq − (cid:104) u (cid:105) eq | = |(cid:104) e − t K u − (cid:104) u (cid:105) eq , u (cid:105) eq |≤ (cid:107) e − t K u − (cid:104) u (cid:105) eq (cid:107) L eq (cid:107) u (cid:107) L eq ≤ Ce − αt (cid:107) u (cid:107) L eq . (21)This implies the equilibrium correlation function C ( t ) approaches to the equilibrium value (cid:104) u (cid:105) eq = E [ u ]exponentially fast. To get the EMZ equation for observable u ( t ), we introduce Mori-type projection P = (cid:104)· , u (cid:105) eq u . Substituting this into EMZ equation (8) and (9) yields: ddt ˜ u ( t ) = Ω˜ u ( t ) + (cid:90) t K ( t − s )˜ u ( s ) ds + f ( t ) , (22) ddt C ( t ) = Ω C ( t ) + (cid:90) t K ( t − s ) C ( s ) ds, (23)where ˜ u ( t ) = E ξ ( t ) [ u ( x ( t )) | x ] and Ω = (cid:104) u , K u (cid:105) eq / (cid:104) u (cid:105) eq . If we further have (cid:104) u (cid:105) eq = 1, (cid:104) u (cid:105) eq = (cid:104) u , w (cid:105) eq = 0 and K w = K ∗ w = u , by using Cauchy-Schwarz inequality and the semigroup estimate(18), we can get the exponential convergence estimate for the EMZ memory kernel K ( t ) and the fluctuationforce f ( t ): | K ( t ) − ( (cid:104)K u (cid:105) eq (cid:104)QK ∗ u (cid:105) eq + (cid:104)QK ∗ u , w (cid:105) eq (cid:104)K u , w (cid:105) eq ) | ≤ Ce − α Q t , (24) (cid:13)(cid:13)(cid:13) f ( t ) − (cid:16) (cid:104)QK u (cid:105) eq + (cid:104)QK u , w ) (cid:105) eq (cid:17)(cid:13)(cid:13)(cid:13) L eq ≤ Ce − α Q t (cid:107)QK u (cid:107) L eq . (25)5n most cases, such w does not exists, then estimates (24) and (25) reduce to | K ( t ) − (cid:104)K u (cid:105) eq (cid:104)QK ∗ u (cid:105) eq | ≤ Ce − α Q t , (26) (cid:13)(cid:13)(cid:13) f ( t ) − (cid:104)QK u (cid:105) eq (cid:13)(cid:13)(cid:13) L eq ≤ Ce − α Q t (cid:107)QK u (cid:107) L eq . (27) Nonequilibrium nonsteady state.
Semigroup estimate (17) can also be used to get prior estimates fornonequilibrium Langevin dynamics. If the initial condition of (15) is set to be ρ = ρ ( t = 0) (cid:54) = ρ eq , thesystem evolves from a nonequilibrium nonsteady state. We now study the dynamics of the nonequilibirummean function M ( t ) defined as: M ( t ) := E x (0) [ E ξ ( t ) [ u ( t ) | x (0)]] = (cid:104) e t K u (cid:105) ρ .M ( t ) encodes the statistical moment information for a scalar observable u ( x ( t )). Using the Cauchy-Schwarzinequality, the substitution ρ = ρ √ ρ eq / √ ρ eq and the estimate (17), we obtain the asymptotic estimate for M ( t ): | M ( t ) − (cid:104) u (cid:105) eq | = |(cid:104) e − t K u (cid:105) ρ − (cid:104)(cid:104) u (cid:105) eq (cid:105) ρ | = |(cid:104) e − t K u − (cid:104) u (cid:105) eq (cid:105) ρ |≤ (cid:107) e − t K u − (cid:104) u (0) (cid:105) eq (cid:107) L eq (cid:13)(cid:13) ρ /ρ eq (cid:13)(cid:13) ≤ Ce − αt (cid:107) u (cid:107) L eq (cid:13)(cid:13) ρ /ρ eq (cid:13)(cid:13) . Different from the equilibrium case, the convergence of the nonequilibrium mean M ( t ) requires the finitenessof the L ( R d ) norm (cid:107) ρ /ρ eq (cid:107) , which imposes additional constraint on the initial probability distribution ρ .For instance, if the initial probability density is set to be the Gibbs distribution ρ = e − β H / /Z β/ at hightemperature T ∝ /β , we have (cid:107) ρ /ρ eq (cid:107) = + ∞ therefore the above estimate is not sufficient to guaranteethe exponential convergence of M ( t ) towards the equilibrium value (cid:104) u (cid:105) eq . Similar conclusion can be drawnfrom the return to equilibrium estimate for the probability density function ρ ( t, p , q ) (see [15], Section 6.5): (cid:90) R d (cid:12)(cid:12)(cid:12)(cid:12) ρ ( t, p , q ) − Z e − β H (cid:12)(cid:12)(cid:12)(cid:12) e β H d p d q ≤ Ce − α t . The above estimate is a dual of (11) and holds only for ρ = ρ (0 , p , q ) ∈ e − β H / S (cid:48) ( R d ), where S (cid:48) ( R d ) is thespace of tempered distributions. Obviously when ρ = e − β H / /Z β/ / ∈ e − β H / S (cid:48) ( R d ), there is no theoreticalguarantee that the marginal distribution ρ u ( t ) would converges to the equilibrium marginal distribution. Consider a chain of nearest-neighbor interacting anharmonic oscillators coupled to two heat bath at endof the chain. Without adding external forces, the chain dynamics is determined by the system Hamiltonian: H S ( p , q ) = N (cid:88) i =0 (cid:18) p i V ( q i ) (cid:19) + N (cid:88) i =1 V ( q i − q i − ) . Now we attach the boundary oscillators to two thermostats with temperature T L and T R , then the dynamicsof the resulting heat conduction model [9, 7, 8] is described by the system of stochastic differential equations: dq i = p i dt, i = 0 , · · · N,dp = − V (cid:48) ( q ) + V (cid:48) ( q − q ) dt + r L dtdp N = − V (cid:48) ( q N ) + V (cid:48) ( q N − q N − ) dt + r R dtdp j = − V (cid:48) ( q j ) dt − V (cid:48) ( q j − q j − ) dt + V (cid:48) ( q j +1 − q j ) dt, j = 1 , · · · N − dr L = − γ L r L dt + λ L γ L q dt − λ L (cid:112) γ L T L ξ L ( t ) dtdr R = − γ R r R dt + λ R γ R q N dt − λ R (cid:112) γ R T R ξ R ( t ) dt (28)6here λ L , λ R are the coupling constants between the boundary oscillators and the heat bath. ξ L ( t ) and ξ R ( t ) are the standard Gaussian white noise. The Kolmogorov backward operator K corresponding to thesystem of SDEs (28) is given by: K = λ L γ L T L ∂ r L + λ R γ R T R ∂ r R − γ L ( r L − λ L q ) ∂ r L − γ R ( r R − λ R q N ) ∂ r R + r L ∂ p + r R ∂ p N + N (cid:88) i =0 ( p i ∂ q i − V (cid:48) ( q i ) ∂ p i ) − N (cid:88) i =1 V (cid:48) ( q i − q i − )( ∂ p i − ∂ p i − ) . (29) Equilibrium state.
When T L = T R = T , the system admits an invariant probability density which isgiven by the extended Gibbs distribution ρ eq = e − β G ( p , q , r ) /Z , where β = 1 /T and G ( p , q , r ) is the effectiveenergy corresponding to the chain+heat bath system, defined as G ( p , q , r ) = H S ( p , q ) + r L λ L + r R λ R − q r L − q N r R . (30)The analysis for the equilibrium heat conduction model is exactly the same as the one for the Langevindynamics. For potential energy V and V satisfying suitable conditions listed in [8], Eckmann and Hairerproved that the spectrum of the transformed Kolomogorov operator ˜ K in L ( R N +4 ) is discrete and has thecusp-shape spectrum S ˜ K . Therefore according to Proposition 1, if ˜ K has no purely imaginary eigenvalue in L ( R N +4 ), then we have the exponentially decay estimate for scalar observable u ( x ( t )): (cid:107) e − t K u − π u (cid:107) L eq ≤ Ce − αt (cid:107) u (cid:107) L eq , (31) (cid:107) e − t QKQ u − π Q u (cid:107) L eq ≤ Ce − α Q t (cid:107) u (cid:107) L eq , (32)where weighted Hilbert space L eq = L ( R d ; ρ eq ). Following the procedure we outlined in Section 3.1, it iseasy to obtain corresponding exponentially decaying estimates for the equilibrium correlation function C ( t ),EMZ memory kernel K ( t ) and the fluctuation force f ( t ). For the sake of brevity, the derivation details areomitted. Nonquilibrium steady state.
When T L (cid:54) = T R , it is proved in [7] that the system admits an unique invariantmeasure µ . Its density ρ S is an smooth function on R N +4 and can be represented as ρ S = ˜ h ( p , q , r ) e − β G ( p , q , r ) . (33)In (33), β < min { β L , β R } , ˜ h ( p , q , r ) ∈ (cid:84) γ> L ( R N +4 ; G γ ( p , q , r )) is a function decays faster than anypolynomial as x → ∞ . ρ S characterises a nonequilibrium steady state of the system. General speaking, it ishard to get an explicit expression of ˜ h ( p , q , r ), hence of the probability density (33). However, we can still useGibbs form equilibrium probability density e − β G /Z as a reference state to derive prior estimates. To this end,we consider a weighted Hilbert space L ( R N +4 ; ρ r ), where ρ r = e − β G /Z and 1 /β = T > max { T L , T R } .For the nonequilibrium case, the spectrum estimate obtained by Eckmann et al in [8] still hold, which impliesthe following exponentially decay estimate for scalar observable u ( x ( t )): (cid:107) e − t K u − π u (cid:107) L r ≤ Ce − αt (cid:107) u (cid:107) L r . (34)At the steady state, the correlation function C ( t ) is stationary which can be defined as (20) if the initialcondition of (28) satisfies ρ (0) = ρ S . Using Cauchy-Schwarz inequality and the formal expression of thesteady state density (33), we obtain | C ( t ) − (cid:104) u (cid:105) ρ S | = |(cid:104) e − t K u , u (cid:105) ρ S − (cid:104) u (cid:105) ρ S , u (cid:105) ρ S | = |(cid:104) e − t K u − (cid:104) u (cid:105) ρ S , u (cid:105) ρ S |≤ (cid:107) e − t K u − (cid:104) u (cid:105) ρ S (cid:107) L r (cid:107) ˜ h ( p , q , r ) u (cid:107) ≤ Ce − αt (cid:107) ˜ h ( p , q , x ) u (cid:107)(cid:107) u (cid:107) L r Since ˜ h ( p , q , r ) ∈ (cid:84) γ> L ( R N +4 ; G γ ( p , q , r )), for any observable u ( x ( t )) ∈ S (cid:48) ( R N +4 ), e.g. polynomialfunctions, we have (cid:107) ˜ h ( p , q , x ) u (cid:107)(cid:107) u (cid:107) L r < + ∞ . The above estimate implies the steady state correlationfunction C ( t ) decays to (cid:104) u (cid:105) ρ S exponentially fast. We emphasize that all estimates in Section 3 can be readilygeneralized to the N -dimensional EMZ equation (8) and (9) where the observable u ( x ( t )) is a N -dimensionalvector [35]. 7 . Parametrization of the EMZ memory kernel From the previous discussion, we see that the prior estimation for the EMZ equation memory kernelimplies that K ( t ) is bounded by an exponentially decaying function. However, it does not answer what K ( t ) exactly is, which is an important problem for the application the EMZ equation. In this section,we turn to focus on the numerical approximation of the EMZ memory kernel. The main method we areconcerned with is the series expansion method. Over the years, various basis functions have been used toconstruct approximation schemes of the classical system Mori-Zwanzig memory kernel [1, 2, 19, 20, 18, 4, 33],where the expansion coefficients (parameters) are obtained through first-principle or data-driven methods.We will show that for the EMZ equation corresponding to the SDEs, similar approaches can be used toparametrize the memory kernel. In particular, we will prove that many commonly used date-driven methodsare convergent due to the regularity of the orthogonal flow. A first-pinciple method to approximate the memory kernel was considered in [33, 34]. It is shown that aseries expansion of the memory kernel can be derived exactly from the semigroup expansion of orthogonalsemigroup e t QKQ . Following the derivation given therein, we consider the series expansion of the orthogonalsemigroup: e t QKQ = ∞ (cid:88) n =0 g n ( t )Φ n ( QKQ ) , (35)where Φ n ( QKQ ) is the n -th order polynomial function of operator QKQ and g n ( t − τ ) the correspondingtemporal basis. The simplest choice is the Taylor expansion where Φ n and g n are Φ n ( QKQ ) = (
QKQ ) n and g n ( t ) = t n /n !. Other possible choice of Φ n ( n = 0 , . . . , N ) can be, e.g. the Faber polynomials [33], and thecorresponding g n ( t ) = e − at J n ( bt ), where J n ( bt ) is the Bessel function of the first kind. Semigroup expansion(35) leads to function series expansions of the memory kernel. For a one dimensional EMZ equation, asubstitution of (35) into (10c) yields K ( t ) = ∞ (cid:88) n =0 g n ( t ) (cid:104)K Φ n ( QKQ ) QK u , u (cid:105) ρ (cid:104) u , u (cid:105) ρ = ∞ (cid:88) n =0 k n g n ( t ) , (36)where k n is the n -th expansion coefficient which can be understood as the operator cumulant averagedwith respect to the probability density ρ . Naturally, a truncation of the expansion series (36) yields anapproximation of the exact memory kernel. We emphasize that from a theoretical point of view, it is hardto prove the convergence of expansion (36) for nonlinear SDEs due to unboundedness of the operator QKQ .However, the validity of this approximation method has been verified numerically for linear and nonlinearHamiltonian system in the statistical equilibrium [33, 34].
First-principle method to calculate k n . The first-principle method calculate k n via the evaluation of theoperator cumulants in (36). This can be realized using a recursive scheme and the associated combinatorialalgorithm introduced in [34]. The original method is developed for the MZ equation of deterministic Hamil-tonian systems. However, it can be readily generalized to stochastic system EMZ equation with some slightmodifications of the derivation. Here we only briefly review the main idea of the algorithm and refer to [34]for detailed explanations. Without loss of generality, it is convenient to consider a one-dimensional Mori’sprojection: P f = (cid:104) f, u (cid:105) ρ (cid:104) u , u (cid:105) ρ , (37)and introduce the following notation µ i = (cid:104)K ( QK ) i − u , u (cid:105) ρ (cid:104) u , u (cid:105) ρ , γ i = (cid:104)K i u , u (cid:105) ρ (cid:104) u (0) , u (0) (cid:105) ρ . (38)Clearly, if we are given { µ , . . . , µ n +2 } , then we can easily compute { k , . . . , k n } in (36), and therefore the n -th order approximation of the memory kernel K ( t ) for any given polynomial function Φ n . For example,8f Φ n ( QKQ ) = (
QKQ ) n then k q = µ q +2 /q ! ( q = 0 , . . . , n ). Directly evaluating µ i is a daunting task since itinvolves taking operator powers and averaging of operator QKQ which is a integral-differential operator bydefinition. However, the following recursive formula indicates that µ i can be constructed iteratively from γ i : µ = γ , µ = γ − µ γ , · · · , µ n = γ n − n − (cid:88) j =1 µ n − j γ j . (39)The proof of (39) is provided in Appendix A. Recurrence relation (39) shifts the problem of computing { µ , . . . , µ n } to the problem of evaluating the coefficients { γ , . . . , γ n } defined in (38). This can be doneiteratively using the enumerative combinatorial algorithm introduced in [34], with the the Livouille operator L used therein replaced by the Kolomogorov operator K . For the sake of brevity, we omit technical detailswhich can be found in [34]. In Appendix B, we provide the derivation of the combinatrorial algorithm forthe Langevin dynamics (15) of the Fermi-Pasta-Ulam (FPU) chain. Different from the first-principle method, there are many established data-driven methods which can beused to parametrize the memory kernel. Generally speaking, these methods use data collected by simulatingstochastic dynamics (1) to approximate the expansion coefficients k n . The expansion series can be formulatedin the temporal space as well as the frequency space [18]. In this section, we are only concerned with thefirst case and use the following ansatz to approximate K ( t ): K ( t ) ≈ N (cid:88) n − ˆ k n φ n ( t ) , where ˆ k n = (cid:104) K ( t ) , φ n ( t ) (cid:105) ω (cid:104) φ n ( t ) , φ n ( t ) (cid:105) ω . (40)In (40), { φ n ( t ) } is the basis function defined in some open interval I ⊂ R + . The common choice of whichare the orthogonal functions in a weighted Hilbert space L ( I, ω ). Under this setting, (40) becomes ageneralized Fourier series. Hence, we can apply established results in approximation theory, say [13], toprove the convergence of the series expansion (40) as N → ∞ . As a preparation, we first use the Cauchy-Schwartz inequality and semigroup estimate (14) to obtain the upper bounds of the n -th order derivative ofthe memory kernel: | K ( n ) ( t ) | = |(cid:104)K ∗ e t QKQ ( QKQ ) n QK u , u (cid:105) ρ | = |(cid:104) e t QKQ ( QKQ ) n QK u , K ∗ ρ u (cid:105) ρ |≤ C (cid:107)QK u (cid:107) L ρ (cid:107)K ∗ ρ u (cid:107) L ρ (cid:18) B Q (cid:18) tn (cid:19) + (cid:107) π Q ( QKQ ) (cid:107) (cid:19) n , t > . (41)According to the definition of B Q ( t ) in (14), estimate (41) implies that K ( n ) ( t ) is bounded by a continuousfunction of time in domain I = ( T , T ), where 0 ≤ T ≤ T < + ∞ . Hence for suitable weight function ω ,we have K ( t ) | I ∈ ∩ ∞ k =1 H kω ( I ), where K ( t ) | I is the restriction of K ( t ) in the open interval I and H kω ( I ) isthe weighted Sobolev space defined in I . This regularity provides sufficient conditions for the convergenceof expansion (40). If { φ n ( t ) } is chosen to be, say the shifted Jacobi-type polynomials defined in I , thenaccording to Theorem 6.2.4 in [13], the following convergence estimate holds for any 0 < m < N : (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) K ( t ) | I − N (cid:88) n =0 ˆ k n φ n ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ω ( I ) ≤ CN m (cid:13)(cid:13)(cid:13) (1 − t ) m/ K ( m ) ( t ) (cid:13)(cid:13)(cid:13) L ω ( I ) , t ∈ I = ( T , T ) . (42)Since K ( t ) is naturally defined in domain I = (0 , + ∞ ), we can also set { φ n ( t ) } to be the standard Laguerrepolynomial with the weight function ω = e − t/ . For fixed n ∈ N + , using (41) we can getlim t → + ∞ | K ( n ) ( t ) | ≤ C (cid:107)QK u (cid:107) L ρ (cid:107)K ∗ ρ u (cid:107) L ρ (cid:107) π Q ( QKQ ) (cid:107) n , (43)9 = 1 β = 20 Figure 1: Sample path of the tagged oscillator momentum p ( t ). We display the result for the stochastic FPU system (48)with weak ( θ = 0 .
1) and strong nonlinearity ( θ = 1) at high ( β = 1) and low ( β = 20) temperature. which yields | K ( m ) ( t ) | t m/ ∈ L ω ( I ). According to Theorem 6.2.5 in [13], this leads to the following conver-gence estimate for any 0 < m < N : (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) K ( t ) − N (cid:88) n =0 ˆ k n φ n ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ω ( I ) ≤ C ( √ N ) m (cid:13)(cid:13)(cid:13) t m/ K ( m ) ( t ) (cid:13)(cid:13)(cid:13) L ω ( I ) , t ∈ I = (0 , + ∞ ) . (44)The semigroup estimate (14) we used in the above derivation holds in the uniform topology for any scalarobservable u ∈ L ( R n ; ρ ). As a consequence, the convergence rate we obtained on (42) and (44) are notoptimal. But we already know the convergence is spectral , i.e. faster than any polynomials. As far as we areconcerned, this is the first convergence result for data-driven methods used in the Mori-Zwanzig framework.We also note that error estimate (42) only implies the approximation of K ( t ) within ( T , T ) is accurate. Inorder to maintain low extrapolation error, in applications we will use basis functions defined in I = (0 , + ∞ )to approximate the memory kernel. Data-driven method to calculate k n . A substitution of the truncated expansion (40) into the projectedEMZ equation (5) yields ddt P u ( t ) ≈ Ω P u ( t ) + N (cid:88) n =0 k n (cid:90) t φ n ( s ) P u ( t − s ) ds. The projected observable P u ( t ) trajectories within a certain period [0 , T ] can be obtained via numericalsimulation of the SDE (1), then the expansion coefficients k n can be obtained by solving numerically thefollowing regression problem:min { k n } Nn =1 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ddt P u ( t ) − Ω P u ( t ) − N (cid:88) n =0 k n (cid:90) t φ n ( s ) P u ( t − s ) ds (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ω ( I ) . (45)In Section 5, we will use the LASSO regression [28] to solve (45) and get the approximated parameter set { k n } Nn =1 . When compared with the first-principle method, the data-driven method in general has widerrange of applicability but also demands more computational power.
5. Applications
In this section, we will use the Langevin dynamics of a Fermi-Pasta-Ulam (FPU) chain model to numer-ically verify the theoretical results obtained in previous sections and validate the parametrization method of10 = 1 β = 20 -2 -0.500.51 -5 -4 -3 -2 -1 -2 -0.500.51 -5 -4 -3 -2 -1 Figure 2: Temporal auto-correlation function of the tagged oscillator momentum p j ( t ) for weakly nonlinear FPU system atdifferent temperature T ∝ /β . We compare results we obtained by calculating the EMZ memory from first principles using14th order Faber polynomials with results from MC simulation (10 sample paths). In the subplots, we display | C ( t ) /C (0) | and the exponentially decaying upper bound ce − αt with an estimated decaying rate α . the EMZ memory kernel. To this end, we consider the Hamiltonian of the FPU chain: H ( p , q ) = N − (cid:88) j =0 p j m + N − (cid:88) j =0 V ( q j +1 − q j ) , (46)where the potential energy is given by V ( q j +1 − q j ) = α q j +1 − q j ) + θ q j +1 − q j ) . (47) { q j , p j } are, respectively, the generalized coordinate and momentum of the j -th oscillator. In addition, theperiodic boundary condition q = q N and p = p N is imposed, and the total number of oscillators is set tobe N = 100. For such system, it is convenient to work on a new, non-canonical coordinate { r , p } where r j = q j − q j − is the distance between two neighboring oscillators. In the new coordinate, the Langevindynamics (15) for the stochastic FPU model is given by ddt r j = 1 m ( p i − p i − ) ,ddt p j = ∂V ( r j +1 ) ∂r j +1 − ∂V ( r j ) ∂r j − γm p j + σξ ( t ) . (48)The corresponding Kolmogorov operator is K = L ( p , r ) + S ( p )= N − (cid:88) i =1 (cid:20)(cid:18) ∂V ( r i +1 ) ∂r i +1 − ∂V ( r i ) ∂r i (cid:19) ∂∂p i + 1 m ( p i − p i − ) ∂∂r i (cid:21) − N − (cid:88) i =1 γ (cid:18) p i m ∂∂p i − β ∂ ∂p i (cid:19) , (49)where L ( p , r ) is the Liouville operator in the new coordinate { r , p } , S ( p ) is the advection-diffusion operatorinvolving p . The sample paths for the momentum of a tagged oscillator p ( t ) are shown in Figure 1for system with different parameters. Stochastic FPU chain with weak nonlinearity
We first consider theequilibrium dynamics of the FPU chain with weak nonlinearity. To this end, we set modeling parameter α = 1, γ = 1 and θ = 0 .
1. The initial condition of (15) is set to be x (0) ∼ ρ eq where ρ eq = e − βH is theequilibrium Gibbs distribution. For the weakly nonlinear system, we aim to verify the following claims:i) The observable statistics, in particular the auto-correlation function C ( t ) and the corresponding mem-ory kernel K ( t ) defined in the projected EMZ (23), decays exponentially to its equilibrium value.11 = 1 β = 20 -2 -0.200.20.40.60.8 -20 -10 -2 -0.200.20.40.60.81 -30 -20 -10 Figure 3: Approximated EMZ memory kernel corresponding to the tagged particle momentum correlation function C ( t ). Thesubplots display | K ( t ) /K (0) | and the exponentially decaying upper bound c Q e − α Q t with an estimated decaying rate α Q . Othersetting is same as Figure 2. ii) The first-principle method introduced in Section 4.1 yields accurate approximation to the memorykernel K ( t ), therefore of C ( t ).For claim i), we note that the FPU potential energy defined as (47) satisfies the weak ellipticity condi-tion (Hypothesis 1) in [35]. Therefore the theoretical results in Section 3.1 hold for any polynomial-typeobservable. Now we choose the momentum p j ( t ) of a tagged oscillator as quantity of interest and use Mori’sprojection P ( · ) = (cid:104) ( · ) , p j (0) (cid:105) eq / (cid:104) p j (0) (cid:105) eq to derive the projected EMZ equation (23). Some simple calculationimplies Ω = −
1, hence the projected EMZ equation for the momentum yields the evolution equation for thetime correlation function: dC ( t ) dt = − C ( t ) + (cid:90) t K ( t − s ) C ( s ) ds, (50)where C ( t ) = (cid:104) p j ( t ) , p j (0) (cid:105) eq . According to estimate (21), the auto-correlation function C ( t ) decays to 0exponentially fast with the rate α . In the non-canonical coordinate { p j , r j } , we have K q j = K ∗ eq q j = p j .Moreover, since the periodic boundary condition is posed q j cannot be written as a function of { p j , r j } (thelinear transformation { q j } Nj =1 → { r j } Nj =1 is not invertible). Hence the kernel projection operator π Q of QKQ admits the explicit form: π Q ( · ) = E [( · )] + P ( · ) , and the memory kernel estimate is given by (26). In Figure 2, we plot the auto-correlation function C ( t )obtained by Monte-Carlo (MC) simulation (10 sample paths) for FPU systems with mild nonlinearities( θ = 0 .
1) at different temperatures ( β = 1 and β = 20). The corresponding memory kernel K ( t ) is shownin Figure 3 which is obtained by the first-principle parametrization method. In both plots, we can see that C ( t ) and K ( t ) approaches to the predicted asymptotic C ( t = ∞ ) = 0 and K ( t = ∞ ) = 0 exponentially fast.For claim ii), we adopt the MZ-Faber expansion of e − t QKQ [33] to approximate the memory kernel,where e − at J n ( bt ) is the basis function. The simulation reult is displayed in Figure 2. It can be seen thatthe MZ-Faber approximation of the EMZ memory kernel yields relatively accurate results for FPU systemswith mild nonlinearties at both low ( β = 20) and high temperature ( β = 1). Stochastic FPU chain with strong nonlinearity.
When the modeling parameter of the stochastic FPUchain is set to be α = 1 and θ = 1, we get a strong nonlinear FPU chain. We can similarly derive theprojected EMZ equation for the time correlation function of the momentum of a tagged particle. The formof which will be the same as (50).The first principle method introduced in Section 4.1 still can be applied here to approximate the EMZmemory kernel. However, large θ will lead to significant numerical instabilities at large t when calculating K ( t ) and C ( t ) [34]. Hence, we will adopt the data-driven method to approximate the memory kernel. To12 = 1 β = 20 -2 -1 -0.500.51 -5 -4 -3 -2 -1 -2 -1 -0.500.51 -6 -4 -2 Figure 4: Temporal auto-correlation function of the tagged oscillator momentum p j ( t ) for strongly nonlinear FPU system atdifferent temperature T ∝ /β . The MC simulation results (10 sample paths) of the correlation function are compared withthe one obtained by the data-driven memory kernel using Faber series (20th order) and the standard Laguerre polynomials(20th order). In the subplots, we display | C ( t ) /C (0) | and the exponentially decaying upper bound ce − αt with an estimateddecaying rate α . this end, we use the standard Laguerre polynomial [13] and Faber series [33] to construct the data-drivenapproximation scheme of the EMZ memory kernel. In particular, the LASSO regression is used to solve (45)numerically to get the approximated parameter { k n } Nn =1 . We also note that since the EMZ equation (23) isa Volterra-type integro-differential equation, constructing the memory kernel K ( t ) from C ( t ) is an inverseproblem which is ill-posed. Hence by solving numerically the regression problem (45), we can only get an effective memory kernel which may yield accurate predication of C ( t ) or ˜ u ( t ) while cannot be used to justifythe exponential decaying estimate of K ( t ). Considering this fact, we only use the data-driven method toverify the following claims:iii) The auto-correlation function C ( t ) defined in the projected EMZ (23) decays exponentially to itsequilibrium value.iv) The data-driven method introduced in Section 4.1 yields effective approximations to the memorykernel K ( t ), therefore of C ( t ).To demonstrate iii), we use MC simulation (10 sample paths) to calculate the momentum auto-correlationfunction C ( t ). It is shown in the subplots of Figure 4 that C ( t ) defined in the projected EMZ (23) decays0 exponentially fast. To validate iv), we adopt the Faber series and the standard Laguerre polynomials asthe basis function to construct the data-driven approximation schemes for K ( t ). These calculation results,along with the one obtained by the established rational approximation method [18], are presented in Figure4. We can see that the data-driven method leads to accurate predication of C ( t ).
6. Summary
In this paper, we mainly focus on the application of the effective Mori-Zwanzig (EMZ) equation onthe reduced-order modeling of stochastic systems. In particular, we showed that the semigroup estimatesfor e − t K and e − t QKQ can be used to derive the exponentially decay upper bounds for various observablestatistics associated with the EMZ equation, including the auto-correlation function C ( t ), the EMZ memorykernel K ( t ) and the fluctuation force. The results are presented for the Langevin dynamics of anharmonicoscillator chain and the heat conduction model in and out of statistical equilibrium. For each system,we have shown how the obtained prior estimates describe the dynamical properties of the reduced-orderequation for observables. In addition, we introduced both the first-principle and data-driven methods toparametrize the EMZ memory kernel, and demonstrated that the regularity of K ( t ) enables us to prove theconvergence of frequently used data-driven approximation schemes. As far as we are concerned, this is thefirst theoretical convergence result regarding the approximation of the memory kernel. All these theoreticalfindings are verified numerically by simulating the Langevin dynamics for a Fermi-Pasta-Ulam (FPU) chain13odel. With the same example, we also proved the effectiveness of the numerical methods within their rangeof applicability. We conclude by emphasizing that analytical results obtained in this paper can be generalizedand applied to the EMZ equation of other hypoelliptic stochastic systems. The numerical methodology weconsidered can be used to build effective reduced-order models for large-scale stochastic dynamical systems. Acknowledgements
This research was partially supported by the Air Force Office of Scientific Research(AFOSR) grant FA9550-16-586-1-0092. The first author would like to thank Prof. D. Venturi and A. Iserlesfor stimulating discussions.
Appendix A. Proof of recurrence relation (39)The proof of the recurrence relation (39) for equilibrium Hamiltonian system is given by Chu and Li in[4]. Here we provide a more general proof and show that such relation holds for any linear operator K andfinite-rank projection operator P . (39) is a direct consequence of the following operator polynomial identity: PK ( QK ) n = PK ( n +1) − n (cid:88) i =1 PK ( QK ) ( i − PK ( n − i +1) , ≤ n ∈ N (A.1)We can prove (A.1) by induction. For n = 0, we have identity PK = PK . Suppose for n = k , we have PK ( QK ) k = PK ( k +1) − k (cid:88) i =1 PK ( QK ) i − PK ( k − i +1) Then for n = k + 1, we have PK ( QK ) k +1 = PK ( QK ) k K − PK ( QK ) k PK = PK ( k +1) K − k (cid:88) i =1 PK ( QK ) ( i − PK ( k − i +1) K − PK ( QK ) k PK = PK ( k +2) − k +1 (cid:88) i =1 PK ( QK ) ( i − PK ( k − i +2) . By mathematical induction, the statement (A.1) holds for all 0 ≤ n ∈ N . Applying operator identity (A.1) onthe observable u (0) and then using the definition (37) and (38), we can get the recurrence relation (39). For M -dimensional finite-rank projection (7), using the same trick we get the following matrix form recurrencerelation: M n = Γ n − n − (cid:88) i =1 Γ n − i M i (A.2)where M n , Γ n are M × M dimensional matrix, defined as PK ( QK ) ( n − u (0) = M n u (0) , PK n u (0) = Γ n u (0) . where u (0) = [ u (0) , u (0) , · · · , u M (0)] T is the (initial) vector of quantities of interest and Ran( P ) = { u i (0) } Mi =1 . Appendix B. First-principle algorithm to calculate γ n for stochastic FPU chain The notaion used in this section follows exactly from [34]. We first note that for a system with potentialenergy given by a polynomial function, the action of the n -th operator power K n on a polynomial observable u ( x ) yields a polynomial function. Take u ( x ( t )) = x j as an example, this implies K n x j = (cid:88) b i ∈ B ( n ) a ( n ) b i x m ( i ) k k · · · x m ( i ) kr k r ⇒ K n +1 x j = KK n x j = (cid:88) b i ∈ B ( n +1) a ( n +1) b i x m ( i ) k k · · · x m ( i ) kr k r . (B.1)14here { a ( n ) b i } are polynomial coefficients and { m ( i ) k j } are polynomial exponents. At this point, it is con-venient to define the set of polynomial exponents B ( n ) = { b , b , · · · } , the set polynomial coefficients A ( n ) = { a ( n ) b , a ( n ) b , · · · } , and the combined index set I ( n ) = { A ( n ) , B ( n ) } . Clearly, I ( n ) identifies uniquelythe polynomial (B.1), i.e., there is a one-to-one correspondence between I ( n ) and K n x j . If we can computethe mapping I ( n ) K −→ I ( n +1) , induced by the action of the Kolmogorov operator K to the polynomial (B.1)(represented by I ( n ) ), then we can compute the exact series expansion of K n x j for arbitrary n . The wholeprocess can be represented as u ( x ) → K u ( x ) → K u ( x ) → · · · → K n u ( x ) ⇐⇒ I (0) K −→ I (1) K −→ I (2) K −→ · · · K −→ I ( n ) where ⇐⇒ represents the translation between the action of K on the observables and its action on theindex set I ( n ) . It is left to determine the updating rule of I ( n ) for the Langevin dynamics of the FPUchain. Suppose we are interested in the distance between the oscillators j and j −
1, i.e., in the polynomialobservable u ( p , r ) = r j . Using the formal definition of the Kolmogorov operator (49), the action of K n on r j can be explicitly written as K n r j = (cid:88) b i ∈ B ( n ) a ( n ) b i r m ( i ) k k · · · r m ( i ) ku k u p s ( i ) l l · · · p s ( i ) lv l v , (B.2)where { k , . . . , k u } and { l , . . . , l v } are the relevant degrees of freedom for r and p at iteration n . We canexplicitly compute the sets of such relevant degrees of freedom as K r ( n, j ) = (cid:110) j − (cid:106) n (cid:107) , . . . , j + (cid:106) n (cid:107)(cid:111) L p ( n, j ) = (cid:26) j − (cid:22) n + 12 (cid:23) , . . . , j + (cid:22) n − (cid:23)(cid:27) , (B.3)The action of the Kolmogorov operator on each monomial appearing in (B.2) can be written as K r m ( i ) ku k r m ( i ) ku k u p s ( i ) l l · · · p s ( i ) lv l v = (cid:88) v ∈ K r ( n,j ) (cid:88) h ∈ L p ( n,j ) ( L r v + L p h + S p h ) r m ( i ) k k · · · r m ( i ) ku k u p s ( i ) l l · · · p s ( i ) lv l v , (B.4)where L r v = 1 m ( p v − p v − ) ∂∂r v L p h = (cid:2) α ( r h +1 − r h ) + θ (cid:0) r h +1 − r h (cid:1)(cid:3) ∂∂p h S p h = γp h m ∂∂p h − γβ ∂ ∂p h . The action of L r v , L p h and S p h on the monomial r m ( i ) k k · · · r m ( i ) ku k u p s ( i ) l l · · · s ( i ) lv l v can be explicitly computed. Thisyields explicit linear maps of the polynomial exponents b i = [ m ( i ) , s ( i ) ] , m ( i ) = [ m ( i ) k , . . . , m ( i ) k u ] , s ( i ) = [ s ( i ) l , . . . , s ( i ) l v ] , (B.5)and polynomial coefficients a ( n ) b i . With such maps available, we can transform the combined index set I ( n ) (representing K n r j ) to I ( n +1) (representing K n +1 r j ). Specifically, we obtain I ( n +1) = I ( n +1) L r (cid:93) I ( n +1) L p (cid:93) I ( n +1) S p , I ( n +1) L r = (cid:93) v ∈ K r ( n,j ) B ( n ) (cid:93) i =1 1 (cid:93) k =0 (cid:110) m ( i ) v ( − k a ( n ) b i , [ m ( i ) − e v , s ( i ) + e v − k ] (cid:111) , I ( n +1) S p = (cid:93) h ∈ L p ( n,j ) B ( n ) (cid:93) i =1 (cid:26) { γs ( i ) h a ( n ) b i , − γβ s ( i ) h ( s ( i ) h − a ( n ) b i } , { [ m ( i ) , s ( i ) ] , [ m ( i ) , s ( i ) − e h ] } (cid:27) , I ( n +1) L p = (cid:93) h ∈ L p ( n,j ) B ( n ) (cid:93) i =1 1 (cid:93) k =0 (cid:110) { s ( i ) h ( − k +1 αa ( n ) b i , s ( i ) h ( − k +1 γa ( n ) b i } , { [ m ( i ) + e h + k , s ( i ) − e h ] , [ m ( i ) + 3 e h + k , s ( i ) − e h ] } (cid:111) . (B.6)On the other hand, since K ∗ eq = −L ( p , r ) + S ( p ). It is easy to obtain the updating rule for the correspondingindex set I ∗ ( n ) from the formal expression (B.6). With these results available, we can immediately determinethe coefficients γ j in (38) by averaging over the probability density ρ eq as γ n = (cid:104)K n r j , r j (cid:105) eq (cid:104) r j , r j (cid:105) eq = (cid:104)K n r j , K ∗ n eq r j (cid:105) eq (cid:104) r j , r j (cid:105) eq , n is even (cid:104)K n +12 r j , K ∗ n − eq r j (cid:105) eq (cid:104) r j , r j (cid:105) eq , n is odd (B.7)Using formula (38), (39) and the exact expression of the polynomial Φ n ( QKQ ), we can get the expansioncoefficient k n in (36). References [1] A. D. Baczewski and S. D. Bond. Numerical integration of the extended variable generalized Langevinequation with a positive Prony representable memory kernel.
J. Chem. Phys , 139(4):044107, 2013.[2] M. Berkowitz, J. D. Morgan, D. J. Kouri, and J. A. McCammon. Memory kernels from moleculardynamics.
J. Chem. Phys , 75(5):2462–2463, 1981.[3] A. J. Chorin, O. H. Hald, and R. Kupferman. Optimal prediction and the Mori-Zwanzig representationof irreversible processes.
Proc. Natl. Acad. Sci. USA , 97(7):2968–2973, 2000.[4] W. Chu and X. Li. The Mori-Zwanzig formalism for the derivation of a fluctuating heat conductionmodel from molecular dynamics. arXiv:1709.05928 , 2017.[5] R. Kupferman D. Givon and O. H. Hald. Existence proof for orthogonal dynamics and the Mori-Zwanzigformalism.
Isr. J. Math. , 145(1):221–241, 2005.[6] J. M. Dominy and D. Venturi. Duality and conditional expectations in the Nakajima-Mori-Zwanzigformulation.
J. Math. Phys , 58(8):082701, 2017.[7] J. P. Eckmann and M. Hairer. Non-equilibrium statistical mechanics of strongly anharmonic chains ofoscillators.
Commun. Math. Phys. , 212(1):105–164, 2000.[8] J. P. Eckmann and M. Hairer. Spectral properties of hypoelliptic operators.
Commun. Math. Phys. ,235(2):233–253, 2003.[9] J. P. Eckmann, C. A. Pillet, and L. Rey-Bellet. Non-equilibrium statistical mechanics of anharmonicchains coupled to two heat baths at different temperatures.
Commun. Math. Phys. , 201(3):657–697,1999.[10] P. Espa˜nol. Hydrodynamics from dissipative particle dynamics.
Phys. Rev. E , 52(2):1734, 1995.1611] P. Espa˜nol and P. Warren. Statistical mechanics of dissipative particle dynamics.
EPL , 30(4):191, 1995.[12] S. K. J. Falkena, C. Quinn, J. Sieber, J. Frank, and H. A. Dijkstra. Derivation of delay equation climatemodels using the Mori- Zwanzig formalism.
Proc. R. Soc. A , 475, 2019.[13] D. Funaro.
Polynomial approximation of differential equations , volume 8. Springer Science & BusinessMedia, 2008.[14] F. Grogan, H. Lei, X. Li, and N. A. Baker. Data-driven molecular modeling with the generalizedLangevin equation.
J. Comput. Phys. , 418:109633–109641, 2020.[15] B. Helffer and F. Nier.
Hypoelliptic estimates and spectral theory for Fokker-Planck operators and WittenLaplacians . Springer, 2005.[16] F. H´erau and F. Nier. Isotropic hypoellipticity and trend to equilibrium for the Fokker-Planck equationwith a high-degree potential.
Arch. Ration. Mech. Anal , 171(2):151–218, 2004.[17] P. E. Kloeden and E. Platen.
Numerical solution of stochastic differential equations , volume 23. SpringerScience & Business Media, 2013.[18] H. Lei, N.A. Baker, and X. Li. Data-driven parameterization of the generalized Langevin equation.
Proc. Natl. Acad. Sci. , 113(50):14183–14188, 2016.[19] Z. Li, , X. Bian, X. Li, and G. E. Karniadakis. Incorporation of memory effects in coarse-grainedmodeling via the Mori-Zwanzig formalism.
J. Chem. Phys , 143:243128, 2015.[20] Z. Li, H. S. Lee, E. Darve, and G. E. Karniadakis. Computing the non-Markovian coarse-grainedinteractions derived from the Mori-Zwanzig formalism in molecular systems: Application to polymermelts.
J. Chem. Phys , 146:014104, 2017.[21] K. K. Lin and F. Lu. Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism. arXiv preprint arXiv:1908.07725 , 2019.[22] F. Lu, K. K. Lin, and A. J. Chorin. Data-based stochastic model reduction for the Kuramoto–Sivashinskyequation.
Physica D , 340:46–57, 2017.[23] H. Mori. Transport, collective motion, and Brownian motion.
Prog. Theor. Phys. , 33(3):423–455, 1965.[24] E. J. Parish and K. Duraisamy. Non-Markovian closure models for large eddy simulations using theMori-Zwanzig formalism.
Phys. Rev. Fluids , 2(1):014604, 2017.[25] G. A. Pavliotis.
Stochastic processes and applications: diffusion processes, the Fokker-Planck andLangevin equations , volume 60. Springer, 2014.[26] H. Risken.
The Fokker-Planck equation: methods of solution and applications . Springer-Verlag, secondedition, 1989. Mathematics in science and engineering, vol. 60.[27] P. Stinis. Stochastic optimal prediction for the Kuramoto–Sivashinsky equation.
Multiscale Modeling& Simulation , 2(4):580–612, 2004.[28] R. Tibshirani. Regression shrinkage and selection via the Lasso.
J. Royal Stat. Soc , 58(1):267–288,1996.[29] D. Venturi and G. E. Karniadakis. Convolutionless Nakajima-Zwanzig equations for stochastic analysisin nonlinear dynamical systems.
Proc. R. Soc. A , 470(2166):1–20, 2014.[30] D. Venturi, T. P. Sapsis, H. Cho, and G. E. Karniadakis. A computable evolution equation for thejoint response-excitation probability density function of stochastic dynamical systems.
Proc. R. Soc. A ,468(2139):759–783, 2012. 1731] Y. Yoshimoto, I. Kinefuchi, T. Mima, A. Fukushima, T. Tokumasu, and S. Takagi. Bottom-up construc-tion of interaction models of non-Markovian dissipative particle dynamics.
Phys. Rev. E , 88(4):043305,2013.[32] Y. Zhu, J. M. Dominy, and D. Venturi. On the estimation of the Mori-Zwanzig memory integral.
J.Math. Phys , 59(10):103501, 2018.[33] Y. Zhu and D. Venturi. Faber approximation of the Mori-Zwanzig equation.
J. Comp. Phys. , (372):694–718, 2018.[34] Y. Zhu and D. Venturi. Generalized langevin equations for systems with local interactions.
J. Stat.Phys , pages 1–31, 2020.[35] Y. Zhu and D. Venturi. Hypoellipticity and the Mori-Zwanzig formulation of stochastic differentialequations. arXiv preprint arXiv:2001.04565 , 2020.[36] R. Zwanzig. Memory effects in irreversible thermodynamics.
Phys. Rev , 124(4):983, 1961.[37] R. Zwanzig. Nonlinear generalized Langevin equations.