Edgeworth expansions for slow-fast systems with finite time scale separation
rrspa.royalsocietypublishing.org
Research
Article submitted to journal
Subject Areas:
Keywords: multi-scale systems, homogenization,Edgeworth expansion, stochastic limitsystems
Author for correspondence:
Jeroen Wouterse-mail: [email protected]
Edgeworth expansions forslow-fast systems with finitetime scale separation
Jeroen Wouters , Georg A. Gottwald Department of Mathematics and Statistics, Universityof Reading, Reading, United Kingdom Niels Bohr Institute, University of Copenhagen,Copenhagen, Denmark School of Mathematics and Statistics, University ofSydney, NSW 2006, Australia
We derive Edgeworth expansions that describecorrections to the Gaussian limiting behaviour ofslow-fast systems. The Edgeworth expansion isachieved using a semi-group formalism for thetransfer operator, where a Duhamel-Dyson seriesis used to asymptotically determine the correctionsat any desired order of the time scale parameter ε . The corrections involve integrals over higher-order auto-correlation functions. We develop adiagrammatic representation of the series to controlthe combinatorial wealth of the asymptotic expansionin ε and provide explicit expressions for the first twoorders. At a formal level, the expressions derivedare valid in the case when the fast dynamics isstochastic as well as when the fast dynamics isentirely deterministic. We corroborate our analyticalresults with numerical simulations and show that ourmethod provides an improvement on the classicalhomogenization limit which is restricted to the limitof infinite time scale separation.
1. Introduction
Many systems in the natural sciences feature a timescale separation between slowly and rapidly evolvingvariables. Examples range from molecular drug designwhere a protein interacts with a target molecule inthe presence of a rapidly fluctuating environment [1],to climate dynamics where the slowly evolving oceandynamics is driven by rapidly evolving atmosphericweather systems [2].Such systems can often be modeled by multi-scalesystems of the form © The Author(s) Published by the Royal Society. All rights reserved. a r X i v : . [ n li n . C D ] F e b r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... d x = 1 ε f ( x, y )d t + f ( x, y )d t (1.1) d y = 1 ε g ( y )d t + 1 ε β ( y )d W t + 1 ε g ( x, y )d t. (1.2)with x ∈ R d , y ∈ R m and β ∈ R m × l and l -dimensional Wiener process W t . The fast dynamics canbe stochastic with β (cid:54) = 0 or deterministic with β ≡ . In the stochastic case we assume that thefast dynamics d y = g d t + β d W t is an ergodic process with an absolutely continuous measure µ and the full system (1.1)-(1.2) admits an absolutely continuous measure µ ( ε ) . In the purelydeterministic case β ≡ we assume that the fast dynamics d y = g d t admits a unique invariantphysical measure µ on R m and the full system (1.1)-(1.2) admits a unique invariant physicalmeasure µ ( ε ) on R d + m . Here ε (cid:28) denotes the degree of time scale separation between theslow and fast variables, x and y , respectively. Often only the slow variables are of interest in suchsystems, and one seeks reduced equations for the slow dynamics only.In the limit of infinite time scale separation ε → , closed diffusion equations for the slowvariables can be obtained by the method of homogenization [3–9]. The diffusive behaviouremerges as the integrated effect of the fast dynamics, reminiscent of the summation of randomvariables in the central limit theorem (CLT). This method applies to slow-fast systems, wherethe fast dynamics can be either stochastic or deterministic with β ≡ . In the deterministic casethe fast dynamics needs to be sufficiently chaotic. If the leading order slow vector field averagesto zero, i.e. if (cid:82) f ( x, y ) µ (d y ) = 0 , the slow dynamics is approximated on time scales of O (1) bya stochastic differential equation; see [5–7] for the stochastic case and [10–12] and [13–15] forthe deterministic case. Homogenization has been used to design efficient numerical multi-scaleintegrators such as equation-free projection [16,17] and the heterogeneous multi-scale method[18,19], and has been used for stochastic parameterization in the climate sciences [20–26].In realistic physical systems, however, the time scale separation is always finite. In systemswithout a clear time scale separation classical homogenization theory may fail and may notbe able to reliably approximate the stochastic long-time behaviour of the slow dynamics. Forfinite values of ε there is an intricate feedback between the evolution of the slow x variablesand the fast y variables which prevents the approximation of the integrated fast dynamics byBrownian motion. Similar issues arise in the CLT, where the Gaussian distribution is only anaccurate approximation for sums with a sufficiently large number of summands. For the CLTtechniques exist to obtain a more accurate description of the distribution of finite sums thanprovided by the limiting Gaussian. A classical technique is the Edgeworth expansion, whichprovides an expansion of the distributions of sums, asymptotic in / √ n , where n is the lengthof the sum [27,28].In the case of a multi-scale system such as (1.1)-(1.2) the small parameter controlling the limitis now ε , instead of / √ n , and the random variables converging to a Gaussian are the incrementsover time of the slow variable x . The multi-scale system (1.1)-(1.2) features three distinct timescales [29]: the fast time scale of O ( ε ) , an intermediate time scale of O ( ε ) on which the slowdynamics is trivial but the fast dynamics has equilibrated, and the long diffusive time scale of O (1) on which the slow dynamics exhibits nontrivial diffusive behaviour. The corrections toGaussianity occur on the intermediate time scale; it is sufficiently long for the integrated noise on x to become nearly Gaussian, but not long enough for the slow dynamics to dominate. To focuson the statistical behaviour on the intermediate time scale we consider the transition probabilities An ergodic measure is called physical if for a set of initial conditions of nonzero Lebesgue measure the temporal average ofa continuous observable converges to the spatial average over this measure. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... of the slow variable xπ ε (x , t, x ) = P (cid:18) x ( t ) − x (0) √ t ∈ (x , x + dx) (cid:12)(cid:12)(cid:12)(cid:12) x (0) = x , y (0) ∼ µ ( ε ) x (cid:19) , where µ ( ε ) x is the invariant measure of (1.1)-(1.2) conditioned on x = x and x ( t ) the slow variableof a solution of the multi-scale system (1.1)-(1.2).Homogenization dictates that on the intermediate time scale and in the limit ε → of infinitetime scale separation, π ε ( t = ε ) becomes Gaussian. We will refer to this convergence as the CLTin the context of slow-fast systems. For small but finite ε , the deviations from Gaussianity of π ε will be small. We can therefore expand π ε (x , ε, x ) in √ ε . This expansion is the equivalent of theclassical Edgeworth expansion for slow-fast systems.Whereas the limiting Gaussian probability, implied by homogenization theory, onlyinvolves the two-time statistics, higher-order Edgeworth expansions involve higher-order timecorrelations, containing more information about the dynamics. This constitutes our main result: (a) Main result We assume that the fast dynamics of the multi-scale system (1.1)-(1.2) decorrelates sufficientlyrapidly, such that higher-order correlation functions are integrable and all expectation valuesappearing in the formulae below exist. Furthermore we assume that the conditional invariantmeasure µ ( ε ) x obeys linear response w.r.t ε , and (cid:82) f µ ( dy ) = 0 . Then the expansion is given up to O ( ε ) in the limit ε → with t = ε (cid:28) by π ε (x , t = ε, x ) = n ,σ (x) √ ε c (1) σ H (cid:16) x σ (cid:17) + c (3) σ H (cid:16) x σ (cid:17) (1.3) + ε c (2)1 + c (1) σ H (cid:16) x σ (cid:17) + c (4)1 + 4 c (1) c (3) σ H (cid:16) x σ (cid:17) + c (3) σ ) H (cid:16) x σ (cid:17) + O ( ε ) , with c ( p ) = c ( p )0 , + c ( p )1 , − and c ( p )1 = c ( p )0 , + c ( p )1 , + c ( p )2 , − , with expressions given in equations (4.9),(4.11), (4.12), (4.13), (4.15), (4.16), (4.18), (4.19) and (4.20), and where H n (x) = (x − ddx ) n areHermite polynomials of degree n .Establishing this expansion involves expanding the first four cumulants c ( p ) of ( x ( t ) − x (0)) / √ t with p ≤ , resulting in their expansion coefficients c ( p ) k with k ∈ { , } (see Section4). These coefficients only involve the leading order measure µ (0) x = µ and, in particular, do notinvolve the linear response correction to µ ( ε ) x . It should be noted that the expressions for thecumulant expansion coefficients c ( p )( i,j ) as derived below determine the functional form of theexpansion, but are not sufficient to show that an Edgeworth expansion holds for a given classof dynamical systems. (b) Plan of the paper The paper is organized as follows. In Section 2 we briefly review homogenization theoryhighlighting the rôle of infinite time scale separation in convergence of transition probabilities toGaussian distributions. Section 3 reviews Edgeworth expansions for discrete stochastic systems.Section 4 introduces Edgeworth expansions and finite time scale separation corrections to theCLT for dynamical multi-scale systems and presents the explicit expression of the Edgeworth r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... expansion; the lengthy and involved derivation is provided in the Appendix A. In Section 5we present a numerical example corroborating our analytical main result for the transitionprobability and show that the Edgeworth expansion improves on the CLT. We conclude witha summary and an outlook in Section 6.
2. Homogenization
Homogenization describes the integrated effect of fast (either stochastic or deterministic chaotic)dynamics on slow variables as noise. Initially developed for stochastic multi-scale systems [5–7],homogenization has been extended recently to deterministic multi-scale systems when the fastdynamics evolves on a compact attractor Λ ⊂ R m with an ergodic invariant measure µ . It wasshown rigorously that for sufficiently chaotic fast dynamics the emergent stochastic long timebehaviour of the slow dynamics is given by stochastic differential equations driven by Brownianmotion [10–12,30]. While homogenization for stochastic systems has been proven in a very generalsetting, the results in the deterministic case are so far limited to the skew product case with g = 0 . The assumptions on the chaoticity of the fast subsystem are mild, including Axiom Adiffeomorphisms and flows, Hénon-like attractors and Lorenz attractors.The following heuristic argument serves to show how the diffusive behaviour of the slowdynamics is linked to the CLT for time-integrated stationary processes in the case when thedynamical system is entirely deterministic. Consider the simplified version of the multi-scalesystem (1.1)-(1.2) with x ∈ R and y ∈ R m ˙ x = 1 ε f ( y )˙ y = 1 ε g ( y ) . Integrating the slow dynamics leads to x ( t ) = x (0) + 1 ε (cid:90) t f ( y ( s )) d s. Transforming the integrand to the fast time scale τ = s/ε we obtain x ( t ) = x (0) + W ε ( t ) , where we introduced W ε ( t ) = ε (cid:90) tε f ( y ( τ )) d τ, (2.1)with rescaled fast dynamics ˙ y = g ( y ) . For (cid:82) f ( y ) µ (d y ) = 0 the integral is collecting weaklydependent variables with mean zero, provided the fast dynamics is sufficiently chaotic. Theintegral term in (2.1) is formally of the form of the CLT where n = 1 /ε terms are integratedand then scaled by / √ n = ε , and, assuming the CLT holds for y , converges weakly to Brownianmotion W t on the long diffusive time scale t = O (1) . The slow dynamics is approximated by d X = σ d W t . By explicitly calculating lim ε → (cid:82) Λ µ (d y ) W ε ( t ) , we obtain a Green-Kubo formula for thevariance with σ = (cid:90) ∞ d s (cid:90) Λ µ (d y ) f ( y ) f ( ϕ s y ) , where ϕ s denotes the flow map of the fast dynamics ˙ y = g ( y ) . Note that in the deterministiccase, randomness is only introduced through the choice of the initial condition y (0) .This heuristic argument can be made rigorous. It can be shown that a functional central limittheorem exists and solutions of the multi-scale system (1.1)-(1.2) converge weakly to solutions the r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... homogenized Itô stochastic differential equation dX = F ( X )d t + σ ( X ) d W t , (2.2)where W t denotes l -dimensional Brownian motion. The drift coefficient F : R d → R d is given by F ( x ) = (cid:90) f ( x, y ) µ (d y ) + (cid:90) ∞ d s (cid:90) E (cid:104) f ( x, y ) · ∂ x f ( x, ϕ t y ) (cid:105) µ (d y )+ (cid:90) ∞ d s (cid:90) E (cid:104) g ( x, y ) · ∂ y (cid:16) f ( x, ϕ t y ) (cid:17)(cid:105) µ (d y ) , (2.3)with µ the ergodic invariant measure corresponding to d y = g ( y )d t + β ( y )d W t and E theexpectation value w.r.t. the Wiener measure on W t in the stochastically driven case ( β (cid:54) = 0 ). Thediffusion coefficient σ : R d → R d × l is given by the Green-Kubo formula σ ( x ) σ T ( x ) = (cid:90) ∞ d s (cid:90) E (cid:16) f ( x, y ) ⊗ f ( x, ϕ t y ) + f ( x, ϕ t y ) ⊗ f ( x, y ) (cid:17) µ (d y ) , (2.4)where the outer product between two vectors is defined as ( a ⊗ b ) ij = a i b j . These expressionsfor the drift and diffusion can be derived formally by an asymptotic expansion of the backwardKolmogorov equation . We remark that one can add a stochastic driver to the slow dynamicsin (1.1) which would lead to an additively increased diffusion (2.4) (cf. [52]). For simplicity ofexposition we do not consider this case here.In the deterministic case β = 0 , homogenization results [10–12] assure that the family ofsolutions of the original multi-scale dynamical system converge weakly in the sup-norm topologyto the unique solution X of the reduced stochastic differential equation (2.2) as ε → . At the heartof diffusive limits of deterministic dynamical multi-scale systems lies a functional CLT whichassures that W ε ( t ) → w W ( t ) in C ([0 , ∞ ) , R d ) in the limit ε → . The functional CLT implies theCLT but not vice versa. We are, however, only aware of a single example where a judiciouslychosen unbounded observable of a deterministic dynamical system satisfies a CLT but not theassociated functional CLT [31].In the following we evaluate corrections to the CLT for increments of x by probing the finite ε corrections of the transition probability in an Edgeworth expansion.
3. The Edgeworth expansion and corrections to the central limittheorem
Before introducing Edgeworth expansions for continuous time multi-scale systems, it isinstructive to briefly review the case of Edgeworth expansions in the discrete time stochastic case.The central limit theorem describes when appropriately scaled sums S n = 1 √ n n (cid:88) i =1 ( y i − m ) of variables y i with mean m and variance σ converge in distribution to a normal distribution inthe sense that P ( a ≤ S n ≤ b ) → √ πσ (cid:90) ba e − s σ d s as n → ∞ [28,32]. It is valid for i.i.d. random variables, weakly dependent random variables [28,32], as well as for a large class of dynamical systems [13,33–36]. Edgeworth expansions describedeviations from the CLT for finite n . We briefly review in the next subsection the well studied caseof Edgeworth expansions for stochastic random variables. Strictly speaking, these formulae are only valid for correlation functions which are slightly more than integrable. For fastsystems with decaying autocorrelation functions which are only integrable, one can find expressions for the drift and diffusioncoefficients which are, however, more complicated; see [12] for details. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... (a) Edgeworth expansions for stochastic random variables In the stochastic context Edgeworth expansions are usually derived in the spectral framework bystudying the characteristic function χ n ( ξ ) = E [exp( iξS n )] of the random variable S n ( E denotesin this subsection the expectation with respect to µ , the distribution of y ). The characteristicfunction χ n is related to the cumulants c ( j ) n of S n by χ n ( ξ ) = exp( (cid:80) ∞ j =1 ( iξ ) j j ! c ( j ) n ) . Assuming that c (1) n = E [ S n ] = 0 , we have for weakly dependent random variables with uniform or strong mixingproperties that c (2) n = c (2) ∞ + n δc (2) + o ( n − ) , c (3) n = √ n c (3) ∞ + o ( n − ) and c (4) n = n c (4) ∞ + o ( n − ) ,while higher cumulants are of higher order in / √ n . Specifically, up to order o ( n − / ) theexpansions of the cumulants are related to the correlation functions of the parent randomvariables y i by c (2) n = σ − n ∞ (cid:88) j =1 j E [ y y j +1 ] √ nc (3) n = E [ y ] + 3 ∞ (cid:88) j =1 (cid:16) E [ y y j +1 ] + E [ y y j +1 ] (cid:17) + 6 ∞ (cid:88) i,j =1 E [ y y i y i + j ] with σ = E [ y ] + 2 (cid:80) ∞ j =1 E [ y y j ] given by the Green-Kubo formula [37]. The above equationsimplicitly define the terms c (2) ∞ , δc (2) and c (3) ∞ . Inserting the expansions of the cumulants into thecharacteristic function, we obtain χ n ( ξ ) = exp (cid:18) c (2) n ( iξ )
2! + c (3) n ( iξ )
3! + c (4) n ( iξ )
4! + . . . (cid:19) = exp (cid:18) √ n c (3) ∞ ( iξ )
3! + 1 n (cid:18) δc (2) ( iξ )
2! + c (4) ∞ ( iξ ) (cid:19) + O (cid:18) n / (cid:19)(cid:19) × exp (cid:18) − c (2) ∞ ξ (cid:19) = (cid:18) √ n c (3) ∞ ( iξ )
3! + 1 n (cid:18) δc (2) ( iξ )
2! + c (4) ∞ ( iξ ) (cid:19) + 12 (cid:18) √ n c (3) ∞ ( iξ ) (cid:19) + O (cid:18) n / (cid:19)(cid:33) × exp (cid:18) − c (2) ∞ ξ (cid:19) . The characteristic function therefore converges pointwise to exp( − c (2) ∞ ξ / , which is thecharacteristic function of a Gaussian with variance c (2) ∞ = σ . Since, by Lévy’s continuity theorem,pointwise convergence of the characteristic functions is equivalent to convergence in distribution,the CLT follows. An expansion in √ n of the probability density function (pdf) of S n is obtainedby the inverse Fourier transform of the characteristic function χ n ( ξ ) . Under the inverse Fouriertransform the terms ( iξ ) k in the expansion become k -th derivatives of the normal distribution, i.e.Hermite polynomials. Let ρ n be the pdf of the normalized Birkhoff sum S n . Then, the first twoEdgeworth approximations ρ (1) n and ρ (2) n of the probability density ρ n of S n are ρ (1) n (x) = n ,σ (x) (cid:32) c (3) ∞ σ √ n H (cid:16) x σ (cid:17)(cid:33) , (3.1)and ρ (2) n (x) = n ,σ (x) c (3) ∞ σ √ n H (cid:16) x σ (cid:17) + δc (2) σ n H (cid:16) x σ (cid:17) + c (4) ∞ σ n H (cid:16) x σ (cid:17) + c (3) ∞ σ n H (cid:16) x σ (cid:17) , where H k is the k -th Hermite polynomial. We have ρ n (x) = ρ (1) n (x) + o (cid:16) √ n (cid:17) and ρ n (x) = ρ (2) n (x) + o (cid:0) n (cid:1) uniformly in x [28]. The Edgeworth approximation generally yields an improvedapproximation of the pdf around the mean of the distribution [38]. Note that ρ ( p ) n is nolonger nonnegative or normalized and therefore the approximation ρ ( p ) n is no longer a r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... probability density function. In contrast to the Gaussian distribution n ,σ (x) , the first Edgeworthapproximation may have a non-zero third moment c (3) ∞ / √ n , which vanishes as n → ∞ . Higherorder approximations ρ ( p ) n can be derived by increasing the order of the Taylor series.Interestingly, the functional form of the expansion is universal, in the sense that the parentprocess y i only enters through the asymptotic cumulants c (2) ∞ = σ , δc (2) , c (3) ∞ and c (4) ∞ .In the stochastic context, Edgeworth expansions have been obtained for time series such asARMA processes [39], continuous-time diffusions [40] and, employing the Nagaev-Guivarc’hmethod for the characteristic function, for ergodic Markov chains [41]. Edgeworth expansionshave, to the best of our knowledge, not been explored in the multi-scale context. The next sectionsaim at filling this gap.
4. Edgeworth expansions for continuous-time multi-scale systems
We now determine Edgeworth corrections for multi-scale systems. The formulae will be given interms of the generator of the fast process L = g ( y ) · ∂ y + 12 β ( y ) β T ( y ) : ∂ y · ∂ y (4.1)with L -adjoint L ∗ which acts as L ∗ ν = − ∂ y · ( g ( y ) ν ) + 12 ∂ y · ∂ y · (cid:16) β ( y ) β T ( y ) ν (cid:17) . (4.2)As described in the introduction, multi-scale systems (1.1)-(1.2) are characterized by three distincttime scales: the fast time scale of O ( ε ) , an intermediate time scale of O ( ε ) on which the slowdynamics is trivial but the fast dynamics has equilibrated, and the long diffusive time scale of O (1) on which the slow dynamics exhibits nontrivial diffusive behaviour. The particular rôleof the intermediate time scale to control the normality of the noise is formally reflected in thehomogenized stochastic limit system (2.2) which evolves on the diffusive time scale of O (1) ; since d W t scales like √ t the Brownian motion is dominant on time scales of O ( ε ) . On this time scale,the transition probability π h of the diffusive homogenized limit system (2.2) satisfies the CLT andconverges according to π h (x , t, x ) = P (cid:18) X ( t ) − x √ t ∈ (x , x + dx) (cid:12)(cid:12)(cid:12)(cid:12) X (0) = x (cid:19) → n ,σ (x) , where we let t → since t is of O ( ε ) and where X solves the homogenized stochastic differentialequation (2.2). To study deviations from the Gaussian behaviour of the limit ε → , we develop inAppendix A a semi-group formalism to calculate the Edgeworth expansion of the intermediate-time transition probabilities of the slow variable x of the multi-scale system (1.1)-(1.2) π ε (x , t, x ) = P (cid:18) x ( t ) − x √ t ∈ (x , x + dx) (cid:12)(cid:12)(cid:12)(cid:12) x (0) = x , y (0) ∼ µ ( ε ) x (cid:19) , for t = ε , and expand π ε (x , ε, x ) in √ ε . Here µ ( ε ) x ( y ) is the invariant measure µ ( ε ) ( x, y ) of (1.1)-(1.2) conditioned on x = x . By the Rokhlin disintegration theorem [42] the conditional measureis essentially unique and we furthermore assume that µ ( ε ) x obeys linear response w.r.t. ε , suchthat the conditional measure can be expanded in ε around µ . Whereas the coefficients in thehomogenized equation only involve the two-time statistics, higher-order Edgeworth expansionsinvolve higher-order time correlations, containing more information about the dynamics.As in the case of random variables described in the previous section, the Edgeworth expansionof the intermediate-time transition probability π ε can be calculated by the asymptotic expansionof its associated characteristic function which is entirely determined by the cumulants. Wetherefore set out to asymptotically calculate the p th -moments of the slow variables ( x ( t ) − x ) / √ t r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... in orders of √ ε m ( p ) = E x ,µ (cid:20)(cid:18) ˆ x ( t ) √ t (cid:19) p (cid:21) , (4.3)with ˆ x ( t ) = x ( t ) − x on the intermediate time scale t = ε and the corresponding cumulants c ( p ) .The conditional average E x ,µ is with respect to the product measure µ ( ε ) ( dx, dy ) = (cid:16) δ x × µ ( ε ) x (cid:17) ( dx, dy ) . (4.4)The measure µ ( ε ) x depends on ε for non-skew product systems where the fast dynamics dependson the slow dynamics. Assuming that µ ( ε ) x obeys linear response w.r.t. ε we expand the measure µ ( ε ) x as µ ( ε ) x = µ (0) x + εµ (1) x + O ( ε ) , (4.5)with µ (0) x = µ the invariant measure of the fast process d y = g ( y )d t + β ( y )d W t . We note that,while linear response has been proven for a wide class of deterministic and stochastic systemsand has been used for model reduction [43–45], counter-examples do exist [46–51].We seek expansions of the scaled moments m ( p ) ( t ) in ε and in t . We take the limit t = ε (cid:28) and t/ε → ∞ as ε → . In this limit the random variable ˆ x ( t ) / √ t converges to a normal distributionand the coefficients of the different powers of √ ε appearing in the expansions of its cumulantswill provide the desired Edgeworth corrections coefficients. We expand the p th moments as m ( p ) = m ( p )0 + (cid:88) | α | > ε α ε t α t m ( p ) α , (4.6)where we use multi-index notation to denote the expansions in ε and t with α = ( α ε , α t ) and with | α | = α ε + α t being the combined order of the contribution. We similarly expand the cumulant as c ( p ) = c ( p )0 + (cid:88) | α | > ε α ε t α t c ( p ) α . (4.7)Note that α ε,t < and α ε,t half-integer is allowed.Our expansion of the transition probability π ε (x , t = ε, x ) (1.3) involves the first four cumulants.The derivation of those cumulant expansions can be found in Appendix A. Here we only state theresulting formulae. The first cumulant is given up to order O ( ε ) by c (1) = √ t c (1)0 , ( x ) + R (1) ε , (4.8)where the remainders R ( j ) ε = (cid:80) | α | > ε α ε t α t c ( j ) α for j = 1 , · · · , p consist of higher order terms and c (1)0 , = F ( x ) = (cid:104) f (cid:105) − (cid:104) f L − ⊥ ∂ x f (cid:105) − (cid:104) ( g ∂ y ) L − ⊥ f (cid:105) , (4.9)recovering the drift coefficient F ( x ) of the homogenized equation (2.2) (cf. [3,52]). The angularbrackets denote the conditional average with respect to µ (0) x = µ , i.e. (cid:104) A ( x, y ) (cid:105) = (cid:82) A ( x, y ) µ (d y ) .The second cumulant and moment is given up to order O ( ε ) by c (2) = m (2) = c (2)0 + t c (2)0 , + ε t c (2)2 , − + ε c (2)1 , + R (2) ε . (4.10)The O (1) contribution is given by the homogenized Green-Kubo formula (2.4) c (2)0 = σ = − (cid:104) f L − ⊥ f (cid:105) r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... and higher-order contributions are given by c (2)0 , = 12 σ (cid:18) ∂σ∂x (cid:19) + 12 σ ∂ σ∂x + σ ∂F∂x + F σ ∂σ∂x + F (4.11) c (2)2 , − = − (cid:104) f L − ⊥ f (cid:105) (4.12) c (2)1 , = − (cid:104) f L − ⊥ f (cid:105) − (cid:104) f L − ⊥ f (cid:105) + 2 (cid:104) f L − ⊥ ∂ x f L − ⊥ f (cid:105) + 4 (cid:104) f L − ⊥ f L − ⊥ ∂ x f (cid:105) + 2 (cid:104) f L − ⊥ ( g ∂ y ) L − ⊥ f (cid:105) + 2 (cid:104) ( g ∂ y ) L − ⊥ f L − ⊥ f (cid:105) . (4.13)Here L − ⊥ denotes the invertible operator whose inverse is the restriction of L to the spaceorthogonal to the projection onto the invariant measure µ (0) x (see Section (i) in the Appendix formore details; cf. (A 9)). Recall that L , associated with the ergodic fast dynamics, has a nontrivialkernel, namely functions which are constant in y , and as such is noninvertible.The third moment and its cumulant are given up to order O ( ε ) by c (3) = m (3) = √ t c (3)0 , + ε √ t c (3)1 , − + R (3) ε (4.14)with c (3)0 , = 6 (cid:104) f L − ⊥ f (cid:105) ∂∂x (cid:104) f L − ⊥ f (cid:105) (4.15) c (3)1 , − = 6 (cid:68) f L − ⊥ f L − ⊥ f (cid:69) (4.16)and the fourth cumulant is given up to order O ( ε ) by c (4) = t c (4)0 , + ε c (4)1 , + ε t c (4)2 , − + R (4) ε , (4.17)with c (4)0 , = − (cid:104) f L − ⊥ f (cid:105) (cid:18) ∂∂x (cid:104) f L − ⊥ f (cid:105) (cid:19) − (cid:104) f L − ⊥ f (cid:105) ∂ ∂x (cid:104) f L − ⊥ f (cid:105) (4.18) c (4)1 , = − (cid:104) ∂∂x f L − ⊥ f L − ⊥ f (cid:105)(cid:104) f L − ⊥ f (cid:105) − (cid:104) f L − ⊥ f L − ⊥ f (cid:105) ∂∂x (cid:104) f L − ⊥ f (cid:105) (4.19) c (4)2 , − = 24 (cid:16) (cid:104) f L − ⊥ f (cid:105)(cid:104) f L − ⊥ f (cid:105) − (cid:104) f L − ⊥ f L − ⊥ f L − ⊥ f (cid:105) (cid:17) . (4.20)Higher cumulants give rise to terms of at least O ( ε ) . As expected from the CLT, the only O (1) contribution to the cumulants (4.8)-(4.17) appears in the second cumulant (4.10). The higher-orderterms determine the Edgeworth corrections. At O ( √ ε ) – describing the lowest order correctionto the CLT – the first and third cumulant (4.14) feature, and, if non-zero, give rise to skewness.Corrections to the variance and the fourth-order cumulant c (4) start to contribute at order O ( ε ) .Note that the feedback of the slow dynamics on the approach to Gaussianity via vector field f only appears in the corrections to the first moment.In this formulation of the cumulant expansion we have not yet substituted t = ε . This allowsus to separate the contributions which pertain in the limit ε → and encode the Edgeworthcorrections to the transition probabilities of the homogenized slow diffusion equation (2.2),namely those with α ε = 0 (see also [40]). r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Corollary 1.
For the transition probabilities π h (x , t, x ) of the homogenized system, we have the followingexpansion for small tπ h (x , t, x ) = n ,σ (x) √ t c (1)0 , σ H (cid:16) x σ (cid:17) + c (3)0 , σ H (cid:16) x σ (cid:17) (4.21) + t c (2)0 , + c (1)0 , σ H (cid:16) x σ (cid:17) + c (4)0 , + 4 c (1)0 , c (3)0 , σ H (cid:16) x σ (cid:17) + c (3)0 , σ ) H (cid:16) x σ (cid:17) + O ( t ) , where H n (x) = (x − ddx ) n are Hermite polynomials of degree n . The remaining contributions, (4.12),(4.13),(4.16),(4.19) and (4.20) involve intricate correlationsbetween the slow and the fast dynamics which do not vanish on the intermediate time scale for ε → and t (cid:54) = 0 . In particular the homogenized limit ε → does not involve the slow vector field f .In homogenization the knowledge of the drift F and the Green-Kubo diffusion σ is sufficientto determine the effective reduced stochastic slow dynamics (cf. (2.2)). Going beyond the CLTrequires higher-order Edgeworth corrections involving indefinite integrals over multi-pointcorrelation functions. It is instructive to note that the vector field f ( x, y ) nontrivially enters thecorrections to the variance (4.10). The detailed proof of our main result (1.3) can be found inAppendix A.
5. Numerical results
We now corroborate our main result on the explicit formula (1.3) for the transition probability π ε (x , t = ε, x ) with numerical simulations. We show that the Edgeworth approximation ofthe intermediate-time transition probability provides a much better approximation to the truetransition probability of the multi-scale system with finite time scale separation than theGaussian limiting distribution implied by the assumption of infinite-time-scale separation andhomogenization theory.We consider here as an example of the general multi-scale system (1.1)-(1.2), the followingskew-product slow-fast system for a one-dimensional slow variable ˙ x = 1 ε f ( y ) − V (cid:48) ( x ) (5.1)driven by a fast Lorenz system ˙ y = s l ε ( y − y ) (5.2) ˙ y = 1 ε ( y ( r l − y ) − y ) (5.3) ˙ y = 1 ε ( y y − b l y ) , (5.4)where we choose the classical parameters s l = 10 , r l = 28 and b l = 8 / [53]. The Lorenz system israpidly mixing with exponentially decaying correlation [54]. The fast variables drive the slowvariable x via a function f ( y ) = cos( y /
2) exp( y / − ¯ f , generating skewed deterministicnoise. Here ¯ f is a constant chosen such that the expectation value of f under the fast dynamicsis zero and the system (5.1)–(5.4) satisfies the centering condition. We consider here a harmonicpotential V ( x ) = αx (i.e. f = V (cid:48) ( x ) ). In the absence of coupling to the fast Lorenz systemthe slow dynamics settles to the stable fixed point x = 0 . With nontrivial coupling f , the fastdynamics induces fluctuations around the stable fixed point. An example of a slow trajectory r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... is given in Figure 1. On large enough time scales we observe seemingly stochastic behaviour(left-hand figure), similar to that of the limiting homogenized system. On shorter time scales, thesmoothness of the noise is however still discernible (right-hand figure). t − x .
00 0 .
05 0 .
10 0 .
15 0 .
20 0 . t − . − . − . . . x Figure 1.
Trajectory of the slow variable x of the slow-fast system (5.1) - (5.4) with α = 0 . and ε = 0 . . To numerically estimate the transition probability π ε (x , t = ε, x ) we perform ensembleaverages of the slow variable x ( t = ε ) using M = 10 long time simulations of × time unitsusing a Dormand-Prince order / Runge-Kutta method [55,56]. The simulations have the sameinitial condition x (0) = 1 of the slow variable but differ in the initial conditions for the fast Lorenzsystem which are chosen randomly from the Lorenz attractor (this is assured by letting randominitial conditions settle on the attractor after a transient period of time units).The limiting Gaussian transition probability π h can be analytically determined from thehomogenized limiting equation d X = − αX d t + σ d W which is an Ornstein-Uhlenbeck processwhere the diffusion coefficient σ is given by the Green-Kubo formula. The limiting Gaussiantransition probability has mean x exp( − αt ) and variance by (1 − exp( − αt )) σ / (2 α ) . TheEdgeworth corrections to this limiting distribution are given by (1.3) and the first four cumulants(4.8), (4.10), (4.14) and (4.17). The terms c (2)2 , (4.12), c (3)1 , (4.16) and c (4)2 , − (4.20) are estimatednumerically by setting the gradient force V (cid:48) ( x ) ≡ in the long time integrations. Setting f = V (cid:48) ( x ) ≡ implies that the Edgeworth expansion only involves (4.12), (4.16) and (4.20) whichin turn can be accurately estimated without being numerically dominated by the other termsinvolving f . For details on the calculations of these coefficients see [57].Figure 2 shows a comparison of the transition probabilities of the full deterministic multi-scalesystem with ε = 0 . , the limiting Gaussian implied by the homogenized limit and the transitionprobability given by the Edgeworth expansion. It is clearly seen that the the homogenized limitsystem is not able to capture the inherent skewness of the slow dynamics, whereas the Edgeworthapproximation capture the transition probability remarkably well. We also see that the Edgeworthapproximation is only a good approximation of the transition probability for the slow variablenear the mode and exhibits deviations for x -values far from the mean with unphysical negativevalues.
6. Summary and outlook
In this article we derived Edgeworth expansions that describe corrections to the Gaussian limitingbehaviour of slow-fast systems for finite time scale separation. The Edgeworth expansion isachieved using a semi-group formalism for the transfer operator, where a Duhamel-Dysonseries is used to asymptotically determine the corrections at any desired order of the time scaleparameter ε . The corrections appear on the intermediate time scale O ( ε ) and the asymptoticsrequires the limit t = ε (cid:28) and t/ε → ∞ . We developed a diagrammatic representation of r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... − . . . . . π ε multiscale LorenzhomogenizedEdgeworth Figure 2.
Transition probability π ε (x , t, x ) of the slow variable for the full multi-scale Lorenz system (5.1) - (5.4) with α = 0 . and ε = 0 . (blue solid), transition probability π h (x , t, x ) of the corresponding homogenized system (orangedashed) and π ε (x , t, x ) of the Edgeworth expansion (1.3) (green dotted). The transition probabilities are evaluated at t = 1 and were initiated at x = 1 . higher-order correlation integrals to control the combinatorial wealth of their asymptoticexpansion in ε . To obtain our explicit formula for the transition probability (1.3) we requiredmixing assumptions on the integrability of higher-order autocorrelation functions. It is pertinentto mention that homogenization theory does not rely on mixing and the diffusive limit equationsexist for non-mixing systems (albeit not with a Green-Kubo formula (2.4) for the diffusion),and similarly there are mixing systems which do not allow for a diffusive limit. We expect thatsimilarly one can derive an Edgeworth expansion without these strong mixing assumptions, atthe cost of not having compact explicit expressions for the drift and diffusion.Our work points to several applications and directions, planned for further research. We derivehere the Edgeworth expansion for continuous-time multi-scale systems in the case where atleading order the slow dynamics does not couple back into the fast dynamics, i.e. g = g ( y ) . Wealso expect a similar expansion to hold when g = g ( x, y ) and the slow dynamics couples backinto the fast dynamics at leading order. A complicating issue here is the potential breakdown oflinear response when the fast invariant measure does not depend smoothly on the slow variables.Edgeworth corrections to the CLT are not restricted to systems where the noise originates asthe accumulative effect of rapidly decorrelating fast variables, as we have described here for slow-fast systems. Sums of uncorrelated deterministic variables also appear in weak coupling limitswhere a distinguished degree of freedom is weakly coupled to a bath of N degrees of freedom.Here stochastic limit systems arise in the limit of an infinitely large bath with N → ∞ [3]. Again,corrections for finite values of N can be studied using an Edgeworth expansion.The universal form of the deviations from Gaussianity given by the Edgeworth expansionsuggests that one can devise stochastic parametrizations and effective diffusive dynamics for theslow variables for finite time scale separation by substituting the fast dynamics with a surrogatesystem with the same Edgeworth coefficients, improving on the classical homogenizationlimit SDEs. In particular, the universal character allows for a data-driven approach where thecumulants are numerically estimated to build a stochastic model for the observed variables.Data Accessibility. This article has no additional data.
Authors’ Contributions.
JW designed the research and performed the numerical experiments. All authorscontributed to the research and the writing of the paper.
Competing Interests.
There are no competing interests. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Funding.
The research leading to these results has received funding from the European Community’sSeventh Framework Programme (FP7/2007-2013) under grant agreement n° PIOF-GA-2013-626210. GAG ispartially supported by the ARC grant DP180101385.
Acknowledgements.
We thank Ben Goldys and Françoise Pène for enlightening discussions andcomments.
A. Appendix: Derivation of the cumulant expansion
Outline of the derivation:
The Edgeworth expansion involves an expansion of the cumulantsin orders of ε . The derivation of the Edgeworth expansion proceeds in a number of steps.We first consider the Laplace transform of the moments, when expressed in terms of theKoopman operator, and perform a subsequent expansion using eigenfunctions of the fastgenerator L (Section i). Introducing a two-dimensional diagrammatic representation we performa combinatorial accounting of the sequences appearing in the expansion of the Laplace transform(Section ii). In particular, this allows us to identify those sequences which vanish and thosewhich will give non-trivial contributions. This will allow us to provide the explicit expressionsof the cumulants c ( p ) with p ≤ stated in (4.8), (4.10), (4.14) and (4.17) to capture the Edgeworthcorrections up to O ( ε ) (Section iii). We then show that it is sufficient for the estimation of thecumulants up to O ( ε ) to perform averages with respect to the fast measure µ (0) x only, discardingcontributions from the linear response term µ (1) x (Lemma 1). We conclude with Lemma 2 showingthat higher cumulants c ( p ) with p ≥ do not contribute at order O ( ε ) to the Edgeworthexpansion and the transition probability π ε (x , t = ε, x ) . Derivation of the cumulant expansion:
We express the conditional average in (4.3) in terms ofthe Koopman operator e L t associated with the multi-scale system (1.1)-(1.2) with the generator L = 1 ε L + 1 ε L + L , (A 1)where L = g ( y ) · ∂ y + 12 β ( y ) β T ( y ) : ∂ y · ∂ y , L = f ( x, y ) · ∂ x + g ( x, y ) · ∂ y , L = f ( x, y ) · ∂ x . (A 2)We seek expansions in t and ε of the moments m ( p ) = 1 t p/ E x ,µ (cid:20) e t (cid:16) L ε + L ε (cid:17) z ( p ) ( x ) (cid:21) = 1 t p/ (cid:90) (cid:90) e t (cid:16) L ε + L ε (cid:17) z ( p ) ( x ) (cid:16) δ x × µ ( ε ) x (cid:17) (d x, d y ) , (A 3)where z ( p ) ( x ) = ( x − x ) p and where we introduce for convenience L = L + ε L . Using (4.5),we split the conditional average over the invariant measure µ ( ε ) (see (4.4)) according to m ( p ) = 1 t p/ δ x (cid:16) ( µ (0) x + εµ (1) x + O ( ε )) e t L z ( p ) ( x ) (cid:17) = 1 t p/ δ x (cid:16) A ( p ) + ε B ( p ) + O ( ε ) (cid:17) , (A 4)where we define A ( p ) ( t ) = (cid:28) e t (cid:16) L ε + L ε (cid:17) z ( p ) ( x ) (cid:29) and B ( p ) ( t ) = µ (1) x (cid:18) e t (cid:16) L ε + L ε (cid:17) z ( p ) ( x ) (cid:19) . As before, angular brackets denote the average with respect to µ (0) x .We first calculate the first term in (A 4) associated with the average over µ (0) x before provingthat averages with respect to µ (1) x do not contribute in Lemma 1. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... (i) Laplace transform of the moments and expansion using eigenfunctions of L We assume that the spectrum of the generator L is contained in the left-half complex plane, { z ∈ C : Re( z ) ≤ } (see [58] for Anosov flows). We can then take the Laplace transform of A ( p ) ( t ) L {A ( p ) } ( s ) = (cid:42)(cid:18) s − L ε − L ε (cid:19) − z ( p ) ( x ) (cid:43) (A 5)for Re( s ) > .Using the operator identity ( C − D ) − = C − + C − D ( C − D ) − = C − + C − DC − + C − DC − DC − + . . . (A 6)with C = s − L ε and D = L ε we have L {A ( p ) } ( s ) = (cid:42)(cid:32)(cid:18) s − L ε (cid:19) − + (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − + (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − + . . . (cid:33) z ( p ) (cid:43) = z ( p ) s + 1 s (cid:42)(cid:32) L ε + L ε (cid:18) s − L ε (cid:19) − L ε (A 7) + L ε (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − L ε + . . . (cid:33) z ( p ) (cid:43) , where we used in the last equality that L A ( x ) = L ∗ µ (0) x = 0 .We assume here that the resolvent is compact. This is the case on L p -spaces if the generatorincludes diffusion. For the purely advective case of general deterministic multi-scale systemsunder consideration here, there is currently no theory available on what function spaces theresolvent is compact . Under this assumption let us decompose s − L ε into projectors p i on theeigenfunctions of L , where p i • = ( r i , • ) l i and p • = ( µ (0) , • )1 , such that L = (cid:80) ∞ i =0 λ i p i . Thepairing ( a, b ) is defined as ( a, b ) = (cid:82) a ( dy ) b . The eigenfunctions l i and r i satisfy ( r i , l j ) = δ i,j andcorrespond to the eigenvalue λ i with λ = 0 [60, Chapter 7]. Then s − L ε = ∞ (cid:88) i =0 (cid:18) s − λ i ε (cid:19) p i = sp + (cid:18) s − λ ε (cid:19) p + (cid:18) s − λ ε (cid:19) p + . . . and therefore (cid:18) s − L ε (cid:19) − = 1 s p + (cid:18) s − λ ε (cid:19) − p + (cid:18) s − λ ε (cid:19) − p + . . . = 1 s p + (cid:18) s ⊥ − L ⊥ ε (cid:19) − = 1 s p − ε ( L ⊥ ) − − sε ( L ⊥ ) − − s ε ( L ⊥ ) − − . . . . (A 8)where we define the operators restricted to the space orthogonal to the projection onto theinvariant measure µ (0) x as L ⊥ = λ p + λ p + . . . and ⊥ = p + p + . . . . We note that formally L − , ⊥ = − (cid:90) ∞ d τ (cid:16) e τ L − p (cid:17) (A 9)and therefore formally (cid:104) A L − ⊥ f (cid:105) = (cid:104) A L − f (cid:105) and (cid:104) f L − ⊥ A (cid:105) = (cid:104) f L − A (cid:105) for any A since p f = (cid:104) f (cid:105) = 0 .The expansion (A 7) contains terms with both positive and negative powers of s . According tothe residue theorem, only the residues contribute to the inverse Laplace transform. By the way itis defined in Eq. (A 7), the function L {A ( p ) } ( s ) can only have poles at zero or at λ i ε . The poles at For hyperbolic maps one has good statistical properties when considering anisotropic Banach spaces (see [59] and referencestherein). r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... λ i ε decay exponentially upon inverse Laplace transform (recall that t/ε → ∞ ), and we thereforeonly need to consider the poles at zero. (ii) Combinatorial analysis of the expansion at each order in ε and t The expansion of the Laplace transform (A 5) contains powers of ε and of s . To facilitate thecombinatorial problem of accounting for the non-zero contributions at a specified order of ε and s (or t , respectively) we introduce the following diagrammatic representation. We classify all termsin the expansion (A 7) as sequences S ( a, b ) with a i ∈ N , b j ∈ { , } of the form S ( a, b ) := 1 s (cid:104) B b A a B b A a . . . B b l A a l B b l +1 z ( p ) (cid:105) , (A 10)where A = s p , A = − ε ( L ⊥ ) − , A = − sε ( L ⊥ ) − , etc. represent the terms in the series(A 8), and B = L /ε and B = L . We introduce the shorthand notation for a sequence oflength l , (cid:32) a . . . a l b . . . b l b l +1 (cid:33) for S ( a, b ) . We also introduce a product ⊗ of twosubsequences, with (cid:32) a . . . a l b . . . b l +1 (cid:33) ⊗ (cid:32) c . . . c l d . . . d l +1 (cid:33) = (cid:32) a . . . a l c . . . c l b . . . b l +1 d . . . d l +1 (cid:33) . For example s (cid:68) L p s L ( − ε ( L ⊥ ) − ) L z ( p ) (cid:69) = S ((0 , , (2 , , (cid:32) (cid:33) = (cid:32) (cid:33) ⊗ (cid:32)
11 1 (cid:33) . The order of ε of a given sequence is given by O ε = 2 l (cid:88) i =1 a i + l +1 (cid:88) j =1 ( b j −
2) = l (cid:88) i =1 a i + l +1 (cid:88) j =1 b j − l + 1) , and the order of s is given by O s = − l (cid:88) i =1 ( a i − , or, equivalently, the order of t in the inverse Laplace transform is given by O t = 1 − l (cid:88) i =1 ( a i −
1) = 1 + l − l (cid:88) i =1 a i . The combined order in t and ε of the sequence is O ε,t = l (cid:88) i =1 a i + l +1 (cid:88) j =1 b j − l − . (A 11)Note that the combined order of a sequence does not change upon commutation of thesubsequences in a product. Finally, the sign of the sequence is given by ( − θ where θ = (cid:80) li =1 δ ,a i is the number of non-zero elements among the a i where we used the Kronecker δ .We can readily identify certain sequences which do not contribute. The centering condition (cid:104)L (cid:105) = 0 implies that sequences containing a subsequence (cid:32) · · · · · · (cid:33) vanish. For r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... the same reason, sequences starting with (cid:32) · · · (cid:33) or ending in (cid:32) · · · (cid:33) vanish.Furthermore, sequences containing an insufficient number of derivatives with respect to x do notcontribute; if the number of derivatives is less than the order p , the sequence S ( a, b ) averages tozero since δ x ( ∂ nx ( x − x ) p ) = 0 for n < p . Hence only those sequences with l + 1 (cid:62) p contributesince the operators L and L contain at most one x -derivative. This implies that for fixed momentorder p we have l ≥ l min = p − (A 12)Recall that moments are rescaled by t − p (cf. (4.3)). We therefore require O ε,t − (cid:98) p (cid:99) ≤ (A 13)to retain only terms that contribute to first order after normalization. This implies l (cid:88) i =1 a i + l +1 (cid:88) j =1 b j − l − (cid:54) q + 1 . (A 14)We now identify all those sequences which are compatible with the constraints (A 14) and (A 12).For sake of exposition, we treat even and odd moments separately. We first determine the possiblesequences of different lengths l for even moments m (2 q ) where l (cid:62) q − . Case l = 2 q − Equation (A 14) implies (cid:80) li =1 a i + (cid:80) l +1 j =1 b j (cid:54) q + 1 . The sequences of length q − have in total l + 1 = 4 q − elements. Each sequence must therefore contain at least q − zero elements. Since sequences with subsequence (cid:32) · · · · · · (cid:33) average to zero, there are6 possible types of sequence that fulfill the inequality, namely (cid:32)
11 1 (cid:33) ⊗ q permutations: O ε,t = q (A 15) (cid:32) (cid:33) ⊗ ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q +12 (cid:1) O ε,t = q + 1 (A 16) (cid:32) (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q − (cid:1) O ε,t = q + 1 (A 17) (cid:32)
12 1 (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q (cid:1) O ε,t = q + 1 (A 18) (cid:32)
21 1 (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q (cid:1) O ε,t = q + 1 (A 19) (cid:32)
11 2 (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q (cid:1) O ε,t = q + 1 (A 20)with all possible permutations of subsequences. For q = 2 , we can verify by simple enumerationthat these are all the sequences possible if we restrict to b j < . All other possible nonzero termswith b j > can be obtained by substituting a b j in one of these sequences, but this operationincreases the order. For q > the only way to extend a series without increasing the order is toappend an element (cid:32)
11 1 (cid:33) . r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Case l = 2 q Equation (A 14) implies (cid:80) li =1 a i + (cid:80) l +1 j =1 b j (cid:54) q + 2 . The sequences of length q have in total l + 1 = 4 q + 1 elements. Each sequence must therefore contain at least q − zeroelements. There is only 1 possible type of sequence that fulfills the inequality, namely (cid:32) (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ ( q − permutations: (cid:0) q (cid:1) O ε,t = q + 1 (A 21)with all possible permutations of subsequences. Case l = 2 q + 1 Equation (A 14) implies (cid:80) li =1 a i + (cid:80) l +1 j =1 b j (cid:54) q + 3 . The sequences of length q + 1 have in total l + 1 = 4 q + 3 elements. Each sequence must therefore contain at least q zeroelements. There is only 1 possible type of sequence that fulfills the inequality, namely (cid:32)
11 1 (cid:33) ⊗ ( q + 1) permutations: O ε,t = q + 1 (A 22)with all possible permutations of subsequences. Case l (cid:62) q + 2 Nonzero sequences of length q + 2 are of at least order q + 2 , so they do notcontribute.For odd moments m (2 q +1) , the inequality (A 14) together with the condition on the number ofderivatives l (cid:62) q implies that only sequences (A 21)-(A 22) contribute. (iii) Calculation of the moments m ( p ) We have now derived all nontrivial sequences contributing to the moments m ( p ) . We can obtainexplicit expressions for the moments by calculating for each sequence the expectation value asprescribed in Eq. (A 10). In the process, subsequences become functions of x due to the projection p . Finally, we work out explicitly the x -derivatives in the sequence. For example, for the sequence S ((0 , , (2 , , with p = 3 , we have that, as described above, L − S ((0 , , (2 , , − ε t (cid:68) L p L L − ⊥ L ( x − x ) (cid:69) = − ε t (cid:104) f ( x, y ) (cid:105) (cid:68) f ( x, y ) L − ⊥ f ( x, y ) (cid:69) . Recalling the multi-index notation to denote the expansions in ε and t with α = ( α ε , α t ) andwith | α | = α ε + α t being the combined order of the contribution, we obtain, in this way, fromsequence (A 15) the leading homogenized term ( | α | = 0 ) in the q -th scaled moment m (2 q ) m (2 q )0 = 1 t q t q q ! (2 q )! (cid:18) σ (cid:19) q , (A 23)where σ = − (cid:104) f L − ⊥ f (cid:105) is the homogenized diffusion coefficient (2.4). The first factor /t q is thenormalization factor of the moment, the term t q /q ! comes from the inverse Laplace transform of /s q +1 and (2 q )! comes from the q -th derivative of ( x − x ) q . Finally, since only terms with q derivatives ∂ x yield nonzero contributions, only the term with f in L contributes, resulting in (cid:68) f e t L f (cid:69) q = (cid:16) σ / (cid:17) q . r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Similar combinatorial accounting distills from the sequences (A 16) and (A 22) thehomogenized first correction term ( α ε = 0 , | α | = 1 ) in the q -th scaled moment m (2 q ) t m (2 q )0 , = t (2 q )!( p + 1)! (cid:32)(cid:32) q + 12 (cid:33) F (cid:18) σ (cid:19) q − + F (cid:18) σ (cid:19) ( q − σ∂ x σ (cid:18) q + q − q (cid:19) + q ( q + 1) (cid:18) σ (cid:19) q ∂ x F + σ (2 q +1) ( q +1) ∂ x σ (cid:18) q + q − q (cid:19) + σ q q +1 ( ∂ x σ ) q q + 1)(6 q − q − (cid:19) , where F = (cid:104) f (cid:105) − (cid:104) ( g ∂ y ) L − ⊥ f (cid:105) − (cid:104) f L − ⊥ ∂ x f (cid:105) is the homogenized drift coefficient (2.3).From sequences (A 17)-(A 20) we obtain the non-homogenized corrections ( α ε > , | α | = 1 ) tothe q -th scaled moment with q x -derivatives ε ˜ m (2 q )1 , + ε t m (2 q )2 , − = (2 q )! t q (cid:18) εt q q ! q ( q − (cid:104) f L − ⊥ f L − ⊥ f (cid:105)(cid:104) f (cid:105) (cid:16) − (cid:68) f L − ⊥ f (cid:69)(cid:17) ( q − + εt q q ! q (cid:16) −(cid:104) f L − ⊥ f (cid:105) (cid:17) ( −(cid:104) f L − ⊥ f (cid:105) ) q − + εt q q ! q (cid:16) − (cid:68) f L − ⊥ f (cid:69)(cid:17) ( −(cid:104) f L − ⊥ f (cid:105) ) ( q − + ε t q − ( q − q (cid:16) − (cid:68) f L − ⊥ f (cid:69)(cid:17) ( −(cid:104) f L − ⊥ f (cid:105) ) q − + ε t q − ( q − q − (cid:16) − (cid:68) f L − ⊥ f L − ⊥ f L − ⊥ f (cid:69)(cid:17) ( −(cid:104) f L − ⊥ f (cid:105) ) q − + ε t q − ( q − (cid:32) q − (cid:33) (cid:68) f L − ⊥ f L − ⊥ f (cid:69) ( −(cid:104) f L − ⊥ f (cid:105) ) q − (cid:33) . From sequence (A 21) we obtain the non-homogenized corrections ( α ε > , | α | = 1 ) to the q -thmoment with q + 1 x -derivatives ε ˜˜ m (2 q )1 , = (2 q )! εq ! (cid:16) q (cid:16) (cid:104) ( g ∂ y ) L − ⊥ f L − ⊥ f (cid:105) ( −(cid:104) f L − ⊥ f (cid:105) ) q − + (cid:104) f L − ⊥ ( g ∂ y ) L − ⊥ f (cid:105) ( −(cid:104) f L − ⊥ f (cid:105) ) q − +( q − (cid:68) f L − ⊥ f L − ⊥ f (cid:69) ( −(cid:104) ( g ∂ y ) L − ⊥ f (cid:105) ( −(cid:104) f L − ⊥ f (cid:105) ) q − ) (cid:17) + q ( q − (cid:16) − (cid:68) f L − ⊥ f (cid:69)(cid:17) q − (cid:104) ∂ x f L − ⊥ f L − ⊥ f (cid:105) + q (cid:16) −(cid:104) f L − ⊥ f (cid:105) (cid:17) q − (cid:104) f L − ⊥ ∂ x f L − ⊥ f (cid:105) + q ( q + 1)( −(cid:104) f L − ⊥ f (cid:105) ) q − (cid:68) f L − ⊥ f L − ⊥ ∂ x f (cid:69) + q (cid:18) q − q + 12 (cid:19) ( −(cid:104) ∂ x f L − ⊥ f (cid:105) ) (cid:16) − (cid:68) f L − ⊥ f (cid:69)(cid:17) q − (cid:104) f L − ⊥ f L − ⊥ f (cid:105) + q (cid:18) q − q − (cid:19) (cid:16) − (cid:68) f L − ⊥ ∂ x f (cid:69)(cid:17) ( −(cid:104) f L − ⊥ f (cid:105) ) q − (cid:104) f L − ⊥ f L − ⊥ f (cid:105) (cid:19) , where m (2 q )1 , = ˜ m (2 q )1 , + ˜˜ m (2 q )1 , . From sequence (A 22) we obtain the leading homogenized terms( α ε = 0 , | α | = ) in the (2 q + 1) -th moment √ t m (2 q +1)0 , = √ t (2 q + 1)! q ! 12 q ( F σ q + qσ q +1 ∂ x σ ) . r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... From sequence (A 21) we obtain the non-homogenized correction terms ( α ε > , | α | = ) in the (2 q + 1) -th moment ε √ t m (2 q +1)1 , − = (2 q + 1)! t (2 q +1) / ε t q q ! q (cid:104) f L − ⊥ f L − ⊥ f (cid:105) ( −(cid:104) f L − ⊥ f (cid:105) ) q − . Summarizing for q = 0 and q = 1 we obtain the desired expressions for the first four cumulants(4.8), (4.10), (4.14) and (4.17). We have until now only performed the average with respect to theleading order contribution µ (0) x . The following Lemma shows that this is sufficient Lemma 1.
The asymptotic series of cumulants c ( p ) in ε and t up to the combined order O ε,t = involvesaverages over µ (0) x only and the linear response term µ (1) x does not contribute.Proof. Analogously to (A 5), the Laplace transform of B ( p ) ( t ) = εµ (1) x (cid:18) e t (cid:16) L ε + L ε (cid:17) z ( p ) ( x ) (cid:19) can be expanded as L {B ( p ) } ( s ) = εµ (1) x (cid:32)(cid:32)(cid:18) s − L ε (cid:19) − + (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − + (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − L ε (cid:18) s − L ε (cid:19) − + . . . (cid:33) z ( p ) (cid:33) . Using our diagrammatic representation, the sequences are encoded as S ( a, b ) = (cid:32) a . . . a l b . . . b l − (cid:33) . The order of ε of a given sequence is now O ε = 2 (cid:80) li =1 a i + (cid:80) l − j =1 ( b j −
2) + 1 = (cid:16) (cid:80) li =1 a i + (cid:80) l − j =1 b j (cid:17) − l −
1) + 1 and the order of t is O t = − − (cid:80) li =1 ( a i −
1) = − l − (cid:80) li =1 a i . The combinedorder in t and ε of the sequence is then O ε,t = (cid:80) li =1 a i + (cid:80) l +1 j =1 b j − l + 2 . The only sequencethat can contribute in this case is (cid:32) (cid:33) ⊗ (cid:32)
11 1 (cid:33) ⊗ q − ⊗ (cid:32) (cid:33) at order O ε,t = q + 1 . Because of the left-most projection p in the sequence, such termsare functions of x only. Since µ (1) x = lim ε → ( µ ( ε ) − µ (0) x ) /ε is a difference of two normalizedmeasures, applying µ (1) x to a constant in y yields . Therefore the cumulants up to O ( ε ) onlycontain averages over the fast invariant measure µ (0) x and the average with respect to the linearresponse µ (1) x – the second term in (A 4) – vanishes. Remark 1.
The linear response term µ (1) x may become relevant for higher orders O ε,t (cid:62) . Finally, we prove that higher order cumulants c ( p ) with p (cid:62) do not contribute at O ( ε ) , andwe can be content finding expressions for the first 4 cumulants to determine the Edgeworthexpression. Lemma 2. c ( p ) = O ( ε ) for p (cid:62) . r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Proof.
Using the recursion formula for cumulants c ( p ) = m ( p ) − p − (cid:88) m =1 (cid:32) p − m − (cid:33) m ( p − m ) c ( m ) , we have by recursion at order O (1) for q (cid:62) that c (2 q ) = m (2 q )0 − (cid:32) q − (cid:33) m (2 q − c (2)0 + O ( ε ) = c (2 q )0 + O ( ε ) , where, due to (A 23), c (2 q )0 = (2 q )! q ! σ q q − (2 q − q − q − q − σ q − q − σ = 0 . Similarly, for the combined order contribution to the even cumulant δ (1) c (2 q ) = c (2 q )1 , + c (2 q )0 , + c (2 q )2 , − , we find that for q (cid:62) δ (1) c (2 q ) = δ (1) m (2 q ) − (cid:32) q − (cid:33) (cid:16) m (2 q − , + m (2 q − , − (cid:17) c (1)0 − (cid:32) q − (cid:33) ( m (2 q − δ (1) c (2) + δ (1) m (2 q − c (2)0 ) − (cid:32) q − (cid:33) (cid:16) m (2 q − , + m (2 q − , − (cid:17) c (3)0 − (cid:32) q − (cid:33) m (2 q − δ (1) c (4) = 0 , (A 24)where δ (1) m (2 q ) = m (2 q )1 , + m (2 q )0 , + m (2 q )2 , − . For the odd cumulants c (2 q +1) we find recursivelythat the O ( √ ε ) contribution for q (cid:62) is c (2 q +1)0 , + c (2 q +1)1 , − = (cid:16) m q +10 , + m q +11 , − (cid:17) − m q c (1)0 , − (2 q ) (cid:16) m (2 q +1)0 , + m (2 q +1)1 , − (cid:17) c (2)0 − (cid:32) q (cid:33) m (2 q )0 (cid:16) c (3)0 , + c (3)1 , − (cid:17) = 0 . References
1. Kamerlin SCL, Vicatos S, Dryga A, Warshel A. 2011 Coarse-Grained (Multiscale) Simulationsin Studies of Biophysical and Chemical Systems.
Annual Review of Physical Chemistry , 41–64.2. Imkeller P, von Storch JS. 2001 Stochastic Climate Models . Birkhäuser.3. Givon D, Kupferman R, Stuart A. 2004 Extracting macroscopic dynamics: Model problemsand algorithms.
Nonlinearity , R55–127.4. Pavliotis G, Stuart A. 2008 Multiscale Methods Averaging and Homogenization . Texts in AppliedMathematics , Springer.5. Khasminsky RZ. 1966 On stochastic processes defined by differential equations with a smallparameter. Theory of Probability and its Applications , 211–228.6. Kurtz TG. 1973 A limit theorem for perturbed operator semigroups with applications torandom evolutions. Journal of Functional Analysis , 55–67.7. Papanicolaou GC. 1976 Some probabilistic problems and methods in singular perturbations. Rocky Mountain Journal of Mathematics , 653–674.8. Beck C. 1990 Brownian motion from deterministic dynamics. Phys. A , 324–336.9. Just W, Kantz H, Rödenbeck C, Helm M. 2001 Stochastic modelling: replacing fast degrees offreedom by noise.
J. Phys. A , 3199–3213.10. Melbourne I, Stuart A. 2011 A note on diffusion limits of chaotic skew-product flows. Nonlinearity , 1361–1367.11. Gottwald GA, Melbourne I. 2013 Homogenization for deterministic maps and multiplicativenoise. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science . r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A ..........................................................
12. Kelly D, Melbourne I. 2017 Deterministic homogenization for fast–slow systems with chaoticnoise.
Journal of Functional Analysis , 4063–4102.13. Dolgopyat D. 2004 Limit theorems for partially hyperbolic systems.
Trans. Amer. Math. Soc. , 1637–1689).14. De Simoi J, Liverani C. 2015 The Martingale Approach after Varadhan and Dolgopyat. In
Hyperbolic dynamics, fluctuations and large deviations vol. 89
Proc. Sympos. Pure Math. pp. 311–339.Amer. Math. Soc., Providence, RI.15. De Simoi J, Liverani C. 2016 Statistical properties of mostly contracting fast-slow partiallyhyperbolic systems.
Invent. Math. , 147–227.16. Gear C, Kevrekidis I. 2003 Projective methods for differential equations.
SIAM J. Sci. Comp. ,1091–1106.17. Kevrekidis IG, Gear CW, Hyman JM, Panagiotis GK, Runborg O, Theodoropoulos C. 2003Equation-free, coarse-grained multiscale computation: Enabling microscopic simulators toperform system-level analysis. Comm. Math. Sci. , 715–762.18. E W. 2003 Analysis of the heterogeneous multiscale method for ordinary differentialequations. Comm. Math. Sci. , 423–436.19. E W, Engquist B, Li X, Ren W, Vanden-Eijnden E. 2007 Heterogeneous multiscale methods: Areview. Comm. Comp. Phys. , 367–450.20. Majda AJ, Timofeyev I, Vanden-Eijnden E. 1999 Models for stochastic climate prediction. Proceedings of the National Academy of Sciences , 14687–14691.21. Majda AJ, Timofeyev I, Vanden-Eijnden E. 2001 A mathematical framework for stochasticclimate models. Communications on Pure and Applied Mathematics , 891–974.22. Majda AJ, Timofeyev I, Vanden-Eijnden E. 2002 A priori tests of a stochastic mode reductionstrategy. Phys. D , 206–252.23. Majda AJ, Timofeyev I, Vanden-Eijnden E. 2003 Systematic strategies for stochastic modereduction in climate.
Journal of the Atmospheric Sciences , 1705–1722.24. Monahan AH, Culina J. 2011 Stochastic Averaging of Idealized Climate Models. Journal ofClimate , 3068–3088.25. Culina J, Kravtsov S, Monahan AH. 2011 Stochastic Parameterization Schemes for Use inRealistic Climate Models. Journal of the Atmospheric Sciences , 284–299.26. Gottwald G, Crommelin D, Franzke C. 2017 Ensemble-based Atmospheric Data Assimilation.In Franzke CLE, O’Kane TJ, editors, Nonlinear and Stochastic Climate Dynamics pp. 209–240.Cambridge: Cambridge University Press.27. Bhattacharya RN, Rao RR. 2010
Normal Approximation and Asymptotic Expansions vol. 64
Classicsin Applied Mathematics . Society for Industrial and Applied Mathematics (SIAM), Philadelphia,PA.28. Feller W. 1957
An Introduction to Probability Theory and Its Applications . New York: Wiley 2d ededition.29. de Simoi J, Liverani C, Poquet C, Volk D. 2017 Fast–Slow Partially Hyperbolic Systems VersusFreidlin–Wentzell Random Systems.
Journal of Statistical Physics , 650–679.30. Dolgopyat D. 2005 Averaging and invariant measures.
Mosc. Math. J. , 537–576, 742.31. Gouëzel S. 2004 Central limit theorem and stable laws for intermittent maps. Probability Theoryand Related Fields , 82–122.32. Çınlar E. 2011
Probability and Stochastics . Number 261 in Graduate Texts in Mathematics. NewYork ; London: Springer.33. Melbourne I, Nicol M. 2005 Almost sure invariance principle for nonuniformly hyperbolicsystems.
Commun. Math. Phys. , 131–146.34. Melbourne I, Nicol M. 2008 Large deviations for nonuniformly hyperbolic systems.
Trans.Amer. Math. Soc. , 6661–6676.35. Melbourne I, Nicol M. 2009 A vector-valued almost sure invariance principle for hyperbolicdynamical systems.
Annals of Probability , 478–505.36. Gouëzel S. 2010 Almost sure invariance principle for dynamical systems by spectral methods. Ann. Probability , 1639–1671.37. Götze F, Hipp C. 1983 Asymptotic Expansions for Sums of Weakly Dependent RandomVectors. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete , 211–239.38. Field C, Ronchetti E. 1990 Small Sample Asymptotics . Institute of Mathematical Statistics.39. Götze F, Hipp C. 1994 Asymptotic distribution of statistics in time series.
Ann. Statist. ,2062–2088. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A ..........................................................
40. Aït-Sahalia Y. 2002 Maximum Likelihood Estimation of Discretely Sampled Diffusions: AClosed-Form Approximation Approach.
Econometrica , 223–262.41. Hervé L, Pène F. 2010 The Nagaev-Guivarc’h method via the Keller-Liverani theorem. Bull.Soc. Math. France , 415–489.42. Rokhlin VA. 1949 On the fundamental ideas of measure theory.
Matematicheskii Sbornik ,107–150.43. Wouters J, Lucarini V. 2013 Multi-level Dynamical Systems: Connecting the Ruelle ResponseTheory and the Mori-Zwanzig Approach. Journal of Statistical Physics , 850–860.44. Wouters J, Lucarini V. 2012 Disentangling multi-level systems: averaging, correlations andmemory.
Journal of Statistical Mechanics: Theory and Experiment , P03003.45. Wouters J, Dolaptchiev SI, Lucarini V, Achatz U. 2016 Parameterization of stochasticmultiscale triads.
Nonlin. Processes Geophys. , 435–445.46. Baladi V, Smania D. 2008 Linear response formula for piecewise expanding unimodal maps. Nonlinearity , 677.47. Baladi V, Smania D. 2010 Alternative proofs of linear response for piecewise expandingunimodal maps. Ergodic Theory and Dynamical Systems , 1–20.48. Baladi V. 2014 Linear Response, or else. In ICM Seoul 2014, Proceedings, Volume III pp. 525–545.49. Baladi V, Benedicks M, Schnellmann D. 2015 Whitney-Hölder continuity of the SRB measurefor transversal families of smooth unimodal maps.
Invent. Math. , 773–844.50. de Lima A, Smania D. 2016 Central limit theorem for the modulus of continuity of averagesof observables on transversal families of piecewise expanding unimodal maps.
Journal of theInstitute of Mathematics of Jussieu pp. 1–61.51. Gottwald GA, Wormell JP, Wouters J. 2016 On spurious detection of linear response andmisuse of the fluctuation-dissipation theorem in finite time series.
Phys. D , 89–101.52. Pavliotis GA, Stuart AM. 2008
Multiscale Methods: Averaging and Homogenization . New York:Springer.53. Lorenz EN. 1963 Deterministic nonperiodic flow.
Journal of the Atmospheric Sciences , 130–141.54. Araújo V, Melbourne I, Varandas P. 2015 Rapid Mixing for the Lorenz Attractor and StatisticalLimit Laws for Their Time-1 Maps. Communications in Mathematical Physics , 901–938.55. ODE.jl v2.1 commit 8d4827b93609118633478acf09a74247f47cd97e. https://github.com/JuliaDiffEq/ODE.jl .56. Dormand J, Prince P. 1980 A Family of Embedded Runge-Kutta Formulae.
Journal ofComputational and Applied Mathematics , 19–26.57. Wouters J, Gottwald GA. 2018 Stochastic model reduction for slow-fast systems withmoderate time-scale separation. arXiv:1804.09537 [cond-mat, physics:nlin] . arXiv: 1804.09537.58. Butterley O, Liverani C. 2007 Smooth Anosov flows: correlation spectra and stability. Journalof Modern Dynamics , 301–322.59. Baladi V. 2017 The quest for the ultimate anisotropic Banach space. J. Stat. Phys. , 525–557.60. Lasota A, Mackey MC. 1985