[PDF] Higher order phase averaging for highly oscillatory systems

Abstract

We introduce a higher order phase averaging method for nonlinear oscillatory systems. Phase averaging is a technique to filter fast motions from the dynamics whilst still accounting for their effect on the slow dynamics. Phase averaging is useful for deriving reduced models that can be solved numerically with more efficiency, since larger timesteps can be taken. Recently, Haut and Wingate (2014) introduced the idea of computing finite window numerical phase averages in parallel as the basis for a coarse propagator for a parallel-in-time algorithm. In this contribution, we provide a framework for higher order phase averages that aims to better approximate the unaveraged system whilst still filtering fast motions. Whilst the basic phase average assumes that the solution is independent of changes of phase, the higher order method expands the phase dependency in a basis which the equations are projected onto. We illustrate the properties of this method on an ODE that describes the dynamics of a swinging spring due to Lynch (2002). Although idealized, this model shows an interesting analogy to geophysical flows as it exhibits a slow dynamics that arises through the resonance between fast oscillations. On this example, we show convergence to the non-averaged (exact) solution with increasing approximation order also for finite averaging windows. At zeroth order, our method coincides with a standard phase average, but at higher order it is more accurate in the sense that solutions of the phase averaged model track the solutions of the unaveraged equations more accurately.

Full PDF

HHigher order phase averaging for highly oscillatorysystems

Werner Bauer, Colin J. Cotter, Beth WingateFebruary 24, 2021

Abstract

We introduce a higher order phase averaging method for nonlinear oscillatory sys-tems. Phase averaging is a technique to ﬁlter fast motions from the dynamics whilst stillaccounting for their eﬀect on the slow dynamics. Phase averaging is useful for derivingreduced models that can be solved numerically with more eﬃciency, since larger timestepscan be taken. Recently, Haut and Wingate (2014) introduced the idea of computing ﬁnitewindow numerical phase averages in parallel as the basis for a coarse propagator for aparallel-in-time algorithm. In this contribution, we provide a framework for higher orderphase averages that aims to better approximate the unaveraged system whilst still ﬁlteringfast motions. Whilst the basic phase average assumes that the solution is independentof changes of phase, the higher order method expands the phase dependency in a basiswhich the equations are projected onto. We illustrate the properties of this method on anODE that describes the dynamics of a swinging spring due to Lynch (2002). Althoughidealized, this model shows an interesting analogy to geophysical ﬂows as it exhibits a slowdynamics that arises through the resonance between fast oscillations. On this example,we show convergence to the non-averaged (exact) solution with increasing approximationorder also for ﬁnite averaging windows. At zeroth order, our method coincides with astandard phase average, but at higher order it is more accurate in the sense that solu-tions of the phase averaged model track the solutions of the unaveraged equations moreaccurately.

Phase averaging is a technique for approximating highly oscillatory systems so that the solutionstill captures the long time trends in the dynamics. In the context of highly oscillatory nonlinearPDEs, the combination of phase averaging and asymptotics has been used to derive slow PDEsthat do this (Schochet, 1994; Majda and Embid, 1998; Wingate et al., 2011, for example),providing rigorous derivations of slow geophysical models such as the quasigeostrophic model.Slow models are more analytically tractable and easier to explain, and they are more eﬃcientto solve numerically. This is because they have had the fast frequencies ﬁltered from them, andso much larger timesteps can be taken without excessively compromising stability or accuracy.For example, the ﬁrst numerical weather prediction exercise by Charney, Fjortoft and vonNeumann was based upon the barotropic vorticity equation, a model that does not supportthe full gravity and pressure waves present in more detailed models. One problem with these1 a r X i v : . [ m a t h . D S ] F e b symptotic models is that they are not uniformly valid, which is one reason why they are notused in today’s weather prediction systems as a predictive tool.This paper is not about numerical algorithms, but it is highly motivated by them. Hautand Wingate (2014) investigated the possibility of using phase averaging without asymptotics,by performing phase averaging over a ﬁnite window, and the phase average integral with nu-merical quadrature. Although this results in a computationally intensive model, the terms inthe quadrature sum can be evaluated independently, exposing possibilities for parallel com-puting. The phase averaged formulation also involves the application of matrix exponentials,which can be computed by Krylov methods or other techniques such as the parallel rationalapproximation approach of Haut et al. (2016); Schreiber et al. (2018), particularly suited tooscillatory operators. This produces a variant of heterogeneous multiscale methods (Abdulleet al., 2012). The accuracy and stability of the ﬁnite window phase averaging technique wasstudied in Peddle et al. (2019), and applied to proving convergence of the parareal iteration.If the results of the averaging method are insuﬃciently accurate, then it can be used asthe predictor in a predictor-corrector framework. Many of these frameworks provide additionalopportunities for parallel computing. With this in mind, Haut and Wingate (2014) introducedan asymptotic parareal method using the ﬁnite window averaging as a coarse propagator, andusing a standard time integration method applied to the original unaveraged system as a ﬁnepropagator. In this paper we lay the foundations for an alternative approach, where the accuracyof the ﬁnite window averaged model is increased “from the top down” by extending the phaseaverage into a polynomial expansion of the solution in the phase variable. The goal is to reducethe error in the slow component of the solution whilst still ﬁltering fast oscillations. Thisincreases the model complexity but it may be interesting to consider how to hide higher ordercorrections to the solution behind other computations in parallel, as occurs in the PFASSTframework (Minion, 2011), for example.In this paper, we introduce and explore the higher order averaging technique applied toODEs (although we have highly oscillatory PDEs in mind when we do this). We also avoidnumerical aspects , by making phase averages exact, and by using an adaptive timestepper tocompute numerical solutions that should well approximate the exact solutions over ﬁnite times.We ﬁnd that when applied to a swinging spring model introduced by Lynch (Holm and Lynch,2002), the higher order averaging does indeed produce a more accurate solution than the basicﬁnite window averaging, and that this error decreases as we increase the polynomial degree ofthe expansion in the phase variable.The rest of the paper is organised as follows. In Section 2 we introduce the higher orderaveraging technique in the most general form, before specialising to a polynomial basis for theexpansion in the phase variable as well as selecting a Gaussian averaging weight function sothat the phase averaging can be computed analytically. This is not critical to the frameworkbut was done to expediate our numerical experiments and to expose understanding of the eﬀectof the higher order averaging. We then discuss asymptotic limits for the averaging window.In Section 3 we develop further formulae for Lynch’s swinging spring model, a speciﬁc lowdimensional example. In Section 4 we present numerical integrations for the swinging springexample, calculating errors between the higher order average models and the exact model.Finally, in Section 5 we provide a summary and outlook.2 Higher order averaging: formulation

We consider initial value problems written in the following abstract form ∂∂t U = L U + N ( U , U ) , (2.1)where L is a linear operator (with purely imaginary eigenvalues) and N is a nonlinear operator.Restricting ourselves to quadratic nonlinearities, we assume N to be a bilinear form (bilinearityindicated by the two slots). To develop phase averaged models, it is useful to reformulate (2.1)by introducing the factorization V ( t ) = e − tL U ( t ) , (2.2)where the modulated variable V ( t ) satisﬁes the equation ∂∂t V ( t ) = e − tL N (cid:0) e tL V ( t ) , e tL V ( t ) (cid:1) . (2.3)We refer to this form as the modulated equation . It is straightforward to show that bothformulations are equivalent: when multiplying both sides of (2.1) with the integration factor e − tL and noting that e − tL ( ∂ t U − L U ) = ∂ t ( e − tL U ), we recover Eqn. (2.3) via e − tL ( ∂ t U − tL U ) = e − tL N ( U , U ) ⇔ ∂ t ( e − tL U ) = e − tL N ( U , U ) (2.4) ⇔ ∂ t V = e − tL N ( e tL V , e tL V ) . (2.5)Note that this equation is valid for cases with or without scale separation (as discussed below).To apply phase averaging, we add an additional phase parameter s ∈ R over which weintegrate and hence average the solution. As a ﬁrst step, we assume a factorization V ( t, s ) = e − ( t + s ) L U ( t ) (2.6)where, besides the time parameter t , we include a phase parameter s ∈ R in the exponentsuch that the solution in V depends on both t and s . The phase parameter s corresponds toa change in the point where the exponential exp(( t + s ) L ) is equal to the identity, i.e. a phaseshift.As a calculation similar to above shows, this factorization yields a form of (2.3) that includesthe parameter s , i.e. ∂∂t V ( t, s ) = e − ( t + s ) L N (cid:0) e ( t + s ) L V ( t, s ) , e ( t + s ) L V ( t, s ) (cid:1) . (2.7)This gives us a one parameter family of solutions V ( t, s ) that depend on the phase parameter s with the property that for s = 0 we recover the solution of (2.3). We can then obtain a phaseaveraged model, ∂∂t ¯ V ( t ) = lim T →∞ T (cid:90) T − T e − ( t + s ) L N (cid:0) e ( t + s ) L ¯ V ( t, s ) , e ( t + s ) L ¯ V ( t, s ) (cid:1) d s. (2.8)Phase averaging for (2.7) was analysed by Schochet (1994) in the limit (cid:15) →

0, leading toslow equations, that approximate the solution of the unaveraged model in this limit in someappropriate sense. In particular, Majda and Embid (1998) showed that this averaging and (cid:15) → T ﬁnite and introducing a C ∞ weight function ρ with ρ ≥ (cid:107) ρ (cid:107) L = 1, to obtain the averagedequation ∂∂t ¯ V ( t ) = (cid:90) ∞−∞ ρ T ( s ) e − ( t + s ) L N (cid:0) e ( t + s ) L ¯ V ( t, s ) , e ( t + s ) L ¯ V ( t, s ) (cid:1) d s, (2.9)where ρ T ( s ) = ρ ( s/T ) /T . They further proposed to replace the integral via quadrature (choos-ing ρ to have compact support), with the evaluation of the integrand at each quadrature pointbeing computed independently in parallel; this formulation provided the coarse propagator fora parallel-in-time algorithm.Note that depending on the value of T , formulation (2.9) recovers an averaged version, anasymptotic approximation, or the full equations (2.3): • for T →

0, we recover Eqn. (2.3) because in the limit T → ρ converges formally to the Dirac measure δ ( s ). • for T → ∞ , we recover (2.9), because when integrating the resulting integral value isinvariant under shifts in t ; hence for t = 0 we recover the asymptotic approximationintroduced in Majda and Embid (1998) after also taking (cid:15) → • for 0 << T << ∞ , formulation (2.9) has a smoothing eﬀect on fast oscillations with timeperiod less than T .The optimum choice of T as a function of (cid:15) to minimise the error due to averaging whencombined with a numerical time integration method to solve 2.9 was investigated in Peddleet al. (2019).An interpretation of (2.9) is that we have projected V ( t, s ) onto the space of functionsindependent of s using the ρ -weighted L inner product. In the next subsection, our higherorder averaging method for ﬁnite averaging window T will be based on projecting the oneparameter family of solutions on to higher order polynomials in s . After solving this system ofcoupled diﬀerential equations, we set s = 0 because we are interested only in the time evolutionof the solution with this phase value. Hence, we only present solutions at s = 0 in the numericalresults section. We abbreviate the RHS of (2.3) as general bilinear function F ( V ( t, s ) , V ( t, s ); t + s ) that hasquadratic dependency on V ( t, s ), i.e. ∂∂t V ( t, s ) = F (cid:0) V ( t, s ) , V ( t, s ); t + s (cid:1) . (2.10)According to (2.6), the solution vector V depends on both t and s and we assume that it canbe represented by the factorization V ( t, s ) = p (cid:88) k =0 V k ( t ) φ k ( s ) , (2.11)4here φ k ( s ) are basis vectors depending on s that span Q , some chosen ( p + 1)-dimensionalfunction space.We project (2.10) onto Q to obtain (cid:90) ∞−∞ ρ T ( s ) W ( s ) · ∂∂t V ( t, s ) ds = (cid:90) ∞−∞ ρ T ( s ) W ( s ) · F (cid:0) V ( t, s ) , V ( t, s ); t + s (cid:1) ds ∀ W ( s ) ∈ Q, (2.12)where ρ T ( s ) = ρ ( s/T ) /T for our chosen weight function ρ .Also expanding the test function W in the basis, W ( s ) = p (cid:88) j =0 W j φ j ( s ) , (2.13)we get p (cid:88) j =0 p (cid:88) k =0 W j (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) φ k ( s ) ds ∂∂t V k ( t ) = p (cid:88) j =0 W j (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) F (cid:0) p (cid:88) k =0 V k ( t ) φ k ( s ) , p (cid:88) l =0 V l ( t ) φ l ( s ); t + s (cid:1) ds, ∀ W , . . . , W p . (2.14)As the coeﬃcients W j are independent from each other, we obtain p (cid:88) k =0 (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) φ k ( s ) ds ∂∂t V k ( t ) = (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) F (cid:0) p (cid:88) k =0 V k ( t ) φ k ( s ) , p (cid:88) l =0 V l ( t ) φ l ( s ); t + s (cid:1) ds. (2.15)This deﬁnes our averaged model for the components V k , in the form of a matrix-vector equationfor ∂∂t V k . To use it, we set V ( t, s = 0) = V (0), the initial condition for V . Then we solvethe equations forward in time (perhaps approximating the phase integral as a sum), and use¯ V ( t, s = 0) as our estimate of V ( t ).To understand more about what this approximation does, we further specify the nonlinearfunction F . As it is bilinear, we may pull out the double sum over the basis φ , i.e. F (cid:32) p (cid:88) k =0 V k ( t ) φ k ( s ) , p (cid:88) l =0 V l ( t ) φ l ( s ); t + s (cid:33) = p (cid:88) k =0 p (cid:88) l =0 φ k ( s ) φ l ( s ) F (cid:0) V k ( t ) , V l ( t ); t + s (cid:1) . (2.16)Moreover, we assume that we can factor out the time dependency of F on t + s so that F isrepresented as F (cid:0) V k ( t ) , V l ( t ); t + s (cid:1) = (cid:88) m F m (cid:0) V k ( t ) , V l ( t ) (cid:1) e i ( t + s ) c m (2.17)where the sum in m is over diﬀerent terms that are quadratic in V (for diﬀerent polynomial order k, l ) with coeﬃcients c m and bilinear functions F m and that depend on the choice of N . In thecase of a PDE, this sum over m will be replaced by an inﬁnite sum over frequencies. Examples of5hese coeﬃcients and their relations are shown further below in Table 3.1 for the swinging springmodel. To ease notation, we will adopt the short hand notation F m,k,l ( t ) := F m (cid:0) V k ( t ) , V l ( t ) (cid:1) .In summary, we end up with a system of equations (for j = 0 , . . . p ) that reads p (cid:88) k =0 (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) φ k ( s ) ds ∂∂t V k ( t ) = (cid:88) m p (cid:88) k =0 p (cid:88) l =0 F m,k,l ( t ) (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) φ j ( s ) φ k ( s ) φ l ( s ) e i ( t + s ) c m ds ∀ φ j ( s ) . (2.18)This formulation provides a general framework of higher order averages for various choices of Q and weight function ρ . In this section we make some speciﬁc calculations using the choice of Q = P p , the degree-p polynomials, and a particular choice of weight function that allows us to make progressanalytically, namely ρ = exp( − s / / √ π , recalling that (cid:90) ∞−∞ T ρ (cid:16) sT (cid:17) = (cid:90) ∞−∞ √ πT e − s T ds = 1 . (2.19)This allows us to integrate over the whole R while guaranteeing a bounded integral, but otherchoices are possible too.As a basis φ , we choose the standard monomials φ j ( s ) = s j , j = 0 , . . . , p. (2.20)This is a poorly-conditioned basis, and better choices might be made for large values of p (forthe weight function we are using the Hermite polynomials would be ideal since they diagonalisethe mass matrix). However, the monomial basis simpliﬁes the calculations below. Also, underthis basis, ¯ V ( t, s = 0) = ¯ V , so to initialise the system we can just set ¯ V (0) = V (0 , s ) and¯ V k (0) = 0 for 1 ≤ k ≤ p .Substituting this basis into (2.18), we arrive at p (cid:88) k =0 √ πT (cid:90) ∞−∞ e − s T s j s k ds ∂∂t V k ( t ) = (cid:88) m p (cid:88) k =0 p (cid:88) l =0 F m,k,l ( t ) e ic m t √ πT (cid:90) ∞−∞ e − s T e ic m s s j s k s l ds ∀ j. (2.21)The latter equation can be formulated more concisely as p (cid:88) k =0 M jk ∂∂t V k ( t ) = (cid:88) m p (cid:88) k =0 p (cid:88) l =0 F m,k,l ( t ) e ic m t ˜R mjkl ∀ j, (2.22)in which we denote the integral on the left hand side (LHS) of (2.21) as a mass matrixM jk = 1 √ πT (cid:90) ∞−∞ e − s T s j s k ds, (2.23)6nd that on the RHS of (2.21) as˜R mjkl = 1 √ πT (cid:90) ∞−∞ e − s T e ic m s s j s k s l ds. (2.24)To evaluate these integrals, it will be useful to introduce the notation I α := 1 √ πT (cid:90) ∞−∞ e − s T s α ds (2.25)with I = 1 which follows immediately from (2.19). Then, for instance, for α = i + j we have M ij = I α . With the help of this concise notation, we formulate the following two propositionsthat allow us to evaluate the integrals in (2.23) and (2.24). Proposition 2.1.

For averaging window T and α = j + k with j, k = 0 , . . . , p , the integral I α in (2.25) can be represented in the recursive form I α = T ( α − I α − with I = 1 (2.26) In particular, I α = 0 for all α odd.Proof. This recursive formula follows directly from integration by parts. Using the normaliza-tion factor γ = √ πT , there follows I α = γ (cid:90) ∞−∞ dds e − s T − T s s α ds = γ (cid:20) − e − s T T s α − (cid:21) ∞−∞ + T ( α − γ (cid:90) ∞−∞ e − s T s α − ds = T ( α − I α − . (2.27)For α = 0, we have I = 1 since (cid:82) ∞−∞ e − s T = √ πT which follows from (2.19). For odd α , I α is an integral of a product of an even and odd function over entire R , hence zero. Proposition 2.2.

Using the recursive formula of Proposition 2.1 and for averaging window T and α = j + k + l with j, k, l = 0 , . . . , p , the integral (2.24) can be represented for all m as ˜R mα = e − c mT R mα with R mα = α (cid:88) β =0 (cid:18) αβ (cid:19) I α − β · ( ic m T ) β (2.28) with the property that for α = 0 we have R m = 1 for all m .Proof. First, we rewrite (2.24) using α = j + k + l and by completing the square ( − s T + ic m s = − T ( s − ic m T ) − c m T ) we arrive at˜R mα = e − c mT R mα with R mα := 1 √ πT (cid:90) ∞−∞ e − T ( s − ic m T ) s α ds. (2.29)Next, we use the substitution s (cid:48) = s − ic m T with ds (cid:48) = ds :R mα = 1 √ πT (cid:90) ∞− ic m T −∞− ic m T e − s (cid:48) T ( s (cid:48) + ic m T ) α ds (cid:48) . (2.30)7ecause the latter is a contour integral in the complex domain in which one term is integratedalong the real axis and another one along a semi circle at inﬁnity, the shift of the real axis inthe complex direction by − ic m T does not change the integral value as long as there are no, orthe same number of singularities enclosed by the contour. Therefore, we can equivalently writeR mα = 1 √ πT (cid:90) ∞−∞ e − s (cid:48) T ( s (cid:48) + ic m T ) α ds (cid:48) (2.31)while for α = 0 there followsR m = 1 √ πT (cid:90) ∞− ic m T −∞− ic m T e − s (cid:48) T ds (cid:48) = 1 √ πT (cid:90) ∞−∞ e − s (cid:48) T ds (cid:48) = 1 . (2.32)For the ease of notation, we will skip (cid:48) in the following. Using the binomial expansion ( x + y ) α = (cid:80) αβ =0 (cid:0) αβ (cid:1) x α − β y β , Eqn. (2.31) becomesR mα = α (cid:88) β =0 (cid:18) αβ (cid:19) (cid:18) √ πT (cid:90) ∞−∞ e − s T s α − β ds (cid:19) ( ic m T ) β . (2.33)The statement follows by using expression (2.26) for the term in brackets.Based on these results, we can further expand (2.22) to get p (cid:88) k =0 M jk ∂∂t V k ( t ) = (cid:88) m p (cid:88) k =0 p (cid:88) l =0 F m,k,l ( t )R mjkl e ic m t e − c mT for j = 0 , , , (2.34)where M jk = M α for α = j + k and R mjkl = R mα for α = j + k + l . Note that F m,k,l = F m,l,k forall m, k, l .Equation (2.34) will serve us in Section (3) to develop a higher order averaging method foran ODE describing a swinging spring. But before we discuss this example, let us ﬁrst presenta explicit representation of (2.34) and how the equation behaves in the zero and inﬁnity limitsof T . p = 2For the sake of clarity, we present an explicit representation of Eqn. (2.34) for p = 2, i.e. j, k, l = 0 , ,

2. Proposition 2.1 leads to a mass matrix M with coeﬃcients (in terms of M α ):M = 1 , M = 0 , M = T , M = 0 , M = 3 T ; (2.35)and we can explicitly write the inverse of M asM − =  T T T T  − =  − T T − T T  . (2.36)Moreover, Proposition 2.2 leads to the coeﬃcients (in terms of R mα ):R m = 1 , R m = ic m T , R m = T − c m T , R m = i c m T − ic m T , R m = 3 T − c m T + c m T , R m = i cmT − i c m T + ic m T , R m = 15 T − c m T + 15 c m T − c m T . (2.37)8ith these coeﬃcients, the explicit form of Eqn. (2.34) reads  M M M M M   ˙ V ( t )˙ V ( t )˙ V ( t )  = (cid:88) m (cid:88) k =0 2 (cid:88) l =0 e ic m t e − c mT F m,k,l  R m kl R m kl R m kl  = (cid:88) m e ic m t e − c mT × (cid:104) F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  + F m, ,  R m R m R m  (cid:105) . (2.38)Bringing M to the LHS, we result in  ˙ V ( t )˙ V ( t )˙ V ( t )  = (cid:88) m (cid:88) k =0 2 (cid:88) l =0 e ic m t e − c mT F m,k,l  − T  R m kl +  T  R m kl +  − T T  R m kl  . (2.39)As it is useful for the asymptotic study further below, we present a fully explicit version of (2.39)in terms of tendencies for V , V , and V in Appendix A. This explicit representation illustratesclearly the coupling between the principal V term and the higher order ones V i , i = 1 , . . . p and allows us to visualize the expansion’s asymptotic behavior with respect to the averagingwindow T . Here we explicitly see the smoothing eﬀect for large averaging window T , as all ofthe terms are exponentially small in T except for resonant cases where c m is small. T We illustrate the asymptotic behavior of the expansion (2.39) on the example presented abovefor p = 2, but the results hold for general p . In particular, we investigate its behaviour forlarge and small T , which can be inferred from an explicit representation of (2.39), as shownin Appendix A. Here, we present a general discussion while showing a concrete example inSection 3. Limit T → : We study equation (2.39) in the limit T →

0. In this limit, many terms vanish(cf. Appendix A) and we arrive at  ˙ V ( t )˙ V ( t )˙ V ( t )  = (cid:88) m e ic m t ×  F m, , ic m F m, , + 2 F m, , − c m F m, , + F m, , + 2 ic m F m, , + 2 F m, ,  . (2.40)In the limit T → V , i.e. V does not depend on the higher order terms. When considering only V , which isall that we are interested in, we recover in fact the non-averaged equation (2.3).9 emark 2.3. Considering the full system (2.40), higher order terms do not vanish and in thenumerical algorithm below, these terms will be calculated too. However, for T very small, thecondition number of matrix M is very large leading to increased numerical errors (cf. discussionto Figure 4.4). In practice this can be dealt with by choosing a better conditioned basis suchas an orthogonal basis with respect to the ρ -weighted inner product (which would be Hermitepolynomials in the examples we have computed here). However, this only becomes an issuein the small T limit, and we are interested in larger T values to facilitate larger timesteps innumerical integrations. Limit T → ∞ : We study expansion (2.39) for T → ∞ . To this end, we have to distinguishbetween the terms in the sum on the RHS of (2.39) with coeﬃcients (i) c m (cid:54) = 0 and (ii) c m = 0(see e.g. the coeﬃcients in Table 3.1).Case 1: for c m (cid:54) = 0 there follows e ic m t e − c mT = 0 for T → ∞ such that the correspondingterms on the RHS of (2.39) are zero (cf. Appendix A).Case 2: for c m = 0 there follows e ic m t e − c mT = 1 for any T . Hence, not all terms on theRHS vanish (cf. Appendix A) and we arrive at the following equation:  ˙ V ( t )˙ V ( t )˙ V ( t )  = (cid:88) m  F m, , − T F m, , F m, , + 6 T F m, , F m, , + 2 F m, , + 6 T F m, ,  . (2.41)Similar to the zero limit (2.40), there is a decoupling of the higher order modes from theprinciple one: the F m, , term vanishes in the higher order modes. Therefore, as all V i , i = 1 , ..p are initialized with zero, they remain zero including the F m, , that would contribute otherwiseto V . Therefore, we recover indeed in this limit an asymptotic approximation of (2.3) in formof Majda and Embid (1998). We illustrate the properties of the higher order averaging method introduced above on an ODEthat describes the dynamics of a swinging spring, a model due to Peter Lynch (Holm and Lynch,2002). Although idealized, this model shows an interesting analogy to geophysical ﬂows as itexhibits a high sensitivity of small scale oscillation on the large scale dynamics.

For the swinging spring (expanding pendulum) model of Holm and Lynch (Holm and Lynch,2002), we consider the following equations of motion¨ x ( t ) + ω R x ( t ) = λx ( t ) z ( t ) , ¨ y ( t ) + ω R y ( t ) = λy ( t ) z ( t ) , ¨ z ( t ) + ω Z z ( t ) = 0 . λ ( x ( t ) + y ( t ) ) , (3.1)where x ( t ) , y ( t ) , z ( t ) are (time dependent) Cartesian coordinates centered at the point of equi-librium, ω R = (cid:112) g/l is the frequency of linear pendular motion, and ω Z = (cid:112) k/m u is the10requency of elastic oscillation with spring constant k , unit mass m u and gravitational constant g . l denotes the unstretched length while l is the length at equilibrium such that λ = l ω Z /l .The dot denotes the time derivative ddt .To bring Eqn. (3.1) into the abstract form (2.1), we ﬁrst reformulate the former into theﬁrst order system ˙ x ( t ) − ω R p x ( t ) = 0 , ˙ p x ( t ) + ω R x ( t ) = λω R x ( t ) z ( t ) , ˙ y ( t ) − ω R p y ( t ) = 0 , ˙ p y ( t ) + ω R y ( t ) = λω R y ( t ) z ( t ) , ˙ z ( t ) − ω Z p z ( t ) = 0 , ˙ p z ( t ) + ω Z z ( t ) = λω Z ( x ( t ) + y ( t ) ) , (3.2)with p x := ∂p∂x and so forth. Then, by introducing the complex numbers X ( t ) = x ( t ) + ip x ( t ) , Y ( t ) = y ( t ) + ip y ( t ) , Z ( t ) = z ( t ) + ip z ( t ) , we can rewrite Eqn. (3.2) as˙ X ( t ) + iω R X ( t ) = i λω R Re ( X ( t )) Re ( Z ( t )) , (3.3)˙ Y ( t ) + iω R Y ( t ) = i λω R Re ( Y ( t )) Re ( Z ( t )) , (3.4)˙ Z ( t ) + iω Z Z ( t ) = i λ ω Z ( Re ( X ( t )) + Re ( Y ( t )) ) . (3.5)This form of the equations can be expressed in the abstract form (2.1) by deﬁning the velocityas U ( t ) =  X ( t ) Y ( t ) Z ( t )  =  x ( t ) + ip x ( t ) y ( t ) + ip y ( t ) z ( t ) + ip z ( t )  , (3.6)the linear operator as L =  − iω R − iω R − iω R ρ  , (3.7)and the nonlinear (quadratic) operator as N ( U ( t ) , U ( t )) = i λω R  Re ( X ( t )) Re ( Z ( t )) Re ( Y ( t )) Re ( Z ( t )) ρ ( Re ( X ( t )) + Re ( Y ( t )) )  . (3.8)Using these deﬁnitions, Eqn. (3.1) can also be written in the modulated form (2.3). Thatis, the time evolution of the velocity V ( t ) with transformation V ( t ) = e − tL U ( t ) , V ( t ) =  ˆ X ( t )ˆ Y ( t )ˆ Z ( t )  , (3.9)11nder the linear operator L of (3.7) is given by Eqn. (2.3) with nonlinear operator N of (3.8).Explicitly, we arrive at the component equations:˙ˆ X = c xy ˆ X ˆ Ze − iω R ρt + c xy ˆ X ˆ Ze iω R ρt + c xy ˆ Z ˆ Xe − iω R ρt +2 iω R t + c xy ˆ X ˆ Ze iω R ρt +2 iω R t (3.10)˙ˆ Y = c xy ˆ Y ˆ Ze − iω R ρt + c xy ˆ Y ˆ Ze iω R ρt + c xy ˆ Z ˆ Y e − iω R ρt +2 iω R t + c xy ˆ Y ˆ Ze iω R ρt +2 iω R t (3.11)˙ˆ Z = c z ( ˆ X + ˆ Y ) e iω R ρt − iω R t + 2 c z ( ˆ X ˆ X + ˆ Y ˆ Y ) e iω R ρt + c z ( ˆ X + ˆ Y ) e iω R ρt +2 iω R t (3.12)with coeﬃcients c xy = 3 iω R ρ /

16 and c z = 3 iω R ρ/

32. Overline notation denotes the conjugate.For ease of notation, we omitted here to include the time dependency of the transformedcoordinates ˆ X, ˆ Y , ˆ Z . After solving for V , we have to transform back to U with U ( t ) = e tL V ( t )to ﬁnd solutions of (2.1).The explicit formulation in (3.10)–(3.12) allows us to identify the problem (or ODE) depen-dent coeﬃcient c m and F m,k,l = ( F xm , F ym , F zm ) ∀ k, l of the expansion (2.34). Note that the lattercoeﬃcients change with m while they are the same for all k, l . The following table summarizesthese coeﬃcients: c m F xm F ym F zm c := ( ρ − ω R F z = c z ( ˆ X ˆ X + ˆ Y ˆ Y ) c := ρω R F x = c xy ˆ X ˆ Z F y = c xy ˆ Y ˆ Z F z = 2 c z ( ˆ X ˆ X + ˆ Y ˆ Y ) c := ( ρ + 2) ω R F x = c xy ˆ X ˆ Z F y = c xy ˆ Y ˆ Z F z = c z ( ˆ X ˆ X + ˆ Y ˆ Y ) c := − ρω R F x = c xy ˆ X ˆ Z F y = c xy ˆ Y ˆ Z c := − ( ρ + 2) ω R F x = c xy ˆ X ˆ Z F y = c xy ˆ Y ˆ Z Limits for T : Similarly to Section 2 but here for concrete model equations, i.e. for theODE (3.1) of the swinging spring, we present the equations describing the time evolution of V resulting from taking the limits of (2.34) in T , i.e. for (i) T → T → ∞ .Case (i): as discussed in (2.40), in the T to zero limit the higher order terms decouplefrom the principle mode V such that for the time evolution of the latter only the F m, , termscontribute. In this limit and using the coeﬃcients of Table 3.1, the resulting set of equationsfor V coincides with equations (3.10)–(3.12) of the exact model.Case (ii): as shown in (2.41), all terms on the RHS of the expansion (2.34) vanish in the T → ∞ limit except of those with c m = 0 (i.e. here for ρ = 2). Considering further that higherorder terms in V (i.e. for k > l >

0) are initialized with zero, these terms, and inparticular the F m, , terms in the ﬁrst line in (2.41), remain zero. Then, for any approximationorder p , the resulting set of equations for V are given by˙ˆ X = c xy ˆ Z ˆ X, ˙ˆ Y = c xy ˆ Z ˆ Y , ˙ˆ Z = c z ( ˆ X + ˆ Y ) . (3.13)This is exactly the phase-averaged model that was derived by Holm and Lynch (2002) usingWitham averaging. 12 Numerical results

We investigate accuracy and stability of the higher order phase averaging method on numericalsimulations of the swinging spring model as introduced in the previous section. We comparesolutions obtained from the approximated models with those from the non-averaged (exact)equations (2.3). In particular, the convergence behavior of solutions obtained from averagingmodels with increasing approximation order to exact solutions will be investigated for ﬁniteaveraging windows. Besides showing that higher order phase averaging indeed leads to a betterapproximation of the fast oscillations around the slow manifold than standard (ﬁrst order)averaging, we illustrate that this also reduces the drift from averaged solutions away from theexact solutions, which we experienced in our simulations.Analogously to Holm and Lynch (2002), we choose for the numerical simulations the fol-lowing parameters for the ODE in equation 3.1: m u = 1 kg, l = 1 m, g = π m s − , k = 4 π kg s − and hence ω R = π and ω Z = 2 π with resonance condition ω Z = 2 ω R . The unstretchedlength l = 0 .

75 m follows from the balance k ( l − l ) = g m u allowing us to determine λ in (3.1).Finally, the equations are initialized with( X (0) , Y (0) , Z (0)) = (0 . , , . , ( ˙ X (0) , ˙ Y (0) , ˙ Z (0)) = (0 , . , , (4.1)for the principle mode V while values for V (cid:54) =0 and ˙ V (cid:54) =0 are all initialized with zero and resetback to zero after an integration window ∆ T . This resetting avoids potential instabilities inthe higher order modes in V i , i = 1 , ...p (cf. discussion to Figure 4.2). If not stated otherwise,we use a resetting window of ∆ T = 100 while other choices are also possible because this choicedoes not signiﬁcantly impact on the accuracy of the solutions, as shown below in Figure 4.6.Note that with this parameter choice, the fast and slow motions occurring in the simulationshave a scale separation at the order of about (cid:15) ≈ .

01 (in good agreement with values forgeophysical ﬂows).We integrate the ODEs in time using the adaptive RK4/5 explicit integrator implementedthrough the Python Scipy package. We integrate up to 1000 seconds (corresponding to 1000 ver-tical oscillations) using adaptive relative and absolute tolerance of 1 . e − unless otherwisestated. First, we verify that the solutions of our phase averaged models (including the ﬁrst order version)stay close to the corresponding exact solutions and capture the same characteristics typical tothe swinging spring model (given, for instance, by the stepwise precision of the pendulum inthe ( x, y )-plane). To this end, we compare for a long term simulation of 1000 s the solutions in U of the exact model, the averaged one with p = 0 that corresponds to Peddle et al. (2019)and a higher order (HO) version with, e.g., p = 6. In order to guarantee stable simulations, wereset the higher order values in U (or V ) to zero, here after a resetting time of ∆ T = 1. Thesize of this resetting window ∆ T does only marginally impact the accuracy (cf. Figure 4.6) butresetting prevents the averaged solutions to become unstable, in particular for large p and longsimulations (cf. Figure 4.2). Large scale dynamics.

In Figure 4.1 (left) we see the projection of the pendulum movementonto the horizontal ( x, y )-plane. Both, the solutions for the averaged and HO averaged models13 .02 0.01 0.00 0.01 0.02X0.020.010.000.010.02 Y exactavg, T=0.5HO avg, p=6, T=0.5 0 25 50 75 100 125 150 175time t/s0.0100.0050.0000.0050.010 Z exactavg, T=0.5HO avg, p=6, T=0.5 Figure 4.1: Left: horizontal projection of pendulum movement (i.e.

X, Y coordinates) for 1000sfor the exact, averaged and higher order averaged models. Right: Vertical projection onto Z coordinate of pendulum movement over time t for 167s (corresponding approximately the 1stof 6 modulation cycles).are very close to the exact solutions, almost indistinguishable in the given ﬁgures that showthe large scale dynamics. Here, we used an averaging window of T = 0 . T the accuracy does increase (cf. e.g. Figure 4.5). In particular, the solutions of the averagedmodels capture very well the stepwise precision of the swinging spring model. In the right panel,the movement of the pendulum in the Z axis versus time is shown. Also here, both averagedand HO averaged solutions are very close to the exact one. The ﬁgure illustrates nicely fastoscillations with a slowly varying amplitude envelope. This typical behavior of the swingingspring model makes it thus a perfect test problem for our averaging method developed above. Smoothing properties.

Next, we illustrate how the proposed averaging algorithm ﬁlters fastmotions from the dynamics of the swinging spring model while still accounting for their eﬀect onthe slow dynamics. Given the factorization assumption (2.2) that underlies equations (2.3), itis useful to study in more detail solutions in terms of the velocity V because the latter providesolutions of the slow dynamics without the fast oscillations intrinsic to U . As such, V (inparticular V ) allows us to clearly illustrate how the averaging method ﬁlters fast oscillationsand how the higher order methods better approximate small scale oscillations around the slowdynamics.Figure 4.2 shows the solutions for V (i.e. ˆ X, ˆ Y , ˆ Z ) for the exact, the averaged, and theHO averaged models without resetting and with resetting at ∆ T = 100. We use an averagingwindow of T = 0 . p = 8 for the HO method as this provides suﬃciently accuratesolutions (cf. Figure 4.5). The ﬁgure illustrates also a solutions of the HO model for p = 8 ( ˆ X nores, ˆ Y nores, ˆ Z nores) that becomes slightly unstable after about 800 s in case no resetting isused. This instability, however, is easily controlled by simple resetting the higher order velocitycomponent at each ∆ T to zero, resulting in the stable solutions ˆ X HO, ˆ Y HO, ˆ Z HO.Next, we compare the exact solutions in all three coordinates with those of the averagedmodels. We realize that for longer simulation times, the averaged solutions slightly drift awayfrom the exact ones, as shown on the right panels in Figure 4.3. This is particularly obviousin the Z coordinate. Further in these ﬁgure, we see clearly that with increasing approximation14

200 400 600 800 1000time t/s0.0150.0100.0050.0000.0050.0100.0150.0200.025 X , Y , Z X modX avgX HO Y modY avgY HO Z modZ avgZ HO X noresY noresZ nores

Figure 4.2: V solutions of the exact, averaged (avg with p = 0) and higher order (HO) models(with p = 8) and T = 0 .

2; reset every ∆ T = 100 or with no resetting (nores). Here, no resettingleads to instabilities for high p ≥ p the solutions’ drift reduces signiﬁcantly whilst the small scale oscillations are betterapproximated. Note that the standard averaging method with p = 0 corresponds to the originalversion as introduced in Haut and Wingate (2014). As discussed therein, the averaging leadsto some drift away from the exact solutions. As the ﬁgure illustrates clearly, including thehigher averaging terms signiﬁcantly reduces this drift while providing solutions much closer tothe exact ones.When considering short integration times, as shown on the left panels in Figure 4.3, it isparticularly obvious that the higher order method more accurately approximates small scaleoscillations than the standard method. While both approximate well the slower dynamics,the standard averaging method almost entirely smooths out the faster oscillations while withincreasing polynomial approximation order, these fast oscillations are better and better capturedby by the solutions of the HO averaging models. Here, we study the accuracy of solutions obtained from our higher order averaging methodcompared to exact solutions, obtained from numerical simulations of (2.3), by performing aconvergence analysis of the solutions and their dependencies on the averaging window T andthe polynomial degree p . For the chosen T from above, we already veriﬁed that with increasing p the error in the averaged solutions decreases signiﬁcantly. However, these error depend cruciallyon the choice of averaging window T . For instance, in case T is too large, even high polynomialapproximations will have an signiﬁcant error.In the following, we will therefore study solution errors with respect to both p and T andprovide a 2 dimensional (2D) error map. To determine these 2D error maps, we deﬁne the15 .1 0.0 0.1 0.2 0.3 0.4 0.5 0.6time t/s0.0059700.0059750.0059800.0059850.0059900.0059950.0060000.006005 X X modX avgX HO p=1 X HO p=2 X HO p=3 X HO p=4 X HO p=5 X HO p=6 X HO p=7 X HO p=8 X HO p=9 X HO p=10 998.0 998.5 999.0 999.5 1000.0 1000.5 1001.0time t/s0.005350.005400.005450.005500.005550.00560 X X modX avgX HO p=1 X HO p=2 X HO p=3 X HO p=4 X HO p=5 X HO p=6 X HO p=7 X HO p=8 X HO p=9 X HO p=10 0.2 0.0 0.2 0.4 0.6 0.8 1.0time t/s0.000010.000000.000010.000020.000030.000040.00005 Y Y Y modY avgY HO p=1 Y HO p=2 Y HO p=3 Y HO p=4 Y HO p=5 Y HO p=6 Y HO p=7 Y HO p=8 Y HO p=9 Y HO p=10 998.0 998.5 999.0 999.5 1000.0 1000.5 1001.0time t/s0.0011000.0010750.0010500.0010250.0010000.0009750.0009500.0009250.000900 Y Y Y modY avgY HO p=1 Y HO p=2 Y HO p=3 Y HO p=4 Y HO p=5 Y HO p=6 Y HO p=7 Y HO p=8 Y HO p=9 Y HO p=10 0.2 0.0 0.2 0.4 0.6 0.8 1.0time t/s0.0119800.0119850.0119900.0119950.0120000.012005 Z Z Z modZ avgZ HO p=1 Z HO p=2 Z HO p=3 Z HO p=4 Z HO p=5 Z HO p=6 Z HO p=7 Z HO p=8 Z HO p=9 Z HO p=10 998.0 998.5 999.0 999.5 1000.0 1000.5 1001.0time t/s0.004500.004750.005000.005250.005500.005750.006000.006250.00650 Z Z Z modZ avgZ HO p=1 Z HO p=2 Z HO p=3 Z HO p=4 Z HO p=5 Z HO p=6 Z HO p=7 Z HO p=8 Z HO p=9 Z HO p=10

Figure 4.3: ˆ X , ˆ Y , ˆ Z solutions of the exact, averaged (avg with p = 0) and higher order (HO)models with resetting (∆ T = 100) and for an averaging window of T = 0 . L error measure for solution ˆ X ( t ) (and similarly for ˆ Y ( t ) , ˆ Z ( t ))L ˆ X ( t ) = (cid:113)(cid:82) tf ( ˆ X ( t ) − ˆ X e ( t )) dt (cid:113)(cid:82) tf ˆ X e ( t ) dt (4.2)where ˆ X ( t ) and ˆ X e ( t ) is the x-coordinate of V of the averaged and exact solutions, respectively(and similarly for ˆ Y , ˆ Z ), and where tf is the ﬁnal time.We will consider three regimes of averaging window size T , namely (i) T →

0, (ii) T ∈ [0 . , . T → ∞ , whilst studying the impact of higher order polynomial approxima-tion in the solver on the solutions’ accuracy. In practice, we achieve with our implementationand, e.g., p = 5 values for T of about 0 . e + 12 for case (iii), or evensmaller respectively larger T for smaller p . We will consider an integration time of tf = 167 s16 T X T Y T Z Figure 4.4: 2D error plots ( T − p ) for solutions V in the limit T →

0. Error values arecalculated using the default solver tolerance of 1 . e − .here (corresponding to the ﬁrst cycle) and a time step interval of ∆ t = 0 .

001 s and we use thedefault adaptive timestepping error tolerance of 1 . e − or otherwise stated.In Figure 4.4 we presented results for case (i) for small T ∈ [0 . , . , ... . T ≈ .

005 we have minimum error values in ˆ X , ˆ Y , ˆ Z of about 10 − for p = 5. With these values we reach the default solver tolerance ( ≈ − ), while with asolver tolerance of about 10 − , we reach for the same p and T values minimum errors of about10 − that show a similar error pattern to Figure 4.4. With even smaller tolerances the solutionsbecome even more accurate (not shown). For each tolerance studied, we experience for values of T smaller than about 0 . T values in Figure 4.4).This increase is related to the fact that for T ≤ . M gets huge, leading to increased numerical errors in the numerical solver of the ODE. Theseresults conﬁrm the discussion of Section 2.2.2.Next, we investigate case (ii). As shown in Figure 4.5, for averaging window sizes smallerthan T ≤ .

2, the higher order method signiﬁcantly improves the accuracy (cf. also Figures 4.2and 4.3). In particular for T = 0 .

05 and p = 10 we reach solver tolerance for all three coordi-nates, but also for other values with T ≤ .

2, our method improves the accuracy signiﬁcantly.For values 0 . ≤ T ≤ .

5, the HO method still provides improvements but with error values for p = 10 of 10 − and 10 − for ˆ X , ˆ Y and ˆ Z , respectively. For even larger averaging windowsof T ≥ . p (notshown). This is because in this case, the exponential function dominates the factors in theexpansion in (2.34) and, therefore, the results correspond to the original averaging method ofPeddle et al. (2019). Also these results agree with the derivations of Section 2.2.2.As indicated above, the resetting after the time interval ∆ T of the higher order velocitycomponents prevents the HO models from blowups. The latter are due to instabilities in thehigher order terms which, in turn, are caused by a growing discrepancy between principaland higher order modes in V that might happen after long simulation times, cf. Figure 4.2.In Figure 4.6, we present error plots for solutions obtained with the higher order averagingapproach for two diﬀerent resetting values: ∆ T = 0 . T = 100 (right). Thesimilarity of both error maps indicates that a change in the resetting window size ∆ T hasalmost no impact on the accuracy of the solutions.17 T X T Y T Z Figure 4.5: 2D error plots ( T − p ) for V for intermediate values of T . Error values calculatedusing the default solver tolerance of 1 . e − . T X, dT = 10 T X, dT = 10000 Figure 4.6: 2D error plots ( T − p ) for V for diﬀerent resetting windows ∆ T . In this paper we introduced a higher order ﬁnite window averaging technique and investigated itby specialising to Gaussian weight functions to allow for explicit computation of averages. Thisenabled us to eﬃciently investigate the higher order technique when applied to highly oscillatoryODEs, in particular Lynch’s swinging spring model. We found that higher order corrections canstrongly increase the accuracy of the averaged model solutions, but only when the averagingwindow is close to the time period of the fast frequency (this model only has two fast frequencies,in 2:1 resonance). We expect the situation to become more complicated when there are aspectrum of fast frequencies. In this case, it is known that very high frequencies do not changethe slow components of the solution much, but moderate fast frequencies can resonate in thenonlinearity (just like in the swinging spring), altering the slow dynamics. In this case onewould want to select an averaging window that removes as much fast dynamics as possible,whilst preserving accuracy of the slow dynamics arising from near-resonant interactions, andthe higher order model could prove useful in reducing the impact of the averaging on this slowdynamics.In this paper we have also avoided numerical aspects, such as the eﬀects of approximateaverages by numerical quadrature, and the interaction of numerical time integratores with theaveraging. An analysis using the ideas of Peddle et al. (2019) would also be useful here. Lookingfurther ahead, if the higher order averaging model is shown to improve the accuracy of averaging18ethods whilst still allowing larger time steps in numerical integrations, this would motivatethe investigation of multilevel schemes such as PFASST, where the low order averaging is usedas a coarse propagator and higher order averaging models provide higher order corrections,some of which can be hidden in parallel behind ﬁrst iterations in later timesteps.