Ensemble Dynamics and Bred Vectors
Nusret Balci, Anna L. Mazzucato, Juan M. Restrepo, George R. Sell
EEnsemble Dynamics and Bred Vectors
Nusret Balci , Anna L. Mazzucato , Juan M. Restrepo ∗ , George R. Sell Institute for Mathematics and its ApplicationsUniversity of Minnesota, Minneapolis MN 55455 U.S.A. Department of MathematicsThe Pennsylvania State University, University Park, PA 16802 U.S.A. Department of Mathematics and Department of PhysicsUniversity of Arizona Tucson, AZ 85721 U.S.A. School of MathematicsUniversity of Minnesota, Minneapolis MN 55455 U.S.A.
September 3, 2018
Abstract
We introduce the new concept of an EBV to assess the sensitivity ofmodel outputs to changes in initial conditions for weather forecasting.The new algorithm, which we call the
Ensemble Bred Vector or EBV,is based on collective dynamics in essential ways. As such, it keepsimportant geometric features which are lost in the earlier bred vectoralgorithm (BV). By construction, the EBV algorithm produces one ormore dominant vectors, and is less prone to spurious results than theBV algorithm. It retains the attractive features of the BV with regardto being able to handle legacy codes, with minimal additional coding.We investigate the performance of EBV, comparing it to the BValgorithm as well as the finite-time Lyapunov Vectors. With the helpof a continuous-time adaptation of these algorithms, we give a theo-retical justification to the observed fact that the vectors produced byBV, EBV, and the finite-time Lyapunov vectors are similar for small ∗ Corresponding Author: [email protected] a r X i v : . [ phy s i c s . a o - ph ] A ug mplitudes. The continuum theory is establishes the relationship be-tween the two algorithms and general directional derivatives.Numerical comparisons of BV and EBV for the 3-equation Lorenzmodel and for a forced, dissipative partial differential equation ofCahn-Hilliard type that arises in modeling the thermohaline circu-lation, demonstrate that the EBV yields a size-ordered description ofthe perturbation field, and is more robust than the BV in the highernonlinear regime. The EBV yields insight into the fractal structureof the Lorenz attractor, and of the inertial manifold for the Cahn-Hilliard-type partial differential equation. Keywords: Bred vectors, Lyapunov vectors, sensitivity, dynamic stability,Cahn-Hilliard, Lorenz.
PACS 94.05sx, 92.60.Ry, 92.60.Wc, 93.65.+e, 92.70.-j, 02.50.-r
Central to weather prediction is the analysis of the sensitivity of a physical orcomputer-coded model to initial conditions. Model sensitivity to parametersis also important in model inter-comparison. One studies such sensitivity inorder to obtain a better understanding of the role played by these parametersin model outcomes.Sensitivity and predictability are often intertwined in the context of weatherprediction and have been the subject of extensive research (see Buizza et al.(1993) and references contained therein.) These are not exclusively weather-related issues and thus geophysical fluid dynamics will often mine other phys-ical, computational and mathematical disciplines, for ideas with which toassess dynamic sensitivity. Practical sensitivity methodologies must con-tend with the evolution and dynamics of highly coupled, complex, high-dimensional systems, riddled with subscale parameterizations and empiricalrelations, which are the norm in large-scale climate and meteorology models.A tool used in the study of sensitivity analysis is the Bred Vector (BV)algorithm. It is proposed for use in forward sensitivity of weather and climatemodels. While in Subsection 2.1, we present a brief survey of some of theapplications of this algorithm in various sensitivity analyses, in this article,we will focus on the issue of the maximal growth of errors due to smallchanges in the initial conditions.The concept of the BV algorithm we use is based on the theory firstintroduced in Toth and Kalnay (1993). In addition to the BV notion, wepresent here a new variant, which we call the
Ensemble Bred Vector (EBV)algorithm. The definitions of both the BV and EBV algorithms are presentedin Section 2.n the BV algorithm, one follows an initial condition of the time-discretenonlinear system, along with cloud, which describes a family of nearby solu-tions. (Since this algorithm is used to sample the error space, an ensemble ofinitial perturbations is bred simultaneously.) The perturbations at the initialtime are fixed with a common small amplitude (cid:15) . After each cycle, the out-come of the perturbations is rescaled to the same amplitude (cid:15) . For the BValgorithm, the rescaling of each perturbation is independent of the others,and there is no mechanism to use the rescaling to compare the dynamics ofnearby perturbations.The new variation that we propose here, the EBV algorithm, differs fromthe BV algorithm, in the rescaling rule. In particular, for the EBV thoseperturbations that are not the same size as the largest perturbation, play areduced role after the rescaling. Thus the rescaling used in the EBV algo-rithm serves us better in separating various levels of the dominant dynamics.In short, the EBV algorithm offers better insight into the relative behaviorof nearby trajectories. Therefore, even when the initial perturbations of thetwo algorithms are based on the same cloud, the EBV algorithm is linked tothe ensemble dynamics of the underlying non-linear model, hence its name.We will make use of both the BV and the EBV algorithms. In fact, one ofour major goals is to present an in-depth comparison of the two algorithms,as they are used, or can be used, in describing the underlying dynamics ofthe model.We will show that in some metrics, the BV and EBV algorithms arecomparable, with the EBV being more accurate and faster, see Table 1 andFigures 6, 7, 8, for example. For other issues, especially those involvingthe longtime dynamics within the global attractor, the BV algorithm has ashortcoming, which limits its use (see Section 4). The EBV, instead, leadsto useful and interesting insight into the dynamics of the model, as is shownin the three Figures 4, 1, and 2.This brings up the question: For a given model of sensitivity with respectto initial conditions, how does one determine the direction vector that resultsin the maximal increase in the error due to a small perturbation in the givendirection initially? This is where the Lyapunov vectors enter the scene. Whatone needs is a red vector , which is the Lyapunov vector x with (cid:107) x (cid:107) = 1and with the property that the corresponding strong Lyapunov exponent λ is the maximal Lyapunov exponent for the model. (See Subsection 2.5 for thedefinition and more details. One should note that the Lyapunov exponentsfor the model require integration over 0 ≤ t < ∞ , or over the real line R .)We will use either the EBV or the BV algorithm to approximate thesolutions of the tangent linear equation. In Section 3 we show that eitheralgorithm is a good approximation. For these algorithms one can integrateonly over a finite interval 0 ≤ t ≤ T , where 0 < T < ∞ . However, it3s only under exceptional circumstances, e.g. , time-periodic or autonomousproblems, that a finite-time integration will approximate well an infinite timeaverage. For any hope for success in using a finite-time integration, we requirethat the model problem satisfy two properties:(1) There is an attractor A for the solutions , (2) The initial condition is chosen to be on, or near, A . What can one expect with such a finite-time approximation, when theinitial condtions are near the attractor A ? In terms of the calculated time,one expects first to be in a transient state. Then after a while, one hopes toget some meaningful information about the longtime dynamics of the model.We include in this manuscript several studies of such approximations.It is very important to note that, in order to better understand the maxi-mal growth of errors due to small changes in the initial conditions, one needsto exploit the dynamical information contained in the attractor of the model.In particular, one needs to complete two steps: • Step 1: One needs to locate the red vector x and the associate Lya-punov exponent λ . This then determines the red vector solution R ( x , t ), for −∞ < t < ∞ , for the model equation. • Step 2: One must find a good approximation of the time evolution R ( x , t ) on an appropriate finite-time interval, 0 ≤ t ≤ T < ∞ .Once the red vector x is known, then (cid:15) x is the initial condition for thebred vector sequence, see Subsection 2.2. Due to the results derived here inSection 3, either EBV ( t ), or BV ( t ), is a good finite-time approximation of R ( x , t ), for 0 ≤ t ≤ T < ∞ . This takes care of Step 2. Consequently, theproblem boils down to a search for the red vector, which is addressed belowin Subsection 2.5. Fortunately for us, there is wealth of related mathematicalinformation in the 1987 manuscript of R. A. Johnson, K. J. Palmer, and G.R. Sell. see Johnson et al. (1987). (We will refer to this paper as the “JPS87”in the sequel.) As we shall show, by using the JPS87, we are able to describethe mathematical process of finding the red vector. By using this citation,with the theory of the EBV ( t ) algorithm, this leads to a good solution forour sensitivity problem.It was observed by Toth & Kalnay op. cit. (see also Toth and Kalnay(1997)) in several experimental runs that BVs resemble the leading finite-time Lyapunov vectors. In order to make a more quantitative comparisonbetween Lyapunov vectors and BVs, in Section 3 we study the ContinuumLimits (as the basic time-step size for rescaling goes to 0) of the BV andthe EBV algorithms and show direct connections between these limits and4pecific solutions of the continuous-time tangent linear equations. Section3 also contains a further discussion of some of the desirable and interestingfeatures of the new EBV. For instance, there is a natural ordering of theensemble members of an EBV, and as we will show, it is possible to observeperturbations with smaller sizes than the dominant one, but with very stronggrowth, see the spear-like behavior in Figure 4.We consider two models to exemplify the features of the new EBV. Thefirst is the familiar Lorenz63 model introduced by Lorenz (1963). It has awell-known global attractor. The second is a nonlinear forced and dissipativepartial differential equation of the Cahn-Hilliard type. This equation is avariation of a model proposed by Cessi and Young (1992) of the oceanicthermohaline circulation. We will denote the equation associated with thismodel as the CY92 in this study. (In fact, we impose periodic boundaryconditions instead of the more physical zero-flux, zero-stress conditions atthe poles.)It turns out, it is a good example of the typical climate-related modeldynamics. However, the CY92 is special, since it has an inertial manifold.Consequently, the longtime dynamics of this partial differential equation iscompletely contained in the attractor of a finite dimensional ordinary differ-ential equation. In applications, the BV algorithm and its variants are infact discrete-time algorithms based on finite-dimensional approximations ofweather models, obtained either by mode projection as in the Lorenz (1963)system, or by spatial discretization of partial differential equations (PDEs),as in the Cessi-Young (CY92) model, which will be described below. As wewill show the EBV yields insights into the structure of the attractor of theLorenz63, and of the inertial manifold to the CY92.Our numerical examples also highlight that the BV algorithm is sensitiveto the amplitude and frequency content of the initial perturbation. In con-trast, the outcome of the EBV algorithm shows a clear hierarchy among itsmembers, and the first few members already generate an unambiguous char-acterization of the perturbation field at both large and small amplitudes. Inthe nonlinear regime, however, the EBV will be shown in Section 4 to be lesslikely to produce spurious results than the BV. In Section 5 we will addressimplementation issues of the EBV. In this section we present the definitions and methodology for computingthe two Bred Vector algorithms, the BV and the EBV. We also review thebasic theory of Lyapunov Vectors, Lyapunov exponents, and their finite-time5ounterparts. We will compare these different tools below. However, beforedoing this, we include here a brief survey of some of the applications andtheoretical issues that have been noted in the implementation of the BValgorithm.
For the BV algorithm, we use the one originally proposed by Toth and Kalnay(1993). BV is purely algorithmic. It is “equation-free” and thus with addi-tional minimal computer coding it can handle legacy code representing evenextremely complex models. Most alternatives for obtaining estimates of for-ward sensitivity will involve non-trivial additional coding. For example, inorder to obtain the finite-time Lyapunov vectors and exponents, one needsto derive and make use of the tangent linear model. (From the beginning,it was realized that there existed some close connections between the BValgorithm and the tangent linear equations. As we will show in Section 3,there is a rigorous mathematical foundation for these connections.) Singularvalue decomposition methods, which can offer complementary informationto Lyapunov-vector inspired methods, also require a tangent linear model.Deremble et al. (2009) use this approach to study regime predictability insome reduced weather models. Other examples are: Buizza et al. (1993) andPalmer et al. (1998). In another direction, Wolfe and Samelson (2007) pro-pose the use of the MET and the finite-time singular vectors to approximatethe Lyapunov vectors.The BV algorithm is a finite-time, forward sensitivity methodology which,in addition to being useful in characterizing model sensitivity to initial condi-tions, has been proposed as a means to produce a reduced-rank representationof the background error in data assimilation and forecast error-covariance ap-proximations (see Corazza et al. (2003), for example).Several articles in the literature have addressed applications of the BValgorithm in weather modeling. See, in particular Toth and Kalnay (1993),Toth and Kalnay (1997), Kalnay (2003), and reference therein. For a compar-ison of the BV algorithm and other methods, such as Monte-Carlo perturbedobservations, we refer for example to Cheung (2001); Gneiting and Raftery(2005); Hansen and Smith (2000); Wei and Toth (2003). For an applica-tion of the BV algorithm to ensemble Kalman filters see Wang and Bishop(2003). Primo et al. (2008) and Hallerberg et al. (2010) have treated applica-tions based on variations in the algorithm, where for example, the rescalingis done by using a geometric mean.As already discussed, several alternatives to the BV algorithm have beenproposed and employed in the literature. BVs have been viewed as non-linearanalogs of finite-time Lyapunov vectors. Similarly nonlinear analogs of sin-6ular vectors have been proposed, for instance conditional nonlinear optimalperturbations proposed by Mu and Jiang (2008) and non-linear singular vec-tors proposed by Rivi´ere et al. (2008), although they entail a computationallyexpensive optimization.Both the BV and EBV algorithms arise in the time-discretization of acontinuous-time dynamical system. We consider the following initial valueproblem dydt = G ( y ) , t > ,y (0) = y , (1)where t represent time and G = G ( y ) is a map that has at least a boundedgradient. Since our main applications involve autonomous differential equa-tions, we assume that G does not explicitly depend on time. The basic theorywe present here has a routine extension to non-autonomous problems.The solution vector y = y ( t ) can live in a finite- or infinite-dimensionalnormed linear space. In the former case, (1) is an (autonomous) system of or-dinary differential equations, while in the latter case, (1) is an (autonomous)system of partial differential equations, modeling a time dependent, spatiallyextended system. For systems of partial differential equations, we assumethat either periodic boundary conditions or non-flux boundary conditionsare prescribed. See for example, Sell and You (2002). The use of otherboundary conditions may lead to a related theory, but we do not address theissue here.If the system contains evolution partial differential equations, the system,along with the boundary conditions, are discretized in space or projected ontoa finite-dimensional space compatible with the boundary conditions. Conse-quently, we usually assume that (1) is a system of ordinary differential equa-tions of dimension K , which may be large. Since most large-scale weatherand climate circulation models presently use explicit-in-time integrators, wewill focus on numerical models of this type. We let y = y ( t ) denote a given (continuous time) solution of (1). We thenturn to an approximate solution Y n = Y ( t n ), which is defined on the timegrid: t n , for n = 0 , , , · · · , where t n +1 = t n + δt n , for n = 0 , , , . . . . We set t = 0. We assume that δt n = δt is positive and small, and that it does notdepend on n , for n ≥ Y n +1 = M ( Y n , δt ) , n = 0 , , , ..,Y = Y (0) . (2)where Y ( t n ) is a solution of the discrete problem (2), and it may be viewed asan approximation of the continuous-time solution y ( t ), at t = t n . Likewise,the initial condition Y is an approximation of the initial condition y . Thepoints Y n are in the K -dimensional Euclidean space R K and M = M ( Y, δt )is the discrete-time solution operator on R K generated by the ordinary dif-ferential equation (1). As noted above, we assume that the discrete-timeproblem (2) has an attractor A , and that Y is on, or near A .At this point it is convenient to introduce the related concepts of a Cloud (at t = 0) and a Family of Initial Perturbations . A Cloud (at t = 0) is afamily of tangent vectors δ Y ( ι ) to Y in R K that depends on ι , where ι ∈ I and I is a finite index set. The main requirement we impose is that (cid:107) δ Y ( ι ) (cid:107) = (cid:15), for all ι ∈ I , where (cid:15) is small, positive, and fixed. The collection of all terms ( Y , δ Y ( ι )) in { Y } × R K , for ι ∈ I , is called a Family of Initial Perturbations. Notice thatthis collection lies on a sphere of radius (cid:15) in { Y } × R K with center at ( Y , n ≥
0, we assume that the base point Y n and the perturbation vector δY n ( ι )are known. For the ( n + 1) st step we use:1. Y n +1 denotes the ( n + 1) st base point, and it is determined by (2);2. δ Y n +1 ( ι ), the ( n + 1) st perturbation vector, is given by δY n +1 ( ι ) = M ( Y n + δ Y n ( ι ) , δt ) − M ( Y n , δt ) , (3) δ Y n +1 ( ι ) = R n +1 δY n +1 ( ι ) , (4)where R n +1 is a rescaling rule.The time evolution of the Cloud is BV( t n ) = δ Y n ( ι ), where ι ∈ I and n ≥ op. cit. , consists ofrescaling the n th perturbation vector to the previous one by (cid:107) δ Y n +1 ( ι ) (cid:107) = R n +1 (cid:107) δY n +1 ( ι ) (cid:107) = (cid:107) δ Y ( ι ) (cid:107) = (cid:15), for ι ∈ I , n ≥ . (5)Equivalently, R n +1 ( Y ) := (cid:107) δ Y n ( ι ) (cid:107)(cid:107) δY n +1 ( ι ) (cid:107) = (cid:107) δ Y ( ι ) (cid:107)(cid:107) δY n +1 ( ι ) (cid:107) , (6)8o that R n +1 = R n +1 ( Y ) depends on the initial base point Y , as well asthe initial perturbation vector δ Y ( ι ). An alternate rescaling rule consists inrescaling periodically, at t m k , where k is an integer k ≥
2, and m = 1 , , · · · .In this case, one uses the rule (6) when n = mk , and R n +1 ( Y ) = 1 , for mk < n < ( m + 1) k. (7)We do not use this alternate rule in this paper. (For a discussion of rescalingtime and regime predictability in some reduced models, see Deremble et al.(2009) and references therein).For the Ensemble Bred Vector algorithm, instead of using R n +1 as inequation (3), we use a uniform scaling R min n +1 , which is the same for all ι ∈ I .In particular, we replace equation (3) with δY n +1 ( ι ) = M ( Y n + δ Y n ( ι ) , δt ) − M ( Y n , δt ) δ Y n +1 ( ι ) = R min n +1 δY n +1 ( ι ) , (8)for all ι ∈ I , where R min n +1 = (cid:15) (cid:20) max ι ∈ I ( (cid:107) δY n +1 ( ι ) (cid:107) ) (cid:21) − . (9)Similarly, when (8) and (9) hold, we use EBV( t n ) = δ Y n ( ι ), for ι ∈ I and n ≥
0, to denote the time evolution of the Cloud. (Alternatively, a periodicrescaling rule utilizing (7) above can be employed.) The time step used tocompute the base trajectory and the time intervals between normalizationsneed not be the same. This is the case for both the BV algorithm as well asEBV.The crucial difference between the BV and EBV algorithms is that, evenwhen the BV is run concurrently over an ensemble of initial data, the out-come of the algorithm for each given datum does not depend on the othermembers of the ensemble. In contrast, the evolution of the ensemble mem-bers is interdependent in the EBV. Nevertheless, by construction, the EBVshould exhibit similar behavior to the BV at small amplitudes. Indeed, oneof the design principles for the EBV algorithm is to reinforce this aspect byproviding it with a built-in acceleration mechanism.Both the BV algorithm and EBV outcomes, on the other hand dependon the choice of vector norm used to define the rescaling rule in either (6) or(9). While all norms are equivalent in the (finite) K -dimensional space wherewe seek solutions, in practice the constants appearing in the equivalencebetween different finite-dimensional norms generally strongly depend on thedimension K and eventually blow up as k becomes infinite. Hence, the choiceof norm used can have an impact on the implementability and performance ofthese algorithms. In Rivi´ere et al. (2008), it was suggested that the outcome9f the BV algorithm may depend strongly on the choice of norm. This isactually a consequence of the non-negligible nonlinear effects in the system.In Section 5, we will elaborate further on the issue of norm dependence.In addition to the “renormalization time-step” δt , there is another timestep, the “integration time-step”, which we will denote by ∆ t . For example,one encounters the new time step when moving from the continuous-timeproblem equation (1) to the discrete-time problem equation (2). For the mostpart, we will treat ∆ t and δt as being equal in the calculations described inthis article. However, we always require that ∆ t ≤ δt . In Section 4, we will be applying the BV and EBV algorithms to two modelsconsisting of systems of (nonlinear) autonomous ODEs of the form (1). ( Thesecond model arises from the discretization of a PDE.) In each model, thereis a compact, global attractor A , which is a subset of R K , and is invariant forthe time evolution of the system. (We refer the reader to Chapter 2 in Selland You (2002) for more information on attractors and global attractors.) Wewill let θ denote a typical point in the attractor A , and we will let θ · t = y ( t )denote the unique solution of (1) that satisfies y (0) = θ . Since A is invariant,one has θ · t ∈ A , for all t ∈ R .In order to study the sensitivity with respect to initial conditions on theattractor, the Tangent Linear Model is used, which is defined as ∂ t x = A ( θ · t ) x, (10a) x (0) = x ∈ R K , (10b)where A ( y ) = DG ( y ) is the Jacobian matrix of G . Hence A = A ( θ · t ) isthe linearization of (1) along the solution θ · t . We observe that, even if G does not explicitly depend on time, (10a) is generally non-autonomous, since θ · t changes with time.We let U ( θ, t ), denote the solution operator of (10a), which takes theinitial data to the solution at time t , so that U ( θ, t ) x is the solution ofthe initial value problem for (10). Such an operator is well defined by theuniqueness of solutions to the problem (10). Uniqueness of solutions alsoreadily implies the cocycle identity: U ( θ, τ + t ) = U ( θ · τ, t ) U ( θ, τ ) , for all θ ∈ A and all τ, t ∈ R . (11)Next we consider a family of mappings Π = Π( t ), which are defined for t ∈ R by the relationΠ( t )( θ, x ) := ( θ · t, U ( θ, t ) x ) , for ( θ, x ) ∈ A × R K . (12)10e note that Π( t ) maps A × R K into itself, for each t ∈ R ; it is jointlycontinuous in ( t, θ, x ); it satisfies Π(0)( θ, x ) = ( θ, x ), (i.e., Π(0) = I , theidentity operator; as well as the evolution property:Π( τ + t ) = Π( t ) Π( τ ) , for all τ, t ∈ R . (13)By using the discrete-time dynamics, where t and τ are restricted tosatisfy t = n · δt and τ = m · δt , the notation and the theory of dynamicalsystems extends readily to the discrete-time problems of interest herein. Notethat the θ -component of Π does not depend on the x -component. Thus Πis called a skew product flow. Since Π is linear in x , it is sometimes called a linear skew product flow. In summary, the Tangent Linear Equation over theattractor A generates a linear skew product flow. (For more information onthe theory of skew product flows in the context of non-autonomous dynamics,see the multiple works of Sacker and Sell, for example: Sacker and Sell (1977),Sacker and Sell (1978) and Sacker and Sell (1980).) The dynamics of Π arecrucial for understanding the sensitivity and predictability of the underlyingmodel.In his opus magnum, which was published in Ukraine in 1892, Lyapunovpresented his theory of stability for finite-dimensional ordinary differentialequations. This work includes his study of the non-autonomous linear prob-lem (10), see Lyapunov (1992) (yes, 100 years later.) One of Lyapunov’sgoals was to develop an analogue of the well-known eigenvalue-eigenvectortheory, for the solutions of the autonomous problem, to the study of solutionsof general non-autonomous equation (10).The approach developed by Lyapunov begins with the 4 Lyapunov Rela-tions of exponential growth:lim sup t →±∞ t log( (cid:107) U ( θ, t ) x (cid:107) ) and lim inf t →±∞ t log( (cid:107) U ( θ, t ) x (cid:107) ) , where x (cid:54) = 0. Lyapunov was interested in, as are we, the case where thesefour limits are equal, and λ ( θ, x ) def = lim t →−∞ t log( (cid:107) U ( θ, t ) x (cid:107) ) = lim t →∞ t log( (cid:107) U ( θ, t ) x (cid:107) ) , (14)where x (cid:54) = 0. The linearity of U ( θ, t ) x implies that s λ ( θ, x ) = λ ( θ, s x ),for s (cid:54) = 0. Hence one can assume, as we do, that x is a unit vector, i.e., (cid:107) x (cid:107) = 1. When (14) holds, then λ ( θ, x ) is a strong Lyapunov exponent ,and the unit vector x is an (associate) Lyapunov vector . The
Lyapunovspectrum , LYΣ, is the collection of all such λ ( θ, x ), with θ ∈ A and (cid:107) x (cid:107) = 1.For example, if A ( θ · t ) = A is an autonomous matrix, then the Lyapunovspectrum consists of all real numbers λ that satisfy λ = Re ν , where ν is aneigenvalue of A . 11 Lyapunov vector is not an isolated vector, rather it spawns a line ofLyapunov vectors (through the origin) in R K . That is to say, a Lyapunovvector is a point in P K − , the ( K − v (cid:54) = 0 in R K , we will use [ v ] ∈ P K − to denote the unique linein R K that contains v . (Note that [ v ] = [ − v ].) Conversely, when one maps aline [ u ] in P K − to a vector u ∈ R K , we require that the pre-image u lie onthe line [ u ] and that (cid:107) u (cid:107) = 1. One should note that P K − is a metric space.The projective metric d pr ( u , u ) is defined for nonzero vectors u and u in R K by d pr ( u , u ) = min s ,s (cid:107) s u ± s u (cid:107) , (15)where s and s are real numbers that satisfy (cid:107) s u (cid:107) = (cid:107) s u (cid:107) = 1. Thismetric is used for measuring the distance between the lines [ u ] and [ u ] in P K − .Since the solution operator U ( θ, t ) of the linear problem (10) maps linesin R K onto lines, one can use this operator to define a related projective flow Σ( θ, t ) on P K − by means of the relationΣ( θ, t ) [ u ] = [ U ( θ, t ) u ] , for [ u ] ∈ P K − . Using this, one obtains an equivalent flow [Π] on P K − × A , where[Π]( t )( θ, [ x ]) = ( θ · t, Σ( θ, t )[ x ]) , for ( θ, [ x ]) ∈ A × P K − , (16)compare with (12). One obtains additional information about the dynamicson the projective flow [Π( t )], by using the Lyapunov vectors, as is notedbelow. Next we turn our attention to the question of finding good finite-time ap-proximations of these Lyapunov vectors.We begin by constructing a piecewise autonomous approximation of theTangent Linear Equation for (1). To this end, we replace (10) by ∂ t Y ( t ) = A n Y ( t ) , for t n < t ≤ t n +1 , (17a) Y (0) = Y , (17b) t n = n δt, n = 0 , , , · · · , N − . (17c)where A n = A ( y ( t n )) := ∂G∂y | y =( y ( t n )) and y ( t n ) is the value of an exact solutionof (1) at t = t n . The solution of this system at the grid points t n is, explicitlyand recursively, given by Y n +1 = e δt A n Y n , for n = 0 , , , · · · , N − . (18)12onsequently, Y N , the solution at time T = N δt , is given by Y ( T ) = Y N = Z ( T ) Y , (19)where Z ( T ) := e δt A N − e δt A N − · · · e δt A e δt A . We now define the approximation of the Lyapunov Vector associated with thelargest Lyapunov Exponent λ , at the time t = T - the finite-time Lyapunovvector , as the direction of steepest ascent for the matrix Z ( T ). For example,if one had used an explicit Euler scheme, then W ( T ) would be an Eulerianapproximation of Z ( T ), where W ( T ) = ( I + δt A N − ) · · · ( I + δt A ) · · · ( I + δtA ) . The finite-time Lyapunov Vector is the singular vector corresponding to thelargest singular value of Z ( T ).In practice (see e.g. Section 4.2 for the case of the CY92 model), thefinite-time LV will be computed by directly solving a discrete approximationof the LTM and rescaling the output (the rescaling can be done at arbitraryintervals of time, since the problem is linear.) One of the main contributions found in the JPS87 manuscript is an indepthstudy of the interactions between two major theories of the longtime dy-namics of nonautonomous, linear differential systems. Dynamics of nonau-tonomous, linear differential equations: • Exponential Dichotomies (and Continuous Foliations) and • the MET (Multiplicative Ergodic Theorem) and Ergodic Measures.We view the JPS87 manuscript as a toolkit to be used in the analysis ofthe dynamics of related linear systems: It is this united theory, as we nowshow, that forms the mathematical foundations of the theory and applicationsof bred vectors. We begin with the first aspect: Exponential Dichotomies andContinuous Foliations.Consider the family of shifted semiflows U λ ( θ, t ) def = e − λt U ( θ, t ) , for λ ∈ R , and the associate skew-product flowsΠ λ ( t )( θ, x ) def = ( θ · t, U λ ( θ, t ) x ) . M be a compact invariant set in A . As noted in Sacker and Sell (1978),the skew-product flow Π λ is said to have an exponential dichotomy over M ,provided that there exist projectors P λ and Q λ and constants K ≥ α >
0, such that P λ ( θ ) + Q λ ( θ ) = I , and such that, for all θ ∈ M and u ∈ R K , one has (cid:107) U λ ( θ, t ) Q λ ( θ ) u (cid:107) ≤ K e − αt (cid:107) u (cid:107) , t ≥ (cid:107) U λ ( θ, t ) P λ ( θ ) u (cid:107) ≤ K e αt (cid:107) u (cid:107) , t ≤ . When there is an exponential dichotomy and (2.5) is valid, then the stable and unstable linear spaces, S λ and U λ - which are respectively the rangesof the projectors Q λ and P λ - satisfy important dynamical properties thatdescribe the exponential growth rate of selected solutions. For example, thesystem (2.5) is equivalent to: (cid:107) U λ ( θ, t ) u (cid:107) ≤ K e − αt (cid:107) u (cid:107) , for t ≥ , u ∈ R ( Q λ ( θ )) , (cid:107) U λ ( θ, t ) u (cid:107) ≤ K e αt (cid:107) u (cid:107) , for t ≤ , u ∈ R ( P λ ( θ )) . (20)Let SSΣ denote the SS Spectrum (aka Sacker-Sell Spectrum) for Π, whichis defined as the collection of all λ ∈ R such that Π λ does not have anexponential dichotomy over M .The Spectral Theorem in Sacker and Sell (1978) describes the continuousfoliation, and other properties of the flow Π over M . More precisely, thereis an integer (cid:96) , where 1 ≤ (cid:96) ≤ K , such that SSΣ is the union of (cid:96) closed,bounded intervals, that is,SSΣ = (cid:96) − (cid:91) i =0 [ a (cid:96) − i , b (cid:96) − i ] , with a (cid:96) − i ≤ b (cid:96) − i < a (cid:96) − i − . Also there is a continuous foliation R K = (cid:96) − (cid:77) i =0 V (cid:96) − i ( θ ) , θ ∈ M , where { V (cid:96) ( θ ) , · · · , V ( θ ) } is a linearly independent, continuous family of sub-spaces of R K with (cid:80) i = (cid:96) − i =0 dim V (cid:96) − i ( θ ) = K , for θ ∈ M . As is shown below,the right-most interval [ a , b ] plays a special role in the study of bred vectors.For 1 ≤ k ≤ (cid:96) , we let [ a k , b k ] denote the k th spectral interval and let V k ( θ ) denote the corresponding subspace given by the continuous foliation.We next explore the important connections between the exponential growthrates of solutions with initial conditions in V k ( θ ) and the k th interval [ a k , b k ].Among other things, we will encounter the Monotonicity Property and theStrictly Monotone Property. 14et ν and λ be real numbers that satisfy ν < λ , and both U ν ( θ, t ) and U λ ( θ, t ) have exponential dichotomies over M . Then the Monotonicity Prop-erty holds: S ν ⊆ S λ , while U λ ⊆ U ν . (21)The Strictly Monotone Property is a consequence of the observation that thefollowing three statements are equivalent: • One has S ν (cid:54)⊆ S λ . • One has U λ (cid:54)⊆ U ν . • There is a spectral interval [ a k , b k ] in the interval ( ν, λ ).This brings us to a basic property. Let b (cid:96) +1 and a satisfy: b (cid:96) +1 < a (cid:96) and b < a . Next we fix ν and λ so that b k +1 < ν < a k ≤ b k < λ < a k − . Then neither ν nor λ lie in the Sacker-Sell spectrum SS Σ. Furthermore, theinterval ( ν, λ ) contains the spectral interval [ a k , b k ]. By the Strictly MonotoneProperty, the space S λ is larger than S ν , while U ν is larger than U λ . Moreover,as is shown in Sacker and Sell (1978), one has: V k ( θ ) = U ν ( θ ) ∩ S λ ( θ ) , for all θ ∈ M . (22)It is a consequence of the relations (22) and (20) that if the initial condi-tion ( θ, x ) satisfies x ∈ V k ( θ ), then the solution U ( θ, t ) x satisfies (cid:107) U ( θ, t ) x (cid:107) ≤ K e ( λ − α ) t (cid:107) x (cid:107) , for t ≥ , (cid:107) U ( θ, t ) x (cid:107) ≤ K e ( ν − α ) t (cid:107) x (cid:107) , for t ≤ . (Note that K depends on the choice of ν and λ .)It should be noted that all the terms used above, including P λ ( θ ) and Q λ ( θ ), vary continuously in θ . Furthermore, the exponential dichotomy isrobust, in the sense that it varies continuously under small perturbations.Small changes in the model result in a related exponential dichotomy withsmall changes in P λ , Q λ , K , and α , see Pliss and Sell (1999).Moreover, it is shown in Sacker and Sell (1976) that if λ ∈ SSΣ, thenthere is a ( θ, x ) ∈ M × R K , such that (cid:107) x (cid:107) (cid:54) = 0 and the solution U λ ( θ, t ) x satisfies: sup t ∈ R (cid:107) U λ ( θ, t ) x (cid:107) < ∞ . (23)What will become apparent shortly is that the red vector must be in the space V ( θ ). Furthermore, since λ = b is in SSΣ, the pair ( θ, x ), that arises in1523) for this choice of λ , is a candidate for the “red vector” designation. Ared vector must satisfy (23), but the converse need not be true. More on thislater.As noted above, the second tool to be used in the theory of Lyapunovexponents/vectors is the Multiplicative Ergodic Theorem (MET) and theergodic measures on M . One finds in JPS87 a study of the links betweenthe LYΣ and the SSΣ. While the SSΣ leads to a continuous foliation, asnoted above, the MET leads to a “measurable” refinement of this continuoussplitting, as is noted in Remark 4.2.9 on pages 177-179 in Arnold (1998).The latter reference is noteworthy because it contains various extensions ofthe MET to problems not originally envisioned in the pioneering works of anearlier generation. The Multiplicative Ergodic Theorem : Let M be a (non-empty) compact,invariant set on the attractor A , and let µ be an ergodic measure on M with µ ( M ) = 1. Then there is an invariant set M µ in M , with µ ( M µ ) = 1, andthere is a k , with (cid:96) ≤ k ≤ K , such that the following hold:1. There is a measurable foliation R K = j = k − (cid:77) j =0 W k − j ( θ ) , for θ ∈ M µ , where { W k − j ( θ ) , · · · , W ( θ ) } is a linearly independent, measurable fam-ily of subspaces of R K with dim W j ( θ ) = m j ≥
1, for 1 ≤ j ≤ k, ( m + · · · + m k ) = K , and all θ ∈ M µ .2. There are real numbers λ j , for 1 ≤ j ≤ k , that are the strong Lyapunovexponents λ ( θ, x ) = λ j , for all ( θ, x ) ∈ M µ × W j ( θ ), with (cid:107) x (cid:107) = 1,and one has λ k < · · · < λ .3. The Ergodic Spectrum LYΣ( µ ), which depends on the ergodic measure µ , is this collection { λ , · · · λ k } . The Lyapunov Spectrum LYΣ is theunion LYΣ def = ∪ µ LYΣ( µ ) , over all ergodic measures µ , is used below.4. For each j , the measurable vector bundle M µ × W j ( θ ) = { ( θ, x ) ∈ M µ × W j ( θ ) } is an invariant set for the projective flow. Furthermore, because ofthe exponential separation between these vector bundles, the bundle M µ × W j ( θ ), with j = 1, is an attractor for the projective flow.16onnections between the SSΣ and LYΣ: As is shown in JPS87, the fol-lowing relations hold: • One has LYΣ ⊂ SSΣ. • For each i with 1 ≤ i ≤ (cid:96) , there is an ergodic measure µ with supportin the spectral interval [ a i , b i ]. • Assume that the invariant set M is “dynamically connected”, that is, M cannot be written as the union of two disjoint, nonempty, closed,invariant sets. Then the following holds: For b , the largest value inSSΣ, there is an ergodic measure µ with the property that λ = b isa strong Lyapunov exponent. It follows that λ = b is a Lyapunovexponent for some ergodic measure µ on M , and consequently there isa unit vector x in W ( θ ) with the property that x is a red vector. • The previous item is valid for any of the endpoints { a , · · · , a k ; b , · · · b k } ,but the related ergodic measures may differ.It can happen, as in the case with the CY92 model, that there is a uniquered vector. Furthermore, in this case, as is noted in Section 4, one hasdim V ( θ ) = 1. Hence one has W ( θ ) = V ( θ ), and there is a unique redvector in the projective flow. Moreover, due to the exponential dichotomiesoccurring in the CY92 model, the red vector is robust, and it varies contin-uously with small changes in the model.On the other hand, when dim W ( θ ) = 1, there is always a unique redvector. If in addition, one has dim V ( θ ) ≥
2, then the red vector may beonly measurable and not continuous. In short, the red vector need not berobust.
We now elucidate further the relationships between the BV and EBV algo-rithms and the dynamics of the underlying system. In particular, we nowformalize the connections between these algorithms and the solutions of thelinear tangent equation, i.e., the finite-time Lyapunov vectors. We are in-terested in the behavior as the step size δt goes to 0. To accomplish theseaims we revert to a continuum formulation and for simplicity, assume thatthe vector y ( t ) in (1) is a member of the Euclidean space R K . Since energynorms are used in many geophysical fluid mechanics problems, we will takethe usual l -norm. We denote the norm and inner product by (cid:107) · (cid:107) and ( · , · ),respectively, with (cid:107) · (cid:107) = ( · , · ) 17he notation in this chapter differs from what has been set in the rest ofthe paper in minor ways, like the letters representing the functions. This is soto emphasize that unlike what one computes in practice, the dynamical sys-tems here are continuous in time. However, everything is clearly explained toavoid ambiguities without cluttering the presentations with technical details.We stress that the term continuum limit refers to the rescaling time(sometimes called a cycle), not to the numerical integration time step. Ina numerical context, this corresponds to a strategy where both integrationtime steps and the rescaling times are small. While this has no importancefor linear systems, it leads to different outcomes when applied to nonlinearsystems even for quite small perturbations amplitudes.We first recall the system (1): dydt = G ( y ) , t > t ,y (0) = y . (24)Our first goal below is to obtain the formula W ( T ; ε ) = W ( t ; ε ) + (cid:90) Tt (cid:2) G εy ( W ( t ; ε ) , t ) − (cid:0) G εy ( W ( t ; ε ) , t ) , W ( t ; ε ) (cid:1) W ( t ; ε ) (cid:3) dt, (25)where for all t ≥ t , W ( t ; ε ) = W ; (cid:107) W (cid:107) = 1; G εy ( W ( t ; ε ) , t ) = 1 ε G y ( εW ( t ; ε ) , t ) , and G y ( δy, t ) = G ( y + δy ) − G ( y ) for the limiting case δt → t ≤ t ≤ T .The vector W ( t ; ε ) corresponds to a continuously rescaled bred vector V ( t )at amplitude ε and with initial perturbation V ( t ) = εW . ( W has the samedirection as V , but it has amplitude 1). We will streamline the presentationby skipping some of the techical details in derivation, and we will assumeat the outset that G in (24) has the necessary differentiability properties tomake all the mathematical steps rigorous.Let y and y + δy be two solutions of (24) that satisfy the initial conditions y ( t ) = y and ( y + δy )( t ) = y + δy . Then δy ( t ) is a solution of ddt δy = [ G ( y + δy ) − G ( y )] , δy ( t ) = δy . (26)Even if G is autonomous, the resulting equation (26) for δy is non-autonomous.Integrating (26) for t ≥ t , we obtain δy ( t + δt ) = (cid:18) [ y ( t ) + δy ( t )] + (cid:90) t + δtt G ( y + δy ) dt (cid:19) − (cid:18) y ( t ) + (cid:90) t + δtt G ( y ) dt (cid:19) . (27)18e next define ε = (cid:107) δy ( t ) (cid:107) , and a family of vectors v ( t ) on a sphere centeredat zero with radius ε in R K by v ( t ) = δy ( t ) (cid:107) δy ( t ) (cid:107) (cid:107) δy (0) (cid:107) . Assuming that δy is bounded away from 0, v ( t ) obeys the evolution equa-tion: 1 (cid:107) δy (0) (cid:107) dvdt = 1 (cid:107) δy ( t ) (cid:107) (cid:20)(cid:18) ddt δy (cid:19) (cid:107) δy ( t ) (cid:107) − δy ( t ) ddt (cid:107) δy ( t ) (cid:107) (cid:21) = 1 (cid:107) δy ( t ) (cid:107) (cid:20) G y ( δy, t ) (cid:107) δy ( t ) (cid:107) − δy ( t ) (cid:107) δy ( t ) (cid:107) · ddt (cid:107) δy (cid:107) (cid:21) , (28)which makes explicit the fundamental dependence of the time evolution of v on the norm. Taking the scalar product of (26) with δy ( t ), we also have12 ddt (cid:107) δy (cid:107) = ( G y ( δy, t ) , δy ( t )) . Substituting this relation in (28) gives after some simplifications,1 (cid:107) δy (0) (cid:107) dvdt = G y ( δy, t ) (cid:107) δy ( t ) (cid:107) − (cid:18) G y ( δy, t ) (cid:107) δy ( t ) (cid:107) , v ( t ) (cid:107) δy (0) (cid:107) (cid:19) v ( t ) (cid:107) δy (0) (cid:107) . (29)We integrate (29) independently on successive intervals[ t , t + δt ) , [ t + δt, t + 2 δt ) , . . . , [ t + ( N − δt, T ), where T = t + N δt and denote the solution of (29) on the interval I k =[ t + ( k − δt, t + k δt ) by v k ( t ). This is a sytem of N integral equationswith N free parameters (the integration constants) v k , k = 1 , . . . , N : v k ( t ) = v k + (cid:107) v k (cid:107) (cid:90) t ( k − δt (cid:18) G y ( δy k , τ )) (cid:107) δy k ( τ ) (cid:107) − (cid:18) G y ( δy k , τ ) (cid:107) δy k ( τ ) (cid:107) , v k ( τ ) (cid:107) v k (cid:107) (cid:19) v k ( τ ) (cid:107) v k (cid:107) (cid:19) dτ,t + ( k − δt ≤ t < t + k δt, k = 1 , . . . , N, (30)where δy k satisfies the (27) with the initial condition v k at t = t + ( k − δt ,i.e., the left-hand boundary of the interval I k . We observe that, at t = t + ( k − δt , v k ( t ) is an approximation of the discrete bred vector withshort rescaling time step δt . Hence, we take it as the basic approximationfor the continuously rescaled bred vector we seek. Note that the integrandin (30) is simply the component of the vector G y ( δy,t ) (cid:107) δy k ( t ) (cid:107) perpendicular to v k ( t )in R K at time t . 19e take v = δy ( t ) , v k = lim t → t +( k − δt − v k − ( t ) , ≤ k ≤ N, (31)assuming the limits exists, and extend the domain of definition of v k bysetting v k ( t ) = 0 if t (cid:54)∈ I k . Similarly, we extend the domain of definition of δy k . We then let V N = v + · · · + v N , and note that V N is continuous intime on [ t , t + N δt ] (extended to t = t + N δt by continuity) and piecewisedifferentiable. We define δY N analogously as the sum of extended δy k ’s. Then V = lim N →∞ V N = lim δt → V N exists and is continuous in time, for instance, when G has the necessarydifferentiability properties so that the sequence V N , possibly after passing toa subsequence, converges uniformly on compact time intervals, as N → ∞ (or, equivalently, as δt → V N coincides with the right-hand sidelimit of δY N at every grid point, the difference between V N and δY N goes tozero as N → ∞ in an appropriate norm that is allowed by how smooth G is.When G is continuously differentiable, for example, this convergence wouldbe uniform on compact time intervals. Thus, by passing to the limit N → ∞ in (30), we obtain the following representation formula for V : V = V ( t ) + (cid:90) Tt (cid:18) G y ( V ( t ) , t ) − (cid:18) G y ( V ( t ) , t ) , V ( t ) (cid:107) V ( t ) (cid:107) (cid:19) V ( t ) (cid:107) V ( t ) (cid:107) (cid:19) dt. (32)Under the regularity assumptions above on G , the integrand is continuousand hence V is a mild solution of an associated differential equations. There-fore, we can bootstrap and prove the further regularity of V . Note that, byconstruction, ε = (cid:107) V ( t ) (cid:107) = (cid:107) V ( t ) (cid:107) . Therefore, Equation (32) is equivalent to (25), if we take δy = εW .An immediate consequence of (25) is the following. When G is sufficientlyregular, W ( t ; ε ) converges uniformly on [ t , T ] as ε →
0. Let us denote thepointwise limit by W ( t ). Then, by (25), and the observationlim ε → G εy ( W ( t ) , t ) = lim ε → G ( y + εW ( t )) − G ( y ) ε = D W ( t ) G ( y ) , where the right-hand-side is the directional derivative of G in the W ( t )-direction, it follows that W ( t ) satisfies the differential equation dWdt = A ( t ) W − ( A ( t ) W, W ) W, (33a)20here A ( t ) = D y G | y = y ( t ) , and W ( t ) = W , W ∈ R K . (33b)We recall the Linear Tangent Equation for the problem (24): dXdt = A ( t ) X, A ( t ) = D y G | y = y ( t ) , (34a) X ( t ) = W , X ∈ R K , (34b)where D y G is the Jacobian matrix of the map G ( y ), and y ( t ) is the solutionof (24). As in Section 2, U ( t, s ) denotes the solution operator of (34), takingthe solution at time s to the solution and time t . With a slight abuse ofnotation, we write U ( t ) := U ( t, t ), so that in particular the solution X ( t ) of(34) at time t is simply U ( t ) W . We recall also that U is a semiflow, i.e. , U ( t + s ) = U ( t + s, t ) U ( t ) , ∀ t, s ∈ R . (35)What is the relation between (33) and (34)? It can be easily seen thatrescaling X ( t ) to unit length and restarting the integration of (34) with X ( t ) / (cid:107) X ( t ) (cid:107) at any t m ∈ [ t , T ] does not alter value of X ( t ) / (cid:107) X ( t ) (cid:107) for t ≥ t m . By induction, this follows for any finite number of rescaling-restartingcycles. This is because of the linearity of (34). We leave the details to thereader. By a direct approximation argument, or by a similar argument thatled to (33), we obtain X ( t ) (cid:107) X ( t ) (cid:107) = W ( t ) , t ∈ [ t , T ] . Thus, we have just established that, the continuously rescaled bred vector W ( t ; ε ), i.e., the solution of (25), converges to the corresponding solution(rescaled to size 1) of the tangent linear equation (34) with the same initialdata, i.e., to the solution of (33), as ε → t , T ], assuming that G is smooth enough. Since the solutions of (33) haveconstant magnitude 1, it is the linear analog of (25) in terms of continuousrescaling.While continuous rescaling is inconsequential for linear equations, this isnot the case for the nonlinear ones. However, when the initial amplitude isvery small, the results above indicate that one might be able to rescale lessoften and still get comparable results. This is simply because it takes longerfor nonlinear effects to start to dominate the picture.21 .1 Continuum Limit for EBV We discuss the continuum limit for the EBV algorithm briefly, by confiningourselves to a short, heuristic description. Writing a system of differentialequation for the EBV algorithm is not as straightforward as for the BValgorithm. The perturbations δy ( ι ) in the ensemble that grow fastest at thelargest amplitude (before rescaling) at time t will obey the equation (25) aslong as they are the top contenders. All the other perturbations will satisfythe following differential equation: d ( δy ( ι )) dt = G y ( δy ( ι ) , t ) − δy ( ι ) ε dh ( t ) dt , (36)where ι ∈ I , and dh ( t ) /dt is the maximum growth rate of the norm amongall ensemble members with the maximum amplitude ε at time t . Assumingenough smoothness on h ( t ), (36) can be heuristically obtained by takingthe right time derivative of the vector δy ( ι )( t ) εh ( t ) , where h ( τ ) , τ ≥ t isthe maximum norm of the ensemble members at time τ in absence of anyrescaling after time t (If Equation 26 were in charge). As ε approaches zerofor time t fixed, all ensemble members point in directions along which (36)approaches the linear tangent equation, since dh ( t ) /dt → G y at zero with G y (0 , t ) = 0. A detailed treatment complete with thetechnical aspects will be presented elsewhere. We note that the convergence to the linear tangent equation is expectedto be quicker for the EBV compared to the BV for most of the ensemblemembers. This is manifested in the equations, and later verified numericallyin the next chapter. The acceleration is due to the inherent size ordering inthe EBV algorithm: information on the scales where linear tangent equationdominates tends to be preserved in a robust manner against the changes inthe parameter ε . On the other hand, at finite amplitude away from 0, therescaling rule in the BV algorithm might create or prolong the life of certaininstabilities due to the nonlinear effects. The algorithm output needs to beexamined independently to check whether these vectors are in fact relevant tothe dynamics or just artefacts of the rescaling strategy. We believe that theEBV algorithm is more resistant and robust in this regard. The parameter ε is tuned under different considerations in the EBV and BV algorithms. In nonlinear systems, it is possible that a perturbation will not grow veryrapidly in the zone of perturbations with size near ε , but instead, a multiple22f this perturbation can be dominant among the smaller perturbations andmight grow quite fast there. It is also possible that the dominant vectorsin different amplitude zones will be close in structure, but very different ingrowth characteristics. These theoretical considerations are realized in theLorenz63 system quite strikingly. See Figure 4 and its discussion, especiallythe recurring patterns of “spears” therein. This can be seen very easily fromthe following relation: (cid:107) V ( t n ) (cid:107)(cid:107) V ( t n − ) (cid:107) = R min n (cid:107) ¯ V ( t n ) (cid:107)(cid:107) V ( t n − ) (cid:107) , where R min n is defined similarly to (9), V ( t ) denotes a EBV( t ) member, and¯ V ( t ) denotes the same vector right before the rescaling. In a nonlinear sys-tem, this ratio can be larger than 1, for some range of perturbation sizessmaller than ε . This is the main mechanism that creates the vector zonesof different magnitude with zonal growth characteristics, or a separation ofscales.Another case in point will be presented in Section 4 for the CY92 model.Even at late times and small amplitudes, we can see perturbations survivingthe rescaling strategy and they resemble what one would get from the usualBV algorithm at those sizes (and the finite-time Lyapunov vectors) veryclosely, whereas the dominating perturbation of size ε resembles a particularBV of size ε . This example actually illustrates more. There are members ofthe ensemble in small magnitudes that are slow in aligning with the dominantdirections. We will be comparing the BV and the EBV on two problems: The Lorenzequations, or Lorenz63 (see Lorenz (1963)), and a dissipative and forced non-linear partial differential equation that arises in modeling the thermohalinecirculation (see Cessi and Young (1992)). The latter will be denoted as theCY92. It is a Cahn-Hilliard equation and it will shown to have an inertialmanifold. We will also have occasion to compare the BV and EBV resultsto the finite-time Lyapunov vector outcomes.Throughout we will use an explicit fourth order Runge-Kutta time march-ing scheme, for the calculation of the base solutions as well as for the calcu-lations of the BV, EBV, and finite-time Lyapunov vectors.
The finite-dimensional, nonlinear Lorenz63 model has often been used asbenchmark for testing sensitivity and, in particular, as a test problem for23V (see for example Evans et al. (2004)).Let X = ( X , X , X ) ∈ R and let A = − σ σ r − − b , N ( X ) = − X X X X , DN ( X ) = ∂∂X N ( X ) . (37) DN ( X ) is a 3 × ∂ t X = A X + N ( X ) . (38)The associated Tangent Linear Model is the skew product system ∂ t X = A X + N ( X ) ∂ t U = ( A + DN ( X )) U, (39)where U = ( U , U , U ) ∈ R . The U -equation in (39), which is a linearequation, is of special interest to us.We will use the notation for the Tangent Linear Model introduced in Sec-tion 2.3. With the initial condition θ = X ∈ R , we let θ · t = S ( θ, t ) = X ( t )denote the solution of the nonlinear equations (38) that satisfies S ( θ,
0) = θ .In this study we set r = 28 , b = 8 / , σ = 10. It is well-known that for thesevalues of the parameters the Lorenz63 model has a chaotic global attractor.For the non-autonomous U -equation; ∂ t U = ( A + DN ( θ · t )) U , we let U ( t ) = U ( θ, t ) U , denote the solution operator, (40)where U ∈ R , and U (0) = U .We will use the term “attractor” to refer to a compact, invariant set A that attracts a neighborhood of itself. As a result, A is Lyapunov stable, asa set. Consequently, for 0 < ε <
1, there is a family of ε -neighborhoods, N ε , of A , where each neighborhood is positively invariant and A = ∩ ε> N ε .When we write that θ is near A , we mean that θ ∈ N ε , for some small ε > BV ( t ) is a good approxi-mation of the tangent linear solution U ( t ), over bounded time-intervals. Asa consequence of the continuum limit theory in Section 3, we see that thisperceptive observation has a solid mathematical basis, provided that thetime-step δt is small and the perturbation amplitude is small, as well.In order to find a numerical validation of the continuum limit theorydescribed in Section 3, we will calculate the distance D ( t, δt ) := d pr ([ BV ( t )] , [ U ( t )]) , (41)24here d pr is the projective metric on A × P K − (see (15)), and BV ( t ) is a BVat time t . For this calculation, we assume that both BV ( t ) and U ( t ) satisfythe same initial condition ( y , δ Y ) at t = 0, and that y = θ is near theattractor A . We also use identical ensembles of perturbation vectors for boththe BV and the EBV algorithms. We then fix T >
0, and we examine thedistances D ( T, δt ) (in the projective space), for different choices of δt . Ourgoal is to show that D ( T, δt ) becomes smaller, as δt gets smaller. The maxand min values, for both the BV and the EBV algorithms, are reported inTable 1. For the calculations used to generate the data in Table 1, we had setthe initial conditions for both BV ( t ) and U ( t ), at t = 0, so that X = 0 . Y = 0 . Z = 0 . δ Y = [1 , , {− , − . , − . , − . , , . , . , . , } in R , and projected this product onto the sphere of radius 0 .
1, centeredat the origin, in R . By eliminating the repetitions introduced with thisprojection, one obtains an ensemble of 584 distinct points in R . (Note thatthe initial conditions are ”near” the Lorenz attractor.) The trajectory for thenonlinear problem was computed with a step size dt = 0 . T = 2 and madetwo choices for δt , namely δt = 0 .
004 and 0 . BV and the EBV algorithms. As noted in Section 3, bothalgorithms
EBV ( t ) and BV ( t ) approximate solutions of the tangent linearequation, as δt goes to 0. By using the projective metric one can comparethe rates of convergence in terms of this metric.The maximal values of D ( T, δt ), for both the BV and EBV algorithms,are essentially the same for both choices of δt , and the minimal values areessentially the same for δt = 0 . δt = 0 .
004 to δt = 0 . D ( T, δt ), for each of the algorithms, one arrives at the bestapproximation given by for the given algorithm. Clearly the min BV andmin EBV columns in Table 1 shows that the EBV algorithm yields a betterapproximation than the BV algorithm.One can, of course, use different initial conditions for the two solutions BV ( t ) and U ( t ) and/or larger perturbations. However, one cannot expect toreplicate the results seen in Table 1 in that case. First, there is a transientphase, which ends when the two solutions are ”near” the attractor. Even ifthis transient phase is short, one still has a problem. While the attractoris stable, as a set, one still has a problem because the flow on the attractor25able 1: Maximum and minimum values of D ( T, δt ) with T = 2 , for two δt .See (41). δt max BV min BV max EBV min EBV0.004 1 . × − . × − . × − . × − . × − . × − . × − . × − is generally not Lyapunov stable. Essentially all pairs of nearby orbits maydiverge over long time intervals.An interesting outcome of the use of the EBV algorithm on Lorenz63 isthat it exposes the fractal behavior of the Lorenz attractor. We computeda 584-member EBV, using the same parameter values, initial conditions andbase trajectory as was used in generating Table 1, for the δt = 0 .
001 case.We constructed plots by merging all the vectors between time 24 and 30, atintervals of dt = 0 .
1. In that time, at any given instant, one only observesa couple of members reaching highest amplitude. By rotating the same plotabout the X = Z axis, one obtains three views of the EBV’s shown inFigure 1. Zooming into Figure 1a by a factor of 8, 32, and 60, respectively,we obtain Figure 2. The fact that one observes similar patterns at severallevels of magnification meets the requirement of ‘fractal behavior’ that wasintroduced by Mandelbrot (1977).Figure 3 shows the time evolution of a sample BV and the correspondingfinite-time Lyapunov vector with the same initial perturbation. The resultswere obtained with a time step of δt = 0 . t = 0, X = 0 . Y = 6 . Z = 1 . δ Y was [1 , , EBV ( t ) alone, forthe same case. The figure depicts a rescaled time evolution of the normof 98 distinct ensemble members in the EBV calculation. The rescaling isdone with respect to the usual Euclidean l -norm. The initial perturbationsare made to sample a perturbation sphere of amplitude 1 about the initialconditions. The figure highlights the rapid decay of many of the vectorsand the eventual size-ordering that is inherent in the EBV algorithm. For0 < t ≤
13, the results in Figure 4 describe the transient behavior andare rather chaotic. However, for t ≥
13 a very interesting pattern evolves:we see that the largest member of the ensemble takes on the value 1, forall t ≥
0, which is expected. (This corresponds to the Lyapunov vectorwith the largest exponent.) From the totality of the information on the26 ! ! ! YX Z ! ! ! ! ! XY Z ! ! ! ! XY Z (a) (b)(c) Figure 1:
Viewed from three different angles, (a)-(c). EBVs, for times through , taken at . time intervals, for Lorenz63. Parameters and con-ditions are those used to generate Table 1. ! ! ! YX Z ! ! !
202 x 10 ! !
202 x 10 ! YX Z ! ! !
101 x 10 ! !
101 x 10 ! YX Z (a) (b)(c) Figure 2:
Close-up views of Figure 1a: (a) zoomed in 8 times; (b) zoomed in32 times; (c) zoomed in 60 times. Note the preservation of structure even aswe zoom in. − δ X − δ Y − δ Z Time
Figure 3:
The three components of BV and of the finite-time Lyapunov vec-tors, as a function of time, for the Lorenz63 system. (For parameters andinitial conditions, see text). The vectors are nearly coincidental over thewhole time span. The perturbation size is , in all of the components. Forour choice of initial conditions the vectors exhibit more temporal regularityat earlier times. This transient behavior disappears at t = 13 , approximately. ensemble, it is possible to extract the next two Lyapunov vectors, wherethe exponents satisfy λ ≤ λ ≤ λ . However, this would require a deeperanalysis, and it is deferred to a future work. In fact, viewed as a whole,the graphs corresponding to those smaller than 1 have very useful structuralinformation: Notice the recurrent ‘spear-like’ pattern, which occurs after t = 17. What is happening is that a group of ‘small’ vectors in the attractorgrow rapidly, as they go around the horn in the Lorenz attractor. We proposethat the rationale for this behavior is that the nonlinear equations of motiontemporarily overwhelm the uniform rescaling rule for these small vectors.For t ≥
17, we are beyond the transient zone, and we would not expect suchbehavior to occur if the linear term AX strongly dominates the nonlinearterm N ( X ) in the vicinity of the attractor. This is an excellent illustrationof the fact that EBV algorithm preserves the role of the nonlinear terms inthe equations of motion.This feature of the EBV can not be replicated by BV, even a BV withan ensemble of perturbations. The reason for this is that the rescaling rulefor the BV algorithm forces all the perturbations to have the same norm, foreach t ≥
0. Thus the Figure 4 for the EBV would be replaced by a figure forthe BV, where all of the perturbations are plotted on the top line only.29 time
Figure 4:
Plot depicting the evolution of the 2-norm of an ensemble of 98EBV ( t ) . Same parameters and initial conditions, as in Figure 3. Initially,in the interval [0 , we note a very fast decay of some of the ensemblemembers, leading to a sorting in size, beyond that time. As described in thetext, this outcome is one of the most distinguishing features of the EBVs,when compared to an ensemble of individual BV outcomes. .2 A Cahn-Hilliard Equation In their work on the thermohaline dynamics Cessi and Young (1992) pro-posed a coupled model for the circulation, salinity, temperature, and densityof the oceans, with atmospheric forcing. The crux of the model is the par-tial differential equation for salinity: The slow-time dynamics of the oceansalinity S ( x, t ), zonally-averaged, and as a function of latitude x ∈ [ − π, π ]and time t ≥
0, is described by ∂S∂t = α ∂ ∂x [ f ( x ) + µS ( S − sin( x )) + S − γ ∂ S∂x ] , t > ,S ( x,
0) = S ( x ) . (42)The equation is subject to zero-flux and zero-stress boundary conditions atthe poles, however, we will be considering periodic boundary conditions (forsteady and periodic forcing f ( x ) as well as equilibrium solution sin( x ) theconclusions that follow apply to the zero-flux case). The positive parameter α affects the strength of the linear stability of the model. We fix α = 3 . × − , γ = 0 . µ = √
10. The forcing f ( x ) is a prescribed function thatreflects balances of evaporation and precipitation of freshwater; it can besymmetric, about the Equator ( x = 0), but is more typically, asymmetric(see Eyink (2005)). The second derivative of the forcing function, will bechosen to be ∂ xx f ( x ) = (36 − µ cos x −
81 cos x + 8 µ + 25 µ cos x ) sin x, for x ∈ [ − π, , − (3 µ cos x − − µ ) sin x, for x ∈ (0 , π ] . (43)It is shown in Figure 5a. The initial condition chosen for this computationis S ( x ) = cos( x ). The base solution obtained numerically is displayed inFigure 5b.A great deal is known and can be said about the mathematical structureof CY92 and its solutions. By letting S = ∂ x u and f = ∂ x F the CY92 canbe related to the Cahn-Hilliard Equation (CHE) with forcing: The generalCHE, for u above, is ∂ t u + ν (cid:52) u = (cid:52) ( g ( u )) + F for x ∈ Ω and t ≥ , (44)where u = u ( t, x ) is a scalar field, ν > (cid:52) is the Laplacianoperator, (cid:52) is the bi-harmonic operator, and g is a polynomial of degree 3: g ( u ) = (cid:80) j =1 a j u j , with a >
0. The term F = F ( x ) is the forcing function.In general, the domain Ω may be an open bounded domain in the Euclideanspace R m , with 1 ≤ m ≤
3. However, we restrict our attention, to the case31 ! ! ! ! ! ! X time X ! ! ! ! ! (a) (b) Figure 5: (a) Forcing function f xx , (43), used in the CY92 model; the nu-merical solution to CY92 is shown in (b). See Figure 6 for the BV algorithmand finite-time Lyapunov vector algorithm outcomes. where m = 1 and Ω is the interval Ω = ( − π, π ), with boundary Γ = {± π } .As we will see, the multiplicity of the largest Lyapunov exponent for thisproblem is 1.For the analysis of the solutions of the CHE, one will use the standardSobolev spaces H j = H j (Ω), where j ≥ H = L (Ω). Asusual, the inner product and norm on H is denoted by (cid:104)· , ·(cid:105) = (cid:104)· , ·(cid:105) and (cid:107) · (cid:107) = (cid:107) · (cid:107) . For j ≥ (cid:107) u (cid:107) j +1 , = (cid:107) u (cid:107) j + (cid:88) | α | = j +1 (cid:90) Ω | D α u | dx, where D α = ∂ α ∂x α , α ≥ , and (cid:104) u, v (cid:105) j +1 = (cid:104) u, v (cid:105) j + (cid:88) | α | = j +1 (cid:90) Ω ( D α u ) ( D α v ) dx. The finite-time Lyapunov vector algorithm for the CY92 model is derived,by first rewriting the equation as ∂ t S = α∂ xx (cid:2) µ S ( S − η ) − rf ( x ) + S − γ ∂ xx S (cid:3) =: F ( S ) .
32e opt here to first linearize and then discretize. To obtain the tangentlinear equation, we need to find the linear map L such that F ( S + δS ) − F ( S ) = L ( δS ) + o ( δS ) as (cid:107) δS (cid:107) → , where (cid:107) · (cid:107) is a norm. Typically, this will be the norm in the space wheresolutions live, such as the Sobolev space H for S , or dictated by physicalconsiderations. In this infinite-dimensional model, different norms are notnecessarily equivalent.First, we note that { S + δS } ( { S + δS } − η ) − S ( S − η ) = ( S − η ) (3 S − η ) + o ( (cid:107) δS (cid:107) ) . We let L be a finite-difference approximation to L . The discrete tangentlinear equation becomes the product of two matrices AB , where B is thematrix whose entries are found by discretizing the operator µ ( S − η ) (3 S − η ) + Id − γ ∂ xx , where Id is the identity matrix, and A is the matrix corresponding to thediscretization of α∂ xx . The finite-time Lyapunov algorithm is obtained bysolving ∂ t δ Θ =
L δ
Θ =
AB δ
Θ = A ( B ( δ Θ)) , for t ∈ [0 , T ], subject to some initial vector perturbation δ Θ(0) = s , where δ Θis the discretization of the perturbation δS . The perturbation is normalizedto the norm of the initial perturbation at each δt , the time step of the explicitRunge-Kutta 4 time integration scheme employed here. T = 70. In thesimulations, δt = 0 .
01. We applied second-order centered finite differencesin space, and used 121 grid points, − π = x , x , ..., x = π .By construction the finite-time Lyapunov vectors are not amplitude sen-sitive, however the BVs are, thus different amplitude perturbations will yielddifferent BVs, in the case of a general nonlinear problem. This outcome hasimportant practical implications, if one would like to use BV to either inferthe structure in the field, or the degree of sensitivity of the outcomes to per-turbations in initial conditions. A challenging problem could thus arise inthe context of large-scale simulations: what is considered a large structuralchange or a highly sensitive outcome is physics-dependent, perhaps even dif-ficult to surmise quantitatively; the physics in question may not be fullyunderstood and thus a reasonable perturbation amplitude is simply guessed.We ran the finite-time Lyapunov vector and the BV algorithms, usingthe same initial condition, forcing and perturbation. First, we examine theeffect of the size of the amplitude of the perturbation on the outcomes: we doso by keeping the shape of the perturbation fixed, changing only the overall33mplitude. Figure 6a and b are plots of the final BVs and finite-time Lya-punov vectors, corresponding to 2-norm 0.25 and 0.025 sized perturbations,respectively. When the perturbations are small the BVs and the finite-timeLyapunov vectors are qualitatively consistent and, nearly so, quantitatively.The outcomes shown here are typical of the general case, that is, the qualita-tive and quantitative disagreement grows with an increase in the amplitudeof perturbations. ! ! ! ! ! ! ! X LVBV ! ! ! ! ! ! X LVBV (a) (b)
Figure 6:
Comparison of the finite time Lyapunov vectors and BVs. Effectof amplitude of perturbation on the CY92 Model, with initial conditions Y =cos( x ) . The initial perturbation was (cid:15) sin( x ) . Comparison of the finite-timeLyapunov vector and the BV outcomes at t = 70 . (a) Corresponds to (cid:15) =0 . ; (b) to (cid:15) = 0 . . The shape or spectral content of the perturbation mattered as well. Thespectrum of the perturbation is clearly important when projecting onto spec-tral bases for a reduced representation. To illustrate this, we use the CY92model, with the same forcing as before and same initial condition. We exam-ine monochromatic sine wave perturbations with wavenumber j = 1 , , ..., δY = (cid:15) j sin[ jx + 13 ( j −
1) exp(1)] , (45)where all of the perturbation amplitudes remain the same, (cid:15) j = 0 .
25. Thechoice of the phase is inconsequential: to show this we chose non-commensurate34hases among the sine wave components. In Figure 7a we show the BVs as-sociated with each of the perturbations ( j = 1 , ..., t = 70. Figure 7bshows the EBVs at t = 70. Even at t = 5, shown in Figure 7c, we alreadysee an amplitude-ordered structure in the EBV.The space-time plot of all EBVs is plotted in Figure 7d; we note thevery short transient phase, lasting till about t = 4, followed by a structurewhich is clearly dominated by the largest member of the EBV. The EBVcalculation was run with identical parameters to those used in Figure 7a,with the ensemble consisting of the same initial perturbations in (45). In theBV case we are getting outcomes that do not have the reductive appearance ofthe EBV. On the other hand, there is no ambiguity to the prevailing ensemblemember in the EBV case. This ensemble member has a clear correspondencein structure to the path and its perturbation field. The BVs should eventuallyagree with the EBVs, nevertheless. It is safe to assume that this will happenonly after a very long time, longer than the time interval that might besuggested by the base solution.We also compared BV, for each of the perturbations in (45), to the out-comes of the finite-time Lyapunov vector calculation. Figure 8 compares theBVs and finite-time Lyapunov vectors for each j wavenumber perturbation.The amplitudes are set to (cid:15) j = 0 .
25, for j = 1 , , ...,
6. The structure ofthe finite-time Lyapunov vectors, to within a sign, is qualitatively the same,regardless of the wavenumber of the perturbation. The BVs are not. If theperturbation was made considerably smaller the differences between the BVsand the finite-time Lyapunov vectors would become small, as expected: Asshown in Section 3, the BVs and the finite-time Lyapunov vectors must besimilar to each other, provided the perturbations are small enough. For largeramplitudes, the BVs might show more structure since nonlinear effects couldplay a role in the structure of the BVs. Apparently, (cid:15) = 0 .
25 is already inthe range of large perturbations of the CY92 about the solution chosen. Itwas not clear whether a chosen perturbation is large or small, based solelyon the CY92 model itself: it was only clear to us after a comparison of theoutcomes of the finite-time Lyapunov case and the BV case.
The Bi-harmonic Operator
B u = (cid:52) u on Ω and the linear Bi-harmonicequation: ∂ t u + ν (cid:52) u = 0 (46)play a basic role in the study of the CHE. We assume for now that theOperator B satisfies either the non-flux boundary conditions: ∂u∂n = 0 and ∂ ( (cid:52) u ) ∂n = 0 , on Γ , (47)35 ! ! ! ! ! X ! ! ! ! ! ! ! X ! ! ! ! X time X ! ! ! ! ! ! (a) (b)(c) (d) Figure 7: (a) At t = 70 , the superposition of 6 BV simulations, the j th onecorresponding to a sine wave perturbation, as per (45). (b) Cross section ofthe EBVs, at t = 70 , run with an ensemble of perturbations (see 45). Wenote that, unlike the BV case in (a), the EBV has settled into a structurethat clearly reflects the dominant EBV ensemble member. (c) Cross sectionof the same EBVs, at an earlier time: at t = 5 . Even for short times, theEBV already shows an amplitude-ordering of its vectors. (d) Superpositionof all EBVs, as a function of time and space. ! ! ! X LVBV ! ! ! ! ! ! ! ! X LVBV ! ! ! ! X LVBV ! ! ! ! X LVBV ! ! ! ! ! X LVBV ! ! ! ! ! ! X LVBV (a) (b)(c) (d)(e) (f)
Figure 8:
Wavenumber dependence of BVs and finite-time Lyapunov vectors.CY92 Model, cosine initial condition and non-symmetric forcing (see Figure5). Comparison of the finite-time Lyapunov vectors and the BVs for sinewave perturbations of size (cid:15) j = 0 . . The perturbations are the individualsine waves in (45), (a) j = 1 , (b) j = 2 , ..., (f ) j = 6 .
37r the periodic boundary conditions, see Sell and You (2002). In the sequel,we will assume that the forcing function F in (44) satisfies F ∈ H j , for some j ≥
0. That is to say, (cid:90) Ω F dx = 0 . Thus, by integrating (44), one observes that any solution u = u ( t, x ) satisfies u := 1 | Ω | (cid:90) Ω u ( x ) dx = 1 | Ω | (cid:90) Ω u ( t, x ) = u ( t ) , for t ≥ . (48)One seeks solutions of the CHE (44) in the Sobolev space V . A mild solution u = u ( t ) of the initial value problem is given by u ( t ) = e − νB t u + (cid:90) t e − νB ( t − s ) [ (cid:52) ( g ( u ( s ))) + F ] ds. (49)We will denote the maximally defined solutions of (49) by u ( t ) = S ( t ) u .With u ∈ V , this mild solution is uniquely determined, with S ( t ) u ∈ V ,for 0 ≤ t < T ( u ), where 0 < T ( u ) ≤ ∞ .Because of (48), we see that, for j ≥
2, the spaces H j := { φ ∈ H j : φ = 0 } (50)are positively invariant spaces for the solutions of (44). Thus we will focusonly on solutions S ( t ) u that satisfy u = 0. We also denote the collectionof stationary solutions of the CHE by Q := { u ∈ H : S ( t ) u = u for all t ∈ R } , and Q = Q ∩ H = Q ∩ V . F ≡ : The basic problem with forcing F ≡ ∂ t u + ν (cid:52) u = (cid:52) ( g ( u )) for x ∈ Ω and t ≥ . (51)To study the solutions of (51), one uses the Landau-Ginsburg functional: J ( u ) := (cid:90) Ω (cid:20) ν |∇ u | + G ( u ) (cid:21) dx, where G ( z ) = (cid:90) z g ( s ) ds. (52)In addition, the Landau-Ginsburg functional satisfies ∂ t J ( S ( t ) u ) = −(cid:107)∇ K ( S ( t ) u ) (cid:107) , for 0 < t < T ( u ) , (53)38here K ( u ) = − ν (cid:52) u + g ( u ). Furthermore, there exist positive constants f and C such that − f ≤ G ( s ) ≤ C s + f and ν (cid:107)∇ S ( t ) u (cid:107) − f | Ω | ≤ J ( u ) . (54)Since J ( u ) is bounded below, see (54), and decreasing along orbits, see (53),it follows that J ( S ( t ) u ) is a Lyapunov function, see LaSalle and Lefschetz(1961). This implies that the mild solution S ( t ) u is defined for all t > T ( u ) = ∞ , for every u ∈ V . It is fact that, whenever F ≡ u = 0, then ω ( u ), the omega limit set of the solution S ( t ) u is a nonempty,compact, connected invariant set in Q .Let us now return to the CHE with forcing (44). By integrating theequation (44), where u ∈ H and F ∈ H , as well, then S ( t ) u ∈ H , forall t ≥ A . Assume that B satisfies the boundary conditions BC, i.e. , either the non-fluxcondition (47) holds, or the periodic boundary conditions hold. Let S ( t ) bethe semiflow generated by the solution operator on V . Then the followinghold.1. S ( t ) has a nonempty, compact global attractor A in V , and A at-tracts all bounded sets in V . The attractor A depends continuouslyon the forcing function F ∈ H .2. When F ∈ H , then the attractor A is a compact, invariant set in V r , for each r with 0 ≤ r ≤ Q is nonempty, compact, and invariant with Q ⊂ A .4. Lastly, there is an Inertial Manifold for the solutions of the infinitedimensional system (44), see Foias et al. (1988) and Sell and You (2002).One finds this manifold by using the orthogonal projection P N onto thelowest N nodes, that is to say, into Span { e k : k ≤ N } , where { e k } arethe eigenfunctions for the Bi-harmonic operator B . One then makesa change of variables u = v + w , where v = P N u , w = Q N u , and Q N = I − P N . The ( v, w ) system: ∂ t v + Bv = P N ( (cid:52) ( g ( v + w )) + F ) ∂ t w + Bw = Q N ( (cid:52) ( g ( v + w )) + F ) , (55)is equivalent to the CHE (44). One then shows that, for N large, thevariable w is enslaved to the variable v in some neighborhood of A .39hat is to say, w = Φ( v ), for a suitable function Φ. It turns out thethe longtime dynamics of (44) is equivalent to the longtime dynamicsof the finite-dimensional ordinary differential equation: ∂ t v + Bv = P N ( (cid:52) ( g ( v + Φ( v ))) + F ) . (56)See Sell and You (2002) and the references contained therein, for moredetails. For sufficiently small perturbations and short renormalizing time intervalsthe finite-time Lyapunov vector, the BV algorithm, and the EBV are similarin outcomes. As we depart from these conditions, we see significant differ-ences in the outcomes of the three methods. The BV outcomes are mostsensitive to the amplitude and the frequency of the perturbations. Regard-less of the amplitude of the perturbations, the EBV outcomes are structurallyunambiguous and robust.We revisit the CY92 simulations of BV and EBV, using the same forc-ing and initial conditions used Section 4.2 but set the time scale parameter α = 0 .
01, we increase the spatial resolution to 320 points, and set the in-tegration time step at δt = 0 . (cid:15) j = 0 . , .
8, and 1 .
2, with j = 1 , , ...,
6, in (45) weobtain Figure 9, which shows all BVs at t = 19 . t = 19 . L .) For relative com-parison, see Figure 9.The sensitivity of the outcomes to the size of initial perturbations issignificantly different in the EBV and the BV outcomes. There are shapevariations among the BVs as the amplitude of the perturbation is increased,40 ! ! ! ! ! X ! ! ! ! ! ! ! X ! ! ! ! ! X (a) (b)(c) Figure 9:
BVs at t = 19 . , corresponding to (cid:15) j = (cid:15) , for all j . (a) (cid:15) = 0 . ,(b) (cid:15) = 0 . , (c) (cid:15) = 1 . . The outcomes are very sensitive to nonlinearity.There is a resulting ambiguity in structure of the perturbation field. ! ! ! ! X ! ! ! ! ! X ! ! ! ! ! ! ! X (a) (b)(c) Figure 10:
EBV with (a) (cid:15) = 0 . , (b) (cid:15) = 0 . , (c) (cid:15) = 1 . . Compare outcomesto Figure 9. The vectors are shown in their original scales. ! ! ! ! X ! ! ! ! ! X ! ! ! ! ! X (a) (b)(c) Figure 11:
EBV outcomes shown in Figure 10, with each vector rescaled toan L -norm of 1 to aid in visual comparison. Perturbation amplitudes (a) (cid:15) = 0 . , (b) (cid:15) = 0 . , and (c) (cid:15) = 1 . . Note that as the perturbation amplitudeincreases, the resemblance between the BV and the EBV outcomes is lost,except for one of the vectors.
43s evidenced in Figure 9. This degree of sensitivity is not as prevalent in theEBV outcomes, shown in Figure 11. The BV algorithm does not distinguishbetween different perturbation scales. A large perturbation BV calculationwill thus not yield information concerning smaller scales. In contrast, seeFigure 11c, for the EBV case, where small scale information is evident. (Seealso Figure 10c).
Like the BV algorithm, the EBV algorithm is capable of dealing with legacycode. The important difference between implementing a code that does anensemble of BV and the EBV is that while the former can be run concurrently,in the EBV the ensemble members require normalization to each other. Interms of coding, this is a minor issue, if the state variable dimension ofthe underlying model is moderate. For very large problems communicationbecomes an issue, but not at all unfamiliar in concurrent or hybrid computing.Any additional computational issues borne by the EBV are well outweighedby the higher informational content of the EBV over the BV. Moreover, theEBV is much more robust under the nonlinear effects, as illustrated in Section4, than BV. This last aspect is very important and it should be studied inthe context of more chaotic equations in fluid mechanics, meteorology, andgeophysics.The numerical outcome of the BV and EBV algorithms depends on thechoice of norm used for rescaling. This is alluded-to in Rivi´ere et al. (2008),but the reason for this dependence turns out to be easily explained and canbe significant if nonlinear effects are not negligible. The dependence of theoutcomes on the norm lies in the fact that it is not possible, in the generalcase, to scale out the norm in the algorithm if the underlying dynamicsare nonlinear. To illustrate the norm dependence, we consider a simple 2-dimensional system dX dt = X ,dX dt = − sin(8 X ) , t > , (57)subject to the (same) initial conditions ( X (0) , X (0)) = (0 . , − .
15. The normswe employ are all standard (finite) L p norms. The resulting BVs reflectthe characters of these norms. That different outcomes are obtained is notsurprising: the shape of the unit sphere changes depending on which L p normis used. 44 ! ! Time ! ! Time ! ! Time (a) (b)(c)
Figure 12: (a) L -norm, (b) L -norm, and (c) L ∞ -norm of a sample BV as afunction of time, for the system in (57). The initial conditions are the samein all cases, the amplitude of the perturbations was . .
45 more weighty consideration related to norms concerns physical andtheoretical considerations. Conservation laws provide guidance for the mostappropriate norms for the given dynamics (for instance, in Rayleigh-B´ernardconvection, the temperature enjoys a maximum principle, while the velocitydoes not), but other considerations will play a role ( e.g. , a sup-norm maybe an obvious choice in determining the location of severe weather events).It will not be uncommon, thus, that a mixture of norms may be necessaryin multi-physics problems; the fact that the choice of norm in the EBV/BVaffects the results is in fact a good thing, not a bad one.To end this section we wish to highlight a practical consideration thatmay be not be familiar to practitioners, implementing BV or EBV algorithmscomputationally. The specific issue is the impact of finite precision computingon the outcomes. It is easy to show that these algorithms have high numericalsensitivity. In order to illustrate how this plays out we will consider theproblem of calculating the vectors associated with dXdt = AX, t > , X (0) = X . (58)Here A is a square matrix of constants. For linear problems BV, EBV (andthe finite-time Lyapunov) algorithms must yield the same vectors. We willchoose to illustrate computationally numerical ill-conditioning on a problemthat exhibits transient growth due to the non-normal structure of the matrix A . We emphasize, however, that the numerical sensitivity we will be high-lighting in this example does not hinge on the nature of the dynamics, butrather, on the algorithmic form of the BV and EBV themselves.We take A to be an upper-triangular Jordan-block matrix of dimension 5,where A has a single eigenvalue λ , which is repeated on the main diagonal,while the diagonal directly above the main diagonal has non-zero entries.For simplicity we assume these to have the same value ρ . We assume furtherthat λ = − ρ = 1. The general solution operator contains “generalized”eigen-solutions with growth rates t k e − t , where 0 ≤ k ≤
4. (In this case, theeigenvalue has (algebraic) multiplicity 5.) For this linear problem we expectto see, provided we take t sufficiently long to forget the transient, a verytrivial outcome to BV, or EBV.In Figure 13 we summarize the results of the BV calculation on this sys-tem. We employed an integration time step of 0 . .
01. We performed the calculation in double precision, usingan explicit Runge-Kutta 4, but payed little attention to how numerical sen-sitivity was handled. In Figure 13a we show the BV, i.e. , Y , as a function oftime. The calculation of BV, using (3), clearly diverged, shortly after about t = 30 (and thus not shown in Figure 13a). However, there were indicationsthat something was not right even before it became obvious that the solution46as wrong: In Figure 13b we show the 2-norm of M ( Y n + δ Y n , δt ) − M ( Y n , δt ).It monotonically increases, even though all the factors t k e − t , for 0 ≤ k ≤ t > M ( Y n + δ Y n , δt ) − M ( Y n , δt ), however, in very large scaleproblems this might not be practical or even possible. Figure 13c shows the2-norm of M ( Y n + δ Y n , δt ) − M ( Y n , δt ), using a numerically well-conditionedimplementation of the BV algorithm. The strategy used to obtain a well-conditioned outcome was to normalize the elements in the required subtrac-tion before computing their difference. The well-conditioned calculation wascapable of qualitatively good results. However, the strategy adapted here toincrease the numerical stability was by no means generally applicable to allproblems, nor was it optimal. The main thrust of our work is to propose an ensemble-based vector breedingalgorithm, the Ensemble Bred Vector (EBV) algorithm. It is based on theBred Vector (BV) algorithm introduced by Toth and Kalnay (1993, 1997).We compare the EBV to the BV algorithm and the finite-time LyapunovVector algorithms. In the EBV, an ensemble of initial perturbations is bredconcurrently and then rescaled by the size of the largest member of the bredensemble. The uniform normalization of all the ensemble members aftereach cycle is the distinctive trait of the EBV algorithm, which leads to someprofound differences when compared to the BV algorithm.As expected, when initial perturbations are sufficiently small, the EBV,the BV, and the finite-time Lyapunov vector algorithms lead to similar re-sults. We gave a theoretical justification of this phenomenon by lookingat the corresponding time-continuum analogues of the BV and EBV algo-rithms. We rescale frequently: The algorithms and results are formulated,for simplicity, assuming that the rescaling is done after every discrete timestep.In Section 3, we develop a solid mathematical basis for both the BV andEBV algorithms and show that each algorithm results in good approxima-tions of the solutions of the tangent linear model, when the step-size δt issmall. As is seen in Table 1, the EBV algorithm has a substantial advantageover the (classical) BV algorithm, in the sense that the drop in the minimalEBV error is substantially better than the corresponding drop in the minimal47 Time
Time
Time (a) (b)(c)
Figure 13: (a) A sample BV, as a function of time. The calculation divergesand eventually fails (not shown); (b) 2-norm of the numerically-approximated M ( Y n + δ Y n , δt ) − M ( Y n , δt ) as a function of time, for the linear system dXdt = AX . Here, A is an upper-triangular Jordan-block matrix of dimension5, where A has a single eigenvalue − , which is repeated on the main diag-onal, while the superdiagonal has entries 1. (c) 2-norm of the numerically-approximated M ( Y n + δ Y n , δt ) − M ( Y n , δt ) , well-conditioned case.
48V error.In the study of the Lorenz attractor, the classical BV algorithm has ashortcoming which limits one’s ability to use this algorithm to study the dy-namics inside the attractor. In particular, equation (5) implies that (cid:107) δ Y n +1 ( ι ) (cid:107) = (cid:15) , for all n ≥ ι ∈ I . The BV algorithm maps the cloud onto a 2-dimensional sphere. The third dimension is lost, and with it so are the fractalpatterns seen in Figures 4 and 1. Also the many patterns seen in Figure 2are lost because the BV alternative would map everything onto the singleline at height (cid:15) .Lastly, in Figures 9, 10, and 11, we examine a series of related test prob-lems that depart from the tangent linear model. The point here is to get acomparison of the performance of the two algorithms, as one moves furtherinto nonlinear regime. Once again, one sees a significant advantage of theEBV over the BV.The theoretical aspects for the EBV and BV algorithms are presented inSubsection 2.5. As is noted there, our application to the basic issue of thesensitivity with respect to errors in the initial conditions relies heavily on theJohnson, et al manuscript, JPS87.In conclusion, for the applications described in this article, we believethat the new EBV algorithm has been shown to be superior to the traditionalBV algorithm. Finally, we ask: Is the EBV algorithm the ”last” word onmodifications of the classical BV? Probably not. However, we do expect thatthe EBV will serve as a good starting point for new theories of bred vectors. Acknowledgements
JMR was partially supported by NSF grant DMS-0335360. ALM was par-tially supported by NSF grant DMS-0708902, DMS-1009713 and DMS-1009714.NB, ALM, JMR, and GRS wish to thank the Institute for Mathematics andits Applications (IMA) for their support and hospitality. Research at theIMA is supported by the National Science Foundation and the University ofMinnesota. GRS acknowledges his appreciation of Robert Sacker for relatedsuggestions he made at an early stage in the development of this article. Weexpress our sincere gratitude to the referees, who made an in depth readingof the original manuscript and offered several helpful suggestions which ledto improvements in the paper. 49 eferences
L. Arnold.
Random Dynamical Systems . Springer, New York, 1998.R. Buizza, J. Tribbia, F. Molteni, and T. Palmer. Computation of optimalunstable structures for numerical weather prediction models.
Tellus , 45A:388–407, 1993.P. Cessi and W. R. Young. Multiple equilibria in two-dimensional thermo-haline circulation.
Journal of Fluid Mechanics , 241:291–309, 1992.K. K. W. Cheung. Ensemble forecasting of tropical cyclone motion: compar-ison between regional bred modes and random perturbations.
Meteorologyand Atmospheric Physics , 78:23–35, 2001.M. Corazza, E. Kalnay, D. J. Patil, S.-C. Yang, R. Morss, M. Cai, I. Szun-yogh, B. R. Hunt, and J. A. Yorke. Use of the breeding technique toestimate the structure of the analysis errors of the day.
Nonlinear Pro-cesses in Geophysics , 10:233–243, 2003.B. Deremble, F. DAndrea, and M. Ghil. Fixed points, stable manifolds,weather regimes, and their predictability.
Chaos , 19:043109, 2009.E. Evans, N. Bhatti, J. Kinney, L. Oann, M. Pe˜na, S. Yang, and E. Kalnay.RISE undergraduates find that regime changes in Lorenz model are pre-dictable.
Bulletin of the American Meteorological Society , 85:520–524,2004.G. Eyink. Statistical hydrodynamics of the thermohaline circulation in a twodimensional model.
Tellus, A , 57:100–115, 2005.C. Foias, G. R. Sell, and R. Temam. Inertial manifolds for nonlinear evolu-tionary equations.
Journal of Differential Equations , 73:309 – 353, 1988.T. Gneiting and A. E. Raftery. Atmospheric science - weather forecastingwith ensemble methods.
Science , 310:248–249, 2005.S. Hallerberg, D. Pazo, J. M. Lopez, and M. A. Rodriguez. Logarithmic bredvectors in spatiotemporal chaos: Structure and growth.
Physical ReviewE , 81, 2010.J. A. Hansen and L. A. Smith. The role of operational constraints in select-ing supplementary observations.
Journal of the Atmospheric Sciences , 57:2859–2971, 2000. 50. A. Johnson, K. J. Palmer, and G. R. Sell. Ergodic properties of lineardynamical systems.
SIAM Journal of Mathematical Analysis , 18:1–33,1987.E. Kalnay.
Atmospheric Modeling, Data Assimilation and Predictability .Cambridge University Press, Cambridge, 2003.J. P. LaSalle and S. Lefschetz.
Stability by Lyapunov’s Direct Method withApplications . Academic Press, New York, 1961.E. N. Lorenz. Deterministic nonperiodic flow.
Journal of Atmospheric Sci-ence , 20:130–141, 1963.A. M. Lyapunov. The general problem of the stability of motion.
Interna-tional Journal of Control , 55(3):521–790, 1992. Translated by A. T. Fullerfrom ´Edouard Davaux’s French translation (1907) of the 1892 Russian orig-inal, (historical introduction) by Fuller, a Smirnov, and the bibliographyBarrett,B. B. Mandelbrot.
Fractals: form, chance, and dimension . W. H. Freemanand Co., San Francisco, Calif., revised edition, 1977. Translated from theFrench.M. Mu and Z. N. Jiang. A new approach to the generation of initial perturba-tions for ensemble prediction: Conditional nonlinear optimal perturbation.
Chinese Science Bulletin , 53:2062–2068, 2008.T. N. Palmer, R. Gelaro, J. Barkmeijer, and R. Buizza. Singular vectors,metrics and adaptive observations.
Journal of Atmospheric Sciences , 55:633–653, 1998.V. A. Pliss and G. R. Sell. Robustness of exponential dichotomies in infinite-dimensional dynamical systems.
Journal of Dynamics and DifferentialEquations , 11:471–513, 1999.C. Primo, M. A. Rodriguez, and J. M. Gutierrez. Logarithmic bred vectors. anew ensemble method with adjustable spread and calibration time.
Journalof Geophysical Research-Atmospheres , page D05116, 2008.O. Rivi´ere, G. Lapeyre, and O. Talagrand. Nonlinear generalization of sin-gular vectors: Behavior in a baroclinic unstable flow.
Journal of the At-mospheric Sciences , 65:1896–1911, 2008.R. J. Sacker and G. R. Sell. Existence of dichotomies and invariant splittingsfor linear differential systems. II.
Journal of Differential Equations , 22(2):478–496, 1976. 51. J. Sacker and G. R. Sell. Lifting properties in skew-product flows withapplications to differential equations.
Memoirs of the American Mathe-matical Society , 11(190):iv+67, 1977.R. J. Sacker and G. R. Sell. A spectral theory for linear differential systems.
Journal of Differential Equations , 27(3):320 – 358, 1978.R. J. Sacker and G. R. Sell. The spectrum of an invariant submanifold.
Journal of Differential Equations , 38(2):135 – 160, 1980.G. R. Sell and Y. You.
Dynamics of Evolutionary Equations . Springer, NewYork, 2002.Z. Toth and E. Kalnay. Ensemble forecasting at NCEP: the generation ofperturbations.
Bulletin of the American Meteorological Society , 74:2317–2330, 1993.Z. Toth and E. Kalnay. Ensemble forecasting at NCEP: the breeding method.
Monthly Weather Review , 125:3297–3318, 1997.X. G. Wang and C. H. Bishop. A comparison of breeding and ensemble trans-form Kalman filter ensemble forecast schemes.
Journal of the AtmosphericSciences , 60:1140–1158, 2003.M. Z. Wei and Z. Toth. A new measure of ensemble performance: Pertur-bation versus error correlation analysis (PECA).
Monthly weather Review ,131:1549–1565, 2003.C. L. Wolfe and R. M. Samelson. An efficient method for recovering Lyapunovvectors from singular vectors.