Data assimilation for chaotic dynamics
Alberto Carrassi, Marc Bocquet, Jonathan Demaeyer, Colin Grudzien, Patrick Raanes, Stephane Vannitsem
DData assimilation for chaotic dynamics
Alberto Carrassi, Marc Bocquet, Jonathan Demaeyer, Colin Grudzien, PatrickRaanes and Stephane Vannitsem
Abstract
Chaos is ubiquitous in physical systems. The associated sensitivity to initialconditions is a significant obstacle in forecasting the weather and other geophysi-cal fluid flows. Data assimilation is the process whereby the uncertainty in initialconditions is reduced by the astute combination of model predictions and real-timedata. This chapter reviews recent findings from investigations on the impact of chaoson data assimilation methods: for the Kalman filter and smoother in linear systems,analytic results are derived; for their ensemble-based versions and nonlinear dynam-ics, numerical results provide insights. The focus is on characterizing the asymptoticstatistics of the Bayesian posterior in terms of the dynamical instabilities, differen-tiating between deterministic and stochastic dynamics. We also present two novelresults. Firstly, we study the functioning of the ensemble Kalman filter (EnKF) in thecontext of a chaotic, coupled, atmosphere-ocean model with a quasi-degenerate spec-trum of Lyapunov exponents, showing the importance of having sufficient ensemble
Alberto CarrassiDept. of Meteorology and National Centre for Earth Observation, University of Reading, Reading,UK,Mathematical Institute, University of Utrecht, Netherlands e-mail: [email protected]
Marc BocquetCEREA, joint laboratory École des Ponts ParisTech and EDF R&D, Universitè Paris-Est, Francee-mail: [email protected]
Jonathan DemaeyerRoyal Meteorological Institute of Belgium, Brussels, Belgium e-mail: [email protected]
Colin GrudzienDepartment of Mathematics and Statistics, University of Nevada, Reno, Reno, Nevada, USA e-mail: [email protected]
Patrick RaanesNORCE, Bergen, Norway e-mail: [email protected]
Stephane VannitsemRoyal Meteorological Institute of Belgium, Brussels, Belgium e-mail: [email protected] a r X i v : . [ phy s i c s . d a t a - a n ] N ov Authors Suppressed Due to Excessive Length members to track all of the near-null modes. Secondly, for the fully non-Gaussianmethod of the particle filter, numerical experiments are conducted to test whether thecurse of dimensionality can be mitigated by discarding observations in the directionsof little dynamical growth of uncertainty. The results refute this option, most likelybecause the particles already embody this information on the chaotic system. Theresults also suggest that it is the rank of the unstable-neutral subspace of the dynam-ics, and not that of the observation operator, that determines the required numberof particles. We finally discuss how knowledge of the random attractor can play arole in the development of future data assimilation schemes for chaotic multiscalesystems with large scale separation.
This chapter attempts a unified and comprehensive discussion of a number of studiesthat in about a decade have contributed to shape our nowadays’s understanding ofthe implications, impacts and consequences for data assimilation when is applied tochaotic dynamics. The chapter presents a review of the essential results appeared indifferent studies, but for the first time together in a coherent treatment. In additionwe present new original findings addressing two key aspects that were not coveredin previous studies. The first treats the impact of data assimilation on a chaoticand multiscale system, the second concerns the consequences for nonlinear, non-Gaussian, data assimilation (a particle filter) in face of the chaotic nature of theunderlying dynamics.The exposition is organised as follows. We first discuss, in Sect. 2, the chaoticcharacter of atmospheric and oceanic flows, and provide a treatment of the keymathematical concepts and tools that allow for characterising chaos and to “measure”the degree of instabilities. In doing so, we review classical results from dynamicalsystem theory, including the multiplicative ergodic theorem and associated definitionof Lyapunov (forward, backward and covariant) vectors and exponents.Section 3 analyses how a chaotic dynamics impacts data assimilation. Our focusis on Kalman filter (KF) and smoother (KS) in Sect. 3.1, and on their ensemble-based formulations, the ensemble Kalman filter (EnKF) and smoother (EnKS), inSect. 3.2. Section 3.1 contains primarily analytic results and treats KF and KS inlinear systems, either purely deterministic (Sect. 3.1.1) or with stochastic additivenoise (Sect. 3.1.2). Section 3.2 is dedicated to nonlinear systems (deterministic, inSect. 3.2.1 and stochastic in Sect. 3.2.2) and on how the EnKF (Evensen, 2009a)and the extended Kalman filter (EKF, Ghil and Malanotte-Rizzoli (1991)) works inthis scenario. In Sect. 3.2.1 we present original results on the impact of chaos on theperformance of the EnKF in a coupled atmosphere-ocean model which possesses adegenerate-like spectrum of Lyapunov exponents, disentangling on the role of thequasi-null exponents.Section 4 reverses the perspective and instead of studying the effect of chaos ondata assimilation, reports on how properties of the dynamics have been used to devise hort Title 3 adaptive observation strategies and ad-hoc data assimilation methods. In particular,in Sect. 4.2, we succinctly review the assimilation in the unstable subspace (AUS,Palatella et al. (2013)), a known approach that exploits the unstable-neutral subspaceof the dynamics to perform the analysis. Remarkably, AUS was conceived beforethe findings reviewed in Sect. 3. The latter work was inspired by these early studies,and the need to provide mathematical rigor to clarify the mechanisms that madethese early studies successful. In turn, the later work has furthermore provided theframework to generalize these early ideas to a variety of other types of dynamics.Section 5 presents what we consider two main areas of future developments. InSect. 5 the paradigm of AUS is incorporated within a fully nonlinear data assimilationscheme, the particle filter ( ? ). It is shown that observing in the directions of instabil-ities is effective but also that it is not deleterious to observe the stable directions. Asopposed to the data or model sizes, our results suggest that the number of particlesrequired to achieve good results scales with the size of the unstable-neutral subspace.Section 5.2 discusses how the concept of random attractor can offer novel ways tohandle the data assimilation problem in stochastic chaotic multi-scale systems withlarge scale separation. It also treats the implications of the numerical scheme on theoutput of the data assimilation cycle and ensemble-based forecasts, as well as posessome key questions for future studies.Final conclusions and a summary are drawn in Sect. 6. The atmosphere and the ocean are fluids that are described by the set of classicalconservation laws of hydrodynamics, including the conservation of mass, momentumand energy (Vallis, 2017). For the atmosphere, these are often complemented by theconservation of moisture present in the air. These laws lead to a set of local dynamicalequations describing the motion of each parcel of fluid. Given that these equations arenonlinear with complex interactions with the boundaries, realistic solutions cannotbe obtained analytically and one must rely on numerical simulations starting fromappropriate initial conditions.Numerical simulations are based on discrete approximations in space and timeof the dynamical equations, and are often accompanied with simplifications ofthe equations in order to describe appropriately the scales of interest. One of themost famous approximations is the geostrophic approximation which assumes abalance between the horizontal velocity field and the horizontal pressure gradient.Geostrophic balance is a good approximation for both the ocean and the atmospherealbeit at different spatial scales. The geostrophic approximation is at the basis of themodels that will be used later in Sect. 3.2.1Whatever the scale at which the atmospheric or the ocean fluids are observed,they display an apparently erratic evolution. This erratic behavior is also presentin atmospheric, ocean and climate models when appropriate forcing are imposed.This feature should not be confused with randomness as most of the models used
Authors Suppressed Due to Excessive Length since the start of numerical modelling were purely deterministic. Lorenz (1963)showed in a simple low-order model that this erratic behavior is concomitant withthe property of sensitivity to initial conditions, by which whatever small an errorin the initial condition is, it will increase rapidly in time. This property impliesthat given the inevitable error in the initial conditions, any forecast will ultimatelybecome useless as the error finally reaches an amplitude of the same order as thenatural variability of the variable considered. The sensitivity to initial conditionsand the consequent erratic-like evolution are the key properties of deterministicdynamical systems displaying chaos .This behaviour has been found in many atmospheric, oceanic and climate models(see e.g.,
Vannitsem, 2017, for a review). In particular, a detailed investigation ofthe chaotic nature of the coupled ocean-atmosphere system that will be used inSect. 3.2.1 has been performed by Vannitsem et al. (2015).
The notion of sensitivity to initial conditions of dynamical solutions was alreadydiscovered and studied in a mathematical context by Poincaré (1899). In the secondhalf of the 20th century, this property was discovered in models of atmosphericand climate relevance (Thompson, 1957; Lorenz, 1963). Its important practicalimplications drove the regain of interest in developing the appropriate mathematicsin support of its description and understanding. These efforts culminated with thedevelopment of the ergodic theory of deterministic dynamical systems and chaostheory. In the following we shall briefly summarise what we consider to be the keydevelopments that led to the definition of Lyapunov exponents and vectors.The importance of these mathematical objects stands on their ability to “mea-sure” the degree of instabilities and thus to quantify the aforementioned sensitivity toinitial conditions. While more rigorous mathematical treatments can be found in ap-propriate mathematical literature (see e.g. , Pikovsky and Politi, 2016, and referencestherein), and we will invoke that rigour to a certain extent in the following sections,here we approach the discussion with a physical and intuitive angle. Further detailscan also be found in Legras and Vautard (1996); Barreira and Pesin (2002); Kuptsovand Parlitz (2012).Let us write the evolution laws of a deterministic dynamical system in the formof a set of ordinary differential equations (ODEs),d x d 𝑡 = f ( x , 𝝈 ) , (1)where x is a vector containing the entire set of relevant variables x = ( 𝑥 , ..., 𝑥 𝑛 ) and 𝝈 represents a set of parameters. The discussion that follows holds also when system(1) explicitly depends on time, thus being non-autonomous, provided it remainsergodic. hort Title 5 Since the process of measurement is always subject to finite precision, the initialstate is never known exactly. To study the evolution and the implications of such anerror, let us consider an initial state displaced slightly from x by an initial error 𝛿 x .The perturbed initial state, x + 𝛿 x , generates a new trajectory in phase space andone can define the instantaneous error vector as the vector joining the representativepoints of the reference trajectory and the perturbed one at a given time, 𝛿 x ( 𝑡 ) .Provided that this perturbation is sufficiently small and smooth, its dynamics can bedescribed by the linearized equation,d 𝛿 x d 𝑡 ≈ 𝜕 F 𝜕 x | x ( 𝑡 ) 𝛿 x , (2)with a formal solution, 𝛿 x ( 𝑡 ) ≈ M ( 𝑡, x ( 𝑡 )) 𝛿 x ( 𝑡 ) . (3)The matrix M is referred to as the “fundamental matrix” and is the resolvent of Eq. (2), i.e. M ( 𝑡, x ( 𝑡 )) = 𝑒 ∫ 𝑡 𝑡 𝜕 F 𝜕 x | x ( 𝑡 ) d 𝑡 . The fundamental matrix contains the information onthe amplification of infinitesimally small perturbations.In the context of the ergodic theory of deterministic dynamical systems, theOseledets theorem (Kuptsov and Parlitz, 2012) shows that the limit of the matrix ( M (cid:62) M ) /[ ( 𝑡 − 𝑡 ) ] , for time going to infinity, exists; let us refer to this limitingmatrix as S . The logarithm of its eigenvalues are called the Lyapunov exponents (LEs), whereas the full set of LEs is called the Lyapunov spectrum, and is usuallyrepresented in decreasing order. The eigenvectors of S , which are local properties ofthe flow (they change along the trajectory thus being time-dependent) and dependon the initial time 𝑡 , are called the forward Lyapunov vectors (FLVs) (Legras andVautard, 1996).The LEs are a powerful tool to “measure” chaos and to characterise the degreeof instability of a system. For instance, LEs are averaged (asymptotic) indicatorsof exponential growth (LE>0) or decay (LE<0) of perturbations under the tangent-linear model. A deterministic chaotic system is uniquely characterised by having atleast its leading LE larger than zero, i.e. LE >
0. The sum of the LEs is equal tothe average divergence of flow generated by Eq. (1) (see e.g.
Pikovsky and Politi(2016)). This means that in dissipative (conservative, e.g.
Hamiltonian) systemsthe sum of the LEs is negative (zero): volumes in the phase space of a dissipative(conservative) dynamics reduce (are conserved) on average with time. Furthermore,autonomous continuous-in-time systems generally possess at least one LE=0, unlessthey converge to a motion-less state. This null exponent is related to perturbationsaligned to the system’s velocity vector ( i.e. 𝛿 x = f ), so although they shall fluctuatedepending on the local flow, they will not on average decay nor growth. These lasttwo properties can also be used as a check for numerical accuracy when computingthe LEs.The multiplicative ergodic theorem (MET) (Barreira and Pesin, 2002, Theo-rem 2.1.2) guarantees that, under general hypotheses, the eigenvalues of the matrix ( M (cid:62) M ) /[ ( 𝑡 − 𝑡 ) ] obtained for 𝑡 → ∞ are equivalent to the ones of the matrix S (cid:48) = ( MM (cid:62) ) /[ ( 𝑡 − 𝑡 ) ] when 𝑡 → −∞ . The equivalence of the spectrum of S and S (cid:48) Authors Suppressed Due to Excessive Length is not generically true for the fundamental matrix of an arbitrary, linear dynamicalsystem; however, the MET guarantees that this is a fairly generic property of theresolvent of the tangent-linear model of a nonlinear dynamical system . If the flowof the time derivative f is a C diffeomorphism of a compact, smooth, Riemannianmanifold 𝑀 , then the MET assures that the LEs are defined equivalently by thelog-eigenvalues of S or S (cid:48) for any initial condition x ( 𝑡 ) and that the LEs are uniqueon a subset of 𝑀 of full measure with respect to any ergodic invariant measure 𝜇 of the flow. Note that, without the condition of ergodicity of 𝜇 , the LEs may bewell-defined point-wise, but the specific values and their multiplicity may dependon the initial condition x ( 𝑡 ) .The matrices S and S (cid:48) are symmetric. However, contrary to the eigenvalues, theeigenvectors of these two matrices are not equivalent due to the asymmetric char-acter of the fundamental matrix M in forward- and reverse-time. The eigenvectorsof the latter are called the backward Lyapunov vectors (BLVs). Theoretically, eachmatrix can be evaluated at the same place along the reference trajectory x ( 𝑡 ) andtheir orthogonal eigenvectors can be computed as L f ,𝑖𝑡 and L b ,𝑖𝑡 for S and S (cid:48) , respec-tively, where it is understood that the time-dependence on 𝑡 is with respect to thelinearization of the dynamics at x ( 𝑡 ) . There exist Oseledet subspaces 𝑊 𝑖𝑡 , 𝑊 𝑖𝑡 = L b , 𝑡 ⊕ ... ⊕ L b ,𝑖𝑡 ∩ L f ,𝑖𝑡 ⊕ ... ⊕ L f ,𝑁𝑡 , (4)with ⊕ being the direct product (Ruelle, 1979), that have the important properties ofbeing invariant under the effect of the fundamental matrix, such that M ( 𝜏, x ( 𝑡 )) 𝑊 𝑖𝑡 = 𝑊 𝑖𝜏 . (5)Due to their orthogonal nature, the FLVs and the BLVs require by definition thechoice of a norm and of an inner product. Nevertheless, the Oseledet subspacesthemselves do not have this dependence; in this way, they can be considered to embedmore invariant information about the dynamics. The decomposition of the tangent-linear space into these covariant subspaces is commonly known as the “Oseledetsplitting” or decomposition. The classical form of the MET thus guarantees that theOseledet splitting is well-defined and consistent with probability one over all initialconditions of the attractor, with respect to the invariant, ergodic measure. Othercovariant splittings of the tangent-linear model, such as by exponential dichotomy,exist under more general forms of the MET (Froyland et al., 2013).When the Lyapunov spectrum is non-degenerate, one can define a time-varyingbasis, subordinate to the Oseledet spaces, 𝑊 𝑖𝑡 = span (cid:8) L c ,𝑖𝑡 (cid:9) , such that M ( 𝜏, x ( 𝑡 )) L c ,𝑖𝑡 = 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) L c ,𝑖𝜏 , (6)where (cid:107) L c ,𝑖𝑡 (cid:107) = (cid:107) L c ,𝑖𝜏 (cid:107) , and 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) ∈ R describes an amplification factor. Thevectors L c ,𝑖𝑡 are known as the covariant Lyapunov vectors (CLVs). In the long-timelimit, the amplifications 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) can be associated to the LEs as, hort Title 7 ± 𝜆 𝑖 = lim 𝜏 →±∞ 𝜏 𝑙𝑛 | 𝛼 𝑖 ( 𝜏, x ( 𝑡 ))| = lim 𝜏 →±∞ 𝜆 𝜏𝑖 ( x ( 𝑡 )) (7)where we indicate by 𝜆 𝜏𝑖 ( x ( 𝑡 )) the average of the growth rate taken over a timewindow 𝜏 starting at 𝑡 at position x ( 𝑡 )) , and are commonly known as the localLyapunov exponents (LLEs) (Pikovsky and Politi, 2016, see chapter 5).Throughout the chapter we will denote the matrix with columns correspondingto the full tangent linear space ordered basis of BLVs/FLVs/CLVs at time 𝑡 as L e 𝑡 for e ∈ { b , f , c } respectively. A sub-slice of this matrix of Lyapunov vectorscorresponding, inclusively, to columns 𝑖 through 𝑗 will be denoted L e ,𝑖 : 𝑗𝑡 . In thefollowing, the relationships between these Lyapunov vectors and their asymptoticdynamics will be key to understanding the predictability of chaotic systems. For thispurpose, we will largely use the BLVs and the CLVs, and to a lesser extent the FLVs.Note that the basis (cid:8) L c ,𝑖𝑡 (cid:9) 𝑁 𝑥 𝑖 = is not orthogonal in general and, in fact, the anglesbetween any two Oseledec subspaces 𝑊 𝑖 and 𝑊 𝑗 are not in general bounded awayfrom zero in their limits in forward- or reverse-time. For this reason, coordinatetransformations to covariant Oseledet bases are not generally numerically well-conditioned asymptotically, and therefore hard to compute. The property of integralseparation (Dieci and Van Vleck, 2002), describing the stability of the Lyapunovspectrum under bounded perturbations of the tangent-linear equations, ensures thatthe angles between the covariant subspaces will remain bounded away from zero,but this is a strong condition and it is not as generic of a property as the existence ofthe Oseledet decomposition under the MET. However, if a dynamical system is inte-grally separated as above, there exists a well-defined, numerically well-conditionedtransformation of coordinates of the tangent-linear model for which the action of theresolvent M can be expressed as a block-diagonal matrix with each block describingthe invariant dynamics of a single Oseledet space. For degenerate spectrum, theseblocks may be upper-triangular, but for non-degenerate spectrum this representationof the resovlent becomes a strictly diagonal matrix; see Theorem 5.4.9 of Adrianova(1995) for the classical result, or Theorem 5.1 of Dieci and Van Vleck (2007) andFroyland et al. (2013) for more recent extensions. The high sensitivity of a chaotic dynamical system to the initial condition makes ithard to forecast it, even when their evolution equations are perfectly known. Indeedthe typical error grows exponentially over time with a rate given by the largestpositive LE. With a view to forecasting, one has no choice but to regularly correctits trajectory using information on the system state obtained through observations.This is the primary goal of data assimilation (DA) and has been key to the successof numerical weather forecasting. We refer to Kalnay (2003); Asch et al. (2016) andreferences therein for textbooks and reviews on geophysical DA.
Authors Suppressed Due to Excessive Length
When framed in a Bayesian formalism, the goal of DA is to estimate the condi-tional probability density function (pdf) of the system state knowing observationsof that system. For high-dimensional systems, only approximations of this pdf canbe obtained, among which the Gaussian approximation is the most practical andcommon (Carrassi et al., 2018). In the case of Gaussian statistics of the error andlinear dynamics, the conditional pdf can be obtained analytically: it is Gaussian, andcan be sequentially computed using the Kalman filter (KF) (Kalman, 1960). Directlyinspired from the Kalman filter, the extended Kalman filter (EKF) offers an approx-imate DA scheme when the operators are nonlinear (Ghil and Malanotte-Rizzoli,1991). However, it can hardly be used in the context of high-dimensional systems,where it has to be replaced with the ensemble Kalman filter (EnKF) (Evensen,2009b). In the EnKF, the covariance matrices are represented by state perturbationswhich are representative of the errors and, together with the state mean, form alimited-size ensemble of state vectors.Understanding how the ensemble of the EnKF evolves under the action of theDA scheme and the forecast model dynamics is important. Several numerical resultssuggest that the skills of ensemble-based DA methods in chaotic systems are related tothe instabilities of the underlying dynamics (Ng et al., 2011). Numerical evidencesexist that some asymptotic properties of the ensemble-based covariance matrices(rank, span, range) relate to the unstable modes of the dynamics (Sakov and Oke,2008a; Carrassi et al., 2009). Nevertheless, a better, more profound, understandingof these results was needed to aim at designing reduced rank, computationally cheap,formulations of the filters.Analytic results that have shed lights on the behaviour of filters and smootherson chaotic dynamics, and explained the numerically observed properties, have beenobtained for linear dynamics. Section 3.1.1 reviews those findings in the case ofdeterministic systems without model error, based on the work by Gurumoorthy et al.(2017) and Bocquet et al. (2017). Section 3.1.2 treats their extension to the case ofstochastic systems, following the work by Grudzien et al. (2018a,b). In the case ofnonlinear dynamics, robust numerical evidences will be described by the originalexperiments in Sect. 3.2, that extent the previous findings by Bocquet and Carrassi(2017).
At time 𝑡 𝑘 , let x 𝑘 ∈ R 𝑁 x and y 𝑘 ∈ R 𝑁 y be the state and observation vector, respec-tively. Let us assume linear evolution model dynamics M 𝑘 and observation model H 𝑘 such that hort Title 9 x 𝑘 = M 𝑘 x 𝑘 − + w 𝑘 , (8a) y 𝑘 = H 𝑘 x 𝑘 + v 𝑘 . (8b)The model and observation noises, w 𝑘 and v 𝑘 , are assumed mutually independent,zero-mean Gaussian white sequences with statisticsE [ v 𝑘 v (cid:62) 𝑙 ] = 𝛿 𝑘,𝑙 R 𝑘 , E [ w 𝑘 w (cid:62) 𝑙 ] = 𝛿 𝑘,𝑙 Q 𝑘 , E [ v 𝑘 w (cid:62) 𝑙 ] = . (9)In the Kalman filter, which yields the optimal DA solution with such assumptions,the forecast error covariance matrix P 𝑘 satisfies the recurrence P 𝑘 + = M 𝑘 + ( I + P 𝑘 𝛀 𝑘 ) − P 𝑘 M (cid:62) 𝑘 + + Q 𝑘 + , (10)where 𝛀 𝑘 ≡ H (cid:62) 𝑘 R − 𝑘 H 𝑘 , (11)is the precision matrix of the observations mapped into state space. In the absenceof model error, i.e. Q 𝑘 ≡ , Gurumoorthy et al. (2017) proved rigorously that in thefull-rank KF (in particular P is full rank), P 𝑘 collapses onto the unstable-neutralsubspace.We shall summarise in the following some key results that apply to the full-rankbut also to the degenerate case, i.e. even if P is of arbitrary rank and the initial errorsof the DA scheme only lie in a subspace of R 𝑁 x . In particular they apply to the EnKFif the dynamical and observations operators are both linear. We recall them here, butreaders can find full details in Bocquet et al. (2017); Bocquet and Carrassi (2017). Result 1 : Bound on the free forecast error covariance matrixLet us define the resolvent of the dynamics from 𝑡 𝑙 to 𝑡 𝑘 as M 𝑘 : 𝑙 = M 𝑘 M 𝑘 − · · · M 𝑙 + ,with the convention that M 𝑘,𝑘 = I . The first key result is the following inequality inthe set of the semi-definite symmetric matrices: P 𝑘 ≤ M 𝑘 :0 P M (cid:62) 𝑘 :0 + 𝚵 𝑘 (cid:67) P free 𝑘 , (12)where 𝚵 ≡ and for 𝑘 ≥ 𝚵 𝑘 ≡ 𝑘 ∑︁ 𝑙 = M 𝑘 : 𝑙 Q 𝑙 M (cid:62) 𝑘 : 𝑙 , (13)is known as the controllability matrix (Jazwinski, 1970), and the suffix “free” is usedto refer to the forecast error of the system unconstrained by data. In the absence ofmodel noise ( Q 𝑘 ≡ for the rest of this section), it reads P 𝑘 ≤ M 𝑘 :0 P M (cid:62) 𝑘 :0 . (14)Assuming the dynamics to be non-singular, the column subspace of the forecast errorcovariance matrix satisfies Im ( P 𝑘 ) = M 𝑘 :0 ( Im ( P )) . (15)If 𝑛 is the dimension of the unstable-neutral subspace of the dynamics, it can furtherbe shown that lim 𝑘 →∞ rank ( P 𝑘 ) ≤ min { rank ( P ) , 𝑛 } , (16)which is a first proof of the collapse of the error covariance matrix (actually itscolumn space) onto the unstable and neutral subspace of the dynamics. Result 2 : Collapse onto the unstable subspaceLet 𝜎 𝑘𝑖 , for 𝑖 = , . . . , 𝑁 x denote the eigenvalues of P 𝑘 ordered as 𝜎 𝑘 ≥ 𝜎 𝑘 · · · ≥ 𝜎 𝑘𝑁 x . It was shown that 𝜎 𝑘𝑖 ≤ 𝛽 𝑖 exp (cid:16) 𝜆 𝑘𝑖 𝑘 (cid:17) , (17)for some 𝛽 𝑖 >
0, where 𝜆 𝑘𝑖 is a log-singular value of M 𝑘 :0 that converges to the LE 𝜆 𝑖 . This gives an upper bound for all eigenvalues of P 𝑘 and a rate of convergence forthe 𝑁 x − 𝑛 smallest ones. Moreover, if P 𝑘 is uniformly bounded, it can further beshown that the stable subspace of the dynamics is asymptotically in the null spaceof P 𝑘 , i.e. , lim 𝑘 →∞ (cid:13)(cid:13)(cid:13) P 𝑘 L b ,𝑖𝑘 (cid:13)(cid:13)(cid:13) = 𝑖 > 𝑛 ; this extends to any norm and linear combination of these vectors. Result 3 : Explicit dependence of P 𝑘 on P To study the dependence of P 𝑘 on P , it has been shown that the forecast errorcovariance matrix can be written as P 𝑘 = M 𝑘 :0 P M (cid:62) 𝑘 :0 (cid:0) I + 𝚪 𝑘 M 𝑘 :0 P M (cid:62) 𝑘 :0 (cid:1) − , (19)where the matrix 𝚪 𝑘 ≡ 𝑘 − ∑︁ 𝑙 = M −(cid:62) 𝑘 : 𝑙 𝛀 𝑙 M − 𝑘 : 𝑙 , (20)is known as the information matrix and it measures the observability of the systemby propagating the precision matrices 𝛀 𝑙 up to 𝑡 𝑘 .An alternative formulation of Eq. (19) is P 𝑘 = M 𝑘 :0 P [ I + 𝚯 𝑘 P ] − M (cid:62) 𝑘 :0 , (21)where hort Title 11 𝚯 𝑘 ≡ M (cid:62) 𝑘 :0 𝚪 𝑘 M 𝑘 :0 = 𝑘 − ∑︁ 𝑙 = M (cid:62) 𝑙 :0 𝛀 𝑙 M 𝑙 :0 , (22)is also an information matrix, directly related to the observability of the DA system,but pulled back at the initial time 𝑡 . Equation (21) is of key importance because itallows to study the asymptotic behaviour of P 𝑘 , i.e. the filter “believed” error, usingthe asymptotic properties of the dynamics. Result 4 : Asymptotics of P 𝑘 For P 𝑘 to forget about P as 𝑘 tends to infinity, it was shown that one can impose thefollowing sufficient conditions: • Condition 1:
Recall that the FLVs at 𝑡 associated to the unstable and neutralexponents are the columns of L f , 𝑛 ∈ R 𝑁 x × 𝑛 . Moreover, let define the anomalymatrix X ∈ R 𝑁 x × 𝑛 such that P = XX T . The condition readsrank (cid:16) X (cid:62) L f , 𝑛 (cid:17) = 𝑛 . (23)In practice the initial ensemble anomalies X projects onto the first 𝑛 FLVs at 𝑡 . • Condition 2:
The model is sufficiently observed so that the unstable and neutraldirections remain under control, i.e. , there exits 𝜀 > (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 > 𝜀 I . (24) • Condition 3:
Furthermore, for any neutral BLV we havelim 𝑘 →∞ (cid:16) L b ,𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b ,𝑛 𝑘 = ∞ , (25)implying that the neutral direction is well observed and controlled.Under these three conditions, we obtainlim 𝑘 →∞ (cid:40) P 𝑘 − L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) (cid:41) = . (26)Hence, the asymptotic sequence does not depend on P , but only 𝚪 𝑘 , i.e. the dynam-ics. It can also be shown that the neutral modes have a peculiar role and a long lastinginfluence: their influence on the current estimate decreases sub-exponentially. Result 5 : From the degenerate KF to the square-root EnKF and EnKSThe recurrence Eq. (21) can be reformulated in a factorised form which is suited tothe square-root EnKF. The standard perturbation decomposition of the forecast error covariance is P 𝑘 = X 𝑘 X (cid:62) 𝑘 , (27)where X 𝑘 is the matrix of centred perturbations (the anomaly matrix aforemen-tioned). A right-transform update formula can then be obtained from (21): X 𝑘 = M 𝑘 :0 X (cid:2) I + X (cid:62) 𝚯 𝑘 X (cid:3) − / 𝚼 𝑘 , (28)where 𝚼 𝑘 is an arbitrary orthogonal matrix such that 𝚼 𝑘 = , where is the vector [ . . . ] (cid:62) defined in the ensemble subspace. It is equivalent to the left-transformupdate formula X 𝑘 = (cid:2) I + M 𝑘 :0 P M (cid:62) 𝑘 :0 𝚪 𝑘 (cid:3) − / M 𝑘 :0 X 𝚼 𝑘 . (29)The importance of Eqs. (28) and (29) stands on the fact that with linear models,Gaussian observation and initial errors, the (square-root) degenerate KF (with X ∈ R 𝑁 𝑥 × 𝑛 and 𝑛 < 𝑁 𝑥 ) is equivalent to the square-root EnKF and can serve as a proxyto the EnKF applied to nonlinear models.All of these results can be generalised to linear smoothers. In particular, thesmoother forecast error covariance matrix is similar to that of the filter, i.e. given by(19) (or (21)) but with the following modified information matrix: (cid:98) 𝚪 𝑘 = 𝚪 𝑘 + 𝑘 + 𝐿 − 𝑆 ∑︁ 𝑙 = 𝑘 M − T 𝑘 : 𝑙 𝛀 𝑘 M − 𝑘 : 𝑙 , (30)where 𝐿 is the lag of the smoother (how far in the past observations are accountedfor) and 𝑆 tells by how many time steps the smoother’s window is shifted betweentwo consecutive updates. Note that (cid:98) 𝚪 𝑘 ≥ 𝚪 𝑘 (using the Loewner order on the set ofsemi-definite positive matrices) reflecting the general higher amount of informationincorporated within a smoother update relative to a filter. Therefore the asymptoticsequences for the filter (right-hand side) and smoother (left-hand-side) follow theinequality: L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) (cid:98) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) ≤ L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) . (31)The linear smoothers can serve as a proxy to the ensemble Kalman smoother(Evensen, 2009b) or the iterative ensemble Kalman smoother (IEnKS) (Bocquet andSakov, 2014) applied to nonlinear models (Bocquet and Carrassi, 2017). With theIEnKS, where the true state trajectory is even better estimated and the errors arereduced, the collapse of the perturbations onto the unstable and neutral subspace isexpected to be even faster. hort Title 13 The above perfect-deterministic model configuration demonstrates how chaos shapesthe inferences of the posterior. Asymptotic statistics are determined by the ability ofthe filter to control the growth of initial error in the unstable-neutral subspace withrespect to its sensitivity to observations therein. However, in realistic DA, additionalforecast errors are introduced throughout the forecast cycle due to the inadequacyof numerical models in representing reality. One classical approach to treat thesemodel errors is to represent them as additive or multiplicative noise (Jazwinski,1970). Oseledet’s theorem and the Lyapunov spectrum are also formulated for suchsystems of stochastic differential equations (SDEs) and discrete maps.Suppose { v 𝑖 } 𝑑𝑖 = is a collection of C vector fields on 𝑀 ⊂ R 𝑁 𝑥 , a smooth manifoldwithout boundary. Define { 𝑊 𝑖𝑡 } 𝑑𝑖 = where each 𝑊 𝑖𝑡 is an independent Wiener processdefined on the probability space ( Ω , F , 𝑃 ) . This describes a generic StratonovichSDE, d x d 𝑡 = f ( x , 𝝈 , 𝜔 ) (cid:44) v ( x , 𝝈 ) + 𝑑 ∑︁ 𝑖 = v 𝑖 ( x , 𝝈 ) ◦ 𝑊 𝑖𝑡 ( 𝜔 ) . (32)For fixed 𝝈 , if { v 𝑖 } 𝑑𝑖 = span the tangent space 𝑇 x 𝑀 for each x ∈ 𝑀 , then the systemof SDEs gives rise to a unique probability measure 𝜇 on 𝑀 that is invariant withrespect to the random flow induced by the system of SDEs. For 𝜇 × 𝑃 almost every ( x , 𝜔 ) ∈ 𝑀 × Ω , the Lyapunov exponents and their multiplicity are defined andonly depend on x (Liu and Qian, 2006)[theorem 2.1]. For the SDE, f ( x , 𝝈 , 𝜔 ) , thetangent linear model is once again defined as in Eq. (2) from which the exponentsand vectors can be computed as usual (Pikovsky and Politi, 2016)[chapters 2 and 8].The analysis from perfect-deterministic models thus extends to stochasticallyforced models as in Eq. 8a, but with key differences. Firstly, the forecast error co-variance can generally be considered to be of full rank due to the injection of thestochastic forcing w 𝑘 into arbitrary subspaces; the standard controllability assump-tion actually guarantees that it is of full rank after a sufficient lead time (Jazwinski,1970)[lemma 7.3]. Stochastic perturbations w 𝑘 are, moreover, subject to growth anddecay rates of the LLEs, 𝜆 𝜏𝑖 of Eq. (7). These LLEs are distributed about the LE suchthat even asymptotically stable modes may have transient periods of rapid growth.To make the analysis tractable, assume that the dynamics are stationary in thesense that the recursive QR algorithm (Shimada and Nagashima, 1979; Benettin et al.,1980) converges uniformly to the theoretical LEs uniformly in number of iterationsfrom any initial time. While the forecast error covariance is not generally reduced-rank, the stable dynamics is actually sufficient to uniformly bound the forecast errorvariances in the stable subspace without any assimilation . Result 6 : Uniform bounds on the variance of errors in the stable subspaceRecall that L b ,𝑖𝑘 is the 𝑖 -th BLV and that P free 𝑘 is the free forecast error covariance( e.g. without DA). For 𝜆 𝑖 <
0, define 𝜖 > { 𝜆 𝑖 + 𝜖 } <
1. By thestationarity assumed above, there exists 𝑁 𝜖 such that if 𝑘 − 𝑙 > 𝑁 𝜖 , then − 𝜖 < 𝑘 − 𝑙 log (cid:16) (cid:107) M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 (cid:107) (cid:17) − 𝜆 𝑖 < 𝜖 . (33)Assuming that the system is uniformly completely controllable and Q 𝑘 ≤ 𝑞 sup I 𝑛 < ∞ for all 𝑘 , the controllability matrix, Eq. (13), is uniformly bounded above by 𝐶 𝑁 𝜖 I 𝑛 .A variation on Eq. (12) can be used to obtain a uniform bound on the 𝑖 -th varianceof P free 𝑘 in the basis of the BLVs aslim sup 𝑘 →∞ (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) P free 𝑘 L b ,𝑖𝑘 ≤ 𝐶 𝑁 𝜖 + 𝑞 sup exp { ( 𝜆 𝑖 + 𝜖 ) 𝑁 𝜖 + } − exp { ( 𝜆 𝑖 + 𝜖 )} . (34)This bound represents the competing forces of the transient growth rates of re-cently introduced perturbations in the controllability matrix, and perturbations thatadhere to their aysmptotic log-average rate of decay in the stable subspace withina margin of 𝜖 (Grudzien et al., 2018a)[proposition 2 and corollary 3]. This demon-strates that, if assimilation prevents error growth in the span of the unstable-neutralBLVs, errors in the span of the stable BLVs can be neglected without relinquishingfilter boundedness. However, this does not state whether the error in the span of thestable subspace will remain within tolerable bounds.Redefine 𝑞 sup such that P free0 , Q 𝑘 ≤ 𝑞 sup I 𝑛 for all 𝑘 . The variance of the freeforecast in the 𝑖 -th BLV is bounded directly as (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) P free 𝑘 L b ,𝑖𝑘 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 :0 P free0 M (cid:62) 𝑘 :0 L b ,𝑖𝑘 + 𝑘 ∑︁ 𝑙 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 : 𝑙 Q 𝑙 M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 ≤ 𝑞 sup 𝑘 ∑︁ 𝑙 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 : 𝑙 M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 = 𝑞 sup 𝑘 ∑︁ 𝑙 = (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) , (35)where (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) is the norm of the 𝑖 -th row of T 𝑘 : 𝑙 in the recursive QR decompo-sition of the propagator, M 𝑘 : 𝑙 = L b 𝑘 T 𝑘 : 𝑙 (cid:16) L b 𝑙 (cid:17) (cid:62) . The sum Ψ 𝑖𝑘 (cid:44) 𝑘 ∑︁ 𝑙 = (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) (36)describes the invariant evolution for the 𝑖 -th variance of the free forecast errorcovariance matrix in the basis of BLVs. In the case that Q 𝑘 = 𝑞 I 𝑛 for all 𝑘 , Eq. (35)becomes an equality and Ψ 𝑖𝑘 can be interpreted as the evolution of the 𝑖 -th variance hort Title 15 when 𝑞 =
1. As 𝑘 − 𝑙 grows (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) converges exponentially to zero while for 𝑘 − 𝑙 close to one this describes transient dynamics in the basis of the BLVs. Although Ψ 𝑖𝑘 is guaranteed to be uniformly bounded in 𝑘 (Grudzien et al., 2018a)[corollary 3],numerical simulations demonstrate how this uniform bound can be extremely large.Figure 1, presents an example from Grudzien et al. (2018a) of the free forecasterror variance, Ψ 𝑖𝑘 , over 10 forecast cycles where the model propagator M 𝑘 is definedby the evolution of the tangent linear model of the Lorenz-96 system (Lorenz, 1996)in 𝑁 𝑥 =
10 dimensions, with an interval between observations of 0 .
1, and Q 𝑘 (cid:44) I 𝑁 𝑥 .Although this model is generated from the underlying nonlinear Lorenz system, thestate model for the experiment is treated as a discrete linear model using only thetangen-linear resolvent as above. This model has three unstable, one neutral andsix stable LEs. While 𝜆 and 𝜆 are negative, L b , 𝑘 and L b , 𝑘 experience frequenttransient instabilities in the timeseries of their LLEs (see bottom row in Fig. 1). TheLLEs of L b , 𝑘 have more intense growth, reflected in the differences between Ψ 𝑘 and Ψ 𝑘 (top row): the maximum of Ψ 𝑘 is on the order of O ( ) and the mean isapproximately 28; for Ψ 𝑘 the max is of O (cid:0) (cid:1) and the mean is approximately 808.This suggests that, as opposed to the case of deterministic dynamics, for successfulDA in stochastic systems with additive model error, it is necessary to control thegrowth of forecast errors also in the span of weakly stable BLVs. While errors inthis span will not grow indefinitely and remain bounded, if those directions are leftuncontrolled their error bounds can be practically too large for any meaningful stateestimation purposes. Fig. 1 Upper: time series of Ψ 𝑘 and Ψ 𝑘 as defined in Eq. (36). Lower:
LLEs of L b , and L b , .Adapted from Grudzien et al. (2018a)6 Authors Suppressed Due to Excessive Length Result 7 : The unfiltered-to-filtered error upwell and the need for inflationMotivated by the results above, suppose that an approximate, reduced-rank Kalmanestimator is defined such that the resulting forecast error covariance and the Kalmangain have image (column) spaces constrained to the span of the leading 𝑟 ≥ 𝑛 BLVs, L b , 𝑟𝑘 . For perfect and deterministic dynamics, this estimator is asymptoticallyequivalent to the optimal Kalman filter by the results of Sect. 3.1.1. On the otherhand, Result 6 establishes that, in the presence of additive model errors, the varianceof the unfiltered error in the span of the trailing 𝑁 𝑥 − 𝑛 BLVs will remain finite, albeitbounded as per Eq. (35). When 𝑟 > 𝑛 this reduced-rank Kalman filter will correctthe 𝑟 − 𝑛 stable modes in addition to unstable-neutral modes, thereby reducing thevariance of the free forecast errors below the bounds pictured in Fig. (1).Grudzien et al. (2018b) derive the full forecast error covariance dynamics for thereduced-rank Kalman estimator described above. Note that, as opposed to the freeforecast error covariance described in relation to Result 6, the discussion pertains nowto the forecast error covariance cycled within the KF. While the standard reduced-rank KF formalism would write the recursion for the forecast error covarianceentirely within the span of L b , 𝑟 , it is proven in Grudzien et al. (2018b) that forecasterrors in the column span of trailing L b ,𝑟 + 𝑁 𝑥 𝑘 (those left “uncorrected” by DA) aretransmitted into the column span of L b , 𝑟𝑘 . This “error upwell” is a consequence ofthe KF rank-reduction within the first 𝑟 BLVs and is driven by the upper triangulardynamics of the BLVs in the recursive QR algorithm. Therefore, neglecting thecontribution of the “upwelling” of error from the trailing to the leading BLVs, asin the standard recursion, leads to a systematic underestimation of the true forecasterror in the presence of additive noise. Furthermore, because the leading 𝑟 BLVsshare the same span as the leading 𝑟 Oseledet spaces, Eq. (4), the upwelling of errorsfrom the span of the trailing BLVs to the leading BLVs holds for any estimator thatis restricted to the span of the leading 𝑟 covariant subspaces.Figure 2 presents an example from Grudzien et al. (2018b), using the sametangent linear model from the 10-dimensional Lorenz-96 system as in Fig. 1, andfixing each of H 𝑘 = R 𝑘 = Q 𝑘 = I . In each window, the eigenvalues of the forecasterror covariance matrix of the optimal full-rank KF (yellow) and of the “exact”,reduced-rank estimator (red) are averaged over 10 analysis cycles and plotted withtriangles. By exact it is meant here that the full covariance equation is evolvedanalytically, including the covariance within the unfiltered trailing BLVs and theircross covariances with the leading filtered modes. The rank, 𝑟 , of the reduced-rankestimator is varied to examine the differences between the forecast error covariancesarising from the optimal KF and the reduced-rank one when only the unstable-neutralsubspace is corrected by the gain ( 𝑟 = 𝑛 = 𝑟 =
5) and so on. As suggested by Fig. 1, correcting the first stable modereduces the leading eigenvalue of reduced-rank estimator’s forecast error covarianceby an order of magnitude versus the case when only the unstable-neutral subspacehas been corrected ( cf the red curves between the two top windows).In each window the projection coefficients of the reduced-rank estimator’s forecasterror covariance into the basis of BLVs are also plotted (green line). For the full- hort Title 17 Fig. 2
Eigenvalues of the KF and the reduced-rank estimator covariances plotted with triangles.Projection coefficients of the reduced-rank estimator covariance plotted with X’s. rank optimal KF, the projection coefficients closely follow the eigenvalues and thisis not shown due to redundancy. By contrast, it is clear that for the reduced-rankestimator the leading eigenvector is typically close to the first BLV that is containedin the null space of the reduced-rank gain, i.e. the first among the unconstraineddirections. Figure 2 demonstrates thus a fundamental difference between the perfectmodel and stochastically forced models configurations: in stochastic systems, usinga reduced-rank 𝑛 ≤ 𝑟 < 𝑁 𝑥 Kalman estimator, the leading order forecast errorscan actually lie in the stable, unfiltered modes. Those modes must be taken thus intoaccount in the DA procedure by, for instance, appropriately enlarging the ensemblesize or otherwise augmenting the span of the ensemble-based gain.The exact recursion for the reduced-rank gain (red lines in Fig. 2) requires resolv-ing the full error covariance dynamics and therefore cannot be used practically forDA. Nevertheless, it was used here to render apparent, the upwelling phenomena: afundamental source of uncertainty that is not captured by the standard reduced-rankKF (and thus almost all EnKF) recursion. This mechanism is ubiquitous wheneverone solves for a reduced rank estimator and is present whenever the forecast errorevolution can be well approximated by the tangent-linear dynamics (see Sect. 3.2.2).Notably, it also provides one basic, mathematically grounded, justification for us-ing covariance inflation (Grudzien et al., 2018b), a powerful common fix used inensemble-based DA (that are commonly rank-deficient by construction) to mitigatefor sampling, and sometimes model, error (see e.g.
Carrassi et al., 2018, their Sec-tion 4.4.2 and references therein). Finally, the exact recursion also demonstrates theasymptotic characteristics of the forecast error covariance when using a reduced-rankgain, in the absence of sampling error. It is extremely important to note that in theexact error dynamics for the reduced rank estimator, the leading order forecast errorslie in the directions that are asymptotically stable but unfiltered. This also highlights the importance of localization (Sakov and Bertino, 2011) and gain-hybridization(Penny, 2017) as effective means for preventing the growth of forecast errors that lieoutside of the ensemble span in the EnKF.
The performance and the functioning mechanisms of the EnKF in nonlinear systemsare studied with the aid of numerical simulations performed using the qgs codeplatform (Demaeyer et al., Submitted; Demaeyer and De Cruz, 2020).We consider first a spectral 2-layer channel quasi-geostrophic atmospheric model.The Fourier modes decomposition is truncated at wavenumber 2 in both meridionaland zonal directions on a beta-plane, leading to a set of 20 ODEs for the time evolutionof the first 10 components of the atmospheric streamfunction 𝝍 , and temperature 𝜽 (Reinhold and Pierrehumbert, 1982). The model dimension is 𝑁 𝑥 =
20 and itsspectrum of LEs includes 3 positive and one neutral exponents, so that 𝑛 = , 𝑡 𝑘 , each separated by a time interval Δ 𝑡 , 𝑡 𝑘 + = 𝑡 𝑘 + Δ 𝑡 . The observations are then obtained by adding a zero mean Gaussian randomerror sampled from N ( , R ) , with R being the (assumed to be known) observationerror covariance matrix. It is assumed that we observe the spectral componentsdirectly and that the full system is observed, implying that the observation operatoris the identity matrix, H = I 𝑁 𝑥 ∈ R 𝑁 𝑥 × 𝑁 𝑥 . Although the former hypothesis cannotbe met in practice (instrumental devices do not observe the spectral modes) and thelatter rarely holds in high-dimensional applications, they are done here for the sake ofclarity and will facilitate the study of the dynamical behaviour we intend to discuss.Furthermore, observational error is supposed to be spatially (in spectral space)uncorrelated with an amplitude proportional to the corresponding model variable’sstandard deviation, 𝜎 𝑖 md , 𝑖 = , . . . , 𝑁 𝑥 . These imply R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) Data assimilation is performed using the EnKF-N, hereafter simply referred toas EnKF (Bocquet, 2011). This EnKF belongs to the family of deterministic filtersbut it furthermore possesses the appealing property that the ensemble covariancemultiplicative inflation is computed automatically as part of the DA process. Infla-tion is one of the unavoidable feature making the EnKF methods suitable for highdimensional problems (Carrassi et al., 2018). It comes under two ways known asmultiplicative and additive inflation, that are often used together. We shall use mul-tiplicative inflation alone because, as opposed to the additive version, it does notchange the rank and span of the ensemble error covariance but it only inflates the hort Title 19 matrix entries amplitude. This allows us to study the effect (if any) on the ensemblesubspace (reflected into the ensemble covariance rank and span) that comes fromthe dynamics, without artifacts from the DA procedure. At each analysis step, theforecast anomaly matrix is inflated as 𝛼 X f ← X f , 𝛼 ≥ 𝑡 𝑘 , the alignment between the ensemble and the unstable-neutralsubspaces, U 𝑘 , is computed ascos ( 𝜃 𝑖𝑘 ) = 𝑛 ∑︁ 𝑝 = cos ( 𝜃 𝑖, 𝑝𝑘 ) = 𝑛 ∑︁ 𝑝 = (cid:8) ( u 𝑝𝑘 ) T v 𝑖𝑘 (cid:9) (cid:13)(cid:13) v 𝑖𝑘 (cid:13)(cid:13) ≤ 𝑖 ≤ 𝑁, (37)where 𝜃 𝑖𝑘 ∈ [ , 𝜋 ] is the angle between the anomaly v 𝑖𝑘 and U 𝑘 , u 𝑝𝑘 is the 𝑝 -th BLV,and 𝑁 is the number of ensemble members.The RMSE of the EnKF analyses and the angle 𝜃 𝑖𝑘 are shown on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) in Fig. 3. Values are averaged over 4 years of simulated time afterdiscarding the first 200 analyses. The RMSE of the analyses is also averaged over allmodel variables, once the error on each variable is normalised with respect to thecorresponding standard deviation 𝜎 md . The number of ensemble members is set to 𝑁 = t (hours) % Angle t (hours) % RMSE Fig. 3
Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle, from Eq. (37)(in degree), between the anomalies of the EnKF and the unstable–neutral subspace (left panel,shadow colours, in degree), and time averaged normalised RMSE of the EnKF analysis (rightpanel, shadow colours), both on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) . The set-up is H = I 𝑑 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) and 𝑁 =
10. Note that the logarithmic scale is used on the right panel.
Figure 3 immediately shows the strong resemblance of patterns between the angleand the RMSE. In practice, whenever the observation time interval Δ 𝑡 and error 𝜎 % are small enough then the angle between the two subspaces get smaller and the RMSE decreases. Similarly to what is discussed in Bocquet and Carrassi (2017),we also see here how increasing observation frequency is more effective in reducingthe RMSE than reducing the observation error (see discussion in the Appendix ofBocquet and Carrassi (2017)). This result also suggests that the use of a large numberand frequent observations, like the ones produced by non-conventional systems ( e.g. crowdsourcing) has the potential for improving analyses, despite that observationalerrors may be larger than in conventional measurement systems (Nipen et al., 2019).A complementary picture of the relation between the filter skill and the subspacesalignment is provided in Fig. 4, that displays the angle (left y-axis) and RMSE ofthe EnKF analysis (right y-axis) both against the ensemble size 𝑁 . The remainingexperimental set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .
08 and Δ 𝑡 = .
11 hours.Recalling that the unstable-neutral subspace has dimension 𝑛 =
4, Fig. 4 shows thatas soon as 𝑁 ≥ 𝑛 + i.e. the ensemble fully spans the unstable-neutral subspace,the RMSE suddenly reduces to very small values and it does not further decreasewhen 𝑁 is increased beyond 𝑛 +
1. The fact that the convergence occurs for 𝑛 + 𝑛 is due to the ensemble anomalies subspace to be at most of rank 𝑁 − 𝑁 = 𝑛 + Ensemble size N M e a n a n g l e w i t h t h e un s t a b l e - n e u t r a l s u b s p a c e Angle 0.00.20.40.60.81.0 R M S E RMSE
Fig. 4
Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle, Eq. (37) (indegree), between the ensemble anomalies of the EnKF and the unstable-neutral subspace as afunction of the ensemble size 𝑁 (left 𝑦 axis), and the corresponding time-averaged RMSE of theEnKF (right 𝑦 axis). The set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .
08 and Δ 𝑡 = .
11 hours.
At the convergence, the angle between the ensemble and unstable-neutral subspace(of dimension 𝑛 =
4) is about 10 degrees (see Fig. 4): a remaining small portion ofthe ensemble subspace is projecting outside the unstable-neutral space. To investigate hort Title 21 how large is such a portion, we compute the angle between an ensemble subspace withfixed 𝑁 =
10 and a Lyapunov/Oseledets subspace of increasing dimension beyond 𝑛 +
1; results are shown in Fig. 5. The angle between the two subspaces decreasesmonotonically with the size of the subspace, eventually reaching zero once the fullphase space ( 𝑛 =
20) is considered. Interestingly, the rate of convergence is initiallyfaster until approximately dimension 10, and slower afterwards. This indicates thatthe additional projection beyond the asymptotic unstable-neutral subspace is largelyconfined to the less stable ( i.e. close to neutral) directions, those that can often belocally unstable. This mechanism is a reminiscent of what was shown by (Grudzienet al., 2018b) for stochastic systems and reviewed in Section 3.1.2. It suggests that theaddition of few ensemble members beyond 𝑛 + Dimension of subspace M e a n a n g l e w i t h s u b s p a c e Fig. 5
Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle (in degree)between the ensemble anomalies of the EnKF with 𝑁 =
10 members, and subspaces spanned bythe BLVs of increasing dimensions. These subspaces are constructed by starting from the unstable-neutral subspace ( 𝑛 =
4) and adding one by one stable directions ordered decreasingly by theirLyapunov exponent. The set-up is H = I 𝑑 , R = 𝝈𝝈 T with 𝜎 % = .
08 and Δ 𝑡 = .
11 hours.
To discuss the impact of multiple timescales on the ensemble Kalman filtering, weconsider now the addition of a shallow-water ocean component to the previous atmo-spheric model. The atmospheric and oceanic models are coupled together throughwind and radiative forcing as well as through heat exchanges, yielding the Modu-lar Arbitrary-Order Ocean-Atmosphere Model MAOOAM (De Cruz et al., 2016).MAOOAM has the same 20 ODEs of the atmospheric quasi-geostrophic model, andadditional 16 ODEs for the ocean. Amongst these 16 equations, the first 8 governthe time evolution of the first eights components of the oceanic streamfunction 𝝍 o ,while last 8 are relative to the first eights components of the oceanic temperatureanomaly 𝛿 𝑻 o . The model dimension is thus 𝑁 𝑥 =
36 and its parameters are cho- sen such that a decadal low-frequency variability appears as a consequence of thecoupling (Vannitsem et al., 2015; De Cruz et al., 2016).The model is integrated with a time step of approximately 15 minutes and isspun-up for as many as 18 ,
500 years to ensure the solution has reached the modelattractor, given the low-frequency ( i.e. slow time-scale) of the model. From this spun-up trajectory we start DA experiments using the EnKF-N with the same protocolused for the atmospheric model detailed above. In this case the true trajectorylasts 185 years, and it is again assumed that we observe the spectral componentsdirectly and completely. The EnKF is used in a strongly-coupled DA mode (see e.g.
Penny and Hamill (2017)) so that, at analysis steps, atmospheric data can impact theocean and vice-versa. MAOOAM has been already used as a prototypical coupledmodel to study coupled DA by Penny et al. (2019); Tondeur et al. (2020). Again,observational error is assumed to be spatially (in spectral space) uncorrelated withan amplitude proportional to the corresponding model variable’s standard deviation, 𝜎 𝑖 md , 𝑖 = , . . . , 𝑁 𝑥 . These imply R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ).The spectrum of LEs of MAOOAM is displayed in Fig. 6 and presents some keyfeatures. L y a p un o v e x p o n e n t Fig. 6
MAOOAM coupled Ocean-Atmosphere model. Time-averaged Lyapunov exponents indays − computed along the true trajectory of the DA experiments. With some arbitrariness, it can be decomposed into three subsets. A first subsetof LEs corresponds to the unstable ( 𝜆 𝑖 >
0) and neutral ( 𝜆 𝑛 =
0) directions.The dimension of the subspace spanned by these directions is 𝑛 =
6. A secondsubset of small negative LEs ( 𝜆 𝑖 ∈ [− × − , [ day − ) corresponds to nearlyneutral directions . The subspace they span has dimension 𝑛 =
10 and we hereafterdefine the unstable-near–neutral subspace as the one spanned by the first 𝑛 + 𝑛 directions. The presence of these nearly-neutral directions amounts as a form ofdegeneracy of the neutral direction. Although the model is theoretically possessingonly one single neutral mode ( cf Sect. 2), it is computationally extremely difficult todistinguish it from other nearly neutral ones, and the model can be said to degeneratein the neutral direction in any practical sense. We shall see how this feature hasimportant consequences on the functioning and performance of DA. Finally, aftera clear gap, the nearly neutral Lyapunov exponents are followed by the remainingstable directions which form the last subset of the spectrum. hort Title 23
Similarly to Fig. 3, we study the filter performance along with its ensemblesubspace alignment to the unstable subspace in Fig. 7. The figure shows the RMSE ofthe EnKF analyses and the angle 𝜃 𝑖𝑘 given by Eq. (37), between the ensemble subspaceand the unstable-neutral (left) and unstable-near–neutral (middle) subspace. Valuesare averaged over 185 years of simulated time after discarding the first 100 analyses.RMSE of the analysis is also averaged over all model variables, once the error oneach variable is normalised with respect to the corresponding standard deviation 𝜎 md . The number of ensemble members is set to 𝑁 = t (days)0.0020.0080.0350.1440.600 % Angle t (days)0.0020.0080.0350.1440.600 % Angle t (days)0.0020.0080.0350.1440.600 % RMSE
Fig. 7
MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle givenby Eq. (37) between the anomalies of the EnKF and the unstable–neutral subspace (left panel,shadow colours, in degree) or the unstable-near–neutral subspace (mid panel, shadow colours, indegree). The time averaged normalised RMSE of the EnKF analysis is also depicted (right panel,shadow colours). All the figures are on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) . The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) and 𝑁 =
20. The unstable-near–neutral subspace is defined asfollows: it includes the subspace spanned by the unstable and neutral 𝑛 = 𝑛 =
10 stable but near–neutral directions with LEs 𝜆 𝑖 ∈ [− × − , [ day − . In contrast to what was observed in Fig. 3, we see here that the pattern of theRMSE does not longer resemble the pattern of the angle between the ensemble andunstable-neutral subspace ( cf left and right panels in Fig. 7). Nevertheless, it bearsgreat similarities with the pattern of the angle between the ensemble and unstable-near–neutral subspace ( cf mid and right panels in Fig. 7), suggesting that it is nowthis larger subspace (that includes the 𝑛 weakly stable modes) that contains most ofthe error.This is further confirmed by looking at Fig. 8 that, similarly to Fig. 4, showsthe angle (left y-axis) and RMSE of the analysis (right y-axis) against the ensemblesize 𝑁 . The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) with 𝜎 % = .
08 and Δ 𝑡 = .
68 days.The critical role of the 𝑛 weakly stable modes appears now evident when lookingat Fig. 8 ( cf green and blue lines). As opposed to the behaviour of the EnKF applied tothe atmospheric model alone, or to the single scale Lorenz96 system used by Bocquetand Carrassi (2017), we see that the ensemble subspace does not longer project muchonto the unstable-neutral subspace (blue line with solid circles markers): when 𝑁 = 𝑛 + Ensemble size N M e a n a n g l e w i t h s u b s p a c e Angle with unstable-neutralAngle with unstable-near-neutral 0.00.10.20.30.40.50.6 R M S E RMSE
Fig. 8
MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle (indegree) between the ensemble anomalies of the EnKF and the unstable-neutral subspace orthe unstable-near–neutral subspace as a function of the ensemble size 𝑁 (left 𝑦 axis), andthe corresponding time-averaged RMSE of the EnKF (right 𝑦 axis). The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) with 𝜎 % = .
08 and Δ 𝑡 = .
68 days. when
𝑁 > 𝑛 +
1, indicating the presence of non-negligible projections outsidethe 𝑛 = 𝑁 = 𝑛 +
1: the first unstable-neutral 𝑛 = 𝑁 =
15 and stays almost constant afterwards.This result demonstrates undoubtedly the importance of the 𝑛 near–neutral modes.Although possibly asymptotically weakly stable, these directions span a small portionof the filter error that, if included in the ensemble subspace (by properly enlargingthe ensemble size) leads to further improvement of the filter performance. This isfinally emphasized in Fig. 9, which shows the decrease of the ensemble-averagedangle between an ensemble of 𝑁 =
10 members and the subspaces of increasingdimension beyond the unstable-neutral one. In contrast to Fig. 5, it indicates that theaddition of 𝑛 extra ensemble members may lead to RMSE improvement that is farfrom negligible. Also, the gap observed around the value 𝑛 + 𝑛 + =
17 clearlyshows that adding more members beyond the range of the near-neutral stability doesnot bring any benefit. This is a strong indication in favor of a cautious assessment ofthe ensemble size when working with coupled multi-scale dynamics and performingstrongly-coupled DA. Furthermore, as elucidate by Vannitsem and Lucarini (2016)and Tondeur et al. (2020), the near-neutral part of the spectrum in MAOOAM isdirectly connected to the coupling: including those directions within the ensemble hort Title 25 subspace is paramount to propagate the data information content between ocean andatmosphere. This subspace also proved to be key in producing reliable ensembleforecasts in coupled ocean-atmosphere systems Vannitsem and Duan (2020).
Dimension of subspace M e a n a n g l e w i t h s u b s p a c e Fig. 9
MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle (indegree) between the ensemble anomalies of the EnKF with 𝑁 =
10 members, and subspacesspanned by the BLVs of increasing dimensions. These subspaces are constructed by starting fromthe unstable-neutral subspace ( 𝑛 =
6) and adding one by one stable directions ordered decreasinglyby their LE. The set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .
08 and Δ 𝑡 = .
68 days.
Results in Sect. 3.1.1 show that the optimal KF and the reduced-rank KF withgain restricted to the leading BLVs are asymptotically equivalent for linear, perfect-deterministic models. Moreover, results in Sect. 3.2.1 demonstrate that in weaklynonlinear error dynamics, with a perfect-deterministic model, these conclusionslargely extend to the EnKF. Yet the exact reduced-rank recursion introduced inSect. 3.1.2 evidenced important differences between the optimal and the reduced-rank formulations in the presence of model error for linear models. In this case,stochastic noise injected in asymptotically stable directions is not entirely dampedout and remains finitely bounded. In addition, the upwelling process, albeit inherentto the reduced-rank formulation and not a consequence of model error, will movethe constantly injected model noise from unfiltered to filtered directions.It is of interest thus to compare the differences between the full rank Kalmanestimator, and both the standard and the exact reduced-rank KF recursions in astochastically forced, nonlinear models. As a prototype, we use the model definedby the nonlinear flow of the Lorenz-96 (Lorenz, 1996) system with additive noise, and 𝑁 𝑥 =
40. Suppose that 𝑡 𝑘 + = 𝑡 𝑘 + Δ 𝑡 , so that the flow map taking all initialconditions to time + Δ 𝑡 is defined 𝜙 Δ 𝑡 ( x 𝑘 ) = x 𝑘 + . If x t 𝑘 represents the true physicalstate at time 𝑡 𝑘 , we define the dynamical model analogously to Eq. (8a) as a discrete,nonlinear map, x t 𝑘 + = 𝜙 Δ 𝑡 (cid:16) x t 𝑘 (cid:17) + w 𝑘 . To avoid the interplay and superposition between sampling and model errors, andto be able to focus on the latter alone, instead of the EnKF we use here the EKF (seeSect. 3 and Jazwinski (1970)). The EKF estimates the forecast distribution for x t 𝑘 via the equation, x f 𝑘 + = 𝜙 Δ 𝑡 (cid:16) x a 𝑘 (cid:17) , taking the analysis mean at time 𝑡 𝑘 to the forecastmean at time 𝑡 𝑘 + , and by the linearized forecast equation for the covariance. In thefollowing, the EKF propagates the full-rank covariance equation via the tangent-linear model defined along the mean equation and assimilates observations in allstate components. On the other hand, both the standard and the exact reduced-rankformulations restrict the assimilation so that the image space of the gain is equalto the span of the leading 𝑟 BLVs defined along the tangent-linear model of themean equation. The difference between the standard and the exact formulation isas follows: the standard formulation only estimates the error covariance in the spanof the leading 𝑟 BLVs while the exact formulation simulates the entire covarianceequation, estimating the free forecast covariance in the trailing modes and includingits upwelling in the estimate of the uncertainty in the span of leading BLVs (Grudzienet al., 2018b, see proposition 1)Figure 10, adapted from Grudzien et al. (2018b), illustrates the differences inperformance between the above described schemes. On the left, the average RMSEof the (i) full-rank , the (ii) standard reduced-rank and the exact reduced-rank EKF formulations are plotted versus the rank of the reduced-rank gain over 10 analy-ses. The model possesses 13 positive LEs, therefore the reduced-rank EKFs underconsideration have at least 𝑟 ≥ 𝑛 + =
14. The system is fully observed with R = . I 𝑁 𝑥 . As 𝑟 approaches 𝑁 𝑥 =
40, the two reduced rank formulations con-verge to the full rank EKF, with performance considered optimal for a filter in thissystem. However, for correction rank 𝑟 ≤
20, there are substantial differences inperformance between the standard formulation and that which includes the effect ofthe upwelling of error from the trailing BLVs: the exact formulation reaches bothadequate and near-optimal performance with smaller 𝑟 than the standard recursion.On the right, Fig. 10 demonstrates the effect of multiplicative inflation on thestandard recursion when the correction rank is fixed at 𝑟 =
17. While multiplicativeinflation greatly improves the performance of the standard recursion, this perfor-mance is actually bounded below by the RMSE of the exact formulation. This ex-ample shows that, in addition to sampling error, multiplicative inflation can be usedto remedy the inadequacies of the standard reduced-rank formalism which neglectsthe effects of dynamic upwelling. This dynamic upwelling is a direct byproduct ofthe estimator being restricted to the span of the leading BLVs. Such an estimatorarises when, e.g. , the ensemble span aligns with the span of the leading BLVs inthe EnKF, as demonstrated in Sect. 3.2.1. The efficacy of dynamic, multiplicativecovariance inflation for treating the effect of model error separately from samplingerror has also been demonstrated with statistical and optimization methods by, e.g. , hort Title 27 Mitchell and Carrassi (2015); Raanes et al. (2015, 2019); Sakov et al. (2018); Fillionet al. (2020). The dynamical upwelling derived in Grudzien et al. (2018b) providesan explanation of one of the mechanisms responsible for the need for covarianceinflation.
Fig. 10 Left:
RMSE of the full-rank, standard reduced-rank and exact reduced-rank EKF. Thecorrection rank of the reduced-rank estimators is varied in the horizontal axis.
Right:
RMSE ofthe full-rank, standard reduced-rank and exact reduced-rank EKF. Rank is fixed at 𝑟 =
17 andmultiplicative inflation in the standard reduced-rank recursion is varied in the horizontal axis
We present here two areas where the knowledge about the chaotic nature and prop-erties of the model dynamics, have been used pro-actively to achieve a better trackof the system of interest and to reduce the forecast error. There have been two wayshow this has been accomplished. The former has to do with the design of methods toinform where and when additional fewer data would lead to a major improvement inthe analysis and forecast skill. This is usually referred to as “adaptive observation”or “target observation”, and is the content of Sect. 4.1. The second has to do withincorporating, within the DA process itself, the information about the unstable sub-space, with the goal of achieving a computational economy while maximising theerror reduction. This gave raise to a family of DA methods known as “assimilationin the unstable subspace”, summarised in Sect. 4.2.Our treatments here are intentionally succinct and have the character of a review,but interested readers will find all the appropriate references to the original works.
Historically, one very important development in this type of dynamical analysis ofthe climate predictability / DA problem arose from the question of how to generateadaptive observation systems. By “adaptive observations” is meant here observationswhose locations, time and type is chosen such that their impact on the state estimateor on the forecast skill is the largest. It was well understood at the time that the growthof forecast errors was confined to a lower dimensional subspace of rapidly growingperturbations (Trevisan and Legnani, 1995). The Fronts and Atlantic Storm Track(FASTEX) program (Snyder, 1996), a multinational collaboration to investigatethe growth and development of frontal cyclones, in particular was motivated bythese dynamical approaches to generate adaptive observation schemes that wouldtarget regions of rapid forecast error growth. Other similar international effortshave followed, some of them including actual field campaign of measurements: theTHORPEX (Fourrié et al., 2006) and the Winter Storm Reconnaissance programs(Szunyogh et al., 2002; Hamill et al., 2013). Two main approaches were considered,the forced singular vectors (Buizza et al., 1993; Palmer et al., 1998), and the bredvectors (Toth and Kalnay, 1993, 1997), to identify the sensitivity areas of highforecast uncertainty.Forced singular vectors are generated by the right singular vectors of the forward-in-time, tangent-linear model resolvent M . It can be seen from the earlier discussionsin Sect. 2 that the singular vectors can be interpreted as a finite-time approximationof the FLVs along a model forecast. They therefore indicate region where error willgrow. On the other hand, the bred vectors are an ensemble-based approach to identifysensitivity regions. Particularly, the breeding scheme simulates how the modes offast growing error are maintained and propagated through the successive use of shortrange forecasts in weather prediction. The bred vectors are formed by initializingsmall perturbations of a control trajectory and forecasting these in parallel along thecontrol. By successive rescaling of the perturbations amplitude back to a small value,this mimics the evolution of small perturbations under the tangent-linear model, andthe span of these perturbations generically converges to the leading BLVs. Both ofthese approaches represent an early and intuitive way to utilizing the ergodic theoryof chaotic dynamical systems in designing an effective adaptive observation schemeby targeting an unstable subspace in some form.Trevisan and Uboldi (2004a); Uboldi and Trevisan (2006) utilized the bred vectors/ BLV analysis of Toth and Kalnay (1993, 1997) and explicitly linked the methodologyto Lyapunov stability theory. Importantly it was recognised that, instead of mimickingthe error growth of the unforced (free) forecast, it was important to track the errorsthat develop within the DA cycle. This led to a modified version of the breedingapproach known as Breeding on the Data Assimilation System (BDAS) Uboldi et al.(2005); Carrassi et al. (2007). The BDAS scheme for adaptive observations is basedon the principle of the support of the forecast error lying primarily in the span ofthe BLVs, with the bred vectors acting as a proxy for the explicit decomposition. Inpractice the locations of few adaptive observations were selected to be in the areaswhere the leading BDAS modes attained their local maxima. In experiments with hort Title 29 an atmospheric quasi-geostrophic model, BDAS was used successfully to locate oneadditional observation at each analysis time of a 3DVar cycle, leading to a dramaticimprovement of the analysis skill, compared to cases where either a fixed or arandomly located adaptive observation was assimilated Carrassi et al. (2007). Data assimilation has long been studied with the trade-off between accuracy andnumerical efficiency as a goal. To this end a natural choice has been that of devisingreduced-order schemes that focus the observational constraint on smaller, albeitcrucial, part of the full dynamics. One of the most celebrated among those approachesis the assimilation in the unstable subspace (AUS) where the DA procedure isexplicitly designed to track and control the unstable manifold of the dynamics,usually of much smaller dimension of the full phase space, thus aiming to a reductionin computational cost (Palatella et al., 2013).We have seen in Sect. 2 that the full phase space of a chaotic dissipative dynamicalsystem can be seen as split in a (usually much smaller) unstable–neutral subspaceand a stable one (Kuptsov and Parlitz, 2012). For instance, Carrassi et al. (2007) haveshown how a quasi-geostrophic atmospheric model of
O ( ) degrees of freedompossesses an unstable–neutral subspace of dimension as small as 𝑛 = considered. This improved the performance of the standard EKF-AUS particularly inregimes of increasing nonlinearities. It remains however to be seen to which extentAUS concept could be used within fully nonlinear DA schemes. This is the subjectof Sect. 5.1 and of the references therein. Although AUS has been largely used in aperfect model scenario, Palatella and Grasso (2018) proposed a suitable modificationthat allows for incorporating parametric model errors. As proved in Grudzien et al.(2018b) and detailed in Sect. 3.1.2 and 3.2.2 the use of AUS in chaotic systems forcedby additive stochastic noise would necessarily require the additional inclusion of theasymptotically weakly stable modes.A key caveat in all of the aforementioned applications of AUS is that one needsto compute in real time the BLVs to be used in the analysis. Therefore, whileAUS proved capable to improve accuracy, it did not accomplishes a computationaleconomy, unless the BLVs were all pre-computed and stored. Note however thatthe latter is not just a technological challenge given that the BLVs depends on thesystem’s state and vice-versa if BLVs are to be used in the analysis update. Thus it isnot straightforward to decouple their online estimation and use within the analysis.At the same time however, as we have seen in Sect. 3, AUS concept proved to be verypowerful to understand, design and interpret the functioning of the KF and EnKF inchaotic systems. In this subsection we attempt to improve the performance of the (bootstrap) particlefilter (PF, Farchi and Bocquet, 2018) by AUS. The underlying hypothesis is thatobservational components in the stable subspace contribute little in the way ofprecision (since nearby orbits within the stable subspace converge), but a lot ofnoise. Therefore, the investigation will explore whether discarding observationalinformation outside of the unstable subspace can mitigate the acute collapse ofweights experienced by PFs in high-dimensional systems, manifesting the “curse ofdimensionality”. If so, this could be used to reduce the number of particles required,which scales exponentially with some measure of the system size (Snyder et al.,2008). A secondary objective is to investigate the effectiveness of a few differentmethods of targeting observing systems to the unstable subspace, potentially alsoreducing costs.We perform synthetic DA experiments with the Lorenz-96 system. The statesize is set to 𝑁 𝑥 =
10 and there is no dynamical noise ( Q = ). Four differentobservation configurations targeting the unstable subspace are tested. For each ofthem, observations are taken 0 . .
5. Each experiment lasts for 10 analysis cycles. The RMSE averages of eachmethod are tabulated for a range of ensemble sizes and observation operator ranks, 𝑁 𝑦 , and plotted as curves in Fig. 11. The plotted scores represent the lowest obtained hort Title 31 among a large number of tuning settings, selected for optimality at each point. Forthe PF the tuning parameters are: (i) the threshold for resampling, which is triggeredif the threshold is larger than the effective ensemble size, (cid:107) w (cid:107) − , where w is thevector of weights, and (ii) the bandwidth (scaling) of the regularizing post-resamplejitter, whose covariance is computed from the weighted ensemble. For comparisonthe (symmetric square-root) EnKF (Hunt et al., 2004) is also tested. Its tuningparameters are (i) the post-analysis inflation factor and (ii) whether or not to applyrandom, covariance-preserving rotations (Sakov and Oke, 2008b).The top-left panel Fig. 11 shows that the error decreases monotonically in 𝑁 𝑦 when the observation operator are rows of the identity. The top-right panel shows,by contrast, that when observing the BLVs, L b 𝑘 , the PF with largest ensemble levelsoff at near-optimal performance with as few as 𝑁 𝑦 = 𝑛 = 𝑁 , albeitless clearly. Also recall that a similar “levelling off” of RMSE around 𝑛 occurredin Fig. 4 of Sect. 3.2.1, whose x-axis is 𝑁 (rather than 𝑁 𝑦 , as here). This resultdemonstrates that targeting observations to the directions of dynamical growth ofthe uncertainty is efficacious. Interestingly, the performance of any given DA methodis nearly independent of the observing system ( i.e. panel) for 𝑁 𝑦 =
10. This makessense considering that all of I , L b 𝑘 , V 𝑘 + , and U 𝑘 are orthogonal, i.e. equal up to arotation.In all panels of Fig. 11, the RMSE score of the PF (as well as that of the EnKF), forany ensemble size, never degrades by the inclusion of more observations (except forwhat we adjudge to be noise). This feature was also found in experiments with evensmaller ensembles than shown, and experiments with larger observation errors. Thusit seems that the curse of dimensionality is not mitigated by discarding observations.Hence there is little to gain by eliminating observation components outside of theunstable-neutral subspace, apart for the potential computational efficiency in the casethat unstable-neutral has been pre-computed.It should be noted that this finding runs counter to that of Maclean and Vleck(2019); Beeson and Namachchivaya (2020); Potthast et al. (2019), all of whichreport success in mitigating weight collapse by reducing the observations to theleading components of L b 𝑘 or some related matrix. Beeson and Namachchivaya(2020) also tested observation operators given by time-local backward and forwardLyapunov vectors, defined through the singular value decomposition of the resolvent: M 𝑘 = U 𝑘 𝚺 𝑘 V (cid:62) 𝑘 . They reported better targeting results with V 𝑘 + than with U 𝑘 ,which in turn was more effective than L b 𝑘 . Our results similarly show U 𝑘 (bottom-right panel) to be slightly more effective for targeting observations than L b 𝑘 (top-rightpanel), albeit only for intermediate ensemble sizes. Both, however, are more effectivethan V 𝑘 + (bottom-left). Moreover, also for V 𝑘 + and U 𝑘 , we never observe lowerRMSE scores when using fewer observation components. There are some differencesin the experimental setting; notably our system is deterministic, and the jitter we applyis restricted to the ensemble subspace. It is unclear if these differences can accountfor the disparity in conclusions.Why does PF performance not suffer from the inclusion of further, visibly redun-dant, observations (contrary to our initial expectation)? Consider the likelihood ( i.e. H k = ([ L b k ] ) : N y H k = ( V k +1 ) : N y H k = ( U k ) : N y H k = I : N y EnKFPF T i m e - a v e r a g e d , f il t e r - a n a l y s i s ( R M S E rr o r ) Observation length ( N y ) Fig. 11
Benchmarks of filter accuracy (RMSE) from synthetic DA experiments on the Lorenz-96 system, plotted as functions of the observation dimension ( 𝑁 𝑦 ). Specifically, the observationoperator consists of the 𝑁 𝑦 leading rows of the transpose of the identity matrix ( top-left ), the BLVmatrix, L b 𝑘 , computed by recursive QR decompositions ( top-right ), the 1-cycle forward singularvectors, V 𝑘 + ( bottom-left ), and 1-cycle backward singular vectors, U 𝑘 ( bottom-right ), all of whichare defined via the fundamental matrix of the orbit of the (supposedly unknown) truth. The numberof members/particles used for the ensemble Kalman filter (EnKF, dashed ) and Particle filter (PF, solid ) is tagged above each line. No further improvement is obtained by increasing the ensemblesize beyond 𝑁 =
30 for the EnKF and 𝑁 =
800 for the PF, for which the PF achieves betteraccuracy than the EnKF, as expected for nonlinear systems. The greyscale, dotted lines, includedfor context, show the performance of baseline methods, whose analysis estimates, x a , are given by x for Climatology ( black ), x + K ( C ) [ y − x ] for Optimal interp. ( dark grey ), x f + K ( 𝑐 I ) [ y − x f ] for3D-Var ( light grey ). Here, x and C are the mean and covariance of the (invariant measure of the)system dynamics, K ( C ) = CH (cid:62) ( HCH (cid:62) + R ) − is a gain matrix, x f is the forecast of the previous x a , and 𝑐 is a scaling factor subject to tuning. The plots show that targeting observations to thedirections of dynamical growth of the uncertainty is efficacious and that, for this task, L b 𝑘 and U 𝑘 are similarly effective, and superior to V 𝑘 + . Moreover, the RMSE performance of any method,also for the PF (which is our focus), never degrades with the inclusion of more observations. weighting) of particle 𝑛 ∈ [ , . . . , 𝑁 ] , at a given (implicit) time: 𝑝 ( y | x 𝑛 ) = 𝜙 (cid:0) (cid:107) y − Hx 𝑛 (cid:107) R (cid:1) , (38)where the norm is defined as (cid:107) y (cid:107) R = y (cid:62) R − y , and 𝜙 is a radial density such thatEq. (38) represents an elliptical distribution, for example a Gaussian. Suppose for hort Title 33 the sake of simplicity that both the observation operator and its error covarianceare identity, i.e. H = R = I , and let 𝚷 be the matrix of orthogonal projectiononto the unstable-neutral subspace. Now, assuming the PF controls the error in theunstable-neutral subspace, it seems reasonable to assume, given the results for linearsystems of Sect. 3, that the particles will converge onto the unstable-neutral subspace: ( I − 𝚷 )( x 𝑛 − x ) → , at least as a first-order approximation. Supposing this, and bydecomposing the norm into two orthogonal components, it can be shown that (cid:107) y − Hx 𝑛 (cid:107) = (cid:107) 𝚷 ( y − Hx 𝑛 )(cid:107) + (cid:107)( I − 𝚷 ) v (cid:107) , (39)where v is the observation noise. Thus, projecting the observations onto the unstable-neutral subspace, which would eliminate the last term of Eq. (39), merely reducesthe data mismatch by a constant (with respect to 𝑛 , the particle index). Thus 𝑝 ( 𝚷 y | x 𝑛 ) = 𝑐 𝑝 ( y | x 𝑛 ) , (40)for some 𝑐 > i.e. uncertainty, in the stable subspace, and therefore ignores observational components, including the noise , in that subspace. This should be contrasted with the situation fordynamical noise (model error), which is “up-welled” from the stable to the unstablesubspace, as per Sect. 3.1.2. This distinction exemplifies the difference betweenuncertainty addition (by dynamical noise) and subtraction (by likelihood updates).For nonlinear dynamical systems, as illustrated in Sect. 3.2, the particles do notneatly align with the unstable-neutral subspace. Capturing this nonlinear aspect ofthe flow is generally seen as an advantage of the PF. It might be argued, though, thatthe observation noise is likely large compared to the spread of the prior particles inthe stable subspace, and therefore the observations should be reduced by discardingthe corresponding components. However, as highlighted above, the observationalnoise is constant in the particle index, and therefore its amplitude does not constitutea particular source of weight variability. Instead, the weight variability originates inthe prior, which was assumed to have low variability in this scenario.The analysis resulting in Eq. (40) assumed that H = R = I followed by H = 𝚷 .In the case of the observation operator consisting of the leading 𝑁 𝑦 ≥ 𝑛 rows ofthe transposed BLV matrix, H 𝑘 = ( L b 𝑘 ) (cid:62) 𝑁 𝑦 , the same conclusion can be derivedalong similar lines. In the general case, for any H , it is not obvious how to reducethe observations so as to only measure the unstable-neutral subspace. To accomplishthis, both Maclean and Vleck (2019) and Beeson and Namachchivaya (2020) applythe pseudo-inverse H + before their reduction. This can be costly if H is large. A morepractical approach is to reduce the observations as ˆy = ( H ˆQ ) + y , or ˆy = 𝚷 y with 𝚷 = ( H ˆQ )( H ˆQ ) + , with ˆQ = ( L b 𝑘 ) (cid:62) 𝑁 𝑦 In any case, re-doing the same derivation,Eq. (40) again follows, including the same implications for the weights.
In summary, discarding observational information outside of the unstable sub-space does not yield improvements in the PF because it already embodies this“flow-dependent” information. A similar conclusion was also drawn for the iterativeensemble Kalman smoother by Bocquet and Carrassi (2017). Thus, while AUS isa powerful explanatory and diagnostics tool, it is not obvious if it can be used tocombat the curse of dimensionality for PFs.Lastly, this section adds some clarification to the influential paper by Snyder et al.(2008), whose conclusion is sometimes taken to be that the required ensemble sizefor PFs scales exponentially with observation size. Our results rather indicate thatthe performance of a well-tuned PF will not deteriorate with the inclusion of moreobservations (even if they are redundant). In other words, that the required ensemblesize depends on the rank of the state space, or more precisely for chaotic dynamics,the rank of the unstable-neutral subspace. Snyder et al. (2015) points out that the“effective dimension” may be limited by the observation size if this is smaller thanthe state size. However, in case 𝑁 𝑦 < 𝑛 , no filtering system using flow-dependentpriors will be able to achieve satisfactory performance, because the system is notsufficiently observed. Of course, the question of observability is complicated byconsidering time-dependent observation networks, while localization can also beapplied to alleviate the curse of dimensionality (Farchi and Bocquet, 2018). AUS and its theoretical extensions provide a framework to interpret the asymptoticinferences of the EnKF. Provided that the forecast anomalies can be considered to beperturbations of the true physical state, and if their evolution can be approximatedby the tangent-linear model along the true trajectory, the dynamics of the EnKFanomalies can be decomposed along the Oseledec spaces of the true physical state.The stability and accuracy of the EnKF is largely determined by the ability ofthe ensemble-based gain to correct the growth of forecast errors in the unstable-neutral subspace, with respect to the uniform-complete observability of these modes.Additive noise complicates the description of the asymptotic forecast uncertaintyas uncorrected forecast errors in the span of the stable BLVs may be bounded butimpractically large. The upwelling of such errors into the ensemble span furthermorenecessitates covariance inflation to rectify the systematic underestimation of theforecast uncertainty in the standard KF-AUS recursion.Model stochasticity arises systematically in multiscale climate dynamics wherethere are large scale-separations between resolved and unresolved dynamic processes.In the asymptotic limit of scale-separation, unresolved dynamics can be reduced toadditive Gaussian noise due to the Central Limit Theorem; finite scale-separationin reality leads to non-Markovian memory terms in addition to additive stochasticforcing in the exact model reduction of a multiscale model, as in Mori-Zwansigformalism (Gottwald et al., 2015). Several mathematically rigorous frameworkshave been developed to model and simulate the effect of small-scale dynamics hort Title 35 on the large-scale dynamics with stochastic parameterization, including averagingmethods, perturbation methods and combinations of the two — see, e.g. , the surveyof approaches by Demaeyer and Vannitsem (2018).The theory of random dynamical systems offers a novel means of analysis of theDA cycle for multi-scale chaotic systems with large scale separations and modelreduction error. Characterizing the asymptotic Bayesian posterior in terms of theproperties of a random, nonlinear and ergodic attractor is a natural step forwardin the philosopy of AUS. Recent work suggests that the support of the posteriormeasure of the DA cycle can be asymptotically bounded by the support of theSRB measure in deterministic, nonlinear dynamical systems (Oljača et al., 2018).While this is an intuitively appealing result, the existence of an SRB measure indeterministic dynamics usually requires a hyperbolicity assumption which may notbe appropriate in weather and climate, e.g. (Vannitsem and Lucarini, 2016). However,many of the theoretical challenges in showing the existence of SRB measures indeterministic dynamics are relaxed in a random dynamical systems setting. Indeed,the Pesin entropy formula holds under very general assumptions for stochastic flowsof diffeomorphisms (Liu and Qian, 2006)[theorem 3.1 and discussion on page 127],establishing the link once again between the observed instability in the dynamicsand the statistical properties of the invariant measure.For such a dynamical interpretation of the DA cycle to be credible, the correctspecification of random models in stochastic-physical systems is a primary concern;stochastically reduced models should be specified to preserve conservation lawsand the original model’s dynamics (Cotter et al., 2019). In addition to the correctstochastic model specification, important differences in the statistical properties ofmodel forecasts of stochastic dynamical systems have been observed due to thediscretization errors of certain low-order schemes. For example, Frank and Gottwald(2018) develop an order 2.0 Taylor scheme to correct the bias in the drift term inducedby the Euler-Maruyama scheme in their study system. Grudzien et al. (2020) likewisefind that the bias due to discritization error of the Euler-Maruyama scheme can besufficient to cause filter divergence of the EnKF in the stochastically forced Lorenz-96 model. Grudzien et al. (2020) emphasize the important role of efficient, weaknumerical schemes for the simulation of ensemble-based forecasts. Unlike strongconvergent schemes, numerical schemes that converge in the weak sense can makereductions in the complexity of simulation by emphasizing the accuracy of theconvergence of the ensemble to the forecast distribution rather than the accuracy ofany individual ensemble member.
Chaos is ubiquitous in natural, physical and laboratory systems. Scientists have longcoped with this whenever attempting to model, predict or control such systems.Combining and confronting models with data is common in science and data as-similation (DA) is the term coined in the context of numerical weather prediction science to encompass the methods that perform such a combination. The outputs ofDA is the improved representation of the system under study, and an estimate of theassociated uncertainty.Inevitably, the sensitivity to initial conditions of chaotic systems, including state-dependence of the directions of error growth, is a challenge for DA. On the otherhand, the energy dissipation typical of real systems implies a “dimension reduction”in that errors are confined within a subspace of the full system’s phase space.The DA process requires furnishing a prior distribution, whose specification isa severe difficulty in high dimension. Gaussian methods reduce the complexity tothat of estimating the prior mean and covariance. Yet, with the exceptionally high-dimensions of geophysical problems, a proper estimate, and storage of the priorcovariance matrices is still challenging. Thus, suitable reduced-rank formulationsshould be used to lower the computational load while maintaining a good descriptionof the errors about the full system.In chaotic dissipative systems this goal can be achieved by monitoring theunstable-neutral subspace of the model dynamics and performing DA only withinthat subspace. This is the idea at the basis of the class of methods known as assim-ilation in the unstable subspace (Palatella et al., 2013), that were developed mainlyin the years between 2004 to 2015; we reviewed them in Sect. 4. Despite reducingthe problem size to that of the unstable-neutral subspace (of dimension 𝑛 (cid:28) 𝑁 𝑥 ),AUS methods proved aptly skillful, very close to those of their more costly full-rankcompetitors.However, the reduction of cost achieved at the analysis steps is offset by the addi-tional cost of monitoring and tracking the unstable-neutral subspace. This requirescomputing the Lyapunov vectors (usually the backward Lyapunov vectors), whichimplies computing the tangent linear model (or alternatively evolving an ensembleof bred perturbations mimicking the evolution under the tangent linear model) and arepeated QR matrix decomposition. For specific purposes, one can opt for trackingonly a few dominant unstable modes. This was the case for early adaptive (targeted)observation results (Carrassi et al., 2007) or when AUS was used to complement aclassical DA method (Carrassi et al., 2008b).In parallel to the early developments of AUS, several studies with ensemble DAmethods in chaotic dissipative systems were suggesting that a number of their keyproperties were related to the model instabilities, including the rank and span ofthe ensemble-based forecast error covariance, as well as the skill of the analysis(Sakov and Oke, 2008b; Carrassi et al., 2009; Ng et al., 2011; Bocquet and Sakov,2014). These results indicate that the ensemble anomalies automatically align withthe unstable-neutral subspace, thus resulting in the analysis to be confined to it.To put this mechanism on a more rigorous theoretical footing, a stream of workshave studied the relation between the unstable-neutral subspace in linear systemsusing the Kalman filter (KF) and Kalman smoother (KS) (Gurumoorthy et al., 2017;Bocquet et al., 2017; Bocquet and Carrassi, 2017). These works, reviewed in Sect. 3.1,have provided analytic proofs that the span of the error covariance matrices of theKF and KS tends asymptotically to the unstable–neutral subspace, independent ofthe initial condition ( i.e. no matter the number of the ensemble members, provided hort Title 37 it exceeds the size of the unstable-neutral subspace). For stochastic systems withadditive noise it was proved that asymptotically weakly stable modes, that one mightdiscard in deterministic systems, must be included and analytic bounds for the errorwere provided (Grudzien et al., 2018a).How do these results hold for nonlinear systems? In the case of chaotic deter-ministic systems, this was studied in Bocquet and Carrassi (2017) and further inSect. 3.2.1 of this chapter. It was numerically showed that an ensemble comprisingat least as many members as the size of the unstable-neutral subspace plus one( 𝑁 ≥ 𝑛 +
1) is needed to achieve satisfactorily skill with the ensemble Kalman filter(EnKF). Section 3.2.1 also considered the case of a coupled multiscale system witha quasi-degenerate spectrum of Lyapunov exponents. This originates in the pres-ence of many close-to-zero exponents that are related to the coupling mechanisms(Vannitsem and Lucarini, 2016). It is shown that their full inclusion in the ensembledesign is needed to reduce the EnKF analysis error to a satisfactorily low level.The case of nonlinear stochastic chaotic systems with additive noise was studiedin Grudzien et al. (2018b) and reviewed in Sect. 3.2.2. The section explains the roleof the weakly stable modes already identified in linear systems, but also discoveredthe upwelling mechanism for which uncertainty is upwelled from unfiltered (stable)modes to filtered (unstable) ones. The upwelling phenomenon is not exclusive ofnonlinear systems and it is in fact present in linear systems too. It provides an addi-tional rationale to the use of multiplicative inflation, otherwise known by numericallyevidence to be needed for a proper functioning of reduced-rank filters even in perfectmodel scenarios (see e.g.
Raanes et al., 2019).An outlook at how this research may evolve is given in Sect. 5. In particular inSect. 5.1 we provide original results on the use of the AUS approach, i.e. exploiting theunstable-neutral subspace, in particle filters (PFs), a fully non-Gaussian DA method.Results indicate that targeting observations within the unstable-neutral subspaceis very effective. However, by analogy with what was proved for the EnKF inSect. 3.2.1, adding observations along the stable modes does not deteriorate theanalysis. In the particle filter too, the particles automatically align along the unstable-neutral subspace so that the contribution from observations in its complement stablesubspace is negligible. Our results shed also new insight on the scaling of theparticle numbers needed to reach convergence. It is shown to depend on the sizeof the unstable-neutral subspace rather than the observation vector size. In Sect.5.2we surveyed how the novel concept of random attractors could offer new ways tofurther exploit the idea behind AUS on stochastic multi-scale systems with largescale separation. Moreover, we also described how to amend numerical integrationschemes when doing ensemble DA on such systems, as a trade-off between accuracyand computational cost.It is important to recall that all of the results with the EnKF and PF that wehave presented are obtained without the use of localization (see e.g.
Carrassi et al.(2018), their chapter 4.4, and Farchi and Bocquet (2018) for localisation in the EnKFand PF, respectively). While we are well aware of the dramatic positive impactof localization on the filters’ skill, we intentionally did not use it as it artificially changes the ensemble-covariance rank and span, thus making it impossible to linkthem exclusively to the model instabilities.The use of the time-dependent unstable-neutral subspace to represent uncertaintyin dynamical systems is still very appealing and potentially prone to success in awider area than explored so far. For instance, in a recent work by Bocquet et al.(2020) model error arise from parameter mispecification and the EnKF is appliedto estimate simultaneously the (chaotic) model state and 𝑁 𝑝 parameters. The EnKFis used in the state-augmentation formulation and the standard persistence model isadopted for the parameter dynamics. It was shown that the bound for the necessaryensemble size becomes 𝑁 ≥ 𝑛 + 𝑁 𝑝 + 𝑁 𝑝 additional members are required toinfer the 𝑁 𝑝 parameters. While the linear one-to-one relation between the numberof parameters and that of the additional members is a consequence of the choice ofa persistence model and will change if a different parameters dynamics is in place,this result further highlights how much the design of the EnKF is, and can be tied tothe properties of the dynamical model.Future developments along these lines will unavoidably have to tackle the obstacleof the computing cost of the unstable-neutral subspace. We speculate that recentprogress in the area of machine learning (Goodfellow et al., 2016) may help. Neuralnetwork surrogate models of chaotic systems have shown capabilities to reproducethe spectrum of the asymptotic Lyapunov exponents fairly well (Pathak et al., 2017;Brajard et al., 2020). It is matter of future investigations to explore the possibilities ofmachine learning algorithms that learn about the time-dependent instabilities fromoffline long model simulations and then assist the model in the prediction mode byproviding a proxy of the unstable-subspace at each analysis time. Acknowledgements
AC has been funded by the UK Natural Environment Research Council awardNCEO02004. CEREA is member of Institut Pierre–Simon Laplace (IPSL). PNR has been partlyfunded by DIGIRES, a project sponsored by industry partners and the PETROMAKS2 programmeof the Research Council of Norway.
References
L. Y. Adrianova.
Introduction to linear systems of differential equations . AmericanMathematical Soc., 1995.M. Asch, M. Bocquet, and M. Nodet.
Data Assimilation: Methods, Algorithms, andApplications . SIAM, 2016.L. Barreira and Y. B. Pesin.
Lyapunov Exponents and Smooth Ergodic Theory .Student Mathematical Library. American Mathematical Society, 2002. ISBN9780821829219.R. Beeson and N. S. Namachchivaya. Particle filtering for chaotic dynamical systemsusing future right-singular vectors.
Nonlinear Dynamics , June 2020. doi: 10.1007/s11071-020-05727-y. hort Title 39
G. Benettin, L. Galgani, A. Giorgilli, and J. Strelcyn. Lyapunov characteristicexponents for smooth dynamical systems and for Hamiltonian systems; a methodfor computing all of them. Part 1: Theory.
Meccanica , 15(1):9–20, 1980.M. Bocquet. Ensemble Kalman filtering without the intrinsic need for inflation.
Nonlinear Processes in Geophysics , 18(5):735–750, 2011.M. Bocquet and A. Carrassi. Four-dimensional ensemble variational data assimila-tion and the unstable subspace.
Tellus A , 69(1):1304504, 2017.M. Bocquet and P. Sakov. An iterative ensemble Kalman smoother.
Q. J. R. Meteorol.Soc. , 140:1521–1535, 2014.M. Bocquet, K. S. Gurumoorthy, A. Apte, A. Carrassi, C. Grudzien, and C. K. R. T.Jones. Degenerate Kalman filter error covariances and their convergence ontothe unstable subspace.
SIAM/ASA Journal on Uncertainty Quantification , 5(1):304–333, 2017.M. Bocquet, A. Farchi, and Q. Malartic. Online learning of both state and dynamicsusing ensemble kalman filters.
Foundation of Data Science , 0:00–00, 2020. doi:10.3934/fods.2020015. Accepted for publication.J. Brajard, A. Carrassi, M. Bocquet, and L. Bertino. Combining data assimilationand machine learning to emulate a dynamical model from sparse and noisy ob-servations: A case study with the lorenz 96 model.
Journal of ComputationalScience , 44:101171, 2020.R. Buizza, J. Tribbia, F. Molteni, and T. N. Palmer. Computation of optimal unstablestructures for a numerical weather prediction model.
Tellus A , 45(5):388–407,1993.A. Carrassi, A. Trevisan, and F. Uboldi. Adaptive observations and assimilation inthe unstable subspace by breeding on the data-assimilation system.
Tellus A , 59(1):101–113, 2007.A. Carrassi, M. Ghil, A. Trevisan, and F. Uboldi. Data assimilation as a nonlinear dy-namical systems problem: Stability and convergence of the prediction-assimilationsystem.
Chaos , 18:023112, 2008a.A. Carrassi, A. Trevisan, L. Descamps, O. Talagrand, and F. Uboldi. Controllinginstabilities along a 3DVar analysis cycle by assimilating in the unstable subspace:a comparison with the EnKF.
Nonlinear Processes in Geophysics , 15:503–521,2008b.A. Carrassi, S. Vannitsem, D. Zupanski, and M. Zupanski. The maximum likelihoodensemble filter performances in chaotic systems.
Tellus A , 61:587–600, 2009.A. Carrassi, M. Bocquet, L. Bertino, and G. Evensen. Data assimilation in thegeosciences-an overview on methods, issues and perspectives.
WIREs ClimChange , e535., 2018. doi: 10.1002/wcc.535.C. Cotter, D. Crisan, D. D. Holm, W. Pan, and I. Shevchenko. Numerically modelingstochastic lie transport in fluid dynamics.
Multiscale Modeling & Simulation , 17(1):192–232, 2019.L. De Cruz, J. Demaeyer, and S. Vannitsem. The modular arbitrary-order ocean-atmosphere model: maooam v1.0.
Geoscientific Model Development , 9(8):2793–2808, 2016. doi: 10.5194/gmd-9-2793-2016.
J. Demaeyer and L. De Cruz. Climdyn/qgs: qgs version 0.2.0 release, July 2020.URL https://doi.org/10.5281/zenodo.3941877 .J. Demaeyer and S. Vannitsem. Stochastic parameterization of subgrid-scale pro-cesses: A review of recent physically based approaches. In
Advances in NonlinearGeosciences , pages 55–85. Springer, 2018.J. Demaeyer, L. De Cruz, and S. Vannitsem. qgs: A flexible python frameworkof reduced-order multiscale climate models.
Journal of Open Source Software ,Submitted. URL https://raw.githubusercontent.com/openjournals/joss-papers/joss.02597/joss.02597/10.21105.joss.02597.pdf .L. Dieci and E. S. Van Vleck. Lyapunov spectral intervals: theory and computation.
SIAM Journal on Numerical Analysis , 40(2):516–542, 2002.L. Dieci and E. S. Van Vleck. Lyapunov and sacker–sell spectral intervals.
Journalof dynamics and differential equations , 19(2):265–293, 2007.G. Evensen.
Data Assimilation: The Ensemble Kalman Filter . Springer-VerlagBerlin Heildelberg, second edition, 2009a.G. Evensen.
Data assimilation: the ensemble Kalman filter . Springer Science &Business Media, 2009b.A. Farchi and M. Bocquet. Review article: Comparison of local particle filters andnew implementations.
Nonlinear Processes in Geophysics , 25(4):765–807, Nov.2018. doi: 10.5194/npg-25-765-2018.A. Fillion, M. Bocquet, S. Gratton, S. Gürol, and P. Sakov. An iterative ensemblekalman smoother in presence of additive model error.
SIAM/ASA Journal onUncertainty Quantification , 8(1):198–228, 2020. doi: 10.1137/19M1244147.N. Fourrié, D. Marchal, F. Rabier, B. Chapnik, and G. Desroziers. Impact studyof the 2003 north atlantic thorpex regional campaign.
Quarterly Journal of theRoyal Meteorological Society , 132(615):275–295, 2006.J. Frank and G. A. Gottwald. A note on statistical consistency of numerical integratorsfor multiscale dynamics.
Multiscale Modeling & Simulation , 16(2):1017–1033,2018.G. Froyland, T. Hüls, G. P. Morriss, and T. M. Watson. Computing covariantlyapunov vectors, oseledets vectors, and dichotomy projectors: A comparativenumerical study.
Physica D: Nonlinear Phenomena , 247(1):18–39, 2013.M. Ghil and P. Malanotte-Rizzoli. Data assimilation in meteorology and oceanog-raphy. In
Advances in geophysics , volume 33, pages 141–266. Elsevier, 1991.I. Goodfellow, Y. Bengio, and A. Courville.
Deep learning . MIT press, 2016.G. A. Gottwald, D. Crommelin, and C. Franzke. Stochastic climate theory.
Nonlinearand Stochastic Climate Dynamics , pages 209–240, 2015.C. Grudzien, A. Carrassi, and M. Bocquet. Asymptotic forecast uncertainty and theunstable subspace in the presence of additive model error.
SIAM/ASA Journal onUncertainty Quantification , 6(4):1335–1363, 2018a.C. Grudzien, A. Carrassi, and M. Bocquet. Chaotic dynamics and the role ofcovariance inflation for reduced rank Kalman filters with model error.
NonlinearProcesses in Geophysics , 25(3):633–648, 2018b. hort Title 41
C. Grudzien, M. Bocquet, and A. Carrassi. On the numerical integration of thelorenz-96 model, with scalar additive noise, for benchmark twin experiments.
Geoscientific Model Development , 13(4):1903–1924, 2020.K. S. Gurumoorthy, C. Grudzien, A. Apte, A. Carrassi, and C. K. R. T. Jones. Rankdeficiency of Kalman error covariance matrices in linear time-varying systemwith deterministic evolution.
SIAM Journal on Control and Optimization , 55(2):741–759, 2017.T. M. Hamill, F. Yang, C. Cardinali, and S. J. Majumdar. Impact of targeted win-ter storm reconnaissance dropwindsonde data on midlatitude numerical weatherpredictions.
Monthly weather review , 141(6):2058–2065, 2013.B. R. Hunt, E. Kalnay, E. J. Kostelich, E. Ott, D. J. Patil, T. Sauer, I. Szunyogh, J. A.Yorke, and A. V. Zimin. Four-dimensional ensemble Kalman filtering.
Tellus A ,56(4):273–277, 2004.A. H. Jazwinski.
Stochastic Processes and Filtering Theory . Academic Press,New-York, 1970.R. E. Kalman. A new approach to linear filtering and prediction problems.
Journalof Fluids Engineering , 82:35–45, 1960.E. Kalnay.
Atmospheric modeling, data assimilation and predictability . CambridgeUniversity Press, 2003.P. V. Kuptsov and U. Parlitz. Theory and computation of covariant lyapunov vectors.
Journal of Nonlinear Science , 22(5):727–762, 2012.B. Legras and R. Vautard. A guide to lyapunov vectors. In
ECMWF Workshop onPredictability , pages 135–146, Reading, United-Kingdom, 1996. ECMWF.P. D. Liu and M. Qian.
Smooth ergodic theory of random dynamical systems .Springer, 2006.E. N. Lorenz. Deterministic nonperiodic flow.
Journal of the atmospheric sciences ,20(2):130–141, 1963.E. N. Lorenz. Predictability: A problem partly solved. In
Proc. Seminar on pre-dictability , volume 1, 1996.J. Maclean and E. S. V. Vleck. Particle filters for data assimilation based on reducedorder data models, 2019.L. Mitchell and A. Carrassi. Accounting for model error due to unresolved scaleswithin ensemble Kalman filtering.
Quarterly Journal of the Royal MeteorologicalSociety , 141(689):1417–1428, 2015.G. H. C. Ng, D. McLaughlin, D. Entekhabi, and A. Ahanin. The role of modeldynamics in ensemble Kalman filter performance for chaotic systems.
Tellus A ,63(5):958–977, 2011.T. N. Nipen, I. A. Seierstad, C. Lussana, J. Kristiansen, and O. Hov. AdoptingCitizen Observations in Operational Weather Prediction.
Bulletin of the AmericanMeteorological Society , 101(1):E43–E57, 10 2019. ISSN 0003-0007. doi: 10.1175/BAMS-D-18-0237.1.L. Oljača, J. Bröcker, and T. Kuna. Almost sure error bounds for data assimilation indissipative systems with unbounded observation noise.
SIAM Journal on AppliedDynamical Systems , 17(4):2882–2914, 2018. doi: 10.1137/17M1162305.
L. Palatella and F. Grasso. The ekf-aus-nl algorithm implemented without the lineartangent model and in presence of parametric model error.
SoftwareX , 7:28 – 33,2018. ISSN 2352-7110.L. Palatella and A. Trevisan. Interaction of Lyapunov vectors in the formulation ofthe nonlinear extension of the Kalman filter.
Physical Review E , 91:042905, 2015.L. Palatella, A. Carrassi, and A. Trevisan. Lyapunov vectors and assimilation in theunstable subspace: theory and applications.
Journal of Physics A: Mathematicaland Theoretical , 46:254020, 2013.T. Palmer, R. Gelaro, J. Barkmeijer, and R. Buizza. Singular vectors, metrics, andadaptive observations.
Journal of the Atmospheric Sciences , 55(4):633–653, 1998.J. Pathak, Z. Lu, B. R. Hunt, M. Girvan, and E. Ott. Using machine learning toreplicate chaotic attractors and calculate lyapunov exponents from data.
Chaos:An Interdisciplinary Journal of Nonlinear Science , 27(12):121102, 2017.S. Penny, E. Bach, K. Bhargava, C.-C. Chang, C. Da, L. Sun, and T. Yoshida.Strongly coupled data assimilation in multiscale media: Experiments using aquasi-geostrophic coupled model.
Journal of Advances in Modeling Earth Sys-tems , 11(6):1803–1829, 2019.S. G. Penny. Mathematical foundations of hybrid data assimilation from a synchro-nization perspective.
Chaos: An Interdisciplinary Journal of Nonlinear Science ,27(12):126801, 2017.S. G. Penny and T. M. Hamill. Coupled data assimilation for integrated earth systemanalysis and prediction.
Bulletin of the American Meteorological Society , 98(7):ES169–ES172, 2017.A. Pikovsky and A. Politi.
Lyapunov exponents: a tool to explore complex dynamics .Cambridge University Press, 2016.H. Poincaré.
Les méthodes nouvelles de la mécanique céleste. Tome III . GAUTHIER-VILLARS, 1899.R. Potthast, A. Walter, and A. Rhodin. A localized adaptive particle filter within anoperational NWP framework.
Monthly Weather Review , 147(1):345–362, 2019.P. N. Raanes, A. Carrassi, and L. Bertino. Extending the square root method toaccount for additive forecast noise in ensemble methods.
Monthly Weather Review ,143(10):3857–3873, 2015.P. N. Raanes, M. Bocquet, and A. Carrassi. Adaptive covariance inflation in theensemble kalman filter by gaussian scale mixtures.
Quarterly Journal of theRoyal Meteorological Society , 145(718):53–75, 2019.B. B. Reinhold and R. T. Pierrehumbert. Dynamics of weather regimes: Quasi-stationary waves and blocking.
Monthly Weather Review , 110(9):1105–1145,1982.D. Ruelle. Ergodic theory of differentiable dynamical systems.
Inst. Hautes ÉtudesSci. Publ. Math. , 50(50):27–58, 1979. ISSN 0073-8301.P. Sakov and L. Bertino. Relation between two common localisation methods forthe enkf.
Computational Geosciences , 15(2):225–237, 2011.P. Sakov and P. R. Oke. A deterministic formulation of the ensemble Kalman filter:an alternative to ensemble square root filters.
Tellus A , 60(2):361–371, 2008a. hort Title 43
P. Sakov and P. R. Oke. Implications of the form of the ensemble transformationin the ensemble square root filters.
Monthly Weather Review , 136(3):1042–1053,2008b.P. Sakov, J. M. Haussaire, and M. Bocquet. An iterative ensemble Kalman filter inpresence of additive model error.
Quarterly Journal of the Royal MeteorologicalSociety , 2018.I. Shimada and T. Nagashima. A numerical approach to ergodic problem of dis-sipative dynamical systems.
Progress of Theoretical Physics , 61(6):1605–1616,1979.C. Snyder. Summary of an informal workshop on adaptive observations and fastex.
Bulletin of the American Meteorological Society , 77(5):953–961, 1996.C. Snyder, T. Bengtsson, P. Bickel, and J. Anderson. Obstacles to high-dimensionalparticle filtering.
Monthly Weather Review , 136(12):4629–4640, 2008.C. Snyder, T. Bengtsson, and M. Morzfeld. Performance bounds for particle filtersusing the optimal proposal.
Monthly Weather Review , 143(11):4750–4761, 2015.I. Szunyogh, Z. Toth, A. V. Zimin, S. J. Majumdar, and A. Persson. Propagation of theeffect of targeted observations: The 2000 winter storm reconnaissance program.
Monthly weather review , 130(5):1144–1165, 2002.P. D. Thompson. Uncertainty of initial state as a factor in the predictability oflarge scale atmospheric flow patterns.
Tellus , 9(3):275–295, 1957. doi: 10.1111/j.2153-3490.1957.tb01885.x.M. Tondeur, A. Carrassi, S. Vannitsem, and M. Bocquet. On temporal scale sepa-ration in coupled data assimilation with the ensemble kalman filter.
Journal ofStatistical Physics , 179:1161–1185, 2020.Z. Toth and E. Kalnay. Ensemble forecasting at nmc: The generation of perturbations.
Bulletin of the american meteorological society , 74(12):2317–2330, 1993.Z. Toth and E. Kalnay. Ensemble forecasting at NCEP and the breeding method.
Monthly Weather Review , 125(12):3297–3319, 1997.A. Trevisan and R. Legnani. Transient error growth and local predictability: A studyin the lorenz system.
Tellus A , 47(1):103–117, 1995.A. Trevisan and L. Palatella. On the Kalman filter error covariance collapse into theunstable subspace.
Nonlinear Processes in Geophysics , 18(2):243–250, 2011.A. Trevisan and F. Uboldi. Assimilation of standard and targeted observations withinthe unstable subspace of the observation-analysis-forecast cycle.
Journal of theAtmospheric Sciences , 61:103–113, 2004a.A. Trevisan and F. Uboldi. Assimilation of standard and targeted observations withinthe unstable subspace of the observation–analysis–forecast cycle system.
Journalof the Atmospheric Sciences , 61(1):103–113, 2004b.A. Trevisan, M. D’Isidoro, and O. Talagrand. Four-dimensional variational assim-ilation in the unstable subspace and the optimal subspace dimension.
Q. J. R.Meteorol. Soc. , 136:487–496, 2010.F. Uboldi and A. Trevisan. Detecting unstable structures and controlling error growthby assimilation of standard and adaptive observations in a primitive equation oceanmodel.
Nonlinear Processes in Geophysics , 16:67–81, 2006.
F. Uboldi, A. Trevisan, and A. Carrassi. Developing a dynamically based assim-ilation method for targeted and standard observations.
Nonlinear Processes inGeophysics , 12(1):149–156, 2005. doi: 10.5194/npg-12-149-2005.G. K. Vallis.
Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Circulation . Cambridge University Press, 2 edition, 2017. doi: 10.1017/9781107588417.S. Vannitsem. Predictability of large-scale atmospheric motions: Lyapunov expo-nents and error dynamics.
Chaos: An Interdisciplinary Journal of NonlinearScience , 27(3):032101, 2017.S. Vannitsem and W. Duan. On the use of near-neutral backward lyapunov vectorsto get reliable ensemble forecasts in coupled ocean–atmosphere systems.
ClimateDynamics , 55:1125–1139, 2020. doi: 10.1007/s00382-020-05313-3.S. Vannitsem and V. Lucarini. Statistical and dynamical properties of covariantlyapunov vectors in a coupled atmosphere-ocean model—multiscale effects, geo-metric degeneracy, and error dynamics.
Journal of Physics A: Mathematical andTheoretical , 49(22):224001, 2016.S. Vannitsem, J. Demaeyer, L. De Cruz, and M. Ghil. Low-frequency variabilityand heat transport in a low-order nonlinear coupled ocean–atmosphere model.