[PDF] Data assimilation for chaotic dynamics

Abstract

Chaos is ubiquitous in physical systems. The associated sensitivity to initial conditions is a significant obstacle in forecasting the weather and other geophysical fluid flows. Data assimilation is the process whereby the uncertainty in initial conditions is reduced by the astute combination of model predictions and real-time data. This chapter reviews recent findings from investigations on the impact of chaos on data assimilation methods: for the Kalman filter and smoother in linear systems, analytic results are derived; for their ensemble-based versions and nonlinear dynamics, numerical results provide insights. The focus is on characterising the asymptotic statistics of the Bayesian posterior in terms of the dynamical instabilities, differentiating between deterministic and stochastic dynamics. We also present two novel results. Firstly, we study the functioning of the ensemble Kalman filter in the context of a chaotic, coupled, atmosphere-ocean model with a quasi-degenerate spectrum of Lyapunov exponents, showing the importance of having sufficient ensemble members to track all of the near-null modes. Secondly, for the fully non-Gaussian method of the particle filter, numerical experiments are conducted to test whether the curse of dimensionality can be mitigated by discarding observations in the directions of little dynamical growth of uncertainty. The results refute this option, most likely because the particles already embody this information on the chaotic system. The results also suggest that it is the rank of the unstable-neutral subspace of the dynamics, and not that of the observation operator, that determines the required number of particles. We finally discuss how knowledge of the random attractor can play a role in the development of future data assimilation schemes for chaotic multiscale systems with large scale separation.

Full PDF

DData assimilation for chaotic dynamics

Alberto Carrassi, Marc Bocquet, Jonathan Demaeyer, Colin Grudzien, PatrickRaanes and Stephane Vannitsem

Abstract

Chaos is ubiquitous in physical systems. The associated sensitivity to initialconditions is a signiﬁcant obstacle in forecasting the weather and other geophysi-cal ﬂuid ﬂows. Data assimilation is the process whereby the uncertainty in initialconditions is reduced by the astute combination of model predictions and real-timedata. This chapter reviews recent ﬁndings from investigations on the impact of chaoson data assimilation methods: for the Kalman ﬁlter and smoother in linear systems,analytic results are derived; for their ensemble-based versions and nonlinear dynam-ics, numerical results provide insights. The focus is on characterizing the asymptoticstatistics of the Bayesian posterior in terms of the dynamical instabilities, diﬀeren-tiating between deterministic and stochastic dynamics. We also present two novelresults. Firstly, we study the functioning of the ensemble Kalman ﬁlter (EnKF) in thecontext of a chaotic, coupled, atmosphere-ocean model with a quasi-degenerate spec-trum of Lyapunov exponents, showing the importance of having suﬃcient ensemble

Alberto CarrassiDept. of Meteorology and National Centre for Earth Observation, University of Reading, Reading,UK,Mathematical Institute, University of Utrecht, Netherlands e-mail: [email protected]

Marc BocquetCEREA, joint laboratory École des Ponts ParisTech and EDF R&D, Universitè Paris-Est, Francee-mail: [email protected]

Jonathan DemaeyerRoyal Meteorological Institute of Belgium, Brussels, Belgium e-mail: [email protected]

Colin GrudzienDepartment of Mathematics and Statistics, University of Nevada, Reno, Reno, Nevada, USA e-mail: [email protected]

Patrick RaanesNORCE, Bergen, Norway e-mail: [email protected]

Stephane VannitsemRoyal Meteorological Institute of Belgium, Brussels, Belgium e-mail: [email protected] a r X i v : . [ phy s i c s . d a t a - a n ] N ov Authors Suppressed Due to Excessive Length members to track all of the near-null modes. Secondly, for the fully non-Gaussianmethod of the particle ﬁlter, numerical experiments are conducted to test whether thecurse of dimensionality can be mitigated by discarding observations in the directionsof little dynamical growth of uncertainty. The results refute this option, most likelybecause the particles already embody this information on the chaotic system. Theresults also suggest that it is the rank of the unstable-neutral subspace of the dynam-ics, and not that of the observation operator, that determines the required numberof particles. We ﬁnally discuss how knowledge of the random attractor can play arole in the development of future data assimilation schemes for chaotic multiscalesystems with large scale separation.

This chapter attempts a uniﬁed and comprehensive discussion of a number of studiesthat in about a decade have contributed to shape our nowadays’s understanding ofthe implications, impacts and consequences for data assimilation when is applied tochaotic dynamics. The chapter presents a review of the essential results appeared indiﬀerent studies, but for the ﬁrst time together in a coherent treatment. In additionwe present new original ﬁndings addressing two key aspects that were not coveredin previous studies. The ﬁrst treats the impact of data assimilation on a chaoticand multiscale system, the second concerns the consequences for nonlinear, non-Gaussian, data assimilation (a particle ﬁlter) in face of the chaotic nature of theunderlying dynamics.The exposition is organised as follows. We ﬁrst discuss, in Sect. 2, the chaoticcharacter of atmospheric and oceanic ﬂows, and provide a treatment of the keymathematical concepts and tools that allow for characterising chaos and to “measure”the degree of instabilities. In doing so, we review classical results from dynamicalsystem theory, including the multiplicative ergodic theorem and associated deﬁnitionof Lyapunov (forward, backward and covariant) vectors and exponents.Section 3 analyses how a chaotic dynamics impacts data assimilation. Our focusis on Kalman ﬁlter (KF) and smoother (KS) in Sect. 3.1, and on their ensemble-based formulations, the ensemble Kalman ﬁlter (EnKF) and smoother (EnKS), inSect. 3.2. Section 3.1 contains primarily analytic results and treats KF and KS inlinear systems, either purely deterministic (Sect. 3.1.1) or with stochastic additivenoise (Sect. 3.1.2). Section 3.2 is dedicated to nonlinear systems (deterministic, inSect. 3.2.1 and stochastic in Sect. 3.2.2) and on how the EnKF (Evensen, 2009a)and the extended Kalman ﬁlter (EKF, Ghil and Malanotte-Rizzoli (1991)) works inthis scenario. In Sect. 3.2.1 we present original results on the impact of chaos on theperformance of the EnKF in a coupled atmosphere-ocean model which possesses adegenerate-like spectrum of Lyapunov exponents, disentangling on the role of thequasi-null exponents.Section 4 reverses the perspective and instead of studying the eﬀect of chaos ondata assimilation, reports on how properties of the dynamics have been used to devise hort Title 3 adaptive observation strategies and ad-hoc data assimilation methods. In particular,in Sect. 4.2, we succinctly review the assimilation in the unstable subspace (AUS,Palatella et al. (2013)), a known approach that exploits the unstable-neutral subspaceof the dynamics to perform the analysis. Remarkably, AUS was conceived beforethe ﬁndings reviewed in Sect. 3. The latter work was inspired by these early studies,and the need to provide mathematical rigor to clarify the mechanisms that madethese early studies successful. In turn, the later work has furthermore provided theframework to generalize these early ideas to a variety of other types of dynamics.Section 5 presents what we consider two main areas of future developments. InSect. 5 the paradigm of AUS is incorporated within a fully nonlinear data assimilationscheme, the particle ﬁlter ( ? ). It is shown that observing in the directions of instabil-ities is eﬀective but also that it is not deleterious to observe the stable directions. Asopposed to the data or model sizes, our results suggest that the number of particlesrequired to achieve good results scales with the size of the unstable-neutral subspace.Section 5.2 discusses how the concept of random attractor can oﬀer novel ways tohandle the data assimilation problem in stochastic chaotic multi-scale systems withlarge scale separation. It also treats the implications of the numerical scheme on theoutput of the data assimilation cycle and ensemble-based forecasts, as well as posessome key questions for future studies.Final conclusions and a summary are drawn in Sect. 6. The atmosphere and the ocean are ﬂuids that are described by the set of classicalconservation laws of hydrodynamics, including the conservation of mass, momentumand energy (Vallis, 2017). For the atmosphere, these are often complemented by theconservation of moisture present in the air. These laws lead to a set of local dynamicalequations describing the motion of each parcel of ﬂuid. Given that these equations arenonlinear with complex interactions with the boundaries, realistic solutions cannotbe obtained analytically and one must rely on numerical simulations starting fromappropriate initial conditions.Numerical simulations are based on discrete approximations in space and timeof the dynamical equations, and are often accompanied with simpliﬁcations ofthe equations in order to describe appropriately the scales of interest. One of themost famous approximations is the geostrophic approximation which assumes abalance between the horizontal velocity ﬁeld and the horizontal pressure gradient.Geostrophic balance is a good approximation for both the ocean and the atmospherealbeit at diﬀerent spatial scales. The geostrophic approximation is at the basis of themodels that will be used later in Sect. 3.2.1Whatever the scale at which the atmospheric or the ocean ﬂuids are observed,they display an apparently erratic evolution. This erratic behavior is also presentin atmospheric, ocean and climate models when appropriate forcing are imposed.This feature should not be confused with randomness as most of the models used

Authors Suppressed Due to Excessive Length since the start of numerical modelling were purely deterministic. Lorenz (1963)showed in a simple low-order model that this erratic behavior is concomitant withthe property of sensitivity to initial conditions, by which whatever small an errorin the initial condition is, it will increase rapidly in time. This property impliesthat given the inevitable error in the initial conditions, any forecast will ultimatelybecome useless as the error ﬁnally reaches an amplitude of the same order as thenatural variability of the variable considered. The sensitivity to initial conditionsand the consequent erratic-like evolution are the key properties of deterministicdynamical systems displaying chaos .This behaviour has been found in many atmospheric, oceanic and climate models(see e.g.,

Vannitsem, 2017, for a review). In particular, a detailed investigation ofthe chaotic nature of the coupled ocean-atmosphere system that will be used inSect. 3.2.1 has been performed by Vannitsem et al. (2015).

The notion of sensitivity to initial conditions of dynamical solutions was alreadydiscovered and studied in a mathematical context by Poincaré (1899). In the secondhalf of the 20th century, this property was discovered in models of atmosphericand climate relevance (Thompson, 1957; Lorenz, 1963). Its important practicalimplications drove the regain of interest in developing the appropriate mathematicsin support of its description and understanding. These eﬀorts culminated with thedevelopment of the ergodic theory of deterministic dynamical systems and chaostheory. In the following we shall brieﬂy summarise what we consider to be the keydevelopments that led to the deﬁnition of Lyapunov exponents and vectors.The importance of these mathematical objects stands on their ability to “mea-sure” the degree of instabilities and thus to quantify the aforementioned sensitivity toinitial conditions. While more rigorous mathematical treatments can be found in ap-propriate mathematical literature (see e.g. , Pikovsky and Politi, 2016, and referencestherein), and we will invoke that rigour to a certain extent in the following sections,here we approach the discussion with a physical and intuitive angle. Further detailscan also be found in Legras and Vautard (1996); Barreira and Pesin (2002); Kuptsovand Parlitz (2012).Let us write the evolution laws of a deterministic dynamical system in the formof a set of ordinary diﬀerential equations (ODEs),d x d 𝑡 = f ( x , 𝝈 ) , (1)where x is a vector containing the entire set of relevant variables x = ( 𝑥 , ..., 𝑥 𝑛 ) and 𝝈 represents a set of parameters. The discussion that follows holds also when system(1) explicitly depends on time, thus being non-autonomous, provided it remainsergodic. hort Title 5 Since the process of measurement is always subject to ﬁnite precision, the initialstate is never known exactly. To study the evolution and the implications of such anerror, let us consider an initial state displaced slightly from x by an initial error 𝛿 x .The perturbed initial state, x + 𝛿 x , generates a new trajectory in phase space andone can deﬁne the instantaneous error vector as the vector joining the representativepoints of the reference trajectory and the perturbed one at a given time, 𝛿 x ( 𝑡 ) .Provided that this perturbation is suﬃciently small and smooth, its dynamics can bedescribed by the linearized equation,d 𝛿 x d 𝑡 ≈ 𝜕 F 𝜕 x | x ( 𝑡 ) 𝛿 x , (2)with a formal solution, 𝛿 x ( 𝑡 ) ≈ M ( 𝑡, x ( 𝑡 )) 𝛿 x ( 𝑡 ) . (3)The matrix M is referred to as the “fundamental matrix” and is the resolvent of Eq. (2), i.e. M ( 𝑡, x ( 𝑡 )) = 𝑒 ∫ 𝑡 𝑡 𝜕 F 𝜕 x | x ( 𝑡 ) d 𝑡 . The fundamental matrix contains the information onthe ampliﬁcation of inﬁnitesimally small perturbations.In the context of the ergodic theory of deterministic dynamical systems, theOseledets theorem (Kuptsov and Parlitz, 2012) shows that the limit of the matrix ( M (cid:62) M ) /[ ( 𝑡 − 𝑡 ) ] , for time going to inﬁnity, exists; let us refer to this limitingmatrix as S . The logarithm of its eigenvalues are called the Lyapunov exponents (LEs), whereas the full set of LEs is called the Lyapunov spectrum, and is usuallyrepresented in decreasing order. The eigenvectors of S , which are local properties ofthe ﬂow (they change along the trajectory thus being time-dependent) and dependon the initial time 𝑡 , are called the forward Lyapunov vectors (FLVs) (Legras andVautard, 1996).The LEs are a powerful tool to “measure” chaos and to characterise the degreeof instability of a system. For instance, LEs are averaged (asymptotic) indicatorsof exponential growth (LE>0) or decay (LE<0) of perturbations under the tangent-linear model. A deterministic chaotic system is uniquely characterised by having atleast its leading LE larger than zero, i.e. LE >

0. The sum of the LEs is equal tothe average divergence of ﬂow generated by Eq. (1) (see e.g.

Pikovsky and Politi(2016)). This means that in dissipative (conservative, e.g.

Hamiltonian) systemsthe sum of the LEs is negative (zero): volumes in the phase space of a dissipative(conservative) dynamics reduce (are conserved) on average with time. Furthermore,autonomous continuous-in-time systems generally possess at least one LE=0, unlessthey converge to a motion-less state. This null exponent is related to perturbationsaligned to the system’s velocity vector ( i.e. 𝛿 x = f ), so although they shall ﬂuctuatedepending on the local ﬂow, they will not on average decay nor growth. These lasttwo properties can also be used as a check for numerical accuracy when computingthe LEs.The multiplicative ergodic theorem (MET) (Barreira and Pesin, 2002, Theo-rem 2.1.2) guarantees that, under general hypotheses, the eigenvalues of the matrix ( M (cid:62) M ) /[ ( 𝑡 − 𝑡 ) ] obtained for 𝑡 → ∞ are equivalent to the ones of the matrix S (cid:48) = ( MM (cid:62) ) /[ ( 𝑡 − 𝑡 ) ] when 𝑡 → −∞ . The equivalence of the spectrum of S and S (cid:48) Authors Suppressed Due to Excessive Length is not generically true for the fundamental matrix of an arbitrary, linear dynamicalsystem; however, the MET guarantees that this is a fairly generic property of theresolvent of the tangent-linear model of a nonlinear dynamical system . If the ﬂowof the time derivative f is a C diﬀeomorphism of a compact, smooth, Riemannianmanifold 𝑀 , then the MET assures that the LEs are deﬁned equivalently by thelog-eigenvalues of S or S (cid:48) for any initial condition x ( 𝑡 ) and that the LEs are uniqueon a subset of 𝑀 of full measure with respect to any ergodic invariant measure 𝜇 of the ﬂow. Note that, without the condition of ergodicity of 𝜇 , the LEs may bewell-deﬁned point-wise, but the speciﬁc values and their multiplicity may dependon the initial condition x ( 𝑡 ) .The matrices S and S (cid:48) are symmetric. However, contrary to the eigenvalues, theeigenvectors of these two matrices are not equivalent due to the asymmetric char-acter of the fundamental matrix M in forward- and reverse-time. The eigenvectorsof the latter are called the backward Lyapunov vectors (BLVs). Theoretically, eachmatrix can be evaluated at the same place along the reference trajectory x ( 𝑡 ) andtheir orthogonal eigenvectors can be computed as L f ,𝑖𝑡 and L b ,𝑖𝑡 for S and S (cid:48) , respec-tively, where it is understood that the time-dependence on 𝑡 is with respect to thelinearization of the dynamics at x ( 𝑡 ) . There exist Oseledet subspaces 𝑊 𝑖𝑡 , 𝑊 𝑖𝑡 = L b , 𝑡 ⊕ ... ⊕ L b ,𝑖𝑡 ∩ L f ,𝑖𝑡 ⊕ ... ⊕ L f ,𝑁𝑡 , (4)with ⊕ being the direct product (Ruelle, 1979), that have the important properties ofbeing invariant under the eﬀect of the fundamental matrix, such that M ( 𝜏, x ( 𝑡 )) 𝑊 𝑖𝑡 = 𝑊 𝑖𝜏 . (5)Due to their orthogonal nature, the FLVs and the BLVs require by deﬁnition thechoice of a norm and of an inner product. Nevertheless, the Oseledet subspacesthemselves do not have this dependence; in this way, they can be considered to embedmore invariant information about the dynamics. The decomposition of the tangent-linear space into these covariant subspaces is commonly known as the “Oseledetsplitting” or decomposition. The classical form of the MET thus guarantees that theOseledet splitting is well-deﬁned and consistent with probability one over all initialconditions of the attractor, with respect to the invariant, ergodic measure. Othercovariant splittings of the tangent-linear model, such as by exponential dichotomy,exist under more general forms of the MET (Froyland et al., 2013).When the Lyapunov spectrum is non-degenerate, one can deﬁne a time-varyingbasis, subordinate to the Oseledet spaces, 𝑊 𝑖𝑡 = span (cid:8) L c ,𝑖𝑡 (cid:9) , such that M ( 𝜏, x ( 𝑡 )) L c ,𝑖𝑡 = 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) L c ,𝑖𝜏 , (6)where (cid:107) L c ,𝑖𝑡 (cid:107) = (cid:107) L c ,𝑖𝜏 (cid:107) , and 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) ∈ R describes an ampliﬁcation factor. Thevectors L c ,𝑖𝑡 are known as the covariant Lyapunov vectors (CLVs). In the long-timelimit, the ampliﬁcations 𝛼 𝑖 ( 𝜏, x ( 𝑡 )) can be associated to the LEs as, hort Title 7 ± 𝜆 𝑖 = lim 𝜏 →±∞ 𝜏 𝑙𝑛 | 𝛼 𝑖 ( 𝜏, x ( 𝑡 ))| = lim 𝜏 →±∞ 𝜆 𝜏𝑖 ( x ( 𝑡 )) (7)where we indicate by 𝜆 𝜏𝑖 ( x ( 𝑡 )) the average of the growth rate taken over a timewindow 𝜏 starting at 𝑡 at position x ( 𝑡 )) , and are commonly known as the localLyapunov exponents (LLEs) (Pikovsky and Politi, 2016, see chapter 5).Throughout the chapter we will denote the matrix with columns correspondingto the full tangent linear space ordered basis of BLVs/FLVs/CLVs at time 𝑡 as L e 𝑡 for e ∈ { b , f , c } respectively. A sub-slice of this matrix of Lyapunov vectorscorresponding, inclusively, to columns 𝑖 through 𝑗 will be denoted L e ,𝑖 : 𝑗𝑡 . In thefollowing, the relationships between these Lyapunov vectors and their asymptoticdynamics will be key to understanding the predictability of chaotic systems. For thispurpose, we will largely use the BLVs and the CLVs, and to a lesser extent the FLVs.Note that the basis (cid:8) L c ,𝑖𝑡 (cid:9) 𝑁 𝑥 𝑖 = is not orthogonal in general and, in fact, the anglesbetween any two Oseledec subspaces 𝑊 𝑖 and 𝑊 𝑗 are not in general bounded awayfrom zero in their limits in forward- or reverse-time. For this reason, coordinatetransformations to covariant Oseledet bases are not generally numerically well-conditioned asymptotically, and therefore hard to compute. The property of integralseparation (Dieci and Van Vleck, 2002), describing the stability of the Lyapunovspectrum under bounded perturbations of the tangent-linear equations, ensures thatthe angles between the covariant subspaces will remain bounded away from zero,but this is a strong condition and it is not as generic of a property as the existence ofthe Oseledet decomposition under the MET. However, if a dynamical system is inte-grally separated as above, there exists a well-deﬁned, numerically well-conditionedtransformation of coordinates of the tangent-linear model for which the action of theresolvent M can be expressed as a block-diagonal matrix with each block describingthe invariant dynamics of a single Oseledet space. For degenerate spectrum, theseblocks may be upper-triangular, but for non-degenerate spectrum this representationof the resovlent becomes a strictly diagonal matrix; see Theorem 5.4.9 of Adrianova(1995) for the classical result, or Theorem 5.1 of Dieci and Van Vleck (2007) andFroyland et al. (2013) for more recent extensions. The high sensitivity of a chaotic dynamical system to the initial condition makes ithard to forecast it, even when their evolution equations are perfectly known. Indeedthe typical error grows exponentially over time with a rate given by the largestpositive LE. With a view to forecasting, one has no choice but to regularly correctits trajectory using information on the system state obtained through observations.This is the primary goal of data assimilation (DA) and has been key to the successof numerical weather forecasting. We refer to Kalnay (2003); Asch et al. (2016) andreferences therein for textbooks and reviews on geophysical DA.

Authors Suppressed Due to Excessive Length

When framed in a Bayesian formalism, the goal of DA is to estimate the condi-tional probability density function (pdf) of the system state knowing observationsof that system. For high-dimensional systems, only approximations of this pdf canbe obtained, among which the Gaussian approximation is the most practical andcommon (Carrassi et al., 2018). In the case of Gaussian statistics of the error andlinear dynamics, the conditional pdf can be obtained analytically: it is Gaussian, andcan be sequentially computed using the Kalman ﬁlter (KF) (Kalman, 1960). Directlyinspired from the Kalman ﬁlter, the extended Kalman ﬁlter (EKF) oﬀers an approx-imate DA scheme when the operators are nonlinear (Ghil and Malanotte-Rizzoli,1991). However, it can hardly be used in the context of high-dimensional systems,where it has to be replaced with the ensemble Kalman ﬁlter (EnKF) (Evensen,2009b). In the EnKF, the covariance matrices are represented by state perturbationswhich are representative of the errors and, together with the state mean, form alimited-size ensemble of state vectors.Understanding how the ensemble of the EnKF evolves under the action of theDA scheme and the forecast model dynamics is important. Several numerical resultssuggest that the skills of ensemble-based DA methods in chaotic systems are related tothe instabilities of the underlying dynamics (Ng et al., 2011). Numerical evidencesexist that some asymptotic properties of the ensemble-based covariance matrices(rank, span, range) relate to the unstable modes of the dynamics (Sakov and Oke,2008a; Carrassi et al., 2009). Nevertheless, a better, more profound, understandingof these results was needed to aim at designing reduced rank, computationally cheap,formulations of the ﬁlters.Analytic results that have shed lights on the behaviour of ﬁlters and smootherson chaotic dynamics, and explained the numerically observed properties, have beenobtained for linear dynamics. Section 3.1.1 reviews those ﬁndings in the case ofdeterministic systems without model error, based on the work by Gurumoorthy et al.(2017) and Bocquet et al. (2017). Section 3.1.2 treats their extension to the case ofstochastic systems, following the work by Grudzien et al. (2018a,b). In the case ofnonlinear dynamics, robust numerical evidences will be described by the originalexperiments in Sect. 3.2, that extent the previous ﬁndings by Bocquet and Carrassi(2017).

At time 𝑡 𝑘 , let x 𝑘 ∈ R 𝑁 x and y 𝑘 ∈ R 𝑁 y be the state and observation vector, respec-tively. Let us assume linear evolution model dynamics M 𝑘 and observation model H 𝑘 such that hort Title 9 x 𝑘 = M 𝑘 x 𝑘 − + w 𝑘 , (8a) y 𝑘 = H 𝑘 x 𝑘 + v 𝑘 . (8b)The model and observation noises, w 𝑘 and v 𝑘 , are assumed mutually independent,zero-mean Gaussian white sequences with statisticsE [ v 𝑘 v (cid:62) 𝑙 ] = 𝛿 𝑘,𝑙 R 𝑘 , E [ w 𝑘 w (cid:62) 𝑙 ] = 𝛿 𝑘,𝑙 Q 𝑘 , E [ v 𝑘 w (cid:62) 𝑙 ] = . (9)In the Kalman ﬁlter, which yields the optimal DA solution with such assumptions,the forecast error covariance matrix P 𝑘 satisﬁes the recurrence P 𝑘 + = M 𝑘 + ( I + P 𝑘 𝛀 𝑘 ) − P 𝑘 M (cid:62) 𝑘 + + Q 𝑘 + , (10)where 𝛀 𝑘 ≡ H (cid:62) 𝑘 R − 𝑘 H 𝑘 , (11)is the precision matrix of the observations mapped into state space. In the absenceof model error, i.e. Q 𝑘 ≡ , Gurumoorthy et al. (2017) proved rigorously that in thefull-rank KF (in particular P is full rank), P 𝑘 collapses onto the unstable-neutralsubspace.We shall summarise in the following some key results that apply to the full-rankbut also to the degenerate case, i.e. even if P is of arbitrary rank and the initial errorsof the DA scheme only lie in a subspace of R 𝑁 x . In particular they apply to the EnKFif the dynamical and observations operators are both linear. We recall them here, butreaders can ﬁnd full details in Bocquet et al. (2017); Bocquet and Carrassi (2017). Result 1 : Bound on the free forecast error covariance matrixLet us deﬁne the resolvent of the dynamics from 𝑡 𝑙 to 𝑡 𝑘 as M 𝑘 : 𝑙 = M 𝑘 M 𝑘 − · · · M 𝑙 + ,with the convention that M 𝑘,𝑘 = I . The ﬁrst key result is the following inequality inthe set of the semi-deﬁnite symmetric matrices: P 𝑘 ≤ M 𝑘 :0 P M (cid:62) 𝑘 :0 + 𝚵 𝑘 (cid:67) P free 𝑘 , (12)where 𝚵 ≡ and for 𝑘 ≥ 𝚵 𝑘 ≡ 𝑘 ∑︁ 𝑙 = M 𝑘 : 𝑙 Q 𝑙 M (cid:62) 𝑘 : 𝑙 , (13)is known as the controllability matrix (Jazwinski, 1970), and the suﬃx “free” is usedto refer to the forecast error of the system unconstrained by data. In the absence ofmodel noise ( Q 𝑘 ≡ for the rest of this section), it reads P 𝑘 ≤ M 𝑘 :0 P M (cid:62) 𝑘 :0 . (14)Assuming the dynamics to be non-singular, the column subspace of the forecast errorcovariance matrix satisﬁes Im ( P 𝑘 ) = M 𝑘 :0 ( Im ( P )) . (15)If 𝑛 is the dimension of the unstable-neutral subspace of the dynamics, it can furtherbe shown that lim 𝑘 →∞ rank ( P 𝑘 ) ≤ min { rank ( P ) , 𝑛 } , (16)which is a ﬁrst proof of the collapse of the error covariance matrix (actually itscolumn space) onto the unstable and neutral subspace of the dynamics. Result 2 : Collapse onto the unstable subspaceLet 𝜎 𝑘𝑖 , for 𝑖 = , . . . , 𝑁 x denote the eigenvalues of P 𝑘 ordered as 𝜎 𝑘 ≥ 𝜎 𝑘 · · · ≥ 𝜎 𝑘𝑁 x . It was shown that 𝜎 𝑘𝑖 ≤ 𝛽 𝑖 exp (cid:16) 𝜆 𝑘𝑖 𝑘 (cid:17) , (17)for some 𝛽 𝑖 >

0, where 𝜆 𝑘𝑖 is a log-singular value of M 𝑘 :0 that converges to the LE 𝜆 𝑖 . This gives an upper bound for all eigenvalues of P 𝑘 and a rate of convergence forthe 𝑁 x − 𝑛 smallest ones. Moreover, if P 𝑘 is uniformly bounded, it can further beshown that the stable subspace of the dynamics is asymptotically in the null spaceof P 𝑘 , i.e. , lim 𝑘 →∞ (cid:13)(cid:13)(cid:13) P 𝑘 L b ,𝑖𝑘 (cid:13)(cid:13)(cid:13) = 𝑖 > 𝑛 ; this extends to any norm and linear combination of these vectors. Result 3 : Explicit dependence of P 𝑘 on P To study the dependence of P 𝑘 on P , it has been shown that the forecast errorcovariance matrix can be written as P 𝑘 = M 𝑘 :0 P M (cid:62) 𝑘 :0 (cid:0) I + 𝚪 𝑘 M 𝑘 :0 P M (cid:62) 𝑘 :0 (cid:1) − , (19)where the matrix 𝚪 𝑘 ≡ 𝑘 − ∑︁ 𝑙 = M −(cid:62) 𝑘 : 𝑙 𝛀 𝑙 M − 𝑘 : 𝑙 , (20)is known as the information matrix and it measures the observability of the systemby propagating the precision matrices 𝛀 𝑙 up to 𝑡 𝑘 .An alternative formulation of Eq. (19) is P 𝑘 = M 𝑘 :0 P [ I + 𝚯 𝑘 P ] − M (cid:62) 𝑘 :0 , (21)where hort Title 11 𝚯 𝑘 ≡ M (cid:62) 𝑘 :0 𝚪 𝑘 M 𝑘 :0 = 𝑘 − ∑︁ 𝑙 = M (cid:62) 𝑙 :0 𝛀 𝑙 M 𝑙 :0 , (22)is also an information matrix, directly related to the observability of the DA system,but pulled back at the initial time 𝑡 . Equation (21) is of key importance because itallows to study the asymptotic behaviour of P 𝑘 , i.e. the ﬁlter “believed” error, usingthe asymptotic properties of the dynamics. Result 4 : Asymptotics of P 𝑘 For P 𝑘 to forget about P as 𝑘 tends to inﬁnity, it was shown that one can impose thefollowing suﬃcient conditions: • Condition 1:

Recall that the FLVs at 𝑡 associated to the unstable and neutralexponents are the columns of L f , 𝑛 ∈ R 𝑁 x × 𝑛 . Moreover, let deﬁne the anomalymatrix X ∈ R 𝑁 x × 𝑛 such that P = XX T . The condition readsrank (cid:16) X (cid:62) L f , 𝑛 (cid:17) = 𝑛 . (23)In practice the initial ensemble anomalies X projects onto the ﬁrst 𝑛 FLVs at 𝑡 . • Condition 2:

The model is suﬃciently observed so that the unstable and neutraldirections remain under control, i.e. , there exits 𝜀 > (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 > 𝜀 I . (24) • Condition 3:

Furthermore, for any neutral BLV we havelim 𝑘 →∞ (cid:16) L b ,𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b ,𝑛 𝑘 = ∞ , (25)implying that the neutral direction is well observed and controlled.Under these three conditions, we obtainlim 𝑘 →∞ (cid:40) P 𝑘 − L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) (cid:41) = . (26)Hence, the asymptotic sequence does not depend on P , but only 𝚪 𝑘 , i.e. the dynam-ics. It can also be shown that the neutral modes have a peculiar role and a long lastinginﬂuence: their inﬂuence on the current estimate decreases sub-exponentially. Result 5 : From the degenerate KF to the square-root EnKF and EnKSThe recurrence Eq. (21) can be reformulated in a factorised form which is suited tothe square-root EnKF. The standard perturbation decomposition of the forecast error covariance is P 𝑘 = X 𝑘 X (cid:62) 𝑘 , (27)where X 𝑘 is the matrix of centred perturbations (the anomaly matrix aforemen-tioned). A right-transform update formula can then be obtained from (21): X 𝑘 = M 𝑘 :0 X (cid:2) I + X (cid:62) 𝚯 𝑘 X (cid:3) − / 𝚼 𝑘 , (28)where 𝚼 𝑘 is an arbitrary orthogonal matrix such that 𝚼 𝑘 = , where is the vector [ . . . ] (cid:62) deﬁned in the ensemble subspace. It is equivalent to the left-transformupdate formula X 𝑘 = (cid:2) I + M 𝑘 :0 P M (cid:62) 𝑘 :0 𝚪 𝑘 (cid:3) − / M 𝑘 :0 X 𝚼 𝑘 . (29)The importance of Eqs. (28) and (29) stands on the fact that with linear models,Gaussian observation and initial errors, the (square-root) degenerate KF (with X ∈ R 𝑁 𝑥 × 𝑛 and 𝑛 < 𝑁 𝑥 ) is equivalent to the square-root EnKF and can serve as a proxyto the EnKF applied to nonlinear models.All of these results can be generalised to linear smoothers. In particular, thesmoother forecast error covariance matrix is similar to that of the ﬁlter, i.e. given by(19) (or (21)) but with the following modiﬁed information matrix: (cid:98) 𝚪 𝑘 = 𝚪 𝑘 + 𝑘 + 𝐿 − 𝑆 ∑︁ 𝑙 = 𝑘 M − T 𝑘 : 𝑙 𝛀 𝑘 M − 𝑘 : 𝑙 , (30)where 𝐿 is the lag of the smoother (how far in the past observations are accountedfor) and 𝑆 tells by how many time steps the smoother’s window is shifted betweentwo consecutive updates. Note that (cid:98) 𝚪 𝑘 ≥ 𝚪 𝑘 (using the Loewner order on the set ofsemi-deﬁnite positive matrices) reﬂecting the general higher amount of informationincorporated within a smoother update relative to a ﬁlter. Therefore the asymptoticsequences for the ﬁlter (right-hand side) and smoother (left-hand-side) follow theinequality: L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) (cid:98) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) ≤ L b , 𝑛 𝑘 (cid:20)(cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) 𝚪 𝑘 L b , 𝑛 𝑘 (cid:21) − (cid:16) L b , 𝑛 𝑘 (cid:17) (cid:62) . (31)The linear smoothers can serve as a proxy to the ensemble Kalman smoother(Evensen, 2009b) or the iterative ensemble Kalman smoother (IEnKS) (Bocquet andSakov, 2014) applied to nonlinear models (Bocquet and Carrassi, 2017). With theIEnKS, where the true state trajectory is even better estimated and the errors arereduced, the collapse of the perturbations onto the unstable and neutral subspace isexpected to be even faster. hort Title 13 The above perfect-deterministic model conﬁguration demonstrates how chaos shapesthe inferences of the posterior. Asymptotic statistics are determined by the ability ofthe ﬁlter to control the growth of initial error in the unstable-neutral subspace withrespect to its sensitivity to observations therein. However, in realistic DA, additionalforecast errors are introduced throughout the forecast cycle due to the inadequacyof numerical models in representing reality. One classical approach to treat thesemodel errors is to represent them as additive or multiplicative noise (Jazwinski,1970). Oseledet’s theorem and the Lyapunov spectrum are also formulated for suchsystems of stochastic diﬀerential equations (SDEs) and discrete maps.Suppose { v 𝑖 } 𝑑𝑖 = is a collection of C vector ﬁelds on 𝑀 ⊂ R 𝑁 𝑥 , a smooth manifoldwithout boundary. Deﬁne { 𝑊 𝑖𝑡 } 𝑑𝑖 = where each 𝑊 𝑖𝑡 is an independent Wiener processdeﬁned on the probability space ( Ω , F , 𝑃 ) . This describes a generic StratonovichSDE, d x d 𝑡 = f ( x , 𝝈 , 𝜔 ) (cid:44) v ( x , 𝝈 ) + 𝑑 ∑︁ 𝑖 = v 𝑖 ( x , 𝝈 ) ◦ 𝑊 𝑖𝑡 ( 𝜔 ) . (32)For ﬁxed 𝝈 , if { v 𝑖 } 𝑑𝑖 = span the tangent space 𝑇 x 𝑀 for each x ∈ 𝑀 , then the systemof SDEs gives rise to a unique probability measure 𝜇 on 𝑀 that is invariant withrespect to the random ﬂow induced by the system of SDEs. For 𝜇 × 𝑃 almost every ( x , 𝜔 ) ∈ 𝑀 × Ω , the Lyapunov exponents and their multiplicity are deﬁned andonly depend on x (Liu and Qian, 2006)[theorem 2.1]. For the SDE, f ( x , 𝝈 , 𝜔 ) , thetangent linear model is once again deﬁned as in Eq. (2) from which the exponentsand vectors can be computed as usual (Pikovsky and Politi, 2016)[chapters 2 and 8].The analysis from perfect-deterministic models thus extends to stochasticallyforced models as in Eq. 8a, but with key diﬀerences. Firstly, the forecast error co-variance can generally be considered to be of full rank due to the injection of thestochastic forcing w 𝑘 into arbitrary subspaces; the standard controllability assump-tion actually guarantees that it is of full rank after a suﬃcient lead time (Jazwinski,1970)[lemma 7.3]. Stochastic perturbations w 𝑘 are, moreover, subject to growth anddecay rates of the LLEs, 𝜆 𝜏𝑖 of Eq. (7). These LLEs are distributed about the LE suchthat even asymptotically stable modes may have transient periods of rapid growth.To make the analysis tractable, assume that the dynamics are stationary in thesense that the recursive QR algorithm (Shimada and Nagashima, 1979; Benettin et al.,1980) converges uniformly to the theoretical LEs uniformly in number of iterationsfrom any initial time. While the forecast error covariance is not generally reduced-rank, the stable dynamics is actually suﬃcient to uniformly bound the forecast errorvariances in the stable subspace without any assimilation . Result 6 : Uniform bounds on the variance of errors in the stable subspaceRecall that L b ,𝑖𝑘 is the 𝑖 -th BLV and that P free 𝑘 is the free forecast error covariance( e.g. without DA). For 𝜆 𝑖 <

0, deﬁne 𝜖 > { 𝜆 𝑖 + 𝜖 } <

1. By thestationarity assumed above, there exists 𝑁 𝜖 such that if 𝑘 − 𝑙 > 𝑁 𝜖 , then − 𝜖 < 𝑘 − 𝑙 log (cid:16) (cid:107) M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 (cid:107) (cid:17) − 𝜆 𝑖 < 𝜖 . (33)Assuming that the system is uniformly completely controllable and Q 𝑘 ≤ 𝑞 sup I 𝑛 < ∞ for all 𝑘 , the controllability matrix, Eq. (13), is uniformly bounded above by 𝐶 𝑁 𝜖 I 𝑛 .A variation on Eq. (12) can be used to obtain a uniform bound on the 𝑖 -th varianceof P free 𝑘 in the basis of the BLVs aslim sup 𝑘 →∞ (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) P free 𝑘 L b ,𝑖𝑘 ≤ 𝐶 𝑁 𝜖 + 𝑞 sup exp { ( 𝜆 𝑖 + 𝜖 ) 𝑁 𝜖 + } − exp { ( 𝜆 𝑖 + 𝜖 )} . (34)This bound represents the competing forces of the transient growth rates of re-cently introduced perturbations in the controllability matrix, and perturbations thatadhere to their aysmptotic log-average rate of decay in the stable subspace withina margin of 𝜖 (Grudzien et al., 2018a)[proposition 2 and corollary 3]. This demon-strates that, if assimilation prevents error growth in the span of the unstable-neutralBLVs, errors in the span of the stable BLVs can be neglected without relinquishingﬁlter boundedness. However, this does not state whether the error in the span of thestable subspace will remain within tolerable bounds.Redeﬁne 𝑞 sup such that P free0 , Q 𝑘 ≤ 𝑞 sup I 𝑛 for all 𝑘 . The variance of the freeforecast in the 𝑖 -th BLV is bounded directly as (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) P free 𝑘 L b ,𝑖𝑘 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 :0 P free0 M (cid:62) 𝑘 :0 L b ,𝑖𝑘 + 𝑘 ∑︁ 𝑙 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 : 𝑙 Q 𝑙 M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 ≤ 𝑞 sup 𝑘 ∑︁ 𝑙 = (cid:16) L b ,𝑖𝑘 (cid:17) (cid:62) M 𝑘 : 𝑙 M (cid:62) 𝑘 : 𝑙 L b ,𝑖𝑘 = 𝑞 sup 𝑘 ∑︁ 𝑙 = (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) , (35)where (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) is the norm of the 𝑖 -th row of T 𝑘 : 𝑙 in the recursive QR decompo-sition of the propagator, M 𝑘 : 𝑙 = L b 𝑘 T 𝑘 : 𝑙 (cid:16) L b 𝑙 (cid:17) (cid:62) . The sum Ψ 𝑖𝑘 (cid:44) 𝑘 ∑︁ 𝑙 = (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) (36)describes the invariant evolution for the 𝑖 -th variance of the free forecast errorcovariance matrix in the basis of BLVs. In the case that Q 𝑘 = 𝑞 I 𝑛 for all 𝑘 , Eq. (35)becomes an equality and Ψ 𝑖𝑘 can be interpreted as the evolution of the 𝑖 -th variance hort Title 15 when 𝑞 =

1. As 𝑘 − 𝑙 grows (cid:107) (cid:0) T (cid:62) 𝑘 : 𝑙 (cid:1) 𝑖 (cid:107) converges exponentially to zero while for 𝑘 − 𝑙 close to one this describes transient dynamics in the basis of the BLVs. Although Ψ 𝑖𝑘 is guaranteed to be uniformly bounded in 𝑘 (Grudzien et al., 2018a)[corollary 3],numerical simulations demonstrate how this uniform bound can be extremely large.Figure 1, presents an example from Grudzien et al. (2018a) of the free forecasterror variance, Ψ 𝑖𝑘 , over 10 forecast cycles where the model propagator M 𝑘 is deﬁnedby the evolution of the tangent linear model of the Lorenz-96 system (Lorenz, 1996)in 𝑁 𝑥 =

10 dimensions, with an interval between observations of 0 .

1, and Q 𝑘 (cid:44) I 𝑁 𝑥 .Although this model is generated from the underlying nonlinear Lorenz system, thestate model for the experiment is treated as a discrete linear model using only thetangen-linear resolvent as above. This model has three unstable, one neutral andsix stable LEs. While 𝜆 and 𝜆 are negative, L b , 𝑘 and L b , 𝑘 experience frequenttransient instabilities in the timeseries of their LLEs (see bottom row in Fig. 1). TheLLEs of L b , 𝑘 have more intense growth, reﬂected in the diﬀerences between Ψ 𝑘 and Ψ 𝑘 (top row): the maximum of Ψ 𝑘 is on the order of O ( ) and the mean isapproximately 28; for Ψ 𝑘 the max is of O (cid:0) (cid:1) and the mean is approximately 808.This suggests that, as opposed to the case of deterministic dynamics, for successfulDA in stochastic systems with additive model error, it is necessary to control thegrowth of forecast errors also in the span of weakly stable BLVs. While errors inthis span will not grow indeﬁnitely and remain bounded, if those directions are leftuncontrolled their error bounds can be practically too large for any meaningful stateestimation purposes. Fig. 1 Upper: time series of Ψ 𝑘 and Ψ 𝑘 as deﬁned in Eq. (36). Lower:

LLEs of L b , and L b , .Adapted from Grudzien et al. (2018a)6 Authors Suppressed Due to Excessive Length Result 7 : The unﬁltered-to-ﬁltered error upwell and the need for inﬂationMotivated by the results above, suppose that an approximate, reduced-rank Kalmanestimator is deﬁned such that the resulting forecast error covariance and the Kalmangain have image (column) spaces constrained to the span of the leading 𝑟 ≥ 𝑛 BLVs, L b , 𝑟𝑘 . For perfect and deterministic dynamics, this estimator is asymptoticallyequivalent to the optimal Kalman ﬁlter by the results of Sect. 3.1.1. On the otherhand, Result 6 establishes that, in the presence of additive model errors, the varianceof the unﬁltered error in the span of the trailing 𝑁 𝑥 − 𝑛 BLVs will remain ﬁnite, albeitbounded as per Eq. (35). When 𝑟 > 𝑛 this reduced-rank Kalman ﬁlter will correctthe 𝑟 − 𝑛 stable modes in addition to unstable-neutral modes, thereby reducing thevariance of the free forecast errors below the bounds pictured in Fig. (1).Grudzien et al. (2018b) derive the full forecast error covariance dynamics for thereduced-rank Kalman estimator described above. Note that, as opposed to the freeforecast error covariance described in relation to Result 6, the discussion pertains nowto the forecast error covariance cycled within the KF. While the standard reduced-rank KF formalism would write the recursion for the forecast error covarianceentirely within the span of L b , 𝑟 , it is proven in Grudzien et al. (2018b) that forecasterrors in the column span of trailing L b ,𝑟 + 𝑁 𝑥 𝑘 (those left “uncorrected” by DA) aretransmitted into the column span of L b , 𝑟𝑘 . This “error upwell” is a consequence ofthe KF rank-reduction within the ﬁrst 𝑟 BLVs and is driven by the upper triangulardynamics of the BLVs in the recursive QR algorithm. Therefore, neglecting thecontribution of the “upwelling” of error from the trailing to the leading BLVs, asin the standard recursion, leads to a systematic underestimation of the true forecasterror in the presence of additive noise. Furthermore, because the leading 𝑟 BLVsshare the same span as the leading 𝑟 Oseledet spaces, Eq. (4), the upwelling of errorsfrom the span of the trailing BLVs to the leading BLVs holds for any estimator thatis restricted to the span of the leading 𝑟 covariant subspaces.Figure 2 presents an example from Grudzien et al. (2018b), using the sametangent linear model from the 10-dimensional Lorenz-96 system as in Fig. 1, andﬁxing each of H 𝑘 = R 𝑘 = Q 𝑘 = I . In each window, the eigenvalues of the forecasterror covariance matrix of the optimal full-rank KF (yellow) and of the “exact”,reduced-rank estimator (red) are averaged over 10 analysis cycles and plotted withtriangles. By exact it is meant here that the full covariance equation is evolvedanalytically, including the covariance within the unﬁltered trailing BLVs and theircross covariances with the leading ﬁltered modes. The rank, 𝑟 , of the reduced-rankestimator is varied to examine the diﬀerences between the forecast error covariancesarising from the optimal KF and the reduced-rank one when only the unstable-neutralsubspace is corrected by the gain ( 𝑟 = 𝑛 = 𝑟 =

5) and so on. As suggested by Fig. 1, correcting the ﬁrst stable modereduces the leading eigenvalue of reduced-rank estimator’s forecast error covarianceby an order of magnitude versus the case when only the unstable-neutral subspacehas been corrected ( cf the red curves between the two top windows).In each window the projection coeﬃcients of the reduced-rank estimator’s forecasterror covariance into the basis of BLVs are also plotted (green line). For the full- hort Title 17 Fig. 2

Eigenvalues of the KF and the reduced-rank estimator covariances plotted with triangles.Projection coeﬃcients of the reduced-rank estimator covariance plotted with X’s. rank optimal KF, the projection coeﬃcients closely follow the eigenvalues and thisis not shown due to redundancy. By contrast, it is clear that for the reduced-rankestimator the leading eigenvector is typically close to the ﬁrst BLV that is containedin the null space of the reduced-rank gain, i.e. the ﬁrst among the unconstraineddirections. Figure 2 demonstrates thus a fundamental diﬀerence between the perfectmodel and stochastically forced models conﬁgurations: in stochastic systems, usinga reduced-rank 𝑛 ≤ 𝑟 < 𝑁 𝑥 Kalman estimator, the leading order forecast errorscan actually lie in the stable, unﬁltered modes. Those modes must be taken thus intoaccount in the DA procedure by, for instance, appropriately enlarging the ensemblesize or otherwise augmenting the span of the ensemble-based gain.The exact recursion for the reduced-rank gain (red lines in Fig. 2) requires resolv-ing the full error covariance dynamics and therefore cannot be used practically forDA. Nevertheless, it was used here to render apparent, the upwelling phenomena: afundamental source of uncertainty that is not captured by the standard reduced-rankKF (and thus almost all EnKF) recursion. This mechanism is ubiquitous wheneverone solves for a reduced rank estimator and is present whenever the forecast errorevolution can be well approximated by the tangent-linear dynamics (see Sect. 3.2.2).Notably, it also provides one basic, mathematically grounded, justiﬁcation for us-ing covariance inﬂation (Grudzien et al., 2018b), a powerful common ﬁx used inensemble-based DA (that are commonly rank-deﬁcient by construction) to mitigatefor sampling, and sometimes model, error (see e.g.

Carrassi et al., 2018, their Sec-tion 4.4.2 and references therein). Finally, the exact recursion also demonstrates theasymptotic characteristics of the forecast error covariance when using a reduced-rankgain, in the absence of sampling error. It is extremely important to note that in theexact error dynamics for the reduced rank estimator, the leading order forecast errorslie in the directions that are asymptotically stable but unﬁltered. This also highlights the importance of localization (Sakov and Bertino, 2011) and gain-hybridization(Penny, 2017) as eﬀective means for preventing the growth of forecast errors that lieoutside of the ensemble span in the EnKF.

The performance and the functioning mechanisms of the EnKF in nonlinear systemsare studied with the aid of numerical simulations performed using the qgs codeplatform (Demaeyer et al., Submitted; Demaeyer and De Cruz, 2020).We consider ﬁrst a spectral 2-layer channel quasi-geostrophic atmospheric model.The Fourier modes decomposition is truncated at wavenumber 2 in both meridionaland zonal directions on a beta-plane, leading to a set of 20 ODEs for the time evolutionof the ﬁrst 10 components of the atmospheric streamfunction 𝝍 , and temperature 𝜽 (Reinhold and Pierrehumbert, 1982). The model dimension is 𝑁 𝑥 =

20 and itsspectrum of LEs includes 3 positive and one neutral exponents, so that 𝑛 = , 𝑡 𝑘 , each separated by a time interval Δ 𝑡 , 𝑡 𝑘 + = 𝑡 𝑘 + Δ 𝑡 . The observations are then obtained by adding a zero mean Gaussian randomerror sampled from N ( , R ) , with R being the (assumed to be known) observationerror covariance matrix. It is assumed that we observe the spectral componentsdirectly and that the full system is observed, implying that the observation operatoris the identity matrix, H = I 𝑁 𝑥 ∈ R 𝑁 𝑥 × 𝑁 𝑥 . Although the former hypothesis cannotbe met in practice (instrumental devices do not observe the spectral modes) and thelatter rarely holds in high-dimensional applications, they are done here for the sake ofclarity and will facilitate the study of the dynamical behaviour we intend to discuss.Furthermore, observational error is supposed to be spatially (in spectral space)uncorrelated with an amplitude proportional to the corresponding model variable’sstandard deviation, 𝜎 𝑖 md , 𝑖 = , . . . , 𝑁 𝑥 . These imply R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) Data assimilation is performed using the EnKF-N, hereafter simply referred toas EnKF (Bocquet, 2011). This EnKF belongs to the family of deterministic ﬁltersbut it furthermore possesses the appealing property that the ensemble covariancemultiplicative inﬂation is computed automatically as part of the DA process. Inﬂa-tion is one of the unavoidable feature making the EnKF methods suitable for highdimensional problems (Carrassi et al., 2018). It comes under two ways known asmultiplicative and additive inﬂation, that are often used together. We shall use mul-tiplicative inﬂation alone because, as opposed to the additive version, it does notchange the rank and span of the ensemble error covariance but it only inﬂates the hort Title 19 matrix entries amplitude. This allows us to study the eﬀect (if any) on the ensemblesubspace (reﬂected into the ensemble covariance rank and span) that comes fromthe dynamics, without artifacts from the DA procedure. At each analysis step, theforecast anomaly matrix is inﬂated as 𝛼 X f ← X f , 𝛼 ≥ 𝑡 𝑘 , the alignment between the ensemble and the unstable-neutralsubspaces, U 𝑘 , is computed ascos ( 𝜃 𝑖𝑘 ) = 𝑛 ∑︁ 𝑝 = cos ( 𝜃 𝑖, 𝑝𝑘 ) = 𝑛 ∑︁ 𝑝 = (cid:8) ( u 𝑝𝑘 ) T v 𝑖𝑘 (cid:9) (cid:13)(cid:13) v 𝑖𝑘 (cid:13)(cid:13) ≤ 𝑖 ≤ 𝑁, (37)where 𝜃 𝑖𝑘 ∈ [ , 𝜋 ] is the angle between the anomaly v 𝑖𝑘 and U 𝑘 , u 𝑝𝑘 is the 𝑝 -th BLV,and 𝑁 is the number of ensemble members.The RMSE of the EnKF analyses and the angle 𝜃 𝑖𝑘 are shown on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) in Fig. 3. Values are averaged over 4 years of simulated time afterdiscarding the ﬁrst 200 analyses. The RMSE of the analyses is also averaged over allmodel variables, once the error on each variable is normalised with respect to thecorresponding standard deviation 𝜎 md . The number of ensemble members is set to 𝑁 = t (hours) % Angle t (hours) % RMSE Fig. 3

Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle, from Eq. (37)(in degree), between the anomalies of the EnKF and the unstable–neutral subspace (left panel,shadow colours, in degree), and time averaged normalised RMSE of the EnKF analysis (rightpanel, shadow colours), both on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) . The set-up is H = I 𝑑 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) and 𝑁 =

10. Note that the logarithmic scale is used on the right panel.

Figure 3 immediately shows the strong resemblance of patterns between the angleand the RMSE. In practice, whenever the observation time interval Δ 𝑡 and error 𝜎 % are small enough then the angle between the two subspaces get smaller and the RMSE decreases. Similarly to what is discussed in Bocquet and Carrassi (2017),we also see here how increasing observation frequency is more eﬀective in reducingthe RMSE than reducing the observation error (see discussion in the Appendix ofBocquet and Carrassi (2017)). This result also suggests that the use of a large numberand frequent observations, like the ones produced by non-conventional systems ( e.g. crowdsourcing) has the potential for improving analyses, despite that observationalerrors may be larger than in conventional measurement systems (Nipen et al., 2019).A complementary picture of the relation between the ﬁlter skill and the subspacesalignment is provided in Fig. 4, that displays the angle (left y-axis) and RMSE ofthe EnKF analysis (right y-axis) both against the ensemble size 𝑁 . The remainingexperimental set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .

08 and Δ 𝑡 = .

11 hours.Recalling that the unstable-neutral subspace has dimension 𝑛 =

4, Fig. 4 shows thatas soon as 𝑁 ≥ 𝑛 + i.e. the ensemble fully spans the unstable-neutral subspace,the RMSE suddenly reduces to very small values and it does not further decreasewhen 𝑁 is increased beyond 𝑛 +

1. The fact that the convergence occurs for 𝑛 + 𝑛 is due to the ensemble anomalies subspace to be at most of rank 𝑁 − 𝑁 = 𝑛 + Ensemble size N M e a n a n g l e w i t h t h e un s t a b l e - n e u t r a l s u b s p a c e Angle 0.00.20.40.60.81.0 R M S E RMSE

Fig. 4

Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle, Eq. (37) (indegree), between the ensemble anomalies of the EnKF and the unstable-neutral subspace as afunction of the ensemble size 𝑁 (left 𝑦 axis), and the corresponding time-averaged RMSE of theEnKF (right 𝑦 axis). The set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .

08 and Δ 𝑡 = .

11 hours.

At the convergence, the angle between the ensemble and unstable-neutral subspace(of dimension 𝑛 =

4) is about 10 degrees (see Fig. 4): a remaining small portion ofthe ensemble subspace is projecting outside the unstable-neutral space. To investigate hort Title 21 how large is such a portion, we compute the angle between an ensemble subspace withﬁxed 𝑁 =

10 and a Lyapunov/Oseledets subspace of increasing dimension beyond 𝑛 +

1; results are shown in Fig. 5. The angle between the two subspaces decreasesmonotonically with the size of the subspace, eventually reaching zero once the fullphase space ( 𝑛 =

20) is considered. Interestingly, the rate of convergence is initiallyfaster until approximately dimension 10, and slower afterwards. This indicates thatthe additional projection beyond the asymptotic unstable-neutral subspace is largelyconﬁned to the less stable ( i.e. close to neutral) directions, those that can often belocally unstable. This mechanism is a reminiscent of what was shown by (Grudzienet al., 2018b) for stochastic systems and reviewed in Section 3.1.2. It suggests that theaddition of few ensemble members beyond 𝑛 + Dimension of subspace M e a n a n g l e w i t h s u b s p a c e Fig. 5

Atmospheric quasi-geostrophic model. Time- and ensemble-averaged angle (in degree)between the ensemble anomalies of the EnKF with 𝑁 =

10 members, and subspaces spanned bythe BLVs of increasing dimensions. These subspaces are constructed by starting from the unstable-neutral subspace ( 𝑛 =

4) and adding one by one stable directions ordered decreasingly by theirLyapunov exponent. The set-up is H = I 𝑑 , R = 𝝈𝝈 T with 𝜎 % = .

08 and Δ 𝑡 = .

11 hours.

To discuss the impact of multiple timescales on the ensemble Kalman ﬁltering, weconsider now the addition of a shallow-water ocean component to the previous atmo-spheric model. The atmospheric and oceanic models are coupled together throughwind and radiative forcing as well as through heat exchanges, yielding the Modu-lar Arbitrary-Order Ocean-Atmosphere Model MAOOAM (De Cruz et al., 2016).MAOOAM has the same 20 ODEs of the atmospheric quasi-geostrophic model, andadditional 16 ODEs for the ocean. Amongst these 16 equations, the ﬁrst 8 governthe time evolution of the ﬁrst eights components of the oceanic streamfunction 𝝍 o ,while last 8 are relative to the ﬁrst eights components of the oceanic temperatureanomaly 𝛿 𝑻 o . The model dimension is thus 𝑁 𝑥 =

36 and its parameters are cho- sen such that a decadal low-frequency variability appears as a consequence of thecoupling (Vannitsem et al., 2015; De Cruz et al., 2016).The model is integrated with a time step of approximately 15 minutes and isspun-up for as many as 18 ,

500 years to ensure the solution has reached the modelattractor, given the low-frequency ( i.e. slow time-scale) of the model. From this spun-up trajectory we start DA experiments using the EnKF-N with the same protocolused for the atmospheric model detailed above. In this case the true trajectorylasts 185 years, and it is again assumed that we observe the spectral componentsdirectly and completely. The EnKF is used in a strongly-coupled DA mode (see e.g.

Penny and Hamill (2017)) so that, at analysis steps, atmospheric data can impact theocean and vice-versa. MAOOAM has been already used as a prototypical coupledmodel to study coupled DA by Penny et al. (2019); Tondeur et al. (2020). Again,observational error is assumed to be spatially (in spectral space) uncorrelated withan amplitude proportional to the corresponding model variable’s standard deviation, 𝜎 𝑖 md , 𝑖 = , . . . , 𝑁 𝑥 . These imply R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ).The spectrum of LEs of MAOOAM is displayed in Fig. 6 and presents some keyfeatures. L y a p un o v e x p o n e n t Fig. 6

MAOOAM coupled Ocean-Atmosphere model. Time-averaged Lyapunov exponents indays − computed along the true trajectory of the DA experiments. With some arbitrariness, it can be decomposed into three subsets. A ﬁrst subsetof LEs corresponds to the unstable ( 𝜆 𝑖 >

0) and neutral ( 𝜆 𝑛 =

0) directions.The dimension of the subspace spanned by these directions is 𝑛 =

6. A secondsubset of small negative LEs ( 𝜆 𝑖 ∈ [− × − , [ day − ) corresponds to nearlyneutral directions . The subspace they span has dimension 𝑛 =

10 and we hereafterdeﬁne the unstable-near–neutral subspace as the one spanned by the ﬁrst 𝑛 + 𝑛 directions. The presence of these nearly-neutral directions amounts as a form ofdegeneracy of the neutral direction. Although the model is theoretically possessingonly one single neutral mode ( cf Sect. 2), it is computationally extremely diﬃcult todistinguish it from other nearly neutral ones, and the model can be said to degeneratein the neutral direction in any practical sense. We shall see how this feature hasimportant consequences on the functioning and performance of DA. Finally, aftera clear gap, the nearly neutral Lyapunov exponents are followed by the remainingstable directions which form the last subset of the spectrum. hort Title 23

Similarly to Fig. 3, we study the ﬁlter performance along with its ensemblesubspace alignment to the unstable subspace in Fig. 7. The ﬁgure shows the RMSE ofthe EnKF analyses and the angle 𝜃 𝑖𝑘 given by Eq. (37), between the ensemble subspaceand the unstable-neutral (left) and unstable-near–neutral (middle) subspace. Valuesare averaged over 185 years of simulated time after discarding the ﬁrst 100 analyses.RMSE of the analysis is also averaged over all model variables, once the error oneach variable is normalised with respect to the corresponding standard deviation 𝜎 md . The number of ensemble members is set to 𝑁 = t (days)0.0020.0080.0350.1440.600 % Angle t (days)0.0020.0080.0350.1440.600 % Angle t (days)0.0020.0080.0350.1440.600 % RMSE

Fig. 7

MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle givenby Eq. (37) between the anomalies of the EnKF and the unstable–neutral subspace (left panel,shadow colours, in degree) or the unstable-near–neutral subspace (mid panel, shadow colours, indegree). The time averaged normalised RMSE of the EnKF analysis is also depicted (right panel,shadow colours). All the ﬁgures are on the plane ( 𝑥, 𝑦 ) = ( Δ 𝑡, 𝜎 % ) . The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) and 𝑁 =

20. The unstable-near–neutral subspace is deﬁned asfollows: it includes the subspace spanned by the unstable and neutral 𝑛 = 𝑛 =

10 stable but near–neutral directions with LEs 𝜆 𝑖 ∈ [− × − , [ day − . In contrast to what was observed in Fig. 3, we see here that the pattern of theRMSE does not longer resemble the pattern of the angle between the ensemble andunstable-neutral subspace ( cf left and right panels in Fig. 7). Nevertheless, it bearsgreat similarities with the pattern of the angle between the ensemble and unstable-near–neutral subspace ( cf mid and right panels in Fig. 7), suggesting that it is nowthis larger subspace (that includes the 𝑛 weakly stable modes) that contains most ofthe error.This is further conﬁrmed by looking at Fig. 8 that, similarly to Fig. 4, showsthe angle (left y-axis) and RMSE of the analysis (right y-axis) against the ensemblesize 𝑁 . The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) with 𝜎 % = .

08 and Δ 𝑡 = .

68 days.The critical role of the 𝑛 weakly stable modes appears now evident when lookingat Fig. 8 ( cf green and blue lines). As opposed to the behaviour of the EnKF applied tothe atmospheric model alone, or to the single scale Lorenz96 system used by Bocquetand Carrassi (2017), we see that the ensemble subspace does not longer project muchonto the unstable-neutral subspace (blue line with solid circles markers): when 𝑁 = 𝑛 + Ensemble size N M e a n a n g l e w i t h s u b s p a c e Angle with unstable-neutralAngle with unstable-near-neutral 0.00.10.20.30.40.50.6 R M S E RMSE

Fig. 8

MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle (indegree) between the ensemble anomalies of the EnKF and the unstable-neutral subspace orthe unstable-near–neutral subspace as a function of the ensemble size 𝑁 (left 𝑦 axis), andthe corresponding time-averaged RMSE of the EnKF (right 𝑦 axis). The set-up is H = I 𝑁 𝑥 , R = 𝜎 % diag ( 𝜎 , . . . , 𝜎 𝑁 𝑥 md ) with 𝜎 % = .

08 and Δ 𝑡 = .

68 days. when

𝑁 > 𝑛 +

1, indicating the presence of non-negligible projections outsidethe 𝑛 = 𝑁 = 𝑛 +

1: the ﬁrst unstable-neutral 𝑛 = 𝑁 =

15 and stays almost constant afterwards.This result demonstrates undoubtedly the importance of the 𝑛 near–neutral modes.Although possibly asymptotically weakly stable, these directions span a small portionof the ﬁlter error that, if included in the ensemble subspace (by properly enlargingthe ensemble size) leads to further improvement of the ﬁlter performance. This isﬁnally emphasized in Fig. 9, which shows the decrease of the ensemble-averagedangle between an ensemble of 𝑁 =

10 members and the subspaces of increasingdimension beyond the unstable-neutral one. In contrast to Fig. 5, it indicates that theaddition of 𝑛 extra ensemble members may lead to RMSE improvement that is farfrom negligible. Also, the gap observed around the value 𝑛 + 𝑛 + =

17 clearlyshows that adding more members beyond the range of the near-neutral stability doesnot bring any beneﬁt. This is a strong indication in favor of a cautious assessment ofthe ensemble size when working with coupled multi-scale dynamics and performingstrongly-coupled DA. Furthermore, as elucidate by Vannitsem and Lucarini (2016)and Tondeur et al. (2020), the near-neutral part of the spectrum in MAOOAM isdirectly connected to the coupling: including those directions within the ensemble hort Title 25 subspace is paramount to propagate the data information content between ocean andatmosphere. This subspace also proved to be key in producing reliable ensembleforecasts in coupled ocean-atmosphere systems Vannitsem and Duan (2020).

Dimension of subspace M e a n a n g l e w i t h s u b s p a c e Fig. 9

MAOOAM coupled Ocean-Atmosphere model. Time- and ensemble-averaged angle (indegree) between the ensemble anomalies of the EnKF with 𝑁 =

10 members, and subspacesspanned by the BLVs of increasing dimensions. These subspaces are constructed by starting fromthe unstable-neutral subspace ( 𝑛 =

6) and adding one by one stable directions ordered decreasinglyby their LE. The set-up is H = I 𝑁 𝑥 , R = 𝝈𝝈 T with 𝜎 % = .

08 and Δ 𝑡 = .

68 days.

Results in Sect. 3.1.1 show that the optimal KF and the reduced-rank KF withgain restricted to the leading BLVs are asymptotically equivalent for linear, perfect-deterministic models. Moreover, results in Sect. 3.2.1 demonstrate that in weaklynonlinear error dynamics, with a perfect-deterministic model, these conclusionslargely extend to the EnKF. Yet the exact reduced-rank recursion introduced inSect. 3.1.2 evidenced important diﬀerences between the optimal and the reduced-rank formulations in the presence of model error for linear models. In this case,stochastic noise injected in asymptotically stable directions is not entirely dampedout and remains ﬁnitely bounded. In addition, the upwelling process, albeit inherentto the reduced-rank formulation and not a consequence of model error, will movethe constantly injected model noise from unﬁltered to ﬁltered directions.It is of interest thus to compare the diﬀerences between the full rank Kalmanestimator, and both the standard and the exact reduced-rank KF recursions in astochastically forced, nonlinear models. As a prototype, we use the model deﬁnedby the nonlinear ﬂow of the Lorenz-96 (Lorenz, 1996) system with additive noise, and 𝑁 𝑥 =

40. Suppose that 𝑡 𝑘 + = 𝑡 𝑘 + Δ 𝑡 , so that the ﬂow map taking all initialconditions to time + Δ 𝑡 is deﬁned 𝜙 Δ 𝑡 ( x 𝑘 ) = x 𝑘 + . If x t 𝑘 represents the true physicalstate at time 𝑡 𝑘 , we deﬁne the dynamical model analogously to Eq. (8a) as a discrete,nonlinear map, x t 𝑘 + = 𝜙 Δ 𝑡 (cid:16) x t 𝑘 (cid:17) + w 𝑘 . To avoid the interplay and superposition between sampling and model errors, andto be able to focus on the latter alone, instead of the EnKF we use here the EKF (seeSect. 3 and Jazwinski (1970)). The EKF estimates the forecast distribution for x t 𝑘 via the equation, x f 𝑘 + = 𝜙 Δ 𝑡 (cid:16) x a 𝑘 (cid:17) , taking the analysis mean at time 𝑡 𝑘 to the forecastmean at time 𝑡 𝑘 + , and by the linearized forecast equation for the covariance. In thefollowing, the EKF propagates the full-rank covariance equation via the tangent-linear model deﬁned along the mean equation and assimilates observations in allstate components. On the other hand, both the standard and the exact reduced-rankformulations restrict the assimilation so that the image space of the gain is equalto the span of the leading 𝑟 BLVs deﬁned along the tangent-linear model of themean equation. The diﬀerence between the standard and the exact formulation isas follows: the standard formulation only estimates the error covariance in the spanof the leading 𝑟 BLVs while the exact formulation simulates the entire covarianceequation, estimating the free forecast covariance in the trailing modes and includingits upwelling in the estimate of the uncertainty in the span of leading BLVs (Grudzienet al., 2018b, see proposition 1)Figure 10, adapted from Grudzien et al. (2018b), illustrates the diﬀerences inperformance between the above described schemes. On the left, the average RMSEof the (i) full-rank , the (ii) standard reduced-rank and the exact reduced-rank EKF formulations are plotted versus the rank of the reduced-rank gain over 10 analy-ses. The model possesses 13 positive LEs, therefore the reduced-rank EKFs underconsideration have at least 𝑟 ≥ 𝑛 + =

14. The system is fully observed with R = . I 𝑁 𝑥 . As 𝑟 approaches 𝑁 𝑥 =

40, the two reduced rank formulations con-verge to the full rank EKF, with performance considered optimal for a ﬁlter in thissystem. However, for correction rank 𝑟 ≤

20, there are substantial diﬀerences inperformance between the standard formulation and that which includes the eﬀect ofthe upwelling of error from the trailing BLVs: the exact formulation reaches bothadequate and near-optimal performance with smaller 𝑟 than the standard recursion.On the right, Fig. 10 demonstrates the eﬀect of multiplicative inﬂation on thestandard recursion when the correction rank is ﬁxed at 𝑟 =

17. While multiplicativeinﬂation greatly improves the performance of the standard recursion, this perfor-mance is actually bounded below by the RMSE of the exact formulation. This ex-ample shows that, in addition to sampling error, multiplicative inﬂation can be usedto remedy the inadequacies of the standard reduced-rank formalism which neglectsthe eﬀects of dynamic upwelling. This dynamic upwelling is a direct byproduct ofthe estimator being restricted to the span of the leading BLVs. Such an estimatorarises when, e.g. , the ensemble span aligns with the span of the leading BLVs inthe EnKF, as demonstrated in Sect. 3.2.1. The eﬃcacy of dynamic, multiplicativecovariance inﬂation for treating the eﬀect of model error separately from samplingerror has also been demonstrated with statistical and optimization methods by, e.g. , hort Title 27 Mitchell and Carrassi (2015); Raanes et al. (2015, 2019); Sakov et al. (2018); Fillionet al. (2020). The dynamical upwelling derived in Grudzien et al. (2018b) providesan explanation of one of the mechanisms responsible for the need for covarianceinﬂation.

Fig. 10 Left:

RMSE of the full-rank, standard reduced-rank and exact reduced-rank EKF. Thecorrection rank of the reduced-rank estimators is varied in the horizontal axis.

Right:

RMSE ofthe full-rank, standard reduced-rank and exact reduced-rank EKF. Rank is ﬁxed at 𝑟 =

17 andmultiplicative inﬂation in the standard reduced-rank recursion is varied in the horizontal axis

We present here two areas where the knowledge about the chaotic nature and prop-erties of the model dynamics, have been used pro-actively to achieve a better trackof the system of interest and to reduce the forecast error. There have been two wayshow this has been accomplished. The former has to do with the design of methods toinform where and when additional fewer data would lead to a major improvement inthe analysis and forecast skill. This is usually referred to as “adaptive observation”or “target observation”, and is the content of Sect. 4.1. The second has to do withincorporating, within the DA process itself, the information about the unstable sub-space, with the goal of achieving a computational economy while maximising theerror reduction. This gave raise to a family of DA methods known as “assimilationin the unstable subspace”, summarised in Sect. 4.2.Our treatments here are intentionally succinct and have the character of a review,but interested readers will ﬁnd all the appropriate references to the original works.

Historically, one very important development in this type of dynamical analysis ofthe climate predictability / DA problem arose from the question of how to generateadaptive observation systems. By “adaptive observations” is meant here observationswhose locations, time and type is chosen such that their impact on the state estimateor on the forecast skill is the largest. It was well understood at the time that the growthof forecast errors was conﬁned to a lower dimensional subspace of rapidly growingperturbations (Trevisan and Legnani, 1995). The Fronts and Atlantic Storm Track(FASTEX) program (Snyder, 1996), a multinational collaboration to investigatethe growth and development of frontal cyclones, in particular was motivated bythese dynamical approaches to generate adaptive observation schemes that wouldtarget regions of rapid forecast error growth. Other similar international eﬀortshave followed, some of them including actual ﬁeld campaign of measurements: theTHORPEX (Fourrié et al., 2006) and the Winter Storm Reconnaissance programs(Szunyogh et al., 2002; Hamill et al., 2013). Two main approaches were considered,the forced singular vectors (Buizza et al., 1993; Palmer et al., 1998), and the bredvectors (Toth and Kalnay, 1993, 1997), to identify the sensitivity areas of highforecast uncertainty.Forced singular vectors are generated by the right singular vectors of the forward-in-time, tangent-linear model resolvent M . It can be seen from the earlier discussionsin Sect. 2 that the singular vectors can be interpreted as a ﬁnite-time approximationof the FLVs along a model forecast. They therefore indicate region where error willgrow. On the other hand, the bred vectors are an ensemble-based approach to identifysensitivity regions. Particularly, the breeding scheme simulates how the modes offast growing error are maintained and propagated through the successive use of shortrange forecasts in weather prediction. The bred vectors are formed by initializingsmall perturbations of a control trajectory and forecasting these in parallel along thecontrol. By successive rescaling of the perturbations amplitude back to a small value,this mimics the evolution of small perturbations under the tangent-linear model, andthe span of these perturbations generically converges to the leading BLVs. Both ofthese approaches represent an early and intuitive way to utilizing the ergodic theoryof chaotic dynamical systems in designing an eﬀective adaptive observation schemeby targeting an unstable subspace in some form.Trevisan and Uboldi (2004a); Uboldi and Trevisan (2006) utilized the bred vectors/ BLV analysis of Toth and Kalnay (1993, 1997) and explicitly linked the methodologyto Lyapunov stability theory. Importantly it was recognised that, instead of mimickingthe error growth of the unforced (free) forecast, it was important to track the errorsthat develop within the DA cycle. This led to a modiﬁed version of the breedingapproach known as Breeding on the Data Assimilation System (BDAS) Uboldi et al.(2005); Carrassi et al. (2007). The BDAS scheme for adaptive observations is basedon the principle of the support of the forecast error lying primarily in the span ofthe BLVs, with the bred vectors acting as a proxy for the explicit decomposition. Inpractice the locations of few adaptive observations were selected to be in the areaswhere the leading BDAS modes attained their local maxima. In experiments with hort Title 29 an atmospheric quasi-geostrophic model, BDAS was used successfully to locate oneadditional observation at each analysis time of a 3DVar cycle, leading to a dramaticimprovement of the analysis skill, compared to cases where either a ﬁxed or arandomly located adaptive observation was assimilated Carrassi et al. (2007). Data assimilation has long been studied with the trade-oﬀ between accuracy andnumerical eﬃciency as a goal. To this end a natural choice has been that of devisingreduced-order schemes that focus the observational constraint on smaller, albeitcrucial, part of the full dynamics. One of the most celebrated among those approachesis the assimilation in the unstable subspace (AUS) where the DA procedure isexplicitly designed to track and control the unstable manifold of the dynamics,usually of much smaller dimension of the full phase space, thus aiming to a reductionin computational cost (Palatella et al., 2013).We have seen in Sect. 2 that the full phase space of a chaotic dissipative dynamicalsystem can be seen as split in a (usually much smaller) unstable–neutral subspaceand a stable one (Kuptsov and Parlitz, 2012). For instance, Carrassi et al. (2007) haveshown how a quasi-geostrophic atmospheric model of

O ( ) degrees of freedompossesses an unstable–neutral subspace of dimension as small as 𝑛 = considered. This improved the performance of the standard EKF-AUS particularly inregimes of increasing nonlinearities. It remains however to be seen to which extentAUS concept could be used within fully nonlinear DA schemes. This is the subjectof Sect. 5.1 and of the references therein. Although AUS has been largely used in aperfect model scenario, Palatella and Grasso (2018) proposed a suitable modiﬁcationthat allows for incorporating parametric model errors. As proved in Grudzien et al.(2018b) and detailed in Sect. 3.1.2 and 3.2.2 the use of AUS in chaotic systems forcedby additive stochastic noise would necessarily require the additional inclusion of theasymptotically weakly stable modes.A key caveat in all of the aforementioned applications of AUS is that one needsto compute in real time the BLVs to be used in the analysis. Therefore, whileAUS proved capable to improve accuracy, it did not accomplishes a computationaleconomy, unless the BLVs were all pre-computed and stored. Note however thatthe latter is not just a technological challenge given that the BLVs depends on thesystem’s state and vice-versa if BLVs are to be used in the analysis update. Thus it isnot straightforward to decouple their online estimation and use within the analysis.At the same time however, as we have seen in Sect. 3, AUS concept proved to be verypowerful to understand, design and interpret the functioning of the KF and EnKF inchaotic systems. In this subsection we attempt to improve the performance of the (bootstrap) particleﬁlter (PF, Farchi and Bocquet, 2018) by AUS. The underlying hypothesis is thatobservational components in the stable subspace contribute little in the way ofprecision (since nearby orbits within the stable subspace converge), but a lot ofnoise. Therefore, the investigation will explore whether discarding observationalinformation outside of the unstable subspace can mitigate the acute collapse ofweights experienced by PFs in high-dimensional systems, manifesting the “curse ofdimensionality”. If so, this could be used to reduce the number of particles required,which scales exponentially with some measure of the system size (Snyder et al.,2008). A secondary objective is to investigate the eﬀectiveness of a few diﬀerentmethods of targeting observing systems to the unstable subspace, potentially alsoreducing costs.We perform synthetic DA experiments with the Lorenz-96 system. The statesize is set to 𝑁 𝑥 =

10 and there is no dynamical noise ( Q = ). Four diﬀerentobservation conﬁgurations targeting the unstable subspace are tested. For each ofthem, observations are taken 0 . .

5. Each experiment lasts for 10 analysis cycles. The RMSE averages of eachmethod are tabulated for a range of ensemble sizes and observation operator ranks, 𝑁 𝑦 , and plotted as curves in Fig. 11. The plotted scores represent the lowest obtained hort Title 31 among a large number of tuning settings, selected for optimality at each point. Forthe PF the tuning parameters are: (i) the threshold for resampling, which is triggeredif the threshold is larger than the eﬀective ensemble size, (cid:107) w (cid:107) − , where w is thevector of weights, and (ii) the bandwidth (scaling) of the regularizing post-resamplejitter, whose covariance is computed from the weighted ensemble. For comparisonthe (symmetric square-root) EnKF (Hunt et al., 2004) is also tested. Its tuningparameters are (i) the post-analysis inﬂation factor and (ii) whether or not to applyrandom, covariance-preserving rotations (Sakov and Oke, 2008b).The top-left panel Fig. 11 shows that the error decreases monotonically in 𝑁 𝑦 when the observation operator are rows of the identity. The top-right panel shows,by contrast, that when observing the BLVs, L b 𝑘 , the PF with largest ensemble levelsoﬀ at near-optimal performance with as few as 𝑁 𝑦 = 𝑛 = 𝑁 , albeitless clearly. Also recall that a similar “levelling oﬀ” of RMSE around 𝑛 occurredin Fig. 4 of Sect. 3.2.1, whose x-axis is 𝑁 (rather than 𝑁 𝑦 , as here). This resultdemonstrates that targeting observations to the directions of dynamical growth ofthe uncertainty is eﬃcacious. Interestingly, the performance of any given DA methodis nearly independent of the observing system ( i.e. panel) for 𝑁 𝑦 =

10. This makessense considering that all of I , L b 𝑘 , V 𝑘 + , and U 𝑘 are orthogonal, i.e. equal up to arotation.In all panels of Fig. 11, the RMSE score of the PF (as well as that of the EnKF), forany ensemble size, never degrades by the inclusion of more observations (except forwhat we adjudge to be noise). This feature was also found in experiments with evensmaller ensembles than shown, and experiments with larger observation errors. Thusit seems that the curse of dimensionality is not mitigated by discarding observations.Hence there is little to gain by eliminating observation components outside of theunstable-neutral subspace, apart for the potential computational eﬃciency in the casethat unstable-neutral has been pre-computed.It should be noted that this ﬁnding runs counter to that of Maclean and Vleck(2019); Beeson and Namachchivaya (2020); Potthast et al. (2019), all of whichreport success in mitigating weight collapse by reducing the observations to theleading components of L b 𝑘 or some related matrix. Beeson and Namachchivaya(2020) also tested observation operators given by time-local backward and forwardLyapunov vectors, deﬁned through the singular value decomposition of the resolvent: M 𝑘 = U 𝑘 𝚺 𝑘 V (cid:62) 𝑘 . They reported better targeting results with V 𝑘 + than with U 𝑘 ,which in turn was more eﬀective than L b 𝑘 . Our results similarly show U 𝑘 (bottom-right panel) to be slightly more eﬀective for targeting observations than L b 𝑘 (top-rightpanel), albeit only for intermediate ensemble sizes. Both, however, are more eﬀectivethan V 𝑘 + (bottom-left). Moreover, also for V 𝑘 + and U 𝑘 , we never observe lowerRMSE scores when using fewer observation components. There are some diﬀerencesin the experimental setting; notably our system is deterministic, and the jitter we applyis restricted to the ensemble subspace. It is unclear if these diﬀerences can accountfor the disparity in conclusions.Why does PF performance not suﬀer from the inclusion of further, visibly redun-dant, observations (contrary to our initial expectation)? Consider the likelihood ( i.e. H k = ([ L b k ] ) : N y H k = ( V k +1 ) : N y H k = ( U k ) : N y H k = I : N y EnKFPF T i m e - a v e r a g e d , f il t e r - a n a l y s i s ( R M S E rr o r ) Observation length ( N y ) Fig. 11

Benchmarks of ﬁlter accuracy (RMSE) from synthetic DA experiments on the Lorenz-96 system, plotted as functions of the observation dimension ( 𝑁 𝑦 ). Speciﬁcally, the observationoperator consists of the 𝑁 𝑦 leading rows of the transpose of the identity matrix ( top-left ), the BLVmatrix, L b 𝑘 , computed by recursive QR decompositions ( top-right ), the 1-cycle forward singularvectors, V 𝑘 + ( bottom-left ), and 1-cycle backward singular vectors, U 𝑘 ( bottom-right ), all of whichare deﬁned via the fundamental matrix of the orbit of the (supposedly unknown) truth. The numberof members/particles used for the ensemble Kalman ﬁlter (EnKF, dashed ) and Particle ﬁlter (PF, solid ) is tagged above each line. No further improvement is obtained by increasing the ensemblesize beyond 𝑁 =

30 for the EnKF and 𝑁 =

800 for the PF, for which the PF achieves betteraccuracy than the EnKF, as expected for nonlinear systems. The greyscale, dotted lines, includedfor context, show the performance of baseline methods, whose analysis estimates, x a , are given by x for Climatology ( black ), x + K ( C ) [ y − x ] for Optimal interp. ( dark grey ), x f + K ( 𝑐 I ) [ y − x f ] for3D-Var ( light grey ). Here, x and C are the mean and covariance of the (invariant measure of the)system dynamics, K ( C ) = CH (cid:62) ( HCH (cid:62) + R ) − is a gain matrix, x f is the forecast of the previous x a , and 𝑐 is a scaling factor subject to tuning. The plots show that targeting observations to thedirections of dynamical growth of the uncertainty is eﬃcacious and that, for this task, L b 𝑘 and U 𝑘 are similarly eﬀective, and superior to V 𝑘 + . Moreover, the RMSE performance of any method,also for the PF (which is our focus), never degrades with the inclusion of more observations. weighting) of particle 𝑛 ∈ [ , . . . , 𝑁 ] , at a given (implicit) time: 𝑝 ( y | x 𝑛 ) = 𝜙 (cid:0) (cid:107) y − Hx 𝑛 (cid:107) R (cid:1) , (38)where the norm is deﬁned as (cid:107) y (cid:107) R = y (cid:62) R − y , and 𝜙 is a radial density such thatEq. (38) represents an elliptical distribution, for example a Gaussian. Suppose for hort Title 33 the sake of simplicity that both the observation operator and its error covarianceare identity, i.e. H = R = I , and let 𝚷 be the matrix of orthogonal projectiononto the unstable-neutral subspace. Now, assuming the PF controls the error in theunstable-neutral subspace, it seems reasonable to assume, given the results for linearsystems of Sect. 3, that the particles will converge onto the unstable-neutral subspace: ( I − 𝚷 )( x 𝑛 − x ) → , at least as a ﬁrst-order approximation. Supposing this, and bydecomposing the norm into two orthogonal components, it can be shown that (cid:107) y − Hx 𝑛 (cid:107) = (cid:107) 𝚷 ( y − Hx 𝑛 )(cid:107) + (cid:107)( I − 𝚷 ) v (cid:107) , (39)where v is the observation noise. Thus, projecting the observations onto the unstable-neutral subspace, which would eliminate the last term of Eq. (39), merely reducesthe data mismatch by a constant (with respect to 𝑛 , the particle index). Thus 𝑝 ( 𝚷 y | x 𝑛 ) = 𝑐 𝑝 ( y | x 𝑛 ) , (40)for some 𝑐 > i.e. uncertainty, in the stable subspace, and therefore ignores observational components, including the noise , in that subspace. This should be contrasted with the situation fordynamical noise (model error), which is “up-welled” from the stable to the unstablesubspace, as per Sect. 3.1.2. This distinction exempliﬁes the diﬀerence betweenuncertainty addition (by dynamical noise) and subtraction (by likelihood updates).For nonlinear dynamical systems, as illustrated in Sect. 3.2, the particles do notneatly align with the unstable-neutral subspace. Capturing this nonlinear aspect ofthe ﬂow is generally seen as an advantage of the PF. It might be argued, though, thatthe observation noise is likely large compared to the spread of the prior particles inthe stable subspace, and therefore the observations should be reduced by discardingthe corresponding components. However, as highlighted above, the observationalnoise is constant in the particle index, and therefore its amplitude does not constitutea particular source of weight variability. Instead, the weight variability originates inthe prior, which was assumed to have low variability in this scenario.The analysis resulting in Eq. (40) assumed that H = R = I followed by H = 𝚷 .In the case of the observation operator consisting of the leading 𝑁 𝑦 ≥ 𝑛 rows ofthe transposed BLV matrix, H 𝑘 = ( L b 𝑘 ) (cid:62) 𝑁 𝑦 , the same conclusion can be derivedalong similar lines. In the general case, for any H , it is not obvious how to reducethe observations so as to only measure the unstable-neutral subspace. To accomplishthis, both Maclean and Vleck (2019) and Beeson and Namachchivaya (2020) applythe pseudo-inverse H + before their reduction. This can be costly if H is large. A morepractical approach is to reduce the observations as ˆy = ( H ˆQ ) + y , or ˆy = 𝚷 y with 𝚷 = ( H ˆQ )( H ˆQ ) + , with ˆQ = ( L b 𝑘 ) (cid:62) 𝑁 𝑦 In any case, re-doing the same derivation,Eq. (40) again follows, including the same implications for the weights.

In summary, discarding observational information outside of the unstable sub-space does not yield improvements in the PF because it already embodies this“ﬂow-dependent” information. A similar conclusion was also drawn for the iterativeensemble Kalman smoother by Bocquet and Carrassi (2017). Thus, while AUS isa powerful explanatory and diagnostics tool, it is not obvious if it can be used tocombat the curse of dimensionality for PFs.Lastly, this section adds some clariﬁcation to the inﬂuential paper by Snyder et al.(2008), whose conclusion is sometimes taken to be that the required ensemble sizefor PFs scales exponentially with observation size. Our results rather indicate thatthe performance of a well-tuned PF will not deteriorate with the inclusion of moreobservations (even if they are redundant). In other words, that the required ensemblesize depends on the rank of the state space, or more precisely for chaotic dynamics,the rank of the unstable-neutral subspace. Snyder et al. (2015) points out that the“eﬀective dimension” may be limited by the observation size if this is smaller thanthe state size. However, in case 𝑁 𝑦 < 𝑛 , no ﬁltering system using ﬂow-dependentpriors will be able to achieve satisfactory performance, because the system is notsuﬃciently observed. Of course, the question of observability is complicated byconsidering time-dependent observation networks, while localization can also beapplied to alleviate the curse of dimensionality (Farchi and Bocquet, 2018). AUS and its theoretical extensions provide a framework to interpret the asymptoticinferences of the EnKF. Provided that the forecast anomalies can be considered to beperturbations of the true physical state, and if their evolution can be approximatedby the tangent-linear model along the true trajectory, the dynamics of the EnKFanomalies can be decomposed along the Oseledec spaces of the true physical state.The stability and accuracy of the EnKF is largely determined by the ability ofthe ensemble-based gain to correct the growth of forecast errors in the unstable-neutral subspace, with respect to the uniform-complete observability of these modes.Additive noise complicates the description of the asymptotic forecast uncertaintyas uncorrected forecast errors in the span of the stable BLVs may be bounded butimpractically large. The upwelling of such errors into the ensemble span furthermorenecessitates covariance inﬂation to rectify the systematic underestimation of theforecast uncertainty in the standard KF-AUS recursion.Model stochasticity arises systematically in multiscale climate dynamics wherethere are large scale-separations between resolved and unresolved dynamic processes.In the asymptotic limit of scale-separation, unresolved dynamics can be reduced toadditive Gaussian noise due to the Central Limit Theorem; ﬁnite scale-separationin reality leads to non-Markovian memory terms in addition to additive stochasticforcing in the exact model reduction of a multiscale model, as in Mori-Zwansigformalism (Gottwald et al., 2015). Several mathematically rigorous frameworkshave been developed to model and simulate the eﬀect of small-scale dynamics hort Title 35 on the large-scale dynamics with stochastic parameterization, including averagingmethods, perturbation methods and combinations of the two — see, e.g. , the surveyof approaches by Demaeyer and Vannitsem (2018).The theory of random dynamical systems oﬀers a novel means of analysis of theDA cycle for multi-scale chaotic systems with large scale separations and modelreduction error. Characterizing the asymptotic Bayesian posterior in terms of theproperties of a random, nonlinear and ergodic attractor is a natural step forwardin the philosopy of AUS. Recent work suggests that the support of the posteriormeasure of the DA cycle can be asymptotically bounded by the support of theSRB measure in deterministic, nonlinear dynamical systems (Oljača et al., 2018).While this is an intuitively appealing result, the existence of an SRB measure indeterministic dynamics usually requires a hyperbolicity assumption which may notbe appropriate in weather and climate, e.g. (Vannitsem and Lucarini, 2016). However,many of the theoretical challenges in showing the existence of SRB measures indeterministic dynamics are relaxed in a random dynamical systems setting. Indeed,the Pesin entropy formula holds under very general assumptions for stochastic ﬂowsof diﬀeomorphisms (Liu and Qian, 2006)[theorem 3.1 and discussion on page 127],establishing the link once again between the observed instability in the dynamicsand the statistical properties of the invariant measure.For such a dynamical interpretation of the DA cycle to be credible, the correctspeciﬁcation of random models in stochastic-physical systems is a primary concern;stochastically reduced models should be speciﬁed to preserve conservation lawsand the original model’s dynamics (Cotter et al., 2019). In addition to the correctstochastic model speciﬁcation, important diﬀerences in the statistical properties ofmodel forecasts of stochastic dynamical systems have been observed due to thediscretization errors of certain low-order schemes. For example, Frank and Gottwald(2018) develop an order 2.0 Taylor scheme to correct the bias in the drift term inducedby the Euler-Maruyama scheme in their study system. Grudzien et al. (2020) likewiseﬁnd that the bias due to discritization error of the Euler-Maruyama scheme can besuﬃcient to cause ﬁlter divergence of the EnKF in the stochastically forced Lorenz-96 model. Grudzien et al. (2020) emphasize the important role of eﬃcient, weaknumerical schemes for the simulation of ensemble-based forecasts. Unlike strongconvergent schemes, numerical schemes that converge in the weak sense can makereductions in the complexity of simulation by emphasizing the accuracy of theconvergence of the ensemble to the forecast distribution rather than the accuracy ofany individual ensemble member.

Chaos is ubiquitous in natural, physical and laboratory systems. Scientists have longcoped with this whenever attempting to model, predict or control such systems.Combining and confronting models with data is common in science and data as-similation (DA) is the term coined in the context of numerical weather prediction science to encompass the methods that perform such a combination. The outputs ofDA is the improved representation of the system under study, and an estimate of theassociated uncertainty.Inevitably, the sensitivity to initial conditions of chaotic systems, including state-dependence of the directions of error growth, is a challenge for DA. On the otherhand, the energy dissipation typical of real systems implies a “dimension reduction”in that errors are conﬁned within a subspace of the full system’s phase space.The DA process requires furnishing a prior distribution, whose speciﬁcation isa severe diﬃculty in high dimension. Gaussian methods reduce the complexity tothat of estimating the prior mean and covariance. Yet, with the exceptionally high-dimensions of geophysical problems, a proper estimate, and storage of the priorcovariance matrices is still challenging. Thus, suitable reduced-rank formulationsshould be used to lower the computational load while maintaining a good descriptionof the errors about the full system.In chaotic dissipative systems this goal can be achieved by monitoring theunstable-neutral subspace of the model dynamics and performing DA only withinthat subspace. This is the idea at the basis of the class of methods known as assim-ilation in the unstable subspace (Palatella et al., 2013), that were developed mainlyin the years between 2004 to 2015; we reviewed them in Sect. 4. Despite reducingthe problem size to that of the unstable-neutral subspace (of dimension 𝑛 (cid:28) 𝑁 𝑥 ),AUS methods proved aptly skillful, very close to those of their more costly full-rankcompetitors.However, the reduction of cost achieved at the analysis steps is oﬀset by the addi-tional cost of monitoring and tracking the unstable-neutral subspace. This requirescomputing the Lyapunov vectors (usually the backward Lyapunov vectors), whichimplies computing the tangent linear model (or alternatively evolving an ensembleof bred perturbations mimicking the evolution under the tangent linear model) and arepeated QR matrix decomposition. For speciﬁc purposes, one can opt for trackingonly a few dominant unstable modes. This was the case for early adaptive (targeted)observation results (Carrassi et al., 2007) or when AUS was used to complement aclassical DA method (Carrassi et al., 2008b).In parallel to the early developments of AUS, several studies with ensemble DAmethods in chaotic dissipative systems were suggesting that a number of their keyproperties were related to the model instabilities, including the rank and span ofthe ensemble-based forecast error covariance, as well as the skill of the analysis(Sakov and Oke, 2008b; Carrassi et al., 2009; Ng et al., 2011; Bocquet and Sakov,2014). These results indicate that the ensemble anomalies automatically align withthe unstable-neutral subspace, thus resulting in the analysis to be conﬁned to it.To put this mechanism on a more rigorous theoretical footing, a stream of workshave studied the relation between the unstable-neutral subspace in linear systemsusing the Kalman ﬁlter (KF) and Kalman smoother (KS) (Gurumoorthy et al., 2017;Bocquet et al., 2017; Bocquet and Carrassi, 2017). These works, reviewed in Sect. 3.1,have provided analytic proofs that the span of the error covariance matrices of theKF and KS tends asymptotically to the unstable–neutral subspace, independent ofthe initial condition ( i.e. no matter the number of the ensemble members, provided hort Title 37 it exceeds the size of the unstable-neutral subspace). For stochastic systems withadditive noise it was proved that asymptotically weakly stable modes, that one mightdiscard in deterministic systems, must be included and analytic bounds for the errorwere provided (Grudzien et al., 2018a).How do these results hold for nonlinear systems? In the case of chaotic deter-ministic systems, this was studied in Bocquet and Carrassi (2017) and further inSect. 3.2.1 of this chapter. It was numerically showed that an ensemble comprisingat least as many members as the size of the unstable-neutral subspace plus one( 𝑁 ≥ 𝑛 +

1) is needed to achieve satisfactorily skill with the ensemble Kalman ﬁlter(EnKF). Section 3.2.1 also considered the case of a coupled multiscale system witha quasi-degenerate spectrum of Lyapunov exponents. This originates in the pres-ence of many close-to-zero exponents that are related to the coupling mechanisms(Vannitsem and Lucarini, 2016). It is shown that their full inclusion in the ensembledesign is needed to reduce the EnKF analysis error to a satisfactorily low level.The case of nonlinear stochastic chaotic systems with additive noise was studiedin Grudzien et al. (2018b) and reviewed in Sect. 3.2.2. The section explains the roleof the weakly stable modes already identiﬁed in linear systems, but also discoveredthe upwelling mechanism for which uncertainty is upwelled from unﬁltered (stable)modes to ﬁltered (unstable) ones. The upwelling phenomenon is not exclusive ofnonlinear systems and it is in fact present in linear systems too. It provides an addi-tional rationale to the use of multiplicative inﬂation, otherwise known by numericallyevidence to be needed for a proper functioning of reduced-rank ﬁlters even in perfectmodel scenarios (see e.g.

Raanes et al., 2019).An outlook at how this research may evolve is given in Sect. 5. In particular inSect. 5.1 we provide original results on the use of the AUS approach, i.e. exploiting theunstable-neutral subspace, in particle ﬁlters (PFs), a fully non-Gaussian DA method.Results indicate that targeting observations within the unstable-neutral subspaceis very eﬀective. However, by analogy with what was proved for the EnKF inSect. 3.2.1, adding observations along the stable modes does not deteriorate theanalysis. In the particle ﬁlter too, the particles automatically align along the unstable-neutral subspace so that the contribution from observations in its complement stablesubspace is negligible. Our results shed also new insight on the scaling of theparticle numbers needed to reach convergence. It is shown to depend on the sizeof the unstable-neutral subspace rather than the observation vector size. In Sect.5.2we surveyed how the novel concept of random attractors could oﬀer new ways tofurther exploit the idea behind AUS on stochastic multi-scale systems with largescale separation. Moreover, we also described how to amend numerical integrationschemes when doing ensemble DA on such systems, as a trade-oﬀ between accuracyand computational cost.It is important to recall that all of the results with the EnKF and PF that wehave presented are obtained without the use of localization (see e.g.

Carrassi et al.(2018), their chapter 4.4, and Farchi and Bocquet (2018) for localisation in the EnKFand PF, respectively). While we are well aware of the dramatic positive impactof localization on the ﬁlters’ skill, we intentionally did not use it as it artiﬁcially changes the ensemble-covariance rank and span, thus making it impossible to linkthem exclusively to the model instabilities.The use of the time-dependent unstable-neutral subspace to represent uncertaintyin dynamical systems is still very appealing and potentially prone to success in awider area than explored so far. For instance, in a recent work by Bocquet et al.(2020) model error arise from parameter mispeciﬁcation and the EnKF is appliedto estimate simultaneously the (chaotic) model state and 𝑁 𝑝 parameters. The EnKFis used in the state-augmentation formulation and the standard persistence model isadopted for the parameter dynamics. It was shown that the bound for the necessaryensemble size becomes 𝑁 ≥ 𝑛 + 𝑁 𝑝 + 𝑁 𝑝 additional members are required toinfer the 𝑁 𝑝 parameters. While the linear one-to-one relation between the numberof parameters and that of the additional members is a consequence of the choice ofa persistence model and will change if a diﬀerent parameters dynamics is in place,this result further highlights how much the design of the EnKF is, and can be tied tothe properties of the dynamical model.Future developments along these lines will unavoidably have to tackle the obstacleof the computing cost of the unstable-neutral subspace. We speculate that recentprogress in the area of machine learning (Goodfellow et al., 2016) may help. Neuralnetwork surrogate models of chaotic systems have shown capabilities to reproducethe spectrum of the asymptotic Lyapunov exponents fairly well (Pathak et al., 2017;Brajard et al., 2020). It is matter of future investigations to explore the possibilities ofmachine learning algorithms that learn about the time-dependent instabilities fromoﬄine long model simulations and then assist the model in the prediction mode byproviding a proxy of the unstable-subspace at each analysis time. Acknowledgements

AC has been funded by the UK Natural Environment Research Council awardNCEO02004. CEREA is member of Institut Pierre–Simon Laplace (IPSL). PNR has been partlyfunded by DIGIRES, a project sponsored by industry partners and the PETROMAKS2 programmeof the Research Council of Norway.

References

L. Y. Adrianova.

Introduction to linear systems of diﬀerential equations . AmericanMathematical Soc., 1995.M. Asch, M. Bocquet, and M. Nodet.

Data Assimilation: Methods, Algorithms, andApplications . SIAM, 2016.L. Barreira and Y. B. Pesin.

Lyapunov Exponents and Smooth Ergodic Theory .Student Mathematical Library. American Mathematical Society, 2002. ISBN9780821829219.R. Beeson and N. S. Namachchivaya. Particle ﬁltering for chaotic dynamical systemsusing future right-singular vectors.

Nonlinear Dynamics , June 2020. doi: 10.1007/s11071-020-05727-y. hort Title 39

G. Benettin, L. Galgani, A. Giorgilli, and J. Strelcyn. Lyapunov characteristicexponents for smooth dynamical systems and for Hamiltonian systems; a methodfor computing all of them. Part 1: Theory.

Meccanica , 15(1):9–20, 1980.M. Bocquet. Ensemble Kalman ﬁltering without the intrinsic need for inﬂation.

Nonlinear Processes in Geophysics , 18(5):735–750, 2011.M. Bocquet and A. Carrassi. Four-dimensional ensemble variational data assimila-tion and the unstable subspace.

Tellus A , 69(1):1304504, 2017.M. Bocquet and P. Sakov. An iterative ensemble Kalman smoother.

Q. J. R. Meteorol.Soc. , 140:1521–1535, 2014.M. Bocquet, K. S. Gurumoorthy, A. Apte, A. Carrassi, C. Grudzien, and C. K. R. T.Jones. Degenerate Kalman ﬁlter error covariances and their convergence ontothe unstable subspace.

SIAM/ASA Journal on Uncertainty Quantiﬁcation , 5(1):304–333, 2017.M. Bocquet, A. Farchi, and Q. Malartic. Online learning of both state and dynamicsusing ensemble kalman ﬁlters.

Foundation of Data Science , 0:00–00, 2020. doi:10.3934/fods.2020015. Accepted for publication.J. Brajard, A. Carrassi, M. Bocquet, and L. Bertino. Combining data assimilationand machine learning to emulate a dynamical model from sparse and noisy ob-servations: A case study with the lorenz 96 model.

Journal of ComputationalScience , 44:101171, 2020.R. Buizza, J. Tribbia, F. Molteni, and T. N. Palmer. Computation of optimal unstablestructures for a numerical weather prediction model.

Tellus A , 45(5):388–407,1993.A. Carrassi, A. Trevisan, and F. Uboldi. Adaptive observations and assimilation inthe unstable subspace by breeding on the data-assimilation system.

Tellus A , 59(1):101–113, 2007.A. Carrassi, M. Ghil, A. Trevisan, and F. Uboldi. Data assimilation as a nonlinear dy-namical systems problem: Stability and convergence of the prediction-assimilationsystem.

Chaos , 18:023112, 2008a.A. Carrassi, A. Trevisan, L. Descamps, O. Talagrand, and F. Uboldi. Controllinginstabilities along a 3DVar analysis cycle by assimilating in the unstable subspace:a comparison with the EnKF.

Nonlinear Processes in Geophysics , 15:503–521,2008b.A. Carrassi, S. Vannitsem, D. Zupanski, and M. Zupanski. The maximum likelihoodensemble ﬁlter performances in chaotic systems.

Tellus A , 61:587–600, 2009.A. Carrassi, M. Bocquet, L. Bertino, and G. Evensen. Data assimilation in thegeosciences-an overview on methods, issues and perspectives.

WIREs ClimChange , e535., 2018. doi: 10.1002/wcc.535.C. Cotter, D. Crisan, D. D. Holm, W. Pan, and I. Shevchenko. Numerically modelingstochastic lie transport in ﬂuid dynamics.

Multiscale Modeling & Simulation , 17(1):192–232, 2019.L. De Cruz, J. Demaeyer, and S. Vannitsem. The modular arbitrary-order ocean-atmosphere model: maooam v1.0.

Geoscientiﬁc Model Development , 9(8):2793–2808, 2016. doi: 10.5194/gmd-9-2793-2016.

J. Demaeyer and L. De Cruz. Climdyn/qgs: qgs version 0.2.0 release, July 2020.URL https://doi.org/10.5281/zenodo.3941877 .J. Demaeyer and S. Vannitsem. Stochastic parameterization of subgrid-scale pro-cesses: A review of recent physically based approaches. In

Advances in NonlinearGeosciences , pages 55–85. Springer, 2018.J. Demaeyer, L. De Cruz, and S. Vannitsem. qgs: A ﬂexible python frameworkof reduced-order multiscale climate models.

Journal of Open Source Software ,Submitted. URL https://raw.githubusercontent.com/openjournals/joss-papers/joss.02597/joss.02597/10.21105.joss.02597.pdf .L. Dieci and E. S. Van Vleck. Lyapunov spectral intervals: theory and computation.

SIAM Journal on Numerical Analysis , 40(2):516–542, 2002.L. Dieci and E. S. Van Vleck. Lyapunov and sacker–sell spectral intervals.

Journalof dynamics and diﬀerential equations , 19(2):265–293, 2007.G. Evensen.

Data Assimilation: The Ensemble Kalman Filter . Springer-VerlagBerlin Heildelberg, second edition, 2009a.G. Evensen.

Data assimilation: the ensemble Kalman ﬁlter . Springer Science &Business Media, 2009b.A. Farchi and M. Bocquet. Review article: Comparison of local particle ﬁlters andnew implementations.

Nonlinear Processes in Geophysics , 25(4):765–807, Nov.2018. doi: 10.5194/npg-25-765-2018.A. Fillion, M. Bocquet, S. Gratton, S. Gürol, and P. Sakov. An iterative ensemblekalman smoother in presence of additive model error.

SIAM/ASA Journal onUncertainty Quantiﬁcation , 8(1):198–228, 2020. doi: 10.1137/19M1244147.N. Fourrié, D. Marchal, F. Rabier, B. Chapnik, and G. Desroziers. Impact studyof the 2003 north atlantic thorpex regional campaign.

Quarterly Journal of theRoyal Meteorological Society , 132(615):275–295, 2006.J. Frank and G. A. Gottwald. A note on statistical consistency of numerical integratorsfor multiscale dynamics.

Multiscale Modeling & Simulation , 16(2):1017–1033,2018.G. Froyland, T. Hüls, G. P. Morriss, and T. M. Watson. Computing covariantlyapunov vectors, oseledets vectors, and dichotomy projectors: A comparativenumerical study.

Physica D: Nonlinear Phenomena , 247(1):18–39, 2013.M. Ghil and P. Malanotte-Rizzoli. Data assimilation in meteorology and oceanog-raphy. In

Advances in geophysics , volume 33, pages 141–266. Elsevier, 1991.I. Goodfellow, Y. Bengio, and A. Courville.

Deep learning . MIT press, 2016.G. A. Gottwald, D. Crommelin, and C. Franzke. Stochastic climate theory.

Nonlinearand Stochastic Climate Dynamics , pages 209–240, 2015.C. Grudzien, A. Carrassi, and M. Bocquet. Asymptotic forecast uncertainty and theunstable subspace in the presence of additive model error.

SIAM/ASA Journal onUncertainty Quantiﬁcation , 6(4):1335–1363, 2018a.C. Grudzien, A. Carrassi, and M. Bocquet. Chaotic dynamics and the role ofcovariance inﬂation for reduced rank Kalman ﬁlters with model error.

NonlinearProcesses in Geophysics , 25(3):633–648, 2018b. hort Title 41

C. Grudzien, M. Bocquet, and A. Carrassi. On the numerical integration of thelorenz-96 model, with scalar additive noise, for benchmark twin experiments.

Geoscientiﬁc Model Development , 13(4):1903–1924, 2020.K. S. Gurumoorthy, C. Grudzien, A. Apte, A. Carrassi, and C. K. R. T. Jones. Rankdeﬁciency of Kalman error covariance matrices in linear time-varying systemwith deterministic evolution.

SIAM Journal on Control and Optimization , 55(2):741–759, 2017.T. M. Hamill, F. Yang, C. Cardinali, and S. J. Majumdar. Impact of targeted win-ter storm reconnaissance dropwindsonde data on midlatitude numerical weatherpredictions.

Monthly weather review , 141(6):2058–2065, 2013.B. R. Hunt, E. Kalnay, E. J. Kostelich, E. Ott, D. J. Patil, T. Sauer, I. Szunyogh, J. A.Yorke, and A. V. Zimin. Four-dimensional ensemble Kalman ﬁltering.

Tellus A ,56(4):273–277, 2004.A. H. Jazwinski.

Stochastic Processes and Filtering Theory . Academic Press,New-York, 1970.R. E. Kalman. A new approach to linear ﬁltering and prediction problems.

Journalof Fluids Engineering , 82:35–45, 1960.E. Kalnay.

Atmospheric modeling, data assimilation and predictability . CambridgeUniversity Press, 2003.P. V. Kuptsov and U. Parlitz. Theory and computation of covariant lyapunov vectors.

Journal of Nonlinear Science , 22(5):727–762, 2012.B. Legras and R. Vautard. A guide to lyapunov vectors. In

ECMWF Workshop onPredictability , pages 135–146, Reading, United-Kingdom, 1996. ECMWF.P. D. Liu and M. Qian.

Smooth ergodic theory of random dynamical systems .Springer, 2006.E. N. Lorenz. Deterministic nonperiodic ﬂow.

Journal of the atmospheric sciences ,20(2):130–141, 1963.E. N. Lorenz. Predictability: A problem partly solved. In

Proc. Seminar on pre-dictability , volume 1, 1996.J. Maclean and E. S. V. Vleck. Particle ﬁlters for data assimilation based on reducedorder data models, 2019.L. Mitchell and A. Carrassi. Accounting for model error due to unresolved scaleswithin ensemble Kalman ﬁltering.

Quarterly Journal of the Royal MeteorologicalSociety , 141(689):1417–1428, 2015.G. H. C. Ng, D. McLaughlin, D. Entekhabi, and A. Ahanin. The role of modeldynamics in ensemble Kalman ﬁlter performance for chaotic systems.

Tellus A ,63(5):958–977, 2011.T. N. Nipen, I. A. Seierstad, C. Lussana, J. Kristiansen, and O. Hov. AdoptingCitizen Observations in Operational Weather Prediction.

Bulletin of the AmericanMeteorological Society , 101(1):E43–E57, 10 2019. ISSN 0003-0007. doi: 10.1175/BAMS-D-18-0237.1.L. Oljača, J. Bröcker, and T. Kuna. Almost sure error bounds for data assimilation indissipative systems with unbounded observation noise.

SIAM Journal on AppliedDynamical Systems , 17(4):2882–2914, 2018. doi: 10.1137/17M1162305.

L. Palatella and F. Grasso. The ekf-aus-nl algorithm implemented without the lineartangent model and in presence of parametric model error.

SoftwareX , 7:28 – 33,2018. ISSN 2352-7110.L. Palatella and A. Trevisan. Interaction of Lyapunov vectors in the formulation ofthe nonlinear extension of the Kalman ﬁlter.

Physical Review E , 91:042905, 2015.L. Palatella, A. Carrassi, and A. Trevisan. Lyapunov vectors and assimilation in theunstable subspace: theory and applications.

Journal of Physics A: Mathematicaland Theoretical , 46:254020, 2013.T. Palmer, R. Gelaro, J. Barkmeĳer, and R. Buizza. Singular vectors, metrics, andadaptive observations.

Journal of the Atmospheric Sciences , 55(4):633–653, 1998.J. Pathak, Z. Lu, B. R. Hunt, M. Girvan, and E. Ott. Using machine learning toreplicate chaotic attractors and calculate lyapunov exponents from data.

Chaos:An Interdisciplinary Journal of Nonlinear Science , 27(12):121102, 2017.S. Penny, E. Bach, K. Bhargava, C.-C. Chang, C. Da, L. Sun, and T. Yoshida.Strongly coupled data assimilation in multiscale media: Experiments using aquasi-geostrophic coupled model.

Journal of Advances in Modeling Earth Sys-tems , 11(6):1803–1829, 2019.S. G. Penny. Mathematical foundations of hybrid data assimilation from a synchro-nization perspective.

Chaos: An Interdisciplinary Journal of Nonlinear Science ,27(12):126801, 2017.S. G. Penny and T. M. Hamill. Coupled data assimilation for integrated earth systemanalysis and prediction.

Bulletin of the American Meteorological Society , 98(7):ES169–ES172, 2017.A. Pikovsky and A. Politi.

Lyapunov exponents: a tool to explore complex dynamics .Cambridge University Press, 2016.H. Poincaré.

Les méthodes nouvelles de la mécanique céleste. Tome III . GAUTHIER-VILLARS, 1899.R. Potthast, A. Walter, and A. Rhodin. A localized adaptive particle ﬁlter within anoperational NWP framework.

Monthly Weather Review , 147(1):345–362, 2019.P. N. Raanes, A. Carrassi, and L. Bertino. Extending the square root method toaccount for additive forecast noise in ensemble methods.

Monthly Weather Review ,143(10):3857–3873, 2015.P. N. Raanes, M. Bocquet, and A. Carrassi. Adaptive covariance inﬂation in theensemble kalman ﬁlter by gaussian scale mixtures.

Quarterly Journal of theRoyal Meteorological Society , 145(718):53–75, 2019.B. B. Reinhold and R. T. Pierrehumbert. Dynamics of weather regimes: Quasi-stationary waves and blocking.

Monthly Weather Review , 110(9):1105–1145,1982.D. Ruelle. Ergodic theory of diﬀerentiable dynamical systems.

Inst. Hautes ÉtudesSci. Publ. Math. , 50(50):27–58, 1979. ISSN 0073-8301.P. Sakov and L. Bertino. Relation between two common localisation methods forthe enkf.

Computational Geosciences , 15(2):225–237, 2011.P. Sakov and P. R. Oke. A deterministic formulation of the ensemble Kalman ﬁlter:an alternative to ensemble square root ﬁlters.

Tellus A , 60(2):361–371, 2008a. hort Title 43

P. Sakov and P. R. Oke. Implications of the form of the ensemble transformationin the ensemble square root ﬁlters.

Monthly Weather Review , 136(3):1042–1053,2008b.P. Sakov, J. M. Haussaire, and M. Bocquet. An iterative ensemble Kalman ﬁlter inpresence of additive model error.

Quarterly Journal of the Royal MeteorologicalSociety , 2018.I. Shimada and T. Nagashima. A numerical approach to ergodic problem of dis-sipative dynamical systems.

Progress of Theoretical Physics , 61(6):1605–1616,1979.C. Snyder. Summary of an informal workshop on adaptive observations and fastex.

Bulletin of the American Meteorological Society , 77(5):953–961, 1996.C. Snyder, T. Bengtsson, P. Bickel, and J. Anderson. Obstacles to high-dimensionalparticle ﬁltering.

Monthly Weather Review , 136(12):4629–4640, 2008.C. Snyder, T. Bengtsson, and M. Morzfeld. Performance bounds for particle ﬁltersusing the optimal proposal.

Monthly Weather Review , 143(11):4750–4761, 2015.I. Szunyogh, Z. Toth, A. V. Zimin, S. J. Majumdar, and A. Persson. Propagation of theeﬀect of targeted observations: The 2000 winter storm reconnaissance program.

Monthly weather review , 130(5):1144–1165, 2002.P. D. Thompson. Uncertainty of initial state as a factor in the predictability oflarge scale atmospheric ﬂow patterns.

Tellus , 9(3):275–295, 1957. doi: 10.1111/j.2153-3490.1957.tb01885.x.M. Tondeur, A. Carrassi, S. Vannitsem, and M. Bocquet. On temporal scale sepa-ration in coupled data assimilation with the ensemble kalman ﬁlter.

Journal ofStatistical Physics , 179:1161–1185, 2020.Z. Toth and E. Kalnay. Ensemble forecasting at nmc: The generation of perturbations.

Bulletin of the american meteorological society , 74(12):2317–2330, 1993.Z. Toth and E. Kalnay. Ensemble forecasting at NCEP and the breeding method.

Monthly Weather Review , 125(12):3297–3319, 1997.A. Trevisan and R. Legnani. Transient error growth and local predictability: A studyin the lorenz system.

Tellus A , 47(1):103–117, 1995.A. Trevisan and L. Palatella. On the Kalman ﬁlter error covariance collapse into theunstable subspace.

Nonlinear Processes in Geophysics , 18(2):243–250, 2011.A. Trevisan and F. Uboldi. Assimilation of standard and targeted observations withinthe unstable subspace of the observation-analysis-forecast cycle.

Journal of theAtmospheric Sciences , 61:103–113, 2004a.A. Trevisan and F. Uboldi. Assimilation of standard and targeted observations withinthe unstable subspace of the observation–analysis–forecast cycle system.

Journalof the Atmospheric Sciences , 61(1):103–113, 2004b.A. Trevisan, M. D’Isidoro, and O. Talagrand. Four-dimensional variational assim-ilation in the unstable subspace and the optimal subspace dimension.

Q. J. R.Meteorol. Soc. , 136:487–496, 2010.F. Uboldi and A. Trevisan. Detecting unstable structures and controlling error growthby assimilation of standard and adaptive observations in a primitive equation oceanmodel.

Nonlinear Processes in Geophysics , 16:67–81, 2006.

F. Uboldi, A. Trevisan, and A. Carrassi. Developing a dynamically based assim-ilation method for targeted and standard observations.

Nonlinear Processes inGeophysics , 12(1):149–156, 2005. doi: 10.5194/npg-12-149-2005.G. K. Vallis.

Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Circulation . Cambridge University Press, 2 edition, 2017. doi: 10.1017/9781107588417.S. Vannitsem. Predictability of large-scale atmospheric motions: Lyapunov expo-nents and error dynamics.

Chaos: An Interdisciplinary Journal of NonlinearScience , 27(3):032101, 2017.S. Vannitsem and W. Duan. On the use of near-neutral backward lyapunov vectorsto get reliable ensemble forecasts in coupled ocean–atmosphere systems.

ClimateDynamics , 55:1125–1139, 2020. doi: 10.1007/s00382-020-05313-3.S. Vannitsem and V. Lucarini. Statistical and dynamical properties of covariantlyapunov vectors in a coupled atmosphere-ocean model—multiscale eﬀects, geo-metric degeneracy, and error dynamics.

Journal of Physics A: Mathematical andTheoretical , 49(22):224001, 2016.S. Vannitsem, J. Demaeyer, L. De Cruz, and M. Ghil. Low-frequency variabilityand heat transport in a low-order nonlinear coupled ocean–atmosphere model.