[PDF] On the Mathematical Theory of Ensemble (Linear-Gaussian) Kalman-Bucy Filtering

Abstract

The purpose of this review is to present a comprehensive overview of the theory of ensemble Kalman-Bucy filtering for linear-Gaussian signal models. We present a system of equations that describe the flow of individual particles and the flow of the sample covariance and the sample mean in continuous-time ensemble filtering. We consider these equations and their characteristics in a number of popular ensemble Kalman filtering variants. Given these equations, we study their asymptotic convergence to the optimal Bayesian filter. We also study in detail some non-asymptotic time-uniform fluctuation, stability, and contraction results on the sample covariance and sample mean (or sample error track). We focus on testable signal/observation model conditions, and we accommodate fully unstable (latent) signal models. We discuss the relevance and importance of these results in characterising the filter's behaviour, e.g. it's signal tracking performance, and we contrast these results with those in classical studies of stability in Kalman-Bucy filtering. We provide intuition for how these results extend to nonlinear signal models and comment on their consequence on some typical filter behaviours seen in practice, e.g. catastrophic divergence.

Full PDF

OOn the Mathematical Theory of Ensemble (Linear-Gaussian)Kalman-Bucy Filtering

Adrian N. Bishop and Pierre Del Moral CSIRO; and University of Technology Sydney (UTS), Australia INRIA, Bordeaux Research Center, France

Abstract

The purpose of this review is to present a comprehensive overview of the theory ofensemble Kalman-Bucy ﬁltering for linear-Gaussian signal models. We present a system ofequations that describe the ﬂow of individual particles and the ﬂow of the sample covarianceand the sample mean in continuous-time ensemble ﬁltering. We consider these equationsand their characteristics in a number of popular ensemble Kalman ﬁltering variants. Giventhese equations, we study their asymptotic convergence to the optimal Bayesian ﬁlter. Wealso study in detail some non-asymptotic time-uniform ﬂuctuation, stability, and contractionresults on the sample covariance and sample mean (or sample error track). We focus ontestable signal/observation model conditions, and we accommodate fully unstable (latent)signal models. We discuss the relevance and importance of these results in characterisingthe ﬁlter’s behaviour, e.g. it’s signal tracking performance, and we contrast these resultswith those in classical studies of stability in Kalman-Bucy ﬁltering. We provide intuitionfor how these results extend to nonlinear signal models and comment on their consequenceon some typical ﬁlter behaviours seen in practice, e.g. catastrophic divergence.

Contents a r X i v : . [ m a t h . S T ] J un Introduction

Consider a time-invariant, continuous-time, signal and observation model of the form, d X t = a ( X t ) dt + R / d V t d Y t = h ( X t ) dt + R / d W t (1.1)where X t is the underlying signal (latent) process, Y t is the observation signal, a ( · ) and h ( · ) are the signal and sensor model functions, and V t and W t are continuous-time Brownian motion(noise) signals. The ﬁltering problem [2, 6] is concerned with estimating some statistic(s) of thesignal X t conditioned on the observations Y s , ≤ s ≤ t . For example, one may want to charac-terise fully the distribution of X t given Y t , or one may seek some moments of this distribution.The conditional distribution of X t given Y s , ≤ s ≤ t is called the (optimal, Bayesian) ﬁlter-ing distribution. When the model functions a ( · ) , h ( · ) are linear, the exact (optimal, Bayesian)solution to this problem is completely characterised by the ﬁrst two moments of the ﬁlteringdistribution and these moments are given by the celebrated Kalman-Bucy ﬁlter [43, 2, 8].Apart from the most pathological model cases [6], in the nonlinear model setting, there isin general no ﬁnite dimensional optimal ﬁlter. In practice, typically some ﬁlter approximationsare needed. For example, one may consider a type of “extended” Kalman ﬁlter [2] based onlinearisation of the nonlinear model and application of the classical Kalman-Bucy ﬁlter. Thismethod works well in suitably regular, and suﬃciently close to linear problems. This methoddoes not handle multiple modes in the true conditional ﬁltering distribution well. So-calledGaussian sum ﬁlters are another Kalman-ﬁlter-type approximation designed to handle in somesense multiple modes in the ﬁltering distribution [2]. More recently there has been some focuson Monte Carlo integration methods for approximating the optimal Bayesian ﬁlter [24, 6]. Suchmethods, termed particle ﬁlters or sequential Monte Carlo ﬁlters/methods [34, 29, 28], havethe advantage of not being subject to the assumption of linearity or Gaussianity in the model.These particle ﬁlters are consistent in the number of Monte Carlo samples, i.e. with inﬁnitecomputational power these methods converge to the optimal nonlinear ﬁlter. However, typicalparticle ﬁltering algorithms exhibit high computational costs with approximation errors thatgrow (with a ﬁxed sample size) with the signal/observation dimensions [24, 68]. These methodsin practice are not scalable to the high-dimensional ﬁltering or state estimation problems foundin the geosciences and other areas [32, 44, 53, 70].The ensemble Kalman-Bucy ﬁlter (generally abbreviated EnKF ) [31, 32] is a type of MonteCarlo sampling approximation of the Kalman ﬁlter. In particular, the

EnKF is a recursive algo-rithm for propagating and updating the sample mean and sample covariance of an approximatedBayesian ﬁlter [32]. The ﬁlter works via the evolution of a collection (i.e. an ensemble) of samples(or ensemble members, or particles) that each satisﬁes a type of Kalman-Bucy update equation.In classical Kalman-Bucy ﬁltering [43, 2, 8], a gain function weights a predicted state estimatewith the signal observations dependent on the ﬁltering error covariance matrix, see [32, 8]. Inthe

EnKF , the error covariance is replaced by a type of sample covariance in the gain function.The result is a system of interacting particles in the spirit of a mean-ﬁeld approximation of acertain McKean-Vlasov-type diﬀusion equation [61, 26]. We may reﬁne this discussion by givingthe relevant equations for a most basic form of

EnKF . Let ( V it , W it , X i ) with ≤ i ≤ N + 1 be ( N + 1) independent copies of ( V t , W t , X ) . The most basic ensemble Kalman ﬁlter, originallydue to Evensen [18, 31, 32], is deﬁned by, d X it = a ( X it ) dt + R / d V it + (cid:98) P ht R − (cid:104) d Y t − (cid:16) h ( X it ) dt + R / d W it (cid:17)(cid:105) (1.2)with ≤ i ≤ N + 1 and the (particle) sample mean and the sample cross-covariance deﬁning the2o-called Kalman gain matrix given by, (cid:98) X t := 1 N + 1 N +1 (cid:88) i =1 X it and (cid:98) P ht := 1 N N +1 (cid:88) i =1 (cid:104) X it − (cid:98) X t (cid:105) (cid:34) h ( X it ) − N + 1 N +1 (cid:88) i =1 h ( X it ) (cid:35) (cid:48) (1.3)and we may also write the standard sample covariance, (cid:98) P t := 1 N N +1 (cid:88) i =1 (cid:104) X it − (cid:98) X t (cid:105) (cid:104) X it − (cid:98) X t (cid:105) (cid:48) (1.4)In this work we study this most basic ensemble Kalman ﬁlter as described above, and also moresophisticated variants, including the method of Sakov and Oke [72], that exhibit less ﬂuctuationdue to sampling noise. Readers familiar with the Kalman ﬁlter will recognise immediately somestructural similarities as discussed above. However, there is no evolution equation given abovefor the covariance as in the Kalman ﬁlter (e.g. no Riccati-type matrix ﬂow equation). Instead,we replace the relevant covariance matrices with their sample-based counterparts.Importantly, if the underlying model is linear and Gaussian, then the optimal Bayesian ﬁlter-ing distribution is Gaussian, and the EnKF propagates exactly the sample mean and covarianceof the optimal Bayesian ﬁlter and is provably consistent. If the model is nonlinear and/or non-Gaussian, then a standard implementation of the

EnKF propagates a sample-based estimate ofthe ﬁltering mean and covariance (but not the true posterior sample mean or covariance andwith no results on consistency). In the context of estimation theory, we may contrast the notionof a state estimator (or observer) with the notion of a Bayesian ﬁlter. The goal of the formeris to design an observer that tracks in some suitable (typically point-wise) sense the underlyingsignal and perhaps provides some usable measure of uncertainty on this estimate. The goalof the latter is to compute or approximate the true (Bayesian) ﬁltering distribution (or somerelated statistics). In the nonlinear setting, even with inﬁnite computational power, the

EnKF methods do not converge to the optimal nonlinear ﬁlter and indeed their limiting objects arenot well understood in this setting. As discussed more technically later, ensemble Kalman ﬁltersare probably best viewed in practice as a type of (random) sample-based observer or state esti-mator for nonlinear signal/observation models. However, in the special case of linear signal andobservation models they are indeed provably consistent approximations of the optimal Bayesianﬁlter.In practice, the ensemble Kalman ﬁltering methodology is applied in high-dimensional, non-linear state-space models, e.g. see [31, 32] and the application references listed later in thisintroduction. Empirically, this method has shown good tracking performance in these applica-tions, see [32] and the application references listed later. This tracking behaviour of the

EnKF when applied to practical models may be explainable as a consequence of viewing the

EnKF inthe sense of a dynamic state estimator, cf. the preceding discussion on ﬁltering versus stateestimators/observers. The ﬂuctuation, stability and contraction properties of the

EnKF exploredin this article (albeit mainly for linear-Gaussian models) may be viewed in this context also andprovide some insight into the state estimate tracking behaviour seen in practice.

The purpose of this review is to present a comprehensive overview of the theory of ensembleKalman-Bucy ﬁltering with an emphasis on rigorous results and behavioural characterisations forlinear-Gaussian signal/observation models. We present a system of equations that describe theﬂow of individual particles, the ﬂow of the sample covariance, and the ﬂow of the sample meanin continuous-time ensemble ﬁltering. We consider these equations and their characteristics ina number of popular

EnKF varieties. Given these equations, we study in detail some ﬂuctuation,stability, and error contraction results for the relevant ensemble Kalman ﬁltering equations .3e discuss the relevance and importance of these results in terms of characterising the

EnKF behaviour, and we contrast these results with those considered in classical studies of stability inKalman-Bucy ﬁltering.We remark that classical studies of stability in (traditional, non-ensemble-type) Kalman-Bucy ﬁltering are important because they rigorously establish the type of “tracking” propertiesdesired in a ﬁltering or estimation problem; and they establish intuitive, testable, model-basedconditions (e.g. so-called model observability) for achieving these convergence properties. Clas-sical results in Kalman-Bucy ﬁltering also establish the (exponential) convergence of the errorcovariance to a ﬁxed steady-state value computable from the model parameters. See the review[8, 10] for detailed results in the classical context and historical remarks. The results in thiswork seek to characterise in an analogous manner the practical performance and behaviour ofensemble Kalman ﬁltering, and these results then provide guidance and intuition on the tracking,approximation error, and other properties of these practical methods.

The

EnKF is a key numerical method for solving high-dimensional forecasting and data assimi-lation problems; see, e.g., [31, 32]. In particular, applications have been motivated by inferenceproblems in ocean and atmosphere sciences [56, 58, 44, 65], weather forecasting [3, 4, 18, 38],environmental and ecological statistics [1, 42], as well as in oil reservoir simulations [33, 64, 76],and many others. This list is by no means exhaustive, nor the cited articles fully representa-tive of the respective applications. We refer to (some of) the seminal methodology papers in[30, 38, 18, 5, 39, 36, 3, 4, 79, 72, 69, 83]. This long list is not exhaustive; see also [32, 44, 53, 70]for more background, and the detailed chronological list of references in Evensen’s text [32]. En-semble Kalman methods for inverse problems have also been considered in the literature [41, 21]with some related analysis [74, 75]. See these references for further details on this topic.In continuous-time, we may broadly break down the class of

EnKF methods into three distincttypes; distinguished by the level of ﬂuctuation added via sampling noise needed to ensure thatthe

EnKF sample mean and covariance are consistent in the linear-Gaussian setting. The originalform of the

EnKF is the so-called ‘vanilla’

EnKF of Evensen [18, 32]; and this method exhibitsthe most ﬂuctuations due to sampling noise. The next class is the so-called ‘deterministic’

EnKF of Sakov and Oke [72], see also [69]; which exhibits (considerably) less ﬂuctuation. In thecontinuous-time linear-Gaussian setting this class is representative of the so-called square-root

EnKF methods [79] (which diﬀer somewhat in discrete-time). Finally, there has been recentinterest in so-called transport-inspired

EnKF methods [69, 78]; which apart from initialisationnoise/randomisation are completely deterministic and whose analysis (in the linear setting)follows that of the classical Kalman-Bucy ﬁlter (cf. [8]). These classes do not distinguishthe totality of

EnKF methodology (especially in nonlinear or non-Gaussian models); which mayfurther consist of so-called covariance regularisation methods [5, 39, 36, 62, 31], etc. However,in the linear-Gaussian setting, these three classes broadly capture the fundamentals. We willreturn to these three classes, and some regularisation methods, in the development of this article.Convergence and large-sample asymptotics of the discrete-time

EnKF has been studied in [55,60, 51], in the sense of taking the number of particles to inﬁnity. The discrete-time square rootform of the

EnKF is accommodated in [51], and nonlinear state-space models are accommodatedin [55]. In [54, 22] the authors consider continuous-time (non-Gaussian) state-space models(e.g. certain nonlinear diﬀusion models) and the convergence (with sample size) of particular(somewhat less standard)

EnKF methods. In the continuous-time, linear-Gaussian, setting, theconvergence (in sample size) of the three broad classes of

EnKF to the true Kalman-Bucy ﬁlteris immediate and follows, e.g., from the sample mean and sample covariance matrices evolutionequations in [26, 11]. In this latter sense, we recover the fact that the

EnKF is a consistentapproximation of the optimal, Bayesian ﬁlter (i.e. the classical Kalman-Bucy ﬁlter) in thelinear-Gaussian setting as discussed earlier. 4n practice, one is typically interested in the non-asymptotic (in terms of ensemble size)ﬂuctuation properties as well as the long time/stability behaviour of the particle-type ﬁlteringapproximations.The ﬂuctuation analysis of ensemble Kalman-type ﬁltering is studied in detail in the linear-Gaussian setting in [14, 13, 11]. In [14] a complete Taylor-type stochastic expansion of thesample covariance is given at any order with bounded remainder terms and estimates. Bothnon-asymptotic and asymptotic bias and variance estimates for the

EnKF sample covariance andsample mean are given explicitly in [14]. These latter expansions directly imply an almost surestrong form of a central limit-type result on the sample covariance and sample mean at anytime. The analysis in [14] is considered over the entire path space of the matrix-valued Riccatistochastic diﬀerential equation that describes the ﬂow of the sample covariance. However, mostof the non-asymptotic time-uniform results in [14] hold only when the underlying signal is stable.In [13, 11] we consider the case in which the underlying signal may be unstable, and we providetime-uniform, non-asymptotic moment estimates and time-uniform control over the ﬂuctuationof the sample covariance and mean about their limiting Riccati and Kalman-Bucy ﬁltering terms.The emphasis of time-uniformity on the moment bounds and on the ﬂuctuation boundson the sample mean and sample covariance (about the true optimal Bayesian ﬁltering meanand covariance) is important. If these bounds are allowed to grow in time, e.g. typically in thisanalysis one can easily obtain bounds that grow exponentially in time, then these bounds quicklybecome useless for any practical numerical application; e.g. an exponent > may induce anexceedingly pessimistic bound greater than the estimated number of particles of matter in thevisible universe. We remark also that our emphasis on accommodating unstable (latent) signalmodels is important because time-uniform ﬂuctuation results in such cases (which are of realpractical importance) are signiﬁcantly more diﬃcult to obtain under testable and realistic modelassumptions (like the classical observability and controllability model assumptions in the controland ﬁltering literature [2, 8]).In [26], stability of the EnKF in continuous-time linear-Gaussian models is considered underthe assumption that the underlying signal model is also stable. This latter assumption is incontrast with classical Kalman-Bucy ﬁlter stability results, which hold in the linear-Gaussiansetting under the much weaker (and more natural) condition of signal detectability [82, 8, 10].The classical Kalman-Bucy ﬁlter is stable as a result of the closed-loop stabilising propertiesof the so-called Kalman gain matrix, which is intimately related to the ﬂow of the ﬁlter errorcovariance described by a Riccati diﬀerential equation. The

EnKF analogue, in linear-Gaussiansettings, is the sample covariance, and its random ﬂuctuation properties (noted in the precedingparagraph) are the main source of diﬃculty in establishing the closed-loop ﬁlter stability inthose models in which the underlying signal itself is not stable.In [80], the authors analyse the long-time behaviour of the (discrete-time)

EnKF in a classof nonlinear systems, with ﬁnite ensemble size, using Foster-Lyapunov techniques. Applyingthe results of [80] to the basic linear-Gaussian ﬁltering problem, the analysis and assumptionsin [80] then also require stability of the underlying signal model. In a traditional sense, theconditions needed in [80] are hard to check, e.g. as compared to the classical observability orcontrollability-type model conditions in Kalman ﬁltering analysis; but a range of examples aregiven in [80]. In [47], the long-time behaviour of the

EnKF is analysed in both discrete andcontinuous time settings with similar conditions on the model as in [80]; and which again iflinearised equates to a form of stability on the signal model.We emphasise again that the type of analysis in [47, 80, 26] cannot handle unstable, ortransient, signal models; i.e. signals with sample paths with at least one coordinate that maygrow unbounded. In the context studied in [47, 80, 26] dealing with stable or bounded latentsignal processes (e.g. the Lorenz-class of signal models [47, 80]), the important question on theﬁlter stability or ﬁltering error estimates relies on obtaining meaningful quantitative ﬂuctuationconstants decreasing with the number of ensemble members to achieve a desired performance.5f course, time uniformity of these bounds follows trivially in this setting from the boundednessproperties of the latent signal process.Covariance inﬂation is a mechanism used in practical methods to increase the positive-deﬁniteness of the sample covariance matrix and essentially amplify its eﬀect on the stabilisationproperties of the Kalman gain matrix. In [47] time-uniform

EnKF error boundedness results followunder a true signal stability condition and given a suﬃciently large variance inﬂation regime.See also [81, 59] for related stability analysis in the presence of adaptive covariance inﬂation andprojection techniques. In [11] in the continuous-time linear-Gaussian setting, the mechanism bywhich covariance inﬂation acts to stabilise the ensemble ﬁlter is exempliﬁed, see also [16].In the continuous-time, and linear-Gaussian setting, the ﬁrst work to relax the assumptionof underlying signal stability for the

EnKF is in [14, 13, 11]. In those articles, signals withsample paths that may grow unbounded (to inﬁnity exponentially fast) are accommodated.That work is based on both a ﬂuctuation analysis of the sample covariance and the samplemean [14, 13, 11], followed by studies on the long-time behaviour, e.g. stability properties,of both the sample covariance and mean [13, 11]. Time-uniform ﬂuctuation properties aregiven under a type of (strong) signal observability condition. In this setting time-uniformity ofthese results is non-trivial. This assumption is in keeping with classical Kalman-Bucy ﬁlteringand Riccati equation results; and does not require any form of underlying signal stability. Asthe authors of [80] note in their stability analysis, they use “few properties of the forecast[predicted] covariance matrix other than positivity”. As noted in [80], this lends generality totheir results, but conversely places the burden back on the signal model assumptions (includingthose assumptions of true signal stability). Contrast this with the work in [14, 13, 11] whereemphasis is placed on the ﬂuctuation analysis of the sample covariance, with a primary aim ofremoving the stability assumptions needed on the underlying signal model. The time-uniformﬂuctuation and stochastic perturbation contributions in [14, 13, 11], were discussed earlier.Given this ﬂuctuation analysis, the stability of the ﬁlter sample mean and sample covarianceand their (time) asymptotic properties are studied in [14, 13, 11] without stability assumptions onthe underlying signal model. These results rigorously establish the type of “tracking” propertiesdesired by a ﬁltering or estimation solution.Although of lesser practical use in ensemble ﬁltering applications, strong results in the one-dimensional setting are also derived in [13] that converge, e.g. in the limit with the ensemblesize, to those properties of the classical Kalman-Bucy ﬁlter. For example, we can recover theoptimal exponential contraction and ﬁlter stability rates, etc. In the multidimensional setting,the decay rates to equilibrium are not sharp, and the stationary measures are not given in closedform.

The main goal of this article is to: 1). present a novel formulation for ensemble ﬁltering inlinear-Gaussian, continuous-time, systems that lends itself naturally to analysis; 2). study thestability of the resulting stochastic Riccati diﬀerential equation that describes the ﬂow of thesample covariance; 3). study the stability of the continuous-time ensemble Kalman-Bucy updateequation that is coupled to this stochastic Riccati equation, and which describes the ﬂow of thesample mean (or the sample mean minus the true signal, i.e. the sample error signal); 4). providedetailed ﬂuctuation analysis of the ensemble Kalman-Bucy ﬂow, the sample mean, and thestochastic Riccati equation describing the sample covariance. This article is primarily a review ofthe literature and results in these directions. The prime focal point of this review are the articles[26, 14, 13, 11], which focus heavily on the linear-Gaussian model setting. In those articles, an emphasis is placed on deriving time-uniform ﬂuctuation, stability, and contraction resultsunder testable model conditions reminiscent of those classical observability and controllability-type model assumptions. Importantly, we do not generally assume the true underlying signal isstable in this review.

EnKF methodsconsidered is always under-biased when compared to the true covariance matrix. This may mo-tivate, from a pure uncertainty quantiﬁcation viewpoint, some form of covariance regularisation[5, 39, 36, 62, 31]. We provide detailed analysis illustrating the eﬀect of inﬂation regularisation onstability (similarly to [47, 81, 59]). As another example, we provide strong intuition for so-calledcatastrophic ﬁlter divergence (studied previously in [37, 35, 48]) based on rigorous (heavy-tailed)ﬂuctuation properties inherent to the relevant sample covariance matrices and their invariantdistributions. We contrast the so-called ‘vanilla’

EnKF of [18, 32] with the ‘deterministic’

EnKF ofSakov and Oke [72] in terms of their ﬂuctuation and sample noise characteristics and we showhow this eﬀects their respective sample behaviour and stability properties.As with classical (non-ensemble) Kalman ﬁltering, the importance of the results reviewedis in rigorously establishing the type of tracking and stability behaviour desired in ﬁlteringapplications [2, 24, 6, 8]. For example, our results imply conditions under which the initialestimation errors are forgotten, and that the ﬂow of the sample mean converges to the trueKalman ﬁltering mean estimate (and thus the true underlying signal) in the average. In thiscase, there must be some emphasis on the stochastic behaviour of the ensemble (Monte Carlo)mean and covariance in order to establish ﬁlter stability. We also provide the analogue of theerror covariance ﬁxed point in classical Kalman ﬁltering [2, 8]; whereby we state results thatensure the sample covariance matrix converges to an invariant, steady-state, distribution. Wecharacterise the properties of this invariant distribution and relate this to the sample behaviourof the ‘vanilla’

EnKF [18, 32] and the ‘deterministic’

EnKF [72].We focus on the linear, continuous-in-time, Gaussian setting in this review and note that inthis case the sample mean and sample covariance are consistent approximations of the optimalBayesian ﬁltering mean and covariance. We emphasise that even in the linear-Gaussian case,the samples themselves are not in general independent. The analysis even in the linear settingis highly technical [26, 14, 13, 11] and the results presented in this case are aimed as a step inthe progression to more applied results and intuition in nonlinear model settings. There is someprecedent for studying the relative properties, behaviour, or performance of ensemble Kalmanﬁltering ﬁrstly with linear-Gaussian signal models [31]. For example, the seminal article [18]illustrated that a perturbation of the observations in the ensemble Kalman ﬁlter was necessary torecover a consistent covariance limit (to the true Kalman ﬁlter for linear-Gaussian systems); or toachieve the standard Monte Carlo error rate with a ﬁnite size set of particles. The analysis (andeven derivation) of ensemble square root ﬁlters for linear-Gaussian system models is standard[73, 57], etc. Convergence of the ensemble Kalman ﬁlter in inverse problems is studied in [74]in the linear setting. We discuss connections and extensions of the results in this article to thenonlinear model setting toward the end.Note that the analysis and proofs in [26, 14, 13, 11], while motivated originally by en-semble Kalman-type ﬁltering methods, are largely presented as independent technical resultson certain general classes of matrix-valued Riccati diﬀusion equations and associated linearstochastic diﬀerential equations with random coeﬃcients. In this review we emphasise the workin [26, 14, 13, 11] via a series of results directly and solely stated in the context of ensembleKalman-type ﬁltering. Throughout we relate our results to the broader technical literature onensemble Kalman ﬁltering and we emphasise the practical signiﬁcance of these results, e.g. viathe tracking property of the ﬁlter, its stability, or via their error ﬂuctuation or catastrophicdivergence behaviour, among other topics. We also contrast the behaviour of the various classesof continuous-time

EnKF methods. 7 .4 Notation

We remark ﬁrstly that some care must be taken throughout to keep track of the font stylings; e.g.upright vs. calligraphic vs. script, etc. There is typically a relationship between like symbolsappearing with diﬀerent stylings.Hatted terms (cid:98) · := · N should be viewed as terms indexed to the ensemble (particle) size N ≥ .Time is indexed variously by s, t, u, τ ∈ [0 , ∞ [ . We write c, c n , c τ , c n,τ , c n,τ ( Q ) , c n,τ ( z, Q ) . . . forsome positive constants whose values may vary from result to result, and which only dependon some parameters n, τ, z, Q , etc, as well as implicitly on the model parameters ( A, H, R, R ) introduced later. Importantly, these constants do not depend on the time horizon t , nor on thenumber of ensemble particles N .Let M d be the set of ( d × d ) real matrices with r ≥ and M d ,d the set of ( d × d ) realmatrices. Let S d ⊂ M d be the subset of symmetric matrices, and S d , and S + d the subsets ofpositive semi-deﬁnite and deﬁnite matrices respectively. We write A ≥ B when A − B ∈ S d ; and A > B when A − B ∈ S + d . We denote by and I the null and identity matrices, for any d ≥ .Given R ∈ ∂ S + d := S d − S + d we denote by R / a (non-unique) symmetric square root of R . When R ∈ S + d we choose the unique symmetric square root. We write A (cid:48) the transpose of A , and A sym =( A + A (cid:48) ) / its symmetric part. We denote by Absc( A ) := max { Re ( λ ) : λ ∈ Spec( A ) } its spectralabscissa. We also denote by Tr( A ) the trace. When A ∈ S d we let λ ( A ) ≥ . . . ≥ λ d ( A ) denotethe ordered eigenvalues of A . We equip M d with the spectral norm (cid:107) A (cid:107) = (cid:107) A (cid:107) = (cid:112) λ ( AA (cid:48) ) or the Frobenius norm (cid:107) A (cid:107) = (cid:107) A (cid:107) Frob = (cid:112) Tr( AA (cid:48) ) . Let µ ( A ) denote the matrix logarithmic“norm” (which can be < ) [77]. For example, the (2-)logarithmic “norm” is µ ( A ) = λ ( A sym ) .We have µ ( · ) ≥ Absc( · ) .Given a suitably regular matrix-valued stochastic process t (cid:55)→ Q t ∈ M d , for any t ≥ and n ≥ , we set ||| Q t ||| n = E [ (cid:107) Q t (cid:107) n ] /n where E [ · ] denotes the expectation. Consider a time-invariant linear-Gaussian ﬁltering model of the following form, d X t = A X t dt + R / d V t d Y t = H X t dt + R / d W t (2.1)where A ∈ M d and H ∈ M d y ,d are the signal and sensor model matrices respectively, and R ∈ S d and R ∈ S + d y are the respective signal and sensor noise covariance matrices. The noise inputs V t and W t are d and d y -dimensional Brownian motions, and X is an d -dimensional Gaussianrandom variable (independent of ( V t , W t ) ) with mean E ( X ) and covariance P ∈ S d .We let Y = 0 and Y t = σ ( Y s , s ≤ t ) be the σ -algebra generated by the observations. Theconditional distribution η t := Law ( X t | Y t ) of the signal states X t given Y t is Gaussian with aconditional mean and covariance given by X t := E ( X t | Y t ) and P t := E (cid:0) [ X t − X t ] [ X t − X t ] (cid:48) | Y t (cid:1) . The mean and the covariance obey the Kalman-Bucy and the Riccati equations dX t = A X t dt + P t H (cid:48) R − ( d Y t − HX t dt ) (2.2) ∂ t P t = Ricc( P t ) (2.3)with the Riccati drift function from S d into S r deﬁned for any Q ∈ S d by Ricc( Q ) := AQ + QA (cid:48) − QSQ + R (2.4)8nd with, S := H (cid:48) R − H (2.5)Importantly, the covariance of the conditional distribution Law ( X t | Y t ) in this case does notdepend on the observations Y t . The error Z t := ( X t − X t ) satisﬁes dZ t = ( A − P t S ) Z t dt + P t H (cid:48) R − / d W t − R / d V tlaw = ( A − P t S ) Z t dt + ( P t S P t + R ) / d B t (2.6)where B t is some independent d -dimensional Brownian motion. Here we make use of a martin-gale representation theorem, e.g. [45, Theorem 4.2], see also [27].Let φ t ( Q ) := P t denote the ﬂow of the matrix diﬀerential equation (2.3) with P = Q ∈ S d .Let ψ t ( z, Q ) := Z t denote the ﬂow of the stochastic error (2.6) with Z = z = ( x − X ) ∈ R d and P t = φ t ( Q ) . Finally, we denote the ﬂow of the Kalman-Bucy update (2.2) with X = x ∈ R d by χ t ( x, Q ) := X t . This notation allows us to reference the ﬂows ψ t ( z, Q ) , φ t ( Q ) , χ t ( x, Q ) withrespect to their initialisation at t = 0 which is useful when we compare ﬂows and study stability.Throughout the remainder, we assume that ( A, R / ) and ( A, H ) are controllable and ob-servable pairs in the sense that (cid:104) R / , AR / . . . , A r − R / (cid:105) and  HHA ... HA r −  have rank d. (2.7)Note that if R ∈ S + d is positive deﬁnite it follows the controllability holds trivially. We considerthe observability and controllability Gramians ( O t , C t ( O )) and ( C t , O t ( C )) associated with thetriplet ( A, R, S ) and deﬁned by O t := (cid:90) t e − A (cid:48) s S e − As ds and C t ( O ) := O − t (cid:20)(cid:90) t e − ( t − s ) A (cid:48) O s R O s e − ( t − s ) A ds (cid:21) O − t (2.8) C t := (cid:90) t e As R e A (cid:48) s ds and O t ( C ) := C − t (cid:20)(cid:90) t e ( t − s ) A C s S C s e ( t − s ) A ds (cid:21) C − t . Given (2.7), there exists some parameters τ, (cid:36) o,c ± , (cid:36) c ± ( O ) , (cid:36) o ± ( C ) > such that (cid:36) c − ≤ (cid:107)C τ (cid:107) ≤ (cid:36) c + and (cid:36) o − ≤ (cid:107)O τ (cid:107) ≤ (cid:36) o + (2.9) (cid:36) c − ( O ) ≤ (cid:107)C τ ( O ) (cid:107) ≤ (cid:36) c + ( O ) and (cid:36) o − ( C ) ≤ (cid:107)O τ ( C ) (cid:107) ≤ (cid:36) o + ( C ) . (2.10)The parameter τ is often called the interval of observability-controllability.These rank conditions (2.7) ensure the existence and the uniqueness of a positive deﬁniteﬁxed-point matrix P ∞ solving the so-called algebraic Riccati equation Ricc( P ∞ ) := AP ∞ + P ∞ A (cid:48) − P ∞ SP ∞ + R = 0 . (2.11)Importantly, in this case, the matrix diﬀerence ( A − P ∞ S ) is asymptotically stable (Hurwitzstable, i.e. Absc( A − P ∞ S ) < ) even when the signal matrix A is unstable [52, Theorems9.12, 9.15]. More relaxed conditions (i.e. detectability and stabilisability) for a stabilisingsolution (perhaps only positive semi-deﬁnite) to exist are discussed widely in the literature; see[49, 63, 52] and the convergence results in [50, 19]. A (marginally) stable solution to (2.11) existsunder a detectability condition, and convergence to this solution is given under mild additional9onditions in [67, 20, 66]. In [82], given only detectability, the stability of a time-varying “closedloop”, e.g. ( A − φ t ( Q ) S ) , is proven even when ( A − P ∞ S ) , is only marginally stable.For any s ≤ t and Q ∈ S d we deﬁne the state-transition matrix, E s,t ( Q ) := exp (cid:18)(cid:73) ts ( A − φ u ( Q ) S ) du (cid:19) ⇐⇒ ∂ t E s,t ( Q ) = ( A − φ u ( Q ) S ) E s,t ( Q ) . (2.12)When s = 0 we often write E t ( Q ) instead of E ,t ( Q ) . The matrix E t ( Q ) is the fundamental matrix.We have E s,t ( Q ) = E t ( Q ) E s ( Q ) − . The following convergence estimates follow from [8, 10]: Forany Q, Q , Q ∈ S d and any t ≥ we have the local contraction inequalities (cid:107)E t ( Q ) (cid:107) ≤ c (1 + (cid:107) Q (cid:107) ) (cid:107)E t ( P ∞ ) (cid:107) and (cid:107)E t ( P ∞ ) (cid:107) = (cid:107) e t ( A − P ∞ S ) (cid:107) ≤ c e − α t (2.13)for some ﬁnite α, c > and with P ∞ solving (2.11) and (cid:107)E t ( Q ) − E t ( Q ) (cid:107) ≤ c ( Q , Q ) e − α t (cid:107) Q − Q (cid:107) (2.14)for some ﬁnite constant c ( Q , Q ) > . In addition, there exists some parameter τ > suchthat for any s ≥ and any t ≥ τ > we have the uniform estimates, (cid:107)E s,s + t ( Q ) (cid:107) ≤ c τ (cid:107)E t ( P ∞ ) (cid:107) (2.15)Note it is desirable to relate the decay of E s,s + t ( Q ) to the decay at the ﬁxed point (cid:107)E t ( P ∞ ) (cid:107) = (cid:107) e t ( A − P ∞ S ) (cid:107) ≤ c e − α t (since as t → ∞ it is clear that we cannot do better). See [10] for anexplicit Floquet-type expression of E t ( Q ) in terms of E t ( P ∞ ) .The convergence and stability properties of the Kalman-Bucy ﬁlter and the associated Riccatiequation are directly related to the contraction properties of the state-transition matrix E s,t ( Q ) .To get some intuition for this we note, ψ t ( z, Q ) = E s,t ( Q ) ψ s ( z, Q ) + (cid:90) ts E u,t ( Q ) ( φ u ( Q ) S φ u ( Q ) + R ) / d B u (2.16)and φ t ( Q ) = E s,t ( Q ) φ s ( Q ) E s,t ( Q ) (cid:48) + (cid:90) t E s,t ( Q ) ( φ s ( Q ) S φ s ( Q ) + R ) E s,t ( Q ) (cid:48) ds (2.17)for any s ≤ t .From [8], for any t ≥ τ > and any Q ∈ S d we have the uniform estimates (cid:0) O τ ( C ) + C − τ (cid:1) − ≤ φ t ( Q ) ≤ O − τ + C τ ( O ) . (2.18)We also have ≤ φ t ( Q ) ≤ P ∞ + e ( A − P ∞ S ) t ( R − P ∞ ) e ( A − P ∞ S ) (cid:48) t (2.19)The following stability result follows from [8, 10]: For any Q , Q ∈ S d and for any t ≥ , (cid:107) φ t ( Q ) − φ t ( Q ) (cid:107) ≤ c (1 + (cid:107) Q (cid:107) + (cid:107) Q (cid:107) ) (cid:107)E t ( P ∞ ) (cid:107) (cid:107) Q − Q (cid:107) (2.20)and recall the exponential contraction estimate on (cid:107)E t ( P ∞ ) (cid:107) in (2.13). Similarly, using (2.15),for any s ≥ and any t ≥ τ > , we have (cid:107) φ s,s + t ( Q ) − φ s,s + t ( Q ) (cid:107) ≤ c τ (cid:107)E t ( P ∞ ) (cid:107) (cid:107) Q − Q (cid:107) (2.21)Note that both (2.20) and (2.21) imply immediately that φ t ( Q ) → t →∞ P ∞ exponentially fastfor any Q ∈ S d ; e.g. by letting Q = P ∞ . 10ote that the uniform estimates independent of the initial condition stated throughout,involve some arbitrarily small, positive time parameter τ , which is related to the notion of aso-called observability/controllability interval; for further details on this topic we refer to [8].Results (e.g. bounds and convergence results) on the ﬂow of the inverse of the solution ofthe Riccati equation are considered in [8] and are relevant for proving results on the ﬂow of theRiccati equation itself; e.g. upper bounds on the ﬂow of the inverse solution help to lower boundsolutions of the Riccati ﬂow. The ﬂow of the inverse Riccati solution may also be of interest onits own as it relates to the ﬂow of “information” (as the inverse of covariance).Given the contraction properties on E s,t ( Q ) it is often said the “deterministic part” of theﬁlter error ∂ t Z t = ( A − P t S ) Z t is stable. From [8] we can be more explicit if desired, forexample, for any t ≥ τ we have the uniform estimate, sup Q ∈ S d (cid:107) E ( ψ t ( z, Q ) | X ) (cid:107) ≤ c e − α t (cid:107) z − X (cid:107) (2.22)for some rate α > and some ﬁnite constant c > . Moreover, the conditional probability ofthe following event (cid:107) ψ t ( z, Q ) (cid:107) ≤ c ( Q ) (cid:18) e − αt (cid:107) z − X (cid:107) + e √ (cid:20)

12 + (cid:16) δ + √ δ (cid:17)(cid:21)(cid:19) (2.23)given the state variable X is greater than − e − δ , for any δ ≥ . And, for any t ≥ , z , z ∈ R d , Q , Q ∈ S d and any n ≥ we have the almost sure local contraction estimate E ( (cid:107) ψ t ( z , Q ) − ψ t ( z , Q ) (cid:107) n | X ) n ≤ c ( Q , Q ) e − αt (cid:107) z − z (cid:107) + c n ( Q , Q ) e − αt (1 + (cid:107) z − X (cid:107) ) (cid:107) Q − Q (cid:107) (2.24)with some rate α > and the ﬁnite constants c ( Q , Q ) , c n ( Q , Q ) > . For any probability measure η on R d we let P η denote the η -covariance η (cid:55)→ P η := η (cid:0) [ ι − η ( ι )][ ι − η ( ι )] (cid:48) (cid:1) (3.1)with the identity function ι ( x ) := x and the column vector η ( f ) := (cid:82) f dη for some measurablefunction f : R d → R d .We now consider three diﬀerent cases of a conditional nonlinear McKean-Vlasov-type diﬀu-sion process, ( F1 ) d X t = A X t dt + R / d V t + P η t H (cid:48) R − (cid:104) d Y t − (cid:16) H X t dt + R / d W t (cid:17)(cid:105) ( F2 ) d X t = A X t dt + R / d V t + P η t H (cid:48) R − (cid:20) d Y t − H (cid:18) X t + η t ( ι )2 (cid:19) dt (cid:21) (3.2) ( F3 ) d X t = A X t dt + R P − η t ( X t − η t ( ι )) dt + P η t H (cid:48) R − (cid:20) d Y t − H (cid:18) X t + η t ( ι )2 (cid:19) dt (cid:21) where η t := Law ( X t | Y t ) (3.3)and thus the diﬀusions in (3.2) depend in some nonlinear fashion on the conditional law of the dif-fusion process itself. In all three cases ( V t , W t , X ) are independent copies of ( V t , W t , X ) . Thesediﬀusions are time-varying Ornstein-Uhlenbeck processes [26] and consequently η t is Gaussian;see also [8]. These Gaussian distributions have the same conditional mean η t ( ι ) and conditionalcovariance P η t . 11 roposition 3.1 ([26, 8]) . We have η t := Law ( X t | Y t ) = Law ( X t | Y t ) =: η t (3.4) and X t := η t ( ι ) = η t ( ι ) and P t = P η t = P η t where X t and P t correspond to the Kalman-Bucyﬁlter update and Riccati equations in (2.2) and (2.3). We may refer to this speciﬁc class (3.2) of McKean-Vlasov-type diﬀusion as a Kalman-Bucydiﬀusion process [8]. The case ( F1 ) corresponds to the limiting object that is sampled in thecontinuous-time version of the ‘vanilla’ EnKF [32]; while ( F2 ) is the continuous-time limitingobject that is sampled in the ‘deterministic’ EnKF of [72], see also [69]; and ( F3 ) is a fullydeterministic transport-inspired equation [69, 78]. Note that in this case ( F3 ) the existence ofthe inverse of P η t is given by the positive-deﬁniteness properties of the solution of the Riccatiequation in (2.3). In the next section we detail the Monte-Carlo ensemble ﬁlters derived fromthese Kalman-Bucy diﬀusion processes.Note we may deﬁne a generalised version of case ( F3 ) by, ( F3 (cid:48) ) d X t = A X t dt + R P − η t ( X t − η t ( ι )) dt + P η t H (cid:48) R − (cid:20) d Y t − H (cid:18) X t + η t ( ι )2 (cid:19) dt (cid:21) + G t P − η t ( X t − η t ( ι )) dt (3.5)for any skew symmetric matrix G (cid:48) t = − G t that may also depend η t . This added tuning parametermay be related to an optimality metric, when deriving this transport equation from an optimaltransport beginning. We may also write similar generalised versions ( F1 (cid:48) ) and ( F2 (cid:48) ) by adding G t P − η t ( X t − η t ( ι )) to ( F1 ) and ( F2 ). Ensemble Kalman-Bucy ﬁlters (

EnKF ) coincide with the mean-ﬁeld particle interpretation of thenonlinear diﬀusion processes deﬁned in (3.2).Let ( V it , W it , X i ) with ≤ i ≤ N + 1 be ( N + 1) independent copies of ( V t , W t , X ) . Again,we consider three diﬀerent cases of Kalman-Bucy-type interacting diﬀusion process, ( F1 ) d X it = A X it dt + R / d V it + (cid:98) P t H (cid:48) R − (cid:104) d Y t − (cid:16) H X it dt + R / d W it (cid:17)(cid:105) ( F2 ) d X it = A X it dt + R / d V it + (cid:98) P t H (cid:48) R − (cid:34) d Y t − H (cid:32) X it + (cid:98) X t (cid:33) dt (cid:35) (4.1) ( F3 ) d X it = A X it dt + R (cid:98) P − t (cid:16) X it − (cid:98) X t (cid:17) dt + (cid:98) P t H (cid:48) R − (cid:34) d Y t − H (cid:32) X it + (cid:98) X t (cid:33) dt (cid:35) with ≤ i ≤ N + 1 and the rescaled (particle) sample mean and covariance (cid:98) η t := η N t = 1 N + 1 N +1 (cid:88) i =1 δ X it = ⇒ (cid:98) X t := X N t = 1 N + 1 N +1 (cid:88) i =1 X it and (cid:98) P t := P N t = N + 1 N P (cid:98) η t (4.2)In cases ( F1 ) and ( F2 ) we have N ≥ and in case ( F3 ) we require N ≥ d for the almost sureinvertibility of (cid:98) P t (although in case ( F3 ) one may substitute a pseudo-inverse of (cid:98) P t withoutchanging the mathematical analysis). A sampled version of case ( F3 (cid:48) ) may also be derived inthe same way. 12 .1 Vanilla Ensemble Kalman-Bucy Filter The vanilla

EnKF , denoted by

VEnKF , is associated with the ﬁrst case ( F1 ) of nonlinear process X t in (3.2) and is deﬁned by the Kalman-Bucy-type interacting diﬀusion process ( F1 ) in (4.1).We then have the following key result. Proposition 4.1 ([26]) . Let N ≥ . The stochastic ﬂow of the sample mean satisﬁes, d (cid:98) X t law = (cid:16) A − (cid:98) P t S (cid:17) (cid:98) X t dt + (cid:98) P t H (cid:48) R − d Y t + 1 √ N + 1 (cid:16) R + (cid:98) P t S (cid:98) P t (cid:17) / d B t (4.3) where B t is an independent d -dimensional Brownian motion.The sample covariance evolves according to a so-called matrix-valued Riccati diﬀusion processof the form, d (cid:98) P t law = Ricc( (cid:98) P t ) dt + 2 √ N (cid:20) (cid:98) P / t d M t (cid:16) R + (cid:98) P t S (cid:98) P t (cid:17) / (cid:21) sym (4.4) where M t is a ( d × d ) -matrix with independent Brownian entries (also independent of B t ). We see that for the vanilla

EnKF , the convergence of (cid:98) X t → X t and (cid:98) P t → P t as N → ∞ followsimmediately. This result follows via the martingale representation theorem, e.g. Theorem 4.2in [45], see also [27]. The ‘deterministic’

EnKF , denoted

DEnKF , is associated with the second case ( F2 ) of nonlinearprocess X t in (3.2), and is deﬁned by the Kalman-Bucy-type interacting diﬀusion process ( F2 ) in(4.1). This ‘deterministic’ epithet in the DEnKF follows because the update ‘part’ of the particleﬂow is deterministic and does not rely on the stochastic perturbations by W it appearing in the VEnKF . This name and idea was proposed in [72]. We have the following key result.

Proposition 4.2 ([13, 11]) . Let N ≥ . The stochastic ﬂow of the sample mean satisﬁes, d (cid:98) X t law = (cid:16) A − (cid:98) P t S (cid:17) (cid:98) X t dt + (cid:98) P t H (cid:48) R − d Y t + 1 √ N + 1 R / d B t (4.5) where B t is an independent d -dimensional Brownian motion.The sample covariance evolves according to a so-called matrix-valued Riccati diﬀusion processof the form, d (cid:98) P t law = Ricc( (cid:98) P t ) dt + 2 √ N (cid:104) (cid:98) P / t d M t R / (cid:105) sym (4.6) where M t is a ( d × d ) -matrix with independent Brownian entries (also independent of B t ). Again, for the

DEnKF , the convergence of (cid:98) X t → X t and (cid:98) P t → P t as N → ∞ follows imme-diately. Note the simpliﬁed diﬀusion weighting(s) in the case of the DEnKF , as compared to the

VEnKF . The fully deterministic ensemble transport ﬁlter

DEnTF is associated with the third case ( F3 ) ;deﬁned by the Kalman-Bucy-type interacting diﬀusion process ( F3 ) in (4.1). In this case, wehave the special result. Proposition 4.3 ([69, 78]) . Let N ≥ . The ﬂow of sample mean is given by, d (cid:98) X t = (cid:16) A − (cid:98) P t S (cid:17) (cid:98) X t dt + (cid:98) P t H (cid:48) R − d Y t , (cid:98) X := 1 N + 1 N +1 (cid:88) i =1 X i (4.7)13 he sample covariance evolves according to the deterministic Riccati equation, d (cid:98) P t = Ricc( (cid:98) P t ) dt, (cid:98) P := N + 1 N P (cid:98) η (4.8)Note that the particle mean (cid:98) X t and the particle covariance (cid:98) P t associated with the particleinterpretation ( F3 ) discussed in (4.1) satisfy exactly the equations of the Kalman-Bucy ﬁlterwith the associated deterministic Riccati equation.The “randomness” in this case only comes from the initial conditions. The stability analysisof this class of DEnTF model resumes to the one of the Kalman-Bucy ﬁlter and the associatedRiccati equation. Thus, the results, e.g., in (2.20), (2.22), (2.23) and (2.24) hold immediately;see also [8] in the linear-Gaussian setting. In [22, 23] this ﬁlter is analysed in the case of anonlinear signal, but fully observed (linear observation) model. The ﬂuctuation analysis in thiscase can also be developed easily by combining these stability results w.r.t. the initial state (see[8]) with conventional sample estimates based on independent copies of the initial states, see forinstance [15] for estimates associated with classical sample covariance estimates. Consequently,we do not consider this class of model going forward.

In practice, the ensemble Kalman ﬁltering methodology is applied in high-dimensional, nonlinearstate-space models, e.g. see [31, 32] and the application references listed in the introduction.It is rather straightforward to extend the algorithmic particle methods in (4.1) to nonlinearsystems as we now outline. Consider a time-invariant nonlinear diﬀusion model of the form, d X t = a ( X t ) dt + R / d V t d Y t = h ( X t ) dt + R / d W t (4.9)where a : R d → R d and h : R d → R d y are the nonlinear signal and sensor model functions ofsome suﬃcient regularity.Let ( V it , W it , X i ) with ≤ i ≤ N + 1 be ( N + 1) independent copies of ( V t , W t , X ) . Weconsider the three EnKF variants as before and deﬁne the ﬂow of particles by, ( NF1 ) d X it = a ( X it ) dt + R / d V it + (cid:98) P ht R − (cid:104) d Y t − (cid:16) h ( X it ) dt + R / d W it (cid:17)(cid:105) ( NF2 ) d X it = a ( X it ) dt + R / d V it + (cid:98) P ht R − (cid:34) d Y t − (cid:32) h ( X it ) + (cid:98) h t (cid:33) dt (cid:35) (4.10) ( NF3 ) d X it = a ( X it ) dt + R (cid:98) P − t (cid:16) X it − (cid:98) X t (cid:17) dt + (cid:98) P ht R − (cid:34) d Y t − (cid:32) h ( X it ) + (cid:98) h t )2 (cid:33) dt (cid:35) with ≤ i ≤ N + 1 and the (particle) sample mean (cid:98) X t and sample covariance (cid:98) P t deﬁned asusual, e.g. see (4.2), and with the observation function sample mean and sample cross-covariancedeﬁned as, (cid:98) h t := 1 N + 1 N +1 (cid:88) i =1 h ( X it ) and (cid:98) P h := 1 N N +1 (cid:88) i =1 (cid:104) X it − (cid:98) X t (cid:105) (cid:104) h ( X it ) − (cid:98) h t (cid:105) (cid:48) (4.11)The EnKF implementation for nonlinear signal and observation models is thus a natural progres-sion of the ensemble ﬁlters in (4.1), with the appropriate nonlinear replacements of the drift andsensor functions, e.g. A X it with a ( X it ) , etc. Except in the fully deterministic case ( NF3 ), we arenot aware of any results that rigorously establish a limiting ( N → ∞ ) evolution equation, say X t .We may conjecture that such a limit, X t , is the natural analogue of the McKean-Vlasov equa-tions in (3.2), but this remains to be proven. Similarly, except in case ( NF3 ), the limiting mean14nd sample covariance equations and their interpretation remains a topic of open investigation.For example, in any case, it is clear that the ﬂow of the sample covariance of the particles in(4.10) will not satisfy the type of Riccati diﬀusion we see under linear models.It is worth noting again that in the linear model setting, the

EnKF algorithms in (4.1) arederived as sampled versions of the McKean-Vlasov equations (3.2). This is a natural derivationof a particle-type approximation of the optimal Bayes ﬁlter (i.e. an ensemble ﬁlter). This followsbecause the (conditional) distribution of the state in (3.2) is exactly the optimal Bayes ﬁlter(which is Gaussian with mean and covariance given by the classical Kalman ﬁlter). The

EnKF inthat linear setting is shown to be consistent, i.e. with inﬁnite computational power one recoverstruly the optimal Bayesian ﬁlter.In the nonlinear model setting, the

EnKF algorithms in (4.10) are most naturally derived asheuristic extensions of (4.1) so as to accommodate the nonlinearities in the model. For example,swapping A X it with a ( X it ) is a type of heuristic extension moving from (4.1) to (4.10); albeitthis is a very natural heuristic step also. This diﬀerence in derivation is exempliﬁed by thefact that, except perhaps in case ( NF3 ), the limiting ( N → ∞ ) object to which each ensemblemember converges has not been rigorously established. And it is exempliﬁed further by notingthat whatever that limiting object may look like, it is certainly not a diﬀusion equation withdistribution equal to the optimal Bayes ﬁlter. That is,Law ( X t | Y t ) (cid:54) = Law ( X t | Y t ) =: η t (4.12)in the nonlinear model setting. Said diﬀerently, even with inﬁnite computational power, the EnKF methods as applied in this nonlinear model setting do not converge to the optimal nonlinearBayes ﬁlter. As discussed previously and again later, the

EnKF in this nonlinear model setting isprobably best viewed in practice as a type of (random) sample-based observer or (point-valued)state estimator (and not as an approximation of the optimal Bayesian ﬁlter).We discuss connections and extensions of our results to the nonlinear model setting in a latersection (at the end of this review article).

Going forward, we consider only the

VEnKF (case ( F1 )) and DEnKF (case ( F2 )) since as noted thetheory of the DEnTF in the linear-Gaussian setting reverts to that of the standard Kalman-Bucyﬁlter as detailed in [8]. The parameter κ ∈ { , } will distinguish the two cases ( κ = 1 in case( F1 ), and κ = 0 in case ( F2 )) throughout.We may unify the analysis via the following representation, d (cid:98) X t = ( A − (cid:98) P t S ) (cid:98) X t dt + (cid:98) P t H (cid:48) R − d Y t + 1 √ N + 1 Σ / κ ( (cid:98) P t ) d B t (5.1) d (cid:98) P t = Ricc( (cid:98) P t ) dt + 2 √ N (cid:104) (cid:98) P / t d M t Σ / κ ( (cid:98) P t ) (cid:105) sym (5.2)with the mapping, Σ κ ( Q ) := R + κ QSQ with κ = (cid:26) in case ( F1 ) in case ( F2 ) (5.3)Let (cid:98) Z t := ( (cid:98) X t − X t ) and observe that d (cid:98) Z t = ( A − (cid:98) P t S ) (cid:98) Z t dt + (cid:98) P t H (cid:48) R − / d V t − R / d W t + 1 √ N + 1 Σ / κ ( (cid:98) P t ) d B tlaw = ( A − (cid:98) P t S ) (cid:98) Z t dt + Ω / κ ( (cid:98) P t ) d (cid:98) B t (5.4)15or some independent d -dimensional Wiener process (cid:98) B t and with, Ω κ := Σ + 1 N + 1 Σ κ (5.5)Note we often refer to the ﬂows (cid:98) Z t or Z t as error ﬂows.We also underline that (cid:98) Z t − Z t = ( (cid:98) X t − X t ) − ( X t − X t ) = (cid:98) X t − X t (5.6)so that the diﬀerence between the noisy error ﬂow (cid:98) Z t and the classical Kalman-Bucy error ﬂow Z t is equal to the diﬀerence between the EnKF (sample mean) state estimate and the classicalKalman-Bucy state estimate.Let (cid:98) φ t ( Q ) := (cid:98) P t denote the ﬂow of the Riccati diﬀusion equation in (5.2) with (cid:98) P = Q ∈ S d .Let (cid:98) ψ t ( z, Q ) := (cid:98) Z t denote the ﬂow of the stochastic error (5.4) with (cid:98) Z = z = ( x − X ) ∈ R d and (cid:98) P t = (cid:98) φ t ( Q ) . Finally, we denote the ﬂow of the sample mean in (5.1) with (cid:98) X = x ∈ R d by (cid:98) χ t ( x, Q ) := (cid:98) X t .We underline further that the diﬀerence between two error ﬂows satisﬁes, (cid:98) ψ t k ( z , Q ) − (cid:98) ψ t k ( z , Q ) = (cid:98) χ t ( x , Q ) − (cid:98) χ t ( x , Q ) (5.7)and is thus equal to the diﬀerence between the two corresponding sample means (with compatiblestarting points). Studying the diﬀerence between two error ﬂows ( (cid:98) ψ t k ( z , Q ) − ψ t k ( z , Q )) subsumes the study of something like ( (cid:98) χ t ( x , Q ) − χ t ( x , Q )) which is the diﬀerence betweenthe EnKF (sample mean) state estimate and the classical Kalman-Bucy state estimate (withdiﬀerent initial conditions).For any s ≤ t and Q ∈ S d we deﬁne the stochastic state-transition matrix, (cid:98) E s,t ( Q ) := exp (cid:20)(cid:73) ts (cid:16) A − (cid:98) φ u ( Q ) S (cid:17) du (cid:21) ⇐⇒ ∂ t (cid:98) E s,t ( Q ) = (cid:16) A − (cid:98) φ u ( Q ) S (cid:17) (cid:98) E s,t ( Q ) (5.8)As with the classical Kalman-Bucy ﬁlter, e.g. see (2.16) and (2.17), the convergence andstability properties of the ensemble Kalman-Bucy ﬁlter and the associated Riccati diﬀusionequation are directly related to the contraction properties of the stochastic state-transitionmatrix (cid:98) E s,t ( Q ) . For example, the ﬂow of the stochastic error equation (5.4) is given by, (cid:98) ψ t ( z, Q ) = (cid:98) E s,t ( Q ) (cid:98) ψ s ( z, Q ) + (cid:90) ts (cid:98) E u,t ( Q ) Ω / κ ( (cid:98) φ u ( Q )) d (cid:98) B u (5.9)and the stochastic ﬂow of the matrix Riccati diﬀusion (5.2) is given implicitly by (cid:98) φ t ( Q ) = (cid:98) E s,t ( Q ) (cid:98) φ s ( Q ) (cid:98) E s,t ( Q ) (cid:48) + (cid:90) ts (cid:98) E u,t ( Q ) Σ ( (cid:98) φ u ( Q )) (cid:98) E u,t ( Q ) (cid:48) du + 2 √ N (cid:90) ts (cid:98) E u,t ( Q ) (cid:104) (cid:98) φ / u ( Q ) d M u Σ / κ ( (cid:98) φ u ( Q )) (cid:105) sym (cid:98) E u,t ( Q ) (cid:48) (5.10)for any s ≤ t . We denote by (cid:98) Π t the Markov semigroup of (cid:98) φ t ( Q ) deﬁned for any boundedmeasurable function F on S d and any Q ∈ S d with the property that, (cid:98) Π t ( F )( Q ) := E (cid:104) F ( (cid:98) φ t ( Q )) (cid:105) (5.11)We have the ﬁrst result concerning the quadratic, matrix-valued, Riccati diﬀusion process (5.10). Theorem 5.1.

For any N ≥ the Riccati diﬀusion (5.10) has a unique weak solution on S d .For N ≥ d + 1 there exists a unique strong solution on S + d . Moreover, (cid:98) Π t ( P, dQ ) is a stronglyFeller and irreducible semigroup with a unique invariant probability measure (cid:98) Γ ∞ on S + d . Thismeasure admits a positive density with respect to the natural Lebesgue measure on S d . (cid:98) X t in(5.1) or a solution (cid:98) Z t in (5.4) exists and is unique. This result is proven in [11, Theorem 2.1].Once the problem of existence and uniqueness is tackled, one major problem in this equationis the behavior at inﬁnity: existence of a stationary measure and speed of convergence towardsthis stationary measure or even distance between two solutions starting at diﬀerent points. In this section we consider the ﬂuctuation of (cid:98) φ t ( Q ) about φ t ( Q ) and of (cid:98) ψ t ( z, Q ) about ψ t ( z, Q ) .The ﬂuctuation properties and moment boundedness properties of (cid:98) φ t ( Q ) and (cid:98) ψ t ( z, Q ) dependnaturally on the size on the ﬂuctuation as determined by N .Typically, we will write either of the the following expressions in stating our results, (cid:48)(cid:48) N is suﬃciently large (cid:48)(cid:48) or (cid:48)(cid:48) N ≥ (cid:48)(cid:48) (5.12)In case ( F1 ) with κ = 1 there is often a minimum threshold on N needed to prove the results.In case ( F1 ) this lower threshold on N may be large. In case ( F2 ) with κ = 0 , these same resultstypically hold; but moreover, we can often reﬁne the relevant results and at the same time relaxthe conditions on N , often needing just N ≥ . This is a signiﬁcant analytical advantage of the DEnKF over the

VEnKF . In some cases, this advantage is practically realised and provable (and notjust a by-product of analysis methods). For example, we will show later that some moments ofthe

VEnKF sample covariance in one-dimension provably do not exist in the steady-state withouta suﬃcient number of particles (whereas in the

DEnKF these moments always exist with N ≥ ).In some cases, the results stated in this work are only known for the DEnKF . If we do not specifya particular case, or a value for κ ∈ { , } , then the stated results may be assumed to hold forboth the VEnKF and the

DEnKF .We start with the following under-bias estimate on the sample covariance which holds forboth the

VEnKF and the

DEnKF . Theorem 5.2.

In all cases, for any t ≥ , any Q ∈ S d , and any N ≥ , we have the uniformunder-bias estimate, E (cid:104) (cid:98) φ t ( Q ) (cid:105) ≤ φ t ( Q ) ≤ c (1 + (cid:107) Q (cid:107) ) I (5.13) for a ﬁnite constant c > that doesn’t depend on the time horizon. We may reﬁne this under-bias result as is done in [11]. For example, if we assume furtherthat S ∈ S + d , then for any time horizon t ≥ we also have the reﬁned bias estimates, ≤ φ t ( Q ) − E (cid:104) (cid:98) φ t ( Q ) (cid:105) ≤ c ( Q ) 1 N I (5.14)when N is suﬃciently large (in case ( F1 ) with κ = 1 ) or for any N ≥ (in case ( F2 ) with κ = 0 ).The proof of this reﬁnement, and details on the constant c ( Q ) , is in [11, Theorem 2.3] and in[14].We will see subsequtently that the condition S ∈ S + d ensures that for any n ≥ , the n -th moments of the trace of the sample covariance are uniformly bounded w.r.t. the timehorizon (when the ﬂuctuation parameter is small enough) even when the matrix A is unstable .This condition S ∈ S + d may be thought of as a strengthening of a detectability/observabilitycondition. We discuss this condition more later as it (re-)appears throughout our presentation.The next theorem concerns these time-uniform moment estimates on the stochastic Riccatiﬂow in (4.4), i.e. on the ﬂow of the sample covariance matrix. Theorem 5.3.

Assume that S ∈ S + d . For any n ≥ , t ≥ , any Q ∈ S d , and any N suﬃcientlylarge, we have the uniform estimate, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n (1 + (cid:107) Q (cid:107) ) (5.15)17 urthermore, for any time horizon t ≥ τ > we also have the uniform estimates (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n,τ (5.16) In addition, in case ( F2 ), for any N ≥ , any n ≥ , Q ∈ S d , t ≥ and any s ≥ τ > we havethe reﬁned estimates, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c (1 + (cid:107) Q (cid:107) ) (1 + (cid:114) n N ) and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ s ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c τ (1 + (cid:114) n N ) (5.17)The proof of the above Theorem is provided in [11, Theorem 2.2] where a precise descriptionof the (ﬁnite) parameters c n , c n,τ , c, c τ > are also provided. The ﬁrst estimate in (5.15) alsoholds if S = 0 when µ ( A ) < . The proof of this Theorem is based on a reduction of (4.4)to a scalar Riccati diﬀusion, a novel representation of its n -th powers, and a comparison of itsmoments to a judiciously designed deterministic scalar Riccati equation. We note the proof isconservative by nature (due to the scalar reduction and comparison).Now we turn to quantifying the ﬂuctuations of the matrix Riccati diﬀusions around theirlimiting (deterministic) values as found when N tends to ∞ . That is, we quantify the ﬂuctuationof the EnKF sample covariance about the limiting covariance of the classical Kalman-Bucy ﬁlter.

Theorem 5.4.

Assume that S ∈ S + d . For any n ≥ , t ≥ , any Q ∈ S d , and any N suﬃcientlylarge we have the uniform estimates, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) − φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n √ N (1 + (cid:107) Q (cid:107) ) (5.18) In case ( F2 ), for any N ≥ , any n ≥ , t ≥ , and any Q ∈ S d , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) − φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c √ N (1 + (cid:107) Q (cid:107) ) (cid:18) (cid:114) n N (cid:19) (5.19)The estimates in Theorem 5.4 do not depend on Q ∈ S d when t ≥ τ for any τ > and with c n , c replaced with c n,τ , c τ ; e.g. similarly to (5.16) in Theorem 5.3.The proof of the preceding Theorem is provided in [11, Theorem 2.3] and in [14]. The prooffollows from a second-order expansion of the stochastic ﬂow (cid:98) φ t about the deterministic ﬂow φ t and then an appropriate bounding of the ﬁrst and second order stochastic terms. See [14] fordetails on the decomposition of (cid:98) φ t in terms of φ t plus stochastic terms of any order (in theﬂuctuation N ). We explore this expansion in the scalar case in more detail in a later section.The under bias result (5.13) holds with any N ≥ in both the VEnKF of case ( F1 ), and in the DEnKF of case ( F2 ). This under-bias is a motivation for so-called sample covariance regularisationin practice; e.g. so-called sample covariance inﬂation or localisation methods [5, 39, 36, 62, 31].Later we discuss the eﬀects of inﬂation in particular.As with the deterministic Riccati equation, we may bound the moments of the inverse of thestochastic Riccati ﬂow (cid:98) φ t ( Q ) under stronger conditions on the number of particles N required;e.g. see [11]. It follows that with Q ∈ S + d and with additional conditions on N , that for t ≥ τ > there exists a uniform positive deﬁnite lower bound on E [ (cid:98) φ t ( Q )] .A number of basic corollaries follow the proofs in [14, 11], for instance, we have the monotoneproperty, S d (cid:51) Q ≤ Q = ⇒ E (cid:104) (cid:98) φ t ( Q ) (cid:105) ≤ E (cid:104) (cid:98) φ t ( Q ) (cid:105) (5.20)and, for any Q ∈ S d , the ﬁxed upper bound, E (cid:104) (cid:98) φ t ( Q ) (cid:105) ≤ P ∞ + E t ( P ∞ ) ( Q − P ∞ ) E t ( P ∞ ) (cid:48) (5.21)18everal spectral estimates can be deduced from the estimates (5.14), (5.18) and (5.19). Forexample, in case ( F2 ), with κ = 0 and N ≥ then combining (5.19) with the n -version of theHoﬀman-Wielandt inequality we have the uniform estimate, sup ≤ i ≤ r (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ i (cid:16) (cid:98) φ t ( Q ) (cid:17) − λ i ( φ t ( Q )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( Q ) 1 √ N (5.22)In [14] we go further with regards to this ﬂuctuation analysis on the EnKF sample covariance.In [14] we present sharp and non-asymptotic expansions of the matrix moments of the matrixRiccati diﬀusion with respect to the parameter N . These results follow from a Taylor-typeperturbation expansion of the form, (cid:98) φ t = φ t + (cid:88) ≤ k

Consider only case ( F2 ). If µ ( A − P ∞ S ) < and S ∈ S + d , then for any n ≥ , z ∈ R d , Q ∈ S d , there exists some time (cid:98) t n −→ N →∞ ∞ such that for any ≤ t ≤ (cid:98) t n we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) ψ t ( z, Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( z, Q ) (5.25) See [9] for details on the time parameter (cid:98) t n −→ N →∞ ∞ . Unlike Theorem 5.3, the proof of this result requires contraction (or non-explosiveness) to beestablished a priori for the transition semigroup (cid:98) E s,t ( Q ) deﬁned in (5.8). Moreover, this result isknown only for the DEnKF with κ = 0 in the general multi-dimensional setting without stabilityassumptions on A itself.Now we turn to quantifying the ﬂuctuations of the sample mean around their limiting valuesas found when N tends to ∞ . That is, we quantify the ﬂuctuation of the EnKF sample meanabout the limiting conditional mean estimate from the classical Kalman-Bucy ﬁlter.

Theorem 5.6.

Consider only case ( F2 ). If µ ( A − P ∞ S ) < and S ∈ S + d , then for any n ≥ , z ∈ R d , Q ∈ S d , there exists some time (cid:98) t n −→ N →∞ ∞ such that for any ≤ t ≤ (cid:98) t n we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) ψ t ( z, Q ) − ψ t ( z, Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( z, Q ) 1 √ N (5.26) See [9] for details on the time parameter (cid:98) t n −→ N →∞ ∞ . (cid:98) E s,t ( Q ) . Also, this result is known only for the DEnKF with κ = 0 in thegeneral multi-dimensional setting without stability assumptions on A itself.In [26], the preceding two results, Theorem 5.5 and Theorem 5.6, are stated under thestrong stability assumption µ ( A ) < . That assumption ensures the contractive property holdsfor (cid:98) E s,t ( Q ) and consequently that (cid:98) ψ t ( z, Q ) is stable. In that case, with µ ( A ) < , these resultshold for both the VEnKF and the

DEnKF . The stability of (cid:98) ψ t ( z, Q ) more generally (without askingfor A to be stable) is considered in the following subsection and Theorem 5.5 and Theorem 5.6should be read in conjunction with those results.Asking for µ ( A − P ∞ S ) < in Theorems 5.5 and 5.6 is much weaker than asking for µ ( A ) < (e.g. we allow unstable A ). Note that the related condition Absc( A − P ∞ S ) < follows directlyfrom the classical observability and controllability assumptions on the signal-observation model(2.1), see (2.11). Model conditions for µ ( A − P ∞ S ) < may be thought of as a strengtheningof observability and controllability and is discussed later.The proof of Theorem 5.5 and Theorem 5.6 is provided in [13] in the one-dimensional settingwhere a detailed description of the (ﬁnite) parameters c n ( z, Q ) > are provided. The multi-dimensional result follows using similar proof methods to those used in [13] in combination withthe contraction properties of the transition matrix (cid:98) E s,t ( Q ) detailed in the next subsection. Inthe one-dimensional setting studied in [13], contraction of (cid:98) E s,t ( Q ) is given under very generalmodel conditions which also accommodate both the VEnKF and the

DEnKF . Consequently, in onedimension both Theorem 5.5 and Theorem 5.6 hold on an inﬁnite time horizon for any t ≥ and with any κ ∈ { , } .One may consider a perturbation expansion of the sample mean ﬂow as (cid:98) ψ t = ψ t + (cid:88) ≤ k induces an exceedinglypessimistic estimate larger than the estimated number of elementary particles of matter in thevisible universe. In this sense, non-time-uniform bounds of this form are clearly impracticalfrom a numerical user-case perspective. 20 .2 Stability Results and Long Time Behavioural Properties With Q ∈ S + d we set Λ( Q ) := (cid:107) Q (cid:107) + (cid:107) Q − (cid:107) and we consider the collection of Λ -norms on theset of probability measures Γ , Γ on S + d , indexed by (cid:126) > , and deﬁned by, (cid:107) Γ − Γ (cid:107) (cid:126) , Λ := sup (cid:40) | Γ ( F ) − Γ ( F ) | : (cid:107) F (cid:107) Λ := sup Q ∈ S + d | F ( Q ) | (cid:126) Λ( Q ) ≤ (cid:41) (5.30)for any bounded measurable function F on S d .It is known that the deterministic Riccati equation that describes the ﬂow of the covariancematrix in classical Kalman-Bucy ﬁltering tends to a ﬁxed point P ∞ for any initial point Q ∈ S d when the (time-invariant) model (2.1) is detectable and stabilisable; e.g. see (2.20) and [8]. Thenext result is the analogue of this idea in the EnKF setting and describes the stability of the ﬂowof the sample covariance.

Theorem 5.7.

Assume the ﬂuctuation parameter N is suﬃciently large such that E [ (cid:107) (cid:98) φ t ( Q ) (cid:107) ] and E [ (cid:107) (cid:98) φ − t ( Q ) (cid:107) ] are uniformly bounded (e.g. as in Theorem 5.3 for bounds on E [ (cid:107) (cid:98) φ t ( Q ) (cid:107) ] ).Then, there exists some ﬁnite constants c, α, (cid:126) > such that for any t ≥ and probabilitymeasures Γ , Γ on S + d , we have the Λ -norm contraction inequality (cid:107) Γ (cid:98) Π t − Γ (cid:98) Π t (cid:107) (cid:126) , Λ ≤ c e − α t (cid:107) Γ − Γ (cid:107) (cid:126) , Λ (5.31)Of course, setting Γ = (cid:98) Γ ∞ where (cid:98) Γ ∞ is the unique invariant probability measure describedin Theorem 5.1 implies that for any initial probability measure Q ∼ Γ on S + d we have that (cid:98) φ t ( Q ) tends to be distributed according to (cid:98) Γ ∞ . The proof of the above theorem is provided in [11,Theorem 2.4] and is based on matrix-valued Lyapunov and minorisation conditions (choosingthe Lyapunov candidate, Λ( · ) ).For one-dimensional models, the article [13] provides explicit analytical expressions for thereversible measure of (cid:98) P t in terms of the model parameters. As expected, heavy tailed reversiblemeasures arise when κ = 1 , and weighted Gaussian distributions when κ = 0 ; see the illustrativeexamples in a later section in this article. The article [13] also provides sharp exponential decayrates to equilibrium, in the sense that the decay rates tend to those of the limiting deterministicRiccati equation when N tends to ∞ . Recall that the stability properties of the deterministic ( N = ∞ ) semigroups E s,t ( Q ) associatedwith the classical Kalman-Bucy ﬁlter are rather well understood; e.g. see (2.13), (2.15), and(2.14) and also [8, 10].We emphasise that in the deterministic case ( N = ∞ ), stability of the matrix-valued Riccatidiﬀerential equation, e.g. as in (2.20), follows from the contraction properties of E s,t ( Q ) in(2.13); see [8, 10] for the derivation. Some intuition for this follows from the implicit form forthe solution in (2.17). Similarly, in classical Kalman-Bucy ﬁlter, the stability properties of theerror ﬂow (2.6) are related to the contraction properties of the state-transition matrix E s,t ( Q ) .The intuition for this follows again from the solution form in (2.16). The stability properties ofthe classical Kalman-Bucy error ﬂow are given in, e.g., (2.22) and (2.24); see [8].We come now to the contractive properties of (cid:98) E s,t ( Q ) deﬁned in (5.8). Firstly, we remarkthat if S ∈ S + d , then up to a change of basis we can always assume that S = I . Moreover, forany s, t ∈ [0 , ∞ [ we immediately have the rather crude almost sure estimate µ ( A ) < ⇒ (cid:107) (cid:98) E s,s + t ( Q ) (cid:107) ≤ exp [ t µ ( A ))] −→ t →∞ (5.32)In general, asking for A to be stable in this form is a very strong and restrictive condition.We typically seek contraction results on (cid:98) E s,t ( Q ) that accomodate arbitrary A ∈ M d matrices;21n particular, we seek to accommodate unstable signal matrices A , i.e. matrices with (some)non-negative eigenvalues. To this end, ﬁx Q ∈ S d and consider the process (cid:98) A deﬁned by (cid:98) A : t ∈ [0 , ∞ [ (cid:55)→ (cid:98) A t := A − (cid:98) φ t ( Q ) S (5.33)We write A for the analogous process driven by φ t ( Q ) , i.e. with N = ∞ .In this notation, for example when κ = 0 , combining (5.17) (5.19) and (2.15) with Krause’sinequality for any nd ≥ we also have the uniform estimate, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) SpecDist (cid:16) A t , (cid:98) A t (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) nd ≤ c n ( Q ) 1 √ N (5.34)where we deﬁne the optimal matching distance between the spectrum of matrices A, B ∈ M d by SpecDist (

A, B ) = min perm( · ) max ≤ i ≤ d | λ i ( A ) − λ perm( i ) ( B ) | (5.35)where the minimum is taken over the set of d ! permutations of { , . . . , d } . In addition, for any t ≥ τ > , using the Lipschitz estimates discussed above we also have A ∞ := A − P ∞ S = ⇒ SpecDist ( A t , A ∞ ) ≤ c exp [ − α t/d ] (cid:107) Q − P ∞ (cid:107) /d (5.36)with the rate parameter α coming from (2.13). These spectral estimates are of interest on theirown, but are not immediately usable for controlling the contraction properties of the exponentialsemigroups.By Theorem 5.3 and Theorem 5.4, with S ∈ S + d , the collection of processes ( A , (cid:98) A ) introducedin (5.33) satisfy the following regularity properties: • Case κ ∈ { , } : For any n ≥ , t ≥ , Q ∈ S d , and any N suﬃciently large we have theuniform estimates √ N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) A t − (cid:98) A t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n (1 + (cid:107) Q (cid:107) ) and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) A t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n (1 + (cid:107) Q (cid:107) ) • Case κ = 0 :For any n ≥ , t ≥ , Q ∈ S d , and any N ≥ we have the uniform estimates √ N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) A t − (cid:98) A t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c (1 + (cid:107) Q (cid:107) ) (1 + √ n √ N ) and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) A t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c (1 + (cid:107) Q (cid:107) ) (1 + √ n √ N ) The stability properties of stochastic semigroups associated with a collection of stochasticﬂows ( A , (cid:98) A ) satisfying the above properties have been developed in our prior work [9]. Severallocal-type contraction estimates can be derived. The following results broadly imply the semi-group (cid:98) E s,t ( Q ) is asymptotically stable in probability when µ ( A ∞ ) < and a suﬃcient numberof particles are employed. Theorem 5.8.

Let κ ∈ { , } and S ∈ S + d . If µ ( A ∞ ) < , then for any increasing sequence ≤ s ≤ t k ↑ k →∞ ∞ , and for any Q ∈ S d , the probability of the following event lim sup k →∞ t k log (cid:107) (cid:98) E s,t k ( Q ) (cid:107) < µ ( A − P ∞ S ) is greater than − ν (5.37) for any ν ∈ ]0 , , as soon as N is suﬃciently large (as a function of ν ∈ ]0 , ). This log-Lyapunov estimate (5.37) immediately implies the semigroup (cid:98) E s,t k ( Q ) may be ex-ponentially contracting with a high probability; given strong observability and controllabilityconditions that imply µ ( A − P ∞ S ) < . A number of reformulations of this result that shedinsight individually are worth highlighting: 22 Let κ ∈ { , } . For any ≤ s ≤ t k ↑ k →∞ ∞ , there exists a sequence N := N k ↑ k →∞ ∞ such that we have the almost sure Lyapunov estimate lim sup k →∞ lim sup k →∞ t k log (cid:107) (cid:98) E s,s + t k ( Q )) (cid:107) < µ ( A − P ∞ S ) (5.38) • Let κ ∈ { , } . Then, for any increasing sequence of times ≤ s ≤ t k ↑ k →∞ ∞ , theprobability of the following event,  ∀ < ν ≤ ∃ l ≥ such that ∀ k ≥ l it holds that t k log (cid:107) (cid:98) E s,t k ( Q ) (cid:107) ≤

12 (1 − ν ) µ ( A − P ∞ S )  (5.39)is greater than − ν , for any ν ∈ ]0 , , as soon as N is suﬃciently large (as a function of n ≥ and ν ∈ ]0 , ). • Let κ ∈ { , } . Consider any s ≥ , any increasing sequence of time horizons t k ↑ k →∞ ∞ ,and any sequence N := N k ,n ↑ k →∞ ∞ such that (cid:80) k ≥ / (cid:112) N k ,n < ∞ for some n ≥ .Then, we have the almost sure Lyapunov estimate,  ∀ < ν ≤ ∃ l , l ≥ such that ∀ k ≥ l , ∀ k ≥ l it holds that t k log (cid:107) (cid:98) E s,s + t k ( Q ) (cid:107) ≤

12 (1 − ν ) µ ( A − P ∞ S )  (5.40)The ﬁrst dot-point result captured by (5.38) is derived from (5.37) in Theorem 5.8 viathe Borel-Cantelli lemma. The next two dot-point results provide some reformulation of thesupremum limit estimates (5.37) and (5.38) in terms of random relaxation time horizons andrandom relaxation-type ﬂuctuation parameters. The last reformulation in (5.40) underlines thefact that after some random time (i.e. determined by l ), and given some randomly suﬃcientlylarge number of particles (determined by l ) the semigroup (cid:98) E s,t ( Q ) is exponentially contractive.We have no direct control over the parameters l and l in (5.40) which depend on the randomnessin any realisation.Stronger results are applicable if we restrict κ = 0 . We have the following immediate corollaryof our prior work in [9]: Theorem 5.9.

Assume κ = 0 and S ∈ S + d . If µ ( A ∞ ) < , then the semigroup (cid:98) E s,t ( Q ) isasymptotically L n -stable for any n ≥ over time horizons with lengths controlled by N . Morespeciﬁcally, for any n ≥ , s ≥ , Q ∈ S d , there exists some time horizons t n < (cid:98) t n −→ N →∞ ∞ such that for any t n ≤ t ≤ (cid:98) t n we have t log E (cid:104) (cid:107) (cid:98) E s,s + t ( Q ) (cid:107) n (cid:105) ≤ n µ ( A − P ∞ S ) (5.41) whenever N is suﬃciently large such that (cid:98) t n > t n ; see [9] for details on these time parameters. Importantly, in this last result we have (cid:98) t n −→ N →∞ ∞ and thus we can control the horizonon which the semigroup (cid:98) E s,t ( Q ) is asymptotically L n -stable for any n ≥ when κ = 0 . In otherwords, the estimate (5.41) ensures that the stochastic semigroup (cid:98) E s,t ( Q ) is stable on arbitrarylong ﬁnite time horizons , as soon as κ = 0 , and when the perturbation parameter is chosensuﬃciently large. We have the following fact immediate from Theorem 5.9:23 Assume κ = 0 . For any n ≥ , s ≥ , we have lim sup N →∞ (cid:98) t n log E (cid:104) (cid:107) (cid:98) E s,s + (cid:98) t n ( Q ) (cid:107) n (cid:105) ≤ n µ ( A − P ∞ S ) Combining Theorem 5.8 and Theorem 5.9 we may draw the basic (qualitative) conclusionthat, after some initial time period, and given enough particles, the (noisy) exponential semi-groups (cid:98) E s,t ( Q ) are exponentially contractive (in some sense, e.g. almost-sure or L n -type) at arate related to µ ( A − P ∞ S ) . We note that the condition µ ( A − P ∞ S ) < is however strongerthan asking for Absc( A − P ∞ S ) < ; the latter of which follows immediately from classical ob-servability/controllability conditions on the model (or even weaker detectability/stabilisabilityconditions); see (2.11).Finally, we also have the following new result which extends the exponential decay resultsfor one-dimensional models presented in [13] to the determinant of the matrix-valued Riccatidiﬀusions considered herein. This is a type of stochastic Liouville formula. Theorem 5.10.

For any n ≥ , t ≥ , any Q ∈ S + d , and N suﬃciently large we have theexponential decay estimate E (cid:104) det ( (cid:98) E t ( Q )) n (cid:105) /n = E (cid:20) exp (cid:18) n (cid:90) t Tr( A − (cid:98) φ s ( Q ) S ) ds (cid:19)(cid:21) /n ≤ c n ( Q ) exp (cid:32) − t (cid:114) Tr (cid:16) (cid:98) R n (cid:98) S n (cid:17)(cid:33) (5.42) with (cid:98) R n := R − N (2 n + d + 1) R > and (cid:98) S n := S − N (2 n + d + 1) κ S > (5.43) In addition, the exists some function (cid:98) ν n with lim N →∞ (cid:98) ν n = 0 such that E (cid:104) det ( (cid:98) E t ( Q )) n (cid:105) /n ≤ c n ( Q ) exp (cid:16) − t (1 − (cid:98) ν n ) (cid:112) Tr( A ) + Tr( RS ) (cid:17) (5.44)The proof of this theorem is in [11, Theorem 2.7]. In the one-dimensional case, d = 1 , thisresult collapses to capture the strong exponential contraction results presented in [13]. Indeedin one dimension, Theorem 5.10 can be seen as a signiﬁcant improvement over both Theorem5.8 and Theorem 5.9 in both theoretical development and practical usability.In the scalar case, strong stability results on the stochastic Riccati ﬂow (cid:98) φ t analogous to thedeterministic setting, e.g. (2.20), also follow from Theorem 5.10; see also [13] and the illustrativeexamples in a later section in this article. In this section we consider the stability of the error ﬂow of (cid:98) ψ t ( z, Q ) := (cid:98) Z t = ( (cid:98) X t − X t ) in bothcase ( F1 ) and case ( F2 ) EnKF models with (cid:98) Z = ( (cid:98) X − X ) = z ∈ R d and (cid:98) P = Q ∈ S d . Thisﬂow may be related to the Ornstein-Uhlenbeck process (5.4) and whose solution can be writtenmore generally as in (5.9).Now from preceding results on the contraction of (cid:98) E s,t in Section 5.2.1, we may study thestability of the error ﬂow (cid:98) ψ t ( z, Q ) and its contraction properties.The following uniform error contraction estimate follows from (5.9) and Theorem 5.9, sup Q ∈ S d (cid:13)(cid:13)(cid:13) E (cid:16) (cid:98) ψ t ( z, Q ) | X (cid:17) (cid:13)(cid:13)(cid:13) ≤ c exp [ t α µ ( A − P ∞ S )] (cid:107) z − X (cid:107) (5.45)and holds for the DEnKF , with κ = 0 , for some α, c > , and under conditions compatiblewith the conditions in Theorem 5.9. This contraction result is analogous to (2.22) for the24lassical Kalman-Bucy ﬁlter; but under stronger conditions dictated by the available results onthe contraction properties of (cid:98) E t stated in Theorem 5.9. In particular, our methods prove thiscontraction (5.45) only in the case of the DEnKF , with κ = 0 , with N suﬃciently large, and ontime horizons compatible with those detailed in Theorem 5.9.The next results on the stability of (cid:98) ψ t ( z, Q ) similarly follow immediately from those stabilityresults in the preceding Section 5.2.1 but are stated at the level of the process (cid:98) ψ t ( z, Q ) itself,rather than the stochastic exponential semigroup (cid:98) E t . Theorem 5.11.

Let κ ∈ { , } and S ∈ S + d . If µ ( A ∞ ) < , then for any increasing sequence oftimes t k ↑ k →∞ ∞ and any z (cid:54) = z and any Q ∈ S d , the probability of the following event lim sup k →∞ t k log (cid:107) (cid:98) ψ t k ( z , Q ) − (cid:98) ψ t k ( z , Q ) (cid:107) < µ ( A − P ∞ S ) is greater than − ν (5.46) for any ν ∈ ]0 , , as soon as N is suﬃciently large (as a function of ν ∈ ]0 , ). Two reformulations of this result may shed insight individually and are worth highlighting: • Let κ ∈ { , } . For ≤ t k ↑ k →∞ ∞ , there exists a sequence N := N k ↑ k →∞ ∞ suchthat we have the almost sure Lyapunov estimate lim sup k →∞ lim sup k →∞ t k log (cid:107) (cid:98) ψ t k ( z , Q ) − (cid:98) ψ t k ( z , Q ) (cid:107) < µ ( A − P ∞ S ) (5.47) • Let κ ∈ { , } . Consider any increasing sequence of time horizons t k ↑ k →∞ ∞ , and anysequence N := N k ,n ↑ k →∞ ∞ such that (cid:80) k ≥ / (cid:112) N k ,n < ∞ for some n ≥ . Then, wehave the almost sure Lyapunov estimate,  ∀ < ν ≤ ∃ l , l ≥ such that ∀ k ≥ l , ∀ k ≥ l it holds that t k log (cid:107) (cid:98) ψ t k ( z , Q ) − (cid:98) ψ t k ( z , Q ) (cid:107) ≤

12 (1 − ν ) µ ( A − P ∞ S )  (5.48)Again we emphasise that the last reformulation in (5.48) highlights that after some randomtime (i.e. determined by l ), and given some randomly suﬃciently large number of particles(determined by l ) the diﬀerence of error ﬂows (or sample means; see (5.7)) is exponentiallystable.We have stronger L n -type stability results in case ( F2 ), i.e. when κ = 0 , as illustrated next. Theorem 5.12.

Assume κ = 0 and S ∈ S + d . If µ ( A ∞ ) < , then for any n ≥ , Q ∈ S d , thereexists some time horizons t n < (cid:98) t n −→ N →∞ ∞ such that for any t n ≤ t ≤ (cid:98) t n we have the stabilityestimate, E (cid:104) (cid:107) (cid:98) ψ t ( z , Q ) − (cid:98) ψ t ( z , Q ) (cid:107) n (cid:105) /n ≤ exp (cid:18) t µ ( A − P ∞ S ) (cid:19) (cid:107) z − z (cid:107) (5.49) whenever N is suﬃciently large such that (cid:98) t n > t n ; see [9] for details on these time parameters. We emphasise again that (cid:98) t n −→ N →∞ ∞ . With regards to qualitative reasoning, we maycombine Theorem 5.11 and Theorem 5.12 and draw the basic (qualitative) conclusion that,after some initial time period, and given enough particles, the diﬀerence in (noisy) error ﬂows ( (cid:98) ψ t ( z , Q ) − (cid:98) ψ t ( z , Q )) , or the diﬀerence in sample means ( (cid:98) χ t ( x , Q ) − (cid:98) χ t ( x , Q )) , is exponentiallystable (in some sense) with a rate related to µ ( A − P ∞ S ) . We note again that µ ( A − P ∞ S ) < is more restrictive than Absc( A − P ∞ S ) < ; the latter of which follows from classical observ-ability/controllability conditions, see (2.11). We have stronger stability results in the case κ = 0 L n -type stability results for any n ≥ ; but on ﬁnite horizons thatcan be made arbitrarily long.In the scalar case d = 1 , stronger stability results on the error ﬂow (cid:98) ψ t ( z, Q ) follow fromthe contraction properties in Theorem 5.10 under weaker model and ensemble (particle) sizeassumptions. The strong L n -type stability results in the scalar d = 1 case are quantitative withexponential rates, on inﬁnite horizons, that collapse to the optimal deterministic rates (explicitlycomputable when d = 1 ) as N → ∞ . See [13] and the illustrative examples in the next sectionin this article. Throughout this section we let d = 1 and R ∧ S > . The latter condition R ∧ S > is bothnecessary and suﬃcient for observability and controllability to hold in one dimension.When P ∈ [0 , ∞ [ , the deterministic Riccati equation deﬁned on [0 , ∞ [ , in the classicalKalman-Bucy ﬁlter, satisﬁes the quadratic diﬀerential equation (2.3) which may be written alsoas, ∂ t P t = Ricc( P t ) = − S ( P t − (cid:37) + ) ( P t − (cid:37) − ) , (6.1)with the equilibrium states ( (cid:37) − , (cid:37) + ) deﬁned by S (cid:37) − := A − (cid:112) A + RS < < S (cid:37) + := A + (cid:112) A + RS (6.2)With P ∈ [0 , ∞ [ , we have P t → t →∞ P ∞ = (cid:37) + . It follows that, A − P ∞ S = − (cid:112) A + RS (6.3)and thus simplifying, e.g. (2.13), we have the equality, E t ( Q ) = c t ( Q ) E t ( P ∞ ) ≤ c ( Q ) E t ( P ∞ ) and E t ( P ∞ ) = e − t √ A + RS (6.4)where −√ A + RS may be viewed explicitly as the optimal semigroup contraction rate in thescalar case. The explicit form of the constants c t ( Q ) , c ( Q ) is also available in the scalar case,see [13] and also the general Floquet-type multivariate result in [10].The Riccati drift function Ricc( · ) is also the derivative of the double-well drift function F ( Q ) = − S Q ( Q − ζ − ) ( Q − ζ + ) (6.5)with the roots ζ − := 3 A S − (cid:34)(cid:18) A S (cid:19) + 3 RS (cid:35) / < < ζ + := 3 A S + (cid:34)(cid:18) A S (cid:19) + 3 RS (cid:35) / (6.6)In this situation, the general Riccati diﬀusion (5.2) describing the ﬂow of the sample covariancein both case ( F1 ) and case ( F2 ) resumes to the Langevin-Riccati drift-type diﬀusion process, d (cid:98) P t = ∂F ( (cid:98) P t ) dt + 1 √ N (cid:98) P / t Σ / κ ( (cid:98) P t ) d M t (6.7)with the mapping Σ κ deﬁned in (5.3). Recall that case ( F1 ) corresponds to the vanilla EnKF ,denoted by

VEnKF , and case ( F2 ) corresponds to the ‘deterministic’ EnKF , denoted by

DEnKF .Also observe that ∂F > on the open interval ]0 , ζ + [ and ∂F (0) = R > σ (0) so that theorigin is repellent and instantaneously reﬂecting.26t any time t ≥ we may comment on the boundedness of certain moments of the sam-ple variance and the ﬂuctuation of the sample variance and sample mean about their limiting(classical Kalman-Bucy variance and mean) values.For example, taking Theorem 5.3 and the detailed scalar exposition in [13], we have for any n ≥ , t ≥ , Q ∈ [0 , ∞ [ , and any N ≥ ∨ κ ( n − , the uniform estimates, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( Q ) (6.8)For any t ≥ , Q ∈ [0 , ∞ [ , and any N ≥ , we have the uniform under bias estimate, E (cid:104) (cid:98) φ t ( Q ) (cid:105) ≤ φ t (6.9)and we have explicit bounds on φ t in the scalar case in [13]. This under bias result holds for any N ≥ in both the VEnKF of case ( F1 ), and in the DEnKF of case ( F2 ). This under bias motivates so-called variance/covariance regularisation methods in practice; e.g. so-called sample covarianceinﬂation or localisation methods. Later we discuss the eﬀects of inﬂation in particular.We have bounds on the inverse Riccati ﬂow (leading to lower bounds on the sample covari-ance) under stronger conditions on N ; see [13].From Theorem 5.5 and the detailed scalar exposition in [13] we have for any n ≥ , t ≥ , Q ∈ [0 , ∞ [ , and any N > n + 1)(1 + 4 κ ) , the uniform estimates, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) ψ t ( z, Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( z, Q ) (6.10)From Theorem 5.4, Theorem 5.6 and the scalar exposition in [13], we have for any n ≥ , t ≥ , Q ∈ [0 , ∞ [ , and any N ≥ ∨ κ ( n − , the uniform ﬂuctuation estimate, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) φ t ( Q ) − φ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( Q ) √ N (6.11)and for any N > n + 1)(1 + 4 κ ) we have the uniform ﬂuctuation estimate, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) ψ t ( Q ) − ψ t ( Q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ≤ c n ( z, Q ) √ N (6.12)Note that the constants (e.g. c n ( Q ) ) above in (6.8) and (6.11) are examined in [13] in thescalar case d = 1 in more detail.The beneﬁt and real interest in the scalar case is the ability to explicate the convergencerates, e.g. as in (6.4) and as shown subsequently for the contraction properties of the EnKF sample means and variances.Note that we may expand the stochastic ﬂow of the sample variance as in (5.23). Exploringthis idea further in the scalar case, we may write the ﬁrst and second-order ﬂuctuations as, (cid:98) ϕ t := √ N [ (cid:98) φ t − φ t ] , (6.13) (cid:98) ϕ (2) t := √ N [ (cid:98) ϕ t − ϕ t ] (6.14)where in the second line we emphasise the superscript · (2) is an order index (not a power) andwhere, ϕ t ( Q ) := (cid:90) t ( ∂φ t − s ) ( φ s ( Q )) Σ / κ ( φ s ( Q )) d M s (6.15)and the derivatives ∂ k φ t of any order are explicitly given in [13]. In this case, ∂φ t ( Q ) = E t ( Q ) (where the superscript here is now a power). We then have, (cid:98) φ t = φ t + 1 √ N ϕ t + 1 N (cid:98) ϕ (2) t (6.16)27he natural central limit theorem follows, √ N (cid:104) (cid:98) φ t − φ t (cid:105) −→ N →∞ ϕ t (6.17)The (non-)asymptotic variance is estimated in [13, 14].The expansion (6.16) allows ones to better understand the bias properties of the samplecovariance (cid:98) φ t . Writing the third-order ﬂuctuation as, (cid:98) ϕ (3) t := √ N [ (cid:98) ϕ (2) t − ϕ (2) t / (6.18)and expanding and taking expectations, E (cid:104) (cid:98) φ t (cid:105) = φ t + 12 N E (cid:104) ϕ (2) t (cid:105) + 1 N / E (cid:104) (cid:98) ϕ (3) t (cid:105) (6.19)and limits we have the dominating ( N -order-asymptotic) bias is given by N (cid:16) E (cid:104) (cid:98) φ t (cid:105) − φ t (cid:17) −→ N →∞ (cid:90) t (cid:0) ∂ φ t − s (cid:1) ( φ s ( Q )) Σ κ ( φ s ( Q )) ds < (6.20)which is negative (agreeing with our general under bias results, e.g. (6.9)).The expansion (5.27) of the sample mean (error) may be explored similarly to the aboveexpansion of the sample covariance. The ﬁrst order terms ϑ t in (5.27) related to the centrallimit theorem are studied in [14, Section 1.3].The inﬁnitesimal generator of the diﬀusion (6.7) on ]0 , ∞ [ is given in Sturm-Liouville formby the equation L ( f ) = 2 N ι Σ κ e V ∂ (cid:0) e − V ∂f (cid:1) with V ( · ) = − N (cid:90) · δ ∂F ( x ) ι − ( x ) Σ − κ ( x ) dx (6.21)for any δ > and where we recall the identity function ι ( x ) := x . This implies that a reversiblemeasure of the Riccati diﬀusion (5.2) in the scalar d = 1 case is given by the formula (cid:98) Γ ∞ ( dx ) ∝ ]0 , ∞ [ ( x ) N ι ( x ) Σ κ ( x ) exp ( − V ( x )) dx (6.22)In case ( F1 ) corresponding to the VEnKF , we have that L is reversible w.r.t. the probabilitymeasure (cid:98) Γ ∞ on ]0 , ∞ [ deﬁned by, (cid:98) Γ ∞ ( dx ) ∝ ]0 , ∞ [ ( x ) exp (cid:32) N A √ RS tan − (cid:32) x (cid:114) SR (cid:33)(cid:33) (cid:18) xR + Sx (cid:19) N x ( R + Sx ) dx (6.23)See also [13] for alternate derivations/forms of this heavy tailed invariant measure. The heavytailed nature of the stationary measure implies that for the n -th moment to exist one requires N > ∨ n − . As expected this condition on N is generally weaker than that required for n -th moment boundedness at any time t ≥ in (6.8). In Figure 1 we plot the line deﬁned by (2 n − / N for various N values. With any N ≥ , we have existence of the ﬁrst two moments.Higher-order moments even in one dimension are still troublesome (for the VEnKF , κ = 1 ).In fact, the diﬀusion (cid:98) P t for the sample variance in case ( F1 ) does not have any exponentialmoments in the stationary regime for any ﬁnite N ≥ . That is, for any t ≥ and any ﬁnite α > we have Law ( Q ) = (cid:98) Γ ∞ = ⇒ E (cid:104) exp (cid:16) α (cid:107) (cid:98) φ t ( Q ) (cid:107) (cid:17)(cid:105) = ∞ for any N ≥ .We also remark that the heavy tailed nature of this stationary distribution implies thatnumerical stability in practice may be worrisome. In the stationary regime, it is realistic to28 Plot of (2n-4)/N...... with N moving from 1 to 50 fanning left to right

Figure 1: Existence of moments for the EnKF. Each line corresponds to some number N with N moving from to fanning left to right. The ‘x-axis’ corresponds to moment orders n anda moment n exists whenever the line (2 n − / N is strictly less than one.expect samples from the tails in this case, and these may be large enough and/or frequent enoughto cause numerical divergence. This property may lead to so-called catastrophic divergence asstudied in, e.g., [48]. In [37, 35, 48] mechanisms for catastrophic divergence are studied incomplex nonlinear systems. Here we argue that even in linear systems, the heavy-tailed natureof the invariant measure of the sample covariance may lead to samples numerically large enoughto cause numerical catastrophe in any practical computing system.In case ( F2 ) corresponding to the DEnKF , we have that L is reversible w.r.t. the probabilitymeasure (cid:98) Γ ∞ on ]0 , ∞ [ deﬁned by, (cid:98) Γ ∞ ( dx ) ∝ ]0 , ∞ [ ( x ) x N − exp (cid:32) − S N R (cid:18) x − AS (cid:19) (cid:33) dx (6.24)Note this measure has Gaussian tails, and we contrast this with the heavy tailed nature of (6.23).This is signiﬁcant, since it implies that the sample variance (and mean) of this DEnKF will exhibitsmaller ﬂuctuations than the

VEnKF , and that all moments (including exponential moments) existin this case for any choice of N ≥ . This latter result is consistent with Theorem 5.3 at any time t ≥ (and in the general multivariate setting). We can also expect better numerical stability(e.g. less outliers); including better time-discretisation properties [40] in case ( F2 ). These betterﬂuctuation properties are already apparent in the preceding results (e.g. see Theorem 5.3, 5.4and 5.12) in the full multi-dimensional setting.As an illustrative example, take A = 20 (i.e. the underlying signal model is highly unstable), R = S = 1 and N = 6 . In Figure 2 we compare the invariant measure for the ﬂow of the samplevariance in each case.We see in Figure 2 the heavy tails of the invariant measure (6.23) for the vanilla EnKF samplevariance, and conversely the Gaussian-type tails in the case (6.24) of the ‘deterministic’

EnKF .Note also the positioning of the mode/mean in each case. In case ( F1 ) of the VEnKF , n -th ordermoments exist only when (2 n − / N is strictly less than one (in this case for n < ); while allmoments exist in case ( F2 ) for the DEnKF .We ﬁnally tun to the convergence/stability properties of the sample variance and samplemean. In the case of the sample variance, we know from Theorem 5.7 that convergence of (cid:98) φ t toits invariant measure (cid:98) Γ ∞ (e.g. as depicted in Figure 2 and described by (6.23) or (6.24)) holdsif N > κ S ) / (2 R ) . Proof of this condition on N follows from Theorem 5.7, the originalmultivariate statement of the same result in [11, Theorem 2.4] and bounds on the mean of thesample variance ﬂow and its inverse [13, 11]. In [13] we also consider contraction and stability29 Sample Variance Invariant Measures 'Vanilla' EnKF'Deterministic' EnKF

Figure 2: The invariant measure of the sample variance of the ‘vanilla’

EnKF in case ( F1 ), versusthat of the ‘deterministic’ EnKF in case ( F2 ).properties of the distribution of the sample covariance with respect to a particular Wassersteinmetric; as opposed to the Λ -norm contraction used in Theorem 5.7. An interesting result from[13] is that when κ = 0 , and for stable signal models (i.e. A < ), the Riccati diﬀusion (6.7)(describing the ﬂow of the sample covariance) may converge faster to its invariant measure in(6.24), than the deterministic Riccati (6.1) does to its ﬁxed point in (6.2).In one-dimensional ( d = 1 ) settings we may say more on the (stochastic) stability of the EnKF sample covariance φ t and sample mean ψ t based on the contraction properties of thestochastic transition matrix (cid:98) E t ( Q ) deﬁned in (5.8). It follows from Theorem 5.10 that we havethe exponential decay estimate with N > ∨ (4 n − κ which comes from [11, Theorem 2.7], E (cid:104) (cid:98) E t ( Q ) n (cid:105) /n = E (cid:20) exp (cid:18) n (cid:90) t ( A − (cid:98) φ s ( Q ) S ) ds (cid:19)(cid:21) /n ≤ c n ( Q ) exp (cid:18) − t (cid:113) (cid:98) R n (cid:98) S n (cid:19) (6.25)In addition, the exists some function (of N ) lim N →∞ (cid:98) ν n = 0 such that E (cid:104) (cid:98) E t ( Q ) n (cid:105) /n = c n ( Q ) exp (cid:16) − t (1 − (cid:98) ν n ) (cid:112) A + RS (cid:17) (6.26)which we may relate (or contrast) with the exact contraction rate of the exponential semigroupassociated with the deterministic Riccati equation in (6.4) describing the true ﬁlter variancein the classical Kalman-Bucy ﬁlter. The rate parameter (cid:98) ν n is diﬀerent between the VEnKF and

DEnKF . Details on the parameter (cid:98) ν n are given in [13] but importantly for both κ ∈ { , } werecover naturally the convergence rate of the deterministic Riccati ﬂow in (6.4).The exponential decay of the exponential semigroup (cid:98) E t ( Q ) plays a central role in the stabilityof the pair of processes ( (cid:98) φ t , (cid:98) ψ t ) . For large time horizons the Lyapunov exponent can be estimatedby the formula t log (cid:98) E t ( Q ) = 1 t (cid:90) t ( A − (cid:98) φ s ( Q ) S ) ds −→ t →∞ A − (cid:98) Γ ∞ ( ι ) S, (6.27)where (cid:98) Γ ∞ denotes the reversible measure (6.23) or (6.24). We also have the following estimatesof the Lyapunov exponent (6.27) from [13], and that relate also to the under-bias result in (6.9).Let κ = 0 and let Law ( Q ) = (cid:98) Γ ∞ be the reversible probability measure deﬁned in (6.24). Then,for any t ≥ , we have N > ⇒ − (cid:112) A + RS ≤ A − E ( (cid:98) φ t ( Q )) S ≤ − (cid:115) A + RS (cid:18) − N (cid:19) < (6.28)30imilarly assuming κ = 1 with Law ( Q ) = (cid:98) Γ ∞ and (cid:98) Γ ∞ deﬁned in (6.23) we have for any t ≥ , N > ⇒ − (cid:112) A + RS ≤ A − E ( (cid:98) φ t ( Q )) S ≤ − (cid:113) A + RS ( − (4 /N ) ) − A/N /N < (6.29)As noted, the left hand inequalities in the preceding two equations follows immediately from theunder-bias result in (6.9).From the contraction properties on E [ (cid:98) E t ( Q ) n ] we may deduce, in the scalar setting, strongstability results on the stochastic Riccati ﬂow (cid:98) φ t analogous to the deterministic setting, e.g.(2.20). Similarly, strong stability results on the error ﬂow (cid:98) ψ t follow from the contraction proper-ties of E [ (cid:98) E t ( Q ) n ] . Importantly, in the scalar d = 1 case of (cid:98) ψ t ( z, Q ) we may relax the multivariateresults like Theorem 5.11 and Theorem 5.12 which require more restrictive model (e.g. the strongobservability/stability µ ( A − P ∞ S ) < condition) and ensemble (particle) size assumptions.From [13, Theorem 5.10] we have that for any N > ∨ κ ( n − , E (cid:104) (cid:107) (cid:98) φ t ( Q ) − (cid:98) φ t ( Q ) (cid:107) n (cid:105) /n ≤ c n (cid:107) Q − Q (cid:107) exp (cid:16) − t (1 − (cid:98) ν n ) (cid:112) A + RS (cid:17) (6.30)for some function (of N ) lim N →∞ (cid:98) ν n = 0 . Note we have found no analogue of this result in themultivariate setting.From [13, Theorem 6.1] we have that for any N > n + 1)(1 + 4 κ ) , E (cid:104) (cid:107) (cid:98) ψ t ( z , Q ) − (cid:98) ψ t ( z , Q ) (cid:107) n (cid:105) /n ≤ c n ( z , z , Q , Q ) ( (cid:107) z − z (cid:107) + (cid:107) Q − Q (cid:107) ) exp (cid:16) − t (1 − (cid:98) ν n ) (cid:112) A + RS (cid:17) (6.31)for some function (of N ) lim N →∞ (cid:98) ν n = 0 . We may contrast this result with the more restrictiveTheorem 5.12 in the multivariate setting. Note in the scalar setting we accommodate diﬀerentinitial variance positions, and we recover, over fully inﬁnite horizons, a continuous relationshipwith the optimal stability rates of (6.4).The constants in (6.26), (6.30), and (6.31) are given explicitly in terms of the model param-eters in [13]. We remark that across these three stability results, the details of (cid:98) ν n vary [13], butimportantly we recover the optimal (classical Kalman-Bucy) rates lim N →∞ (cid:98) ν n = 0 .We consider an illustration of the ﬂuctuation and stability properties of the sample variancein the diﬀerent EnKF variants. Consider again the model leading to Figure 2, and let (cid:98) P = 0 .The deterministic Riccati ﬂow ( N = ∞ , in (6.1)) of the classical Kalman-Bucy ﬁlter and withthe chosen model parameters is given in Figure 3, along with N = 100 sample paths of thesample variances for both the VEnKF and the

DEnKF . Flow of the deterministic Riccati equation

Sample path trajectories ('vanilla' EnKF sample variance)

Figure 3: Flow of the deterministic Riccati equation, and N = 100 sample paths of the VEnKF sample variance of case ( F1 ), and N = 100 paths of the DEnKF sample variance of case ( F2 ).31ote in Figure 3 the drastically reduced ﬂuctuations in the ‘deterministic’ EnKF samplevariance sample paths. At equilibrium, these ﬂuctuations are related to the invariant measuresof the two

EnKF varieties in (6.23) and (6.24).In Figure 4 we plot the ﬂow of the ﬁrst two central moments and the rd through the thstandardised central moments for both the VEnKF and

DEnKF sample variance distribution. Recallthat N = 6 in this example, and we expect moments of the VEnKF sample variance in case ( F1 )to exist up to n = 4 with n = 5 the boundary case; while all moments exist for the DEnKF ofcase ( F2 ). Vanilla EnKF: Sample Variance Moments

Mean

Variance × × × Deterministic EnKF: Sample Variance Moments

Mean

Variance

Figure 4: Flow of the sample variance moments for the

VEnKF and the

DEnKF .We note in Figure 4 that the sample variance moments for the

VEnKF in case ( F1 ) beginto destabilize around the th/ th moments as expected. Importantly, the mean of the samplevariance for the VEnKF is very negatively biased in this case, while the mean of the

DEnKF in case( F2 ) is quite accurate. Note also the very large variance in the sample variance for the VEnKF .Lastly, observe that (6.7) has non-globally Lipschitz coeﬃcients in case ( F1 ). The drift isquadratic, while the diﬀusion has a polynomial growth of order / . It follows by [40] that thenaive Euler-type time-discretization may blow up, regardless of the boundedness properties ofthe limiting (continuous-time) diﬀusion. Let ( V it , W it , X i ) with ≤ i ≤ N + 1 be ( N + 1) independent copies of ( V t , W t , X ) . Now considera modiﬁcation of the individual particle update equations in the two cases of interest, ( F1 ) d X i,ξt = A X i,ξt dt + R / d V it + (cid:16) (cid:98) P ξt + ξ T (cid:17) H (cid:48) R − (cid:104) d Y t − (cid:16) H X i,ξt dt + R / d W it (cid:17)(cid:105) (7.1) ( F2 ) d X i,ξt = A X i,ξt dt + R / d V it + (cid:16) (cid:98) P ξt + ξ T (cid:17) H (cid:48) R − (cid:34) d Y t − H (cid:32) X i,ξt + (cid:98) X ξt (cid:33) dt (cid:35) ξ ∈ [0 , ∞ [ , and T ∈ S r is some given reference matrix. Here, (cid:98) P ξt = (cid:98) φ ξt ( Q ) denotes thesample covariance-type function of X i,ξt given by, (cid:98) η ξt := η N ,ξt = 1 N + 1 N +1 (cid:88) i =1 δ X i,ξt = ⇒ (cid:98) X ξt := X N ,ξt = 1 N + 1 N +1 (cid:88) i =1 X i,ξt and (cid:98) P ξt := P N ,ξt = N + 1 N P (cid:98) η ξt (7.2)Recall the uniﬁed representation (for both the VEnKF and the

DEnKF ) for the ﬂow of thesample mean, sample covariance, and the sample error ﬂow in equations (5.1) through to (5.4).Now consider the modiﬁcation of the state estimator (sample mean) update equation result-ing from the ( ξ T ) -modiﬁcation to the particle updates, d (cid:98) X ξt = ( A − ( (cid:98) P ξt + ξ T ) S ) (cid:98) X ξt dt + ( (cid:98) P ξt + ξ T ) H (cid:48) R − d Y t + 1 √ N + 1 Σ / κ,ξ ( (cid:98) P ξt ) d B t (7.3)with the mapping, Σ κ,ξ ( Q ) := R + κ ( Q + ξ T ) S ( Q + ξ T ) with κ = (cid:26) in case ( F1 ) in case ( F2 ) (7.4)With (cid:98) Z ξt := ( (cid:98) X ξt − X t ) we then also have that, d (cid:98) Z ξt = ( A − ( (cid:98) P ξt + ξ T ) S ) (cid:98) Z ξt dt + ( (cid:98) P ξt + ξ T ) H (cid:48) R − d V t − R / d W t + 1 √ N + 1 Σ / κ,ξ ( (cid:98) P ξt ) d B tlaw = ( A − ( (cid:98) P ξt + ξ T ) S ) (cid:98) Z ξt dt + Ω / κ,ξ ( (cid:98) P ξt ) d (cid:98) B t (7.5)with, Ω κ,ξ := Σ ,ξ + 1 N + 1 Σ κ,ξ (7.6)In un-regularised ensemble Kalman ﬁltering, we approximate φ t by the sample covariance (cid:98) φ t since the dimension of (cid:98) X t may be in the many millions; see [31]. However, when computingthe sample covariance in high-dimensions, rank deﬁcient estimation is common due to a lack ofenough samples. Covariance inﬂation, leading to an approximation of the form ( (cid:98) φ ξt ( Q ) + ξ T ) , inthe update equation (e.g. (7.1)), is a common, simple means of addressing this rank deﬁciency[31].Note that the perturbation in the resulting ﬂow of (cid:98) φ ξt ( Q ) comes from a (rather delicate)feedback loop adding ξ T to the covariance of the signal (cid:98) X ξt at each instant. The ﬂow of (cid:98) φ ξt ( Q ) is given by, d (cid:98) P ξt = (cid:20) ( A − (1 − κ )2 ξ T S − (cid:98) P ξt S ) (cid:98) P ξt + (cid:98) P ξt ( A − (1 − κ )2 ξ T S − (cid:98) P ξt S ) (cid:48) + R + κ ξ T S T (cid:21) dt + 2 √ N (cid:20) (cid:98) P ξt / d M t Σ / κ,ξ ( (cid:98) P ξt ) (cid:21) sym (7.7)In the limit N → ∞ we recover a perturbed, deterministic, Riccati equation that describesthe ﬂow of the limiting covariance. This perturbed Riccati equation is studied in [16, 12]. Forany size (cid:107) ξT (cid:107) < ∞ , the perturbed Riccati ﬂow qualitatively retains all the stability propertiesof the nominal Riccati ﬂow (e.g. (2.20), but with a diﬀerent steady state value), and the size ofthe error between the two grows in a well-quantiﬁed continuous way.33n the limiting case, we have via [12, Theorem 2.1] that φ t ( Q ) ≤ φ ξt ( Q ) in case ( F1 ). In case( F2 ) we have that φ ξt ( Q ) ≤ φ t ( Q ) in the limit N → ∞ .For any s ≤ t and Q ∈ S d we deﬁne the stochastic state-transition matrix, (cid:98) E ξs,t ( Q ) := exp (cid:20)(cid:73) ts (cid:16) A − ξ T S − (cid:98) φ ξu ( Q ) S (cid:17) du (cid:21) ⇔ ∂ t (cid:98) E ξs,t ( Q ) = (cid:16) A − ξ T S − (cid:98) φ ξu ( Q ) S (cid:17) (cid:98) E ξs,t ( Q ) (7.8)Note that this semigroup (cid:98) E ξs,t ( Q ) is associated with the evolution of the (inﬂation) regularisedsample mean in (7.3) or the error ﬂow (7.5) in both case ( F1 ) and ( F2 ). Unlike the un-regularisedsetting, this same semigroup is not directly related to the evolution of the sample covariance,in (7.7); for example, in case ( F1 ) the semigroup associated with the evolution of the samplecovariance is just (cid:98) E s,t ( Q ) as given in (5.8) and studied throughout the preceding section.We can comment on the eﬀect of inﬂation regularisation on the contraction properties of (cid:98) E ξs,t ( Q ) , as compared e.g. to (cid:98) E s,t ( Q ) . Firstly, it is worth noting, given the contraction estimatesin Section 5.2.1, that, µ (( A − ξ T S ) − P S ) ≤ µ ( A − P S ) (7.9)for any ﬁxed matrix P ∈ S d and S ∈ S d . Arguing as in (5.32), when S ∈ S + d , then up to achange of basis we can always assume that S = I . We then have, µ ( A ) < (cid:107) ξ T (cid:107) = ⇒ (cid:107) (cid:98) E ξs,t ( Q ) (cid:107) ≤ exp ( µ ( A − ξ T )( t − s )) −→ ( t − s ) →∞ (7.10)which illustrates the added stabilising eﬀects of ξ T in the extreme case in which (cid:98) P ξt S hasno stabilising eﬀect at all. Contrast this with (5.32). Then one interpretation of the precedingrelationship is that ξ T extends the set of signal matrices A ∈ M d for which one may immediatelyachieve stabilisation (regardless of the eﬀect of (cid:98) P ξt S ). In practice, (cid:98) P ξt S will also act to stabilisethe ﬁlter, see e.g. (5.49). Indeed, in the classical Kalman ﬁltering setting (2.2), (2.3) with ξ = 0 ,the time-varying matrix ( A − P t S ) is stabilising [8] for any A ∈ M d , even A unstable. In the EnKF , we know that (cid:98) P t will ﬂuctuate about P t , e.g. see Theorem 5.4. Therefore, the stabilisationproperties of ( A − (cid:98) P t S ) are unclear; indeed the study of (cid:98) E s,t ( Q ) in the preceding Section 5.2.1is concerned with precisely this issue. The above implies that the addition of ξ T can act tocounter the negative eﬀects of this ﬂuctuation (and directly add a stabilising eﬀect on the stateestimation error).Finally, we have φ t ( Q ) ≤ φ ξt ( Q ) in case ( F1 ) and φ ξt ( Q ) ≤ φ t ( Q ) in case ( F2 ). The semigroupassociated with the error ﬂow (7.5) in both cases is the same. The inequality φ t ( Q ) ≤ φ ξt ( Q ) in case ( F1 ) suggests that the diﬀusion ﬂuctuation in (7.3) or (7.7) will increase. However, weconversely expect that with S ∈ S + d we have µ (( A − ξ T S ) − φ ξt ( Q ) S ) ≤ µ (( A − ξ T S ) − φ t ( Q ) S ) and thus we gain a type of stabilising eﬀect. Inﬂation in case ( F1 ) is then a delicate balancingtradeoﬀ between adding noise to the diﬀusion coeﬃcients (which may kill the existence of samplecovariance moments, for example), and adding a stabilising eﬀect on the sample mean error ﬂow.When ξ > is large enough we can achieve added stabilisation in case ( F2 ), as compared to thenon inﬂated case. This is not automatic as in case ( F1 ) because φ ξt ( Q ) ≤ φ t ( Q ) . However, theﬂuctuations are (further) decreased with inﬂation in case ( F2 ). In places, we switch between rather quantitative estimates to those more qualitative in nature.In part this is to simplify presentation, or when the details are (likely) not tight and thus perhapsof little quantitative interest. In some in places it is because we did not obtain more precisedescriptions of the estimates involved. Reﬁning these estimates may be of practical interest insome cases; e.g. when deriving estimates on the required number of particles N for stability ofthe sample covariance (or convergence to its invariant measure).34he results presented thus far consisted of constants, e.g. c , c n , c τ , etc, that depend onthe model parameters ( A, R, S ) , but importantly not on the ensemble size ( N + 1) or the timehorizon t ∈ [0 , ∞ [ . Due to the dependence on the model (e.g. ( A, R, S ) ), these constants dependimplicitly (via the matrix norms used) on the underlying signal dimension d . It would be ofinterest to pull this dependence out more explicitly depending on the matrix norm we are using,so as to quantify, at least in some general sense, the tradeoﬀ between N and d . For example, inTheorem 5.4 or Theorem 5.6 detailing the ﬂuctuation of the sample covariance and sample meanabout their limiting covariance and (Kalman-Bucy) state estimate values, it would be of interestto know how this ﬂuctuation scales with dimension d , say e.g. with ﬁxed N . Unfortunately, theproof tools used in the development of this work does not lend itself naturally to this analysis.The matrix S := H (cid:48) R − H plays a critical role throughout with regards to obtaining time-uniform ﬂuctuation and then subsequently stability/convergence results. In particular, the as-sumption that S ∈ S + d is strictly positive-deﬁnite is needed in numerous places; speciﬁcally whenthe signal model A is unstable. This assumption amounts to a type of strong observability result;e.g. a requirement on the “fullness” of the observations and the size and rank of the observationmatrix H . It is worth emphasising that this assumption appears in many technical articlesdiscussing the performance properties of the ensemble Kalman ﬁlter; e.g. [47, 26, 80, 25, 22, 23].Typically, the tools used in the proofs in [26, 14, 13, 11] are not sophisticated enough to accom-modate zero eigenvalues of S . A basic example of this deﬁciency is in the proof of time-uniformmoment boundedness of (cid:98) P t , stated in Theorem 5.3. In that proof, we resort to taking trace oreigenvalue-type reductions of the matrix-valued Riccati diﬀusion and studying a scalar compari-son Riccati equation. This scalar reduction means that we must look at the minimum eigenvalueof S (because it appears with a minus sign in the Riccati equation) and thus we cannot allowthis value to be zero (because we would lose this term completely in the scalar comparison). Toobtain uniform-in-time bounds, one needs the stabilising eﬀect of this non-zero S in the scalarcomparison. See the proof in [11, Theorem 2.2] for this very transparent example. In this exam-ple, one may relax the condition on S to S ∈ S d at the expense of time exponentially growingbounds. Related diﬃculties in allowing S ∈ S d instead of S ∈ S + d arise in numerous other places(and as noted in other related works [47, 26, 80, 25, 22, 23]). One diﬃculty is related to stabilityof the (time-varying) matrix ( A − (cid:98) P t S ) and the positive-deﬁniteness properties of product (cid:98) P t S as discussed subsequently.We have focused signiﬁcant eﬀort on relaxing the assumption that the underlying signal isstable. Note that if A is stable, e.g. strongly stable in the sense that µ ( A ) < , then the stabilityof µ ( A − (cid:98) P t S ) may be trivially inherited whenever S ∈ S + d via a change of coordinates, see [26].We see here again the use of S ∈ S + d as it pertains to the product (cid:98) P t S . If S ∈ S d is only positivesemi-deﬁnite, then one can construct counterexamples such that even if (cid:98) φ t = ( φ t + (cid:98) ϕ t / √ N ) ∈ S + d is positive deﬁnite, there exists ﬂows (cid:98) ϕ t such that µ ( A − (cid:98) P t S ) = µ ( A − φ t S − (cid:98) ϕ t S/ √ N ) > .The assumption µ ( A ) < is made in [26] in the linear-Gaussian setting and follow also in, e.g.,[47, 80, 25, 23] when reducing those studies to the linear-Gaussian setting. If A is allowed tobe unstable, then the asymptotic (time-varying) stability of ( A − P t S ) in the classical Kalmanﬁlter follows under so-called detectability (or observability) conditions [50, 66, 82]. Detectabilityintricately relates the relevant rank deﬁcient directions in P t and S in terms of the unstabledirections in A (i.e. it basically ensures those directions of A that are unstable are observed(as captured by S ) and non-zero weighted in the update Kalman gain via P t ). The rank of thesample covariance (cid:98) P t is at most N ≥ . If N < d , then (cid:98) P t is almost surely rank deﬁcient andthus has zero eigenvalues in some directions. In general, we cannot control the directions inwhich the random, sub-rank, (cid:98) P t has zero eigenvalues (e.g. to play nicely with S in the sense ofdetectability). If A is unstable in those directions, the ﬁlter is consequently unstable in thosedirections. Thus, there is a basic, unavoidable, but also transparent tradeoﬀ in requiring eitherstability of µ ( A ) < or suﬃciently large ensemble sizes N ≥ d in the derivation of uniform-in-time stability results for the EnKF . In the stability results stated in this work, we emphasised35nstable models A but required suﬃciently large ensemble sizes N ≥ d . Nevertheless, moststability results stated in this work with the hypothesis that “ N is suﬃciently large” may berestated with this condition replaced with “ N ≥ and µ ( A ) < ”. In [14, 13, 11] the detailson “ N is suﬃciently large” are given more explicitly. Note some results that do not consider orrely on the long time stability behaviour of the samples, e.g. the ﬂuctuation size of the samplecovariance about its true value, hold with N ≥ and any matrix A , e.g. this is true for the DEnKF in Theorem 5.4.We remark that the assumption that the true Kalman-Bucy ﬁlter is stable in the sense µ ( A − P ∞ S ) < is used in a number of the ﬂuctuation (on the sample mean) and long-time behavioural results in this article. This is stronger than the more natural assumption that Absc( A − P ∞ S ) < . The latter follows from the very natural model assumptions of detectabilityand stabilisability, see (2.11) and the discussion following that equation, e.g. [52, Theorems 9.12,9.15]. In this sense, asking for µ ( A − P ∞ S ) < on the true ﬁlter may be viewed as asking for atype of strong observability and controllability. This is similarly in line with asking for S ∈ S + d which is itself a type of strong observability assumption.It is worth noting again that all moment boundedness and ﬂuctuation results stated inthis work hold with any N ≥ and without further assumptions if one replaces the constants c, c n , c n ( Q ) , c n ( z, Q ) . . . with functions that depend on (and grow with) the time horizon t ≥ . The focus of this article is ensemble ﬁltering in the linear-Gaussian (continuous-time) setting.The results surveyed herein portray a rather detailed theory of ﬂuctuation and stability results inthat case. In practice, the ensemble Kalman ﬁltering methodology is applied in high-dimensional,nonlinear state-space models, e.g. see [31, 32]. The evolution equations for each ensemblemember in the case of nonlinear state-space models are given in (4.10).In [83, 78] a novel McKean-Vlasov-type diﬀusion is studied which has conditional distributionequal to the true Bayesian ﬁlter. The mean-ﬁeld approximation of this McKean-Vlasov-typediﬀusion in [83, 78] resembles somewhat superﬁcially the particle ﬁlters in (4.10). However, theanalogue of the gain function in (4.10) in the ﬁlter of [83, 78] is derived as the solution of acertain Poisson-type partial diﬀerential equation. In the linear-Gaussian case, the ﬁlter of [83]coincides with the

DEnKF .In the nonlinear model setting, the ensemble ﬁlters in (4.10) are not derived as sampledversions of an equation whose (conditional) distribution is equal to the Bayesian ﬁlter. That is,these ﬁlters are not derived as sampled versions of the McKean-Vlasov-type diﬀusion in [83, 78].Conversely, in the limit ( N → ∞ ) the ensemble ﬁlters in (4.10) do not converge to an object withdistribution equal to the optimal Bayes ﬁlter. In fact, the object these ﬁlters converge to hasnot been rigorously established in general and its properties, as compared to the true Bayesianﬁlter, remain an open topic. Thus, in the nonlinear setting, the ensemble ﬁlters discussed inthis work, see (4.10), may be viewed as approximations of the ﬁlter in [83, 78] only in somevery weak sense (despite any superﬁcial resemblance to the contrary). Indeed, the gain functionapproximation in (4.10) is likely a very poor approximation of the solution of the Poisson-typepartial diﬀerential equation in [83]; except of course in linear-Gaussian models. Rather, we mayargue, as we have earlier in this article, that the ensemble ﬁlters in (4.10) should be viewed in thecontext of so-called observer theory, and related not to Bayesian ﬁltering but rather to the moregeneral topic of (dynamic) state estimation [2, 7]. The goal of state estimation in this contextis to design an observer that tracks in some suitable (typically point-wise) sense the underlyingsignal and perhaps provides some usable measure of uncertainty on this estimate. The goal isnot to develop an approximation (at each time) of the true conditional (Bayesian) distributionof the signal given the observations. The latter contains signiﬁcantly more information than isperhaps needed in many practical applications. Nevertheless, we also argue that the ﬁltering36deas in [83, 78], and suitable approximations thereof, are in need of further investigation.In [25] a class of so-called ensemble extended Kalman ﬁlters ( En-EKF ) is developed that isbased on a type of particle approximation of the linearisation-based extended Kalman ﬁlter,see [2]. This ensemble ﬁlter is interesting because the sample mean is shown to converge (in N → ∞ ) to the extended Kalman ﬁlter state estimate. This extended Kalman state estimatorhas been widely studied in nonlinear ﬁltering and control theory [2, 7, 71, 17, 46], and may beviewed more as a type of nonlinear state estimator rather than a Bayesian ﬁlter [7, 71].When considering nonlinear signal models, the long time behavioural analysis of various EnKF methods in [47, 80, 25] assumes a strong type of stability property on the signal (whichin the linear case would reduce to assuming that A is Hurwitz stable in our model (2.1)). Thisstability assumption on the true signal is precisely what we aim to relax in our work; albeitlimited in our study to linear models. Filter stability without assumptions on the stability ofthe true signal will ultimately require some control of the ﬂuctuation properties of the sampledobserver, e.g. see the discussion in the preceding section on this topic (in the linear-Gaussianmodel setting). This ﬂuctuation analysis is lacking somewhat in the nonlinear model setting. Itis complicated in that case by the absence of any closed-form evolution equations for the samplemean and sample covariance.Viewing, or even designing, an ensemble ﬁlter (or its sample mean for example) as a (dy-namic) state estimator (or observer) may have some beneﬁts. In particular, stability may be alarger design consideration if starting from this viewpoint rather than seeking Bayesian proba-bilistic properties. It may be possible to then also exploit the properties of existing nonlinearstate estimators which have traditionally been rigorously analysed, e.g. [7, 71, 17, 46].This is exempliﬁed in the ( En-EKF ) in [25] that converges to the extended Kalman ﬁlterin the limit N → ∞ . The stability of the extended Kalman ﬁlter as a nonlinear observer hasbeen widely studied, e.g. see [7, 71, 17, 46]. Although strong signal stability assumptions aretaken in [25], it would be natural to consider the ( En-EKF ) in [25] without the underlying signalstability assumption and look at developing the ﬂuctuation type analysis considered herein inthe linear-Gaussian setting. We may then also exploit the stability analysis that already exists[7, 71, 17, 46] for the limiting extended Kalman state estimator. This is analogous in manyways to the stability properties and observability/controllability properties used herein in thelinear-Gaussian setting.Inﬂation is used in [47, 81] in the nonlinear model setting to aid in stability. This is similarto the study considered herein on stability under inﬂation in linear-Gaussian models. It seemsnatural that added inﬂation acts to stabilise the various ensemble ﬁlters. In the context of thepreceding discussion, inﬂation-based state estimators may also be viewed in the context of stablenonlinear observers, rather than heuristic adaptions of approximate Bayesian ﬁlters.Finally, we remark that the transport-based ensemble ﬁlter

DEnTF , see case (

NF3 ) in (4.10),is studied in [22] in a particular nonlinear setting. In that case again, it is typical to assumea square observation matrix, or in other words to take the observations as linear and assumetheir exists a change of coordinate so that H = I . Note this assumption is made also in[47, 80, 25, 23] which otherwise consider certain classes of nonlinear signals. Thus, this strong(and linear) observability assumption seems key to analysis in the ensemble ﬁltering literatureeven when moving away from the linear signal model. References [1] J.I. Allen, M. Eknes and G. Evensen. An ensemble Kalman ﬁlter with a complex marineecosystem model: Hindcasting phytoplankton in the Cretan Sea. Annales Geophysicae. vol.21. pp. 399–411 (2003).[2] B.D.O. Anderson and J.B. Moore. Optimal Filtering. Dover Publications (1979).373] J.L. Anderson. An ensemble adjustment Kalman ﬁlter for data assimilation. MonthlyWeather Review. vol. 129, no. 12. pp. 2884–2903 (2001).[4] J.L. Anderson. A local least squares framework for ensemble ﬁltering. Monthly WeatherReview. vol. 131, no. 4. pp 634–642 (2003).[5] J.L. Anderson and S.L. Anderson. A Monte Carlo Implementation of the Nonlinear FilteringProblem to Produce Ensemble Assimilations and Forecasts. Monthly Weather Review. vol.127, no. 12. pp. 2741–2758 (1999).[6] A. Bain and D. Crisan. Fundamentals of Stochastic Filtering. Springer (2009).[7] J.S. Baras, A. Bensoussan, and M.R. James. Dynamic observers as asymptotic limits ofrecursive ﬁlters: Special cases. SIAM Journal on Applied Mathematics. vol. 48, no. 5. pp.1147–1158 (1988).[8] A.N. Bishop and P. Del Moral. On the Stability of Kalman-Bucy Diﬀusion Processes. SIAMJournal on Control and Optimization. vol. 55, no. 6. pp 4015–4047 (2017); arxiv e-print arXiv:1610.04686 updated.[9] A.N. Bishop and P. Del Moral. Stability Properties of Systems of Linear Stochastic Diﬀer-ential Equations with Random Coeﬃcients. SIAM Journal on Control and Optimization.vol. 57, no. 2. pp. 1023–1042 (2019); arXiv e-print, arXiv:1804.09349 .[10] A.N. Bishop and P. Del Moral. An explicit Floquet-type representation of Riccati aperiodicexponential semigroups. arXiv e-print, arXiv:1805.02127 (2018).[11] A.N. Bishop and P. Del Moral. On the stability of matrix-valued Riccati diﬀusions. Elec-tronic Journal of Probability. vol. 24, paper no. 24 (2019); arXiv e-print, arXiv:1808.00235 .[12] A.N. Bishop and P. Del Moral. On the robustness of Riccati ﬂows to complete modelmisspeciﬁcation. Journal of the Franklin Institute. vol. 355, no. 15. pp 7178–7200 (2018).[13] A.N. Bishop, P. Del Moral, K. Kamatani, and B. Remillard. On one-dimensional Riccatidiﬀusions. Annals of Applied Probability. vol. 29, no. 2. pp. 1127–1187 (2019); arXiv e-print, arXiv:1711.10065 .[14] A.N. Bishop, P. Del Moral, and A. Niclas. A perturbation analysis of stochastic matrixRiccati diﬀusions. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques. vol.56, no. 2. pp. 884–916 (2020); arXiv e-print, arXiv:1709.05071 .[15] A.N. Bishop, P. Del Moral and A. Niclas. An introduction to Wishart matrix moments.Foundations and Trends in Machine Learning. vol. 11, no. 2. pp. 97–218 (2018); arXive-print, arXiv:1710.10864 .[16] A.N. Bishop, P. Del Moral and S. Pathiraja. Perturbations and Projections of Kalman-BucySemigroups. Stochastic Processes and their Applications. vol. 128, no. 9. pp. 2857–2904.(2018).[17] A.N. Bishop and P. Jensfelt. A stochastically stable solution to the problem of robocen-tric mapping. In Proceedings of the 2009 IEEE International Conference on Robotics andAutomation, Kobe, Japan (May, 2009).[18] G. Burgers, P.J. van Leeuwen, and G. Evensen. Analysis Scheme in the Ensemble KalmanFilter. Monthly Weather Review, vol. 126, no. 6. pp. 1719–1724 (1998).3819] F.M. Callier and J.L. Willems. Criterion for the Convergence of the Solution of the RiccatiDiﬀerential Equation. IEEE Transactions on Automatic Control. vol. 26, no. 6. pp. 1232–1242 (1981).[20] F.M. Callier and J. Winkin. Convergence of the Time-Invariant Riccati Diﬀerential Equa-tion towards Its Strong Solution for Stabilizable Systems. Journal of Mathematical Analysisand Applications. vol. 192, no. 1. pp. 230–257 (1995).[21] N.K. Chada, M.A. Iglesias, L. Roininen, and A.M. Stuart. Parameterizations for ensembleKalman inversion. Inverse Problems. vol. 34, no. 5. (2018).[22] J. de Wiljes, S. Reich, and W. Stannat. Long-Time Stability and Accuracy of the EnsembleKalman-Bucy Filter for Fully Observed Processes and Small Measurement Noise. SIAMJournal on Applied Dynamical Systems. vol. 17, no. 2. pp. 1152–1181 (2018).[23] J. de Wiljes and X.T. Tong. Analysis of a localised nonlinear Ensemble Kalman Bucy Filterwith complete and accurate observations. arXiv e-print, arXiv:1908.10580 (2019).[24] P. Del Moral. Feynman-Kac Formulae. Springer (2004).[25] P. Del Moral, A. Kurtzmann, and J. Tugaut. On the Stability and the Uniform Propagationof Chaos of a Class of Extended Ensemble Kalman–Bucy Filters. SIAM Journal on Controland Optimization. vol. 55, no.1. pp. 119–155 (2017).[26] P. Del Moral and J. Tugaut. On the stability and the uniform propagation of chaos prop-erties of ensemble Kalman-Bucy ﬁlters. Annals of Applied Probability. vol. 28, no. 2. pp790–850 (2018).[27] J.L. Doob. Stochastic Processes. John Wiley & Sons, New York (1953).[28] A. Doucet, N. de Freitas and N.J. Gordon (editors). Sequential Monte Carlo Methods inPractice. Springer (2001).[29] A. Doucet, S. Godsill and C. Andrieu. On sequential Monte Carlo sampling methods forBayesian ﬁltering. Statistics and Computing. vol. 10, no. 3. pp. 197–208 (2000).[30] G. Evensen. Sequential data assimilation with a nonlinear quasi-geostrophic model usingMonte Carlo methods to forecast error statistics. Journal of Geophysical Research: Oceans.vol. 99, no. C5. pp. 10143–10162 (1994).[31] G. Evensen. The Ensemble Kalman Filter: Theoretical Formulation and Practical Imple-mentation. Ocean Dynamics. vol. 53, no. 4. pp. 343–367 (2003).[32] G. Evensen. Data Assimilation: The Ensemble Kalman Filter. Springer Science & BusinessMedia, 2nd edition, (2009).[33] G. Evensen, J. Hove, H.C. Meisingset, E. Reiso, K.S. Seim. Using the EnKF for assistedhistory matching of a North Sea Reservoir Model. In Proceedings of the 2007 SPE ReservoirSimulation Symposium, Houston, Texas (February, 2007).[34] N. Gordon, J. Salmond and A. Smith. A novel approach to non-linear/non-GaussianBayesian state estimation. IEE Proceedings on Radar and Signal Processing. vol. 140, no.2. pp. 107–113 (1993).[35] G.A. Gottwald and A.J. Majda. A mechanism for catastrophic ﬁlter divergence in dataassimilation for sparse observation networks. Nonlinear Processes in Geophysics. vol. 20,no. 5. pp. 705–712 (2013). 3936] T.M. Hamill, J.S. Whitaker, and C. Snyder. Distance-Dependent Filtering of BackgroundError Covariance Estimates in an Ensemble Kalman Filter. Monthly Weather Review. vol.129, no. 11. pp. 2776–2790 (2001).[37] J. Harlim and A.J. Majda. Catastrophic Filter Divergence in Filtering Nonlinear DissipativeSystems. Communications in Mathematical Sciences. vol. 8, no. 1. pp. 27–43 (2010).[38] P.L. Houtekamer and H.L. Mitchell. Data assimilation using an ensemble Kalman ﬁltertechnique. Monthly Weather Review. vol. 126, no. 3. pp. 796–811 (1998).[39] P.L. Houtekamer and H.L. Mitchell. A Sequential Ensemble Kalman Filter for AtmosphericData Assimilation. Monthly Weather Review. vol. 129, no. 1. pp. 123–137 (2001).[40] M. Hutzenthaler, A. Jentzen and P.E. Kloeden. Strong and weak divergence in ﬁnite time ofEuler’s method for stochastic diﬀerential equations with non-globally Lipschitz continuouscoeﬃcients. Proceedings of the Royal Society of London A: Mathematical, Physical andEngineering Sciences. vol. 467, no. 2130. pp. 1563–1576 (2011).[41] M.A. Iglesias, K.J.H. Law, and A.M. Stuart. Ensemble Kalman methods for inverse prob-lems. Inverse Problems. vol. 29, no. 4. (2013).[42] C.J. Johns and J. Mandel. A two-stage ensemble Kalman ﬁlter for smooth data assimilation.Environmental and Ecological Statistics. vol. 15, no. 1. pp. 101–110 (2008).[43] R.E. Kalman and R.S. Bucy. New Results in Linear Filtering and Prediction Theory. Journalof Basic Engineering. vol. 83, no. 1. pp. 95–108 (1961).[44] E. Kalnay. Atmospheric Modelling, Data Assimilation and Predictability. Cambridge Uni-versity Press (2003).[45] I. Karatzas and S.E. Shreve. Brownian Motion and Stochastic Calculus. Springer (1996).[46] T. Karvonen, S. Bonnabel, E. Moulines and S. Särkkä. On stability of a class of ﬁlters fornon-linear stochastic systems. arXiv e-print, arXiv:1809.05667 (2018).[47] D. Kelly, K.H.J. Law, and A.M. Stuart. Well-posedness and accuracy of the ensembleKalman ﬁlter in discrete and continuous time. Nonlinearity. vol. 27, no. 10. pp. 2579–2603(2014).[48] D. Kelly, A.J. Majda, and X.T. Tong. Concrete ensemble Kalman ﬁlters with rigorous catas-trophic ﬁlter divergence. Proceedings of the National Academy of Sciences of the UnitedStates of America. vol 112, no. 34. pp. 10589–10594 (2015).[49] V. Kucera. A Contribution to Matrix Quadratic Equations. IEEE Transactions on Auto-matic Control. vol. 17, no. 3. pp. 344–347 (1972).[50] H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. Wiley-Interscience (1972).[51] E. Kwiatkowski and J. Mandel. Convergence of the Square Root Ensemble Kalman Filterin the Large Ensemble Limit. SIAM/ASA Journal on Uncertainty Quantiﬁcation. vol. 3,no. 1. pp. 1–17 (2015).[52] P. Lancaster and L. Rodman. Algebraic Riccati Equations. Oxford University Press (1995).[53] K.J.H. Law, A.M. Stuart and K. Zygalakis. Data Assimilation: A Mathematical Introduc-tion. Springer (2015). 4054] K.J.H. Law, H. Tembine and R. Tempone. Deterministic Mean-Field Ensemble KalmanFiltering. SIAM Journal on Scientiﬁc Computing. vol. 38, no. 3. pp. A1251-A1279 (2016).[55] F. Le Gland, V. Monbet and V.D. Tran. Large sample asymptotics for the ensemble Kalmanﬁlter. Chapter 22 in The Oxford Handbook of Nonlinear Filtering. pp. 598–631 (2011).[56] K.A. Lisaeter, J. Rosanova and G. Evensen. Assimilation of ice concentration in a coupledice-ocean model using the Ensemble Kalman Filter. Ocean Dynamics. vol. 53, no. 4. pp.368–388 (2003).[57] D.M. Livings, S.L. Dance and N.K. Nichols. Unbiased Ensemble Square Root Filters. Phys-ica D: Nonlinear Phenomena. vol. 237, no. 8. pp. 1021–1028 (2008).[58] A.J. Majda and J. Harlim. Filtering Complex Turbulent Systems. Cambridge UniversityPress (2012).[59] A.J. Majda and X.T. Tong. Performance of Ensemble Kalman Filters in Large Dimensions.Communications in Mathematical Sciences. vol. 71, no. 5. pp. 892–937 (2018).[60] J. Mandel, L. Cobb, and J.D. Beezley. On the convergence of the ensemble Kalman ﬁlter.Applications of Mathematics. vol. 56, no. 6. pp. 533–541 (2011).[61] H.P. McKean. A class of Markov processes associated with nonlinear parabolic equations.Proceedings of the National Academy of Sciences. vol. 56, no. 6. pp.1907–1911 (1966).[62] H.L. Mitchell, P.L. Houtekamer and G. Pellerin. Ensemble size, balance, and model-errorrepresentation in an ensemble Kalman ﬁlter. Monthly Weather Review. vol. 130, no. 11.pp. 2791–2808 (2002).[63] B.P. Molinari. The time-invariant linear-quadratic optimal control problem. Automatica.vol. 13, no. 4. pp. 347–357 (1977).[64] G. Naevdal, L.M. Johnsen, S.I. Aanonsen and E.H. Vefring. Reservoir monitoring andcontinuous model updating using ensemble Kalman ﬁlter. In Proceedings of the 2003 SPEAnnual Technical Conference and Exhibition, Denver, Colorado (October, 2003).[65] E. Ott, B.R. Hunt, I. Szunyogh, A.V. Zimin, E.J. Kostelich, M. Corazza, E. Kalnay, D.Patil, and J.A. Yorke. A local ensemble Kalman ﬁlter for atmospheric data assimilation.Tellus A. vol. 56, no. 5. pp. 415–428 (2004).[66] P. Park and T. Kailath. Convergence of the DRE solution to the ARE strong solution.IEEE Transactions on Automatic Control. vol. 42, no. 4. 573–578 (1997).[67] M-A. Poubelle, I.R. Petersen, M.R. Gevers, and R.R. Bitmead. A Miscellany of Results onan Equation of Count J. F. Riccati. IEEE Transactions on Automatic Control. vol. 31, no.7. pp. 651–654 (1986).[68] P. Rebeschini and R. Van Handel. Can local particle ﬁlters beat the curse of dimensionality?The Annals of Applied Probability. vol. 25, no. 5. pp. 2809–2866 (2015).[69] S. Reich and C.J. Cotter. Ensemble ﬁlter techniques for intermittent data assimilation.Large Scale Inverse Problems: Computational Methods and Applications in the EarthSciences (eds: M. Cullen, M.A. Freitag, S. Kindermann, R. Scheichl), pages 91–134. DeGruyter Publishers (2013). See also: arXiv e-print, arXiv:1208.6572arXiv:1208.6572