[PDF] A multigrid/ensemble Kalman Filter strategy for assimilation of unsteady flows

Abstract

A sequential estimator based on the Ensemble Kalman Filter for Data Assimilation of fluid flows is presented in this research work. The main feature of this estimator is that the Kalman filter update, which relies on the determination of the Kalman gain, is performed exploiting the algorithmic features of the numerical solver employed as a model. More precisely, the multilevel resolution associated with the multigrid iterative approach for time advancement is used to generate several low-resolution numerical simulations. These results are used as ensemble members to determine the correction via Kalman filter, which is then projected on the high-resolution grid to correct a single simulation which corresponds to the numerical model. The assessment of the method is performed via the analysis of one-dimensional and two-dimensional test cases, using different dynamic equations. The results show an efficient trade-off in terms of accuracy and computational costs required. In addition, a physical regularization of the flow, which is not granted by classical KF approaches, is naturally obtained owing to the multigrid iterative calculations. The algorithm is also well suited for the analysis of unsteady phenomena and, in particular,for potential application to in-streaming Data Assimilation techniques.

Full PDF

AA multigrid/ensemble Kalman ﬁlter strategy forassimilation of unsteady ﬂows

G. Moldovan a, ∗ , G. Lehnasch a , L. Cordier a , M. Meldi a a Institut Pprime, CNRS - ISAE-ENSMA - Universit´e de Poitiers, 11 Bd. Marie et PierreCurie, Site du Futuroscope, TSA 41123, 86073 Poitiers Cedex 9, France

Abstract

A sequential estimator based on the Ensemble Kalman Filter for Data As-similation of ﬂuid ﬂows is presented in this research work. The main featureof this estimator is that the Kalman ﬁlter update, which relies on the determi-nation of the Kalman gain, is performed exploiting the algorithmic features ofthe numerical solver employed as a model. More precisely, the multilevel res-olution associated with the multigrid iterative approach for time advancementis used to generate several low-resolution numerical simulations. These resultsare used as ensemble members to determine the correction via Kalman ﬁlter,which is then projected on the high-resolution grid to correct a single simulationwhich corresponds to the numerical model. The assessment of the method isperformed via the analysis of one-dimensional and two-dimensional test cases,using diﬀerent dynamic equations. The results show an eﬃcient trade-oﬀ interms of accuracy and computational costs required. In addition, a physicalregularization of the ﬂow, which is not granted by classical KF approaches, isnaturally obtained owing to the multigrid iterative calculations. The algorithmis also well suited for the analysis of unsteady phenomena and, in particular,for potential application to in-streaming Data Assimilation techniques.

Keywords:

Kalman Filter, Data Assimilation, compressible ﬂows ∗ Corresponding author, [email protected] submitted to Journal of Computational Physics Monday 21 st December, 2020 a r X i v : . [ c s . C E ] D ec . Introduction In Computational Fluid Dynamics (CFD), newly developed numerical meth-ods are generally assessed in terms of accuracy via comparison with experimentaldata [1]. In practice, this validation step is far from being trivial since manysources of error are inevitably introduced in the simulations. First, the partialdiﬀerential equations used to derive the numerical scheme may be restricted tooversimpliﬁed physical models, such as the Boussinesq approximation applied inthermal convection or the incompressibility condition. Second, the discretiza-tion process and the use of iterative numerical methods introduce computationalerrors in the representation of the ﬂow features [2]. Third, boundary and initialconditions are usually very sophisticated in complex applications but detailed apriori knowledge is insuﬃciently available. Last, for very high Reynolds numberconﬁgurations, turbulence/subgrid-scale modelling must be included in order toreduce the required computational costs [3]. All of these sources of error exhibitcomplex interactions owing to the non-linear nature of the dynamical modelsused in numerical application, such as the Navier-Stokes equations.The experimental results are also aﬀected by uncertainties and biases. Inmany cases, the set-up of the problem can be controlled up to a ﬁnite precision(alignment between the ﬂow and the wind tunnel or immersed bodies, mass ﬂowrate, . . . ). This kind of uncertainty, which is clearly aﬀecting every physicalsystem but cannot be exactly quantiﬁed, is usually referred to as epistemicuncertainty . In addition, experimental results are also aﬀected by the precision(or bias) of the acquisition and measurement system. Thus, the main diﬃcultyin the comparison between numerical and experimental results is understandinghow much of the diﬀerences observed is due to an actual lack of precision ofthe numerical model, and how much is instead associated to diﬀerences in theset-up of investigation.One possible strategy to obtain an optimal combination of the knowledgecoming from simulations and experiments is to derive a state estimation whichcomplies with both sources of information. The degree of precision of such es-2imation is connected with the conﬁdence in the sources of information. Thishas the advantage of naturally incorporating uncertainty and bias present in thesources of information in the analysis. Tools providing such state estimation areusually included in diﬀerent disciplines, control theory for state observers [4],Data Assimilation (DA) [5, 6, 7] for weather prediction, ocean modelling andmore recently mechanical engineering problems. Essentially, DA methods com-bine information issued from two sources: i) a model , which provides a dynamicaldescription of the phenomenon in the physical domain, ii) a set of observations ,which are usually sparse and/or local in time and space. These methods areclassiﬁed in diﬀerent families according to the way the state estimation is per-formed. One of the classical criterion of classiﬁcation deals with the operativestrategy used to derive the state estimation.

Variational approaches resolve aconstrained optimization problem over the parametric space characterizing themodel (usually coeﬃcients deﬁning boundary conditions or physical models).The solution of the variational problem minimizes prescribed error norms sothat the assimilated model complies with the observation provided over a speci-ﬁed time window. Methods from this family, which include 3D-Var and 4D-Var,usually exhibit very high accuracy [8, 9, 10, 11]. However, they are also aﬀectedby several drawbacks. First, the formulation of the adjoint problem that is intro-duced to perform the parametric optimization can be diﬃcult, if not impossible,when automatic diﬀerentiation is not employed. Second, the adjoint problem isdeﬁned backward in time which may lead to numerical diﬃculties of resolutionrelated to the unaﬀordable data storage that is needed, and the ampliﬁcationof the adjoint solution that frequently happens when multi-scale interactionsare dominant [5, 8, 12]. Third, standard variational data assimilation methodsare intrusive from a computational point of view, requiring the developmentof an adjoint calculation code, or the availability of the source code when theuse of automatic diﬀerentiation is planned. In most cases, modiﬁcations arerequired in order to maintain the code or extend the applications (change ofthe deﬁnitions of errors, for instance). For commercial software without open-source licence, these modiﬁcations are expensive.

Non-intrusive methods that3equire no modiﬁcation of original calculation code are therefore preferable indata assimilation.Another family of DA methods is represented by the sequential approaches.These methods, mostly based on Bayes’ theorem, provide a probabilistic descrip-tion of the state estimation. A well known approach is the Kalman Filter (KF)[13]. Extensions and generalizations of this method have also been developed,such that the Extended Kalman Filter (EKF) [14] which is tailored for nonlin-ear systems, and Ensemble Kalman Filter (EnKF) [6]. This class of methodssolves the state estimation problem by transporting error covariance matricesof the model and observations. These methods are usually more ﬂexible thanvariational approaches (no required computation of ﬁrst order sensitivities), butthe time advancement and update of the covariance matrices are prohibitivelyexpensive for large scale problems, encountered in practical applications [15].One possible strategy consists in reducing the order of the Kalman ﬁlter [16] orﬁltering the error covariance matrix. Inspired by a domain localization proce-dure, Meldi & Poux [17, 18] proposed a strategy based on an explicit ﬁlter of theerror covariance matrix. The application of this estimator to diﬀerent turbu-lent ﬂows exhibited encouraging results considering the relatively small increasein computational resources. A more popular strategy for data assimilation ofengineering applications is the Ensemble Kalman Filter [5, 6, 19], which relieson a Monte Carlo implementation of the Bayesian update problem. The EnKF(and follow-up models) was introduced as an extension of the original Kalmanﬁlter made for high-dimensional systems for which transporting the covariancematrix is not computationally feasible. EnKF replaces the covariance matrix bythe sample covariance matrix computed from an ensemble of state vectors. Themain advantage of EnKF is that advancing a high-dimensional covariance matrixis achieved by simply advancing each member of the ensemble. Several researchworks have been reported in the literature in recent years for application inﬂuid mechanics [20, 21, 22]. Despite the interest of this non-intrusive technique,and the possibility to perform eﬃcient parametric inference, the computationalcosts can still be prohibitive for realistic applications. Statistical convergence4s usually obtained for a typical ensemble size going from 60 to 100 ensemblemembers [5].In addition, the state estimation obtained via sequential tools does not nec-essarily comply with a model solution i.e. the conservativity of the dynamicequations of the model is violated. This aspect is a potentially critical issue inﬂuid mechanics studies. Violation of conservativity may result in loss of con-servation of some physical properties of the ﬂow (such as mass conservation ormomentum conservation) as well as in the emergence of non-physical disconti-nuities in the ﬂow quantities. The aforementioned issues signiﬁcantly aﬀect theprecision of the prediction of the ﬂow and may eventually produce irreversibleinstabilities in the time advancement of the dynamical model. A number ofworks in the literature have provided advancement in the form of additionalconstraints to be included in the state estimation process. Meldi & Poux [17]used a recursive procedure and a Lagrangian multiplier (the pressure ﬁeld) toimpose the zero-divergence condition of the velocity ﬁeld for incompressibleﬂows. Other proposals deal with imposing hard constraints in the frameworkof an optimization problem [23], ad-hoc augmented observation [24] and gen-eralized regularization [25]. These approaches are responsible for a signiﬁcantincrease in the computational resources required, which is due to augmentationin size of the state estimation problem or to the optimization process, whichusually needs the calculation of gradients of a cost function.The investigation of physically constrained sequential state estimation is hereperformed using an advanced estimator strategy, which combines an EnKF ap-proach and a multigrid method. For this reason, we refer to our algorithmas Multigrid Ensemble Kalman Filter (MEnKF). Multigrid methods [26, 2]are a family of tools which employ multi-level techniques to obtain the time-advancement of the ﬂow. In particular, the geometric multigrid [27] uses diﬀer-ent levels of the resolution in the computational grid to obtain the ﬁnal state.The method here proposed exploits algorithmic features of iterative solvers usedin practical CFD applications. The EnKF error covariance matrix reconstruc-tion is performed using information from a number of ensemble members which5re generated over a coarse level mesh of a multigrid approach. This procedureis reminiscent of reduced order / multilevel applications of EnKF strategy re-ported in the literature [28, 29, 30, 31]. However, the state estimation obtainedat the coarse level is used to obtain a single solution calculated on a high res-olution mesh grid, similarly to the work by Debreu et al. [32] for variationalDA. Because of the algorithmic structure of the problem, all of the simulationson the ﬁne and coarse level can be run simultaneously in parallel calculations,providing a tool able to perform in-streaming DA for unsteady ﬂow problems.The strategy is tested on diﬀerent conﬁgurations, using several physical mod-els represented by the Burger’s equation and the compressible Euler and Navier-Stokes equations for one-dimensional and two-dimensional test cases.The article is structured as follows. In Sec. 2, the sequential DA procedureis detailed including descriptions of the classical KF and EnKF methods. Thenumerical discretization and the multigrid strategy are also presented. In Sec 3,the algorithm called Multigrid Ensemble Kalman Filter (MEnKF) is discussed.In Sec. 4, the MEnKF is used to investigate a one-dimensional case using theBurgers’ equation. In Sec. 5, a second one-dimensional case is investigated, butin this case the dynamical model is represented by a Euler equation. In Sec. 6,we investigate the case of the two-dimensional compressible Navier-Stokes equa-tions, namely the spatially evolving mixing layer. Finally, in Sec. 7 concludingremarks are drawn.

2. Sequential data assimilation in ﬂuid dynamics

In Sec. 2.1, we introduce sequential data assimilation methods starting withthe Kalman ﬁlter. The objective is to gradually arrive at the Dual ensembleKalman Filter methodology (Dual EnKF), which is an essential ingredient ofthe estimator proposed in this work. In Sec. 2.2, the steps that are necessaryto transform a general transport equation into a discretized model usable insequential DA are described. A brief description of the multigrid approachemployed is also provided. 6 .1. Sequential data assimilation2.1.1. Kalman ﬁlter

The role of the Kalman ﬁlter (KF) is to provide an estimate of the stateof a physical system at time k ( x k ), given the initial estimate x , a set ofmeasurements or observations, and the information of a dynamical model (e.g.,ﬁrst principle equations): x k = M k : k − ( x k − , θ k ) + η k (1)where M k : k − is a non linear function acting as state-transition model and θ k contains the parameters that aﬀect the state-transition. The term η k is asso-ciated with uncertainties in the model prediction which, as discussed before,could emerge for example from incomplete knowledge of initial / boundary con-ditions. In the framework of KF applications, these uncertainties are usuallymodelled as a zero-mean Gaussian distribution characterized by a variance Q k , i.e. η k ∼ N ( , Q k ).Indirect observations of x k are available in the components of the observationvector y o k . These two variables are related by: y o k = H k ( x k ) + (cid:15) o k (2)where H k is the non linear observation operator which maps the model statespace to the observed space. The available measurements are also aﬀected byuncertainties which are assumed to follow a zero-mean Gaussian distributioncharacterized by a variance R k , i.e. (cid:15) o k ∼ N ( , R k ).The model and observation errors being Gaussian, we can show that thesolution is characterized entirely by the ﬁrst two moments of the state. Follow-ing the notation generally used in DA literature, the forecast / analysis statesand error covariances are indicated as x f / a k and P f / a k , respectively. The errorcovariance matrix is deﬁned as P f / a k = E (cid:20)(cid:16) x f / a k − E ( x f / a k ) (cid:17) (cid:16) x f / a k − E ( x f / a k ) (cid:17) (cid:62) (cid:21) .In the case of a linear dynamical model ( M k : k − ≡ M k : k − ) and a linear ob-7ervation model ( H k ≡ H k ), the estimated state is obtained via the followingrecursive procedure:1. A predictor (forecast) phase, where the analysed state of the system ata previous time-step is used to obtain an a priori estimation of the stateat the current instant. This prediction, which is obtained relying on themodel only, is not conditioned by observation at time k : x f k = M k : k − x a k − (3) P f k = M k : k − P a k − M (cid:62) k : k − + Q k (4)2. An update (analysis) step, where the state estimation is updated account-ing for observation at the time k : K k = P f k H (cid:62) k (cid:0) H k P f k H (cid:62) k + R k (cid:1) − (5) x a k = x f k + K k (cid:0) y o k − H k x f k (cid:1) (6) P a k = ( I − K k H k ) P f k (7)The optimal prediction of the state ( x a k ) is obtained via the addition to the pre-dictor estimation ( x f k ) of a correction term determined via the so called Kalmangain K k . The classical KF algorithm is not suited for direct application to theanalysis of complex ﬂows. First of all, KF classical formulation is developedfor linear systems. Applications to non-linear systems can be performed usingmore advanced techniques such as the extended Kalman ﬁlter [14] or exploit-ing features of the numerical algorithms used for numerical discretization [17].The canonical Kalman ﬁlter is diﬃcult to implement with realistic engineeringmodels because of i) nonlinearities in the models and observation operators, ii)poorly known error statistics and iii) a prohibitive computational cost. Consid-ering point ii), the matrices Q k and R k are usually unknown and their behaviourmust be modelled. One simple, classical simpliﬁcation is to consider that er-rors for each component are completely uncorrelated in space and from othercomponents i.e. Q k and R k are considered to be diagonal [33, 34]. Point iii) is8asily explained by the fact that in engineering applications, the dimension ofthe physical state x k is huge ( N ∈ [10 , ]). By examining the equations ofthe canonical Kalman ﬁlter, it is immediate to see that the procedure relies onthe transport of a very large error covariance matrix P k . It is therefore neces-sary to store it but also to invert very large matrices (see (5)). To bypass thiscomputational cost, one popular strategy is to rely on a statistical descriptionobtained via an ensemble of trajectories of the model dynamics. The Ensemble Kalman Filter (EnKF) [6], introduced by Evensen in 1994[35], relies on the estimation of P k by means of an ensemble. More precisely,the error covariance matrix is approximated using a ﬁnite ensemble of modelstates of size N e . If the ensemble members are generated using stochastic Monte-Carlo sampling, the error in the approximation decreases with a rate of 1 √ N e .In the following, we are describing the stochastic EnKF ﬂavour of EnKF forwhich random sampling noise is introduced in the analysis [5].Given an ensemble of forecast/analysed states at a certain instant k , theensemble matrix is deﬁned as: EEE f / a k = (cid:104) x f / a , (1) k , · · · , x f / a , ( N e ) k (cid:105) ∈ R N x × N e (8)To reduce the numerical cost of implementation, the normalized ensembleanomaly matrix is then speciﬁed as: X f / a k = (cid:104) x f / a , (1) k − x f / a k , · · · , x f / a , ( N e ) k − x f / a k (cid:105) √ N e − ∈ R N x × N e , (9)where the ensemble mean x f / a k is obtained as: x f / a k = 1 N e N e (cid:88) i =1 x f / a , ( i ) k (10)The error covariance matrix P f / a k can thus be estimated via the informa-9ion derived from the ensemble. This estimation, hereafter denoted with thesuperscript e , can be factorized into: P f / a , e k = X f / a k (cid:16) X f / a k (cid:17) (cid:62) ∈ R N x × N x (11)The goal of the EnKF is to mimic the BLUE (Best Linear Unbiased Es-timator) analysis of the Kalman ﬁlter. For this, Burgers et al. [36] showedthat the observation must be considered as a random variable with an averagecorresponding to the observed value and a covariance R k (the so-called data ran-domization trick). Therefore, given the discrete observation vector y o k ∈ R N y atan instant k , the ensemble of perturbed observations is deﬁned as: y o , ( i ) k = y o k + (cid:15) o , ( i ) k , with i = 1 , · · · , N e and (cid:15) o , ( i ) k ∼ N (0 , R k ) . (12)In a similar way to what we did for the ensemble matrices of the fore-cast/analysed states, we deﬁne the normalized anomaly matrix of the observa-tions errors as E o k = 1 √ N e − (cid:104) (cid:15) o , (1) k − (cid:15) o k , (cid:15) o , (2) k − (cid:15) o k , · · · , (cid:15) o , ( N e ) k − (cid:15) o k , (cid:105) ∈ R N y × N e (13)where (cid:15) o k = 1 N e N e (cid:88) i =1 (cid:15) o , ( i ) k .The covariance matrix of the measurement error can then be estimated as R e k = E o k ( E o k ) (cid:62) ∈ R N y × N y . (14)By combining the previous results, we obtain (see [5]) the standard stochasticEnKF algorithm. The corresponding analysis step consists of updates performedon each of the ensemble members, as given by x a , ( i ) k = x f , ( i ) k + K e k (cid:16) y o , ( i ) k − H k (cid:16) x f , ( i ) k (cid:17)(cid:17) (15)10he expression of the Kalman gain is K e k = X f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − (16)where Y f k = H k X f k .Interestingly, it can be shown that the computation of the tangent linearoperator H k can be avoided. The details can be found in [5, Sec. 6.3.3]. Aversion of the Ensemble Kalman ﬁlter algorithm using the previously deﬁnedanomaly matrices is given in Appendix A.2. This is the version we use in ourapplications.State-of-the-art approaches based on the EnKF are arguably the most ad-vanced forms of state estimation available in the ﬁeld of DA methods. Thesetechniques have been extensively applied in the last decade in meteorology andgeoscience [5]. Applications in mechanics and engineering are much more recent,despite a rapid increase in the number of applications in the literature. Amongthose, studies dealing with wildﬁre propagation [22], combustion [37], turbulencemodeling [21] and hybrid variational-EnKF methods [10] have been reported.These applications reinforce the idea that approaches based on EnKF have ahigh investigative potential despite the highly non-linear, multiscale features ofthe ﬂows studied by the ﬂuid mechanics community. In this section, we extend the classical EnKF framework presented in Sec. 2.1.2by considering the case of a parameterized model such as (1). The objectiveis to enable the model to generate accurate forecasts. For this, we need todetermine good estimates of both model state variables x k and parameters θ k given erroneous observations y o k . One approach is provided by joint estimation where state and parameter vectors are concatenated into a single joint statevector (state augmentation). After [38], the drawback of such strategy is that,by increasing the number of unknown model states and parameters, the degreeof freedom in the system increases and makes the estimation unstable and in-11ractable, especially in the non linear dynamical model. For this reason, wefollow the procedure developed by [38], called dual estimation . The principleis to apply successively two interactive ﬁlters, one for the estimation of the pa-rameters from a guessed state solution, the other for the updating of the statevariables from the estimated previous parameters.In the ﬁrst step of the algorithm, the ensemble of the analysed parametersis updated following the classical KF equation: θ a , ( i ) k = θ f , ( i ) k + K θ, e k (cid:16) y o , ( i ) k − y f , ( i ) k (cid:17) with i = 1 , · · · , N e (17)where y f , ( i ) k = H k (cid:16) x f , ( i ) k (cid:17) .The Kalman gain responsible for correcting the parameter trajectories in theensemble is obtained as follows: K θ, e k = Θ f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − , (18)where the variable Θ f k plays the same role for the parameters as the variable X f k deﬁned in (9) for the states. We then have: Θ f / a k = (cid:104) θ f / a , (1) k − θ f / a k , · · · , θ f / a , ( N e ) k − θ f / a k (cid:105) √ N e − ∈ R N θ × N e (19)with θ f / a k = 1 N e N e (cid:88) i =1 θ f / a , ( i ) k (20)Knowing a better approximation of the model’s parameters, we can updatethe state by EnKF (see Sec. 2.1.2). The Dual Ensemble Kalman ﬁlter allowsto perform a recursive parametric inference / state estimation using the in-formation from the ensemble members. The algorithm that we use is givenin Appendix A.3. 12 .2. From transport equation to multigrid resolution The general expression for a conservation equation in local formulation overa continuous physical domain reads as:D x D t = 1 ρ ∇ · σ + f (21)where D / D t is the total (or material) derivative of the physical quantity ofinvestigation x and ρ is the ﬂow density. The divergence operator is indicated as ∇· while σ is the stress tensor. Finally, f represents the eﬀects of volume forces.The evolution of the ﬂow is obtained via time advancement of the discretizedsolution, which is performed in a physical domain where initial and boundaryconditions are provided. A general expression of the discretized form of (21) forthe time advancement from the step k − k is given by: x k = Φ k x k − + B k b k (22)where Φ k is the state transition model which includes the discretized informationof (21). In case of non-linear dynamics described by (21), the state-of-the-artalgorithms used for the discretization process are able to preserve the non-linearinformation in the product Φ k x k − , up to a discretization error which is usuallyproportional to the size of the time step. The term b k represents the controlvector reﬂecting, for instance, the eﬀect of the boundary conditions. B k is the control input model which is applied to the control vector b k . Equation (22) isconsistent with a time explicit discretization of (21). It is well known that thisclass of methods, despite the very high accuracy, may exhibit some unfavorablecharacteristics for the simulation of complex ﬂows, such as limitations to thetime step according to the Courant-Friedrichs-Lewy (CFL) condition [2]. Tobypass this limitation, one possible alternative consists in using implicit schemesfor time discretization. In this case, the general structure of the discretized13roblem is usually cast in the form: Ψ k x k = (cid:101) Ψ k x k − + (cid:101) B k b k = c k (23)where Ψ k , (cid:101) Ψ k and (cid:101) B k are matrices obtained via the discretization process.Obviously, considering Φ k = Ψ − k (cid:101) Ψ k , we retrieve (22). However, this manip-ulation is in practice not performed due to the prohibitive costs associated tolarge scale matrices inversions at each time step. Instead, an iterative proce-dure is used until the residual δ n , determined at the n -th iteration, falls belowa pre-selected threshold value ε . In other words, the procedure is stopped when (cid:107) δ n (cid:107) = (cid:107) Ψ k x nk − c k (cid:107) < ε . Among the various iterative methods proposed inthe literature, multigrid approaches are extensively used in CFD applications[39, 2]. The solution is found on the computational grid by updating an initialguess via multiple estimations obtained on a hierarchy of discretizations. Twowell-known families of multigrid approaches exist, namely the algebraic multi-grid method and the geometric multigrid method. With algebraic multigridmethods, a hierarchy of operators is directly constructed from the state transi-tion model Ψ . On the other hand, the geometric multigrid obtains the solutionvia a set of operations performed in two (or more) meshes. In this paper, weconsider the simpliﬁed case of two grids. Thereafter, the variables deﬁned on theﬁne grid will be denoted with the superscript F ( x F for instance), those deﬁnedon the coarse grid will be denoted with the superscript C ( x C for instance).The coarse-level representation x C is usually obtained suppressing multiplemesh elements from the initial ﬁne-level one x F . This operation may be deﬁnedby a coarsening ratio parameter r C , which indicates the total number of elementson the ﬁne grid over the number of elements conserved in the coarse grid. Amongthe numerous algorithms proposed for geometric multigrid, we use the FullApproximation Scheme (FAS), which is a well documented strategy [26, 27]. Ageneral formulation for a two-grid algorithm is now provided. The time subscript k is excluded for clarity. The superscript n represents the iteration step of theprocedure. 14. Starting from an initial solution on the ﬁne grid (cid:0) x (cid:1) F (which is usuallyequal to x at the previous time step k − (cid:0) x (cid:1) F . A residual (cid:0) δ (cid:1) F = c F − Ψ F (cid:0) x (cid:1) F is calculated.2. (cid:0) x (cid:1) F and (cid:0) δ (cid:1) F are projected from the ﬁne grid to the coarse grid space viaa projection operator Π C , so that (cid:0) x (cid:1) C and (cid:0) δ (cid:1) C are obtained. Similarly,the state transition model Ψ F is projected on the coarse grid (that is re-estimated based on the projection of the solution of the ﬁne grid onto thecoarse grid) to obtain Ψ C . Finally, we evaluate c C = Ψ C (cid:0) x (cid:1) C + (cid:0) δ (cid:1) C .3. An iterative procedure is employed to obtain (cid:0) x (cid:1) C on the coarse gridusing as initial solution (cid:0) x (cid:1) C .4. The updated variable on the ﬁne grid is obtained as (cid:0) x (cid:1) F = (cid:0) x (cid:1) F +Π F (cid:16)(cid:0) x (cid:1) C − (cid:0) x (cid:1) C (cid:17) where Π F is a projection operator from the coarsegrid to the ﬁne grid.5. At last, the ﬁnal solution (cid:0) x (cid:1) F is obtained via a second iterative procedureon the ﬁne grid starting from the intermediate solution (cid:0) x (cid:1) F .This procedure can be repeated multiple times imposing (cid:0) x (cid:1) F = (cid:0) x (cid:1) F atthe beginning of each cycle. When the convergence is reached, the ﬁne grid solu-tion at time instant k is equal to (cid:0) x (cid:1) F . Performing part of the calculations on acoarse grid level provides two main advantages [2]. First, a signiﬁcant reductionin the computational resources is obtained since the calculations performed overthe coarse grid are usually much less expensive than a full set of iterations overthe ﬁne grid. Second, spurious high-frequency numerical oscillations, due to thediscretization error in x F , are naturally ﬁltered via the operators Π C and Π F ,which globally improves the quality of the numerical solution. In this paper, thetwo projections (Π F and Π C ) are deﬁned as 4-th order Lagrange interpolators.

3. Multigrid Ensemble KF method (MEnKF)

Despite the game-changing advantage that EnKF oﬀers for the analysis oflarge-scale dynamical systems, the use of a suﬃciently large ensemble (usually150 to 100 members are required for convergence [5]) may still be too prohibitivefor advanced applications. In the following, we present a Kalman ﬁlter strat-egy which relies on the use of a coarse mesh for the ensemble Kalman ﬁlterstep. These computations on the coarse mesh are jointly run with a singlehigh-reﬁnement simulation, which is updated using the coarse mesh assimila-tion results. For this reason, the clock time required for the time advancementof the ensemble members and the memory storage of the physical variables aredramatically reduced. The multigrid-ensemble algorithm works through thesteps described below. In the following description, the notation Ψ might holdfor both Ψ and (cid:101) Ψ introduced in (23), depending on the choice of the time inte-gration strategy. For its part, the state ( x C k ) ∗ represents the projection on thecoarse-grid of ( x F k ) f , the ﬁne-grid solution. Finally, ( x C k ) (cid:48) denotes the coarse-gridstate obtained by an ensemble Kalman ﬁltering process applied on the coarsegrid. In the FAS multigrid method described in Sec. 2.2, ( x C k ) ∗ and ( x C k ) (cid:48) cor-respond to (cid:0) x (cid:1) C and (cid:0) x (cid:1) C , respectively.1. First iteration on the ﬁne grid . Starting from an initial solution onthe ﬁne grid (cid:0) x F k − (cid:1) a , a forecasted state ( x F k ) f is obtained by using θ a k asparameter for the model Ψ F .2. Projection on the coarse grid . ( x F k ) f is projected on the coarse gridspace via a projection operator Π C , so that ( x C k ) ∗ is obtained. Similarly,the state matrix Ψ F is projected on the coarse grid to obtain Ψ C . Thiscritical passage will be discussed in detail in the following.3. Time advancement of the ensemble members used in the DualEnKF . For each member i of the ensemble, the state matrix ( Ψ C ) ( i ) usedfor the advancement in time on the coarse grid is determined . The ensem-ble forecast ( x C k ) f , ( i ) is corrected with the standard Dual EnKF procedureto obtain ( x C k ) a , ( i ) as well as the parameters θ a , ( i ) k . While the advancement model is unique (the Navier-Stokes equations, for example), thediscretization process contained in Ψ is unique for each member of the ensemble. To distin-guish them, it is therefore necessary to introduce an exponent i in the notations. Determination of the state variables on the coarse grid . If ob-servations are not available, the state ( x C k ) (cid:48) is obtained using the classicalmultigrid procedure. On the other hand, if observations are available, theensemble error covariance matrix ( P C k ) f , e is used to determine the coarsegrid solution ( x C k ) (cid:48) through a Kalman ﬁlter estimation.5. Final iteration on the ﬁne grid . A ﬁrst estimation ( x F k ) (cid:48) of the ﬁnegrid state is determined using the results obtained on the coarse space:( x F k ) (cid:48) = ( x F k ) f + Π F (cid:16) ( x C k ) (cid:48) − ( x C k ) ∗ (cid:17) . The state ( x F k ) a is obtained from aﬁnal iterative procedure starting from ( x F k ) (cid:48) . FINE GRIDCOARSE GRID

Figure 1: Schematic representation of the Multigrid Ensemble Kalman Filter (MEnKF). Twodiﬀerent levels of representation (ﬁne and coarse grids) are used to obtain a data-driven ﬁnegrid estimation. The Dual Ensemble Kalman ﬁlter procedure is solved in the coarse grid. Thefull algorithm is given in Appendix A.

An overview of the assimilation cycle described in the previous algorithm ispresented in Fig. 1. Two important aspects need to be discussed:- As previously stated, Ψ C and its role in the determination of the matrices( Ψ C ) ( i ) is an essential step in the MEnKF strategy. In non-linear prob-lems of interest in ﬂuid mechanics, the state transition matrix Ψ includesinformation of the multi-scale interactions that are speciﬁc for every case17nvestigated. The simplest possible choice, which is the one adopted inthis work, is to calculate the coeﬃcients of the matrices Ψ C and ( Ψ C ) ( i ) separately for each simulated state. Thus, the similarities between theemployed state matrices are limited to the use of the same discretizationschemes / structure of Ψ . However, one can envision to use the non-linearinformation conserved in Ψ C , which is supposedly accurate, to improvethe accuracy of the prediction of the ensemble members. This aspect isdiscussed in the perspectives included in Sec. 7.- The recursive structure of the algorithm allows for integration of itera-tive corrections for non-linear systems [40] as well as hard constraints (seethe discussion in the introduction of [23, 24, 25]) to respect the conserva-tivity of the model equations. However, these corrections may result inan increase of the computational resources required. Here, the multigridalgorithm itself is used for regularization ( i.e. for smoothing the discon-tinuities in the physical variables produced by the update via KalmanFilter) of the ﬂow. In fact, if an intentionally reduced tolerance is im-posed in the iterative steps 4 and 5, the ﬁnal solution will keep memory ofthe features of the state estimation produced in step 3. However, the iter-ative resolution will smooth the estimation via the state transition model Ψ , which will perform a natural regularization of the ﬂow. Clearly, if areduced tolerance is imposed, the ﬁnal solution will not necessarily respectthe conservativity constraints of the model equations. However, one canargue that complete conservativity is not an optimal objective in this case,if the model state at the beginning of the time step is not accurate.The advantages of our strategy with respect to classical approaches basedon EnKF may be summarized in the following points:- The RAM requirement necessary to store the N e ensemble members duringthe assimilation is usually moderate. The reduction in computational costsis driven by N e and by the size of the coarse variables. To illustrate, letus consider the case of a simple two-level geometric multigrid approach18or a 3D test case with a constant coarsening ratio r C = 4 and a sizeof ensembles N e = 100. Each ensemble member is then described by4 = 64 times less mesh elements than the single simulation on the ﬁnegrid. If one considers that one main simulation and 100 ensemble membersare run simultaneously, and if the RAM requirement is normalized overthe main simulation, this implies that R RAM , the non-dimensional RAMrequirement, is equal to 1 + 100 /

64 = 2 .

56. In other words, the totalcost in RAM is increased to just 2 .

56 times the cost of the simulationwithout EnKF. For r C = 8, the normalized RAM requirement is R RAM =1 + 100 / = 1 . N e = 100, since in this case R RAM = N e = 100.- Considering that the ensemble members in the coarse grid and the simu-lation over the ﬁne grid are running simultaneously, communication timesare optimized.- Owing to the iterative procedures of steps 4 and 5, regularization of theﬁnal solution is naturally obtained.- The algorithm is here described and tested in the framework of geometricmultigrid, but it can actually be integrated within other algorithmic struc-tures. For iterative methods, the only essential operation to be performedis the determination of the state transition model Ψ C and of the projec-tions Π C and Π F . This implies that the method can be easily extended toother popular procedures, such as the algebraic multigrid. If the multigridoperations are removed, the methods becomes a classical multilevel EnKFmethod, which relies on just one level of resolution for the ensemble mem-bers. However, in this case, no regularization is obtained unless speciﬁccorrections are included.This general algorithm may be easily tailored accounting for the complexityof the test case investigated, in particular for the requirements of iterative loops19n both the coarse grid level and the ﬁne grid level. The algorithm that we usedto validate our approach is described in Appendix A.4.

4. Application: one-dimensional Burgers’ equation

The MEnKF method introduced in Sec. 3 is now applied to the analysis ofdiﬀerent test cases. Several dynamical systems of increasing complexity werechosen in order to highlight diﬀerent properties of the algorithm. Also, a set ofdiﬀerent tests is performed in order to obtain a comprehensive validation of themethod. At ﬁrst, let us consider a 1D Burgers’ equation: ∂u∂t + u ∂u∂x = 1Re ∂ u∂x (24)where x is the spatial coordinate, u the velocity and Re is the Reynolds number.Equation (24) is non-dimensionalized with a reference velocity u and a referencelength λ . This equation is solved with a second-order centered ﬁnite diﬀerencescheme for the space derivatives and a ﬁrst-order scheme for the time integrationto obtain the general form of discretized representation as given by (23). ADirichlet time-varying condition is imposed at the inlet: u ( x = 0 , t ) = u (1 + θ sin( ωt + θ )) (25)where u = 1 and ω = 2 πf c . The characteristic frequency f c is set to 1 and thecharacteristic length λ is deﬁned as λ = u f c . The characteristic time is deﬁnedas t c = λ/u . This implies that, over a complete period of oscillation 1 /f c , thesolution is advected of a length λ by the velocity u . θ and θ represent theamplitude and phase of the sinusoidal signal, respectively. Note that from nowon, the units are the characteristic magnitudes. The outlet boundary conditionis extrapolated from the nearest points to the outlet. The initial conditionimposed for t = 0 is u ( x, t = 0) = u everywhere in the physical domain ( θ = 0for the reference simulation). The value of the Reynolds number is Re = 200.The time advancement step is chosen as ∆ t = 0 . , x = 0 . λ using 80 mesh elements. This also implies that thetotal number of nodes employed to perform the calculation is N x = 800. Asimulation of reference is run on the ﬁne grid with values θ = 0 . θ = 0 for the phase. Thereafter, the solution obtained by thisreference simulation is called the true state or truth . A ﬂow visualization at t = 10 is shown in Fig. 2. For the investigated value of Reynolds number, thenon-linear eﬀects and viscous mechanisms can be clearly identiﬁed. Figure 2: Solution of the 1D Burgers’ equation at t = 10 for θ = 0 . θ = 0 (true state). Data assimilation is performed in the following conditions:- The observations are sampled each 30 time steps of the reference simu-lation on the space domain [0 ,

1] (80 sensors) and on the time window[10 , R = 0 . model is chosen to be the discretized version of (24). The numericaltest consists of one main simulation, which is run on the ﬁne grid previ-ously introduced, and an ensemble of N e = 100 coarse simulations usedfor assimilation purposes. The coeﬃcients θ and θ are initially assumed21o be described by Gaussian distributions, so that θ ∼ N (0 , Q θ ) and θ ∼ N (0 . , Q θ ). The initial value of the covariance of the parameters ischosen equal to Q θ ( t = 0) = Q θ ( t = 0) = 0 . i.e. θ = 0 and θ = 0 .

3. Random values for the parameters areimposed at the inlet for each ensemble member on the coarse grid level.The initial mean values for the parameters are signiﬁcantly diﬀerent whencompared with the values prescribed in the reference simulation, which are θ = 0 . θ = 0. This choice allows to analyze the rate of convergenceof the optimization procedure. Also, the initial condition u ( x,

0) = u isimposed for every simulation.The data estimation is run for a time window of T DA = 19 characteristic times,which encompass roughly 3000 DA analysis phases. The sensitivity of the para-metric inference procedure to the resolution of the coarse simulations is investi-gated considering several coarsening ratios r C = 1 , , , ,

16. The ﬁne grid isunchanged, so that the MEnKF is performed using information of progressivelycoarser grids as r C increases. The interest of this test is to analyse the loss ofaccuracy of the estimator as r C increases and to ascertain the potential for eﬃ-cient trade-oﬀ between accuracy and computational resources required for theestimation process.The time-evolution of the estimation of θ is shown in Fig. 3. Very rapidconvergence (less than 2 t c ) is observed for r C ≤

4. In addition, the parameterestimation is extremely precise (discrepancy lower than 0 .

01% for r C = 1, lowerthan 2% for r C = 4). For higher values of the parameter r C , the estimation of θ becomes progressively more degraded. For the case r C = 8, θ is initiallyoverestimated and it ﬁnally converges to a value of θ = 0 . θ are observed for r C = 16.In this case, the optimized amplitude parameter is θ = 0 .

27, which is 35%larger than the true value.Similar considerations can be drawn by the analysis of the optimization of22 .0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 t r c = 1 r c = 2 r c = 4 r c = 8 r c = 166 8 10 12 140.190.20 Figure 3: Values of the parameter θ for diﬀerent coarsening ratios r C = 1 , , , ,

16. In thezoomed region, the shaded area represents the 95% credible interval for the shown cases. the parameter θ , which is shown in Fig. 4. For r C = 1 , ,

4, we obtain anaccurate prediction of the parameter, while a loss in accuracy is observed forthe cases r C = 8 ,

16. This observation can be justiﬁed considering the numberof mesh elements representing one characteristic length λ for these cases, whichare 10 and 5 for r C = 8 and 16, respectively. If one considers that the frequency ω is set so that one complete oscillatory cycle is performed on average over acharacteristic length λ , this means that the average phase angle between meshelements is equal to 0 .

63 radians for r C = 8 and 1 .

26 radians for r C = 16.Thus, the values observed for the optimization of θ for these two cases, whichare around θ ≈ .

15, are signiﬁcantly lower than the uncertainty due to thecoarse-level resolution. t r c = 1 r c = 2 r c = 4 r c = 8 r c = 166 8 10 12 140.020.00 Figure 4: Values of the parameter θ for diﬀerent coarsening ratios r C = 1 , , , ,

16. In thezoomed region, the shaded area represents the 95% credible interval for the shown cases. r C = 1 are shown in Fig. 5. This case,which is performed using the same grid for the coarse and ﬁne mesh level, isequivalent to a standard Dual EnKF. However, owing to the ﬁnal multigriditerative loop, the ﬁnal solution is naturally regularized. The results, which areshown for t = 1 , . , .

60, show that the estimator successfully representsthe behaviour of the dynamical system. A full domain advective time ( i.e. r C . Results for r C = 8are shown in Fig. 6. Minor diﬀerences between the state estimation and thetrue state can be observed in this case. This discrepancy is due to the lack ofresolution of the ensemble members. In fact, the resolution in this case is of 10mesh elements per characteristic length. This number of points is arguably notenough to provide an accurate representation of the sinusoidal waves which areimposed at the inlet. However, one can see that no spurious numerical eﬀectsare observed as the estimator provides a smooth, continuous prediction of thevelocity. The discrepancy between the true state and the state estimation ismainly associated with an erroneous calculation of the Kalman gain due to theunder-resolution of the ensemble, which also aﬀects the parameter estimation.A combined analysis of Fig. 3 and 6 shows that, due to the lack of accuracyin the estimation of θ , the variable u is over-predicted for t < t c while it isslightly under-predicted for t > t c .At last, the case for r C = 16 is shown in Fig. 7. In this case, the meshelements are only 5 times smaller than the characteristic length λ . Despite theimportant under-resolution of the ensemble members, which severely aﬀects theestimation of θ and θ , the state estimation still adequately represents the mainfeatures of the dynamical system. 24 a) t = 1(b) t = 3 . t = 10 . Figure 5: Estimations obtained by MEnKF for r C = 1 at t = 1 (a), t = 3 .

88 (b) and t = 10 . t c units. The grey shaded area corresponds to the observation window. a) t = 1(b) t = 3 . t = 10 . Figure 6: Estimations obtained by MEnKF for r C = 8 at t = 1 (a), t = 3 .

88 (b) and t = 10 . t c units. The grey shaded area corresponds to the observation window. a) t = 1(b) t = 3 . t = 10 . Figure 7: Estimations obtained by MEnKF for r C = 16 at t = 1 (a), t = 3 .

88 (b) and t = 10 . t c units. The grey shaded area corresponds to the observation window. i.e. RMSE( k ) = (cid:118)(cid:117)(cid:117)(cid:117)(cid:117)(cid:117)(cid:116) (cid:90) x (cid:104) ( u F k ) a ( x ) − ( u F k ) True ( x ) (cid:105) d x (cid:90) x (cid:104) ( u F k ) True ( x ) (cid:105) d x (26)The results are shown in Fig. 8 for diﬀerent values of the coarsening ratio r C .One can see that the error decreases slowly for t < t c . This threshold timecorresponds to a complete advective cycle in the physical domain. After thistransient, the error may rapidly decrease before reaching a quasi-asymptoticbehaviour. Figure 8: Time evolution of the RMS error of u for r C = 1 , , , , One can also see that, once convergence is reached, the asymptotic RMSEvalue decreases with lower r C values, as expected. However, lower r C valuesare as well associated with larger computational costs, so that a trade-oﬀ be-tween accuracy and required resources must be found. This aspect is furtherinvestigated considering the computational resources required to perform a fullassimilation window for each r C value. In Fig. 9, results are shown and normal-ized over the case r C = 1. One can see that the computational resources requiredrapidly decrease with increasing r C , even for this simple one-dimensional test28ase. For large values of r C = 8 ,

16, one can see that the computational resourcesreach a plateau. Here the computational time to perform the DA procedures,which is the same for every r C , is of similar order of magnitude of the calculationfor the time advancement of the ensemble members. Figure 9: Computational time required to perform a full assimilation cycle for r C =1 , , , ,

16. Results are normalized over the computational time required for r C = 1. In summary, the present analysis assesses the performance of the MEnKFtool for varying mesh resolution of the ensemble members. As expected, theaccuracy of the state and parameter estimations diminishes for increasing r C ,but so does the computational cost. In addition, it was observed that theaccuracy signiﬁcantly drops when the mesh resolution is not able to provide asuitable description of the main scales characterizing the ﬂow. Such a signiﬁcantdiscrepancy for a relatively simple test case stresses how a minimal resolutionthreshold must be achieved in order to capture the essential physical featuresand to obtain a successful state estimation. In conclusion, the results of theparametric optimization may become unreliable with extreme under-resolutionof the ensemble members. The dispersive and diﬀusive eﬀects of the coarse gridon the state prediction deserve an in-depth analysis, which is out of the scopeof this work. Here, the discretization schemes used are the same for the ﬁneand coarse grids. However, one could argue that choosing adapted schemes forprogressively coarser grids could improve the performance of the estimator.29 . Acoustic propagation of sinusoidal wave The MEnKF strategy is now applied to a more complex physical system,namely the inviscid one-dimensional Euler equations: ∂ρ∂t + ∂ ( ρu ) ∂x = 0 (27) ∂ ( ρu ) ∂t + ∂ (( ρu ) u ) ∂x + ∂p∂x = 0 (28) ∂ ( ρE ) ∂t + ∂ (( ρE ) u ) ∂x + ∂ ( pu ) ∂x = 0 (29)where ρ is the density, u is the velocity, p is the pressure and E is the total energyper unit mass. In this case, viscous eﬀects are absent, but acoustic propagationaﬀects the evolution of the ﬂow. The equations are discretized using the ﬁnitediﬀerence method. Second-order centered schemes are used for the derivatives inspace and a ﬁrst-order scheme for the time integration to obtain the general formof discretized representation as given by (23). A centered sixth-order numericalﬁlter is included to damp numerical spurious oscillations [41]. The numericalalgorithm is used to analyse the acoustic propagation of a sinusoidal wave with atime-varying amplitude. To do so, a Dirichlet time-varying condition is imposedat the inlet for the velocity: u ( x = 0 , t ) = u (1 + θ ( t ) sin( ωt )) (30) u is set in order to impose an inlet Mach number M = u a = 0 .

4, where a isthe speed of the sound. The amplitude of variation in θ is set at suﬃcientlylow value to allow a ﬂow evolution mainly driven by acoustic phenomena. Theinlet velocity perturbation creates an acoustic wave that is transported alongthe domain with a speed of u + a . The characteristic velocity and length of thetest case are u c = u + a and λ . Similarly to the analysis in Sec. 4, λ representsthe wavelength of the signal imposed at the inlet. The characteristic time of thesystem is deﬁned as t c = λ/u c .The sinusoidal behaviour of the velocity at the inlet is characterized by30 constant frequency f c = 1 /t c with ω = 2 πf c . However, the time-varyingamplitude of the sinusoidal wave is driven by the parameter θ ( t ) = θ (1 +sin( ω θ t )), where θ , ω θ = 2 πf c /b and b are constants. The density at the inletis ﬁxed ρ ( x = 0 , t ) = ρ as well as the total energy per unit mass E ( x = 0 , t ) = E , which is calculated as E = e + 0 . u . The outlet boundary condition isextrapolated from the nearest points to the outlet. The initial condition imposedat t = 0 is u ( x, t = 0) = u , ρ ( x, t = 0) = ρ and E ( x, t = 0) = E everywherein the physical domain. The ﬂuid is an ideal gas with γ = 1 . ρ = 1 .

17 and T = 300 in S.I. units. Note that from now on, the units of measure are thecharacteristic magnitudes.The computational domain has been set to a size of L x = 10. A uniformmesh distribution is used for every calculation. Similarly to the analysis inSec. 4, 80 mesh elements are used to discretize a characteristic length λ for atotal of N x = 800 elements in the domain. Finally, the normalized value of ∆ t is set to ∆ t = 0 . θ = 0 .

015 (true value) and b = 10.A ﬂow visualization of the wave patterns is shown in Fig. 10 at t = 17 .

3. Thefully developed state obtained for t = 10 is used to initialize a new simulationfrom t = 0. This simulation is run for a total time of T ref = 110. As in Sec. 4,this simulation is used as true state and it is sampled every 30 time steps toobtain observations in the space region x ∈ [0 , R = 0 . N e = 100 simulations on a coarse-grid level. In this section, only onecoarsening ratio is considered ( r C = 4). The estimator is used to dynamicallytrack the value of the parameter θ , which evolves in time. No a priori knowledgeabout the behaviour of the parameter is used. This time optimization is much31 igure 10: Field ρu of the one-dimensional inviscid Euler equations at t = 17 . θ = 0 . ω θ = ω/

10 (true state). more challenging when compared with the inference of a constant parameteras done in Sec. 4. A similar analysis using a classical Kalman smoother wasrecently proposed by Mons et al. [20].For each coarse grid simulation of the estimator, θ is initially assumed to be arandom Gaussian phenomenon θ ∼ N (0 , Q θ ). The initial value of the covarianceis Q θ ( t = 0) = 6 . × − . As shown in the previous case, the value initiallyimposed at the inlet of the ﬁne-grid simulation is θ = 0, while random values areselected for each ensemble member on the coarse grid level following the normaldistribution introduced above. The variance of the parameter θ for the ensemblemembers is artiﬁcially increased, as in the classical Dual EnKF algorithm. Asdescribed in Appendix A.4, we add to the estimated parameter of each memberof the ensemble a Gaussian noise of diagonal covariance Σ θk = 10 − . Extensivenumerical tests have been performed and the results show the importance of Σ θk . The value chosen for Σ θk is arbitrary and works only for this test-case. Ingeneral, the estimation of Σ θk is challenging, since a priori information aboutthe dynamic behaviour of θ is required. Nonetheless, θ can be determinedheuristically: the artiﬁcial variance added to the parameters, Σ θk , should beof the same order of magnitude as the true rate of change of θ in a singleassimilation cycle. When this parameter is underpredicted, the variance in θ istoo small for the Dual EnKF to perform, while showing a oscillatory behaviourin the parameter estimation if Σ θk = 10 − is overpredicted.32his study is performed for three diﬀerent values of f a , the normalized periodbetween successive assimilations deﬁned as f a = t c /t a . This normalized analysisfrequency corresponds to the number of analysis phases per characteristic timeof simulation t c . Three diﬀerent values of f a are investigated: f a = 2 , , T ref = 110, which encom-passes 220 to 6000 DA analysis phases, depending on the value of f a . At theend of each analysis, the mean value and the variance of the amplitude θ areupdated following the Dual EnKF technique [38], similarly to what was done inSec. 4.The results for the estimation of the time-varying parameter θ are reportedin Fig. 11. The time evolution of θ are correctly estimated for the three valuesof f a . This is an important result, considering that no a priori information wasprovided for the evolution of this parameter. A more detailed analysis revealsa lag in the parameter estimation. The application of a simple Kalman ﬁlterseems to be responsible for this result, while a Kalman Smoother (KS) shouldhave been used to obtain a better synchronization. However, considering thatthe implementation of a KS is straightforward in this case and that observationis always provided close to the inlet, we considered that the increase in com-putational resources required by the KS were not deserved. We ﬁnd that thelag increases when a relatively small number of DA analyses is done. One cansee that the prediction is signiﬁcantly degraded for f a = 2, while similar resultsare obtained for f a = 10 ,

55. In addition, θ tends to be underestimated (around10%) when it reaches its maximum value. This result is arguably associatedwith the under-resolution of the coarse level of the grid, where the gradients ofphysical variables are calculated with lower accuracy.Now, results dealing with the state estimation are discussed. The predictedphysical variable ρu , normalized over the initial value ρ u , is shown in Fig. 12,13 and 14 for f a = 2 ,

10 and 55, respectively. For f a = 2, the state estimationis signiﬁcantly distant from the truth. It appears that the ﬁeld correction ap-plied via the Kalman gain is not able to compensate the poor estimation of θ .However, accurate results are observed for f a = 10 and 55. Even though the33

20 40 60 80 100 t Truthf a = 55 f a = 10 f a = 20 2 4 6 8 10 12 14 t Truthf a = 55 f a = 10 f a = 2 Figure 11: Time estimation of the parameter θ driving the amplitude of the sinusoidal acousticwave. Results are shown for f a = 2 , ,

55 and compared to the true value of θ . In the topimage, θ is shown in the whole assimilation window t ∈ [0 , t ∈ [0 ,

15] is shown. The shaded area represents the 95% credible interval for the case f a = 10. θ is not exact, the total state estimation including thecorrection via Kalman gain is very precise. For the case f a = 55, almost nodiscernible diﬀerence is observed between the state estimation and the truth.At last, the relative Root Mean Square Error (RMSE) deﬁned asRMSE( k ) = (cid:118)(cid:117)(cid:117)(cid:117)(cid:117)(cid:117)(cid:116) (cid:90) x (cid:104)(cid:0) ( ρu ) F k (cid:1) a ( x ) − (cid:0) ( ρu ) F k (cid:1) True ( x ) (cid:105) d x (cid:90) x (cid:104)(cid:0) ( ρu ) F k (cid:1) True ( x ) (cid:105) d x (31)is shown in Fig. 15. The error achieves a quasi-constant asymptotic behaviourafter a complete propagation of the signal in the physical domain ( t ≈ t c ).As expected, a low global error is obtained for the cases f a = 10 and f a = 55.On the other hand, the error for f a = 2 case is around 2 − f a = 10 and f a = 55indicates that, once a minimal threshold in the assimilation period t a is reached,the prediction of the estimator exhibits a robust convergence. Similar resultswere previously observed in three-dimensional simulations [18].

6. Spatially evolving compressible mixing layer

In this section, we consider the compressible Navier-Stokes equations in atwo-dimensional physical domain: ∂ρ∂t + div ( ρ u ) = 0 (32) ∂ ( ρ u ) ∂t + div ( ρ u ⊗ u ) = − grad p + div τ (33) ∂ ( ρE ) ∂t + div ( ρE u ) = − div ( p u ) + div ( τ u ) + div ( λ ( T ) grad T ) (34)where ρ is the density, u is the velocity (components u in the streamwise direc-tion and v in the normal direction), p is the pressure, E is the total energy perunit of mass, τ is the tensor of the viscous constraints and T is the tempera-ture. To obtain the representation given by (23), the equations are discretizedusing the ﬁnite diﬀerence method. Second-order centered schemes are used for35 a) t = 1 . t = 8 . t = 16 . Figure 12: Estimations by MEnKF of the momentum ρu normalized by ρ u for f a = 2 at t = 1 .

23 (a), t = 8 .

32 (b) and t = 16 .

30 (c). Times are given in t c units. The grey shadedarea corresponds to the observation window. a) t = 1 . t = 8 . t = 16 . Figure 13: Estimations by MEnKF of the momentum ρu normalized by ρ u for f a = 10 at t = 1 .

23 (a), t = 8 .

32 (b) and t = 16 .

30 (c). Times are given in t c units. The grey shadedarea corresponds to the observation window. a) t = 1 . t = 8 . t = 16 . Figure 14: Estimations by MEnKF of the momentum ρu normalized by ρ u for f a = 55 at t = 1 .

23 (a), t = 8 .

32 (b) and t = 16 .

30 (c). Times are given in t c units. The grey shadedarea corresponds to the observation window. igure 15: Time evolution of the RMS error of ρu for f a = 2 , , the derivatives in space and a ﬁrst-order scheme for the time integration. Acentered sixth-order numerical ﬁlter is included to damp numerical spuriousoscillations [41].As ﬂow conﬁguration, we consider the two-dimensional spatially evolvingmixing layer at Re = 100. For this value of Reynolds number, the ﬂow exhibitsunsteady features. It can be shown [42, 43] that the characteristics of themixing layer are strongly aﬀected by the inlet and, in particular, by imposed ad-hoc time perturbations. The computational domain has been set to a sizeof 14Λ ×

6Λ in the streamwise direction x and normal direction y , respectively.The characteristic length Λ, which is taken as reference length from now on,is given by Λ = Aδ , where δ is the initial vorticity thickness imposed at theinlet. The value of the parameter A is determined from the most unstablewavelength determined by Linear Stability Theory (LST). At Re = 100, wehave A = 14 . x ≤

10. The size of the elements is ∆ x = δ . For x ≥

10, a sponge zone isestablished with a coarsening ratio between successive elements which increasesfrom 1 .

025 to 1 .

04. The resolution in the normal direction is constant and equalto ∆ y = δ for − . ≤ y ≤ .

18. Outside this zone, the mesh elements increasein size moving away from the centerline with a constant coarsening ratio of 1 . U − U ) δ /ν =100 with asymptotic velocities set to U = 173 .

61 and U = 104 .

17. Thesevalues correspond to a Mach number Ma = 0 . .

3, for each stream,respectively. The kinematic viscosity and thermal diﬀusivity of the ﬂow areconsidered to be constant and their value is ﬁxed to ν = 1 . × − and α = 22 . × − , respectively. All these quantities are expressed in S.I. units.The inlet boundary condition is taken from [42]. For the velocity ﬁeld, one has: U in = U + U U − U (cid:18) yδ (cid:19) + U pert − < y < V in = 0 (36)where U in is the streamwise velocity at the inlet and V in is the normal velocity. U in is estimated as a classical hyperbolic tangent proﬁle plus a time-varyingperturbation component: U pert = N in (cid:88) i =1 (cid:15) i U + U f i ( y ) sin( ω i t )] , (37)where N in is the total number of perturbation modes and (cid:15) i quantiﬁes themagnitude of each mode. The function f i ( y ) = cos(4 n i yδ ) h ( y ) controls theshape of the perturbation of the inlet velocity proﬁle in the normal direction.The role of h ( y ) = 1 − tanh ( yδ ) is to damp the perturbation component movingaway from the centerline. The wavelength parameters n i are tuned according tothe LST results. In the following, we consider N in = 1 i.e. the inlet perturbationconsists of a single mode. The inlet density is set to be constant so that ρ in =1 . T = 300 in S.I. units.In this section, the parameters θ of the model correspond to (cid:15) i , the vari-able governing the amplitude of the perturbation. Two diﬀerent scenarios arestudied. In the ﬁrst one, the parameter (cid:15) i is constant. In the second one, it isa time-dependent coeﬃcient. In the ﬁrst case, the reference simulation is done40sing a constant single mode for the inlet perturbation (cid:15) = (cid:15) = cte. In the sec-ond case, (cid:15) varies in time following a sinusoidal form: (cid:15) = (cid:15) (1 + sin( ω (cid:15) t )). Thevalues of the numerical parameters characterizing the perturbation are (cid:15) = 0 . n = 0 . π , ω = 1 /t c and ω (cid:15) = 0 . ω , where t c = 2Λ / ( U + U ) is the averageadvection time. A ﬂow visualization of ρv at t = 10 is shown in Fig. 16 forthe two cases. In both cases, one can clearly observe the emergence of coherentstructures. When a constant value of (cid:15) is used, the coherent structures arevirtually aligned. When (cid:15) is time-varying, one can observe the emergence ofmore complex pairing patterns. (a) (cid:15) : constant(b) (cid:15) : time-varying Figure 16: Visualization of the normal momentum ρv (S.I. units) for the 2D compressibleNavier-Stokes equation. Reference simulation at t = 10 for a constant value of (cid:15) ( a ), and atime-varying value ( b ). The DA procedure is performed using the following elements:- The model is the discretized version of the system given by (32) - (34).The features of the ﬁne mesh level were previously introduced. For thecoarse grid level, a homogeneous coarsening ratio r C = 4 is employed. The41imulation is initialized in the physical domain using a simple hyperbolictangent proﬁle with no perturbation (see (35)). We consider that no priorinformation is available on the time evolution of (cid:15) . At t = 0, this coeﬃ-cient is ﬁxed to be a random Gaussian phenomenon (cid:15) ∼ N (0 , Q a ) wherethe initial value of Q a ( t = 0) = 0 . t = 0 is (cid:15) = 0, while random values are imposed for each ensemble member onthe coarse grid level. The size of the ensemble is N e = 100.- The observation is sampled from the reference simulations shown in Fig. 16,which are run for a total simulation time of T ref = 40 in t c units. A fullydeveloped state obtained from a prior simulation at t = 10 is used to ini-tialize the simulations at t = 0. Data are sampled every 30 time steps inthe region x ∈ [0 , .

55] and y ∈ [ − . , . ρu and ρv . The data used as observation are artiﬁcially perturbedusing a Gaussian noise of variance R = 1.The estimation algorithm is run over a time window equal to T DA = 40which encompass roughly 3200 DA analysis phases. This value corresponds tofour complete advections in the whole physical domain. At the end of eachanalysis, the mean value and the variance of the coeﬃcient (cid:15) are updatedfollowing the Dual EnKF technique [38], similarly to what was done in Sec. 4.First, results dealing with the constant value of (cid:15) are discussed. The pa-rameter estimation for this case is almost exact, as observed in Fig. 17. Theconvergence towards the exact value of (cid:15) is obtained in roughly 0 . ρv that is predicted at the centerline of the mixing layer isshown in Fig. 18. At t = 1, the ﬂow is strongly aﬀected by the KF corrections.42

10 20 30 40 t MEnKF

Truth0 2 4 6 8 100.140.150.16

Figure 17: Time evolution of the inferred values of (cid:15) . The truth corresponds to (cid:15) = cte =0 .

15. In the zoomed region, the shaded area represents the 95% credible interval for theestimated parameter.

After ﬁve characteristic times, the ﬂow is perfectly matching the true stateupstream. Finally, after the initial transient is dissipated, we observe that thestate estimation almost perfectly matches the true state in the whole physicaldomain.The results dealing with the time-varying parameter (cid:15) are now discussed.The time evolution of the estimated value of (cid:15) is reported in Fig. 19. The overallsinusoidal trend is generally respected, although a relatively small phase lag isvisible. This lag does not appear to be larger than the one previously observedfor the one-dimensional case based on the Euler equation. The presence of thisdelay has probably the same reasons as in Sec. 5. However, in this case, someover prediction of the parameter is locally observed in time, which was notobtained for the wave propagation test case.The results obtained for the prediction of the normal momentum ρv areshown in Fig. 20. One can see that the combination of parameter and stateestimations produces an accurate prediction of the ﬂow. Minor diﬀerences areobserved with the true state. In particular, the momentum ρv does not exhibitspurious oscillations which could stem from the ﬁeld correction determined viathe Kalman gain. In order to evaluate the respective inﬂuence of the parameterestimation step and state estimation phase, a test case is run in which only43 a) t = 1(b) t = 5(c) t = 30 Figure 18: Estimations obtained by MEnKF of the momentum ρv (S.I. units) at the centerline y = 0 of the mixing layer. Results at t = 1 (a), t = 5 (b) and t = 30 (c) for the case (cid:15) = cte = 0 . t Truth MEnKF (a) Estimation history t Truth MEnKF (b) Zoom

Figure 19: Time evolution of the inferred values of (cid:15) for the time-varying reference case. (a)Large time window. (b) Zoomed region. The shaded area represents the 95% credible intervalfor the estimated parameter. k ) = (cid:118)(cid:117)(cid:117)(cid:117)(cid:117)(cid:117)(cid:116) (cid:90) x (cid:104)(cid:0) ( ρv ) F k (cid:1) a ( x ) − (cid:0) ( ρv ) F k (cid:1) True ( x ) (cid:105) d x (cid:90) x (cid:104)(cid:0) ( ρv ) F k (cid:1) True ( x ) (cid:105) d x (38)The results, which are shown in Fig. 22, indicate that the accuracy of thecomplete algorithm is higher when compared to the case in which only theparameter estimation is performed. Therefore, the two operations concurrentlyprovide an improvement in the prediction of the ﬂow.At last, an analysis of the conservativity of the algorithm is performed. Aspreviously discussed, the state estimation obtained via EnKF does not necessar-ily comply with the dynamical equations of the model. This drawback can beresponsible for discontinuities in the physical ﬁeld, which can signiﬁcantly aﬀectthe accuracy and stability of the global algorithm. The analysis is performedconsidering an indicator Γ F k which measures the conservation of the transversalmomentum equation (33) in discretized form: (cid:0) ( ρv ) F k (cid:1) a − (cid:0) ( ρv ) F k − (cid:1) a ∆ t − F ρv (cid:0) ρ k , u k , p k , τ k (cid:1) = Γ F k , (39)where F ρv represents the spatial discretization terms in the transversal momen-tum equation. In the forecast step performed via the model, Γ F k = 0 downto a convergence rate δ which is prescribed. However, the value of Γ F k , at theend of a time step where a forecast-analysis is performed, is strictly connectedwith the computational strategy employed. Here, three cases are considered for (cid:15) = cte = 0 .

15: 46 a) t = 1(b) t = 5(c) t = 30 Figure 20: Estimations obtained by MEnKF of the momentum ρv (S.I. units) at the centerline y = 0 of the mixing layer. Results at t = 1 (a), t = 5 (b) and t = 30 (c) for the time-varying (cid:15) . a) t = 1(b) t = 5(c) t = 30 Figure 21: Estimations obtained by MEnKF of the momentum ρv (S.I. units) at the centerline y = 0 of the mixing layer. Here, MEnKF is only used to provide the estimation of (cid:15) . Resultsat t = 1 (a), t = 5 (b) and t = 30 (c) for the time-varying (cid:15) . igure 22: Time evolution of the RMS error of ρv for the case of a time-varying inlet parameter (cid:15) . The symbol P.E. corresponds to the case where MEnKF is only used for the estimationof (cid:15) . The notation MEnKF corresponds to the standard version of the algorithm, includingparameter estimation and physical state correction via Kalman gain. - A classical Dual EnKF is performed on the coarse grid and a ﬁne-gridcorrection is obtained through the ensemble statistics. In this scenario, thestate estimation obtained in the step 4 of the MEnKF algorithm presentedin Sec. 3 is directly projected in the ﬁne mesh space and used as ﬁnalsolution. The step 5 of the algorithm is not performed.- A standard MEnKF algorithm, as described in Sec. 3.- A MEnKF algorithm where the ensemble prediction is just used to esti-mate the unknown parameter of the system. No update of the physicalsolution is performed using the correction via Kalman gain.The results are shown in Fig. 23 after the ﬁrst forecast / analysis step. Forclarity, we introduce a normalized criterion (Γ F k ) ∗ = Γ F k C where C is deﬁnedas max k (cid:12)(cid:12)(cid:12)(cid:12) ( ( ρv ) F k ) a − ( ( ρv ) F k − ) a ∆ t (cid:12)(cid:12)(cid:12)(cid:12) . As expected, Γ F k = 0 everywhere when MEnKFis only used for the parameter estimation. Here, the time advancement of thesolution is performed using the model only, which exactly complies with the dis-cretized equation and respects conservativity (up to a convergence error whichis negligible). On the other hand, results in Fig. 23 (a) show some lack of con-servativity in the physical domain for the ﬁrst scenario. This is also expected,since no constraint is imposed to force the Kalman gain correction to comply49 a) Standard Dual EnKF on the coarse mesh.(b) Standard MEnKF.(c) MEnKF used only for parameter estimation. Figure 23: Analysis of the conservativity of the dynamical model via the normalized quantity (cid:0) Γ F k (cid:1) ∗ . Results are shown for three scenarios: (a) the classical Dual EnKF, (b) the classicalMEnKF algorithm and (c) the MEnKF only used for parameter estimation. F k ) ∗ is very similar to the results observed forthe ﬁrst scenario. However, one can clearly see that this ﬁeld appears to be sen-sibly smoothed out by the multigrid iterative procedures in step 4 and 5 of theMEnKF algorithm. As previously discussed, complete conservativity startingfrom an erroneous state at k −

7. Conclusions

A sequential estimator based on a Kalman ﬁlter approach for Data Assimi-lation of ﬂuid ﬂows is presented in this research work. This estimator exploitsiterative features which are employed in several CFD codes for the resolutionof complex applications in ﬂuid mechanics. More precisely, the multilevel reso-lution associated with the multigrid iterative approach for time advancement isused to generate several low-resolution numerical simulations. These results arethen employed as ensemble members to determine i) the correction via Kalmanﬁlter, which is then projected on the high-resolution grid to correct a singlesimulation which corresponds to the numerical model and ii) an optimization ofthe free parameters driving the simulation. One of the main advantages of themodel is that, owing to the iterative procedure for the calculation of the ﬂow51ariables, the ﬁnal solution is regularized.The method, which is referred to as Multigrid Ensemble Kalman Filter(MEnKF), is assessed via the analysis of one-dimensional and two-dimensionaltest cases and using diﬀerent dynamic equations. First, the one-dimensionalBurgers equation for Re = 200 is analyzed. Here, the performance of MEnKFis assessed considering several coarsening ratios r C , which determines the diﬀer-ence in resolution between the main simulation and the ensemble members. Theresults for r C = 1 (i.e. method equivalent to a EnKF) indicate that the StateEstimation and the parametric optimization of the inlet provide very high accu-racy in the results. With increasing coarsening ratios the quality of the resultsis progressively degraded, but the main features of the ﬂow are obtained evenfor very under-resolved ensemble members. In addition, higher r C values are as-sociated with signiﬁcantly decreased computational costs, so that this methodexhibit a potential to be explored for eﬃcient trade oﬀ between accuracy andresources required.Then, MEnKF is used to track the time evolution of a free parameter for thecase of a wave propagation, using a one-dimensional Euler model. Three casesare here investigated, varying the time window between successive assimilations.The estimator can eﬃciently represent the evolution in time of the parameter, aswell as to provide an accurate state estimation. However, the global predictionis signiﬁcantly degraded if the assimilation window is larger than a thresholdvalue, which is arguably connected to the physical features of the ﬂow.At last, the analysis of the two-dimensional spatially evolving mixing layerfor Re = 100 is performed. The algorithm appears to be well suited for theanalysis of unsteady phenomena, in particular for the analysis of time varyingfree parameters of the simulation. These features are promising for potentialapplication to in-streaming Data Assimilation techniques.Future research over this method will target improvement of the projectoroperators between coarse and ﬁne grid level, in particular for the state matrixΨ. Preliminary tests have shown that, in the case of non-linear models, infor-mation from the main, reﬁned simulation can be projected on the coarse mesh52evel to improve the predictive capabilities of the ensemble members. This im-plies a more accurate prediction of the state estimation and of the parameteroptimization on the coarse level, from which the main reﬁned simulation willbeneﬁt. This loop has the potential to improve even more the performance ofthe MEnKF model, and strategies for eﬃcient application are currently underinvestigation. Moreover, we plan to test the MEnKF algorithm to more complexconﬁgurations involving complex geometries and more challenging parametricoptimization problems. The compressible eﬀects are relatively low for the testsperformed so far. Further research is planned on test-cases where the compress-ible eﬀects are more accentuated.Acknowledgements: Our research activities are supported by the DirectionG´en´erale de l’Armement (DGA) and the R´egion Nouvelle Aquitaine. Prof. HengXiao is warmly acknowledged for valuable discussion on the subject.Conﬂict of interest: The Authors have no conﬂict of interests. Appendix A. Data Assimilation algorithms

Appendix A.1. Kalman ﬁlter algorithm

The Kalman ﬁlter algorithm given in Sec. 2.1.1 corresponds to Fig. A.24.

Appendix A.2. Ensemble Kalman ﬁlter algorithm

An eﬃcient implementation of the EnKF relying on anomaly matrices isgiven in Algo. 1. We have used the secant method described in [5] to changethe deﬁnition of the variable Y f k . 53 = 0 x a k P a k Initialization x f k +1 = M k +1: k x a k P f k +1 = M k +1: k P a k M > k +1: k + Q k +1 Forecast/Prediction k ← k +1 K k = P f k H > k (cid:0) H k P f k H > k + R k (cid:1) − x a k = x f k + K k (cid:0) y o k − H k x f k (cid:1) P a k = ( I − K k H k ) P f k Analysis/Correction

Figure A.24: Kalman Filter algorithm. The initialization is made with the analysed state.

Appendix A.3. Dual Ensemble Kalman ﬁlter algorithm

An eﬃcient implementation of the Dual EnKF relying on anomaly matricesis given in Algo. 2. We have slightly adapted this algorithm from [38].

Appendix A.4. Multigrid Ensemble Kalman ﬁlter algorithm

The algorithm 4 represents a simpliﬁed, ready-to-use application of the con-ceptual methodology presented in Sec. 3. This algorithm was tailored for therelatively simple physical models used in this work. While it may not be suitedfor complex three-dimensional applications, it proved an optimum trade-oﬀ inaccuracy and computational resources for the present analysis.First of all, when observation is not available, the two main forecast op-erations (ﬁne grid forecast and ensemble coarse forecast) are performed usingexplicit time advancement schemes. This choice allows to reduce the compu-tational costs. However, when observation is available, the following strategiesare employed:1. The two forecast operations (main simulation and ensemble members) areperformed using an implicit matrix-splitting iterative procedure, usinga single iteration. As stated in Sec. 3, the state transition model for54 lgorithm 1:

Stochastic Ensemble Kalman Filter (slightly adaptedfrom [5]). Use of anomaly matrices with Y f k = H k X f k . Input:

For k = 0 , . . . , K : the forward models M k : k − , the observationmodels H k , the observation error covariance matrices R k Output: { x a , ( i ) k } ; k = 0 , · · · , K ; i = 1 , · · · , N e begin Initialize the ensemble of forecasts { x f , ( i )0 } ; i = 1 , · · · , N e for k = 0 , . . . , K do Draw a statistically consistent observation set ; i = 1 , · · · , N e y o , ( i ) k = y o k + (cid:15) o , ( i ) k with (cid:15) o , ( i ) k ∼ N (0 , R k ) Compute the model counterparts of the observation set ; i = 1 , · · · , N e y f , ( i ) k = H k (cid:16) x f , ( i ) k (cid:17) Compute the ensemble means x f k = 1 N e N e (cid:88) i =1 x f , ( i ) k ; y f k = 1 N e N e (cid:88) i =1 y f , ( i ) k ; (cid:15) o k = 1 N e N e (cid:88) i =1 (cid:15) o , ( i ) k Compute the normalized anomalies ; i = 1 , · · · , N e (cid:2) X f k (cid:3) : ,i = x f , ( i ) k − x f k √ N e − (cid:2) Y f k (cid:3) : ,i = y f , ( i ) k − y f k √ N e − E o k ] : ,i = (cid:15) o , ( i ) k − (cid:15) o k √ N e − Compute the Kalman gain K e k = X f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − Update the ensemble ; i = 1 , · · · , N e x a , ( i ) k = x f , ( i ) k + K e k (cid:16) y o , ( i ) k − y f , ( i ) k (cid:17) Compute the ensemble forecast ; i = 1 , · · · , N e x f , ( i ) k +1 = M k +1: k ( x a , ( i ) k ) 55 lgorithm 2: Dual Ensemble Kalman Filter (slightly adapted from[38]). Use of anomaly matrices with Y f k = H k X f k . We have i =1 , · · · , N e . Input:

For k = 1 , . . . , K : the forward models M k : k − , the observationmodels H k , the observation error covariance matrices R k Output: { θ a , ( i ) k } and { x a , ( i ) k } ; k = 0 , · · · , K begin Initialize { θ a , ( i )0 } and { x a , ( i )0 } for k = 1 , . . . , K do Observation ensemble: y o , ( i ) k = y o k + (cid:15) o , ( i ) k with (cid:15) o , ( i ) k ∼ N (0 , R k ) R e k = 1 N e − N e (cid:88) i =1 (cid:15) o , ( i ) k (cid:16) (cid:15) o , ( i ) k (cid:17) (cid:62) Parameter forecast: θ f , ( i ) k = θ a , ( i ) k − + τ ( i ) k with τ ( i ) k ∼ N (0 , Σ θk ) x f , ( i ) k = M k : k − ( x a , ( i ) k − , θ f , ( i ) k ) y f , ( i ) k = H k (cid:16) x f , ( i ) k (cid:17) Compute the normalized anomalies (cid:2) Θ f k (cid:3) : ,i = θ f , ( i ) k − θ f k √ N e − (cid:2) Y f k (cid:3) : ,i = y f , ( i ) k − y f k √ N e − E o k ] : ,i = (cid:15) o , ( i ) k − (cid:15) o k √ N e − Parameter update: K θ, e k = Θ f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − θ a , ( i ) k = θ f , ( i ) k + K θ, e k (cid:16) y o , ( i ) k − y f , ( i ) k (cid:17) State forecast: x f , ( i ) k = M k : k − ( x a , ( i ) k − , θ a , ( i ) k ) y f , ( i ) k = H k (cid:16) x f , ( i ) k (cid:17) Compute the normalized anomalies (cid:2) X f k (cid:3) : ,i = x f , ( i ) k − x f k √ N e − (cid:2) Y f k (cid:3) : ,i = y f , ( i ) k − y f k √ N e − E o k ] : ,i = (cid:15) o , ( i ) k − (cid:15) o k √ N e − State update: K x, e k = X f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − x a , ( i ) k = x f , ( i ) k + K x, e k (cid:16) y o , ( i ) k − y f , ( i ) k (cid:17) lgorithm 3: Dual Ensemble Kalman ﬁlter of Algo. 2 applied on thecoarse mesh. We have i = 1 , · · · , N e . Input:

For k = 1 , . . . , K : the forward models M C k : k − , the observationmodels H C k , the observation error covariance matrices R C k Output: { θ a , ( i ) k } and { ( x C k ) a , ( i ) } ; k = 0 , · · · , K begin Initialize { θ a , ( i )0 } and { ( x C ) a , ( i ) } ; for k = 1 , . . . , K do Parameter forecast: θ f , ( i ) k = θ a , ( i ) k − + τ ( i ) k with τ ( i ) k ∼ N (0 , Σ θk )( x C k ) f , ( i ) = M C k : k − (cid:16)(cid:0) x C k − (cid:1) a , ( i ) , θ f , ( i ) k (cid:17) if Observation available then ( y C k ) f , ( i ) = H C k (cid:16) ( x C k ) f , ( i ) (cid:17) Observation ensemble:( y C k ) o , ( i ) = ( y C k ) o + ( (cid:15) C k ) o , ( i ) with ( (cid:15) C k ) o , ( i ) ∼ N (0 , R C k )( R C k ) e = 1 N e − N e (cid:88) i =1 ( (cid:15) C k ) o , ( i ) (cid:16) ( (cid:15) C k ) o , ( i ) (cid:17) (cid:62) Compute the normalized anomalies (cid:2) Θ f k (cid:3) : ,i = θ f , ( i ) k − θ f k √ N e − (cid:2) Y f k (cid:3) : ,i = ( y C k ) f , ( i ) − ( y C k ) f √ N e − E o k ] : ,i = ( (cid:15) C k ) o , ( i ) − ( (cid:15) C k ) o √ N e − Parameter update:( K C k ) θ, e = Θ f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − θ a , ( i ) k = θ f , ( i ) k + ( K C k ) θ, e (cid:16) ( y C k ) o , ( i ) − ( y C k ) f , ( i ) (cid:17) State forecast:( x C k ) f , ( i ) = M C k : k − (cid:16)(cid:0) x C k − (cid:1) a , ( i ) , θ a , ( i ) k (cid:17) ( y C k ) f , ( i ) = H C k (cid:16) ( x C k ) f , ( i ) (cid:17) Compute the normalized anomalies (cid:2) X f k (cid:3) : ,i = ( x C k ) f , ( i ) − ( x C k ) f √ N e − (cid:2) Y f k (cid:3) : ,i = ( y C k ) f , ( i ) − ( y C k ) f √ N e − E o k ] : ,i = ( (cid:15) C k ) o , ( i ) − ( (cid:15) C k ) o √ N e − State update:( K C k ) x, e = X f k (cid:0) Y f k (cid:1) (cid:62) (cid:16) Y f k (cid:0) Y f k (cid:1) (cid:62) + E o k ( E o k ) (cid:62) (cid:17) − ( x C k ) a , ( i ) = ( x C k ) f , ( i ) + ( K C k ) x, e (cid:16) ( y C k ) o , ( i ) − ( y C k ) f , ( i ) (cid:17) lgorithm 4: Multigrid EnKF algorithm. We have i = 1 , · · · , N e . begin Initialize (cid:110) ( x F ) a , θ a0 , θ a , ( i )0 , ( x C ) a , ( i ) (cid:111) for k = 1 , . . . , K do Fine grid forecast:( x F k ) f = M F k : k − (cid:0)(cid:0) x F k − (cid:1) a , θ a k (cid:1) Dual EnKF on coarse mesh: Apply Algo. 3 if Observation available then Projection on the coarse grid( x C k ) ∗ = Π C (cid:16) ( x F k ) f (cid:17) Fine grid state correction using the ensemble statistics:( x C k ) (cid:48) = ( x C k ) ∗ + ( K C k ) x, e (cid:2) ( y C k ) o − H C k (cid:0) ( x C k ) ∗ (cid:1)(cid:3) ( x F k ) a = ( x F k ) f + Π F (cid:16) ( x C k ) (cid:48) − ( x C k ) ∗ (cid:17) Matrix-Splitting iterative procedure on the ﬁnal solutionstarting from ( x F k ) a . 58ach ensemble member is determined independently, but using the samestructure and discretization schemes of the main simulation.2. The number of iterative solutions for the main simulation on the coarse-grid level is equal to zero. That is, the solution from the ﬁrst forecastis projected on the coarse grid, and the diﬀerence between the KF stateestimation and this forecast is re-projected over the ﬁne grid.3. In the ﬁnal iteration on the ﬁne grid, an implicit matrix-splitting iterativeprocedure is employed, using a single iteration and a relaxation coeﬃcient α = 0 .

5. This choice, which provides the best compromise between accu-racy and regularization, has been identiﬁed after extensive tests for theconﬁgurations investigated.

References [1] W. L. Oberkampf, T. G. Trucano, Veriﬁcation and validation in com-putational ﬂuid dynamics, Progress in Aerospace Sciences 38 (3) (2002)209–272. doi:https://doi.org/10.1016/S0376-0421(02)00005-2 .URL [2] J. H. Ferziger, M. Peric, Computational Methods for Fluid Dynamics,Springer, 2002.[3] S. B. Pope, Turbulent ﬂows, Cambridge University Press, 2000.[4] K. Zhou, J. C. Doyle, K. Glover, Robust and optimal control, PrenticeHall, 1996.[5] M. Asch, M. Bocquet, M. Nodet, Data Assimilation: methods, algorithms,and applications, SIAM, 2016.[6] G. Evensen, Data Assimilation: The Ensemble Kalman Filter, Springer-Verlag/Berlin/Heildelberg, 2009.[7] S. B. Daley, Atmospheric Data Analysis, Cambridge University Press, 1991.598] A. Onder, J. Meyers, Optimal control of a transitional jet using a continu-ous adjoint method, Computers & Fluids 126 (2016) 12–24.[9] D. P. G. Foures, N. Dovetta, D. Sipp, P. J. Schmid, A data-assimilationmethod for Reynolds-averaged Navier-Stokes-driven mean ﬂow reconstruc-tion, Journal of Fluid Mechanics 759 (2014) 404–431.[10] V. Mons, Q. Wang, T. A. Zaki, Kriging-enhanced ensemble variationaldata assimilation for scalar-source identiﬁcation in turbulent environments,Journal of Computational Physics 398 (2019) 108856.[11] P. Chandramouli, E. Memin, D. Heitz, 4d large scale variational data as-similation of a turbulent ﬂow with a dynamics error model, J. Comp. Phys.412 (2020) 109446. doi:https://doi.org/10.1016/j.jcp.2020.109446 .URL [12] Z. Sirkes, E. Tziperman, Finite Diﬀerence of Adjoint or Adjoint of FiniteDiﬀerence?, Monthly Weather Review 125 (1997) 3373–3378.[13] R. E. Kalman, A new approach to linear ﬁltering and prediction problems,Journal of Basic Engineering 82 (1960) 35–45.[14] W. G., B. G., An Introduction to the Kalman Filter, Technical report - tr95-041, University of North Carolina at Chapel Hill (2006).[15] D. Rozier, F. Birol, E. Cosme, P. Brasseur, J.-M. Brankart, J. Verron, AReduced-Order Kalman Filter for Data Assimilation in Physical Oceanog-raphy, SIAM Review 49 (3) (2007) 449–465. doi:10.1137/050635717 .[16] T. Suzuki, Reduced-order Kalman-ﬁltered hybrid simulation combiningparticle tracking velocimetry and direct numerical simulation, Journal ofFluid Mechanics 709 (2012) 249 – 288.[17] M. Meldi, A. Poux, A reduced order model based on Kalman Filtering forsequential Data Assimilation of turbulent ﬂows, Journal of ComputationalPhysics 347 (2017) 207–234. 6018] M. Meldi, Augmented Prediction of Turbulent Flows via Sequential Esti-mators: Sensitivity of State Estimation to Density of Time Sampling forAvailable Observation, Flow, Turbulence and Combustion 101 (2018) 389–412.[19] G. Evensen, The ensemble Kalman Filter for combined state and parameterestimation - Monte Carlo techniques for data assimilation in large systems,IEEE Control Systems 29 (2009) 83–104.[20] V. Mons, J. C. Chassaing, T. Gomez, P. Sagaut, Reconstruction of unsteadyviscous ﬂows using data assimilation schemes, Journal of ComputationalPhysics 316 (2016) 255–280.[21] H. Xiao, J. L. Wu, J. X. Wang, R. Sun, C. Roy, Quantifying and reducingmodel-form uncertainties in reynolds-averaged navier–stokes simulations:A data-driven, physics informed bayesian approach, Journal of Computa-tional Physics 324 (2016) 115–136.[22] M. C. Rochoux, S. Ricci, D. Lucor, B. Cuenot, A. Trouve, Towards pre-dictive data-driven simulations of wildﬁre spread - Part I: Reduced-costEnsemble Kalman Filter based on a Polynomial Chaos surrogate modelfor parameter estimation, Natural Hazards and Earth System Sciences 14(2015) 2951–2973.[23] D. Simon, T. L. Chia, Kalman ﬁltering with state equality constraints,IEEE Trans. Aerosp. Electron. Syst. 38 (2002) 128–136.[24] G. Nachi, Kalman ﬁltering in the presence of state space equality con-straints, IEEE Chinese Control Conference (2007) 107–133.[25] X. L. Zhang, C. M. Str¨ofer, H. Xiao, Regularized ensemble kalman methodsfor inverse problems, J. Comp. Phys. 416 (2020) 109517.[26] A. Brandt, Multi-level adaptive solutions to boundary-value problems,Mathematics of Computation 31 (1977) 333–390.6127] P. Wesseling, C. W. Oosterlee, Geometric multigrid with applicationsto computational ﬂuid dynamics, Journal of Computational and AppliedMathematics 128 (2001) 311–334.[28] H. Hoel, K. J. H. Law, R. Tempone, Multilevel ensemble kalman ﬁltering,SIAM J. Numer. Anal. 54(3) (2016) 1813–1839.[29] A. Siripatana, L. Giraldi, O. P. Le Maˆıtre, O. M. Knio, I. Hoteit, Combiningensemble kalman ﬁlter and multiresolution analysis for eﬃcient assimilationinto adaptive mesh models, Computational Geosciences 23 (2019) 1259–1276.[30] K. Fossum, T. Mannseth, A. S. Stordal, Kalman ﬁltering with state equalityconstraints, Assessment of multilevel ensemble-based data assimilation forreservoir history matching 24 (2020) 217–239.[31] J. Brajard, A. Carrassi, M. Bocquet, L. Bertino, Combining data assimila-tion and machine learning to infer unresolved scale parametrisation, arxiv2009.04318v1 (2020) 1–16.[32] L. Debreu, E. Neveu, E. Simon, F.-X. Le Dimet, A. Vidard, Multigridsolvers and multigrid preconditioners for the solution of variational dataassimilation problems, Quarterly Journal of the Royal Meteorological So-ciety 142 (2015) 515–528.[33] G. Stonebridge, Diagonal Approximations to the ObservationError Covari-ance Matrix in Sea Ice ThicknessData Assimilation, Master of science, Uni-versity of Waterloo (2017).[34] S. L. Brunton, B. R. Noack, Closed-Loop Turbulence Control: Progressand Challenges, Applied Mechanics Review 67 (2015) 050801.[35] G. Evensen, Sequential data assimilation with a nonlinear quasi-geostrophicmodel using monte-carlo methods to forecast error statistics, Journal ofGeophysical Research 99 (1994) 10143–10162. doi:10.1029/94JC00572 .6236] G. Burgers, P. J. Van Leeuwen, G. Evensen, On the analysis scheme inthe ensemble kalman ﬁlter, Monthly Weather Review 126 (06 1998). doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2 .[37] J. W. Labahn, H. Wu, B. Coriton, J. H. Frank, M. Ihme, Data assimilationusing high-speed measurements and LES to examine local extinction eventsin turbulent ﬂames, Proceedings of the Combustion Institute 37 (2019)2259–2266.[38] M. Hamid, S. Sorooshian, H. Gupta, P. Houser, Dual state-parameter es-timation of hydrological models using ensemble kalman ﬁlter, Advances inWater Resources 28 (2005) 135–147. doi:10.1016/j.advwatres.2004.09.002 .[39] W. Hackbusch, Multi-grid methods and applications, Springer Berlin, 1985.[40] P. Sakov, D. S. Olivier, L. Bertino, An iterative enkf for strongly nonlinearsystems, Monthly Weather Review 140 (2011) 1988–2004.[41] C. Bogey, C. Bailly, A family of low dispersive and low dissipative explicitschemes for ﬂow and noise computations, Journal of Computational Physics194 (2004) 194–214. doi:10.1016/j.jcp.2003.09.003doi:10.1016/j.jcp.2003.09.003