[PDF] Data-Driven Optimization Approach for Inverse Problems : Application to Turbulent Mixed-Convection Flows

Abstract

Optimal control of turbulent mixed-convection flows has attracted considerable attention from researchers. Numerical algorithms such as Genetic Algorithms (GAs) are powerful tools that allow to perform global optimization. These algorithms are particularly of great interest in complex optimization problems where cost functionals may lack smoothness and regularity. In turbulent flow optimization, the hybridization of GA with high fidelity Computational Fluid Dynamics (CFD) is extremely demanding in terms of computational time and memory storage. Thus, alternative approaches aiming to alleviate these requirements are of great interest. Nowadays, data driven approaches gained attention due to their potential in predicting flow solutions based only on preexisting data. In the present paper, we propose a near-real time data-driven genetic algorithm (DDGA) for inverse parameter identification problems involving turbulent flows. In this optimization framework, the parametrized flow data are used in their reduced form obtained by the POD (Proper Orthogonal Decomposition) and solutions prediction is made by interpolating the temporal and the spatial POD subspaces through a recently developed Riemannian barycentric interpolation. The validation of the proposed optimization approach is carried out in the parameter identification problem of the turbulent mixed-convection flow in a cavity. The objective is to determine the inflow temperature and inflow velocity corresponding to a given temperature distribution in a restricted area of the spatial domain. The results show that the proposed genetic programming optimization framework is able to deliver good approximations of the optimal solutions within less than two minutes.

Full PDF

DData-Driven Optimization Approach for Inverse Problems :Application to Turbulent Mixed-Convection Flows

M. Oulghelou , C. Beghein , C. Allery LaSIE, UMR-7356-CNRS, Universit´e de La Rochelle Pˆole Science et Technologie, Avenue MichelCr´epeau, 17042 La Rochelle Cedex 1, France.

Abstract

Optimal control of turbulent mixed-convection ﬂows has attracted considerable attentionfrom researchers. Numerical algorithms such as Genetic Algorithms (GAs) are powerfultools that allow to perform global optimization. These algorithms are particularly of greatinterest in complex optimization problems where cost functionals may lack smoothness andregularity. In turbulent ﬂow optimization, the hybridization of GA with high ﬁdelity Compu-tational Fluid Dynamics (CFD) is extremely demanding in terms of computational time andmemory storage. Thus, alternative approaches aiming to alleviate these requirements are ofgreat interest. Nowadays, data driven approaches gained attention due to their potential inpredicting ﬂow solutions based only on preexisting data. In the present paper, we proposea near-real time data-driven genetic algorithm (DDGA) for inverse parameter identiﬁcationproblems involving turbulent ﬂows. In this optimization framework, the parametrized ﬂowdata are used in their reduced form obtained by the POD (Proper Orthogonal Decomposi-tion) and solutions prediction is made by interpolating the temporal and the spatial PODsubspaces through a recently developed Riemannian barycentric interpolation. The valida-tion of the proposed optimization approach is carried out in the parameter identiﬁcationproblem of the turbulent mixed-convection ﬂow in a cavity. The objective is to determinethe inﬂow temperature and inﬂow velocity corresponding to a given temperature distribu-tion in a restricted area of the spatial domain. The results show that the proposed geneticprogramming optimization framework is able to deliver good approximations of the optimalsolutions within less than two minutes.

Keywords:

Flow inverse problem, optimal control, Data-Driven optimization, indoorﬂows, heat problems, Genetic Algorithm, Proper Orthogonal Decomposition. [email protected] [email protected] [email protected] a r X i v : . [ c s . C E ] S e p . Introduction Decreasing energy consumption of buildings is an important aspect of the reducing ofglobal warming. However, the energy reduction has to be compromised with the quality ofthermal comfort inside buildings. To achieve that, optimization applied to indoor airﬂows,which is aimed at determining optimal ﬂow values for some well chosen parameters are ofgreat interest. The optimization objective can be expressed in the whole or a part of thedomain, in terms of ﬁeld variables such as inlet velocity, wall temperature, heat source,etc. For ﬂows in buildings, which are mostly mixed convection turbulent ﬂows, high ﬁ-delity solvers are privileged for parameter identiﬁcation problems. A usual class of ﬂowoptimization algorithms consists in standard gradient descent algorithms using high ﬁdelityadjoint equations. The search direction is computed as the functional cost sensitivity overthe design variables and the solution is moved along until an optimal solution is reached.This approach was used for instance by Liu et al. to ﬁnd optimal thermo-ﬂuid boundaryconditions in a two-dimensional cavity [1] and to optimize the air supply location, size, andparameters in a two dimensional non isothermal ventilated cavity [2]. It was also used tooptimize buoyancy-driven ventilation ﬂows governed by Boussinesq equations [3, 4]. A ma-jor limitation of high ﬁdelity adjoint-based algorithms is that they are more likely to stuckin local optima. To overcome this issue, a global optimization search can be carried out byGenetic Algorithms (GAs) [5]. In the context of mixed-convection ﬂows, high ﬁdelity solverscombined with GA have been investigated and validated in [6, 7]. Compared to high ﬁdelityadjoint based optimization approach, high ﬁdelity based GA is more eﬃcient in terms ofﬁnding global optimal solutions, yet it requires a tremendous computing load, leading toturn the attention to techniques of model reduction.Reduced-order models have been extensively used in ﬂuid dynamics in order to reduce thecomputational burden in optimization and control applications. Recently, POD/Galerkinreduced order models were successfully combined with optimization approaches allowing adrastic alleviation of the optimization computational eﬀort. A standard approach devel-oped by Tallet et al. [8] and Bergmann et al. [9] consists in using high ﬁdelity simulationsto extract a POD basis representing the main structures of a set of snapshots sampled atdiﬀerent parameter values. The temporal dynamics is afterwards calculated by solving anordinary system of diﬀerential equations resulting from Galerkin projection of the governingequations onto the global POD basis. By considering the global POD/Galerkin ROM asthe state equations, a reduced scale optimization problem can be formulated and solvedin near-real time. However, in many physical cases, the global POD/Galerkin ROM mayexperience issues of accuracy due to the overload of information in the global POD basis.Sophisticated subspace interpolation techniques such as the ITSGM (Interpolation on theTangent Space of the Grassmann Manifold) proposed by Amsallem et al. [10] is an eﬃcientlocal method meant to restrict the ROM predictions to the wanted physical regime. In thecontext of the adjoint-based optimal control, the ITSGM/Galerkin ROM was successfullyembedded in a suboptimal control strategy to achieve a near-real time optimal control oftransfer phenomena [11].In the last two decades, interest in data driven model reduction techniques for ﬂow problemsis increasingly growing. Interestingly, the power of these methods is their dispense on theunderlying mathematical model. Instead, they explore and learn the dynamics from preex-isting data and deliver approximations that are expected to capture most of the dynamicsof the ﬂow. Numerous attempts have been carried out in this subject. Namely, one canrefer to [12] where a modiﬁed version of the ITSGM referred to as Bi-CITSGM designed fornon-linear data interpolation is proposed, and to [13, 14, 15, 16, 17] where Artiﬁcial NeuralNetworks (ANN) were used with model reduction for the prediction of ﬂow solutions. Inthe context of data driven optimization, the Bi-CITSGM has been used successfully in con-junction with GA to control the ﬂow past a circular cylinder and the ﬂow in a lid drivencavity [12]. In the same spirit, ANN are used in conjunction with micro genetic algorithm2MGAs) for the optimization of the location of multiple discrete heat sources in a ventilatedcavity [18]. In both cases, the Bi-ITSGM or ANN combined with GA demonstrated theirability to reach good suboptimal solutions within a near-real computational time.In this paper, we formulate a new Data Driven Genetic Algorithm (DDGA) based on theRiemannian Barycentric interpolation of subspaces. This interpolation method is basedupon the geometry of the manifold of ﬁxed rank matrices studied in details in [19]. It wasinitially used to interpolate low-rank solutions of the Luyapunov equations resulting fromparametric linear input-output reduced order system [20], and recently adapted to interpo-late the parametric Navier-Stokes Galerkin/ROM [21]. In contrast to the Bi-CITSGM whichneeds a calibration phase for the interpolated POD subspaces, the barycentric interpolationnaturally results in modes that are arranged according the POD energetic content. Thisproperty allows to interpolate the time and space quantities separately and eventually formthe set of untrained solutions by simply combining them. The aim of the following studyis to use a preexisting ﬂow database to solve the inverse parameter identiﬁcation probleminvolving the turbulent mixed-convection ﬂow in a cavity. The optimization objective is todetermine the inlet velocity and temperature that optimize the cost functional related tomaintaining a desired temperature distribution inside a part of the spatial domain.The remainder of this article is organized as follows: First, the studied Mixed convectioninverse problem is presented in section 2. In section 3, the barycentric interpolation usedfor nonlinear parametrized data prediction is detailed. Next, the proposed data driven Ge-netic Algorithm is outlined in section 4. In section 5, numerical experiments assessing thepotential of this approach are carried out on the inverse problem involving the turbulentmixed-convection ﬂow in a cavity. Finally, conclusions are drawn in section 6.

2. Mixed convection inverse problem

This study focuses on the inverse problem of temperature distribution in a two-dimensionalventilated cavity, whose dimensions are 1 . m × . m , and which is shown in ﬁgure 1. Thetemperature θ hot of the bottom wall of the cavity is higher than the temperature θ cold ofthe other walls: θ hot = 35 o C and θ cold = 15 o C (1)The air inlet (resp. outlet) is located at the top left (resp. bottom right) corner of the Figure 1: Description of the studied mixed-convection ﬂow cavity. The vertical dimension of the air inlet (resp. outlet) is 0 .

018 m (resp. 0 .

024 m). Atthe inlet, the air velocity is denoted U , and the air temperature is θ . The turbulent air ﬂowin the cavity is governed by the equations of mass conservation, momentum conservation,3nergy conservation, of an incompressible Newtonian ﬂuid with Boussinesq’s assumption  ∇ · v = 0 ρ∂ t v + ρ v · ∇ v = −∇ p + µ ∆ v + ρ g β (Θ − Θ ) e y + ∇ σ t ρ c p ∂ t Θ + ρ c p v · ∇ Θ = λ ∆Θ + ∇ q t (2)where v , Θ, p are the time averaged velocity , temperature and pressure obtained with anUnsteady Reynolds Averaged Navier-Stokes (URANS) turbulence model. ρ , µ , C p , λ are thedensity, dynamic viscosity, heat capacity and heat conductivity of the ﬂuid at the referencetemperature Θ , g is the gravitational acceleration, β is the thermal expansion coeﬃcient. σ t and q t are the turbulent Reynolds stress and the turbulent heat ﬂux given by σ t ij = − ρ v (cid:48) i v (cid:48) j q t i = − ρ c p v (cid:48) i Θ (cid:48) where v (cid:48) and Θ (cid:48) stand for the temporal mean values of the ﬂuctuating velocity and tem-perature. The aim of the following study is to solve the constrained nonlinear optimizationproblem min δ J ( y ) subject to N ( y, δ ) = 0 (3)where J is the functional describing the cost to minimize, N the non-isothermal Navier-Stokes equations (2) and y ( δ ) the state variable which might be represented for exampleby the velocity ﬁeld v or the temperature Θ. In the present article, since the turbulentmixed-convection ﬂow is strongly inﬂuenced by the inlet temperature θ and velocity U , weuse them as optimization variables. For a given temperature distribution ˆΘ, the goal is torecover the inlet velocity U and inlet temperature θ that minimize the objective functional J (Θ) = (cid:90) t f (cid:90) Ω int (Θ − ˆΘ) dx dt (4)where [0 , t f ] is the time frame of simulation and Ω int the restricted occupied zone of thespatial domain depicted in ﬁgure 1. Two cases of optimization are studied. The ﬁrst caseconsists in maintaining the inlet temperature θ constant and considering the optimizationvariable to be the inlet velocity δ = U ; and the second case by ﬁxing the inlet velocity U and optimizing on the inlet temperature δ = θ . It is worth mentioning that one could alsothink about optimizing on diﬀerent parameters, such as the coordinates or the intensity ofa heat source in the domain Ω. But, since a GA strategy is to be used, these parameterscan directly be incorporated into the cost functional without inducing any modiﬁcation inthe optimization process. The general idea of GA is illustrated in the ﬂowchart 2. GA consists in starting from arandomly generated set (of size N chrom ) of chromosomes δ , δ , . . . , δ N chrom , forming a pop-ulation. The size of populations is unchanged and ﬁxed to N chrom . In each population, aﬁtness value [22] is assigned to each chromosome δ j . Virtually, any ﬁtness function can bechosen given that no requirement for continuity in the derivatives is needed. Some examplesof the choice of ﬁtness functions can be found in [23, 24, 25, 26]. In the present paper, theﬁtness function f is chosen as the inverse of the objective function, i.e, the ﬁtness of the j th chromosome is calculated as follows f ( y j ) = 1 J ( y j , δ j )where y j is obtained by solving the constraint problem N ( y j , δ j ) = 0. In order to evolvepopulations, three main genetic operators [27] modeled on the Darwinian concepts of naturalselection and evolution are used. These are : In these equations, v , Θ and p should have been written v , Θ and p . To alleviate the notations in thereminder of the paper, the time averaged notation will not be used. igure 2: Outline of the Genetic Algorithm. Selection.

There are several methods for selecting the best chromosomes and their transferto the next generation. In general, a new population of chromosomes is chosen to survivebased on their ﬁtness values. That means that a chromosome δ j with a large ﬁtness valuehas higher probability of being reproduced and passed down into the next generation. Theprobability of reproduction can be calculated as follows P js = f ( y j ) (cid:44) N chrom (cid:88) i =1 f ( y i )Using this reproduction probability, N chrom solutions from the current generation are selectedby the roulette rule [28] to survive for the next generation. These reproduced solutions areafterwards modulated by the crossover and mutation operators [27] . Crossover.

The crossover is the operation wherein genes are exchanged between two chro-mosomes. In particular, all the surviving chromosomes by the roulette selection rule arerandomly paired. Precisely, two individuals are randomly selected as parent individuals,then arbitrary positions on both individuals are chosen for crossing locations where ex-change of genes takes place. In practice, a random number ranging from 0 to 1 is generated.If the random number is greater than P c , the two chromosomes in the original pair remaininto the next generation. Otherwise, the crossover takes place, and two new chromosomesare created to replace the parent chromosomes. Mutation.

The mutation operator is responsible for bringing new information to the pop-ulation. With a probability P m ranging from 0 to 1, the mutation operator accidentallychanges one of the resulted genes.The above genetic operations are repeated for a predetermined number of generations ar-bitrarily set by the user. The best chromosome of the ﬁnal generation is declared as theglobal optimized solution. Despite their superiority with respect to other optimization ap-proaches, a serious weakness of high ﬁdelity based GAs is their considerable requirementsin computational eﬀort and memory storage [29]. In fact, GA needs to perform high ﬁ-delity simulations many times for each evolved population (iteration). With the increasein the number of generations, the populations and their required crossovers and mutations5ill increase. These, in turn increase the time complexity of GA, making unfeasible theirapplication in near-real time. In order to tackle this issue, an interpolation strategy suitedfor non-linear parameterized data and intended to replace the high ﬁdelity solver in the GAis proposed in the next section.

3. Barycentric interpolation for nonlinear parametrized data

Consider a set of parametrized matrices { Y k ∈ R N x × N s , k = 1 , . . . , N p } formed from thediscrete solutions y ( δ k ) of a transient non-linear ﬂow problem. i.e, Y k =  y ( t , x , δ k ) y ( t , x , δ k )... . . . y ( t , x N x , δ k ) y ( t Ns , x N x , δ k )  In practice, δ refers to a parameter of the ﬂow problem, N x the number of spatial degreesof freedom and N s the number of time steps, where it is assumed that N x exceeds N s byseveral orders of magnitude. The aim of the following is to extract a set of reduced matrices Y k that describes the dynamics of the full order matrices Y k . To this end, assume that eachmatrix Y k is approximated in a POD basis of dimension q as follows Y k ≈ Φ k Λ kT (5)where Φ i ∈ R N x × q and Λ i ∈ R N s × q are respectively the spatial and temporal bases. Now,consider the POD respectively of orders r and s , r, s ≤ qN p , of the column block matrices (cid:2) Φ Φ · · · Φ N p (cid:3) = Φ ϕ T and (cid:2) Λ Λ · · · Λ N p (cid:3) = Λ α T where Φ ∈ R N x × r , ϕ ∈ R qN p × r , Λ ∈ R N s × s , α ∈ R qN p × s . Let ϕ i ∈ R q × r and α i ∈ R q × s , i = 1 , . . . , N p , be the column block matrices of ϕ T and α T such as ϕ T = (cid:2) ϕ ϕ · · · ϕ N p (cid:3) and α T = (cid:2) α α · · · α N p (cid:3) It yields that the full order snapshots matrix associated to the parameter δ k can be writtenas Y k ≈ Φ ϕ k α Tk Λ T (6)It is important to note that in the above expression, the change with respect to parameter δ k occurs only on the nested matrices Y k = ϕ k β Tk of signiﬁcantly reduced size r × s , r, s (cid:28) N x .In parametric studies such as optimization, rather than using the full order matrices Y k , itis more convenient to manipulate the corresponding nested reduced matrices Y k in order toachieve low cost calculations. The interpolation strategy of the matrices Y k is detailed inthe next subsection. In our case, the solutions y correspond to the turbulent mixed convection temperature distribution Θ,and the parameter δ to the inlet velocity U or inlet temperature θ . The POD bases are constructed such that they verify optimality with respect to the Euclidean innerproduct. In this case, the POD is nothing but the Singular Value Decomposition (SVD). However, otherinner products such as L or H can be used. More details about the POD approach can be found in [30]. .2. Data interpolation In the following, the interpolation approach is ﬁrst presented for two data samples. Thegeneralization to an arbitrary number of data samples is given afterwards. Let Y and Y be two parametrized compressed matrices associated respectively to δ and δ , such that Y = ϕ α T Y = ϕ α T where ϕ k and α k are rank- q parameterized matrices resulted from the data compressionprocedure. By using the above representations, the goal is to predict the matrix ˜ Y asso-ciated to a new parameter value ˜ δ diﬀerent from δ and δ . To this end, the barycentricinterpolation proposed in [21] for subspaces interpolation is used. For the sake of simplicity,we restrict ourselves to the univariate case and use Lagrange functions to generate interpo-lation weights. The Lagrange functions constructed by using two points δ and δ are givenby ω (˜ δ ) = ˜ δ − δ δ − δ ω (˜ δ ) = ˜ δ − δ δ − δ During the interpolation process, two sorts of subspaces have to be distinguished. The spa-tial subspaces span ( ϕ ) and span ( ϕ ), and the temporal subspaces span ( α ) and span ( α ).The proposed data interpolation technique suggests to predict the new matrix ˜ Y by applyingthe barycentric interpolation strategy to the spatial and temporal subspaces separately, i.e,it consists in solving the ﬁxed point problems( P x )  Find ˜ ϕ such that :˜ ϕ T ϕ = ξ Σ η T and ˜ ϕ T ϕ = ξ Σ η T ˜ ϕ = ˜ δ − δ δ − δ ϕ ˜ Q + ˜ δ − δ δ − δ ϕ ˜ Q where ˜ Q = η ξ T and ˜ Q = η ξ T ( P t )  Find ˜ α such that :˜ α T α = ζ Υ τ T and ˜ α T α = ζ Υ τ T ˜ α = ˜ δ − δ δ − δ α ˜ K + ˜ δ − δ δ − δ α ˜ K where ˜ K = τ ζ T and ˜ K = τ ζ T The iterative process to solve the problem ( P x ) is described by the following ﬁxed pointsequence( P x )  ˜ ϕ (0) given, for n ≥ ϕ ( n ) T ϕ = ξ ( n )1 Σ ( n )1 η ( n ) T then set ˜ Q ( n )1 = η ( n )1 ξ ( n ) T Perform the SVD of ˜ ϕ ( n ) T ϕ = ξ ( n )2 Σ ( n )2 η ( n ) T then set ˜ Q ( n )2 = η ( n )2 ξ ( n ) T Update the interpolant as ˜ ϕ ( n +1) = ˜ δ − δ δ − δ ϕ ˜ Q ( n )1 + ˜ δ − δ δ − δ ϕ ˜ Q ( n )2 The same strategy applies for the resolution of problem ( P t ). Now, once the solutions ˜ ϕ and˜ α respectively, of the ﬁxed points problems ( P x ) and ( P t ) are found, the reduced snapshotmatrix ˜ Y can be formed as˜ Y = ˜ ϕ ˜ α T = (cid:32) ˜ δ − δ δ − δ (cid:33) ϕ ˜ Q ˜ K T α T + (cid:32) ˜ δ − δ δ − δ (cid:33) ϕ ˜ Q ˜ K T α T + (˜ δ − δ )( δ − ˜ δ )( δ − δ ) (cid:16) ϕ ˜ Q ˜ K T α T + ϕ ˜ Q ˜ K T α T (cid:17) A very interesting property of the above formula is that even though space and time reducedbases { ϕ , ϕ } and { α , α } are separately interpolated, the calibration between the columns7f ˜ ϕ and ˜ α is naturally ensured by the barycentric interpolation, unlike the Bi-CITSGM[12] where the calibration is lost by the Grassmannian interpolation.Let’s now state the general framework of the data interpolation approach. To do so, con-sider a set of parametrized data matrices Y , · · · , Y Np associated to the parameter values δ , δ , . . . , δ N p , such that Y k = ϕ k α Tk , k = 1 . . . , N p The approximate matrix ˜ Y for a new untrained value ˜ δ (cid:54) = δ k obtained by solving the followingﬁxed point problems( P x )  Find ˜ ϕ such that :˜ ϕ T ϕ k SVD = ξ k Σ k η Tk , k = 1 , . . . , N p ˜ ϕ = N p (cid:88) k =1 ω k (˜ δ ) ϕ k ˜ Q k where ˜ Q k = η k ξ Tk ( P t )  Find ˜ α such that :˜ α T α k SVD = ζ k Υ k τ Tk k = 1 , . . . , N p ˜ α = N p (cid:88) k =1 κ k (˜ δ ) α k ˜ K k where ˜ K k = τ k ζ Tk The solution is then constructed as follows˜ Y = N p (cid:88) k,h =1 ω k (˜ δ ) κ h (˜ δ ) ϕ k ˜ Q k ˜ K Th α Th where ˜ Q k and ˜ K h are orthogonal matrices and ω k and κ h are some interpolation functionsof sum equal to 1, verifying ω k ( δ i ) = κ k ( δ i ) = δ ki , with δ ki the piecewise Kronecker deltafunction which value is 1 if k equals i and 0 otherwise. The interpolation procedure ofnonlinear parametrized data is summarized in algorithm 1.In order to tackle the severe computational eﬀort of Genetic algorithms, an optimizationprocedure is proposed in the next section, where algorithm 1 is used as solution predictorinstead of the high ﬁdelity solver. Algorithm 1:

Non-linear data interpolation strategy

Oﬄine :

Use the POD to compress the trained parametrized data matrices Y k such as Y k ≈ Φ Y k Λ T where Y k = ϕ k α Tk Online :

Give a value of ˜ δ (chosen by the user) and calculate the weights ω k (˜ δ ) and κ h (˜ δ )Set ˜ Y (0) = ˜ ϕ (0) k ˜ α k (0) T arbitrary, for example choose a point Y k from the sampling while Error > ε dofor k ∈ { , . . . , N p } do Calculate the matrix ˜ Q ( n ) k = η ( n ) k ξ ( n ) T k where ˜ ϕ ( n ) T ϕ k SVD = ξ ( n ) k Σ ( n ) k η ( n ) T k Calculate the matrix ˜ K ( n ) k = τ ( n ) k ζ ( n ) T k where ˜ α ( n ) T α k SVD = ζ ( n ) k Υ ( n ) k τ ( n ) T k Update the reduced matrix : ˜ Y ( n +1) = N p (cid:88) k,h =1 ω k (˜ δ ) κ h (˜ δ ) ϕ k ˜ Q ( n ) k ˜ K ( n ) T h α Th Evaluate the error :

Error = N p (cid:88) k =1 N p (cid:88) h =1 || ˜ Q ( n ) k ˜ K ( n ) T h − ˜ Q ( n − k ˜ K ( n − T h || F where || · || denotes the Frobenius norm. 8 . Data-Driven Reduced Genetic Algorithm Basically, the proposed DDGA is a genetic algorithm strategy to solve inverse problemsby means of available precomputed parametrized ﬂow data. The major advantage of thisapproach is that the relationship between the state variable y and the optimization variable δ , earlier established through the mapping N , is now replaced by the cheap explicit formulaof the barycentric interpolation y ( t l , x j , ˜ δ ) ≈ Φ ( x j ) ˜ Y Λ T ( t l ) (7)where Φ ( x j ) and Λ ( t l ) denote respectively the j th and l th rows of the matrices Φ and Λ ,and ˜ Y the reduced snapshots matrix to be found by algorithm 1.In order to make sure that DDGA performs in an optimal manner, the chromosomes areenriched by virtual genes. These genes are the order of POD truncation q and the number ofspatial and temporal interpolation neighbors denoted respectively ne x and ne t . To illustratethis, let Y , . . . , Y be four reduced matrices associated to the parameter values δ < δ <δ < δ respectively such that Y k = ϕ k α Tk , k = 1 , . . . , ϕ k and α k are rank- q matrices. Suppose that we want to ﬁnd an approximation ofthe reduced matrix ˜ Y for an untrained value ˜ δ ∈ ] δ , δ [ by using an order of POD truncation m < q , three neighbors for spatial interpolation ( ne x = 3) and two neighbors for temporalinterpolation ( ne t = 2). Then the untrained reduced matrix is approximated as˜ Y = (cid:88) k =1 2 (cid:88) h =1 ω k (˜ δ ) κ h (˜ δ ) ϕ k ˜ Q k ˜ K Th β Th where the columns of ϕ k and β h are truncated up to the order m and ω k (˜ δ ) = (cid:89) i =1 i (cid:54) = k ˜ δ − δ i δ k − δ i and κ h (˜ δ ) = (cid:89) i =1 i (cid:54) = h ˜ δ − δ i δ h − δ i In the proposed genetic algorithm strategy, the j th chromosome is then the candidate¯ δ j = { δ j , ne t , ne x , m } where δ j , ne t , ne x and q are its genes. Accordingly, the originaloptimization problem (3) is modiﬁed yielding tomin ¯ δ J ( ˜ Y , ¯ δ ) such that ˜ Y is the output of algorithm 1In the next section, the potential of this approach is assessed on the inverse parameteridentiﬁcation problem involving a turbulent mixed convection ﬂow.

5. Numerical experiments

In this section, the CFD model used to solve the mixed-convection problem is ﬁrstvalidated with respect to the benchmark experimental data. Then a set of solutions sampledin diﬀerent time instants and diﬀerent trained parameters are created and eventually usedto assess the eﬃciency of the proposed DDGA.

This series of numerical computations was based on the experiment carried out by Blayet al. [31], where a turbulent mixed convection ﬂow was generated in a ventilated cavitywith dimensions 1 . × . × . m . In this experiment, a two-dimensional ﬂow wasgenerated in the enclosure shown in ﬁgure 1, which was surrounded by two guard cavities.9he reference temperature Θ was the average temperature in the cavity. The Rayleighnumber of this conﬁguration, based on the cavity height and on the temperature diﬀerencebetween the heated ﬂoor ( θ hot = 35 . o C ) and the other walls and the inlet ( θ cold = θ =15 o C ), was 2 . × . The Reynolds number based on the air velocity at inlet U =0 . m/s and on the inlet height was 654. The two-dimensional turbulent ﬂow was modeledwith the RNG k-epsilon model [32]. To compute this ﬂow, and to generate all input datanecessary for the study presented in this paper, the ﬁnite volume code OpenFOAM [33] wasused. The computational domain was discretized into a non uniform grid made of 120000hexaedral cells, which was very tight close to the walls, in order to properly discretize theboundary layer. The non-isothermal ﬂow described by equations (2) was calculated withthe buoyantBoussinesqPimpleFoam solver. At the inlet, the velocity boundary conditionswere u = 0 .

57 m/s and v = 0 m/s, the temperature was Θ = 15 o C , and the turbulentboundary conditions were k = 1 . × − m /s and (cid:15) = 5 . × − m /s . On the walls,no-slip boundary conditions were applied for the velocity components, the temperature wasequal to 35 o C on the ﬂoor, and to 15 o C on the other walls. At the outlet, zero gradientboundary conditions were applied for the temperature, the velocity components and theturbulent variables. The steady ﬂow presented in this paragraph was reached by computingan unsteady ﬂow, starting at t = 0 s from Θ ini = θ cold for the temperature, and u ini = v ini = 0 for the velocity components. The convection terms were discretized with the Gausslinear Upwind scheme, and the laplacian terms were approximated with the Gauss linearcorrected scheme. With this non uniform mesh, the average y + value was equal to 1.1,and the maximum value was 3 .

3. In ﬁgure 3, the temperature proﬁles at x = 0 .

52 and at y = 0 .

52 are shown such that, Θ ∗ = Θ − Θ θ hot − θ cold , x ∗ = x/H and y ∗ = y/H where H is thecavity height. A satisfactory agreement can be noticed. (a) Θ ∗ at x ∗ = 0 . ∗ at x ∗ = 0 . Figure 3: Comparison between the numerical results (CFD) and the experimental results (Exp Blay et al.)of Θ ∗ at x ∗ = 0 . y ∗ = 0 . As claimed in the earlier section, a data driven approach (algorithm 1) is to be embeddedwithin the GA in order to tackle the severe computational eﬀort due to high ﬁdelity simula-tions. Thereby, the time required for evaluating the ﬁtness of one chromosome passes fromseveral hours to real time, and thus, drastically reducing the time needed for optimization.By using a set of parametrized ﬂow solutions, the goal is to act on the inlet velocity ortemperature in order to minimize the discrete cost functional J (Θ) = 1 N s N s (cid:88) n =1 (cid:90) Ω int (Θ n − ˆΘ n ) dx (8)10here the interior subdomain represented in ﬁgure 1 is considered such that Ω int = [0 . , . × [0 . , . n refers to the time instant, Θ n the calculated temperature andˆΘ n the target temperature.A set of training simulations, based on the conﬁguration presented in ﬁgure 1, for diﬀerentvalues of inlet velocity U and inlet temperature θ were performed with OpenFOAM over thetime interval [0 , t f ]. For all the cases considered in this paper, at t = 0 s, the temperaturein the cavity is equal to θ cold , and the velocity to 0. The ﬁnal time instant t f was chosenin such a way that the temporal evolution of the temperature in the center of the cavitydid not vary according to time. For all simulations, t f = 1250 s was a suﬃciently longtime interval. 1000 snapshots uniformly spaced in the time interval [0 , t f ] are then used tobuild the temperature POD decompositions, where the maximal POD truncation order q isinitially set to 60. Two series of tests are carried out : Test Series 1 : the optimization is performed by ﬁxing the inlet temperature θ = 15 ◦ C andvarying the inlet velocity U . The following three values of U are considered for the trainingphase : 0 . m/s , 0 . m/s and 0 . m/s . Knowing the temperature ˆΘ in the subdomainΩ int , the aim is to determine by applying DDGA, the corresponding inlet velocity ˆ U withvalues : 0 . m/s , 0 . m/s , 0 . m/s , 0 . m/s , 0 . m/s and 0 . m/s . Recall thatbesides the inlet velocity U , the space of search by DDGA is enriched by the order of trun-cation of the POD decompositions m , and the number of temporal and spatial neighboringsubspaces ne t and ne x , selected to perform the barycentric interpolation. For this case, theDDGA is allowed to search in the following space K = (cid:8) ( U, ne t , ne x , m ) ∈ R + × N , . ≤ U ≤ . ≤ ne t , ne x ≤ ≤ m ≤ q (cid:9) In order to analyze the performance of the method proposed in this paper, it is interestingto have a look at the isovalues of temperature and velocity magnitude in the cavity for thethree training values of inlet velocity (see ﬁgures 4). At the beginning of all simulationspresented in this paper, the air in the vicinity of the hot ﬂoor is warmed by thermal diﬀusion,and it is then lifted by natural convection along the hot ﬂoor (one can notice small thermalplumes at the beginning of all simulations). For an inlet velocity between 0.51 and 0.798m/s, and an inlet temperature value of 15 o C , a clockwise recirculation region is generatedby the combined eﬀects of the forced convection induced by the air injection, and of thenatural convection which occurs along the hot ﬂoor. Test Series 2 : in this case, the optimization is performed by acting on the inlet temperature θ while the inlet velocity is set to the ﬁxed value 0 . m/s . The considered training injectiontemperature values are : 5 ◦ C , 10 ◦ C , 15 ◦ C , 20 ◦ C and 25 ◦ C . As in test series 1, the aim isto use DDGA to approximate the optimal inlet temperature ˆ θ with values : 7 . ◦ C , 12 . ◦ C ,17 . ◦ C and 22 . ◦ C associated to the known temperature distribution ˆΘ. The space of searchby DDGA in this case is given by K = (cid:8) ( θ, ne t , ne x , m ) ∈ R + × N , ≤ θ ≤

25; 2 ≤ ne t , ne x ≤ ≤ m ≤ q (cid:9) For this test series, let us have a look at the isovalues of temperature and velocity magnitudeobtained for the inlet temperatures of 5 o C , 15 o C and 25 o C (see ﬁgures 5). For smallinlet temperatures ( θ = 5 o C ), the air in the upper left part of the cavity, which is toocold, falls along the left wall. It is then warmed by the hot ﬂoor, and lifted by naturalconvection with a counterclockwise motion along the hot ﬂoor. For higher inlet temperatures( θ = 15 o C ), the air in the upper part of the cavity is warm and the clockwise motion of a largerecirculation region induced by the combined eﬀects of the forced convection phenomenonand the natural convection phenomenon along the hot ﬂoor can be seen. For the highesttemperature velocities ( θ = 25 o C ), the injected air is hot, it remains in a large region alongthe ceiling, it falls along the left and right cold walls, and is lifted along the heated ﬂoor,inducing two recirculation regions, a clockwise one in the right part of the cavity, and acoutnterclockwise one in the left part of the cavity. For this second series of training tests,11t can be concluded that for various inlet velocities, the ﬂow regimes are diﬀerent from eachother.In the numerical experiments of DDGA, a population of 20 chromosomes formed by 4 genesrandomly generated in K is used as initial guess to run the DDGA. The algorithm is allowedto run until a maximum number of iterations predetermined by the user is reached. Themaximum number of iterations here is set to 30. Figure 4: Temperature distribution at three time instants t = 8 . s (left), t = 55 s (middle) and t = 1250 s (right) for the case of variable inlet temperature In the following, the results of the inverse parameter identiﬁcation problem involvingthe turbulent mixed convection ﬂow are presented and analyzed. The decay of the averagedfunctional for the cases of variable inlet temperature and variable inlet velocity is plotted inﬁgure 6. It shows that after successive generations, the averaged cost decreases and tendsto stagnate, meaning that the populations contain a chromosome of high recurrence. Thischromosome is eventually considered as the best individual that approximates the soughtoptimum of the inverse problem. The outputs of this best chromosome from the last gener-ation are listed in Table 1 and Table 2. It can be seen that the DDGA succeeded to recoverapproximations ˜ U and ˜ θ of the sought optimal inlet values ˆ U and ˆ θ with good accuracies.Moreover, The L percentage of error over the simulation time interval between the targettemperature and the solution obtained by DDGA for all the cases was less than 0 .

8% (seeﬁgure 7). Figures 8 and 9 show the target temperature solutions side by side with the recon-structed temperature solutions obtained at the end of DDGA. The odd columns show theﬁrst appearance of the thermal plumes that emerge from the heated bottom wall of the cav-ity, while the even columns represent the temperature distribution in its established regime.From a visual perspective, it can be seen that the approached solutions by DDGA are in12 igure 5: Temperature distribution at three time instants t = 8 . s (left), t = 55 s (middle) and t = 1250 s (right) for the case of variable inlet velocity. good agreement with the target high ﬁdelity solutions. The converged DDGA-solution suc-ceeded to track the provided target temperature catching by that the most of the dynamicsfeatures present in the temperature along the simulation time interval and all over the do-main Ω. More particularly, for the ﬁrst test series which led to similar features but diﬀerentvalues of velocity and temperature, the velocity and temperature values in the cavity areproperly recovered by the method proposed in this paper. It can also be pointed out thatfor the second test series which involved various ﬂow regimes and which was much morecomplex than the ﬁrst case, the new method presented here provided results that showeda good accuracy. Here, the attention of the reader is bounced back to the fact that thePOD truncation order m as well as the neighbors number ne t and ne x , are extremely im-portant parameters of DDGA. These parameters are essentially meant to ensure the goodperformance of the barycentric interpolation inside the DDGA. By analyzing the resultsof tables 1 and 2, we observe that these quantities vary from a test case to another, i.e,variable neighbors number with less than 12 modes were needed to represent the DDGA-optimal ﬂow for the case of inlet velocity, while the case of variable inlet temperature hasmore complicated dynamics and needed at least 25 modes to represent the solution. Thisconﬁrms that besides the ability to locate a global optimum of the inlet problem, the DDGAhas the feature to eliminate the noise that might intervene from further data samples andfrom lower frequency POD modes. Finally, in terms of computational eﬀort, DDGA is veryeﬃcient and performs in near-real time. The overall computational time needed to perform30 generations in a single cluster was less than two minutes. In inverse problems of turbulentﬂows, this represents a tremendous gain in CPU time compared to traditionally used highﬁdelity approaches. 13 igure 6: Evolution of the averaged functional over generations of DDGA for the cases of variable inletvelocity and variable inlet temperature.Figure 7: Percentage of error of the converged temperature solution by DDGA. Sought optimal values Approximated value ne t ne x trunc. order m ˆ U = 0 .

54 ˜ U = 0 .

544 2 2 8ˆ U = 0 .

57 ˜ U = 0 .

578 2 2 10ˆ U = 0 . U = 0 .

59 2 2 10ˆ U = 0 .

67 ˜ U = 0 .

66 3 3 10ˆ U = 0 . U = 0 .

706 2 3 7ˆ U = 0 .

755 ˜ U = 0 .

755 3 3 12

Table 1: Outputs of the optimal control by using DDGA for the case of variable inlet velocity.

Sought optimal values Approximated value ne t ne x trunc. order m ˆ θ = 7 .

50 ˜ θ = 7 .

92 4 4 25ˆ θ = 12 . θ = 12 .

10 2 2 31ˆ θ = 17 . θ = 17 .

85 2 2 38ˆ θ = 22 . θ = 22 .

50 3 4 30

Table 2: Outputs of the optimal control by using DDGA for the case of variable inlet temperature. igure 8: Comparison of the high ﬁdelity and DDGA temperature solutions at two diﬀerent instants of theﬂow, for the case of variable inlet velocity. The odd columns describe the ﬁrst appearance of thermal plumesat t = 8 . s , and the even columns the established regime of the temperature at t = 1250 s .

6. Conclusions

In this paper, we have proposed the data driven optimization approach DDGA by com-bining genetic algorithms and the barycentric interpolation. The barycentric interpolationis presented here as an equation-free approach that allows to learn from trained data so-lutions and predict the evolution of new untrained solutions without any knowledge of thephysics hidden behind. The numerical assessments of DDGA are performed on the inverseproblem involving a turbulent mixed convection problem, where the variation is carried outon the inlet velocity and then on the inlet temperature. We notice that DDGA succeededto track the optimal solutions and to deliver satisfying approximations in less than twominutes. This signiﬁcant gain endorses the great potential of this approach compared to ahigh ﬁdelity based GA that could last for many hours or days.

Acknowledgement

This material is based upon work ﬁnancially supported by CPER BATIMENT DURABLE- Axe 3 ”Qualit´e des Environnement Int´erieurs (QEI)” (P-2017-BAFE-102) and FrenchAstrid ANR MODULO’PI (ANR-16-ASTR-0018 MODUL’O Π).

References

References [1] W. Liu and Q. Chen, “Optimal air distribution design in enclosed spaces using anadjoint method,”

Inverse Problems in Science and Engineering , vol. 23, no. 5, pp. 760–779, 2015.[2] W. Liu, M. Jin, C. Chen, and Q. Chen, “Optimization of air supply location, size,and parameters in enclosed environments using a computational ﬂuid dynamics-basedadjoint method,”

Journal of Building Performance Simulation , vol. 9, no. 2, pp. 149–161, 2016.[3] S. Nabi, P. Grover, and C. Caulﬁeld, “Adjoint-based optimization of displacementventilation ﬂow,”

Building and Environment , vol. 124, pp. 342 – 356, 2017.15 igure 9: Comparison of the high ﬁdelity and DDGA temperature solutions at two diﬀerent instants of theﬂow, for the case of variable inlet temperature. The odd columns describe the ﬁrst appearance of thermalplumes at t = 8 . s , and the even columns the established regime of temperature at t = 1250 s . Computers & Fluids , vol. 194,p. 104313, 2019.[5] J. H. Holland,

Adaptation in Natural and Artiﬁcial Systems . Ann Arbor, MI: Universityof Michigan Press, 1975. second edition, 1992.[6] Y. Xue, Z. J. Zhai, and Q. Chen, “Inverse prediction and optimization of ﬂow controlconditions for conﬁned spaces using a CFD-based genetic algorithm,”

Building andEnvironment , vol. 64, pp. 77 – 84, 2013.[7] T. Dias and L. F. Milanez, “Optimal location of heat sources on a vertical wall with nat-ural convection through genetic algorithms,”

International Journal of Heat and MassTransfer , vol. 49, no. 13, pp. 2090 – 2096, 2006.[8] A. Tallet, C. Allery, and C. Leblond, “Optimal ﬂow control using a POD basedReduced-Order Model,”

Numerical Heat Transfer, Part B , vol. 170, 2016.[9] M. Bergmann, L. Cordier, and J.-P. Brancher, “Optimal rotary control of the cylinderwake using proper orthogonal decomposition reduced-order model,”

Physics of Fluids ,vol. 17, no. 9, pp. 97–101, 2005.[10] D. Amsallem and C. Farhat, “An interpolation method for adapting reduced-ordermodels and application to aeroelasticity,”

AIAA Journal , pp. 1803–1813, 2008.[11] M. Oulghelou and C. Allery, “A fast and robust sub-optimal control approach usingreduced order model adaptation techniques,”

Applied Mathematics and Computation ,vol. 333, pp. 416 – 434, 2018.[12] M. Oulghelou and C. Allery, “Non-intrusive reduced genetic algorithm for near-realtime ﬂow optimal control,”

International Journal for Numerical Methods in Fluids ,https://doi.org/10.1002/ﬂd.4820, 2020.[13] M. Cheng, F. Fang, C. Pain, and I. Navon, “Data-driven modelling of nonlinear spatio-temporal ﬂuid ﬂows using a deep convolutional generative adversarial network,”

Com-puter Methods in Applied Mechanics and Engineering , vol. 365, 2020.[14] D. Xiao, C. Heaney, L. Mottet, F. Fang, W. Lin, I. Navon, Y. Guo, O. Matar, A. Robins,and C. Pain, “A reduced order model for turbulent ﬂows in the urban environment usingmachine learning,”

Building and Environment , vol. 148, pp. 323–337, 2019.[15] O. San, R. Maulik, and M. Ahmed, “An artiﬁcial neural network framework for re-duced order modeling of transient ﬂows,”

Communications in Nonlinear Science andNumerical Simulation , vol. 77, pp. 271–287, 2019.[16] J. Yu, C. Yan, and M. Guo, “Non-intrusive reduced-order modeling for ﬂuid problems:A brief review,”

Proceedings of the Institution of Mechanical Engineers, Part G: Journalof Aerospace Engineering , vol. 233, no. 16, pp. 5896–5912, 2019.[17] S. Ahmed, S. Rahman, O. San, A. Rasheed, and I. Navon, “Memory embedded non-intrusive reduced order modeling of non-ergodic ﬂows,”

Physics of Fluids , vol. 31,no. 12, 2019.[18] R. R. Madadi and C. Balaji, “Optimization of the location of multiple discrete heatsources in a ventilated cavity using artiﬁcial neural networks and micro genetic algo-rithm,”

International Journal of Heat and Mass Transfer , vol. 51, no. 9, pp. 2299 –2312, 2008. 1719] E. Massart, P.-Y. Gousenbourger, N. Son, T. Stykel, and P.-A. Absil, “Interpolationon the manifold of ﬁxed-rank positive-semideﬁnite matrices for parametric model orderreduction: preliminary results,” 08 2019.[20] E. Massart and P.-A. Absil, “Quotient geometry with simple geodesics for the manifoldof ﬁxed-rank positive-semideﬁnite matrices,”

SIAM Journal on Matrix Analysis andApplications , vol. 41, no. 1, pp. 171–198, 2020.[21] M. Oulghelou and C. Allery, “A Riemannian barycentric interpolation : Derivation ofthe parametric unsteady navier-stokes reduced order model,” arXiv:2009.11231 , 2020.[22] V. Kozeny, “Genetic algorithms for credit scoring: Alternative ﬁtness function perfor-mance comparison,”

Expert Systems with Applications , vol. 42, no. 6, pp. 2998 – 3004,2015.[23] S. T. Selvi, S. Baskar, and S. Rajasekar, “Chapter 17 - application of evolutionaryalgorithm for multiobjective transformer design optimization,” in

Classical and Re-cent Aspects of Power System Optimization (A. F. Zobaa, S. H. A. Aleem, and A. Y.Abdelaziz, eds.), pp. 463 – 504, Academic Press, 2018.[24] J. S. Arora, “Chapter 17 - nature-inspired search methods,” in

Introduction to OptimumDesign (Fourth Edition) (J. S. Arora, ed.), pp. 739 – 769, Boston: Academic Press,fourth edition ed., 2017.[25] S. Ali, H. Lu, S. Wang, T. Yue, and M. Zhang, “Chapter two - uncertainty-wise testingof cyber-physical systems,” vol. 107 of

Advances in Computers , pp. 23 – 94, Elsevier,2017.[26] B. V. Kumar, G. Karpagam, and Y. Zhao, “Chapter 9 - evolutionary algorithm withmemetic search capability for optic disc localization in retinal fundus images,” in

Intel-ligent Data Analysis for Biomedical Applications (D. J. Hemanth, D. Gupta, and V. E.Balas, eds.), Intelligent Data-Centric Systems, pp. 191 – 207, Academic Press, 2019.[27] K. Khoo and P. Suganthan, “Evaluation of genetic operators and solution representa-tions for shape recognition by genetic algorithms,”

Pattern Recognition Letters , vol. 23,no. 13, pp. 1589 – 1597, 2002.[28] D. E. Goldberg,

Genetic Algorithms in Search, Optimization and Machine Learning .Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1st ed., 1989.[29] X. Yang, Z. Yang, G. hua Lu, and J. Li, “A gray-encoded, hybrid-accelerated, ge-netic algorithm for global optimizations in dynamical systems,”

Communications inNonlinear Science and Numerical Simulation , vol. 10, no. 4, pp. 355 – 363, 2005.[30] L. Sirovich, “Turbulence and the dynamics of coherent structures : Part I, II and III,”

Quarterly of Applied Mathematics , pp. 461–590, 1987.[31] D. Blay, S. Mergui, and C. Niculae, “Conﬁned turbulent mixed convection in the pres-ence of a horizontal buoyant wall jet,”

ASME Heat Transfer Division , vol. 213, pp. 65–72, 1992.[32] V. Yakhot, V. Orszag, S. Thangam, T. B. Gatski, and C. G. Speziale, “Development ofturbulence models for shear ﬂows by a double expansion technique,”

Physics of FluidsA: Fluid Dynamics , vol. 4, no. 7, pp. 1510–1520, 1992.[33]