Assessment of unsteady flow predictions using hybrid deep learning based reduced order models
Sandeep Reddy Bukka, Rachit Gupta, Allan Ross Magee, Rajeev Kumar Jaiman
AAssessment of unsteady flow predictions using hybrid deep learning based reduced order models
Sandeep Reddy Bukka ∗ and Allan Ross Magee † Technology Center for Offshore and Marine, Singapore (TCOMS)
Rachit Gupta ‡ and Rajeev Kumar Jaiman § University of British Columbia, Vancouver (Dated: September 23, 2020)In this paper, we present two deep learning-based hybrid data-driven reduced order models for the predic-tion of unsteady fluid flows. These hybrid models rely on recurrent neural networks (RNNs) to evolve low-dimensional states of unsteady flow. The first model projects the high-fidelity time series data from a finiteelement Navier-Stokes solver to a low-dimensional subspace via proper orthogonal decomposition (POD). Thetime-dependent coefficients in the POD subspace are propagated by the recurrent net (closed-loop encoder-decoder updates) and mapped to a high-dimensional state via the mean flow field and POD basis vectors. Thismodel is referred as POD-RNN. The second model, referred to as convolution recurrent autoencoder network(CRAN), employs convolutional neural networks (CNN) (instead of POD), as layers of linear kernels withnonlinear activations, to extract low-dimensional features from flow field snapshots. The flattened featuresare advanced using a recurrent (closed-loop manner) net and up-sampled (transpose convoluted) gradually tohigh-dimensional snapshots. Two benchmark problems of the flow past a cylinder and flow past a side-by-sidecylinder are selected as the test problems to assess the efficacy of these models. For the problem of flow past asingle cylinder, the performance of both the models is satisfactory, with CRAN being a bit overkill. However, itcompletely outperforms the POD-RNN model for a more complicated problem of flow past side-by-side cylin-ders. Owing to the scalability of CRAN, we briefly introduce an observer-corrector method for the calculationof integrated pressure force coefficients on the fluid-solid boundary on a reference grid. This reference grid,typically a structured and uniform grid, is used to interpolate scattered high-dimensional field data as snap-shot images. These input images are convenient in training CRAN. This motivates us to further explore theapplication of CRAN models for the prediction of fluid flows.
Keywords: POD; RNN; CNN; Deep learning; ROM; Autoencoders; Prediction
I. INTRODUCTION
The fluid flows described by the Navier-Stokes partial dif-ferential equations are highly nonlinear and multiscale in na-ture. While commercial and open-source solvers do a greatjob in accurate prediction of fluid flows via full-order model(FOM) based on the partial differential equations, the compu-tational costs associated with parametric sweep and dealingwith complex problems can lead to unrealistic time scales ofcomputation. This has in turn lead to development of severallow cost data-driven reduced order models (ROMs), which at-tempt to replace the high-dimensional (i.e., high-fidelity) so-lution from the FOM to a low-dimensional (i.e., low-fidelity)solution manifold. The purpose of these low cost models is toaccommodate for realistic time scales in computations involv-ing design optimization and real time control. In this work, weare interested in the development of hybrid data-driven ROMsmodel that can learn the dynamical system well enough to ef-ficiently predict the time series of flow fields using the full-order data.A large-class of ROMs are projection-based: they seek low-dimensional (coarse grained) representations of an underly-ing high-dimensional system. The basic assumption behind ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] these models is that, lower-order representation of higher-dimensional flow field can exist, and it’s representation canbe identified efficiently within reasonable accuracy. Properorthogonal decomposition (POD) is one such widely usedprojection-based techniques for reduced order modeling offluid flows [1]. The reason being, the lower-order represen-tations obtained by POD are mathematically optimal for agiven dataset. A more extensive work has been reported inthe projection-based approach for building ROMs [2]. Theidea, behind projection based model reduction, is that theNavier-Stokes equations are projected onto the modes ob-tained primarily by POD and time integration is performedon ODEs obtained by projection, instead of the original PDEsarising from the conservation laws. Although the projection-based approach is computationally efficient, they do not ac-count for local spatial variations in the flow and are knownto become unstable under certain conditions even for canon-ical cases. They are also intrusive in nature which requiresextensive modifications to the current state-of-the-art CFDcodes. Projection-based methods have severe computationaldifficulties when dealing with problems involving dominantflow structures and patterns that propagate over time suchas in transport and wave-type phenomena, or convection-dominated flows [3]. Several researchers have also explorednon-intrusive approaches based on data-driven techniques.Eigensystem realization algorithm (ERA) is one such methodwhich is purely data-driven and is primarily used for the sta-bility analysis of dynamical systems [4–6].The ubiquity of machine learning (ML) techniques moti- a r X i v : . [ phy s i c s . f l u - dyn ] S e p vates several researchers in fluid mechanics to implement andadapt in several applications [7–9]. Deep learning is a subsetof ML which refers to the use of highly multilayered neuralnetworks to understand a complex dataset, with the intentionof predicting certain features in the dataset. The tremendoussuccess of deep learning [10] in various fields such as imag-ing, finance, robotics, etc. has prompted its potential applica-tion to fluid mechanics problems [11, 12]. Recently, convolu-tional neural networks (CNNs) [13, 14] were used to developa novel nonlinear modal decomposition method, which per-formed superior to the traditional POD. Another aspect is todirectly learn the governing equations from higher-order data.This can be of great advantage where the knowledge of thephysical laws is hidden. There has been several instances inthe literature where ML is applied to learn the hidden govern-ing equations of fluid flow directly from field data [15–21].A relatively new alternative to the traditional neural networks,where the derivative of the hidden state is represented as pa-rameters via neural network, instead of specifying a discretesequence of hidden layers, is presented in [22]. These newfamilies of deep neural network were quickly adapted in theresearch community as a whole and also specific to fluid me-chanics in [23].There exist wide variants of deep architectures in the liter-ature, that have been successful in addressing respective flowproblems. Deep nets are also a popular choice for real-timefluid simulations in interactive computer graphics. [24] de-scribes one such initial attempts. Here, regression forests areemployed to approximate the fluid flow equations with theLagrangian description to predict the position and velocityof the particles. Neural networks are employed in the liquidsplash problem by learning the regression of splash formationin [25]. A complete data-driven approach based on LSTMnetworks is proposed in [26] to infer the temporal evolu-tion of pressure fields coming from the Navier-Stokes solver.[27] proposed a generative adversial network (GAN) for theproblem of reconstructing super resolution smoke flows. An-other application of GAN is presented in [28], where the au-thors developed a novel generative model to synthesize fluidflows from a set of reduced parameters. By the introductionof a novel deformation-aware loss function, the complex be-havior of liquids over a parameter range is mapped onto areduced representation based on space-time deformations in[29]. The accuracy of the deep learning model for the in-ference of RANS solution is investigated via state-of-the-artU-net architecture with a large number of trained network ar-chitectures [30]. Bhatnagar et al [31] used CNNs to predictthe velocity and pressure field in unseen flow conditions andgeometries, for a given pixelated shape of the object. Physics-informed neural networks are proposed in [32] by consideringthe residual of a PDE as the loss function in the training algo-rithm. An attempt to develop hybrid models combining the ca-pabilities of POD with deep learning models is also describedin [11]. Carlberg et al [33] used autoencoders for reconstruct-ing the missing CFD data by using the data saved at few timeinstances of high-fidelity simulations.In addition to applications specific to fluid dynamics, amore general idea of ROM in nonlinear data-driven dynamics can be understood using ERA, DMD (Dynamic Mode Decom-position) and ML algorithms. In that regard, Koopman theoryhas gained a lot of traction. It is mainly used for linear control[34, 35] and modal decomposition [36, 37]. An excellent anal-ogy between DMD with Koopman theory is presented in [38].Marko et al [39] surveyed the applications of Koopman theoryfor nonlinear dynamical systems. Building on the mathemat-ical framework of Koopman theory, several other works thatdeploy deep learning-based models for nonlinear dynamicalsystems are [40–44].Traditional model reduction in fluid mechanics is hence notvery different from ML. The motive in either is to build low-dimensional models. The latter has grown from the computerscience community with a focus on creating low-dimensionalmodels from black-box data streams. Owing to their non-intrusive nature, they have advantages in adaptability and scal-ability. Sensors in smart autonomous systems (SASs) act asblack-box data streams and provide huge volumes of data.Therefore the marriage of ML with traditional model reduc-tion techniques such as POD seems like a natural evolutionaryprocess. The advantages of both ML and POD, such as scala-bility and physics-based modeling, can be harnessed via suchhybrid ROMs. The role of POD in such a hybrid model is tocalculate low-dimensional projections/features, which can beevolved in time via RNNs. It is common knowledge now thatCNNs are very good at extracting features. In the recent workof Miyanawala et al. [45], the authors showcased the advan-tages of CNN (as a data-driven ROM) for feature extraction inunsteady wake flow patterns. In the quest for more accurateROM, the concept of autoencoders which has attained greatsuccess in image tracking and scene labeling is explored in aseries of papers [46, 47]. A more complex variant of the au-toencoder is the convolutional recurrent autoencoder network(CRAN) which is obtained by the integration of CNN withRNN. Recently such models are applied for benchmark fluidflow problem of lid-driven cavity [48]. However, the applica-bility of hybrid ROMs such as POD-RNN and CRAN for theflow past bluff bodies with massively separated flows is yet tobe tested and there is no literature providing such analysis tothe best of author’s knowledge.The target of the reduced-order models is to process thedata coming from sensors and make relevant predictions asrequired by a particular physical system. To be relevant and beuseful for the current automation needs of the industry, furtherdevelopments of ROMs must consider the strength of largevolumes of data combined with the latest state-of-the-art MLalgorithms. The current work is focused on that goal for thedevelopment of efficient and reliable ROMs for unsteady fluidflow and fluid-structure interaction.There appears to be a major gap in the literature in termsof the large applicability of hybrid data-driven ROMs suchas POD-RNN and CRAN for fluid-structure interaction prob-lems. The novelty of work lies in testing the ability of hy-brid data-driven reduced-order models such as POD-RNN andconvolutional recurrent autoencoder networks (CRANs) forthe nonlinear unsteady problems of the flow past bluff bod-ies. We present comprehensive procedures for network ar-chitectures and training of POD-RNN and CRAN. For theCRAN, the LSTM is trained to construct the reduced dynam-ics through the autoencoder process. Both DL-based ROMsare purely data-driven and rely on a set of FOM snapshots ofthe Navier-Stokes equations, instead of making an attempt toreplace the high-fidelity FOM via deep neural networks.The article is organized as follows. Section II deals with theformulation of a high dimensional model that is used to gener-ate the data followed by a mathematical depiction of forwardvs inverse problem approach in section III. The methodologyof model reduction via POD-RNN is presented in section IValong with the idea of recurrent neural networks. The math-ematical equivalence of autoencoders and POD is discussedin section V. Next, the convolutional recurrent autoencodermodels are outlined in section VI with the unsupervised train-ing strategy. The complete applications of these hybrid deeplearning models for flow past a cylinder and flow past side-by-side cylinders are discussed in section VII and VIII respec-tively. Finally, concluding remarks are provided in section IX. II. HIGH-DIMENSIONAL MODEL
A numerical scheme implementing Petrov-Galerkin finite-element and semi-discrete time stepping are adopted in thepresent work [49, 50]. Let Ω f ( t ) ⊂ R d be an Eulerian fluiddomain at time t , where d is the space dimension. The motionof the imcompressible viscous fluid in Ω f ( t ) is governed bythe incompressible Navier-Stokes equations ρ f (cid:18) ∂ u f ∂ t + u f · ∇ u f (cid:19) = ∇ · σ f + b f on Ω f ( t ) , (1) ∇ · u f = Ω f ( t ) , (2)with the boundary conditions u f = u f D ∀ x f ∈ Γ fD ( t ) , (3) σ f · n f = h f ∀ x f ∈ Γ fN ( t ) , (4) u f = u f0 on Ω f ( ) , (5)where u f = u f ( x , t ) represents the fluid velocity. u f D and h f arethe Dirichlet and Neumann boundary condition on Γ fD ( t ) and Γ fN ( t ) respectively. Note that n f is the unit normal on Γ fN ( t ) and u f0 is the initial condition. In Eq. (1) the partial timederivative is with respect to the Eulerian referential coordinatesystem. Here b f represents the body force per unit mass and σ f is the Cauchy stress tensor for a Newtonian fluid which isdefined as σ f = − p I + µ f (cid:16) ∇ u f + ( ∇ u f ) T (cid:17) , (6)where p , µ f and I are the hydrodynamic pressure, the dynamicviscosity of the fluid, and the identity tensor, respectively.A rigid-body structure submerged in the fluid experiencesunsteady fluid forces and consequently may undergo flow-induced vibrations. In this study, we focus on static bodiesand hence do not model the solid movement. The fluid-solidboundary ( Γ f s ( t ) ) is modeled as a no-slip Dirichlet boundary condition Eq. (3) with u f D =
0. The fluid force along the fluid-solid boundary is computed by integrating the surface traction,from the Cauchy stress tensor, over the first boundary layer el-ements on the fluid-solid surface. At a time instant, the dragand lift force coefficients C D and C L are given by C D = ρ f U ∞ D (cid:90) Γ fs ( σ f . n ) . n x d Γ , C L = ρ f U ∞ D (cid:90) Γ fs ( σ f . n ) . n y d Γ , (7)where ρ f , U ∞ and D are the fluid density, reference velocityand reference length respectively. For the unit normal n ofthe surface, n x and n y are the x and y Cartesian components,respectively.The weak variational form of Eq. (1) is discretized in spaceusing P / P isoparametric finite elements for the fluid ve-locity and pressure. The second-order backward-differencingscheme is used for the time discretization of the Navier-Stokessystem [51]. A partitioned staggered scheme is considered forthe full-order simulations of fluid-structure interaction [52].The above coupled variational formulation completes the pre-sentation of full-order model for high-fidelity simulation. Theemployed in-house FSI solver has been extensively validatedin [53, 54]. III. FORWARD VS INVERSE PROBLEM
In the previous section, the numerical methodology for thefull order model is outlined. The nonlinear partial differentialequations are derived from the principles of conservation andthe numerical discretizations are carried out via finite elementprocedure. The above approach can also be termed as for-ward problem for the solution of a physical system. Given themodel differential equations with appropriate initial/boundaryconditions, the forward problem by numerical PDE schemescan provide the physical predictions.The forward problem for a general nonlinear dynamics of asystem can be written in an abstract form as d S dt = F ( S , U ) , Y = G ( S , U ) , (8)where S is the state of the system which can be velocity andpressure for the fluid flow problem, F contains the informationon the evolution of the state, U is the input variable to the sys-tem, Y represents the observable quantities of interest such asforces and G is a nonlinear function which maps the state andinput of the system to the quantities of the interest. The sameabove equation results in a nonlinear partial differential equa-tion for the system of fluid flow as described in the previoussection. The goal of the forward problem is to compute thefunction F and G via principles of conservation and numeri-cal discretizations. The discrete form of the above equationcan be written as S n + = F ( S n , U n ) , Y n = G ( S n , U n ) . (9)Another alternative to the above approach is to build the func-tion F and G via projection-based model reduction, systemidentification and machine learning methods. This approachis termed as inverse problem . The inverse problems beginwith the available data and aim at estimating the parametersin the model. The inference in this inverse problem is purelybased on the data and sometimes we might completely ig-nore the principles of conservation in such methods. The in-verse problem aims to construct close approximations F , G to the functions F , G for a given dataset U = { U , . . . U n } , S = { S , . . . S n } and Y = { Y , . . . Y n } . The general philoso-phy of the inverse problem is to obtain the functions F , G byminimizing the loss between the true data and predicted datafrom the model: (cid:107) F ( S , U ) − F ( S , U ) (cid:107) −→ , (cid:107) G ( S , U ) − G ( S , U ) (cid:107) −→ , (10)where the data coming from F , G is considered as ground truthdata. In practice, the ground truth data is obtained via forwardproblem/numerical simulations, experiments and field mea-surements. The ground truth data are obtained from numeri-cal simulations in the current work. The functions F and G represent the approximate (surrogate) reduced models whichgenerate the predicted data. In this paper, we investigate theapplication of hybrid inverse problems such as POD-RNN andCRAN on canonical problems of flow past a plain cylinderand the flow past side-by-side cylinders. In the upcoming sec-tions, more mathematical details on the methodologies of theinverse problem are presented. IV. MODEL REDUCTION VIA POD-RNN
In the current section, we present a data-driven approachto construct a reduced-order model (ROM) for the unsteadyflow field prediction and fluid-structure interaction The ap-plication of machine learning in reduced-order modeling ofthe dynamics of fluid flow has garnered great interest amongthe research community. They are likewise utilized as a way toshow the transient development of the low-dimensional pro-jections obtained from subspace estimation of the full state.As of late, AI procedures are utilized to build start to finishmodels that learn both the low dimensional projections andtheir transient advancement. The current work was propelledand along these lines has significant likenesses to past workas far as learning the low-dimensional portrayals and further-more their transient advancement.The main idea in these relatively new approaches and tradi-tional projection-based model reduction in general is the fol-lowing dual-step process: • The calculation of a low-dimensional subspace Φ em-bedded in R N on which majority of the data is rep- The majority of the content in this section has been presented in our previ-ous work in [55, 56], It is outlined here again for the sake of completeness resented. This yields, in some sense, an ideal low-dimensional representations A = f ( S ) of the data S interms of the intrinsic coordinates on Φ , and • The identification of a transient model which effectivelyevolves the low dimensional representation A on themanifold Φ .Fig. 1 depicts the overall framework of the data-drivenreduced-order model based on the above process. Thecustomary methodology of model reduction by means ofprojection-based technique comes up short for situationswhere fundamental equations don’t exist. The methodol-ogy of demonstrating the transient advancement of the lowdimensional representations of the state variable straightfor-wardly has an extraordinary preferred position for the abovesituations. A straightforward yet resourceful model for sucha methodology is to show the advancement of the low-dimensional representations, acquired for instance throughPOD, utilizing a RNN A n + = H ( A n ) , (11)where the representation vector A ∈ R N A , is of much smallersize than the state from which it is learned. This technique haspreviously been investigated with regards to low dimensionalmodeling where A is obtained through POD [11, 57], and inthe more general case where Eq. (11) may model the dynamicconduct of unpredictable [58] or turbulent systems [59]. Inthe current section, we first explore the POD-RNN approachof model reduction. This proposed approach relies on twocomponents: • A projection of the high-dimensional data from theNavier-Stokes equations to a low-dimensional subspaceusing the proper orthogonal decomposition (POD) and • Integration of the low-dimensional model with the re-current neural networks for temporal evolution.For the hybrid ROM formulation, we consider long short-term memory network which is a special variant of recurrentneural networks. The mathematical structure of recurrent neu-ral networks embodies a nonlinear state-space form of the un-derlying dynamical behavior. This particular attribute of anRNN makes it suitable for nonlinear unsteady flow problems.In the proposed hybrid RNN method, the spatial and temporalfeatures of the unsteady flow system are captured separately.Time-invariant modes obtained by low-order projection em-bodies the spatial features of the flow field, while the tempo-ral behavior of the corresponding modal coefficients is learnedvia recurrent neural networks.The purpose of the reduced-order model is to predict the un-known future flow fields efficiently without solving the gov-erning differential equations. At first, proper orthogonal de-composition is used for estimating the low-dimensional rep-resentation of the flow field. Time-invariant spatial modes ob-tained from the POD capture the dominant flow patterns. Thetemporal evolution of these modes is contained in the modalcoefficients. Recurrent neural networks are then employed toFIG. 1: Data-driven reduced order model for prediction offlow fieldlearn the temporal behavior of the dominant modes obtainedfrom POD. The coupling of traditional POD and the state-of-the-art RNN makes this approach very effective.
A. Proposed architecture
The essential and initial component of the POD-RNNmodel requires a high-fidelity series of snapshots of the flowfield which can be obtained by full-order simulations, experi-ments or field measurements. Snapshot POD is used to extractthe most significant modes and their corresponding POD co-efficients.Let S = { S S . . . S n } ∈ R m × n be the flow field data setwhere S i ∈ R m is the flow field snapshot at time t i and n isthe number of snapshots. m is the number of data points, forexample, number of probes in an experiment or field measure-ments or the number of mesh nodes in a numerical simulation.The target is to predict the future flow fields: S n + , S n + , . . . using the data set S . The proposed POD-RNN technique canbe divided into two main steps, which are as follows: • Generate the proper orthogonal decomposition (POD)basis for the data set S Using the n snapshots given by S , the mean field ( ¯ S ∈ R m ) , the POD modes ( Φ ∈ R m × k ) and the timedependent POD coefficient can be calculated such that S i ≈ ¯ S + Φ A i , i = , , . . . , n (12)˜ S i ≈ Φ A i , i = , , . . . , n (13)where ˜ S i is the fluctuation matrix which is obtainedby removing the mean ¯ S from instantaneous values S i .The POD modes are extracted using the eigenvalues Λ k × k = diag [ λ , λ , . . . , λ k ] and the eigenvectors W =[ w , w , . . . , w k ] of the covariance matrix ˜ S T ˜ S ∈ R k × k given by FIG. 2: Illustration of POD: Mode extraction of highdimensional data˜ S T ˜ S w j = λ j w j . (14)Here, the POD modes Φ = [ v , v , . . . , v k ] are related to Λ and W by Φ = ˜ S W Λ − / , (15)where A i = [ a i a i . . . a ik ] ∈ R k are the time coefficientsof the first k significant modes for the time t i . The tem-poral coefficients of the linear combination are deter-mined by the L inner product between the fluctuationmatrix and the POD modes as follows A i = (cid:10) ˜ S i , Φ (cid:11) . (16)The order k can be estimated via the mode energy dis-tribution. The order of the flow field is reduced from m to k via POD. The next step is to predict the matrix A n + . Fig. 2 illustrates the mode extraction from a setof snapshots resulting in modes and their correspondingcoefficients with temporal behaviour. • Train the recurrent neural network
Recurrent neural networks (RNNs) are employed tolearn the temporal behaviour of the POD modes. Theyare generally used in sequence-to-sequence predictionproblems. The architecture of RNNs makes it feasibleto predict multiple time steps in future at a time. Moredetails on recurrent neural networks are outlined in theupcoming discussions in this section. The purpose of anRNN is to learn a nonlinear function H , which mapsthe evolution of modal coefficients, i.e. [ A n + s , A n + s − , . . . , A n + ] = H ( A n , A n − , . . . , A n − p ) , (17)where s ans p are the output and input sizes of recur-rent neural network. Fig. 3 illustrates the prediction ofmodal coefficients.Combining Eqs. (13) and (17), we obtain a relationship ofa hybrid POD-RNN model in an abstract form˜ S i + = Φ H ( A i ) . (18)FIG. 3: Illustration of prediction of modal coefficientsThe time series of POD modal coefficients are divided intothe training and testing sets. The network coefficients are ob-tained by following the standard learning procedure startingfrom the training and testing.The problem of predicting the evolution of flow-field is nowtranslated to the problem of tracking the temporal evolutionof POD modal coefficients. This can be categorized undersequence modeling problems in machine learning. Persis-tence and retention of the information are required by neu-ral networks in dealing with sequence prediction problemsand therefore differ from other canonical learning problemsin machine learning. Traditional neural networks do not havea mechanism of persistence and retention of information. Re-current neural networks are specifically designed to addressthe issue of retention of information while training and mak-ing physical predictions. The internal state of the model is pre-served and propagated through the addition of a new dimen-sion in recurrent neural networks. Although recurrent neuralnetworks have been successful, several stability issues havebeen reported. Vanishing gradient is the most common amongthem. They also cannot learn long term dependencies in thedata.To address the issue of long-term dependency in the data,long short-term memory (LSTM) networks are utilized for thepresent work. The LSTM networks are designed with the de-fault characteristic of retaining the information for longer pe-riods and be able to use them for prediction depending on itsrelevance and context. For example, they can be shown oneobservation at a time from a sequence and can learn what ob-servations it has seen previously are relevant, and how theycan be used to make a prediction [12]. The fundamental engi-neering of the LSTM NN is currently illustrated. The LSTMnetworks are not quite the same as other profound learningstructures like convolutional neural organizations (CNN), in A t +X X σ tanh σ tanh X σ Y t FIG. 4: Cell structure of long short-term memory network(LSTM)that the ordinary LSTM cell contains three gates to controlthe flow of information. The LSTM manages the progressionof information through these gates by specifically includingdata, eliminating or letting it through to the following cell.The input gate is represented by i , output gate by o andforget gate by f . The cell state is represented as X and the celloutput is given by Y , while the cell input is denoted as A . Theequations to compute its gates and states are as follows: f t = σ ( W f . [ Y t − , A t ] + b f ) i t = σ ( W i . [ Y t − , A t ] + b i ) ˜ X t = tanh ( W c . [ Y t − , A t ] + b c ) X t = f t ∗ X t − + i i ∗ ˜X t o t = σ ( W o . [ Y t − , A t ] + b o ) Y t = o t ∗ tanh ( X t ) , (19)where W are the weights for each of the gates and ˜ X is theupdated cell state. These states are proliferated ahead throughthe model and parameters are refreshed by backpropagationthrough time. The forget gate assumes a critical part in damp-ening the over-fitting by not holding all data from the past timesteps. Fig. 4 portrays the schematic of a LSTM cell. This planof gates and particular data control is likewise the key motiva-tion behind why LSTMs don’t experience the ill effects of thevanishing gradient issue which tormented customary RNNs.Thus, LSTMs are an amazing asset to model non-stattionarydatasets. Further insights regarding the LSTM can be foundin [60].The nonlinear state-space form of the LSTM cell for theFIG. 5: Schematic of closed-loop recurrent neural networkcurrent problem can be written as X ( k + ) = F ( X ( k ) , A ( k )) , A ( k + ) = G ( X ( k + )) , (20)where X is the cell state and A is the matrix of modal coeffi-cients, whose time history has to be predicted. F and G repre-sent the nonlinear functions of LSTM cell as given in Eq. (19)in a mathematically compact form. The Eq. (20) shares sim-ilar structure with Eq. (9), which is obtained as the discreteform of the nonlinear dynamical system. The only differencebetween them is the output of the model, which in the currentcase is the future of the variable A that should be predicted.In the current work, we explore two variants of recurrentneural networks • Closed-loop recurrent neural network • Encoder-decoder recurrent neural networkA schematic of a closed-loop recurrent neural network is de-picted in Fig. 5. It takes one single input and predicts the out-put and this value is taken as input to the second prediction.In this manner, this architecture forecasts an entire sequenceof desirable length with one single input. This approach has agreat advantage since it does not require the knowledge of thetrue data for online prediction. However, it is limited to sim-ple problems such as flow past a plain cylinder. It is ineffec-tive for a more complex problem of the flow past side-by-sidecylinders.In a quest for more efficient recurrent neural networkswhich can model the dynamics of a complex system such asflow side-by-side cylinder, the encoder-decoder networks arefound to be suitable. The encoder-decoder architectures withlong short-term memory networks are widely adopted for bothneural machine translation (NMT) and sequence-to-sequence(seq2seq) prediction problems. The primary advantage of theencoder-decoder architecture is the ability to handle input andoutput sequences of text with variable lengths. A single end-to-end model can also be trained directly on the source andtarget sentences.A schematic of the encoder-decoder network for a machinetranslation example is given in Fig. 6. It takes an input se-quence of variable length and encodes the whole informationthrough a series of LSTM cells. There is no direct output inthe encoding process. The output is generated via a decodernetwork through a series of LSTM cells. The input and out-put time series can have multiple features. The time history of FIG. 6: Encoder-decoder network for neural machinetranslation taskfew dominant modes can be used together as a multi-featureddata to train the network. In the decoding process, the stateand output generated by the previous cell are taken as inputfor the LSTM cell.In the present work, we use this architecture for time se-ries prediction of the POD modal coefficients (in side-by-sidecylinders) as shown in Fig. 7 where the input sequence willbe the known time history of the modal coefficients. The un-known time history of modal coefficients in the future is gen-erated as the output sequence. The output generated by thedecoder network can be used as input in a moving windowfashion for the next encoding procedure. In this way, a self-sufficient learning model can be generated which can predictthe time history of the modal coefficients for a given knowntime sequence and thereby the prediction of the entire flowfield can be achieved. The process of encoding and decodingcan be represented mathematically in a compact form usingEqs. (20 - 22).The encoding process can be represented in the followingequation as X ( ) = F ( X ( ) , A ( )) X ( ) = F ( X ( ) , A ( )) ... X ( p ) = F ( X ( p − ) , A ( p )) X enc = X ( p ) , (21)where p is the length of the input time history. X enc is theencoded state which contains the information pertaining to theinput sequence. The decoding process can be represented as X ( p + ) = F ( X enc , A ( p )) A ( p + ) = G ( X ( p + )) X ( p + ) = F ( X ( p + ) , A ( p + )) A ( p + ) = G ( X ( p + )) ... X ( p + s ) = F ( X ( p + s − ) , A ( p + s − )) A ( p + s ) = G ( X ( p + s )) , (22)where s is the length of the output sequence to be predicted.FIG. 7: Encoder-decoder network for time series prediction V. AUTOENCODERS AND POD
In the section, we attempt to establish a mathematicalequivalence between the idea of autoencoders and proper or-thogonal decomposition. This particular similarity betweenthe above two methods motivated us to choose convolutionalrecurrent autoencoder networks as the next natural avenue tobuild our data-driven reduced order model.The manifold hypothesis states that any set of real-worldhigh-dimensional data is spanned by an optimal subspace rep-resented by Φ which is embedded in R N , where N is large.The above hypothesis forms the core of the data-driven sci-ences in terms of dimensionality reduction where an approx-imate low-dimensional representation is constructed to de-scribe the dynamics of high-dimensional data. This partic-ular aspect has contributed to the widespread application ofdata-driven sciences throughout multiple areas. POD is oneof the major dimensionality reduction techniques in this fielddue to its simplicity and trivial computational efforts. Be thatas it may, it has the significant downside of developing just anoptimal linear subspace. This is very critical since informa-tion tested from perplexing, true frameworks are as a generalrule emphatically nonlinear. A wide assortment of method-ologies for more precise demonstrating of Φ have been createdthroughout the long term, most including utilizing an interwo-ven of neighborhood subspace acquired through linearizationsor higher-order approximations of the state-space.An under-complete autoencoder can be termed as a nonlin-ear generalization of POD. It consists of a solitary multi-layerencoder network. A = E ( S ; θ E ) ; (23)where ˆ S ∈ R N is the system state A ∈ R N A is the low dimen-sional embedding, and N A < N . A decoder network is thenused to reproduce S by ˆ S = D ( A ; θ D ) . (24)The variables of the above network are obtained by a trainingprocedure which lowers the expected reconstruction error overall training examples. θ ∗ E , θ ∗ D = arg min θ E , θ D (cid:2) L ( ˆ S , S ) (cid:3) , (25)where L ( ˆ S , S ) is some measure of discrepancy between S andits reconstruction ˆ S . The autoencoder is prevented form learn-ing the identity function by restricting the N A < N . The parameters of E , D and L ( ˆ S , S ) significantly dependon the problem at hand. Indeed, if one chooses a uni-dimensional encoder and decoder of the form A = W E S , ˆ S = W D A , (26)where W E ∈ R N A × N and W D ∈ R N × N A , then with a squaredreconstruction error L ( ˆ S , S ) = (cid:107) S − ˆ S (cid:107) = (cid:107) S − WW T S (cid:107) , (27)the autoencoder will obtain the same embedding as the onerepresented by the first N A POD modes if W = W D = W T E .However, without additional constraints on W i.e., W T W = I N A × N A the columns of W do not conform with the rules oforthonormal basis. VI. MODEL REDUCTION VIA CRAN
In the previous section, we have introduced an overallframework for data-driven reduced-order modeling . Whilethe POD based model reduction is optimal and can gener-ate physically interpretable modes, it is not good enough forpractical problems. In such cases, the number of prominentmodes will increase and their computation can pose seriouschallenges. As it will be shown in the application sections,27 modes are required to capture the dominant energy of theside-by-side cylinders whereas only 4 modes were requiredfor flow past a plain cylinder. Similarly, it is shown in [62],that 123 modes were required for flow past a square cylin-der at Re = A = E ( S ) , (28)where S ∈ R N high-dimensional state of the system, A ∈ R N A , N A < N , E is nonlinear enocoder network and a linear The majority of the content in this section has been presented in our previ-ous work in [56, 61], It is outlined here again for the sake of completeness
FIG. 8: Schematic of convolutional recurrent autoencodernetwork (CRAN)recurrent model K is used to evolve the low-dimensional fea-tures, A n + = KA n , (29)where K ∈ R N A × N A . This approach was first introduced in thecontext of learning a dictionary of functions used in extendeddynamic mode decomposition to approximate the Koopmanoperator of a nonlinear system [63].Fully-connected autoencoders are used in many appli-cations for dimensionality reduction. However, the high-dimensionality of DNS-level input data can act as a majorbottleneck for its application. The high-dimensional inputdata should be flattened into a 1D array when we considerautoencoders. This might result in the elimination of localspatial relations between the data. In other words, the dom-inant flow features, often found in FSI simulations might beignored or might not be captured efficiently. Therefore, thedirect application of fully-connected autoencoders on DNS-level input data is generally prohibited. The current modelseeks to extract dominant flow features present in much fluidflow and fluid-structure interaction data through the use ofCNNs. Specifically, as opposed to applying a fully-connectedautoencoder implicitly on the high-dimensional numerical orphysical data, it is rather applied on a low dimensional vectorobtained from the procedure of feature extraction of a deepCNN acting directly on the high-dimensional data. A. Proposed architecture
A 12 layer convolutional autoencoder model is constructedas shown in Fig. 9. The first 4 layers together form theconvolution aspect of the model. The two dimensional in-put S ∈ R N x × N y , with N x = N y =
64 is first passed through this FIG. 9: Schematic of 12 layer convolutional recurrentautoencoder model
Layer filter size filters stride padding1 4 × × × × × × × × TABLE I: Filter sizes and strides of the layers inconvolutional autoencoder network. Layers (1 to 4)correspond to encoder part and layers (9 to 12) correspond todecoder partconvolutional block. All the layers in the convolution blockemploy a filter bank K f ∈ R × with the number of filters inthe first layer f increasing from 4 in the first layer to 32 in thefourth layer. A stride of 2 and a padding of 1 is maintainedacross all layers.The process of transpose convolution is carried out in thelast four layers of the network. The transpose convolutionis not exactly the reverse of regular convolutions but it up-samples the low dimensional representations across four lay-ers gradually to reconstruct the high dimensional state. Theoutput size of the transpose convolutional layer is exactlythe inverse of regular convolutional layer and is given by N o = ( N i − ) ∗ s + f − ∗ P , where s is the stride, f is thefilter size, P is the padding, N i is the input size and N o is theoutput size. Table I outlines the architecture of the convolu-tional encoder and decoder subgraphs. In this work, we willconsider the sigmoid activation function σ ( s ) = + exp − s foreach layer of the CRAN model.A 4 layer fully connected regular autoencoder with a plainvanilla feed-forward neural network is also used in the archi-tecture. The first two layers perform encoding by taking thevectorized form of 32 feature maps from the final layer of con-volution block and returns a one-dimensional vector A ∈ R N A .The last two layers decode the evolved feature vector into a0higher dimensional vector which will be reshaped as featuremaps and passed into the transpose convolution block. Fig. 9details the whole architecture with the changes in the size ofthe data as it flows throughout the entire model.The key advantage in preferring convolutional autoencoderis its ability in scaling to problems with larger dimensions.The local coherent spatial structures of the problem at handare utilized for reducing the model in a nonlinear fashion. B. Evolution of features
Now we proceed to model the evolution of low-dimensionalfeatures A in a computationally efficient manner. In terms ofan analysis viewpoint, it is beneficial to identify the linear dy-namics of A . However, we consider a general case of learningfeature dynamics in a nonlinear setting.Consider a set of observations { s n } mn = , s n ∈ R N obtainedfrom a CFD simulation or through experimental sampling.An optimal low-dimensional representation A n ∈ R N A , where N A (cid:28) N is obtained for each observation. These low-dimensional representations are obtained via convolutionalautoencoder described in the previous section. The model fortemporal evolution of these features is constructed in a com-plete data-driven setting.LSTM networks are employed to model the evolution of A . The details on the LSTM networks are outlined in sectionIV. In a typical machine translation task the size of the hid-den states and number of layers is large when compared to thesize of the feature vectors. In the current work, a single layerLSTM network is found to be sufficient for evolving the fea-ture vectors A n . Note that the development of A doesn’t needdata from the full state S and thereby dodges a exorbitant re-constructionInstantiating with a pre-computed low-dimensional repre-sentation A , one obtains a prediction for the subsequent stepsby applying the following iterativelyˆ A n + = H ( ˆ A n ) , n = , , , . . . (30)where ˆ A = H ( A ) and H speaks to the activity of Eq. (19)and its subcomponents. A graphical portrayal of this model isoutlined in Fig. 8. C. Unsupervised training strategy
The unsupervised training approach is the key componentof the current model. The weights of both the convolutionalautoencoder network and LSTM recurrent neural network areadjusted in a joint fashion. The main challenge, here, is toprevent overfitting of the CNN and RNN portion of the model.The construction of the training dataset as well as the trainingand evaluation schematics are presented in this section.Consider a dataset { s , . . . , s m } , where s ∈ R N x × N y is a 2Dsnapshot of some dynamical system (e.g., a pressure field de-fined on a 2D grid). This dataset is broken up into a set of N s FIG. 10: Schematic of training process of convolutionrecurrent autoencoder network (CRAN)finite time training sequences { S , . . . , S N s } where each train-ing sequence S i ∈ R N x × N y × N t consists of N t snapshots.The fluctuations around the temporal mean are consideredas followed generally in the data science community for im-proving the training. s (cid:48) n = s n − ¯ s , (31)where ¯ s = m ∑ mn = s n is the temporal mean over the completedataset and s (cid:48) are the perturbations around this average. Nextthe scaling of the above data is conducted as below s (cid:48) ns = s (cid:48) n − s (cid:48) min s (cid:48) max − s (cid:48) min , (32)where each s (cid:48) ns ∈ [ , ] N x × N y . The scaling is carried out to en-sure that the data lies in the interval of ( , ) . This requirementstems from the fact that a sigmoid activation function is usedin all the layers of convolutional autoenocoder. The trainingdataset with the above modifications has the form S = { S (cid:48) s , . . . , S (cid:48) N s s } ∈ [ , ] N x × N y × N t × N s , (33)where each training sample S (cid:48) is = [ s (cid:48) s , i , . . . , s (cid:48) N t s , i ] is a matrix con-sisting of the feature-scaled fluctuations.The process of training entails two stages primarily. Theconvolutional autoencoder takes an N b - sized batch of thetraining data S b ⊂ S , where S b ∈ [ , ] N x × N y × N t × N b andoutputs the current N b sized batch of low-dimensional repre-sentations of the training sequence A b = { A , . . . , A N b } ∈ R N A × N t × N b , (34)where A i = [ A i , . . . , A N t i ] ∈ R N A × N t and a reconstruction ˆ S b of the original input training batch in the first stage. This isrepresented as purple arrow in Fig. 10. The second stage isindicated by the blue arrows in the same Fig. 10, where thefirst feature vector of each sequence is used to initialize anditeratively update Eq. (30) to get a reconstruction ˆ A b of thelow-dimensional representations of the training batch A b .A loss function is constructed that equally weights the er-ror in the full-state reconstruction and the evolution of thelow-dimensional representations. The target of the trainingis to find the model parameters θ such that for any sequence1FIG. 11: Illustration of prediction process of convolutionrecurrent autoencoder network (CRAN) S (cid:48) s = [ S (cid:48) s , . . . , S (cid:48) N t s ] and its corresponding low-dimensional rep-resentation A = [ A , . . . , A N t ] , minimizes the following errorbetween the truth and the predictions: J ( θ ) = [ L ( ˆ S (cid:48) s , S (cid:48) s , ˆ A , A )]= (cid:34) α N t N t ∑ n = (cid:107) S (cid:48) ns − ˆ S (cid:48) ns (cid:107) (cid:107) S (cid:48) ns (cid:107) + β N t (cid:107) A n − ˆ A n (cid:107) (cid:107) A n (cid:107) (cid:35) , (35)where α = β = .
5. The values of α and β are chosen inorder to give equal weightage to the errors in full state recon-struction and feature prediction respectively. The predictionsare denoted by ˆ S (cid:48) s . In practice, the error is approximated byaveraging L ( ˆ S (cid:48) s , S (cid:48) s , ˆ A , A ) over all samples in a training batchduring each backward pass. The convolutional autoencoderperforms a regular forward pass and also constructs the low-dimensional representation simultaneously at every trainingstep. These low-dimensional representations are used to trainthe RNN.The ADAM optimizer [64] is used for updating the weightsduring the training process. It is an updated version of thecanonical SGD method which uses the estimates of first andsecond moments of the gradients for adaptive computation oflearning rates. Fig. 10 depicts the overall schematic of thetraining process for easy comprehension. Tensorflow is usedfor building this model and the main source code is taken from[48].The online prediction is straight forward and is depicted inFig. 11 for easy comprehension. Firstly. a low-dimensionalrepresentation A ∈ R N A is constructed using the encoder net-work for a given initial condition S ∈ [ , ] N x × N y and a setof trained parameters θ ∗ . The Eq. (30) is applied itera-tively for N t steps with A ∈ R N A as the initial solution toget predictions of low dimensional representations. Finally,the full-dimensional state is reconstructed ˆ S n from the low-dimensional representation ˆ A n at every time step or at anyspecific instance. D. Force calculation on the interface
Since the full-order data from the fluid solver is highly un-structured, these high-dimensional nodal data are first pro-jected as snapshot images onto a uniform (reference) grid.This is an important data processing step, before proceed-ing with training or testing, as it brings spatial uniformity
Boundary of the snap-shot grid for neural approximation 𝛀 𝐬 𝛀 𝐬 𝛀 𝐟 Projected
GridFull Order (Point Cloud) u f = field data x = x-coordinate y = y-coordinate Y1X1 = x-coordinate= y-coordinate = field data u f f f 𝐃𝐃 𝛀 𝐟 𝛀 𝐬 Fluid DomainSolid Domain
Exact Solid Boundary 𝐃 Cylinder Diameter
FIG. 12: Representative unstructured and reference grids forthe side-by-side case. Every nodal point, in both grids, has anassociated field value along with a 2-d spatial coordinate 𝛀 𝐟 𝛀 𝐬 𝛀 𝐬 𝛀 𝐟 Solid Volume
Fluid Volume Forcing Cells
Physical Interface 𝚪 𝐟𝐬 σ(U f;kn ) ab c d Nodal field data Interpolated face data (from nodal data, using finite difference ) U f = p v σ = Cauchy Stress Tensor
FIG. 13: Illustration of interpolation process for the reference(uniform) grid of DL-ROM procedure. Three types of cells,solid cells (red), interface cells that contain the exact physicalboundary (yellow) and fluid cells (white) are shownto the CNN input in the CRAN model. For the presentproblems, we select a uniform 64 ×
64 as a fixed refer-ence grid for both cases. The interpolation of the 2-d scat-tered data onto this grid is implemented using the SciPy’s“ scipy . interpolate . griddata ” function [65]. The referencegrid dimension is selected to sufficiently enclose the exactfluid-solid interface (FSI) present in the unstructured grid.Fig. 12 depicts both the unstructured grid (used in the full-order simulations) and the reference (used as an input to theCRAN model) for a side-by-side case. Note that the presenceof the rigid body is ensured by masking the entire spatial re-gion covered by the cylinder with a mandatory function whichzeroes out the predictions made by the model, including thepoints inside the cylinder.Next, we briefly introduce the idea to calculate integratedpressure force on the exact solid boundary from a fairly coarsegrid (reference grid). This is simply done via a two-step pro-cess. The first step is to identify all the interface cells for agiven solid boundary (see Fig. 13) and sum the force experi-enced by these cells. We call the summed force as the total2discrete force ( F B ). The process of calculating the total dis-crete force is convenient as one can loop over all the cells inthe reference grid and mark all those cells that have any oneof the cell nodal values as zero. Then the individual cell forceis summed for all these marked cells. This will include thecontribution from the cells in the interior of the solid domain(with zero force) and the interface cells (non-zero force), giv-ing the discrete force. The second step is a reconstructionmapping ( ψ ) of this discrete force over the exact solid bound-ary. The reconstruction mapping takes care of the missingphysical data that is lost in the interpolation process from ahigh-resolution grid (full-order) to the reference grid (neuralnetwork). This mapping is observed from the training dataand corrected in the prediction. We call this entire process asan observer-corrector methodology to calculate the integratedforce.Let f n ( i , j ) be the force in any interface cell ( i , j ) at a timestep n . The discrete force in this cell is calculated using linearinterpolation via Finite Difference. The local face numbering(a − d − b − c) and node numbering (2 − − −
1) is shownin Fig. 13. The pressure and velocity values are approximatedat the mid-point of these faces linearly, by simply taking theaverage of it’s end nodal values. This can be further clarifiedfrom Fig. 14 and following equation: p n a; ( i , j ) = p n ( i , j ) + p n ( i , j ) , p n b; ( i , j ) = p n ( i , j ) + p n ( i , j ) , p n c; ( i , j ) = p n ( i , j ) + p n ( i , j ) , p n d; ( i , j ) = p n ( i , j ) + p n ( i , j ) , (36)where p n ∗ ; ( i , j ) represents the pressure at a specific point in thecell at a time-step n . This process, similarly, can be repeatedfor individual velocity components as well.The discrete force f n k at a time-step n in a cell k (or ( i , j ) from Fig. 14) for a Newtonian fluid can be written as f n k = ( σ n a;k − σ n b;k ) . n x ∆ y + ( σ n c;k − σ n d;k ) . n y ∆ x . (37)The Cauchy Stress Tensor σ n ∗ ;k = − p n ∗ ;k I + µ E n ∗ ;k for anypoint inside the cell k at instance n . Here − p n ∗ ;k , I , µ and E n ∗ ;k denote the fluid pressure, identity tensor, fluid dynamicviscosity and fluid strain rate tensor respectively. n x and n y are unit vectors in the x and y direction, while ∆ x = ∆ y isthe cell size. For the present problems, we only consider thepressure component in the Cauchy Stress Tensor. The viscouscomponent involves the calculation of gradients and can bedone using linear interpolation in the five-cell stencil shownin the Fig. 14. This part has been left for future work forfurther exploration. With above considerations and using Eqs.(36) and (37), the expression for discrete force is f n k = (cid:20) p n a;k − p b;k p n a;k − p b;k (cid:21) . (cid:20) (cid:21) ∆ y + (cid:20) p n c;k − p d;k p n c;k − p d;k (cid:21) . (cid:20) (cid:21) ∆ x . (38) (i, j)(i, j + 1)(i, j − 1) (i + 1, j)(i − 1, j) a b dc ΔxΔy
Fluid Cells
Interface CellsSolid Cells U f ≠ 0U f = 0 Face Mid-Point xy FIG. 14: A five cell stencil for linear interpolation ofgradients using finite differenceNext, a few notes on the reconstruction mapping ψ . Thecoarsening effect (projection) of the full-order data onto thereference grid brings a spatial uniformity in the input state.However, this coarsening can also lead to a loss of accurateforces on the physical FSI, even in the training data. This canbe accounted to the fact that the boundary layer is not properlyresolved on the reference grid. Hence, the function of ψ is toobserve the loss of data in the training full-order and discreteforce. After the prediction of the field on the reference gridby CRAN, this calculated ψ reconstructs to get the full-orderprediction force. This will be further clarified in the exam-ple problems of a single cylinder and side-by-side cylindersin subsequent sections.In summary, the overall process that is followed to predictthe integrated value of the force at a time-step n for the fluid-solid interface, • Calculate the discrete force f n k in each interface cell kusing Eq. (37) . • Let N F be the total such cells. Obtain the total discreteforce F n B as F n B = N F ∑ k = f n k . (39) • Correct F n B by ψ observed from training data forces topredict the full-order force F n Γ fs as F n Γ fs ≈ ψ F n B . (40)3 (a)(b) (c) FIG. 15: The flow past a cylinder: (a) schematic of theproblem set-up, (b) full-domain computational mesh viewand (c) close-up computational mesh view
VII. APPLICATION I FLOW PAST A CYLINDER
As a first demonstration, we apply both POD-RNN andCRAN models for a simplified problem of flow past a cylin-der. This includes the prediction of the flow field and the inte-grated force coefficient on the cylinder for each case.
Problem objective : The objective is to sufficiently learn thephenomenon of vortex shedding for flow past a stationarycylinder on a given time-set (training) and extrapolate the be-havior based on learned parameters (prediction).
Full-order data : The schematic of the problem set-up is thesame as depicted in Fig. 15 (a) where all the informationabout the domain boundaries and the boundary conditions areclearly outlined. The final mesh which is obtained after fol-lowing standard rules of mesh convergence is presented inFig. 15 (b),(c) which contains a total of 25916 quadrilateralelements with 26114 nodes. The numerical simulation is car-ried out via finite element Navier-Stokes solver to generatethe train and test data. The Reynolds number of the problemis set at Re = tU ∞ / D with a time-step of 0 . tU ∞ / D . Atotal of 4500 snapshots of the simulation are collected at ev-ery 0 . tU ∞ / D for the pressure field ( P ) and the x-velocity( U ). Of those 4500 snapshots, 3000 (from 501 to 3500 steps)are used for training and 1000 (from 3501 to 4500 steps) arekept as testing. Thus, the total steps used in the analysis are N = A. POD-RNN model
The POD-RNN model is applied on the flow past a plaincylinder as follows:1. The POD algorithm discussed in section IV A is appliedon all the time snapshots S = { S S . . . S N } ∈ R m × N (here, m = N = S ∈ R m , the fluctuation matrix ˜ S ∈ R m × N and the N spatially invariant POD modes Φ ∈ R m × N .Note that S can be either pressure field P or x-velocity U snapshots.2. Next, extract the eigen values Λ N × N of the covariancematrix ˜ S T ˜ S ∈ R N × N . The absolute value of these eigenvalues are a measure of energy of the respective PODmodes. Cumulative energy and percentage of totalmodal energies are plotted in Fig. 16 for both pressureand x-velocity fields. It is evident from Fig. 16 thatmost of the system energy is concentrated in the firstfour to five POD modes (in fact, first two modes con-tribute >
95% of the total energy) for both pressure andx-velocity data.3. In order to construct the reduced order system dynam-ics, select the first few energetic POD modes (here, k =
5) in the analysis. The N POD modes are now sim-ply approximated with first k modes ( k (cid:28) N ), reducingthe POD modes to Φ ∈ R m × k . The first four (most ener-getic) POD modes for pressure and x-velocity snapshotsare depicted in Fig. 18.4. The temporal variations of these modes are obtained by A = Φ T ˜ S . Here, A ∈ R k × N , Φ ∈ R m × k and ˜ S ∈ R m × N .For instance, the time history of these five modes forpressure and x-velocity field from 501 − − tU ∞ / D ) time-steps is depicted in Fig. 17. These N temporal modes are divided into training ( n tr = n ts = − − − k (here, k =
5) so4 (a) (b)(c) (d)
FIG. 16: The flow past a cylinder: Cumulative andpercentage of modal energies. (a)-(b) for pressure field P ,(c)-(d) for x-velocity field U that all k modes are predicted in a single step by uti-lizing the multi-featured setting of the recurrent neuralnetwork. Another advantage of this input dimension isthat it can also learn any dependencies that exits in be-tween the modes. As we already know, the modes arenot completely independent of each other. Parameter ValueInput dimension 5Output dimension 5Hidden dimension 512RNN cell LSTMNumber of time-steps 3500Number of batches 1Initial learning rate 0.005Learning rate drop period 125 epochsLearning rate drop factor 0.2Iterations 3000 epochsOptimizer ADAM
TABLE II: The flow past a cylinder: Network and parameterdetails for the closed-loop recurrent neural network6. The results of the k = k modal coefficients at time-step 3500 is used tomake prediction for next 1000 steps (from 3501 till (a)(b) FIG. 17: The flow past a cylinder: Time history of modalcoefficients (shown from 125 till 375 tU ∞ / D ) for (a) pressurefield P and (b) x-velocity field U RMSE = (cid:113) ∑ i = Tn = ( A n − ˆ A n ) T is tabulated in Table III.Here, A n and ˆ A n are the true and predicted (normalised)modal coefficients respectively. T are the number of testtime-steps (here, T = (a) (b)(c) (d)(e) (f)(g) (h) FIG. 18: The flow past a cylinder: First four most energetictime-invariant spatial modes obtained from POD along with% of total energy. (a)-(b)-(c)-(d) for pressure field P and(e)-(f)-(g)-(h) for x-velocity field U (a)(b) FIG. 19: The flow past a cylinder: Temporal evolution(prediction) of modal coefficients (shown from 875 till1125 tU ∞ / D ) for (a) pressure field P and (b) x-velocity field U mode RMSE1 5 . − . − . − . − . − (a) mode RMSE1 6 . − . − . − . − . − (b) TABLE III: The flow past a cylinder: RMSE for the predictedagainst true (normalised) modal coefficients for (a) pressure(b) x-velocity
Field prediction : The predicted modal coefficients at anytime-step, ˆ A n ∈ R k , can simply be reconstructed back to thehigh dimensional state ˆ S n using the mean field ¯ S ∈ R m and k spatial POD modes Φ ∈ R m × k as ˆ S n ≈ ¯ S + Φ ˆ A n . Fig. 21and 22 depict the comparison of predicted and true values for P and U fields respectively at time-steps 3700 (925 tU ∞ / D ),4000 (1000 tU ∞ / D ) and 4300 (1075 tU ∞ / D ). The normal-ized reconstruction error E n is constructed by taking the abso-lute value of differences between the true S n and predicted ˆ S n n and for all points k in the space ( x , y ) and is given by E n = | S n − ˆ S n |(cid:107) S n (cid:107) , k . (41) Force coefficient prediction : The high-dimensional field pre-diction on all the nodal points allows to carry the reduced nu-merical integration directly over the fluid-solid boundary onthe cylinder. The Cauchy Stress tensor σ f is constructed withpressure field data and Eq. (7) in section II is used to inte-grate it over the fluid-solid boundary to get C D , p and C L , p .Fig. 20 depicts the predicted and actual pressure coefficientfrom time-steps 3501 − D ) and the reference velocityis the inlet velocity ( U ∞ ). t denotes the actual simulation timecarried at a time-step 0 . s . (a) (b) FIG. 20: The flow past a cylinder: Predicted and actual (POD-RNN model) (a) drag and (b) lift force coefficients due tothe pressure field ( P ) on the cylinder (shown from 875 till975 tU ∞ / D ) B. CRAN model
The end-to-end nonlinear model order reduction tool basedon CRAN is now applied on the same problem of flow pasta cylinder. For uniformity, same set of training (time-steps501 − − S = { S S . . . S N } ∈ R m × N (here, m = N = griddata ”function [65] is used to map the m dimensional unstruc-tured data on a 2-d reference grid of size N x × N y (here, N x = N y = D × D andthe single cylinder positioned at the center. The snap-shot data-set hence obtained is S = { s s . . . s N } ∈ R N x × N y × N . The N , 2-d snapshots are divided into train-ing n tr = n ts = n tr training data is broken into N s batches, each offinite time-step size N t . Note that n tr = N s N t . The pro-cedure described in section VI C is followed to generatethe feature scaled data for training, S = { S (cid:48) s , . . . , S (cid:48) N s s } ∈ [ , ] N x × N y × N t × N s , (42)where each training sample S (cid:48) is = [ s (cid:48) s ; i , . . . , s (cid:48) N t s ; i ] . In thisanalysis, N x , N y =
64 and N t = N s = S (cid:48) is can be pressure or velocity data.3. Train a range of convolutional recurrent autoencodernetworks (CRANs) using the data set described abovewith the low-dimensional evolver LSTM state A ofdifferent sizes (refer Fig. 8 and 9). Since A is alatent abstract feature space, we experiment predic-tions with different size of low-dimensional state N A = , , , , , , ,
256 for both pressure P andx-velocity U . All the models are trained on a singleIntel E5-2690v3 (2.60GHz, 12 cores) CPU node for N train = ,
000 iterations. We took sufficiently longiterations to account for a non-convex optimisation, al-though, we did observe the objective function (Eq. (35))to reach a nearly steady value of approximately 10 − inless than100 ,
000 iterations. It should be noted that we are usinga closed recurrent neural network in the CRAN model(see Fig. 5) i.e., the prediction from previous step isused as an input for the next prediction. The main ad-vantage of using such a model is that, there is no re-quirement of the true data to be fed into the network ina time delayed fashion. One can predict for a finite timehorizon with a single true sample (infer data) which isused for initialization of the network.4. The online prediction is carried for n ts = − P and U respectively.Ef = ∑ n (cid:107) s n − ˆ s n (cid:107) , k (cid:107) s n (cid:107) , k + ε n ts , (43)where n denotes a time instance from 3501 to 4500time-steps and k denotes a point ( x , y ) in space. (cid:107) . . . (cid:107) , k denotes the L norm over spatial dimension. s n andˆ s n are respectively the true and predicted fields. Thetuning of the hyper-parameter N A reveals that, thereexist an optimal value of N A for all input data. Forthe pressure data P , as shown in Fig. 23 (a), there ex-ists approximately two order magnitude difference be-tween N A = , ,
32 and N A = , , N A =
32 for x-velocity U compared to the rest. Hence,the CRAN tunes for the best performance at N A = P and U .7 (a)(b)(c) FIG. 21: The flow past a cylinder: Comparison of predicted and true fields (POD-RNN model) along with normalizedreconstruction error E n at tU ∞ / D = (a) 925, (b) 1000, (c) 1075 for pressure field ( P ) Field prediction : Here, N A =
32 trained CRAN model isemployed to demonstrate the field predictions for both pres-sure and x-velocity using just one time-step 3500. Fig. 26and 27 depict the comparison of predicted and true values forP and U fields respectively at time-steps 3700 (925 tU ∞ / D ), 4000 (1000 tU ∞ / D ) and 4300 (1075 tU ∞ / D ). The normalizedreconstruction error E n is, similarly, constructed by taking theabsolute value of differences between the true s n and predictedfield ˆ s n and normalizing it with L norm of the true data andis given by8 (a)(b)(c) FIG. 22: The flow past a cylinder: Comparison of predicted and true fields (POD-RNN model) along with normalizedreconstruction error E n at tU ∞ / D = (a) 925, (b) 1000, (c) 1075 for x-velocity field ( U ) E n = | s n − ˆ s n |(cid:107) s n (cid:107) , k . (44) Force coefficient prediction : Now, we demonstrate themethodology highlighted in section VI D to predict the pres-sure force coefficient on the cylinder boundary. Here, C D , p and C L , p denote the drag and lift force experienced by the sta-9 (a) (b) FIG. 23: The flow past a cylinder: Mean normalized squarederror ( ¯Ef) for predicted time-steps with respect to size oflow-dimensional space N A for (a) pressure field P and (b)x-velocity field U tionary cylinder in a pressure field, normalized by 0 . ρ DU ∞ .Using Eqs. (38) and (39), we first calculate the discrete dragand lift force from the pressure training data (3000 snapshotsfrom time 125 tU ∞ / D to 875 tU ∞ / D ). Fig. 24 (a) and (b)depict the drag and lift discrete force coefficients (calculatedon low-resolution reference grid) with respect to the full-order(calculated on high-resolution unstructured grid) respectively.For visualisation convenience, the plot from 125 tU ∞ / D till225 tU ∞ / D is shown. Clearly, there is a mean shift and/orhigher amplitudes observed in the discrete force compared tofull-order. This is accounted for the fact of interpolation error(coarsening effect) that is introduced on the reference grid.However, the discrete force tends to capture the force propa-gation trend with mean shift and/or amplitude difference. Thispeculiarity, itself, is observed in the training data and needs tobe corrected. This is done by the reconstruction mapping ψ ,which is calculated from discrete and full-order forces in thetraining data. Figs. 24 (c) and (d) represent the reconstructeddiscrete force and the full-order coefficients on the same train-ing data from 125 tU ∞ / D to 225 tU ∞ / D using the correctormapping ψ . Thus, getting the correct force prediction reducesto the task of getting the discrete force from predicted fields.The ψ calculated from the training data is then used to recon-struct the predicted discrete force to full-order (Eq. (40)). Theresults are presented in Fig. 25. (a) (b)(c) FIG. 24: The flow past a cylinder: Comparison of discreteforce and mapped discrete force with respect to full-orderforce coefficients on the training data due to pressure. (a),(b)represent the discrete drag and lift coefficients respectively.(c),(d) represent the reconstructed (mapped) discrete drag andlift coefficients respectively (a) (b)
FIG. 25: The flow past a cylinder: Predicted and actual(CRAN model) (a) drag and (b) lift force coefficients dueto the pressure field ( P ) on the cylinder (shown from 875 to975 tU ∞ / D ) C. Discussion
Table IV lists out the differences in the attributes of boththe models for flow past a cylinder. We have employed a0 (a)(b)(c)
FIG. 26: The flow past a cylinder: Comparison of predicted and true fields (CRAN model) along with normalizedreconstruction error E n at tU ∞ / D = (a) 925, (b) 1000, (c) 1075 for pressure field ( P )closed-loop recurrent network in both the models to evolvethe low-dimensional subspace. This also includes the fact thatonly one time-step is sufficient to predict long range of futuretime-states (here, 1000 for example) for both the hybrid mod- els. This is expected since the problem achieves a periodic be-havior. Thus, the compounding effect of the errors are negligi-ble in time. POD-RNN model provides a selective control onthe number of modes (features) in the low-dimensional space1 (a)(b)(c) FIG. 27: The flow past a cylinder: Comparison of predicted and true fields (CRAN model) along with normalizedreconstruction error E n at tU ∞ / D = (a) 925, (b) 1000, (c) 1075 for x-velocity field ( U )2based on energy distribution and a simple algebraic recon-struction using time-advanced coefficients, modes and meanfield. This by-passes the costly encoder and decoder net-work parameter space which is utilised in the CRAN modelto achieve a similar process. In that sense, one can argue thatthe CRAN model does not provide any significant improve-ments over POD-RNN model for this problem. However, thesame inference is expected since the problem at hand is oneof the most simplified fluid flow problems and any standardprediction tool should be able to perform at equal capabilities. POD-RNN CRANEncoder/decoder space POD θ ≈ Evolver features A k = N A = I , P ) I : 1, P : 1000 I : 1, P : 1000 θ : trainableparameters I : input time-steps, P : predicted time-steps TABLE IV: The flow past a cylinder: Comparison ofPOD-RNN with convolutional recurrent autoencoder network(CRAN)The temporal mean of normalized reconstruction error overthe predicted time-steps is given by E s = ∑ n E n n ts , (45)where E n is the spatial reconstruction error for a predictedtime-step n and n ts refer to the number of test time-steps.Fig. 28 depicts this temporal mean error E s for the predictedfields using POD-RNN and CRAN models. For both themodels, majority of the errors are concentrated in the non-linear wake region of the cylinder. For POD-RNN, E s is inthe order of 10 − for both the field variables. In contrast,the order of errors in CRAN are a bit higher compared toPOD-RNN case, with E s ≈ − for pressure and x-velocity.It is worth mentioning that in flow past side-by-side cylin-ders a closed-loop recurrent architecture can be inefficient toget very long time-series prediction due to compounding ef-fects of error, since the problem in hand is bi-stable. In thatcase, we have employed an encoder-decoder type network inPOD-RNN and ground data as a demonstrator in CRAN toachieve longer time-series of prediction. This application issystematically explored in the next section. (a) (b)(c) (d) FIG. 28: The flow past a cylinder: Comparison between thetemporal mean error E s for POD-RNN and CRAN model.(a),(b) for pressure and (c),(d) for x-velocity VIII. APPLICATION II FLOW PAST SIDE-BY-SIDECYLINDERS
The canonical problem of flow past side-by-side cylinderscan act as a representative test case for many multi-body sys-tems. The dynamics associated with this problem is still apopular topic of research among the fluid mechanics commu-nity. Hence, as a second application, we apply both POD-RNN and CRAN models for such a bi-stable problem to testthe capabilities of the prediction tools. This includes predic-tion of flow field and integrated pressure force coefficient onthe cylinders.3 (a)(b) (c)
FIG. 29: The flow past side-by-side cylinders: (a) Schematicof the problem set-up, (b) full-domain computational meshview and (c) close-up computational mesh view
Problem objective : As in the case of a single cylinder, theobjective here is to also sufficiently learn the phenomenonof vortex shedding for flow past side-by-side cylinders on agiven time-set (training) and extrapolate the behavior basedon learned parameters (prediction). However, flow past side-by-side cylinders is known to be a more complex (chaotic)problem with a bi-stable solution, flip-flopping regime andgap flow [53]. The idea is to study the performance of thisproblem for both types of neural architectures.
Full-order data : The schematic of the problem set-up is givenin Fig. 29 (a), where all the information about the domainboundaries and the boundary conditions are clearly outlined.The final mesh which is obtained after following standardrules of mesh convergence is presented in Fig. 29 (b) and(c) which contains a total of 49762 triangular elements with25034 nodes. The numerical simulation is carried out via fi-nite element Navier-Stokes solver to generate the train andtest data. The Reynolds number of the problem is set at Re = tU ∞ / D with a time-step of 0 . tU ∞ / D . A total of 3800snapshots of the simulation are collected at every 0 . tU ∞ / D for the pressure field P and the x-velocity U . Of those 3800snapshots, 3200 (from 301 to 3500 steps) are used for trainingand 300 (from 3501 to 3800 steps) are kept as testing. Thetotal train and test snapshots N = A. POD-RNN model
The POD-RNN model is applied on the flow past side-by-side cylinders as follows:1. The POD algorithm (section IV A) is applied onall the train and test time snapshots for field S = { S S . . . S N } ∈ R m × N (here, m = N = S ∈ R m , the fluctuation ma-trix ˜ S ∈ R m × N and the N spatially invariant POD modes Φ ∈ R m × N .2. Get the eigen values Λ N × N (energy of POD modes) ofthe covariance matrix ˜ S T ˜ S ∈ R N × N . The POD energyspectrum for both pressure and x-velocity fields is plot-ted in Fig. 30. From the spectrum, it is observed thataround 95% of energy is concentrated in first 21 modesfor pressure and first 19 for x-velocity. (a) (b)(c) (d) FIG. 30: The flow past side-by-side cylinders: Cumulativeand percentage of modal energies. (a)-(b) for pressure field P ,(c)-(d) for x-velocity field U
43. Construct the reduced order system dynamics by se-lecting first few energetic POD modes in the analysis(here, k = x -velocity. The N POD modes are now simply approximated with these k modes ( k (cid:28) N ), reducing the POD system to Φ ∈ R m × k .For instance, the first four spatial POD modes for pres-sure and x-velocity fields along with % total energy aredepicted in Fig. 31.4. The temporal variations of these k modes are obtainedby A = Φ T ˜ S . Here, A ∈ R k × N , Φ ∈ R m × k and ˜ S ∈ R m × N . These temporal modes are divided into training( n tr steps) and testing part ( n ts = N − n tr steps). Here, n tr = n ts = − − − − tU ∞ / D )time-steps is depicted in Fig. 32.5. The temporal behavior of these modal coefficients (forboth pressure and x-velocity) is learned via an encoder-decoder type recurrent neural network (see Fig. 7) as itfound to work better compared to a closed-loop type.Note that before proceeding with training, the modalcoefficients are normalized between -1 to 1 to preventvanishing/exploding gradients. The primary feature ofthis recurrent architecture is to encode a finite sequenceof time-steps (of length, say, n i ) and decode it to predictthe next set of time-steps (of length n o ). For example,time-steps 1 to 25 can be encoded to predict 26 until30 steps and so on. The training data consists of finiteinput-output time sequences of k modes generated inrandom batches of size n b in every epoch (here, n b = n i = n o = k = P and U is outlined in the Table V. The testtime-steps of 300 are organized into six equal sets ofinput-output pairs. The size of the input ( n i ) and output( n o ) time-window is 25 here. Parameter ValueInput features k = k = n i = n o = n b =
50 (random)Initial learning rate 0.0005L-2 regularization factor 0.003Iterations 15000 epochsOptimizer ADAM
TABLE V: The flow past side-by-side cylinders: Network andparameter details for the encoder-decoder type recurrent net-work6. The result of the temporal modal predictions from theencoder-decoder type recurrent network is plotted in (a) (b)(c) (d)(e) (f)(g) (h)
FIG. 31: The flow past side-by-side cylinders: First fourmost energetic time-invariant spatial modes obtained fromPOD along with % of total energy. (a)-(b)-(c)-(d) forpressure field P and (e)-(f)-(g)-(h) for x-velocity field U (a)(b) FIG. 32: The flow past side-by-side cylinders: Time historyof modal coefficients from 75 to 325 tU ∞ / D ): (a) pressurefield P and (b) x-velocity field U Fig. 34 for (a) pressure and (b) x-velocity for the testtime-steps from 3501 till 3800 (875 − tU ∞ / D ). Forvisualisation convenience, only first 8 modal coeffi-cients are shown, although the network predicts thetemporal trend in all k =
25 input modes within reason-able accuracy. The network input, prediction and truevalues are depicted by green, red and black lines re-spectively in Fig. 34. In spite of the drastic differencesbetween the POD modes, one trained encoder-decodernetwork is suffice to capture such a chaotic behavior inall the modes for finite time-steps. The normalised rootmean square error
RMSE = (cid:113) ∑ i = Tn = ( A n − ˆ A n ) T is plottedin Fig. 33 for all the predicted 25 modes. Here, A n andˆ A n are the true and predicted modal coefficients respec-tively. T is the total length of the output time-sequences(here, T = n o = (a) (b) FIG. 33: The flow past side-by-side cylinders: RMSE for thepredicted against true (normalised) modal coefficients for (a)pressure (b) x-velocity (a)(b)
FIG. 34: The flow past side-by-side cylinders: Temporalevolution (prediction) of modal coefficients (from 875 till950 tU ∞ / D ) for (a) pressure field P and (b) x-velocity field U Field prediction : The predicted modal coefficient ˆ A n ∈ R k can simply be reconstructed back to the high dimensional stateˆ S n ∈ R m using the mean field ¯ S ∈ R m and k spatial POD modes Φ ∈ R m × k as ˆ S n ≈ ¯ S + Φ ˆ A n . Figs. 35 and 36 depict the com-6parison of predicted and true values for P and U , respectivelyat time-steps 3592 (898 tU ∞ / D ), 3692 (923 tU ∞ / D ) and 3792(948 tU ∞ / D ). The normalized reconstruction error E n is con-structed by taking the absolute value of differences betweenthe true S n and predicted ˆ S n field at any time-step n and nor-malizing it with L norm of the true data and is given by E n = | S n − ˆ S n |(cid:107) S n (cid:107) , k . (46) Force coefficient prediction : The high-dimensional field pre-diction on all the nodal points allows to carry the reduced nu-merical integration directly over the fluid-solid boundary onthe cylinders. The Cauchy Stress tensor σ f is constructedwith pressure field data and Eq. (7) in section II is used tointegrate it over the fluid-solid boundary to get C D , p and C L , p .Fig. 37 depicts the predicted and actual pressure coefficientfrom time-steps 3501 − D =
1) and the referencevelocity is the inlet velocity ( U ∞ =1). t denotes the actual sim-ulation time carried at a time-step 0 . s . B. CRAN model
The end-to-end nonlinear model order reduction tool basedon CRAN is now applied on the same problem of flow pastside-by-side cylinders. Same set of training (time-steps 301 − − S = { S S . . . S N } ∈ R m × N (here, m = N = griddata ” function [65]is used to map the m dimensional unstructured data on a2-d reference grid of size N x × N y (here, N x = N y = S = { s s . . . s N } ∈ R N x × N y × N .These N , 2-d snapshots are divided into training n tr = n ts = N − n tr = n tr training data is broken into N s batches, each offinite time-step size N t . Note that n tr = N s N t . The pro-cedure described in section VI C is followed to generatethe feature scaled data for training, S = { S (cid:48) s , . . . , S (cid:48) N s s } ∈ [ , ] N x × N y × N t × N s , (47)where each training sample S (cid:48) is = [ s (cid:48) s ; i , . . . , s (cid:48) N t s ; i ] with N x , N y =
64 and N t = N s = S (cid:48) is can be pres-sure or velocity data.3. Convolutional recurrent autoencoder networks(CRANs) are trainied using the data set described above and the low-dimensional evolver LSTM state A of different sizes. Predictions are experimented withdifferent sizes N A = , , , , , , ,
256 for both pressure P and x-velocity U . All themodels are trained on a singleIntel E5-2690v3 (2.60GHz, 12 cores) CPU node for N train = ,
000 epochs. The objective function (Eq.(35)) was observed to reach a nearly steady value ofapproximately 10 − at the end of iterations. It shouldbe noted that we are employing a closed-loop recurrentneural network in the CRAN model for this problem(see Fig. 5) i.e., the prediction from previous step isused as an input for the next prediction.4. The closed-loop recurrent net uses prediction from pre-vious time-step to generate a new prediction. Due tothe bistable nature of the problem, the errors are com-pounded at every time-step, thus, causing prediction tra-jectory to deviate after a few finite time-steps ( ≈ N t =
25 time-steps). In order to address this issue, groundtruth data is fed after every N t predicted time-steps as ademonstrator for the net to follow the actual trajectoryand avoid divergence. We follow the 1 − to − N t rulefor prediction. This means that any one ground truthsnapshot predicts N t time-steps from a trained model.Thus, in testing, we input ≈ n ts / N t ground truth data topredict n ts time-steps. For instance, time-step 3500 pre-dicts sequence 3501 − − N t predicted time-steps for P and U respectively with respect to the size of low-dimensional feature N A .Ef = ∑ n (cid:107) s n − ˆ s n (cid:107) , k (cid:107) s n (cid:107) , k + ε N t , (48)where n denotes a time instance from 3501 to 3525time-steps and k denotes a point ( x , y ) in space. (cid:107) . . . (cid:107) , k denotes the L norm over spatial dimension. s n and ˆ s n are respectively the true and predicted fields. The tun-ing of the hyper-parameter N A reveals the existence ofan optimal value of N A for the field data. Referringto Fig. 38, the variation of Ef vs N A follows a nearlyconvex shaped optimisation curve with the minimumat N A =
32 for pressure and N A =
16 for x-velocity.Based on this discovery, we utilise N A =
32 and N A = n ts =
300 test time-steps.
Field prediction : Fig. 40 and 41 depict the comparisonof predicted and true values for the pressure and x-velocityfield, respectively, at time-steps 3592 (898 tU ∞ / D ), 3692(923 tU ∞ / D ) and 3792 (948 tU ∞ / D ). The normalized recon-struction error E n is constructed by taking the absolute valueof differences between and true and predicted values and nor-malizing it with L norm of the truth data and is given by7 (a)(b)(c) FIG. 35: The flow past side-by-side cylinders: Comparison of predicted and true fields (POD-RNN model) along withnormalized reconstruction error E n at (a) tU ∞ / D = tU ∞ / D = tU ∞ / D =
948 for pressure field ( P ) E n = | s n − ˆ s n |(cid:107) s n (cid:107) , k . (49) Force coefficient prediction : We use the methodology high-lighted in section VI D to predict the pressure force coeffi- cients on the boundary of the cylinders. Here, C D , p and C L , p denote the drag and lift force coefficient experienced by thestationary cylinder in a pressure field. Using Eqs. (38) and(39), we first calculate the discrete drag and lift force (coef-ficient) from the pressure training data (3200 snapshots fromtime 75s to 875s). Fig. 39 (a) and (b) depict the discrete lift8 (a)(b)(c) FIG. 36: The flow past side-by-side cylinders: Comparison of predicted and true fields (POD-RNN model) along withnormalized reconstruction error E n at (a) tU ∞ / D = tU ∞ / D = tU ∞ / D =
948 for x-velocity field ( U )and drag force coefficients on the Cylinder 1 and the effect ofreconstruction ψ on these data is shown in Fig. 39 (c) and(d). Similarly, the discrete and mapped-discrete force coeffi-cients on Cylinder 2 are shown in Fig. 39 (e),(f),(g),(h). Theseplots are shown for 75 tU ∞ / D till 200 tU ∞ / D for visualisation point of view. The reconstruction accuracy is nearly 99 . ψ calculated from the training data recovers the missingforce data in prediction (Eq. (40)). The predicted force coef-9 (a) (b)(c) (d) FIG. 37: The flow past side-by-side cylinders: Predicted and actual (POD-RNN model) pressure force coefficients. (a) Drag(Cylinder 1), (b) lift (Cylinder 1), (c) drag (Cylinder 2), (d) lift (Cylinder 2)ficients are shown in Fig. 42 for both the cylinders for all thetest time-steps (3501 till 3800 time-steps).
C. Discussion
It is noted that the POD-RNN model with the closed-looprecurrent neural network is inefficient in predicting the flowpast side-by-side cylinders. For that purpose, we employ anencoder-decoder type recurrent net instead of a closed-looptype. Whereas, the convolutional recurrent autoencoder net-work (CRAN) model can predict the flow fields in a closed-loop fashion for longer-time steps with limited ground dataas a demonstrator. In the case of the POD-RNN model, thespatial and temporal parts of the problem are dealt with inde-pendently which might work for simple problems of flow pasta cylinder. However, such linear decomposition is less effec-tive in combating highly nonlinear problems such as side-by-side cylinders. The improvement compared to the POD-RNNmodel can be attributed to the low-dimensional features ob- tained by CNN of the CRAN model. The non-linear kernels inCNN are very powerful in identifying dominant local flow fea-tures [14]. Also, the complete end-to-end architecture of thenetwork, which enables us to integrate the encoding, evolutionand decoding in a complete nonlinear fashion, is the primaryreason for this model to work for the problem. This is a verypromising result and motivates us to take forward this conceptof convolutional recurrent autoencoder models and its variants(such as variational autoencoders, registration/interpretableautoencoders [66, 67]) for unsteady fluid flow problems withstrong convection effect and moving boundaries.Table VI lists out the differences in the attributes of boththe models for flow past side-by-side cylinders for both thefields. It is worth mentioning that CRAN outperforms POD-RNN in terms of longer time-series prediction. In this analy-sis, POD-RNN requires ≈ n ts / =
150 test time-steps as inputfor predicting remaining 150 time-steps, while CRAN utilises ≈ n ts / N t =
12 data demonstrators to predict all n ts time-steps.Note that n ts =
300 denotes the test time-steps.0 (a) (b)
FIG. 38: The flow past side-by-side cylinders: Meannormalized squared error (Ef) ) for first N t predictedtime-steps with respect to size of low-dimensional space N A for (a) pressure field P and (b) x-velocity field POD-RNN CRANEncoder/decoder space POD θ ≈ Evolver features A ( P , U ) k = N A = , I , P ) I : 25, P : 25 I : 1, P : 25 θ :trainable parameters P : pressure field, U : x-velocity field I : input time-steps, P : predicted time-steps TABLE VI: The flow past side-by-side cylinders: Compari-son of POD-RNN with convolutional recurrent autoencodernetwork (CRAN)Fig. 43 depicts the temporal mean of the reconstruction er-ror E s for both the models over the test-time steps. The tempo-ral mean error E s is quite low in the order of 10 − for both thepredictions. Similar to the plain cylinder, the majority of theerrors are concentrated in the near wake region where there ispresence of strong nonlinearity. IX. CONCLUSIONS
We have presented two data-driven reduced-order modelsfor the prediction of nonlinear fluid flow past bluff bodies.Both methods share a common data-driven framework. Thecommon principle of both methods is to first obtain a set oflow-dimensional features of the high dimensional data comingfrom the fluid flow problems. These low dimensional featuresare then evolved in time via recurrent neural networks. Inthe first method, proper orthogonal decomposition is used toobtain a set of low-dimensional features (POD modes in thiscase). We have shown a mathematical equivalence of PODwith autoencoders and then proceed to present a more generalmethod where the low-dimensional features are obtained byconvolutional neural networks. We have selected flow past aplain cylinder and flow past side-by-side cylinder as candidatetest cases to evaluate the above methods. POD-RNN seems to (a) (b)(c) (d)(e) (f)(g) (h)
FIG. 39: The flow past side-by-side cylinders: Comparisonof discrete force, mapped discrete force and full-order forcecoefficients on the training data due to pressure for Cylinder1 ((a)-(b)-(c)-(d)) and Cylinder 2 ((e)-(f)-(g)-(h))1 (a)(b)(c)
FIG. 40: The flow past side-by-side cylinders: Comparison of predicted and true fields (CRAN model) along with normalizedreconstruction error E n at (a) tU ∞ / D = tU ∞ / D = tU ∞ / D =
948 for pressure field ( P )be more efficient in terms of computational costs and accu-racy of prediction when it comes to the problem of flow pasta plain cylinder, CNN-RNN is a bit overkill for this problem.However, it performs extremely well and completely bypassesPOD-RNN in terms of accuracy for the complicated problemof flow past side-by-side cylinders. This study shows that con- volutional recurrent autoencoders can encompass a broaderrange of nonlinear fluid flow problems and can be pursuedfurther for more complicated problems. We intend to extendthis method to predict nonlinear fluid flow past moving bodiesand we target to predict the motions of the bodies interactingwith the fluid flow.2 (a)(b)(c) FIG. 41: The flow past side-by-side cylinders: Comparison of predicted and true fields (CRAN model) along with normalizedreconstruction error E n at (a) tU ∞ / D = tU ∞ / D = tU ∞ / D =
948 for x-velocity field ( U ) ACKNOWLEDGEMENTS
A part of work has been done at the Keppel-NUS Cor-porate Laboratory was supported by the National ResearchFoundation, Keppel Offshore and Marine Technology Cen-tre (KOMtech) and National University of Singapore (NUS). The conclusions put forward to reflect the views of the au-thors alone and not necessarily those of the institutions withinthe Corporate Laboratory. Second and fourth authors wouldlike to acknowledge the support from the University of BritishColumbia (UBC) and the Natural Sciences and EngineeringResearch Council of Canada (NSERC). The computational3 (a) (b)(c) (d)
FIG. 42: The flow past side-by-side cylinders: Predicted and actual (CRAN model) pressure force coefficients for all testtime-steps: (a) Drag (Cylinder 1), (b) lift (Cylinder 1), (c) drag (Cylinder 2), (d) lift (Cylinder 2). Note that at a time, one inputtime-step (blue dot) is used to predict the next sequence of N t =
25 steps, until a new input is fed. This helps in reducing thecompounding effect of the errors while still achieving predictions in closed-loop recurrent fashionwork for this article was performed on resources of the Na-tional Supercomputing Centre, Singapore and at AdvancedComputing Resources of ARC at UBC. Some part of the datapresented in this work has been taken from our previous pub-lications in [55, 56, 61].
DATA AVAILABILITY
The data that support the findings of this study are openlyavailable in https://github.com/rachit1307-code/Assessment-of-hybrid-DLROM
CONFLICT OF INTEREST
The authors declare that they have no conflict of interest. [1] G. Berkooz, P. Holmes, and J. L. Lumley, Annual review offluid mechanics , 539 (1993).[2] J. Burkardt, M. Gunzburger, and H.-C. Lee, Computer Methodsin Applied Mechanics and Engineering , 337 (2006). [3] S. Fresca, L. Dede, and A. Manzoni, arXiv preprintarXiv:2001.04001 (2020).[4] W. Yao and R. Jaiman, Journal of Fluid Mechanics , 357(2017). (a) (b)(c) (d) FIG. 43: The flow past side-by-side cylinders: Comparisonbetween the temporal mean error E s for POD-RNN andCRAN model. (a),(b) for pressure and (c),(d) for x-velocity. [5] S. B. Reddy, A. R. Magee, and R. K. Jaiman, in ASME 201837th International Conference on Ocean, Offshore and Arc-tic Engineering (American Society of Mechanical Engineers,2018) pp. V002T08A004–V002T08A004.[6] S. Bukka, A. Magee, and R. Jaiman, Journal of Fluid Mechan-ics (2020).[7] K. Duraisamy, Z. J. Zhang, and A. P. Singh, in (2015) p. 1284.[8] J.-X. Wang, J.-L. Wu, and H. Xiao, arXiv preprintarXiv:1606.07987 (2016).[9] T. Miyanawala and R. Jaiman, arXiv preprint (2017).[10] Y. LeCun, Y. Bengio, and G. Hinton, nature , 436 (2015).[11] Z. Wang, D. Xiao, F. Fang, R. Govindan, C. C. Pain, andY. Guo, International Journal for Numerical Methods in Fluids , 255 (2018).[12] A. T. Mohan and D. V. Gaitonde, arXiv preprintarXiv:1804.09269 (2018).[13] T. Murata, K. Fukami, and K. Fukagata, arXiv preprintarXiv:1906.04029 (2019).[14] T. Miyanawala and R. Jaiman, arXiv preprint (2018).[15] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Proceedings of theNational Academy of Sciences , 3932 (2016).[16] S. Rudy, A. Alla, S. L. Brunton, and J. N. Kutz, SIAM Journalon Applied Dynamical Systems , 643 (2019).[17] M. Raissi and G. E. Karniadakis, Journal of ComputationalPhysics , 125 (2018).[18] K. Champion, B. Lusch, J. N. Kutz, and S. L. Brunton, arXivpreprint arXiv:1904.02107 (2019).[19] P. Choudhury, R. Allen, and M. Endres, Available at SSRN3251077 (2018).[20] Z. Long, Y. Lu, and B. Dong, arXiv preprint arXiv:1812.04426(2018). [21] S. Pan and K. Duraisamy, SIAM Journal on Applied DynamicalSystems , 2381 (2018).[22] T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud,in Advances in neural information processing systems (2018)pp. 6571–6583.[23] R. Maulik, A. Mohan, B. Lusch, S. Madireddy, and P. Bal-aprakash, arXiv preprint arXiv:1906.07815 (2019).[24] S. Jeong, B. Solenthaler, M. Pollefeys, M. Gross, et al. , ACMTransactions on Graphics (TOG) , 199 (2015).[25] K. Um, X. Hu, and N. Thuerey, in Computer Graphics Forum ,Vol. 37 (Wiley Online Library, 2018) pp. 171–182.[26] S. Wiewel, M. Becher, and N. Thuerey, in
Computer GraphicsForum , Vol. 38 (Wiley Online Library, 2019) pp. 71–82.[27] Y. Xie, E. Franz, M. Chu, and N. Thuerey, ACM Transactionson Graphics (TOG) , 95 (2018).[28] B. Kim, V. C. Azevedo, N. Thuerey, T. Kim, M. Gross, andB. Solenthaler, in Computer Graphics Forum , Vol. 38 (WileyOnline Library, 2019) pp. 59–70.[29] B. Bonev, L. Prantl, and N. Thuerey, arXiv preprintarXiv:1704.07854 (2017).[30] H. M. N. Thuerey, N. Mainali, K. Weissenow, L. Prantl, andX. Hu, Technical University of Munich.[31] S. Bhatnagar, Y. Afshar, S. Pan, K. Duraisamy, and S. Kaushik,Computational Mechanics , 1 (2019).[32] M. Raissi, Z. Wang, M. S. Triantafyllou, and G. E. Karniadakis,Journal of Fluid Mechanics , 119 (2019).[33] K. T. Carlberg, A. Jameson, M. J. Kochenderfer, J. Mor-ton, L. Peng, and F. D. Witherden, Journal of ComputationalPhysics (2019).[34] M. Korda and I. Mezi´c, Automatica , 149 (2018).[35] S. Peitz and S. Klus, Automatica , 184 (2019).[36] Z. Liu, S. Kundu, L. Chen, and E. Yeung, in (IEEE, 2018) pp. 4811–4818.[37] I. Mezi´c, Annual Review of Fluid Mechanics , 357 (2013).[38] M. Korda and I. Mezi´c, Journal of Nonlinear Science , 687(2018).[39] M. Budisic, M. Ryan, and I. Mezi´c, Chaos: An Interdisci-plinary Journal of Nonlinear Science , 047510 (2012).[40] S. Pan and K. Duraisamy, arXiv preprint arXiv:1906.03663(2019).[41] S. Pan and K. Duraisamy, Complexity (2018).[42] K. Yeo and I. Melnyk, Journal of Computational Physics ,1212 (2019).[43] S. E. Otto and C. W. Rowley, SIAM Journal on Applied Dy-namical Systems , 558 (2019).[44] B. Lusch, J. N. Kutz, and S. L. Brunton, Nature communica-tions , 4950 (2018).[45] T. Miyanawala and R. Jaiman, arXiv preprint arXiv:1807.09591(2018).[46] K. Lee and K. Carlberg, arXiv preprint arXiv:1812.08373(2018).[47] N. B. Erichson, M. Muehlebach, and M. W. Mahoney, arXivpreprint arXiv:1905.10866 (2019).[48] F. J. Gonzalez and M. Balajewicz, arXiv preprintarXiv:1808.01346 (2018).[49] R. Jaiman, M. Guan, and T. Miyanawala, Computers & Fluids , 68 (2016).[50] R. Jaiman, N. Pillalamarri, and M. Guan, Computer Methodsin Applied Mechanics and Engineering , 187 (2016).[51] J. Liu, R. K. Jaiman, and P. S. Gurugubelli, Journal of Compu-tational Physics , 687 (2014).[52] R. Jaiman, P. Geubelle, E. Loth, and X. Jiao, Computers &Fluids , 120 (2011). [53] B. Liu and R. K. Jaiman, Physics of Fluids , 127103 (2016).[54] R. C. Mysa, A. Kaboudian, and R. K. Jaiman, Journal of Fluidsand Structures , 76 (2016).[55] S. B. Reddy, A. R. Magee, and R. K. Jaiman, in ASME 201938th International Conference on Ocean, Offshore and Arc-tic Engineering (American Society of Mechanical Engineers,2019).[56] S. R. BUKKA,
DATA-DRIVEN COMPUTING FOR THESTABILITY ANALYSIS AND PREDICTION OF FLUID-STRUCTURE INTERACTION , Ph.D. thesis (2019).[57] J. N. Kani and A. H. Elsheikh, arXiv preprint arXiv:1709.00939(2017).[58] O. Ogunmolu, X. Gu, S. Jiang, and N. Gans, arXiv preprintarXiv:1610.01439 (2016).[59] K. Yeo, arXiv preprint arXiv:1710.01693 (2017).[60] C. Olah, (2015). [61] S. R. Bukka, A. R. Magee, and R. K. Jaiman, arXiv preprintarXiv:2003.12147 (2020).[62] T. P. Miyanawala and R. K. Jaiman, Journal of Fluid Mechanics , 723 (2019).[63] Q. Li, F. Dietrich, E. M. Bollt, and I. G. Kevrekidis, Chaos:An Interdisciplinary Journal of Nonlinear Science , 103111(2017).[64] D. P. Kingma and J. Ba, arXiv preprint arXiv:1412.6980(2014).[65] https://docs.scipy.org/doc/scipy-1.4.1/scipy-ref-1.4.1.pdf (2019).[66] R. Mojgani and M. Balajewicz, arXiv preprintarXiv:2006.15655 (2020).[67] T. Taddei, SIAM Journal on Scientific Computing42