An Efficient Deep Learning Technique for the Navier-Stokes Equations: Application to Unsteady Wake Flow Dynamics
AAn Efficient Deep Learning Technique for theNavier-Stokes Equations: Application to UnsteadyWake Flow Dynamics
T. P. Miyanawala a , R. K. Jaiman a, ∗ a Department of Mechanical Engineering, National University Singapore, Singapore 119077
Abstract
We present an efficient deep learning technique for the model reduction of theNavier-Stokes equations for unsteady flow problems. The proposed techniquerelies on the Convolutional Neural Network (CNN) and the stochastic gradientdescent method. Of particular interest is to predict the unsteady fluid forces fordifferent bluff body shapes at low Reynolds number. The discrete convolutionprocess with a nonlinear rectification is employed to approximate the mappingbetween the bluff-body shape and the fluid forces. The deep neural network isfed by the Euclidean distance function as the input and the target data generatedby the full-order Navier-Stokes computations for primitive bluff body shapes.The convolutional networks are iteratively trained using the stochastic gradientdescent method with the momentum term to predict the fluid force coefficientsof different geometries and the results are compared with the full-order compu-tations. We attempt to provide a physical analogy of the stochastic gradientmethod with the momentum term with the simplified form of the incompress-ible Navier-Stokes momentum equation. We also construct a direct relationshipbetween the CNN-based deep learning and the Mori-Zwanzig formalism for themodel reduction of a fluid dynamical system. A systematic convergence andsensitivity study is performed to identify the effective dimensions of the deep-learned CNN process such as the convolution kernel size, the number of kernels ∗ Corresponding author
Email address: [email protected] (R. K. Jaiman)
Preprint submitted to arXiv August 16, 2018 a r X i v : . [ phy s i c s . f l u - dyn ] A ug nd the convolution layers. Within the error threshold, the prediction based onour deep convolutional network has a speed-up nearly four orders of magnitudecompared to the full-order results and consumes an insignificant fraction of com-putational resources. The proposed CNN-based approximation procedure hasa profound impact on the parametric design of bluff bodies and the feedbackcontrol of separated flows. Keywords:
Deep learning, Convolutional neural networks, Distance function,Stochastic gradient descent, Navier-Stokes equations, Unsteady wake dynamics
1. Introduction
Unsteady separated flow behind a bluff body causes fluctuating drag andtransverse forces to the body, which is of great significance in engineering ap-plications. “After over a century of effort by researchers and engineers, theproblem of bluff body flow remains almost entirely in the empirical, descriptiverealm of knowledge,” as famously stated by [1]. Recent advances in computa-tional and experimental techniques in the past decades have only strengthenedthis statement. Nonlinear unsteady flow separation and the dynamics of vor-tex formation in the near wake region make the bluff-body flow an extremelycomplex phenomenon to tackle through rigorous analytical tools. While physi-cal experimental and computational techniques provide high-fidelity data, theyare generally time-consuming and expensive for design space exploration andflow control in a practical engineering application. Furthermore, the enormousamount of generated high-fidelity data are often under-utilized and the sig-nificant patterns identified and learned in one case are rarely used for a nextsimulation effectively. For example, consider the flow past a stationary roundedcorner square cylinder and the engineering objective is to determine a direct re-lationship between the fluid forces and the rounding angle of the square cylinder.Traditionally, we need to perform computationally expensive transient Navier-Stokes simulations for each rounded angle value for a considerable number oftime-steps, albeit the vortex formation and the shedding process do not differ2uch between the different configurations. Here, we focus on a data-drivencomputing method to predict the key flow parameters of bluff-body flows viaavailable high-fidelity data. Of particular interest is to have an efficient alterna-tive for the brute force full-order model (FOM) and to obtain the fluid forces atnearly real-time within an acceptable error threshold while utilizing a fractionof the computational effort.Data-driven methods using low dimensional representations via proper or-thogonal decomposition (POD) [2], eigenvalue realization algorithm (ERA) [3]and dynamic mode decomposition (DMD)[4, 5] have been used in computationalfluid dynamics (CFD) for model order reduction. These methods generate phys-ically interpretable spatial and spatio-temporal modes related to dominatingfeatures of the bluff body flow such as the shear layers and the Karman vortexstreet. However, these techniques may fail to capture nonlinear transients andmulti-scale phenomena [6] in bluff-body flows, which motivates the need for animproved technique to extract invariant dominant features to predict the im-portant flow dynamical parameters. In such learning algorithms, the essentialidea is to construct a generalized kernel mapping function from a finite set ofoutput data and to postulate a priori relationship between the input-outputdynamics. With that regard, biologically-inspired learning (i.e., neuron-basedlearning) techniques have some attractive approximation properties to constructthe input-output dynamics through two main phases namely, training and pre-diction. In the training phase, high-fidelity data from FOM are fed to a learningalgorithm as pairs of input and output. The target of neural network learning isthen to find the functional relationship between the input (bluff body geometry)and the output (wake dynamics) using the provided initial data such that theforce coefficient for a new bluff body geometry can be determined in real timewithout using the full-order simulation. For this process, the learning algorithmutilizes a set of weighting/functional steps also called layers to connect the in-put geometry matrix and the force coefficient. These hidden layers, the set ofinput geometry matrices and the output force coefficient layer form a neuralnetwork . After the input data is fed to the neural network, it initializes the3raining with guessed weights for the layers between the input and output anditeratively back-propagates the deviations and corrects the assumed weights. In[7] a modeling paradigm was proposed using field inversion and machine learningfor fluid mechanics problems. However, in these traditional machine learningalgorithms, neural network layers utilize matrix multiplication by considering amatrix of parameters with another parameter and each data point in one layeris connected to each data point of the next layer via a weight function. Theseearly supervised learning methods, also referred to as shallow learning, performsreasonably well if the system is governed by a few dominant features and be-comes extremely inefficient in capturing the multiple local flow features such asflow separation, shear layer, vortex formation region, and street.Deep learning solves the deficiency of traditional learning algorithms bybuilding complex features from simpler nested functions via input-output re-lationships. For example, [8] proposed a novel Reynolds averaged turbulencemodel based on deep learning. Another developing technique is machine learningbased on Gaussian process (GP) regression whereby the interpolations are gov-erned by previous covariances opposed to smooth curve-fitting. The GP-basedregression network replaces the weight connections in a Bayesian neural networkwith Gaussian processes, which allows the model input dependent correlationsbetween multiple tasks. Such GP-based models are generally task-specific andmay require sophisticated approximate Bayesian inference which is much morecomputationally expensive in contrast to deep learning models. Recently, theGP-based machine learning method is introduced for the linear differential equa-tion systems by [9] and for the nonlinear partial differential equation (PDE) sys-tems by [10]. Also, [11] introduced a multi-fidelity GP for prediction of randomfields. While neural networks employ a considerable number of highly adap-tive basis functions, Gaussian processes generally rely on a multitude of manyfixed basis functions. Another main difference of CNN-based technique withthe GP-based regression method is that it allows a parametric learning processthrough highly adaptive basis functions. Neural networks have some inherentability to discover meaningful high-dimensional data through learning multiple4ayers via highly adaptive basis functions [12, 13]. In particular, deep neuralnetworks can provide an effective process for generating adaptive basis functionsto develop meaningful kernel functions, which can allow discovering structure inhigh-dimensional data without human intervention. Owing to these attractiveproperties of neural networks, we consider a CNN-based based deep learningto develop a novel computational framework to predict the wake flow dynamicsusing high-fidelity computational fluid dynamics (CFD) data.The present work aims to develop an efficient model reduction technique forthe Navier-Stokes equations via deep neural networks. In a deep convolutionalnetwork framework [14], the output data point of a layer is connected only toa very small portion of the input data points and the learning system providesthe sparse interactions (sparse weights) and parameter sharing (tied weights)for working with inputs of variable size. Due to this local connectivity betweenthe adjacent layers, a particular learned weight function can properly capturethe essential features with much reduced memory requirements and computingoperations. While the CNN applies a discrete convolution operation, it is localin extent and performs superior to other deep learning methods. Hence, we con-sider the convolutional neural network as the deep-learning technique to learnand predict the wake flow parameters. Our main contribution is to develop thefeedforward CNN-based prediction model and to assess its accuracy and perfor-mance against the traditional Navier-Stokes simulation for unsteady bluff-bodyseparated flow. Another key contribution is to establish a relationship of deepkernel learning method with the simplified form of the incompressible Navier-Stokes equations, which essentially provides a physical justification for the suc-cess of the proposed neural networks during the functional mapping betweenthe input and the output response. We also connect the convolution processin the deep learning procedure with the well-known Mori-Zwanzig formalismfor the dimensionality reduction of a dynamical system. For a demonstrationof the proposed data-driven technique, we consider a simple variation of bluffbody geometry as a parametric input and perform a series of experiments todemonstrate the accuracy and the efficiency of the technique. The proposed5ovel model order reduction (MOR)-technique based on deep learning has di-rect engineering applications in the design of offshore, civil and aeronauticalstructures.The paper is organized as follows. We begin by reviewing the full-ordermodel based on the variational formulation of the unsteady incompressible flowequations in Section 2. Section 3 describes the background material and theformulation of our CNN-based deep learning procedure and covers the rela-tionship of deep-learning based MOR with the analytical form of the full-orderNavier-Stokes system. In Section 4, we present a systematic sensitivity and per-formance analysis of the proposed CNN-based technique in predicting the wakedynamics of various bluff bodies of different geometries. Concluding remarksare presented in Section 5.
2. Full-Order Variational Model
For the sake of completeness, we briefly summarize the Navier-Stokes solverused in this numerical study for our high-dimensional full-order computations.We assume the fluid flow to be Newtonian, isothermal and incompressible forthe present study. A Petrov-Galerkin finite-element and semi-discrete time step-ping are adopted for the full-order modeling to investigate the interaction of anincompressible viscous flow with a stationary body [15, 16]. For the spatial do-main Ω and the time domain (0 , T ), the incompressible Navier-Stokes equationsfor the momentum and the continuity are ρ (cid:18) ∂ u ∂t + u · ∇ u (cid:19) = ∇ · σ + b , ∇ · u = 0 on Ω × (0 , T ) (1)where u = u ( x , t ) represent the fluid velocity defined for each spatial point x ∈ Ω, respectively, b is the body force applied on the fluid, and σ is the Cauchystress tensor for a Newtonian fluid, written as: σ = − p I + µ (cid:16) ∇ u + ( ∇ u ) T (cid:17) ,where p denotes the fluid pressure, µ is the dynamic viscosity of the fluid. Theappropriate conditions are specified along the Dirichlet Γ g and Neumann Γ h boundaries of the fluid domain. We next present the discretization using a6tabilized variational procedure with equal order interpolations for velocity andpressure. By means of the finite element method, the fluid spatial domain Ω is dis-cretized into several non-overlapping finite elements Ω e , e = 1 , , ..., n el , where n el is the total number of elements. In this paper, we adopt a generalized- α method to integrate in time t ∈ [ t n , t n+1 ], which can be unconditionally stableas well as second-order accurate for linear problems simultaneously via a singleparameter termed as the spectral radius ρ ∞ . With the aid of the generalized- α parameters ( α, α m ), the expressions employed in the variational form for theflow equation are given as [17]: u n+1h = u nh + ∆ t ∂ u nh ∂t + γ ∆ t (cid:18) ∂ u n+1h ∂t − ∂ u nh ∂t (cid:19) , (2) u n+ α h = u nh + α ( u n+1h − u nh ) , (3) ∂ u n+ α m h ∂t = ∂ u nh ∂t + α m (cid:18) ∂ u n+1h ∂t − ∂ u nh ∂t (cid:19) , (4)where α m = (cid:18) − ρ ∞ ρ ∞ (cid:19) , α = ρ ∞ , γ = + α m − α . Let the space of thetrial solutions be denoted by S h and the space of test functions be V h . Thevariational form of the flow equations can be written as: find [ u n+ α h , p n+1h ] ∈ S h such that ∀ [ φ f , q ] ∈ V h : (cid:90) Ω h ρ (cid:18) ∂ u n+ α m h ∂t + u n+ α h · ∇ u n+ α h (cid:19) · φ f d Ω + (cid:90) Ω h σ n+ α h : ∇ φ f d Ω − (cid:90) Ω h ∇ · u n+ α h q d Ω + n el (cid:88) e=1 (cid:90) Ω e τ m (cid:16) ρ u n+ α h · ∇ φ f + ∇ q (cid:17) · (cid:18) ρ ∂ u n+ α m h ∂t + ρ u n+ α h · ∇ u n+ α h − ∇ · σ n+ α h − b (cid:19) dΩ e + n el (cid:88) e=1 (cid:90) Ω e ∇ · φ f τ c ρ ∇ · u n+ α h dΩ e = (cid:90) Ω h b · φ f d Ω + (cid:90) Γ h h · φ f d Γ , (5)where the 4th and 5th terms on the left hand side represent the stabilizationterms applied on each element locally. All the other integrals constitute theGalerkin terms and they are evaluated at t n +1 . The stabilization parameters7 m and τ c appearing in the element level integrals are the least-squares metrics,which are added to the fully discretized formulation [18]. Refer to [15, 16] forthe detailed derivation of Eq. (5).Linearization of the variational form is carried out by the standard Newton-Raphson technique. Let ∆ u and ∆ p denote the increment in the velocity andpressure variables. The linearized matrix form of Eq. (5) is written as: M ∆ u + θ K d ∆ u + θ N ∆ u + θ G ∆ p = R m (6) − ( G M ) T ∆ u − ( G K ) T ∆ u + C ∆ p = R c (7)where M is the mass matrix, K d is the diffusion matrix, N is the convectionmatrix, G is the pressure gradient operator. G M , G K , and C are the contri-bution of mass matrix, stiffness matrix and pressure matrix for the continuityequation respectively. The parameter θ = 2∆ t (1 + ρ ∞ ) / (3 − ρ ∞ ) is a scalar,whereas ρ ∞ denotes the spectral radius which acts as a parameter to control thehigh-frequency damping [16]. R m and R c are the right-hand residual vectors inthe linearized form of the momentum and continuity equations, respectively. We solve the N-S equations at discrete time steps to capture the transientflow characteristics, which lead to a sequence of linear systems of equations viaNewton-Raphson type iterations. Using Eqs. (6) and (7), the linearized form ofthe Navier-Stokes equations can be written as the coupled system of equations: M + θ ( K d + N ) θ G − ( G TM + G TK ) C ∆ u ∆ p = R m R c (8)This coupled matrix form is complicated to solve since there are two sets ofpressure and velocity modes, whereby the velocity modes must be captured ac-curately before evaluating the pressure modes. Linear solvers are needed forthe solution of momentum and the pressure equations. We first solve the sym-metric matrix of the pressure projection using the Conjugate Gradient (CG),8ith a diagonal preconditioner [19]. This provides a set of projection vectorsfor the low-pressure modes of the coupled Navier-Stokes system. Note thatto solve this equation we do not need the left-hand-side matrix in the explicitform, but instead we perform the required matrix-vector products. Withoutdisturbing the projected low-pressure modes, we then use the standard Gener-alized Minimal Residual (GMRES) solver based on the modified Gram-Schmidtorthogonalization [20] for the velocity and pressure increments using Eq. (8)within the Newton-Raphson loop. The GMRES is an iterative Krylov-subspacemethod, which minimizes the residual norm of all vectors in the Krylov sub-space. The Arnoldi recurrence generates a sequence of orthogonal vectors thatspan the Krylov space [20] and the solution to the least squares system providesan approximate solution by orthogonal projection. A Krylov cycle involves theequal number of steps of the Arnoldi recurrence and matrix-vector products forthe Krylov space. Each new Arnoldi vector is explicitly orthogonalized againstall previous vector, which increases the storage linearly with the iteration countuntil convergence and the work increases quadratically with the iteration count.In the above full-order iterative solution, the linear equation solution via theKrylov-subspace has a robust performance and convergence rate for the steep-est descent. However, such matrix-free Krylov methods for the full-order modelrequire much more storage of intermediate results, and they are generally non-local. Furthermore, the formations of finite element terms are also computa-tionally expensive processes for element-by-element integration in the full-orderanalysis using Eq. (5) at each iteration, which can be replaced by some equation-free approximation via a deep neural network. We next present our data-drivenmethod based on deep-learning and the stochastic gradient descent approach.
3. Model Reduction via Convolutional Neural Network
Deep learning utilizes multiple layers of representation to learn relevant fea-tures automatically from training data via deep neural networks. These deepneural networks can be considered as multi-iterative coarse-graining scheme9hich provides a successive process to learn relevant dynamics from the databy combining low-level features with higher-level features. In this study, weconsider the convolutional neural network for our deep learning technique toextract relevant features from fluid dynamics data and then predict the fluidforces. With this objective, we hypothesize that a properly trained CNN-baseddeep-learning technique can find the mapping between flow parameters (e.g.,force coefficients, Strouhal number) and the design parameters (e.g., aspectratio, rounding angle, angle of attack) of interest and hence can be used forpredictions. The training phase of the CNN comprises the input function, thefeed-forward process, and the back-propagation process. We feed the inputfunction as a matrix of values derived from a unique mapping of the designparameter. Then the feed-forward process initially assumes the features of thewake flow and performs four operations on this input matrix: (i) convolution: alinear process which extracts the local features of the flow using small matricescalled kernels or filters and sliding-window operations, (ii) nonlinear rectifica-tion: adds non-linearity to the outcome of the convolution, (iii) down-sampling(if required): reduces the memory consumption of the process at the expense ofdata dropout, and (iv) fully connected layers: connect all the extracted featuresto output the wake flow parameter.In a deep neural network, input features are mapped or transformed intohidden units to form new features to feed to the next layer. While the terminol-ogy of feature map is quite broad in the field of machine learning, a feature mapis essentially a function which maps the input data vector to feature space andprovides the encoding of features and their locations for convolutional layers.Using the discrepancy between the FOM and the predicted result, the back-propagation process corrects the assumed convolution kernel values and weightsof the fully connected layer. By successively performing feature extraction, thenetworks learn to emphasize dynamically relevant features while having a di-minishing effect on the irrelevant features. After a prescribed number of theseiterations, the CNN-based learning is deemed to be properly trained and readyto predict the force coefficients of the new geometry set. The components of the10terative training are elaborated as follows for our CNN-based deep learning.
The CNN starts with an input function which maps the macroscopic param-eter of concern with the independent variable(s). For the sake of explanationlet the input function be a matrix D ( x ) = D ij of the size R m × n . It is critical todefine the input function for a large enough domain with a proper refinement.However, we show in Section 4.5 that the refinements required for CNN aremuch less than the finite element NS solvers. After generating the input func-tion, the feed-forward process is applied to it to extract the dominant featuresof the fluid flow. The feed-forward process applies the operations: convolution, rectification,and down-sampling in sequence, first on the input matrix, then on the outputobtained from the above operations. We refer to each of these operations asa layer and the combination of adjacent convolution, rectification and down-sampling as a composite layer. We denote the output of each layer of the l th composite layer as Y Cl , Y Rl and Y Dl , respectively. The total number of com-posite layers, denoted by N , is a critical factor influencing the accuracy of theentire convolutional network. Such system variables are referred to as hyper-parameters and their values need to be determined for an optimal performance.The feed-forward process takes a set of 2D vectors as input and applies thediscrete convolution operation with a number of 2D kernels. For the first convo-lution operation, the input is the input function: D ( x ). Let us denote the totalnumber of kernels in the l th composite layer by k l and group all those kernelswith size δ × ξ into the tensor K l ∈ R δ × ξ × k l . When properly trained, each ofthese kernels represents a feature of the system contributing to the variation ofthe unknown output parameter due to the change in the design parameter. Theconvolution kernel size determines the input neighborhood size which representsa particular feature. 11hen we apply the first layer of 2D convolution on the input matrix, i.e.( l = 1), it outputs the 3D tensor Y C = { Y C ijk } [21]: Y C ijk = D ij (cid:63) K ijk = n (cid:88) b =1 m (cid:88) a =1 D ab K i − a +1)( j − b +1) k , k = 1 , , ..., k , (9)where the symbol (cid:63) denotes the convolution operation, which allows extractinglocal features with some prior knowledge. Eq. (9) changes slightly if the convo-lution blocks are skipped on more than one element of the input matrix on anydirection. The skipping lengths along the two directions of the input, termedas the stride s l = [ s l s l ] is also an important hyper-parameter. Except forthe first composite layer, the input to the convolution layer is a 3D tensor. The l th layer takes each 2D slice of the tensor Y D ( l − and convolutes them with allthe kernels which create a 3D tensor Y Cl with a depth of k ( l − × k l . Theseadditional convolutions increase the neighborhood size of a particular featureand may cause some errors as well.Since convolution is a linear operation, it is required to introduce a nonlinearrectification process to capture the complex nonlinear flow features such as thebluff body wake. We employ the common rectifier linear unit (reLU) for thispurpose which yields Y Rl = { Y Rlijk } as: Y Rlijk = max( Y Clijk , . (10)The reLU is generally observed to be a faster learning non-linearization processthan the other commonly used tanh( x ) and sigmoid (1 + e − x ) − functions [22].By generalizing the convolution for the l th layer, the size v l of Y Cl will be: v l = (cid:108) m (cid:81) s l (cid:109) × (cid:108) n (cid:81) s l (cid:109) × (cid:81) k l which gets larger with smaller strides and alarge number of kernels and convolution layers. A down-sampling/pooling layerbetween convolutions can be used to reduce the size of the CNN matrices atthe cost of losing some data. The commonly used pooling techniques are themaximum, average and minimum where the respective function (max, avg ormin) is applied to small windows of the rectified output, generally with strides12 nput layer convolutional layer+ rectifier layerpooling layer convolutional layer+ rectifier layerpooling layer fully connectedlayeroutput layer Figure 1: An abstract representation of a convolutional neural network (CNN) architecture,which illustrates the alternative sequential stages between convolutional and pooling layers.While the convolution step is denoted in dashed lines, the pooling-layer window is denoted indotted lines. Units in a convolutional layer are feature maps and each unit is connected tolocal patches in the feature maps of the previous layer through a set of weights. equal to the pooling window size. It reduces the size of the feature map andalso drops some data. After the N composite layers, the feed-forward processends with an additional 1 D vector called a fully connected layer which has thesize of R β × where β is the stacked size of v N . Figure 1 summarizes a generalarchitecture of the convolutional neural network. In the figure, a convolutionallayer represents two sub-layers namely convolution and rectifier. The featuremaps of the final downsampling layer are fed into the fully connected layers forthe feed-forward process. While the convolutional layer plays a role to detectlocal connections of features from the previous layer, the pooling layer is tocollapse the similar features into one. The final set of predictions for the forcecoefficient of the feed-forward process C FCNN for a training set with S inputs isgiven by: [ C FCNN ] S × = [ w ] S × β [ Y N ] β × . (11)where w is the weight matrix of the fully connected layer. The values of thekernels and these weights are the adjustable parameters of the system and de-note them as a set W for the ease of explanation. C FCNN is compared with thefull order result ( C FF OM ) and the weights are then adjusted to minimize theerror using the back-propagation process. As being a universal black-box ap-proximation, the above neural networks with convolution layers can be effectiveto deal with the local interaction and multiscale nature of Navier-Stokes PDE13ystem. In subsection 3.4, we provide a connection between the convolutionprocess with the memory kernels in the Mori-Zwanzig formalism used for themodel reduction of a full-order hydrodynamic system.
The role of back-propagation is to train a multi-layered neural network tolearn the relevant features from any arbitrary input-output datasets. The train-ing phase is an iterative process, which continuously minimizes the error betweenthe predicted and target outputs. It compares the output of the feed-forwardpass with the full-order result and corrects the set of weights ( W ) to minimizethe error. While the network is initialized with randomly chosen weights, thebackpropagation process attempts to find a local minimum of the error func-tion. Let us represent the entire feed-forward process for the p th input matrix( d p ) with the function H ( d p , W ). We define a cost function G to measure thediscrepancy between the feed-forward prediction and full-order fluid coefficient: E p = G ( C FF OM , H ( d p , W )) . (12)In this study, the cost function G is the root mean square error L function.Now the target is to update the weight set W to minimize the error E p using astandard gradient descent back-propagation method.For clarity, we denote the layer number in superscripts and the iterationnumber in subscripts unless otherwise mentioned. For simplicity let us denotethe output of the l th composite layer of the feed-forward process as Y l . It canbe related to the previous layer output Y ( l − and the weights of the l th layer W l : Y l = F ( W l , Y ( l − ) , (13)where F represents a single pass of convolution, rectification, and down-sampling.Note that Y = D and Y ( N +1) = C FCNN . The back-propagation process startsat this predicted value where the error gradient of the fully connected layer canbe determined first. We next calculate the error gradient with respect to the14eights and the previous layer output recurrently in the backward direction.When the error gradient of the l th output layer: ∂E p ∂Y l is known, we can get theerror gradients using the chain rule: ∂E p ∂W l = ∂F∂W (cid:16) W l , Y ( l − (cid:17) ∂E p ∂Y l ,∂E p ∂Y ( l − = ∂F∂Y (cid:16) W l , Y ( l − (cid:17) ∂E p ∂Y l , (14)where ∂F∂W and ∂F∂Y are the Jacobians of F relative to W and Y , and are evaluatedat the l th layer. The successive gradients during the backpropagation can alsobe linked with the variational least-square minimization. After the evaluationof all the error gradients ∂E p ∂W , we use the stochastic gradient descent methodwith momentum (SGDM) [23] to adjust the parameter set for the T th iteration: W T = W T − − γ S S (cid:88) p =1 ∂E p ∂W + η ( W T − − W T − ) (15)where γ > η ∈ [0 ,
1] is called the momentum and is the hyper-parameterwhich determines the contribution from the previous gradient correction. Thegradient in the SGDM is an expectation, which may be approximately estimatedusing a small set of samples. Refer to [24] for the proof of convergence and thederivation of the SGDM method. In the next section, we provide the linkingbetween the deep kernel learning with the momentum term and the simplifiedanalytical form of the Navier Stokes momentum equation. We have outlined thedetailed formulation of our CNN-based deep learning algorithm via stochasticgradient descent. The present approach is solely data-driven and does not relyexplicitly on the underlying governing equations of the dynamical system.
Both flow physics and deep learning rely on many degrees of freedoms, whichinteract in a nonlinear and multiscale manner. Here, we attempt to demonstratea physical analogy for deep learning in the context of Navier-Stokes equations15or the fluid flow modeling. This analogy will provide a fundamental basis anda conceptual underpinning to understand how the deep learning can provide auseful reduced model for flow dynamics.The deep learning based on convolutional neural networks can be consideredas a black box mapping between two functions [9]: y ( x ) = L φ F ( x ) , (16)where y and F are the mappings of the input and output with the independentvariable x . The aim of CNN process is to determine the unknown mapping L byextracting its features φ using available data. Moreover, the CNN transformsthis problem to determine some unknown kernels κ in the equation: L φ F ( x ) = ( κ (cid:63) F ) = (cid:90) κ ( x − x (cid:48) ) F ( x (cid:48) ) d x (cid:48) , (17)where x (cid:48) is a dummy variable. The task of CNN is now to estimate thesekernels (memory effect) by a learning process based on available input-outputcombinations obtained by full-order methods. Essentially, the CNN-based pro-cess facilitates the systematic approximations of the memory effects during themodel reduction. We will justify this approximation property of deep learningby linking with the Mori-Zwanzig formalism. Before we provide an analogy ofthe convolution effect and the stochastic iterative correction process, we trans-form the incompressible Navier Stokes momentum equation into a simplifiedintegral-differential form. To begin, let φ be the divergence-free velocity potential function in Eq. (1),such that u = ∇ φ . Substituting this for a very small volumetric fluid region(∆ V ), this gives: ∂ ∇ φ∂t + ( ∇ φ · ∇ ) ∇ φ = − ρ ∇ p + ν ∇ ∇ φ. (18)16hich can be reduced to the reaction-diffusion equation: ∂ψ∂t − ν ∇ ψ = ∆ p µ ψ (19)where ψ = − ν ln φ . Detailed steps of the transformation of the Navier-Stokesmomentum equation into the reaction-diffusion equation is presented in Ap-pendix A. From a statistical viewpoint, Eq. (19) is a particular case of theEinstein-Kolmogorov differential equation: ∂ Υ ∂t = − ∂A Υ ∂x + ∂ B Υ ∂x , (20)where Υ( x , t ; x , t ) is the concentration at spatial point x at time t of particleswhich started undergoing Brownian motion at time t from point x . HenceΥ( x , t ; x , t ) · ∆ V is the probability of finding a particle undergoing Brow-nian motion. The connection between the partial differential equations andthe Brownian Motion can be constructed by the basic differentiation of distri-bution function of the underlying Gaussian stochastic process. A statistical-mechanical theory of many fluid particle system can be employed to link thecollective transport and the Brownian motion. The following equation so-calledthe Einstein-Kolmogorov equation relates the particle concentration with theinitial concentration:Υ( x , t ; x , t ) = (cid:90) Υ( x , t ; P, θ )Υ(
P, θ ; x , t ) dV P , (21)where t < θ < t and the integral is taken all over the spatial domain. Thisequation leads to Eq. (20) as shown in [25]. Eq. (20) relates the stochasticprocess of the Brownian motion to a continuum partial differential equation onconcentration.Next, we can formulate a general solution for the reaction-diffusion equation,Eq. (19) using the Green’s function. Let L be the operator L = ∂∂t − ν ∇ .The differential equations for ψ ( x , t ) and the Green’s function, G ( x , t ; ξ , τ ) for17 , ξ ∈ D and t, τ ≥ L ψ ( x , t ) = ∆ p µ ψ ( x , t ) , (22) L G ( x , t ; ξ , τ ) = δ ( x − ξ ) δ ( t − τ ) , (23)where D is the fluid domain of interest and ξ and τ are dummy variables. Thisleads to the integral form of the Green’s function based general solution fornon-homogeneous diffusion equation [26] (see Appendix B): ψ ( x , t ) = (cid:90) ∞ (cid:90) V G ∆ p µ ψ ( x , t ) dV ξ dτ + (cid:90) V Gψ ( x , dV ξ + ν (cid:90) ∞ (cid:73) S [ G ∇ ξ ψ + ψ ∇ ξ G ] n dSdτ. (24)Note that the second and third terms represent the initial and boundary condi-tions. Considering only the first term, Green’s function is the unknown kernelanalogous to the kernels of the Eq. (17). Then the blackbox process L φ in Eq.(16) is analogous to the fixed point iteration of the nonlinear functional map ofEq. (24) given by F : ψ m +1 ( x , t ) = F ( ψ m ( x , t )) , (25)where m is the iteration number. The above steps provide a physical processto construct the direct mapping function from the Navier-Stokes momentumequation for the hydrodynamic variable of interest. This integral transformationof the variable of interest can also be realized using the deep architecture wherethe weights between the successive layers are connected through a mappingfunction. We next present the Mori-Zwanzig formalism to establish a connectionof the reduced system obtained from the Navier-Stokes equation. During thecoarse-grained representation via the Mori-Zwanzig formalism, the complexityof large degrees of freedom in the full-order model is replaced by the convolution(memory) kernels. In the deep learning, the CNN process provides systematicapproximations for the memory effects of the original dynamics.18 .4.2. Mori-Zwanzig formalism and convolution integral We briefly discuss the Mori-Zwanzig formalism [27, 28] in the context ofmodeling memory effects during the CNN-based deep learning. This formalismprovides a strategy for integrating out a subset of variables in a generic dynam-ical system through the coarse-graining process. Let the coarse-grained modesof the function ψ in Eq. (19) be χ such that ψ ( χ , t ) = g ( χ ), where the initialcondition χ (0) = χ with χ ∈ L . Consider the decomposition of ψ as the slow(resolved) modes (cid:98) ψ = ψ ( (cid:98) χ , t ) and the fast (unresolved) modes (cid:101) ψ = ψ ( (cid:101) χ , t ). Aspresented in Appendix C, the Mori-Zwanzig equation for the resolved variableof interest (cid:98) ψ can be derived as: ∂ (cid:98) ψ∂t = Θ (cid:98) ψ (cid:124)(cid:123)(cid:122)(cid:125) Markovian + (cid:90) t K ( t ) (cid:98) ψ ( t − t ) dt (cid:124) (cid:123)(cid:122) (cid:125) Convolution + F ( (cid:98) χ , t ) (cid:124) (cid:123)(cid:122) (cid:125) Fluctuation (26)The first term in the right-hand side is a Markovian term given by Θ (cid:98) ψ = e Lt P L (cid:98) ψ , where P is a projection of ψ to the slow variables (cid:98) ψ and the dif-ferential operator L is given by L = ( ∆ p µ + ν ∇ ). The Markovian term dependsonly on the value of (cid:98) ψ at time t . The second term depends on every valueof ψ during interval [0 , t ], hence represents the memory effect via convolutionprocess. The convolution (memory) kernel is given by K ( t ) ψ = P Le
QLt
QLψ ,where Q = I − P is an orthogonal subspace of P and I denotes the identityoperator. The projection operators P and Q are self-adjoint and they are or-thogonal to each other. The third term F ( (cid:98) χ , t ) = e QLt QL (cid:98) ψ represents thefluctuation (noise) and it satisfies the orthogonal dynamics equation. In a stan-dard stochastic differential equation, the time evolution of the resolved variableis considered as the combination of the real-time Markovian term and the fluc-tuation. However, the Mori-Zwanzig formalism contains the convolution kernelwith memory effect, which is not correlated with the fluctuation. The CNN-based learning process is specifically trained to extract the memory-dependentdynamics via data-driven modeling. In a broader sense, for fluid dynamics, wecan apply this dimensionality reduction approach to build subgrid models for19urbulence via data-driven approximations. Next, we show a link between thestochastic gradient descent with the momentum and the discrete counterpart ofthe reaction-diffusion system. By considering the time-discretized form of the reaction-diffusion equationEq. (19) for two adjacent time steps, we get: ψ T − ψ T − ∆ t = ∆ p µ ψ T + ν ∇ ψ T (27) ψ T − − ψ T − ∆ t = ∆ p µ ψ T − + ν ∇ ψ T − (28)Subtracting the equations and rearranging the terms, we obtain ψ T = ψ T − + 2∆ tµν µ − ∆ p ∆ t (cid:0) ∇ ψ T − ∇ ψ T − (cid:1) + 2 µ µ − ∆ p ∆ t ( ψ T − − ψ T − ) . (29)This is analogous to the stochastic gradient descent method with momentumgiven in Eq. (15). The second term of the right-hand side is equivalent to theerror gradient and the third term represents the momentum term. The abovesemi-discrete form of the simplified Navier-Stokes momentum equation providesa connection with the discrete kernel learning (i.e., black-box integrator) fromthe stochastic gradient with momentum. We next present the sensitivity andconvergence study of the proposed CNN-based stochastic gradient method withthe momentum term.
4. Results
In this work, for the first time, we apply a deep convolutional network for thefundamental fluid dynamic problem of the flow past bluff bodies. In particular,we quantify the variation of force coefficients e.g., lift ( C L ) and drag ( C D ) asa function of the bluff body geometry. Without loss of generality, we considertwo of the most widely studied bluff body geometry variations: the aspect ratio20 AR ) of an elliptical body and the rounding angle ( ϕ ) of a square shaped body.Using a trained deep neural net, we will examine how the deep nets can predictwell on unseen configurations (i.e., generalization property of deep nets). (a) (b) (c)(d) Figure 2: Problem definition and parameter space for the application of deep convolutionalnetwork: (a) schematics of bluff-body geometries with relevant dimensions: aspect ratio ( AR )of ellipses and rounding angle ( ϕ ) of squares are shown, (b) known training geometry set,(c) vortex street and time histories of the force coefficients for a representative case: circularcylinder ( AR = 1 and ϕ = 90 o ) and (d) unknown geometries to be predicted by using theFOM data of the training set. Figure 2 (a) shows the two-dimensional problem domain and the boundaryconditions. A Dirichlet boundary condition ( u = U, v = 0) is employed at theinlet and a Neumann boundary condition ( ∂u∂x = 0 , ∂v∂x = 0 , p = 0) is appliedat the outlet. The top and bottom boundaries are applied with a symmetricboundary condition ( ∂u∂y = 0 , v = 0) to simulate the far-field conditions. A no-slip boundary condition is applied on the bluff body surface. The flow domain21 consists of a collection of non-overlapping finite elements with a body-fittedformulation with finer grid resolution in the regions of high gradients.While Fig. 2 (a) shows the bluff body shapes immersed in a uniform flowspeed of U , Figs. 2(b) and (d) illustrate the training cases and prediction casesrespectively. Even though we define two different configurations of bluff bodies,they share a common element: the circular bluff body can be considered as anellipse with aspect ratio, AR = 1 and a rounded square with a rounding angle ϕ = 90 o . Hence, we assume that the high-fidelity data of one configuration willbe useful in predicting the wake flow characteristics of the other configuration.We perform the full-order simulation over a small subset of the infinite setof problem geometries shown in Fig. 2 (b), for a laminar flow at Re = 100(based on the diameter D ) to extract the velocity and pressure fields for 20cycles of vortex shedding. Using the flow field data, we first calculate the timehistories of the lift and drag coefficients and then the mean drag ( C D ) and theroot mean square (rms) of the lift ( C rmsL ) for each training case. Figure 2 (c)shows the vortex shedding pattern, the time histories of the force coefficientand the extracted time-averaged statistics for the representative case of circularshape. We feed these values as the output layer of the CNN while a distancefunction based on the bluff body geometry is fed as the input to the CNN. Thedeep neural network is trained to obtain the functional relationship between thebluff body geometry and the force dynamics such that it can predict the forcecoefficients for any perturbed geometry of the training set. As a demonstration,we estimate the force coefficient for all geometries in the set shown in Fig. 2(d).The fluid loading is computed by integrating the surface traction consideringthe first layer of elements located on the cylinder surface. The instantaneous22 L u =20D L d =40D Y X v=0, v=0, xy =0 xy =0 v=0u=U H = D xy =0 xx =0 D (a) (b) Figure 3: Full order model based on Navier-Stokes equations: (a) problem domain and (b)unstructured 2D mesh lift and drag force coefficients are defined as C L = 1 ρU D (cid:90) Γ ( σ . n ) . n y d Γ , (30) C D = 1 ρU D (cid:90) Γ ( σ . n ) . n x d Γ , (31)where D is the cylinder dimension perpendicular to the flow direction and n x and n y are the Cartesian components of the unit normal, n . The domain is dis-cretized using an unstructured finite-element mesh shown in Fig. 3(b). There isa boundary layer mesh surrounding the bluff body and triangular mesh outsidethe boundary layer region. This mesh is obtained by the convergence stud-ies conducted in [16, 29]. The mesh selected for the circular cylinder contains86,638 elements. This element count slightly differs for each bluff-body geome-try. Based on our previous temporal convergence studies in [16, 29], we employ∆ t = 0 .
025 (
D/U ∞ ). We define an input function which represents the bluff body geometry and isindependent of any other variable. We use the Euclidean distance from the bluffbody boundary given by d ( x, y ) = { d ij } ∈ R m × n for a Cartesian XY -coordinate23rame ( x ∈ X, y ∈ Y ) centered at the bluff body center such that: d ij = (min || r ( x i , y j ) − r ( x Γ , y Γ ) || ) b, i = 1 , , ..., m, j = 1 , , ..., n, (32)where r is the distance to the domain point from the bluff-body center and Γrepresents the independent bluff body boundary points. The binary condition b = 0 on and inside the boundary and b = 1 otherwise. The integers m and n arethe matrix dimensions in the x and y directions, which depend on the underlyingfluid domain discretization. In all cases, the input discrete representation ofgeometry is evaluated on a coarse uniform grid (i.e., x i , y j ) of 0 . D intervalson a rectangular domain [( − D, − D ) , (40 D, D )], which results into a 2Dmatrix with a size m × n = 201 × x Γ , y Γ )represents ( x, y ) coordinates of 512 points on the bluff body boundary, which areindependent of the main coarse grid. This grid resolution and the fine boundarydiscretization properly capture the geometry variations among different bluffbodies. The feed-forward process is then applied to the input geometry matrixto extract the dominant features of the flow. As mentioned in Section 3.2 weneed to tune the algorithm by optimizing a few hyper-parameters. The nextsection describes the hyper-parameter design performed in this particular case. Extreme refining and overuse of the convolution layers may cause the CNNto overfit the training dataset and make it incapable of predicting the forces forperturbed geometries of the training dataset. However, the under-utilization ofconvolution will increase the error of prediction. Here, we present an empiricalsensitivity study to establish the hyper-parameter values for the best perfor-mance of the CNN-based learning. Specifically, we address some of the generalissues raised by [6] for deep learning methods in fluid flows e.g., the number ofconvolution layers, the number of filters per layer, the size of the filters and thelearning rate.The main benchmark considered here is the maximum relative error of the24rag coefficient prediction which is required to be below 5% threshold for theacceptable hyper-parameter set. We evaluate this error for the training set andalso select a subset of 10 target geometries ( AR = 1 . , . , , , ◦ , ◦ , ◦ , ◦ , ◦ ) which we denote as the test set. We start withthe hypothesis that the CNN process with the least overall size is optimal forthe predictions since it has the smallest fluid domain neighborhood representinga flow feature of dynamical importance. Our first trial is to increase the stridewhich can cause our results to deteriorate since it increases the fluid neigh-borhood size of a feature and generates erroneous local connectivities betweenfarther points in the fluid domain. For all convolution layers, we use a strideof s l = [1 1]: i.e. the convolution kernel block is moved adjacently in bothdirections of the 2D matrix. As shown in Fig. 4 (a), the least error is observedwhen 50 kernels are used. When a less number of kernels is used, it missessome flow features required to predict C D from the bluff body geometry. Ifwe increase it too much, the CNN introduces erroneous features increasing theprediction error. Figure 4 (b) shows that the prediction error is minimal: forthe training set when 8 × × × C D for newgeometries via the generalization property of deep nets. Also, the 8 × × C D prediction fromthe actual value. At a learning rate of 0 .
01, we obtain the best predictions asshown in Fig. 4c. Smaller learning rates do not provide an acceptable accuracywithin a prescribed number of iterations. If the learning rate is too large, theweights of the CNN oscillate between the same values without converging to theminimum error.Next, we conduct the error analysis with a different number of layers, thekernels and with and without a down-sampling layer, which is shown in Fig.4d. We find that the CNN performs best when a single composite layer with25
B C D E F02468 M a x i m u m e rr o r % A - 1 layer, 50 kernels, w/o poolingB - 1 layer, 50 kernels, w pooling
Train set error Test set error
C - 2 layers, 7 kernels, w/o poolingD - 2 layers, 50 kernels, w/o poolingE - 2 layers, 28 kernels, w poolingF - 2 layers, 50 kernels, w pooling
10 20 30 40 50 60 70 80 90 10012345678
Number of kernels per layer M a x E rr o r % (a) × × × × Kernel size M a x E rr o r % (b) .
001 0 .
002 0 .
005 0 .
01 0 . Learning rate M a x E rr o r % (c) , ,w/o , ,w , ,w/o , ,w/o , ,w , ,w No. of convolution layers, No. of ker-nels, With or without downsampling M a x E rr o r % (d) Figure 4: Hyper-parameter analysis: maximum error variation with (a) number of kernels,(b) kernel size, (c) learning rate and (d) number of layers. The solid lines denote the errorthreshold of 5%.
50 kernels is used without the down-sampling layer. When we use more layers,it increases the fluid neighborhood size representing a local feature creatinginvalid local connectivities. Down-sampling drops many points of the featuremap obtained by a kernel which causes a much larger fluid neighborhood to berepresented by a single value. This affects adversely on the wake flow predictionswhere the features are highly local. To summarize, we use a CNN with oneconvolution layer consisting of 50 kernels of the size 4 ×
4, followed by an reLUrectification layer and an SGDM back-propagation with a learning rate of 0.01.Some sample data and codes used in this work are publicly available on Githubat: https://github.com/TharinduMiyanawala/CNNforCFD .4. Check for overfitting When tuning the hyper-parameters of the CNN, we face the risk of overfittingthe algorithm to the training data. However, it is possible to check the developedCNN algorithm for overfitting without running a validation check with FOMof predictions. We first test the algorithm using the standard 1-fold exclusiontest. Here we exclude one element at a time from the training set and predictit using the CNN trained by the other inputs. If the system is overfitting,the exclusion of one element causes the system to fail to reach the accuracyrequired. However, the full-set predictions will generally be more accurate thanthe 1-fold excluded predictions. We also introduce a class-wise training wherewe only employ the common element: circle, one ellipse and one rounded squareas the training set. If the prediction error of this test falls below the requiredthreshold, it confirms that the algorithm is not overfitting. However, if the errorof this test is higher it will not give a conclusion on overfitting. According toFig. 5, the relative error is below the 5% threshold for both tests which confirmthat the CNN-based model does not overfit to the training data. In the nextsection, we predict the force coefficients for new bluff body geometries using thedesigned CNN. o o o . . . . . . Ellipses: AR Rounded Sq.: ϕ C D FOMFull-set1-fold exclusionClass-wise (a) o o o Ellipses: AR Rounded Sq.: ϕ M a x . E rr o r %
5% ThresholdFull-set1-fold exclusionClass-wise (b)
Figure 5: (a) Assessment of CNN-based C D prediction for the training set of using full-set,1-fold exclusion and class-wise training. (b) Relative percentage error for each case. . . . . . Rounding Angle ( ϕ ) C D FOMPredicted-EllipsesJaiman et. al. (2015)
Aspect Ratio ( AR ) FOM - Rounded SquaresPredicted - Rounded SquaresJaiman et. al. (2015) - Rounded Sq.FOM - EllipsesPredicted - EllipsesJohnson et. al. (2001) - Ellipses (a)
10 20 30 40 50 600 . . . . . Rounding Angle ( ϕ ) C r m s L FOMPredicted-EllipsesJaiman et. al. (2015)
Aspect Ratio ( AR ) FOM - Rounded SquaresPredicted - Rounded SquaresJaiman et. al. (2015)FOM - EllipsesPredicted - Ellipses (b)
Figure 6: Prediction of the force coefficients: (a) mean drag and (b) rms of the lift. Theresults are compared with simulation results of [30] and [31].
We use the convolutional neural network designed in Section 4.3 to deter-mine the force coefficients for the perturbed geometries shown in Fig. 2 (d).We determine the error of prediction and then compare the computational costof the full-order model and the CNN-based approximation. We feed the CNNmodel with the new geometry input function generated for the bluff body config-urations in Fig. 2 (d) and obtain the predictions of the force coefficients. Figure6 compares the results of the CNN-based prediction with the full-order Navier-Stokes results. The CNN successfully predicts the force coefficients within 5%error, which clearly demonstrates the generalization of our optimized convolu-tional nets. A direct functional relationship between the fluid forces and theinput geometric parameters is established for unstable wake flow via the pro-posed deep learning method.
Table 1 describes the computational resources and the elapsed time forthe full-order model when running on a multicore workstation for the high-28 able 1: Summary of computational resources used, speed-up and error between FOM andCNN data. While FOM is performed on a multi-core workstation and a single-core personalcomputer (PC) system, the CNN-based computation is done on a PC.
FOM-HPC ∗ FOM-PC ∗ CNN ∗∗ No. of Processors 24 1 1Processor Speed (GHz) 2.60 2.60 2.60RAM (GB) 256 16 16Elapsed time (online) (sec) 2659.44 63548.21 69.41Elapsed time(offline) (sec) - - 287.24online time per case ∗∗ (sec) 2659.44 63548.21 3.31Total time per case ∗∗ (sec) 2659.44 63548.21 16.98Speed-up factor (online) (803 † , 19198 †† )Speed-up (offline + online) (156 † , 3742 †† )Maximum Error 4 . † ∗ -per simulation, ∗∗ - for an input set of 21 cases, † -relative to FOM-HPC, †† -relative toFOM-PC. performance computing (HPC) and a single-core personal computer (PC) andthe CNN-based computation is performed on the same PC. While the parallelsimulations utilize 24 processors and 256GB RAM for each case, the serial PCsimulations are performed on a single processor with 16GB RAM. Note that eachFOM case is a standalone and completely online computation. To obtain theforce coefficients of one case, we run the FOM for 20 shedding cycles which min-imize the numerical errors introduced by the initial transient response. Henceevery simulation has to run until ∼ tU/D = 250 or ∼ tU/D = 0 . M elements, the GMRES methodperforms ∼ (2 M + N M ) floating-point operations for the matrix-vector prod-ucts and additions in the N th iteration for the N th Krylov subspace vector.Hence, for a single time step, it requires ∼ (2 M + N M ) operations. Since N (cid:28) M , the total number of operations performed in the FOM solver for T time steps is ∼ O ( M T ). In this study, M ∼ O (10 ) and T ∼ O (10 ), hencethe number of operations is ∼ O (10 ) per case. Unlike the FOM computationvia finite-element GMRES approach, the CNN-based deep learning predictionhas an offline training process and then the force predictions are obtained forthe required inputs. This offline process gives a library of kernels and weights29hich are used for all predictions where in the case of full-order simulations thehigh-fidelity data of one FOM simulation is generally not utilized for the nextFOM simulation. Figure 7: Comparison of the processes of extracting the force coefficients from the FOMand CNN for the flow past a bluff body. Here σ ( u , p ) is the Cauchy stress tensor and C X =[ C D , C rmsL ]. The convolution layer has ‘ l (cid:48) layers with size p × p where kr i is the i th componentof the r th layer. F C i is the i th element of the fully connected layer. Figure 7 illustrates a comparison of the online processes of FOM and CNN.The online prediction phase of CNN is an extremely fast process since it requiresfar less number of floating point operations than FOM. For a CNN consisting ofan input matrix with M elements and l number of p × p kernels, the convolutionprocess requires only ∼ M p l operations. The non-linearization and the fully-connected layer mapping require further ∼ M l operations which estimate thetotal number of operations as ∼ M l (2 p + 3) for a CNN consists of a singleconvolution, rectifier, and fully-connected layer.In this study M ∼ O (10 ), l = 50 and p = 4 which assesses the number ofoperations per case at ∼ O (10 ). Theoretically, the online process of CNN is30 (10 ) times faster than the FOM. Due to the high efficiency of the CNN-baseddeep learning, the force predictions for all 21 cases are computed at once. Usingthe serial computation power of a PC, it gives ∼
800 speed up in comparisonto the FOM simulations on an HPC which has 24 times the processing powerand 16 times the RAM in contrast to the single-processor PC. Even with theoffline training process, the CNN-based prediction is ∼
150 times faster thanthe HPC-FOM. When the CNN-based deep learning computation and the FOMsimulation are performed on the same PC, the CNN is ∼ ∼ ∼ O (10) seconds.In summary, the CNN-based deep learning model predicts the force coeffi-cients within 5% accuracy at ∼ O (10 ) speed-up per single new geometry usinga small fraction of the computational resources as compared to its FOM coun-terpart. This allows the maximum utilization of the high-fidelity data obtainedfrom the full-order methods. The CNN-based prediction has a very attractivespeed versus accuracy trade-off for the design optimization process and the feed-back control of bluff body flows. Furthermore, the CNN significantly reducesthe computational resource requirement enabling the design space explorationof wake flow problems on personal computers and devices.
5. Discussion
Through our CNN-based deep learning results, we are successfully able toidentify the variation of the force coefficients with the two representative geo-metric perturbations: the aspect ratio of an ellipse and the rounding angle of arounded square. In the present study, Eq. (31) links the instantaneous force co-efficients with the pressure ( p ) and velocity ( u ) fields, which are governed by the31avier-Stokes equations as functions of physical and geometric parameters e.g., Re and cylinder geometry. In a broader sense, the full-order model provides afunctional map between the geometric perturbations and the force coefficients,i.e. for the case of ellipses: C D ( x ) = f ( AR ( x )) (33) C rmsL ( x ) = g ( AR ( x )) . (34)The functions f and g comprise the force contributions from the flow featuressuch as the shear layer, the near-wake, the vortex street, the time and spacediscretizations, the surface integration of Eq. (31) and the time averaging of theinstantaneous forces. Each of the input-output pair generated by the FOM anal-ysis is independent, contains a large amount of high-fidelity data and consumessignificant computational effort. While the FOM provides a great insight intothe flow physics, it is generally not suitable for the iterative optimization andthe efficient engineering design study. We have shown that the CNN-based deepkernel learning provides a functional relationship between the flow parameterand design parameter and it can also capture dynamically important patterns.In this section, we further elaborate our results on the force prediction for vary-ing aspect ratios. In particular, the variation of the lift coefficient with theaspect ratio ( AR ) of ellipses (Fig. 6 (b)) shows an interesting pattern. It hasa maximum in the vicinity of AR = 2 .
5. Here we briefly examine the reasonsfor this behavior and investigate the capability of CNN-based deep learning tocapture it.
The lift force of a bluff body is affected by many flow features. Here we dis-cuss two such features namely, the near wake circulation and the recirculationlength, and their variations with the aspect ratio of the elliptical bluff body. Ac-cording to the Kutta-Joukowski’s theorem for an inviscid flow around a stream-line body that the lift force per unit span can be expressed as: F L = − ρ f U Γ,32here Γ is the circulation given by the line integral (cid:72) S u f · dS evaluated on aclosed contour S enclosing the streamlined body. Although the viscous flow pasta bluff body does not satisfy some of the conditions, it suggests the importantrelationship: the lift force has some proportional relationship with the circula-tion of the vortical wake. Here, we observe the vortex patterns and calculatethe total circulation of the wake to describe the lift variation observed due tothe perturbation of the ellipse geometry. Figure 8 (a) shows the vorticity pat-terns of the wake of different ellipses. When the aspect ratio is changed fromthe circular cylinder ( AR = 1 .
0) to AR = 10 .
0, the vortex pattern changesfrom a standard Karman street to a two-layered vortical wake pattern. Thisbehavior is also observed for Re = 100 in the recent study by [32]. It isclear that the vortex intensity and the recirculation length differs accordingto the aspect ratio of an elliptical bluff body. We quantify the total circula-tion of the near wake by considering the closed rectangular contour bounded by( − , − . D, (3 . , − . D, (3 . , . D and ( − , . D . Apart from the x = 3 . D line segment, all the other line segments pass through irrotational regions of theflow. The circulation is taken positive in the anti-clockwise sense. Figure 8 (b)shows the time traces of the near wake circulation and the lift coefficient. Asexpected, the Circulation and the lift signals have the same frequency as theStrouhal frequency, but with a phase difference. Unlike for a streamlined body,the lift force on a bluff body also depends on the distance to the vortices in thewake. We present the circulation and the recirculation length variation withrespect to the aspect ratio in Fig. 8 (c). Note that the recirculation lengthis calculated as the distance to the closest vortex core just after the end of itsshedding from the bluff body. The near wake circulation has a maximum of AR = 4 . AR .The predictions based on our CNN-based deep learning rely on extractingfeature patterns as described here. Due to this, unlike the FOM analysis, itutilizes the full-order results of all input simulations to determine the requiredflow dynamic property. We further discuss the interpretation of the learnedkernels in terms of the feature extractions.33 a)
150 200 250-2-1012 (b) . . . . Aspect Ratio ( AR ) Γ r m s Γ rms . . L r Near wake circulation (Γ rms )Recirculation length ( L r ) (c) Figure 8: Analysis of flow past over elliptical cylinders of varying aspect ratios. Flow featuresaffecting the lift variation with aspect ratio: (a) Development of vortex shedding patternsobserved for different aspect ratios, (b) Time traces of near wake circulation and lift coefficientfor AR = 1 .
0, and (c) Variation of near wake circulation and recirculation length with AR .Note that Γ and L r values presented here are non-dimensionalized by UD and D , respectively. .2. Convolution kernels Figure 9: Feature maps obtained by critical filters of the convolutional neural network forthe flow past a circular cylinder. The maps are presented in the near cylinder 5 D × D fluid domain. Note that the scales are not uniform due to the different kernel weight rangesretrieved by the adaptive training process. We have demonstrated that the CNN process is efficient and accurate for theprediction of unsteady forces acting on a bluff body due to the vortex sheddingprocess. The success of CNN-based deep learning lies in the reasonable captur-ing of flow features affecting the forces on the bluff body through the learningprocess. The main learning operation occurs at the adaptation of convolutionkernels using the stochastic gradient descent method. The learned discrete ker-nel is a small matrix of numbers which makes it hard to visualize and interpret.However, by applying convolution operation on the input matrix using indi-vidual kernels separately, this will produce the individual feature map of eachkernel. For example, Fig. 9 shows some of the critical feature maps extracted atthe convolution layer of the circular cylinder. These feature maps provide theencoding of features and their locations for the convolutional layers. While thelocation of a feature explicitly represents where it is, the response of a feature(i.e. what it is) is encoded implicitly. Carefully designed deep networks basedon the convolutional feature maps can extract the flow dynamical features. It is35orth noting that the feature maps will not necessarily display commonly seenflow features such as the shear layer and the Karman vortex street. In mostcases, these features are reconstructed by a nonlinear combination of many ker-nels. Hence, the learning process is not limited to the learning of kernels but alsoincludes learning the proper combination of kernels for the prediction process.Next, we investigate whether the CNN-based learning is capable of predictingthe lift force variation generated by the FOM even it is fed by different inputcases. M a x . e rr o r % . . . . . . . . . . . .
42 Aspect Ratio ( AR ) C r m s L FOM AR = { , , } AR = { , , } AR = { , , } AR = { , , } AR = { , , } † AR = { , , , } Figure 10: Lift force prediction using different training sets. The legend states the cases usedas the training set. † denotes that only ellipses are used for training. All other cases aretrained together with the set of rounded square cases: ϕ = { o , o , o , o } . Inset showsthe maximum error of each prediction. The predictions presented in Fig. 4.5 are generated by a small subset ofinput-output combinations obtained from the full-order Navier-Stokes simula-tions. However, the CNN-based deep learning does not operate as a traditionalcurve-fitting tool whereby the single-valued input-output combinations are fitted36o match a predetermined general curve formula. The input to the CNN-baseddeep learning is not a single value but a matrix of function values derived fromthe single value input. This makes the CNN capable of extracting the flow fea-tures and eventually much more data efficient than standard curve-fitting andregression tools.Figure 10 shows the performance of CNN-based learning when fed with dif-ferent combinations of inputs. We find that the predictions perform best whenthe extremes of the geometric variations (e.g., in the case of ellipses: AR = 1and 10) are included. Furthermore, when we include AR = 3 case, all the predic-tions are accurate within 5% error margin of the FOM results. The performanceof CNN-based deep learning becomes slightly better when AR = 2 case is alsoadded to the training set. The interesting observation is that the predictionsare significantly improved when four cases from the rounding angle problem( ϕ = 10 o , o , o and 60 o ) are included in the training process. This emphasizesanother advantage of CNN-based prediction that we may use the data of a sim-ilar problem to enhance the accuracy of the data-driven predictions. The mostsignificant result is that the CNN has accurately captured the maximum lift inthe vicinity of AR = 2 . AR = 1 , , AR = 2 , , ϕ = 10 o , o and60 o rounded square cases. In both cases, the drag coefficient of ellipse geometrieswith the aspect ratio outside the training set are predicted reasonably. Figure 11illustrates the predictions of these cases with an error about 5%. It is worth not-ing that the interpolation approach is slightly more accurate than the extrapola-tion cases. Relevant codes and data for this particular example can be found at37 . . . . . . AR ) C D FOMCNN - InterpolationCNN - Extrapolation set 1CNN - Extrapolation set 2
Figure 11: Prediction of mean drag as an extrapolation problem:- extrapolation set 1: AR =1 , , , φ = 10 , , , and extrapolation set 2: AR = 2 , , , φ = 10 , , . GitHub link: https://github.com/TharinduMiyanawala/CNNforCFD . Over-all, the above investigation suggests that we can achieve an improved perfor-mance by allocating the high-performance computing resources for the FOMcases at the extreme values of the input parameter.Next, we attempt to generalize the force coefficient predictions to otherbluff body shapes with sharp corners. In particular, we utilize the same CNNtrained presented in Section 4.4 to predict the drag coefficient for the bluff bodygeometries shown in Fig. 12, i.e. a symmetric trapezium, a regular hexagon,a regular octagon, a basic square and a rectangular geometry with an aspectratio of 2. The force predictions and the associated errors are presented in table2 in the ascending order of the drag coefficient. The CNN forecasts the dragcoefficients with more than 92.5% accuracy. As expected, the shapes closer tothe circle and the rounded square (i.e., sharp square, octagon, and hexagon) havehigher accuracy than the trapezium and the rectangle geometries. These results38emonstrate a broader prospect of the data-driven flow dynamics predictionsvia trained deep nets. Using the available full-order or experimental results, onecan create a common database for flow dynamic coefficients and use it to train ageneralized CNN to predict the flow dynamic parameters of any complex shapefor a wide range of flow conditions. (a) (b)(c) (d)(e)
Figure 12: Vorticity fields for sharp cornered bluff-bodies: (a) trapezium, (b) regular hexagon,(c) regular octagon, (d) square and (e) rectangle with aspect ratio = 2 at tU ∞ /D = 245. Flowis from left to right. Efficient prediction of force coefficients via our data-driven framework hasa profound impact in offshore, civil and aeronautical engineering applications.In particular to practical offshore structures, the hydrodynamic coefficients re-quired for the well-known force decomposition model of [33] can be obtainedefficiently from the present CNN-based model which are otherwise expensiveand slow via CFD-based simulations or physical experiments. Further exten-39 able 2: Generalization of drag force predictions for sharp-cornered bluff-bodies shown in Fig.12.
Bluff-body C D Error (%)FOM CNNTrapezium 1.3584 1.2597 7.20Hexagon 1.3787 1.3741 0.33Octagon 1.4196 1.3973 1.57Square 1.5095 1.4291 5.30Rectangle 1.8341 1.7186 6.29sions and demonstrations are underway for high Reynolds number and practicalgeometries.
6. Conclusions
We have presented a general and efficient data-driven model reduction methodbased on the deep-learning convolutional neural nets and the stochastic gradi-ent descent with momentum term. We successfully demonstrated the predictivecapability of the proposed deep learning technique for the force coefficients of aset of bluff body geometries. We provided a physical analogy of the stochasticgradient descent method and the iterative correction process of the CNN-basedwith the simplified form of the Navier-Stokes equation. We illustrated the con-nection of the convolution process in the deep neural networks with the Mori-Zwanzig formalism. During the prediction process, we feed this CNN-basedmodel with the force coefficients derived for seven bluff bodies using a full-orderNavier-Stokes solver. The CNN is trained to output the force coefficients for anyperturbed bluff body geometry input. We successfully tuned the neural networkparameters for the best prediction performance with a maximum error < × O (10 ) for the considered set of cases. We have shown that theCNN-based learning accurately captures the flow features related to the forcepredictions with very less input-output data combinations. The CNN-basedpredictions are accurate for both interpolation and extrapolation from the geo-metric parameter range of the training set. Finally, we demonstrated that thetrained CNN can be used to predict many different geometries. The efficientprediction of force coefficients using the present data-driven method has a greatvalue for the iterative engineering design and the feedback flow control [34]. Wehope that this work will help to expand the use of deep neural networks in awider range of CFD applications. Appendix A: Transformation of the NS momentum equation to reaction-diffusion equation
We briefly present the transformation of the NS momentum equation into thereaction-diffusion problem by rewriting the convective and diffusion terms andeliminating the convective operator. Let φ be the velocity potential function as u = ∇ φ and then from the incompressible Navier-Stokes momentum equation(Eq. (1)), we get ∂ ∇ φ∂t + ( ∇ φ · ∇ ) ∇ φ = − ρ ∇ p + ν ∇ ∇ φ. (A.1)Using the vector identity: ∇ ( a . a ) = 2 ( a × ( ∇ × a ) + ( a . ∇ ) a ) and ∇ × ∇ φ =0, we get ( ∇ φ · ∇ ) ∇ φ = ∇ ( ∇ φ · ∇ φ ) and then substituting into Eq. (A.1),41ne can obtain ∇ (cid:18) ∂φ∂t + ∇ φ · ∇ φ (cid:19) = ∇ (cid:18) − pρ + ν ∇ φ (cid:19) , (A.2)which can be further simplified as ∂φ∂t + ∇ φ · ∇ φ − ∆ pρ + ν ∇ φ, (A.3)where ∆ p is the pressure difference with respect to some reference pressure. Byintroducing the variable transformation φ = − ν ln ψ in Eq.(A.3), we obtain:1 ψ ∂ψ∂t − νψ ∇ ψ · ∇ ψ = ∆ p µ + ν ∇ (ln ψ ) . (A.4)By substituting the relation ∇ (ln ψ ) = (cid:16) − ψ ( ∇ ψ · ∇ ψ ) + ψ ∇ ψ (cid:17) , we con-struct the reaction-diffusion equation in terms of scalar function ψ of a movingfluid element from the Navier-Stokes momentum equation: ∂ψ∂t − ν ∇ ψ = ∆ p µ ψ. (A.5)The above hyperbolic differential form is often called the Liouville equation,whose characteristics are integral curves of the nonlinear ordinary differentialequation. In Appendix B, we present an integral solution to the above reaction-diffusion equation for constructing the moment map. Appendix B: An explicit integral solution of the reaction-diffusionequation
We can construct a general integral solution for the reaction-diffusion systemEq. (A.5) via Green’s function. Let L be the operator L = ∂∂t − ν ∇ . Thedifferential equations for ψ ( x , t ) and the Green’s function, G ( x , t ; ξ , τ ) for x , ξ ∈ and t, τ ≥ L ψ ( x , t ) = ∆ p µ ψ ( x , t ) , (B.1) L G ( x , t ; ξ , τ ) = δ ( x − ξ ) δ ( t − τ ) , (B.2)where D is the fluid domain of interest and ξ and τ are the dummy variables.Multiplying Eq. (B.1) by G and by Eq. (B.2) ψ and subtracting Eq. (B.2) fromEq. (B.1), we get G L ψ − ψ L G = G ∆ p µ ψ − δ ( x − ξ ) δ ( t − τ ) ψ. (B.3)Integrating with respect to x and t , the space-time integral form is constructedas follows: (cid:90) ∞ (cid:90) V [ G L ψ − ψ L G ] dV dt = (cid:90) ∞ (cid:90) V G ∆ p µ ψ dV dt − ψ ( ξ , τ ) . (B.4)The left-hand side can be further simplified using the Green’s second identity (cid:90) ∞ (cid:90) V [ G L ψ − ψ L G ] dV dt = (cid:90) V (cid:90) ∞ (cid:20) G ∂ψ∂t − ψ ∂G∂t (cid:21) dtdV − ν (cid:90) ∞ (cid:90) V (cid:2) G ∇ ψ − ψ ∇ G (cid:3) dV dt = − (cid:90) V Gψ ( x , dV − (cid:90) ∞ (cid:90) V ψ ∂G∂t dV dt − ν (cid:90) ∞ (cid:73) S [ G ∇ ψ − ψ ∇ G ] n dSdt. (B.5)By equating Eq. (B.4) and Eq. (B.5), for ψ ( ξ , τ ) we obtain the explicit expressionfor the transformed scalar variable ψ : ψ ( ξ , τ ) = (cid:90) ∞ (cid:90) V G ∆ p µ ψ dV dt + (cid:90) V Gψ ( x , dV + 2 (cid:90) ∞ (cid:90) V ψ ∂G∂t dV dt + ν (cid:90) ∞ (cid:73) S [ G ∇ ψ + ψ ∇ G ] n dSdt. (B.6)43y exchanging ( ξ , τ ) with ( x , t ) and setting the 3rd term of right-hand side tozero, we get the general solution of non-homogeneous diffusion equation [26]: ψ ( x , t ) = (cid:90) ∞ (cid:90) V G ∆ p µ ψ ( x , t ) dV ξ dτ + (cid:90) V Gψ ( x , dV ξ + ν (cid:90) ∞ (cid:73) S [ G ∇ ξ ψ + ψ ∇ ξ G ] n dSdτ. (B.7)By observing the above equation, one can construct an input-output mappingfor the sets of scalar functions and the Green’s function. Deep neural networkscan help to construct these functions hierarchically, in which higher-level featurefunctions can be formed as a multilayer stack of simple lower-level ones. Appendix C: The Mori-Zwanzig projection operators
By considering the operator L = ∆ p µ + ν ∇ , the reaction-diffusion Eq. (A.5)derived from the Navier-Stokes momentum equation can be written as Liouvilleform: ∂ψ∂t = Lψ, (C.1)whose characteristics satisfy the N -dimensional semi-discrete nonlinear ordinarydifferential equation. Let ψ ( χ , t ) = g ( χ ), where χ ∈ R N denote the modes of ψ comprising the slow (retained) degrees of freedom (cid:98) χ and the fast (eliminated)degrees of freedom (cid:101) χ . We denote the component of ψ corresponding to theresolved mode by (cid:98) ψ ( t ) = ψ ( (cid:98) χ , t ) = g ( (cid:98) χ ). The solution of Eq. (C.1) for theresolved mode is given by (cid:98) ψ = e Lt (cid:98) ψ (0), whereas e Lt is the evolution opera-tor. The standard semi-group notation is used to construct the flow map i.e., (cid:98) ψ ( (cid:98) χ , t ) = e Lt g ( (cid:98) χ ). For brevity, we denote (cid:98) ψ (0) by (cid:98) ψ hereafter. Let P be aprojection of ψ in the direction of (cid:98) ψ such that P ψ ≡ ( (cid:98) ψ , ψ )( (cid:98) ψ , (cid:98) ψ ) (cid:98) ψ, (C.2)and also let Q = I − P be another projector onto an orthogonal subspace. Theprojection operators P and Q follow the self-adjoint condition and satisfy the44rthogonality condition QP = 0. While P projects the particular variable ontothe slow variables of the system, Q projects the variable onto the fast variablesof the dynamical system. Using Dyson’s formula [35]: e Lt = e QLt + (cid:90) t e L ( t − t ) P Le
QLt dt . (C.3)and substituting it to Eq. (C.1), we obtain: ∂ (cid:98) ψ∂t = e Lt L (cid:98) ψ = e Lt ( Q + P ) L (cid:98) ψ = e Lt P L (cid:98) ψ + e Lt QL (cid:98) ψ . (C.4)Using Eq. (C.2) and Eq. (C.3), the projection terms are: e Lt P L (cid:98) ψ = ( L (cid:98) ψ , ψ )( (cid:98) ψ , (cid:98) ψ ) e Lt (cid:98) ψ = Θ (cid:98) ψ, (C.5) e Lt QL (cid:98) ψ = e QLt QL (cid:98) ψ + (cid:90) t e L ( t − t ) P Le
QLt QL (cid:98) ψ dt . (C.6)By defining F ( (cid:98) χ , t ) ≡ e QLt QL (cid:98) ψ and P Le
QLt QL (cid:98) ψ = − K ( t ) (cid:98) ψ , we can writeEq. (C.4) as ∂ (cid:98) ψ∂t = Θ (cid:98) ψ (cid:124)(cid:123)(cid:122)(cid:125) Markovian + (cid:90) t K ( t ) (cid:98) ψ ( t − t ) dt (cid:124) (cid:123)(cid:122) (cid:125) Convolution (memory) + F ( (cid:98) χ , t ) (cid:124) (cid:123)(cid:122) (cid:125) Fluctuation (C.7)which is referred to as the Mori-Zwanzig equation for the coarse-graining of ageneral dynamical system [27, 28, 36]. While the first term in the right-hand sideis a function of the instantaneous value of ψ , the second term depends on thetime history of ψ in the range [0 , t ] via convolution (memory) effect. The thirdterm represents the effect of fluctuations and by construction, it satisfies theorthogonal dynamics equation, ∂F ( (cid:98) χ ,t ) ∂t = QLF ( (cid:98) χ , t ). This formalism providesa theoretical framework to make systematic approximations of memory kerneleffects during the CNN-based learning.45 cknowledgements The first author thanks the National Research Foundation and Keppel Cor-poration, Singapore for supporting the work done in the Keppel-NUS CorporateLaboratory.
ReferencesReferences [1] A. Roshko, Perspectives on bluff body aerodynamics, Journal of Wind En-gineering and Industrial Aerodynamics 49 (1-3) (1993) 79–100.[2] P. Holmes, Turbulence, coherent structures, dynamical systems and sym-metry, Cambridge University Press, 2012.[3] W. Yao, R. K. Jaiman, Model reduction and mechanism for the vortex-induced vibrations of bluff bodies, Journal of Fluid Mechanics 827 (2017)357–393.[4] P. J. Schmid, Dynamic mode decomposition of numerical and experimentaldata, Journal of Fluid Mechanics 656 (2010) 5–28.[5] P. J. Schmid, L. Li, M. P. Juniper, O. Pust, Applications of the dy-namic mode decomposition, Theoretical and Computational Fluid Dynam-ics 25 (1-4) (2011) 249–259.[6] J. N. Kutz, Deep learning in fluid dynamics, Journal of Fluid Mechanics814 (2017) 1–4.[7] E. J. Parish, K. Duraisamy, A paradigm for data-driven predictive model-ing using field inversion and machine learning, Journal of ComputationalPhysics 305 (2016) 758–774.[8] J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence mod-elling using deep neural networks with embedded invariance, Journal ofFluid Mechanics 807 (2016) 155–166.469] M. Raissi, P. Perdikaris, G. E. Karniadakis, Machine learning of lineardifferential equations using Gaussian processes, Journal of ComputationalPhysics 348 (2017) 683–693.[10] M. Raissi, G. E. Karniadakis, Hidden physics models: Machine learning ofnonlinear partial differential equations, arXiv preprint arXiv:1708.00588.[11] L. Parussini, D. Venturi, P. Perdikaris, G. E. Karniadakis, Multi-fidelityGaussian process regression for prediction of random fields, Journal of Com-putational Physics 336 (2017) 36–50.[12] G. Hinton, S. Osindero, Y. W. Teh, A fast learning algorithm for deepbelief nets, Neural Computation 18 (7) (2006) 1527–1554.[13] Y. Bengio, Learning deep architectures for AI, Foundations and trends R (cid:13)(cid:13)