[PDF] Variational Multi-scale Super-resolution : A data-driven approach for reconstruction and predictive modeling of unresolved physics

Abstract

The variational multiscale (VMS) formulation formally segregates the evolution of the coarse-scales from the fine-scales. VMS modeling requires the approximation of the impact of the fine scales in terms of the coarse scales. For the purpose of this approximation, this work introduces a VMS framework with a special neural-network (N-N) structure, which we call the variational super-resolution N-N (VSRNN). The VSRNN constructs a super-resolved model of the unresolved scales as a sum of the products of individual functions of coarse scales and physics-informed parameters. Combined with a set of locally non-dimensional features obtained by normalizing the input coarse-scale and output sub-scale basis coefficients, the VSRNN provides a general framework for the discovery of closures for both the continuous and the discontinuous Galerkin discretizations. By training this model on a sequence of L 2 − projected data and using the super-resolved state to compute the discontinuous Galerkin fluxes, we improve the optimality and the accuracy of the method for both the linear advection problem and turbulent channel flow. Finally, we demonstrate that - in the investigated examples - that the present model allows generalization to out-of-sample initial conditions and Reynolds numbers. Perspectives are provided on data-driven closure modeling, limitations of the present approach, and opportunities for improvement.

Full PDF

VVariational Multi-scale Super-resolution : A data-driven approach forreconstruction and predictive modeling of unresolved physics.

Aniruddhe Pradhan, Karthik Duraisamy

Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109, USA

Abstract

The variational multiscale (VMS) formulation formally segregates the evolution of the coarse-scales from the ﬁne-scales. VMS modeling requires the approximation of the impact of the ﬁne scales in terms of the coarse scales. For thepurpose of this approximation, this work introduces a VMS framework with a special neural-network (N-N) structure,which we call the variational super-resolution N-N (VSRNN). The VSRNN constructs a super-resolved model of theunresolved scales as a sum of the products of individual functions of coarse scales and physics-informed parameters.Combined with a set of locally non-dimensional features obtained by normalizing the input coarse-scale and outputsub-scale basis coe ﬃ cients, the VSRNN provides a general framework for the discovery of closures for both thecontinuous and the discontinuous Galerkin discretizations. By training this model on a sequence of L − projected dataand using the super-resolved state to compute the discontinuous Galerkin ﬂuxes, we improve the optimality and theaccuracy of the method for both the linear advection problem and turbulent channel ﬂow. Finally, we demonstrate that- in the investigated examples - that the present model allows generalization to out-of-sample initial conditions andReynolds numbers. Perspectives are provided on data-driven closure modeling, limitations of the present approach,and opportunities for improvement. Keywords:

Variational Multiscale Method, Super-resolution, Physics-informed Deep Learning,Coarse-grained Modeling, Discontinuous Galerkin

1. Introduction

Multi-scale problems are ubiquitous in science and engineering, and many practical applications are characterizednot just by a disparity of scales, but also by a complex interplay between scales. The coarse scales which are resolvedin a multi-scale computation constitute the macro-scale level of description. The ﬁner, unresolved scales constitutethe micro-scale level of description, and their e ﬀ ect on the coarse-scales needs to be modeled not just for accuracy, butalso for stability [1]. A classical application of multi-scale modeling is turbulent ﬂuid-ﬂow at high Reynolds numbers.In the so-called Large Eddy Simulations (LES) of turbulent ﬂow, the energy-containing large scales are resolved andthe e ﬀ ect of the small scales, also known as sub-scales, are modeled [2–14]. Email addresses: [email protected] (Aniruddhe Pradhan), [email protected] (Karthik Duraisamy)

Preprint submitted to Journal of Computational Physics January 26, 2021 a r X i v : . [ phy s i c s . c o m p - ph ] J a n ver the past few decades, several modeling approaches have been pursued to improve the performance of coarse-grained simulation of multi-scale PDEs. These techniques can be broadly classiﬁed into two categories: (i.) physics-inspired approaches that employ phenomenological assumptions; and (ii.) structural formulations that attempt toderive sub-grid models from the structure of the governing PDE. Since it is not possible to review all such approaches,we highlight representative examples in the area of turbulence simulations, and use the discussion to set the contextof the present work.Physics-based models use arguments such as the balance of energy transfer from large to small scales or scale-similarity (e.g. the Smagorinsky[15] in LES and its adaptive variants [2, 4, 16].) Additional functional forms wereproposed by Vreman [6], Nicoud et al. [5, 7], and L´evˆeque et al. [17] to correct asymptotic near-wall behavior.Another class of sub-grid models based on the self-similarity idea was ﬁrst put forward by Bardina et al. [18] in itsoriginal form, and later explored by other researchers in the mixed form [18–22]. The self-similarity model in itsoriginal form was found to have a high spatial correlation with the actual sub-grid stress but lacked enough dissipationto ensure the stability of the approach.Langford et al. [23] in their work on optimal-LES showed that it is possible to construct an abstract sub-grid modelthat can obtain correct single-time, multi-point statistics and generates minimal error in the instantaneous dynamics.This is possible by minimizing the root mean square error in the time derivative of the resolved quantities. Here, theLES model is written in terms of a conditional average, an average over all the instantaneous ﬁelds that correspond tothe same LES solution when ﬁltered. As a result, the modeled sub-grid stresses does not necessarily have to show ahigh spatial correlation with the true sub-grid stresses. In addition to these modeling approaches, the interaction of thenumerical method with the under-lying sub-grid model also poses several challenges. The operating ﬁlter size in allthese models is, in general, very close to the grid size. Consequently, the truncation error due to the numerical methodcan be of a similar magnitude to the sub-grid model term. To decouple them, various explicit ﬁltering approaches[24–26] have also been explored in the literature. The usage of dissipative numerical methods alone to perform LESwithout any explicit sub-grid model has also gained popularity[27–31] recently in the form of implicit-LES.Another category of sub-grid models is based on the mathematical structure of the PDEs. One such mathematicalformalism is the variational multiscale (VMS) method [1, 32] which formally separates the evolution equation ofthe coarse scales from the evolution equation of the ﬁne-scales. The ﬁnal VMS modeling step involves writing anexplicit expression for the ﬁne-scales approximately in-terms of the coarse-scale residual. These methods [1, 33–36]were ﬁrst developed to stabilizing ﬁnite element methods for linear problems involving advection but later extendedto non-linear problems [8–13] as sub-grid models. Analogous model forms for the sub-scales in terms of the coarse-scale residual can also be derived for the non-linear problems if the Mori-Zwanzig (M-Z) formalism is used with theVMS method[14, 37–40]. In this work, we attempt to discover such VMS model forms directly from data using deeplearning tools, and use them for predictive modeling.An alternate class of methods in structural sub-grid modelling is based on the approximate deconvolution method(ADM) [41, 42]. In the ADM approach, the ﬁltered state variables are approximately deconvolved using ADM2perators approximated by a truncated series expansion of the inverse ﬁlter. The non-linear terms in the governingequations are then computed using these deconvolved state variables.The availability of high resolution data from numerical simulations and experiments in the past decade has led toan interest in data-based modeling. Applications of machine learning augmentations have been used in RANS (e.g.[40, 43–45] and LES (e.g. [46–49]). In the LES front, Maulik et al. [50] used machine learning to classify and blenddi ﬀ erent LES models to select the most accurate model at run-time. Yang. et al [51] used physics-informed featuresto improve the performance of equilibrium LES wall models in non-equilibrium cases. Similarly, many other notableattempts to improve LES models using data have also been made by Maulik et al. [50, 52, 53], Beck et al. [49],Sarghini et al. [46], Ghamara and Hattori [47], Wang et. al [54] and many more [22, 48, 55]. These data-driventechniques have also found application in developing closures for reduced-order models (ROMs) [56–59].Very recently, super-resolution of turbulent-ﬂow ﬁelds has been pursued using neural networks [60–65]. Xie etal. [60] and Fukami et al. [63] appear to be the ﬁrst to introduce this idea in ﬂuid dynamic applications. These werefollowed by Deng et al. [61] who improved traditional GAN performance by augmenting the model architecture.Improvements in ﬂow ﬁeld reconstruction were shown by Liu et al. [62], using both spatial and temporal information.Fukami et al. [65] performed super-resolution and in-betweening to reconstruct a highly-resolved space-time solutionusing two low-resolution snapshots taken at the start t and the end t +∆ t of an interval. These models have demonstratedan ability to reconstruct ﬁne scales from highly coarse-grained data for either the same or similar data-set on whichthey have been trained both in a supervised and an unsupervised setting [64]. However, applying the trained modelsto super-resolve coarse ﬂow-ﬁelds at di ﬀ erent Reynolds numbers or another part of the ﬂow, is relatively unexplored.The idea of using the super-resolved ﬁeld to compute the closure terms is similar to the ADM approach. However,compared to the approximately deconvolved solution that lies on the same mesh as the ﬁltered solution, the super-resolved solution lies on a higher resolution mesh. A super-resolution model capable of reconstructing ﬁne-spacedata for a case where the ﬁne-space data already exists and the same information is used for training has no usein a predictive setting. This paper attempts to improve predictive capabilities by bringing in generalizable modelforms and features. In addition to the generalizability of these models, the deﬁnition of a coarse-space solutionis ambiguous. This ambiguity is because a variety of low-ﬁdelity data, including LES solution, obtained using ﬁnitedi ﬀ erence method (FDM), ﬁnite volume method (FVM), spectrally ﬁltered DNS solution, and stabilized ﬁnite element(FE) solution on a coarse grid qualify as coarse-solutions. The nodal or modal values in each of these methodsrepresent di ﬀ erent quantities. For example, Fukami et al. [63] used the max-pooling operation to obtain coarse data.Consequently, the trained model is dependent on the type of method or ﬁlter used to generate data, and the mappinglearned by the network has no formal basis. To resolve this ambiguity, we deﬁne both our coarse space and our ﬁnespace in terms of the L projection of the DNS solution on low and high order polynomial basis functions in a similarspirit to the Variational Multiscale Method. As a result, the trained model will approximate the function that mapsthe L projection of the DNS data on the two sub-spaces. Additionally, the model should be preferably compact andapplied patch-wise rather than on the entire ﬂow. This is because there is no guarantee that the coarse data that needs3o be super-resolved has the same size as that of the input layer, and interpolating it back to the network size defeatsthe purpose of super-resolution. In this work, we develop N-N closures that are: (i.) capable of extrapolating tounseen ﬂow conditions and resolutions; (ii.) use non-dimensional features rather than dimensional features for bettergeneralizability; and (iii.) can be applied patch-wise rather than on the entire ﬁeld.The outline of the paper is as follows: We introduce the VMS methodology in section 2. In section 3, we deriveVMS consistent features. In section 4, we propose a model form and a new network architecture for learning it. Wedescribe the procedure of generating training data in section 5. In section 6, we apply our approach to the linearadvection problem both in an online (numerical method) and o ﬄ ine (super-resolution) setting. In section 7, weevaluate the performance of a model approach to the turbulent channel ﬂow. Perspectives on the broader challenge ofdata driven modeling, and on the present work is shared in section 8. Finally, we summarize our work in section 9.

2. The Variational Multiscale (VMS) Method

This section summarizes the Variational Multiscale Method (VMS), which was originally presented by Hughes etal.[1]. As discussed previously, this method has been extensively used for developing closures for both linear [1, 33–36] and non-linear PDEs [8–13]. This section will only discuss it in the context of a linear problem and use it as aguiding principle for feature selection and in shaping the network architecture. As discussed by Hughes et al.[1], thedevelopment of VMS closure can be broadly categorized into two di ﬀ erent subsections based on the type of basisfunctions used which are detailed below, along with a context for super resolution. In the ‘smooth case’, the basis functions are su ﬃ ciently smooth so that the distributional e ﬀ ects may be ignored[1]. Both the Fourier basis and the Chebyshev spectral basis qualify as a smooth basis. For this case, consider thefollowing PDE on an open and bounded domain Ω ⊂ R d , where d ≥ Γ = ∂ Ω : L ( u ) = f in Ω , u = g on Γ (1)where the operator L can be linear or non-linear, the functions f : Ω → R and g : Γ → R are given. The variationalform of the above PDE is given by ( L ( u ) , w ) = ( f , w ) , (2)such that u ∈ V for all w ∈ V , where ( · , · ) denotes the L ( Ω ) inner product, and V ≡ H ( Ω ) is the Sobolev space.The solution and weighting space are decomposed as follows: V = V h ⊕ V (cid:48) , (3)where ⊕ represents a direct sum of V h and V (cid:48) . Applying the VMS operation, we have( L ( u h + u (cid:48) ) , w h ) + ( L ( u h + u (cid:48) ) , w (cid:48) ) = ( f , w h ) + ( f , w (cid:48) ) . (4)4hile the above equation is valid for both non-linear and linear equations, further simpliﬁcations can be made if thedi ﬀ erential operator is assumed to be linear. To this end, using the linear independence of w h and w (cid:48) , and taking thedi ﬀ erential operator to be linear, we obtain the coarse and ﬁne equations :( L ( u h ) , w h ) + ( L ( u (cid:48) ) , w h ) = ( f , w h ) (5)( L ( u (cid:48) ) , w (cid:48) ) = − ( L u h − f , w (cid:48) ) . (6)The coarse and ﬁne scale equations can be re-written as:( L ( u h ) − f , w h ) = − ( L ( u (cid:48) ) , w h ) (7) Π (cid:48) L ( u (cid:48) ) = − Π (cid:48) ( L u h − f ) , (8)where Π (cid:48) is the L -projector on the ﬁne-scale basis functions. The Green’s function corresponding to the ad-joint ofthe ﬁne-scale problem is found by solving the following equations Π (cid:48) L ∗ ( g (cid:48) ( x , y )) = Π (cid:48) ( δ ( x − y )) ∀ x ∈ Ω ; g (cid:48) ( x , y ) = ∀ x ∈ Γ . (9)The ﬁne-scale can be obtained by super-position as follows: u (cid:48) ( y ) = − (cid:90) Ω g (cid:48) ( x , y )( L u h − f )( x ) d Ω x . (10)In the current super-resolution approach, we do not seek u (cid:48) . Rather, we seek u (cid:48) f , which is the optimal projection of u (cid:48) on the ﬁner-space w f . The space of functions in w f is ﬁner in-comparison to w h or represents a di ﬀ erent kind ofspace. To this end, the optimal projection of u (cid:48) on w f is given by: u (cid:48) f ( y ) = Π f u (cid:48) ( y ) = − (cid:90) Ω Π f ( g (cid:48) ( x , y ))( L u h − f )( x ) d Ω x , (11)where Π f ( g (cid:48) ( x , y )) is L -projection of g (cid:48) ( x , y ) on w f ( y ). For this case, the ﬁne space can be constructed using higherwavenumber Fourier modes or higher-order Chebyshev polynomials. In the ’rough case’, the lack of continuity of derivatives at element interfaces requires us to account for the distribu-tional e ﬀ ects explicitly [1]. This case is typical of the ﬁnite element method, where piece-wise continuous polynomialfunctions are used within each element. Similar to the ’smooth case’, Hughes et al. [1] showed that the exact form ofsub-scales in the case of ﬁnite elements is given by: u (cid:48) ( y ) = − (cid:88) e (cid:32)(cid:90) Ω e g (cid:48) ( x , y )( L u h − f )( x ) d Ω x + (cid:90) Γ e g (cid:48) ( x , y )( bu h )( x ) d Γ x (cid:33) . (12)The sub-scale solution depends on the coarse-scale inside the element and its neighbors. Applying the projectionoperator on both sides of equation (12) we obtain u (cid:48) f ( y ) = Π f u (cid:48) ( y ) = − (cid:88) e (cid:32)(cid:90) Ω e Π f ( g (cid:48) ( x , y ))( L u h − f )( x ) d Ω x + (cid:90) Γ e Π f ( g (cid:48) ( x , y ))( bu h )( x ) d Γ x (cid:33) . (13)5n approximation to the above equation is given in the form of compact bubble functions which vanish at the elementboundaries [66–69]. For 1-D linear problems, solving the element Green’s function leads to the coarse-scale solutionbeing the endpoint interpolant of the actual solution [1]. Assuming that the coarse-scale is given in the form of theendpoint interpolant of the true solution, the ﬁne-scale solution is given by u (cid:48) f ( y ) = Π f u (cid:48) ( y ) = (cid:90) Ω e Π f ( g (cid:48) ( x , y ))( L u h − f )( x ) d Ω x . (14)A point to note is that application of the projection operator Π f ( y ) on the element Green’s function g (cid:48) ( x , y ) leads tothe reduction of dimension only in y, i.e. g (cid:48) f ( x , y ) = Π f g (cid:48) ( x , y ) = (cid:88) i g (cid:48) x , i ( x ) ψ i ( y ) , (15)where the basis coe ﬃ cients g (cid:48) x , i ( x ) are functions of x , which are not necessarily polynomials. However, if the polyno-mial order p c of the coarse-scale is given, the coarse-scale residual for (e.g.) the convection-di ﬀ usion equation will beof polynomial order p C −

1. Thus, projecting (cid:80) g (cid:48) x , i ( x ) onto the space of polynomials with order p C − g (cid:48) ( x , y ) onto tensor-product basis functions in x and y is su ﬃ cient. The polynomial order of y is determined bythe polynomial order of ﬁne-scales, whereas the polynomial order of x is decided by the maximum polynomial orderarising in the coarse-scale residual. For the convection-di ﬀ usion problem, the element Green’s function is given by: g ( x , y ) =  C ( y ) (cid:16) − e − α x / h (cid:17) , if x ≤ yC ( y ) (cid:16) e − α x / h − e − α (cid:17) , x ≥ y where α is the cell Peclet number given by: α = ah κ . (16)The element’s Green’s function approximated using di ﬀ erent order tensor-product basis functions in x and y isshown in Figure 1. The basis functions used here to approximate the sub-scale necessarily do not vanish at theelement boundary. However, one can also select them to ensure that the sub-scales vanish at the element boundary,similar to a bubble function, as shown in Figure 2. Hence, when the input and output order is ﬁxed, the Green’sfunction for the ﬁne-scales can be represented in a ﬁnite number of dimensions. This makes it easier to learn themapping between the coarse-scale and ﬁne-scale solutions.

3. VMS-inspired feature selection

In this section, we will derive an appropriate feature set and the network architecture for our model. To demon-strate the action of the super-resolution operator, we assume the coarse space to be composed of piece-wise linearpolynomials and the governing PDE to be the linear convection-di ﬀ usion equation given by the following di ﬀ erentialoperator and boundary condition L (cid:44) a ddx − κ d dx in Ω = [0 , L ] , u = on Γ = { , L } . (17)6 igure 1: L optimal approximation of the ﬁne-scale Green’s function on various tensor-product polynomial basis function g’ for di ﬀ erent cellPeclet number α .Figure 2: L -optimal approximation of the ﬁne-scale Green’s function on p x = p y = u (cid:48) f ( y ) is zero on the boundaries. h existing between 0 ≤ x ≤ h in its local co-ordinates, the coarse solution u h in termsof the end-point values u (0) and u ( h ) is given by: u h ( x ) = u (0)(1 − x / h ) + u ( h )( x / h ) . (18)The space in which the ﬁne scales are approximated can be a discontinuous ﬁnite element space or a bubble functionspace. The L -optimal approximation of the ﬁne scales u (cid:48) on any polynomial space existing inside an element is givenby the projection Π f : u (cid:48) f ( y ) = Π f u (cid:48) ( y ) = − (cid:90) Ω e Π f ( g (cid:48) ( x , y )) a (cid:32) u ( h ) − u (0) h (cid:33) d Ω x . (19)The coarse solution considered here is an endpoint interpolant of the true solution. In that case, one can determine theexact form of the sub-scale inside the element by assuming the endpoint values as Dirichlet boundary conditions andby solving the equation inside the element: u (cid:48) ( y ) = ( u ( h ) − u (0))  − e α yh − e α − yh  . (20)The simplest approximation of the bubble function is obtained by projecting it on a p discontinuous space inside theelement i.e. u (cid:48) ( y ) = (cid:90) h ( u ( h ) − u (0))  − e α yh − e α − yh  dy = − a ( u ( h ) − u (0)) h h a (cid:32) coth ( α ) − α (cid:33) , (21)where the ﬁrst part is the residual when linear basis functions are used, and the second part is the form of the stabiliza-tion parameter τ commonly used in stabilized methods. Next, we deﬁne the mean and r.m.s quantities of the coarsesolution in an element as follows: u m = (cid:82) h u h d Ω h ; u rms = (cid:115) (cid:82) h ( u h − u m ) d Ω h . (22)An important observation is that the solution is independent of the mean u m : u (cid:48) ( y ) = (( u ( h ) − u m ) − ( u (0) − u m ))  − e α yh − e α − yh  . (23)If our approximation space is linear, then u m and u rms are given by: u m , (cid:44) ( u (0) + u ( h )) / u rms , (cid:44) | u ( h ) − u (0) |√ . (24)Re-arranging this form we get: u (cid:48) ( y ) u rms , = √

3( 1 α − coth ( α )) sgn ( u ( h ) − u (0)) , (25)where f is a function that can be learned and sgn denotes the sign function. The above equation can be simpliﬁed asfollows: u (cid:48) ( y ) u rms , =  √ α − coth ( α )) , u ( h ) − u (0) | u ( h ) − u (0) | ≥ − √ α − coth ( α )) . u ( h ) − u (0) | u ( h ) − u (0) | ≤ ﬃ cients of the coarse solution u h and re-scale them with the r.m.s u rms , we get: u (0) − u m u rms , = √ u (0) − u ( h ) | u ( h ) − u (0) | , ; u ( h ) − u m u rms , = √ u ( h ) − u (0) | u ( h ) − u (0) | . (27)These parameters determine the sign of the sub-scale in equation (26). Hence, the magnitude of the sub-scales is fullydetermined by the physics-informed parameter α and its sign (phase) is determined by these parameters. Consequently,an appropriate choice of the feature will be: u (cid:48) ( y ) u rms , = f (cid:32) α, u (0) − u m u rms , u ( h ) − u m u rms (cid:33) . (28)The two extra parameters, in this case, are redundant because they are the same in magnitude and opposite in sign.Hence, only one parameter can be used: u (cid:48) ( y ) u rms , = f (cid:32) α, u ( h ) − u (0) u rms (cid:33) . (29)The above functional form is derived for the advection-di ﬀ usion problem. However, this structure will serve asthe inspiration for applying this method to other non-linear problems, as detailed in the following section.

4. Learning VMS-consistent closures

The next step is to learn the mapping presented in equation (29) and compare it to the analytical solution. Data isﬁrst obtained by solving the equation at di ﬀ erent Peclet numbers Pe on very ﬁne meshes (we refer to this as the DNS)for training the network. Similarly, we can also generate data by dividing a single high Pe DNS case into multiplecases with di ﬀ erent element sizes, i.e., di ﬀ erent cell Peclet numbers. The coarse-scale is obtained in the form ofend-point interpolant, and the sub-scale is then computed for each such element numerically. A small network with asize of 3 × × × × p in which u (cid:48) is approximated. In limit, p → ∞ the approximate sub-scale should approach u (cid:48) ( y ). However, the order of the polynomial p required to learn thefunction increases when Pe is increased. By limiting the output order p to the order of super-resolution, we arereducing the complexity and size of the network, as discussed in the last section. This is because, when α → ∞ , f ( α ) = | u (cid:48) | u rms = √ coth ( α ) − α ) is well-behaved, whereas if u (cid:48) ( y ) given by the equation (20) is learnt directly as afunction of y and α , the function becomes steep at y = h for α → ∞ . A solution to this problem is to use featuressuch as − e α yh − e α as inputs. However, this restricts the method to one kind of problem. Similarly, the optimal form of thesub-scale on discontinuous p = u (cid:48) ( y ) u rms , = C (cid:48) ( α ) ψ ( y / h ) + C (cid:48) ( α ) ψ ( y / h ) . (30)9

20 40 6000.511.522.5

Figure 3: Comparison of sub-scales magnitude as a function of cell Peclet number α obtained analytically vs. that learnt from data using a N-N. The form of the function to be learned for this case is: (cid:2) C (cid:48) , C (cid:48) (cid:3) = f ( α ) . (31)The above analysis was performed for the continuous Galerkin (CG) method, but these ideas can be extended to thediscontinuous Galerkin (DG). Retaining the same structure, we extend the technique to non-linear / linear problems forboth CG / DG types of basis functions as follows:[ C (cid:48) , p s , C (cid:48) , p s , .., C (cid:48) ( p s + d , p s ] = f (cid:18) α, (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C ( p c + d , p c (cid:105) , .. (32) , (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C ( p c + d , p c (cid:105) m , .. (cid:19) , where p s and p c are the polynomial orders of the spaces in which the sub-scale and coarse-scale are optimally repre-sented, and d denotes the dimension of the problem. This function, apart from α (equivalent to cell Re ) also containsthe basis coe ﬃ cients of the current element and its neighbors. The term (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C ( p c + d , p c (cid:105) m with sub-script m denotes the mean subtracted normalized basis coe ﬃ cient of m th neighbour. The neighbour information is criti-cal when discontinuous basis is used, or when bubble function approximation are not employed in CG, or non-localtransfer of information happens from outside the element. These coe ﬃ cients are ﬁrst subtracted with the coarsescale mean u m and then normalised with the coarse scale R.M.S. u rms as done previously. The output of the function (cid:104) C (cid:48) , p s , C (cid:48) , p s , .., C (cid:48) ( p s + d , p s (cid:105) denotes the basis coe ﬃ cients of the sub-scale that has been normalised with u rms only i.e. (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C ( p c + d , p c (cid:105) m = (cid:104) C , p c − u m , C , p c − u m , .., C ( p c + d , p c − u m (cid:105) m / u rms , (33)where C i , j denotes the actual basis coe ﬃ cients. The quantities used for shifting and non-dimensionalizing the inputparameters,i.e., u m and u rms respectively, and non-dimensionalizing the output parameters u rms are calculated usingthe coarse-scale solution in the center element only. As will be seen later in this paper, the non-dimensionalization10rocess is critical for the N-N model to generalize. The output coe ﬃ cients are ﬁnally re-scaled with u rms and addedto the coarse-scale to obtain the super-resolved solution as follows: u sr = u p c + u (cid:48) p s = u p c + u rms ( p s + d (cid:88) i C (cid:48) i , p s ψ i , p s , (34)where, ψ i , p s denotes basis function corresponding to the i th node or mode. Division by u rms in equation (33) is aproblem when u rms is precisely equal to zero. However, adding a small positive number (cid:15) to u rms while dividing wassu ﬃ cient for all the cases presented below. In addition to the model features, the model architecture can be made consistent with the VMS formulation. Asproposed in equation (32), the input to the model are the physics-informed parameters such as the cell P´eclet number α , along with the normalized mean-subtracted coarse-scale basis coe ﬃ cients of an element and its neighbor. Theoutput to the network are the normalized sub-scale basis coe ﬃ cients in that particular element. The physics-informedparameter can also be other non-dimensional numbers such as the CFL number or the cell Reynolds number Re ∆ speciﬁc to the problem. Given these sets of input and output features, many possible ways of embedding them intothe model exist. Figure 4 shows two di ﬀ erent kinds of network architectures to achieve this.The traditional approach is based on the idea of training one single fully connected N-N with both the normalizedcoarse-scale basis coe ﬃ cients and the physics informed-parameter as inputs. If the normalized sub-scale basis coe ﬃ -cients are denoted by u (cid:48) , the normalized input coarse basis coe ﬃ cients of the element and its neighbors as u c and thephysics-informed parameter α , then the traditional model is given by: u (cid:48) = f FNN ( α, u c ) , (35)where f FNN denotes a fully-connected neural network (FNN). Another approach also called the variational super-resolution N-N (VSRNN), is based on a multiplicative strategy in which the ﬁne-scales are approximated by a sumof products of individual functions of the parameters and the coarse-scales. The model form can be summarized asfollows: u (cid:48) = f FNN ( g α (cid:12) g u ) , ; g α = h FNN ( α ) , ; g u = k FNN ( u c ) , (36)where f FNN , h FNN and k FNN denote three di ﬀ erent FNNs. The symbol (cid:12) denotes element-wise multiplication betweentwo vectors of the same size. This architecture is inspired by the analytical solution of the sub-scale provided inequation (25). In this case, g α (Part B) learns the dependence of α i.e. √ (cid:16) α − coth ( α ) (cid:17) and g u (Part A) learns thedependence of the normalized coarse-scale basis coe ﬃ cients, i.e., sgn ( u ( h ) − u (0)). In this paper, we will use theVSRNN model for all simulations. 11 igure 4: VMS-consistent architecture and features are used for learning the mapping in VSRNN. . Data Generation The generation of proper training and testing data is as critical as the model architecture and features. For example,low-resolution data can be obtained from a variety of high ﬁdelity sources (simulations and experiment) and can becoarse-grained. There is no guarantee that a model trained to perform super-resolution of the ﬁltered solution willbe useful unless the ﬁltering operation is consistent with the underlying numerics. To this end, consider the VMSdecomposition of the full-order solution u as follows: u = u h + u (cid:48) , (37)where u h ∈ V h and u (cid:48) ∈ V (cid:48) . The vector space of functions V ≡ H ( Ω ) is a Sobolev space where the functions andtheir derivative are square-integrable. This space is now decomposed as follows: V = V h ⊕ V (cid:48) , (38)where ⊕ represents a direct sum of V h and V (cid:48) . Let us also deﬁne T h to be a tessellation of domain Ω into a set ofnon-overlapping elements, K , each having a sub-domain Ω k and boundary Γ k . V h is now deﬁned as: V h (cid:44) (cid:110) u ∈ L ( Ω ) : u | T ∈ P k ( T ) , T ∈ T h (cid:111) , (39)where the space of polynomials up to degree k is denoted as P k . Deﬁning V h in this manner allows discontinuityin the solution across element boundaries. Given u from the high-ﬁdelity simulation, our goal is to ﬁnd the optimalrepresentation of u in the coarse sub-space ˜ V . In our case, we will use the L projection to obtain u h which minimisesthe value of || u − u h || . This problem is equivalent to the problem of ﬁnding u h ∈ ˜ V such that( u , w h ) = ( u h , w h ) ∀ ˜ w ∈ ˜ V . (40)For example, to generate training data for section 6, we use direct simulation (DNS) results for a channel ﬂow at Re τ ≈

950 [70]. The 3-D data is sliced into many 2-D planes at di ﬀ erent y + locations, and projection is performed in2-D for simplicity. The ﬁne space and coarse space’s polynomial orders are chosen to be 3 and 1, respectively, i.e.,we are super-resolving p = p = u has been assumed toexist in an inﬁnitely high dimensional space, in reality, it is not. For example, the Kolmogorov scale η dictates the sizeof the smallest size eddy and the size of u . Although the dimension of u is ﬁnite, it is enormous when compared to u h , k and u h , m because the size of our ﬁnite element grid h is much greater than η . Hence, to accurately compute theseterms, the DNS data is ﬁrst interpolated using cubic-splines to a much ﬁner-grid O ( η ) and then the inner productswith the coarse ﬁnite element basis function (having dimension O ( h )) are computed on these ﬁne-meshes using theSimpson’s Rule. Interpolation of DNS was done to ensure that u h , k and u h , m did not change signiﬁcantly due to thenumerical integration scheme. Sample L -projected snapshots of the DNS data on elements of di ﬀ erent sizes andorders are shown in ﬁgure 5. 13 igure 5: Example L -projected snapshots of the DNS data for channel ﬂow at Re τ ≈

950 and wall normal distance of y + ≈

6. Application to Linear Advection

In this section, we apply our super-resolution methodology to the linear advection equation in the domain Ω ⊂ R with the boundary Γ = ∂ Ω as follows ∂ u ∂ t = a ∂ u ∂ x , (41)with time t ∈ (0 , T ], and spatially periodic boundary conditions on Γ . The parameter a denotes the advection velocity.The required training data is generated by L -projecting the true solution on coarse and ﬁne spaces. Unlike traditionalmethods, in which only the spatial term in the PDE is discretized using ﬁnite elements, we will consider 2-D space-time ﬁnite elements spanned by p = , , ﬀ erent settings. First, as a model tosuper-resolve coarse low-order ﬁnite element data to high-order ﬁner ﬁnite element data. Second, as a method toimprove the existing ﬁnite element method for this problem in a predictive setting on a problem with a very di ﬀ erentinitial condition in comparison to the training data. 14 x N t ∆ x ∆ t CFL (cid:107) u − u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u − u (cid:107) (cid:107) u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u (cid:107)

16 32 0.3927 0.19635 0.5 0.41684 0.41684 0.0018975 0.066344 0.066345 0.0003020116 16 0.3927 0.3927 1 0.56984 0.57053 0.0044046 0.090699 0.090809 0.0007010616 8 0.3927 0.7854 2 1.4709 1.4681 0.0066213 0.23456 0.23411 0.001055916 4 0.3927 1.5708 4 3.3222 3.3223 0.016548 0.56392 0.56394 0.002808916 2 0.3927 3.1416 8 3.115 2.9975 1.7061 0.75135 0.723 0.4115232 64 0.19635 0.098175 0.5 0.1075 0.10755 0.00075342 0.01711 0.017117 0.0001199132 32 0.19635 0.19635 1 0.14736 0.14738 0.0020468 0.023453 0.023457 0.0003257632 16 0.19635 0.3927 2 0.41684 0.4157 0.005024 0.066344 0.066163 0.0007996232 8 0.19635 0.7854 4 1.4235 1.4245 0.009266 0.22699 0.22716 0.001477632 4 0.19635 1.5708 8 3.3146 2.723 1.3326 0.5626 0.46219 0.2262

Table 1: Reconstruction error when super-resolved from p = p = ﬀ erentCFL numbers. N x N t ∆ x ∆ t CFL (cid:107) u − u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u − u (cid:107) (cid:107) u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u (cid:107) (cid:107) u , NN − u (cid:107) (cid:107) u (cid:107)

16 64 0.3927 0.098175 0.25 0.40799 0.40956 0.029322 0.064934 0.065183 0.004666716 32 0.3927 0.19635 0.5 0.42022 0.42039 0.004307 0.066879 0.066907 0.0006854816 16 0.3927 0.3927 1 0.57471 0.57558 0.0043912 0.091468 0.091606 0.0006988816 8 0.3927 0.7854 2 1.5204 1.5228 0.0087391 0.242 0.24238 0.00139116 4 0.3927 1.5708 4 3.8684 3.8747 0.018305 0.62236 0.62338 0.002944916 2 0.3927 3.1416 8 3.8513 3.6725 3.079 0.81524 0.77739 0.6517632 128 0.19635 0.049087 0.25 0.108 0.10626 0.0096799 0.017188 0.016912 0.001540632 64 0.19635 0.098175 0.5 0.10772 0.10788 0.0011823 0.017145 0.017169 0.0001881632 32 0.19635 0.19635 1 0.14764 0.14791 0.0015759 0.023498 0.02354 0.0002508132 16 0.19635 0.3927 2 0.42022 0.42123 0.0037693 0.066879 0.06704 0.000599932 8 0.19635 0.7854 4 1.4737 1.4799 0.011139 0.23456 0.23554 0.001772932 4 0.19635 1.5708 8 3.8616 3.0788 1.266 0.62126 0.49532 0.20368

Table 2: Reconstruction error when super-resolved from p = p = ﬀ erentCFL numbers. igure 6: High-resolution linear advection solution on a 512x512 mesh is L -projected on di ﬀ erent ﬁnite element meshes for p = , To train the super-resolution model, we will generate the true solution on a ﬁne grid. The initial condition for thiscase is sin ( x ) + sin (2 x ) + sin (4 x ), and the size of the grid is taken to be N x × N t : 512 × x direction. The true solution is then evaluated on all the grid points to obtain the DNS solution.The next step is to obtain the coarse p = p = L -projected solution for di ﬀ erent meshes having spatial andtemporal element sizes ∆ x and ∆ t respectively as shown in ﬁgure 6. A non-dimensional parameter naturally arising inthis case is the CFL = a ∆ t ∆ x number which is similar to Peclet number in the 1-D convection-di ﬀ usion problem. Thissolution is then projected on grids of various sizes corresponding to CFL numbers of 0.25,0.5,1.0,2 and 4. These CFLnumbers correspond to three sets of grid-sizes: (i.) N x × N t : 32 × ×

64, 32 ×

32, 32 ×

16 and 32 ×

8; (ii.) N x × N t :16 ×

64, 16 ×

32, 16 ×

16, 16 × ×

4; and (iii.) N x × N t : 24 ×

96, 24 ×

48, 24 ×

24, 24 ×

12 and 24 ×

6. For each elementin these grids, the basis coe ﬃ cients corresponding to the coarse-space is extracted along with its neighbors, excludingthose which are part of the future time step, i.e., only space-time elements present in the south, south-east, south-west,east, and west of the central coarse element are extracted. The corresponding ﬁne-space basis coe ﬃ cients are also16 igure 7: Comparison of the true L -projected p = L -projected p = extracted for the central element. As a ﬁrst step towards normalization, a mean value u m is ﬁrst computed inside theelement as follows: u m = (cid:82) (cid:82) Ω e u ( x , t ) dx dt (cid:82) (cid:82) Ω e dx dt . (42)Similarly, an R.M.S value is also computed inside the element: u rms = (cid:118)(cid:117)(cid:116) (cid:82) (cid:82) Ω e ( u ( x , t ) − u m ) dx dt (cid:82) (cid:82) Ω e dx dt . (43)17 igure 8: Comparison of solution obtained using the traditional space-time method and the super-resolved method to the projected DNS solutionfor initial condition u ( x , = sin ( x ) + sin (3 x ). A model is then sought in the following form (cid:104) C (cid:48) , p s , C (cid:48) , p s , .., C (cid:48) p s + , p s (cid:105) = f (cid:18) log ( CFL ) , (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C p c + , p c (cid:105) , .. (44) , (cid:104) ˜ C , p c , ˜ C , p c , .., ˜ C p c + , p c (cid:105) m , .. (cid:19) , where C (cid:48) i , j and ˜ C i , j are deﬁned in terms of u m and u rms similar to equation (33) . For learning this function, theVSRNN architecture is adopted with size 24 ×

12 for part A, 1 × ×

12 for part B, and 12 × ﬀ erent set of initial conditions, as shown in ﬁgure 7. It can be observed that unless anextremely coarse model was used, the model could e ﬃ ciently super-resolve unseen coarse solution to its ﬁne-solutionwith minimal reconstruction error. This reconstruction error for the cases is reported in table 1. Except for the casewhen the CFL number was as high as 8, the error in reconstruction (cid:107) u , NN − u (cid:107) is orders of magnitude smaller18 igure 9: Comparison of solution obtained using the traditional space-time method and the super-resolved method to the projected DNS solutionfor initial condition u ( x , = . e − x − π ) . in comparison to the magnitude of the sub-scale (cid:107) u − u (cid:107) . Hence, the super-resolution model is very e ﬃ cient inreconstructing the ﬁne-solution as long as the CFL is not large. A separate model for super-resolving p = p = ﬃ cient reconstruction was obtained at lower CFL values. Irrespective of the large reconstructionerror at CFL values of 8 and above, the magnitude of (cid:107) u − u (cid:107) is still very close to (cid:107) u , NN − u (cid:107) for all CFL numbers.This indicates that the super-resolution model can also act as an e ﬃ cient error indicator that can be used for meshadaption. In the previous section, we showed that the neural network could e ﬃ ciently predict the sub-scales as long as thegrid is not highly under-resolved. In this section, we will use the trained model from the previous section to improvethe existing space-time based numerical method. To this end, we start with the linear advection equation in the domain Ω ⊂ R with the boundary Γ = ∂ Ω as follows ∂ u ∂ t + a ∂ u ∂ x = , (45)19 E ne r g y Online p= 1 SolutionOffline p= 2 SolutionProjected p= 1 TruthProjected p= 2 Truth 0 1 2 3 4 5 6Time131415161718 E ne r g y Online p= 1 SolutionOnline p= 2 SolutionProjected p= 1 TruthProjected p= 2 Truth

Figure 10: Evolution of energy E ( t ) = (cid:82) u ( x , t ) d Ω x as a function of time for traditional space-time method vs. super-resolution method for initialcondition u ( x , = sin ( x ) + sin (3 x ). with a periodic boundary condition at the boundary Γ and time t ∈ (0 , T ]. The weak form of the above equation isobtained by multiplying it with a test function w and integrating it over the space-time element as follows (cid:90) Ω e (cid:32) ∂ u ∂ t + a ∂ u ∂ x (cid:33) wd Ω = , (46)such that u ∈ V for all w ∈ V . Let us also deﬁne T h as a tessellation of the domain Ω into a set of non-overlappingelements, K , each having a sub-domain Ω e and boundary Γ e . The vector space of functions V ≡ H ( T h ) is a Sobolevspace where the functions and their derivatives are square-integrable inside each element . Simplifying the previousequation, we obtain the following: (cid:90) Ω e (cid:32) ∂ uw ∂ t + ∂ auw ∂ x (cid:33) d Ω − (cid:90) Ω e (cid:32) u ∂ w ∂ t + au ∂ w ∂ x (cid:33) d Ω = . (47)Application of the divergence theorem leads to the following (cid:90) Γ e ( auw ˆi + uw ˆj ) . ( n x ˆi + n t ˆj ) d Γ − (cid:90) Ω e (cid:32) u ∂ w ∂ t + au ∂ w ∂ x (cid:33) d Ω = , (48)where n x and n t denote the components of the outward normal on the surface of the element along the space and timeaxis, respectively. The ﬁrst term in the DG method is replaced with a numerical ﬂux as follows: (cid:90) Γ e ( F ∗ x ( au , au − ) ˆi + F ∗ t ( u , u − ) ˆj ) . ( n x ˆi + n t ˆj ) wd Γ − (cid:90) Ω e (cid:32) u ∂ w ∂ t + au ∂ w ∂ x (cid:33) d Ω = . (49)The traditional space-time DG method can be obtained by applying the Galerkin approximation to the previous equa-tion as follows: (cid:90) Γ e ( F ∗ x ˆi + F ∗ t ˆj ) . ( n x ˆi + n t ˆj ) w h d Γ − (cid:90) Ω e (cid:32) u h ∂ w h ∂ t + au h ∂ u h ∂ x (cid:33) d Ω = , (50)20 E ne r g y Online p= 1 SolutionOffline p= 2 SolutionProjected p= 1 TruthProjected p= 2 Truth 0 1 2 3 4 5 6Time4.24.34.44.54.64.74.8 E ne r g y Online p= 1 SolutionOnline p= 2 SolutionProjected p= 1 TruthProjected p= 2 Truth

Figure 11: Evolution of energy E ( t ) = (cid:82) u ( x , t ) d Ω x as a function of time for traditional space-time method vs. super-resolution method for initialcondition u ( x , = . e − x − π ) . where F ∗ x = a ˜ u − when an x < a ˜ u when an x >

0. The temporal ﬂux is based on previous space-time slab i.e. F ∗ t = ˜ u − . The e ﬀ ect of the numerical ﬂuxes is similar to that of a closure, which is dissipative in action due to thejump term, ensuring the stability of the method. The numerical ﬂuxes were originally developed for solving the exact1-D problem at the interface, and application to the DG method is generally made by applying it along the normaldirection of the interface. However, this might not be the most optimal choice for the ﬂux. To this end, we revisitthe strong form of the DG i.e., equation (49) through the VMS approach. The coarse-scale equation corresponding toequation (49) is given by: (cid:90) Γ e (cid:16) F ∗ x ( a ( u h + u (cid:48) ) , a ( u − h + u (cid:48)− )) ˆi + F ∗ t ( u h + u (cid:48) , u − h + u (cid:48)− ) ˆj (cid:17) . ( n x ˆi + n t ˆj ) w h d Γ − (cid:90) Ω e (cid:32) ( u h + u (cid:48) ) ∂ w h ∂ t + a ( u h + u (cid:48) ) ∂ w h ∂ x (cid:33) d Ω = . (51)If V h ⊥V (cid:48) , then the e ﬀ ect of the sub-scale on the interior ﬂux is negligible i.e. (cid:90) Γ e (cid:16) F ∗ x ( a ( u h + u (cid:48) ) , a ( u − h + u (cid:48)− )) ˆi + F ∗ t ( u h + u (cid:48) , u − h + u (cid:48)− ) ˆj (cid:17) . ( n x ˆi + n t ˆj ) w h d Γ − (cid:90) Ω e (cid:32) u h ∂ w h ∂ t + au h ∂ w h ∂ x (cid:33) d Ω = , (52)and the e ﬀ ect of un-resolved sub-scales is only through the ﬂux. The true solution u h + u (cid:48) is inﬁnite-dimensional.However, only a few of its moments are required in the form of inner-products with low-order basis functions onelement faces. In this limit, we assume that the following approximation can be made: u s ≈ u h + u (cid:48) , where u s denotesthe super-resolved solution of u h i.e. (cid:90) Γ e (cid:16) F ∗ x ( a ( u h + u (cid:48) ) , a ( u − h + u (cid:48)− )) ˆi + F ∗ t ( u h + u (cid:48) , u − h + u (cid:48)− ) ˆj (cid:17) . ( n x ˆi + n t ˆj ) w h d Γ ≈ (cid:90) Γ e (cid:16) F ∗ x ( au s , au − s ) ˆi + F ∗ t ( u s , u − s ) ˆj (cid:17) . ( n x ˆi + n t ˆj ) w h d Γ , (53)where F ∗ x and F ∗ t are traditional up-wind numerical ﬂuxes but computed using the super-resolved state u s instead of u h . In this paper, we choose the spaces of u h and u s to be p = p = x -3-2-10123 u ( x , ) TrainingReconstructionSub-grid Modelling: Case ASub-grid Modelling: Case B

Figure 12: Di ﬀ erent initial conditions are used for training, o ﬄ ine reconstruction and online evaluation of the model. To obtain the super-resolved state, we will re-use the super-resolution network trained in the previous sub-section.Two di ﬀ erent initial conditions are chosen: (i.) Case A with an initial condition sin ( x ) + sin (3 x ) (ii.) Case B withan initial condition 2 . e − x − π ) . These initial conditions are di ﬀ erent from those used for training and testing. Acomparison of di ﬀ erent initial conditions used in training, reconstruction, and online evaluation is summarised inﬁgure 12. The space-time slab is then discretized into 32 elements in the spatial direction, and the CFL value is takento be 1.0. Figure 8 and 9 shows results obtained for the super-resolution model and the traditional model compared tothe optimal solution obtained by L -projection of the DNS solution for the two di ﬀ erent initial conditions, respectively.The super-resolution model is far more accurate than the traditional method, where the sub-scales were recovered witha high level of accuracy for both cases. In the case of the traditional method, extrema can be seen decreasing withtime considerably in comparison to the super-resolution space-time method both in ﬁgures 8 and 9. This shows ahigher dissipation characteristic of the traditional numerical method over the super-resolution method. This can alsobe quantitatively seen in both the ﬁgures 10(a) and 11(a), where the red line denoting time evolution of energy i.e. E ( t ) = (cid:82) Ω x u h d Ω x decreases in time for the traditional approach in comparison to the optimal p = ﬄ ine when the solution is made available. Asshown in ﬁgure 10(a) and 11(a), application of the super-resolution in an o ﬄ ine stage does not improve the results.On the other hand, when the super-resolved states were used to compute the ﬂux in the numerical method online, a22igh level of L -optimality was also obtained in the coarse solution as shown in ﬁgures 10(b) and 11(b). Consequently,the corresponding super-resolved solution was also accurate and close to the p = ﬀ erent.The model encounters input parameters that it has not encountered during training and outputs an incorrect super-resolved solution in the evaluation stage. The blue line represents another optimal representation of DNS solutionon the coarse solution space and not the L -optimal solution for which the method has been trained. However, whenthe super-resolved state is used to compute the ﬂuxes, it forces the coarse resolution towards L -optimality because ithas been formulated using the VMS method, where the coarse and ﬁne spaces are formally deﬁned. As shown in thesecond part of ﬁgure 13, consistent numerical methods are required for the super-resolution models to work correctly.The VMS method is an ideal candidate to help us in achieving this consistency. Figure 13: Sources of errors in o ﬄ ine and online super-resolution.

7. p-Super-resolution of turbulent channel ﬂow.

The assessment of super-resolution models for turbulent ﬂows poses a stern challenge. This is because, given acoarse-scale solution, there are inﬁnitely many possible solutions for the ﬁne-scales. This problem is especially truefor ﬁltering using the sharp spectral ﬁlter, where the ﬁne-scale solution is lost after ﬁltering, and it is impossible torecover the original ﬁeld from ﬁltered data. The exact ﬁne-scales are both functions of the coarse space and theirtime-history [14, 37–39, 71]. As shown by Langford and Moser [23] in their work on optimal LES, to compute the23orrect single-time multi-point statistical quantity of the large-scale ﬁeld exact ﬁne-scales might not be required. TheSmagorinsky and the VMS models [1, 8–11, 13, 39], which perform well online, are well-known to perform poorlyin an a priori setting. Similarly, the N-N generated super-resolved ﬁeld is not point-wise exact. Rather it is an optimalrepresentation of the ﬁne-space generating correct single-time multi-point statistics.Our model being compact, the L error is computed only locally in a single element. The training data consistsof data from several elements, part of a 2-D DNS slice having homogeneity in stream-wise and span-wise directions.During the optimization, these errors from each element are averaged. As a result, the model output, in some sense, isan optimal representation of the ﬁne-scales for all possible realizations. To this end, we will be using one-dimensionalenergy spectra that have been averaged over homogeneous directions as a measure for model accuracy in-place of the L norm for the full ﬁeld. The contours of the reconstructed ﬁne-space solution will only be presented for qualitativepurposes.The ﬁrst step is to generate data for training the model. As described in section 4, a single 2-D DNS snapshotis extracted at a wall-normal height of y + ≈

850 and is L -projected on discontinuous polynomial spaces spannedby order p = , ﬀ erent sizes. In this case, we project theDNS solution on meshes with elements N x × N y : 8 ×

4, 16 ×

8, 32 ×

16, 64 ×

32 in the x and y directions respectively.Once the p = , p = ﬃ cients are extracted for both the element and its immediate neighbors along with the p = ﬃ cients. The next step is to evaluate the normalising parameter for each element i.e. u rms = (cid:114) (cid:82) Ω e ( u − u m ) d Ω e (cid:82) Ω e d Ω e ,where the mean velocity u m inside an element is given by u m = (cid:82) Ω e u d Ω e (cid:82) Ω e d Ω e . Finally, a functional form similar to equation(32) is assumed except the parameter α is replaced with the logarithm of cell Reynolds number i.e. log ( Re ∆ ). Thephysics-informed feature log ( Re ∆ ) ensures that di ﬀ erent orders of magnitude of Re ∆ in training data is accounted for.Finally, the normalized input and output basis coe ﬃ cient data and the logarithmic cell Reynolds number log ( Re ∆ ) foreach element are assembled into a single table as a training data-set.A VSRNN architecture for the N-N model is then assumed with sizes: 37 × × ×

32 for the part A, 16 × × × × ×

16 sized post-multiplication part, respectively. Finally, the model performance isevaluated in ﬁgures 15 and 14 by comparing the stream-wise and span-wise energy spectra obtained for the super-resolved p = p = L -optimal solution at di ﬀ erent wall-normal heights of y + ≈ , , p + N x × ( p + N y wherethe factor p + ﬀ ective grid-size at higher orders. This also prevents the over-sampling of the data.As can be observed in ﬁgures 15 and 14, the network can successfully recreate the correct energy distributionacross di ﬀ erent wave-numbers both in the stream-wise and the span-wise directions. This is true for both the cases:the plane at y + ≈ y + ≈

500 and y + ≈ L -projected p = p = p = L -optimally projected p = ﬀ erent grid sizes at awall-normal distance of y + ≈

500 is shown in ﬁgure 16. It can be observed that the super-resolved solution, similarto the optimal p = p = ﬀ erentmesh resolutions.The generalizability of the model trained using a single snapshot of DNS data stems from the fact that whenthe DNS image is projected on several ﬁnite element meshes with di ﬀ erent element sizes, the average value of thecell Reynolds number changes. As a result, the training data contains an extensive range of cell Reynolds numbers.When the trained model is evaluated at di ﬀ erent wall-normal distances while retaining the same the grid size, the cellReynolds number changes due to changes in u rms . However, this new cell Reynolds number can also be obtained at aprevious wall-normal height by changing the grid-size alone. This can also be observed in equation (26) for the nor-malized sub-scales. The normalized sub-scales only depend on the cell Peclet number α and the non-dimensionalizedinputs rather than the grid size or the di ﬀ usion coe ﬃ cient separately. Figure 14: Stream-wise energy spectra obtained for the L -projected stream-wise velocity solution on p = L -projected stream-wise velocitysolution on p =

3, N-N super-resolved p = ﬀ erent wall normal height y + and mesh resolutions. igure 15: Span-wise energy spectra obtained for the L -projected stream-wise velocity solution on p = L -projected stream-wise velocitysolution on p =

3, N-N super-resolved p = ﬀ erent wall normal height y + and mesh resolutions. The previous sub-section showed that the super-resolution model accurately reconstructs the high-order optimalsolution from the low-order optimal solution. We consider LES of the compressible Navier–Stokes equation, but forsimplicity of presentation, we only detail the development for the inviscid terms. The domain Ω ⊂ R d is used with theboundary Γ = ∂ Ω , where d ≥ ∂ u ∂ t + ∇ · F ( u ) = , (54)with appropriate boundary conditions on Γ and time t ∈ (0 , T ]. The state vector u is given by: u = [ ρ, ρ u , ρ v , ρ w , ρ E ] T , (55)26 igure 16: Stream-wise Velocity contours of the coarse solution, super-resolved ﬁne solution and the corresponding optimal ﬁne solution. and the matrix F ( u ) corresponding to the ﬂux is given by: F ( u ) =  ρ u ρ v ρ w ρ u + p ρ uv ρ uw ρ vu ρ v + p ρ vw ρ wu ρ wv ρ w + p ρ uH ρ vH ρ wH  . (56)Let us also deﬁne T h as a tessellation of the domain Ω into a set of non-overlapping elements, K , each having asub-domain Ω k and boundary ∂ Ω k . The DG weak form is then obtained by multiplying with weighting functions w and performing integration by parts on an element: (cid:90) Ω k (cid:32) ∂ u ∂ t + ∇ · F ( u ) (cid:33) · w d Ω = . (57)27fter application of integration by parts we obtain: (cid:90) Ω k ∂ u ∂ t · w d Ω − (cid:90) Ω k ∇ w : F ( u ) d Ω + (cid:90) ∂ Ω k w · F ∗ ( u , u − ) d Γ = . (58)Using the Galerkin approximation we have, (cid:90) Ω k ∂ u h ∂ t · w h d Ω − (cid:90) Ω k ∇ w h : F ( u h ) d Ω + (cid:90) ∂ Ω k w h · F ∗ ( u h , u − h ) d Γ = , (59)where components of u h , i.e., u h , i ∈ ˜ V and V is the space of p = V (cid:44) (cid:110) u ∈ L ( Ω ) : u | T ∈ P ( T ) , T ∈ T h (cid:111) . (60)The numerical ﬂux F ∗ is assumed to be the Roe ﬂux, and an under-resolved model results in a sub-optimal solution u h , i.e., u h (cid:44) u = Π u , (61)where Π denotes L -projection on V . Similarly, the large-scales in u h will also be inconsistent i.e. u h , = Π u h (cid:44) u = Π u = Π u , (62)where Π projects onto a coarse-space formed by tensor-product of p = p = u h obtained after projection i.e. u h , = Π u h is better resolved in the coarse space V in comparison to u h in the ﬁne space V . This is because the numericaldissipation due to the standard numerical ﬂux is more likely to corrupt the smaller scales in comparison to the larger,resolved scales. We can now use our model to super-resolve u h , back to u s ∈ ˜ V . as follows u s = f NN ( u h , , ... ) , (63)where u s is the super-resolved state in the element when the coarse-scale solution in the element and its neighboursare given by { u h , , ... } . The super-resolution of each state is performed independently on 2-D planes. The size of thenetwork used for super-resolution is reduced to 36 × × ×

16 to ensure computational e ﬃ ciency. Finally, in a similarapproach to section 6.2, the super-resolved state is used to compute the ﬂux terms in the DG formulation as follows: (cid:90) Ω k ∂ u h ∂ t · w h d Ω − (cid:90) Ω k ∇ w h : F ( u h ) d Ω + (cid:90) ∂ Ω k w h · F ∗ ( u s , u − s ) d Γ = . (64)The application of the super-resolved state u s directly in the boundary ﬂux term makes it unstable when used with theexplicit R-K type time-stepping methods. To stabilize this approach (SR-LES), the super-resolution process is relaxedas follows: (cid:90) Ω k ∂ u h ∂ t · w h d Ω − (cid:90) Ω k ∇ w h : F ( u h ) d Ω + (cid:90) ∂ Ω k w h · F ∗ (cid:0) (1 − λ ) u h + λ u s , (1 − λ ) u h − + λ u s − (cid:1) d Γ = . (65)28 value λ = . − . λ is desirable, stability generally demands λ ≤ .

2. The discretization of the viscous terms in the compressibleNavier-Stokes equation is performed using the second form of Bassi and Rebay [72] scheme. The boundary termsarising due to the viscous ﬂuxes are also evaluated using the under-relaxed super-resolved state similar to the inviscidﬂuxes.To compare the performance of di ﬀ erent models, we perform LES of channel ﬂow at Re τ ≈ x ), span-wise ( y ) and wall-normal ( z ) directions are N x = N y =

12 and N z = L X , L y , L z ] : [2 πδ, πδ, δ ], respectively.The stream-wise and span-wise element sizes in wall-units are ∆ x + ≈ .

41 and ∆ y + ≈ .

41 respectively. Theelement sizes in wall-normal direction vary from ∆ z + min ≈ .

37 near the wall to ∆ z + max ≈ .

55 at the center of thechannel. Since, high-order basis functions are used, the e ﬀ ective grid size can be approximated by ∆ e f f ≈ ∆ p , where ∆ is the element size and p is the order of the polynomial i.e. p =

3. Time marching was performed using the explicitRK3-TVD scheme for all the cases. Figure 17 shows the velocity statistics obtained for the channel ﬂow problemusing ILES, SR-LES and DNS. The performance improves as λ is increased to 0.2. This is observed both in the meanvelocity proﬁle, and the stream-wise root mean square (RMS) velocity proﬁle. The RMS peak of the stream-wisevelocity obtained for both λ = . λ = . λ = . λ where the model is expected to work better. One of the reasons for the constraint in λ is dueto the explicit time-stepping scheme was used. No such factor was needed for the linear advection case in the previoussection, where an implicit space-time method was employed. As discussed further in the perspectives section below,additional challenges have to be addressed to ensure success of super-resolution networks for predictive modeling ofturbulent ﬂows.

8. Perspectives

Inspired by successes in the machine vision community, there has recently been considerable interest in the usesuper-resolution in the physical sciences. Much of the existing literature has, however, focused on reconstructionperformance and not on predictive modeling. Truly predictive models should not be restricted to a single mesh or ﬂowconﬁguration, and should generalize to a class of ﬂows. Despite the success in the canonical problem in section 6,the results in section 7 suggest that there is much to be done before a truly predictive capability can be realized fora problem as challenging as turbulent ﬂow. We view our work as a ﬁrst step in moving towards a predictive LEScapability. Along these lines, we outline the following ingredients for the discovery of sub-grid closures:1. The model should be constructed using features that lend themselves to generalization2. The structure of the learning model should allow one to e ﬃ ciently embed physics-informed parameters29. The closure model should be intimately linked to the underlying numerical discretization.4. The training should be performed in manner that the super-resolution is consistent with the coarse scales duringthe prediction.In our work, we addressed points u (cid:48) = f ( u DNSh ), whereas in the on-line prediction stage, it is used as u (cid:48) = f ( u LESh ). As the error between the coarse scales in the LES and DNS grows, the super-resolution becomesless accurate. In other words, the parameters of the learning model have not been inferred for on-line performance.Model-consistent training has been successfully demonstrated in RANS closures [40, 43, 74], the authors are awareof only one such attempt in the context of LES [75]. However, as mentioned above, and in more detail in Ref. [75],implicitly ﬁltered approaches are associated with other challenges. The VMS approach, on the other hand, allowsfor both numerics-consistent and model-consistent training, but the implementation of such a capability is a majorundertaking is yet to be pursued by the authors in an LES context.As a ﬁnal point, while the appeal of VMS is the segregation of scales and the prospects of deriving closures withfew phenomenological assumptions, structural models (e.g. [39]) generally perform poorly when the simulation isseverely under-resolved. Several attempts [76, 77] have been made to combine traditional VMS approaches with phe-nomenological models like Smagorinsky in the form of mixed models. The use of data-driven techniques potentiallyallows us to account for these phenomenological relationships present in the data directly into the VMS model, thus,bridging the gap between phenomenological and structural modeling.

9. Conclusions

We proposed a strategy for multi-scale modeling in which the coarse and the ﬁne scales are deﬁned in termsof projection onto their respective ﬁnite element spaces, and segregated using a variational multiscale formulation.Existing variational multiscale formulations provide guiding principles for the construction of consistent features andnetwork architecture to deﬁne a super-resolution model of the ﬁne scales. Particularly, we deﬁne an architecture- called the Variational super-resolution neural network (VSRNN) - which approximates the sub-scales as a sumof products of individual functions of coarse-scales and the physics-informed parameters. This model form andnetwork structure is inspired by analytical expression for the sub-scales as given by the convection-di ﬀ usion equation.It is emphasized that traditional architectures - such as a fully connected neural network - are not ideal for this30 z + U + DNSILESSR-LES =0.1SR-LES =0.2 (a) Stream-wise mean velocity proﬁle U + vs. channel wall normal height z + inwall units. z + (b) Root mean square of velocity components √ < u + > , √ < v + > , √ < w + > vs. channel wall normal height z + in wall units. z/ -0.200.20.40.60.81 - < u w > + DNSILESSR-LES =0.1SR-LES =0.2 (c) Resolved turbulent shear-stress - < uw > + vs. channel wall normal height z /δ normalised with the semi-channel height. Figure 17: Velocity statistics for channel ﬂow using ILES and SR-LES at Re τ ≈ ﬃ cients and physics informed-parameters) as inputs. The input features and output quantities are obtained by appropriately non-dimensionalizing thecoarse-scales and the sub-scale basis coe ﬃ cients. By applying the super-resolved state to compute the DiscontinuousGalerkin (DG) ﬂuxes, we ensure that the online coarse-scale solution is forced towards its L -optimal state.We verify that when the present approach is applied to the convection-di ﬀ usion problem, it can learn the analyticalsolution to a high degree of accuracy. Similarly, for the 1-D linear-advection space-time problem, the model couldaccurately super-resolve low-order coarse solutions to high-order ﬁne-solution. The network could also reproducesuper-resolved velocity ﬁelds with the proper energy distribution across di ﬀ erent wave-numbers in the stream-wiseand the span-wise direction for the turbulent channel case.Next, we assessed the predictive capability of these models. Super-resolution was used to determine the DG ﬂuxesfor the linear-advection problem, and shown to result in higher accuracy and optimality of the method over traditionalspace-time methods for the same number of degrees of freedom. When applied to LES of turbulent channel ﬂow, thisapproach led to a more modest performance improvement. This improvement stems from the fact that the presentmodel has been trained using L -optimal ﬁne and coarse solutions, leading to sub-grid models that are consistent withthe type of optimality sought. The present method was found to generalize to out-of-sample initial conditions andReynolds numbers for both the linear advection and the turbulent channel ﬂow cases.Perspectives were provided on data-driven closure modeling in general, and particularly how model-consistenttraining could improve the prospects of developing truly predictive models. In addition to reconstruction and sub-gridmodeling, the super-resolution model can be used as an error indicator for adaptive grid reﬁnement : Regions in whichthe high magnitude of the sub-scale values can be used as a measure for under-resolution. Finally, the authors wouldlike to point out that e ﬀ ective implementation of this approach solvers requires the development of e ﬃ cient non-linearsolvers and preconditioners to handle the additional non-linearity and sti ﬀ ness due to the model. Acknowledgement

This research was funded by NASA under the project ”Scale-resolving turbulence simulations through adaptivehigh-order discretizations and data-enabled model reﬁnements”, grant number 80NSSC18M0149 (Technical monitor:Gary Coleman). We gratefully acknowledge Dr. Krzysztof Fidkowski for the valuable discussions.

References [1] T. J. Hughes, G. R. Feij´oo, L. Mazzei, J.-B. Quincy, The variational multiscale method—a paradigm for computational mechanics, Computermethods in applied mechanics and engineering 166 (1998) 3–24.[2] M. Germano, U. Piomelli, P. Moin, W. H. Cabot, A dynamic subgrid-scale eddy viscosity model, Physics of Fluids A: Fluid Dynamics 3(1991) 1760–1765.[3] D. You, P. Moin, A dynamic global-coe ﬃ cient subgrid-scale eddy-viscosity model for large-eddy simulation in complex geometries, Physicsof Fluids 19 (2007) 065110.

4] C. Meneveau, T. S. Lund, W. H. Cabot, A lagrangian dynamic subgrid-scale model of turbulence, Journal of ﬂuid mechanics 319 (1996)353–385.[5] F. Nicoud, H. B. Toda, O. Cabrit, S. Bose, J. Lee, Using singular values to build a subgrid-scale model for large eddy simulations, Physics ofFluids 23 (2011) 085106.[6] A. Vreman, An eddy-viscosity subgrid-scale model for turbulent shear ﬂow: Algebraic theory and applications, Physics of ﬂuids 16 (2004)3670–3681.[7] F. Nicoud, F. Ducros, Subgrid-scale stress modelling based on the square of the velocity gradient tensor, Flow, turbulence and Combustion62 (1999) 183–200.[8] R. Codina, Stabilized ﬁnite element approximation of transient incompressible ﬂows using orthogonal subscales, Computer Methods inApplied Mechanics and Engineering 191 (2002) 4295–4321.[9] R. Codina, J. Principe, O. Guasch, S. Badia, Time dependent subscales in the stabilized ﬁnite element approximation of incompressible ﬂowproblems, Computer Methods in Applied Mechanics and Engineering 196 (2007) 2413–2430.[10] Y. Bazilevs, V. Calo, J. Cottrell, T. Hughes, A. Reali, G. Scovazzi, Variational multiscale residual-based turbulence modeling for large eddysimulation of incompressible ﬂows, Computer Methods in Applied Mechanics and Engineering 197 (2007) 173–201.[11] Z. Wang, A. Oberai, Spectral analysis of the dissipation of the residual-based variational multiscale method, Computer Methods in AppliedMechanics and Engineering 199 (2010) 810–818.[12] V. Gravemeier, M. W. Gee, M. Kronbichler, W. A. Wall, An algebraic variational multiscale–multigrid method for large eddy simulation ofturbulent ﬂow, Computer Methods in Applied Mechanics and Engineering 199 (2010) 853–864.[13] A. Masud, R. Calderer, A variational multiscale method for incompressible turbulent ﬂows: Bubble functions and ﬁne scale ﬁelds, ComputerMethods in Applied Mechanics and Engineering 200 (2011) 2577–2593.[14] E. J. Parish, K. Duraisamy, A uniﬁed framework for multiscale modeling using the mori-zwanzig formalism and the variational multiscalemethod, arXiv preprint arXiv:1712.09669 (2017).[15] J. Smagorinsky, General circulation experiments with the primitive equations: I. the basic experiment, Monthly weather review 91 (1963)99–164.[16] D. K. Lilly, A proposed modiﬁcation of the germano subgrid-scale closure method, Physics of Fluids A: Fluid Dynamics 4 (1992) 633–635.[17] E. L ´EV ˆEQUE, F. TOSCHI, L. SHAO, J.-P. BERTOGLIO, Shear-improved smagorinsky model for large-eddy simulation of wall-boundedturbulent ﬂows, Journal of Fluid Mechanics 570 (2007) 491–502.[18] J. Bardina, J. Ferziger, W. Reynolds, Improved subgrid-scale models for large-eddy simulation, in: 13th ﬂuid and plasmadynamics conference,1980, p. 1357.[19] S. Liu, C. Meneveau, J. Katz, On the properties of similarity subgrid-scale models as deduced from measurements in a turbulent jet, Journalof Fluid Mechanics 275 (1994) 83–119.[20] S. Liu, C. Meneveau, J. Katz, Experimental study of similarity subgrid-scale models of turbulence in the far-ﬁeld of a jet, Applied scientiﬁcresearch 54 (1995) 177–190.[21] B. Vreman, B. Geurts, H. Kuerten, On the formulation of the dynamic mixed subgrid-scale model, Physics of Fluids 6 (1994) 4057–4059.[22] C. Xie, J. Wang, H. Li, M. Wan, S. Chen, Artiﬁcial neural network mixed model for large eddy simulation of compressible isotropicturbulence, Physics of Fluids 31 (2019) 085112.[23] J. A. Langford, R. D. Moser, Optimal les formulations for isotropic turbulence, Journal of ﬂuid mechanics 398 (1999) 321–346.[24] S. T. Bose, P. Moin, D. You, Grid-independent large-eddy simulation using explicit ﬁltering, Physics of Fluids 22 (2010) 105103.[25] T. Lund, The use of explicit ﬁlters in large eddy simulation, Computers & Mathematics with Applications 46 (2003) 603–616.[26] T. S. Lund, H.-J. Kaltenbach, Experiments with explicit ﬁltering for les using a ﬁnite-di ﬀ erence method (1995).[27] R. C. Moura, G. Mengaldo, J. Peir´o, S. J. Sherwin, On the eddy-resolving capability of high-order discontinuous galerkin approaches toimplicit les / under-resolved dns of euler turbulence, Journal of Computational Physics 330 (2017) 615–623.[28] D. Flad, G. Gassner, On the use of kinetic energy preserving dg-schemes for large eddy simulation, Journal of Computational Physics 350 / least-squaresmethod for advective-di ﬀ usive equations, Computer methods in applied mechanics and engineering 73 (1989) 173–189.[34] A. N. Brooks, T. J. Hughes, Streamline upwind / petrov-galerkin formulations for convection dominated ﬂows with particular emphasis on theincompressible navier-stokes equations, Computer methods in applied mechanics and engineering 32 (1982) 199–259.[35] R. Codina, On stabilized ﬁnite element methods for linear systems of convection–di ﬀ usion-reaction equations, Computer Methods in AppliedMechanics and Engineering 188 (2000) 61–82.[36] T. J. Hughes, L. P. Franca, M. Balestra, A new ﬁnite element formulation for computational ﬂuid dynamics: V. circumventing the babuˇska-brezzi condition: a stable petrov-galerkin formulation of the stokes problem accommodating equal-order interpolations, Computer Methodsin Applied Mechanics and Engineering 59 (1986) 85–99.[37] E. J. Parish, K. Duraisamy, Non-markovian closure models for large eddy simulations using the mori-zwanzig formalism, Physical ReviewFluids 2 (2017) 014604.[38] E. J. Parish, K. Duraisamy, A dynamic subgrid scale model for large eddy simulations based on the mori–zwanzig formalism, Journal ofComputational Physics 349 (2017) 154–175.[39] A. Pradhan, K. Duraisamy, Variational multiscale closures for ﬁnite element discretizations using the mori–zwanzig approach, ComputerMethods in Applied Mechanics and Engineering 368 (2020) 113152.[40] E. J. Parish, K. Duraisamy, A paradigm for data-driven predictive modeling using ﬁeld inversion and machine learning, Journal of Computa-tional Physics 305 (2016) 758–774.[41] S. Stolz, N. A. Adams, An approximate deconvolution procedure for large-eddy simulation, Physics of Fluids 11 (1999) 1699–1701.[42] S. Stolz, N. A. Adams, L. Kleiser, An approximate deconvolution model for large-eddy simulation with application to incompressiblewall-bounded ﬂows, Physics of ﬂuids 13 (2001) 997–1015.[43] A. P. Singh, K. Duraisamy, Using ﬁeld inversion to quantify functional errors in turbulence closures, Physics of Fluids 28 (2016) 045110.[44] J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence modelling using deep neural networks with embedded invariance, Journalof Fluid Mechanics 807 (2016) 155–166.[45] J.-X. Wang, J.-L. Wu, H. Xiao, Physics-informed machine learning approach for reconstructing reynolds stress modeling discrepancies basedon dns data, Physical Review Fluids 2 (2017) 034603.[46] F. Sarghini, G. De Felice, S. Santini, Neural networks based subgrid scale modeling in large eddy simulations, Computers & ﬂuids 32 (2003)97–108.[47] M. Gamahara, Y. Hattori, Searching for turbulence models by artiﬁcial neural network, Physical Review Fluids 2 (2017) 054604.[48] C. Xie, K. Li, C. Ma, J. Wang, Modeling subgrid-scale force and divergence of heat ﬂux of compressible isotropic turbulence by artiﬁcialneural network, Physical Review Fluids 4 (2019) 104605.[49] A. Beck, D. Flad, C.-D. Munz, Deep neural networks for data-driven les closure models, Journal of Computational Physics 398 (2019)108910.[50] R. Maulik, O. San, J. D. Jacob, C. Crick, Sub-grid scale model classiﬁcation and blending through deep learning, Journal of Fluid Mechanics870 (2019) 784–812.[51] X. Yang, S. Zafar, J.-X. Wang, H. Xiao, Predictive large-eddy-simulation wall modeling via physics-informed neural networks, Physical eview Fluids 4 (2019) 034602.[52] R. Maulik, O. San, A neural network approach for the blind deconvolution of turbulent ﬂows, Journal of Fluid Mechanics 831 (2017)151–181.[53] R. Maulik, O. San, A. Rasheed, P. Vedula, Data-driven deconvolution for large eddy simulations of kraichnan turbulence, Physics of Fluids30 (2018) 125109.[54] Z. Wang, K. Luo, D. Li, J. Tan, J. Fan, Investigations of data-driven closure for subgrid-scale stress in large-eddy simulation, Physics ofFluids 30 (2018) 125101.[55] C. Xie, J. Wang, E. Weinan, Modeling subgrid-scale forces by spatial artiﬁcial neural networks in large eddy simulation of turbulence,Physical Review Fluids 5 (2020) 054606.[56] C. Mou, B. Koc, O. San, L. G. Rebholz, T. Iliescu, Data-driven variational multiscale reduced order models, Computer Methods in AppliedMechanics and Engineering 373 (2021) 113470.[57] X. Xie, M. Mohebujjaman, L. G. Rebholz, T. Iliescu, Data-driven ﬁltered reduced order modeling of ﬂuid ﬂows, SIAM Journal on ScientiﬁcComputing 40 (2018) B834–B857.[58] M. Mohebujjaman, L. G. Rebholz, T. Iliescu, Physically constrained data-driven correction for reduced-order modeling of ﬂuid ﬂows,International Journal for Numerical Methods in Fluids 89 (2019) 103–122.[59] Q. Wang, N. Ripamonti, J. S. Hesthaven, Recurrent neural network closure of parametric pod-galerkin reduced-order models based on themori-zwanzig formalism, Journal of Computational Physics 410 (2020) 109402.[60] Y. Xie, E. Franz, M. Chu, N. Thuerey, tempogan: A temporally coherent, volumetric gan for super-resolution ﬂuid ﬂow, ACM Transactionson Graphics (TOG) 37 (2018) 1–15.[61] Z. Deng, C. He, Y. Liu, K. C. Kim, Super-resolution reconstruction of turbulent velocity ﬁelds using a generative adversarial network-basedartiﬁcial intelligence framework, Physics of Fluids 31 (2019) 125111.[62] B. Liu, J. Tang, H. Huang, X.-Y. Lu, Deep learning methods for super-resolution reconstruction of turbulent ﬂows, Physics of Fluids 32(2020) 025105.[63] K. Fukami, K. Fukagata, K. Taira, Super-resolution reconstruction of turbulent ﬂows with machine learning, Journal of Fluid Mechanics 870(2019) 106–120.[64] H. Kim, J. Kim, S. Won, C. Lee, Unsupervised deep learning for super-resolution reconstruction of turbulence, arXiv preprintarXiv:2007.15324 (2020).[65] K. Fukami, K. Fukagata, K. Taira, Machine-learning-based spatio-temporal super resolution reconstruction of turbulent ﬂows, Journal ofFluid Mechanics 909 (2021).[66] A. Masud, R. Calderer, A variational multiscale method for incompressible turbulent ﬂows: Bubble functions and ﬁne scale ﬁelds, ComputerMethods in Applied Mechanics and Engineering 200 (2011) 2577–2593.[67] L. P. Franca, C. Farhat, Bubble functions prompt unusual stabilized ﬁnite element methods, Computer Methods in Applied Mechanics andEngineering 123 (1995) 299–308.[68] F. Brezzi, M.-O. Bristeau, L. P. Franca, M. Mallet, G. Rog´e, A relationship between stabilized ﬁnite element methods and the galerkin methodwith bubble functions, Computer Methods in Applied Mechanics and Engineering 96 (1992) 117–129.[69] T. J. Hughes, Multiscale phenomena: Green’s functions, the dirichlet-to-neumann formulation, subgrid scale models, bubbles and the originsof stabilized methods, Computer methods in applied mechanics and engineering 127 (1995) 387–401.[70] A. Lozano-Dur´an, J. Jim´enez, E ﬀ ect of the computational domain on direct simulations of turbulent channels up to re τ = rXiv:2009.10675 (2020).[74] J. R. Holland, J. D. Baeder, K. Duraisamy, Field inversion and machine learning with embedded neural networks: Physics-consistent neuralnetwork training, in: AIAA Aviation 2019 Forum, 2019, p. 3200.[75] J. Sirignano, J. MacArt, J. Freund, Embedded training of neural-network sub-grid-scale turbulence models, Bulletin of the American PhysicalSociety (2020).[76] T. J. Hughes, L. Mazzei, K. E. Jansen, Large eddy simulation and the variational multiscale method, Computing and visualization in science3 (2000) 47–59.[77] Z. Wang, A. Oberai, A mixed large eddy simulation model based on the residual-based variational multiscale formulation, Physics of Fluids22 (2010) 075107.rXiv:2009.10675 (2020).[74] J. R. Holland, J. D. Baeder, K. Duraisamy, Field inversion and machine learning with embedded neural networks: Physics-consistent neuralnetwork training, in: AIAA Aviation 2019 Forum, 2019, p. 3200.[75] J. Sirignano, J. MacArt, J. Freund, Embedded training of neural-network sub-grid-scale turbulence models, Bulletin of the American PhysicalSociety (2020).[76] T. J. Hughes, L. Mazzei, K. E. Jansen, Large eddy simulation and the variational multiscale method, Computing and visualization in science3 (2000) 47–59.[77] Z. Wang, A. Oberai, A mixed large eddy simulation model based on the residual-based variational multiscale formulation, Physics of Fluids22 (2010) 075107.