An Uncertainty-Quantification Framework for Assessing Accuracy, Sensitivity, and Robustness in Computational Fluid Dynamics
AAn Uncertainty-Quantification Framework for Assessing Accuracy,Sensitivity, and Robustness in Computational Fluid Dynamics
S. Rezaeiravesh a,b, ∗ , R. Vinuesa a,b, ∗∗ , P. Schlatter a,b, ∗∗ a SimEx/FLOW, Engineering Mechanics, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden b Swedish e-Science Research Centre (SeRC), Stockholm, Sweden
Abstract
A framework is developed based on different uncertainty quantification (UQ) techniques in order to assessvalidation and verification (V&V) metrics in computational physics problems, in general, and computationalfluid dynamics (CFD), in particular. The metrics include accuracy, sensitivity and robustness of the simula-tor’s outputs with respect to uncertain inputs and computational parameters. These parameters are dividedinto two groups: based on the variation of the first group, a computer experiment is designed, the data ofwhich may become uncertain due to the parameters of the second group. To construct a surrogate modelbased on uncertain data, Gaussian process regression (GPR) with observation-dependent (heteroscedastic)noise structure is used. To estimate the propagated uncertainties in the simulator’s outputs from first andalso the combination of first and second groups of parameters, standard and probabilistic polynomial chaosexpansions (PCE) are employed, respectively. Global sensitivity analysis based on Sobol decomposition isperformed in connection with the computer experiment to rank the parameters based on their influence onthe simulator’s output. To illustrate its capabilities, the framework is applied to the scale-resolving simu-lations of turbulent channel flow using the open-source CFD solver Nek5000. Due to the high-order natureof Nek5000 a thorough assessment of the results’ accuracy and reliability is crucial, as the code is aimedat high-fidelity simulations. The detailed analyses and the resulting conclusions can enhance our insightinto the influence of different factors on physics simulations, in particular the simulations of wall-boundedturbulence.
Keywords:
Uncertainty quantification, Validation and verification, Computational fluid dynamics,Polynomial chaos expansion, Gaussian process regression, Wall turbulence.
1. Introduction
In any scientific and engineering field, the ultimate aim of applying analytical, numerical, and experi-mental approaches to fluid dynamics problems is to obtain high-quality results. In a broad classical sense,quality can be exclusively interpreted as accuracy . But, as explained in the landmark studies by Roache [55],and Oberkampf and Trucano [39], reliability and robustness should be also included in the interpretationof the quality. In Refs. [39, 38], a thorough review is given on approaches based on verification and valida-tion (V&V) which have a key importance to assess accuracy and reliability of computational simulations.Following the terminologies introduced by Schlesinger [63] and extended in Refs. [55, 39, 38], verificationin computational simulations means investigating the errors in the computational model and its solution.On the other hand, validation aims at quantifying how accurate the solutions of a computational model arecompared to reference experimental data. ∗ Principal Corresponding Author ∗∗ Corresponding Author
Email addresses: [email protected] (S. Rezaeiravesh), [email protected] (R. Vinuesa), [email protected] (P. Schlatter)
Preprint submitted to Elsevier July 15, 2020 a r X i v : . [ phy s i c s . c o m p - ph ] J u l he present study aims at introducing a framework based on uncertainty quantification (UQ) techniquesin order to quantify accuracy, robustness and sensitivity of computational simulations. The three mentionedconcepts are closely related to V&V. It is also noteworthy that V&V and UQ together build the foundationsof predictive computational sciences, see Oden et al. [40]. The framework developed in this study is general innature and hence applicable to any computational physics problem. Nevertheless, we provide the examplesand supporting discussions with regard to CFD (computational fluid dynamics) problems and with emphasison the simulation of turbulent wall-bounded flows. These types of flows are complex and appear in manyphysics and engineering applications.In order to put the work in the context, let us start by providing the necessary terminology and definitions.In an ideal environment, the true value of the quantities of interest (QoIs) of a physical system such as a flowcould be measured in a highly-accurate experiment or a high-fidelity numerical simulation. Borrowing thedefinition from Rabinovich [49], the true value is “the value of a quantity that, were it known, would ideallyreflect the property of an object with respect to the purpose of the measurement”. Clearly, in practice,through repeated realizations in which the possible errors and uncertainties are kept at a minimum, only a reference true value would be possibly achieved. The deviation of the QoIs of a simulation from a referencetrue value is denoted as the error in the QoIs. Naturally, a lower error means higher accuracy and vice versa.According to Refs. [39, 38], in the verification step the accuracy of a computational model is assessed withrespect to analytical and highly-accurate numerical data. Such actions require software testing. Conversely,in the validation step, the results of the computational model are compared to experimental data whichwithin some confidence reflect the true reality.Another point of view for the definition of error is as follows: According to the AIAA (American Instituteof Aeronautics and Astronautics) Guide on V&V [1], an error is defined as “a recognizable deficiency in anyphase or activity of modeling and simulations that is not due to lack of knowledge”. According to thesame guide, two types of errors can be considered. The unacknowledged errors are due to mistakes, bugsin the codes, and compiler errors, for instance. In contrast, the acknowledged errors are identified andquantified as the deviation from a reference true value. This is aligned with the definition of error givenabove. Referring to the same Guide [1], uncertainty is defined as “a potential deficiency in any phase oractivity of the modeling process that is due to lack of knowledge”. As compared to the definition of error,the key distinguishing characteristics of uncertainty are that it is, first, a potential deficiency, and second, itis due to the lack of knowledge. The UQ literature, for instance Ghanem et al. [17] and Smith [65], definesuncertainty as the lack of certainty in a model output which may stem from incomplete knowledge or lackof knowledge, incomplete modeling or data, uncertain or variable factors in the model, etc. Hereafter, weuse the word factor for combination of parameters and inputs. In a broad view, the uncertainties can bedivided into two main groups: epistemic (bias) and aleatoric (random) uncertainties. The treatment of eachgroup requires employing appropriate sets of mathematical and statistical tools, see e.g. [17, 65]. A commonstrategy in the UQ framework is, however, to consider the uncertainties to be random (even if they are notreally so), in order to make probabilistic methods applicable, see e.g. [65].Given the above general definitions, we provide some examples related to CFD and turbulence simulationwhich help understanding the connection between UQ and V&V, and also motivate the aim and strategyof the present study. The general definitions for the error can also be adopted for numerical and associatedcomputational models developed for solving the continuous form of a system of PDEs (partial differentialequations) which for CFD applications are the Navier–Stokes equations. What is known as the numericalerror includes discretization and projection (truncation) components, and also numerical parameters such astolerances for iterative solvers. Computational errors may originate from, for instance, bugs in the computercodes, parallelization algorithms, compilers, and perhaps hardware. Clearly, due to unavailability of ana-lytical general solutions for the Navier–Stokes equations, except for very few idealized flows, the referencetrue solution in the verification step is chosen from the most accurate available numerical simulations andexperimental data. However, the reference data can also be uncertain, see e.g. [44, 32, 52, 53].The simulation of the turbulent flows is more subtle, due to the additional types of errors and uncer-tainties, and at the same time, the small margin of tolerance allowed in engineering designs relying onthese flows. The use of the DNS (direct numerical simulation) approach, which would provide the highestfidelity results, at high Reynolds numbers that are relevant to engineering applications is computationally2nfeasible. The computational cost can be reduced using the LES (large-eddy simulation) approach whichdirectly resolves scales larger than a characteristic length and models the remaining subgrid scales (SGS),see for instance the textbook by Sagaut [59]. However, the use of SGS models to compensate for the effectof unresolved scales adds more complexity. In fact, not only the numerical and modeling errors may becomeintertwined, but also the structure and calibration parameters of the SGS models introduce new sourcesof uncertainties. As shown in Refs. [27, 76], the contribution and effect of the components of numericalerrors vary for different terms in the Navier-Stokes equations and also depend on the order of the numericalscheme used for discretizing the equations. For high-order methods, the aliasing error has been found to bedominant, whereas in low-order methods the truncation error may be so high that even exceeds the SGSeffects. The uncertainties arising from turbulence modeling become more dominant when moving towardlower-fidelity approaches such as hybrid LES/RANS (Reynolds-averaged Navier–Stokes), wall-modelled LESand RANS, see Ref. [60]. In addition to all these, when using scale-resolving simulations such as DNS andLES, extra uncertainties in the averaged QoIs due to finite time-averaging appear. These errors could inprinciple be reduced by running longer, at an obvious computational cost.To summarize up to this point, the reported QoIs of any CFD simulation are uncertain , at least up tosome extent. The uncertainty in the QoIs is driven by many sources of uncertainties and errors which can,however, be intertwined and interconnected. As a result, tracing the uncertainties by analytical approachesis usually very hard, if not impossible, and in any case the outcome could be inaccurate and unreliable.A remedy can be found in the field of UQ. In particular, the UQ forward problem is considered, whichaims at estimating the propagation of uncertainties in a simulator ’s (i.e. the computational code) outputsdue to the uncertainty in different factors (inputs and parameters). If the factor under investigation is notuncertain or random by nature, the impact of its variability on the outputs is evaluated. Considering the costsinvolved in running a simulator, a main part of the process is focused on building surrogate models to expressapproximate relationship between the simulator’s inputs and outputs. Higher uncertainties estimated in themodel outputs indicate low robustness of the model to a particular set of uncertain factors. The forwardproblem is complemented by sensitivity analysis through which the contribution of each of the uncertainfactors in the resulting uncertainty in the model output is evaluated. The robustness and sensitivity canbe estimated for primary QoIs as well as their errors. Therefore, depending on the reference data usedto evaluate the error, both verification and validation of the computational models can be conducted in aprobabilistic framework.The non-intrusive implementation of the UQ forward problem can be carried out using computer ex-periments , see Refs. [58, 61]. As will be detailed in Section 2, the implementation includes the followingsteps: (i) parameterization of the factors, (ii) assuming the parameters to be random variables with knowndistributions over given spaces, and (iii) running the simulator at a limited number of samples taken forthe uncertain parameters. In the rest of this paper, the factors which are considered in the describedmethodology are referred to as Category-I parameters.Several studies have investigated the impact of variation of Category-I parameters on the outputs ofnumerical simulations of turbulent flows. A short survey of the relevant factors and corresponding examplestudies is given below. Our focus is on scale-resolving approaches such as DNS and LES which are compu-tationally expensive but capable of revealing more physics. For RANS simulations, the reader is referred tothe review paper [77] and the references therein. The impact of combined numerical errors which includespatial discretization and time integration on scale-resolving simulations have been investigated for differ-ent flows. Meyers and Sagaut [35] and Rezaeiravesh and Liefvendahl [52] studied turbulent channel flow,Meldi et al. [33] considered spatially-evolving mixing layers, and Mariotti et al. [32] investigated the flowover a rectangular cylinder. The influence of the coefficients appearing in the SGS models in LES have beenstudied by Lucor et al.[31] and Meldi et al. [33]. Congedo et al. [6] investigated the sensitivity of LES ofturbulent pipe flow with respect to the inflow condition. For the wall-modeled LES, the influence of differentfactors on the approximate wall boundary condition and consequently on the whole flow field is studied byRezaeiravesh et al. [54].A computer experiment relies on several realizations of the simulator’s outputs or QoIs, where thereexists a one-to-one correspondence between a realization and a sample taken from the space of Category-I parameters. The QoIs in each of such realizations can be uncertain due to what is hereafter referred to as3 ategory-II uncertainties. An example is the uncertainty in the time-averaged QoIs of turbulent flows as aresult of using finite number of time samples. The way of incorporating the known
Category-II uncertaintiesin the design of experiments for Category-I uncertainties constitutes another focus of the present study. Itis stressed that the evaluation of Category-II uncertainties is beyond the scope of the present paper. Inparticular, for assessment of uncertainty due to lack of time samples, the reader is referred to Oliver etal. [44] and Russo and Luchini [57].The unified framework of the present study is developed based on different UQ techniques. To estimateuncertainties in, and equivalently the robustness of, the outputs of a simulator with respect to the factorsfalling under Category-I, the non-intrusive polynomial chaos expansion (PCE) [80, 78, 18, 48] techniqueis used. Different types of construction of PCE as well as the impact of the number of samples on theaccuracy of the estimations are discussed. A novel aspect of this study is to show how to incorporatethe already-known uncertainties of Category-II when aiming for quantifying the uncertainties from theCategory-I factors. This is obtained through employing a more general form of observation noise in theGaussian process regression (GPR), [51, 20]. To estimate uncertainties in the QoIs in these situations,probabilistic polynomial chaos expansion (PPCE) is an appropriate tool. The formulation and use of thismethod within the present unified framework is another novelty of this study. In addition to these, toperform global sensitivity analysis [17, 65, 61], Sobol indices relying on the ANOVA (analysis of variance)technique [68, 67] are estimated.The remainder of this paper is organized as follow. Section 2 provides the details of the UQ techniqueswhich form the framework of the present study. Moreover, a short description of the numerical imple-mentation of these approaches is given. The connection between the UQ techniques and the evaluation ofaccuracy, robustness, and sensitivity in CFD simulations is explained in Section 3. These are followed bySection 4, where it is shown how the framework can be employed for a practical scale-resolving simulationof a canonical wall-bounded turbulent flow. Finally, conclusions and plans for extending the current studyare provided in Section 5.
2. UQ Techniques
In a UQ forward problem, the uncertainties in a model inputs and parameters are propagated into themodel outputs, and the resulting response surface and statistical moments of the outputs are constructed.In Section 2.1, theoretical aspects of a UQ forward problem are discussed. First, the concept of a surrogate which yields a computationally inexpensive relationship between inputs and outputs of a computationalmodel, is reviewed. Then the standard polynomial chaos expansion (PCE) is discussed, which is a powerfultechnique to estimate statistical moments of QoIs due to the variability of Category-I parameters in acomputer experiment. The treatment of the known uncertainties of the factors belonging to Category-II comes next via reviewing Gaussian process regression (GPR). We then investigate the properties ofprobabilistic PCE which is a combination of non-intrusive PCE and GPR. As a complement to the UQforward problem, sensitivity analysis is performed, which is the focus of Section 2.2. Finally, in Section 2.3it is briefly explained how the described UQ techniques are implemented.The UQ techniques employed here are chosen to comply with two basic requirements: first, reliableUQ estimates must be achievable through a limited number of realizations. This is essential consideringthe large computational cost of running the simulator in many computational physics problems, includingCFD. Second, the resulting UQ framework should be non-intrusively linked to the simulator. This providesa great flexibility in practice, as not specialised codes, potentially with involved numerical algorithms andparallelization approaches, need to be developed, optimized, debugged and maintained.
The uncertain factors belonging to Category-I are denoted by q , where q = { q , q , · · · , q p } . As aconvention throughout this paper, all bold-face letters are used for representing tensors of order greaterthan zero. The parameters q vary over the p -dimensional admissible space Q , i.e. q ∈ Q ⊂ R p . Besides4 , a simulator and its outputs can also be dependent on the controlled parameters, denoted by χ . From asingle run of the deterministic simulator a realization for the quantities of interest or model outputs r areobtained. Without loss of generality, we only consider univariate outputs r ∈ R . In general, these QoIs canbe contaminated by Category-II uncertainties.In the best of worlds, we could construct a functional f ( χ, q ) to describe the relationship between theinputs and the simulations outputs. Considering the observation or measurement errors ε and adopting anadditive error model, we could write r = f ( χ, q ) + ε . (1)In general, r = r ( χ, q ), and ε can be observation-dependent and contains generally both bias and randomeffects, but here we assume it to be only representing random noise. Nevertheless, constructing an exactfunction for the simulator f ( χ, q ) is not feasible in practice, considering the prohibitively large number ofsimulations required. Instead, a surrogate ˜ f ( χ, q ) can be constructed using a limited number of simulations ina computer experiment with outputs R = { r ( i ) } ni =1 , corresponding one-to-one to the samples Q = { q ( i ) } ni =1 drawn from Q . We refer to D = { ( q ( i ) , r ( i ) ) | i = 1 , , . . . , n } as the training data for constructing thesurrogate. The constructed surrogate can then be used to approximately predict values of r over the entire Q and also be employed for estimating the statistical moments of r due to the variation of q .There are different methods for constructing surrogates, see e.g. [17, 20]. A short review is given belowon the PCE and GPR methods, which are appropriate for the purpose of the present study. The strongpoint about PCE is that estimating statistical moments of ˜ f ( χ, q ) is a natural outcome of constructing thesurrogate. However, in the standard use, incorporating the observation uncertainties into the surrogate isnot trivial. In contrast, such a capability is the main feature of GPR. Consequently, the predictions of r byGPR in Q are accompanied by uncertainties. Consider the factors q to be mutually-independent. This property either holds by the problem definitionor is achieved through applying a transformation such as Rosenblatt’s [56] to the originally dependent param-eters. Consequently, the joint probability density function (PDF) of these parameters is ρ ( q ) = (cid:81) pi =1 ρ i ( q i ).Let each q i be mapped to a standard random variable ξ i , see below, over range Γ i . Consequently, q ∈ Q iseventually mapped to ξ ∈ Γ, where Γ = (cid:78) pi =1 Γ i and (cid:78) denotes tensor product. Using polynomial chaosexpansion (PCE), the surrogate ˜ f ( χ, ξ ) is written as a truncated expansion [18, 79, 48],˜ f ( χ, ξ ) ≈ K (cid:88) k=0 ˆ f k ( χ )Ψ k ( ξ ) . (2)Here k is a unique re-index corresponding to the multi-index k = ( k , k , . . . , k p ), and the bases are Ψ k ( ξ ) = (cid:81) pi =1 ψ k i ( ξ i ). The PCE surrogates are considered to be of parametric type, since through the bases a pre-defined structure for expressing the uncertain behavior of the model output is considered. In the frameworkof generalized PCE (gPCE), see Refs. [81, 10], for the standard distribution of each univariate random vari-able ξ i , where i = 1 , , · · · , p , a set of polynomial bases { ψ k i ( ξ i ) } exist which are orthogonal with respect tothe PDF ρ i ( ξ i ). This can be expressed as below introducing the weighted inner-product (cid:104)· , ·(cid:105) ρ i , (cid:104) ψ k i ( ξ i ) , ψ m i ( ξ i ) (cid:105) ρ i = (cid:90) Γ i ψ k i ( ξ i ) ψ m i ( ξ i ) ρ i ( ξ i )d ξ i = γ k i δ k i m i , (3)where, γ k i = (cid:104) ψ k i ( ξ i ) ψ k i ( ξ i ) (cid:105) ρ i and δ k i m i is the Kronecker delta. This also makes the multi-variate basesorthogonal with respect to ρ ( ξ ), (cid:104) Ψ k ( ξ ) , Ψ m ( ξ ) (cid:105) ρ = γ k δ km . (4)The inner-products (3) and (4) can be equivalently interpreted as expectations E q i [ ψ k i ( ξ i ) ψ m i ( ξ i )] and E q [Ψ k ( ξ )Ψ m ( ξ )], respectively. To construct the PCE (2), we need to: (i) choose a rule to construct themulti-variate bases Ψ k ( ξ ) and accordingly truncate (2) at K , (ii) specify the nodal set { ξ ( i ) } ni =1 , and (iii)adopt a method to estimate coefficients ˆ f k ( χ ) given training data D .5he set of samples { ξ ( i ) } ni =1 in Γ can be chosen to make a structured grid. Assume in the i -th dimensionof the p -dimensional parameter space Γ, we have chosen n i sampling nodes. Then, the index set Λ k for thetensor product method (TPM) reads as,Λ TPM k = { k ∈ Z p ≥ : max ≤ i ≤ p k i ≤ ( n i − } , (5)where Z ≥ denotes the non-negative integers. The TPM rule leads to K + 1 = (cid:81) pi =1 n i . Therefore, thenumber of terms in expansion (2) grows exponentially with p , an issue that is referred to as the curse ofdimensionality. Another choice for the nodal set is to adopt total order method (TOM), for which,Λ TOM k = { k ∈ Z p ≥ : | k | ≤ L } , (6)where, L is the maximum polynomial order in each dimension of the parameter space and | k | = (cid:80) pi =1 k i isthe magnitude of the multi-index k . For TOM, we have K + 1 = ( L + p )! /L ! p !, see Refs. [10, 65]. It is notedthat TOM can be considered as a special case of the more general hyperbolic truncation scheme [3].For obtaining the coefficients ˆ f k ( χ ) in (2), different techniques can be employed. Applying a projectionmethod (denoted as pseudo-spectral method in [78]) leads to the following integral,ˆ f k ( χ ) = E q [ f ( χ, ξ )Ψ k ( ξ )] E q [Ψ k ( ξ )Ψ k ( ξ )] = 1 γ k (cid:90) Γ f ( χ, ξ )Ψ k ( ξ ) ρ ( ξ )d ξ , (7)where, γ k is defined in (4). For the numerical approximation of the integral, an optimal choice is to use thestandard Gaussian quadrature rule which results in,ˆ f k ( χ ) ≈ γ k n (cid:88) j =1 r ( j ) Ψ k ( ξ ( j ) ) ρ ( ξ ( j ) )w ( j ) . (8)In this construction, ξ ( j ) = ( ξ ( j )1 , · · · , ξ ( j ) p ) and w ( j ) denote the j -th quadrature nodes and weight, respec-tively. In the framework of generalized PCE [81, 10], the n i nodes in i -th dimension are the zeros ofpolynomial basis ψ n i ( ξ i ). Moreover, the nodal set in the multi-dimensional parameter space are then con-structed using TPM, which may lead to the curse of dimensionality. A remedy to this, while still using theprojection method to compute the coefficients ˆ f k ( χ ) from (7), is to use sparse grid quadrature rules (forboth nested and non-nested nodes), see the details in Refs. [66, 16]. It is noteworthy that in projectionmethods, the constructed surrogate at the sampling nodes necessarily possesses the exact training values,i.e. ˜ f ( ξ ( j ) ) = r ( j ) for j = 1 , , · · · , n . This clearly proves the connection between Lagrange interpolationbased on stochastic collocation points (which are the parameter samples) and PCE (2), see Refs. [10, 78].Another approach for estimating coefficients ˆ f k ( χ ) in (2) is to use regression through which a linear setof equations, A ˆ f = R T , (9)is solved for ˆ f = [ ˆ f , ˆ f , . . . , ˆ f K ] T via, (cid:107) A ˆ f − R T (cid:107) l ≤ tol . (10)Here, tol. is a small preset tolerance, A is a n × ( K + 1) matrix with elements A i (k+1) = Ψ k ( ξ ( i ) ) and R =[ r (1) , r (2) , . . . , r ( n ) ]. When using the regression method, the samples { ξ ( i ) } ni =1 in the multi-dimensionalspace are not required to be necessarily structured. However, the sampling strategy significantly controlsthe accuracy of the constructed PCE (2), see e.g. [21]. The linear system (9) becomes under- and over-determined if n is smaller and bigger than ( K + 1), respectively. For cases where n (cid:28) ( K + 1), which mayoccur due to the high computational costs involved in making realizations of r , an extra constraint on ˆ f isneeded. In this case, instead of solving (10), we seek for an optimal ˆ f opt where,ˆ f opt = min (cid:107) ˆ f (cid:107) l such that (cid:107) A ˆ f − R T (cid:107) l ≤ tol . (11)6his method, which is called compressed sensing [4, 36], results in a unique sparse solution for ˆ f . Contrary tothe projection method, the PCE (2) determined by the regression approach does not necessarily match thetraining data { ( ξ ( i ) , r ( i ) ) } ni =1 . However, ε in (1) is more a residual of fitting than an observation uncertainty.Flexibility in sampling from the parameter space and also more numerical stability when handling noisysimulator outputs are advantages of the regression methods over the projection approach.The expansion (2) with determined coefficients as a surrogate interpolates values of r over the wholeparameter space Q . As a strong point of PCE, it can be shown (see e.g. [17, 65]) that the mean andvariance of ˜ f ( χ, q ) (and hence approximately f ( χ, q )) due to the variability of q can be estimated from thefollowing expressions, E q [ ˜ f ( χ, q )] = ˆ f ( χ ) , (12) V q [ ˜ f ( χ, q )] = K (cid:88) k=1 ˆ f ( χ ) γ k . (13)If the model response f ( χ, q ) is sufficiently smooth in the parameter space Q , then the rate of convergenceof the above stochastic moments with respect to the number of terms K in the expansion (2) can be high.In general, it is necessary to assess the quality by which a surrogate constructed in a computer experimentcan predict the simulator outputs (QoIs). For this purpose, different approaches have been developed,see Refs. [46, 45, 64] and the references therein, which can be categorized into two major strategies. In thefirst strategy the predictions by a surrogate are validated using either a new set of data or subsets of thetraining data. The second strategy is based on using a set of diagnostic tools to investigate whether thecoefficients or norm of the terms included in PCE (2), for instance, converge (up to a certain threshold)with the number of training samples and truncation K . The PCE (2) is a parametric surrogate which can describe the global behavior of f ( χ, q ) over Q throughestimating moments (12) and (13). In contrast, non-parametric surrogates for f ( χ, q ) can be consideredwith more flexibility in prediction over the input space. In particular, here we consider Gaussian processregression (GPR) [51, 20]. According to Rasmussen and Williams [51], “A Gaussian process ( GP ) is acollection of random variables, any finite number of which have a joint Gaussian distribution.” The naturalpossibility of using Bayesian formalism besides prediction of an output along with associated uncertainty,makes the GPR a suitable tool for UQ studies.The GPR assumes the surrogate ˜ f ( χ, q ) for the simulator in (1) to be random, over which a priordistribution in form of a Gaussian process is assumed. A Gaussian process is fully specified by its mean andcovariance function as, see Refs. [51, 20],˜ f ( q ) ∼ GP ( m ( q ) , c ( q , q (cid:48) ; Θ ε , β )) , (14)where, m ( q ) = E [ ˜ f ( q )] , (15) c ( q , q (cid:48) ; Θ ε , β ) = E [( ˜ f ( q ) − m ( q ))( ˜ f ( q (cid:48) ) − m ( q (cid:48) ))] . (16)In this definition, Θ ε denote the parameters specifying the structure of the observation noise and β arethe hyperparameters in the kernel function constructed in the space of inputs q to represent the covarianceof ˜ f ( · ). There are different options for such kernel function, see e.g. [51]. Note that in this section we drop χ from ˜ f ( χ, q ) to ease the notation. Also, as it is convenient in the context of GPR, we may refer to q asinputs.For the so-called universal GPR, the input space is first projected into a high-dimensional space viaintroducing bases φ ( q ). A particular choice would be using a set of monomials as bases in each of the p dimensions of the input space, i.e. φ ( q ) = { , q, q , . . . , q d − } , where d ∈ N . As a result of adopting theweight-space view for the GPR [51] and using the training data, Eq. (1) is written as, R = Φ ( Q ) Θ + ε , (17)7here weights Θ are of size pd and Φ ( Q ) is a n × pd matrix whose i -th row is,[ Φ ( Q )] i : = [1 , q ( i )1 , q ( i ) , . . . , q ( i ) d − , . . . , , q ( i ) p , q ( i ) p , . . . , q ( i ) d − p ] , (18)for i = 1 , , . . . , n . For these particular settings, the GPR would become essentially similar to PCE (2) withseveral extra features. Given the training data set D = { ( q ( i ) , r ( i ) ) } ni =1 with corresponding observation error ε obtained through a computer experiment, the posterior distribution for ˜ f ( q ), or equivalently for Θ is inferredconditioned on D and Θ ε . Simultaneously, the kernel’s hyperparameters β are optimized. Employing theposterior distribution, the posterior predictive at test points Q ∗ in the input space can be obtained, [51, 20].It is essential to remark that GPR is an interpolant, so it provides higher accuracy within the input admissiblespace Q . To evaluate the quality of a GPR constructed in a computer experiment, different approaches exist,see Refs. [45, 46]. If only the mean predictions by GPR are of interest, then the techniques mentioned atthe end of Section 2.1.2 for PCE can be used. For a general case, a set of measures are introduced in [2].A strong point of using GPR surrogates is the possibility of easily incorporating the uncertainties intraining data D into the surrogate construction and hence in the predictions. Commonly in the formulationfor GPR, the noise ε i for all the n training data samples are assumed to be independent and identicallydistributed (iid) by Gaussian distribution N ∼ (0 , σ d ) where σ d is fixed, see Refs. [51, 20]. This type of noiseis referred as homoscedastic. However, for the purpose of the present study we need to relax the assumptionof having identical noise levels (yet keeping the independence assumption) and consider heteroscedastic noisestructures which allows the noise levels to be input-dependent. In particular, the focus is on ε i ∼ N (0 , σ d i )for i = 1 , , . . . , n , where σ d i are known and constant. In the context of the present study, the value of σ d i represents the magnitude of uncertainty of Category-II which exists in the QoIs of the i -th simulation in acomputer experiment comprising n independent simulations.As proposed by Goldberg et al. [19] for a GPR with input-dependent noise ε i ∼ N (0 , Θ ε i ) where Θ ε i = σ d i , the predicted R ∗ at new input samples (test samples) Q ∗ = { q ∗ ( i ) } n ∗ i =1 has a multivariateGaussian distribution N ( m ( R ∗ |D , Θ ε , Q ∗ ) , v ( R ∗ |D , Θ ε , Q ∗ )), in which, m ( R ∗ |D , Θ ε , Q ∗ ) = C ( Q ∗ , Q )( C ( Q , Q ) + C N ) − R T , (19) v ( R ∗ |D , Θ ε , Q ∗ ) = C ( Q ∗ , Q ∗ ) − C ( Q ∗ , Q )( C ( Q , Q ) + C N ) − C ( Q , Q ∗ ) + e ( Q ∗ ) . (20)Here, without loss of generality, mean of ˜ f ( q ) in (14) is assumed to be zero, R = [ r (1) , r (2) , · · · , r ( n ) ], C ( Q , Q (cid:48) )is a n × n (cid:48) matrix for which, [ C ( Q , Q (cid:48) )] ij = c ( q ( i ) , q (cid:48) ( j ) ) where c ( · , · ) is a kernel function dependent on hy-perparameters β . Moreover, C N is the covariance matrix of the noise in the training data which is equal todiag( e ( Q )), where [ e ( Q )] i = σ d i for i = 1 , , . . . , n . The noise level at the test inputs, i.e. e ( Q ∗ ) is unknown.Following Goldberg et al. [19] a GP prior is assumed over the log of the noise variances. The kernel functionfor this GP has the same structure as that of the main GP, however with different values of hyperparameters.As it is discussed in the next section, the mean predictions by Eq. (19) are more important in the frameworkof the present study, and in fact, the confidence intervals (CIs) for those predictions are computed fromEq. (20). Here, we are seeking for a combination of GPR and non-intrusive PCE which takes advantage of thecharacteristics of both methods. For such a combination, at least two views can be found in the literature.In the approach of Sch¨obi et al. [64], the mean trend in the universal GPR (17) is constructed by PCE andthe error part is modeled by Gaussian processes. In contrast, in the probabilistic PCE (PPCE) approach,Owen [45] proposed to compute the coefficients in the PCE (2) using the Bayesian quadrature technique [42].What is described in this section is in essence similar to the latter, but is more general because of handlingobservation-dependent noise and also considering more standard distributions for q in PCE.As discussed above, when we have uncertain factors q of Category-I varying over Q and our aim isto estimate the resulting statistical moments of a simulator output f ( q ), the PCE approach is employed. For GPR with more general heteroscedastic noise structure, see Refs. [7, 14].
8n the non-intrusive way, associated to each sample drawn from Q , a CFD simulation is performed and arealization of the flow QoIs is obtained. The output of any of these simulations can be uncertain due to theparameters falling under Category-II. If these uncertainties are known and can be expressed as Gaussiannoise levels, then they, by the use of GPR, can propagate and affect the statistical moments (12) and (13)estimated by the standard PCE approach.To this end, first a GPR surrogate is constructed using the noisy data. Then, independent samples fromthe resulting predictive distribution with mean and variance (19) and (20) are drawn. Any of such samplesof the GPR surrogate is denoted by ˜ f ( j ) ( χ, q ∗ ) = ˜ f ( χ, q ∗ ( j ) ), for which a PCE can be constructed by (2)over Q . To compute the PCE coefficients, projection or regression methods with any of the truncationschemes (5) or (6) can be employed. However, to maximize the convergence of the PCE as well as takingadvantage of the framework of gPCE [81], it is recommended to use tensor-product truncation (5) alongwith the Gauss quadrature rule (8). In this case, the samples ξ ∗ i ∈ Γ i corresponding to q ∗ i ∈ Q i , for i = 1 , , · · · , p are taken as the zeros of polynomial bases ψ n i ( ξ i ) chosen based on the Wiener-Askey rule [81]in accordance with PDF ρ i ( q i ). Since the evaluation of the surrogate ˜ f ( j ) ( χ, q ∗ ) corresponding to thesesamples is computationally inexpensive, an arbitrary number of samples ξ ∗ i over the input space Q i can beconsidered. Having PCE coefficients ˆ f ( j ) estimated for the j -th sample from the GPR surrogate, an estimatefor statistical moments (12) and (13) can be determined. Having a sufficient number of samples, say n s ,taken from the GPR predictive distribution, estimates for the sample mean and variance of the statisticalmoments (12) and (13) are obtained. For instance for E q [ ˜ f ( χ, q )] we get,ˆ E [ E q [ ˜ f ( χ, q )]] = 1 n s n s (cid:88) j =1 E q [ ˜ f ( j ) ( χ, q )] = 1 n s n s (cid:88) j =1 ˆ f ( j )0 ( χ ) , (21)ˆ V [ E q [ ˜ f ( χ, q )]] = 1 n s n s (cid:88) j =1 (cid:16) E q [ ˜ f ( j ) ( χ, q )] − ˆ E [ E q [ ˜ f ( χ, q )]] (cid:17) . (22)Similar expression can be written for V q [ ˜ f ( χ, q )] given by (13). The estimators ˆ E [ · ] and n s ˆ V [ · ] will respec-tively converge to the population values E [ · ] and V [ · ] as n s → ∞ , see e.g. [50]. So far, appropriate tools for UQ forward problems have been discussed, so that the uncertainty prop-agated into the simulator outputs from a p -dimensional uncertain q can be estimated. As a complementto this, tools from global sensitivity analysis (GSA) are utilized to rank the p components of q based ontheir influence on the output r . Contrary to the local sensitivity analysis, where the sensitivity of r to smallperturbations of q is estimated, in GSA all q are let to simultaneously vary over their admissible space. Forconducting GSA, there are different approaches, see e.g. [17, 65]. Here, a short description for computingSobol sensitivity indices is provided. Once again, we drop χ from f ( χ, q ) for simplifying the notation.By analysis of variance (ANOVA) or Sobol decomposition [68, 67], a model function is decomposed as, f ( q ) = f + p (cid:88) i =1 f i ( q i ) + (cid:88) ≤ i GPyTorch [15] in which the optimization of the kernels’ hyperparameters are performedusing the ADAM algorithm [25]. 3. Accuracy, Robustness, and Sensitivity Analysis In this section, it is shortly explained how the UQ tools introduced in Section 2 can be employed whenassessing accuracy, robustness and sensitivity of the CFD simulations. To measure the robustness of a QoIin a computer model (1), we need to estimate the variance V q [ f ( χ, q )] in a UQ forward problem where theparameters q vary over Q with a given distribution. This can be achieved by non-intrusive application ofthe standard or probabilistic PCE methods reviewed in Sections 2.1.2 and 2.1.4, respectively. As a resultof using standard PCE, E q [ f ( χ, q )] ± t α (cid:112) V q [ f ( χ, q )] is obtained, in which t α is a constant value that islooked up from the t-table [50] for a desired confidence level. In this study, we take α = 95% which leadsto t = 1 . 96. In case of using probabilistic PCE, different combinations of moments due to the variationof q (Category-I parameters) as well as uncertainties of Category-II are achieved. Comparable to the resultof standard PCE, we get ˆ E [ E q [ f ( χ, q )]] ± t α (cid:113) ˆ E [ V q [ f ( χ, q )]]. The uncertainty term (under the square root)in this expression is itself uncertain for which we have, ˆ E [ V q [ f ( χ, q )]] ± t α (cid:113) ˆ V [ V q [ f ( χ, q )]]. Besides these,the uncertainty in the mean expected value ˆ E [ E q [ f ( χ, q )]] is represented by ± t α (cid:113) ˆ V [ E q [ f ( χ, q )]].As discussed in Section 1, the accuracy is defined as the deviation of the QoIs (simulator’s outputs)from what is considered as the reference true value. Such a value would be obtained from a high-fidelitysimulation or experiment for verification and validation purposes, respectively. To measure the a-posteriorierror as an indicator of the accuracy for a QoI r ( χ, q ), an option is, (cid:15) [ r ( χ, q )] = (cid:107) r ( χ, q ) − r ◦ ( χ ) (cid:107) / (cid:107) r ◦ ( χ ) (cid:107) , (26)in which, the symbol ◦ specifies the reference true value. In particular, we use the l ∞ norm, since it ismore restrictive than the l and l norms, see e.g. [28]. As formulated here, the error in a QoI can vary dueto both Category-I and Category-II uncertainties. Therefore, (cid:15) [ r ( χ, q )] can be treated as another outputto which different UQ techniques can be applied. Of particular importance is the construction of error10urfaces in the admissible space of q . As pointed out in Refs. [35, 34, 52], the resulting error portraits arecapable of indicating the complexities in the variation of errors in QoIs of turbulent flows. This confirmsthe shortcoming of the classical numerical analysis techniques in predicting the errors in the averaged QoIsobtained by scale-resolving simulations of turbulent flows. 4. Illustrative Example The focus of this section is on illustrating how the tools introduced in the previous sections can beutilized for the purpose of uncertainty quantification of a standard CFD simulation. As a showcase, thewall-resolved simulation of fully-developed turbulent channel flow is considered. This canonical case hasbeen extensively used in the literature for studying the physics of wall-bounded turbulence and also fordeveloping and testing numerical algorithms.The channel comprises of two parallel walls at which no-slip and no-penetration boundary conditions forvelocity are imposed. The channel half-height is denoted by δ and the normalized wall-normal coordinate y is measured from the bottom wall and varies between 0 and 2. The flow is periodic in the streamwise andspanwise directions which are associated with coordinates x and z , respectively. The flow is driven by fixedmass flux, which reduces to constant bulk velocity U b for constant density flows. Dimensional analysis showsthat only one parameter is needed to fully describe turbulent channel flow, which can be chosen as the bulkReynolds number Re b = U b δ/ν , with the kinematic viscosity ν .The channel flow simulations are carried out using the high-order spectral-element [8] open-sourcesolver Nek5000 [13], see Section 4.2. Starting from initial conditions consisting of a laminar velocity profilewith random perturbations, the flow undergoes transition and eventually becomes fully turbulent. Quanti-ties of interest of the channel flow simulation can be any functional of the simulation output. As detailedin Section 4.3, the QoIs are taken to be the flow quantities averaged over time and periodic directions, afterthe fully-developed turbulent state is established. The numerical code Nek5000 used in this study was developed by Fischer et al. [13]. This code is basedon the spectral-element method (SEM), which was originally proposed by Patera [47], in which so-calledspectral elements are used to decompose the domain. The incompressible Navier–Stokes equations are cast inweak form, and the velocity and pressure fields are expressed in terms of Lagrange interpolants of Legendrepolynomials, at the Gauss–Lobatto–Legendre (GLL) quadrature points within the elements. Note that themethod allows flexible distributions of the spectral elements, a fact that enables highly accurate simulationsof turbulent flows in moderately complex geometries (see for instance Refs. [74, 72]), while the GLL grid-point distributions within the elements are prescribed by the polynomial order N . All the simulationsdiscussed here were conducted using the P N − P N formulation, i.e. polynomials of order N are used bothfor the velocity and the pressure. In Nek5000, the nonlinear terms are treated explicitly by third-orderextrapolation and the viscous terms are treated implicitly by third-order backward differentiation. Velocityand pressure are decoupled and solved iteratively, the former through conjugate gradients and the latterby means of the generalized minimal residual (GMRES) method. Jacobi preconditioning is used for thevelocity, and the additive overlapping Schwarz method is used to build a pressure preconditioner [11]. Twomethods are considered for the coarse-grid solve from the latter: the first one (more adequate for smallerproblems) is based on a Cholesky factorization and it is denoted as XXT [69], and the second one is basedon an algebraic multigrid (AMG) solver [30]. Additional details on the way in which the governing equationsare solved in Nek5000 can be found in Ref. [41]. Furthermore, and although in earlier versions of Nek5000 anexplicit-filtering approach was adopted [12], in this work we consider the approach proposed by Schlatter etal. [62], in which the Navier–Stokes equations are supplemented by a dissipative term based on a high-passspectral filter (additional details are provided in Refs. [37]). Note that Nek5000 is written in Fortran77/C,and that the message-passing interface (MPI) is used for parallelism.11 .3. Computer Experiment In a computer experiment, the effect of inputs q with a joint PDF ρ ( q ) varying over Q is evaluated ona set of simulation outputs. Here, we consider numerical parameters q = (∆ x + , ∆ z + ) where ∆ x + and ∆ z + are the average spacing between the GLL points in the streamwise and spanwise directions of a channelflow, respectively. The superscript + denotes normalization with respect to the turbulent viscous lengthscale δ ν = ν/u ◦ τ , where u ◦ τ is the true nominal friction velocity. From a reference simulation it is possible toknow a-priori the friction-based Reynolds number Re ◦ τ = u ◦ τ δ/ν corresponding to a Re b employed to setup thechannel flow. In the illustrative example, we consider a channel flow at Re b = 5019 . 55 with Re ◦ τ = 279 . b , the DNS data from other resources such as Refs. [22, 29] couldalternatively be used. In any case, as it is inferred from its definition in Section 1, the accuracy of a QoIgenerally depends on the chosen reference data.In a channel flow simulation, the size of the grid elements in both streamwise and spanwise directionsare fixed. Therefore, ∆ x + = l x /δE x N Re ◦ τ , (27)where, l x and E x specify domain length and number of elements in the streamwise direction, respectively,and N is the polynomial order in the spectral-element method which is taken to be 7 for the present simu-lations. A similar expression can be written for ∆ z + . In the computer experiments to obtain different ∆ x + and ∆ z + for given l x and l z , the number of elements E x and E z are changed. To ensure the insensitivityof the analysis with respect to l x and l z , we consider a large domain with l x /δ = 16 and l z /δ = 9 aftertrying different lengths. Small modifications of the chosen lengths to setup simulations associated with somevalues of ∆ x + -∆ z + samples may have been applied. In the wall-normal direction, the element size decreasestowards the wall, in such a way that the first off-wall GLL point has ∆ y + w = 0 . 45 in all simulations. Thisensures an adequate resolution in the near-wall turbulent field and hence removes the potential uncertaintydue to the no-slip boundary condition at the wall.Two admissible spaces Q = Q ∆ x + ⊗ Q ∆ z + for q are considered: Q = [5 , ⊗ [5 , 70] and Q =[5 , ⊗ [5 , Q . Then, having asurrogate constructed on Q , the QoIs can be interpolated to any arbitrary number of samples in Q ⊂ Q .The QoIs are taken to be the averaged wall friction velocity (cid:104) u τ (cid:105) , averaged velocity profile (cid:104) u i (cid:105) andsecond-order moments of velocity, (cid:104) u (cid:48) i u (cid:48) j (cid:105) (so called Reynolds stresses), where u (cid:48) i = u i − (cid:104) u i (cid:105) and i, j = 1 , , u (cid:48) rms i = (cid:112) (cid:104) u (cid:48) i u (cid:48) i (cid:105) for i = 1 , , 3. The averaged profiles are only dependent on the wall-normal coordinate, y . Note that asa property of the spectral-element method, the values of quantities can be interpolated at any arbitrarynumber of points in an element using Lagrange interpolation constructed from the original GLL points inthat element. Using this feature in post-processing the simulation results, the values of the discrete profilesare obtained at N g equi-spaced points in the wall-normal direction.Assessment of the uncertainties in the above QoIs due to time-averaging using the methods in Refs. [57,44, 71] is left to be thoroughly studied in a future study. In this regard, computing (cid:104)·(cid:105) is performed over a longtime span so that the associated uncertainties become negligible and do not hinder quantifying uncertaintiesdue to the variation of q . Nevertheless, with the use of GPR and PPCE, it is shown how to combine theuncertainties due to time averaging or other Category-II factors with the Category-I uncertainties. Following the discussion in Section 3, the error in the QoIs of channel flow simulations are defined as anormalized deviation from a reference true data (specified by ◦ ), which are here taken to be the DNS dataof Iwamoto et al. [23]. For a scalar QoI, such as (cid:104) u τ (cid:105) , the error is defined as, (cid:15) [ (cid:104) u τ (cid:105) ] = ( (cid:104) u τ (cid:105) − (cid:104) u τ (cid:105) ◦ ) / (cid:104) u τ (cid:105) ◦ . (28)12 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . -5.00 - . - . - . - . - . - . - . - . - . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . (a) (b) 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . . (c) (d) Figure 1: Isolines of (cid:15) [ (cid:104) u τ (cid:105) ]% in Q = [5 , (cid:78) [5 , x + (cid:78) ∆ z + . The surfaces are constructedvia (2) using n = 4 (a), 9 (b), 13 (c), and 25 (d) training samples specified by red dots. The uncertainty in the training datadue to time-averaging is assumed to be zero. The thick line represents zero error in averaged friction velocity (cid:104) u τ (cid:105) . Note that u τ is a very relevant quantity in the context of wall-bounded turbulence, and that errors in u τ may lead to inadequate interpretations of the data [70]. For a QoI that is averaged over both time andperiodic directions, the profile depends on the wall-normal coordinate y and the definition (26) is adopted.This means in practise that the wall-normal maximum of the deviation is considered (maximum norm).Figure 1 represents the isolines of (cid:15) [ (cid:104) u τ (cid:105) ]% in the ∆ x + -∆ z + plane (i.e. the parameter space). In eachplot, a different number of parameter samples is used to construct the response surface using standard PCE.The total-order truncation scheme (6) with L = 12 is considered to ensure that all significant contributionsin PCE (2) are captured. Since the number of samples are less than the chosen K = 90, the coefficients in (2)are computed using the compressed sensing method (11). For all cases except Figure 1(c), tensor-producttruncation (5) can alternatively be used. It is also emphasized that the same response surfaces could beobtained using surrogates constructed by GPR or Lagrange interpolation [65]. The latter is restricted tothe cases with tensor-product samples in the parameter space.As pointed out in Section 2.1.2, it is important to examine the quality of a surrogate in a computerexperiment. In this regard, Figure 2 shows the convergence of the PCE associated to the plots in Figures 1(c)and (d). The contribution of the terms in expansion (2) has been quantified by evaluating the followingdiagnostic, ϑ k := | ˆ f k |(cid:107) Ψ k ( ξ ) (cid:107) / | ˆ f | . (29)As a prerequisite for ensuring that the predictions by non-intrusive PCE are valid, the values of ϑ k shouldexhibit a decreasing trend with | k | . This holds for both n = 13 and 25 cases, as shown in Figure 2.However, when using n = 25 samples, there are relatively significant contributions from | k | ≥ n = 13 samples. Further investigations (not shown here) indicate that addingsamples beyond n = 25 does not significantly modify the contours in Figure 1(d). In the case of n = 25samples, there is a distinct division between the values of ϑ k : those with the values higher than 10 − whichare shown in Figure 2(b), and the rest which are less than 10 − (not shown here). The ϑ k illustrated in13 | k | | f k ||| k () || /| f | | k | | f k ||| k () || /| f | (a) (b) Figure 2: Convergence of ϑ k defined by (29) for the PCE constructed for (cid:15) [ (cid:104) u τ (cid:105) ] using total-order truncation with L = 12 and n = 13 (a) and n = 25 (b) training samples. Note that the plots (a) and (b) correspond to Figure 1(c) and (d), respectively.The horizontal axis is | k | = k + k . In (a), the data with values smaller than 10 − are not shown. Figure 2(b) are the same if the tensor-product truncation scheme (5) with the projection method (7) isalternatively used. Therefore, for the n = 25 case, the use of higher number of terms in expansion (2) doesnot have a substantial effect.Returning to Figure 1, the use of higher number of training samples clearly leads to revelation of moredetails of the response surfaces, although in practice there should be a balance between the number ofsamples and the cost of running the simulator. In particular, observe the variation of the black curveswhich specify infinite combinations of resolutions resulting in (cid:15) [ (cid:104) u τ (cid:105) ] = 0. The existence of these curves,which can be due to error cancellations in the CFD simulator, besides the complexity of the error isolineswhich are solver-dependent, see e.g. [35, 52], prove the challenges involved in evaluation of the accuracy ofCFD simulations. These also cast doubt on the legitimacy of using techniques such as classical Richardsonextrapolation for the purpose of error estimation in turbulent flows, as it has been done for instance in [5, 26]in case of LES. Seeking for the alternatives has, for instance, led to the development of a Bayesian extensionof the Richardson extrapolation technique by Oliver et al. [44]. It is also important to note that the patternand values of the error isolines can be dependent on the reference data with respect to which the accuracyin the QoIs is evaluated.In producing Figure 1, the Category-II uncertainty in each of the training simulations is considered tobe zero. Such uncertainties could be for instance due to the time-averaging. To account for Category-II un-certainties when studying the uncertainty due to q = (∆ x + , ∆ z + ), we can use the GPR technique reviewedin Section 2.1.3. For illustration purposes, two scenarios for the observation Gaussian noise ε i ∼ N (0 , σ d i ),where i = 1 , , . . . , n , are assumed in connection with the generic model (1). In the case with homoscedasticnoise (same variance of all observations), σ d = 0 . 5% is assumed. In contrast, in heteroscedastic noise, wherethe noise level is observation-dependent, we assume σ d i = 0 . − ξ ∆ z + , i )%, where ξ ∆ z + , i ∈ [ − , 1] iscorresponding to ∆ z + i ∈ Q ∆ z + . This construction imitates a practical situation where simulations with finerresolution in the spanwise direction are more expensive and hence may be averaged over shorter time inter-vals. Consequently their QoIs may become more uncertain. The plots in Figure 3 represent these syntheticnoise levels (shown by confidence intervals) added to the mean observations of (cid:15) [ (cid:104) u τ (cid:105) ]% belonging to the n = 25 case in Figure 1(d). For these noisy data, surfaces of mean (cid:15) [ (cid:104) u τ (cid:105) ]% and associated 95% confidenceintervals are constructed in the ∆ x + -∆ z + plane in Figure 4. For this purpose, posterior predictive meanand variance given by Eqs. (19) and (20) are used.As a result of adding the observation noise, the isolines of the mean (cid:15) [ (cid:104) u τ (cid:105) ] may look different, compareFigures 4(a) and (d) with Figure 1(d). Such difference is more clear in the case of heteroscedastic noise. Theconfidence in predicting these surfaces is reflected in Figures 4(b) and (c) for the homoscedastic case, and inFigures 4(e) and (f) for the heteroscedastic case. It is emphasized that the exaggerated noise levels are usedhere to show how the framework works, otherwise, based on the authors’ experience, see Ref. [52, 71], theuncertainties in QoIs due to finite time-averaging intervals are smaller meaning that they may not majorlyalter the plots like those in Figure 1 or other inferences due to the variation of Category-I parameters.14 . . . . . . . . . . . . . . . . . . . . . . . . . x + [ u ] % Mean Observation95% CI z + . . . . . . . . . . . . . . . . . . . . . . . . . x + [ u ] % Mean Observation95% CI z + (a) (b) Figure 3: Synthetic uncertainties (95% CI) added to the mean observed values of (cid:15) [ (cid:104) u τ (cid:105) ]%. The mean values are taken fromthe 25 simulations in Figure 1(d) (corresponding to the red dots) in the computer experiment with q = { ∆ x + , ∆ z + } . Theadded uncertainties are synthetic Gaussian noise which imitate the Category-II uncertainties. The assumed noise is N (0 , σ d i )with i = 1 , , . . . , 25 where σ d i = 0 . σ d i = 0 . − ξ ∆ z + ,i )%, heteroscedastic noise (b). Inthe latter, ξ ∆ z + ,i ∈ [ − , 1] corresponds to the ∆ z + of the sample simulation. These synthetic data are used in Figure 4. Mean Observation Upper 95% CI Lower 95% CI 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . . . . . . . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . (a) (b) (c) 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . . . . . . . . . . . . 20 40 60 80 100 120 x + z + - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . (d) (e) (f) Figure 4: Surfaces of the posterior predictive mean (left column), and 95% CI (middle and right columns) of (cid:15) [ (cid:104) u τ (cid:105) ]% inthe ∆ x + -∆ z + plane. These predictions are made by GPR, through Eqs. (19) and (20), using observations with synthetichomoscedastic noise shown in Figure 3(a) (top row) and observations with synthetic heteroscedastic noise shown in Figure 3(b)(bottom row). The 25 training samples are specified by red dots. Using GPR, the Category-II uncertainty in the training datacan be considered when constructing the surrogates in computer experiments for Category-I uncertainties. u ] [ u ] [ u rms ] [ v rms ] [ w rms ] [ u v ] [ ] M a i n S o b o l I n d i c e s x + z + x + z + [ u ] [ u ] [ u rms ] [ v rms ] [ w rms ] [ u v ] [ ] M a i n S o b o l I n d i c e s x + z + x + z + Figure 5: Main Sobol indices (25) for different error measures when the admissible space for ∆ x + (cid:78) ∆ z + is Q =[5 , (cid:78) [5 , 70] (top), and Q = [5 , (cid:78) [5 , 30] (bottom). The surrogate constructed by n = 25 training samples over Q , seeFigure 1(d), is employed to interpolate QoIs over Q . Accurate quantification of the uncertainties due to finite time-averaging using methods similar to thoseinvestigated in [57, 44, 71] will be considered in the feature.Applying the global sensitivity analysis described in Section 2.2, the Sobol indices indicating the sensi-tivity of the error responses with respect to variation of ∆ x + and ∆ z + are obtained, and they are shownin Figure 5. It is recalled that the Sobol indices represent the global behavior of the error responses, incontrast to the error portraits, which reflect the local behavior. Note that in Figure 5, the uncertaintydue to time-averaging is neglected. The QoIs whose errors are considered in the sensitivity analysis in thisfigure are averaged friction velocity (cid:104) u τ (cid:105) , mean streamwise velocity (cid:104) u (cid:105) , rms velocity fluctuations u (cid:48) rms , v (cid:48) rms ,and w (cid:48) rms in the streamwise, wall-normal and spanwise directions, respectively, Reynolds shear stress (cid:104) u (cid:48) v (cid:48) (cid:105) ,and turbulent kinetic energy K = ( u (cid:48) + v (cid:48) + w (cid:48) ) / 2. Clearly over a given admissible range, the Sobolindices showing sensitivity with respect to ∆ x + and ∆ z + change between different error measures. Withthe exceptions (cid:15) ∞ [ u (cid:48) rms ] on Q = [5 , ⊗ [5 , 70] and (cid:15) ∞ [ (cid:104) u (cid:105) ] on Q = [5 , ⊗ [5 , z + than ∆ x + , regardless of the admissible range ofthese parameters. This behavior is not surprising, although computing Sobol indices provides a quantitativeway to study it. In fact, the relevant near-wall structures in turbulent flows have smaller length-scales inthe spanwise ( z ) direction than in the streamwise ( x ) direction, see e.g. [24]. Thus, employing coarse ∆ z + eliminates the possibility of correctly resolving such structures which in turn largely impacts the numericalmodeling of the flow physics. Another interesting observation is that changing the admissible range, Q , itis only for the streamwise velocity component (mean or rms fluctuation) that the sensitivity with respectto ∆ x + may become larger than that of ∆ z + . In particular, ∆ x + is the most influential factor on (cid:15) ∞ [ u (cid:48) rms ]over Q and on (cid:15) ∞ [ (cid:104) u (cid:105) ] over Q . In a computer experiment, the profiles of flow QoIs are affected by variation of q = (∆ x + , ∆ z + ) over Q .In this regard, the focus of this section is on three important and insightful analyses which are complementaryto each other. These analyses are novel and can be applicable to any CFD practice. First, by evaluating therobustness, an estimate of the amount of uncertainty propagated to each point of the profile of QoIs due tothe variation of q , is obtained. Second, the contribution of each parameter (among q ) in such propagated16ncertainty is measured by computing the Sobol sensitivity indices. Finally, it is explained how to computethe probability of observing a particular value for the QoIs conditioned on letting the parameters q vary. q [ y * ] u * q [ u * ]Ref. DNS95% CI for u * q [ y * ] u * r m s q [ u * rms ]Ref. DNS95% CI for u * rms q [ y * ] u * q [ u * ]Ref. DNS95% CI for u * q [ y * ] u * r m s q [ u * rms ]Ref. DNS95% CI for u * rms Figure 6: Expected value of (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms profiles along with the associated 95% confidence intervals for the variation of ∆ x + and ∆ z + over Q = [5 , (cid:78) [5 , 70] (top), and Q = [5 , (cid:78) [5 , 30] (bottom). For both cases, n = 25 training samples areused. The top plots correspond to the case shown in Figure 1(d). Any Category-II uncertainty, e.g. due to time-averaging, inthe training data is neglected. The DNS data of Iwamoto et al. [23] are provided as reference. Note that the expected values E q [ (cid:104) u (cid:105) ∗ ] and E q [ u (cid:48)∗ rms ] (the blue lines) do not show the best results. The discrete profile of a time-averaged QoI is represented by f i ( q ) = f ( y i , q ) for i = 1 , , · · · , N g .Following Section 3, confidence intervals for the expected value E q [ f i ( q )] are constructed for q ∈ Q . As aresult, we obtain the plots in Figure 6 showing the uncertainty in the inner-scaled profiles of mean and rmsfluctuations of the streamwise velocity component, (cid:104) u (cid:105) ∗ = (cid:104) u (cid:105) / (cid:104) u τ (cid:105) and u (cid:48)∗ rms = u (cid:48) rms / (cid:104) u τ (cid:105) , as a result ofvariation of ∆ x + and ∆ z + over Q and Q . For the sake of brevity, we do not present the results for profilesof Reynolds shear stress (cid:104) u (cid:48) v (cid:48) (cid:105) and other components of the rms fluctuation velocity. Note that since thecomputed (cid:104) u τ (cid:105) (not the reference value) are used for scaling the quantities of the corresponding simulation inthe computer experiment, we have used the superscript ∗ instead of + . Moreover, the horizontal axis in theplots represents E q [ y ∗ ] which is the expected value of y ∗ over the simulations in the experiment. Therefore,the illustrated confidence intervals only reflect the variation in the QoI which is on the vertical axis. Severalinteresting insights can be gained from Figure 6. First, as expected, the way in which the uncertaintydue to the variation of q is propagated depends on the QoI. However, for any QoI the robustness of theprofile varies with the distance from the wall. In the near-wall region, profiles of first- and second-ordervelocity moments are found to be more robust compared to the outer part of the boundary layer. Maximumuncertainty in (cid:104) u (cid:105) ∗ is observed close to the channel center line, while for u (cid:48)∗ rms , it is around the peak wherethe highest influence of the variation of ∆ x + and ∆ z + is observed. Moreover, comparison between the plotsin the top and bottom rows of Figure 6 reveals that reducing the range of variation of q may result insmaller propagation of uncertainty in QoIs. More notably, the (cid:104) u (cid:105) ∗ is found to be very robust when ∆ x + and ∆ z + change over [5 , 50] and [5 , E q [ · ], the expectedvalue of the QoIs in a computer experiment based on variation of q which is specified by the blue line inFigure 6, does not reflect the most accurate value of the QoIs. It however, provides a baseline around which17 q [ y * ] E rr o r i n q [ u * ] n = 4 n = 9 n = 13 10 q [ y * ] E rr o r i n q [ u * r m s ] n = 4 n = 9 n = 13 (a) (b) Figure 7: Normalized relative error in the estimated V q [ (cid:104) u (cid:105) ∗ ] (a) and V q [ u (cid:48)∗ rms ] (b) using n = 4 , , 25 training samples. Thereference values for computing the error and normalizing it are provided by the PCE with n = 25 samples. In all cases theTOM scheme (6) with L = 12 is used for truncation of the PCE. q [ y * ] u * [ q [ u * ]]Ref. DNS95% CI for q [ u * ]95% CI for sdev[ u * ]95% CI for u * q [ y * ] u * r m s (a) (b) Figure 8: Propagation of uncertainty into (cid:104) u (cid:105) ∗ (a) and u (cid:48)∗ rms (b) profiles due to the variation of ∆ x + and ∆ z + over Q whenthe simulator’s data are uncertain with a Gaussian noise with synthetic standard deviations (30). The experiment with 25simulations as in Figure 1(d) and Figure 6 is considered for the expected value of the simulated QoIs. The shaded regions showthe followings: dark blue: ˆ E [ E q [ f ( y ∗ , q )]] ± t (cid:113) ˆ V [ E q [ f ( y ∗ , q )]], light blue: ˆ E [ E q [ f ( y ∗ , q )]] ± t (cid:113) ˆ E [ V q [ f ( y ∗ , q )]], pink:ˆ E [ V q [ f ( χ, q )]] ± t (cid:113) ˆ V [ V q [ f ( χ, q )]], see Section 3. the confidence interval based on associated V q [ · ] is built.The top plots in Figure 6 correspond to the computer experiment with the 25 training samples shown bydots in Figure 1(d). We can investigate how much error would be involved in the confidence intervals if lowernumber of simulations were considered in the computer experiment. To this end, we compute the relativeerror between V q [ f ( y, q )] estimated by n = 4 , , 13 samples and the reference value given by n = 25 case.These samples are represented by the red dots over Q in Figure 1. The resulting errors for (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms are plotted in Figure 7. Clearly, using 4 and 9 simulations would be considerably erroneous. But, byincluding n = 13 simulations, the error in the variance of (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms is less than approximately 6% and 4%,respectively. Therefore, in a computer experiment fewer samples may be required to assess robustness ofthe profiles compared to what is needed to get an accurate error portrait, see Figure 1. It is noted thatbesides the above error analysis, the non-intrusive PCEs in each experiment can be examined by evaluatingthe norm (29) at each point on the profile of a QoI.In the above analysis of robustness against variation of q = (∆ x + , ∆ z + ), Figure 6, the training values ofQoIs (i.e. (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms ) were assumed to have negligible uncertainties. Such uncertainty could, for instance,have been due to insufficient averaging over time. For the purpose of illustrating the novel functionality ofthe PPCE method of Section 2.1.4, synthetic yet reasonable uncertainties are considered to be involved in18he training QoIs. According to a preliminary analysis (not shown here), the time-averaging uncertaintyin the profile of a channel flow QoI can be a function of wall distance. Analyzing the time-series of a setof channel flow simulations at different ∆ x + and ∆ z + values, approximate curves for g ( y ∗ ) are obtained,where (cid:112) V t [ r ( y ∗ )] / E t [ r ( y ∗ )] = c g ( y ∗ ). Note that g ( y ∗ ) has the the value one at the first off-wall gridpoint and takes different forms for different quantities r ( y ∗ ). In this expression, c is a constant value, and V t [ · ] and E t [ · ] are estimated variance and expectation of a time-averaged quantity. We repeat the computerexperiment designed for investigating the impact of q = (∆ x + , ∆ z + ), with the results shown in Figure 6(top),considering Gaussian noise ε i ∼ N (0 , σ d i ) in (1) where i = 1 , , · · · , 25. The observation-dependent noiselevels are constructed as, σ d i ( y ∗ ) = c g ( y ∗ ) E t [ r ( i ) ( y ∗ )] , (30)where, E t [ r ( i ) ( y ∗ )] is substituted by (cid:104) u (cid:105) ∗ ( i ) and u ∗ ( i ) rms obtained by averaging over long enough time (whatused in Figures 1, 6, for instance) and the value of c is assumed to be 0 . 01. This means that if forinstance g ( y ∗ ) = 1, then 1% of the value of the QoIs at each wall-normal location in the channel is uncertaindue to time-averaging. Using the PPCE method of Section 2.1.4, the plots in Figure 8 are obtained. Threeshaded areas representing different confidence intervals can be observed which are constructed following thediscussion in Section 3. The dark-blue area represents ˆ E [ E q [ f ( y ∗ , q )]] ± t (cid:113) ˆ V [ E q [ f ( y ∗ , q )]]. This region isvery small (almost invisible) meaning a high confidence in estimating the ˆ E [ E q [ f ( y ∗ , q )]] profile (solid blueline). Also, note that this profile remains unaffected in the plots in Figure 8 and Figure 6(top). The lightblue shaded region with dash-dotted boundaries shows ˆ E [ E q [ f ( y ∗ , q )]] ± t (cid:113) ˆ E [ V q [ f ( y ∗ , q )]]. This regioncan be compared to the corresponding CI for the variation of q in the plots in Figure 6(top). As a result ofincluding noise (30), the robustness of (cid:104) u (cid:105) ∗ with respect to variation of q remains almost unchanged, exceptat small y ∗ (near the wall). On the other hand, for u (cid:48)∗ rms , the changes are dramatic. This clearly indicatesthe importance of accounting for uncertainties in the training data, upon their existence, in a computerexperiment. Finally, the pink area in Figure 8 reflects the uncertainty involved in the estimated light-blueconfidence regions. q [ y * ] S o b o l I n d i c e s x + z + x + z + q [ y * ] S o b o l I n d i c e s x + z + x + z + q [ y * ] S o b o l I n d i c e s q [ y * ] S o b o l I n d i c e s Figure 9: Main Sobol indices for (cid:104) u (cid:105) ∗ (left column) and u (cid:48)∗ rms (right column) profiles showing the sensitivity with respect to ∆ x + and ∆ z + as they vary over [5 , , 70] (top row) and [5 , 50] and [5 , 30] (bottom row), respectively. q [ y * ] u * . . . . . . . . . . . . . q [ u * ]Ref. DNS u * q [ y * ]=3 0 2PDF8.08.59.0 q [ y * ]=10 0.0 0.5 1.0PDF11.512.012.513.0 q [ y * ]=20 0.00 0.25 0.50PDF1415161718 q [ y * ]=50 0.0 0.2 0.4PDF161820 q [ y * ]=100 0.0 0.5PDF182022 q [ y * ]=250 Figure 10: PDF contours (top) and histograms at specific wall-normal distances (bottom) of (cid:104) u (cid:105) ∗ due to the variation of ∆ x + and ∆ z + over [5 , , Henceforth, we do not consider the synthetic uncertainties in the training data in the computer experi-ments for q . In order to specify the contribution of each component of q compared to the others in drivingthe uncertainty in the profiles of QoIs, the Sobol sensitivity indices are computed. Figure 9 shows the profilesof these indices for (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms belonging to the cases represented in Figure 6. The information deducedfrom the plots in these two figures are complementary to each other. For instance, the highest uncertaintyin the u (cid:48)∗ rms profile that is near the peak turns out to be mostly driven by the variation in ∆ z + . Further,at different wall distances, the relative influence of a parameter on the uncertainty of the QoI can change.However, near the wall, i.e. for E q [ y ∗ ] (cid:46) 7, it is ∆ z + which dominates. It is emphasized that this type ofinterpretation and conclusions are neither trivial nor achievable without using the current UQ framework.In particular, note the fact that the admissible space of ∆ x + is taken to be always larger than the spaceof ∆ z + .We move on to the third type of analysis mentioned at the beginning of this section. The idea is toestimate how probable is to observe a particular value of the QoIs as a result of variation of q over Q . To thisend, probability density functions (PDF) of the QoIs are constructed through evaluating the surrogate ˜ f ( q )at enough number of samples of q . The surrogate can be constructed for instance by PCE (2) or GPR (14)or any other method, see e.g. [17, 65, 20].Shown in the top plots of Figures 10 and 11 are the isolines of PDFs of (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms constructed alongthe associated profiles. The bottom plots represent the PDFs at a few off-the-wall locations extracted fromthe top plots. Clearly, the PDF of the QoI varies with the wall distance and is not the same as any standarddistributions. Moreover, at a fixed E q [ y ∗ ], there might be more than one value for the QoIs which have highprobability to be observed. In other words, the PDF of the QoI can be multi-modal. As an extension of thecurrent analysis, one can estimate the distribution of the combinations of the parameters q which lead to aspecific value for the QoIs. This can be achieved through conducting a UQ inverse problem in a Bayesianframework, see e.g. [17, 65], where the surrogate in model (1) specifies the relationship between the inputparameters and outputs. 20 q [ y * ] u * r m s . . . . . . . . . . . . . . . . . . . . . . q [ u * rms ]Ref. DNS u * r m s q [ y * ]=3 0 2PDF2.502.753.003.253.50 q [ y * ]=10 0.0 0.5 1.0PDF2.53.03.54.0 q [ y * ]=20 0 1 2PDF2.02.5 q [ y * ]=50 0 2 4PDF1.41.61.8 q [ y * ]=100 0 5PDF0.80.91.0 q [ y * ]=250 Figure 11: PDF contours (top) and histograms at specific wall-normal distances (bottom) of u (cid:48)∗ rms due to the variation of ∆ x + and ∆ z + over [5 , , 5. Summary and Conclusions Validation and verification (V&V) aim at assessing the accuracy and reliability of the outputs of com-putational simulations of different physics problems. A computational model depends on a set of inputsand parameters (together called factors) which can influence the simulator’s outputs. Tracing such impactsthrough closed mathematical relations is not usually feasible, considering the complexities in computational,numerical and mathematical models. This is particularly challenging in the case of the Navier–Stokes equa-tions being numerically solved for studying flow turbulence. Therefore, designing appropriate computerexperiments is inevitable for the purpose of V&V. The present study develops a unified framework by com-bining different UQ techniques and computer experiments in order to quantify accuracy, robustness, andsensitivity in computational physics. Although some of the UQ techniques may be well-established, utilizingthem for the purpose of this study is novel. As a UQ way of thinking, all the simulator’s factors and outputsare considered to be random, and thus probabilistic approaches can be applied. According to this view,the concepts of accuracy, robustness and sensitivity of the outputs with respect to variation of the factorsare defined. Depending on the reference true data used for assessment of accuracy, both validation andverification can be addressed.The framework is general and flexible so it can be applied to any computational physics problem. How-ever, the examples and discussions in this manuscript are provided with focus on CFD, in general, andhigh-fidelity simulations of wall turbulence, in particular. In this regard, two sets of factors for the com-puter simulators, e.g. CFD solvers, are considered. Category-I parameters are either uncertain by definitionor assumed to be so. Hence they are allowed to vary according to a given probability distribution over anadmissible space. Based on a limited number of (training) samples for these factors, a surrogate for theactual simulator is constructed in a computer experiment. To estimate variation in the outputs due to theCategory-I parameters, different constructions for non-intrusive PCE are discussed. To assess sensitivityof QoIs with respect to the uncertain factors, Sobol indices based on analysis of variance (ANOVA) arecomputed. An overview on Gaussian process regression is given, which is a powerful method to includeuncertainties due to Category-II parameters which affect the simulator’s outputs in the computer exper-21ment designed based on Category-I parameters. As a novel contribution, it is shown how to use GPRwith heteroscedastic noise structure which enables us to incorporate observation-dependent uncertainties.It is recalled that usually GPR is used with homoscedastic noise levels. Another novelty of this study is tocombine standard PCE with uncertain predictions by GPR to derive probabilistic PCE for the purpose ofestimating “uncertain propagation of uncertainties” in the simulator’s outputs due to the combination ofCategory-I and Category-II parameters.Different capabilities of the developed framework are illustrated by applying it to a computer experimentbased on scale-resolving simulations of turbulent channel flow. The CFD simulations are carried out usingthe open-source spectral-element solver Nek5000 [13]. The inner-scaled grid resolutions ∆ x + and ∆ z + inthe wall-parallel directions of the channel are the Category-I parameters. The quantities of interest are theaveraged friction velocity (cid:104) u τ (cid:105) and the profiles of mean and rms fluctuation of velocity, i.e. (cid:104) u i (cid:105) and (cid:104) u (cid:48) i u (cid:48) j (cid:105) for i, j = 1 , , 3, respectively. The Category-II factors may, for instance, be the insufficiency of sampleswhen computing time-averaged QoIs. Although, the values of such an uncertainty are negligible in the dataused in the present study, we employ synthetic values for this uncertainty to illustrate the features of theframework.Through the illustrative example, it is shown how the UQ framework facilitates finding quantitativeestimates to the following questions which are relevant to any CFD simulation: i) How much do the QoIsvary as a result of variation of different factors? Through this uncertainty-propagation problem, robustnessof the QoIs with respect to the input variations is measured. ii) In a given variation in a QoI, what is thecontribution of each factor? This is answered by global sensitivity analysis. iii) What is the probability ofobserving a particular value of the QoIs when inputs vary? Any of these analyses can also be applied to theaccuracy of a QoI using experimental or high-fidelity reference data.For the case of turbulent channel flow, the portrait of the error in (cid:104) u τ (cid:105) in the ∆ x + -∆ z + space, Figure 1,gives an understanding of the complexity in the variation of error with grid resolution and hence clarifies thedifficulty that analytical error estimators have to deal with. It should be highlighted that our approach is notintended to find the best combination of the factors to be closest to the reference data, but rather quantifiesthe sensitivity, robustness and interdependence of the various factors. Considering Category-II uncertainties,the isolines are slightly affected, see Figure 4. The largest variation is observed for the isoline of zero (cid:15) [ (cid:104) u τ (cid:105) ].The Sobol indices computed for error in different QoIs (see Figure 5) are informative, since they specify whichfactor has to be altered in order to have the highest impact on a specific error measure. A discussion is madeon the number of samples from the uncertain factors which have to be included in a computer experiment.With regard to the non-intrusive PCE, the use of the diagnosis tool (29) is represented in Figure 2. When thepurpose is estimating the propagated uncertainty into the inner-scaled profiles of mean and rms fluctuationof streamwise velocity, (cid:104) u (cid:105) ∗ = (cid:104) u (cid:105) / (cid:104) u τ (cid:105) and u (cid:48)∗ rms = u (cid:48) rms / (cid:104) u τ (cid:105) (see Figure 7), the inclusion of 13 samples isfound to be insignificantly erroneous compared to the case with almost double number of samples. However,in any case it is important to use a space-filling sampling method.The propagated uncertainties in (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms profiles due to the variation of ∆ x + and ∆ z + are repre-sented in Figure 6. Interestingly, in the immediate near-wall region with E q [ y ∗ ] (cid:46) 5, the profiles are veryrobust against changing the resolution. The most influenced region is near the channel center for (cid:104) u (cid:105) ∗ andnear the peak for u (cid:48)∗ rms . Moreover, the admissible space of the uncertain parameters is shown to influence thepropagated uncertainty. Such influence varies between the QoIs. The information from these UQ forwardproblem should be combined with the results of global sensitivity analysis. The plots in Figure 9 show howmuch influence each factor has on the QoIs at different distances from the wall. The profiles of Sobol indicesvary between the QoIs and also with the admissible space. However, the most influential factor at very closedistances to the wall is always observed to be ∆ z + , i.e. the grid resolution in the spanwise direction. Asthe third type of information which could be extracted from the framework, the PDF of profiles of (cid:104) u (cid:105) ∗ and u (cid:48)∗ rms are plotted in Figure 10 and Figure 11, respectively. At any wall-distance, the PDFs of the QoIs aremultimodal, even though the factors ∆ x + and ∆ z + were assumed to be uniform random variables.Finally, we have shown how to combine the variation of grid resolutions with uncertainty from othersources which affect the simulator’s outputs, for instance, insufficient averaging in time. The novel plots ofFigure 8 are obtained as a result of applying the probabilistic PCE of Section 2.1.4. No need to emphasizethat many of the resulting interpretations and conclusions could not be achieved without using the relevant22echniques of the introduced framework.The present study can be applied and extended in various ways, including the followings. The frameworkcan be applied to compare performance of different simulators for benchmark problems. The uncertaintiesdue to insufficient time-averaging in turbulent simulations are needed to be accurately quantified and then beused in the present framework. A UQ inverse problem can be designed and solved to estimate with confidencethe grid resolutions which could result in specific PDFs of QoIs similar to what is shown in Figure 10 andFigure 11. Another relevant application area of the present framework is the study of sensitivity withinturbulence modelling (LES and RANS), or the consideration of physical factors such as surface roughnessand/or geometrical uncertainties. Acknowledgments This work has been supported by the EXCELLERAT project which has received funding from theEuropean Union’s Horizon 2020 research and innovation programme under grant agreement No 823691.Financial support by the Linn´e FLOW Centre at KTH for SR is gratefully acknowledged. RV acknowledgesthe financial support from the Swedish Research Council (VR), and PS funding by the Knut and AliceWallenberg (KAW) foundation as part of the Wallenberg Academy Fellow programme. The channel flowsimulations in Section 4 were performed on the resources provided by the Swedish National Infrastructurefor Computing (SNIC) at PDC (KTH Royal Institute of Technology), HPC2N (Ume˚a University), and NSC(Link¨oping University), Sweden. References [1] Guide for the Verification and Validation of Computational Fluid Dynamics Simulations. In AIAA G-077-1998(2002) .American Institute of Aeronautics and Astronautics, Inc., 1998. doi: 10.2514/4.472855.001. URL https://doi.org/10.2514/4.472855.001 .[2] L. S. Bastos and A. OHagan. Diagnostics for gaussian process emulators. Technometrics , 51(4):425–438, 2009. doi:10.1198/TECH.2009.08019. URL https://doi.org/10.1198/TECH.2009.08019 .[3] G. Blatman. Adaptive sparse polynomial chaos expansions for uncertainty propagation and sensitivity analysis . PhDthesis, Universit´e Blaise Pascal, Clermont-Ferrand, France, 2009.[4] E. J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incompletefrequency information. IEEE Transactions on Information Theory , 52(2):489–509, 2006.[5] I. Celik, Z. N. Cehreli, and I. I. Yavuz. Index of resolution quality for large eddy simulations. ASME. J. Fluids Eng. , 127(5):949–958, 2005. doi: 10.1115/1.1990201.[6] P. M. Congedo, C. Duprat, G. Balarac, and C. Corre. Numerical prediction of turbulent flows using Reynolds-averagedNavier-Stokes and large-eddy simulation with uncertain inflow conditions. International Journal for Numerical Methodsin Fluids , 72(3):341–358, 2012. doi: 10.1002/fld.3743. URL https://doi.org/10.1002/fld.3743 .[7] S. Conti and A. OHagan. Bayesian emulation of complex multi-output and dynamic computer models. Journal of StatisticalPlanning and Inference , 140(3):640 – 651, 2010. ISSN 0378-3758. doi: https://doi.org/10.1016/j.jspi.2009.08.006. URL .[8] M. O. Deville, P. F. Fischer, and E. H. Mund. High-Order Methods for Incompressible Fluid Flow . Cambridge Monographson Applied and Computational Mathematics. Cambridge University Press, 2002. doi: 10.1017/CBO9780511546792.[9] S. Diamond and S. Boyd. CVXPY: A Python-embedded modeling language for convex optimization. Journal of MachineLearning Research , 17(83):1–5, 2016.[10] M. Eldred and J. Burkardt. Comparison of non-intrusive polynomial chaos and stochastic collocation methods for un-certainty quantification. In . American Institute of Aeronautics and Astronautics, Jan. 2009. doi: 10.2514/6.2009-976. URL https://doi.org/10.2514/6.2009-976 .[11] P. F. Fischer. An overlapping schwarz method for spectral element solution of the incompressible Navier–Stokes equations. J. Comput. Phys. , 133:84–101, 1997.[12] P. F. Fischer and J. Mullen. Filter-based stabilization of spectral element methods. C. R. Acad. Sci. Ser. I Math. , 332:265, 2001.[13] P. F. Fischer, J. W. Lottes, and S. G. Kerkemeier. NEK5000: Open source spectral element CFD solver. Available at: http://nek5000.mcs.anl.gov , 2008. URL http://nek5000.mcs.anl.gov .[14] T. E. Fricker, J. E. Oakley, and N. M. Urban. Multivariate gaussian process emulators with nonseparable covariancestructures. Technometrics , 55(1):47–56, 2013. doi: 10.1080/00401706.2012.715835. URL https://doi.org/10.1080/00401706.2012.715835 .[15] J. R. Gardner, G. Pleiss, D. Bindel, K. Q. Weinberger, and A. G. Wilson. Gpytorch: Blackbox matrix-matrix gaussianprocess inference with GPU acceleration. CoRR , abs/1809.11165, 2018. URL http://arxiv.org/abs/1809.11165 . 16] T. Gerstner and M. Griebel. Numerical integration using sparse grids. Numerical Algorithms , 18(3/4):209–232, 1998. doi:10.1023/a:1019129717644. URL https://doi.org/10.1023/a:1019129717644 .[17] R. Ghanem, D. Higdon, and H. Owhadi, editors. Handbook of Uncertainty Quantification . Springer International Pub-lishing, 2017. doi: 10.1007/978-3-319-12385-1. URL https://doi.org/10.1007/978-3-319-12385-1 .[18] R. G. Ghanem and P. D. Spanos. Stochastic Finite Elements: A Spectral Approach . Springer-Verlag New York, Inc., NewYork, NY, USA, 1991. ISBN 0-387-97456-3.[19] P. W. Goldberg, C. K. I. Williams, and C. M. Bishop. Regression with input-dependent noise: A gaussian processtreatment. In Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems 10 , NIPS 97,page 493499, Cambridge, MA, USA, 1998. MIT Press. ISBN 0262100762.[20] R. B. Gramacy. Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences . ChapmanHall/CRC, Boca Raton, Florida, 2020. URL http://bobby.gramacy.com/surrogates/ .[21] S. Hosder, R. Walters, and M. Balch. Efficient sampling for non-intrusive polynomial chaos applications with multipleuncertain input variables. In , pages 1–16, 2007. doi: 10.2514/6.2007-1939. URL https://arc.aiaa.org/doi/abs/10.2514/6.2007-1939 .[22] S. Hoyas and J. Jim´enez. Reynolds number effects on the Reynolds-stress budgets in turbulent channels. Physics of Fluids ,20(10):101511, 2008. doi: 10.1063/1.3005862. URL https://doi.org/10.1063/1.3005862 .[23] K. Iwamoto, Y. Suzuki, and N. Kasagi. Reynolds number effect on wall turbulence: toward effective feedback control. International Journal of Heat and Fluid Flow , 23(5):678–689, 2002. ISSN 0142-727X. doi: http://doi.org/10.1016/S0142-727X(02)00164-9. URL .[24] J. Jim´enez. Near-wall turbulence. Physics of Fluids , 25(10):101302, 2013. doi: 10.1063/1.4824988. URL http://dx.doi.org/10.1063/1.4824988 .[25] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR , abs/1412.6980, 2014.[26] M. Klein. An attempt to assess the quality of large eddy simulations in the context of implicit filtering. Flow, Turbulenceand Combustion , 75(1):131–147, 2005. ISSN 1573-1987.[27] A. Kravchenko and P. Moin. On the effect of numerical errors in large eddy simulations of turbulent flows. Journalof Computational Physics , 131(2):310 – 322, 1997. ISSN 0021-9991. doi: https://doi.org/10.1006/jcph.1996.5597. URL .[28] E. Kreyszig. Introductory Functional Analysis With Applications . Wiley Classics Library. John Wiley & Sons, 1978. ISBN9780471507314. URL https://books.google.se/books?id=Va8rAAAAYAAJ .[29] M. Lee and R. D. Moser. Direct numerical simulation of turbulent channel flow up to Re τ ≈ Journal of FluidMechanics , 774:395–415, 2015. doi: 10.1017/jfm.2015.268.[30] J. Lottes. Independent quality measures for symmetric algebraic multigrid components. Argonne National Laboratory,Mathematics & Computer Science Division , 2005.[31] D. Lucor, J. Meyers, and P. Sagaut. Sensitivity analysis of large-eddy simulations to subgrid-scale-model parametricuncertainty using polynomial chaos. Journal of Fluid Mechanics , 585:255–279, 2007. doi: 10.1017/S0022112007006751.[32] A. Mariotti, L. Siconolfi, and M. Salvetti. Stochastic sensitivity analysis of large-eddy simulation predictions of theflow around a 5:1 rectangular cylinder. European Journal of Mechanics - B/Fluids , 62:149–165, 2017. ISSN 0997-7546. doi: https://doi.org/10.1016/j.euromechflu.2016.12.008. URL .[33] M. Meldi, M. V. Salvetti, and P. Sagaut. Quantification of errors in large-eddy simulations of a spatially evolvingmixing layer using polynomial chaos. Physics of Fluids , 24(3):035101, 2012. doi: 10.1063/1.3688135. URL https://doi.org/10.1063/1.3688135 .[34] J. Meyers. Error-landscape assessment of large-eddy simulations: A review of the methodology. Journal of Scientific Com-puting , 49(1):65–77, Dec. 2010. doi: 10.1007/s10915-010-9449-z. URL https://doi.org/10.1007/s10915-010-9449-z .[35] J. Meyers and P. Sagaut. Is plane-channel flow a friendly case for the testing of large-eddy simulation subgrid-scale models? Physics of Fluids , 19(4):048105, 2007.[36] B. Moore and B. Natarajan. A general framework for robust compressive sensing based nonlinear regression. In , pages 225–228, 2012.[37] P. Negi, P. Schlatter, and D. Henningson. A re-examination of filter-based stabilization for spectral-element methods.Technical report, KTH, Engineering Mechanics, 2017. QC 20171121.[38] W. L. Oberkampf and C. J. Roy. Verification and Validation in Scientific Computing . Cambridge University Press, 2010.doi: 10.1017/CBO9780511760396.[39] W. L. Oberkampf and T. G. Trucano. Verification and validation in computational fluid dynamics. Progress in AerospaceSciences , 38(3):209 – 272, 2002. ISSN 0376-0421. doi: https://doi.org/10.1016/S0376-0421(02)00005-2. URL .[40] T. Oden, R. Moser, , and O. Ghattas. Computer predictions with quantified uncertainty, Part I. SIAM News , 43(9), 2010.[41] N. Offermans, O. Marin, M. Schanen, J. Gong, P. F. Fischer, P. Schlatter, A. Obabko, A. Peplinski, M. Hutchinson, andE. Merzari. On the strong scaling of the spectral element solver Nek5000 on petascale systems. In Proceedings of theExascale Applications and Software Conference , 2016.[42] A. O’Hagan. Bayeshermite quadrature. Journal of Statistical Planning and Inference , 29(3):245 – 260, 1991. ISSN 0378-3758. doi: https://doi.org/10.1016/0378-3758(91)90002-V. URL .[43] T. Oliphant. NumPy: A guide to NumPy. Trelgol Publishing USA, 2006. URL .[44] T. A. Oliver, N. Malaya, R. Ulerich, and R. D. Moser. Estimating uncertainties in statistics computed from direct numericalsimulation. Physics of Fluids , 26(3):035101, 2014. doi: 10.1063/1.4866813. URL https://doi.org/10.1063/1.4866813 . 45] N. E. Owen. A comparison of polynomial chaos and Gaussian process emulation for uncertainty quantification in computerexperiments . PhD thesis, University of Exeter, UK, 2017.[46] N. E. Owen, P. Challenor, P. P. Menon, and S. Bennani. Comparison of surrogate-based uncertainty quantification methodsfor computationally expensive simulators. SIAM/ASA Journal on Uncertainty Quantification , 5(1):403–435, 2017. doi:10.1137/15M1046812. URL https://doi.org/10.1137/15M1046812 .[47] A. T. Patera. A spectral element method for fluid dynamics: laminar flow in a channel expansion. J. Comput. Phys. , 54:468–488, 1984.[48] M. P. Pettersson, G. Iaccarino, and J. Nordstr¨om. Polynomial Chaos Methods for Hyperbolic Partial Differential Equa-tions . Springer International Publishing, 2015. doi: 10.1007/978-3-319-10714-1. URL https://doi.org/10.1007/978-3-319-10714-1 .[49] S. G. Rabinovich. Evaluating Measurement Accuracy . Springer International Publishing, 2017. doi: 10.1007/978-3-319-60125-0. URL https://doi.org/10.1007/978-3-319-60125-0 .[50] K. Ramachandran and C. Tsokos. Mathematical Statistics with Applications . Elsevier Science, 2009. ISBN 9780080951706.URL https://books.google.se/books?id=YFyhXk-ONWwC .[51] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and MachineLearning) . The MIT Press, 2005. ISBN 026218253X.[52] S. Rezaeiravesh and M. Liefvendahl. Effect of grid resolution on large eddy simulation of wall-bounded turbulence. Physicsof Fluids , 30(5):055106, May 2018. doi: 10.1063/1.5025131. URL https://doi.org/10.1063/1.5025131 .[53] S. Rezaeiravesh, R. Vinuesa, M. Liefvendahl, and P. Schlatter. Assessment of uncertainties in hot-wire anemometry andoil-film interferometry measurements for wall-bounded turbulent flows. European Journal of Mechanics - B/Fluids , 72:57–73, 2018. ISSN 0997-7546. doi: https://doi.org/10.1016/j.euromechflu.2018.04.012. URL .[54] S. Rezaeiravesh, T. Mukha, and M. Liefvendahl. Systematic study of accuracy of wall-modeled large eddy simulationusing uncertainty quantification techniques. Computers & Fluids , 185:34–58, 2019. ISSN 0045-7930. doi: https://doi.org/10.1016/j.compfluid.2019.03.025. URL .[55] P. J. Roache. Quantification of uncertainty in computational fluid dynamics. Annual Review of Fluid Mechanics , 29(1):123–160, 1997. doi: 10.1146/annurev.fluid.29.1.123. URL https://doi.org/10.1146/annurev.fluid.29.1.123 .[56] M. Rosenblatt. Remarks on a multivariate transformation. The Annals of Mathematical Statistics , 23(3):470–472, 1952.doi: 10.1214/aoms/1177729394. URL https://doi.org/10.1214/aoms/1177729394 .[57] S. Russo and P. Luchini. A fast algorithm for the estimation of statistical error in DNS (or experimental) time averages. Journal of Computational Physics , 347:328 – 340, 2017. ISSN 0021-9991. doi: https://doi.org/10.1016/j.jcp.2017.07.005.URL .[58] J. Sacks, W. J. Welch, T. J. Mitchell, and H. P. Wynn. Design and analysis of computer experiments. Statistical Science ,4(4):409–423, 1989. ISSN 08834237. URL .[59] P. Sagaut. Large Eddy Simulation for Incompressible Flows, An Introduction . Springer, 3rd edition, 2006.[60] P. Sagaut, S. Deck, and M. Terracol. Multiscale and Multiresolution Approaches in Turbulence: LES, DES and HybridRANS/LES Methods : Applications and Guidelines . Imperial College Press, 2013. ISBN 9781848169876. URL https://books.google.se/books?id=MzW6CgAAQBAJ .[61] T. J. Santner, B. J. Williams, and W. I. Notz. The Design and Analysis of Computer Experiments . Springer New York,2003. doi: 10.1007/978-1-4757-3799-8. URL https://doi.org/10.1007/978-1-4757-3799-8 .[62] P. Schlatter, S. Stolz, and L. Kleiser. LES of transitional flows using the approximate deconvolution model. Int. J. HeatFluid Flow , 25:549–558, 2004.[63] S. Schlesinger. Terminology for model credibility. SIMULATION , 32(3):103–104, 1979. doi: 10.1177/003754977903200304.URL https://doi.org/10.1177/003754977903200304 .[64] R. Sch¨obi, B. Sudret, and J. Wiart. Polynomial-chaos-based Kriging. International Journal for Uncertainty Quantifica-tion , 5(2):171–193, 2015. doi: 10.1615/int.j.uncertaintyquantification.2015012467. URL https://doi.org/10.1615/int.j.uncertaintyquantification.2015012467 .[65] R. C. Smith. Uncertainty Quantification: Theory, Implementation, and Applications . Society for Industrial and AppliedMathematics, Philadelphia, PA, USA, 2013. ISBN 161197321X, 9781611973211.[66] S. A. Smolyak. Quadrature and interpolation formulas for tensor products of certain classes of functions. Dokl. Akad.Nauk SSSR , 148(5):1042–1045, 1963. ISSN 0021-9991. URL .[67] I. Sobol. Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates. Mathematics andComputers in Simulation , 55(1):271 – 280, 2001. ISSN 0378-4754. doi: https://doi.org/10.1016/S0378-4754(00)00270-6.URL . The Second IMACS Seminar on MonteCarlo Methods.[68] I. M. Sobol. On sensitivity estimation for nonlinear mathematical models. Mat. Model. , 2(1):112–118, 1990. ISSN0234-0879.[69] H. Tufo and P. F. Fischer. Fast parallel direct solvers for coarse grid problems. J. Parallel Distr. Com. , 61:151–177, 2001.[70] R. Vinuesa, P. Schlatter, and H. M. Nagib. Role of data uncertainties in identifying the logarithmic region of turbulentboundary layers. Exp. Fluids , 55:1751, 2014.[71] R. Vinuesa, C. Prus, P. Schlatter, and H. M. Nagib. Convergence of numerical simulations of turbulent wall-bounded flows and mean cross-flow structure of rectangular ducts. Meccanica , 51(12):3025–3042, Oct. 2016. doi:10.1007/s11012-016-0558-0. URL https://doi.org/10.1007/s11012-016-0558-0 .[72] R. Vinuesa, P. Negi, M. Atzori, A. Hanifi, D. Henningson, and P. Schlatter. Turbulent boundary layers around wingsections up to Re c = 1 , , Int. J. Heat Fluid Flow , 72:86–99, 2018. 73] R. Vinuesa, A. Peplinski, M. Atzori, L. Fick, O. Marin, E. Merzari, P. Negi, A. Tanarro, and P. Schlatter. Turbulencestatistics in a spectral-element code:a toolbox for high-fidelity simulations. Technical Report 2018:010, KTH, Turbulence,2018. QC 20181219.[74] R. Vinuesa, P. Schlatter, and H. M. Nagib. Secondary flow in turbulent ducts with increasing aspect ratio. Phys. Rev.Fluids , 3:054606, 2018.[75] P. Virtanen, R. Gommers, T. E. Oliphant, and Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computingin Python. Nature Methods , 17:261–272, 2020. doi: https://doi.org/10.1038/s41592-019-0686-2.[76] B. Vreman, B. Geurts, and H. Kuerten. Comparision of numerical schemes in large-eddy simulation of the tem-poral mixing layer. International Journal for Numerical Methods in Fluids , 22(4):297–311, 1996. doi: 10.1002/(SICI)1097-0363(19960229)22:4 (cid:104) (cid:105) https://onlinelibrary.wiley.com/doi/abs/10.1002/%28SICI%291097-0363%2819960229%2922%3A4%3C297%3A%3AAID-FLD361%3E3.0.CO%3B2-X .[77] H. Xiao and P. Cinnella. Quantification of model uncertainty in RANS simulations: A review. Progress in AerospaceSciences , 108:1 – 31, 2019. ISSN 0376-0421. doi: https://doi.org/10.1016/j.paerosci.2018.10.001. URL .[78] D. Xiu. Efficient collocational approach for parametric uncertainty analysis. Communications in Computational Physics ,2(2):293–309, 2007. ISSN 1991-7120. doi: https://doi.org/. URL http://global-sci.org/intro/article_detail/cicp/7907.html .[79] D. Xiu. Numerical Methods for Stochastic Computations: A Spectral Method Approach . Princeton University Press,Princeton, NJ, USA, 2010. ISBN 9781400835348.[80] D. Xiu and J. S. Hesthaven. High-order collocation methods for differential equations with random inputs. SIAM Journalon Scientific Computing , 27(3):1118–1139, 2005. doi: 10.1137/040615201. URL https://doi.org/10.1137/040615201 .[81] D. Xiu and G. E. Karniadakis. The Wiener–Askey polynomial chaos for stochastic differential equations. SIAM Journalon Scientific Computing , 24(2):619–644, 2002., 24(2):619–644, 2002.