[PDF] The ROMES method for statistical modeling of reduced-order-model error

Abstract

This work presents a technique for statistically modeling errors introduced by reduced-order models. The method employs Gaussian-process regression to construct a mapping from a small number of computationally inexpensive `error indicators' to a distribution over the true error. The variance of this distribution can be interpreted as the (epistemic) uncertainty introduced by the reduced-order model. To model normed errors, the method employs existing rigorous error bounds and residual norms as indicators; numerical experiments show that the method leads to a near-optimal expected effectivity in contrast to typical error bounds. To model errors in general outputs, the method uses dual-weighted residuals---which are amenable to uncertainty control---as indicators. Experiments illustrate that correcting the reduced-order-model output with this surrogate can improve prediction accuracy by an order of magnitude; this contrasts with existing `multifidelity correction' approaches, which often fail for reduced-order models and suffer from the curse of dimensionality. The proposed error surrogates also lead to a notion of `probabilistic rigor', i.e., the surrogate bounds the error with specified probability.

Full PDF

TTHE ROMES METHOD FOR STATISTICAL MODELING OFREDUCED-ORDER-MODEL ERROR

MARTIN DROHMANN AND KEVIN CARLBERG

Abstract.

This work presents a technique for statistically modeling errors introduced byreduced-order models. The method employs Gaussian-process regression to construct a map-ping from a small number of computationally inexpensive ‘error indicators’ to a distributionover the true error. The variance of this distribution can be interpreted as the (epistemic)uncertainty introduced by the reduced-order model. To model normed errors, the method em-ploys existing rigorous error bounds and residual norms as indicators; numerical experimentsshow that the method leads to a near-optimal expected eﬀectivity in contrast to typical errorbounds. To model errors in general outputs, the method uses dual-weighted residuals—whichare amenable to uncertainty control—as indicators. Experiments illustrate that correctingthe reduced-order-model output with this surrogate can improve prediction accuracy by anorder of magnitude; this contrasts with existing ‘multiﬁdelity correction’ approaches, whichoften fail for reduced-order models and suﬀer from the curse of dimensionality. The proposederror surrogates also lead to a notion of ‘probabilistic rigor’, i.e., the surrogate bounds theerror with speciﬁed probability.

Primary 65G99, Secondary 65Y20, 62M861.

Introduction

As computing power increases, computational models of engineered systems are being em-ployed to answer increasingly complex questions that guide decision making, often in time-critical scenarios. It is becoming essential to rigorously quantify and account for both aleatoryand epistemic uncertainties in these analyses. Typically, the high-ﬁdelity computational modelcan be viewed as providing a (costly-to-evaluate) mapping between system inputs (e.g., un-certain parameters, decision variables) and system outputs (e.g., outcomes, measurable quan-tities). For example, data assimilation employs collected sensor data (outputs) to updatethe distribution of uncertain parameters (inputs) of the model; doing so via Bayesian infer-ence requires sampling from the posterior distribution, which can entail thousands of forwardmodel simulations. The computational resources (e.g., weeks on a supercomputer) requiredfor large-scale simulations preclude such high-ﬁdelity models from being feasibly deployed insuch scenarios.To avoid this bottleneck, analysts have turned to surrogate models that approximate theinput–output map of the high-ﬁdelity model, yet incur a fraction of their computationalcost. However, to be rigorously incorporated in uncertainty-quantiﬁcation (UQ) contexts, itis critical to quantify the additional uncertainty introduced by such an approximation. Forexample, Bayesian inference aims to sample from the posterior distribution(1) P[ µ | ¯ s ] ∝ P[ µ ]P[¯ s | µ ] , a r X i v : . [ c s . NA ] D ec . DROHMANN, K. CARLBERG ROMES METHOD where µ ∈ P ⊂ R n µ denote system inputs, ¯ s ∈ R denotes the measured output, P[ µ ]represents the prior, and P[¯ s | µ ] denotes the likelihood function. Typically, the measuredoutput is modeled as ¯ s = s ( µ ) + ε , where s : P → R denotes the outputs predicted by thehigh-ﬁdelity model for inputs µ , and ε is a random variable representing measurement noise.Sampling from this posterior distribution (e.g., via Markov-chain Monte–Carlo or importancesampling) is costly, as each sample requires at least one evaluation of the high-ﬁdelity input–output map µ (cid:55)→ s that appears in the likelihood function.When a surrogate model is employed, the measured output becomes ¯ s = s surr ( µ )+ δ s ( µ )+ ε ,where s surr : P → R denotes the output predicted by the surrogate model, and δ s : P → R rep-resents the surrogate-model output error or bias. In this case, posterior sampling requires onlyevaluations of the surrogate-model input–output map µ (cid:55)→ s surr —which is computationallyinexpensive—as well as evaluation of the surrogate-model error δ s ( µ ), which is not preciselyknown in practice. As such, it can be considered a source of epistemic uncertainty , as it can bereduced in principle by employing the original high-ﬁdelity model (or a higher ﬁdelity surro-gate model). The goal of this work is to construct a statistical model of this surrogate-modelerror (cid:101) δ s ( µ ) that is 1) cheaply computable, 2) exhibits low variance (i.e., introduces minimalepistemic uncertainty), and 3) whose distribution can be numerically validated.Various approaches have been developed for diﬀerent surrogate models to quantify the sur-rogate error δ s ( µ ). Surrogate models can be placed into three categories [19]: 1) data ﬁts,2) lower-ﬁdelity models, and 3) reduced-order models. Data ﬁts employ supervised machine-learning methods (e.g., Gaussian processes, polynomial interpolation [21]) to directly modelthe high-ﬁdelity input–output map. Within this class of surrogates, it is possible to statis-tically model the error for stochastic-process data ﬁts, as a prediction for inputs µ yields amean s surr ( µ ) and a mean-zero distribution δ s ( µ ) that can be associated with epistemic un-certainty. While such models are (unbeatably) fast to query and non-intrusive to implement, they suﬀer from the curse of dimensionality and lack access to the underlying model’s physics,which can hinder predictive robustness.Lower-ﬁdelity models simply replace the high-ﬁdelity model with a ‘coarsened’ model ob-tained by neglecting physics, coarsening the mesh, or employing lower-order ﬁnite elements,for example. While such models remain physics based, they often realize only modest compu-tational savings. For such problems, ‘multiﬁdelity correction’ methods have been developed,primarily in the optimization context. These techniques model the mapping µ (cid:55)→ δ s using adata-ﬁt surrogate; they either enforce ‘global’ zeroth-order consistency between the correctedsurrogate prediction and the high-ﬁdelity prediction at training points [23, 29, 33, 39, 35], or‘local’ ﬁrst- or second-order consistency at trust-region centers [2, 19]. Such approaches tendto work well when the surrogate-model error exhibits a lower variance than the high-ﬁdelityresponse [35] and the input-space dimension is small.Reduced-order models (ROMs) employ a projection process to reduce the state-space di-mensionality of the high-ﬁdelity computational model. Although intrusive to implement, suchphysics-based surrogates often lead to more signiﬁcant computational gains than lower-ﬁdelitymodels, and higher robustness than data ﬁts. For such models, error analysis has been limited This work considers one output for notational simplicity. All concepts can be straightforwardly extendedto multiple outputs. The numerical experiments treat the case of multiple outputs. Their construction requires only black-box evaluations of the input–output map of the high-ﬁdelity model.

OMES METHOD M. DROHMANN, K. CARLBERG

ROM data multiﬁdelity ROM +ﬁts correction ROMESnon–intrusive × X X × output-error correction × N/A

X X rigorous error bounds X × × ( X ) ∗ tight error bounds ( X ) † × × X * probabilistically rigorous † good eﬀectivity can only be obtained with very intrusive methods. Table 1.

Features of diﬀerent surrogate modelsprimarily to computing rigorous a posteriori error bounds ∆ s ( µ ) satisfying | δ s ( µ ) | ≤ ∆ s ( µ )[10, 26, 42]. Especially for nonlinear problems, however, these error bounds are often highlyineﬀective, i.e., they overestimate the actual error by orders of magnitude [17]. To overcomethis shortcoming and obtain tighter bounds, the ROM must be equipped with complex ma-chinery that both increases the computational burden [47, 30] and is intrusive to implement(e.g., reformulate the discretization of the high-ﬁdelity model [45, 48]). Further, rigorous bounds are not directly useful for uncertainty quantiﬁcation (UQ) problems, where a statis-tical error model that is unbiased, has low variance, and is stochastic is more useful. Recentwork [35, Section IV.D] has applied multiﬁdelity correction to ROMs. However, the methoddid not succeed because the ROM error is often a highly oscillatory function of the inputs andtherefore typically exhibits a higher variance than the high-ﬁdelity response.In this paper, we introduce the ROM Error Surrogates (ROMES) method that aims tocombine the utility of multiﬁdelity correction with the computational eﬃciency and robustnessof reduced-order modeling. Table 1 compares the proposed approach with existing surrogate-modeling techniques. Similar to the multiﬁdelity-correction approach, we aim to model theROM error δ s using a data-ﬁt surrogate. However, as directly approximating the mapping µ (cid:55)→ δ s is ineﬀective for ROMs, we instead exploit the following key observation: ROMsoften generate a small number of physics-based, cheaply computable error indicators ρ : P → R q that correlate strongly with the true error δ s ( µ ). Examples of indicators include theresidual norm, dual-weighted residuals, and the rigorous error bounds discussed above. Tothis end, ROMES approximates the low-dimensional, well-behaved mapping ρ ( µ ) (cid:55)→ δ s ( µ )using Gaussian-process regression, which is a stochastic-process data-ﬁt method. Note thatROMES constitutes a generalization of the multiﬁdelity correction approach, as the inputs(or features) of the error model can be any user-deﬁned error indicator—they need not be thesystem inputs µ . Figure 1 depicts the propagation of information for the proposed method.In addition to constructing an error surrogate for the system outputs, ROMES can also beused to construct a statistical model for the norm of the error in the system state. Further,ROMES can be used to generate error bounds with ‘probabilistic rigor’, i.e., an error boundthat overestimates the error with a speciﬁed probability.Next, Section 2 introduces the problem formulation and provides a general but brief intro-duction to model reduction. In Section 3, we introduce the ROMES method, including itsobjectives, ingredients, and some choices of these ingredients for particular errors. Section 4brieﬂy summarizes the Gaussian-process kernel method [41] and the relevance vector machine3 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD Input µ Reduced-order model Output s red ( µ ) := s ( u red ( µ )) Error indicators ρ ( µ ) ROMES Error surrogates error ] k δ u k ( ρ ( µ )) or output error e δ s ( ρ ( µ )) Figure 1.

ROMES method. The output quantities of interest can be ‘cor-rected’ by adding the ROM error surrogate to the ROM output prediction,i.e., s ( µ ) ≈ s corr ( µ ) := s red ( µ ) + (cid:101) δ s ( ρ ( µ )).[44], which are the two machine-learning algorithms we employ to construct the ROMESsurrogates. However, the ROMES methodology does not rely on these two techniques, asany supervised machine learning algorithm that generates a stochastic process can be used,as long as it generates a statistical model that meets the important conditions described inSection 3.1. Section 5 analyses the performance of the method when applied with the reduced-basis method to solve Poisson’s equation in two dimensions using nine system inputs. Themethod is also compared with existing rigorous error bounds for normed errors, and with themultiﬁdelity correction approach for errors in general system outputs.For additional information on the reduced-basis method, including the algorithms to gen-erate the reduced-basis spaces and the computation of error bounds, we refer to the supple-mentary Section A. 2. Problem formulation

This section details aspects of the high-ﬁdelity and reduced-order models that are importantfor the ROMES surrogates. We begin with a formulation of the high-ﬁdelity model in Section2.1 and the reduced-order model in Section 2.2. Finally, we elaborate on the errors introducedby the model-reduction process and possible problems with their quantiﬁcation in Section 2.3.2.1.

High-ﬁdelity model.

Consider solving a parameterized systems of equations(2) r ( u ; µ ) = 0 , where u : P → R n denotes the state implicitly deﬁned by Eq. (2), µ ∈ P ⊂ R n µ denotesthe system inputs, and r : R n × R n µ → R n denotes the residual operator. This model isappropriate for stationary problems, e.g., those arising from the ﬁnite-element discretizationof elliptic PDEs. For simplicity, assume we are interested in computing a single output(3) s ( µ ) := g ( u ( µ ))with s : R n µ → R and g : R n → R .When the dimension n of the high-ﬁdelity model is ‘large’, computing the system outputs s by ﬁrst solving Eq. (2) and subsequently applying Eq. (3) can be prohibitively expensive.This is particularly true for many-query problems arising in UQ such as Bayesian inference,which may require thousands of input–output map evaluations µ (cid:55)→ s .4 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

Reduced-order model.

Model-reduction techniques aim to reduce the burden of solv-ing Eq. (2) by employing a projection process. First, they execute a computationally expensive oﬄine stage (e.g., solving Eq. (2) for a training set µ ∈ P train ⊂ P ) to construct 1) a low-dimensional trial basis (in matrix form) V ∈ R n × p with p (cid:28) n that (hopefully) capturesthe behavior of the state u throughout the parameter domain P , and 2) an associated testbasis W ∈ R n × p . Then, during the computationally inexpensive online stage , these meth-ods approximately solve Eq. (2) for arbitrary µ ∈ P by searching for solutions in the trialsubspace range ( V ) ⊂ R n and enforcing orthogonality of the residual r to the test subspacerange ( W ) ⊂ R n :(4) W t r ( V ˆ u ; µ ) = 0 . Here, the state is approximated as u red ( µ ) := V ˆ u ( µ ) and the reduced state ˆ u ( µ ) ∈ R p isimplicitly deﬁned by Eq. (4). The ROM-predicted output is then s red ( µ ) := g ( u red ( µ ); µ ).When the residual operator is nonlinear in the state or non-aﬃne in the inputs, additionalcomplexity-reduction approximations such as empirical interpolation [5, 24], collocation [31,3, 43], discrete empirical interpolation [15, 22, 18], or gappy proper orthogonal decomposition(POD) [13, 14] are required to ensure that computing the low-dimensional residual W t r incursan n -independent operation count. In this case, the residual is approximated as ˜ r ≈ r andthe reduced-order equations become(5) W t ˜ r ( V ˆ u ; µ ) = 0 . When the output operator is nonlinear and the vector ∂g/∂ u is dense, approximations in theoutput calculation are also required to ensure an n -independent operation count.Section A describes in detail the construction of a reduced-order model using the reduced-basis method applied to a parametrically coercive, aﬃne, linear, elliptic PDE.2.3. Reduced-order-model error bounds.

One is typically interested in quantifying twotypes of error incurred by model reduction: the state-space error δ u ( µ ) := u ( µ ) − u red ( µ ) ∈ R n and the output error δ s ( µ ) := s ( µ ) − s red ( µ ) ∈ R . In particular, many ROMs are equippedwith computable, rigorous error bounds for these quantities [37, 25, 36, 11, 17]:(6) ∆ u ( µ ) ≥ (cid:107) δ u ( µ ) (cid:107) , ∆ s ( µ ) ≥ | δ s ( µ ) | In cases, where the norm of the residual operator can be estimated tightly, lower bounds alsoexist:(7) ∆ LB u ( µ ) ≤ (cid:107) δ u ( µ ) (cid:107) , ∆ LB s ( µ ) ≤ | δ s ( µ ) | . The performance of an upper bound is usually quantiﬁed by its eﬀectivity , i.e., the factor bywhich it overestimates the true error(8) η u ( µ ) := ∆ u ( µ ) (cid:107) δ u ( µ ) (cid:107) ≥ , η s ( µ ) := ∆ s ( µ ) | δ s ( µ ) | ≥ . The closer these values are to 1, the ‘tighter’ the bound. For coercive PDEs, these eﬀectivitiescan be controlled by choosing a tight lower bound of the coercivity constant.While this can be easily accomplished for stationary, linear problems, it is diﬃcult to ﬁndtight lower bounds in almost all other cases. In fact, the resulting bounds often overestimatethe error by orders of magnitude [42, 17]. Because eﬀectivity is critically important in practice,various eﬀorts have been undertaken to improve the tightness of the bounds. Huynh et5 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD − − − − Residual k r ( V ˆ u ; µ ) k / error bound ∆ u e rr o r ||| δ u ||| − . − − − − − R e s i du a l k r ( V ˆ u ; µ ) k e rr o r b o u n d ∆ u e rr o r ||| δ u ||| residualserror bounds Figure 2.

Relationship between RB error bounds ∆ u , residual norms (cid:107) r ( V ˆ u ; µ ) (cid:107) , and the true state-space errors ||| δ u ||| , visualized by evaluationof 200 random sample points in the input space. Here, |||·||| denotes the energynorm deﬁned in Section 5.1.al. [30] developed the successive constraint method for this purpose; the method approximatesthe coercivity-constant lower bounds by solving small linear programs online, which dependon additional expensive oﬄine computations. Alternatively, Refs. [45, 48] reformulate theentire discretization of time-dependent problems using a space–time method that improvesthe error bounds by incorporating solutions to dual problems. Another approach [47] aims toapproximate the coercivity constant by eigenvalue analysis of the reduced system matrices.These methods all bloat the oﬄine and the online computation time and often incur intrusivechanges to the high-ﬁdelity-model implementation.Regardless of eﬀectivity, rigorous bounds satisfying inequalities (6) are not directly useful forquantifying the epistemic uncertainty incurred by employing the ROM. Rather, a statisticalmodel that reﬂects our knowledge of these errors would be more appropriate. For such amodel, the mean of the distribution would provide an expected error; the variance wouldprovide a notion of epistemic uncertainty. The most straightforward way to achieve this wouldbe to model the error as a uniform probability distribution on an interval whose boundariescorrespond to the lower and upper bounds. Unfortunately, such an approach leads to wide,uninformative intervals when the bounds suﬀer from poor eﬀectivity; this will be demonstratedin the numerical experiments.Instead, we exploit the following observation: error bounds tend to be strongly correlatedwith the true error. Figure 2 depicts this observed structure for a reduced-basis ROM appliedto an elliptic PDE (see Section 5.1 for details). On a logarithmic scale, the true error exhibitsa roughly linear dependence on both the bound and the residual norm, and the variance of thedata is fairly constant. As will be shown in Section 5.3, employing a multiﬁdelity correctionapproach wherein the error is modeled as a function of the inputs µ does not work well forthis example, both because the input-space dimension is large (nine) and the error is a highlyoscillatory function of these inputs. 6 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

Therefore, we propose constructing a stochastic process that maps such error indicators toa random variable for the error . For this purpose, we employ Gaussian-process regression.The approach leverages one strength of ROMs compared to other surrogate models: ROMsgenerate strong ‘physics-based’ error indicators (e.g., error bounds) in addition to outputpredictions. The next section describes the proposed method.3.

The ROMES method

The objective of the ROMES method is to construct a statistical model of the deterministic,but generally unknown ROM error δ ( µ ) with δ : P → R denoting an R -valued error that mayrepresent the norm of the state-space error (cid:107) δ u (cid:107) , the output error δ s , or its absolute value | δ s | , for example. The distribution of the random variable representing the error should reﬂectour (epistemic) uncertainty about its value. We assume that we can employ a set of trainingpoints δ ( µ n ), n = 1 , . . . , N to construct this model.3.1. Statistical model.

Deﬁne a probability space (Ω , F , P ). We seek to approximate thedeterministic mapping m : µ (cid:55)→ d ( δ ( µ )) by a stochastic mapping ˜ m : ρ ( µ ) (cid:55)→ ˜ d with ˜ d : Ω → R a real-valued random variable. Here, d : R → R is an invertible transformation function (e.g.,logarithm) that can be speciﬁed to facilitate construction of the statistical model. We canthen interpret the statistical model of the error as a random variable ˜ δ : ρ (cid:55)→ d − ( ˜ m ( ρ )).Three ingredients must be selected to construct this mapping ˜ m : 1) the error indicators ρ , 2) the transformation function d , and 3) the methodology for constructing the statisticalmodel from the training data. We will make these choices such that the stochastic mappingsatisﬁes the following conditions:(1) The indicators ρ ( µ ) are cheaply computable and low dimensional given any µ ∈ P .In practice, they should also incur a reasonably small implementation eﬀort, e.g., notrequire modifying the underlying high-ﬁdelity model.(2) The mapping ˜ m exhibits low variance , i.e., E (cid:104) ( ˜ m ( ρ ( µ )) − E [ ˜ m ( ρ ( µ ))]) (cid:105) is ‘small’for all µ ∈ P . This ensures that little additional epistemic uncertainty is introduced.(3) The mapping ˜ m is validated :(9) ω validation ( ω ) ≈ ω, ∀ ω ∈ [0 , , where ω validation ( ω ) is the frequency with which validation data lie in the ω -conﬁdenceinterval predicted by the statistical model(10) ω validation ( ω ) := card ( { µ ∈ P validation | d ( δ ( µ )) ∈ C ω ( µ ) } )card ( P validation ) . Here, the validation set P validation ⊂ P should not include any of the points µ n , n =1 , . . . , N employed to train the error surrogate, and the conﬁdence interval C ω ( µ ) ⊂ R ,which is centered at the mean of ˜ m ( ρ ( µ )), is deﬁned for all µ ∈ P such that(11) P[ ˜ m ( ρ ( µ )) ∈ C ω ( µ ))] = ω. In essence, validation assesses whether or not the data do indeed behave as randomvariables with probability distributions predicted by the statistical model.7 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD

The next section describes the proposed methodology for selecting indicators ρ and trans-formation function d . For constructing the mapping ˜ m from training points, we will employthe two supervised machine learning algorithms described in Section 4: the Gaussian process(GP) kernel method and the relevance vector machine (RVM). Note that these are merelyguidelines for model construction, as there are usually no strong analytical tools to prove thatthe mapping behaves according to a certain probability distribution. Therefore, any choicemust be computationally validated according to condition 3 above.3.2. Choosing indicators and transformation function.

The class of multiﬁdelity-correctionalgorithms can be cast within the framework proposed in Section 3.1. In particular, when astochastic process is used to model additive error, these methods are equivalent to the pro-posed construction with ingredients δ = δ s , ρ = µ , and d = id R with id R ( x ) = x , ∀ x ∈ R theidentity function over R . However, as previously discussed, the mapping µ (cid:55)→ δ s can be highlyoscillatory and non-smooth for reduced-order models; further, this approach is infeasible forhigh-dimensional input spaces, i.e., n µ large. This was shown by Ng and Eldred [35, SectionIV.D]; we also demonstrate this in the numerical experiments of Section 5.3.Note that all indicators and errors proposed in this section should be scaled (e.g., via lineartransformations) in practice such that they exhibit roughly the same range. This ‘featurescaling’ task is common in machine-learning and is particularly important when the ROMESsurrogate employs multiple indicators.3.2.1. Normed and compliant outputs.

As discussed in Section 2.3, many ROMs are equippedwith bounds for normed errors. Further, there is often a strong, well-behaved relationshipbetween such error bounds and the normed error (see Figure 2). In the case of compliantoutputs, the error is always non-negative, i.e., δ s = | δ s | (see Section A.3), so we can treat thiserror as a normed error.To this end, we propose employing error bounds as indicators for the errors in the compliantoutput | δ s | and in the state (cid:107) δ u (cid:107) . However, because the bound eﬀectivity often lies in asmall range (even for a large range of errors) [42], employing a logarithmic transformation isappropriate. To see this, consider a case where the eﬀectivity η of the error bound, deﬁned as(12) η ( µ ) := ∆( µ ) δ ( µ ) ≥ , ∀ µ ∈ P , lies within a range η ≤ η ( µ ) ≤ η , ∀ µ ∈ P . Then the relationship between the error boundand the true error is ∆( µ ) η ≥ δ ( µ ) ≥ ∆( µ ) η (13) log ∆( µ ) − log η ≥ log δ ( µ ) ≥ log ∆( µ ) − log η (14)for all µ ∈ P . In this case, one would expect an aﬃne model mapping log ∆( µ ) to log δ ( µ ) withconstant Gaussian noise to accurately capture the relationship. So, employing a logarithmictransformation permits the use of simpler surrogates that assume a constant variance in theerror variable. Therefore, we propose employing ρ = log ∆ and d = log for statistical modelingof normed errors and compliant outputs. 8 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

A less computationally expensive candidate for an indicator is simply the logarithm of theresidual norm ρ = log r , where r is the Euclidean norm of the residual vector(15) r ( µ ) := (cid:107) r ( V ˆ u ( µ ); µ ) (cid:107) . For more information on the eﬃcient computation of (15), we refer to Section A.3.1. Onewould expect this choice of indicator to produce a similar model to that produced by thelogarithm of the error bound: the error bound is often equal to the residual norm divided by a(costly-to-compute) coercivity-constant lower bound (see Section A.3). Further, employing theresidual norm leads to a model that is less sensitive to strong variations in this approximatedlower bound.Returning to the example depicted in Figure 2, the relationship between the error boundand the energy norm of the state error in log-log space is roughly linear, and the vari-ance is relatively small. The same is true of the relationship between the (computationallycheaper) residual norm and the true error. As expected, these relationships can be accu-rately modeled as a stochastic process with a linear mean function and constant variance(more details in Section 5.1). Here, strong candidates for ROMES error indicators include ρ ( µ ) := r ( µ ) , ρ ( µ ) := ∆( µ ), and ρ ( µ ) := ( r ( µ ) , ∆( µ )). In the experiments in Section 5,we will consider the ﬁrst choice, which is the least expensive and intrusive option, yet leadsto excellent results. For cases where the data are less well behaved, more error indicators canbe included, e.g., linear combinations of inputs or the output prediction.Unfortunately, this set of ROMES ingredients is not applicable to errors in general outputsof interest because the logarithmic transformation function assumes strictly positive errors.The next section presents a strategy for handling this general case. Remark 3.1 (Log-Normal distribution) . In the case where d = log , the error models (cid:101) δ ( µ ) , µ ∈ P are random variables with log-normal distribution. If one is interested in the mostprobable error, one might think to use the expected value of (cid:101) δ . However, the maximum of theprobability distribution function of a log-normally distributed random variable is deﬁned by its mode , which is less than the expected value. We therefore use mode( (cid:101) δ ) if scalar values for theestimation of the output error or the reduced state error are required. General outputs.

This section describes the ROMES ingredients we propose for mod-eling the error δ s in a general output s ( µ ) := g ( u ( µ )). Dual-weighted-residual approachesare commonly adopted for approximating general output errors in the context of a posteriori adaptivity [20, 4, 7, 40, 46, 32], model-reduction adaptivity [12], and model-reduction errorestimation [38, 11, 45, 34]. The latter references compute adjoint solutions in order to improvethe accuracy of ROM output-error bounds. The computation of these adjoint solutions entailsa low-dimensional linear solve; thus, they are eﬃciently computable and can potentially serveas error indicators for the ROMES method.The main idea of dual-weighted-residual error estimation is to approximate the output errorto ﬁrst-order using the solution to a dual problem. For notational simplicity in this section,we drop dependence on the inputs µ .To begin, we approximate the output arising from the (unknown) high-ﬁdelity state u toﬁrst order about the ROM-approximated state V ˆ u :(16) g ( u ) ≈ g ( V ˆ u ) + ∂g∂ u ( V ˆ u ) ( u − V ˆ u )9 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD with g : R n → R and ∂g∂ u : R n → R × n . Similarly, we can approximate the residual to ﬁrstorder about the approximated state as(17) 0 = r ( u ) ≈ r ( V ˆ u ) + ∂ r ∂ u ( V ˆ u ) ( u − V ˆ u ) , where r : R n → R n with ∂ r ∂ u : R n → R n × n . Solving for the error yields(18) ( u − V ˆ u ) ≈ − (cid:20) ∂ r ∂ u ( V ˆ u ) (cid:21) − r ( V ˆ u ) . Substituting (18) in (16) leads to(19) g ( u ) − g ( V ˆ u ) ≈ y T r ( V ˆ u ) , where the dual solution y ∈ R n satisﬁes(20) ∂ r ∂ u t ( V ˆ u ) y = − ∂g∂ u t ( V ˆ u ) . Approximation (19) is ﬁrst-order accurate; therefore, it is exact in the case of linear outputsand a linear residual operator. In the general nonlinear case, this approximation is accuratein a neighborhood of the ROM-approximated state V ˆ u .Because we would like to avoid high-dimensional solves, we approximate y as the reduced-order dual solution y red := Y ˆ y ∈ R n , where ˆ y satisﬁes(21) Y T ∂ r ∂ u t ( V ˆ u ) Y ˆ y = Y T ∂g∂ u t ( V ˆ u ) , and Y ∈ R n × p y with p y (cid:28) n is a reduced basis (in matrix form) for the dual system. SectionA.3.2 provides details on the construction of Y for elliptic PDEs. Substituting the approxi-mation y red for y in (19) yields a cheaply computable error estimate(22) g ( u ) − g ( V ˆ u ) ≈ y t red r ( V ˆ u ) . This relationship implies that one can construct an accurate, cheaply computable ROMESmodel for general-output error δ = δ s = g ( u ) − g ( V ˆ u ) by employing indicators ρ = y t red r ( V ˆ u )and transformation function d = id R the identity function over R . Remark 3.2 (Uncertainty control for dual-weighted-residual error indicators) . The accuracyof the reduced-order dual solution can be controlled by changing p y —the dimension of thedual basis Y . In general, one would expect an increase in p y to lead to a lower-varianceROMES surrogate at the expense of a higher dimensional dual problem (21) . The experimentsin Section 5.6 highlight this uncertainty-control attribute. Probabilistically rigorous error bounds.

Clearly, the ROMES surrogate does notstrictly bound the error, even when error bounds are used as indicators. That is, the meanprobability of overestimation is generally less than one(23) c := 1 |P| (cid:90) µ ∈P P[ ˜ m ( ρ ( µ )) > d ( δ ( µ ))] d µ < .

10 Preprint submitted to JUQ

OMES METHOD M. DROHMANN, K. CARLBERG

This frequency of overestimation depends on the probability distribution of the randomvariable ˜ m ( ρ ). Using the machine learning methods proposed in the next section, we infernormally distributed random variables(24) ˜ m ( ρ ) ∼ N ( ν ( ρ ) , σ ( ρ ))with mean ν ( ρ ) and variance σ ( ρ ). If the model is perfectly validated, then the meanprobability of overestimation is c = 0 .

5. However, knowledge about the distribution of therandom variable can be used to control the overestimation frequency. In particular, themodiﬁed surrogate(25) ˜ m c ( ρ ) := ˜ m ( ρ ) + m LB ( ρ , c )enables probabilistic rigor : it bounds the error with mean speciﬁed probability c assuming themodel is perfectly validated. Here, m LB fulﬁlls(26) P[ X > m LB ( ρ , c )] = c, for X ∼ N (0 , σ ( ρ )) . This value can be computed as(27) m LB ( ρ ) = √ σ ( ρ )erf − (2 c − − is the inverse of the error function .4. Gaussian processes

This section describes the two methods we employ to construct the stochastic mapping˜ m : ρ ( µ ) (cid:55)→ ˜ d : • Gaussian process kernel regression (i.e., kriging) [41] and • the relevance vector machine (RVM) [44].Both methods are examples of supervised learning methods that generate a stochastic processfrom a set of N training points for independent variables x := ( x n ) Nn =1 and a dependent vari-able y := ( y n ) Nn =1 . Using these training data, the methods generate predictions ˜ y ( x ∗ m ; θ ML ), m = 1 , . . . , M associated with a set of M prediction points x ∗ := ( x ∗ m ) Mm =1 . Here, θ ML denotes hyperparameters that are inferred using a Bayesian approach; the predictions arerandom variables with a multivariate normal distribution.In the context of ROMES, the independent variables correspond to error indicators x n = ρ ( µ n ) with µ n ∈ P , n = 1 , . . . , N and the dependent variable corresponds to the (transformed)reduced-order-model error such that y n = d ( δ ( µ n )), n = 1 , . . . , N . To make this paper as self-contained as possible, the following sections brieﬂy present and compare the two approaches.4.1. GP kernel method.

A Gaussian process is deﬁned as a collection of random variablessuch that any ﬁnite number of them has a joint Gaussian distribution. The GP kernel methodconstructs this Gaussian process via Bayesian inference using the training data and a speciﬁedkernel function. To begin, the prior distribution is set to(28) ˜ y prior ( x ) ∼ N (cid:0) , K ( x , x ) + σ I N + M (cid:1) with x := ( x i ) N + Mi =1 = ( x , x ∗ ). Here, the GP kernel assumes that the covariance between anytwo points can be described analytically by a kernel k with additive noise (cid:15) ∼ N (0 , σ I M + N )11 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD such that(29) K ( x , x ) = (cid:0) k ( x i , x j ) (cid:1) ≤ i,j ≤ N + M . In this work, we employ the most commonly used squared-exponential-covariance kernel k ( x i , x j ) = exp (cid:18) − (cid:107) x i − x j (cid:107) l (cid:19) , (30)which induces high correlation between geometrically nearby points. Here, l ∈ R is the ‘width’hyperparameter.Assuming the predictions are generated as independent samples from the stochastic pro-cess, the GP kernel method then generates predictions for each point x ∗ ∈ x ∗ . These predic-tions correspond to random variables with posterior distributions ˜ y ( x ∗ ; θ ) ∼ N ( ν ( x ∗ ) , σ ( x ∗ ))with ν ( x ∗ ) = K ( x ∗ , x ) (cid:0) K ( x , x ) + σ I N (cid:1) − y (31) σ ( x ∗ ) = Σ( x ∗ ) + σ (32) Σ( x ∗ ) = K ( x ∗ , x ∗ ) − K ( x ∗ , x ) (cid:0) K ( x , x ) + σ I N (cid:1) − K ( x , x ∗ ) . (33)More details on the derivation of these expressions can be found in Ref. [41, ch 2.2].The hyperparameters θ := ( l , σ ) can be set to the maximum-likelihood values θ ML com-puted as the solution to an optimization problem(34) θ ML = arg max θ L ( θ )with the log-likelihood function deﬁned as(35) L ( l , σ ) = − y t (cid:0) K ( x , x ; l ) + σ I N (cid:1) − y −

12 log (cid:12)(cid:12) K ( x , x ; l ) + σ I N (cid:12)(cid:12) − N π. For details on the derivation of the log likelihood function and problem (34), we refer toRef. [41, ch 5.4].

Remark 4.1.

The noise component σ of posterior covariance σ ( x ∗ ) accounts for uncertaintyin the assumed GP structure. It plays a crucial role for the ROMES method: it accountsfor the non-uniqueness of the mapping ρ (cid:55)→ δ , as it is possible for δ ( µ i ) (cid:54) = δ ( µ j ) even if ρ ( µ i ) = ρ ( µ j ) . In particular, this noise component represents the ‘information loss’ incurredby employing the error indicators in lieu of the system inputs as independent variables in theGP. Therefore, this component can be interpreted as the inherent uncertainty in the error dueto the non-uniqueness of the mapping ρ (cid:55)→ δ .On the other hand, the remaining term Σ( x ∗ ) of the posterior variance quantiﬁes the un-certainty in the mean prediction. This decreases as the number of training points increases.Therefore, Σ( x ∗ ) can be interpreted as the uncertainty due to a lack of training data. For example, the multiﬁdelity-correction approach employs ρ = µ and therefore shouldbe characterized by σ = 0 , as the mapping µ (cid:55)→ δ is unique. However, due to the high-dimensional nature of the system-input space P in many problems, the uncertainty due to lack Typically in the GP literature, predictions at all points x ∗ are generated simultaneously as a single samplefrom the Gaussian process. In this work, we treat all predictions as arising from independent samples of theGP.

12 Preprint submitted to JUQ

OMES METHOD M. DROHMANN, K. CARLBERG of training Σ( x ∗ ) can be very large unless many training points are employed. On the otherhand, the ROMES method aims to signiﬁcantly reduce Σ( x ∗ ) by employing a small number ofindicators, albeit at the cost of a nonzero σ . In light of this remark, we will employ two diﬀerent types of ROMES models: one thatincludes the uncertainty due to a lack of training data (i.e., variance σ ( x ∗ )), and one thatneglects this uncertainty (i.e., variance σ ).4.2. Relevance vector machine (RVM) method.

The RVM [44] is based on a para-meterized discretization of the predictive random variable(36) ˜ y ( x ) = K (cid:88) k =1 w k φ k ( x ) + (cid:15) = φ ( x ) t w + (cid:15), with speciﬁed basis functions φ ( x ) := [ φ ( x ) · · · φ K ( x )] t ∈ R K , a corresponding set of ran-dom variables w := [ w · · · w K ] t ∈ R K , with w k ∼ N (0 , β k ) for k = 1 , . . . , K and noise (cid:15) ∼ N (0 , σ ). The hyperparameters β = [ β · · · β K ] t ∈ R K deﬁne the prior probabilitydistribution, and are usually chosen by a likelihood maximization over the training samples.Radial basis functions(37) φ RBFk ( x ) = exp (cid:18) − r (cid:107) ¯ x k − x (cid:107) (cid:19) , k = 1 , . . . , K constitute the most common choice for basis functions. For the ROMES method, we often ex-pect a linear relationship between the indicators and true errors, likely with a small-magnitudehigh-order-polynomial deviation. Therefore, we also consider Legendre polynomials [1, Ch.8](38) φ Lebk ( x ) = P k ( x ) , k = 1 , . . . , K. Note that both sets of basis functions are dependent on the training data: while the center-ing points ¯ x k , k = 1 , . . . , K in the radial basis functions can be chosen arbitrarily, they aretypically chosen to be equal to the training points. The domain of the Legendre polynomials,on the other hand, is nominally [ − , x , x ∗ ) is included in thisinterval.The RVM method also employs a Bayesian approach to construct the model from trainingdata. In particular, the vector of hyperparameters β aﬀects the variance of the Gaussianrandom variables w . If these hyperparameters are computed by a maximum-likelihood or asimilar optimization algorithm, large values for these hyperparameters identify insigniﬁcantcomponents that can be removed. Therefore, in the ROMES context, the RVM can be usedto ﬁlter out the least signiﬁcant error indicators. Apart from this detail, the RVM can beconsidered a special case of the GP kernel method with kernel(39) k ( x i , x j ) = K (cid:88) k =1 β k φ k ( x i ) φ k ( x j ) .

13 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD Γ D Γ N Γ N Figure 3.

Domain and sample solution u ( µ ) for the thermal-block problem.5. Numerical experiments

This section analyzes the performance of the ROMES method on Poisson’s equation withnine system inputs, using the reduced-basis method to generate the reduced-order model.First, Section 5.1 introduces the test problem. Section 5.2 discusses implementation and vali-dation of the ROMES models. Section 5.3 compares the ROMES method to the multiﬁdelity-correction approach characterized by employing the model inputs as error indicators. Section5.4 compares the ROMES stochastic error estimate to the error bound given by the reduced-basis method. Section 5.5 compares the performance of the two machine-learning algorithms:the Gaussian process kernel method and the relevance vector machine. Finally, Section 5.6considers non-compliant and multiple output functionals, which ROMES handles via dual-weighted-residual error indicators.5.1.

Problem setup.

Consider a ﬁnite-element model of heat transport on a square domainΩ := ∪ i =1 Ω i composed of nine parameterized materials. The block is cooled along the topboundary to a reference temperature of zero, a nonzero heat ﬂux is speciﬁed on the bottomboundary, and the leftmost boundary is adiabatic. The compliant output functional for thisproblem is deﬁned as the integral over the Neumann domain Γ N (40) ¯ g ( u ( µ )) = (cid:90) Γ N u ( µ )d x, µ ∈ P , where the parameter domain is set to P = [0 . , and u is the continuous representation ofthe ﬁnite-element solution. The state variable u ( µ ) ∈ X = H := (cid:8) w ∈ H (Ω) | w | Γ D = 0 (cid:9) fulﬁlls the weak form of the parameterized Poisson’s equation: ﬁnd u ( µ ) ∈ X , such that(41) a ( u ( µ ) , v ) = f ( v ) for all v ∈ X. Here, the bilinear form a : X × X → X and the functional f : X → X are deﬁned as(42) a ( u, v ) := (cid:90) Ω b ( x ; µ ) ∇ u ( µ ) · ∇ v d x, f ( v ) := (cid:90) Γ N v d x with boundary conditions ∇ b ( x ; µ ) u ( µ ) · n = 0 on Γ N , ∇ b ( x ; µ ) u ( µ ) · n = 1 on Γ N . (43) 14 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

We deﬁne the coeﬃcient function b : Ω × P → R as(44) b ( x ; µ ) = (cid:88) i =1 µ i Ω i ( x ) , where µ i denotes the i th component of the parameter vector µ , and the indicator function A ( x ) = 1 if x ∈ A and is zero otherwise. Figure 3 depicts the composition of the domainand the location of the boundary conditions.By replacing the inﬁnite-dimensional function space X with the (ﬁnite) n -dimensional ﬁnite-element space X h ⊂ X in problem (41), one can compute the parameter-dependent statefunction u h ( µ ) ∈ X h represented by vectors containing the function’s degrees of freedom u ( µ ) ∈ R n (see Section A.1). In the experiments, the domain is discretized by triangularﬁnite elements, which results in a ﬁnite-element space X h of dimension n = 10 . The high-ﬁdelity output (in the notation of Section 2.1) is then g ( u ( µ )) := ¯ g ( u h ( µ )), µ ∈ P .As described in Section A.4, we employ a greedy algorithm to generate a reduced-basisspace X red ⊂ X h of dimension p (cid:28) n . The algorithm employs a training set of 100 randomlyselected points (i.e., card ( P greedy ) = 100 in Section A.4), until the maximum computed errorbound in the training set is less than 1; it stops after p = 11 iterations.Replacing X h with X red in Eq. (41) leads to reduced state functions u red ( µ ) ∈ X red for all µ ∈ P . As before, these solutions can be represented by vectors u red ( µ ) := V ˆ u ( µ ) ∈ R n ,where V ∈ R n × p is the discrete representation of a basis for the function space X red .In the following, we analyze two types of error: (i) the energy norm of the state-space error ||| δ u ||| = ||| u − u red ||| := a ( u h − u red , u h − u red ) and (ii) the output error δ s = g ( u h ) − g ( u red ).Because the output functional in this case is compliant (i.e., g = f and a is symmetric), theoutput error is always non-negative; see Eq. (70) of Section A.3. For more details regarding theﬁnite-element discretization, the reduced-order-model generation, and error bounds, consultSection A of the Supplementary Materials.5.2. ROMES implementation and validation.

We ﬁrst compute ROMES surrogates forthe two errors ||| δ u ||| and (compliant) δ s . As proposed in Section 3.2.1, the three ROMESingredients we employ are: 1) log-residual-norm error indicators ρ ( µ ) = log( r ( µ )), 2) a loga-rithmic transformation function d = log, and 3) both the GP kernel and the RVM supervisedmachine-learning methods. To train the surrogates, we compute ||| δ u ( µ ) ||| , δ s ( µ ), and ρ ( µ )for µ ∈ ¯ P ⊂ P with card (cid:0) ¯ P (cid:1) = 2000. The ﬁrst N = 100 points comprise the ROMEStraining set { µ , . . . , µ N } =: P learn ⊂ ¯ P and the following 1900 points deﬁne a validation set P validation ⊂ ¯ P ; note that the validation set was not used to construct the error surrogates.Reported results relate to statistics computed over this validation set.For the kernel method, we employ the squared exponential covariance kernel (30). Forthe RVM method, we choose Legendre polynomials P k as basis functions, as we expect alinear relationship between the indicators and true errors (see Section 4.2). Because Legendrepolynomials are deﬁned on the interval [ − , All reduced-basis computations are conducted with the reduced-basis library RBMatlab ( ).

15 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD − − − − − e rr o r ( e n e r g y n o r m ) ||| δ u ||| (i) Kernel method − − − − e rr o r ( e n e r g y n o r m ) ||| δ u ||| (ii) RVM − − − − − residual r e rr o r k δ u k X − − − − residual r e rr o r k δ u k X training point mean 95% conﬁdence of N ( ν, σ ) 95% conﬁdence of N ( ν, σ ) Figure 4.

Visualization of ROMES surrogates ( δ = ||| δ u ||| and (cid:107) δ u (cid:107) X , ρ =log r , d = log), computed using N = 100 training points and the (i) GP kernelmethod and (ii) RVM.and largest indicator values:(45) [ ρ min − . ρ max − ρ min ) , ρ max + 0 . ρ max − ρ min )] , where ρ min = min µ ∈P learn ρ ( µ ) and ρ max = max µ ∈P learn ρ ( µ ). We include Legendre polynomi-als of orders 0 to 4; however, the RVM method typically discards the higher order polynomialsdue to the near-linear relation between indicators and errors.Figure 4 depicts the ROMES surrogate (cid:94) ||| δ u ||| generated by both machine-learning methodsusing all 100 training points. For comparison, we also create ROMES surrogates (cid:94) (cid:107) δ u (cid:107) X forerrors in the parameter-independent norm (cid:107)·(cid:107) X of the state space X = H . In addition tothe expected mean of the inferred surrogate, the ﬁgure displays two 95%-conﬁdence intervalsfor the prediction (see Remark 4.1): • The darker shaded interval corresponds to the conﬁdence interval arising from theinherent uncertainty in the error due to the non-uniqueness of the mapping ρ (cid:55)→ ||| δ u ||| ,i.e., the inferred variance σ of Eq. (32). • The lighter shaded interval also includes the ‘uncertainty in the mean’ due to a lack oftraining data, i.e., Σ of Eq. (32). With an increasing number of training points, thisarea should be indistinguishable from the darker one.16 Preprint submitted to JUQ

OMES METHOD M. DROHMANN, K. CARLBERG

All ROMES models ﬁnd a linear trend between the indicators and the errors, where thevariance is slightly larger for the parameter-independent norm. This larger variance can beattributed to the larger range of the coercivity constants the parameter-independent norm(see Section A.3). For this example, however, both ROMES are functional. In the followingexamples, we focus on the energy norm only.Note that the ‘uncertainty in the mean’ is dominant for the RVM surrogate. This can beexplained as follows: the high-order polynomials have values close to zero near the mean ofthe data. As such, the training data are not very informative for the coeﬃcients of thesepolynomials. This results in a large inferred variance for those coeﬃcients. Section 5.5 furthercompares the two machine-learning methods; due to its superior performance, we now proceedwith the kernel method.We now assess the validity of the Gaussian-process assumptions underlying the ROMESsurrogates (cid:94) ||| δ u ||| and (cid:101) δ s , i.e., Condition 3 of Section 3.1. From the discussion in Remark 4.1,we know if the underlying GP model form is correct, then as the number of training pointsincreases, the uncertainty about the mean decreases and the set { D ( µ ) | µ ∈ P validation } with(46) D ( µ ) := d ( ||| δ u ( µ ) ||| ) − E (cid:104) d (cid:16) (cid:94) ||| δ u ||| ( ρ ( µ )) (cid:17)(cid:105) = d ( ||| δ u ( µ ) ||| ) − ν ( ρ ( µ ))should behave like samples from the distribution N (0 , σ ). Figure 5 reports this validationtest and veriﬁes that this condition does indeed hold for a suﬃciently large number of trainingpoints.Further, we can validate the inferred conﬁdence intervals as proposed in Eq. (9). Thetable within Figure 5 reports ω validation ( ω ) (see Eq. (10)), which represents the frequency ofobserved predictions in the validation set that lie within the inferred conﬁdence interval C ω .We declare the ROMES model to be validated, as ω validation ( ω ) ≈ ω for several values of ω asthe number of training points increases.The results for the ROMES surrogate (cid:101) δ s are very similar to those presented in Figure5 and will be further discussed in Section 5.3. Note that the inferred Gaussian process iswell-converged with a moderately sized training set consisting of only N = 35 points.5.3. Output error: comparison with multiﬁdelity correction.

As discussed in Section3.2, multiﬁdelity-correction methods construct a surrogate (cid:93) δ s, MF of the output error using thesystem inputs as error-surrogate inputs, i.e., δ = δ s , ρ = µ , and d = id R . In this section, weconstruct this multiﬁdelity correction surrogate using the same GP kernel method as ROMES.Ref. [35] demonstrated that this error surrogate fails to improve the ‘corrected output’ whenthe low-ﬁdelity model corresponds to a reduced-order model. We now verify this result andshow that—in contrast to the multiﬁdelity correction approach—the ROMES surrogate (cid:101) δ s constructed via the GP kernel method with δ = δ s , ρ = log r , and d = log yields impressiveresults: on average, the output ‘corrected’ by the ROMES surrogate reduces the error by anorder of magnitude, and the Gaussian-process assumptions can be validated. The validationquality improves as the number of training points increases, but a moderately sized set of only N = 20 training points leads to a converged surrogate.The reason multiﬁdelity correction fails for most reduced-order models is twofold. First, themapping µ (cid:55)→ δ s can be highly oscillatory in the input space. This behavior arises from thefact the the reduced-order model error is zero at the (greedily-chosen) ROM training points but17 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD − . . n o r m a li z e d r a t e o f o cc u rr e n c e N = 10 − . . n o r m a li z e d r a t e o f o cc u rr e n c e N = 35 − . . n o r m a li z e d r a t e o f o cc u rr e n c e N = 65 − . . n o r m a li z e d r a t e o f o cc u rr e n c e N = 95 Validation frequency ω validation ( ω )predicted ω N = 10 N = 35 N = 65 N = 950 .

80 0 .

49 0 .

71 0 .

76 0 . .

90 0 .

59 0 .

82 0 .

87 0 . .

95 0 .

68 0 .

89 0 .

92 0 . .

98 0 .

76 0 .

93 0 .

95 0 . .

99 0 .

80 0 .

94 0 .

96 0 . Figure 5.

Gaussian-process validation for the ROMES surrogate (GP kernel, δ = ||| δ u ||| , ρ = log r , d = log) with a varying number of training points N .The histogram corresponds to samples of D ( µ ) and the red curve depicts theprobability distribution function N (0 , σ ). The table reports how often theactual error lies in the inferred conﬁdence intervals.grows (and can grow quickly) away from these points. Such complex behavior requires a largenumber of error-surrogate training points to accurately capture. In addition, the numberof system inputs is often large (in this case nine); this introduces curse-of-dimensionalitydiﬃculties in modeling the error. Figure 6(ii) visualizes this problem. The depicted mappingbetween the ﬁrst two parameter components µ , µ and the output error δ s ( µ ) displays nostructured behavior. As a result, there is no measurable improvement of the corrected output s red + (cid:93) δ s, MF over the evaluation of the ROM output s red alone.In order to quantify the performance of the error surrogates, we introduce a normalized expected improvement (47) I ( (cid:101) δ, µ ) := (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ s ( µ ) − mode (cid:16)(cid:101) δ ( ρ ( µ )) (cid:17) δ s ( µ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . If this value is less than one, then the exected corrected output s red + (cid:101) δ is more accurate thanthe ROM output s red itself for point µ ∈ P , i.e., the additive error surrogate improves the18 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG − − − − − − − Residual r /error bound o u t pu t e rr o r δ s (i) Output error v. ROMES indicators − − − Parameters ( µ , µ ) o u t pu t b i a s δ s (ii) Output error v. system inputs ( r ; δ s )(∆ s ; δ s ) ( µ ; δ s ( µ ))( µ ; δ s ( µ )) Figure 6.

Relationship between (i) ROMES error indicators and thecompliant-output error and (ii) the ﬁrst two parameter components and the(compliant) output error, visualized by evaluation of 200 random sample pointsin the input space. Clearly, the observed structure in the former relationshipis more amenable to constructing a Gaussian process.prediction of the ROM. On the other hand, values above one indicate that the error surrogate worsens the ROM prediction.Figure 7 reports the mean, median, standard deviations, and extrema for the expected im-provement (47) evaluated for all validation points P validation and a varying number of trainingpoints. Here, we also compare with the performance of the error surrogate (cid:103) δ uni , which isdeﬁned as a uniform distribution on the interval (cid:2) ∆ LB s ( µ ) , ∆ s ( µ ) (cid:3) , where ∆ LB s ( µ ) and ∆ s ( µ )are the the lower and upper bounds for the output error, respectively. Note that (cid:103) δ uni does notrequire training data, as it is based purely on error bounds.The expected improvement for the ROMES output-error surrogate I ( (cid:101) δ s , µ ) as depicted inFigure 7(i) is approximately 0 . (cid:103) δ uni isalways greater than one, which means that its correction always increases the error. Thisarises from the fact that the center of the interval (cid:2) ∆ LB s ( µ ) , ∆ s ( µ ) (cid:3) is a poor approximationfor the true error.In addition, Figure 7(ii) shows that the expected improvement produced by the multiﬁdelity-correction surrogate I (cid:16) (cid:93) δ s, MF , µ (cid:17) is often far greater than one. This shows that the multiﬁdelity-correction approach is not well suited for this problem. Presumably, with (far) more trainingpoints, these results would improve.Again, we can validate the Gaussian-process assumptions underlying the error surrogates.For N = 100 training points, Figure 8 compares a histogram of deviation of the true errorfrom the surrogate mean to the inferred probability density function. The associated table19 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD

50 10010 − g δ un i number of training points N e x p ec t e d i m p r o v e m e n t I ( e δ , µ ) (i) ROMES

20 40 60 80 10010 − number of training points N e x p ec t e d i m p r o v e m e n t I ( e δ , µ ) (ii) Multiﬁdelity correction mean ± std median minimum maximum Figure 7.

Expected improvement I ( (cid:101) δ, µ ) for a varying number of trainingpoints N : (i) ROMES (GP kernel, compliant δ = δ s , ρ = log r , d = log)with uniform distribution based on reduced-basis error bounds (cid:103) δ uni and (ii)multiﬁdelity correction (GP kernel, compliant δ = δ s , ρ = µ , and d = id R ).(1: no improvement, >

1: error worsened, <

1: error improved). − . . n o r m a li z e d r a t e o f o cc u rr e n c e (i) ROMES − . . . n o r m a li z e d r a t e o f o cc u rr e n c e (ii) Multiﬁ-delity correction Validationfrequency ω validation ( ω )predicted ω (i) (ii)0 .

80 0 .

78 0 . .

90 0 .

89 1 . .

95 0 .

93 1 . .

98 0 .

96 1 . .

99 0 .

97 1 . Figure 8.

Gaussian-process validation for the ROMES surrogate (GP kernel,compliant δ = δ s , ρ = log r , d = log) and multiﬁdelity-correction surrogate(GP kernel, compliant δ = δ s , ρ = µ , and d = id R ) using N = 100 trainingpoints The histogram corresponds to samples of D ( µ ) and the red curve depictsthe probability distribution function N (0 , σ ). The table reports how often theactual error lies in the inferred conﬁdence intervals. Clearly, this validation testfails for the multilﬁdelity-correction surrogate.20 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG reports how often the validation data lie in inferred conﬁdence intervals. We observe thatthe conﬁdence intervals are valid for the ROMES surrogate, but are not for the multiﬁdelity-correction surrogate, as the bins do not align with the inferred distribution. Figure 9 depictsthe convergence of these conﬁdence-interval validation metrics as the number of training pointsincreases. As expected (see Remark 4.1) the ROMES observed conﬁdence intervals moreclosely align with the conﬁdence intervals arising from the inherent uncertainty (i.e., σ )as the number of training points increases, as this eﬀectively decreases the uncertainty dueto a lack of training . In addition, only a moderate number of training points (around 20)is required to generate a reasonably converged ROMES surrogate. On the other hand themultiﬁdelity-correction surrogate exhibits no such convergence when fewer than 100 trainingpoints are used. . . . .

99 number of training points N V a li d a t i o n f r e - q u e n c y ω v a li d a t i o n ( ω ) (i) ROMES . . . .

99 number of training points N V a li d a t i o n f r e - q u e n c y ω v a li d a t i o n ( ω ) (ii) Multiﬁdelity correction

80% 90% 95% 98% 99%

Figure 9.

Gaussian-process validation for the ROMES surrogate (GP kernel,compliant δ = δ s , ρ = log r , d = log) and multiﬁdelity-correction surrogate(GP kernel, compliant δ = δ s , ρ = µ , and d = id R ) and a varying numberof training points N . The plots depict how often the actual error lies in theinferred conﬁdence intervals.5.4. Reduced-basis error bounds.

In this section, we compare the reduced-basis errorbound ∆ µu (64) with the probabilistically rigorous ROMES surrogates (cid:94) ||| δ u ||| c (8) with rigorvalues of c = 0 . c = 0 . The ROMES surrogate is constructedwith the GP kernel method and ingredients δ = ||| δ u ||| , ρ = log r , and d = log. As discussed inSection 2.3 the error-bound eﬀectivity (8) is important to quantify the performance of thesebounds; a value of 1 is optimal, as it implies no over-estimation. Note that c = 0 . (cid:94) ||| δ u ||| . = (cid:94) ||| δ u ||| (seeEqs. (25)–(27)).

21 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD

50 10002468 b o und ∆ g δ un i number of training points N e ﬀ ec t i v i t y η ( . , µ )

20 40 60 80 10000 . . . . N f r e q u e n c y o f e r - r o r o v e r e s t i m a t i o n s

20 40 60 80 10000 . . . . N f r e q u e n c y o f e r - r o r o v e r e s t i m a t i o n s mean ± stdmedianminimummaximum Figure 10.

Validation of the probabilistically rigorous ROMES surrogates (cid:94) ||| δ u ||| c (GP kernel, δ = ||| δ u ||| , ρ = log r , d = log) and comparison with RBerror upper bound ∆ µu and uniform distribution based on reduced-basis errorbounds (cid:103) δ uni . The top plots compare statistics of the eﬀectivities η ( c, µ ) with c = 0 . c = 0 . c validation with the desired value c (red line).As the probabilistically rigorous ROMES surrogates (cid:94) ||| δ u ||| c are stochastic processes, we canmeasure their (most common) eﬀectivity as(48) η ( c, µ ) := mode (cid:16) (cid:94) ||| δ u ||| c ( ρ ( µ )) (cid:17) ||| δ u ( µ ) ||| . The top plots of Figure 10 report the mean, median, standard deviation, and extrema of theeﬀectivities η (0 . , µ ) and η (0 . , µ ) for all validation points µ ∈ P validation . Again, we comparewith (cid:103) δ uni , which is a uniform distribution on an interval whose endpoints correspond to thelower and upper bounds for the error ||| δ u ( µ ) ||| . We also compare with the correspondingstatistics for the eﬀectivity of the RB error bound ∆ µu . The lower bound for the coercivityconstant that is needed in the RB error bound ∆ µu is chosen as the minimum over all parametercomponents α LB ( µ ) = min i ∈{ ,..., } µ i . This simple choice is eﬀective because the example isaﬃnely parameter dependent and linear [37, Ch. 4.2].22 Preprint submitted to JUQ

OMES METHOD M. DROHMANN, K. CARLBERG

We observe that the ROMES surrogate yields better results than both the error bound ∆ µu (which produces eﬀectivities roughly between one and eight) and the uniform distribution (cid:103) δ uni (which produces mode eﬀectivities roughly between one and four). The 50%-rigorous ROMESsurrogate has an almost perfect mean eﬀectivity of 1 as desired. The 90%-rigorous surrogatehas a higher mean eﬀectivity as expected; however, it is only slightly higher. Furthermore,the eﬀectivities of the ROMES surrogate exhibit a much smaller variance than both ∆ µu and (cid:103) δ uni . Finally, a moderate number (around 20) of training samples is suﬃcient to obtainwell-converged surrogates.The bottom plots of Figure 10 report the frequency of error overestimation(49) c validation := card ( { µ ∈ P validation | median ( ˜ m ( ρ ( µ ))) > d ( δ ( µ )) } )card ( P validation )for the probabilistically rigorous ROMES surrogates (i.e., ˜ m = (cid:94) ||| δ u ||| c ) as the number of train-ing points increases to show that c validation ≈ c , which validates the rate of error overestimation(see Eq. (23)). Note that the overestimation frequency c validation converges to its predictedvalue c , which demonstrates that the rigor of the ROMES estimates can in fact be controlled.5.5. Comparison of machine-learning algorithms.

This section compares in detail theROMES surrogates generated using the two machine-learning methods introduced in Section4. Recall that Figure 4 visualizes both surrogates. We observe that both methods work welloverall and generate well-converged surrogates with a modest number of training samples.As previously mentioned, the GP kernel leads to a smaller inferred variance due to moreaccurate and localized estimates of the mean. The RVM is characterized by global basisfunctions that preclude it from accurately resolving localized features of the mean and lead tolarge uncertainty about the mean (see Figure 4). On the other hand, the conﬁdence intervalscomputed with the RVM are (slightly) better validated.Figure 11(i) and (ii) report the validation test, i.e., the frequency of deviations from theinferred mean D ( µ ) (46) with a training sample containing N = 80 training points. Weobserve a smaller inferred variance σ for the GP kernel method, which implies that themean is identiﬁed more accurately. In both cases, the validation samples align well with theprobability density function of the inferred distribution N (0 , σ ).The conﬁdence intervals of this inferred distribution can be validated, and they turn out tobe (slightly) more realistic for the RVM method. The table within Figure 11 shows that thekernel method results are usually optimistic, i.e., the actual conﬁdence intervals are smallerthan predicted. This eﬀect can be corrected, however, by adding Σ( x ∗ ) as an indicator ofthe uncertainty of the mean as discussed in Remark 4.1. However, doing so for the RVMprediction yields extremely wide conﬁdence intervals due to the signiﬁcant uncertainty aboutthe RVM mean (see Figure 4).As the inferred variance is larger for the relevance vector machine, this also aﬀects the per-formance of eﬀectivity and improvement measures for the error surrogates. Figure 12 depictsstatistics of η (0 . , µ ) computed with both the methods, and we observe that all statisticalmeasures are signiﬁcantly better for the kernel method estimate. The higher variance apparent between 45 and 53 training points can be explained by the fact that theminimization algorithm for the log–likelihood function stops after it exceeds the maximum number of iterations.

23 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD − . . n o r m a li z e d r a t e o f o cc u rr e n c e (i) Kernel method − . . . . n o r m a li z e d r a t e o f o cc u rr e n c e (ii) RVM Validationfrequency ω validation ( ω ) σ σ predicted ω (i) (ii) (i) (ii)0 .

80 0 .

78 0 .

84 0 . .

90 0 .

88 0 .

89 0 .

93 0 . .

95 0 .

93 0 .

95 0 .

95 1 . .

98 0 .

96 0 .

97 0 .

97 1 . .

99 0 .

97 0 .

99 0 .

99 1 . Figure 11.

Gaussian-process validation for the ROMES surrogate usingboth the GP kernel method and the RVM ( δ = ||| δ u ||| , ρ = log r , d = log) using N = 80 training points. The histogram corresponds to samples of D ( µ ) andthe red curve depicts the probability distribution function N (0 , σ ). The tablereports how often the actual error lies in the inferred conﬁdence intervals.

20 40 60 80 100123 number of training points N e ﬀ ec t i v i t y η ( . , µ ) (i) Kernel method

20 40 60 80 10001020 number of training points N e ﬀ ec t i v i t y ( = goo d ) (ii) RVM mean ± std median minimum maximum Figure 12.

Comparison of the eﬀectivity η (0 . , µ ) of ROMES surrogates ( δ = ||| δ u ||| , ρ = log r , d = log) with the GP computed via (i) the GP kernel and (ii)the relevance vector machine method.We conclude that while both methods produce feasible ROMES surrogates, the GP kernelmethod produces consistently better results. In particular, the lower inferred variance impliesthat a lower amount of epistemic uncertainty is introduced by the error surrogate (See con-dition 2 from Section 3.1). This is critically important for many UQ tasks. Therefore, werecommend the GP kernel method to construct ROMES surrogates.24 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

Dependence on reduced-basis size.

To assess the generalizability of the ROMES method,we apply it to a ROM of higher ﬁdelity, i.e., larger p . We construct two ROMES surrogates:one for the state-space error (cid:94) ||| δ u ||| (GP kernel, δ = ||| δ u ||| , ρ = log r , d = log) and one for thecompliant-output error (cid:101) δ s (GP kernel, δ = δ s , ρ = log r , d = log).We increase the reduced-basis size by decreasing the maximum error over the training setfrom 1 . . × − and applying the greedy method. The resulting reduced-basis dimensionis p = 62. Figure 13 reports the error data and ROMES surrogates. Comparing the leftmostplot of Figure 13 with Figure 4(i) and the rightmost plot of Figure 13 with Figure 6(i) revealsthat while the errors are several orders of magnitude smaller for the current experiments, thedata (and the resulting ROMES surrogates) exhibit roughly the same structure as before. − − − − residual r e rr o r ( e n e r g y n o r m ) ||| δ u ||| − − − − − residual r o u t pu t e rr o r | s − s r e d | training point mean 95% conﬁdence of N ( ν, σ ) 95% conﬁdence of N ( ν, σ ) Figure 13.

Visualization of ROMES surrogates ( δ = ||| δ u ||| and δ s , ρ = log r , d = log), computed using N = 100 training points and the GP kernel methodfor a higher dimensional ROM with p = 62.Figure 14 reports the performance of these surrogates. Comparing the leftmost plot of Fig-ure 14 with Figure 12(i) shows that the state-space error surrogate exhibits nearly identicalconvergence for the larger- and smaller-dimension reduced-order models. As in the exper-iments of Section 5.4, the value of mode ( η (0 . , µ )) is close to 1 in the mean. Comparingthe rightmost plot of Figure 14 with Figure 7(i) shows that the expected improvement for theoutput-error correction with the surrogate (cid:101) δ s is around 0 . Multiple and non-compliant outputs.

Finally, we assess the performance of ROMESon a model with multiple and non-compliant output functionals as discussed in Section 3.2.2.25 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD

20 40 60 80 100123 number of training points N e ﬀ ec t i v i t y η ( . , µ )

20 40 60 80 1000123 number of training points N i m p r o v e m e n t I ( ˜ δ , µ ) mean ± std median minimum maximum Figure 14.

Eﬀectivity η (0 . , µ ) of ROMES surrogate ( δ = ||| δ u ||| , ρ = log r , d = log) and expected improvement I ( (cid:101) δ, µ ) of ROMES surrogate (GP kernel,compliant δ = δ s , ρ = log r , d = log) for a higher dimensional ROM with p = 62 and a varying number of training points N .For this experiment, we set two outputs to be temperate measurements at points x and x : s i ( µ ) := g i ( u ( µ )) := ¯ g i ( u ( µ )) = (cid:90) Ω δ Dirac ( x − x i ) u ( x i ; µ ) d x = u ( x i ; µ ) , i = 1 , . (50)where δ Dirac denotes the Dirac delta function. In this case, we construct a separate ROMESsurrogates for each output error (cid:102) δ s and (cid:102) δ s . As previously discussed, we use dual-weightedresiduals as indicators ρ i ( µ ) = y red ,i ( µ ) t r ( u red ; µ ), i = 1 , d ≡ id R .This necessitates the computation of approximate dual solutions, for which dual reduced-basisspaces must be generated in the oﬄine stage. The corresponding ﬁnite element problem canbe found in Eq. (79), where Eq. (50) above provides the right-hand sides. The algebraicproblems can be inferred from Eq. (80), where the discrete right-hand sides are canonical unitvectors because the points x and x coincide with nodes of the ﬁnite-element mesh.Like the primal reduced basis, the dual counterpart can be generated with a greedy algo-rithm that minimizes the approximation error for the reduced dual solutions.To assess the ability for uncertainty control with the dual-weighted-residual indicators (seeRemark 3.2) we generate three dual reduced bases of increasing ﬁdelity: 1) error tolerance of1 (basis sizes p y of 10 and 11), 2) error tolerance of 0 . p y of 15 and 17), 3) errortolerance of 0 . p y of 20 and 23).To train the surrogates, we compute δ s ( µ ), δ s ( µ ), ρ ( µ ) (of varying ﬁdelity), ρ ( µ ) (ofvarying ﬁdelity), for µ ∈ ¯ P ⊂ P with card (cid:0) ¯ P (cid:1) = 500. The ﬁrst T = 100 points deﬁne thetraining set P learn ⊂ ¯ P and the following 400 points constitute the validation set P validation ⊂ ¯ P .Figure 15 depicts the observed relationship between indicators ρ ( µ ) (of diﬀerent ﬁdelity)and the error in the ﬁrst output δ s ( µ ). Note that as the dual-basis size p y increases, theoutput error exhibits a nearly exact linear dependence on the dual-weighted residuals. This26 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG − − · − − · − dual weighted residuals o u t pu t e rr o r δ s (i) dual RB size p y = 10 − · − − · − dual weighted residuals o u t pu t e rr o r δ s (ii) dual RB size p y = 15 − · − − · − dual weighted residuals o u t pu t e rr o r δ s (iii) dual RB size= p y = 20 Figure 15.

Relationship between between dual-weighted-residual indicators ρ = y red , ( µ ) t r ( u red ; µ ) and errors in the (non-compliant) ﬁrst output δ s .is expected, as the residual operator is linear in the state. Therefore, the RVM with a linearpolynomial basis produces the best (i.e., minimum variance) results for the ROMES surrogatesin this case.Figure 16 reﬂects the necessity of employing a large enough dual reduced basis to computethe dual-weighted-residual error indicators. For a small dual reduced basis, there is almost noimprovement in the mean, and only a slight improvement in the median; in some cases, the‘corrected’ outputs are actually less accurate. However, the most accurate dual solutions yielda mean and median error improvement of two orders of magnitude. This illustrates the abilityand utility of uncertainty control when dual-weighted residuals are used as error indicators.Table 17 reports validation results for the inferred conﬁdence intervals. While the validationresults are quite good (and appear to be converging to the correct values), they are not asaccurate as those obtained for the compliant output.6. Conclusions and outlook

This work presented the ROMES method for statistically modeling reduced-order-modelerrors. In contrast to rigorous error bounds, such statistical models are useful for tasksin uncertainty quantiﬁcation. The method employs supervised machine learning methods toconstruct a mapping from existing, cheaply computable ROM error indicators to a distribution over the true error. This distribution reﬂects the epistemic uncertainty introduced by theROM. We proposed ROMES ingredients (supervised-learning method, error indicators, andtransformation function) that yield low-variance, numerically validated models for diﬀerenttypes of ROM errors.For normed outputs, the ROMES surrogates led to eﬀectivities with low variance and meansclose to the optimal value of one, as well as a notion of probabilistic rigor. This is in contrast toexisting ROM error bounds, which exhibited mean eﬀectivities close to ten; this improvementwill likely be more pronounced for more complex (e.g., nonlinear, time dependent) problems.Further, the ROMES surrogates were computationally less expensive than the error bounds,as the coercivity-constant lower bound was not required.For general outputs, the ROMES surrogate allowed the ROM output to be corrected, whichyielded a near 10x accuracy improvement. Further, the uncertainty in this error could be controlled by modifying the dimension of the dual reduced basis. On the other hand, existing27 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD

20 40 60 8010 − number of training points N i m p r o v e m e n t I ( e δ , µ ) i nﬁ r s t o u t pu t (i) dual RB size p y = 10 20 40 60 8010 − number of training points N (ii) dual RB size p y = 15 20 40 60 8010 − number of training points N (iii) dual RB size p y = 2020 40 60 8010 − number of training points N i m p r o v e m e n t I ( e δ , µ ) i n s ec o nd o u t pu t (iv) dual RB size p y = 11 20 40 60 8010 − number of training points N (v) dual RB size p y = 17 20 40 60 8010 − number of training points N (vi) dual RB size p y = 23mean ± std median minimum maximum Figure 16.

Expected improvement I ( (cid:101) δ, µ ) for ROMES surrogate (RVM, δ = δ s , ρ i = y red ,i ( µ ) t r ( u red ; µ ), i = 1 , d = id R ) for a varying number oftraining points T and diﬀerent dual reduced-basis-space dimensions. Comparewith Figure 7 (1: no improvement, >

1: error worsened, <

1: error improved).

Validation frequency ω validation ( ω )ﬁrst output second outputpredicted ω N = 29 N = 53 N = 76 N = 100 N = 29 N = 53 N = 76 N = 1000 . .

00 0 .

83 0 .

82 0 .

82 1 .

00 0 .

82 0 .

87 0 . . .

00 0 .

87 0 .

86 0 .

86 1 .

00 0 .

87 0 .

91 0 . .

95 1 .

00 0 .

89 0 .

89 1 .

00 0 .

89 0 .

93 0 . .

98 1 .

00 0 .

90 0 .

91 0 .

90 1 .

00 0 .

92 0 .

94 0 . .

99 1 .

00 0 .

92 0 .

91 1 .

00 0 .

94 0 .

95 0 . Figure 17.

Gaussian-process validation for the ROMES surrogates (RVM, δ = δ s , ρ i = y red ,i ( µ ) t r ( u red ; µ ), i = 1 , d = id R ). The table reports howoften the actual error lies in the inferred conﬁdence intervals. The largest dualreduced-basis space dimensions ( p y = 20 and p y = 23) are used to computethe error indicators. . 28 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG approaches (i.e., multiﬁdelity correction) that employ system inputs (not error indicators) asinputs to the error model did not lead to improved output predictions. This demonstratedthe ability of ROMES to mitigate the curse of dimensionality: although the problem wascharacterized by nine system inputs, only one error indicator was necessary to construct alow-variance, validated ROMES surrogate.We foresee the combination of ROMs with ROMES error surrogates to be powerful inUQ applications, especially when the number of system inputs is large. Future work entailsintegrating and analyzing the ROM/ROMES combination for speciﬁc UQ problems, e.g.,Bayesian inference, as well as automating the procedure for selecting ROMES ingredients fordiﬀerent problems. Future work will also involve integrating the ROMES surrogates into thegreedy algorithm for the reduced-basis-space selection; this has the potential to improve ROMquality due to the near-optimal expected eﬀectivities of the error surrogates.

Acknowledgments

We thank Khachik Sargsyan for his support in the understanding, selection, and develop-ment of the machine-learning algorithms. This research was supported in part by an appoint-ment to the Sandia National Laboratories Truman Fellowship in National Security Science andEngineering, sponsored by Sandia Corporation (a wholly owned subsidiary of Lockheed Mar-tin Corporation) as Operator of Sandia National Laboratories under its U.S. Department ofEnergy Contract No. DE-AC04-94AL85000. We also acknowledge support by the Departmentof Energy Oﬃce of Advanced Scientiﬁc Computing Research under contract 10-014804.

Appendix A. Reduced-basis method for parametric elliptic PDEs with affineparameter dependence

This section summarizes the reduced-basis method for generating a reduced-order modelfor parametric elliptic PDEs with aﬃne parameter dependence. See Ref. [37] for additionaldetails.A.1.

Parametric elliptic problem.

We start with the discretized version of the ellipticproblem, where the state resides in a function space X h ⊂ X of ﬁnite dimension n := dim( X h ),e.g., a ﬁnite-element space. For details on the derivation of such a ﬁnite element discretizationfrom the analytical formulation of an elliptic PDE and its convergence properties, we refer toRef. [9].The problem reads as follows: For any point in the input space µ ∈ P , compute a quantityof interest s ( µ ) := ¯ g ( u h ( µ ))(51)with u h ∈ X h fulﬁlling a ( u h ( µ ) , v h ; µ ) = f ( v h ; µ ) for all v h ∈ X h . (52)Here, for all inputs µ ∈ P , a ( · , · ; µ ) : X h × X h → X h is a bilinear form on a Hilbert space X h and is symmetric, continuous, and coercive, fulﬁlling(53) sup u ∈ X sup v ∈ X a ( u, v ; µ ) (cid:107) u (cid:107) X (cid:107) v (cid:107) X < γ ( µ ) < ∞ and inf u ∈ X, (cid:107) u (cid:107) X =1 a ( u, u ; µ ) > α ( µ )29 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD for constants γ ( µ ) and α ( µ ) >

0. In addition, we have f ( µ ) ∈ X ∗ h .Given a basis { ϕ i } ni =1 spanning X h , the solution can be expressed as a linear combination u h ( µ ) = (cid:80) ni =1 u i ( µ ) ϕ i ∈ X h with the state vector u := [ u · · · u n ] t : P → R n containing thesolution degrees of freedom. Then, problem (52) can be solved algebraically by solving thesystem of linear equations(54) r ( u ; µ ) := A ( µ ) u ( µ ) − f ( µ ) = 0 , with A : P → R n × n given by A ij ( µ ) = a ( ϕ i , ϕ j ; µ ) and f : P → R n given by f i ( µ ) = f ( ϕ i ; µ ).Similarly, the output can be expressed as a function of the state degrees of freedom:(55) g ( u ( µ )) := ¯ g (cid:32) n (cid:88) i =1 u i ( µ ) ϕ i (cid:33) . The key concept underlying projection-based model-reduction techniques is to ﬁnd a lowerdimensional subspace X red ⊂ X h of dimension p := dim( X red ) (cid:28) n , such that (54) can besolved more eﬃciently. For the reduced-basis method, this problem-dependent reduced-basisspace is obtained through identiﬁcation of a few training points P train := { ˜ µ i } pi =1 ⊂ P forwhich the full-order equations (54) are solved. These functions then span the reduced basisspace X red = span { u ( ˜ µ i ) } pi =1 from which we generate an orthonormal basis { ψ i } pi =1 . We referto subsection A.4 for more details on the selection process of these parameters. Then, thesebasis vectors can be expressed in terms of the original ﬁnite-element degrees of freedom, i.e.,(56) ψ j = n (cid:88) i =1 ϕ i V ij , j = 1 , . . . , p. This transformation provides the entries of the trial basis V . As we consider only Galerkinprojection (due to the symmetric and coercive nature of the PDE), we set W = V , whichleads to a reduced-order problem of the form (4):(57) V t r ( V ˆ u ; µ ) = V t A ( µ ) V ˆ u ( µ ) − V t f ( µ ) V = 0 . The reduced quantity of interest is then given by s red = g ( V ˆ u ) . A.2.

Oﬄine–online decomposition.

The low-dimensional operators in Eq. (57) can becomputed eﬃciently assuming that the bilinear forms a and the functional f are aﬃnelyparameter dependent , i.e., they can be written as(58) a ( · , · ; µ ) = Q a (cid:88) q =1 σ qa ( µ ) a q ( · , · ) and f ( · ; µ ) = Q f (cid:88) q =1 σ qf ( µ ) f q ( · ) . with parameter-independent bilinear forms a q : X h × X h → X h , q = 1 , . . . , Q a and functionals f q : X h → X h , q = 1 , . . . , Q f , and coeﬃcient functions σ qa : P → R , q = 1 , . . . , Q a and σ qf : P → R , q = 1 , . . . , Q f . 30 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

In this aﬃnely parameter dependent case, the parameter-dependent reduced quantities V t A ( µ ) V and V t f ( µ ) V can be quickly assembled via a linear combination of their parameter-independent components: V t A ( µ ) V = Q a (cid:88) q =1 σ qa ( µ ) V t A q V (59) V t f V = Q f (cid:88) q =1 σ qf ( µ ) V t f q V . (60)Here the matrices A q ∈ R n × n , q = 1 , . . . , Q a are given by A qij = a q ( ϕ i , ϕ j ) and the vectors f q ∈ R n , q = 1 , . . . , Q f are given by f qi = f ( ϕ i ).The parameter-independent components(61) V t A q V ∈ R p × p , q = 1 , . . . , Q a and V t f q V ∈ R p , q = 1 , . . . , Q f must be computed only once, during the so-called oﬄine stage . All parameter-dependentcomputations are carried out in the online stage in an eﬃcient manner, i.e., with computationalcomplexities that only depend on the reduced-basis dimension p . We refer to Ref. [37] for moredetails on the oﬄine/online decomposition. Note that in the non-aﬃne case, low-dimensionaloperators can be approximated using the empirical interpolation method [6, 28, 16].A.3. Error bounds.

The state error can be measured either in the problem–dependent en-ergy norm or in the norm given by the underlying Hilbert space X h . The energy norm for theabove problem is deﬁned as(62) ||| u ||| := a ( u, u ; µ )and is equivalent to the X h -norm(63) α (cid:107) u (cid:107) X h ≤ ||| u ||| ≤ γ (cid:107) u (cid:107) X h . The bounds for the state errors ∆ µu and ∆ u for the errors ||| δ u ||| and (cid:107) δ u (cid:107) X h are deﬁned asfollows: ∆ µu ( µ ) := r ( µ ) (cid:112) α LB ( µ ) ≥ ||| δ u ( µ ) ||| and(64) ∆ u ( µ ) := r ( µ ) α LB ( µ ) ≥ (cid:107) δ u ( µ ) (cid:107) X h , (65)where α LB ( µ ) ≤ α ( µ ) is a lower bound for the coercivity constant. This lower bound caneasily computed if the bilinear form a is symmetric and parametrically coercive as deﬁned in[37, Ch.4.2], i.e., there is a point ¯ µ ∈ P , such that( u, v ) X h = a ( u, v ; ¯ µ ) , (66)the parameter independent bilinear forms are coercive a q ( w, w ) ≥ ∀ µ ∈ P , q = 1 , . . . , Q a (67) 31 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD and their coeﬃcients strictly positive σ qa ( µ ) > ∀ µ ∈ P , q = 1 , . . . , Q a . (68)In this special case, the lower coercivity constant can be chosen as(69) α LB = min q =1 ,...,Q a σ qa ( µ ) σ qa ( ¯ µ ) . A similar bound can be generated for errors in compliant output functionals ¯ g = f (and a symmetric):(70) ∆ s ( µ ) := r ( µ ) α LB ( µ ) ≥ δ s ( µ ) := s ( µ ) − s red ( µ ) > . Analogously to the upper bounds, lower bounds for the errors can be computed as∆

LB, µu ( µ ) := r ( µ ) (cid:112) γ UB ( µ ) ≤ ||| δ u ( µ ) ||| , (71) ∆ LB u ( µ ) := r ( µ ) γ UB ( µ ) ≤ (cid:107) δ u ( µ ) (cid:107) X h and(72) ∆ LB s ( µ ) := r ( µ ) γ UB ( µ ) ≤ δ s ( µ ) . (73)where γ UB ≥ γ is an upper bound for the continuity constant γ .A.3.1. Oﬄine–online decomposition of the residual norm.

In order to compute the errorbounds eﬃciently, the residual norm(74) r ( µ ) := (cid:107) f ( · ; µ ) − a ( u red ( µ ) , · ; µ ) (cid:107) X ∗ h must be computed eﬃciently through an oﬄine/online decomposition. Here, the reducedsolution u red ( µ ) is given by (cid:80) pi =0 ˆ u i ψ i ∈ X red ⊂ X h with the reduced state vector ˆ u =[ˆ u · · · u p ] t : P → R p containing the reduced solution degrees of freedom. Algebraically theresidual norm can be computed as(75) ( r ( µ )) = ( K − r ( V ˆ u ( µ ); µ )) t r ( V ˆ u ( µ ); µ )= (cid:88) ≤ q,q (cid:48) ≤ Q a σ qa ( µ ) σ q (cid:48) a ( µ ) ˆ u ( µ ) (cid:16)(cid:0) K − A q V (cid:1) t A q (cid:48) V (cid:17) ˆ u ( µ ) − Q f (cid:88) q =1 Q a (cid:88) q (cid:48) =1 σ qf ( µ ) σ q (cid:48) a ( µ ) (cid:16)(cid:0) K − f q V (cid:1) t A q (cid:48) V (cid:17) ˆ u ( µ )+ (cid:88) ≤ q,q (cid:48) ≤ Q f σ qf ( µ ) σ q (cid:48) f ( µ ) + (cid:16)(cid:0) K − f q V (cid:1) t f q (cid:48) V (cid:17) where the inner product matrix K ∈ R n × n is given by K ij = ( ϕ i , ϕ j ) X h . It is used to identifythe Riesz representation of the residual f ( · ; µ ) − a ( u red ( µ ) , · ; µ ) ∈ X ∗ h .32 Preprint submitted to JUQ OMES METHOD M. DROHMANN, K. CARLBERG

All low-dimensional and parameter-independent matrices (cid:16)(cid:0) K − A q V (cid:1) t A q (cid:48) V (cid:17) ∈ R p × p , ≤ q, q (cid:48) ≤ Q a (76) (cid:16)(cid:0) K − f q V (cid:1) t A q (cid:48) V (cid:17) ∈ R p , ≤ q, q (cid:48) ≤ Q f , Q a (77) (cid:16)(cid:0) K − f q V (cid:1) t f q (cid:48) V (cid:17) ∈ R , ≤ q, q (cid:48) ≤ Q f (78)can be pre-computed during the oﬄine stage .A.3.2. Dual weighted error estimates.

As discussed in Section 3.2.2, the ROMES surrogatefor modeling errors in general (non-compliant) outputs ¯ g requires a dual solution y ( µ ) ∈ R n . In the present context, the associated problem is: Find y h ( µ ) ∈ X h fulﬁlling(79) a ( v h , y h ( µ ); µ ) = − ¯ g ( v h ) for all v h ∈ X h . Analogously to the discussion for the primal problem, we require a reduced-basis space X ¯ g red ⊂ X h for this dual problem. Algebraically, this leads to a reduced-basis matrix Y ∈ R n × p y associated for a reduced dual solution ˆ y ∈ R p y such that(80) Y T r g ( Y ˆ y ( µ ); µ ) = 0 , with r g ( y ; µ ) := A ( µ ) y + g . Here, g ∈ R n is given by ( g ) i = ¯ g ( ϕ i ). Then, this error can be bounded as(81) ∆ s ( µ ) := r ( µ ) r g ( µ ) α LB ( µ ) ≥ | δ s ( µ ) | , with the dual residual norm r g ( µ ) := (cid:107) r g ( Y ˆ y ( µ ); µ ) (cid:107) .A.4. Basis generation with greedy algorithm.

We have so far withheld the mechanism bywhich the reduced basis spaces X red , X g red ⊂ X h are constructed. This section provides a briefoverview on this topic, and to ease exposition, we will focus on the generation of the primalreduced-basis space only. As previously mentioned, the reduced-basis space is constructedfrom p chosen input-space points P train ⊂ P with card ( P train ) = p . As the reduced-basissolutions should provide accurate approximations for all points µ ∈ P , the ultimate goal isto ﬁnd a reduced-basis space of low dimension p (cid:28) n that minimizes some distance measurebetween itself and the manifold M := { u ( µ ) | µ ∈ P} . Two possible candidates for such adistance measure are the maximum projection error(82) dist ( X red , M ) := max u ∈M inf v ∈ X red (cid:107) u − v (cid:107) and the maximum reduced state error(83) dist ( X red , M ) := max u ( µ ) ∈M (cid:107) u ( µ ) − u red ( µ ) (cid:107) . The optimally achievable distance of a reduced-basis space from the manifold M is known asthe Kolmogorov N -width and is given by(84) d N ( M ) := inf ˜ X ⊂ X, dim( ˜ X )= N dist ( ˜ X, M ) . While the Kolmogorov N -width is a theoretical measure and is only known for a few simplemanifolds, it is possible to construct a reduced-basis space with a myopic greedy algorithm that often comes close to this optimal value. The greedy algorithm works as follows: First,33 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD the manifold of ‘interesting solutions’ is discretized by deﬁning a ﬁnite subset M greedy := { u ( µ ) | µ ∈ P greedy } ⊂ M , with card ( P greedy ) ≥ p for which the distance measures (82) and(83) can be computed in theory. However, as computing these distance measures exactlyrequires knowledge of the full-order solution u , the method substitutes these distance measuresby error bounds max µ ∈P greedy ∆ u ( µ ). This allows for the construction of a sequence of reduced-basis spaces X ⊂ X ⊂ · · · ⊂ X p =: X red by choosing the ﬁrst subspace arbitrarily, as(85) X = span { u ( µ ) } with a randomly chosen parameter µ ∈ P . Then, the following spaces can be chosen itera-tively by computing the parameter that maximizes the error bound(86) µ k max := arg max µ ∈P greedy ∆ k u ( µ )and adding the corresponding full-order solution to the reduced basis spaces(87) X k +1 = X k ∪ span (cid:110) u ( µ k max ) (cid:111) , where ∆ k u ( µ ) denotes the bound for model-reduction errors (cid:107) u ( µ ) − u red ( µ ) (cid:107) computed usingreduced-basis space X k . This algorithm works well in practice and has recently been veriﬁedby theoretical convergence results: Refs. [8, 27] proved that the distances of these heuristicallyconstructed reduced spaces converge at a rate similar to that of the theoretical Kolmogorov N -width if this N -width converges algebraically or exponentially fast.In practice, the performance of this greedy algorithm depends strongly on both the eﬀec-tivity and computational cost of the error bounds, as less expensive error bounds allow for alarger number of points in M greedy . Therefore, we expect that employing the ROMES surro-gates in lieu of the rigorous bounds may lead to performance gains for the greedy algorithm;this constitutes an interesting area of future investigation. References [1] M. Abramowitz and I.A. Stegun, editors.

Handbook of mathematical functions , volume 55. National Bureauof Standards Applied Mathatics Series, 1972.[2] N.M. Alexandrov, R.M. Lewis, C.R. Gumbert, L.L. Green, and P.A. Newman. Approximation andmodel management in aerodynamic optimization with variable-ﬁdelity models.

AIAA Journal of Aircraft ,38(6):1093–1101, 2001.[3] P. Astrid, S. Weiland, K. Willcox, and T. Backx. Missing point estimation in models described by properorthogonal decomposition.

IEEE Transactions on Automatic Control , 53(10):2237–2251, 2008.[4] I Babuˇska and A Miller. The post-processing approach in the ﬁnite element method—part 1: Calculationof displacements, stresses and other higher derivatives of the displacements.

International Journal fornumerical methods in engineering , 20(6):1085–1109, 1984.[5] M. Barrault, Y. Maday, N. C. Nguyen, and A. T. Patera. An ‘empirical interpolation’ method: applicationto eﬃcient reduced-basis discretization of partial diﬀerential equations.

Comptes Rendus Math´ematiqueAcad´emie des Sciences , 339(9):667–672, 2004.[6] M. Barrault, Y. Maday, N.C. Nguyen, and A.T. Patera. An ‘empirical interpolation’ method: applicationto eﬃcient reduced-basis discretization of partial diﬀerential equations.

C. R. Math. Acad. Sci. Paris SeriesI , 339:667–672, 2004.[7] Roland Becker and Rolf Rannacher.

Weighted a posteriori error control in ﬁnite element methods , volumepreprint no. 96-1. Universitat Heidelberg, 1996.

34 Preprint submitted to JUQ

OMES METHOD M. DROHMANN, K. CARLBERG [8] Peter Binev, Albert Cohen, Wolfgang Dahmen, Ronald DeVore, Guergana Petrova, and PrzemyslawWojtaszczyk. Convergence rates for greedy algorithms in reduced basis methods.

SIAM J. Math. Anal. ,43(3):1457–1472, 2011.[9] Dietrich Braess.

Finite elements: Theory, fast solvers, and applications in solid mechanics . CambridgeUniversity Press, 2007.[10] A. Buﬀa, Y. Maday, Anthony T. Patera, Christophe Prud’homme, and Gabriel Turinici. A priori conver-gence of the greedy algorithm for the parametrized reduced basis.

ESAIM-Math. Model. Numer. Anal. ,46(3):595–603, may 2012. Special Issue in honor of David Gottlieb.[11] C. Canuto, T. Tonn, and K. Urban. A-posteriori error analysis of the reduced basis method for non-aﬃneparameterized nonlinear pde’s.

SIAM J. Numer. Anal , 47(e):2001–2022, 2009.[12] K. Carlberg. Adaptive h -reﬁnement for reduced-order models. arXiv preprint 1404.0442 , 2014.[13] K. Carlberg, C. Bou-Mosleh, and C. Farhat. Eﬃcient non-linear model reduction via a least-squaresPetrov–Galerkin projection and compressive tensor approximations. International Journal for NumericalMethods in Engineering , 86(2):155–181, April 2011.[14] K. Carlberg, C. Farhat, J. Cortial, and D. Amsallem. The GNAT method for nonlinear model reduction:eﬀective implementation and application to computational ﬂuid dynamics and turbulent ﬂows.

Journal ofComputational Physics , 242:623–647, 2013.[15] S. Chaturantabut and D. C. Sorensen. Nonlinear model reduction via discrete empirical interpolation.

SIAM Journal on Scientiﬁc Computing , 32(5):2737–2764, 2010.[16] S. Chaturantabut and D.C. Sorensen. Discrete empirical interpolation for nonlinear model reduction.

SIAMJ. Sci. Comp. , 32(5):2737–2764, 2010.[17] M. Drohmann, B. Haasdonk, and M. Ohlberger. Reduced basis approximation for nonlinear parametrizedevolution equations based on empirical operator interpolation.

SIAM J. Sci Comp , 34:A937–A969, 2012.[18] M. Drohmann, B. Haasdonk, and M. Ohlberger. Reduced basis approximation for nonlinear parametrizedevolution equations based on empirical operator interpolation.

SIAM Journal on Scientiﬁc Computing ,34(2):A937–A969, 2012.[19] M. S. Eldred, A. A. Giunta, S. S. Collis, N. A. Alexandrov, and R. M. Lewis. Second-order correctionsfor surrogate-based optimization with model hierarchies. In

In Proceedings of the 10th AIAA/ISSMOMultidisciplinary Analysis and Optimization Conference, Albany, NY , Aug 2004.[20] Donald Estep. A posteriori error bounds and global error control for approximation of ordinary diﬀerentialequations.

SIAM Journal on Numerical Analysis , 32(1):1–48, 1995.[21] Alexander I. J. Forrester, Andras Sobester, and Andy J. Keane.

Engineering Design via Surrogate Mod-elling - A Practical Guide.

Wiley, 2008.[22] D. Galbally, K. Fidkowski, K. Willcox, and O. Ghattas. Non-linear model reduction for uncertainty quan-tiﬁcation in large-scale inverse problems.

International Journal for Numerical Methods in Engineering ,81(12):1581–1608, published online September 2009.[23] Shawn E Gano, John E Renaud, and Brian Sanders. Hybrid variable ﬁdelity optimization by using akriging-based scaling function.

AIAA Journal , 43(11):2422–2433, 2005.[24] M. A. Grepl, Y. Maday, N. C. Nguyen, and A. T. Patera. Eﬃcient reduced-basis treatment of nonaﬃne andnonlinear partial diﬀerential equations.

ESAIM-Mathematical Modelling and Numerical Analysis (M2AN) ,41(3):575–605, 2007.[25] M.A. Grepl, Y. Maday, N.C. Nguyen, and A.T. Patera. Eﬃcient reduced-basis treatment of nonaﬃne andnonlinear partial diﬀerential equations.

M2AN, Math. Model. Numer. Anal. , 41(3):575–605, 2007.[26] M.A. Grepl and A.T. Patera. A posteriori error bounds for reduced-basis approximations of parametrizedparabolic partial diﬀerential equations.

M2AN, Math. Model. Numer. Anal. , 39(1):157–181, 2005.[27] B Haasdonk. Convergence rates of the POD-greedy method. Technical Report 23, SimTech Preprint 2011,University of Stuttgart, 2011.[28] B. Haasdonk, M. Ohlberger, and G. Rozza. A reduced basis method for evolution schemes with parameter-dependent explicit operators.

Electronic Transactions on Numerical Analysis , 32:145–161, 2008.[29] Deng Huang, TT Allen, WI Notz, and RA Miller. Sequential kriging optimization using multiple-ﬁdelityevaluations.

Structural and Multidisciplinary Optimization , 32(5):369–382, 2006.

35 Preprint submitted to JUQ . DROHMANN, K. CARLBERG ROMES METHOD [30] D.B.P. Huynh, G. Rozza, S. Sen, and A.T. Patera. A successive constraint linear optimization method forlower bounds of parametric coercivity and inf-sup stability constants.

C. R. Math. Acad. Sci. Paris SeriesI , 345:473–478, 2007.[31] P. A. LeGresley.

Application of Proper Orthogonal Decomposition (POD) to Design Decomposition Meth-ods . PhD thesis, Stanford University, 2006.[32] James Ching-Chieh Lu.

An a posteriori error control framework for adaptive precision optimization usingdiscontinuous Galerkin ﬁnite element method . PhD thesis, Massachusetts Institute of Technology, 2005.[33] Andrew March and Karen Willcox. Provably convergent multiﬁdelity optimization algorithm not requiringhigh-ﬁdelity derivatives.

AIAA journal , 50(5):1079–1089, 2012.[34] M. Meyer and H.G. Matthies. Eﬃcient model reduction in non-linear dynamics using the Karhunen-Lo`eveexpansion and dual-weighted-residual methods.

Computational Mechanics , 31(1):179–191, 2003.[35] Leo Wai-Tsun Ng and Michael Eldred. Multiﬁdelity uncertainty quantiﬁcation using non-intrusive polyno-mial chaos and stochastic collocation. In

Structures, Structural Dynamics, and Materials and Co-locatedConferences , pages –. American Institute of Aeronautics and Astronautics, April 2012.[36] N. C. Nguyen, G. Rozza, and A.T. Patera. Reduced basis approximation and a posteriori error estimationfor the time-dependent viscous burgers’ equation.

Calcolo , 46(3):157–185, 2009.[37] A.T. Patera and G. Rozza.

Reduced Basis Approximation and a Posteriori Error Estimation forParametrized Partial Diﬀerential Equations . MIT, 2007. Version 1.0, Copyright MIT 2006-2007, to ap-pear in (tentative rubric) MIT Pappalardo Graduate Monographs in Mechanical Engineering.[38] C. Prud’homme, D.V. Rovas, K. Veroy, L. Machiels, Y. Maday, A.T. Patera, and G. Turinici. Reliablereal-time solution of parametrized partial diﬀerential equations: Reduced-basis output bound methods.

J.Fluids Engineering , 124:70–80, 2002.[39] Dev Rajnarayan, Alex Haas, and Ilan Kroo. A multiﬁdelity gradient-free optimization method and applica-tion to aerodynamic design. In , volume 6020, 2008.[40] Rolf Rannacher. The dual-weighted-residual method for error control and mesh adaptation in ﬁnite elementmethods.

MAFELEAP , 99:97–115, 1999.[41] C.E. Rasmussen and C.K.I. Williams.

Gaussian Processes for Machine Learning . Adaptative computationand machine learning series. University Press Group Limited, 2006.[42] G. Rozza, D.B.P. Huynh, and A.T. Patera. Reduced basis approximation and a posteriori error estima-tion for aﬃnely parametrized elliptic coercive partial diﬀerential equations.

Arch. Comput. Meth. Eng. ,15(3):229–275, 2007.[43] D. Ryckelynck. A priori hyperreduction method: an adaptive approach.

Journal of Computational Physics ,202(1):346–366, 2005.[44] Michael E. Tipping. Sparse bayesian learning and the relevance vector machine.

J. Mach. Learn. Res. ,1:211–244, Sep 2001.[45] K. Urban and A.T. Patera. An improved error bound for reduced basis approximation of linear parabolicproblems.

Math. Comp , 2013. (in print).[46] D.A. Venditti and D.L. Darmofal. Adjoint error estimation and grid adaptation for functional outputs:Application to quasi-one-dimensional ﬂow.

Journal of Computational Physics , 164(1):204–227, 2000.[47] D. Wirtz, D. Sorensen, and B. Haasdonk. A posteriori error estimation for deim reduced nonlinear dy-namical systems.

SIAM Journal on Scientiﬁc Computing , 36(2):A311–A338, 2014.[48] M. Yano and A.T. Patera. A spacetime variational approach to hydrodynamic stability theory.

Proceedingsof the Royal Society A: Mathematical, Physical and Engineering Science , 469(2155), 2013.(Martin Drohmann)

Sandia National Laboratories, 7100 East Ave, Livermore, California 94550

E-mail address , M. Drohmann: mdrohma@@sandia.gov (Kevin Carlberg)

Sandia National Laboratories, 7100 East Ave, Livermore, California 94550

E-mail address , K. Carlberg: ktcarlb@@sandia.govktcarlb@@sandia.gov