[PDF] To be or not to be intrusive? The solution of parametric and stochastic equations - the "plain vanilla" Galerkin case

Abstract

In parametric equations - stochastic equations are a special case - one may want to approximate the solution such that it is easy to evaluate its dependence of the parameters. Interpolation in the parameters is an obvious possibility, in this context often labeled as a collocation method. In the frequent situation where one has a "solver" for the equation for a given parameter value - this may be a software component or a program - it is evident that this can independently solve for the parameter values to be interpolated. Such uncoupled methods which allow the use of the original solver are classed as "non-intrusive". By extension, all other methods which produce some kind of coupled system are often - in our view prematurely - classed as "intrusive". We show for simple Galerkin formulations of the parametric problem - which generally produce coupled systems - how one may compute the approximation in a non-intusive way.

Full PDF

aa r X i v : . [ m a t h . NA ] S e p To be or not to be intrusive?The solution of parametric and stochastic equations— the “plain vanilla” Galerkin case ∗ Loïc Giraldi c , Alexander Litvinenko a , Dishi Liu b ,Hermann G. Matthies † a , and Anthony Nouy ca Institute of Scientiﬁc Computing, Technische Universität Braunschweig b C A S E, German Aerospace Center (DLR), Braunschweig c École Centrale de Nantes, GeM, UMR CNRS 6183, LUNAM Université2nd October 2018

Abstract

In parametric equations—stochastic equations are a special case—one may wantto approximate the solution such that it is easy to evaluate its dependence of the para-meters. Interpolation in the parameters is an obvious possibility, in this context oftenlabeled as a collocation method. In the frequent situation where one has a “solver”for the equation for a given parameter value—this may be a software component ora program—it is evident that this can independently solve for the parameter val-ues to be interpolated. Such uncoupled methods which allow the use of the originalsolver are classed as “non-intrusive”. By extension, all other methods which producesome kind of coupled system are often—in our view prematurely—classed as “intrus-ive”. We show for simple Galerkin formulations of the parametric problem—whichgenerally produce coupled systems—how one may compute the approximation in a non-intusive way.

Keywords: parametric problems, stochastic equation, uncertainty quantiﬁcation,Galerkin approximation, coupled system, non-intrusive computation

Classiﬁcation: (MSC) , 65J15, 60H25, 65D30, 34B08, 35B30 ∗ Work partly supported by the Deutsche Forschungsgemeinschaft (DFG) through the SFB 880. † Corresponding author: TU Braunschweig, D-38092 Braunschweig, Germany, e-mail: [email protected] Introduction

Many problems depend on parameters, which may be a ﬁnite set of numerical values, ormathematically more complicated objects like for example processes or ﬁelds. We addressthe situation where we have an equation which depends on parameters; stochastic equa-tions are a special case of such parametric problems where the parameters are elementsfrom a probability space. One common way to represent this dependability on paramet-ers is by evaluating the state (or solution) of the system under investigation for diﬀerentvalues of the parameters. Particularly in the stochastic context this “sampling” is a com-mon procedure. But often one wants to evaluate the solution quickly for a new set ofparameters where is has not been sampled. In this situation it may be advantageous toexpress the parameter dependent solution with an approximation which allows for rapidevaluation of the solution or functionals thereof—so called quantities of interest (QoI)—independence of the parameters. Such approximations are also called proxy or surrogate models, response functions , or emulators . This last term was chosen so as to contrast with simulator , which is the original solver for the full equation. Such approximations are usedin several ﬁelds, notably optimisation and uncertainty quantiﬁcation, where in the last casethe parameters are random variables and one deals with stochastic equations. All thesemethods may be seen as functional approximations — representations of the solution byan “easily computable” function of the parameters, as opposed to pure samples .The most obvious methods of approximation used are based on interpolation, in thiscontext often labelled as collocation methods . In this case it is usually assumed that theparameters are in some sub-domain of a manifold, usually simply just in some ﬁnite-dimensional vector space, and the interpolation is often on sparse grids [4, 2, 23, 49]. Thisprocess normally gives the approximation (interpolant) as a ﬁnite linear combination ofsome basis functions used for the interpolation, often global multi-variate polynomials [51],or piecewise polynomials [3, 48], or methods based on radial basis functions, kriging, orneural networks.Another approach is to again choose a similar ﬁnite set of basis functions, but ratherthan interpolation use some other projection onto the subspace spanned by these func-tions. Usually this will involve minimising some norm of the diﬀerence between the trueparametric solution and the approximation, and in many cases this norm will be inducedby an inner product, often in the form of an integral w.r.t. some measure—in the case ofstochastic equations this will be the underlying probability measure. These integrals inturn may be numerically evaluated through quadrature formulas—often again on sparse Smolyak or adaptive grids [42, 25, 43, 29, 24]—which need evaluations of the integrand—part of which is the parametric solution—at a ﬁnite number of parameter values. Suchmethods are sometimes called pseudo-spectral projections , or regression solutions , or dis-crete projections [11, 14, 39, 10, 44, 6, 28, 5, 45].In the frequent situation where one has a “solver” for the equation for a given para-meter value, i.e. a software component or a program, it is evident that this can be usedto independently—i.e. if desired in parallel—solve for all the parameter values which sub-sequently may be used either for the interpolation or in the quadrature for the projec-2ion. Such methods are therefore uncoupled for each parameter value, and obviously al-low to use the original solver. Therefore, they additionally often carry the label “non-intrusive”. Without much argument all other methods—which produce a coupled systemof equations—are almost always labelled as “intrusive”, meaning that one cannot use theoriginal solver, e.g. [18, 27, 46, 19, 17]. We want to show here that this not necessarily thecase.Like most methods which are based on the solution at discrete parameter values, thenon-intrusive methods mentioned above “forget” the original equation, i.e. the fact thatthe approximation has to satisfy the parametric equation. This is generally the state ofaﬀairs when using the proxy model in the domain of optimisation. On the other hand,methods which try to ensure that the approximation satisﬁes the parametric equation aswell as possible are often based on a Rayleigh-Ritz or Galerkin type of “ansatz”, whichleads to a coupled system for the unknown coeﬃcients [26, 33, 50, 3, 34, 22, 47, 32, 12].This is often taken as an indication that the original solver can not be used, i.e. that thesemethods are “intrusive”. But in many circumstances these methods may as well be usedin a non-intrusive fashion. Although there are some publications concerning special casesof non-intrusive Galerkin-like methods [1, 13, 15], this has not been widely recognised as ageneral possibility. A kind of in-between possibility is the so-called reduced basis method ,see [7, 8] for recent expositions. Here a new basis for the parametric solution is built fromsolves at particular parameter values, but the “interpolation” is achieved by a Galerkinprojection onto the spanned subspace. This method also establishes a connection betweenproxy models and reduced order models, something we will not pursue further here.Recent developments for low-rank separated approximations [21, 9, 30, 40, 16, 41, 20]of parametric or stochastic equations are based on the minimisation of a least squares orsimilar functional, and naturally lead to Galerkin-type equations. Although it is importantto show that these can also be dealt with in a non-intrusive manner, here we concentrate onthe “plain vanilla”, i.e. standard, Galerkin case. Non-intrusive computation of separatedapproximations will be investigated elsewhere.Most of the literature cited so far is concerned with the case of stochastic equations,and although these are a special case of parametric equations, the methods and techniquesused there may be used in the wider context of general parametric equations, see [38] fora synopsis of these connections of such parametric problems.The question whether a method is intrusive or not is often very important in practise.The “solver” (for a single parameter value) may contain much specialised knowledge, andmay therefore represent quite a valuable investment of eﬀort. In case the method is labelledintrusive, it may seem like the whole—often very domain speciﬁc—process and eﬀort ofproducing a solver, now for the coupled Galerkin system, would have to be repeated again.Therefore, in many cases the wish to re-use existing software guides the choice of method.But as already mentioned, some very eﬀective new methods based on low-rank approx-imations fall in the class of “not obviously non-intrusive” methods; hence as a ﬁrst stepwe show here that the simple “plain vanilla” coupled Galerkin method may be computednon-intrusively, the low-rank approximation case will be treated elsewhere.A method for a parametric problem will be here considered intrusive if one has to3odify the original software to solve the parametric problem. Thus it turns out that thequestion of whether a method is intrusive or not hinges on what kind of interface onehas to the software, and is thus a software-engineering question. Most often it is possibleto not only compute the solution for a certain parameter value—the solver being usuallyiterative—but also the residuum or a “preconditioned residuum” given a “trial solution”.This usually means—for the preconditioned residuum—doing just one iteration with thesolver instead of iterating all the way to convergence. This is the kind of interface whichwill be assumed here, and we shall show that this can be used without any change to solvethe Galerkin equation.The plan for the rest of the paper is as follows: In the following Section 2 we introducethe notation and assumptions for the parametric problem. In Section 3 we introduce theGalerkin approximation, describe alternative formulations, and prove the convergence andspeed of a basic block-Jacobi algorithm for the coupled Bubnov-Galerkin system. In theSection 4 it is shown how the residual in the iteration may be computed non-intrusively ,mainly via numerical integration. The behaviour of the modiﬁed iterates is analysed, and itis shown that they accumulate in the vicinity of the solution. A small numerical exampleis investigated in Section 5, it shows how the non-intrusive computation works, and itconﬁrms the theoretical predictions.

To be more speciﬁc, let us consider the following situation: we are investigating somephysical system which is modelled by an equation for its state u ∈ U — a Hilbert spacefor the sake of simplicity, A ( p ; u ) = f ( p ) , (1)where A is an operator modelling the physics of the system, and f ∈ U ∗ is some externalinﬂuence (action / excitation / loading). The model depends on some parameters p ∈P . In many cases Eq. (1) is the abstract formulation of a partial diﬀerential equation.But for the sake of simplicity we shall assume here that we are dealing with a model ona ﬁnite-dimensional space U with N := dim U , e.g. a partial diﬀerential equation afterdiscretisation. For simplicity we will identify U and U ∗ , and if needed we will use anorthonormal basis { v n } Nn =1 , i.e. span { v n } Nn =1 = U and h v n , v m i U = δ n,m , the Kronecker- δ .Assume that for all p ∈ P , Eq. (1) is a well-posed problem. This means that A as a mapping u A ( p ; u ) for a ﬁxed p is bijective and continuously invertible, i.e. foreach p and f it has a unique solution, which will be denoted by u ∗ ( p ), such that for all p : A ( p ; u ∗ ( p )) = f ( p ).Although this will not be needed here, let us remark that if the map A were also dif-ferentiable w.r.t. u , well-posedness implies that this partial derivative D u A is non-singularand also continuously invertible. Now — if the set P has a diﬀerentiable structure, e.g.if it is a diﬀerentiable manifold or even a vector space — one may invoke a version ofthe implicit function theorem, which, given the partial derivatives D p A and D p f , providesthe derivative of the state u w.r.t. p as D p u = [D u A ] − (D p f − D p A ). This — and higher4erivatives — may be directly used in the approximation of u ∗ ( p ), as well as for a prioribounds for some approximations. These topics will not be pursued further here.Furthermore assume that we are also given an iterative solver — convergent for allvalues of p — which generates successive iterates for k = 0 , . . . ,u ( k +1) ( p ) = S ( p ; u ( k ) ( p ) , R ( p ; u ( k ) ( p )) , with u ( k ) ( p ) → u ∗ ( p ) , (2)where S is one cycle of the solver which may also depend on the iteration counter k , u (0) is some starting vector, and R ( p ; u ( k ) ( p )) is the residuum of Eq. (1) R ( u ( k ) ) := R ( p ; u ( k ) ( p )) := f ( p ) − A ( p ; u ( k ) ) . (3)Obviously, when the residuum vanishes — R ( p ; u ∗ ( p )) = 0 — the mapping S has a ﬁxedpoint u ∗ ( p ) = S ( p ; u ∗ ( p ) , S is the mathematical formalisation of the software interface we will beassuming in order to derive a non-intrusive Galerkin method, i.e. we will assume that themapping S is applied to its inputs with one invocation of the “solver”.In the iteration in Eq. (2) we may set u ( k +1) = u ( k ) + ∆ u ( k ) with∆ u ( k ) := S ( p ; u ( k ) , R ( p ; u ( k ) )) − u ( k ) , and usually (4) P (∆ u ( k ) ) = R ( p ; u ( k ) ) , (5)so that in Eq. (2): S ( p ; u ( k ) ) = u ( k ) + P − ( R ( p ; u ( k ) )), where by slight abuse of notation wehave shortened the list of arguments. Here P is some preconditioner, which may dependon p , the iteration counter k , and on the current iterate u ( k ) ; e.g. in Newton’s method P = D u A ( p ; u ( k ) ). In any case, we assume that for all arguments the map P is linear in∆ u and non-singular. The iteration corresponding to a normal solve for a particular valueof p then is given in Algorithm 2.1. Algorithm 2.1

Iteration of Eq. (2)Start with some initial guess u (0) k ← while no convergence do ⊲ %comment: the global iteration loop%Compute ∆ u ( k ) according to Eq. (4) or Eq. (5) u ( k +1) ← u ( k ) + ∆ u ( k ) k ← k + 1 end while We will assume additionally that the iteration converges at least linearly, i.e. one has k ∆ u ( k +1) ( p ) k U ≤ ̺ ( p ) k ∆ u ( k ) ( p ) k U , with ̺ ( p ) <

1. For the convergence analysis to followlater we will assume that the convergence factors or Lipschitz constants ̺ ( p ) are uniformlybounded for all values of p ∈ P by a constant strictly less than unity, i.e. ̺ ( p ) ≤ ̺ ∗ < u, v ∈ U and p ∈ P the iterator S in Eq. (2) isuniformly Lipschitz continuous with Lipschitz constant ̺ ∗ <

1, i.e. a strict contraction: k S ( p ; u ( p ) , R ( p ; u ( p ))) − S ( p ; v ( p ) , R ( p ; v ( p ))) k U ≤ ̺ ∗ k u ( p ) − v ( p ) k U . (6)One may recall from Banach’s ﬁxed point theorem that this provides us with the a posteriorierror bounds k u ∗ ( p ) − u ( k +1) ( p ) k U ≤ ̺ ∗ − ̺ ∗ k ∆ u ( k ) ( p ) k U , (7)while the satisfaction of the equation may be gauged by k R ( p ; u ( k ) ) k U . The describe the dependence of u on the parameters p one would like to approximate u ∗ ( p )in the following fashion: u ∗ ( p ) ≈ u I ( p ) = X α ∈I u α ψ α ( p ) , (8)where u α ∈ U are vector coeﬃcients to be determined, and ψ α are some linearly independentfunctions, whose linear combinations Q I := span { ψ α } α ∈I ⊂ R P form the Galerkin subspaceof parametric “ansatz” functions, and I is some ﬁnite set of (multi-)indices of cardinality M := |I| . Often the set I has no canonical order, but for the purpose of computation laterwe will assume that some particular ordering has been chosen.If we take the ansatz Eq. (8) and insert it into Eq. (1), the residuum Eq. (3) willusually not vanish for all p , as the ﬁnite set of functions { ψ α } α ∈I can not match all possibleparametric variations of u ( p ). The

Galerkin method — also called the method of weighted residuals — determines theunknown coeﬃcients u α in Eq. (8) by requiring that for all β ∈ I G Q ( ϕ β ( · ) R ( · ; u I )) = 0 , (9)where { ϕ β } β ∈I is a set of linearly independent functions of p . The residuum R ( p ; u I ( p ))in the argument of the linear Galerkin projector G Q is a parametric function, and suchfunctions may be represented by a sum R ( p ; u I ( p )) = P n φ n ( p ) v n with φ n ∈ R P . Hencethe projector may be deﬁned by requiring that for scalar functions ψ, φ ∈ Q ⊆ R P and avector v ∈ U one has G Q ( φ ( · ) ψ ( · ) v ) = h φ, ψ i Q v, (10)where h· , ·i Q is some duality pairing or inner product on a subspace Q of the scalar func-tions, and from this G Q can be extended by linearity: G Q ( ϕ β R ( · ; u I )) = G Q ( ϕ β X n φ n v n ) = X n G Q ( ϕ β φ n v n ) = X n h ϕ β , φ n i Q v n .

6t is easy to see that this deﬁnition is independent of the particular representation of theparametric function.In case P is a measure space with measure µ , then that pairing often is h φ, ψ i Q = R P φ ( p ) ψ ( p ) µ (d p ), and if µ ( P ) = 1, such that P may be considered as a probability spacewith expectation operator E ( φ ) = R P φ ( p ) µ (d p ), then h φ, ψ i Q = E ( φψ ). Observe that asum like h φ, ψ i Q = P j w j φ ( p j ) ψ ( p j ) with positive weights w j is a special form of such anintegral. A bit more general would be to allow h φ, ψ i Q = s P×P κ ( p, q ) φ ( p ) ψ ( q ) µ (d p ) µ (d q ),where κ is a symmetric positive deﬁnite kernel. What is important for what is to follow,and what we want to assume from now on, is that the pairing is given by some integral,and we will assume the form R P φ ( p ) ψ ( p ) µ (d p ) for the sake of simplicity.The set { ψ α } α ∈I determines the Galerkin subspace Q I := span { ψ α } α ∈I ⊆ Q , whichis responsible for the approximation properties, whereas the set { ϕ β } β ∈I determines theprojection onto that subspace which is important for the stability of the procedure, asthe projection is orthogonal to ˆ Q I := span { ϕ β } β ∈I . Often one takes ϕ β = ψ β and henceˆ Q I = Q I , and this is then commonly called the Bubnov-Galerkin method, whereas in thegeneral case ˆ Q I = Q I one speaks of the Petrov-Galerkin method.Explicitly writing down Eq. (9), one obtains for all β G Q ( ϕ β ( · )( f ( p ) − A ( p ; X α ∈I u α ψ α ( p )))) = 0 . (11)It is important to recognise that Eq. (11) is a — usually coupled — system of equations forthe unknown vectors u α of size M × N , as M = dim Q I and N = dim U . These equationslook suﬃciently diﬀerent from Eq. (1), so that the common wisdom is that the solution ofEq. (11) requires new software and new methods, and that the solver Eq. (2) is of no usehere. As a change or re-write of the existing software seems to be necessary, the resultingmethods are often labelled “intrusive”.As a remark, observe that if one chooses ϕ β ( p ) = δ β ( p ) = δ ( p − p β ) — the delta-“function” associated to the duality pairing h· , ·i Q (i.e. h δ β , φ i Q = φ ( p β )) — where the p β are distinct points in P in Eq. (16), this becomes for all β :0 = G Q ( δ β R ( · ; u I )) = R ( p β ; u I ( p β )) = f ( p β ) − A ( p β ; X α ∈I u α ψ α ( p β )) = f ( p β ) − A ( p β ; u β ) , (12)where the last of these equalities holds only in case the basis { ψ α } satisﬁes the Kronecker - δ property ψ α ( p β ) = δ α,β , as then u β = u I ( p β ). In this latter case these are M uncoupledequations each of size N , and they have for each p β the form Eq. (1) — we have recoveredthe collocation method which independently for each p β computes u β , using the solverEq. (2). Such a method then obviously is non-intrusive, as the original software may beused. Thus this is often the method of choice, as often there is considerable investment inthe software which performs Eq. (2), which one would like to re-use. Unfortunately thischoice is very rigid as regards the subspace Q I and the projection orthogonal to ˆ Q I .We believe that this is a false alternative, and that the distinction is not between intrusive or non-intrusive , but between coupled or uncoupled . Furthermore, and more7mportantly, we want to show that also in the more general case of a coupled system likein Eq. (11) the original solver Eq. (2) may be put to good use. This will be achieved bymaking Eq. (2) the starting point, instead of Eq. (1) or Eq. (3). Such coupled iterationsalso arise for example from multi-physics problem, and there these coupled iterations canalso be solved by what is called a partitioned approach, see e.g. [35], which is the equivalentof non-intrusive here. Quite a number of diﬀerent variants of global partitioned iterationsare possible [35], we only look at some of the simplest variants, as the point is here onlyto dispel the myth about intrusiveness. Whatever the starting point, we would still like to achieve the same result. So beforecontinuing, let us show

Proposition 1.

Projecting the ﬁxed point equation attached to the iteration Eq. (2), namely u I = u I + P − ( R ( u I )) , is equivalent to projecting the preconditioned residual P − ( R ( u I )) ,that means for all β ∈ I G Q ( ϕ β ( · ) P − ( · )( R ( · ; u I ( · )))) = 0 . (13) Moreover, if the preconditioner P in Eq. (5) does not depend on p nor u , then it is equivalentto projecting the residual R ( u I ) from Eq. (9), that means for all β ∈ I G Q ( ϕ β ( · ) R ( · ; u I ( · ))) = 0 . (14) Proof.

The Eq. (13) follows simply from linearity of G Q . Furthermore, in case P does notdepend on p nor u , for Eq. (14) we have from Eq. (13) for any β ∈ I G Q ( ϕ β ( · ) P − ( R ( · ; u I ( · )))) = P − G Q ( ϕ β R ( u I )) ⇔ G Q ( ϕ β R ( u I )) , (15)on noting that for any linear map L one has G Q ( ϕ ( · ) L ( φ ( · ) v )) = h ϕ, φ i Q Lv = L G Q ( ϕ ( · ) φ ( · ) v ), and by observing that P − is non-singular.This means that instead of the residual Eq. (3) we may just as well project the iterationEq. (5): with the abbreviation R ( k ) ( · ) := R ( · ; u ( k ) ( · )) we have for all β ∈ I G Q ( ϕ β ( · ) u ( k +1) ) = G Q ( ϕ β ( u ( k ) + ∆ u ( k ) )) = G Q ( ϕ β ( u ( k ) + P − R ( k ) )) . (16)Expanding u ( k ) ( p ) = P α u ( k ) α ψ α ( p ) in Eq. (16), that becomes a coupled iteration equationfor the u α : ∀ β : G Q ( ϕ β ( · ) X α u ( k +1) α ψ α ( · )) = G Q ( ϕ β ( · ) ( X α u ( k ) α ψ α ( · ) + P − R ( k ) ( · ))) , (17)which may now be written as ∀ β : X α M β,α u ( k +1) α = X α M β,α u ( k ) α + G Q ( ϕ β P − R ( k ) ) , (18)8here M β,α := h ϕ β , ψ α i Q . If the coeﬃcients u ( k ) α ∈ U are arranged column-wise in a N × M matrix u ( k ) = [ . . . , u ( k ) α , . . . ] ∈ U I and similarly G Q ( P − R ( k ) ) = [ . . . , G Q ( ϕ α P − R ( k ) ) , . . . ],and the M β,α are viewed as entries of a M × M matrix M ∈ R I×I , Eq. (18) may becompactly written as u ( k +1) M T = u ( k ) M T + G Q ( P − R ( k ) ) , or as (19) u ( k +1) = u ( k ) + ∆ Q ( u ( k ) ) =: S Q ( u ( k ) ) , (20)where we have deﬁned two new functions ∆ Q ( u ( k ) ) := [ G Q ( P − R ( k ) )] M − T , and S Q ( u ( k ) ) = u ( k ) + ∆ Q ( u ( k ) ), which will be needed later for the convergence analysis in Subsection 3.3.It is apparent that the computation will be much simpliﬁed if the ansatz-functions { ψ α } α ∈I and the test-functions for the projection { ϕ β } β ∈I are chosen bi-orthogonal, i.e. ifone has for all α, β ∈ I that M β,α = δ β,α , i.e. M = I , which shall be assumed from nowon. Hence now ∆ Q ( u ( k ) ) = G Q ( P − R ( k ) ) = [ . . . , G Q ( ϕ α P − R ( k ) ) , . . . ] . (21)Eq. (20) is already a possible way of performing the iteration. The practical, non-intrusive, computation of the terms in Eq. (20) still has to be considered, but we mayformulate the corresponding Algorithm 3.1 and investigate its convergence beforehand.The reader who is only interested in the computational description of the non-intrusivealgorithm may jump directly to Section 4. Algorithm 3.1

Block Jacobi iteration of Eq. (20)Start with some initial guess u (0) k ← while no convergence do ⊲ %comment: the global iteration loop%Compute ∆ Q ( u ( k ) ) according to Eq. (21) u ( k +1) ← u ( k ) + ∆ Q ( u ( k ) ) [= S Q ( u ( k ) )] k ← k + 1 end while Although the underlying iteration Eq. (2) in Algorithm 2.1 may be of any kind — e.g.Newton’s method — when one views Eq. (20) with regard to the block structure imposedby the u = [ . . . , u β , . . . ], Algorithm 3.1 is a — maybe nonlinear — block Jacobi iteration. Here we want to show that the map S Q in Eq. (20) satisﬁes a Lipschitz condition with thesame constant as in Eq. (6). This will need some more theoretical considerations. For thesake of simplicity we will assume that h· , ·i Q is actually an inner product on the Hilbertspace Q ⊆ R P , such that Q I ⊆ Q . The contraction condition for S Q with contraction9actor (Lipschitz constant) less or equal to ̺ ∗ will only hold if the Galerkin projection is orthogonal , i.e. we have to take ϕ α = ψ α , which means ˆ Q I = Q I . Our previous assumptionthat M = I — which is now the Gram matrix of the basis { ψ α } α ∈I — now means thatthis basis is actually orthonormal .Parametric elements like P ∋ p u ( p ) ∈ U are formally in the Hilbert tensor productspace of sums like P n φ n ( p ) v n =: P φ n ⊗ v n ∈ Q ⊗ U , with the inner product of twoelementary tensors φ j ⊗ w j ∈ Q ⊗ U , ( j = 1 , h φ ⊗ w , φ ⊗ w i Q⊗U := h φ , φ i Q h w , w i U , and then extended by bi-linearity. In the space Q , the subspace Q I , which is ﬁnite-dimensional and hence closed, leads to the orthogonal direct sumdecomposition Q = Q I ⊕ Q ⊥I , and hence to the orthogonal direct sum decomposition Q ⊗ U = ( Q I ⊗ U ) ⊕ ( Q ⊥I ⊗ U ).The mapping J : U I ∋ u = [ . . . , u α , . . . ] P α ψ α ( · ) u α ∈ Q I ⊗ U ⊆ Q ⊗ U is by designbijective onto Q I ⊗ U and may thus be used to induce a norm and inner product on U I via k u k U I := k J u k Q⊗U = k P α ψ α ( · ) u α k Q⊗U = P α k u α k U , making it a unitary map, hence k J k = 1. When viewed as a mapping into the larger space Q ⊗ U , were it is extended byslight abuse of notation by the inclusion, it remains an isometry.

Lemma 2.

The maps G Q : Q ⊗ U → U I and J are adjoints of each other, G ∗Q = J , and G Q is non-expansive, k G Q k = k G ∗Q k = 1 .Proof. For all v ∈ U I and φ ⊗ w ∈ Q ⊗ U : h G Q ( φ ⊗ w ) , v i U I = h [ . . . , h ψ α , φ i Q w, . . . ] , [ . . . , v α , . . . ] i U I = X α h ψ α , φ i Q h w, v α i U = h φ ⊗ w, X α ψ α ⊗ v α i Q⊗U = h φ ⊗ w, J v i Q⊗U , (22)and hence G ∗Q = J . But J is an isometry, so that one has k G ∗Q k = k J k = 1. As k G Q k = k G ∗Q k , we are ﬁnished.With the observation that G Q ( S ( · ; u ( k ) ( · ) , R ( k ) ( · )) = G Q ( u ( k ) ( · ) + P − R ( k ) ( · )) = u ( k ) + G Q ( P − R ( k ) ) = u ( k ) + ∆ Q ( u ( k ) ) = S Q ( u ( k ) ) , (23)the map S Q : U I → U I in Eq. (20) may be factored in the following way: S Q : U I J → Q ⊗ U ˜ S → Q ⊗ U G Q → U I , (24) S Q = G Q ◦ ˜ S ◦ J = G Q ◦ ˜ S ◦ G ∗Q , (25)where ˜ S is deﬁned via the solver map S in Eq. (2) by˜ S : Q ⊗ U ∋ u ( · ) S ( · ; u ( · ) , R ( · , u ( · ))) ∈ Q ⊗ U . (26)For this mapping we have the following result:10 roposition 3. In Eq. (26), the map denoted ˜ S has the same Lipschitz constant ̺ ∗ as themap S in Eq. (2), cf. Eq. (6); i.e. ˜ S is a contraction with contraction factor ̺ ∗ < .Proof. We now use the assumption that the inner product on Q is given by an integral, h ϕ, φ i Q = R P ϕ ( p ) φ ( p ) µ (d p ). In that case Q = L ( P , µ ; R ), and the Hilbert tensor product Q ⊗ U is isometrically isomorphic to L ( P , µ ; U ). Hence with Eq. (6) for all u ( · ) , v ( · ) ∈ L ( P , µ ; U ) k ˜ S ( u ( · )) − ˜ S ( v ( · )) k L ( P ,µ ; U ) = Z P k S ( p ; u ( p ) , R ( p ; u ( p ))) − S ( p ; v ( p ) , R ( p ; v ( p ))) k U µ (d p ) ≤ ( ̺ ∗ ) Z P k u ( p ) − v ( p ) k U µ (d p ) = ( ̺ ∗ ) k u ( · ) − v ( · ) k L ( P ,µ ; U ) , and the proof is concluded by taking square roots.This immediately leads to Corollary 4.

The map S Q from Eq. (20) is a contraction with contraction factor ̺ ∗ < (see Eq. (6)): ∀ u , v ∈ U I : k S Q ( u ) − S Q ( v ) k U I ≤ ̺ ∗ k u − v k U I , (27) and hence the Galerkin equations have a unique solution u ∗ ∈ U I .Proof. This follows from the decomposition Eq. (25), Lemma 2, and Proposition 3, as k S Q k = k G Q ◦ ˜ S ◦ G ∗Q k ≤ k G Q k k ˜ S k k G ∗Q k ≤ ̺ ∗ , and Banach’s contraction mappingtheorem.Now we may state the main result about the convergence of the simple block-Jacobi Algorithm 3.1 for the coupled Galerkin system:

Theorem 5.

As the map S Q from Eq. (20) has Lipschitz constant ̺ ∗ < , and is thus acontraction with the same factor as the solver S in Eq. (2), the Algorithm 3.1 converges tothe unique solution u ∗ ∈ U I with the same linear speed of convergence as Algorithm 2.1.Additionally, we have the a posteriori error estimate (see Eq. (7)) k u ∗ − u ( k +1) k U I ≤ ̺ ∗ − ̺ ∗ k ∆ Q ( u ( k ) ) k U I . (28) The satisfaction of the parametric equation may be gauged by k R ( k ) k Q⊗U = k R ( · ; u ( k ) ) k Q⊗U .Proof.

Everything simply follows from Corollary 4, Banach’s contraction mapping theorem,and the fact that R ( k ) ( · ) is the residuum at iteration k before any preconditioning orprojection.Observe that this only holds for the linear convergence speed; in case Algorithm 2.1has super-linear convergence, this can not be necessarily matched by Algorithm 3.1, forthis more sophisticated algorithms for the coupled equations are necessary, see e.g. [35].11 Non-intrusive residual

Here we want to look in more detail at the actual computation of the right hand side ofEq. (20), in the form Eq. (21). One may observe that the term G Q ( ϕ α P − R ( k ) ) in Eq. (21)is the Galerkin projection of the preconditioned residual for that iteration. Let us recallthat the Galerkin projector was deﬁned by Eq. (10) as G Q ( ϕ ( · ) v φ ( · )) = h ϕ, φ i Q v . In some cases [38, 37], notably when the preconditioner P does not depend on p nor u , orwhen the operator A is linear or polynomial in u , and linear in the parameters p , it may bepossible to actually represent P − R ( k ) , not just in principle, but actually non-intrusively through the use of the solver S in Eq. (2) as P − R ( k ) ( p ) = X n ρ n ( p ) v n = X n,β ρ n,β ψ β ( p ) v n , (29)where ρ n,β = h ψ β , ρ n i Q , — remembering that we chose ϕ α = ψ α orthonormal. The Galerkinprojection of this then gives G Q ( ψ α P − R ( k ) ) = G Q ( ψ α X n,β ρ n,β ψ β v n ) = X n,β ρ n,β G Q ( ψ α ψ β v n ) = X n,β ρ n,β h ψ α , ψ β i Q v n = X n ρ n,α v n , (30)using the linearity of G Q and the orthonormality of the basis { ψ α } α ∈I . This means that forthe right hand side of Eq. (20) in the form Eq. (21), given the representation Eq. (29), eachterm may be computed through simple linear algebra operations Eq. (30). This expressionmay be directly used in the block-Jacobi Algorithm 3.1 for ∆ Q ( u ( k ) ) in the form Eq. (21),and the description of the algorithm is complete. Let us remark ﬁnally that if the solveractually returns S ( p ; u ( k ) ( p ) , R ( k ) ( p )) instead of the increment P − R ( k ) ( p ), Algorithm 3.1is easily adapted by computing completely analogously S Q ( u ( k ) ). The following idea to obtain a non-intrusive computation of the right hand side of Eq. (20)in the form Eq. (21), is more general, but involves a further approximation, namely nu-merical integration.Remembering that it was assumed that the duality pairing on the scalar functions isgiven by an integral with measure µ , h ϕ, φ i Q = Z P ϕ ( p ) φ ( p ) µ (d p ) , (31)we now assume that this integral has some approximate numerical quadrature formula Z P φ ( p ) µ (d p ) ≈ Z X z =1 w z φ ( p z ) , (32)12here the integrand is evaluated at the quadrature points p z and the w z are appropriateweights.With this approximation the term G Q ( ψ β P − R ( k ) ) in Eq. (18) becomes practicallycomputable without any further assumptions on the operator A , giving G Q ( ψ β P − R ( k ) ) ≈ ∆ Z,β u ( k ) := X z w z ψ β ( p z ) ∆ u ( k ) z , where (33)∆ u ( k ) z := P − ( p z ) R ( p z ; u ( k ) ( p z )) = P − ( p z ) (cid:16) f ( p z ) − A ( p z ; u ( k ) ( p z )) (cid:17) , or (34)= S ( p z ; u ( k ) ( p z ) , R ( p z ; u ( k ) ( p z ))) − u ( k ) ( p z ) (35)is the preconditioned residuum evaluated at p z , and u ( k ) ( p z ) = P α u ( k ) α ψ α ( p z ). This isindeed the only interface needed to the original equation, something which can be easilyevaluated non-intrusively as the iteration increment ∆ u ( k ) z in Eq. (34) in case the currentstate is given as u ( k ) ( p z ) for the parameter value p z . An alternative form is given in Eq. (35),which is one iteration of the solver, starting at u ( k ) ( p z ) for the parameter p z . This variant isfor the case when the solver actually returns S ( p ; u ( k ) ( p ) , R ( k ) ( p )) instead of the increment P − R ( k ) ( p ). The term in Eq. (20) in the form of Eq. (21) has to be computed non-intrusively . FollowingSubsection 4.2 about numerical integration of the terms — if applicable, the analytic coun-terpart from Subsection 4.1 can be easily substituted — we formulate the approximationof ∆ Q ( u ( k ) ) = [ . . . , G Q ( ϕ α P − R ( k ) ) , . . . ] ≈ ∆ Z ( u ( k ) ) = [ . . . , ∆ Z,α u ( k ) , . . . ] (36)in Algorithm 3.1 from Eq. (21) in Algorithm 4.1, using Eq. (33) and Eq. (34): Algorithm 4.1

Non-intrusive computation of Eq. (21) in the form of Eq. (36) for α ∈ I do ∆ Z,α u ( k ) ← end for ⊲ %comment: the loop over integration points% for z ← , . . . , Z do Compute ∆ u ( k ) z from Eq. (34) r z ← w z ∆ u ( k ) z for α ∈ I do ∆ Z,α u ( k ) ← ∆ Z,α u ( k ) + ψ α ( p z ) r z end forend for The result of this algorithm is ∆ Z ( u ( k ) ), the approximation of ∆ Q ( u ( k ) ) by numericalintegration. With Algorithm 4.1 it is now possible to formulate a non-intrusive version ofAlgorithm 3.1, the block-Jacobi iteration, in Algorithm 4.2.13 lgorithm 4.2 Non-Intrusive block Jacobi iteration of Eq. (20)Start with some initial guess ˜u (0) = [ . . . , ˜ u (0) α , . . . ] k ← while no convergence do ⊲ %comment: the global iteration loop%Compute ∆ Z ( ˜u ( k ) ) = [ . . . , ∆ Z,α ˜ u ( k ) , . . . ] according to Algorithm 4.1 ˜u ( k +1) ← ˜u ( k ) + ∆ Z ( ˜u ( k ) ) k ← k + 1 end while The sequence generated by Algorithm 4.2 has been labelled with a tilde { ˜u ( k ) } k todistinguish it from the exact sequence { u ( k ) } k generated by Algorithm 3.1. The questionarises as to how well the original sequence { u ( k ) } k is approximated by the one producednon-intrusively by numerical integration { ˜u ( k ) } k , and what its convergence behaviour is.To that eﬀect we partially cite and conclude from Theorem 4.1 in [36]: Theorem 6.

Assume that the numerical integration in Algorithm 4.1 is performed suchthat k G Q ( ψ α P − R ( · ; ˜ u ( k ) )) − ∆ Z,α ˜ u ( k ) k U ≤ ε/ √ M , then the error in Eq. (36) is estimatedby k ∆ Q ( ˜u ( k ) ) − ∆ Z ( ˜u ( k ) ) k U I ≤ ε, (37) and we have the following a posteriori error estimate for the iterates k u ∗ − ˜u ( k +1) k U I ≤ ̺ ∗ − ̺ ∗ k ∆ Z ( ˜u ( k ) ) k U I + ε − ̺ ∗ . (38) In addition, we have that lim sup k →∞ k u ∗ − ˜u ( k ) k U I ≤ ε − ̺ ∗ . (39) The satisfaction of the parametric equation may be gauged by k R ( · ; ˜ u ( k ) ) k Q⊗U .Proof.

The Eq. (37) is a simple consequence of the assumption by squaring and summing M = |I| terms of size less than ε/ √ M , and then taking the square root. Everything elseare then statements of Theorem 4.1 in [36].The Eq. (38) shows that the modiﬁed sequence { ˜u ( k ) } k will not necessarily converge to u ∗ , even if ∆ Z ( ˜u ( k ) ) → k → ∞ , but Eq. (39) shows that it clusters around u ∗ in asmall neighbourhood. To assess the eﬀort involved in a computational procedure and hence its eﬃciency is alwaysdiﬃcult, not least because it is not always clear on how to measure computational eﬀort.14ere we take the view that the eﬀort is only counted in solver calls, i.e. invocations of S ( p ; u, R ( p ; u )) Eq. (2) or equivalently of P − R ( p ; u ) Eq. (5). This means that the addi-tional linear algebra and computation of ψ α ( p z ) involved in the Algorithms 4.1 and 4.2 isconsidered insigniﬁcant in comparison to an invocation of the solver S .The main contender for the Galerkin procedure outlined so far is to be seen in what iscalled in the introduction a pseudo-spectral or discrete projection, or a regression. This canbe described very quickly. With { ψ α } α ∈I orthonormal, the coeﬃcients in the projection u I = P α ∈I u α ψ α can be simply computed by inner products: u α = h ψ α , u ∗ i Q = Z P ψ α ( p ) u ∗ ( p ) µ (d p ) ≈ Z X z =1 w z ψ ( p z ) u ∗ ( p z ) . (40)One may remind oneself that this — being the orthogonal projection onto the subspace Q I ⊆ Q — has the smallest error to u ∗ ( p ) in the norm k · k Q , but it does not at all take intoaccount the parametric equation. The Galerkin projection on the other hand will producean approximation which is optimal in minimising the residuum. The approximation inAlgorithm 4.3 to Eq. (40) is computed very similarly to Algorithm 4.1. Algorithm 4.3

Discrete projection according to Eq. (40) for α ∈ I do u α ← end for ⊲ %comment: the loop over integration points% for z ← , . . . , Z do Compute u ( p z ) acording to Algorithm 2.1. r z ← w z u ( p z ) for α ∈ I do u α ← u α + ψ α ( p z ) r z end forend for The iterations from Eq. (2) in Algorithm 2.1 with one solver call per iteration in Al-gorithm 4.3 are assumed to be contractions with contraction factor at most ̺ ∗ . Say thatan iteration with contraction factor of ̺ ∗ needs L iterations to converge to the desiredaccuracy. The discrete projection needs L solver calls on Z integration points each, i.e. L × Z solver calls.The block Jacobi variant of the coupled Galerkin system in Algorithm 4.2 needs one solver call on Z integration points. But as it converges also with contraction factor ̺ ∗ —see Corollary 4, it also needs L iterations, i.e. in total also L × Z solver calls.We see that in this measure of eﬀort — solver calls — the discrete projection and theblock Jacobi iteration of the Galerkin system need the same eﬀort for comparable accuracy;something that is borne out also in the numerical example in Section 5. In case the iterationin Eq. (2) is quadratically convergent, e.g. it is Newton’s method, then this can not be15atched by the block Jacobi method; it will usually only have linear convergence. Whenlooking at the other computations apart from the count of solver calls, in both algorithmsintegrals have to be approximated by quadrature formulas. In the discrete projection thishappens only once, whereas in the block Jacobi this is done in every global iteration.Block Jacobi is probably the simplest method for coupled systems, however it can beconsiderably accelerated [31, 35], this ranges from the simple Aitken acceleration over block

Gauss-Seidel to Quasi-Newton methods. In case the iterations from Eq. (2) converge onlylinearly, these extensions can then produce an advantage for the Galerkin solution andmay need considerably fewer than L iterations, as “convergence information” is sharedfor diﬀerent values of p or α , something which will not happen in the decoupled discreteprojection. Even Newton’s method [31] can be emulated on the global Galerkin system,where the action of the inverse of the derivative on a vector is approximated by ﬁnitediﬀerences and non-intrusive solver calls. This last procedure is even able to maintainquadratic convergence in case the iterations from Eq. (2) are quadratically convergentthemselves. These issues will be taken up and published elsewhere.Another area where considerable saving of work is possible in the Galerkin procedureare sparse or low-rank approximations. They come about when viewing the solution — andalso other parametric elements — as tensors , which may be used computationally in low-rank representations / approximations, see for example [36]. Again this is beyond the scopeof the present paper, and will published elsewhere. Using such low-rank representationsin the originally uncoupled discrete projection produces a coupled system, which thendiﬀers not substantially from the Galerkin system. Other ways of building a low-rankrepresentation were already discussed in the introduction, and will be the subject of afuture paper.

Here we want to show the procedures discussed on a tiny example which nonetheless isrepresentative of parametric problems. It is so simple that it may be programmed witha few lines of code. This computational example derives from a little electrical resistornetwork with a global non-linearity. The particular resistor network we use is shown inFig. 1.Kirchhoﬀ’s and Ohm’s laws lead to the following linear relation between voltages u and currents j fed into the nodes, where the numbering of the nodes corresponds to theequations — node 6 is grounded ( u = 0) and so needs no equation, hence u ∈ U = R , K ∈ R × : Ku = j , (41)16 Figure 1: Electrical resistor circuit.with K = 1 R  − − − − − − − − − − − − − − − −  , (42)where we take all resistors to have the value R = 100.To make this toy system non-linear, we add a global cubic non-linearity with an un-certain coeﬃcient ( p + 2)( u T u ) u . We also make the feed-in current f = ( p + 25) f uncertain, so that we are eventually left to solve this system for u (this is a concreteexample of Eq. (1)): A ( p ; u ) :=( Ku + ( p + 2)( u T u ) u ) = ( p + 25) f =: f ( p ) , (43) f := [1 , , , , T , (44)where the random parameters p = ( p , p ) are assumed uniformly and independently dis-tributed in [ − , R ( p ; u ) = ( p + 25) f − ( Ku + ( p + 2)( u T u ) u ) , (45)and for the preconditioner we take simply P = K = D u A ( p ; ).The system can be solved in an iterative way as formulated in Eq. (2), with eﬀectively u ( k +1) = S ( p ; u ( k ) , R ( p ; u ( k ) )) = u ( k ) + P − R ( p ; u ( k ) ) = K − (cid:16) ( p + 25) f − ( p + 2)(( u ( k ) ) T u ( k ) ) u ( k ) (cid:17) (46)17he simple iteration Eq. (46) — indeed a linearly convergent modiﬁed Newton method —converges quite well for the chosen parameters.For the ansatz functions we take tensor products of Legendre polynomials, as they areorthogonal for the uniform measure, i.e. we take ψ α ( p ) = ˜ L α ( p ) = L α ( p ) / k L α k , the multi-variate normalised Legendre polynomial, and L α ( p ) = Q i =1 ℓ α i ( p i ), where the ℓ i are thenormal univariate Legendre polynomials, k L α k = 4(2 α + 1) − (2 α + 1) − , and α ∈ ( N ) : u ( p ) ≈ X | α | ≤ m u α ˜ L α ( p ) =: u I ( p ) , with (47) u α = [ u α, , · · · , u α, ] T ∈ U = R , and I = { α = ( α , α ) : | α | = α + α ≤ m } ⊂ ( N ) , m ∈ N ;hence for diﬀerent m ∈ N we will have diﬀerent approximation orders by polynomials oftotal degree m .For the purpose of comparison we use two approaches to determine the coeﬃcients u α in Eq. (47), these are the Galerkin approach according to Algorithm 4.2 with numeric-ally integrated residuum according to Algorithm 4.1 for u G ( p ), and collocation or morespeciﬁcally discrete projection with numerical integration according to Algorithm 4.3 for u C ( p ), both with the same integration rule. We choose here — as we are only in twodimensions — a tensor-product Gauss-Legendre quadrature. The quadrature order wasalways taken so that products of test- and ansatz-functions ψ α ψ β were integrated exactlyfor the chosen total polynomial degree m in Eq. (47).First we computed a N = 1000 sample Monte Carlo solution on random points p n ∈ P = [ − , , n = 1 , . . . , N to high accuracy by setting the convergence criterionin Algorithm 2.1 to the machine tolerance. These results were taken as the reference solu-tion for the following error estimation. We computed the root-mean-squared-error (RMSE)— eﬀectively the L norm in Q ⊗ U ∼ = L ([ − , ; R ) — as ǫ F = N N X n =1 k u ( p n ) − u F ( p n ) k ! / , (48)where the functional approximation method F is either G for the Galerkin method or C for the collocation method.The two approaches were carried out to compute the coeﬃcients u α in Eq. (47). Thecriteria of convergence for the iterative solvers were that the increment of u or u α is smallerthan ǫ tol . Tabulated in Table 1 are the ǫ tol values obtained in a sensitivity-range investig-ation such that further reduction of these values would not improve accuracy, dependingon the total polynomial degree m . The errors ǫ F of each approach were estimated as inEq. (48) and are displayed in Table 1, together with the number of solver ( S ( p , u )) evalu-ations for total polynomial degrees m = 2 , , ǫ tol ǫ F from Eq. (48)Collocation Galerkin F = C Collocation F = G Galerkinm=2 10 −

73 81 8 . × − . × − m=3 10 −

151 160 6 . × − . × − m=4 10 −

268 300 4 . × − . × − m=5 10 −

430 468 3 . × − . × − Table 1: RMSE and number of solver evaluations of collocation and Galerkin approachesevaluations of the solver. This comes because the same convergence tolerance is used fordiﬀerent equations in the two approaches. The results essentially conﬁrm the theoreticalanalysis in Section 3; for the same accuracy both approaches need about the same numberof solver calls, i.e. the simple block Jacobi iteration of the Galerkin system converges atessentially the same speed as the original iteration.

After reviewing the literature on numerical methods for parametric equations, with a spe-cial emphasis on the subclass of stochastic equations, we have introduced a general meth-odology to formulate numerical methods relying on functional or spectral approximations.We have shown that the Galerkin orthogonality conditions for the residuum and the it-eration equation are equivalent under certain conditions, and that the simplest iterativescheme for the coupled Galerkin system, the block Jacobi method, converges essentially atthe same speed as the original solver for a single parameter value.In the main part for this “plain vanilla” Galerkin formulation, we have shown howto approximate the preconditioned residuum in the Galerkin equation through numericalintegration, and the eﬀects of this on the iteration sequence. Then these explicit non-intrusive Galerkin algorithms have been compared on one simple, easy to understand, babyexample. This showed that the theoretical analysis was validated with these computations,and that even in the simplest case of block Jacobi the Galerkin formulation is competitivewith collocation. We ﬁnally recall once more the discussion in Section 4 on possibilities toaccelerate the coupled Galerkin solution, something that is not possible for the decoupledcollocation approach. 19 eferences [1] S. Acharjee and N. Zabaras,

A non-intrusive stochastic Galerkin approach for modelinguncertainty propagation in deformation processes , Computers & Structures (2007),244–254, doi:10.1016/j.compstruc.2006.10.004 .[2] I. Babuška, F. Nobile, and R. Tempone, A stochastic collocation method for ellipticpartial diﬀerential equations with random input data , SIAM J. Numer. Anal. (2007),1005–1034, doi:10.1137/050645142 .[3] I. Babuška, R. Tempone, and G. E. Zouraris, Galerkin ﬁnite element approximationsof stochastic elliptic partial diﬀerential equations , SIAM J. Numer. Anal. (2004),800–825, doi:10.1137/S0036142902418680 .[4] V. Barthelmann, E. Novak, and K. Ritter, High dimensional polynomial interpolationon sparse grids , Advances in Computational Mathematics (1999), 237–288.[5] M. Berveiller, B. Sudret, and M. Lemaire, Stochastic ﬁnite element: anon intrusive approach by regression , European Journal of ComputationalMechanics/Revue Européenne de Mécanique Numérique (2006), 81–92, doi:10.3166/remn.15.81-92 .[6] G. Blatman and B. Sudret, Adaptive sparse polynomial chaos expansionbased on least angle regression , J. Comput. Phys. (2011), 2345–2367, doi:10.1016/j.jcp.2010.12.021 .[7] S. Boyaval, C. Le Bris, T. Lelièvre, Y. Maday, N. C. Nguyen, and A. T. Patera,

Reduced basis techniques for stochastic problems , Archives of Computational Methodsin Engineering (2010), 435–454, doi:10.1007/s11831-010-9056-z .[8] A. Buﬀa, Y. Maday, A. T. Patera, C. Prud’homme, and G. Turinici, A pri-ori convergence of the greedy algorithm for the parametrized reduced basis method ,ESAIM: Mathematical Modelling and Numerical Analysis (2012), 595–603,Available from: , doi:10.1051/m2an/2011056 .[9] F. Chinesta, P. Ladevèze, and E. Cueto, A short review on model order reduction basedon proper generalized decomposition , Arch. Comput. Methods Eng. (2011), 395–404, Available from: http://amb.unizar.es/PDFs/ChinestaLadevezeCueto.pdf , doi:10.1007/s11831-011-9064-7 .[10] S. Clenet, O. Moreau, V. Costan, Y. Le Menach, and A. Benabou, Adapt-ive method for non-intrusive spectral projection—application on a stochastic eddycurrent NDT problem , IEEE Transactions on Magnetics (2012), 759–762, doi:10.1109/TMAG.2011.2175204 . 2011] A. Cohen, M. A. Davenport, and D. Levitan, On the stability and accuracy of leastsquares approximations , Foundations of Computational Mathematics (2013), Availableonline 01 March 2013, doi:10.1007/s10208-013-9142-3 .[12] A. Cohen, R. Devore, and C. Schwab,

Analytic regularity and polynomial approx-imation of parametric and stochastic elliptic PDE’s , Anal. Appl. (2011), 11–47, doi:10.1142/S0219530511001728 .[13] P. G. Constantine, A. Doostan, and G. Iaccarino, A hybrid collocation/Galerkinscheme for convective heat transfer problems with stochastic boundary conditions , Int.J. Numer. Meth. Engng. (2009), 868–880, doi:10.1002/nme.2564 .[14] P. G. Constantine, M. S. Eldred, and E. T. Phipps, Sparse pseudospectral approxima-tion method , Computer Methods in Applied Mechanics and Engineering (2012),1–12, doi:10.1016/j.cma.2012.03.019 .[15] P. G. Constantine, D. F. Gleich, and G. Iaccarino,

Spectral methods for parameterizedmatrix equations , SIAM Journal of Matrix Analysis and Applications (2010), no. 5,2681–2699, doi:10.1137/090755965 .[16] A. Doostan and G. Iaccarino, A least-squares approximation of partial diﬀerentialequations with high-dimensional random inputs , Journal of Computational Physics (2009), 4332–4345, doi:10.1016/j.jcp.2009.03.006 .[17] A. Doostan, A. A. Validi, and G. Iaccarino,

Non-intrusive low-rank separated ap-proximation of high-dimensional stochastic models , Computer Methods in AppliedMechanics and Engineering (2013), 42–55, doi:10.1016/j.cma.2013.04.003 .[18] M. S. Eldred, C. G. Webster, and P. G. Constantine,

Evaluation of non-intrusiveapproaches for Wiener-Askey generalized polynomial chaos , Proceedings of the 10thAIAA Non-Deterministic Approaches Conference, number AIAA-2008-1892, Schaum-burg, IL, vol. 117, 2008, p. 189, doi:10.2514/6.2008-1892 .[19] M. S. Eldred,

Design under uncertainty employing stochastic expansionmethods , International Journal for Uncertainty Quantiﬁcation (2011), doi:10.1615/IntJUncertaintyQuantification.v1.i2.20 .[20] M. Espig, W. Hackbusch, A. Litvinenko, H. G. Matthies, and E. Zander, Ef-ﬁcient analysis of high dimensional data in tensor formats , Sparse Grids andApplications (J. Garcke and M. Griebel, eds.), Lecture Notes in Computa-tional Science and Engineering, vol. 88, Springer, Berlin, 2013, pp. 31–56, doi:10.1007/978-3-642-31703-3_2 .[21] A. Falcó and A. Nouy,

Proper generalized decomposition for nonlinear convexproblems in tensor Banach spaces , Numerische Mathematik (2012), 503–530, doi:10.1007/s00211-011-0437-5 . 2122] P. Frauenfelder, C. Schwab, and R. A. Todor,

Finite elements for elliptic problems withstochastic coeﬃcients , Computer Methods in Applied Mechanics and Engineering (2005), 205–228.[23] B. Ganapathysubramanian and N. Zabaras,

Sparse grid collocation schemes forstochastic natural convection problems , J. Comput. Phys. (2007), 652–685, doi:10.1016/j.jcp.2006.12.014 .[24] T. Gerstner and M. Griebel,

Dimension-adaptive tensor-product quadrature , Comput-ing (2003), 65–87, doi:10.1007/s00607-003-0015-5 .[25] T. Gerstner and M. Griebel, Numerical integration using sparse grids , Numerical Al-gorithms (1998), 209–232.[26] R. Ghanem and P. D. Spanos, Stochastic ﬁnite elements—a spectral approach ,Springer, Berlin, 1991.[27] M. Herzog, A. Gilg, M. Paﬀrath, P. Rentrop, and U. Wever,

Intrusive versus non-intrusive methods for stochastic ﬁnite elements , From Nano to Space, Springer, Berlin,2008, pp. 161–174, doi:10.1007/978-3-540-74238-8_13 .[28] S. Hosder, R. W. Walters, and M. Balch,

Point-collocation nonintrusive polynomialchaos method for stochastic computational ﬂuid dynamics , AIAA Journal (2010),2721–2730, doi:10.2514/1.39389 .[29] A. Keese and H. G. Matthies, Numerical methods and Smolyak quadraturefor nonlinear stochastic partial diﬀerential equations , Informatikbericht 2003-5, Technische Universität Braunschweig, Brunswick, 2003, Available from: http://digibib.tu-bs.de/?docid=00001471 .[30] B. N. Khoromskij and C. Schwab,

Tensor-structured Galerkin approximation of para-metric and stochastic PDEs , SIAM Journal of Scientiﬁc Computing (2011), no. 1,364–385, doi:10.1137/100785715 .[31] W. Mackens, J. Menck, and H. Voss, Coupling iterative subsystem solvers , ScientiﬁcComputing in Chemical Engineering (Berlin) (F. Keil, W. Mackens, H. Voss, andJ. Werther, eds.), Springer, 1999, pp. 184–191.[32] H. G. Matthies,

Stochastic ﬁnite elements: Computational approaches to stochasticpartial diﬀerential equations , Zeitschrift für angewandte Mathematik und Mechanik (2008), no. 11, 849–873.[33] H. G. Matthies and C. Bucher, Finite elements for stochastic media problems , Com-puter Methods in Applied Mechanics and Engineering (1999), no. 1–4, 3–17.2234] H. G. Matthies and A. Keese,

Galerkin methods for linear and nonlinear ellipticstochastic partial diﬀerential equations , Computer Methods in Applied Mechanics andEngineering (2005), no. 12-16, 1295–1331. MR MR2121216 (2005j:65146)[35] H. G. Matthies, R. Niekamp, and J. Steindorf,

Algorithms for strong coupling proced-ures , Computer Methods in Applied Mechanics and Engineering (2006), no. 17-18,2028–2049, doi:10.1016/j.cma.2004.11.032 . MR 2202913 (2006h:74023)[36] H. G. Matthies and E. Zander,

Solving stochastic systems with low-ranktensor compression , Linear Algebra and its Applications (2012), 3819–3838, doi:10.1016/j.laa.2011.04.017 .[37] H. G. Matthies,

White noise analysis for stochastic partial diﬀerential equations ,Numerical and Symbolic Scientiﬁc Computing (Berlin) (U. Langer and P. Paule,eds.), Texts and Monographs in Symbolic Computation, vol. 1, 2012, pp. 157–174, doi:10.1007/978-3-7091-0794-2\_8 .[38] H. G. Matthies, A. Litvinenko, O. Pajonk, B. V. Rosić, and E. Zander,

Parametricand uncertainty computations with tensor product representations , Uncertainty Quan-tiﬁcation in Scientiﬁc Computing (Berlin) (A. Dienstfrey and R. Boisvert, eds.), IFIPAdvances in Information and Communication Technology, vol. 377, Springer, 2012,pp. 139–150, doi:10.1007/978-3-642-32677-6 .[39] G. Migliorati, F. Nobile, E. von Schwerin, and R. Tempone,

Analysis of the discrete L projection on polynomial spaces with random evaluations , MATHICSE TechnicalReport 29, École Polytechnique Fédérale de Lausanne, Lausanne, 2011, Available from: http://jahia-prod.epfl.ch/files/content/sites/csqi/files/Reports/29.2011_GM-FN-ES-RT.pdf .[40] A. Nouy, Proper generalized decompositions and separated representations for the nu-merical solution of high dimensional stochastic problems , Archives of ComputationalMethods in Engineering (2010), 403–434, doi:10.1007/s11831-010-9054-1 .[41] A. Nouy and O. P. Le Maître, Generalized spectral decomposition for stochasticnonlinear problems , Journal of Computational Physics (2009), no. 1, 202–235, doi:10.1016/j.jcp.2008.09.010 .[42] E. Novak and K. Ritter,

High dimensional integration of smooth functions over cubes ,Numerische Mathematik (1996), 79–97.[43] K. Petras, Fast calculation of coeﬃcients in the Smolyak algorithm , Numerical Al-gorithms (2001), 93–109.[44] G. Poëtte and D. Lucor, Non intrusive iterative stochastic spectral representation withapplication to compressible gas dynamics , J. Comput. Phys. (2012), 3587–3609, doi:10.1016/j.jcp.2011.12.038 . 2345] M. T. Reagan, H. N. Najm, R. G. Ghanem, and O. M. Knio,

Uncertainty quantiﬁcationin reacting-ﬂow simulations through non-intrusive spectral projection , Combustion andFlame (2003), 545–555, doi:10.1016/S0010-2180(02)00503-5 .[46] G. Stefanou,

The stochastic ﬁnite element method: past, present and future , Com-puter Methods in Applied Mechanics and Engineering (2009), 1031–1051, doi:10.1016/j.cma.2008.11.007 .[47] R. A. Todor and C. Schwab,

Convergence rates for sparse chaos approximations ofelliptic problems with stochastic coeﬃcients , IMA J. Numer. Anal. (2007), no. 2,232–261. MR MR2317004[48] X. Wan and G. Karniadakis, An adaptive multi-element generalized polynomial chaosmethod for stochastic diﬀerential equations , J. Comput. Phys. (2005), 617–642, doi:10.1016/j.jcp.2005.03.023 .[49] D. Xiu and J. S. Hesthaven,

High-order collocation methods for diﬀerentialequations with random inputs , SIAM J. Sci. Comput. (2005), 1118–1139, doi:10.1137/040615201 .[50] D. Xiu and G. E. Karniadakis, Modeling uncertainty in steady state diﬀusion prob-lems via generalized polynomial chaos , Computer Methods in Applied Mechanics andEngineering (2002), 4927–4948.[51] ,

The Wiener-Askey polynomial chaos for stochastic diﬀerential equations ,SIAM Journal of Scientiﬁc Computing (2002), 619–644. $Id: S-Galerkin-a.tex,v 1.10 2013/09/05 09:51:00 hgm Exp $$Id: S-Galerkin-a.tex,v 1.10 2013/09/05 09:51:00 hgm Exp $