[PDF] Bayesian Numerical Homogenization

Abstract

Numerical homogenization, i.e. the finite-dimensional approximation of solution spaces of PDEs with arbitrary rough coefficients, requires the identification of accurate basis elements. These basis elements are oftentimes found after a laborious process of scientific investigation and plain guesswork. Can this identification problem be facilitated? Is there a general recipe/decision framework for guiding the design of basis elements? We suggest that the answer to the above questions could be positive based on the reformulation of numerical homogenization as a Bayesian Inference problem in which a given PDE with rough coefficients (or multi-scale operator) is excited with noise (random right hand side/source term) and one tries to estimate the value of the solution at a given point based on a finite number of observations. We apply this reformulation to the identification of bases for the numerical homogenization of arbitrary integro-differential equations and show that these bases have optimal recovery properties. In particular we show how Rough Polyharmonic Splines can be re-discovered as the optimal solution of a Gaussian filtering problem.

Full PDF

aa r X i v : . [ m a t h . NA ] M a y Bayesian Numerical Homogenization

Houman Owhadi ∗ March 15, 2018

Abstract

Numerical homogenization, i.e. the ﬁnite-dimensional approximation of solutionspaces of PDEs with arbitrary rough coeﬃcients, requires the identiﬁcation of ac-curate basis elements. These basis elements are oftentimes found after a laboriousprocess of scientiﬁc investigation and plain guesswork. Can this identiﬁcation prob-lem be facilitated? Is there a general recipe/decision framework for guiding thedesign of basis elements? We suggest that the answer to the above questions couldbe positive based on the reformulation of numerical homogenization as a BayesianInference problem in which a given PDE with rough coeﬃcients (or multi-scale op-erator) is excited with noise (random right hand side/source term) and one tries toestimate the value of the solution at a given point based on a ﬁnite number of obser-vations. We apply this reformulation to the identiﬁcation of bases for the numericalhomogenization of arbitrary integro-diﬀerential equations and show that these baseshave optimal recovery properties. In particular we show how Rough PolyharmonicSplines can be re-discovered as the optimal solution of a Gaussian ﬁltering problem.

This paper is inspired by a curious (and, perhaps, overlooked) link between BayesianInference and Numerical Analysis [20], known as Bayesian Numerical Analysis [20, 63,49, 50], and that can be traced back to Poincar´e’s course on Probability Theory [62].We will recall Diaconis’ compelling example [20] as an illustration of this link.Let f : [0 , → R be a given function and assume that we are interested in the nu-merical approximation of R f ( t ) dt . The Bayesian approach to this quadrature problemis to (1) Put a prior (probability distribution) on continuous functions C [0 ,

1] (2) Calcu-late f at x , x , . . . , x n (to obtain the data ( f ( x ) , . . . , f ( x n ))) (3) Compute a posterior(4) Estimate R f ( t ) dt by the Bayes rule.If the prior on C [0 ,

1] is that of a Brownian Motion (i.e. f ( t ) = B t where B t is aBrownian motion and B is normal), then E (cid:2) f ( x ) (cid:12)(cid:12) f ( x ) , . . . , f ( x n ) (cid:3) is the piecewise lin-ear interpolation of f between the points x , . . . , x n and one re-discovers the trapezoidalquadrature rule. ∗ California Institute of Technology, Computing & Mathematical Sciences , MC 9-94 Pasadena, CA91125, [email protected]

1f the prior on C [0 ,

1] is that of the ﬁrst integral of a Brownian Motion (i.e. f ( t ) ∼ R t B s ds ) then the posterior E (cid:2) f ( x ) (cid:12)(cid:12) f ( x ) , . . . , f ( x n ) (cid:3) is the cubic spline interpolant andintegrating k times yields splines of order 2 k + 1. Although this link has lead to theidentiﬁcation of new quadrature rules for numerical integration [49], it appears to haveremained little known and our paper is prompted by the question of the existence of asimilar link between Bayesian Inference and Numerical Homogenization.As a prototypical example, consider the numerical homogenization of the PDE ( − div (cid:16) a ( x ) ∇ u ( x ) (cid:17) = g ( x ) x ∈ Ω , g ∈ L (Ω) ,u = 0 on ∂ Ω , (1.1)where Ω is a bounded subset of R d with piecewise Lipschitz boundary, a is a symmetric,uniformly elliptic d × d matrix on Ω and with entries in L ∞ (Ω).Recall that numerical homogenization concerns the approximation of the solutionspace of (1.1) with a ﬁnite-dimensional space. Although classical homogenization con-cepts [12, 46, 64, 36, 61, 44] might be present in some instances of this problem [43, 42, 3,25, 2, 37, 15, 16, 31, 11, 32, 1], one of the main objectives of numerical homogenizationis to achieve a numerical approximation of the solution space of (1.1) with arbitraryrough coeﬃcients, i.e., in particular, without the assumptions found in classical homoge-nization, such as scale separation, ergodicity at ﬁne scales and ǫ -sequences of operators.In this situation, piecewise linear ﬁnite-elements can perform arbitrarily badly [10] andthe numerical approximation of the solution space involves the identiﬁcation of accuratebasis elements adapted to the microstructure a ( x ) [70, 9, 6, 57, 58, 56, 28, 27, 5, 4, 48,18, 17, 17, 19, 13, 7, 59, 26, 45, 39].As for the identiﬁcation of quadrature rules in numerical analysis, the identiﬁcation ofaccurate basis elements in numerical homogenization has been based on a diﬃcult processof scientiﬁc investigation. Let us now turn our attention to the Bayesian approach tothis problem. An immediate question is where do we place the prior? (1) If the prioris placed on u then posterior values do not see (depend on) the microstructure. (2) Ifthe prior is placed on a then the microstructure becomes random whereas our purposeis the numerical homogenization of a given deterministic microstructure. Let us alsonote that the randomization of the microstructure, as investigated by Polynomial ChaosApproximation/Stochastic Expansion methods [35, 34, 72, 8, 30, 21], does not lead to thesimpliﬁcation seen after homogenization but to increased complexity with the dimensionof input stochastic variables [66, 14] (although Stochastic Expansion methods have beenused successfully to beat Monte-Carlo sampling they do not lead to averaging resultsseen in homogenization). (3) If the prior is placed on g then the noise propagates throughthe microstructure and the posterior value of u contains that information.This observation motivates us to place the prior on the source term g in (1.1), e.g.,replace it by white noise (i.e. a centered Gaussian ﬁeld ξ ( x ) on Ω with covariancefunction δ ( x − y )) and consider the stochastic PDE ( − div (cid:16) a ( x ) ∇ u ( x ) (cid:17) = ξ ( x ) x ∈ Ω ,u = 0 on ∂ Ω . (1.2)2bserve that the solution (1.2) at the point x , u ( x ), is a random variable and itsbest (mean squared) approximation given u ( x ) , . . . , u ( x N ) (the values of the solu-tion of (1.2) at the points x , . . . , x N form the data) is its conditional expectation E (cid:2) u ( x ) (cid:12)(cid:12) u ( x ) , . . . , u ( x N ) (cid:3) . One result of this paper is that E (cid:2) u ( x ) (cid:12)(cid:12) u ( x ) , . . . , u ( x N ) (cid:3) = N X i =1 u ( x i ) φ i ( x ) , (1.3)where the functions φ i are Rough Polyharmonic Splines [60] (RPS) which have beenidentiﬁed as accurate basis elements for the numerical homogenization of (1.1) havingnoteworthy variational, optimal recovery and localization properties. The discovery ofthese Rough Polyharmonic Splines has required a signiﬁcant amount of work and trialand errors but here, they are identiﬁed after a single step of Bayesian conditioning.This observation motivates us to investigate what the same process of Bayesian con-ditioning would give under diﬀerent priors and under other observations than the valuesof u at individual points (we will consider data formed by the values of a ﬁnite numberof linear functions of u ). In particular, we will use this link between Bayesian Inferenceand Numerical Homogenization to identify bases for the numerical homogenization ofarbitrary linear integro-diﬀerential equations. Our purpose is to show that this link isgeneric and could in principle be used, beyond numerical homogenization, as a guidingprinciple for the coarse-graining of multi-scale systems. The Bayesian approach to thisproblem is to (1) Put a prior on the degrees of freedom of the system (2) Select a ﬁnitenumber of coarse variables (3) Compute the posterior value of the state of the systemconditioned on the coarse variables. Let L and B be linear integro-diﬀerential operators on Ω and ∂ Ω such that (1) ( L , B ) : H (Ω) → H L (Ω) × H B ( ∂ Ω), where H (Ω), H L (Ω) and H B ( ∂ Ω) are Hilbert spaces ofGeneralized functions on Ω and ∂ Ω (2) H L (Ω) contains L (Ω) and H (Ω) is contained in L (Ω).Consider the integro-diﬀerential equation ( L u ( x ) = g ( x ) x ∈ Ω , B u = 0 on ∂ Ω . (2.1)As with (1.1) the numerical homogenization of (2.1) will require the assumption that g belongs to a strict subspace of H L (Ω).We will assume that L and B are such that (2.1) (1) admits a unique solution in H (Ω) (2) and a Green’s function G . Recall that G is deﬁned as the solution of ( L G ( x, y ) = δ ( x − y ) x ∈ Ω , B G ( x, y ) = 0 for x ∈ ∂ Ω , (2.2)where δ ( · − y ) is the Delta mass of dirac at the point y .3 xample 2.1. Note that for the prototypical example (1.1) we have L u ( x ) := − div (cid:0) a ( x ) ∇ u ( x ) (cid:1) and B u ( x ) = u ( x ) . (2.3)Our purpose is to identify a good basis for the numerical homogenization or coarse-graining of (2.1). Our Bayesian approach to the numerical homogenization of (2.1) is to replace the sourceterm g by a Gaussian ﬁeld ξ . More precisely we introduce ξ , a centered Gaussian ﬁeldon Ω with covariance function Λ( x, y ) := E (cid:2) ξ ( x ) ξ ( y ) (cid:3) , (3.1)and consider the stochastic integro-diﬀerential equation ( L u ( x ) = ξ ( x ) x ∈ Ω , B u = 0 on ∂ Ω . (3.2) Proposition 3.1.

The solution of (3.2) is a Gaussian ﬁeld on Ω whose covariancefunction Γ( x, y ) := E (cid:2) u ( x ) u ( y ) (cid:3) is Γ( x, y ) = Z Ω G ( x, z )Λ( z, z ′ ) G ( y, z ′ ) dz dz ′ . (3.3) Remark 3.2.

Write ( L ∗ , B ∗ ) the adjoint of ( L , B ) with respect to the (scalar) productdeﬁned on H (Ω) by (cid:10) u, v (cid:11) L := R Ω u ( x ) v ( x ) dx . Observe that G ( y, x ) (the transpose of G ( x, y ) with respect to the scalar product (cid:10) · , · (cid:11) L ) is the Green’s function of ( L ∗ , B ∗ ) (thecomplex conjugation of the Green’s function is not required to deﬁne its adjoint becausethe scalar product is bilinear and not sesquilinear). Observe that if ξ is white noise (i.e. Λ( x − y ) = δ ( x − y ) ) then Γ( x, y ) = Z Ω G ( x, z ) G ( y, z ) dz, (3.4) which is the Kernel of L ∗ L , i.e., L ∗ L Γ( x, y ) = δ ( x − y ) .Proof. Since L and B are linear operators, u is a linear function of ξ and is therefore aGaussian ﬁeld. Moreover its covariance function is given byΓ( x, y ) = E (cid:2) u ( x ) u ( y ) (cid:3) = E (cid:2) Z Ω G ( x, z ) ξ ( z ) G ( y, z ′ ) ξ ( z ′ ) (cid:3) dz dz ′ = Z Ω G ( x, z ) G ( y, z ′ ) E (cid:2) ξ ( z ) ξ ( z ′ ) (cid:3) dz dz ′ , (3.5)which ﬁnishes the proof. 4 emark 3.3. Beyond Bayesian Homogenization, equations with random right handside can also be of interest in practical applications, for instance in the modeling ofthe electrostatics in nanoscale ﬁeld-eﬀect sensors, where ﬂuctuations arise from randomcharge concentrations [41].

We will show that the choice of the noise Λ can be determined by the regularity ofthe source term g in the right hand side of (2.1). More precisely if ξ is white noise(Λ( x, y ) = δ ( x − y )) then the resulting accuracy estimates will be obtained under theassumption that g ∈ L (Ω) and as a function of k g k L (Ω) .If ξ is not white noise (i.e. if its covariance function is not δ ( x − y )) then we assumethat there exists two linear integro-diﬀerential operators L Λ and B Λ such that ξ is thestochastic solution of the following equation with white noise ξ ′ as the source term: ( L Λ ξ ( x ) = ξ ′ ( x ) x ∈ Ω , B Λ ξ = 0 on ∂ Ω . (3.6)In what follows, if ξ is not white noise then we assume it to be obtained as in(3.6) and the resulting accuracy estimates will be obtained under the assumption that L Λ g ∈ L (Ω) and as a function of kL Λ g k L (Ω) . A prototypical example corresponds tothe situation where ξ is obtained as the regularization of white noise via a power of theLaplace Dirichlet operator on Ω and this allows us to identify optimal recovery basesunder the assumption that g ∈ H s (Ω) with s ≥ s < Let N be a strictly positive integer. Our Bayesian approach is based on the condition-ing of the solution of (3.2) posterior to the observation of N linear functions of u ( x ),expressed as Z Ω u ( x ) ψ i ( x ) dx i ∈ { , . . . , N } , (3.7)where ψ , . . . , ψ N are N linearly independent generalized functions (distributions) on Ωsuch that for all i Z Ω ψ i ( x )Γ( x, y ) ψ i ( y ) dx dy < ∞ . (3.8)Examples of ψ i include masses of Dirac ( ψ i ( x ) = δ ( x − x i )), indicator functions of subsetsof Ω and elements of L (Ω). Let Θ be the N × N symmetric matrix deﬁned byΘ i,j := Z Ω ψ i ( x )Γ( x, y ) ψ j ( y ) dx dy. (3.9)Note that (3.8) implies that if u is the solution of (3.2) thenΨ := (cid:0) Z Ω u ( x ) ψ ( x ) dx, . . . , Z Ω u ( x ) ψ N ( x ) dx (cid:1) , (3.10)5s a well deﬁned center Gaussian random vector with covariance matrix Θ.We will from now on assume that the covariance function (3.1) is not degenerate inthe sense that for f ∈ H (Ω), k f k := Z Ω f ( x )Λ( x, y ) f ( y ) dx dy (3.11)is zero if and only if f is the null function. Note that if ξ is obtained via (3.6) then k f k = kL − f k L (Ω) (writing L − f the solution of L Λ u = f in Ω with B Λ u = 0 on ∂ Ω)and the non-degeneracy of Λ is equivalent to that of the operator L Λ . Lemma 3.4.

The N × N matrix Θ is symmetric positive deﬁnite. Furthermore for all l ∈ R N , l T Θ l = k v k , (3.12) where v is the solution of ( L ∗ v ( x ) = P Nj =1 l j ψ j ( x ) for x ∈ Ω , B ∗ v ( x ) = 0 for x ∈ ∂ Ω . (3.13) Proof.

We obtain from (3.3) that for l ∈ R N l T Θ l = Z Ω ( Z Ω N X i =1 ψ i ( x ) G ( x, z ) dx )Λ( z, z ′ )( Z Ω N X j =1 ψ j ( y ) G ( y, z ′ ) dy ) dz dz ′ . (3.14)Write v ( x ) := N X i =1 l i Z Ω G ( y, x ) ψ i ( y ) dy. (3.15)Since G ( · , x ) is the Green’s function of the adjoint operator (Remark 3.2) it followsthat v is the solution of (3.13) and k v k = l T Θ l which implies that Θ is symmetricpositive deﬁnite. Indeed if Θ is not positive deﬁnite, then there would exist a non zerovector l ∈ R N such that Θ l = 0. This would imply k v k Λ = 0 which is a contradictionsince the equation (3.13) has a non zero solution (since l = 0 and the ψ i are linearlyindependent).Our motivation for using Gaussian noise in (3.2) lies in the fact that for Gaussianﬁelds, conditional expected values can be computed via linear projection. Henceforthour approach is also akin to Gaussian ﬁltering for numerical homogenization and thefollowing Theorem shows that this approach allows for the identiﬁcation of a (projection)basis φ i . Theorem 3.5.

Let u be the solution of (3.2) and Ψ deﬁned by (3.10) , then E (cid:2) u ( x ) (cid:12)(cid:12) Ψ (cid:3) = N X i =1 Ψ i φ i ( x ) , (3.16)6 ith Ψ i := Z Ω u ( y ) ψ i ( y ) dy, (3.17) and φ i ( x ) := N X j =1 Θ − i,j Z Ω Γ( x, y ) ψ j ( y ) dy. (3.18) Furthermore, u ( x ) conditioned on the value of Ψ is a Gaussian random variable withmean (3.16) and variance σ ( x ) = Γ( x, x ) − N X i,j =1 Θ − i,j Z Ω Γ( x, y ) ψ j ( y ) dy Z Ω Γ( x, y ) ψ i ( y ) dy. (3.19) Proof.

Let u Ψ ( x ) := E (cid:2) u ( x ) (cid:12)(cid:12) Ψ (cid:3) . (3.20)Since u and Ψ belong to the same Gaussian space, it follows that u Ψ is a linear functionof Ψ obtained by minimizing the mean squared error E h(cid:0) u ( x ) − c · Ψ (cid:1) i = Γ( x, x ) − N X i =1 c i Z Ω Γ( x, y ) ψ i ( y ) dy + N X i,j =1 c i c j Θ i,j , (3.21)with respect to c ∈ R N , where Θ is deﬁned by (3.9). We conclude the proof by identifyingthe minimizer in c , using Lemma 3.4 for the invertibility of Θ and noting that (3.19) issimply (3.21) at the minimum in c . Example 3.1. If L and B correspond to the prototypical example (1.1) (see also Example2.1), if ξ is white noise (i.e. if its covariance matrix is Λ( x, y ) = δ ( x − y ) ), and ifthe observable functions are masses of Diracs at points x i ∈ Ω (and d ≤ which isrequired for (3.8) ), then Theorem 3.5 implies (1.3) and the basis elements φ i are theRPS elements of [60] which are a generalization of Polyharmonic Splines to PDEs withrough coeﬃcients. Recall that Polyharmonic splines can be traced back to the seminalwork of Harder and Desmarais [40] and Duchon [22, 23, 24].Note also that according to Theorem (3.5) the process of Bayesian conditioning givesus the whole posterior distribution of u ( x ) and not only its (conditional) expected value.In particular, the distribution of u ( x ) conditioned on u ( x ) , . . . , u ( x N ) is a Gaussianrandom variable with mean (1.3) and variance σ ( x ) = Γ( x, x ) − N X i,j =1 Θ − i,j Γ( x, x j )Γ( x, x i ) , (3.22) and this observation can be used to compute the probability of deviation of the RPSinterpolation from u ( x ) by a given margin and guide the addition of interpolation points(note that σ ( x ) = 0 at the interpolation points x , . . . , x N ). emark 3.6. We will show in Theorem 5.1 that σ ( x ) also controls the pointwise errorbetween the solution of the original integro-diﬀerential equation (2.1) and the approxi-mation P Ni =1 φ i ( x ) R Ω u ( y ) ψ i ( y ) dy . In this section we will show that as for RPS [60], the basis elements φ i from BayesianInference have remarkable variational and optimal recovery properties that can be used(1) for their practical computation (2) for the derivation of accuracy estimates. In this subsection we will assume that ξ is white noise (i.e. Λ( x, y ) = δ ( x − y )). Deﬁne V := (cid:8) φ ∈ H (Ω) (cid:12)(cid:12) L φ ∈ L (Ω) and B φ = 0 on ∂ Ω (cid:9) , (4.1)and let (cid:10) · , · (cid:11) be the (scalar) product on V deﬁned by: for u, v ∈ V , (cid:10) u, v (cid:11) := Z Ω (cid:0) L u ( x ) (cid:1)(cid:0) L v ( x ) (cid:1) dx. (4.2)Note in particular that (cid:10) v, v (cid:11) = 0 if and only if v = 0 and we write k v k V := (cid:10) v, v (cid:11) , (4.3)the corresponding norm (note that k v k V is a norm on V because k v k V = 0 and v ∈ V imply L v = 0 in Ω and B v = 0 on ∂ Ω which leads to v = 0 by the non-degeneracy ofthe operator L ). Theorem 4.1. If Γ( x, x ) < ∞ then for v ∈ V and x ∈ Ω (cid:12)(cid:12) v ( x ) (cid:12)(cid:12) ≤ (cid:0) Γ( x, x ) (cid:1) k v k V , (4.4) and the space V with the reproducing Kernel Γ( x, y ) forms a Reproducing Kernel HilbertSpace. In particular, for all v ∈ V (cid:10) v, Γ( · , x ) (cid:11) = v ( x ) . (4.5) Proof.

Theorem 4.1 is a direct consequence of the fact that (cid:10) v, Z Ω Γ( · , y ) f ( y ) dy (cid:11) = Z Ω v ( y ) f ( y ) dy, (4.6)and (by Cauchy-Schwartz inequality and (cid:10) Γ( · , x ) , R Ω Γ( · , x ) (cid:11) = Γ( x, x )) (cid:10) v, Γ( · , x ) (cid:11) ≤ (cid:10) v, v (cid:11) (cid:0) Γ( x, x ) (cid:1) . (4.7)8eﬁne V i := (cid:8) φ ∈ V (cid:12)(cid:12) Z Ω φ ( x ) ψ i ( x ) dx = 1 and Z Ω φ ( x ) ψ j ( x ) dx = 0for j ∈ { , . . . , N } such that j = i (cid:9) , (4.8)and consider the following optimization problem over V i : ( Minimize (cid:10) φ, φ (cid:11)

Subject to φ ∈ V i . (4.9) Proposition 4.2. V i is a non-empty closed aﬃne subspace of V . Problem (4.9) is astrictly convex quadratic optimization problem over V i . The unique minimizer of (4.9) is φ i as deﬁned by (3.18) .Proof. Let us ﬁrst prove that φ i ∈ V i . Let θ i ( x ) := Z Ω Γ( x, y ) ψ i ( y ) dy. (4.10)First observe that for all i ∈ { , . . . , N } , L θ i ( x ) = Z Ω G ( y, x ) ψ i ( y ) dy, (4.11)and B θ i ( x ) = 0 on ∂ Ω. Noting that (cid:13)(cid:13) L θ i (cid:13)(cid:13) L (Ω) = Θ i,i we deduce from (3.8) that θ i ∈ V .We conclude from (3.18) and Lemma 3.4 that φ i ∈ V . Now observe that (3.9) impliesthat Z Ω φ i ( x ) ψ j ( x ) = (Θ − · Θ) i,j = δ i,j , (4.12)where δ i,i = 1 and δ i,j = 0 for j = i . We conclude that φ i ∈ V i which implies that V i isnon empty (it is easy to check that it is a closed aﬃne sub-space of V ).Now let us prove that problem (4.9) is a strictly convex optimization problem over V i . Let v, w ∈ V i such that v = w . Write for λ ∈ [0 , f ( λ ) := (cid:10) v + λ ( w − v ) , v + λ ( w − v ) (cid:11) , (4.13)and we need to show that f ( λ ) is a strictly convex function. Observing that f ( λ ) = (cid:10) v, v (cid:11) + 2 λ (cid:10) v, w − v (cid:11) + λ (cid:10) v − w, v − w (cid:11) , (4.14)and noting that (cid:10) v − w, v − w (cid:11) > v = w ) we deduce that f is strictly convex in λ . We conclude that (see, for example, [29, pp. 35, Proposition1.2]) that Problem (4.9) is a strictly convex optimization problem over V i and that itadmits a unique minimizer in V i . We will postpone the proof of the fact that φ i is theminimizer of (4.9) to the proof of Theorem 4.6.9 emark 4.3. It is important to note that in practical (numerical) applications eachelement φ i would be obtained by solving the quadratic optimization problem (4.9) ratherthan through the representation formula (3.18) because the identiﬁcation of Γ in (3.18) ismore expensive than solving the linear systems associated with (4.9) (inverting a matrixis more expensive than solving a linear system). Note also that, if u is the (stochastic)solution of (3.2) , then φ i is also equal to the expected value of u ( x ) conditioned on R Ω u ( x ) ψ i ( x ) = 1 and R Ω u ( x ) ψ j ( x ) = 0 for j = i , i.e. φ i ( x ) = E (cid:2) u ( x ) (cid:12)(cid:12)Z Ω u ( x ) ψ i ( x ) = 1 and Z Ω u ( x ) ψ j ( x ) = 0 for j = i (cid:3) . (4.15) Remark 4.4.

A simple calculation allows us to show that φ i is also the solution of thefollowing nested equations ( L φ i ( x ) = χ i ( x ) x ∈ Ω , B φ i = 0 on ∂ Ω , (4.16) ( L ∗ χ i ( x ) = P Nj =1 Θ − i,j ψ j ( x ) x ∈ Ω , B ∗ χ i ( x ) = 0 on ∂ Ω . (4.17) Remark 4.5.

Another simple calculation allows us to show that φ i is also the solutionof the following nested equations  L φ i ( x ) = χ i ( x ) x ∈ Ω , B φ i = 0 on ∂ Ω , R Ω φ i ( x ) ψ j ( x ) dx = δ i,j for j ∈ { , . . . , N } , (4.18) ( L ∗ χ i ( x ) = P Nj =1 c j ψ j ( x ) x ∈ Ω , B ∗ χ i ( x ) = 0 on ∂ Ω , (4.19) where c ∈ R N is an unknown vector determined by the third equation in (4.18) . Write V the subset of V deﬁned by V := (cid:8) v ∈ V : Z Ω v ( x ) ψ i ( x ) dx = 0 , ∀ i ∈ { , . . . , N } (cid:9) . (4.20) Theorem 4.6.

It holds true that • The basis φ i is orthorgonal to V with respect to the product (cid:10) · , · (cid:11) , i.e. (cid:10) φ i , v (cid:11) = 0 , ∀ i ∈ { , . . . , N } and ∀ v ∈ V . (4.21) • P Ni =1 w i φ i is the unique minimizer of (cid:10) v, v (cid:11) over all v ∈ V such that R Ω v ( x ) ψ i ( x ) dx = w i . For all i ∈ { , . . . , N } and for all v ∈ V , (cid:10) φ i , v (cid:11) = N X j =1 Θ − i,j Z Ω v ( x ) ψ j ( x ) dx. (4.22) • For all i, j ∈ { , . . . , N } , (cid:10) φ i , φ j (cid:11) = Θ − i,j . (4.23) Remark 4.7.

Theorem 4.6 and its proof is analogous to the optimal property of strictlyconditionally positive deﬁnite kernels [69] when used as interpolant solutions of the op-timal recovery problem [38].Proof.

We have, using (4.10), (3.18) and (4.11) (cid:10) φ i , v (cid:11) = N X j =1 Θ − i,j Z Ω L θ j ( x ) L v ( x ) dx = N X j =1 Θ − i,j Z Ω ψ j ( y ) v ( y ) dy = 0 , (4.24)Which implies (4.21), (4.22) and (4.23).Let w ∈ R N and φ w := P Ni =1 w i φ i . Let v ∈ V such that R Ω v ( x ) ψ i ( x ) dx = w i for all i ∈ { , . . . , N } . Since φ w − v ∈ V , it follows that (cid:10) v, v (cid:11) = (cid:10) φ w , φ w (cid:11) + (cid:10) v − φ w , v − φ w (cid:11) . (4.25)It follows that P Ni =1 w i φ i is the unique minimizer of (cid:10) v, v (cid:11) over all v ∈ V such that R Ω v ( x ) ψ i ( x ) dx = w i . Note that this also implies that φ i is the minimizer of (4.9). If ξ is not white noise (i.e. Λ( x, y ) = δ ( x − y )) then Theorem 4.1,Theorem 4.6 andProposition 4.2 remain true provided that the deﬁnitions of the space V and scalarproduct (cid:10) · , · (cid:11) are changed to V := (cid:8) φ ∈ H (Ω) (cid:12)(cid:12) L Λ L φ ∈ L (Ω) , B φ = 0 and B Λ L φ = 0 on ∂ Ω (cid:9) , (4.26) (cid:10) u, v (cid:11) := Z Ω (cid:0) L Λ L u ( x ) (cid:1)(cid:0) L Λ L v ( x ) (cid:1) dx, (4.27)where L Λ and B Λ are deﬁned in (3.6). φ i Let k v k V be deﬁned as in (4.3). 11 heorem 5.1. Assume that Γ( x, x ) < ∞ . Let v ∈ V . It holds true that for x ∈ Ω (cid:12)(cid:12)(cid:12) v ( x ) − N X i =1 φ i ( x ) (cid:0) Z Ω v ( y ) ψ i ( y ) dy (cid:1)(cid:12)(cid:12)(cid:12) ≤ σ ( x ) k v k V , (5.1) where σ ( x ) is the variance of u ( x ) (solution of (3.2) ) conditioned on R Ω u ( y ) ψ ( y ) dy, . . . , R Ω u ( y ) ψ N ( y ) dy as deﬁned by (3.19) . In particular if u is the solu-tion of the original integro-diﬀerential equation (2.1) , then (cid:12)(cid:12)(cid:12) u ( x ) − N X i =1 φ i ( x ) (cid:0) Z Ω u ( y ) ψ i ( y ) dy (cid:1)(cid:12)(cid:12)(cid:12) ≤ σ ( x ) k g k L (Ω) , (5.2) if φ i , σ are derived from white noise, and (cid:12)(cid:12)(cid:12) u ( x ) − N X i =1 φ i ( x ) (cid:0) Z Ω u ( y ) ψ i ( y ) dy (cid:1)(cid:12)(cid:12)(cid:12) ≤ σ ( x ) kL Λ g k L (Ω) , (5.3) if φ i , σ are derived from the noise with covariance function Λ described in (3.6) .Proof. Let v ∈ V and x ∈ Ω. Using the reproducing kernel property of Theorem 4.1 weobtain that (cid:12)(cid:12) v ( x ) − N X i =1 φ i ( x ) Z Ω v ( y ) ψ i ( y ) dy (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:10) v, Γ( · , x ) − N X i =1 φ i ( x ) Z Ω Γ( · , y ) ψ i ( y ) dy (cid:11)(cid:12)(cid:12)(cid:12) . (5.4)Therefore, using Cauchy-Schwartz inequality (cid:12)(cid:12) v ( x ) − N X i =1 φ i ( x ) Z Ω v ( y ) ψ i ( y ) dy (cid:12)(cid:12) ≤ k v k V (cid:13)(cid:13) Γ( · , x ) − N X i =1 φ i ( x ) Z Ω Γ( · , y ) ψ i ( y ) dy (cid:13)(cid:13) V . (5.5)We conclude by expanding the right hand side of (5.5) and the deﬁnition φ i ( x ) = P Nj =1 Θ − i,j R Ω Γ( x, y ) ψ i ( y ) dy . Remark 5.2. σ ( x ) is also known as the Power function in radial basis function inter-polation [69, 33]. The proof of Theorem 5.1 is similar to the one used to derive localerror estimates for radial basis function interpolation of scattered data (see [71] in which σ ( x ) was referred to as the Kriging function, a terminology coming from geostatistics[47]). H (Ω) -norm estimates Let V be the subset of V deﬁned by (4.20). Write ρ ( V ) := sup v ∈ V k v k H (Ω) k v k V , (5.6)where k . k H (Ω) is the natural norm associated with the space on which the operator L isdeﬁned. 12 heorem 5.3. We have for all v ∈ V (cid:13)(cid:13)(cid:13) v − N X i =1 φ i (cid:0) Z Ω v ( y ) ψ i ( y ) dy (cid:1)(cid:13)(cid:13)(cid:13) H (Ω) ≤ ρ ( V ) k v k V , (5.7) and ρ ( V ) is the smallest constant for which (5.7) holds for all v ∈ V .Proof. Write v Ψ ( x ) := P Ni =1 φ i ( x ) (cid:0) R Ω v ( y ) ψ i ( y ) dy (cid:1) Observing that v − v Ψ belongs to V implies that k v − v Ψ k H (Ω) ≤ ρ ( V ) (cid:10) v − v Ψ , v − v Ψ (cid:11) . (5.8)Theorem 4.6 implies that (cid:10) v, v (cid:11) = (cid:10) v Ψ , v Ψ (cid:11) + (cid:10) v − v Ψ , v − v Ψ (cid:11) , (5.9)which leads to (cid:10) v − v Ψ , v − v Ψ (cid:11) = (cid:10) v, v (cid:11) − (cid:10) v Ψ , v Ψ (cid:11) ≤ (cid:10) v, v (cid:11) , (5.10)which concludes the proof. Remark 5.4.

Observe that Theorem 5.3 implies that if u is the solution of the originalintegro-diﬀerential equation (2.1) and φ i , σ are derived from white noise, then (cid:13)(cid:13)(cid:13) u − N X i =1 φ i (cid:0) Z Ω u ( y ) ψ i ( y ) dy (cid:1)(cid:13)(cid:13)(cid:13) H (Ω) ≤ ρ ( V ) k g k L (Ω) . (5.11) Similarly, if φ i , σ are derived from the noise with covariance function Λ described in (3.6) , then (cid:13)(cid:13)(cid:13) u − N X i =1 φ i (cid:0) Z Ω u ( y ) ψ i ( y ) dy (cid:1)(cid:13)(cid:13)(cid:13) H (Ω) ≤ ρ ( V ) kL Λ g k L (Ω) . (5.12) Example 5.1. If L and B correspond to the prototypical example (1.1) (Example 2.1),if ξ is white noise, and if the observable functions are masses of Diracs at points x i ∈ Ω (and d ≤ ), then [60], ρ ( V ) ≤ CH, (5.13) where C depends only on λ min ( a ) , λ max ( a ) and where λ max ( a ) := sup x ∈ Ω ,l =0 l T a ( x ) l/ | l | , λ min ( a ) := inf x ∈ Ω ,l =0 l T a ( x ) l/ | l | and H is the mesh-norm H := sup x ∈ Ω min i k x − x i k , (5.14) and (cid:13)(cid:13) u − N X i =1 φ i ( x ) u ( x i ) (cid:13)(cid:13) H (Ω) ≤ CH (cid:13)(cid:13) div( a ∇ u ) (cid:13)(cid:13) L (Ω) . (5.15) Let us also recall that the proof of (5.13) is based on the following Poincar´e inequality(Lemma 3.1 of [60]) emma 5.5. ([60, Lemma 3.1 ]) Let d ≤ and B be the open ball of center andradius . There exists a ﬁnite strictly positive constant C λ min ( a ) ,λ max ( a ) such that for all v ∈ H ( B ) such that div( a ∇ v ) ∈ L ( B ) it holds true that k v − v (0) k L ( B ) ≤ C λ min ( a ) ,λ max ( a ) (cid:16) k∇ v k L ( B ) + (cid:13)(cid:13) div( a ∇ v ) (cid:13)(cid:13) L ( B ) (cid:17) . (5.16) Proof.

We will recall the proof of this lemma (as presented in [60, Lemma 3.1 ]) for thesake of completeness. The proof is per absurdum. Note that since d ≤ v ∈ H ( B ) and div( a ∇ v ) ∈ L ( B ) imply the H¨older continuity of v in B . Assumethat (5.16) does not hold. Then there exists a sequence v n and a sequence a ′ n whosemaximum and minimum eigenvalues are uniformly bounded by λ min ( a ) and λ max ( a ) (weneed to introduce that sequence because we want the constant in (5.16) to depend only d, λ min ( a ) , λ max ( a )) such that k v n − v n (0) k L ( B ) > n (cid:16) k∇ v n k L ( B ) + (cid:13)(cid:13) div( a ′ n ∇ v n ) (cid:13)(cid:13) L ( B ) (cid:17) (5.17)Letting w n = v n − v n (0) k v n − v n (0) k L B we obtain that w n (0) = 0, k w n k L ( B ) = 1 and k∇ w n k L ( B ) + (cid:13)(cid:13) div( a ′ n ∇ w n ) (cid:13)(cid:13) L ( B ) < n (5.18)Since k w n k H ( B ) < n ≤ w n j and a w ∈ H ( B ) such that w n j ⇀ w weakly in H ( B ) and ∇ w n j ⇀ ∇ w weakly in L ( B ). Using k∇ w n k L ( B ) ≤ /n wededuce that ∇ w = 0 which implies that w is a constant in B . Since by the Rellich-Kondrachov theorem the embedding H ( B ) ⊂ L ( B ) is compact it follows from (5.19)that w n j → w strongly in L ( B ) which (using k w n k L ( B ) = 1) implies that k w k L ( B ) =1. Now (5.19) together with the fact that (cid:13)(cid:13) div( a ′ n ∇ w n ) (cid:13)(cid:13) L ( B ) is uniformly bounded andthat d ≤ w n is uniformly H¨older continuous on B (0 , ) (see for instance[65]). This implies that w is continuous in B (0 , ) and that w (0) = 0. This contradictsthe fact that w is a constant in B with k w k L ( B ) = 1. Example 5.2. If L and B correspond to the prototypical example (1.1) (Example 2.1), if ξ is white noise, and if the observable functions are indicator functions of Vorono¨ı cellsaround points in x i ∈ Ω or of tetrahedra of a regular tessellation of the points x i ∈ Ω then (5.13) remains valid as a simple consequence of localized Poincar´e inequalities. Indeedfor v ∈ V , writing C i the Vorono¨ı cells at the points x i ∈ Ω , we have (assuming Ω isthe union of those Vorono¨ı cells) k v (cid:13)(cid:13) H (Ω) = Z Ω v ( x ) (cid:0) − div( a ( x ) ∇ v ( x )) (cid:1) dx ≤ k v k L (Ω) (cid:13)(cid:13) div( a ∇ v ) (cid:13)(cid:13) L (Ω) , (5.20)14 nd we conclude by applying Poincar´e’s inequality to the L -norm of v within each cell C i , i.e. k v k L (Ω) = X i k v k L ( C i ) ≤ CH X i k∇ v k L ( C i ) = CH k∇ v k L (Ω) . (5.21)We will give the last example as a theorem. Theorem 5.6.

Let L and B be as in the prototypical example (1.1) (Example 2.1) and let ξ be white noise. Let ψ , . . . , ψ N be linearly independent generalized probability densitieson Ω with (possibly overlapping) support support( ψ i ) . Deﬁne H := sup x ∈ Ω min i sup y ∈ support( ψ i ) k x − y k . (5.22) Then, it holds true that ρ ( V ) ≤ CH, (5.23) where C depends only on λ min ( a ) and λ max ( a ) . Henceforth, for u ∈ V (cid:13)(cid:13) u − N X i =1 φ i ( x ) Z Ω u ( y ) ψ i ( y ) dy (cid:13)(cid:13) H (Ω) ≤ CH (cid:13)(cid:13) div( a ∇ u ) (cid:13)(cid:13) L (Ω) . (5.24) Remark 5.7.

Observe that if for all i the support of ψ i is contained in a ball of center x i and radius H ′ , then H ≤ H ′ + sup x ∈ Ω min i k x − x i k , (5.25) in particular if the points x i have mesh norm H ′′ (see (5.14) ) then H ≤ H ′ + H ′′ .Proof. The proof of (5.23) is simply based on the observation that if v ∈ V then (since R Ω v ( x ) ψ i ( x ) dx = 0) there exists N points y , . . . , y N such that v ( y i ) = 0 and the meshnorm of those points is bounded by H . Therefore we can apply the result of Example5.1. A simple pseudo-algorithmic description of the proposed framework for the numericalhomogenization of (2.1) is as follows:1. Select N linearly independent (measurement) functions ψ , . . . , ψ N in L (Ω).2. Let ξ in (3.2) be a Gaussian ﬁeld of mean 0 and covariance function Λ( x, y ) (as-sumed to be non-degenerate, i.e. such that there exists an inverse covariancefunction Λ − ( x, y ) with R Ω Λ( x, y )Λ − ( y, z ) dy = δ ( x − z )).15. The basis functions φ , . . . , φ N for the numerical homogenization of (2.1) are iden-tiﬁed as (writing u the solution of (3.2) and δ i,j = 1 if i = j and δ i,j = 0 if i = j )the deterministic functions φ i ( x ) = E (cid:2) u ( x ) (cid:12)(cid:12) Z Ω u ( x ) ψ j ( x ) dx = δ i,j for j = 1 , . . . , N (cid:3) . (6.1)4. Each φ i can also be identiﬁed as the unique minimizer of ( Minimize R Ω ( L u ( x ))Λ − ( x, y )( L u ( y )) dx dy Subject to φ ∈ H (Ω) and R Ω φ ( x ) ψ j ( x ) dx = δ i,j for j = 1 , . . . , N (6.2)5. Under appropriate choice of the measurement functions ψ i and the covariance func-tion Λ( x, y ), the basis functions φ i can be computed by localizing the optimizationproblems (6.2) to subdomains of Ω. Another motivation for exploring Bayesian approximations of the solution space, liesin the decision theory/game theory approach to numerical homogenization. In thisapproach one looks at the numerical homogenization problem (1.1) as a repeated gamewhere player B chooses a function θ of the linear measurements (data) R Ω u ( x ) ψ ( x ) dx,. . . , R Ω u ( x ) ψ N ( x ) dx and player A chooses a source term g in the unit ball of L (Ω).These two choices combine and form an error term E ( θ, g ) = (cid:13)(cid:13)(cid:13) u − θ (cid:0) Z Ω u ( x ) ψ ( x ) dx, . . . , Z Ω u ( x ) ψ N ( x ) dx (cid:1)(cid:13)(cid:13)(cid:13) L (Ω) . (7.1)Player’s B objective is to minimize the error (7.1) while player’s A objective is to max-imize it. A surprising result stemming from a generalization [51] of Wald’s DecisionTheory [68] and Von Neumann’s Game Theory [67] is that, although such games aredeterministic, under weak regularity conditions, the optimal strategy for player A is toplay at random by placing an optimal probability distribution π A on the set of candi-dates for g and, similarly, the best strategy for player B is to assume that player Ais playing at random and to use a function θ living in the Bayesian class (obtained byplacing a prior π B on the set of candidates for g and conditioning with respect to themeasurements R Ω u ( x ) ψ i ( x ) dx ).Although the estimator employed by player B may be called Bayesian, the gamedescribed here is not (i.e. the choice of player A might be distinct from that of player B)and player B must solve a min max optimization problem over π A and π B to identify anoptimal prior distribution for the Bayesian estimator (a careful choice of the prior alsoappears to be important due to the possible high sensitivity of posterior distributions[55, 54, 53, 52]). 16e refer to [51] for (1) the complete description of the generalization of the Bayesianframework described here to the decision theory/information game formulation (de-scribed above) (2) practical (including numerical) applications of that generalized frame-work to the problems of ﬁnding numerical homogenization bases and fast solvers for (1.1).In that generalization, optimal numerical homogenization bases functions are obtained byselecting the prior distribution of ξ (in (1.2)) to be that of a Gaussian ﬁeld with mean zeroand covariance function the operator (1.1) (i.e. such that for f ∈ H (Ω), R Ω f ( x ) ξ ( x ) dx is a Gaussian random variable of mean zero and variance R Ω ( ∇ f ( x )) T a ( x ) ∇ f ( x ) dx ). Inparticular [51] shows how the identiﬁcation of an optimal distribution for ξ (in the Gaus-sian class) leads to the (automated) discovery of multigrid and multiresolution solversfor PDEs with rough coeﬃcients. Acknowledgements.

The author gratefully acknowledges this work supported by theAir Force Oﬃce of Scientiﬁc Research under Award Number FA9550-12-1-0389 (Scien-tiﬁc Computation of Optimal Statistical Estimators) and the U.S. Department of En-ergy Oﬃce of Science, Oﬃce of Advanced Scientiﬁc Computing Research, through theExascale Co-Design Center for Materials in Extreme Environments (ExMatEx, LANLContract No DE-AC52-06NA25396, Caltech Subcontract Number 273448). The authoralso thanks Dongbin Xiu, Lei Zhang and Guillaume Bal for stimulating discussions andLeonid Berlyand for comments on the manuscript. The author also thanks two anony-mous referees for valuable comments and suggestions.

References [1] Assyr Abdulle and Marcus J. Grote. Finite element heterogeneous multiscalemethod for the wave equation.

Multiscale Model. Simul. , 9(2):766–792, 2011.[2] Assyr Abdulle and Christoph Schwab. Heterogeneous multiscale FEM for diﬀusionproblems on rough surfaces.

Multiscale Model. Simul. , 3(1):195–220 (electronic),2004/05.[3] G. Allaire and R. Brizzi. A multiscale ﬁnite element method for numerical homog-enization.

Multiscale Model. Simul. , 4(3):790–812 (electronic), 2005.[4] T. Arbogast and K. J. Boyd. Subgrid upscaling and mixed multiscale ﬁnite elements.

SIAM J. Numer. Anal. , 44(3):1150–1171 (electronic), 2006.[5] T. Arbogast, C.-S. Huang, and S.-M. Yang. Improved accuracy for alternating-direction methods for parabolic equations based on regular and mixed ﬁnite ele-ments.

Math. Models Methods Appl. Sci. , 17(8):1279–1305, 2007.[6] I. Babuˇska, G. Caloz, and J. E. Osborn. Special ﬁnite element methods for a classof second order elliptic problems with rough coeﬃcients.

SIAM J. Numer. Anal. ,31(4):945–981, 1994. 177] I. Babuˇska and R. Lipton. Optimal local approximation spaces for generalized ﬁniteelement methods with application to multiscale problems.

Multiscale Model. Simul. ,9:373–406, 2011.[8] I. Babuˇska, F. Nobile, and R. Tempone. A stochastic collocation method for ellipticpartial diﬀerential equations with random input data.

SIAM Rev. , 52(2):317–355,2010.[9] I. Babuˇska and J. E. Osborn. Generalized ﬁnite element methods: their performanceand their relation to mixed methods.

SIAM J. Numer. Anal. , 20(3):510–536, 1983.[10] I. Babuˇska and J. E. Osborn. Can a ﬁnite element method perform arbitrarilybadly?

Math. Comp. , 69(230):443–462, 2000.[11] Guillaume Bal and Wenjia Jing. Corrector theory for MSFEM and HMM in randommedia.

Multiscale Model. Simul. , 9(4):1549–1587, 2011.[12] A. Bensoussan, J. L. Lions, and G. Papanicolaou.

Asymptotic analysis for periodicstructure . North Holland, Amsterdam, 1978.[13] L. Berlyand and H. Owhadi. Flux norm approach to ﬁnite dimensional homoge-nization approximations with non-separated scales and high contrast.

Archives forRational Mechanics and Analysis , 198(2):677–721, 2010.[14] M. Bieri and C. Schwab. Sparse high order FEM for elliptic sPDEs.

Comput.Methods Appl. Mech. Engrg. , 198(13-14):1149–1170, 2009.[15] X. Blanc, C. Le Bris, and P.-L. Lions. Une variante de la th´eorie del’homog´en´eisation stochastique des op´erateurs elliptiques.

C. R. Math. Acad. Sci.Paris , 343(11-12):717–724, 2006.[16] X. Blanc, C. Le Bris, and P.-L. Lions. Stochastic homogenization and randomlattices.

J. Math. Pures Appl. (9) , 88(1):34–63, 2007.[17] L. V. Branets, S. S. Ghai, L. L., and X.-H. Wu. Challenges and technologies inreservoir modeling.

Commun. Comput. Phys. , 6(1):1–23, 2009.[18] L. A. Caﬀarelli and P. E. Souganidis. A rate of convergence for monotone ﬁnitediﬀerence approximations to fully nonlinear, uniformly elliptic PDEs.

Comm. PureAppl. Math. , 61(1):1–17, 2008.[19] C.-C. Chu, I. G. Graham, and T. Y. Hou. A new multiscale ﬁnite element methodfor high-contrast elliptic interface problems.

Math. Comp. , 79:1915–1955, 2010.[20] P. Diaconis. Bayesian numerical analysis. In

Statistical decision theory and relatedtopics, IV, Vol. 1 (West Lafayette, Ind., 1986) , pages 163–175. Springer, New York,1988. 1821] A. Doostan and H. Owhadi. A non-adapted sparse approximation of PDEs withstochastic inputs. 230(8):3015–3034, 2011.[22] Jean Duchon. Interpolation des fonctions de deux variables suivant le principede la ﬂexion des plaques minces.

Rev. Francaise Automat. Informat. RechercheOperationnelle Ser. RAIRO Analyse Numerique , 10(R-3):5–12, 1976.[23] Jean Duchon. Splines minimizing rotation-invariant semi-norms in Sobolev spaces.In

Constructive theory of functions of several variables (Proc. Conf., Math. Res.Inst., Oberwolfach, 1976) , pages 85–100. Lecture Notes in Math., Vol. 571. Springer,Berlin, 1977.[24] Jean Duchon. Sur l’erreur d’interpolation des fonctions de plusieurs variables parles D m -splines. RAIRO Anal. Num´er. , 12(4):325–334, vi, 1978.[25] Weinan E and Bjorn Engquist. The heterogeneous multiscale methods.

Commun.Math. Sci. , 1(1):87–132, 2003.[26] Y. Efendiev, J. Galvis, and X. Wu. Multiscale ﬁnite element and domain decom-position methods for high-contrast problems using local spectral basis functions.

Journal of Computational Physics , 230(4):937–955, 2011.[27] Y. Efendiev, V. Ginting, T. Hou, and R. Ewing. Accurate multiscale ﬁnite elementmethods for two-phase ﬂow simulations.

J. Comput. Phys. , 220(1):155–174, 2006.[28] Y. Efendiev and T. Hou. Multiscale ﬁnite element methods for porous media ﬂowsand their applications.

Appl. Numer. Math. , 57(5-7):577–596, 2007.[29] I. Ekeland and R. Temam.

Convex Analysis and Variational Problems , volume 28 of

Classics in Applied Mathematics . Society for Industrial and Applied Mathematics,1987.[30] I. H. Eldred, C. G. Webster, and P. G. Constantine. Design under uncertaintyemploying stochastic expansion methods. American Institute of Aeronautics andAstronautics Paper 2008–6001, 2008.[31] B. Engquist and P. E. Souganidis. Asymptotic and numerical homogenization.

ActaNumerica , 17:147–190, 2008.[32] Bj¨orn Engquist, Henrik Holst, and Olof Runborg. Multi-scale methods for wavepropagation in heterogeneous media.

Commun. Math. Sci. , 9(1):33–56, 2011.[33] G. E. Fasshauer. Meshfree methods. In

Handbook of Theoretical and ComputationalNanotechnology . American Scientiﬁc Publishers, 2005.[34] R. Ghanem. Ingredients for a general purpose stochastic ﬁnite elements implemen-tation.

Comput. Methods Appl. Mech. Engrg. , 168(1-4):19–34, 1999.1935] R. Ghanem and S. Dham. Stochastic ﬁnite element analysis for multiphase ﬂow inheterogeneous porous media.

Transp. Porous Media , 32(3):239–262, 1998.[36] E. De Giorgi. Sulla convergenza di alcune successioni di integrali del tipo dell’aera.

Rendi Conti di Mat. , 8:277–294, 1975.[37] A. Gloria. Analytical framework for the numerical homogenization of elliptic mono-tone operators and quasiconvex energies.

SIAM MMS , 5(3):996–1043, 2006.[38] M. Golomb and H. F. Weinberger. Optimal approximation and error bounds. In

On numerical approximation. Proceedings of a Symposium, Madison, April 21–23,1958 , Edited by R. E. Langer. Publication No. 1 of the Mathematics ResearchCenter, U.S. Army, the University of Wisconsin, pages 117–190. The University ofWisconsin Press, Madison, Wis., 1959.[39] L. Grasedyck, I. Greﬀ, and S. Sauter. The al basis for the solution of ellipticproblems in heterogeneous media.

Multiscale Modeling & Simulation , 10(1):245–258, 2012.[40] R. L. Harder and R. N. Desmarais. Interpolation using surface splines.

J. Aircraft ,9:189–191, 1972.[41] C. Heitzinger and C. Ringhofer. Multiscale modeling of ﬂuctuations in stochasticelliptic PDE models of nanosensors.

Commun. Math. Sci. , 12(3):401–421, 2014.[42] T. Y. Hou, X.-H. Wu, and Z. Cai. Convergence of a multiscale ﬁnite elementmethod for elliptic problems with rapidly oscillating coeﬃcients.

Math. Comp. ,68(227):913–943, 1999.[43] T.Y. Hou and X.H. Wu. A multiscale ﬁnite element method for elliptic problemsin composite materials and porous media.

J. Comput. Phys. , 134(1):169–189, 1997.[44] S. M. Kozlov. The averaging of random operators.

Mat. Sb. (N.S.) , 109(151)(2):188–202, 327, 1979.[45] A. Malqvist and D. Peterseim. Localization of elliptic multiscale problems.

Mathe-matics of Computation , 2014.[46] F. Murat and L. Tartar. H-convergence.

S´eminaire d’Analyse Fonctionnelle etNum´erique de l’Universit´e d’Alger , 1978.[47] D. E. Myers. Kriging, co-Kriging, radial basis functions and the role of positivedeﬁniteness.

Comput. Math. Appl. , 24(12):139–148, 1992. Advances in the theoryand applications of radial basis functions.[48] J. Nolen, G. Papanicolaou, and O. Pironneau. A framework for adaptive multiscalemethods for elliptic problems.

Multiscale Model. Simul. , 7(1):171–196, 2008.2049] A. O’Hagan. Bayes-Hermite quadrature.

J. Statist. Plann. Inference , 29(3):245–260,1991.[50] A. O’Hagan. Some Bayesian numerical analysis. In

Bayesian statistics, 4 (Pe˜n´ıscola,1991) , pages 345–363. Oxford Univ. Press, New York, 1992.[51] H. Owhadi. Multigrid with rough coeﬃcients and multiresolution operator decom-position from hierarchical information games. 2015. arXiv:1503.03467.[52] H. Owhadi and C. Scovel. Qualitative Robustness in Bayesian Inference. 2014.arXiv:1411.3984.[53] H. Owhadi and C. Scovel. Brittleness of Bayesian inference and new Selberg for-mulas.

Communications in Mathematical Sciences , 2015. arXiv:1304.7046.[54] H. Owhadi, C. Scovel, and T. J. Sullivan. Brittleness of Bayesian Inference underﬁnite information in a continuous world.

Electronic Journal of Statistics , 9:1–79,2015. arXiv:1304.6772.[55] H. Owhadi, C. Scovel, and T. J. Sullivan. On the Brittleness of Bayesian Inference.

SIAM Review (Research Spotlights) , 2015.[56] H. Owhadi and L. Zhang. Homogenization of parabolic equations with a continuumof space and time scales.

SIAM Journal on Numerical Analysis , 46(1):1–36, 2007.[57] H. Owhadi and L. Zhang. Metric-based upscaling.

Comm. Pure Appl. Math. ,60(5):675–723, 2007.[58] H. Owhadi and L. Zhang. Homogenization of the acoustic wave equation with acontinuum of scales.

Computer Methods in Applied Mechanics and Engineering ,198(3–4):397–406, 2008.[59] H. Owhadi and L. Zhang. Localized bases for ﬁnite dimensional homogenizationapproximations with non-separated scales and high-contrast.

SIAM Multiscale Mod-eling & Simulation , 9:1373–1398, 2011. arXiv:1011.0986.[60] Houman Owhadi, Lei Zhang, and Leonid Berlyand. Polyharmonic homogenization,rough polyharmonic splines and sparse super-localization.

ESAIM Math. Model.Numer. Anal. , 48(2):517–552, 2014.[61] G. C. Papanicolaou and S. R. S. Varadhan. Boundary value problems with rapidlyoscillating random coeﬃcients. In

Random ﬁelds, Vol. I, II (Esztergom, 1979) ,volume 27 of

Colloq. Math. Soc. J´anos Bolyai , pages 835–873. North-Holland, Am-sterdam, 1981.[62] Henri Poincar´e.

Calcul des probabilit´es . Georges Carr´es, Paris, 1896.[63] J. E. H. Shaw. A quasirandom approach to integration in Bayesian statistics.

Ann.Statist. , 16(2):895–914, 1988. 2164] S. Spagnolo. Sulla convergenza di soluzioni di equazioni paraboliche ed ellittiche.

Ann. Scuola Norm. Sup. Pisa (3) 22 (1968), 571-597; errata, ibid. (3) , 22:673,1968.[65] Guido Stampacchia. `Equations elliptiques du second ordre `a coeﬃcients discontinus .S´eminaire Jean Leray no 3 (1963-1964). Numdam, 1964.[66] R. A. Todor and C. Schwab. Convergence rates for sparse chaos approximations ofelliptic problems with stochastic coeﬃcients.

IMA J. Numer. Anal. , 27(2):232–261,2007.[67] John von Neumann and Oskar Morgenstern.

Theory of Games and Economic Be-havior . Princeton University Press, Princeton, New Jersey, 1944.[68] Abraham Wald. Statistical decision functions which minimize the maximum risk.

Ann. of Math. (2) , 46:265–280, 1945.[69] Holger Wendland.

Scattered data approximation , volume 17 of

Cambridge Mono-graphs on Applied and Computational Mathematics . Cambridge University Press,Cambridge, 2005.[70] C. D. White and R. N. Horne. Computing absolute transmissibility in the presenceof ﬁnescale heterogeneity.

SPE Symposium on Reservoir Simulation , page 16011,1987.[71] Z. Min Wu and R. Schaback. Local error estimates for radial basis function inter-polation of scattered data.

IMA J. Numer. Anal. , 13(1):13–27, 1993.[72] D. Xiu. Fast numerical methods for stochastic computations: a review.