Data-driven polynomial chaos expansions: a weighted least-square approximation
DDATA-DRIVEN POLYNOMIAL CHAOS EXPANSIONS: AWEIGHTED LEAST-SQUARE APPROXIMATION ∗ LING GUO † , YONGLE LIU ‡ , AND
TAO ZHOU § Abstract.
In this work, we combine the idea of data-driven polynomial chaos expansions withthe weighted least-square approach to solve uncertainty quantification (UQ) problems. The ideaof data-driven polynomial chaos is to use statistical moments of the input random variables todevelop an arbitrary polynomial chaos expansion, and then use such data-driven bases to performUQ computations. Here we adopt the bases construction procedure by following [1], where thebases are computed by using matrix operations on the Hankel matrix of moments. Different fromprevious works, in the postprocessing part, we propose a weighted least-squares approach to solve UQproblems. This approach includes a sampling strategy and a least-squares solver. The main featuresof our approach are two folds: On one hand, our sampling strategy is independent of the randominput. More precisely, we propose to sampling with the equilibrium measure, and this measure isalso independent of the data-driven bases. Thus, this procedure can be done in prior (or in a off-line manner). On the other hand, we propose to solve a Christoffel function weighted least-squareproblem, and this strategy is quasi-linearly stable – the required number of PDE solvers dependslinearly (up to a logarithmic factor) on the number of (data-driven) bases. This new approach isthus promising in dealing with a class of problems with epistemic uncertainties. Several numericaltests are presented to show the effectiveness of our approach.
Key words.
Uncertainty quantification, data-driven polynomial chaos expansions, weightedleast-squares, equilibrium measure
1. Introduction.
Uncertainty Quantification (UQ) has been a hot topic re-cently. The aim of UQ is to quantify the impact of the stochastic inputs to thestochastic response, and thus a fundamental problem of UQ is to approximate a po-tentially high dimensional parametric function f ( ξ , ξ , ..., ξ d ) : R d → R , d ≥ . Onepopular way to perform UQ analysis is to assume that the distributions of the inputparameters { ξ k } dk =1 are known in prior, and this is also well known as aleatory-typeuncertainty model. Among others, the generalized Polynomial Chaos (gPC) [29] basedon the Wiener-Askey formula, which is an extension of the original work by Wiener[25], is a popular approach for aleatory-type uncertainty analysis. The idea is to ap-proximate the parametric function f with polynomial bases that are orthogonal withrespect to the input density of the parameters. The unknown expansion coefficientscan then be computed by performing for example the Galerkin projection into a finitepolynomial space. Notice that in general, one needs to solve a coupled Galerkin sys-tem that is much more complicated than the original model – the so called intrusiveapproach. Another popular approach, termed stochastic collocation, has gained muchattention due to its efficiency and its non-intrusive property. The idea of stochasticcollocation is to use efficient sample solutions to construct global polynomial approx-imations. For recent developments of stochastic collocation methods, one can refer to ∗ The first author is supported by the NSF of China (No.11671265) and Program for OutstandingAcademic leaders in Shanghai (No.151503100). The last author is partially supported by the NSF ofChina (under grant numbers 11688101, 91630312, 91630203, 11571351, and 11731006), the sciencechallenge project (No. TZ2018001), NCMIS, and the youth innovation promotion association (CAS). † Department of Mathematics, Shanghai Normal University, Shanghai, China. Email:[email protected]. ‡ Department of Mathematics, Southern University of Science and Technology, Shenzhen, China.Email: [email protected]. § LSEC, Institute of Computational Mathematics and Scientific/Engineering Computing,Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. Email:[email protected]. 1 a r X i v : . [ m a t h . NA ] M a y [15, 16, 21, 23] and references therein.In recent years, there is a growing demand to extend the gPC method to moregeneral input distributions (beyond the Wiener-Askey formula). One of the first at-tempts is the multi-element generalized polynomial chaos (ME-gPC) [19, 24] wherethe random space is divided into small elements and local polynomial expansion isconstructed via the Gram-Schmidt procedure. While the approach gains advantageswhen dealing with discontinuity of model responses, the computational complexitycan be very dramatic. A multi-element probabilistic collocation method was alsodeveloped along this direction [30]. Global polynomial expansions for arbitrary distri-butions have also been investigated based on Gram-Schmidt orthogonalisation [27, 26].However, such approaches still rely on the availability of the input density function.More recently, Oladyshkin and Nowak [18] propose a moment match method todeal with arbitrary distributions (termed aPC), and the approach is promising whenone has incomplete input information, such as the situation when only sample loca-tions are given. The idea in [18] is to set up the moment match equations, and thensolve the unknown polynomial coefficients. The aPC offers a possibility to propagateonly the given information without making assumptions. As showed by Oladyshkinand Nowak in [17] that only moments are propagated in all PC approaches, thus theaPC offers the most reliable results with limited input data. Although the aPC con-struction approach are straightforward to implement, it is well known that the coeffi-cient matrix of the moment equation maybe ill conditioned when the polynomial orderis large. Recently, a promising alternative way to calculate the aPC was proposed in[1], where the authors proposed an algorithm in which all the required quantities arecalculated directly using only matrix operations performed on the Hankel matrix ofmoments. Then, a sparse grid approach based on the Smolyaks algorithm was pro-posed in [1] where the collocation points are generated by the constructed bases, yetagain by using matrix operations.Unlike the traditional gPC methods, where one perform UQ computations di-rectly based on well known polynomial bases choosing according to the Wiener-Askeyformula, the aPC approach can normally be divided into the following two steps: • Bases construction. One uses the input information (moments, samples loca-tions, ect.) to construct the so called arbitrary polynomial bases (data-drivenbases). Notice that this procedure is somehow model-independent and onlyinput information is used. • UQ computations. One adopts the data-driven bases to perform UQ com-putations. This procedure is obviously model-dependent, and one could con-sider a stochastic Galerkin approach, or a sparse grid stochastic collocationapproach as in [18] (where collocation points are generated using the arbitrarypolynomial bases).In this work, the only information we needed are some sample locations (Thedensity of the input is unknown). We shall then adopt the aPC construction procedurein [1]. However, in the second (postprocessing) step, we propose a weighted least-squares approach to obtain the aPC expansion coefficients. This approach includesa sampling strategy and a least-squares solver. We propose to sampling with theequilibrium measure which is independent of the data driven bases (or the inputinformation). Thus, this procedure can be done in prior (or in a off-line manner).Then we propose to solve a Christoffel function weighted least-squares problem, andin many cases of interests this approach is linearly stable – the number of samples(the number of PDE solvers) depends linearly on the number of (data-driven) bases.We shall present theoretical motivations and several numerical tests to support ourstatements.The rest of this paper is organized as follows. In Section 2, we introduce thetraditional gPC approach. The construction procedure of data-driven polynomialbases is introduced in Section 3. In Section 4, we present a weighted least-squaresapproach to perform UQ computations. Numerical experiments are then shown inSection 5 to indicate the applicable and effectiveness of our approach. Finally, wegive some concluding remarks in Section 6.
2. Generalized polynomial chaos.
In parametric uncertainty quantificationstudies, the main goal is to trace the effect of the random inputs, here denoted by ξ = ( ξ , ξ , . . . , ξ d ) through the model and to quantify their effect on the model output(prediction) f ( ξ ) : R d → R . This is frequently done via the generalized polynomialchaos expansions. Concretely, we assume that the components of the random input ξ = ( ξ , ξ , . . . , ξ d ) are mutually independent, and for each ξ i in Γ i ⊂ R it admits amarginal probability density ρ i . Then the joint density function for ξ yields ρ ( ξ ) = (cid:81) di =1 ρ i ( ξ i ) : Γ → R + with Γ := (cid:81) di =1 Γ i ⊂ R d . The gPC approach seeks to constructa polynomial approximation of f ( ξ ) as follows: f ( ξ ) ≈ (cid:88) α ∈ Λ c α Φ α ( ξ ) , (2.1)where α = { α , α , . . . , α d } is a multi-index and Λ is a finite multi-index set. AndΦ α is the multivariate orthogonal polynomials that are orthogonal with respect to thedensity ρ ( ξ ) , i.e., (cid:90) Γ ρ ( ξ )Φ α ( ξ )Φ β ( ξ ) dξ = δ α , β , α , β ∈ Λ . (2.2)Notice the polynomials are defined as tensor-products of the univariate orthogonalpolynomials in each direction, i.e.,Φ α = d (cid:89) i =1 φ iα i ( ξ i ) with (cid:90) Γ i φ iα k ( ξ i ) φ iα l ( ξ i ) ρ i ( ξ i ) dξ i = δ k,l . In this work we focus on the total degree polynomial space that is defined as P (Λ) = span (cid:40) Φ α (cid:12)(cid:12) α ∈ Λ TD k , with Λ TD k := (cid:40) α (cid:12)(cid:12) | α | = d (cid:88) i =1 α i ≤ k (cid:41)(cid:41) . (2.3)It is usually more convenient use the single index instead of the multi-index, and tothis end, one can place an order on the multi-indices, i.e., { α | α ∈ Λ } ←→ { , . . . , N } . (2.4)Thus we have { Φ α ( ξ ) } α ∈ Λ ⇔ { Φ j ( ξ ) } N =dim( P (Λ)) j =1 . (2.5)Hereafter, for simplicity, we will use the single index { j = 1 , , . . . , N } . Therefore, thegPC approximation (2.1) can be written as f ( ξ ) ≈ f N ( ξ ) = N (cid:88) j =1 c j Φ j ( ξ ) . (2.6)The main purpose now is to estimate the coefficients { c j } Nj =1 in an efficient way. Manynumerical techniques on how to obtain the polynomial coefficients in UQ problemshave been developed in recent years, such as the intrusive stochastic Galerkin methods[22, 12, 29] and the non-intrusive collocation methods [5, 16, 31, 28, 32, 11, 7, 9].
3. Data-driven polynomial chaos: a moment-based approach .
The gPCmethods discussed above assume an exact knowledge of the involved probability den-sity functions. However, the distribution information of the random input is verylimited in many engineering applications, and sometimes the only information avail-able is sample locations. This is also unknown as epistatic uncertainty. To deal withthese situations, more general types of polynomial chaos expansions have been inves-tigated in the past few years, see e.g. [27, 26, 6, 20, 18, 1]. Here we shall reviewthe idea of arbitrary polynomial chaos (aPC for short) approach developed in [18, 1].Such an approach can handle the situation when one only has sample locations (oronly moments information is available for the random input).
In this section, we shall review the basicidea in [18, 1]. We suppose that we are given moments information for the randominput (while the associated distributions are unknown). Notice that this approach pro-vides the possibility to propagate continuous or discrete probability density functionsand also histograms (data sets) as long as their moments exist and the determinantof the moment matrix is strictly positive (see details below). The aim is to constructa set of polynomials bases { Φ j } that admit a good approximation for the underliningparametric problem. This will be done by using the moment match methods. Wefirst present the idea in the one dimensional setting.Suppose that the density function for a continuous random variable η ∈ I is ρ ( η ),then the k -th raw moment µ k is defined by µ k = (cid:90) I η k ρ ( η ) dη, k = 0 , , . . . . (3.1)Similarly, if the random variable η is of discrete-type η ∈ (cid:98) I then its k -th moment isdefined as µ k = (cid:88) η ∈ (cid:98) I η k ρ ( η ) , k = 0 , , . . . . (3.2)Finally, if a random variables is only presented as a set of M samples locations { η , η , . . . , η M } (The setting in this work), the k -th moment µ k can be calculatedapproximately by µ k = 1 M M (cid:88) m =1 η km , k = 0 , , . . . . (3.3)Suppose we know the moments of η up to the index 2 K, then we can consider toconstruct a set of orthogonal polynomial bases { φ k ( η ) } Kk =0 with the general form φ k ( η ) = k (cid:88) j =0 β j η j , k = 0 , ..., K. By matching the moments information, we obtain µ µ · · · µ k µ µ · · · µ k +1 ... ... ... ... µ k − µ k · · · µ k − · · · β β ... β k − β k = . (3.4)Thus one can obtain the polynomial coefficients by inverting the above Vandermondematrix. However, this matrix may become very ill-conditioned when k becomes large.An alternative approach by considering matrix operations on the Hankel matrix ofmoments was proposed in [1]. To introduce the idea, we first define the Hankel matrixof moments as H = µ µ · · · µ k µ µ · · · µ k +1 ... ... ... ... µ k µ k +1 · · · µ k . (3.5)If the moments are given by samples (3.3), we require that the set of M samples isdeterminate in the Hamburger sense, meaning that all the corresponding quadraticforms are strictly positive, that is det( H ) >
0. Given the above Hankel matrix ofmoments, we first perform the Cholesky decomposition to obtain H = R (cid:62) R with R = r r · · · r ,k +1 r · · · r ,k +1 . . . ... r k +1 ,k +1 . (3.6)Then, the Mysovskih theorem [13] states that the entries of the matrix R can forman orthogonal system of polynomials. Moreover, explicit analytic formulas to obtainthe polynomial coefficients are available [8]: ηφ j − ( η ) = b j − φ j − ( η ) + a j φ j − ( η ) + b j φ j ( η ) , j = 1 , ...k. (3.7)Here a j and b j can be computed by the components of matrix R : a j = r j,j +1 r j,j − r j − ,j r j − ,j − , b j = r j +1 ,j +1 r j,j , (3.8)where r , = 1 and r , = 0. Remark 3.1.
In the above discussions, we have only presented the one dimen-sional case. For high dimensional cases, one can simply perform the similar procedureas above, and then obtain the multi-variate bases by using the tensor-product rule.Given such data-driven (or moment driven) polynomial bases, one can then performUQ computations for the underline models. For example, a sparse grid method wasproposed in [1], where the stochastic collocation points are generated again by usingmatrix operations based on the data-driven bases discussed above.
Remark 3.2.
We remark again that the above aPC approach provides the possi-bility to propagate continuous or discrete probability density functions and also datasets as long as their moments exist and the determinant of the moment matrix isstrictly positive. The expansion bases here are fully data-driven, and we do not re-quire any distribution information. For cases with limited data, such an approach canavoid bias and fitting errors caused by wrong assumptions.
We have reviewed the moment match ap-proach for constructing data-driven bases for UQ studies, by requiring that the mo-ment problem is uniquely solvable. Following closely [6], we now provide with somemild conditions that can guarantee such an requirement. Our basic assumptions areas following: • Assumption 1: we assume that each basic random variable η possesses finitemoments of all orders. • Assumption 2: the associated distribution functions F η ( x ) := P ( η ≤ x ) ofthe basic random variables are continuous.Notice that such assumptions are just for theoretical analysis, the approach above canstill be used even if the probability density functions are of discrete type as long astheir moments exist and the determinant of the moment matrix is strictly positive.In other words, the following theorem only works when the input random variablessatisfy the above two assumptions. For more general settings, the relevant theoreticalfoundation is still open. Theorem 3.1 ( [6] ). If one of the following conditions is valid, then the mo-ment problem is uniquely solvable and therefore the set of polynomials (that con-structed by the moment match approach) in the random variable η is dense in thespace L (Ω , σ ( η ) , P ) , where Ω is the abstract set of elementary events, σ ( η ) is a σ -algebra of subsets of Ω and P is a probability measure on σ ( η ) .1. The distribution F η has compact support, i.e., there exists a compact interval [ a, b ] , a, b ∈ R , such that P ( η ∈ [ a, b ]) = 1 .2. The moment sequence { µ k } k ∈ N of the distribution satisfies lim k →∞ inf k √ µ k k < ∞ .
3. The random variable is exponential integral,i.e., there holds (cid:104) exp( a | η | ) = (cid:90) R exp a | x | F η ( dx ) (cid:105) < ∞ . for a strictly positive number a . An equivalent condition is the existence of afinite moment-generating function in a neighbourhood of the origin.4. (Carleman’s condition) The moment sequence { µ k } k ∈ N of the distributionsatisfies ∞ (cid:88) k =0 k √ µ k = ∞ .
5. (Lin’s condition) If the distribution has a symmetric, differentiable and strictlypositive density f η and for a real number x > there holds (cid:90) ∞−∞ − log f η ( x )1 + x dx = ∞ and − xf (cid:48) η ( x ) f η ( x ) (cid:37) ∞ ( x → ∞ , x ≥ x )The theorem above states that the orthogonal polynomials form a complete basesin L (Ω , σ ( η ) , P ) and thus one can expect a good approximation property using suchbases.
4. Weighted least-squares for postprocessing.
As mentioned above, oncewe have the data-driven bases, one can perform UQ studies based on such bases. Asparse grid method was proposed in [1], where the collocation points are generatedbased on the data-driven bases. In this section, we shall propose to use the least-squares approach to do postprocessing computations. Our approach admits manyadvantages. First of all, we simply sampling with a known measure (the equilibriummeasure) to generate collocation points, and the sampling strategy is very cheap andno matrix operations are needed compared to the spares grid approach in [1]. Secondly,our sampling strategy is independent of the data-driven bases, and thus this procedurecan be done in advance. Finally, our least-squares solver is linear stable in many casesof interests. Details of our approach are presented in the following subsections.
Now, we introduce theweighted least-squares procedure for computing the expansion coefficients { c j } Nj =1 inthe following expansion f ( ξ ) ≈ N (cid:88) j =1 c j Φ j ( ξ ) , with { Φ j ( ξ ) } Nj =1 being the data-driven orthogonal bases constructed in Section 3, andwe denote the associated polynomial space by P N := span (cid:110) Φ j ( ξ ) , ≤ j ≤ N (cid:111) . We recall that the polynomial space we considered in this work is of total degree type(2.3), and the associated maximum polynomial order is denoted by k. The weightedleast-squares approach suggest to compute the coefficients via sample evaluations.To this end, suppose we have some sample evaluations { f ( z m ) } at some properlychosen samples { z m } Mm =1 . Then, we seek the following weighted discrete least-squareapproximation f N ∈ P N by requiring f N := P Nm f = argmin p ∈ P N M M (cid:88) m =1 w m (cid:16) p ( z m ) − f ( z m ) (cid:17) . (4.1)Here { w m } Mm =1 are properly designed weights. An equivalent algebraic formula forthe above problem yields: c = argmin c ∈ R N (cid:13)(cid:13)(cid:13) W Ac − W f (cid:13)(cid:13)(cid:13) , (4.2)where f = (cid:0) f ( z (cid:1) , ..., f z m )) , A = (cid:2) Φ j ( z m ) (cid:3) ∈ R M × N , j = 1 , ..., N, m = 1 , ..., M, and W = diag( w , ..., w M ) is the preconditioning matrix. Notice that in the aboveapproach, the sampling strategy and the pre-conditioner are two key points. Here weshall adopt the strategy in [14]: Christoffel function weighted least-squares. To thisend, we define the associated (scaled) Christoffel-type function of P N by K ( ξ ) = N (cid:80) Nj =1 Φ j ( ξ ) , (4.3)The components of the preconditioning matrix W in our weighted least-squares areevaluations of the (scaled) Christoffel function. i.e., w m = N (cid:80) Nj =1 Φ j ( z m ) , m = 1 , ..., M. Now, we are ready to summarize the procedures of our Christoffel function weightedleast-squares (more detailed discussions for the sampling strategy and the theoreticalmotivations will be given later): • sampling with respect to the probability density (cid:98) ρ of an equilibrium measure,which depends on the input density ρ . When ξ is a random vector withunbounded state space, then (cid:98) ρ also depends on k , the maximum polynomialdegree of the total degree polynomial space P N . In this case, we denote thesampling measure by (cid:98) ρ k . • evaluate the function f ( ξ ) (the underlying model) at the selected samples { z m } Mm =1 . • form M × N Vandermonde-like matrix A with entries Φ n ( z m ) . • form the diagonal preconditioning matrix W using evaluations of the (scaled)Christoffel function. • solve the preconditioned least-squares problem (4.2) to approximate the ex-pansion coefficients { c j } Nj =1 .Notice that in our weighted least-squares approach, the main feature is that thesampling strategy is independent of the data-driven bases, thus this procedure andthe associated model simulations can be done in prior. Moreover, we shall show inthe following that the sampling strategies are straightforward. We first consider the boundedcase, where we assume (without loss of generality) that the computational domain for ξ is [ − , d . In this case, our sampling measure is always the tensor-product Cheby-shev measure, i.e, (cid:98) ρ ( ξ ) ∼ π d (cid:81) dk =1 (cid:112) − ξ k , regardless of the underling measure (if exists, yet unknown) of the random vector ξ. In other words, the only information we require is that the random variable is locatedin a bounded domain.Notice that the equilibrium measure for a bounded domain with any admissibleinput density is the Chebyshev measure, or in other words, the Chebyshev measureis universal in the bounded setting. Notice that sampling with Chebyshev measure isstraightforward: one can simply generate uniform distributed samples { u m } Mm =1 andthen generate { z m } Mm =1 by requiring z m = cos( u m ) , m = 1 , ..., M. We now consider theunbounded case. We remark that very few results are known for the equilibriummeasure in unbounded domains. Thus, the results in what follows are our conjecturesfor which the effectiveness have been well studied numerically in [14].
The domain R d with Gaussian density . We consider the domain R d withGaussian-type input N ( σ, µ ) (yet the parameters σ, µ can be arbitrary/unknown). Asin our setting, we assume that we only have some sample locations, we shall firstcompute an approximated pair ( (cid:98) µ, (cid:98) σ ) of the input. Then, a simple linear transforma-tion (cid:98) ξ = ( ξ − (cid:98) µ ) / (cid:98) σ can be used to make sure that the input has distribution N (1 , N (1 ,
0) is given by (cid:98) ρ ( ξ ) = C (cid:16) − (cid:107) ξ (cid:107) (cid:17) d/ , with C a normalization constant. Furthermore, we shall expand the associated sam-ples (generated by the above measure) by the square root of the maximum polynomialdegree k . The following is a concrete way to sample from this expanded density:1. Compute k , the maximum polynomial degree of P N .2. Generate a vector y = ( y , . . . , y d ) of d independent normally distributedrandom variables.3. Draw a scalar sample ν from the Beta distribution on [0 , α = d/ β = d/ z = y (cid:107) y (cid:107) (2 kν ) . The above procedure generates samples on the Euclidean ball of radius √ k in R d . Weemphasize that our methodology samples from a density that is only a conjecture forthe correct equilibrium measure. We also remark that we have introduced a densityerror to this approach, as the mean and variance are computed approximately. Howto quantify and control such errors will be our future projects. The domain R d + with exponential density. Let ξ take values on R d + withassociated exponential-type probability density (again the associated parameters canbe arbitrary). Again, we shall compute an approximated mean value so that we canwork with the standard exponential-type probability density. In this case we samplefrom the following density function (cid:98) ρ ( ξ ) = C (cid:118)(cid:117)(cid:117)(cid:117)(cid:116) (cid:16) − (cid:80) di =1 ξ i (cid:17) d (cid:81) di =1 ξ i As we conjectured in [14, 11], this is the equilibrium measure associated to this choiceof ρ . We shall also expand the samples by the maximum polynomial degree k. Thefollowing is a concrete way to sample from this expanded density:1. Compute k , the maximum polynomial degree of the polynomial space.2. Generate a ( d + 1)-dimensional Dirichlet random vector y with parameters (cid:0) , , . . . , , d + 1 (cid:1) .3. Truncate the last (( d + 1)’th) entry of y .
4. Set z = 4 k y . Remark 4.1.
In the above, we have only discussed two most commonly useddensities in unbounded domains, i.e., the Gaussian density and the exponential den-sity. For more general unbounded densities, less is known for the equilibrium measure(even in the conjecture sense). A possible way to handle such situations is to truncatethe domain into a finite one, and then perform the Chebyshev sampling in the finitedomain. However, this is non-trivial due to the truncated error and we left such casesfor future studies, In this section, we shall provide with some mo-tivations for our Christoffel weighted least-squares. We shall only show the motivationin the bounded domain setting, and one can refer to [14] for the motivation of un-bounded domain cases. To begin, we first present the following fundamental resultfor the least-squares stability [3]:
Theorem 4.1.
For a d -dimensional function f ( ξ ) , consider its approximationin a finite orthogonal bases space P N = span { Φ j ( ξ ) , ≤ j ≤ N } with the associatedorthogonal density ρ ( ξ ) . Suppose the samples { z m } Mm =1 are generated with respect to ρ ( ξ ) . Consider the following least-squares approach c = argmin c ∈ R N (cid:107) Ac − f (cid:107) , (4.4) Then, the above algorithm is stable in the following sense Pr (cid:26) (cid:107) A − I (cid:107) ≥ (cid:27) ≤ M − r provided that κ ( N ) := max ξ N (cid:88) j =1 Φ j ( ξ ) ≤ δ M log M with δ = 1 − log 22 − r . Here I is the identity matrix. The above theorem states that to make the algorithm stable, it is essential tocontrol the quantity κ ( N ) as one requires approximately M (cid:38) κ ( N ) (up to a loga-rithmic factor). However for many cases, the quantities κ ( N ) behaves super-linearin N leading to too much demanding conditions on the sampling size M to guar-antee stability. For example, the most commonly used Legendre polynomials gives κ ( N ) ∼ N meaning that one requires M ≥ CN , which is not satisfactory.The above observations motivate us to use a weighted version of least-squares. Inour approach, by introducing the pre-conditioner W , we are in fact working with ascaled bases set (see (4.3) for the definition of K ( ξ )) (cid:98) P N = span (cid:40)(cid:98) Φ j = Φ j (cid:112) K ( ξ ) (cid:12)(cid:12) ≤ j ≤ N (cid:41) . (4.5)It is easy to show that for the new bases (cid:98) Φ j it holds (cid:98) κ ( N ) := max ξ N (cid:88) j =1 (cid:98) Φ j ( ξ ) ≡ N. (4.6)This means that we have the optimal control of the associated quantity (cid:98) κ ( N ).However, to show the optimal stability by Theorem 4.1 (which use samples ac-cording to the orthogonal measure), we have to sampling with a transformed measure (cid:101) ρ ( ξ ) ∼ K ( ξ ) ρ ( ξ ) = N ρ ( ξ ) (cid:80) Nj =1 Φ j ( ξ ) , (4.7)1as our new bases are orthogonal according to (cid:101) ρ ( ξ ). Notice that (cid:101) ρ ( ξ ) depends onthe polynomial space, and furthermore, sampling with (cid:101) ρ ( ξ ) seems to be non-trivial.Nevertheless, we learn from potential theory in the bounded setting that [14] (cid:101) ρ ( ξ ) → (cid:98) ρ ( ξ ) , when N → ∞ . (4.8)The above result motivated us to sampling with (cid:98) ρ ( ξ ) – the equilibrium measure.In this way, we can get a stable approach in the asymptotical sense ( N → ∞ ). Andfurthermore, our sample strategy now is independent of the polynomial space, and thisis advantage for adaptive computations where the polynomial spaces are constructedadaptively. In the bounded setting, for any admissible input density, the equilibriummeasure is just the Chebyshev density, and this is the exact motivation for us tointroduce the Christoffel weighted least-squares.
5. Numerical experiments.
In this section, we present several numerical ex-amples to show the effectiveness of our Christoffel weighted least-squares for data-driven polynomial approximations. We are interested primarily in investigating howthe sampling rates between M and N affect stability and accuracy. Due to the proba-bilistic nature of the random sampling method, all reported results are averaged over100 independent tests to reduce the statistical oscillations. In all our figures and nu-merical tests, we shall show the performance with a linear and a log-linear dependencebetween M and N, namely, M = CN and M = CN log N. The following stochasticinput distributions will be considered: • Discrete Binomial distribution: Bino(n,p) in [ − , f ( k ; n, p ) = P ( ξ = 2 kn −
1) = n ! k !( n − k )! p k (1 − p ) n − k , k = 0 , , . . . , n ; • Discrete Poisson distribution: Pois( λ ) in [ − , f ( k | λ ) = λ k k ! exp( − λ ); • Uniform distribution: U [ a, b ] : f ( x ) = (cid:40) b − a , x ∈ [ a, b ]0 , otherwise . • Exponential distribution Exp( µ ) in (0 , ∞ ) with parameters µ : f ( x | µ ) = 1 µ exp (cid:0) − xµ (cid:1) . • Normal distribution N ( µ, σ ) in ( −∞ , ∞ ) with parameters µ, σ : f ( x | µ, σ ) = 1 √ πσ exp (cid:0) − ( x − µ ) σ (cid:1) . We first test the condition number of the design matrixCond( (cid:98) A ) = λ max ( (cid:98) A ) λ min ( (cid:98) A ) with (cid:98) A = W A . ξ ∼ Bino(20 , / , ξ ∼ U [ − . , . ξ ∼ U [ − . , . , ξ ∼ U [ − , ξ ∼ Bino(20 , / , ξ ∼ Pois(10)4 ξ ∼ U [ − . , . , ξ ∼ N (0 . , . M/N . No-tice that this quantity measures the sensitivity of the solution of a system of linearequations to errors in the data, that is, it directly reflects the stability of the method.In all examples that follow we perform 100 trials of each procedure and report themean condition number along with 20% and 80% quantiles. c ond t i on nu m be r d=2, Bino(20,1/2), U[-0.6,0.6]
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i c ond t i on nu m be r d=2, U[-0.8,0.8], U[-1,1]
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN c ond t i on nu m be r d=2, Bino(20,1/2), Pois(10)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i c ond t i on nu m be r d=2, U[-0.6,0.6], N(0.1,1.2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 1: Condition number with respect to the polynomial degree in the 2-dimensionalcase (Table 1) with different sampling rates.We first consider the two dimensional tests. Four different test cases are givenin Table 1, where the uniform distribution with different parameters for each dimen-sion and mixture distributions (including binomial, poisson distribution and normal)are taken into account. Notice that the fourth test case includes both bounded andunbounded distributions, and thus in our test, we shall sampling with different equi-3librium measures in each dimension. Here we use the associated moments directly (sothat the numerical error for computing the moments with samples can be neglected)to construct the data-driven bases (by the moment match method). Then, we sam-pling with the equilibrium measure and construct the associated design matrix. Wehave presented the condition numbers of the design matrix for the four test cases inFig. 1. Different sampling rates are reported, i.e, M = 1 . N, M = 2
N, M = N log N, and M = 1 . N log N. We notice that the log-linear sampling rate produces more sta-ble results – the condition number is bounded above for the first three test cases.However, for the fourth test case, we still observed a slightly growing trend, and thisis due to the involved unbounded random variable.We next consider a synthetic example for an empirical data distribution. Thesimulation data set is generated as the superposition of uniform, normal and log-normal distributions (with sample size M = 10000), see Fig. 2 (Left). Here weconstruct the data-driven polynomial bases based on the moments that are computedby those samples. The corresponding condition number for this test case is shown inFig. 2 (Right). Again, we observe that the log-linear sampling rate provides morestable result. polynomial order i c ond t i on nu m be r d=2, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 2: Left: Data distribution showed by histogram. Right: Condition number withrespect to the polynomial degree in the 2-dimensional polynomial spaces.Finally, we further test the stability for the five dimensional case. The input ran-dom parameters used are listed in Table 2, and the corresponding condition numbersof the different test cases are reported in Fig. 3. For all test cases, the design matrixadmits more stable property with the log-linear sampling rate. However, for casesthat involve unbounded parameters, we can still observe a slightly growing trend.
We now test the approximation accuracy of the data-driven bases with Christoffel least-squares post-processing. We shall use the discrete (cid:96) -error to measure the performance of the approximation, namely, for a given function f ( ξ ) and a given set of random samples { z l } Ll =1 in the state space, we evaluate thenumerical error via (cid:15) = (cid:32) L L (cid:88) l =1 | f N ( z l ) − f ( z l ) | (cid:33) / , where f N is the lease-square solution using the data-driven bases.4 Table 2: Test examples for the five dimensional case.Type Parametric Distributions1 ξ i ∼ U [ a i , b i ] , a = [ − . , − . , − . , − , − . , b = − a. ξ i ∼ N ( µ i , σ i ) , µ = [0 , , − . , . , − . , σ = [1 , . , . , , . . ξ , ∼ U [ − . , . , ξ , ∼ Bino(20 , / , ξ ∼ Pois(10) . ξ , ξ ∼ U [ − , , ξ ∼ N (0 , , ξ ∼ N (0 . , . , ξ ∼ N (0 . , . polynomial order i c ond t i on nu m be r d=5, i U[a i ,b i ] a=[-0.1,-0.5,-0.8,-1,-1.2],b=-a CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN c ond t i on nu m be r d=5, N(0,1) N(0.1,1.1) N(-0.1,1.2) N(0.2,1) N(-0.2,0.9)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i c ond t i on nu m be r d=5, U[-1,1],
Bino(20,1/2), Pois(10)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i c ond t i on nu m be r d=5, U[-0.1,0.1] U[-1,1] N(0,1) N(0.1,1.5) N(0.2,2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 3: Condition number with respect to the polynomial degree for the five-dimensional tests in Table 2.
We first consider the following different testfunctions: f ( ξ ) = exp (cid:32) d (cid:88) k =1 ξ k (cid:33) , f ( ξ ) = d (cid:88) k =1 . (cid:18) ξ k − . (cid:19) + sin (cid:18) ξ k − . (cid:19) f ( ξ ) = exp (cid:32) − d (cid:88) k =1 c k ( ξ k − . (cid:33) , c k = exp ( − k/d ) , f ( ξ ) = sin (cid:32) d (cid:88) k =1 ξ k (cid:33) . The distribution information for the above parameters coincides with Table 1. Theconvergence rates of our approach for the two-dimensional case are presented in Fig.4. It is clear shown that the Christoffel least-squares provide very stable and accurate5approximation results. In Fig. 5, we have also tested the five dimensional cases withparameters defined in Table 2 (type 1 and 4), for the test functions f ( ξ ) and f ( ξ ) , respectively. Again, our approach admits very stable approximation results.Finally, we consider tests with histograms data for both the two and five dimen-sional cases. We consider two sets of data generated as superposition of uniform,normal and log-normal distributions. Results of these approximations are given inFigs. 6 and 7. polynomial order i -14 -12 -10 -8 -6 -4 -2 D i sc r e t e L e rr o r d=2, Pois(10), bounded in [-1,1]
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -20 -15 -10 -5 D i sc r e t e L e rr o r d=2, Bino(20,1/2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -16 -14 -12 -10 -8 -6 -4 D i sc r e t e L e rr o r d=2, N(0,1), N(0.1,1.2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -8 -6 -4 -2 D i sc r e t e L e rr o r d=2, U[-0.6,0.6], N(0.1,1.2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 4: Approximation error against polynomial degree for the two dimensional case. polynomial order i -4 -3 -2 -1 D i sc r e t e L e rr o r d=5, mixed Uniform distribution CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -12 -10 -8 -6 -4 -2 D i sc r e t e L e rr o r d=5, mixed Uniform and Normal distribution CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 5: Approximation error against polynomial degree for the five dimensional case.6 polynomial order i -15 -10 -5 D i sc r e t e L e rr o r d=2, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -15 -10 -5 D i sc r e t e L e rr o r d=2, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -16 -14 -12 -10 -8 -6 -4 -2 D i sc r e t e L e rr o r d=2, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -15 -10 -5 D i sc r e t e L e rr o r d=2, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 6: Approximation error against polynomial degree for the two dimensional case. polynomial order i -2 -1 D i sc r e t e L e rr o r d=5, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -5 -4 -3 -2 -1 D i sc r e t e L e rr o r d=5, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -8 -7 -6 -5 -4 -3 -2 -1 D i sc r e t e L e rr o r d=5, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -3 -2 -1 D i sc r e t e L e rr o r d=5, "Histogram" CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 7: Approximation error against polynomial degree for the five dimensional case.7
We now consider a electrical resistor network givenin Fig. 8. The network is comprised of d = 2 p resistances R i of uncertain Ohmage andthe network is driven by a voltage source providing a known potential V = 1. We areinterested in determining the voltage at V , which depends on the d = 2 p resistances.We set the resistances as random parameters with d = 2 and d = 4 cases. To beconcrete, we consider the two dimensional parameters uniformly distributed in theinterval ξ i ∈ [10 , ξ ∼ Exp(0 . , ξ ∼ Exp(1 . , ξ ∼ Exp(0 . , ξ ∼ Exp(1 . d = 2 p resistances { R i } di =1 of uncertain ohmageand the network is driven by a voltage source providing a known potential V . -6 -5 -4 -3 -2 -1 D i sc r e t e L e rr o r d=2, U[10,100]
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -2 -1 D i sc r e t e L e rr o r d=4, Exp(0.9), Exp(1.1), Exp(0.8), Exp(1.0)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 9: Approximation error against polynomial degree k . Left: The two-dimensionalisotropic uniform distribution. Right: The four-dimensional anisotropic exponentialrandom distribution.8 We finnaly consider the following stochasticelliptic PDE (cid:40) −∇ · ( a ( y , ω ) ∇ u ( y , ω )) = f ( y , ω ) in D × Ω ,u ( y , ω ) = 0 on ∂ D ×
Ω (5.1)with spatial domain D = [0 , . We set a deterministic load f ( y , ω ) = cos( y ) sin( y )for these numerical examples. The random diffusion coefficient a N ( y , ω ) is chosen asin [2]: log( a N ( y , ω ) − .
5) = 1 + ξ ( ω ) (cid:16) √ πL (cid:17) / + (cid:88) i =2 ζ i g i ( y ) ξ i ( ω ) , where ζ i := ( √ πL ) / exp (cid:16) − ( (cid:98) i (cid:99) πL ) (cid:17) , for i > g i ( y ) := sin (cid:16) − ( (cid:98) i (cid:99) πy L p (cid:17) , i even , cos (cid:16) − ( (cid:98) i (cid:99) πy L p (cid:17) , i odd . Here { ξ i } di =1 are independent random variables. For y ∈ [0 , L c = 1 /
12 bea desired physical correlation length for a ( y , ω ). Then the parameter L p and L are L p = max { , L c } and L = L c L p , respectively. In our numerical test, for each samples,the deterministic elliptic equation are solved by a standard finite element methodwith a fine mesh. The quantities of interests is the solution u ( y ) = u (0 . , . ξ ) . Weset the parametric density as ξ , ξ ∼ Bino(20 , .
5) and ξ ∼ N (0 , , ξ ∼ N (0 . , . polynomial order i -18 -16 -14 -12 -10 -8 -6 -4 D i sc r e t e L e rr o r d=2, Bino(20,0.5)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN polynomial order i -12 -10 -8 -6 -4 D i sc r e t e L e rr o r d=2, N(0,1), N(0.1, 1.2)
CLS M=1.5NCLS M=2NCLS M=NlogNCLS M=1.5NlogN
Fig. 10: Approximation error against polynomial degree of the parametric PDE.9
6. Conclusions.
We have combined the idea of data-driven polynomial chaosexpansions with the weighted least-square approach to solve UQ problems. We adoptthe bases construction procedure by following [1] and then propose to use the weightedleast-squares approach to solve UQ problems. Our sampling strategy is independentof the random input. More precisely, we propose to sampling with the equilibriummeasure, and this measure is also independent of the data-driven bases. Thus, theprocedure can be done in prior (or in a off-line manner). Moreover, the proposedChristoffel function weighted least-squares problem is linearly stable in many casesof interests – the required number of PDE solvers depends linearly on the number ofbases.There are, however, many unsolved problems related to this topic: • Theoretical foundation. As discussed in Section 3.2. The assumption is thatthe probability density functions are continuous. However, this approach alsowork well for densities of discrete type as long as their moments exist and thedeterminant of the moment matrix is strictly positive (see more numericalexamples in [18]). Thus, the relevant theorem for these cases is still open. • Density error. We have assumed that only sample locations are given, andall the moments are computed by these finite sample locations, and thusthis definitely introduces density error. How to quantify (theoretically) andcontrol this error is of great importance. This is also related to the densitysensitivity of the underling model. • Unbounded domains. We have provided two simple cases for unboundeddomain setting. However, unlike the bounded domain cases, for unboundedcases we need to assume that the type of the density is known (while theassociated parameters can be unknown). This is obviously unsatisfactory.Another possible approach to deal with such situations is to truncate thedomain into a bounded one (potentially large), and perform the computationin the bounded domain. However, this again introduce the truncation error.We finally close this work by remarking that our strategy can also be used in thecompressed sampling setting (or, in the (cid:96) approach) [4, 11, 10] and we shall reportthis in our future studies. REFERENCES[1] R. Ahlfeld, B. Belkouchi, and F. Montomoli. Samba: Sparse approximation of moment-basedarbitrary polynomial chaos.
J. Comput. Phys. , 320:1–16, 2016.[2] I. Babuska, F. Nobile, and R. Tempone. A stochastic collocation method for elliptic partialdifferential equations with random input data.
SIAM Review , 52(2):317–355, 2010.[3] A. Cohen, M.A. Davenport, and D. Leviatan. On the stability and accuracy of least squaresapproximations.
Found. Comput. Math. , 13:819–834, 2013.[4] A. Doostan and H. Owhadi. A non-adapted sparse approximation of pdes with stochasticinputs.
J. Comput. Phys , 230:3015–3034, 2011.[5] M. Eldred. Recent advances in non-intrusive polynomial chaos and stochastic collocation meth-ods for uncertainty analysis and design. In , volume AIAA, pages 2009–2249,2009.[6] O. G. Ernst, A. Mugler, H.J. Starkloff, and E. Ullmann. On the convergence of generalizedpolynomial chaos expansions.
ESAIM: Mathematical Modelling and Numerical Analysis ,46(2):317–339, 2012.[7] Z. Gao and T. Zhou. Choice of nodal sets for least square polynomial chaos method withapplication to uncertainty quantification.
Commun. Comput. Phys. , 16:365–381, 2014.[8] G. H. Golub and J. H. Welsch. Calculation of gauss quadrature rules,.
Math. Comput. , 23:221–230, 1968. [9] L. Guo, A. Narayan, L. Yan, and T. Zhou. Weighted approximate fekete points: sampling forleast-squares polynomial approximation. SIAM J. Sci. Comput. , 40:A366–A387, 2018.[10] L. Guo, A. Narayan, T. Zhou, and Y. Chen. Stochastic collocation methods via l minimizationusing randomized quadratures. SIAM J. Sci. Comput. , 39(1):A333–A359, 2017.[11] J. D. Jakeman, A. Narayan, and T. Zhou. A generalized sampling and preconditioning schemefor sparse approximation of polynomial chaos expansions.
SIAM J. Sci. Comput , 39:A1114–1144, 2017.[12] E. Musharbash, F. Nobile, and T. Zhou. Error analysis of the dynamically orthogonal approx-imation of time dependent random pdes.
SIAM J. Sci. Comput. , 37:A776–A810, 2015.[13] I.P. Mysovskikh. On the construction of cubature formulas with fewest nodes.
Dokl. Akad.Nauk SSSR , 178:1252–1254, 1968.[14] A. Narayan, J. D. Jakeman, and T. Zhou. A christoffel function weighted least squares algorithmfor collocation approximations.
Math. Comput. , 86(306):1913–1947, 2017.[15] A. Narayan and T. Zhou. Stochastic collocation on unstructured multivariate meshes.
Commun.Comput. Phys. , 18(1):1–36, 2015.[16] F. Nobile, R. Tempone, and C. G. Webster. An anisotropic sparse grid stochastic collocationmethod for partial differential equations with random input data.
SIAM J. Numer. Anal. ,46(5):2411–2442, 2008.[17] S. Oladyshkin, H. Class, R. Helmig, and W. Nowak. A concept for data-driven uncertaintyquantification and its application to carbon dioxide storage in geological formations.
Adv.Water Resour. , 34:1508–1518, 2011.[18] S. Oladyshkin and W. Nowak. Data-driven uncertainty quantification using the arbitrary poly-nomial chaos expansion.
Reliab. Eng. Syst. Saf. , 106:179–190, 2012.[19] P. Prempraneerach, F. S. Hover, M. S. Triantafyllou, and G. E. Karniadakis. Uncertaintyquantification in simulationsof power systems: Multi-element polynomial chaos methods.
Reliability Engineering and System Safety , 95:632–646, 2010.[20] C. Soize and R. Ghanem. Physical systems with random uncertainties: chaos representationswith arbitrary probability measure.
SIAM J. Sci. Comput. , 26:395–410, 2004.[21] T. Tang and T. Zhou. Convergence analysis for stochastic collocation methods to scalar hyper-bolic equations with a random wave speed.
Commun. Comput. Phys. , 8:226–248, 2010.[22] T. Tang and T. Zhou. Galerkin methods for stochastic hyperbolic problems using bi-orthogonalpolynomials.
J. Sci. Comput. , 51:274–292, 2012.[23] T. Tang and T. Zhou. Recent developments in high order numerical methods for uncertaintyquantification.
Scientia Sinica Mathematica , 45:891–928, 2015.[24] X. Wan and G. E. Karniadakis. Multi-element generalized polynomialchaos for arbitrary prob-ability measures.
SIAM Journal on Scientific Computing , 28(3):901–928, 2006.[25] N. Wiener. The homogeneous chaos.
Amer. J. Math , 60(4):897–936, 1938.[26] J. A. S. Witteveen and H. Bijl. Modeling arbitrary uncertainties using gramcschmidt polynomialchaos. In , volume AIAA,page 896, 2006.[27] J. A. S. Witteveen, S. Sarkar, and H. Bijl. Modeling physical uncertainties in dynamic stallinduced fluidcstructure interaction of turbine blades using arbitrary polynomial chaos.
Comput. Struct. , 85:866–878, 2007.[28] D. Xiu and J. S. Hesthaven. High-order collocation methods for differential equations withrandom inputs.
SIAM J. Sci. Comput. , 27(3):1118–1139, January 2005.[29] D. Xiu and G. E. Karniadakis. The wiener-askey polynomial chaos for stochastic differentialequations.
SIAM J. Sci. Comput. , 24:619–644, 2002.[30] M. D. Zheng, X. Wan, and G. E. Karniadakis. Adaptive multi-element polynomial chaos withdiscrete measure: Algorithms and application to spdes.
Applied Numerical Mathematics ,90:91–110, 2015.[31] T. Zhou, A. Narayan, and D. Xiu. Weighted discrete least-squares polynomial approximationusing randomized quadratures.
J. Comput. Phys. , 298:787–800, 2015.[32] T. Zhou, A. Narayan, and Z. Xu. Multivariate discrete least-squares approximations with anew type of collocation grid.