[PDF] Quadrature Strategies for Constructing Polynomial Approximations

Abstract

Finding suitable points for multivariate polynomial interpolation and approximation is a challenging task. Yet, despite this challenge, there has been tremendous research dedicated to this singular cause. In this paper, we begin by reviewing classical methods for finding suitable quadrature points for polynomial approximation in both the univariate and multivariate setting. Then, we categorize recent advances into those that propose a new sampling approach and those centered on an optimization strategy. The sampling approaches yield a favorable discretization of the domain, while the optimization methods pick a subset of the discretized samples that minimize certain objectives. While not all strategies follow this two-stage approach, most do. Sampling techniques covered include subsampling quadratures, Christoffel, induced and Monte Carlo methods. Optimization methods discussed range from linear programming ideas and Newton's method to greedy procedures from numerical linear algebra. Our exposition is aided by examples that implement some of the aforementioned strategies.

Full PDF

QQuadrature Strategies for ConstructingPolynomial Approximations

Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu

Abstract

Finding suitable points for multivariate polynomial interpolation and ap-proximation is a challenging task. Yet, despite this challenge, there has been tremen-dous research dedicated to this singular cause. In this paper, we begin by reviewingclassical methods for ﬁnding suitable quadrature points for polynomial approxi-mation in both the univariate and multivariate setting. Then, we categorize recentadvances into those that propose a new sampling approach and those centered onan optimization strategy. The sampling approaches yield a favorable discretizationof the domain, while the optimization methods pick a subset of the discretized sam-ples that minimize certain objectives. While not all strategies follow this two-stageapproach, most do. Sampling techniques covered include subsampling quadratures,Christoffel, induced and Monte Carlo methods. Optimization methods discussedrange from linear programming ideas and Newton’s method to greedy proceduresfrom numerical linear algebra. Our exposition is aided by examples that implementsome of the aforementioned strategies.

Over the past few years there has been a renewed research effort aimed at con-structing stable polynomial approximations: understanding their theoretical proper-

Pranay SeshadriDepartment of Engineering, University of Cambridge, Cambridge, U. K., e-mail: [email protected]

Gianluca IaccarinoDepartment of Mechanical Engineering and Institute for Computational and Mathematical Engi-neering, Stanford University, Stanford, California, U. S. A., e-mail: [email protected]

Tiziano GhisuDepartment of Mechanical, Chemical and Materials Engineering, Universit´a di Cagliari, Cagliari,Sardinia, Italy, e-mail: [email protected] a r X i v : . [ m a t h . NA ] M a y Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu ties [60, 71, 59, 55, 15, 44, 17, 35, 1] and extending their practical use. The latterspans from applications in spectral methods [65], uncertainty quantiﬁcation and therelated sensitivity analysis [51, 57, 30, 39] to dimension reduction [58] and de-sign under uncertainty [54]. One topic germane to the ﬁeld has been to ﬁnd suitablepoints for multivariate polynomials using as few points as possible. The application-centric high level idea can be described as follows:select a polynomial basis ⇓ evaluate model at quadrature points ⇓ estimate polynomial coefﬁcients ⇓ compute moments, sensitivities and probability density functionsMotivated by reducing the number of samples, and providing scalable compu-tational methods for doing so, numerous deterministic and randomized samplingschemes have been proposed. Before we delve into recent strategies for estimatingthese using perspectives from least squares, a brief overview of some of the morefundamental concepts is in order.Quadrature rules given by { ( ζζζ i , ω i ) } mi = seek to ﬁnd d − dimensional points ζζζ i ∈ R d and weights ω i > R such that the integral of a function f may be expressedas (cid:90) R d f ( ζζζ ) ρ ( ζζζ ) d ζζζ ≈ m ∑ i = f ( ζζζ i ) ω i (1)where ρ ( ζζζ ) is a known multivariate weight function—i.e., Gaussian, uniform,Cauchy, etc. or product thereof. These d − dimensional points are sampled over thesupport of ζζζ = ( ζ ( ) , . . . , ζ ( d ) ) , mutually independent random variables under thejoint density ρ ( ζζζ ) . By construction, this quadrature rule must also satisfy m ∑ i = ψψψ ppp ( ζζζ i ) ψψψ qqq ( ζζζ i ) ω i ≈ (cid:90) R d ψψψ ppp ( ζζζ ) ψψψ qqq ( ζζζ ) ρ ( ζζζ ) d ζζζ = δ pppqqq (2)where ψψψ ppp ( ζζζ ) is a multivariate polynomial L -orthogonal on R d when weighted bythe joint density ρρρ ( ζζζ ) . Here δ pppqqq denotes the Kronecker delta; subscripts ppp and qqq aremulti-indices that denote the order of ψψψ and its composite univariate polynomials ψ j via ψψψ ppp ( ζζζ ) = d ∏ k = ψ ( k ) p k (cid:16) ζ ( k ) (cid:17) where ppp = ( p , . . . , p d ) ∈ N d . (3)The ﬁnite set of indices in ppp is said to belong to a multi-index set I . The numberof multi-indices present in ppp , qqq ∈ I is set by choosing elements in I to follow cer-tain rules, e.g., the sum of all univariate polynomial orders must satisfy ∑ di = p i ≤ k ,yielding a total order index set [70]. A tensor order index, which scales exponen-tially in d is governed by the rule max k p k ≤ k . Other well-known multi-index sets uadrature Strategies for Constructing Polynomial Approximations 3 include hyperbolic cross [71] and hyperbolic index sets [4]. We shall denote thenumber of elements in each index set by n = card ( ppp ) .From (1) and (2) it should be clear that one would ideally like to minimize m andyet achieve equality in the two expressions. Furthermore, the degree of exactness associated with evaluating the integral of the function in (1) will depend on thehighest order polynomial (in each of the d − dimensions) that yields equality in (2).Prior to elaborating further on multivariate quadrature rules, it will be useful to detailkey ideas that underpin univariate quadrature rules. It is important to note that thereis an intimate relationship between polynomials and quadrature rules; ideas that dateback to Gauss, Christoffel and Jacobi. Much of the development of classical quadrature techniques is due to the founda-tional work of Gauss, Christoffel and Jacobi. In 1814 Gauss originally developedhis quadrature formulas, leveraging his theory of continued fractions in conjunctionwith hypergeometric series using what is today known as Legendre polynomials.Jacobi’s contribution was the connection to orthogonality , while Christoffel is cred-ited with the generalization of quadrature rules to non-negative, integrable weightfunctions ρ ( ζ ) ≥ Gauss-Christoffel quadrature broadly refers to allrules of the form (1) that have a degree of exactness of ( m − ) for m points when d =

1. The points and weights of Gauss rules may be computed either from themoments of a weight function or via the three-term recurrence formula associatedwith the orthogonal polynomials. In most codes today, the latter approach, i.e., theGolub and Welch [32] approach is adopted. It involves computing the eigenvaluesof a tridiagonal matrix —incurring complexity O (cid:0) m (cid:1) —of the recurrence coefﬁ-cients associated with a given orthogonal polynomial. For uniform weight func-tions ρ ( ζ ) these recurrence coefﬁcients are associated with Legendre polynomials,and the resulting quadrature rule is termed Gauss-Legendre. When using Hermitepolynomials—orthogonal with respect to the Gaussian distribution—the resultingquadrature is known as Gauss-Hermite. In applications where the weight functionsare arbitrary, or data-driven , the discretized Stieltjes procedure (see section 5 in[25]) may be used for computing the recurrence coefﬁcients. Degree of exactness:

The notion of degree of exactness dates back to Radau[24]; all univariate quadrature rules are associated with a degree of exactness ,referring to the highest degree polynomial that can be integrated exactly fora given number of points m . For instance, Gauss-Legendre quadrature has a Known colloquially as the Jacobi matrix. In the case of data-driven distributions, kernel density estimation or even a maximum likelihoodestimation may be required to obtain a probability distribution that can be used by the discretizedStieltjes procedure. Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu degree of exactness of ( m − ) . Gauss-Lobatto quadrature rules on the otherhand have a degree of exactness of ( m − ) , while Clenshaw-Curtis (see [27])have a degree of exactness of ( m − ) —although in practice comparable ac-curacy to Gauss quadrature rules can be obtained (see Trefethen’s monograph[66] and Davis and Rabinowitz [20]). One way to interpret this degree of ex-actness is to inspect the elements of a Gram matrix GGG = AAA T AAA , where

AAA isformed by evaluating the orthogonal polynomials at the quadrature points

AAA ( i , j ) = ψ j ( ζ i ) √ ω i where AAA ∈ R m × n , (4)with m quadrature points and the ﬁrst n polynomials. Thus, each element ofthe Gram matrix seeks to approximate GGG ( p , q ) = (cid:90) ψ p ( ζ ) ψ q ( ζ ) ρ ( ζ ) d ζ ≈ m ∑ i = ψ p ( ζ i ) ψ q ( ζ i ) ω i = δ pq (5)To clarify this, consider the example case of a Gauss-Legendre rule with m =

5. The highest polynomial degree that this rule can integrate up to is 9,implying that the ﬁrst 4-by-4 submatrix of

GGG will be the identity matrix as thecombined polynomial degree of the terms inside the integral in (5) is 8. Thisis illustrated in Figure 1(a). In (b) and (c) similar results are shown for Gauss-Lobatto (the highest degree being 7) and Clenshaw-Curtis (which integrateshigher than degree 5). For all the aforementioned subﬁgures, element-wisedeviations from the identity can be interpreted as the internal aliasing errorsassociated with each quadrature rule. [0.][1.][2.][3.][4.][5.] A T A where A∈ℝ (a) [0.][1.][2.][3.][4.][5.] A T A where A∈ℝ (b) [0.][1.][2.][3.][4.][5.] A T A where A∈ℝ (c)

Fig. 1

The Gram matrix

GGG for three quadrature rules showing their practical degrees of exactness:(a) Gauss-Legendre; (b) Gauss-Lobatto; (c) Clenshaw-Curtis. The vertical axis in the ﬁgures showthe polynomial degree and the colorbar shows the values of the entries of these matrices.

Classical extensions of Gauss quadrature include: • Gauss-Radau and Gauss-Lobatto : The addition of end-points ( a , b ) in quadra-ture rules. In Gauss-Radau the addition of one end-point ( a , b ] = { ζ ∈ ρ | a < ζ ≤ b } and both in Gauss-Lobatto, i.e., [ a , b ] = { ζ ∈ ρ | a ≤ ζ ≤ b } [24]. uadrature Strategies for Constructing Polynomial Approximations 5 • Gauss-Kronrod and Gauss-Patterson : Practical extensions of m -point Gaussrules by adaptively adding m + com-bined degree of exactness of ( m + ) or ( n + ) depending on whether m iseven or odd respectively; based on the work of Kronrod [42]. Patterson [50] rulesare based on optimizing the degree of exactness of Kronrod points and with ex-tensions to Lobatto rules. The intent behind these rules is that they admit certainfavorable nesting properties. • Gauss-Tur´an : The addition of derivative values (in conjunction with functionvalues) within the integrand (see [26]).

Multivariate extensions of univariate quadrature rules exist in the form of tensorproduct grids, assuming one is ready to evaluate a function at m d points. They areexpressed as: (cid:90) R d f ( ζζζ ) ρ ( ζζζ ) d ζζζ ≈ (cid:0) Q m q ⊗ . . . ⊗ Q m d q (cid:1) ( f )= m ∑ j = . . . m d ∑ j d = f (cid:0) ζ j d , . . . , ζ j d (cid:1) ω j . . . ω j d = m ∑ j = f (cid:16) ζζζ j (cid:17) ω j . (6)The notation Q m i q denotes a linear operation of the univariate quadrature rule appliedto f along direction i ; the subscript q stands for quadrature .In the broader context of approximation, one is often interested in the projectionof f onto ψ ppp ( ζζζ ) . This pseudospectral approximation is given by f ( ζζζ ) ≈ n ∑ i = x i ψψψ i ( ζζζ ) , where x i = m ∑ j = f (cid:16) ζζζ j (cid:17) ψψψ i (cid:16) ζζζ j (cid:17) ω j . (7)Inspecting the decay of these pseudospectral coefﬁcients , i.e., x = ( x , . . . , x m ) T isuseful for analyzing the quality of the overall approximation to f ( ζζζ ) , but morespeciﬁcally for gaining insight into which directions the function varies the greatest,and to what extent. We remark here that for simulation-based problems where anapproximation to f ( ζζζ ) may be required, it may be unfeasible to evaluate a modelat a tensor grid of quadrature points, particularly if d is large. Sparse grids [28, 62]offer moderate computational attenuation to this problem. One can think of a sparsegrid as linear combinations of select anisotropic tensor grids (cid:90) R d f ( ζζζ ) ρ ( ζζζ ) d ζζζ ≈ ∑ rrr ∈ K ααα ( rrr ) (cid:0) Q m q ⊗ . . . ⊗ Q m d q (cid:1) ( f ) , (8) Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu where for a given level l —a variable that controls the density of points and the high-est order of the polynomial in each direction—the multi-index K and coefﬁcients ααα ( rrr ) are given by K = (cid:110) rrr ∈ N d : l + ≤ | rrr | ≤ l + d (cid:111) and ααα ( rrr ) = ( − ) −| rrr | + d + l (cid:18) d − − | rrr | + d + l (cid:19) . (9)Formulas for the pseudospectral approximations via sparse grid integration can thenbe written down; we omit these in this exposition for brevity. Approximation via sparse grids:

To motivate the use of sparse grids, and torealize its limitations, consider the bi-variate function f ( ζζζ ) = exp ( ζ + ζ ) where ζζζ ∈ [ − , ] (10)and ρ ( ζζζ ) is the uniform distribution over the domain. Figure 2(a) plots thepoints in the parameter space for an isotropic tensor product grid using ten-sor products of order 35 Gauss-Legendre quadrature points, while (b) plotsestimates of the pseudospectral coefﬁcients using a tensor product Legendrepolynomial basis in ψψψ ppp . A total of 1296 function evaluations were requiredfor these results.Upon inspecting (b) it is apparent that coefﬁcients with a total order of 12and greater can be set to zero with near negligible change in the function ap-proximation. In Figure 2(c, d) and (e, f) we present two solutions that achievethis. Figure 2(c-d) plot the results for a sparse grid with a linear growth rule(essentially emulating a total order index set) with l =

12, requring 1015 func-tion evaluations, while in Figure 2(e-f) the results show an exponential growthrule with l =

5, requiring 667 function evaluations. Clearly, using sparse gridquadrature rules can reduce the number of model evaluations compare to ten-sor product grids.While sparse grids and their adaptive variants [19, 52] are more computationallytractable than tensor product grids, they are still restricted to very speciﬁc indexsets, even with linear, exponential and slow exponential [11] growth rules. Further-more, when simulation-based function evaluations at the quadrature points fail (orare corrupted), one may have to resort to interpolation heuristics. In comparison,least squares methods offer far more ﬂexibility—both in terms of a choice of thebasis and in negotiating failed simulations. uadrature Strategies for Constructing Polynomial Approximations 7 −1.00−0.75−0.50−0.25 0.00 0.25 0.50 0.75 1.00ζ −1.00−0.75−0.50−0.250.000.250.500.751.00 ζ (a) i i -16-14-12-10-8-6-4-20 (b) −1.00−0.75−0.50−0.25 0.00 0.25 0.50 0.75 1.00ζ −1.00−0.75−0.50−0.250.000.250.500.751.00 ζ (c) i i -16-14-12-10-8-6-4-20 (d) −1.00−0.75−0.50−0.25 0.00 0.25 0.50 0.75 1.00ζ −1.00−0.75−0.50−0.250.000.250.500.751.00 ζ (e) i i -16-14-12-10-8-6-4-20 (f) Fig. 2

Parameter space points and pseudospectral coefﬁcients (shown on a log base 10 scale)obtained via a (a, b) tensor grid with a maximum univariate order of 35; (c, d) sparse grid witha linear growth rule and l =

12; (e, f) sparse grid with an exponential growth rule and l = Our speciﬁc goal in this paper is to use ideas from polynomial least squares togenerate quadrature rules. Without loss of generality, these quadrature rules will beused to estimate the pseudospectral coefﬁcients x by solvingminimize x ∈ R n (cid:107) AAA x − b (cid:107) , (11)where we deﬁne the elements of AAA ∈ R m × n as per (4). For completeness, we restatethis deﬁnition but now in the multivariate setting, i.e., AAA ( i , jjj ) = ψψψ jjj ( ζζζ ) √ ω i , jjj ∈ I , card ( I ) = n , i = , . . . , m . (12)Typically I will be either a total order or a hyperbolic basis. The entries of thevector b ∈ R m comprise of weighted model evaluations at the m quadrature points;individual entries are given by b ( i ) = √ ω i f ( ζζζ i ) . Assuming AAA and b are known,solving (11) is trivial—the standard QR factorization can be used. However, the truechallenge lies in selecting multivariate quadrature points and weights { ( ζζζ i , ω i ) } mi = ,such that: • For a given choice of n , the number of quadrature points m can be minimized andyield a rule that has a high degree of exactness; • The least squares approximation in (11) is accurate; • The least squares solution is stable with respect to perturbations in b .We will explore different strategies for generating AAA and techniques for subsamplingit—even introducing a new approach in secti.on 3.2. Broadly speaking, strategies forcomputing multivariate quadrature points and weights via least squares involve twokey decisions:1.

Selecting a sampling strategy : A suitable discretization of the domain from whichquadrature points need to be computed;2.

Formulating an optimization problem : The strategy for subselecting points fromthis sampling (if required) via the optimization of a suitable objective.Our goal in this manuscript is to describe the various techniques for generatingthese quadrature points and to make precise statements (where permissible) on theircomputational complexity. We substantiate our detailed review of literature withexamples using the open-source code Effective Quadratures [56] . The codes to replicate the ﬁgures in this paper can be found at the website: uadrature Strategies for Constructing Polynomial Approximations 9

In this section we present methods for discretizing the domain based on the supportof the parameters and their distributions. The survey builds on a recent review ofsampling techniques for polynomial least squares [34] and from more recent textsincluding Narayan [48, 49] and Jakeman and Narayan [37].Intuitively, the simplest sampling strategy for polynomial least squares is to gen-erate random Monte-Carlo type samples based on the joint density ρ ( ζζζ ) . Miglioratiet al. [45] provide a practical sampling heuristic for obtaining well conditioned ma-trices AAA ∈ R m × n ; they suggest sampling with m = n for total degree and hyperboliccross spaces. Furthermore, they prove that as the number of samples m → ∞ , thecondition number κ (cid:0) AAA T AAA (cid:1) → The Christoffel sampling recipe of Narayan and co-authors [49] signiﬁcantly out-performs the Monte Carlo approach in most problems. To understand their strategy,it will be useful to deﬁne the diagonal of the orthogonal projection kernel K n ( ζζζ ) : = n ∑ j = ψψψ jjj ( ζζζ ) , (13)where as before the subscript n denotes card ( I ) . We associate the above term witha constant (cid:107) K (cid:107) ∞ (cid:44) sup ζζζ ∈ R D ( K n ( ζζζ )) , (14)which has a lower bound of n . Cohen et al. [17] prove that if the number of samples m satisﬁes the bound mnlog ( m ) (cid:63) χ (cid:107) K (cid:107) ∞ n , (15)for some constant χ , then the subsequent least squares problem is both stable andaccurate with high probability. The key ingredient to this bound, the quantity (cid:107) K (cid:107) ∞ ,tends to be rather small for Chebyshev polynomials—scaling linearly with n inthe univariate case—but will be large for other polynomials, such as Legendrepolynomials—scaling quadratically with n . Multivariate examples are given in [15]and lead to computationally demanding restrictions on the number of samples re-quired to satisfy (15) [49, 18]. Working with this bound, Narayan et al. provide asampling strategy that leverages the fact that the asymptotic characteristics for to-tal order polynomials can be determined. Thus, if the domain is bounded with acontinuous ρ , then limit of m / K n —known as the Christoffel function—is given by lim n → ∞ mK n ( ζζζ ) = ρρρ ( ζζζ ) ν ( ζζζ ) (16)where the above statements holds weakly, and where ν ( ζζζ ) is the weighted pluripo-tential equlibrium measure (see section 2.3 in [49] for the deﬁnition and signif-icance). This implies that the functions φφφ = ψψψ (cid:112) m / K n form an orthogonal basiswith respect to a modiﬁed weight function ρρρ K n / m . This in turn yields a modiﬁedreproducing diagonal kernel ˆ K n = m ∑ j = φφφ = nK n K n = n , (17)which attains the optimal value of n . The essence of the sampling strategy is tosample from ν (performing a Monte Carlo on the basis φφφ ), which should in theoryreduce the sample count as dictated by (15).The challenge however is devleoping computational algorithms for generatingsamples according to ν ( ζζζ ) , since precise analytical forms are only known for afew domains. For instance, when D = [ − , ] d , the measure ν ( ζζζ ) is the Chebyshev(arcsine) distribution. Formulations for other distributions such as the Gaussian andexponential distributions are in general more complex and can be found in section6 of [49]. Comparing condition numbers:

But how much lower are the condition num-bers when we compare standard Monte Carlo with Christoffel sampling? Fig-ure 3 plots the mean condition numbers (averaged over 10 trials) for Legendrepolynomials in d = d = . m = n . Forthe Christoffel results, the samples are generated from the Chebyshev distri-bution.It is apparent from these ﬁgures that the Christoffel sampling strategy doesproduce more well conditioned matrices on average. We make two additionalremarks here. The ﬁrst concerns the choice of the weights. In both samplingstrategies the quadrature weights ω i are computed via ω i = ˜ ω i ∑ mk = ˜ ω k , where ˜ ω i = nm ∑ jjj ∈ I ψψψ jjj ( ζζζ i ) (18)ensuring that the weights sum up to unity. The second concerns the Gram ma-trix AAA T AAA , shown in Figure 4 for the Christoffel case with a maximum order of3, with d = uadrature Strategies for Constructing Polynomial Approximations 11 C o n d i t i o n nu m b e r Christoffel Monte Carlo (a) C o n d i t i o n nu m b e r Christoffel Monte Carlo (b) C o n d i t i o n nu m b e r Christoffel Monte Carlo (c) C o n d i t i o n nu m b e r Christoffel Monte Carlo (d)

Fig. 3

A comparison of condition numbers for

AAA when constructing the matrix using multivariateLegendre polynomials evaluated at either Monte Carlo or Christoffel samples. Experiments for (a) d = , m = . n ; (b) d = , m = n ; (c) d = , m = . n ; (d) d = , m = n . Plotted are the meanvalues of 10 random trials. The idea of constructing

AAA using samples from a tensor grid and then subsamplingits rows has been explored from both compressed sensing [64] and least squares[71, 55] perspectives. In Zhou et al. [71] the authors randomly subsample the rowsand demonstrate stability in the least squares problem with m scaling linearly with n . In Seshadri et al., [55], a greedy optimization strategy is used to determine whichsubsamples to select; they report small condition numbers even when n = m forproblems where d ≤ ≤

15. Details of their optimizationstrategy are presented in section 3. One drawback of their technique is the require-ment that the full

AAA matrix must be stored and passed onto the optimizer—a require-ment that can be circumvented in the randomized approach, albeit at the cost of m being greater than n (typically). A representative comparison of the condition num- By solving the basis pursuit (and de-noising) problem.2 Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu d = 4, m = 2.0 n [0. 0. 0. 0.][0. 0. 0. 1.][0. 0. 1. 0.][0. 1. 0. 0.][1. 0. 0. 0.][1. 1. 0. 0.][1. 0. 1. 0.][0. 1. 1. 0.][1. 0. 0. 1.][0. 0. 1. 1.][0. 1. 0. 1.][1. 0. 1. 1.][1. 1. 1. 0.][1. 1. 0. 1.][0. 1. 1. 1.][0. 0. 2. 0.][0. 2. 0. 0.][0. 0. 0. 2.][2. 0. 0. 0.][2. 0. 1. 0.][2. 0. 0. 1.][1. 2. 0. 0.][1. 0. 2. 0.][1. 0. 0. 2.][0. 0. 2. 1.][0. 2. 1. 0.][0. 2. 0. 1.][0. 1. 2. 0.][0. 1. 0. 2.][2. 1. 0. 0.][0. 0. 1. 2.][0. 0. 3. 0.][0. 0. 0. 3.][0. 3. 0. 0.][3. 0. 0. 0.] A T A where A ∈ ℝ

Fig. 4

The Gram matrix

GGG = AAA T AAA for one of the trials for the case where d = m = n showing its deviation from the identity. bers is shown in Figure 5. In these results, the isotropic tensor grid from which thesubsamples are computed corresponds to the highest total order of the polynomialbasis (shown on the horizontal axis). Building on some of the ideas discussed previously in 2.1, one can detail an impor-tance sampling strategy that yields stable least squares estimates for m (cid:63) n , up tolog factors. The essence of the idea is to deﬁne a new distribution µ such that µ = K n n ρρρ = n (cid:32) n ∑ j = ψψψ j ( ζζζ ) (cid:33) ρρρ . (19)It should be noted that while ρρρ is a product measure and therefore easy to samplefrom, µ is not and thus requires either techniques based on Markov chain Monte uadrature Strategies for Constructing Polynomial Approximations 13 C o n d i t i o n nu m b e r Randomized quadratures Effective quadratures (a) C o n d i t i o n nu m b e r Randomized quadratures Effective quadratures (b)

Fig. 5

A comparison of condition numbers for

AAA when constructing the matrix using multivariateLegendre polynomials evaluated at either randomized or effectively subsampled quadrature points.Experiments for (a) d = , m = . n ; (b) d = , m = . n . For the randomized technique, we plotthe mean values of 10 random trials. Carlo (MCMC) or conditional sampling [18]. More speciﬁcally, µ depends on m , soif one enriches the space with more samples or change the basis, then µ will change.In [35], the authors devise an MCMC strategy which seeks to ﬁnd µ , thereby ex-plicitly minimizing their coherence parameter (a weighted form of (cid:107) K (cid:107) ∞ ). In [48],formulations for computing µ via induced distributions is detailed. In our numeri-cal experiments investigating the condition numbers of matrices obtained via such induced samples —in a similar vein to the Figures 3—the condition numbers werefound to be comparable to those from Christoffel sampling. In the case of Christoffel, Monte Carlo and even subsampling based techniques, areduction in m can be achieved, facilitating near quadrature like degree of exactnessproperties. In this section we discuss various optimization techniques for achievingthis. It will be convenient to interpret our objective as identifying a suitable submatrix of

AAA ∈ R m × n by only selecting k << m rows. Ideally, we would like k to be as close to n as possible. Formally, we write this submatrix as ˆ AAA z = τ m ∑ i = z i a Ti e where T z = k , τ = m ∑ i = z i w i , z i ∈ { , } (20)and where a Ti represents the rows of AAA , and e ∈ R k and ∈ R m are vector of ones.The notation above indicates that z ∈ R m is a boolean vector. The normalization fac-tor τ is introduced to ensure the weights associated with the subselected quadraturerule sum up to one.Table 1 highlights some of the various optimization objectives that may be pur-sued for ﬁnding z such that the resulting ˆ AAA z yields accurate and stable least squaressolutions. Table 1

Sample objectives for optimization

Case Objective

P1: minimize z σ max (cid:16) ˆ AAA z (cid:17) P2: maximize z σ min (cid:16) ˆ AAA z (cid:17) P3: minimize z κ (cid:16) ˆ AAA z (cid:17) P4: maximize z vol (cid:16) ˆ AAA z (cid:17) = ∏ ki = σ i (cid:16) ˆ AAA z (cid:17) P5: minimize z (cid:13)(cid:13)(cid:13) ˆ AAA T z ˆ AAA z (cid:13)(cid:13)(cid:13) We remark here that objectives P1 to P4 are proven NP-hard problems (see The-orem 4 in Civril and Magdon-Ismail [16]); although it is readily apparent that allthe objectives require evaluating (cid:0) mk (cid:1) possible choices—a computationally unwieldytask for large values of m and k . Thus some regularization or relaxation is necessaryfor tractable optimization strategies.We begin by focusing on P4. In the case where k = n , one can express the vol-ume maximization objective as a determinant maximization problem. Points that intheory maximize this determinant, i.e., the Fekete points , yield a Lebesgue constantthat typically grows logarithmically—at most linearly—in n [6]. So how do we ﬁnda determinant maximizing submatrix of AAA by selecting k of its rows? In Guo et al.[33] the authors show that if there exists a point stencil that indeed maximizes thedeterminant, then a greedy optimization can recover the global optimum (see The-orem 3.2 in Guo et al.). Furthermore, the authors prove that either maximizing thedeterminant or minimizing the condition number (P3) is likely to yield equivalentsolutions. This explains their use of the pivoted QR factorization for ﬁnding suchpoints; its factorization is given by uadrature Strategies for Constructing Polynomial Approximations 15 AAA T PPP = QQQ (cid:0)

RRR RRR (cid:1) , (21)where QQQ ∈ R n × n is an orthogonal matrix and RRR ∈ R n × n is an upper triangular matrixthat is invertible and R n × ( m − n ) . The permutation matrix PPP ∈ R m × m is based on apivoting strategy that seeks to either maximize σ max ( RRR ) or minimize σ min ( RRR ) .As Bj¨orck notes (see section 2.4.3 in [3]) both these strategies are in some casesequivalent and thus greedily maximize the diagonal entries of RRR and therefore serveas heuristics for achieving all the objectives in Table 3.1. Comprehensive analysis onpivoted QR factorizations can be found in [14, 13, 21]. Techniques for the closelyrelated subset selection problem can also be adopted and build on ideas based onrandom matrix theory (see Deshpande and Rademacher [22] and references therein).The monograph by Miller [46] which also provides a thorough survey of techniques,emphasizes the advantages of methods based on QR factorizations. Optimizing to ﬁnd Gauss points:

In the univariate case, it is known thatGauss-Legendre quadrature points are optimal with respect to a uniform mea-sure. In an effort to gauge the efﬁcacy of some of the optimization strate-gies discussed in this paper, we carry out a simple numerical experiment. Let

AAA ∈ R × K be formed by evaluating up to order K Legendre polynomials atthe ﬁrst 101 Gauss-Legendre quadrature points. Each of the aforementionedgreedy strategies is tasked with ﬁnding a suitable submatrix ˆ

AAA z ∈ R K × K . Thequadrature points embedded in z for K = K = d = d increases? Figure 7(a) plots the subsampledpoints obtained from the three linear algebra optimization strategies when sub-sampling from a tensor grid with order 50 in both ζ and ζ directions. Thebasis used—i.e., the columns of AAA —is an isotropic tensor product with an or-der of 3. As expected, the obtained points are very similar to Gauss-Legendretensor product points with order 3. In Figure 7(b), instead of subsampling froma higher order tensor grid, we subsample from Christoffel samples, which inthe uniform case corresponds to the Chebyshev distribution; the basis is thesame as in (a). In this case it is interesting to note that the LU and QR pivotingapproaches, which are similar, yield slightly different results when comparedto the SVD approach.The use of such rank revealing factorizations for minimizing the condition num-ber of a Vandermonde-type matrix have been previously investigated in [7, 55] albeiton different meshes. When pruning rows from

AAA , the condition number of ˆ

AAA z can bebounded by −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 ζ Grid QR LU SVD Gauss-Legendre (a) −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 ζ Grid QR LU SVD Gauss-Legendre (b)

Fig. 6

Approximation to the Gauss-Legendre quadrature points obtained by subsampling from agrid of order 100 using various optimization strategies with an (a) order 4; (b) order 8 Legendrepolynomial basis. −1.00−0.75−0.50−0.25 0.00 0.25 0.50 0.75 1.00 ζ −1.00−0.75−0.50−0.250.000.250.500.751.00 ζ Grid QR LU SVD Gauss-Legendre (a) −1.00−0.75−0.50−0.25 0.00 0.25 0.50 0.75 1.00 ζ −1.0−0.50.00.51.0 ζ Grid QR LU SVD Gauss-Legendre (b)

Fig. 7

Approximation to the [ , ] Gauss-Legendre quadrature points obtained by subsamplingfrom a grid of with 2601 samples using various optimization strategies: (a) Subsampling from atensor grid of order [ , ] ; (b) Subsampling from 2601 randomly distributed Chebyshev points. κ (cid:16) ˆ AAA z (cid:17) ≤ κ ( AAA ) (cid:113) + s n ( m − n ) , (22)where s > m and n get progressively larger, the costs as-sociated with computing both the SVD O ( m n ) and a Householder (or modiﬁedGram-Schmidt)-based rank revealing QR factorization should be monitored. Otherrank revealing techniques such as strong LU factorizations [47] and randomized QRwith column pivoting [43, 23] may also be opted for.To speed up the optimization process—a consideration that may be ignored whenmodel evaluations themselves take hours—other techniques have been proposed. uadrature Strategies for Constructing Polynomial Approximations 17 Shin and Xiu [59, 60] optimize a metric similar to P4maximize z  (cid:114) det (cid:16) ˆ AAA T z ˆ AAA z (cid:17) ∏ ki = (cid:13)(cid:13)(cid:13) ˆ AAA z ( i ) (cid:13)(cid:13)(cid:13)  / k , (23)where ˆ AAA ( i ) z denotes the i − th column of ˆ AAA z . To ascertain which rows z to use, theauthors outline a greedy strategy that begins with a few initially selected rows andthen adds rows based on whether they increase the determinant of the Gramian,ˆ AAA T z ˆ AAA z . To abate the computational cost of computing the determinant for each can-didate, they devise a rank-2 update to the determinant (of the Gramian) based onSylvester’s determinant formula and the Sherman-Morrison formula. In a similarvein, Ghili and Iaccarino [29] lay out a series of arguments—motivated by reducingthe operational complexity of the determinant-based optimization—for optimizingthe trace of the design matrix instead. They observe that since the Frobenius norm isan upper bound on the spectral norm and because all the singular values contributeto the aliasing error, optimizing over this norm will undoubtedly yield a small spec-tral norm. The authors formulate a coordinate descent optimization approach to ﬁndsuitable points and weights. In general the aforementioned objectives are non-convex. We discuss an idea thatpermits these objectives to be recast as a convex optimization problem. Our ideaoriginates from the sensor selection problem [36, 41]: given a collection of m sen-sor measurements—where each measurement is a vector—select k measurementssuch that the error covariance of the resulting ellipsoid is minimized. We remarkhere that while a generalized variant of the sensor selection problem is NP-hard (see[2]); the one we describe below has not yet been proven to be NP-hard. Further-more, by selecting the optimization variable to be a boolean vector z that restrictsthe measurements selected, this problem can be cast as a determinant maximizationproblem [68] where the objective is a concave function in z with binary constraintson its entries. This problem can be solved via interior point methods and has com-plexity O ( m ) . Joshi and Boyd [38] provide a formulation for a relaxed sensor se-lection problem that can be readily solved by Newton’s method, where the binaryconstraint can be substituted with a penalty term added to the objective function. Byreplacing their sensor measurements with rows from our Vandermonde type matrix,one arrives at the following maximum volume problem minimize z ∈ R M − log det (cid:32) τ M ∑ i = z i a Ti e (cid:33) − λ M ∑ i = ( log ( z i ) + log ( − z i )) subject to T z = K ≤ z i ≤ , i = , . . . , M . (24)where the positive constant λ is used to control the quality of the approximation.Newton’s approach has complexity O ( m ) with the greatest cost arising from theCholesky factorization when computing the inverse of the Hessian [38]. The linearconstraint is solved using standard KKT conditions as detailed in page 525 of [10].In practice this algorithm requires roughly 10-20 iterations and yields surprisinglygood results for ﬁnding suitable quadrature points. Padua points via Convex optimization:

It is difﬁcult to select points suitablefor interpolation with total order polynomials, that are unisolvent —a condi-tion where the points guarantee a unique interpolant [67]. One such group ofpoints that does admit unisolvency are the famous

Padua points . For a positiveinteger N , these points are given by ( ζ ( ) m , ζ ( ) m ) ∈ [ − , ] where ζ ( ) m = cos (cid:18) π ( m − ) N (cid:19) , ζ ( ) m =  cos (cid:16) π ( k − ) N − (cid:17) cos (cid:16) π ( k − ) N − (cid:17) m odd m even (25)and 1 ≤ m ≤ N +

1, 1 ≤ k ≤ + N / O ( log N ) of the Lebesgue constant, far lower thanMorrow-Patterson or the Extended Morrow-Patterson points [5]. Two othercharacterizations may also be used to determine these points; they are formedby the intersection of certain Lissajous curves and the boundary of [ − , ] or alternatively, every other point from an ( N + ) × ( N + ) tensor productChebyshev grid [6, 8].As an example, consider the case where N = GGG = AAA T AAA . As expected thequadrature rule can integrate all polynomials except for the one correspond-ing to the last row, i.e., ψψψ ( , ) , whose integrand has an order of 8 along theﬁrst direction ζ , beyond the degree of exactness of the 5 points. What is fas-cinating is that when these points are subsampled one can get a subsampledGram matrix i.e., ˆ AAA T z ˆ AAA z to have the same degree of exactness, as illustratedin Figure 8(b). To obtain this result, the aforementioned convex optimizationvia Newton’s method was used on AAA , yielding the Padua points—i.e., everyalternate point from the 30-point Chebyshev tensor grid. uadrature Strategies for Constructing Polynomial Approximations 19 [0. 0.][0. 1.][1. 0.][1. 1.][0. 2.][2. 0.][1. 2.][2. 1.][2. 2.][0. 3.][3. 0.][1. 3.][3. 1.][0. 4.][4. 0.] A T A where A ∈ ℝ (a) [0. 0.][0. 1.][1. 0.][1. 1.][0. 2.][2. 0.][1. 2.][2. 1.][2. 2.][0. 3.][3. 0.][1. 3.][3. 1.][0. 4.][4. 0.] ˆA Tz ˆA z where ˆA z ∈ ℝ (b) −1.0 −0.5 0.0 0.5 1.0 ζ −1.0−0.50.00.51.0 ζ Chebyshev order (4, 5) tensor gridNewton Padua (c)

Fig. 8

The Gram matrix associated with Chebyshev tensor product grid with 4 × =

30 pointsand a total order basis with maximum degree of 4 in (a). The subsampled Gram matrix via convexoptimization in (b) and a comparison of the samples and subsamples in (c).

Comparing execution time:

In our numerical experiments on some of theaforementioned optimization strategies, we have alluded to the operationalcomplexity. We make an obvious remark here, that as the both the polynomialdegree and the dimension increases, the computing time rises. Non-linear al-gebra approaches such as [29] and the Newton approach presented here, haveother tunable parameters that can either add to, or decrease computational runtime.Figure 9 plots the running times for d = m = n when subsamplingfrom grid of Chebyshev points—where the number of points is given bythe ( maximum total order + ) d —using the different optimization strategies.These experiments were carried out on a 3.1 GHz i7 Macbook Pro with 16GBof RAM. For completeness, a summary of the operational complexity of thesetechniques is provided in 3.2. The condition numbers of the matrices obtainedfrom QR, LU and the SVD approach were generally lower (similar order ofmagnitude) than those obtained from the other two techniques. Maximum total order (for d = 3) -3 -2 -1 C o m p u t i n g t i m e , ( s e c o n d s ) QRLUSVD Ghili and Iaccarino (2017)Newton

Fig. 9

Representative computing times for the different optimization strategies for m = n and d = Table 2

Summary of the operational complexity of some of the optimization strategies.

Optimization strategy Complexity

SVD-based subset selection O (cid:0) nm (cid:1) + O (cid:0) nm (cid:1) QR column pivoting O (cid:0) nm (cid:1) LU row pivoting O (cid:0) m (cid:1) Newton’s method O (cid:0) n (cid:1) Frobenius norm optimization (Ghili and Iaccarino 2017) O (cid:0) dnm (cid:1) To motivate this section, consider the integration property of orthogonal polynomi-als, highlighted in (5). In the univariate case, one can write

PPP w = e where PPP ( i , j ) = ψ i ( ζ j ) , w ( i ) = ω i , and e T = [ , , . . . , ] . (26)Assuming PPP ∈ R m × m where one uses the ﬁrst m Gauss-Legendre points in ζ j and theﬁrst ( m − ) order Legendre polynomials in ψ i , then the solution w for the above lin-ear set of equations will yield the ﬁrst m Gauss-Legendre weights. In the case wherethe points are not from a known quadrature rule—i.e., if they are from a randomdistribution—then one has to ensure that the weights are positive via a constraint. uadrature Strategies for Constructing Polynomial Approximations 21

Instead of working with matrices and optimizing for the objectives in Table 3.1,one can frame an optimization problem centered on the computation of momentswhere one optimizes over the space of quadrature weights (non-negative). This moment-based approach for ﬁnding quadrature points and weights is utilized in Ke-shavarzzadeh et al. [40] where the authors solve a relaxed version of this constraintlinear system by minimizing (cid:107) Pw − e (cid:107) . In Ryu and Boyd [53], the authors presentnumerical quadrature as a solution to the inﬁnite-dimensional linear programminimize v ∈ R m f T v subject to p Tj v = c i , where i = , . . . , n . (27)Here components of f ∈ R m are given by f ( i ) = f ( ζζζ i ) , and p j ∈ R m has compo-nents p j ( i ) = ψψψ j ( ζζζ i ) ; the optimization variable v ( i ) = ω i represents the quadratureweights. The constants c i in the equality constraints are determined analytically, asthey involve integrating a known polynomial over the support of f . The problemcan be interpreted as a weighted l optimization problem, as we require v to haveas many zeros as possible and yet satisfy the above constraints. As this problem isNP-hard, Ryu and Boyd propose a two-stage approach to solve it; one for generat-ing an initial condition and another for optimizing over v . Their approach has beenrecently adapted in Jakeman and Narayan [37] who propose least absolute shrink-age and selection operator (LASSO) for ﬁnding the initial condition. They thenproceed to solve the optimization using a gradient-based nonlinear least squares op-timizer. Their results are extremely promising—numerical experiments using theirtechnique show orders of magnitude improvement in convergence compared to ten-sor and sparse grid rules. In this paper, we provided an overview of strategies for ﬁnding quadrature pointsusing ideas from polynomial least squares. Although we have sought to keep our re-view as detailed as possible, readers will forgive us for omitting various techniques.For example, we did not discuss optimal design of experiment based samples, al-though we point the interested reader to the review offered in [34]; for convex relax-ations of the various optimal design of experiment problems we refer the reader tosection 7.5.2 of Boyd and Vandenberghe [10]. In addition, we have also highlighteda new convex optimization strategy that uses Newton’s method for ﬁnding the bestsubsamples by maximizing the volume of the conﬁdence ellipsoid associated withthe Vandermonde-type matrix.So whats next? Based on our review we offer a glimpse of potential future areasof research that could prove to be fruitful:

1. Randomized greedy linear algebra approaches for ﬁnding suitable quadraturesamples. Existing approaches are tailored for ﬁnding pivot columns for tall matri-ces; for our problem we require these techniques to be applicable to fat matrices.2. Large scale (and distributed) variants of the convex optimization strategies de-tailed, including an alternating direction method of multiplies (ADMM) [9] for-mulation for the Newtons method technique presented in this paper;3. Heuristics for optimizing the weights when the joint density of the samples is notknown—a problem that arises in data science; typically in uncertainty quantiﬁ-cation the joint density ρρρ is assumed to be known.4. The development of an open-source repository of near- optimal points for the casewhere m = n for different total order basis and across different d .5. Building on (1) and following the recent work by Shin and Xiu [61] and Wu etal. [69], the use of approaches such as the randomized Kaczmarz algorithm [63]for solving the least squares problem in (11). The essence of the idea here is thatas d and the highest multivariate polynomial degree get larger, the matrix ˆ A z cannot be stored in memory—a requirement for standard linear algebra approaches.Thus techniques such as the Kaczmarz algorithm, which solve for x by iterativelyrequiring access to rows of ˆ A z and elements of b , are useful. Acknowledgements

This work was carried out while PS was visiting the Department of Mechanical,Chemical and Materials Engineering at Universit´a di Cagliari in Cagliari, Sardinia;the ﬁnancial support of the University’s Visiting Professor Program is gratefully ac-knowledged. The authors are also grateful to Akil Narayan for numerous discussionson polynomial approximations and quadratures.

References

1. Ben Adcock and Daan Huybrechs. Approximating smooth, multivariate functions on irregulardomains. arXiv preprint arXiv:1802.00602 , 2018.2. Fang Bian, David Kempe, and Ramesh Govindan. Utility based sensor selection. In

Proceed-ings of the 5th international conference on Information processing in sensor networks , pages11–18. ACM, 2006.3. ˚Ake Bj¨orck.

Numerical methods in matrix computations . Springer.4. G´eraud Blatman and Bruno Sudret. Adaptive sparse polynomial chaos expansion based onleast angle regression.

Journal of Computational Physics , 230(6):2345–2367, 2011.5. Len Bos, Marco Caliari, Stefano De Marchi, and Marco Vianello. Bivariate interpolation atxu points: results, extensions and applications.

Electron. Trans. Numer. Anal , 25:1–16, 2006.6. Len Bos, Marco Caliari, Stefano De Marchi, Marco Vianello, and Yuan Xu. Bivariate lagrangeinterpolation at the padua points: the generating curve approach.

Journal of ApproximationTheory , 143(1):15–25, 2006.uadrature Strategies for Constructing Polynomial Approximations 237. Len Bos, Stefano De Marchi, Alvise Sommariva, and Marco Vianello. Computing multivariatefekete and leja points by numerical linear algebra.

SIAM Journal on Numerical Analysis ,48(5):1984–1999, 2010.8. Len Bos, Stefano De Marchi, Marco Vianello, and Yuan Xu. Bivariate lagrange interpolationat the padua points: the ideal theory approach.

Numerische Mathematik , 108(1):43–57, 2007.9. Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, et al. Distributedoptimization and statistical learning via the alternating direction method of multipliers.

Foun-dations and Trends R (cid:13) in Machine learning , 3(1):1–122, 2011.10. Stephen Boyd and Lieven Vandenberghe. Convex optimization . Cambridge university press,2004.11. John Burkardt. Slow exponential growth for clenshaw curtis sparse grids, 2014.12. Marco Caliari, Stefano De Marchi, Alvise Sommariva, and Marco Vianello. Padua2dm:fast interpolation and cubature at the padua points in matlab/octave.

Numerical Algorithms ,56(1):45–60, 2011.13. Tony F Chan and Per Christian Hansen. Low-rank revealing qr factorizations.

NumericalLinear Algebra with Applications , 1(1):33–44, 1994.14. Shivkumar Chandrasekaran and Ilse CF Ipsen. On rank-revealing factorisations.

SIAM Journalon Matrix Analysis and Applications , 15(2):592–622, 1994.15. Abdellah Chkifa, Albert Cohen, Giovanni Migliorati, Fabio Nobile, and Raul Tempone. Dis-crete least squares polynomial approximation with random evaluations - application to para-metric and stochastic elliptic pdes.

ESAIM: Mathematical Modelling and Numerical Analysis ,49(3):815–837, 2015.16. Ali C¸ ivril and Malik Magdon-Ismail. On selecting a maximum volume sub-matrix of a matrixand related problems.

Theoretical Computer Science , 410(47-49):4801–4811, 2009.17. Albert Cohen, Mark A Davenport, and Dany Leviatan. On the stability and accuracy of leastsquares approximations.

Foundations of computational mathematics , 13(5):819–834, 2013.18. Albert Cohen and Giovanni Migliorati. Optimal weighted least-squares methods.

SMAI Jour-nal of Computational Mathematics , 3:181–203, 2017.19. Patrick R Conrad and Youssef M Marzouk. Adaptive smolyak pseudospectral approximations.

SIAM Journal on Scientiﬁc Computing , 35(6):A2643–A2670, 2013.20. Philip J Davis and Philip Rabinowitz.

Methods of numerical integration . Courier Corporation,2007.21. Achiya Dax. A modiﬁed gram–schmidt algorithm with iterative orthogonalization and columnpivoting.

Linear algebra and its applications , 310(1-3):25–42, 2000.22. Amit Deshpande and Luis Rademacher. Efﬁcient volume sampling for row/column subsetselection. In

Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposiumon , pages 329–338. IEEE, 2010.23. Jed A Duersch and Ming Gu. Randomized qr with column pivoting.

SIAM Journal on Scien-tiﬁc Computing , 39(4):C263–C291, 2017.24. Walter Gautschi. A survey of gauss-christoffel quadrature formulae. In

EB Christoffel , pages72–147. Springer, 1981.25. Walter Gautschi. Orthogonal polynomialsconstructive theory and applications.

Journal ofComputational and Applied Mathematics , 12:61–76, 1985.26. Walter Gautschi.

Orthogonal polynomials: computation and approximation . Oxford Univer-sity Press on Demand, 2004.27. W Morven Gentleman. Implementing clenshaw-curtis quadrature, i methodology and experi-ence.

Communications of the ACM , 15(5):337–342, 1972.28. Thomas Gerstner and Michael Griebel. Numerical integration using sparse grids.

Numericalalgorithms , 18(3):209–232, 1998.29. Saman Ghili and Gianluca Iaccarino. Least squares approximation of polynomial chaos ex-pansions with optimized grid points.

SIAM Journal on Scientiﬁc Computing , 39(5):A1991–A2019, 2017.30. Shahpar S. Ghisu, T. Toward affordable uncertainty quantiﬁcation for industrial problems:Part ii turbomachinery application.

ASME Turbo Expo 2017, GT2017-64845 , 2017.4 Pranay Seshadri, Gianluca Iaccarino, Tiziano Ghisu31. Gene H Golub and Charles F Van Loan.

Matrix computations , volume 4. JHU Press, 2012.32. Gene H Golub and John H Welsch. Calculation of gauss quadrature rules.

Mathematics ofcomputation , 23(106):221–230, 1969.33. Ling Guo, Akil Narayan, Liang Yan, and Tao Zhou. Weighted approximate fekete points:Sampling for least-squares polynomial approximation.

SIAM Journal on Scientiﬁc Computing ,40(1):A366–A387, 2018.34. Mohammad Hadigol and Alireza Doostan. Least squares polynomial chaos expansion: Areview of sampling strategies.

Computer Methods in Applied Mechanics and Engineering ,2017.35. Jerrad Hampton and Alireza Doostan. Coherence motivated sampling and convergence anal-ysis of least squares polynomial chaos regression.

Computer Methods in Applied Mechanicsand Engineering , 290:73–97, 2015.36. Geir E Hovland and Brenan J McCarragher. Dynamic sensor selection for robotic systems.In

Robotics and Automation, 1997. Proceedings., 1997 IEEE International Conference on ,volume 1, pages 272–277. IEEE, 1997.37. John D Jakeman and Akil Narayan. Generation and application of multivariate polynomialquadrature rules. arXiv preprint arXiv:1711.00506 , 2017.38. Siddharth Joshi and Stephen Boyd. Sensor selection via convex optimization.

IEEE Transac-tions on Signal Processing , 57(2):451–462, 2009.39. T. S. Kalra, A. Aretxabaleta, P. Seshadri, N. K. Ganju, and A. Beudin. Sensitivity analysisof a coupled hydrodynamic-vegetation model using the effectively subsampled quadraturesmethod.

Geoscientiﬁc Model Development Discussions , 2017:1–28, 2017.40. Vahid Keshavarzzadeh, Robert M Kirby, and Akil Narayan. Numerical integration in multipledimensions with designed quadrature. arXiv preprint arXiv:1804.06501 , 2018.41. Rex K Kincaid and Sharon L Padula. D-optimal designs for sensor and actuator locations.

Computers & Operations Research , 29(6):701–713, 2002.42. Dirk Laurie. Calculation of gauss-kronrod quadrature rules.

Mathematics of Computation ofthe American Mathematical Society , 66(219):1133–1145, 1997.43. Per-Gunnar Martinsson, Gregorio Quintana Ort, Nathan Heavner, and Robert van de Geijn.Householder qr factorization with randomization for column pivoting (hqrrp).

SIAM Journalon Scientiﬁc Computing , 39(2):C96–C115, 2017.44. Giovanni Migliorati and Fabio Nobile. Analysis of discrete least squares on multivariatepolynomial spaces with evaluations at low-discrepancy point sets.

Journal of Complexity ,31(4):517–542, 2015.45. Giovanni Migliorati, Fabio Nobile, Erik Von Schwerin, and Ra´ul Tempone. Analysis of dis-crete l2 projection on polynomial spaces with random evaluations.

Foundations of Computa-tional Mathematics , 14(3):419–456, 2014.46. Alan Miller.

Subset selection in regression . CRC Press, 2002.47. L Miranian and Ming Gu. Strong rank revealing lu factorizations.

Linear algebra and itsapplications , 367:1–16, 2003.48. Akil Narayan. Computation of induced orthogonal polynomial distributions. arXiv preprintarXiv:1704.08465 , 2017.49. Akil Narayan, John Jakeman, and Tao Zhou. A christoffel function weighted least squaresalgorithm for collocation approximations.

Mathematics of Computation , 86(306):1913–1947,2017.50. T. N. L Patterson. The optimum addition of points to quadrature formulae.

Mathematics ofComputation , 22(104):847–856, 1968.51. M Per Pettersson, Gianluca Iaccarino, and J Nordstrom. Polynomial chaos methods for hy-perbolic partial differential equations.

Springer Math Eng. doi , 10:978–3, 2015.52. Dirk Pﬂ¨uger, Benjamin Peherstorfer, and Hans-Joachim Bungartz. Spatially adaptive sparsegrids for high-dimensional data-driven problems.

Journal of Complexity , 26(5):508–522,2010.53. Ernest K Ryu and Stephen P Boyd. Extensions of gauss quadrature via linear programming.

Foundations of Computational Mathematics , 15(4):953–971, 2015.uadrature Strategies for Constructing Polynomial Approximations 2554. Pranay Seshadri, Paul Constantine, Gianluca Iaccarino, and Geoffrey Parks. A density-matching approach for optimization under uncertainty.

Computer Methods in Applied Me-chanics and Engineering , 305:562–578, 2016.55. Pranay Seshadri, Akil Narayan, and Sankaran Mahadevan. Effectively subsampled quadra-tures for least squares polynomial approximations.

SIAM/ASA Journal on Uncertainty Quan-tiﬁcation , 5(1):1003–1023, 2017.56. Pranay Seshadri and Geoffrey Parks. Effective-quadratures (eq): Polynomials for computa-tional engineering studies.

The Journal of Open Source Software , 2:166–166, 2017.57. Pranay Seshadri, Geoffrey T Parks, and Shahrokh Shahpar. Leakage uncertainties in compres-sors: The case of rotor 37.

Journal of Propulsion and Power , 31(1):456–466, 2014.58. Pranay Seshadri, Shahrokh Shahpar, Paul Constantine, Geoffrey Parks, and Mike Adams. Tur-bomachinery active subspace performance maps.

Journal of Turbomachinery , 140(4):041003,2018.59. Yeonjong Shin and Dongbin Xiu. Nonadaptive quasi-optimal points selection for least squareslinear regression.

SIAM Journal on Scientiﬁc Computing , 38(1):A385–A411, 2016.60. Yeonjong Shin and Dongbin Xiu. On a near optimal sampling strategy for least squares poly-nomial regression.

Journal of Computational Physics , 326:931–946, 2016.61. Yeonjong Shin and Dongbin Xiu. A randomized algorithm for multivariate function approxi-mation.

SIAM Journal on Scientiﬁc Computing , 39(3):A983–A1002, 2017.62. Sergey A Smolyak. Quadrature and interpolation formulas for tensor products of certainclasses of functions. In

Dokl. Akad. Nauk SSSR , volume 4, page 123, 1963.63. Thomas Strohmer and Roman Vershynin. A randomized kaczmarz algorithm with exponentialconvergence.

Journal of Fourier Analysis and Applications , 15(2):262, 2009.64. Gary Tang and Gianluca Iaccarino. Subsampled gauss quadrature nodes for estimating poly-nomial chaos expansions.

SIAM/ASA Journal on Uncertainty Quantiﬁcation , 2(1):423–443,2014.65. Lloyd N Trefethen.

Spectral methods in MATLAB . SIAM, 2000.66. Lloyd N Trefethen. Is gauss quadrature better than clenshaw–curtis?

SIAM review , 50(1):67–87, 2008.67. Lloyd N Trefethen. Cubature, approximation, and isotropy in the hypercube.

SIAM Review ,59(3):469–491, 2017.68. Lieven Vandenberghe, Stephen Boyd, and Shao-Po Wu. Determinant maximization with linearmatrix inequality constraints.

SIAM journal on matrix analysis and applications , 19(2):499–533, 1998.69. Kailiang Wu, Yeonjong Shin, and Dongbin Xiu. A randomized tensor quadrature methodfor high dimensional polynomial approximation.

SIAM Journal on Scientiﬁc Computing ,39(5):A1811–A1833, 2017.70. Dongbin Xiu.

Numerical methods for stochastic computations: a spectral method approach .Princeton university press, 2010.71. Tao Zhou, Akil Narayan, and Dongbin Xiu. Weighted discrete least-squares polynomial ap-proximation using randomized quadratures.