Stable high-order randomized cubature formulae in arbitrary dimension
aa r X i v : . [ m a t h . NA ] D ec Stable high-order randomized cubature formulae in arbitrary dimension
Giovanni Migliorati ∗† Fabio Nobile ‡ December 4, 2020
Abstract
We propose and analyse randomized cubature formulae for the numerical integration of functions withrespect to a given probability measure µ defined on a domain Γ Ď R d , in any dimension d . Eachcubature formula is exact on a given finite-dimensional subspace V n Ă L p Γ , µ q of dimension n , anduses pointwise evaluations of the integrand function φ : Γ Ñ R at m ą n independent random points.These points are drawn from a suitable auxiliary probability measure that depends on V n . We showthat, up to a logarithmic factor, a linear proportionality between m and n with dimension-independentconstant ensures stability of the cubature formula with high probability. We also prove error estimatesin probability and in expectation for any n ě m ą n , thus covering both preasymptotic andasymptotic regimes. Our analysis shows that the expected cubature error decays as a n { m timesthe L p Γ , µ q -best approximation error of φ in V n . On the one hand, for fixed n and m Ñ 8 ourcubature formula can be seen as a variance reduction technique for a Monte Carlo estimator, and canlead to enormous variance reduction for smooth integrand functions and subspaces V n with spectralapproximation properties. On the other hand, when n, m Ñ 8 , our cubature becomes of high orderwith spectral convergence. As a further contribution, we analyse also another cubature whose expectederror decays as a { m times the L p Γ , µ q -best approximation error of φ in V n , which is asymptoticallyoptimal but with constants that can be larger in the preasymptotic regime. Finally we show that,under a more demanding (at least quadratic) proportionality betweeen m and n , all the weights ofthe cubature are strictly positive with high probability. As an example of application, we discuss thecase where the domain Γ has the structure of Cartesian product, µ is a product measure on Γ and V n contains algebraic multivariate polynomials. AMS classification numbers:
Keywords: approximation theory, multivariate integration, cubature formula, error analysis, convergencerate, randomized linear algebra.
Let Γ Ď R d be a Borel set, µ be a Borel probability measure on Γ absolutely continuous with respect tothe Lebesgue measure λ , and denote with ρ : “ dµ { dλ : Γ Ñ R its probability density. Given a function φ : Γ Ñ R in some smoothness class, we consider the problem of integrating φ with respect to µ over Γ: I p φ q : “ ż Γ φ p y q dµ p y q “ ż Γ φ p y q ρ p y q dλ p y q . (1.1)When the expression of φ or the geometric shape of the domain Γ are complicated, the exact calculationof I p φ q might be too difficult, or not be possible at all, for example if the function φ is not available inexplicit form but can only be evaluated at any point y P Γ at a certain (possibly high) cost, so thatthe number of evaluations should be limited as much as possible. Hence one resorts to the numericalapproximation of the integral (1.1), see e.g. [7, 29], that is known as the problem of numerical quadraturewhen d “ d ě
2, and that can become a challenging task as d increasesdue to the curse of dimensionality. In any dimension d ě m ě
1, we consider the m -point quadrature/cubature formula I m p φ q : “ m ÿ i “ α i φ p y i q , (1.2) ∗ Laboratoire Jacques-Louis Lions, Sorbonne Universit´e, Paris 75005, France. email: [email protected] † corresponding author ‡ MATHI-CSQI, ´Ecole Polytechnique F´ed´erale de Lausanne, Lausanne CH-1015, Switzerland. email: fabio.nobile@epfl.ch y , . . . , y m P Γ are the nodes and α , . . . , α m P R are the weights. The nodes and weights should bechosen such that I m p φ q « I p φ q . (1.3)One approach to develop quadrature/cubature formulae imposes that (1.2) be exact on some givenfinite-dimensional linear function space V n over Γ, where n : “ dim p V n q . In principle one would like to havea formula that exactly integrates any function in V n , i.e. I m p v q “ I p v q , @ v P V n . When d “ V n is a polynomial space, the existence of such quadrature formulae has been firstdiscussed in [5] with general ρ , extending earlier results in [11] with ρ ”
1. An example of quadratureis the Gauss-Hermite quadrature formula, that exactly integrates any univariate algebraic polynomial upto degree 2 m ´ m points, where integration is intended with respect to the Gaussian probabilitymeasure on R .In general, quadrature/cubature formulae of the form (1.2) might not be provably stable to perturba-tions in the evaluations of φ at the nodes. Denoting with η i the perturbation of φ p y i q , for the formula(1.2) it holds ˇˇˇˇˇ m ÿ i “ α i p φ p y i q ` η i q ´ m ÿ i “ α i φ p y i q ˇˇˇˇˇ ď max j “ ,...,m | η j | m ÿ i “ | α i | . (1.4)As long as V n contains the functions that are constant over Γ, exactness of (1.2) over V n implies m ÿ i “ α i “ . (1.5)On the one hand, in presence of negative weights the summation of the | α i | in (1.4) can become largerthan one, thus serving as an amplifying factor for the perturbations. On the other hand, if the weights arepositive, (1.4) and (1.5) give ˇˇˇˇˇ m ÿ i “ α i p φ p y i q ` η i q ´ m ÿ i “ α i φ p y i q ˇˇˇˇˇ ď max i “ ,...,m | η i | , (1.6)thus ensuring the stability of the formula (1.2) to perturbations.In one dimension, the weights of Gaussian quadratures are strictly positive. In higher dimension,cubatures with positive weights are difficult to construct, above all in general domains or with generaldensities ρ . A remarkable result on the existence of stable quadrature/cubature formulae is the nexttheorem from [30]. Theorem 1.
Let Γ Ă R d be a compact set, and consider the integral (1.1) with ρ strictly positive over Γ .Given n real functions f , . . . , f n that are continuous on Γ , linearly independent, and such that at least oneis nonzero everywhere in Γ , there exists p y i q mi “ Ă Γ and nonnegative reals p α i q mi “ with m ď n such thatthe formula (1.2) for (1.1) is exact on span t f , . . . , f n u . The result in Theorem 1 gives an upper bound for the number of nodes, and can be further generalizedto unbounded domains, see e.g. [25, 1]. With classical spaces of algebraic polynomials over the hypercubeΓ “ r , s d , there are known lower bounds on the number of nodes required by any cubature formula forexactness, e.g. in [28] and in [19] for polynomial spaces supported on isotropic tensor product or totaldegree index sets. In such specific settings, it is possible to construct exact cubature formulae whosenumber of nodes matches those lower bounds, so-called minimal cubature formulae, see e.g. [3] for anoverview. With more general spaces of algebraic polynomials, minimal cubature formulae are not knownon a systematic basis. In low dimension d ď
3, heuristic numerical methods based on optimization as thoseproposed in [26] allow the direct construction of (almost) minimal cubature formulae. In high dimensionit is difficult to find minimal cubature formulae, and it is difficult also to find cubature formulae whosenumber of nodes matches the upper bound in Theorem 1, above all with general polynomial spaces andwith general domains. It is worth noticing that the proof of Theorem 1 from [30] is not constructive, andfinding the nodes and weights of such remarkable formulae remains an open problem.2e now briefly recall other approaches to multivariate integration from the literature, that use acubature formula of the form (1.2) with suitable choices for the nodes and weights. A first approach, thatrequires a product domain and a separable density ρ , has been developed starting from [27], and nowadaysgoes under the name of Smolyak cubature or sparse grid quadrature/cubature, see e.g. [2] and referencestherein. In general, a sparse grid cubature formula is exact on a chosen polynomial space, see for examplethose presented in [22]. In low dimension and using polynomials with moderately high (total) degree,another type of (symmetric) cubature formulae has been presented in [15, 12], and these cubatures becomeunstable as the degree of the polynomials increases.Other approaches to multivariate integration are Monte Carlo and quasi-Monte Carlo methods, see[24, 8] and references therein. These approaches do not impose exactness of the cubature formula on agiven space. Both Monte Carlo and quasi-Monte Carlo methods share the same weights, that are equal to m ´ with m being the number of nodes. The difference between the two methods is in the choice of thenodes. In the Monte Carlo method the nodes are realizations of independent and identically distributedcopies of a random variable distributed as µ . In the quasi-Monte Carlo method the nodes are judiciouslychosen according to specific deterministic rules, and a tensor product domain again is required. Thenotorious half-order convergence rate of the Monte Carlo method is immune to the curse of dimensionality.The convergence rate of quasi-Monte Carlo depends on the structure and smoothness of the integrandfunction, and on the low-discrepancy properties of the point set containing the nodes.In the present article we develop cubature formulae of the form (1.2) that are exact on V n , for generaldomains Γ Ď R d provided an L p Γ , µ q orthonormal basis is available, and whose weights and nodes canbe explicitly calculated. To pursue such an objective, we replace the integrand function by its weightedleast-squares approximation onto V n . Our cubature formulae use randomized nodes, and their exactness on V n occurs with a quantified large probability. More precisely, see Theorem 3, if m is linearly proportionalto n , up to a logarithmic factor, then the formula is exact on V n with large probability. This shows thatexact cubature formulae can be constructed in the general setting of Theorem 1 with m of the order of n ,albeit not necessarily with positive weights.Given a function φ in some smoothness class, for the proposed cubatures we provide convergenceestimates in probability and in expectation for the integration error | I p φ q ´ I m p φ q| (1.7)depending on m , n , and on the best approximation error of φ in V n in some norm. In particular, we showthat the mean error satisfies an estimate of the type E p| I p φ q ´ I m p φ q|q À c nm inf v P V n } φ ´ v } L p Γ ,µ q ` m ´ r , (1.8)provided m ln m Á p ` r q n , where r ą d . Similar estimates are proved for the mean squared error and in probability, and use recent results from[6] on the analysis of the stability and accuracy of weighted least-squares approximation methods.The previous estimate shows that if n is kept fixed and m Ñ 8 , the cubature formula has a MonteCarlo type convergence rate, however with a substantially reduced variance with respect to a standardMonte Carlo cubature formula, proportional to the L p Γ , µ q projection error of φ onto V n . In this respect,our cubature formula can be seen as a very efficient variance reduction technique in a Monte Carlo context.Moreover, if both n, m Ñ 8 , still under the condition m ln m Á n , then the mean cubature error is of thesame order of the L p Γ , µ q -best approximation error and features a fast decay if the integrand functionis smooth and the sequence of finite-dimensional spaces V n has good approximation properties. In thisrespect our cubature formula is of high order. We remark once more that all results and hidden constantsdo not depend on the dimension d .As a further contribution, we construct another cubature formula for I p φ q consisting of I m p φ q plusa Monte Carlo correction estimator. This cubature satisfies an error estimate like (1.8) without theterm ? n , and hence is asymptotically optimal in n , but with different constants that can be larger inthe preasymptotic regime. Such a result is stated in Theorem 5, and is made possible by using 2 m nonidentically distributed nodes rather than m identically distributed nodes.An often desirable property of the quadrature/cubature formula (1.2) concerns the positivity of all theweights, that ensures stability as shown in (1.6). This property is not fulfilled by sparse grid cubature3ormulae, which may contain negative weights, unless tensor grids are used and the underlying quadraturehas positive weights ( e.g. Gaussian quadratures). In Theorem 6 we show that cubature formulae exact on V n and with strictly positive weights can be constructed, again in the same general setting of Theorem 1,but with m being at least quadratically proportional to n , up to a logarithmic factor. At present timewe are not aware of the existence in the literature of stable cubature formulae in general domains whichhave simultaneously high-degree of exactness and strictly positive weights, and the present paper providesa first analysis of cubature formulae of this type.Cubature formulae exact on a given space and with nodes being random variables have been earlierstudied in [9, 10], but the weights are not necessarily positive. Error estimates for these stochastic cubatureformulae can be found in [20, Lemma 1], which can be seen as a stochastic counterpart of Theorem 1. Thestochastic cubatures from [9, 10] use n cubature nodes whose probability distribution contains determinantsof n -by- n matrices. For this reason, the generation of the nodes of such cubatures becomes computationallyintractable already for moderate values of n and d . Compared to [9, 10], the stochastic cubature developedin the present paper use a different distribution of the nodes, that can be sampled way more efficiently.Under very mild requirements, for example if an L p Γ , µ q -orthonormal basis of V n is known in explicit form,efficient algorithms have been developed in [6] for the generation of the nodes of our stochastic cubatures.If the elements of the orthonormal basis have product form, then the computational complexity of suchalgorithms for the generation of m nodes is provably linear in m and d .More recently, a cubature formula based on least-squares approximants for periodic functions has beenproposed in [23], with different weights, different distribution of the nodes and a different error analysis.The cubature from [23] has been later refined in [14].The outline of the article is the following: in Section 2 we recall some useful results on the analysis ofdiscrete least squares. In Section 3 we present the cubature formulae, and provide conditions which ensureexactness and positive weights, together with convergence estimates. Section 4 addresses the case where V n is chosen as a multivariate polynomial space. The proofs are collected in Section 5. In Section 6 wedraw some conclusions. In this section we introduce the weighted discrete least-squares method with evaluations at random pointsets, and recall the main results achieved in [6] for the analysis of its stability and accuracy.Given a Borel probability measure µ on Γ, we introduce the L p Γ , µ q inner product x f , f y : “ ż Γ f p y q f p y q ρ p y q dλ p y q , (2.1)and the norm } f } : “ x f, f y { . We work under the following assumption. Assumption 1.
There exists an L p Γ , µ q orthonormal basis p ψ j q j ě , and this basis contains the constantfunction over Γ . Using the orthonormal basis p ψ j q j ě we define the approximation space as V n : “ span t ψ , . . . , ψ n u , (2.2)and set n : “ dim p V n q . Without loss of generality we suppose that ψ ”
1, and therefore ψ P V n for any n ě
1, as stated in the next assumption.
Assumption 2.
For any n ě the space V n contains ψ ” . For any given space V n we define the functions κ : Γ Ñ R and w : Γ Ñ R as κ p y q : “ ˜ n ÿ i “ | ψ i p y q| ¸ ´ and w p y q : “ n κ p y q , (2.3)whose denominators do not vanish thanks to Assumption 2. For any space V n and any n ě
1, Assumption 2ensures the upper bound w p y q ď n, y P Γ . (2.4)4he function κ is strictly positive over Γ. Sharper lower bounds (uniformly over Γ) can be obtained byexploiting the structure of the space V n : for example with polynomial spaces such lower bounds are shownin Remark 7.When V n is the space of algebraic polynomials of total degree n ´
1, the function κ is known as theChristoffel function. Using such functions, we define on Γ the probability measure dσ : “ w ´ dµ “ ř ni “ | ψ i p y q| n dµ “ ř ni “ | ψ i p y q| n ρ p y q dλ. (2.5)In general σ is not a product measure, even if µ is a product measure. Next, we introduce the weighteddiscrete inner product x f , f y m : “ m m ÿ i “ w p y i q f p y i q f p y i q , (2.6)where the functions w, f , f are evaluated at m points y , . . . , y m P Γ that are independent and identicallydistributed according to σ . The discrete inner product is associated with the discrete seminorm } f } m : “x f, f y { m for any f P L p Γ , µ q . In the forthcoming sections, the random points y , . . . , y m P Γ play the roleof nodes for the quadrature/cubature formula (1.2).For any function φ : Γ Ñ R , we define its L p Γ , µ q projection onto V n asΠ n φ : “ arg min v P V n } v ´ φ } . (2.7)In many applications we do not have an explicit expression of the function φ , and can only evaluateits value φ p y q at a given parameter y P Γ. In such a situation, the projection (2.7) cannot be computed,since it would require the explicit knowledge of the function φ . Hence, one can resort to the discreteleast-squares approximation of φ in V n , defined asΠ mn φ : “ arg min v P V n } v ´ φ } m , (2.8)where the minimization of the L p Γ , µ q norm has been replaced by the minimization of the discrete semi-norm. Since the discrete seminorm uses pointwise evaluations of φ , throughout the paper we further assumethat φ p y q is well defined at any y P Γ. The expansion of Π mn φ over the orthonormal basis readsΠ mn φ “ n ÿ j “ β j ψ j , (2.9)with β : “ p β , . . . , β n q J P R n being the vector of the coefficients in the above expansion. Denote with D the design matrix, whose elements are defined as D ij : “ a w p y i q ψ j p y i q , i “ , . . . , m, j “ , . . . , n, (2.10)and define the Gramian matrix G as G : “ m D J D. Moreover, we denote the Hadamard product of two vectors p, q P R m as p d q : “ p p , . . . , p m q d p q , . . . , q m q “ p p q , . . . , p m q m q P R m , and use this product to define b : “ ? w d Φ “ p a w p y q φ p y q , . . . , a w p y m q φ p y m qq J , where Φ : “ p φ p y q , . . . , φ p y m qq J contains the evaluations of φ at the nodes p y i q ď i ď m .The projection Π mn φ of φ in (2.8) can be computed by solving the linear system Gβ “ m D J b. (2.11)If the matrix G is nonsingular then the solution to the linear system (2.11) is β “ m G ´ D J b, (2.12)5hus defining a unique discrete least-squares approximation of φ through (2.9). For any integer n ě
1, wesay that a point set y , . . . , y m P Γ with m ě n is unisolvent for a given space V n ifdet p G q ‰ . (2.13)A unisolvent point set ensures that the operator Π mn is well defined and uniquely associated to the space V n . When m ă n the matrix G is rank deficient, and condition (2.13) cannot be fulfilled. Condition (2.13)does not depend on the choice of the basis of V n : using any other basis p r ψ , . . . , r ψ n q J “ M p ψ , . . . , ψ n q J of V n related to the ψ , . . . , ψ n by means of a suitable nonsingular matrix M yields det p M J GM q ‰ ðñ det p G q ‰
0. Given any matrix A P R m ˆ n , for any 1 ď p ď `8 we introduce the operator norm ~ A ~ ℓ p : “ sup x P R nx ‰ } Ax } ℓ p } x } ℓ p , and define ~ ¨ ~ : “ ~ ¨ ~ ℓ when p “
2. Condition (2.13) does not take into account situations where G is nonsingular but still very ill-conditioned, which might occur even when m ě n . From a numericalstandpoint, it is desirable that G is also well conditioned. A natural vehicle for quantifying the ill-conditioning of G is to look at how much it deviates from the identity matrix I with compatible size, ~ G ´ I ~ ď δ, (2.14)for some δ ą
0. When δ P p , q condition (2.14) can be rewritten as the norm equivalence p ´ δ q} v } ď } v } m ď p ` δ q} v } , v P V n , (2.15)or as 1 ´ δ ď ~ G ~ ď ` δ. (2.16)We now recall the main results achieved in [6] concerning the analysis of the stability and accuracy ofweighted discrete least-squares approximation with evaluations at random points. For any φ P L p Γ , µ q wedefine its best approximation error in the L p Γ , µ q norm as e p φ q : “ } φ ´ Π n φ }“ min v P V n } φ ´ v } , and its weighted L best approximation error as e ,w p φ q : “ inf v P V n sup y P Γ a w p y q| φ p y q ´ v p y q| . Given any δ P p , q we define the quantity ξ p δ q : “ p ` δ q ln p ` δ q ´ δ ą , (2.17)that satisfies the upper and lower bounds in (5.23) and (5.24).Also recall the conditioned weighted least-squares estimator introduced in [6], defined as r φ : “ Π mn φ, if ~ G ´ I ~ ă δ, , otherwise.Next we quote from [6, Corollary 1] the following result. Theorem 2.
In any dimension d , for any real r ą , any δ P p , q and any n ě , if the m i.i.d. points y , . . . , y m are drawn from σ defined in (2.5) and m ln m ě ` rξ p δ q n, (2.18) then the following holds:(i) the matrix G satisfies Pr p~ G ´ I ~ ď δ q ą ´ nm ´p r ` q ě ´ m ´ r ; (2.19)6 ii) for all φ such that sup y P Γ a w p y q| φ p y q| ă `8 the weighted least-squares estimator satisfies Pr ´ } φ ´ Π mn φ } ď ´ ` ? ¯ e ,w p φ q ¯ ą ´ nm ´p r ` q ě ´ m ´ r ; (2.20) (iii) if φ P L p Γ , µ q then the conditioned estimator satisfies E ´ } φ ´ r φ } ¯ ď p ` ε p m qq p e p φ qq ` } φ } m ´ r , (2.21) where ε p m q : “ ξ p δ qp ` r q ln m decreases monotonically to zero as m increases. The bound (2.14) for G implies a bound of the same type for its inverse: for any δ P p , q and m ě n it holds that ~ G ´ I ~ ď δ ùñ ~ G ´ ´ I ~ ď δ ´ δ , (2.22)since ~ G ´ ´ I ~ “ ~ G ´ p I ´ G q ~ ď ~ G ´ ~ ~ G ´ I ~ , and (2.14) also implies ~ G ~ ď ` δ and ~ G ´ ~ ď p ´ δ q ´ . Another implication of (2.14) is that ~ G ´ I ~ ď δ ùñ cond p G q ď ` δ ´ δ . (2.23)From (2.19) together with (2.22) we have the following corollary of Theorem 2. Corollary 1.
In any dimension d , for any real r ą , any δ P p , q and any n ě , if the m i.i.d. points y , . . . , y m are drawn from σ defined in (2.5) and condition (2.18) holds true then Pr ˆ ~ G ´ ´ I ~ ď δ ´ δ ˙ ą ´ m ´ r . (2.24)Since (2.14) implies that G is nonsingular, another consequence of Theorem 2 is the next corollary. Corollary 2.
In any dimension d , for any real r ą , any δ P p , q and any n ě , if m satisfies (2.18) then the random point set y , . . . , y m with m i.i.d. points drawn from σ is unisolvent for V n with probabilitylarger than ´ m ´ r . Sampling algorithms for the generation of the random point set y , . . . , y m from (2.5) have been devel-oped in [6] when Γ is a Cartesian domain and µ is a product measure. The computational cost of thesealgorithms scales linearly with respect to the dimension d and to m . In this section we construct cubature formulae of the form (1.2), to approximate the multivariate integral(1.1) of a smooth function φ : Γ Ñ R with respect to a given probability measure µ . The constructionof such cubature formulae uses the discrete least-squares approximation (2.8) of the integrand function φ .As in the previous sections, y , . . . , y m and α , . . . , α m denote its nodes and weights, respectively. Thefirst step in the development of our cubature formula consists in the evaluation φ p y q , . . . , φ p y m q of theintegrand function φ at the nodes y , . . . , y m . The second step is the choice of the weights. Let W be thematrix defined element-wise as W ii : “ w p y i q for i “ , . . . , m and W ij “ i, j “ , . . . , m with i ‰ j .We consider weights α “ p α , . . . , α m q J of the form α : “ p a w p y q , . . . , a w p y m qq J d m DG ´ e “ m W { DG ´ e , (3.1)where e : “ p , , . . . , q J P R n denotes the vector with all components except the first being equal to zero,and the first component being equal to one. The components of α are given by α i : “ m ´ W { DG ´ e ¯ i “ m a w p y i q D i G ´ e , i “ , . . . , m, (3.2)7ith D i : “ a w p y i q p ψ p y i q , . . . , ψ n p y i qq . In the next lemma we prove conditions on the nodes and weights such that the cubature (1.2) is exacton V n , see Section 5 for the proof. Lemma 1.
In any dimension d ě and for any n ě , let m ě n nodes y , . . . , y m P Γ be a unisolventpoint set for the space V n . If the weights α , . . . , α m are chosen as in (3.1) then the formula (1.2) satisfies I m p φ q “ I p Π mn φ q , for any φ P L p Γ , µ q , (3.3) and I m p v q “ I p v q , for any v P V n . (3.4) Remark 1.
The weights in equation (3.1) can be calculated as α “ m W { Dh , where h is the solution tothe linear system Gh “ e . Note that, from Corollary 1, the matrix G is well conditioned under condition (2.18) . For any n ě
1, the projection Π n φ of φ onto V n satisfies I p Π n φ q “ I p φ q , φ P L p Γ , µ q , (3.5)whose proof is identical to the proof of (3.3). Using (3.4) together with (3.5) it follows that I m p Π n φ q “ I p φ q , φ P L p Γ , µ q . (3.6)Set h n : “ φ ´ Π n φ , and define the vector g P R m whose components are given by g i “ h n p y i q for any i “ , . . . , m . We can decompose the integration error of I m into two terms named S and B respectively: I m p φ q ´ I p φ q “ ż Γ p Π mn φ ´ Π n φ q dµ “ ż Γ Π mn p φ ´ Π n φ q dµ “ m e J G ´ D J W { g (3.7) “ m e J D J W { g ` m e J p G ´ ´ I q D J W { g (3.8) “ m m ÿ i “ w p y i q h n p y i q looooooooooomooooooooooon “ : S ` m m ÿ i “ ´ e J G ´ p I ´ G q D J W { ¯ i h n p y i q loooooooooooooooooooooooooomoooooooooooooooooooooooooon “ : B , (3.9)where the first equality uses (3.5) and (3.3), the second equality follows from properties of projectionoperators, the third equality uses (3.3) together with the definitions of I m and g . Notice that (3.5) implies E p S q “ E p wφ q ´ E p w Π n φ q “ , (3.10)but there is no reason for B to have zero mean, and in general E p B q “ E p I m p φ qq ´ I p φ q ‰ . Usually B : “ E p B q is called the bias of I m .By conditioning depending on the value of ~ G ´ I ~ , for any δ P p , q we can define another cubatureformula r I m p φ q : “ I m p φ q , if ~ G ´ I ~ ă δ, , otherwise, (3.11)that uses the cubature formula I m p φ q defined in (1.2) with weights (3.1). Notice that r I m p φ q can also bewritten as a cubature formula of the form (1.2), with the same nodes as I m p φ q but with weights given by α “ $&% m W { DG ´ e , if ~ G ´ I ~ ă δ, p , . . . , q J , otherwise. (3.12)8 consequence of (3.3) is that r I m p φ q “ I p r φ q for any φ P L p Γ , µ q on both events t~ G ´ I ~ ă δ u and t~ G ´ I ~ ě δ u . However, (3.4) with I m p v q replaced by r I m p v q remains true only on the event t~ G ´ I ~ ă δ u ,because r I m p v q “ v P V n when t~ G ´ I ~ ě δ u . On the event t~ G ´ I ~ ă δ u , the integration errorof r I m can be decomposed as I p φ q ´ r I m p φ q “ S ` B, (3.13)where the terms S and B are the same that appear in (3.9). The bias of r I m is r B : “ E p r I m p φ qq ´ I p φ q . Theexpectation in the definitions of B and r B is over both events t~ G ´ I ~ ă δ u and t~ G ´ I ~ ě δ u . In contrastto B , for r B such an expectation is always finite. The term r B asymptotically vanishes as m Ñ `8 , and theproof of this fact is postponed to the end of this section. However B and r B do not vanish, in general, when m is finite.The next theorem quantifies the integration error of the formula (1.2) in probability, and the integrationerror of the formula (3.11) in expectation. Its proof is postponed to Section 5. Theorem 3.
In any dimension d , for any real r ą and any δ P p , q , if the m i.i.d. points y , . . . , y m are drawn from σ defined in (2.5) and condition (2.18) holds true then:i) the cubature formula (1.2) with weights chosen as in (3.1) satisfies (3.3) – (3.4) with probability largerthan ´ m ´ r ;ii) for all φ such that sup y a w p y q| φ p y q| ă `8 , the integration error of the cubature formula (1.2) withweights (3.1) satisfies Pr ´ | I p φ q ´ I m p φ q| ď p ` ? q e ,w p φ q ¯ ě ´ m ´ r ; (3.14) iii) for any φ P L p Γ , µ q the integration error of the cubature formula (3.11) satisfies E ˆˇˇˇ I p φ q ´ r I m p φ q ˇˇˇ ˙ ď p ` ε p m qq p e p φ qq ` | I p φ q| m ´ r , (3.15) with ε p m q as in Theorem 2, and also satisfies E ´ˇˇˇ I p φ q ´ r I m p φ q ˇˇˇ¯ ď c nm ˆ ` ε p m, n q ´ δ ˙ e p φ q ` | I p φ q| m ´ r , (3.16) with ε p m, n q : “ a p ` r ln n s q ˆ ` a p ` r ln n s q c nm ˙ ď p ` r ln n s q ˆ ` c nm ˙ . Before closing the section, we compare our randomized cubature formulas with the Monte Carlo methodand with Importance Sampling Monte Carlo, hereafter shortened to Importance Sampling. With all thethree methods the integral I p φ q in (1.1) is approximated using a cubature formula I m p φ q as in (1.2), butwith different choices for the nodes and weights which amount to different estimates for the associatedintegration error (1.7).With Monte Carlo the m nodes y , . . . , y m are independent and identically distributed according to µ ,and the weights α , . . . , α m are all set equal to 1 { m . The mean squared integration error of Monte Carlois given by E ` | I p φ q ´ I m p φ q| ˘ “ Var p φ q m . (3.17)With Importance Sampling, the m nodes y , . . . , y m are independent and can be chosen identicallydistributed according to σ , and the weights can be chosen as α i “ w p y i q{ m for i “ , . . . , m . Thecorresponding mean squared integration error is given by E ` | I p φ q ´ I m p φ q| ˘ “ Var p wφ q m . (3.18)In our cubature formulas the m nodes y , . . . , y m are independent and identically distributed from σ , and the weights are either chosen as in (3.1) or as in (3.12). Concerning the error estimates, (3.15)9roves convergence in expectation of | I p φ q´ r I m p φ q| and (3.14) proves a probabilistic estimate for the error | I p φ q ´ I m p φ q| .Our cubature formulas and Importance Sampling both rely on the change of measure (2.5), that isdetermined by the choice of the function w . In the present paper we have chosen w as in (2.3), but anyother nonnegative function w : Γ Ñ R such that ş Γ w ´ dµ “ w should hopefully make the variance in (3.18) smaller than the variance in (3.17) of MonteCarlo. In our cubature formula, taking w as in (2.3) ensures the stability of the projector Π mn as grantedby Theorem 2.For any fixed n and any i “ , . . . , m , the weights (3.2) of the randomized cubature formula convergealmost surely to the weights of importance sampling as m Ñ `8 . Before presenting the proof of thisresult, we introduce the following notation: for any m ě p y , . . . , y m q Ă Γand define the matrix D m P R m ˆ n , whose elements are D mik : “ a w p y i q ψ k p y i q , and α i,m : “ m a w p y i q D mi p G m q ´ e , G m : “ m p D m q J D m , i “ , . . . , m, where D mi is the i th row of D m . The weights defined above are the same as in (3.2), and the purpose ofthe new notation is solely to emphasize the dependence on m in the design matrix D and in the cubatureweights. Theorem 4.
Let m satisfy (2.18) with some δ P p , q , n ě , r ą . For any i P N the sequence ofrandom variables p mα i,m q m ě max t i,m u converges almost surely to w p y i q as m Ñ `8 .Proof.
For any δ P p , q , define the sequence δ k : “ δ ´ k , k ě
0, and denote with m k the number ofsamples that satisfies (2.18) with δ k , n and r . For any i P N , define the events A ik : “ | m k α i,m k ´ w p y i q| ą δ k a w p y i q n ´ δ k + , k ě . The left-hand side of the above inequality can be bounded as | m k α i,m k ´ w p y i q| “ ˇˇˇa w p y i q D m k i ` p G m k q ´ ´ I ˘ e ˇˇˇ ď a w p y i q} D m k i } ℓ ~p G m k q ´ ´ I ~ } e } ℓ . For any i “ , . . . , m k we have } D m k i } ℓ “ w p y i q ř nj “ ψ j p y i q “ n . Using (2.24), under condition (2.18)with m k , δ k , n and r , we have that ~p G m k q ´ ´ I ~ ď δ k ´ δ k , with probability at least 1 ´ nm ´p r ` q k . As a consequence Pr p A ik q ď nm ´p r ` q k for any k ě
0. Hence,using (5.23), we have that ÿ k ě Pr p A ik q ď n ´ r p ` r q ´p r ` q ÿ k ě ˆ ξ p δ k q ln m k ˙ r ` ď n ´ r ˆ δ p ` r q ˙ r ` ÿ k ě ´ k p r ` q “ n ´ r ˆ δ p ` r q ˙ r ` p r ` q ´ ă `8 , and the claim follows from Borel-Cantelli lemma.In the limit m Ñ `8 , the matrix G tends almost surely to the n -by- n identity matrix, see e.g. [16,Theorem 1] for a proof. From the strong law of large number we have1 m Φ J W { D m Ñ`8 Ñ ˆż Γ wφψ dσ, . . . , ż Γ wφψ n dσ ˙ J , almost surely , I m p φ q “ m ÿ i “ α i φ p y i q “ Φ J α “ m Φ J W { DG ´ e m Ñ`8 Ñ ż Γ w p y q φ p y q dσ “ E p wφ q “ I p φ q , φ P L p Γ , µ q . (3.19)Using (3.19), for any φ P L p Γ , µ q we have E p r I m p φ qq ´ I p φ q “ r B m Ñ`8 Ñ . One can actually quantify more precisely the decay of r B w.r.t. m . Inspection of the proof of (3.16), moreprecisely using (5.6), (5.7) and the notation from there, shows that there exist positive constants C , C such that | r B | ď ż t~ G ´ I ~ď δ u | B | dµ m ` ż t~ G ´ I ~ą δ u | B | dµ m ď nm C ` C ln n ´ δ e p φ q ` | I p φ q| m ´ r . Hence r B “ O p { m q showing that the bias term r B decays faster w.r.t. m than the mean integration errorin (3.16).The error estimate (3.15) shows that, if r is large enough, then the root mean squared error decays atleast as fast as the squared best approximation error. However the rate of convergence of this estimate doesnot catch up with those of Monte Carlo (3.17) and Importance Sampling (3.18) due to the missing decaywith respect to m . Here the error estimate (3.16) comes in handy since it recovers the same convergencerate of Monte Carlo and Importance Sampling with respect to m . The estimate (3.16) shows the mainadvantage of randomized cubatures over Monte Carlo and Importance Sampling: the decay of the error(3.16) is determined by the decay of the term m ´ { and the decay of the best approximation error, incontrast to (3.17) and (3.18) that only decay with respect to m at the same rate, i.e. m ´ { for the rootmean squared error.In Theorem 3 we have written the estimate (3.16) using the upper bound in Lemma 3, that relates tothe best approximation error in L p Γ , µ q . The same estimate can be slightly improved using the equality(5.14), that relates to the weighted best approximation error of φ in L p Γ , µ q . Since w ´ dµ in (2.5) is aprobability measure and w satisfies (2.4), such an error can be sandwiched for any n ě } φ ´ Π n φ } L p Γ ,µ q ď }? w p φ ´ Π n φ q} ď ? n } φ ´ Π n φ } , φ P L p Γ , µ q , or as n ´ Var p w p φ ´ Π n φ qq ď }? w p φ ´ Π n φ q} ď } w ´ } L p Γ ,µ q Var p w p φ ´ Π n φ qq , φ P L p Γ , µ q , thanks to Var p w p φ ´ Π n φ qq “ } w p φ ´ Π n φ q} from (3.10).For some polynomial approximation spaces, see the forthcoming Remark 7, lower bounds for w of theform (4.7) are available. Using such a lower bound and taking n “
1, the variance in the above formulasconnects with the variance of importance sampling (3.18).In contrast to our cubature formulas, Importance Sampling and Monte Carlo are not exact cubatureformulas on V n .The more advantageous error estimate of randomized cubatures comes at the price of two additionalcomputational tasks: the calculation of the cubature weights, that requires the solution of a linear systemwhose matrix G has size n ˆ n , see Remark 1, and the generation of the random samples from σ n . The costfor solving the linear system does not depend on m and d , and the cost for assembling G scales linearlyin m . The cost for the generation of the samples is provably linear with respect to d and m , when Γ is aproduct domain. Remark 2 (Adaptive randomized cubatures for nested sequences of approximation spaces) . The results inTheorem 3 are proven using identically distributed random samples, and apply to any given approximationspace V n . The same result as Theorem 3 can be proven for another type of (nonidentically distributed)random samples, following the lines of the proof of [17, Theorem 2]. These random samples allow the se-quential construction of weighted least-squares estimators on any nested sequence of approximation spaces V n k q k ě with dimension n k : “ dim p V n k q , using an overall number of samples that remains linearly propor-tional to n k , up to logarithmic terms. Using such a type of random samples and [17, Theorem 3], the wholeanalysis of randomized cubatures from this article carries over using nested sequences of approximationspaces rather than a single space. Remark 3.
The error estimates (3.14) , (3.15) , (3.16) , have been presented for real-valued functions, butthey extend to complex-valued functions with essentially the same proofs by identifying R and C . In this section we analyse another cubature formula, that is obtained by adding a correction term to thethe cubature r I m p φ q defined in (3.11), using control variates [13, 21]. Define two mutually independent setsof random samples: r y , . . . , r y m iid from µ , and y , . . . , y m iid from σ . We consider the following cubatureformula, that uses the above 2 m random samples as cubature nodes: p I m p φ q : “ r I m p φ q ` m m ÿ i “ p φ ´ r φ qp r y i q . (3.20)The nodes y , . . . , y m , r y , . . . , r y m are not identically distributed. The m random samples y , . . . , y m areused to compute the weighted least-squares estimator r φ of φ and the cubature r I m p φ q defined in (3.11), andthen the m random samples r y , . . . , r y m are used in the Monte Carlo estimator of φ ´ r φ . By construction p I m p φ q « I p φ q because I p φ q ´ I p r φ q « m m ÿ i “ p φ ´ r φ qp r y i q , and r I m p φ q “ I p r φ q for any φ P L p Γ , µ q , as a consequence of (3.3). The error of the cubature p I m p φ q satisfies the following theorem, whose proof is postponed to Section 5. Theorem 5.
In any dimension d , for any real r ą and any δ P p , q , if condition (2.18) holds true, and r y , . . . , r y m are i.i.d. from µ , and y , . . . , y m are i.i.d. from σ defined in (2.5) , and r y , . . . , r y m , y , . . . , y m are mutually independent, then for any φ P L p Γ , µ q it holds that E ˆˇˇˇ I p φ q ´ p I m p φ q ˇˇˇ ˙ ď m ` p ` ε p m qq p e p φ qq ` } φ } m ´ r ˘ , (3.21) with ε p m q as in Theorem 2. From the estimate in (3.21), using Jensen inequality, we have that E p| I p φ q ´ p I m p φ q|q ď ? m ´a ` ε p m q e p φ q ` ? } φ } m ´ r { ¯ . (3.22)In (3.16) when using 2 m points we can choose an approximation space of dimension n ˚ such that2 m ln 2 m ě p ` r q ξ p δ q n ˚ . (3.23)We compare now the estimate (3.16) for the cubature r I m p φ q on V n ˚ with 2 m points and the estimate(3.22) for p I m p φ qq on V n that also uses 2 m points. Suppose that min v P V n } φ ´ v } ď Cn ´ s for some s, C ą
0. For any m ě r ą φ P L p Γ , µ q , it holds that 2 | I p φ q|p m q ´ r ď ? } φ } m ´p r ` q{ .Both terms m ´p r ` q{ and p m q ´ r always decay sufficiently fast for r large enough, i.e. r ě s ´
1, andtherefore, when comparing the convergence rates w.r.t. n of (3.22) and (3.16), we can focus only on theterm containing the best approximation error. Denote C (3.22) : “ a ` ε p m q and C (3.16) : “ ` ε p m, n q ´ δ ,that satisfy C (3.22) ď C (3.16) for any δ ď n ě m ě
1. From (2.18) and (3.23) n ˚ ă n , and if s ě then n ą s ùñ C (3.16) c n ˚ m p n ˚ q ´ s ą C (3.16) c n m p n q ´ s ą C (3.22) ? m n ´ s .
12n the one hand, this shows that the error estimate (3.22) has a better convergence rate w.r.t. n than(3.16) when n ě s , s ě and r ě s ´
1. On the other hand, the estimate (3.16) gives a better upperbound for the error when n falls in the preasymptotic range, in particular when s is large or r ă s ´ φ since the term | I p φ q| can be much smaller than } φ } .The cubature (3.11) is exact on V n , but this property does not hold anymore for the cubature (3.20). Remark 4 (Convergence rates of stochastic cubatures) . From (3.22) , we obtain explicit convergence ratesof the cubature p I m p φ q , assuming an algebraic decay n ´ s for some s ą of the L p Γ , µ q best approximationerror. For any r ą , n ě and any m satisfying (2.18) , for any φ P L p Γ , µ q : e p φ q À n ´ s ùñ E ´ˇˇˇ I p φ q ´ p I m p φ q ˇˇˇ¯ À ? m ´ n ´ s ` m ´ r { ¯ , where p I m p φ q uses m evaluations of the function φ . The above convergence rate can be made explicitw.r.t. m as ? m ´ n ´ s ` m ´ r { ¯ À m ´ s ´ { p log m q s p ` r q s ` m ´ r { ´ { . Taking r “ s yields the convergence rate m ´ s ´ { p log m q s up to a constant independent of m and n .Since condition (2.18) implies m Á p ` r q n log n , it can also be rewritten w.r.t. n as ? m ´ n ´ s ` m ´ r { ¯ À n ´ s ´ { ˆ log n ` r ˙ ´ { ` n ´ r { ´ { ˆ log n ` r ˙ ´ r { ´ { and taking r “ s the leading term is n ´ s ´ { p log n q ´ { up to a constant independent of m and n . In this section we construct cubature formulae of the form (1.2) with the weights (3.1), enforcing theadditional property that all the weights α , . . . , α m are strictly positive. More precisely we prove that, if m is sufficiently larger than n , then α i ě w p y i q{ m for all i “ , . . . , m with high probability. Strict positivityof w p y q for any y P Γ therefore implies that the weights α , . . . , α m are all strictly positive. Define w min : “ min y P Γ w p y q ď min i “ ,...,m w p y i q “ n ˜ max i “ ,...,m n ÿ j “ | ψ j p y i q| ¸ ´ , which is independent of m and depends on V n . Of course w min ď
1. The next theorem establishes howmuch the cubature weights deviate from the weights of importance sampling in the nonasymptotic regime,when w min ą
0. The proof of this theorem is postponed to Section 5.
Theorem 6.
In any dimension d , for any real r ą and any n ě , if the m i.i.d. nodes y , . . . , y m aredrawn from σ defined in (2.5) and m sastisfies m ln m ě p ` r q n p p { q ´ q w min (3.24) theni) the weights α , . . . , α m given by (3.1) satisfy Pr ˜ m č i “ " ă w p y i q ´ w min m ď α i ď w p y i q ` w min m *¸ ą ´ m ´ r ; (3.25) ii) the cubature formula (1.2) with weights (3.1) satisfies items i), ii) and iii) of Theorem 3 with ε p m q replaced by ε p m q “ ξ ˆ ? w min ? n ˙ p ` r q ln m ď p ` r q ln m , (3.26) in (3.15) , and with {p ´ δ q replaced by {p ´ δ q in (3.16) . emark 5. Since the following trivial inclusions between sets of random events hold true m č i “ "ˇˇˇˇ α i ´ w p y i q m ˇˇˇˇ ď w min m * Ď m č i “ "ˇˇˇˇ α i ´ w p y i q m ˇˇˇˇ ď w p y i q m * Ă m č i “ " α i ě w p y i q m * Ă m č i “ t α i ą u , from (3.25) the event in the above right-hand side holds true with an even larger probability than ´ m ´ r .Therefore, since from (3.25) all the weights α , . . . , α m are sandwiched between two strictly positive bounds,they are just strictly positive with an even larger probability than ´ m ´ r . It is worth to notice that the weights of the cubature (3.20) are not strictly positive. Using theexpression of r φ , the cubature (3.20) can be written in the form (1.2) as p I m p φ q “ m ÿ i “ ´ α i φ p r y i q ` r α i φ p y i q ¯ with nodes y , . . . , y m , r y , . . . , r y m and weights α i “ m , r α i “ $&% ´ m ˆř nj “ ˆ m ř mℓ “ ψ j p r y ℓ q ˙ W { D J G ´ e j ˙ i , if ~ G ´ I ~ ď δ, , otherwise , i “ , . . . , m. From above, the weights r α i might be negative. Proceeding as in the proof of (3.25), one can obtainconditions on m ensuring that with large probability ř mi “ | r α i | À log m , see Remark 8, which is a classicalway to quantify the stability of a cubature in presence of negative weights. In this section we assume that the domain Γ Ď R d has a Cartesian product structure,Γ : “ ˆ dq “ Γ q , (4.1)where Γ q Ď R are bounded or unbounded intervals. We further assume that dµ “ b dq “ dµ q , where each µ q is a probability measure on Γ q . For convenience we take Γ “ Γ q and µ “ µ q for any q “ , . . . , d .Assume the existence of a familiy p ϕ j q j ě of univariate orthogonal polynomials complete and orthonormalin L p Γ , µ q . For any ν P N d we define the multivariate polynomials ψ ν p y q : “ d ź q “ ϕ ν q p y p q q q , y “ p y p q , . . . , y p d q q P Γ , (4.2)with y p q q being the q th coordinate of y . The set p ψ ν q ν P N d is a complete orthonormal basis in L p Γ , µ q .Consider any finite d -dimensional multi-index set Λ Ă N d , and denote its cardinality by p Λ q . Wedenote the polynomial space P Λ “ P Λ p Γ q associated with Λ as P Λ : “ span t ψ ν : ν P Λ u . (4.3)The result from the previous sections apply to the polynomial setting by taking V n “ P Λ with n “ p Λ q .A remarkable class of index sets are downward closed index sets. Definition 1 (Downward closed multi-index set Λ) . In any dimension d , a finite multi-index set Λ Ă N d is downward closed, if ν P Λ ùñ r ν ď ν, @ r ν P Λ , where r ν ď ν is meant component-wise, i.e. r ν q ď ν q for any q “ , . . . , d . A relevant setting in which this type of index sets arises is Gaussian integration, where µ is the Gaussianmeasure on Γ “ R d , and the ψ ν are tensorized Hermite polynomials. On the one hand, tensorization ofunivariate Gaussian quadratures becomes prohibitive as the dimension d increases, due to the exponentialgrowth in d of the number of nodes. On the other hand, the use of downward closed sets allows one to tunethe polynomial space and allocate only the most effective degrees of freedom, depending on the importanceof each coordinate in the approximation of the target function.14 emark 6 (Inverse inequalities for polynomials supported on downward closed index sets) . Given twointeger parameters p θ , θ q P N ˆ N Y tp´ { , ´ { qu , let µ be the probability measure on Γ “ r´ , s d defined as dµ “ b d dJ with dJ : “ C p ´ t q θ p ` t q θ dλ p t q , t P r´ , s , and C s.t. ż ` ´ dJ p t q “ . Consider the tensorized Jacobi polynomials p J θ ,θ ν q ν P N d constructed by (4.2) when taking p ϕ k q k ě as thesequence of univariate Jacobi polynomials orthonormal in L pr´ , s , dJ q . The p J θ ,θ ν q ν corresponds totensorized Legendre polynomials when θ “ θ “ , and to tensorized Chebyshev polynomials when θ “ θ “ ´ . Choosing the orthonormal basis p ψ ν q ν as p J θ ,θ ν q ν and given any downward closed index set Λ Ă N d with cardinality equal to n , see Definition 1, define V n using (4.3) as the space generated by p J θ ,θ ν q ν P Λ . In the setting described above, the following inverse inequalities are proven in [4, 18]: for all v P V n it holds that } v } L ď n B p θ ,θ q } v } , (4.4) where B p θ , θ q : “ $&% max t θ , θ u ` , p θ , θ q P N ˆ N , ln 32 ln 2 , p θ , θ q “ ` ´ , ´ ˘ , (4.5) One step that leads to the proof of such inequalities, see [4, 18], is the estimate max y P Γ n ÿ k “ | ψ k p y q| ď n B p θ ,θ q . (4.6) Remark 7 (Lower bound on w for polynomial spaces) . From a numerical standpoint, it is desirable thatthe weights are not only strictly positive but also bounded away from zero. When V n is a polynomial space on r´ , s d generated by the p J θ ,θ ν q ν P Λ with Λ downward closed, strictly positive lower bounds for the weightscan be derived by using the estimate (4.6) . Using (4.6) we obtain the following lower bound uniformly over Γ for w : w p y q ě n ´ B p θ ,θ q , y P Γ , (4.7) with θ , θ being the same parameters that appear in Remark 6. The next result is a corollary of Theorem 6 in the particular case that V n is a polynomial space, andare obtained by using the lower and upper bounds (4.7) and (2.4). Corollary 3.
In any dimension d with Γ “ r´ , s d , let µ be the Jacobi probability measure on Γ and V n be any downward closed polynomial space generated by tensorized Jacobi polynomials. For any real r ą and n ě , if the m i.i.d. nodes y , . . . , y m are drawn from σ defined in (2.5) and m sastisfies m ln m ě p ` r q p { q ´ n B p θ ,θ q` , p θ , θ q P N ˆ N Y "ˆ ´ , ´ ˙* , (4.8) theni) the weights α , . . . , α m given by (3.1) satisfy Pr ˜ m č i “ " mn B p θ ,θ q ď α i ď ? n m *¸ ą ´ m ´ r . ii) the cubature formula (1.2) with weights (3.1) satisfies items i), ii) and iii) of Theorem 3 with ε p m q replaced by ε p m q “ ξ ˆ ? w min ? n ˙ p ` r q ln m ď p ` r q ln m , in (3.15) , and with {p ´ δ q replaced by {p ´ δ q in (3.16) . w ”
1, that corresponds tousing standard least squares with random samples distributed as µ . Corollary 4.
In any dimension d with Γ “ r´ , s d , let µ be the Jacobi probability measure on Γ , and V n be any downward closed polynomial space generated by tensorized Jacobi polynomials. For any real r ą and n ě , if the m i.i.d. nodes y , . . . , y m are drawn from µ and m sastisfies m ln m ě p ` r q p { q ´ n B p θ ,θ q , p θ , θ q P N ˆ N Y "ˆ ´ , ´ ˙* , (4.9) theni) the weights α , . . . , α m given by (3.1) satisfy Pr ˜ m č i “ " m ď α i ď m *¸ ą ´ m ´ r . ii) the cubature formula (1.2) with weights (3.1) satisfies items i), ii) and iii) of Theorem 3 with ε p m q replaced by ε p m q “ ξ ˆ n B p θ ,θ q ˙ p ` r q ln m ď n B p θ ,θ q p ` r q ln m in (3.15) , and with {p ´ δ q replaced by {p ´ δ q in (3.16) . The exponent of n in condition (4.8) with weighted least squares is always smaller than that in condition(4.9) with standard least squares. In the Legendre and Chebyshev cases, Corollary 3 ensures positivity ofthe weights with large probability if m ln m ě p ` r q p { q ´ n s , with s “ B p , q ` “ , in the Legendre case, and s “ B ˆ ´ , ´ ˙ ` “ ln 3ln 2 ` « . , in the Chebyshev case . In this section we use the notation dµ m : “ b m dµ . Proof of Lemma 1.
Proof of (3.3). On the one hand, using in sequence ψ ”
1, the orthogonality propertyof the basis functions and } ψ } “
1, we have that I p Π mn φ q “ ż Γ Π mn φ dµ (5.1) “ ż Γ n ÿ j “ β j ψ j dµ “ β , (5.2)with β being the coefficient associated to ψ in the expansion (2.9).On the other hand, the left-hand side in (3.3) is the cubature formula (1.2), that can be read (up to amultiplicative factor m ´ ) as the scalar product in R m between the vector α containing the weights andthe vector Φ containing the evaluations of the function φ at the nodes, and from (2.12) we have I m p φ q “ m α J Φ “ m ` DG ´ e ˘ J b “ m e J G ´ D J b “ e J β “ β , (5.3)that proves the equality (3.3).Proof of (3.4). We notice that, since Π mn is a projection on V n , then it holds Π mn v ” v for any v P V n ,and we obtain (3.4) from (3.3). 16 roof of Theorem 3. The proof of i) is an immediate consequence of Lemma 1 and Corollary 2.For proving ii), using point i) we bound the integration error as | I p φ q ´ I m p φ q| “ ˇˇˇˇż Γ p φ ´ Π mn φ q dµ ˇˇˇˇ ď ż Γ | φ ´ Π mn φ | dµ ď} φ ´ Π mn φ } , (5.4)and combining with (2.20) we obtain (3.14).Proof of iii) estimate (3.15). We start by splitting the expectation in (3.15) over the sets of events t~ G ´ I ~ ď δ u and t~ G ´ I ~ ą δ u . Since on the event t~ G ´ I ~ ď δ u the cubature r I m p φ q equals I m p φ q , weobtain E ˆˇˇˇ I p φ q ´ r I m p φ q ˇˇˇ ˙ “ ż ~ G ´ I ~ď δ | I p φ q ´ I m p φ q| dµ m ` ż ~ G ´ I ~ą δ ˇˇˇ I p φ q ´ r I m p φ q ˇˇˇ dµ m . Then we use the upper bound (5.4) and proceeding as in the proof of (2.21) in [6] we obtain ż ~ G ´ I ~ď δ | I p φ q ´ I m p φ q| dµ m ď ż ~ G ´ I ~ď δ } φ ´ Π mn φ } dµ m ď p ` ε p m qqp e p φ qq . (5.5)The last term in the right-hand side of (3.15) is an upper bound for the integral on the event ~ G ´ I ~ ą δ ,where r I m p φ q is set to zero.Proof of iii) estimate (3.16). Splitting the expectation in (3.16) over the events t~ G ´ I ~ ď δ u and t~ G ´ I ~ ą δ u , using (3.11) and (3.7) we obtain E ´ˇˇˇ I p φ q ´ r I m p φ q ˇˇˇ¯ “ ż ~ G ´ I ~ď δ ˇˇˇˇ m e J G ´ D J W { g ˇˇˇˇ dµ m looooooooooooooooooooooomooooooooooooooooooooooon “ : A ` ż ~ G ´ I ~ą δ | I p φ q| dµ m loooooooooooomoooooooooooon “ : B . (5.6)Term B can be controlled as ż ~ G ´ I ~ą δ | I p φ q| dµ m ď | I p φ q| m ´ r . (5.7)For term A, using (3.8), triangular inequality, the sub-multiplicative property of the operator norm,(2.22) and Cauchy-Schwarz inequality we obtain A ď ż ~ G ´ I ~ď δ ˇˇˇˇ m e J D J W { g ˇˇˇˇ dµ m ` ż ~ G ´ I ~ď δ ˇˇˇˇ m e J p G ´ ´ I q D J W { g ˇˇˇˇ dµ m ď E ˆˇˇˇˇ m e J D J W { g ˇˇˇˇ˙ ` ż ~ G ´ I ~ď δ ~ G ´ ´ I ~ ›››› m D J W { g ›››› ℓ dµ m ď ˜ E ˜ˇˇˇˇ m e J D J W { g ˇˇˇˇ ¸¸ { ` ´ δ ż ~ G ´ I ~ď δ ~ G ´ I ~ ›››› m D J W { g ›››› ℓ dµ m ď ˜ E ˜ˇˇˇˇ m e J D J W { g ˇˇˇˇ ¸¸ { ` ´ δ ` E ` ~ G ´ I ~ ˘˘ { ˜ E ˜›››› m D J W { g ›››› ℓ ¸¸ { . (5.8)Using Lemmas 2, 3 and 4 to bound the expectations in (5.8) we obtain (3.16).One of the results used in the proof of (3.16) is the following upper bound on the spectral norm of sumof independent random matrices, see for example [31, Theorem 4.1], that we rewrite here in the Hermitiancase. We denote by 0 n the null n -by- n matrix. 17 heorem 7. Consider a family p Q i q ď i ď m of independent random matrices in R n ˆ n such that E p Q i q “ n for all i , and define Z “ ř mi “ Q i . Then ` E p~ Z ~ q ˘ { ď a C p n q~ E p Z J Z q~ { ` C p n q ˆ E ˆ max i “ ,...,m ~ Q i ~ ˙˙ { , with C p n q : “ p ` r ln n s q . Lemma 2. ` E ` ~ G ´ I ~ ˘˘ { ď a p ` r ln n s q c n ´ m ˆ ` a p ` r ln n s q c nm ˙ . (5.9) Proof.
Using the random variable y distributed as σ , we introduce the n -by- n real random matrices X “ X p y q and Q “ Q p y q , whose elements are defined as X pq p y q : “ w p y q m ψ p p y q ψ q p y q , Q pq p y q : “ X pq p y q ´ δ pq m , p, q “ , . . . , n. By construction, E p X q “ m I and therefore E p Q q “ n . Denote by p X i q ď i ď m a family of m independentcopies of X , and by p Q i q ď i ď m a family of m independent copies of Q .The matrix X i has rank one, and Q i has full rank. Nonetheless, we can compute an upper bound for ~ Q i ~ as follows: ~ Q i ~ ď ~ Q i ~ F “ trace ´ p Q i q J Q i ¯ “ trace ´ p X i ´ m ´ I q J p X i ´ m ´ I q ¯ “ n ÿ p “ ˜ n ÿ q “ X ipq X iqp ´ X ipp m ` δ pp m ¸ “ n ÿ p “ n ÿ q “ ´ X ipq ¯ ´ m n ÿ p “ X ipp ` nm “ | w p y i q| m n ÿ p “ n ÿ q “ | ψ p p y i q| | ψ q p y i q| ´ w p y i q m n ÿ p “ | ψ p p y i q| ` nm “ n p n ´ q m , (5.10)and the trace has been rewritten using ´ p X i ´ m ´ I q J p X i ´ m ´ I q ¯ pq “ n ÿ j “ p X ipj ´ m ´ δ pj qp X ijq ´ m ´ δ jq q “ n ÿ j “ X ipj X ijq ´ m X ipq ` δ pq m , and n ÿ j “ δ pj “ δ pp , n ÿ j “ δ pj δ jq “ δ pq . (5.11)The bound (5.10) holds uniformly for all i “ , . . . , m , and therefore E ˆ max i “ ,...,m ~ Q i ~ ˙ ď n p n ´ q m . (5.12)Define Z : “ ř mi “ Q i “ G ´ I and let us compute E p Z J Z q . The components of the matrix Z can bewritten as Z pq “ x ψ p , ψ q y m ´ δ pq , and therefore p Z J Z q pq “ n ÿ k “ px ψ p , ψ k y m ´ δ pk q px ψ k , ψ q y m ´ δ kq q“ n ÿ k “ x ψ p , ψ k y m x ψ k , ψ q y m ´ x ψ p , ψ q y m ` δ pq , E pp Z J Z q pq q “ n ÿ k “ E px ψ p , ψ k y m x ψ k , ψ q y m q loooooooooooooomoooooooooooooon “ : T kpq ´ δ pq . Using the independence of the random samples, the term T kpq can be rewritten as T kpq “ m E ˜˜ m ÿ i “ w p y i q ψ p p y i q ψ k p y i q ¸ ˜ m ÿ j “ w p y j q ψ k p y j q ψ q p y j q ¸¸ “ m ¨˚˝ E ¨˚˝ m ÿ i “ w p y i q ψ p p y i q ψ k p y i q m ÿ j “ j ‰ i w p y j q ψ q p y j q ψ k p y j q ˛‹‚ ` E ˜ m ÿ i “ p w p y i qq ψ p p y i q ψ q p y i qp ψ k p y i qq ¸˛‹‚ , and using linearity of expectation, the definition of w and (5.11) we obtain n ÿ k “ T kpq “ m ˜ n ÿ k “ p mδ pk p m ´ q δ qk qq ` nmδ pq ¸ “ m ` n ´ m δ pq . Hence we have finally E p Z J Z qq “ n ´ m I , and ~ E p Z J Z qq~ “ p n ´ q m . (5.13)We now apply Theorem 7 to the family of random matrices Q , . . . , Q m , using the bounds (5.12) and(5.13), and finally obtain (5.9). Lemma 3. E ˜ˇˇˇˇ m e J D J W { g ˇˇˇˇ ¸ ď nm p e p φ qq . Proof.
Since from (3.10) E p wg q “
0, it holds that E ˜ˇˇˇˇ m e J D J W { g ˇˇˇˇ ¸ “ E ˜˜ m m ÿ i “ w p y i q g p y i q ¸ ¸ “ m m ÿ i,j “ E p w p y i q g p y i q w p y j q g p y j qq“ m m ÿ i “ E ` p w p y i q g p y i qq ˘ “ m ż Γ w p y qp φ p y q ´ Π n φ p y qq dµ (5.14) ď nm p e p φ qq , and at the last step we have used (2.4). Lemma 4. E ˜›››› m D J W { g ›››› ℓ ¸ “ nm p e p φ qq . roof. Using the independence of the random samples E ˜›››› m D J W { g ›››› ℓ ¸ “ m E ˜ n ÿ k “ ˜ m ÿ i “ D ik a w p y i q g p y i q ¸ ¸ “ m n ÿ k “ m ÿ i,j “ E p ψ k p y i q w p y i q g p y i q ψ k p y j q w p y j q g p y j qq“ m n ÿ k “ ¨˚˝ m ÿ i “ E ` p ψ k p y i q w p y i q g p y i qq ˘ ` m ÿ i,j “ i ‰ j E p ψ k p y i q w p y i q g p y i q ψ k p y j q w p y j q g p y j qq ˛‹‚ “ m m ÿ i “ E ˜ p w p y i q g p y i qq n ÿ k “ p ψ k p y i qq ¸ ` m p m ´ q m n ÿ k “ p E p ψ k wg qq “ nm ż Γ g dµ ` m p m ´ q m n ÿ k “ ˆż Γ ψ k g dµ ˙ “ nm p e p φ qq , where at the last but one step we have used the definition of w , and at the last step we have used theorthogonality of g to ψ k for all k “ , . . . , n . Proof of Theorem 5.
Denote with E r y ,..., r y m the expectation over the random samples r y , . . . , r y m . For given y , . . . , y m , using the mutual independence between r y , . . . , r y m and y , . . . , y m it holds that E r y ,..., r y m ˜ˇˇˇˇˇ I p φ ´ r φ q ´ m m ÿ i “ p φ ´ r φ qp r y i q ˇˇˇˇˇ ¸ “ m Var r y „ µ p φ ´ r φ q , (5.15)where the conditional variance on the right-hand side is defined asVar r y „ µ p φ p r y q ´ r φ p r y qq : “ E r y „ µ ˆˇˇˇ φ p r y q ´ r φ p r y q ´ E r y „ µ p φ p r y q ´ r φ p r y qq ˇˇˇ ˙ , using the following conditional expectation for the given y , . . . , y m : E r y „ µ p φ p r y q ´ r φ p r y qq : “ ż Γ p φ p r y q ´ r φ p r y qq dµ p r y q . For any given y , . . . , y m , an upper bound for the variance isVar r y „ µ p φ p r y q ´ r φ p r y qq ď ż Γ p φ p r y q ´ r φ p r y qq dµ p r y q “ } φ ´ r φ } . (5.16)Using the law of total expectation, (5.15), the upper bound (5.16) and (2.21) we have that E r y ,..., r ymy ,...,ym ˆˇˇˇ I p φ q ´ p I m p φ q ˇˇˇ ˙ “ E r y ,..., r ymy ,...,ym ˜ˇˇˇˇˇ I p φ ´ r φ q ´ m m ÿ i “ p φ ´ r φ qp r y i q ˇˇˇˇˇ ¸ “ E y ,...,y m ˜ E r y ,..., r y m ˜ˇˇˇˇˇ I p φ ´ r φ q ´ m m ÿ i “ p φ ´ r φ qp r y i q ˇˇˇˇˇ ¸¸ ď m E y ,...,y m ´ } φ ´ r φ } ¯ ď m ˆ p ` ε p m qq min v P V n } φ ´ v } ` } φ } m ´ r ˙ . roof of Theorem 6. Proof of i). For any i “ , . . . , m , using the sub-multiplicative property of the operatornorm we obtain that ˇˇˇˇ α i ´ w p y i q m ˇˇˇˇ “ ˇˇˇˇˇ a w p y i q m D i ` G ´ ´ I ˘ e ˇˇˇˇˇ ď a w p y i q m } D i } ℓ ~ G ´ ´ I ~ } e } ℓ . (5.17)Now we estimate each term in the right-hand side of (5.17). For the second term, the definitions of D in(2.10) and w in (2.3) ensure that for any i “ , . . . , m it holds } D i } ℓ “ w p y i q n ÿ j “ ψ j p y i q “ n. For the third term, using (2.24) we have that, under condition (2.18), ~ G ´ ´ I ~ ď δ ´ δ , with probability at least 1 ´ m ´ r . For the fourth term } e } ℓ “
1. We now observe that the restrictionof δ to the interval 0 ă δ ď ? w min ? n ` ? w min , n ě , (5.18)and strict monotonicity of δ ÞÑ δ p ´ δ q ´ on such an interval ensure that the left-hand side in (5.17)satisfies the following upper bound, uniformly for all i “ , . . . , m : ˇˇˇˇ α i ´ w p y i q m ˇˇˇˇ ď ? w min m δ ? n ´ δ ď w min m . (5.19)Since w min ď
1, choosing δ “ ? w min ? n ď ? w min ? n ` ? w min , n ě , (5.20)and thanks to (5.24), we can enforce condition (2.18) as m ln m ě p ` r q n p p { q ´ q w min “ p ` r q n p p { q ´ q δ ě p ` r q nξ p δ q , (5.21)and obtain condition (3.24). Condition (3.24) ensures that (5.19) holds with probability at least 1 ´ m ´ r and simultaneously for all i “ , . . . , m , that is the claim (3.25).Proof of ii). From (5.21), condition (3.24) requires more points than (2.18). Hence any cubatureformula whose nodes are drawn from (2.5) and satisfy (3.24), yields an integration error which obeys tothe convergence estimates in Theorem 3 but with δ chosen as in (5.20). Using the upper bounds (2.4) and(5.23) one obtains (3.26) in (3.15). Since δ ď for any n ě
1, (3.16) holds true with an additional factor3 { p ´ δ q ´ . Remark 8.
With a similar proof as for (3.25) but using } G ´ } ď p ´ δ q ´ , under condition (2.18) forany δ P p , q it holds that Pr ˜ m č i “ " | r α i | ď p n ´ q? n p ´ δ q ? w min m min ˆ ? ` δ, δ ˙*¸ ą ´ m ´ r , (5.22) because from the polarization identity and ψ ” we have for any j ą that x ψ j , ψ y m “ } ψ j ` ψ } m ´ } ψ j } m ´ ď $&% } ψ j } m ď ? ` δ, ` δ } ψ j ` } ´ } ψ j } m ´ “ ` δ p} ψ j } ` q ´ } ψ j } m ´ ď δ , nd therefore Pr ˜ n č j “ " |x ψ j , ψ y m | ď min ˆ ? ` δ, δ ˙*¸ ą ´ m ´ r . In (5.22) , choosing δ depending on n and proceeding similarly as in the proof of (3.25) , one can obtainconditions on m ensuring that with large probability ř mi “ | r α i | ď C log m for some constant C . Lemma 5. ξ p δ q ď δ , δ P r , s . (5.23) Proof.
Define U p δ q : “ δ { D p δ q : “ U p δ q ´ ξ p δ q . The function D p δ q is continuously differentiable atany δ ě
0, and strictly increasing since dD p δ q{ dδ “ δ ´ ln p ` δ q ą δ ą
0. Since D p q “
0, wehave D p δ q ě δ P r , s , and hence (5.23). Lemma 6. p p { q ´ q δ ď ξ p δ q , δ P „ , . (5.24) Proof.
Define ω : “ p p { q´ q « . L p δ q : “ ωδ and D p δ q : “ ξ p δ q´ L p δ q . The function D p δ q is twicecontinuously differentiable at any δ ě dD p δ q{ dδ “ ln p ` δ q ´ ωδ , and d D p δ q{ dδ “ p ` δ q ´ ´ ω .Hence the function D p δ q is convex on r , p ω q ´ ´ s , concave on rp ω q ´ ´ , { s , and since D p q “ dD p q{ dδ “ D p { q “ r , s , that gives (5.24). In any domain Γ Ď R d with any dimension d P N , we have constructed randomized cubature formulaethat are stable and exact on a given space V n on Γ with dimension n , under the assumption that an L orthonormal basis of V n is available in explicit form. In Theorem 3 we have proven that the integrationerror of these cubature formulae satisfies convergence estimates in probability (3.14) and in expectation(3.15)–(3.16), under condition (2.18) on the required number of nodes. Such a condition imposes a numberof nodes only linearly proportional to n , up to a logarithmic term, thus approaching the number of nodesin Tchakaloff’s theorem (see Theorem 1), in the same general setting of arbitrary domain Γ and arbitrarydimension d . If the number of nodes satisfies the more demanding condition (3.24), where m is at leastquadratically proportional to n up to a logarithmic term, then the proposed randomized cubature formulaehave strictly positive weights with high probability. Both conditions (2.18) and (3.24) are immune to thecurse of dimensionality: the required number of nodes only depends on n , and does not depend on thedimension d . The rate of convergence with respect to m for the error in (3.16) catches up with theconvergence rate m ´ { of Monte Carlo, but the multiplicative constant in (3.16) can be much smallerthanks to the additional decay of the best approximation error in V n of φ . As a consequence, the proposedrandomized cubatures provably outperform Monte Carlo whenever the best approximation error in V n of φ decays faster than n ´ { .As a further contribution we have constructed also a cubature formula that is asymptotically optimal,but with error bounds that can be larger in the preasymptotic regime, see Theorem 5 and Remark 4.A point that has not been addressed in the present article is the choice of the space V n . Such a choicedepends indeed on the function φ , or on its smoothness class. In many applications, for example in theanalysis of partial differential equations with parametric or stochastic data, a priori analyses provide goodapproximation spaces V n with proven convergence rates n ´ s with s ą
0. Whenever this is not the case,one can resort to an adaptive construction of the approximation space, see Remark 2.The results on randomized cubatures in this article have been presented using always m identicallydistributed random samples from σ for the construction of the weighted least-squares estimator of theintegrand function, for both cubatures in Theorem 3 and Theorem 5. The cubature in Theorem 5 uses inaddition m random samples from µ , but not for the construction of the weighted least-squares estimator.The whole analysis in this paper applies tout court to other types of (nonidentically distributed) ran-dom samples from other distributions than σ , e.g. the distribution used in [17, Theorem 2], and extendsto the adaptive setting by exploiting recent advances on adaptive weighted least-squares estimators forapproximating the integrand function. 22 eferences [1] C.Bayer, J.Teichmann: The proof of Tchakaloff’s theorem , Proc. Amer. Math. Soc., 134:3035–3040,2006.[2] H.Bungartz, M.Griebel:
Sparse grids , Acta Numer., 13:147–269, 2004.[3] R.Cools:
Constructing cubature formulae: the science behind the art , Acta Numer., 1–54, 1997.[4] A.Chkifa, A.Cohen, G.Migliorati, F.Nobile, R.Tempone:
Discrete least squares polynomial approxima-tion with random evaluations - application to parametric and stochastic elliptic PDEs , ESAIM Math.Model. Numer. Anal., 49(3):815–837, 2015.[5] E.B.Christoffel:
Sur une classe particuli`ere de fonctions enti`eres et de fractions continues ,Ann.Mat.Pura Appl. 2:1–10, 1877.[6] A.Cohen, G.Migliorati:
Optimal weighted least-squares methods , SMAI-JCM, 3:181–203, 2017.[7] P.Davis, P.Rabinowitz:
Methods of numerical integration . Second edition, Academic Press, 1984.[8] J.Dick, F.Y.Kuo, I.H.Sloan:
High-dimensional integration: the quasi-Monte Carlo way , Acta Numer.,22:133–288, 2013.[9] S.M.Ermakov, V.G.Zolotukhin:
Polynomial approximations and the Monte Carlo method , Theor.Prob. Appl. 5:428–431, 1960.[10] S.M.Ermakov:
Die Monte-Carlo-Methode und verwandte Fragen , Oldenbourg Verlag, Mu¨nchen, 1975.[11] C.F.Gauss:
Methodus nova integralium valores per approximationem inveniendi , Comment.Soc.RegiaeSci.Gottingensis Recentiores, 3:163–196, 1816.[12] A.Genz, B.D.Keister:
Fully symmetric interpolatory rules for multiple integrals over infinite regionswith Gaussian weight , J.Comp.Appl.Math., 71:299–309, 1996.[13] S.Heinrich:
Random approximation in numerical analysis , in Proc. of the Functional Analysis Conf.,Essen 1991, Lecture Notes in Pure and Applied Math. vol 150, Marcel Dekker, New York, pp.123–171.[14] L.Kammerer, T.Ullrich, T.Volkmer:
Worst-case recovery guarantees for least squares approximationusing random samples , arXiv 1911.10111, November 2019.[15] J.Mc Namee, F.Stenger:
Construction of fully symmetric numerical integration formulas , Nu-mer.Math. 10:327–344, 1967.[16] G.Migliorati, F.Nobile, E.von Schwerin, R.Tempone:
Analysis of discrete L projection on polynomialspaces with random evaluations , Found. Comput. Math., 14:419–456, 2014.[17] G.Migliorati: Adaptive approximation by optimal weighted least squares methods , SIAMJ.Numer.Anal., 57(5):2217–2245, 2019.[18] G.Migliorati:
Multivariate Markov-type and Nikolskii-type inequalities for polynomials associated withdownward closed multi-index sets , J. Approx. Theory, 189:137–159, 2015.[19] H.M.M¨oller:
Lower bounds for the number of nodes in cubature formulae , Numerische Integration,45:221–230, 1979.[20] E.Novak:
Deterministic and stochastic error bounds in numerical analysis , Springer, 1988.[21] E.Novak:
Some Results on the Complexity of Numerical Integration , In: Cools R., Nuyens D. (eds)Monte Carlo and Quasi-Monte Carlo Methods. Springer Proceedings in Mathematics & Statistics, vol163. Springer, Cham., 2016. 2322] E.Novak, K.Ritter:
Simple cubature formulas with high polynomial exactness , Constr.Approx. 15:499-522, 1999.[23] J.Oettershagen:
Construction of optimal cubature algorithms with applications to econometrics anduncertainty quantification , Phd thesis, 2017.[24] C.P.Robert, G.Casella:
Monte Carlo statistical methods , Springer, 2013.[25] M.Putinar:
A note on Tchakaloff’s theorem , Proceedings of the American Mathematical Society,125(8):2409–2414, 1997.[26] E.K.Ryu, S.P.Boyd:
Extensions of Gauss quadrature via linear programming , Found. Comput. Math.15:953–971, 2014.[27] S.Smolyak:
Quadrature and interpolation formulas for tensor products of certain classes of functions ,Dokl. Akad. Nauk SSSR, 4:240–243, 1963.[28] A.H.Stroud:
Quadrature methods for functions of more than one variable , New York Acad.Sci. 86:776–791, 1960.[29] A.H.Stroud:
Approximate Calculation of Multiple Integrals , Prentice-Hall, 1971.[30] V.Tchakaloff:
Formules de cubature m´echaniques `a coefficients non n´egatifs , Bull. Sci. Math., 81:123–134, 1957.[31] J.A.Tropp: