Component-by-component digit-by-digit construction of good polynomial lattice rules in weighted Walsh spaces
Adrian Ebert, Peter Kritzer, Onyekachi Osisiogu, Tetiana Stepaniuk
aa r X i v : . [ m a t h . NA ] A ug Component-by-component digit-by-digit construction of goodpolynomial lattice rules in weighted Walsh spaces
Adrian Ebert ∗ , Peter Kritzer, Onyekachi Osisiogu, Tetiana Stepaniuk † August 21, 2020
Abstract
We consider the efficient construction of polynomial lattice rules, which are special casesof so-called quasi-Monte Carlo (QMC) rules. These are of particular interest for the approxi-mate computation of multivariate integrals where the dimension d may be in the hundreds orthousands. We study a construction method that assembles the generating vector, which isin this case a vector of polynomials over a finite field, of the polynomial lattice rule in a digit-by-digit (or, equivalently, coefficient-by-coefficient) fashion. As we will show, the integrationerror of the corresponding QMC rules achieves excellent convergence order, and, under suit-able conditions, we can vanquish the curse of dimensionality by considering function spacesequipped with coordinate weights. The construction algorithm is based on a quality measurethat is independent of the underlying smoothness of the function space and can be imple-mented in a fast manner (without the use of fast Fourier transformations). Furthermore, weillustrate our findings with extensive numerical results. Keywords:
Numerical integration; polynomial lattice points; quasi-Monte Carlo methods;weighted function spaces; digit-by-digit construction; component-by-component construction;fast implementations.
In this article we study the problem of multivariate numerical integration for a subclass of square-integrable functions f ∈ L ([0 , d ). We consider special instances of so-called quasi-Monte Carlo(QMC) rules, which are methods to approximate integrals I d ( f ) = Z [0 , d f ( x ) d x by equal-weight quadrature rules, Q N,d ( f ) = 1 N N − X n =0 f ( x n ) , where the integration nodes x , x , . . . , x N − are deterministically chosen in [0 , d . This is incontrast to Monte Carlo rules, where the integration nodes are chosen randomly; with QMCrules, we try to make a deliberate and sophisticated choice of the points x n with the aim ofobtaining better error bounds than for Monte Carlo. The crucial challenge is to find integration ∗ A. Ebert, P. Kritzer, and O. Osisiogu are supported by the Austrian Science Fund (FWF): Project F5506,which is part of the Special Research Program “Quasi-Monte Carlo Methods: Theory and Applications”. † T. Stepaniuk is supported by the Alexander von Humboldt Foundation. t, m, d )-nets and ( t, d )-sequences, asintroduced by Niederreiter, building up on ideas by Sobol’ and Faure (see [14, 16]). A specialcase of ( t, m, d )-nets, namely so-called polynomial lattice point sets, is the focus of the presentpaper. These point sets were introduced in [15], and have their name since their structure canbe viewed as analogous to (ordinary) lattice point sets.While the construction principle of lattice point sets is based on integer arithmetic, polyno-mial lattice point sets are based on polynomial arithmetic over finite fields. To be more precise,we will fix a prime b , and consider the finite field F b with b elements. A polynomial lattice pointset with b m points in [0 , d is constructed by means of a modulus p ∈ F b [ x ] with deg( p ) = m ,and a generating vector g ∈ ( F b [ x ]) d (we refer to Section 2.2 for the precise definition). TheQMC rule using the polynomial lattice point set as integration nodes is then called a polyno-mial lattice rule. It will be convenient in this paper to assume that the modulus has the form p m ( x ) = x m . However, it is crucial to note that not every choice of the generating vector g yieldsa polynomial lattice point set that has good properties, in the sense that the integration error ofthe corresponding polynomial lattice rule is sufficiently low. On the contrary, it is usually highlynon-trivial to find good generating vectors of polynomial lattice rules, and there are (exceptfor special cases) no explicit constructions of such good generating vectors known. Hence, onehas to resort to computer search algorithms for finding generating vectors of polynomial latticepoint sets of high quality. Regarding the error measure, we consider in this paper the worst-casesetting, i.e., we consider a particular normed function space and the supremum of the integrationerror over the unit ball of the space.It is known that (ordinary) lattice rules are well suited for the numerical integration offunctions with pointwise convergent Fourier series (see again, e.g., [16] or [21]). On the otherhand, polynomial lattice rules are usually applied for the numerical integration of functions thatcan be represented by Walsh series (cf. [2, 4, 5]). We will therefore define a reproducing kernelHilbert space based on Walsh functions in Section 2.1, which will be considered throughout thepaper. The function space under consideration will be characterized by a smoothness parameter α (in some publications this parameter is also referred to as “digital smoothness parameter” inthe context of Walsh series). Indeed, the parameter α is linked to the speed of decay of theWalsh coefficients of the functions in our space, but there is also a connection to the number ofderivatives that exist for the elements of the space (we refer to [5] and the references therein fordetails).The function space considered here is closely related to other function spaces considered inthe literature, such as in [2, 4, 5]; indeed, results that we show for the space considered in thepresent paper immediately imply corresponding results for some of the Walsh spaces consideredin these references. Furthermore, our Hilbert space will be a “weighted” function space inthe sense of Sloan and Woźniakowski (cf. [23]). This means that we assign non-negative realnumbers (weights) to the coordinates, or groups of coordinates, of the integration problem, inorder to model the different influence of the coordinates on the problem. As pointed out in [23]and numerous other papers, this method is justified by practical high-dimensional problems inwhich different coordinates may indeed have a very different degree of influence on the value ofan integral. The weights will be incorporated in the inner product and norm of the functionspace in a suitable way. Using this setting, it is plausible that a nominally very high-dimensionalproblem may have a rather low “effective dimension”, i.e., only a certain, possibly small, part2f the components has a significant influence on the integration problem and the error made byapproximative algorithms. This may then yield situations where a curse of dimensionality canbe avoided.In the present paper, we will restrict ourselves, for technical reasons, to considering the mostcommon choice of weights, so-called product weights, but we suspect that the construction ofQMC rules presented here could also work for other choices of weights. We refer to Section 3.2for further comments on this question.The first efficient construction of good generating vectors of polynomial lattice point setswas done in [2]. In that paper, the authors considered the so-called component-by-component(CBC) approach, which is a greedy algorithm to construct one component of the generatingvector at a time. CBC algorithms were first considered for ordinary lattice point sets, withthe first examples in the literature going back to Korobov (cf. [11]), and later a rediscovery bySloan and Reztsov (cf. [22]). The fast CBC construction, which is due to Cools and Nuyens(see, e.g., [17–19]), makes the CBC construction computationally competitive and is currentlythe standard method to construct high-dimensional lattice point sets of good quality.It is wellknown (see, e.g., [2] and again [17]) that CBC constructions also work for the efficient searchfor generating vectors of polynomial lattice point sets; and also in this case, a fast algorithm isavailable.In the present paper, we present another, different algorithm to construct generating vec-tors of polynomial lattice point sets in an efficient way. This construction is also based on acomponent-by-component approach. However, as opposed to the CBC algorithms for polyno-mial lattice point sets currently available in the literature, our new approach constructs thesingle components of the generating vector g “digit-by-digit” and the used search criterion isindependent of the smoothness parameter α . Actually, the term “digit-by-digit” is based on asimilar approach that exists for ordinary lattice point sets (see [12, 13], and for similar results ina more up-to-date setting, [6]). In the context of polynomial lattice point sets, the generatingvector g consists of polynomials, so it would be more appropriate to speak of a “coefficient-by-coefficient” instead of a “digit-by-digit” construction. However, to stay consistent regarding thename of the method, and to avoid confusion with the “component-by-component” approach,we keep the name “digit-by-digit” construction also for polynomial lattice rules. In fact, thealgorithm which we will present in Section 3.2 contains two loops. An outer loop in which thedifferent components are constructed, and an inner loop in which the coefficients (digits) of eachcomponent of the generating vector are constructed. Both loops can be regarded as greedy, i.e.,choices that have been made in previous steps are kept fixed.We will show that the polynomial lattice rules obtained by our new construction method sat-isfy upper error bounds that are arbitrarily close to the optimal convergence rate. Furthermore,under suitable conditions on the coordinate weights, we can vanquish the curse of dimensionality,i.e., avoid exponential dependence of the error on the dimension d of the integration problem,or even obtain error bounds that are independent of the dimension.The rest of the paper is structured as follows. In Section 2, we introduce the functionspace setting as well as polynomial lattice rules, and analyze the corresponding worst-case errorexpression. In Section 3, we derive the component-by-component digit-by-digit (or, for short,CBC-DBD) construction algorithm for polynomial lattice rules and study the worst-case errorbehavior of the resulting integration rules. In Section 4, we show that the introduced construc-tion method can be implemented in a fast manner, competitive with state-of-the-art constructionalgorithms. Finally, the article is concluded in Section 5, where we illustrate our main resultsby numerical experiments.To conclude this introductory section, we fix some notation. In what follows, we denote theset of positive integers by N and the set of non-negative integers by N . To denote subsets ofcomponents, we use fraktur font, e.g., u ⊂ N and additionally write shorthand 1: d } := { , . . . , d } .For the projection of a vector x ∈ [0 , d or k ∈ N d onto the components in a set u ⊆ { d }
3e write x u = ( x j ) j ∈ u or k u = ( k j ) j ∈ u , respectively. With a slight abuse of notation, we willfrequently identify elements of the finite field F b of prime cardinality b with elements of thegroup of integers modulo b denoted by Z b . In this article we consider numerical integration of a sub-class of the square-integrable functions f ∈ L ([0 , d ) which can be represented in terms of their Walsh series. This particular seriesrepresentation of a function is based on the so-called Walsh functions, which are defined asfollows. Definition 1.
Let b ≥ k , we define the k -th Walshfunction b wal k : [0 , → C by b wal k ( x ) := e π i( κ ξ + κ ξ + ··· + κ a − ξ a ) /b with x ∈ [0 ,
1) and base b representations k = κ + κ b + · · · κ a − b a − and x = ξ b − + ξ b − + · · · (unique in the sense that infinitely many of the ξ i must be different from b −
1) with coefficients κ i , ξ i ∈ { , , . . . , b − } .For d ∈ N , an integer vector k = ( k , . . . , k d ) ∈ N d and x = ( x , . . . , x d ) ∈ [0 , d , we definethe k -th ( d -variate) Walsh function b wal k : [0 , d → C by b wal k ( x ) := d Y j =1 b wal k j ( x j ) . In the following, we will consider the base b ≥ b is prime), and then simply write wal k or wal k instead of b wal k or b wal k ,respectively. It is known (see, e.g., [5]) that the Walsh functions in any fixed base b form anorthonormal basis of L ([0 , d ).As indicated, we consider a class of square-integrable functions that can be represented interms of their Walsh series, that is, f ( x ) = X k ∈ N d ˆ f ( k ) wal k ( x ) with ˆ f ( k ) := Z [0 , d f ( x ) wal k ( x ) d x , (1)where we call ˆ f ( k ) the k -th Walsh coefficient of f .It is known from the literature on QMC methods in the past decades that it is advantageousto choose the integration nodes of a QMC rule such that there exists an efficient way of expressingthe integration error for elements in the function class under consideration. In the case wherethe integrand f can be represented in terms of Walsh series as in (1), it is common to considerquasi-Monte Carlo rules which are based on so-called digital nets and sequences. Digital ( t, m, d )-nets are point sets consisting of b m elements in [0 , d that satisfy certain regular distributionproperties, and were in their most general form introduced in [15] (see also [16]). These point setsare generated by using d generating matrices C , . . . , C d over a finite field or ring. In particular,for a digital ( t, m, d )-net P = { x , . . . , x b m − } ⊂ [0 , d constructed over Z b = { , , . . . , b − } with generating matrices C , . . . , C d ∈ Z m × mb the integration error of a QMC rule based on P takes a special form. It is commonly known, see, e.g., [3, Theorem 6.4], that approximating theintegral I d ( f ) of a d -variate function f using a QMC rule Q b m ,d ( f ; P ), that is, Q b m ,d ( f ) = Q b m ,d ( f ; P ) := 1 b m b m − X n =0 f ( x n ) ≈ Z [0 , d f ( x ) d x =: I d ( f ) , Q b m ,d ( f ; P ) − I d ( f ) = X = k ∈D ˆ f ( k ) (2)with the dual net D = D ( C , . . . , C d ) := { k ∈ N d | C ⊤ e tr m ( ~k ) + · · · + C ⊤ d e tr m ( ~k d ) = } , wherefor k ∈ N with base b expansion k = κ + κ b + · · · + κ a b a we define the vector e tr m ( ~k ) =( κ , κ , . . . , κ m − ) ∈ Z mb , and where we denote by the zero vector in Z mb . Equation (2) is aconsequence of the following character property of Walsh functions,1 b m b m − X n =0 wal k ( x n ) = ( , if C ⊤ e tr m ( ~k ) + · · · + C ⊤ d e tr m ( ~k d ) = , , otherwise.We will also use this property in the subsequent analysis. Based on the decay of the Walsh coefficients ˆ f ( k ) in (1) we will define a function space forthe integrands considered in this paper. As mentioned in the introduction, this space willbe equipped with weights to model the varying influence of the coordinates. To this end, let γ = ( γ j ) j ≥ be a non-increasing sequence of positive real numbers. The weights γ j will appearin the definition of the inner product and norm of the function space defined below. Intuitively,we can think of the weight γ j describing the degree of influence of the j -th variable on theintegration problem. Hence, we assume (w.l.o.g.) that the coordinates are ordered according totheir influence. It will also be convenient to define γ u := Y j ∈ u γ j for a subset u ⊆ { d } , and to additionally set γ ∅ to equal 1. The weights γ u are (for obviousreasons) called product weights. In the recent literature on QMC rules, also other types ofweights have been considered, but we will restrict ourselves to product weights here. We referto [3] for further information on this subject.For prime base b ≥ α >
1, we set ψ b ( k ) := ⌊ log b ( k ) ⌋ for k ∈ N and define the decay function r α : N → R by r α ( k ) = r α ( b, k ) := ( , if k = 0 ,b αψ b ( k ) , if k = 0 , with k ∈ N . It is also convenient to define the quantity µ b ( α ) := ∞ X k =1 ( r α ( k )) − = ∞ X a =0 b aα b a +1 − X k = b a ∞ X a =0 ( b − b a b aα = b α ( b − b α − b . For the multivariate case with dimension d ∈ N , integer vector k = ( k , . . . , k d ) ∈ N d , and asequence of weights γ = ( γ j ) j ≥ , we define the weighted decay functions r α ( k ) := d Y j =1 r α ( k j ) and r α, γ ( k ) := γ − k ) r α ( k ) = γ − k ) Y j ∈ supp( k ) b αψ b ( k j ) with supp( k ) := { j ∈ { d } | k j = 0 } . 5sing this decay function, we can estimate the integration error obtained in (2) by | Q b m ,d ( f ; P ) − I d ( f ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X = k ∈D ˆ f ( k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X = k ∈ N d ˆ f ( k ) r α, γ ( k ) ( r α, γ ( k )) − D ( k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ sup k ∈ N d | ˆ f ( k ) | r α, γ ( k ) X = k ∈D ( r α, γ ( k )) − (3)with D denoting the indicator function of the dual lattice D . Based on this estimate, we define,for real α > γ = ( γ j ) j ≥ , the weighted Walsh spaceas W αd, γ := { f ∈ L ([0 , d ) | k f k W αd, γ < ∞} with corresponding norm k·k W αd, γ given by k f k W αd, γ := sup k ∈ N d | ˆ f ( k ) | r α, γ ( k ) . (4) Remark . We remark that the definition of the norm implies that functions in W αd, γ have anabsolutely convergent Walsh series which converges pointwise (see, e.g., [5]). Remark . We would like to note here that in many recent papers (e.g., [2,4]), a slightly differentfunction space f W αd, γ based on Walsh functions has been studied. In f W αd, γ the norm is not givenas an ∞ -norm as in (4), but in the L -sense, i.e., k f k e W αd, γ := X k ∈ N d | ˆ f ( k ) | r α, γ ( k ) . This definition of the norm corresponds to alternatively applying Hölder’s inequality with p = q = 2 in the bound on the integration error that led to (3). As we will see below, the worst-caseerror expressions for W αd, γ and f W αd, γ are closely related to each other.In order to assess the quality of the QMC methods constructed later on, we will use theworst-case error in the weighted Walsh space as the error criterion. Indeed, the worst-case errorfor the QMC rule Q b m ,d ( · ; P ) in the space W αd, γ is defined as e b m ,d,α, γ ( P ) := sup f ∈ W αd, γ k f k Wαd, γ ≤ | I d ( f ) − Q b m ,d ( f ; P ) | . A useful formula for the worst-case error for ( t, m, d )-nets in the function space W αd, γ is given inthe following theorem. Theorem 1.
Let m, d ∈ N , α > , b ≥ , and a sequence of positive weights γ = ( γ j ) j ≥ begiven. Then the worst-case error e b m ,d,α, γ ( P ) of the QMC rule Q b m ,d ( · ; P ) based on the digital ( t, m, d ) -net P = { x , . . . , x b m − } with generating matrices C , . . . , C d in the space W αd, γ satisfies e b m ,d,α, γ ( P ) = X = k ∈D ( r α, γ ( k )) − . (5) Proof.
Recalling the definition of the worst-case error of the QMC rule Q b m ,d ( · ; P ), the combi-nation of (3) and the definition of k · k W αd, γ leads to the estimate e b m ,d,α, γ ( P ) ≤ sup f ∈ W αd, γ k f k Wαd, γ ≤ k f k W αd, γ X = k ∈D ( r α, γ ( k )) − ≤ X = k ∈D ( r α, γ ( k )) − . f with Walsh coefficients ˆ f ( k ) = ( r α, γ ( k )) − has norm k f k W αd, γ =1 and that its integration error equals Q b m ,d ( f , P ) − I d ( f ) = X = k ∈D ( r α, γ ( k )) − , we obtain that the previous upper bound is attained such that the claimed identity follows. Remark . Returning to the alternative Walsh space f W αd, γ once again, it is known from [4] thatthe worst-case error in this space equals X = k ∈D ( r α, γ ( k )) − / , which is just the square root of the worst-case error in W αd, γ , as outlined in Theorem 1. Therefore,we see that the worst-case errors in these Walsh spaces are intimately related to each other, andall results shown here for W αd, γ immediately yield corresponding results for f W αd, γ . While Theorem 1 is a very useful result, the question of how to find and construct ( t, m, d )-netswith a low integration error for practical purposes remains. One of the most powerful waysof obtaining nets is to consider a special case, namely so-called polynomial lattice point sets,as introduced by Niederreiter in [15]. The name “polynomial lattice point sets” is due to thefact that the structure of polynomial lattice point sets is similar to that of ordinary latticepoint sets as introduced by Korobov [10] and Hlawka [9]. However, while lattice point setsare based on integer arithmetic, polynomial lattice point sets are obtained by using polynomialarithmetic over finite fields. We also point out that there are nowadays variants of polynomiallattice point sets which are especially suited for integrating functions with higher smoothness(see, e.g., [5]). However, we will not consider higher order polynomial lattices here, but restrictourselves to the more classical construction scheme. We point out that polynomial lattice pointsets are actually a special case of so-called digital ( t, m, d )-nets, which can be constructed usinggenerating matrices C , . . . , C d over a finite field. For our purposes, though, it is more convenientto define these point sets in an alternative way. Before we give the precise definition, we needto introduce some notation.Let F b (( x − )) be the field of formal Laurent series over F b with elements of the form L = ∞ X ℓ = w t ℓ x − ℓ , where w is an arbitrary integer and all t ℓ ∈ F b . We further denote by F b [ x ] the set of allpolynomials over F b and define the map v m : F b (( x − )) → [0 ,
1) by v m ∞ X ℓ = w t ℓ x − ℓ ! = m X ℓ =max(1 ,w ) t ℓ b − ℓ . There is a close connection between the base b expansions of natural numbers and the polynomialring F b [ x ]. For n ∈ N with base b expansion n = n + n b + · · · + n a b a , we associate n with thepolynomial n ( x ) := a X k =0 n k x k ∈ F b [ x ] . The definition of a polynomial lattice point set is then given as follows. We note that here andin the following we consider the zero polynomial to have degree −∞ , hence the case n = 0 isincluded in the following definition. 7 efinition 2 (Polynomial lattice) . Let b be prime and let m, d ∈ N be given. Furthermore,choose p ∈ F b [ x ] with deg( p ) = m , and let g , . . . , g d ∈ F b [ x ]. Then the point set P ( g , p ), definedas the collection of the b m points x n := (cid:18) v m (cid:18) n ( x ) g ( x ) p ( x ) (cid:19) , . . . , v m (cid:18) n ( x ) g d ( x ) p ( x ) (cid:19)(cid:19) ∈ [0 , d for n ∈ F b [ x ] with deg( n ) < m , is called a polynomial lattice point set (we sometimes also referto the point set as polynomial lattice for short), where the vector g = ( g , . . . , g d ) ∈ ( F b [ x ]) d iscalled the generating vector.As pointed out above, due to the construction principle and the similarities to the construc-tion of (rank-1) lattices, P ( g , p ) is often called a (rank-1) polynomial lattice and a QMC ruleusing the point set P ( g , p ) is referred to as a polynomial lattice rule (modulo p ). Furthermore,note that one can restrict the choice of the components g j of g to the sets G b,m := { g ∈ F b [ x ] | deg( g ) < m } or G ∗ b,m := { g ∈ F b [ x ] \ { } | deg( g ) < m } . We also add that it is known from the literature on polynomial lattice point sets that it isdesirable to have gcd( g j , p ) = 1 for the components g j of g , as this guarantees certain regularityproperties. For prime b , the generating matrices C , . . . , C d ∈ F m × mb of a polynomial latticepoint set P ( g , p ) can be obtained from the generating vector g and p , cf. [5, Theorem 10.5]. Itthen follows that the dual net D ( g , p ) of a polynomial lattice with generating vector g , modulus p with deg( p ) = m , and generating matrices C , . . . , C d equals (see, e.g., [16, Lemma 4.40]) D ( g , p ) = { k ∈ N d | C ⊤ e tr m ( ~k ) + · · · + C ⊤ d e tr m ( ~k d ) = } = { k ∈ N d | tr m ( k ) · g ≡ p ) } , where for two vectors u , v ∈ ( F b [ x ]) d we define the vector dot product u · v = P dj =1 u j v j .Furthermore, for k ∈ N with b -adic expansion k = κ + κ b + · · · + κ a − b a − , we define thetruncation map tr m : N → G b,m viatr m ( k ) := κ + κ x + · · · + κ m − x m − , where we consider κ j as 0 if j ≥ a . If we apply tr m to a d -dimensional vector, we define its d -variate generalization tr m ( k ) to be applied componentwise. Furthermore, for a subset u ⊆ { d } we introduce the notation D u = D u ( g , p ) = D u ( g u ) := { k u ∈ N | u | | tr m ( k u ) · g u ≡ p ) } . Due to the obtained equivalence for the dual net of a polynomial lattice, the result in Theorem1 also applies to polynomial lattice rules with D ( C , . . . , C d ) replaced by D ( g , p ). Furthermore,we will henceforth denote the worst-case error of a QMC rule based on the polynomial latticepoint set P ( g , p ) in the space W αd, γ by e b m ,d,α, γ ( g ). In this section we introduce an alternative quality measure which, opposed to the worst-caseerror expression e b m ,d,α, γ in (5) is independent of the parameter α .For α ≥
1, given weight sequence γ = ( γ j ) j ≥ , m ∈ N , modulus p ∈ F b [ x ] with deg( p ) = m ,and g ∈ ( F b [ x ]) d , we define the quantities T γ ( g , p ) := X = k ∈ A p ( g ) ( r , γ ( k )) − , T α, γ ( g , p ) := X = k ∈ A p ( g ) ( r α, γ ( k )) − (6)with index set given by A p ( g ) := { k ∈ { , , . . . , b m − } d | k ∈ D ( g , p ) } . ∅ 6 = u ⊆ { d } , we introduce the sets A u = A p, u ( g u ) = A p, u ( g ) := { k u ∈ { , , . . . , b m − } | u | | k u ∈ D u ( g , p ) } ,A ∗ u = A ∗ p, u ( g u ) = A ∗ p, u ( g ) := { k u ∈ { , . . . , b m − } | u | | k u ∈ D u ( g , p ) } , and for a polynomial p ∈ F b [ x ] define the indicator function δ p : F b [ x ] → { , } by δ p ( q ) := ( , if q ≡ p ) , , if q p ) . In the following proposition we estimate the difference between the worst-case error e b m ,d,α, γ ( g )and the truncated quality measure T α, γ ( g , p ) of a polynomial lattice rule with generator g andmodulus p ∈ F b [ x ] with deg( p ) = m . Proposition 1.
Let γ = ( γ j ) j ≥ be a sequence of positive weights, let p ∈ F b [ x ] with deg( p ) = m ,and let g = ( g , . . . , g d ) ∈ G db,m such that gcd( g j , p ) = 1 for all j = 1 , . . . , d . Then, for any α > and N = b m , we have e b m ,d,α, γ ( g ) − T α, γ ( g , p ) ≤ N α X ∅6 = u ⊆{ d } γ u (2 µ b ( α )) | u | . Proof.
For a non-empty subset ∅ 6 = u ⊆ { d } and i ∈ { d } , we write for short k u \{ i } ∈ N | u |− and g u \{ i } ∈ G | u |− b,m to denote the projections on the components in u \ { i } . The difference canthen be rewritten as e b m ,d,α, γ ( g ) − T α, γ ( g , p ) = X ∅6 = u ⊆{ d } X k u ∈D u ( g u ) ( r α, γ ( k u )) − − X k u ∈ A ∗ p, u ( g u ) ( r α, γ ( k u )) − , motivating us to define the quantity S α, γ , u := X k u ∈D u ( g u ) ( r α, γ ( k u )) − − X k u ∈ A ∗ p, u ( g u ) ( r α, γ ( k u )) − for ∅ 6 = u ⊆ { d } . In the following we distinguish two cases. Case 1:
Suppose that | u | = 1 such that u = { j } for some j ∈ { d } . Then, we have S α, γ , { j } = X k ∈ N tr m ( k ) g j ≡ p ) ( r α,γ j ( k )) − − X k ∈{ ,...,b m − } tr m ( k ) g j ≡ p ) ( r α,γ j ( k )) − = X k ≥ b m tr m ( k ) g j ≡ p ) ( r α,γ j ( k )) − . Note that tr m ( k ) g j ≡ p ) if and only if there is a c ∈ F b [ x ] such that tr m ( k ) g j = cp andthus, since gcd( g j , p ) = 1, we have that tr m ( k ) = ap for some a ∈ F b [ x ]. But deg(tr m ( k )) < m while deg( p ) = m , which implies that tr m ( k ) = 0 and thus k = t b m for some t ∈ N . This yields S α, γ , { j } = ∞ X t =1 ( r α,γ j ( t b m )) − = γ j ∞ X t =1 b − α ⌊ log b t b m ⌋ = γ j ∞ X t =1 b − α ⌊ m +log b t ⌋ = γ j ∞ X t =1 b − αm b − α ⌊ log b t ⌋ = γ j b αm ∞ X t =1 b − αψ b ( t ) = γ j µ b ( α ) b αm . ase 2: Suppose that | u | ≥
2. In this case, we find that S α, γ , u ≤ X i ∈ u X k u \{ i } ∈ N | u |− X k i ≥ b m δ p (tr m ( k i ) g i + tr m ( k u \{ i } ) · g u \{ i } ) r α, γ ( k u ) . Then, for k u \{ i } ∈ N | u |− , we write q = tr m ( k u \{ i } ) · g u \{ i } , and estimate the expression X k i ≥ b m δ p (tr m ( k i ) g i + q ) r α, γ ( k u ) = γ u X k i ≥ b m δ p (tr m ( k i ) g i + q ) Q j ∈ u b α ⌊ log b k j ⌋ = γ u Y j ∈ u j = i b − α ⌊ log b k j ⌋ X k i ≥ b m δ p (tr m ( k i ) g i + q ) b α ⌊ log b k i ⌋ = γ u Y j ∈ u j = i b − α ⌊ log b k j ⌋ ∞ X t =1 ( t +1) b m − X k i = tb m δ p (tr m ( k i ) g i + q ) b α ⌊ log b k i ⌋ ≤ γ u Y j ∈ u j = i b − α ⌊ log b k j ⌋ ∞ X t =1 b − α ⌊ log b tb m ⌋ ( t +1) b m − X k i = tb m δ p (tr m ( k i ) g i + q ) | {z } =1 = γ u Y j ∈ u j = i b − α ⌊ log b k j ⌋ ∞ X t =1 b − α ⌊ m +log b t ⌋ = γ u µ b ( α ) b αm Y j ∈ u j = i b − α ⌊ log b k j ⌋ , where the penultimate equality follows since if gcd( g i , p ) = 1 then for each t and each q ∈ F b [ x ]there exists exactly one k ∈ { tb m , . . . , ( t + 1) b m − } such that tr m ( k ) g i + q ≡ p ).Hence, we can estimate S α, γ , u , for | u | ≥
2, by S α, γ , u ≤ X i ∈ u X k u \{ i } ∈ N | u |− γ u µ b ( α ) b αm Y j ∈ u j = i b − α ⌊ log b k j ⌋ = γ u µ b ( α ) b αm X i ∈ u ∞ X k =1 b − α ⌊ log b k ⌋ ! | u |− = γ u µ b ( α ) b αm X i ∈ u µ b ( α ) | u |− = γ u µ b ( α ) | u | b αm | u | ≤ γ u N α (2 µ b ( α )) | u | . In summary, we obtain, using the results for both cases from above, X ∅6 = u ⊆{ d } S α, γ , u ≤ N α X ∅6 = u ⊆{ d } γ u (2 µ b ( α )) | u | , which is the claimed upper estimate.Based on the previous result, it is straightforward to show the existence of good polynomiallattice rules with respect to the worst-case error in the weighted Walsh space, if one assumesthe modulus p to be irreducible. We omit the proof, which uses standard methods. Theorem 2.
Let p ∈ F b [ x ] be an irreducible polynomial with deg( p ) = m , let N = b m , and let γ = ( γ j ) j ≥ be positive weights. Then there exists a g ∈ G db,m such that, for all α > , theworst-case error e b m ,d,α, γ ( g ) satisfies e b m ,d,α, γ ( g ) ≤ N α X ∅6 = u ⊆{ d } γ u (2 µ b ( α )) | u | + X ∅6 = u ⊆{ d } γ /α u ( m ( b − | u | α . p , we will assume that p has the special form p ( x ) = p m ( x ) = x m , and showa constructive approach to find generating vectors of good polynomial lattice rules. This will bethe main result of our paper, which is stated in Theorem 7. In this section, we formulate and analyze a method for the construction of good polynomiallattice rules. In contrast to the existence result in Theorem 2, our construction method yieldspolynomial lattice rules with modulus p ( x ) = x m . At first, we prove some auxiliary statementswhich will be needed in the further analysis. We consider the following Walsh series for x ∈ (0 , r , ∞ X k =0 wal k ( x ) r ( k ) = 1 + ∞ X k =1 e π i( κ ξ + κ ξ + ··· ) /b b ⌊ log b ( k ) ⌋ , which, as we will see, is closely related to our quality criterion T γ introduced in (6). To this end,we define, for n ∈ N , the n -th Walsh–Dirichlet kernel by D n ( x ) = n − X k =0 wal k ( x ) . From [5, Lemma A.17] it then follows that, for x ∈ (0 , D b t ( x ) = ( b t , if x ∈ (0 , b t ) , , if x ∈ [ b t , . (7)We can then prove the following identity. Lemma 1.
For base b ≥ , the Walsh series of − ( b − ⌊ log b ( x ) ⌋ + 1) equals, pointwise for x ∈ (0 , , − ( b − ⌊ log b ( x ) ⌋ + 1) = 1 + ∞ X k =1 e π i( κ ξ + κ ξ + ··· ) /b b ⌊ log b ( k ) ⌋ = ∞ X k =0 wal k ( x ) r ( k ) . Proof.
Using the definition of the Walsh-Dirichlet kernel, we obtain ∞ X k =1 wal k ( x ) b ⌊ log b ( k ) ⌋ = ∞ X t =1 b t − X k = b t − wal k ( x ) b t − = ∞ X t =1 D b t ( x ) − D b t − ( x ) b t − , and from (7) we find that for t ≥ D b t ( x ) − D b t − ( x ) = ( b − b t − , if x ∈ (cid:16) , b t (cid:17) , − b t − , if x ∈ h b t , b t − (cid:17) , , if x ∈ h b t − , (cid:17) . (8)11pplied inductively, the relation in (8) yields that for x ∈ [ b t , b t − ) we have ∞ X ℓ =1 D b ℓ ( x ) − D b ℓ − ( x ) b ℓ − = t − X ℓ =1 ( b − b ℓ − b ℓ − − b t − b t − = ( t − b − − t ≥
1, which is equivalent to1 + ∞ X k =1 wal k ( x ) b ⌊ log b ( k ) ⌋ = ( b − t −
1) = − ( b − − t + 1) = − ( b − ⌊ log b ( x ) ⌋ + 1)for x ∈ [ b t , b t − ) and for all t ∈ N . This proves the claimed identity.Based on the previous result in Lemma 1, we show that the function − ( b − ⌊ log b ( x ) ⌋ + 1)can be written in terms of its truncated Walsh series with uniformly bounded remainder term. Lemma 2.
Let N = b m with m ∈ N and base b ≥ . Then for any x ∈ (0 , there exists a τ = τ ( x ) ∈ R with | τ ( x ) | < bb − such that − ( b − ⌊ log b ( x ) ⌋ + 1) = N − X k =0 wal k ( x ) r ( k ) + τ ( x ) N x . (9)
Proof.
The expansion in Lemma 1 allows us to write − ( b − ⌊ log b ( x ) ⌋ + 1) = N − X k =0 wal k ( x ) r ( k ) + R N ( x ) , where the remainder R N ( x ) has the form R N ( x ) = ∞ X k = b m wal k ( x ) r ( k ) = ∞ X k = b m wal k ( x ) b ⌊ log b ( k ) ⌋ = ∞ X t = m D b t +1 ( x ) − D b t ( x ) b t . From (8) we then see that the following inequality holds, | D b t +1 ( x ) − D b t ( x ) | < x , t ∈ N , x ∈ (0 , , and thus we obtain | R N ( x ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ X t = m D b t +1 ( x ) − D b t ( x ) b t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < x ∞ X t = m b t = b ( b − b m x = b ( b − N x , which implies the existence of a τ ( x ) ∈ R with | τ ( x ) | < bb − such that the identity (9) holds. Remark . Using a more involved argument, the result in Lemma 2 can also be extended togeneral N ∈ N . In particular, we obtain that for any x ∈ (0 ,
1) there exists a τ = τ ( x ) ∈ R suchthat − ( b − ⌊ log b ( x ) ⌋ + 1) = N − X k =0 wal k ( x ) r ( k ) + τN x with | τ | < b (cid:16) b − + 2 (cid:17) for b = 2 and with | τ | < b (cid:16) b − + 2 b (cid:17) for b > emma 3. For j ∈ { d } , let u j , v j , and ρ j be real numbers satisfying ( a ) u j = v j + ρ j , ( b ) | u j | ≤ ¯ u j , ( c ) ¯ u j ≥ , for all j ∈ { d } . Then, for any subset ∅ 6 = u ⊆ { d } there exists a θ u with | θ u | ≤ such that Y j ∈ u u j = Y j ∈ u v j + θ u Y j ∈ u (¯ u j + | ρ j | ) X j ∈ u | ρ j | . Furthermore, we recall the character property of Walsh functions for polynomial lattice ruleswith prime base b . Let P ( g , p ) = { x , . . . , x b m − } be a polynomial lattice with generating vector g ∈ ( F b [ x ]) d and modulus p ∈ F b [ x ] with deg( p ) = m . Then, for any integer vector k ∈ N d thefollowing identity holds,1 b m b m − X n =0 wal k ( x n ) = δ p (tr m ( k ) · g ) = ( , if tr m ( k ) · g ≡ p ) , , otherwise . (10)We remark that an analogous result to (10) also holds if we only consider projections of thepolynomial lattice and the generating vector onto a non-empty subset of { d } , as also theprojection of a polynomial lattice is a polynomial lattice that is generated by the correspondingprojection of the generating vector.We now state an auxiliary result that will be useful at several instances in this paper. Lemma 4.
Let P ( g , p ) be a polynomial lattice with modulus p ∈ F b [ x ] with deg( p ) = m andgenerating vector g = ( g , . . . , g d ) ∈ ( F b [ x ]) d such that gcd( g j , p ) = 1 for ≤ j ≤ d . Then eachone-dimensional projection of P ( g , p ) is the full grid (cid:26) , b m , . . . , b m − b m (cid:27) , and in particular the projection of the point with index is always .Proof. The result follows from Definition 2 and [5, Remark 10.3].Additionally, we will need the following result.
Lemma 5.
Let P ( g , p ) = { x , . . . , x b m − } be a polynomial lattice point set with modulus p ∈ F b [ x ] with deg( p ) = m and generating vector g ∈ ( F b [ x ]) d such that gcd( g j , p ) = 1 for ≤ j ≤ d .Furthermore, let m ≥ . For a point x n with n ∈ { , , . . . , b m − } , we denote its coordinatesvia x n = ( x n, , . . . , x n,d ) . Then, for any j ∈ { , . . . , d } , it is true that b m b m − X n =1 x n,j < m ln b ≤ m ( b − . Proof.
We recall that the point set P ( g , p ) is defined as the collection of the b m points of theform x n = (cid:18) v m (cid:18) n ( x ) g ( x ) p ( x ) (cid:19) , . . . , v m (cid:18) n ( x ) g d ( x ) p ( x ) (cid:19)(cid:19) for n ∈ F b [ x ] with deg( n ) < m . Due to Lemma 4 we know that { x ,j , . . . , x b m − ,j } equals theset n b m , . . . , b m − b m o for each j ∈ { , . . . , d } . Thus we can estimate1 b m b m − X n =1 x n,j = 1 b m b m − X n =1 b m n = b m − X n =1 n ≤ Z b m − x d x = 1 + ln( b m − < b m ) = 1 + m ln b ≤ m ( b − , which yields the claimed result, where the last estimate follows from the assumption m ≥ .2 The CBC-DBD construction algorithm We are now ready to study the component-by-component digit-by-digit (CBC-DBD) construc-tion for polynomial lattice rules, see also [6], where such an algorithm was analyzed for ordinarylattice rules. In particular, we will assume throughout this section that our modulus polynomialis of the form p m ( x ) = x m for m ∈ N .Concerning the weights, the algorithm can, as indicated in our main result (Theorem 7), berun with respect to the weights γ /α = ( γ /αj ) j ≥ to obtain a polynomial lattice rule that yieldsa low worst-case error in the Walsh space W αd, γ , or, alternatively, with respect to the weights γ to obtain good polynomial lattice rules in the space W αd, γ α . In the latter case, the constructionalgorithm is independent of the smoothness parameter α and we obtain worst-case error boundsthat hold for all α > η insteadof γ and outline the algorithm based on η . In Theorem 7, we will then choose η equal to γ /α or γ , respectively. For technical reasons, it will be necessary to assume that the positive weights η are of product structure, that is, η u = Y j ∈ u η j for u ⊆ { d } , with a sequence of positive reals ( η j ) j ≥ . However, we point out that the followingtheorem, which is crucial for the proposed construction method, also holds for general weights η = ( η u ) u ⊆{ d } . Theorem 3.
Let b be prime, let m, d ∈ N with m ≥ , let p m ( x ) = x m ∈ F b [ x ] , and let η = ( η u ) u ⊆{ d } be positive weights with η ∅ = 1 . Furthermore, let g = ( g , . . . , g d ) ∈ ( F b [ x ]) d with deg( g j ) < m and gcd( g j , p m ) = 1 for ≤ j ≤ d . Then, T η ( g , p m ) ≤ b m H d,m, η ( g ) − X ∅6 = u ⊆{ d } η u + X ∅6 = u ⊆{ d } η u b m (( b − m + 1) | u | + X ∅6 = u ⊆{ d } η u b m ( b m | u | ) (cid:18) ( b − m + bb − (cid:19) | u | , where we define the function H d,m, η : ( F b [ x ]) d → R as H d,m, η ( g ) := X ∅6 = u ⊆{ d } η u (1 − b ) | u | b m − X n =1 Y j ∈ u (cid:18)(cid:22) log b (cid:18) v m (cid:18) n ( x ) g j ( x ) x m (cid:19)(cid:19)(cid:23) + 1 (cid:19) . (11) Proof.
We use the character property of Walsh functions in (10) to rewrite T η ( g , p m ) with thehelp of the identity in Lemma 2. First, we recall that for k ∈ N we have r ( k ) = r ( b, k ) = ( , for k = 0 ,b ⌊ log b ( k ) ⌋ , for k = 0 . Using this definition, we obtain that T η ( g , p m ) = X ∅6 = u ⊆{ d } η u X k u ∈{ ,...,b m − } | u | δ p m (tr m ( k u ) · g u ) Q j ∈ u r ( k j ) ≤ X ∅6 = u ⊆{ d } η u X = k u ∈{ ,...,b m − } | u | δ p m (tr m ( k u ) · g u ) Q j ∈ u r ( k j )= X ∅6 = u ⊆{ d } η u b m b m − X n =0 X k u ∈{ , ,...,b m − } | u | wal k u ( x n, u ) Q j ∈ u r ( k j ) − X ∅6 = u ⊆{ d } η u b m X k u ∈{ ,...,b m − } | u | Q j ∈ u r ( k j ) + b m − X n =1 Y j ∈ u b m − X k =1 wal k ( x n,j ) b ⌊ log b ( k ) ⌋ ! − X ∅6 = u ⊆{ d } η u = X ∅6 = u ⊆{ d } η u b m X k u ∈{ ,...,b m − } | u | Q j ∈ u r ( k j ) + b m − X n =1 Y j ∈ u v j ( n ) − Y j ∈ u u j ( n ) + Y j ∈ u u j ( n ) − X ∅6 = u ⊆{ d } η u = X ∅6 = u ⊆{ d } η u b m X k u ∈{ , ,...,b m − } | u | Q j ∈ u r ( k j ) + X ∅6 = u ⊆{ d } η u b m b m − X n =1 Y j ∈ u u j ( n ) − X ∅6 = u ⊆{ d } η u b m b m − X n =1 θ u ( n ) Y j ∈ u (¯ u j + | ρ j ( n ) | ) X j ∈ u | ρ j ( n ) | − X ∅6 = u ⊆{ d } η u , (12)where we used Lemma 3 with u j = u j ( n ) := − ( b − ⌊ log b ( x n,j ) ⌋ + 1) , ¯ u j = ¯ u j ( n ) := ( b − m,v j = v j ( n ) := 1 + b m − X k =1 wal k ( x n,j ) b ⌊ log b ( k ) ⌋ , ρ j = ρ j ( n ) := τ j ( n ) x n,j b m , and all | θ u ( n ) | ≤ | τ j ( n ) | < bb − . Due to Lemma 2, Condition (a) of Lemma 3 is fulfilled.Furthermore, we see that for p m ( x ) = x m we have for each j ∈ { d } that x n,j = v m (cid:18) n ( x ) g j ( x ) p m ( x ) (cid:19) ≥ v m (cid:18) x m (cid:19) = b − m for every 1 ≤ n < b m , and so | u j ( n ) | ≤ − ( b − (cid:4) log b ( b − m ) (cid:5) + 1) = − ( b − − m + 1) < ( b − m = ¯ u j with ¯ u j ≥ X ∅6 = u ⊆{ d } η u b m X k u ∈{ , ,...,b m − } | u | Q j ∈ u r ( k j ) = 1 b m X ∅6 = u ⊆{ d } η u b m − X k =1 b ⌊ log b ( k ) ⌋ ! | u | = 1 b m X ∅6 = u ⊆{ d } η u m − X t =0 b t +1 − X k = b t b ⌊ log b ( k ) ⌋ | u | = 1 b m X ∅6 = u ⊆{ d } η u (( b − m + 1) | u | , while the third sum in (12) can be bounded by − X ∅6 = u ⊆{ d } η u b m b m − X n =1 θ u ( n ) Y j ∈ u (¯ u j + | ρ j ( n ) | ) X j ∈ u | ρ j ( n ) | = − X ∅6 = u ⊆{ d } η u b m b m − X n =1 θ u ( n ) Y j ∈ u ( b − m + | τ j ( n ) | x n,j b m ! X j ∈ u | τ j ( n ) | x n,j b m ≤ X ∅6 = u ⊆{ d } η u b m b m − X n =1 | θ u ( n ) | Y j ∈ u (cid:18) ( b − m + bb − (cid:19) X j ∈ u | τ j ( n ) | x n,j b m ≤ X ∅6 = u ⊆{ d } η u b m Y j ∈ u (cid:18) ( b − m + bb − (cid:19) X j ∈ u b m − X n =1 b ( b − b m x n,j X ∅6 = u ⊆{ d } η u b m Y j ∈ u (cid:18) ( b − m + bb − (cid:19) bb − X j ∈ u m ( b − b m X ∅6 = u ⊆{ d } η u ( b m | u | ) (cid:18) ( b − m + bb − (cid:19) | u | , where we used Lemma 5 and the fact that x n,j ≥ b − m for each j and all 1 ≤ n < b m . Combiningthese results with (12) yields the claimed result.Theorem 3 implies that it essentially suffices to find a generating vector g ∈ ( F b [ x ]) d suchthat H d,m, η ( g ) is small, which then implies that also a good bound on T η ( g , p m ) holds. We willtherefore consider the quantity H d,m, η as a search criterion for good generating vectors.At first, we prove the following result which will be needed in the further analysis and remindthe reader that by p m we denote the polynomial p m ∈ F b [ x ] with p m ( x ) = x m for m ∈ N . Lemma 6.
Let a prime b , an integer t ≥ , and polynomials ℓ, q ∈ F b [ x ] with gcd( ℓ, p ) =gcd( q, p ) = 1 be given. Then the following identity holds: X g ∈ F b $ log b v t ℓ ( x ) ( q ( x ) + x t − g ) x t !!% + 1 ! = (cid:22) log b (cid:18) v t − (cid:18) ℓ ( x ) q ( x ) x t − (cid:19)(cid:19)(cid:23) . Proof.
Assume that the product of the polynomials ℓ and q is given by ℓ ( x ) q ( x ) = r X i =0 a i x i with a , a r = 0 . Let, furthermore, ℓ ( x ) = v X k =0 ℓ k x k , where we note that v ≤ r . Hence, we obtain that for g ∈ F b ℓ ( x ) ( q ( x ) + x t − g ) x t = r X i = t a i x i − t + v X k =1 ℓ k gx k − + ( a t − + ℓ g ) x − + t − X i =0 a i x i − t and thus we have that if a t − + ℓ g b ), then $ log b v t ℓ ( x ) ( q ( x ) + x t − g ) x t !!% + 1 = $ log b a t − + ℓ gb + t − X i =0 a i b i − t !% + 1 = − . Otherwise, if a t − + ℓ g ≡ b ), then v t ℓ ( x ) ( q ( x ) + x t − g ) x t ! = t − X i =0 a i b i − t = t X i =2 a t − i b − i = 1 b t − X i =1 a t − i − b − i ! = 1 b v t − (cid:18) ℓ ( x ) q ( x ) x t − (cid:19) , and therefore $ log b v t ℓ ( x ) ( q ( x ) + x t − g ) x t !!% + 1 = (cid:22) log b (cid:18) b v t − (cid:18) ℓ ( x ) q ( x ) x t − (cid:19)(cid:19)(cid:23) + 1= (cid:22) log b (cid:18) v t − (cid:18) ℓ ( x ) q ( x ) x t − (cid:19)(cid:19)(cid:23) . Observing that there exists exactly one g ∈ F b for which a t − + ℓ g ≡ b ) and combiningthe two cases considered, we immediately obtain the claimed identity.16ith the help of Lemma 6 we can prove the following result which motivates the choice ofour quality function for Algorithm 1. Lemma 7.
For integers m ∈ N and w ∈ { , . . . , m } , let b be prime, g ∈ F b , and g ∈ ( F b [ x ]) d with gcd( g j , p ) = 1 for all ≤ j ≤ d , where g d ∈ G b,w − , and let η = ( η u ) u ⊆{ d } be positiveweights with η ∅ = 1 . Then the average of H d,m, η with respect to the choices for extending thedegree of g d + g p w − up to m equals b m − w X ¯ g ∈ G b,m − w H d,m, η ( g , . . . , g d − , g d + g p w − + ¯ g p w )= m X t = w b t − X ℓ =1 ℓ b ) η d (1 − b ) b w − t $ log b v w ℓ ( x ) ( g d ( x ) + g x w − ) x w !!% + 1 ! ×× d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) + S m,w, η ( g ) − ( b m − , (13) where the term S m,w, η ( g ) , which does not depend on g and ¯ g , is given by S m,w, η ( g ) = w − X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) + m X t = w b t − X ℓ =1 ℓ b ) d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) (cid:16) η d (1 − b w − t ) (cid:17) . Proof.
For product weights η u = Q j ∈ u η j and e g = ( e g , . . . , e g d ) ∈ ( F b [ x ]) d , the quantity H d,m, η ( e g )defined in (11) equals H d,m, η ( e g ) = b m − X n =1 d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v m (cid:18) n ( x ) e g j ( x ) x m (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − . We define ¯ H d,m, η ( e g ) := H d,m, η ( e g ) + ( b m −
1) which in turn can be rewritten as¯ H d,m, η ( e g ) = m X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) . Setting e g d = g d + g p w − + ¯ g p w with ¯ g ∈ G b,m − w and e g j = g j for j ∈ { d − } , we can write1 b m − w X ¯ g ∈ G b,m − w H d,m, η ( g , . . . , g d − , g d + g p w − + ¯ g p w ) = 1 b m − w X ¯ g ∈ G b,m − w ¯ H d,m, η ( e g ) − ( b m − b m − w X ¯ g ∈ G b,m − w m X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − b m − w X ¯ g ∈ G b,m − w w − X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − b m − w X ¯ g ∈ G b,m − w m X t = w b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) . − ( b m −
1) in (13) is therefore accounted for. What is more, by the definition of v t wehave for any q ∈ F b [ x ] that v t (cid:18) q ( x ) x t (cid:19) = v t q ( x ) mod x t x t ! , (14)and hence 1 b m − w X ¯ g ∈ G b,m − w w − X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) = w − X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) , which is the first sum in S m,w, η , and, in particular, is independent of g and all ¯ g ∈ G b,m − w .The second sum in S m,w, η and all remaining terms in identity (13) are obtained by considering1 b m − w X ¯ g ∈ G b,m − w m X t = w b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) e g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) = m X t = w b t − X ℓ =1 ℓ b ) d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) ×× η d (1 − b ) 1 b m − w X ¯ g ∈ G b,m − w $ log b v t ℓ ( x ) ( g d ( x ) + g x w − + ¯ g ( x ) x w ) x t !!% + 1 ! , (15)such that, with the help of (14) and under the repeated use of Lemma 6, we obtain for each t ∈ { w + 1 , . . . , m } that X ¯ g ∈ G b,m − w $ log b v t ℓ ( x ) ( g d ( x ) + g x w − + ¯ g ( x ) x w ) x t !!% + 1 ! = b m − t X ¯ g ∈ G b,t − w $ log b v t ℓ ( x ) ( g d ( x ) + g x w − + ¯ g ( x ) x w ) x t !!% + 1 ! = b m − t X ¯ g ∈ G b,t − w − $ log b v t − ℓ ( x ) ( g d ( x ) + g x w − + ¯ g ( x ) x w ) x t − !!% + 1 − ! = b m − t $ log b v w ℓ ( x ) ( g d ( x ) + g x w − ) x w !!% + 1 ! − b m t X r = w +1 b − r = b m − w b w − t $ log b v w ℓ ( x ) ( g d ( x ) + g x w − ) x w !!% + 1 ! − b m − w − b w − t b − ! . Combining this with the identity in (15) yields the remaining term of S m,w, η and the first termin (13) such that the claimed result is proved.We note that only the first term of (13) in Lemma 7 depends on the ( w − gx w − of g d . Therefore, we can introduce the quality function for our algorithm which is basedon the first term of (13), yet slightly adjusted by an additional summand that is independent of g and ¯ g . 18 efinition 3. (Digit-wise quality function) Let q ∈ F b [ x ], with prime b , let m, d ∈ N , andlet η = ( η u ) u ⊆{ d } , where η u = Q j ∈ u η j with positive reals ( η j ) j ≥ , be product weights. Forintegers w ∈ { m } , r ∈ { d } , and polynomials g , . . . , g r − ∈ F b [ x ] with gcd( g j , p ) = 1 for j = 1 , . . . , r −
1, we define the quality function h r,w,m, η : F b [ x ] → R as h r,w,m, η ( q ) := m X t = w b t − w b t − X ℓ =1 ℓ b ) (cid:18) η r (1 − b ) (cid:18)(cid:22) log b (cid:18) v w (cid:18) ℓ ( x ) q ( x ) x w (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) ×× r − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) . We remark that the function h r,w,m, η directly depends on the polynomials g , . . . , g r − eventhough this is not visible in the notation. In the remainder of this section, however, thesepolynomials will always be the components of the generating vector which were selected inthe previous steps of our algorithm. Based on the quality function h r,w,m, η , we formulate thecomponent-by-component digit-by-digit algorithm. Algorithm 1
Component-by-component digit-by-digit algorithm
Input:
Prime number b ≥
2, integers m, d ∈ N , and positive product weights η = ( η j ) j ≥ .Set g ,m = 1 and g , = · · · = g d, = 1. for r = 2 to d dofor w = 2 to m do g ∗ = argmin g ∈ F b h r,w,m, η ( g r,w − + g p w − ) g r,w = g r,w − + g ∗ p w − end forend for Set g = ( g , . . . , g d ) with g r := g r,m for r = 1 , . . . , d . Return:
Generating vector g = ( g , . . . , g d ) ∈ ( G ∗ b,m ) d .In the next section, we study the worst-case error behavior of polynomial lattice rules withgenerating vectors obtained by Algorithm 1. The following theorem shows that for the constructed polynomial lattice rules the quantity H d,m, η ( g ), which for product weights η u = Q j ∈ u η j equals H d,m, η ( g ) = b m − X n =1 d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v m (cid:18) n ( x ) g j ( x ) x m (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − m X t =1 b t − X ℓ =1 ℓ b ) d Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − , can be related to the quantity H d − ,m, η ( g { d − } ). Theorem 4.
Let b be prime, m, d ∈ N be integers with d ≥ , and let η = ( η j ) j ≥ be positiveproduct weights. Furthermore, denote by g the corresponding generating vector constructed byAlgorithm 1. Then g satisfies H d,m, η ( g ) ≤ (1 + η d ) H d − ,m, η ( g { d − } ) + η d ( b m − . (16)19 roof. We will prove (16) by an inductive argument over the selection of the terms of order1 ≤ t ≤ m − g d ∈ F b [ x ]. We start by considering the term of order m − h d,m,m, η ( g d,m − + g p m − )over the choices g ∈ F b , and where g d,m − ∈ G b,m − has been determined in the previous stepsof the algorithm. By Lemma 7 (with w = m ) and Definition 3 this is equivalent to minimizing H d,m, η ( g , . . . , g d − , g d,m − + g p m − )with respect to g ∈ F b . By the standard averaging argument, this yields H d,m, η ( g ) = min ¯ g ∈ F b H d,m, η (cid:16) g { d − } , g d,m − + ¯ g p m − (cid:17) ≤ b X ¯ g ∈ F b H d,m, η (cid:16) g { d − } , g d,m − + ¯ g p m − (cid:17) = 1 b X ¯ g ∈ G b, H d,m, η (cid:16) g { d − } , g d,m − + g p m − + ¯ g p m − (cid:17) , (17)where g d,m − has been split up into g d,m − and g p m − in accordance with Algorithm 1 such that g has been selected in the previous step of the algorithm and we used that G b, ∼ = F b .Similarly, we observe that the term of order m − h d,m − ,m, η ( g d,m − + g p m − ) with respect to the choices g ∈ F b . Again, by Lemma 7 (with w = m −
1) and Definition3 this is equivalent to minimizing1 b X ¯ g ∈ G b, H d,m, η ( g { d − } , g d,m − + g p m − + ¯ g p m − )with respect to g ∈ G b, ∼ = F b . By the standard averaging argument, we obtain thatmin g ∈ G b, b X ¯ g ∈ F b H d,m, η (cid:16) g { d − } , g d,m − + g p m − + ¯ g p m − (cid:17) ≤ b X g ∈ F b X ¯ g ∈ G b, H d,m, η (cid:16) g { d − } , g d,m − + g p m − + ¯ g p m − (cid:17) = 1 b X ¯ g ∈ G b, H d,m, η (cid:16) g { d − } , g d,m − + g p m − + ¯ g p m − (cid:17) , where again we split up g d,m − = g d,m − + g p m − according to Algorithm 1. Inductively repeatingthis argument and combining the result with the estimate in (17), we obtain the inequality H d,m, η ( g ) ≤ b m − X ¯ g ∈ G b,m − H d,m, η (cid:16) g { d − } , g p (cid:17) , where we used that in Algorithm 1 we set g d, = 1. Then, using Lemma 7 with w = 1, g d = 1,and g = 0 to equate the right-hand side of the previous estimate, we finally obtain H d,m, η ( g ) ≤ m X t =1 b t − X ℓ =1 ℓ b ) η d (1 − b ) b − t (cid:18)(cid:22) log b (cid:18) v (cid:18) ℓ ( x ) x (cid:19)(cid:19)(cid:23) + 1 (cid:19) ×× d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − m X t =1 b t − X ℓ =1 ℓ b ) d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19)(cid:16) η d (1 − b − t ) (cid:17) . ℓ with ℓ b ), which is equivalent to gcd( ℓ, p ) = 1, we have for some a ∈ F b \ { } that ⌊ log b ( v ( ℓ ( x ) /x )) ⌋ + 1 = ⌊ log b ( a/b ) ⌋ + 1 = − H d,m, η ( g ) ≤ m X t =1 b t − X ℓ =1 ℓ b ) d − Y j =1 (cid:18) η j (1 − b ) (cid:18)(cid:22) log b (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) ×× (cid:16) η d (1 − b − t ) (cid:17) − ( b m − ≤ (1 + η d ) ( H d − ,m, η ( g { d − } ) + ( b m − − ( b m − η d ) H d − ,m, η ( g { d − } ) + η d ( b m − , which is the claimed estimate.Based on the result in Theorem 4 we can use an inductive argument to show that the quantity H d,m, η ( g ) is sufficiently small if g has been constructed by Algorithm 1. Theorem 5.
Let b be prime, let m, d ∈ N be positive integers and let η = ( η j ) j ≥ be positiveproduct weights. Then the generating vector g constructed by Algorithm 1 satisfies H d,m, η ( g ) ≤ b m − d Y j =1 (1 + η j ) . Proof.
Due to the formulation of Algorithm 1, the estimate (16) obtained in Theorem 4 holdsif we replace d by r for any r ∈ { , . . . , d } , such that we get a result for H r,m, η ( g ) for any r ∈ { , . . . , d } . Hence, we can use this estimate inductively to obtain H d,m, η ( g ) ≤ (1 + η d ) H d − ,m, η ( g { d − } ) + η d ( b m − ≤ (1 + η d )(1 + η d − ) H d − ,m, η ( g { d − } ) + (1 + η d ) η d − ( b m −
1) + η d ( b m − H d − ,m, η ( g { d − } ) d Y j = d − (1 + η j ) + ( b m − − d Y j = d − (1 + η j ) ≤ H ,m, η ( g ) d Y j =2 (1 + η j ) + ( b m − − d Y j =2 (1 + η j ) . (18)Next, we observe that H ,m, η ( g ) = H ,m, η (1)= b m − X n =1 (cid:18) η (1 − b ) (cid:18)(cid:22) log b (cid:18) v m (cid:18) n ( x ) x m (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) − ( b m − − η b m − X n =1 ( b − (cid:18)(cid:22) log b (cid:18) v m (cid:18) n ( x ) x m (cid:19)(cid:19)(cid:23) + 1 (cid:19) = − η m X t =1 b t − X n =1 n b ) ( b − (cid:22) log b (cid:18) v t (cid:18) n ( x ) x t (cid:19)(cid:19)(cid:23) + η (1 − b )( b m − − η m X t =1 t − X r =0 b t − X n =1 n b )deg( n ( x ))= r ( b − (cid:22) log b (cid:18) v t (cid:18) n ( x ) x t (cid:19)(cid:19)(cid:23) + η (1 − b )( b m − . n ( x ) ∈ F b [ x ] of degree 0 ≤ r < t with gcd( n, x ) = 1, we have that (cid:22) log b (cid:18) v t (cid:18) n ( x ) x t (cid:19)(cid:19)(cid:23) = − ( t − r )such that we can further deduce that H ,m, η ( g ) = η m X t =1 ( b − t − X r =0 b t − X n =1 n b )deg( n ( x ))= r ( t − r ) + η (1 − b )( b m − η m X t =1 ( b − ( b − t + t − X r =1 ( b − b r − ( t − r ) ! + η (1 − b )( b m − η m X t =1 ( b − (cid:16) ( b − t + b t − bt + t − (cid:17) + η (1 − b )( b m − η ( b − m X t =1 ( b t −
1) + η (1 − b )( b m − η ( b m +1 − bm − b + m ) + η (1 − b )( b m −
1) = η ( b m − ( b − m − . Combining this with the estimate in (18), we finally obtain H d,m, η ( g ) ≤ η ( b m − d Y j =2 (1 + η j ) + ( b m − − d Y j =2 (1 + η j ) = ( b m − − d Y j =1 (1 + η j ) , which yields the claimed estimate.Theorem 5 allows us to prove the following result regarding the construction in Algorithm 1. Theorem 6.
Let b be prime, let m, d ∈ N with m ≥ , and let ( η j ) j ≥ be positive product weights.Then the generating vector g constructed by Algorithm 1 satisfies T η ( g , p m ) ≤ b m d Y j =1 (1 + η j (( b − m + 1)) + b m d Y j =1 (cid:18) η j (cid:18) b − m + 2 bb − (cid:19)(cid:19) . Proof.
We remark that for reals a , . . . , a d ∈ R the general identity X ∅6 = u ⊆{ d } Y j ∈ u a j = − d Y j =1 (1 + a j )holds. Using the bound on T η ( g , p m ) in Theorem 3 and inserting for g the generating vectorobtained from Algorithm 1, for which the bound on H d,m, η ( g ) from Theorem 5 holds, yields T η ( g , p m ) ≤ − d Y j =1 (1 + η j ) − − d Y j =1 (1 + η j ) + X ∅6 = u ⊆{ d } η u b m (( b − m + 1) | u | + X ∅6 = u ⊆{ d } η u b m ( b m | u | ) (cid:18) ( b − m + bb − (cid:19) | u | b m d Y j =1 (1 + η j (( b − m + 1)) + b m d Y j =1 (cid:18) η j (cid:18) b − m + 2 bb − (cid:19)(cid:19) , where in the last step we used that | u | ≤ | u | . Note that by the formulation of Algorithm 1 wehave that gcd( g j , p m ) = 1 for 1 ≤ j ≤ d such that the conditions of Theorem 3 are satisfied.The next theorem states the main result of this paper, implying that by the constructionin Algorithm 1 we obtain an error convergence rate that is arbitrarily close to the optimalrate of N − α (we know that this order is optimal due to the relation between the worst-caseerrors in W αd, γ and f W αd, γ stated in Section 2 and due to the fact that the rate N − α/ is optimalin f W αd, γ ). Additionally, under a summability condition on the weights that is common in therelated literature, the error can be bounded independently of the dimension, by which we obtainwhat is known as strong polynomial tractability in the context of information-based complexity. Theorem 7.
Let b be prime, let m, d ∈ N with m ≥ , let N = b m , and let ( γ j ) j ≥ be positiveproduct weights satisfying X j ≥ γ j < ∞ . Furthermore, denote by g the generating vector obtained by Algorithm 1, run for the weightsequence η = γ = ( γ j ) j ≥ . Then, for any δ > and each α > , the generating vector g satisfies e b m ,d,α, γ α ( g ) ≤ N α (cid:16) C ( γ α ) + ¯ C ( γ , δ ) N αδ (cid:17) , with positive constants C ( γ α ) and ¯ C ( γ , δ ) , which are independent of d and N .Additionally, if Algorithm 1 is run for the weights η = γ /α with α > , which satisfy X j ≥ γ /αj < ∞ , then, for any δ > , the resulting generating vector e g satisfies the error bound e b m ,d,α, γ ( e g ) ≤ N α (cid:16) K ( γ ) + ¯ K ( γ /α , δ ) N αδ (cid:17) , with positive constants K ( γ ) and ¯ K ( γ /α , δ ) , which are independent of d and N .Proof. We know from Proposition 1 that e b m ,d,α, η α ( g ) ≤ N α X ∅6 = u ⊆{ d } η α u (2 µ b ( α )) | u | + T α, η α ( g , p m ) . For the special case of product weights η u = Q j ∈ u η j , u ⊆ { d } , this yields e b m ,d,α, η α ( g ) ≤ N α d Y j =1 (cid:16) µ b ( α ) η αj (cid:17) + T α, η α ( g , p m ) . Since α >
1, we can use an inequality, sometimes referred to as Jensen’s inequality, which statesthat P Mi =1 y i ≤ (cid:16)P Mi =1 y pi (cid:17) /p for non-negative y , . . . , y M and 0 ≤ p ≤
1. This yields T α, η α ( g , p m ) = X = k ∈ A p ( g ) ( r α, η α ( k )) − = X = k ∈ A p ( g ) ( r , η ( k )) − α ≤ X = k ∈ A p ( g ) ( r , η ( k )) − α = ( T η ( g , p m )) α , η yields g which satisfy T η ( g , p m ) ≤ b m d Y j =1 (1 + η j (( b − m + 1)) + b m d Y j =1 (cid:18) η j (cid:18) b − m + 2 bb − (cid:19)(cid:19) . From this, we deduce, using either the weights η = γ /α or η = γ for Algorithm 1, that b m T η ( g , p m ) ≤ d Y j =1 (1 + η j (( b − m + 1)) + b m d Y j =1 (cid:18) η j (cid:18) b − m + 2 bb − (cid:19)(cid:19) ≤ d Y j =1 (1 + η j bm ) + b m d Y j =1 (1 + η j bm ) = (1 + b m ) d Y j =1 (1 + η j bm ) ≤ e C ( δ/ b mδ/ d Y j =1 (1 + η j bm ) ≤ e C ( δ/ b mδ/ ∞ Y j =1 (1 + η j bm )for arbitrary δ >
0, where e C ( δ/
2) is a constant depending only on δ . Due to the imposedcondition on the weights, i.e., P j ≥ γ j < ∞ or P j ≥ γ /αj < ∞ , we can use the result in[8, Lemma 3] to see that the last product can be bounded by b C ( γ ) b mδ/ or b C ( γ /α ) b mδ/ ,respectively, where b C ( γ ) and b C ( γ /α ) may depend on the weights γ or γ /α , but are independentof the dimension. Choosing η = γ , this yields that( T η ( g , p m )) α = ( T γ ( g , p m )) α ≤ N α (cid:16) e C ( δ/ (cid:17) α (cid:16) b C ( γ ) (cid:17) α N αδ , and similarly, for η = γ /α ,( T η ( g , p m )) α = (cid:16) T γ /α ( g , p m ) (cid:17) α ≤ b mα (cid:16) e C ( δ/ (cid:17) α (cid:16) b C ( γ /α ) (cid:17) α N αδ . Setting then C ( γ α ) = Q dj =1 (1 + 2 µ b ( α ) γ αj ) and ¯ C ( γ , δ ) = ( e C ( δ/ α (cid:16) b C ( γ ) (cid:17) α , and, similarly, K ( γ ) = Q dj =1 (1 + 2 µ b ( α ) γ j ) and ¯ K ( γ /α , δ ) = ( e C ( δ/ α (cid:16) b C ( γ /α ) (cid:17) α , we obtain the claimederror estimates, where the first stated bound holds simultaneously for all α > γ /α ,and hence depending on the parameter α , the algorithm yields typical error bounds for theworst-case error in the space W αd, γ . We emphasize that this type of result could also be obtainedby formulating and using an analogous CBC-DBD algorithm which is instead directly based onthe search criterion e b m ,d,α, γ . On the other hand, when run with weights γ , thus independentlyof α , the algorithm produces generating vectors for which bounds on the worst-case errors inthe spaces W αd, γ α hold simultaneously for all α > In this section we discuss the efficient implementation of the introduced CBC-DBD algorithmand analyze its complexity. Throughout this section, we will consider the implementation for thespecial case of b = 2 and product weights γ u = Q j ∈ u γ j for a sequence of positive reals ( γ j ) j ≥ .Choosing the prime base as b = 2 allows for the use of bitwise operations which facilitate anefficient implementation of the construction scheme. We remark that the major challenge forthe implementation of the algorithm for b > b , all other steps of the algorithm can be implemented analogously.24 .1 Implementation and cost analysis of the CBC-DBD algorithm Let q ∈ F [ x ], m, d ∈ N be positive integers and let γ = ( γ u ) u ⊆{ d } , where γ u = Q j ∈ u γ j withpositive reals ( γ j ) j ≥ . We recall that for b = 2 and integers w ∈ { m } , r ∈ { d } the digit-wisequality function h r,w,m, γ in Definition 3, which is used in Algorithm 1, is given by h r,w,m, γ ( q ) = m X t = w t − w t − X ℓ =1 ℓ ≡ (cid:18) − γ r (cid:18)(cid:22) log (cid:18) v w (cid:18) ℓ ( x ) q ( x ) x w (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) ×× r − Y j =1 (cid:18) − γ j (cid:18)(cid:22) log (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) , where the polynomials g , . . . , g r − ∈ F [ x ] have been determined in the previous steps of thealgorithm. Since the cost of a single evaluation of the function h r,w,m, γ is crucial for the totalcost of Algorithm 1, we are interested in an efficient evaluation procedure which will be discussedin the following paragraph.For integers t ∈ { , . . . , m } and odd ℓ ∈ { , . . . , t − } , we define the term a ( r, t, ℓ ) as a ( r, t, ℓ ) := r Y j =1 (cid:18) − γ j (cid:18)(cid:22) log (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) and observe that for the evaluation of h r,w,m, γ ( q ) we can compute and store the term a ( r − , t, ℓ )since it is independent of w and q . This way we can rewrite h r,w,m, γ ( q ) as h r,w,m, γ ( q ) = m X t = w t − w t − X ℓ =1 ℓ ≡ a ( r − , t, ℓ ) (cid:18) − γ r (cid:18)(cid:22) log (cid:18) v w (cid:18) ℓ ( x ) q ( x ) x w (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) , (19)where in Algorithm 1, after having determined g r,w , the values of a ( r, w, ℓ ) for odd integers ℓ ∈ { , . . . , w − } are computed via the recurrence relation a ( r, w, ℓ ) = a ( r − , w, ℓ ) (cid:18) − γ r (cid:18)(cid:22) log (cid:18) v w (cid:18) ℓ ( x ) g r,w ( x ) x w (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) . For an algorithmic implementation, we introduce the vector v = ( v (1) , . . . , v (2 m − ∈ R m − whose components, for the current r ∈ { , . . . , d } , are given by v ( ℓ m − t ) = r Y j =1 (cid:18) − γ j (cid:18)(cid:22) log (cid:18) v t (cid:18) ℓ ( x ) g j ( x ) x t (cid:19)(cid:19)(cid:23) + 1 (cid:19)(cid:19) = a ( r, t, ℓ )for each t = 1 , . . . , m and corresponding odd index ℓ ∈ { , . . . , t − } . Furthermore, notethat for the evaluation of h r,w,m, γ we do not require the values of a ( r, t, ℓ ) for t = 2 , . . . , w − lgorithm 2 Fast component-by-component digit-by-digit algorithm
Input:
Integers m, d ∈ N and positive weights ( γ j ) dj =1 . for ℓ = 1 to m − do v ( ℓ ) = 1 − γ (cid:16)j log (cid:16) v m (cid:16) ℓx m (cid:17)(cid:17)k + 1 (cid:17) end for Set g ,m = 1 and g , = · · · = g d, = 1. for r = 2 to d dofor w = 2 to m do g ∗ = argmin g ∈ F h r,w,m, γ ( g r,w − + g x w − ) with h r,w,m, γ evaluated using (19) g r,w = g r,w − + g ∗ x w − for ℓ = 1 to w − in steps of do v ( ℓ m − w ) = v ( ℓ m − w ) (cid:16) − γ r (cid:16)j log (cid:16) v w (cid:16) ℓ g r,w x w (cid:17)(cid:17)k + 1 (cid:17)(cid:17) end forend forend for Set g = ( g , . . . , g d ) with g r := g r,m for r = 1 , . . . , d . Return:
Generating vector g = ( g , . . . , g d ) ∈ ( G ∗ ,m ) d for N = 2 m .The computational complexity of Algorithm 2 is then summarized in the following theorem. Theorem 8.
Let m, d ∈ N and let γ = ( γ j ) dj =1 be a given sequence of positive weights. Then Al-gorithm 2 constructs a generating vector g = ( g , . . . , g d ) ∈ ( G ∗ ,m ) d using O ( d m m ) operationsand requiring O (2 m ) memory.Proof. Due to the relation in (19), the cost of evaluating h r,w,m, γ ( q ) can be reduced to O ( P mt = w t − )operations. Thus, the number of calculations in the inner loop over w = 2 , . . . , m of Algorithm2 is of order O m X w =2 m X t = w t − ! = O m X w =2 m X t = w t ! = O ( m m − m − O ( m m ) . Hence, the outer loop over r = 2 , . . . , d , which is the main cost of Algorithm 2, can be executedin O ( d m m ) operations. Furthermore, we observe that initialization and updating of the vector v ∈ R m − can both be executed in O (2 m ) operations. Additionally, storing the vector v requires O (2 m ) of memory.We remark that the running time of Algorithm 2 can be reduced further by precomputingand storing the 2 m − (cid:18)(cid:22) log (cid:18) v m (cid:18) ℓx m (cid:19)(cid:19)(cid:23) + 1 (cid:19) for ℓ = 1 , . . . , m − . The derivation leading to the fast implementation in Algorithm 2 is using arguments thatwere used in [6], where a component-by-component digit-by-digit construction for lattice rulesin weighted Korobov spaces has been studied. Theorem 8 shows that the fast implementationof the component-by-component digit-by-digit construction for polynomial lattice rules achievesthe same computational complexity as state-of-the-art component-by-component methods, see,e.g., [3]. In these constructions the speed-up of the algorithm is achieved by reordering theinvolved matrices to be of circulant structure and by then employing a fast matrix-vector productwhich uses fast Fourier transformations (FFTs). We refer to [18] for further details on an26mplementation for polynomial lattice rules. In contrast, our method does not rely on the use ofFFTs and the low time complexity of the resulting algorithm is due to the smaller search spacefor the components g j of the generating vector g . Furthermore, we remark that the mentionedstate-of-the-art CBC constructions mainly use a primitive or irreducible modulus p ∈ F [ x ] sincethen the multiplicative group of F [ x ] / ( p ) is cyclic. While for reducible polynomials, such as p ( x ) = x m , a fast CBC construction is theoretically possible by using a similar strategy as forthe fast CBC construction for lattice rules with a composite number of points, there are, to thebest of our knowledge, no explicit implementations of such an algorithm known. On the otherhand, the CBC-DBD construction considered in this article immediately yields a fast algorithmfor the construction of polynomial lattice rules in O ( d m m ) operations for p ( x ) = x m . In this section, we illustrate the error convergence behavior of the polynomial lattice rulesconstructed by the CBC-DBD algorithm and visualize the computational complexity of theconstruction by means of numerical experiments. As in the previous section, we consider poly-nomial lattice rules in the weighted Walsh space W αd, γ for prime base b = 2 and product weights γ u = Q j ∈ u γ j given in terms of positive reals ( γ j ) j ≥ .In order to demonstrate the performance of the algorithm, we compare the worst-case errorsof the constructed polynomial lattice rules as well as the algorithm’s computation times to thecorresponding quantities obtained by a state-of-the-art component-by-component algorithm, see,e.g., [3]. As remarked in the previous section, no fast CBC construction is known for the case p ( x ) = x m such that instead we compare our algorithm with a CBC construction with primitivepolynomial p ∈ F [ x ] of degree m as the modulus. Both constructions deliver polynomial latticerules for the spaces W αd, γ consisting of 2 m cubature points.The different algorithms have been implemented in MATLAB R2019b and Python 3.6.3.In Python the implementations are available in double-precision as well as arbitrary-precisionfloating-point arithmetic with the latter provided by the multiprecision Python library mpmath. Let m, d ∈ N , α >
1, and a sequence of positive weights γ = ( γ j ) j ≥ be given. By Theorem 1,the worst-case error of a polynomial lattice point set P ( g , p ) = { x , . . . , x b m − } in base b = 2with generating vector g and modulus p ∈ F [ x ], with deg( p ) = m , in the space W αd, γ is given by e b m ,d,α, γ ( g ) = X = k ∈D ( g ,p ) ( r α, γ ( k )) − = 1 b m b m − X n =0 X = k ∈ N d γ supp( k ) wal k ( x n ) r α ( k ) . For b = 2 and product weights γ u = Q j ∈ u γ j , this expression then equals e m ,d,α, γ ( g ) = − m m − X n =0 d Y j =1 (1 + γ j φ α ( x n,j ))with φ α : [0 , → R given by φ α ( x ) = ( µ ( α ) , if x = 0 ,µ ( α ) − (1+ t )( α − ( µ ( α ) + 1) , otherwise, with t = ⌊ log ( x ) ⌋ , see, e.g., [4]. For the polynomial lattice rules constructed by the algorithms considered, we willuse this worst-case error expression as a measure of quality.27n particular, we consider the convergence behavior of the worst-case error e m ,d,α, γ α ( g ) forgenerating vectors g obtained by the CBC-DBD algorithm (with modulus p ( x ) = x m ) andcompare it with the error rates for polynomial lattice rules constructed by the standard fastCBC algorithm (with primitive polynomial p ∈ F [ x ] of degree m ) which uses the worst-caseerror e m ,d,α, γ α as the quality criterion. We display the computation results for dimension d = 100 for different sequences of product weights γ = ( γ j ) j ≥ , different values of m , anddifferent smoothness parameters α . We stress that the almost optimal error rates of O ( N − α + δ ),guaranteed by Theorem 7, may not always be visible for the weights and ranges of N consideredin our numerical experiments. The graphs shown are therefore to be understood as an illustrationof the pre-asymptotic behavior of the worst-case error. Remark . We stress that in these numerical experiments we compare the CBC-DBD algorithmwith modulus p ( x ) = x m to the CBC construction with a primitive modulus polynomial. Bothconstructions yield polynomial lattices consisting of N = b m points that have been constructedfor the same function space W αd, γ such that the comparison is valid. To the best of our knowledge,there is no known implementation of the fast CBC algorithm for polynomial lattice rules basedon the modulus p ( x ) = x m . The reason for the elusiveness of such an implementation is themore involved structure of the group of units of the factor ring F b [ x ] / ( x m ) when factored intocyclic groups, see, e.g., [24]. While for lattice rules the group of integer units modulo N = b m iseither cyclic (for odd b ) or can be factored into two cyclic subgroups (for b = 2), which makesthe corresponding generator easily computable, see, e.g., [18], the ring F b [ x ] / ( x m ) factors intoa larger number of cyclic subgroups (for sufficiently large m ) and their generating elements areless studied in the context of QMC methods.The results in Figure 1 show that the CBC-DBD algorithm constructs generating vectorsof good polynomial lattice rules which have worst-case errors that are comparable to those ofpolynomial lattice rules obtained by the fast CBC algorithm. We observe identical asymptoticerror rates for both algorithms considered, and also note that the CBC-DBD construction alwaysdelivers slightly higher error values. The latter behavior can easily be explained by the fact thatthe CBC construction is directly tailored to the space W αd, γ α for a particular α since e b m ,d,α, γ α is used as the quality measure. In contrast, the CBC-DBD construction is independent of thesmoothness parameter α and constructs polynomial lattices which have a good quality for all α >
1. This in turn also means that the CBC-DBD algorithm only needs to be executed oncewhile the CBC construction has to be run for all considered α . Additionally, we observe thatthe pre-asymptotic error decay is determined by the weight sequence γ = ( γ j ) j ≥ . The fasterthe weights γ j decay, the closer the error rate is to the optimal rate of O ( N − α ) for the space W αd, γ α . 28 rror convergence in the space W αd, γ α with d = 100 , α = 1 . , , . − − − − − − − Number of points N = 2 m W o r s t - c a s ee rr o r e N , d , α , γ α ( g ) O ( N − . ) O ( N − . ) O ( N − . ) (a) Weight sequence γ = ( γ j ) dj =1 with γ j = 1 /j . − − − − − Number of points N = 2 m W o r s t - c a s ee rr o r e N , d , α , γ α ( g ) O ( N − . ) O ( N − . ) O ( N − . ) (b) Weight sequence γ = ( γ j ) dj =1 with γ j = 1 /j . − − Number of points N = 2 m W o r s t - c a s ee rr o r e N , d , α , γ α ( g ) O ( N − ) O ( N − ) O ( N − . ) (c) Weight sequence γ = ( γ j ) dj =1 with γ j = (0 . j . − − − − − − Number of points N = 2 m W o r s t - c a s ee rr o r e N , d , α , γ α ( g ) O ( N − . ) O ( N − . ) O ( N − . ) (d) Weight sequence γ = ( γ j ) dj =1 with γ j = (0 . j . CBC-DBD standard fast CBC α = 1 . α = 2 α = 3Figure 1: Convergence results of the worst-case error e m ,d,α, γ α ( g ) in the weighted space W αd, γ α for smoothness parameters α = 1 . , , d = 100. The generating vectors g areconstructed via the component-by-component digit-by-digit algorithm and the standard CBCconstruction for polynomial lattice rules for N = 2 m , respectively.29 .2 Computational complexity We demonstrate the computational complexity of Algorithm 2 which was proved in Theorem 8.For this purpose, we measure and compare the computation times of implementations of Algo-rithm 2 and the standard fast CBC algorithm for polynomial lattice rules with primitive modulus p ∈ F [ x ], cf., e.g., [18]. For all timings we perform three independent measurements and thenselect the lowest time out of these three runs. We consider multiple values of m, d ∈ N and fixthe positive weight sequence γ = ( γ j ) j ≥ with γ j = 1 /j . Note that the chosen weight sequencedoes not affect the computation times.In Table 1 we display the timing results for the two considered algorithms. Furthermore,Figure 2 provides a graphical illustration of the running times of both algorithms. We remarkthat the measured times only indicate the duration for the construction of the generating vectorsbut do not include the calculation of the corresponding worst-case error. All timings wereperformed on an Intel Core i5 CPU with 2.3 GHz using Python 3.6.3.Table 1: Computation times (in seconds) for constructing the generating vector g of a polynomiallattice rule with 2 m points in d dimensions using the component-by-component digit-by-digitalgorithm ( bold font ) and the standard fast CBC construction (normal font). For the CBCalgorithm we constructed the polynomial lattice rules with smoothness parameter α = 2. d = 50 d = 200 d = 500 d = 1000 d = 2000 m = 10 0.007 0.025 0.061 0.12 0.239 m = 12 0.025 0.089 0.213 0.421 0.827 m = 14 0.117 0.399 0.953 1.839 3.763 m = 16 0.586 2.0 4.804 9.523 18.836 m = 18 2.858 9.466 22.715 44.56 88.198 m = 20 13.703 44.914 106.861 211.073 416.24 The timings displayed in Table 1 and Figure 2 confirm that the computational complexity ofboth algorithms depends on m and d in a similar way and the measured times are in accordancewith Proposition 8. Additionally, the linear dependence of the construction cost on the dimension d is well observable. The measured construction times for Algorithm 2 are slightly higher than forthe fast CBC algorithm but in general both algorithms can be executed in comparable time. Thisis especially remarkable since the fast CBC construction is based on fast Fourier transformationswhich rely on compiled and optimized code via Python’s Discrete Fourier Transform (numpy.fft)library while the CBC-DBD construction does not make use of any compiled libraries. Lastly,we remark that the slight parabola shape of the timing curve of the CBC-DBD algorithm inFigure 2, which one might suspect, is not to be observed for larger values of m .30 omputation times for CBC-DBD and fast CBC algorithm.
10 11 12 13 14 15 16 17 18 19 2010 − − − m C o m pu t a t i o n t i m e i n s ec o nd s CBC-DBD with d = 50fast CBC with d = 50CBC-DBD with d = 2000fast CBC with d = 2000 Figure 2: Computation times (in seconds) for constructing the generating vector g of a polyno-mial lattice rule with 2 m points in d ∈ { , } dimensions using the component-by-componentdigit-by-digit algorithm (circles) and the standard fast CBC construction (crosses). In this paper, we presented an algorithm for constructing good polynomial lattice rules fornumerical integration in weighted Walsh spaces. In particular, we studied a component-by-component digit-by-digit (CBC-DBD) construction with quality measure independent of thesmoothness parameter α , similar to [6], where such an algorithm was analyzed for ordinarylattice rules. The construction algorithm is formulated for the special case of product weightsand yields polynomial lattice rules which admit error convergence rates that are arbitrarily closeto the optimal convergence order. Furthermore, the proven error bounds become independentof the dimension if the weights satisfy suitable summability conditions. In addition to thesetheoretical results, we derived a fast implementation of the considered algorithm which exhibitsthe same computational complexity as the state-of-the-art fast CBC algorithm, but does notrely on the use of fast Fourier transformations (FFTs). The considered algorithm is, to the bestof our knowledge, the first construction method for good polynomial lattice rules with modulus p ( x ) = x m that requires only O ( d m m ) operations. Extensive numerical experiments illustratedour findings and proved that the considered method is competitive with the standard fast CBCalgorithm. 31 eferences [1] R. Cools, F.Y. Kuo, D. Nuyens. Constructing embedded lattice rules for multivariate inte-gration . SIAM J. Sci. Comput., 28, 2162–2188, 2006.[2] J. Dick, F.Y. Kuo, F. Pillichshammer, I.H. Sloan.
Construction algorithms for polynomiallattice rules for multivariate integration . Math. Comp. 74, 1895–1921, 2005.[3] J. Dick, F.Y. Kuo, I.H. Sloan.
High-dimensional integration—the quasi-Monte Carlo way .Acta Numer. 22, 133–288, 2013.[4] J. Dick, F. Pillichshammer.
Multivariate integration in weighted Hilbert spaces based onWalsh functions and weighted Sobolev spaces . J. Complexity 21, 149–195, 2005.[5] J. Dick, F. Pillichshammer.
Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration . Cambridge University Press, Cambridge 2010.[6] A. Ebert, P. Kritzer, D. Nuyens, O. Osisiogu.
Digit-by-digit and component-by-componentconstructions of lattice rules for periodic functions with unknown smoothness . Submitted,2020.[7] B. Golubov, A. Efimov, V. Skvortsov.
Walsh Series and Transforms: Theory and Applica-tions . Moskow: Nauka, 1987. In Russian. (English translation: Kluver Academic Publishers,Dordrecht, Boston, London, 1991).[8] F.J. Hickernell, H. Niederreiter.
The existence of good extensible rank- lattices . J. Com-plexity, 19, 286–300, 2003.[9] E. Hlawka. Zur angenäherten Berechnung mehrfacher Integrale . Monatshefte für Mathe-matik 66, 140–151, 1962.[10] N.M. Korobov.
Approximate evaluation of repeated integrals . Dokl. Akad. Nauk SSSR, 124,1207–1210, 1959. In Russian.[11] N.M. Korobov.
Number-theoretic methods in approximate analysis . Goz. Izdat. Fiz.-Math.,1963. In Russian.[12] N.M. Korobov.
On the computation of optimal coefficients . Dokl. Akad. Nauk SSSR, 267,289–292, 1982. In Russian.[13] N.M. Korobov.
On the computation of optimal coefficients . Dokl. Akad. Nauk SSSR, 26,590–593, 1982.[14] H. Niederreiter.
Point sets and sequences with small discrepancy . Monatsh. Math. 104, 273–337, 1987.[15] H. Niederreiter.
Low-discrepancy point sets obtained by digital constructions over finite fields .Czechoslovak Math. J. 42, 143–166, 1992.[16] H. Niederreiter.
Random Number Generation and Quasi-Monte Carlo Methods . Society forIndustrial and Applied Mathematics, Philadelphia, 1992.[17] D. Nuyens.
The construction of good lattice rules and polynomial lattice rules . In: P. Kritzer,H. Niederreiter, F. Pillichshammer, A. Winterhof (eds.).
Uniform Distribution and Quasi-Monte Carlo Methods: Discrepancy, Integration and Applications , 223–255, De Gruyter,Berlin, 2014. 3218] D. Nuyens, R. Cools.
Fast component-by-component construction, a reprise for differentkernels . In: H. Niederreiter, D. Talay (eds.).
Monte Carlo and Quasi-Monte Carlo Methods2004 , 373–387, Springer, Berlin, 2006.[19] D. Nuyens, R. Cools.
Fast algorithms for component-by-component construction of rank- lattice rules in shift-invariant reproducing kernel Hilbert spaces . Math. Comp. 75, 903–920,2006.[20] D. Nuyens, R. Cools. Fast component-by-component construction of rank-1 lattice rules witha non-prime number of points . J. Complexity 22, 44–22, 2006.[21] I.H. Sloan, S. Joe.
Lattice Methods for Multiple Integration . Clarendon Press, Oxford, 1994.[22] I.H. Sloan, V.A. Reztsov.
Component-by-component construction of good lattice rules . Math.Comp. 71, 263–273, 2002.[23] I.H. Sloan, H. Woźniakowski.
When are quasi-Monte Carlo algorithms efficient for high-dimensional problems? . J. Complexity 14, 1–33, 1998.[24] J.L. Smith, J.A. Gallian.
Factoring Finite Factor Rings . Math. Magazine 58, 93–95, 1985.
Authors’ addresses:
Adrian EbertJohann Radon Institute for Computational and Applied Mathematics (RICAM)Austrian Academy of SciencesAltenbergerstr. 69, 4040 Linz, Austria. [email protected]
Peter KritzerJohann Radon Institute for Computational and Applied Mathematics (RICAM)Austrian Academy of SciencesAltenbergerstr. 69, 4040 Linz, Austria. [email protected]
Onyekachi OsisioguJohann Radon Institute for Computational and Applied Mathematics (RICAM)Austrian Academy of SciencesAltenbergerstr. 69, 4040 Linz, Austria. [email protected]
Tetiana StepaniukInstitute of MathematicsUniversity of LübeckRatzeburger Allee 160, 23562 Lübeck, Germany, [email protected]@math.uni-luebeck.de