[PDF] Discrepancy Bounds for a Class of Negatively Dependent Random Points Including Latin Hypercube Samples

Abstract

We introduce a class of \gamma-negatively dependent random samples. We prove that this class includes, apart from Monte Carlo samples, in particular Latin hypercube samples and Latin hypercube samples padded by Monte Carlo. For a \gamma-negatively dependent N-point sample in dimension d we provide probabilistic upper bounds for its star discrepancy with explicitly stated dependence on N, d, and \gamma. These bounds generalize the probabilistic bounds for Monte Carlo samples from [Heinrich et al., Acta Arith. 96 (2001), 279--302] and [C.~Aistleitner, J.~Complexity 27 (2011), 531--540], and they are optimal for Monte Carlo and Latin hypercube samples. In the special case of Monte Carlo samples the constants that appear in our bounds improve substantially on the constants presented in the latter paper and in [C.~Aistleitner, M.~T.~Hofer, Math. Comp.~83 (2014), 1373--1381].

Full PDF

aa r X i v : . [ m a t h . S T ] F e b Discrepancy Bounds for a Class of NegativelyDependent Random Points Including Latin HypercubeSamples

Michael Gnewuch ∗ Nils Hebbinghaus † February 10, 2021

Abstract

We introduce a class of γ -negatively dependent random samples. We prove thatthis class includes, apart from Monte Carlo samples, in particular Latin hypercubesamples and Latin hypercube samples padded by Monte Carlo.For a γ -negatively dependent N -point sample in dimension d we provide proba-bilistic upper bounds for its star discrepancy with explicitly stated dependence on N , d , and γ . These bounds generalize the probabilistic bounds for Monte Carlosamples from [Heinrich et al., Acta Arith. 96 (2001), 279–302] and [C. Aistleitner,J. Complexity 27 (2011), 531–540], and they are optimal for Monte Carlo and Latinhypercube samples. In the special case of Monte Carlo samples the constants thatappear in our bounds improve substantially on the constants presented in the latterpaper and in [C. Aistleitner, M. T. Hofer, Math. Comp. 83 (2014), 1373–1381]. Discrepancy measures are well established and play an important role in ﬁelds like com-puter graphics, experimental design, empirical process theory, learning theory and ma-chine learning, random number generation, optimization (in particular, stochastic pro-gramming), and numerical integration or stochastic simulation, see, e.g., [6, 8, 9, 11, 12,20, 21, 22, 25, 41, 43, 45, 47, 50] and the literature mentioned therein.The prevalent and most intriguing discrepancy measure is arguably the star discrep-ancy , which is deﬁned in the following way:Let P ⊂ [0 , d be an N -point set. (We always understand an “ N -point set” as a“multi-set”, i.e., it consists of N points, but those points do not have to be pairwise ∗ Institut f¨ur Mathematik, Universit¨at Osnabr¨uck, Germany ( [email protected] ). † Institut f¨ur Informatik, Christian-Albrechts-Universit¨at zu Kiel, Germany( [email protected] ). local discrepancy of P with respect to a Lebesgue-measurabletest set T ⊆ [0 , d by D N ( P, T ) := (cid:12)(cid:12)(cid:12)(cid:12) N | P ∩ T | − λ d ( T ) (cid:12)(cid:12)(cid:12)(cid:12) , where | P ∩ T | denotes the cardinality of the ﬁnite set P ∩ T and λ d denotes the d -dimensional Lebesgue measure on R d . For vectors x = ( x , x , . . . , x d ), y = ( y , y , . . . , y d ) ∈ R d we write [ x, y ) := d Y j =1 [ x j , y j ) = { z ∈ R d | x j ≤ z j < y j for j = 1 , . . . , d } . The star discrepancy of P is then given by D ∗ N ( P ) := sup y ∈ [0 , d D N ( P, [0 , y )) . The star discrepancy is intimately related to quasi-Monte Carlo integration via theKoksma-Hlawka inequality ([36, 39]): For every N -point set P ⊂ [0 , d we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z [0 , d f ( x ) d λ d ( x ) − N X p ∈ P f ( p ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ D ∗ N ( P )Var HK ( f ) , where Var HK ( f ) denotes the variation of the integrand f in the sense of Hardy and Krause,see, e.g., [2, 47]. The Koksma-Hlawka inequality is sharp, see again [47]. (An alternativesharp version of the Koksma-Hlawka inequality can be found in [34]; it says that the worst-case error of equal-weight quadratures based on a set of sample points P over the normunit ball of the Sobolev space of dominated mixed smoothness of order one is exactlythe star discrepancy of P .) The Koksma-Hlawka inequality shows that equal-weightquadratures based on sample points with small star discrepancy yield small integrationerrors. (Deterministic equal-weight quadratures are commonly called quasi-Monte Carloquadratures ; for a survey we refer to [11].) For the very important task of high-dimensionalintegration, which occurs, e.g., in computational ﬁnance, physics or quantum chemistry,it is therefore of interest to know tight bounds for the smallest achievable star discrepancy D ∗ ( N, d ) := inf { D ∗ N ( P ) | P ⊂ [0 , d , | P | = N } or, equivalently, for the inverse of the star discrepancy N ∗ ( ε, d ) := inf { N ∈ N | D ∗ ( N, d ) ≤ ε } , which is the minimum number of sample points that guarantee a discrepancy bound ofat most ε , and to be able to construct integration points that satisfy those bounds. Toavoid the “curse of dimensionality” it is crucial that such bounds scale well with respectto the dimension d .For ﬁxed d the best known asymptotic upper bounds for D ∗ ( N, d ) are of the form2 ∗ ( N, d ) ≤ C d ln( N ) d − N − , N ≥ , (1)see [30] or, e.g., the books [12, 47]. For larger d those bounds give us no helpful informationfor moderate values of N , since the function f ( N ) := ln( N ) d − N − is increasing for N ≤ e d − . Additionally, for d ≥ N much larger than e d − are needed before f ( N ) is below the common “Monte Carlo rate” 1 / √ N – in dimension d = 10, e.g., weneed N to be larger than 1 . · . Moreover, the constant C d may grow unfavorablyas d gets larger. Actually, it is known for some N -point constructions P that the constant C ′ d in the representation D ∗ ( P ) ≤ ( C ′ d ln( N ) d − + o (ln( N ) d − )) N − of (1) tends to zero as d approaches inﬁnity, see, e.g., [47, 48, 5] or [23]. But the behaviorof the “whole constant” C d in (1) is unfortunately not known. That is why the asymptoticupper bound (1) is not helpful for high-dimensional integration and we have to look forpre-asymptotic bounds, i.e., bounds that give us useful information already for a moderatenumber of points N .The best known upper and lower bounds for the smallest achievable star discrepancywith explicitly given dependence on the number of sample points N as well as on thedimension d are of the following form: On the one hand, for all d, N ∈ N there exists an N -point set P ⊂ [0 , d satisfying D ∗ N ( P ) ≤ C r dN (2)for some constant C >

0, implying N ∗ ( ε, d ) ≤ (cid:6) C dε − (cid:7) (3)for all d ∈ N , ε ∈ (0 , c, ε > N ∗ ( ε, d ) ≥ cdε − (4)for all 0 < ε ≤ ε , d ∈ N , showing that for all N -point sets P ⊂ [0 , d necessarily D ∗ N ( P ) ≥ min (cid:26) ε , c dN (cid:27) . (5)Notice that (3) and (4) show that the inverse of the star discrepancy depends essentiallylinearly on the dimension d (in the sense that we have an upper bound for it that dependslinearly on d and also a lower bound that depends linearly on d ). The upper bounds(2) and (3) were proved in [33] without providing an estimate for the constant C . Anestimate was given in [1], who showed that C ≤ .

65. In the course of this paper wewill improve his estimate to C ≤ . ε in3rbitrary dimension d by a factor of more than 14, cf. (3). All the results mentioned sofar are based on probabilistic arguments and do not provide an explicit (deterministic)point construction that satisﬁes (2). The lower bounds (4) and (5) were established in[35]. Notice that there is a gap between the upper and lower bounds (2) and (5), and (3)and (4), respectively.Already 15 years ago Heinrich posed the following problems in [32, Problem 1 & 2]:(P1) For each N, d ∈ N give a (deterministic) construction of an N -point set P ⊂ [0 , d satisfying (2) for some positive constant C not depending on N or d .(P2) Does any of the various known (deterministic) constructions of low discrepancy pointsets satisfy an estimate like (2)?(P3) Determine the order of the smallest possible star discrepancy as a function of thenumber of points N and the dimension d .(P4) Determine α :=sup (cid:8) α | ∃ c, k ≥ ∀ N, d ∈ N ∃ P ⊂ [0 , d : | P | = N ∧ D ∗ N ( P ) ≤ c d k N α (cid:9) . (The so-called exponent of tractability of the star discrepancy τ is related to α inthe following way: α = 1 /τ ; see, e.g., [33].)Similar problems were stated in [49, 50] as Open Problems 6, 7, and 42.Problems (P1) to (P4) turned out to be very diﬃcult to solve. It is, for instance,obvious that problem (P3) is a very hard problem, since it contains the so-called greatopen problem of discrepancy theory to ﬁnd the precise order of the smallest possible stardiscrepancy in N for ﬁxed dimension d ≥

3. But also the other problems turned out to bevery diﬃcult to answer and have not been solved so far. For problem (P4) it is known that1 / ≤ α < . Conjecture 1.1 (Wo´zniakowski) . α = 1 / . If this conjecture is true, then, due to the essentially linear dependence of the inverseof the star discrepancy on the dimension d , the estimate (2) is actually the best possiblebound (apart from logarithmic factors) that is polynomial in N − as well as in d .One reason for the diﬃculty of Problems (P1) and (P2) is that already the problemof calculating the star discrepancy of an arbitrary N -point set is N P -hard, see [29], and,in the language of parametrized complexity theory, W [1]-hard, see [24].Heinrich also posed weaker versions of Problem (P1) and (P2), cf. [32]. Those weakerversions were at least formally solved with the help of derandomized algorithms thatgenerate deterministic N -point sets P that satisfy D ∗ N ( P ) ≤ C r dN p ln(1 + N ) , (6)4ee [17, 15, 19], or D ∗ N ( P ) ≤ C r d N p ln(1 + N/d ) , (7)see [16, 18], for some small constants C , C . Those algorithms derandomize probabilisticexperiments, in which random point sets P satisfying (6) or (7) with high probabilityare generated. As numerical experiments in [18, 19] showed, those algorithms work wellin dimensions up to d = 21, but for much larger dimension their running times areprohibitive. (For a more extensive discussion, see also [28].)To get closer to a solution of the problems stated by Heinrich, we propose to studythe following related randomized problems:(R1) What kind of randomized point constructions satisfy (2) in expectation and/or withhigh probability?(R2) Are there randomized point constructions that satisfy (2) in expectation and/or withhigh probability and have more evenly distributed lower dimensional projections orsatisfy better asymptotic discrepancy bounds than Monte Carlo points?(R3) Are there randomized point constructions that lead to a better estimate than (2) orthat can even be used to disprove Wo´zniakowski’s conjecture?We believe that the problems (R1), (R2), and (R3) are important and interesting intheir own rights. Moreover, an answer to question (R1) or (R2) would draw us closerto a solution of problem (P1) (since we may derandomize promising randomized pointconstructions) and of problem (P2) (since we get a hint, which known deterministic con-structions are worth to be examined closer). An aﬃrmative answer to question (R3) maylead, due to the probabilistic method, to progress in problem (P3) and in problem (P4).Let us explain this a little bit more. As mentioned, the upper bound (2) was provedvia probabilistic arguments. Indeed, Monte Carlo points, i.e., independent random pointsuniformly distributed in [0 , d , satisfy this bound with high probability (cf. also Corol-lary 4.5).In [13] it was shown that the star discrepancy of Monte Carlo point sets X behaveslike the right hand side in (2). More precisely, there exists a constant K > X is bounded from below by E [ D ∗ N ( X )] ≥ K r dN (8)and additionally we have the probabilistic discrepancy bound P D ∗ N ( X ) < K r dN ! ≤ exp( − Ω( d )) . (9)The upper bound (2) is thus sharp for Monte Carlo points, showing that they cannotbe employed to improve the upper discrepancy bound (2) or to disprove Wo´zniakowski’sconjecture. 5learly, there are other random point constructions that look more promising thansimple Monte Carlo point sets as, e.g., Latin hypercube samples, see [46], scrambled( t, m, s )-nets, see [54, 56, 55], or randomly shifted lattice rules, see [60]. In general, theserandom points will not be stochastically independent, which raises the next problem:(R4) How to analyze random point constructions whose single points are not stochasti-cally independent?Obviously, this problem is not only of interest for analyzing the star discrepancy, but forstochastic simulation in general.In this paper we want to address the problems (R1), (R2), and (R4). We proceedas follows. First we introduce a negative dependence property of random point setsthat is based on negative orthant dependence and that we call γ -negative dependence, seeDeﬁnitions 2.1 and 3.5. This property allows us to use large deviation bounds of Hoeﬀding-and Bernstein-type, see Section 2. In Section 3 we prove that the class of γ -negativelydependent random points contains, in particular, Latin hypercube samples and Latinhypercube samples padded by Monte Carlo in arbitrary dimension d , see Theorem 3.6.With Theorem 3.7 we provide a generalization of Theorem 3.6 that may be used to verify γ -negative dependence for random point sets diﬀerent from (padded or unpadded) Latinhypercube samples. In Section 4 we show that for each γ = e ρd , ρ ∈ N , all γ -negativedependent random point sets P satisfy (2) with high probability and, surprisingly, theconstant C = C ( γ ) depends only mildly on γ , see Theorem 4.4. In particular, ourresult generalizes the results obtained in [1, 4, 33], since the latter results can be seenas probabilistic discrepancy bounds for Monte Carlo point sets. In Corollary 4.5 weprovide discrepancy bounds with explicit constants and explicit success probabilities forthe special instances of Monte Carlo point sets and Latin hypercube samples (paddedby Monte Carlo or not). In the special case of Monte Carlo point sets these constantsand success probabilities improve substantially on the ones derived in [1] and [4]. InRemark 4.6 we explain that the probabilistic discrepancy bound (2) is actually sharpfor Latin hypercube samples; this result follows directly from new lower probabilisticdiscrepancy bounds in [14].Let us close the introduction with some remarks on our notation.For N ∈ N we denote the set { , , . . . , N } by [ N ]. We denote the Lebesgue measureon R by λ . By P , E , and V we always mean probability, expectation, and variance,respectively. If not speciﬁed otherwise, all random variables are deﬁned on a probabilityspace (Ω , Σ , P ). The concept of negative dependence was introduced in [40] for pairs of random variables.In the literature one ﬁnds several contributions on rather demanding notions of negativedependence as, e.g., negative association introduced in [38]; a survey can be found in [59].Suﬃcient for our purpose is the following notion for Bernoulli or binary random variables,i.e., random variables that only take values in { , } .6 eﬁnition 2.1. Let γ ≥

1. We call binary random variables T , T , . . . , T N upper γ -negatively dependent if P ^ j ∈ u T j = 1 ! ≤ γ Y j ∈ u P ( T j = 1) for all u ⊆ [ N ], (10)and lower γ -negatively dependent if P ^ j ∈ u T j = 0 ! ≤ γ Y j ∈ u P ( T j = 0) for all u ⊆ [ N ]. (11)We call T , T , . . . , T N γ -negatively dependent if both conditions (10) and (11) are satisﬁed.If γ = 1, we usually will suppress the explicit reference to γ .A similar notion, called λ -correlation, can be found in [57]. 1-negative dependence isusually called negative orthant dependence, cf. [7].Notice that, in particular, independent binary random variables are negatively depen-dent. Furthermore, it is easily seen that for N = 2 and γ = 1 the notions of upper andlower γ -negative dependence are equivalent, cf. [40].We are interested in binary random variables T i , i = 1 , . . . , N , of the form T i = A ( X i ), where A is a Lebesgue-measurable subset of [0 , d (whose characteristic functionis denoted by A ), and X , . . . , X N are randomly chosen points in [0 , d .Panconesi and Srinivasan derived in [57] Chernoﬀ-Hoeﬀding-type bounds ([10, 37]) for λ -correlated random variables. We will use the following two similar bounds of Hoeﬀding-and of Bernstein-type; for a proof see, e.g., [31]. Theorem 2.2.

Let γ ≥ , and let T , . . . , T N be γ –negatively dependent binary randomvariables. Put S := P Ni =1 ( T i − E [ T i ]) . We have P ( | S | ≥ t ) ≤ γ exp (cid:18) − t N (cid:19) for all t > . (12) Theorem 2.3.

Let γ ≥ , and let T , . . . , T N be γ -negatively dependent binary randomvariables. Put S := P Ni =1 ( T i − E [ T i ]) and σ := N P Ni =1 V [ T i ] . Then we have P ( | S | ≥ t ) ≤ γ exp (cid:18) − t N σ + 2 t/ (cid:19) for all t > . (13)Let us close this section by mentioning a line of research that has been initiated in[42] and that is related to the one pursued in this paper. Its main goal is to show that forimportant classes of random variables the variance of (suitable) randomized quasi-MonteCarlo estimators is never worse than the variance of the plain Monte Carlo estimator. Theproof techniques used there are based on a pairwise negative dependence condition andHoeﬀding’s lemma, see [42]. Further results in this direction are provided in [62, 63, 64].7 Latin Hypercube Sampling and Padding

To provide useful examples of non-trivial γ -negatively dependent random variables, weprove in this section a negative dependence result for sample points stemming from aLatin hypercube sample, which may additionally be padded by Monte Carlo. Furtherexamples are provided in the follow-up paper [64]. Deﬁnition 3.1. A Latin hypercube sample (LHS) ( X n ) Nn =1 in [0 , d is of the form X n,j = π j ( n −

1) + U n,j N , where X n,j denotes the j th coordinate of X n , π j is a permutation of 0 , . . . , N −

1, uni-formly chosen at random, and U n,j is uniformly distributed in [0 , d permutations π j and the dN random variables U n,j are mutually independent.The deﬁnition of Latin hypercube sampling presented above was introduced in [46] forthe design of computer experiments; there is an older version of Latin hypercube samplingpresented in [58], where all uniform random variables U n,j are simply replaced by the value0 . I of an integral I based on Latin hypercube sampleswith N points never leads to a variance greater than that of the corresponding estimatorbased on N − Remark 3.2.

Notice that the one-dimensional projections of Latin hypercube samplesare much more evenly distributed than the one-dimensional projections of Monte Carlopoints. This observation can be put into a quantitative statement by comparing thestar discrepancies of the former and the latter projections: For a one-dimensional Latinhypercube sample of size N the star discrepancy is at most 1 /N , while for one-dimensionalMonte Carlo samples of the same size the star discrepancy is of size 1 / √ N , cf. (8) and(9). Deﬁnition 3.3.

Let d, d ′ , d ′′ ∈ N with d = d ′ + d ′′ . Let Y = ( Y k ) k ∈ N be a (deterministic orrandomized) sequence in [0 , d ′ , and let U = ( U k ) k ∈ N be a sequence of independent uni-formly distributed random vectors in [0 , d ′′ . The d -dimensional concatenated sequence X = ( X k ) k ∈ N = ( Y k , U k ) k ∈ N is called a mixed sequence . One also says that X results from Y by padding by Monte Carlo .Padding by Monte Carlo was introduced in [61] to tackle diﬃcult problems in particletransport theory. He suggested to use a mixed sequence resulting from padding a deter-ministic low-discrepancy sequence. Mixed sequences showed a favorable performance inseveral numerical experiments, see, e.g., [51, 52]. The latter papers also provided theo-retical results on probabilistic discrepancy estimates of mixed sequences which have beenimproved in [3, 27]. Padding by LHS (instead of by Monte Carlo) was considered earlierin [53, Example 5]. 8e deﬁne the 1-dimensional grid G N by G N := { , /N, . . . , ( N − /N, } . The following lemma on 1-dimensional LHS is the key ingredient in the proof of our mainresult on d -dimensional LHS (with or without padding by Monte Carlo), Theorem 3.6.Theorem 3.6 combined with Theorem 4.4 immediately imply the discrepancy bounds forLHS (with or without padding by Monte Carlo) in Corollary 4.5. Lemma 3.4.

Let ( X n ) Nn =1 be a LHS in [0 , . Let a, b ∈ [0 , with a ≤ b . (a) We have for all ν ∈ { , , . . . , N } that P ν ^ ℓ =1 X ℓ ∈ [ a, b ) ! ≤ ( b − a ) ν . (14)(b) Let I (0)1 := [ a, b ) , I (0)2 := [0 , b ) and I (1)1 := [ b, , I (1)2 := [0 , a ) ∪ [ b, . Then we havefor all σ ∈ { , } , k ∈ { , , . . . , N } , and ν ∈ { , , . . . , k } that P ν ^ ℓ =1 X ℓ ∈ I ( σ )1 ∧ k ^ ℓ = ν +1 X ℓ ∈ I ( σ )2 ! ≤ δλ ( I ( σ )1 ) ν λ ( I ( σ )2 ) k − ν , (15) where δ := ( if a, b ∈ G N or a = 0 ,e else . The constant δ in (15) is optimal in the following sense: for each δ < e and any σ ∈ { , } there exist N ∈ N , a, b ∈ [0 , , k ∈ { , , . . . , N } , and ν ∈ { , , . . . , k } such that (15) does not hold.Proof. Let α := ⌈ N a ⌉ , β := ⌊ N b ⌋ , and let ε a , ε b ∈ [0 ,

1) such that a = α − ε a N and b = β + ε b N . (a) We may assume ν ≥ N − ≥ β − α ≥ ν −

2, since otherwise (14) holds trivially.We ﬁrst consider the case β − α ≤ N −

2. If ν points fall into [ a, b ), then one of the threedisjoint events that exactly ν points, ν − ν − α/N, β/N )occurs. Therefore P ν := P ν ^ ℓ =1 X ℓ ∈ [ a, b ) ! = ( β − α )( β − α − · · · ( β − α − ν + 1) N ( N − · · · ( N − ν + 1)+ ν ( β − α )( β − α − · · · ( β − α − ν + 2) N ( N − · · · ( N − ν + 2) ε a + ε b N − ν + 1+ ν ( ν −

1) ( β − α )( β − α − · · · ( β − α − ν + 3) N ( N − · · · ( N − ν + 3) ε a N − ν + 2 ε b N − ν + 1 .

9e have to verify that P ν is at most ( b − a ) ν = (cid:0) ( β − α + ( ε a + ε b )) /N (cid:1) ν . Since for ﬁxedsum ε a + ε b the product ε a ε b is largest if ε a = ε b =: ε , we may conﬁne ourselves to thelatter case. Put C := ( β − α )( β − α − · · · ( β − α − ν + 3) N ( N − · · · ( N − ν + 1)and deﬁne the function f ν by f ν ( ε ) := C (cid:2) ( β − α − ν + 2)( β − α − ν + 1) + ( β − α − ν + 2)2 νε + ν ( ν − ε (cid:3) × (cid:18) Nβ − α + 2 ε (cid:19) ν ;it suﬃces to show that | f ν ( ε ) | ≤ ε ∈ [0 , f ′ ν are ε = 1 and ε ≤ ν > ε = 1 if ν = 2 . Hence f ν takes itsmaximum in [0 ,

1] in 0 or in 1. Now f ν (0) = ( β − α )( β − α − · · · ( β − α − ν + 1) N ( N − · · · ( N − ν + 1) (cid:18) Nβ − α (cid:19) ν ≤ , and from( β − α − ν + 2)( β − α − ν + 1) + ( β − α − ν + 2)2 ν + ν ( ν −

1) = ( β − α + 2)( β − α + 1)and β − α + 2 ≤ N we obtain f ν (1) = ( β − α + 2)( β − α + 1)( β − α ) · · · ( β − α − ν + 3) N ( N − · · · ( N − ν + 1) (cid:18) Nβ − α + 2 (cid:19) ν ≤ . The remaining (simpler) case β − α = N − a = 0 and ν = 0, since theyare already covered by (a). Furthermore, due to (a) it suﬃces to show that P ( σ ) C ( k, ν ) := P k ^ ℓ = ν +1 X ℓ ∈ I ( σ )2 (cid:12)(cid:12)(cid:12) ν ^ ℓ =1 X ℓ ∈ I ( σ )1 ! ≤ δλ ( I ( σ )2 ) k − ν . Let us ﬁrst consider σ = 0: If b ∈ G N , then b = β/N and P (0) C ( k, ν ) = β − νN − ν β − ( ν + 1) N − ( ν + 1) · · · β − ( k − N − ( k − ≤ (cid:18) βN (cid:19) k − ν = (cid:0) λ ( I (0)2 ) (cid:1) k − ν . If b / ∈ G N , then β ≤ N − P (0) C ( k, ν ) ≤ P k ^ ℓ = ν +1 X ℓ ∈ (cid:20) , β + 1 N (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ν ^ ℓ =1 X ℓ ∈ I (0)1 ! = β + 1 − νN − ν β + 1 − ( ν + 1) N − ( ν + 1) · · · β + 1 − ( k − N − ( k − ≤ (cid:18) βN − (cid:19) k − ν ≤ (cid:18) NN − (cid:19) k − ν b k − ν ≤ (cid:18) N − (cid:19) N − b k − ν ≤ e (cid:0) λ ( I (0)2 ) (cid:1) k − ν . σ = 1: If a, b ∈ G N , then we have a = α/N , b = β/N , and it iseasily veriﬁed that (15) holds with δ = 1.If a / ∈ G N , then α ≥ P (1) C ( k, ν ) ≤ P k ^ ℓ = ν +1 X ℓ ∈ [0 , a ) ∪ (cid:20) βN , (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ν ^ ℓ =1 X ℓ ∈ I (1)1 ! = N − β + ( α − − νN − ν N − β + ( α − − ( ν + 1) N − ( ν + 1) · · ·× N − β + ( α − − ( k − N − ( k − k − ν ) N − β + ( α − − νN − ν · · · N − β + ( α − − ( k − N − ( k −

2) 1 − ε a N − ( k − . We want to prove that the last term is less or equal than e (cid:0) ( N − β − ε b + ( α −

1) + (1 − ε a )) /N (cid:1) k − ν . Obviously, it is enough to show this for the case ε b = 1. Put C := N − β + ( α − − νN − ν · · · N − β + ( α − − ( k − N − ( k −

2) 1 N − ( k − f k,ν by f k,ν ( ε ) := C [ N − β + ( α − − ( k −

1) + ( k − ν ) ε ] (cid:18) NN − β + α − ε (cid:19) k − ν ;it suﬃces to show | f k,ν ( ε ) | ≤ e for all ε ∈ [0 , f ′ k,ν is at least 1, hence f k,ν takes its maximum in [0 ,

1] in 0 or 1. Now f k,ν (0) = N − β + ( α − − νN − ν · · · N − β + ( α − − ( k − N − ( k − × (cid:18) NN − β + ( α − − (cid:19) k − ν ≤ (cid:18) NN − (cid:19) k − ν ≤ e, and f k,ν (1) = N − β + ( α − − ( ν − N − ν · · · N − β + ( α − − ( k − N − ( k − × (cid:18) NN − β + ( α − (cid:19) k − ν ≤ (cid:18) NN − (cid:19) k − ν ≤ e.

11e now show that the constant δ in (15) is optimal: Let σ = 0 and β = N − α , ε a = 0, ε b ∈ (0 , k = N , and ν = 1. For this choice of parameters we get P (cid:16) X ℓ ∈ I (0)1 (cid:17) = b − a and P (0) C ( N,

1) = 1 . Therefore every δ that satisﬁes (15) for every choice of N and ε b has to fulﬁll δ ≥ (cid:18) NN − ε b (cid:19) N − ≥ (cid:18) − ε b N − (cid:19) N − ;since the last expression converges to e − ε b for N → ∞ and since we can choose ε b arbitrarily small, this implies δ ≥ e .In the case σ = 1 we can consider a corresponding example: Let α = N − β , ε a = 0, ε b ∈ (0 , k = N , and ν = 1. Again, it is easily veriﬁed that (15) cannot holdfor δ < e by choosing N suﬃciently large and 1 − ε b suﬃciently small.This concludes the proof of Lemma 3.4. Deﬁnition 3.5.

For d ∈ N we put C d := { [0 , a ) | a ∈ [0 , d } and D d := { B \ A | A, B ∈ C d } . Let

S ∈ {C d , D d } . We say that the random points X , . . . , X N in [0 , d are S - γ -negativelydependent if for all S ∈ S the random variables S ( X ) , . . . , S ( X N ) are γ -negatively dependent. Theorem 3.6.

Let d, d ′ , d ′′ ∈ N such that d = d ′ + d ′′ . Let ( X n ) Nn =1 be a LHS in [0 , d ′ and ( Y n ) Nn =1 be independently randomized Monte Carlo points in [0 , d ′′ . For n = 1 , . . . , N put Z n := ( X n , Y n ) . For a, b ∈ [0 , d let A := [0 , a ) , B := [0 , b ) , and D := B \ A . Thenthe random variables D ( Z ) , . . . , D ( Z N ) are γ d ′ -negatively dependent, (16) where γ d ′ := d ′ Y i =1 δ i and δ i := ( if a i , b i ∈ G N or a i = 0 ,e else . In particular, the random points ( Z n ) Nn =1 are C d -negatively as well as D d - γ d ′ -negativelydependent.Proof. Let U = ( u n ) Nn =1 be a family of uniformly distributed i.i.d. random points in [0 , d .For c ∈ { , , . . . , d } we deﬁne the random point sets b P ( c ) = ( b p n ( c )) Nn =1 by( b p n ( c )) i = ( π i ( n − u n,i N for i ≤ cu n,i for i > c, π i is a randomly chosen permutation. Here the permutations π i , i ∈ [ d ], and the u n , n ∈ [ N ], are mutually independent. Notice that b P ( c ) is an MC point set for c = 0, a c -dimensional LHS padded by MC for 1 ≤ c < d , and a d -dimensional LHS for c = d . Put γ ( c ) := c Y i =1 δ i . We ﬁrst show via induction that for c = 0 , , . . . , d the random variables D ( b p ( c )) , . . . , D ( b p N ( c )) are upper γ ( c )-negatively dependent . (17)This is clearly satisﬁed for c = 0, since the random variables D ( b p (0)), . . . , D ( b p N (0))are even independent. Now let c ≥ c −

1. We use the shorthand b P := b P ( c ) and e P := b P ( c −

1) and correspondingnotation for the random points in both sets. We denote by P ∗ c the orthogonal projectiononto all coordinates except of the c th coordinate (i.e., for x ∈ R d we have P ∗ c ( x ) =( x , . . . , x c − , x c +1 , . . . , x d )). Note that P ∗ c ( b p j ) = P ∗ c ( e p j ) for all j ∈ [ N ]. Furthermore,we put A c := [0 , a c ), A ∗ c := P ∗ c ( A ), B c := [0 , b c ), B ∗ c := P ∗ c ( B ), and D c := B c \ A c , D ∗ c := B ∗ c \ A ∗ c . We have b p j ∈ D if and only if one of the following two disjoint eventsoccurs:1. P ∗ c ( b p j ) = P ∗ c ( e p j ) ∈ A ∗ c and b p j,c ∈ D c ,2. P ∗ c ( b p j ) = P ∗ c ( e p j ) ∈ D ∗ c and b p j,c ∈ B c .Since our random point distribution is symmetric, i.e., our random points are exchange-able, we get for k ∈ [ N ] P k ^ j =1 b p j ∈ D ! = k X ν =0 (cid:18) kν (cid:19) P k ^ j =1 b p j ∈ D ∧ ν ^ j =1 P ∗ c ( e p j ) ∈ A ∗ c ∧ k ^ j = ν +1 P ∗ c ( e p j ) ∈ D ∗ c ! = k X ν =0 (cid:18) kν (cid:19) P ν ^ j =1 P ∗ c ( e p j ) ∈ A ∗ c ∧ k ^ j = ν +1 P ∗ c ( e p j ) ∈ D ∗ c ! × P ν ^ j =1 b p j,c ∈ D c ∧ k ^ j = ν +1 b p j,c ∈ B c (cid:12)(cid:12)(cid:12) ν ^ j =1 P ∗ c ( e p j ) ∈ A ∗ c ∧ k ^ j = ν +1 P ∗ c ( e p j ) ∈ D ∗ c ! . (18)Since diﬀerent components of our random point set b P are mutually independent, for ﬁxed ν ∈ { , , . . . , k } the conditional probability in (18) is equal to P ν ^ j =1 b p j,c ∈ D c ∧ k ^ j = ν +1 b p j,c ∈ B c ! e P if we substitute all occuring points b p j by e p j , we obtain P k ^ j =1 b p j ∈ D ! ≤ δ c P k ^ j =1 e p j ∈ D ! ≤ γ ( c ) λ d ( D ) k , (19)where the ﬁrst inequality follows from P ν ^ j =1 b p j,c ∈ D c ∧ k ^ j = ν +1 b p j,c ∈ B c ! ≤ δ c ( b c − a c ) ν b k − νc = δ c P ν ^ j =1 e p j,c ∈ D c ∧ k ^ j = ν +1 e p j,c ∈ B c ! which is valid due to (15) for σ = 0, and the second inequality follows from our inductionhypothesis.Now we show that 1 D ( b p ( c )) , . . . , D ( b p N ( c )) are lower γ ( c )-negatively dependent. Thisholds if and only if1 F ( b p ( c )) , . . . , F ( b p N ( c )) are upper γ ( c )-negatively dependent, (20)where F := A ∪ (cid:0) [0 , d \ B (cid:1) . We now verify (20) by induction. Again, the statement isobvious for c = 0. So let c ≥ c −

1. As before we use thenotation b P , e P , etc.. We have b p j ∈ F if and only if one of the following three disjointevents occurs:1. P ∗ c ( b p j ) = P ∗ c ( e p j ) ∈ A ∗ c and b p j,c ∈ A c ∪ [ b c , P ∗ c ( b p j ) = P ∗ c ( e p j ) ∈ D ∗ c and b p j,c ∈ [ b c , P ∗ c ( b p j ) = P ∗ c ( e p j ) ∈ [0 , d − \ B ∗ c and b p j,c ∈ [0 , k ∈ [ N ] P k ^ j =1 b p j ∈ F ! = X ≤ ν ≤ ν ≤ k (cid:18) kν , ν − ν , k − ν (cid:19) × P ν ^ j =1 P ∗ c ( e p j ) ∈ A ∗ c ∧ ν ^ j = ν +1 P ∗ c ( e p j ) ∈ D ∗ c ∧ k ^ j = ν +1 P ∗ c ( e p j ) ∈ [0 , d \ B ∗ c ! × P (cid:18) ν ^ j =1 b p j,c ∈ A c ∪ [ b c , ∧ ν ^ j = ν +1 b p j,c ∈ [ b c , ∧ k ^ j = ν +1 b p j,c ∈ [0 , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ν ^ j =1 P ∗ c ( e p j ) ∈ A ∗ c ∧ ν ^ j = ν +1 P ∗ c ( e p j ) ∈ D ∗ c ∧ k ^ j = ν +1 P ∗ c ( e p j ) ∈ [0 , d \ B ∗ c (cid:19) . (21)14or ﬁxed ν , ν the conditional probability appearing in the sum in (21) is equal to P ν ^ j =1 b p j,c ∈ A c ∪ [ b c , ∧ ν ^ j = ν +1 b p j,c ∈ [ b c , ! (provided that the event on which we condition occurs with positive probability). Sincethis observation and (21) hold also for e P if we substitute all occuring points b p j by e p j , theinequality P k ^ j =1 b p j ∈ F ! ≤ δ c P k ^ j =1 e p j ∈ F ! ≤ γ ( c ) λ d ( F ) k (22)follows from P (cid:18) ν ^ j =1 b p j,c ∈ A c ∪ [ b c , ∧ ν ^ j = ν +1 b p j,c ∈ [ b c , (cid:19) ≤ δ c ( a c + (1 − b c )) ν (1 − b c ) ν = δ c P ν ^ j =1 e p j,c ∈ A c ∪ [ b c , ∧ ν ^ j = ν +1 e p j,c ∈ [ b c , ! , which holds true due to (15) for σ = 1, and our induction hypothesis.This concludes the proof of Theorem 3.6.Studying the proof above it is easy to see that the following generalization of Theo-rem 3.6 is valid. Theorem 3.7.

Let a, b ∈ [0 , d and put A := [0 , a ) , B := [0 , b ) , and D := B \ A . Let ( Z n ) Nn =1 be a set of (not necessarily independent) random points in [0 , d that satisﬁes thefollowing two conditions: (i) Diﬀerent components of the random point set are mutually independent. (ii)

For all i ∈ [ d ] and for I (0)1 ,i := [ a i , b i ) , I (0)2 ,i := [0 , b i ) and I (1)1 ,i := [ b i , , I (1)2 ,i :=[0 , a i ) ∪ [ b i , there exists a δ i > such that for all σ ∈ { , } , k ∈ { , , . . . , N } , ν ∈ { , , . . . , k } , and all J ⊆ [ N ] , J ν , J k − ν ⊆ J with | J | = k , | J ν | = ν , | J k − ν | = k − ν and J ν ∩ J k − ν = ∅ , one has P  ^ ℓ ∈ J ν Z ℓ,i ∈ I ( σ )1 ,i ∧ ^ ℓ ∈ J k − ν Z ℓ,i ∈ I ( σ )2 ,i  ≤ δ i λ ( I ( σ )1 ,i ) ν λ ( I ( σ )2 ,i ) k − ν . (23) Then the random variables D ( Z ) , . . . , D ( Z N ) are γ d -negatively dependent, where γ d := d Y i =1 δ i . (24)15 Probabilistic Discrepancy Bounds

Now we consider the star discrepancy D ∗ N ( X ) (as deﬁned in the introduction) of D d - γ -negatively dependent random points X = ( X n ) Nn =1 (cf. Deﬁnition 3.5).To “discretize” the star discrepancy, we deﬁne δ –covers as in [17]: Deﬁnition 4.1.

For any δ ∈ (0 ,

1] a ﬁnite set Γ of points in [0 , d is called a δ –cover of [0 , d , if for every y ∈ [0 , d there exist x, z ∈ Γ ∪ { } such that x ≤ y ≤ z and λ d ([0 , z ]) − λ d ([0 , x ]) ≤ δ . The number N ( d, δ ) denotes the smallest cardinality of a δ –cover of [0 , d .The following theorem was stated and proved in [26]. Theorem 4.2. [26, Thm.1.15] For any d ≥ and δ ∈ (0 , we have N ( d, δ ) ≤ d d d d ! ( δ − + 1) d . Notice that due to Stirling’s formula we have d d /d ! ≤ e d / √ πd . Furthermore, it iseasy to verify that in the case d = 1 the identity N (1 , δ ) = ⌈ δ − ⌉ (25)is established with the help of the δ -Cover Γ := { / ⌈ δ − ⌉ , / ⌈ δ − ⌉ , . . . , } .With the help of δ -covers the star discrepancy can be approximated in the followingsense. Lemma 4.3.

Let P ⊂ [0 , d be an N -point set, δ > , and Γ be a δ -cover of [0 , d . Then D ∗ N ( P ) ≤ max x ∈ Γ D N ( P, [0 , x )) + δ. The proof of Lemma 4.3 is straightforward, cf., e.g., [17, Lemma 3.1].We are ready to state and prove our main result, a general probabilistic discrepancybound for D d - γ -negatively dependent random points. Theorem 4.4.

Let d, N ∈ N and ρ ∈ [0 , ∞ ) . Let X = ( X n ) n ∈ [ N ] be a set of D d - e ρd -negatively dependent random points in [0 , d such that each X n is uniformly distributed.Then for every c > D ∗ N ( X ) ≤ c r dN (26) holds with probability at least − e − (1 . · c − . − ρ ) · d , implying that for every q ∈ (0 , D ∗ N ( X ) ≤ . s . ρ + ln (cid:0) (1 − q ) − (cid:1) d r dN (27) holds with probability at least q . X , . . . , X n in Theorem 4.4 wehave ρ = 0 and our bound (27) improves on the main result of [4], that is, Theorem 1in that paper. This is further illustrated by the next corollary. It follows immediatelyfrom Theorem 4.4, since we may take ρ = 0 for Monte Carlo points and ρ = 1 for Latinhypercube samples (padded by Monte Carlo or not), see Theorem 3.6. Notice in particularthe quantitative improvements in the constant in (28) compared to the constant 9 . Corollary 4.5.

Let d, N ∈ N and let X = ( X n ) n ∈ [ N ] be a random point set in [0 , d .1. If X is a Monte Carlo point set, then there exists a realization P ⊂ [0 , d of X such that D ∗ N ( P ) ≤ . · r dN . (28) The probability that X = ( X n ) n ∈ [ N ] satisﬁes D ∗ N ( X ) ≤ · r dN and D ∗ N ( X ) ≤ · r dN (29) is at least . and . , respectively.2. If X is a Latin hypercube sample or a Latin hypercube sample padded by MonteCarlo, then there exists a realization P ⊂ [0 , d of X such that D ∗ N ( P ) ≤ . · r dN . (30) The probability that X = ( X n ) n ∈ [ N ] satisﬁes D ∗ N ( X ) ≤ · r dN and D ∗ N ( X ) ≤ · r dN (31) is at least . and . , respectively.Estimate (28) implies N ∗ ( ε, d ) ≤ (cid:6) . · dε − (cid:7) for all ε > . (32)We may use Theorem 4.4 to prove similar corollaries for other random samples thanMonte Carlo point sets or (padded) Latin hypercube samples; cf. also [64]. Remark 4.6.

We already mentioned in the introduction that the probabilistic discrep-ancy bound (26) is sharp for Monte Carlo point sets. As shown in [14], the same is thecase for Latin hypercube samples: There exists a constant

K > d ≥ N ≥ d the expected star discrepancy of a Latin hypercube sample X is boundedfrom below by E [ D ∗ N ( X )] ≥ K r dN (33)and additionally we have the probabilistic discrepancy bound P D ∗ N ( X ) < K r dN ! ≤ exp( − Ω( d )) , (34)see [14, Theorem 2]. Nevertheless, recall that Latin hypercube samples have a big advan-tage over Monte Carlo samples, namely that their one-dimensional projections are moreevenly distributed, cf. Remark 3.2. Proof of Theorem 4.4.

We adapt the line of proof of [1, Theorem 1] and employ thebounds on the size of minimal δ -covers from Theorem 4.2, dyadic chaining and largedeviation bounds of Hoeﬀding- and Bernstein-type (but this time the ones for sums of γ -negatively dependent random variables in Theorem 2.2 and 2.3). For a, b ∈ [0 , d with a ≤ b we write ∆( a, b ) := [0 , b ) \ [0 , a ) . We start by putting µ := 13 and c µ := ∞ X ℓ =0 (cid:18)r µ + 12 µ (cid:19) ℓ = 11 − q µ +12 µ . Let c > c > c will be determined laterin the proof. Let K be the smallest natural number that satisﬁes K ≥ µ and1 √ K K ≤ c c c µ r dN . (35)We choose for each µ ≤ k ≤ K a 2 − k -cover Γ k of minimum size. Furthermore, we putΓ µ − := { } .Let P be an arbitrary realization of X . Due to Lemma 4.3 we can choose for each testbox [0 , y ) ⊆ [0 , d an a K ∈ Γ K ∪ { } such that D N ( P, [0 , y )) ≤ D N ( P, [0 , a K )) + 2 − K . If K > µ , we additionally choose for k = K − , . . . , µ points a k = a k ( a k +1 ) ∈ Γ k ∪ { } recursively, depending only on the previously chosen point a k +1 , such that a k ≤ a k +1 and λ d (∆( a k , a k +1 )) ≤ − k . (36)Finally, we put a µ − = a µ − ( a µ ) := 0 and get ∆( a µ − , a µ ) = [0 , a µ ). Notice that [0 , a K ) = ∪ Kk = µ ∆( a k − , a k ). Hence we have D N ( P, [0 , y )) ≤ K X k = µ D N ( P, ∆( a k − , a k )) + 2 − K . (37)18et us now for µ ≤ k ≤ K deﬁne the sets A k by A k := { ∆( a k − ( b ) , b ) | b ∈ Γ k } . Clearly, |A k | ≤ | Γ k | . Moreover, we deﬁne events E k by E µ := ( max ∆ µ ∈A µ D N ( X, ∆ µ ) ≤ c r dN ) for k = µ and E k := ( max ∆ k ∈A k D N ( X, ∆ k ) ≤ c c r k − k − r dN ) for µ + 1 ≤ k ≤ K . We put E := K \ k = µ E k . Let us assume that the event E has occured. Then, due to (37) and (35), the realization P of X satisﬁes for an arbitrary test box [0 , y ) ⊆ [0 , d D N ( P, [0 , y )) ≤ c c K X k = µ +1 r k − k − ! r dN + 2 − K ≤ c c r µ µ K − µ − X j =0 s µ + j j µ + c µ r K K !! r dN ≤ c c r µ µ K − µ − X j =0 (cid:18)r µ + 12 µ (cid:19) j + s µ + ( K − µ ) µ K − µ c µ !! r dN ≤ c c r µ µ K − µ − X j =0 (cid:18)r µ + 12 µ (cid:19) j + (cid:18)r µ + 12 µ (cid:19) K − µ c µ !! r dN = c (cid:18) c c µ r µ µ (cid:19) r dN . Let us now derive a lower bound for the probability P ( E ). For k = µ we may useTheorem 2.2 and a simple union bound to obtain P ( E cµ ) ≤ | Γ µ | e ρd e − c d . (38)For µ + 1 ≤ k ≤ K we ﬁrst use the deﬁnition of K , cf. (35), to get the estimate p ( k − − k ≤ p ( K − K − − k < c c c µ r dN ! − − k . P ( E ck ) ≤ | Γ k | e ρd exp (cid:18) − c c ( k − d / c µ ) (cid:19) . (39)We have P ( E ) = 1 − P ( E c ) ≥ − K X k = µ P ( E ck ) . We put τ µ := c / c µ )and σ := µ − ln(2(2 µ + 1)) −

1. From now on we assume that c ≥ p ( µ + ρ − σ ) / . Let us ﬁrst consider the case d = 1. Due to (25) we have | Γ k | = 2 k for k = µ, . . . , K .Hence we get from (38) and (39) for k = µ P ( E cµ ) ≤ exp (cid:0) − c + ρ + ( µ + 1) ln(2) (cid:1) and for µ + 1 ≤ k ≤ K P ( E ck ) ≤ exp (cid:0) − (2 c τ µ − ln(2))( k −

1) + ρ + 2 ln(2) (cid:1) . Hence we obtain P ( E ) ≥ − e − (2 c − ρ − ( µ +1) ln(2)) + e ρ +2 ln(2) K X k = µ +1 e − ( c τ µ − ln(2) ) ( k − ! =1 − e − (2 c − ρ − ( µ +1) ln(2)) + e ρ +2 ln(2) e − ( c τ µ − ln(2) ) µ K − µ − X j =0 e − ( c τ µ − ln(2) ) j ! =1 − e − (2 c − ρ − ( µ +1) ln(2)) e − c ( µτ µ − − e − ( c τ µ − ln(2) ) ! ≥ − e − (2 c − ρ − ( µ +1) ln(2)) (cid:18) e − ( µ + ρ − σ )( µτ µ − − e − ( µ + ρ − σ ) τ µ +ln(2) (cid:19) . Choosing τ µ = 0 , ρ ≥

01 + e − ( µ + ρ − σ )( µτ µ − − e − ( µ + ρ − σ ) τ µ +ln(2) < e. Since e − (2 c − ρ − ( µ +1) ln(2)) < e e − (2 c − ρ − µ + σ ) , c = p ( µ + ρ − σ ) / P ( E ) > d ≥

2. Due to Theorem 4.2 we have | Γ k | ≤ √ πd (2 e ) d (2 k + 1) d for k = µ, . . . , K .Hence we get from (38) and (39) for k = µ P ( E cµ ) ≤ r πd e − (2 c − µ − ρ + σ ) d and for µ + 1 ≤ k ≤ K P ( E ck ) ≤ r πd (2 e ) d kd (1 + 2 − k ) d e ρd exp (cid:18) − c c ( k − d / c µ ) (cid:19) ≤ r πd e (1+2 ln(2)+ ϑ + ρ ) d exp (cid:0) − (2 c τ µ − ln(2))( k − d (cid:1) , where ϑ := ln(1 + 2 − µ − ). Put ζ := 2 ln(2) + ϑ . Then we obtain P ( E ) ≥ − r πd (cid:18) e − (2 c − µ − ρ + σ ) d + e (1+ ζ + ρ ) d e − ( c τ µ − ln(2) ) µd K − µ − X j =0 e − ( c τ µ − ln(2) ) jd (cid:19) ≥ − r πd e − (2 c − µ − ρ + σ ) d e − (2 c ( µτ µ − − ln(2)) µ − − ζ − σ ) d − e − (2 c τ µ − ln(2)) d ! ≥ − r πd e − (2 c − µ − ρ + σ ) d (cid:18) e − (( µ + ρ − σ )( µτ µ − − ln(2)) µ − − ζ − σ ) d − e − (( µ + ρ − σ ) τ µ − ln(2)) d (cid:19) . Choosing τ µ = 0 , d = 1, we get1 + e − (( µ + ρ − σ )( µτ µ − − ln(2)) µ − − ζ − σ ) d − e − (( µ + ρ − σ ) τ µ − ln(2)) d < r πd d = 2 and – since the left hand side of inequality (40)is monotonic decreasing in d , while the right hand side is monotonic increasing – holdstherefore for all d ≥

2. Hence the choice c = p ( µ + ρ − σ ) / P ( E ) > Acknowledgment

The authors thank Marcin Wnuk and two anonymous referees for valuable comments.Part of the work of Michael Gnewuch was done while he was a research fellow and avisitor at the School of Mathematics and Statistics of the University of New South Wales21n Sydney and a “Chercheur Invit´e” at the Laboratoire d’Informatique (LIX) of ´EcolePolytechnique in Paris, France. He acknowledges support from the Australian ResearchCouncil ARC and thanks his hosts Josef Dick, Frances Y. Kuo, Ian H. Sloan, and BenjaminDoerr for their hospitality.Another part of his work was done while he visited special semesters and programs atthe Radon Institute for Computational and Applied Mathematics (RICAM) in Linz, Aus-tria, the Institute of Computational and Experimental Mathematics (ICERM) of BrownUniversity in Providence, USA, and the Erwin Schr¨odinger Institute (ESI) in Vienna,Austria.

References [1] C. Aistleitner. Covering numbers, dyadic chaining and discrepancy.

J. Complexity , 27:531–540,2011.[2] C. Aistleitner and J. Dick. Functions of bounded variation, signed measures, and a generalKoksma–Hlawka inequality.

Acta Arith. , 167:143–171, 2015.[3] C. Aistleitner and M. T. Hofer. Probabilistic error bounds for the discrepancy of mixed sequences.

Monte Carlo Methods Appl. , 18, 2012.[4] C. Aistleitner and M. T. Hofer. Probabilistic discrepancy bounds for monte carlo of mixed sequences.

Math. Comp. , 83, 2014.[5] E. I. Atanassov. On the discrepancy of the Halton sequences.

Math. Balkanica (N. S.) , 18:15–32,2004.[6] H. Avron, V. Sindhwani, J. Yang, and M. W. Mahoney. Quasi-Monte Carlo feature maps forshift-invariant kernels.

J. Machine Learning Res. , 17:1–38, 2016.[7] H. W. Block, T. H. Savits, and M. Shaked. Some concepts of negative dependence.

Ann. Probab. ,10, 1982.[8] C. Cervellera and D. Macci´o. Local linear regression for function learning: an analysis based onsample discrepancy.

IEEE Transactions on Neural Networks and Learning Systems , 25:2086–2098,2014.[9] B. Chazelle.

The Discrepancy Method . Cambridge University Press, Cambridge, 2000.[10] H. Chernoﬀ. A measure of asymptotic eﬃciency for tests of a hypothesis based on the sum ofobservations.

Ann. Math. Statist. , 23, 1952.[11] J. Dick, F. Y. Kuo, and I. H. Sloan. High dimensional integration – the quasi-Monte Carlo way.

Acta Numerica , 22:133–288, 2013.[12] J. Dick and F. Pillichshammer.

Digital Nets and Sequences . Cambridge University Press, Cambridge,2010.[13] B. Doerr. A lower bound for the discrepancy of a random point set.

J. Complexity , 30:16–20, 2014.

14] B. Doerr, C. Doerr, and M. Gnewuch. Probabilistic lower discrepancy bounds for Latin hypercubesamples. In J. Dick, F. Y. Kuo, and H. Wo´zniakowski, editors,

Contemporary ComputationalMathematics – a Celebration of the 80th Birthday of Ian Sloan , pages 339–350. Springer-Verlag,2018.[15] B. Doerr and M. Gnewuch. Construction of low-discrepancy point sets of small size by bracketingcovers and dependent randomized rounding. In A. Keller, S. Heinrich, and H. Niederreiter, edi-tors,

Monte Carlo and Quasi-Monte Carlo Methods 2006 , pages 299–312, Berlin Heidelberg, 2008.Springer-Verlag.[16] B. Doerr, M. Gnewuch, P. Kritzer, and F. Pillichshammer. Component-by-component constructionof low-discrepancy point sets of small size.

Monte Carlo Methods Appl. , 14:129–149, 2008.[17] B. Doerr, M. Gnewuch, and A. Srivastav. Bounds and constructions for the star discrepancy via δ -covers. J. Complexity , 21:691–709, 2005.[18] B. Doerr, M. Gnewuch, and M. Wahlstr¨om. Implementation of a component-by-component algo-rithm to generate low-discrepancy samples. In P. L’Ecuyer and A. B. Owen, editors,

Monte Carloand Quasi-Monte Carlo Methods 2008 , Berlin Heidelberg, 2009. Springer.[19] B. Doerr, M. Gnewuch, and M. Wahlstr¨om. Algorithmic construction of low-discrepancy point setsvia dependent randomized rounding.

J. Complexity , 26:490–507, 2010.[20] C. Doerr, M. Gnewuch, and M. Wahlstr¨om. Calculation of discrepancy measures and applications.In W. W. Chen, A. Srivastav, and G. Travaglini, editors,

Panorama of Discrepancy Theory, LectureNotes in Mathematics 2107 , pages 621–678. Springer, 2014.[21] M. Drmota and R. F. Tichy.

Sequences, Discrepancies and Applications , volume 1651 of

LectureNotes in Mathematics . Springer, Berlin and Heidelberg, 1997.[22] K.-T. Fang and Y. Wang.

Number-Theoretic Methods in Statistics . Chapman and Hall/CRC, 1993.[23] H. Faure and C. Lemieux. Generalized Halton sequences in 2008: A comparative study.

ACMTrans. Model. Comput. Simul. , 19:15: 1–31, 2009.[24] P. Giannopoulus, C. Knauer, M. Wahlstr¨om, and D. Werner. Hardness of discrepancy computationand epsilon-net veriﬁcation in high dimensions.

J. Complexity , 28:162–176, 2012.[25] P. Glasserman.

Monte Carlo Methods in Financial Engineering . Springer-Verlag, New York, 2004.[26] M. Gnewuch. Bracketing numbers for axis-parallel boxes and applications to geometric discrepancy.

J. Complexity , 24:154–172, 2008.[27] M. Gnewuch. On probabilistic results for the discrepancy of a hybrid-Monte Carlo sequence.

J.Complexity , 25:312–317, 2009.[28] M. Gnewuch. Entropy, randomization, derandomization, and discrepancy. In L. Plaskota andH. Wo´zniakowski, editors,

Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing ,pages 43–78. Springer, 2012.[29] M. Gnewuch, A. Srivastav, and C. Winzen. Finding optimal volume subintervals with k points andcalculating the star discrepancy are NP-hard problems. J. Complexity , 25:115–127, 2009.[30] J. H. Halton. On the eﬃciency of certain quasi-random sequences of points in evaluating multidi-mensional integrals.

Numer. Math. , 2:84–90, 1960.

31] N. Hebbinghaus. Mixed sequences and application to multilevel algorithms. Master’s thesis, ChristChurch, University of Oxford, 2012.[32] S. Heinrich. Some open problems concerning the star-discrepancy.

J. Complexity , 19:416–419, 2003.[33] S. Heinrich, E. Novak, G. W. Wasilkowski, and H. Wo´zniakowski. The inverse of the star-discrepancydepends linearly on the dimension.

Acta Arith. , 96:279–302, 2001.[34] F. J. Hickernell, I. H. Sloan, and G. W. Wasilkowski. On tractability of weighted integration overbounded and unbounded regions in R s . Math. Comp. , 73:1885–1901, 2004.[35] A. Hinrichs. Covering numbers, Vapnik- ˇCervonenkis classes and bounds for the star-discrepancy.

J. Complexity , 20:477–483, 2004.[36] E. Hlawka. Funktionen von beschr¨ankter Variation in der Theorie der Gleichverteilung.

Ann. Mat.Pura Appl. , 54:325–333, 1961.[37] W. Hoeﬀding. Probability inequalities for sums of bounded random variables.

Amer. Statist. Assoc.J. , 58, 1963.[38] K. Joag-Dev and F. Proschan. Negative association of random variables, with applications.

Ann.Statist. , 11, 1983.[39] J. F. Koksma. Een algemeene stelling uit de theorie der gelijkmatige verdeeling modulo 1.

Mathe-matica B (Zutphen) , 11:7–11, 1942/3.[40] E. Lehmann. Some concepts of dependence.

Ann. Math. Statist. , 37, 1966.[41] C. Lemieux.

Monte Carlo and Quasi-Monte Carlo Sampling . Springer, New York, 2009.[42] C. Lemieux. Negative dependence, scrambled nets, and variance bounds.

Math. Oper. Res. , 43:228–251, 2018.[43] G. Leobacher and F. Pillichshammer.

Introduction to Quasi-Monte Carlo Integration and Applica-tions . Birkh¨auser, Basel, 2014.[44] J. Matouˇsek. On the L -discrepancy for anchored boxes. J. Complexity , 14:527–556, 1998.[45] J. Matouˇsek.

Geometric Discrepancy . Springer-Verlag Berlin Heidelberg, 2010.[46] M. D. McKay, R. J. Beckman, and W. J. Conover. A comparison of three methods for selectingvalues of input variables in the analysis of output from a computer code.

Technometrics , 21:239–245,1979.[47] H. Niederreiter.

Random Number Generation and Quasi-Monte Carlo Methods , volume 63 of

CBMS-NSF Regional Conference Series in Applied Mathematics . Society for Industrial and Applied Math-ematics (SIAM), Philadelphia, 1992.[48] H. Niederreiter and C. P. Xing. Low-discrepancy bounds and global function ﬁelds with manyrational places.

Finite Fields Appl. , 2:241– 273, 1996.[49] E. Novak and H. Wo´zniakowski.

Tractability of Multivariate Problems. Vol. 1: Linear Information .EMS Tracts in Mathematics. European Mathematical Society (EMS), Z¨urich, 2008.[50] E. Novak and H. Wo´zniakowski.

Tractability of Multivariate Problems. Vol. 2: Standard Informationfor Functionals . EMS Tracts in Mathematics. European Mathematical Society (EMS), Z¨urich, 2010.

51] G. ¨Okten. A probabilistic result on the discrepancy of a hybrid-monte carlo sequence and applica-tions.

Monte Carlo Methods Appl. , 2, 1996.[52] G. ¨Okten, B. Tuﬃn, and V. Burago. A central limit theorem and improved error bounds for ahybrid-Monte Carlo sequence with applications in computational ﬁnance.

J. Complexity , 22:435–458, 2006.[53] A. B. Owen. Lattice sampling revisited: Monte Carlo variance of means over randomized orthogonalarrays.

Ann. Statist. , 22, 1994.[54] A. B. Owen. Randomly permuted ( t, m, s )-nets and ( t, s )-sequences. In H. Niederreiter and P. J.-S. Shiue, editors,

Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing , pages299–317, New York, 1995. Springer.[55] A. B. Owen. Monte Carlo variance of scrambled net quadrature.

SIAM J. Numer. Anal. , 34, 1997.[56] A. B. Owen. Scrambled net variance for integrals of smooth functions.

Ann. Statist , 25, 1997.[57] A. Panconesi and A. Srinivasan. Randomized distributed edge coloring via an extension of thechernoﬀ-hoeﬀding bounds.

SIAM J. Comput. , 26, 1997.[58] H. D. Patterson. The errors of lattice sampling.

J. Royal Statistical Society, Series B , 16:140–149,1954.[59] R. Pemantle. Towards a theory of negative dependence.

Journal of Math. Physics , 41, 2000.[60] I. H. Sloan, F. Y. Kuo, and S. Joe. On the step-by-step construction of quasi-monte carlo integra-tion rules that achieve strong tractability error bounds in weighted sobolev spaces.

Math. Comp. ,71:1609–1640, 2002.[61] J. Spanier. Quasi-Monte Carlo methods for particle transport problems. In H. Niederreiter andP. J.-S. Shiue, editors,

Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing , pages121–148, Berlin, 1995. Springer-Verlag.[62] J. Wiart, C. Lemieux, and G. Y. Dong. On the dependence structure of scrambled ( t, m, s )-nets,2019. preprint, arXiv:1903.09877v3.[63] M. Wnuk and M. Gnewuch. Note on pairwise negative dependence of randomly shifted and jitteredrank-1 lattices.

Operations Research Letters , 48:410–414, 2020. (arXiv:1903.02261v2).[64] M. Wnuk, M. Gnewuch, and N. Hebbinghaus. On negatively dependent sampling schemes, vari-ance reduction, and probabilistic upper discrepancy bounds. In D. Bilyk, J. Dick, and F. Pil-lichshammer, editors,

Discrepancy Theory , pages 43–68. De Gruyter, Berlin/Munich/Boston, 2020.(arXiv:1904.10796v3)., pages 43–68. De Gruyter, Berlin/Munich/Boston, 2020.(arXiv:1904.10796v3).