Double-bootstrap methods that use a single double-bootstrap simulation
aa r X i v : . [ m a t h . S T ] A ug Double-bootstrap methods that use a singledouble-bootstrap simulation
Jinyuan Chang Peter HallDepartment of Mathematics and StatisticsThe University of Melbourne, VIC, 3010, [email protected] [email protected]
Abstract
We show that, when the double bootstrap is used to improve performance of bootstrap methodsfor bias correction, techniques based on using a single double-bootstrap sample for each single-bootstrap sample can be particularly effective. In particular, they produce third-order accuracyfor much less computational expense than is required by conventional double-bootstrap meth-ods. However, this improved level of performance is not available for the single double-bootstrapmethods that have been suggested to construct confidence intervals or distribution estimators.
Keywords:
Bias correction; Bias estimation; Confidence intervals; Distribution estimation; Edge-worth expansion; Second-order correctness; Third-order correctness.
Double-bootstrap methods that use a single simulation at the second bootstrap level have beenstudied in at least one context for more than a decade. An early contribution was made by White(2000), although in the setting of diagnosing the overuse of a dataset, rather than speeding upMonte Carlo simulation for general applications of the bootstrap. Davidson & Mackinnon (2001,2002), and the same authors in a number of subsequent papers accessible via Mackinnon (2006)and Davidson & Mackinnon (2007), introduced the concept independently and explored its appli-cations. Giacomini et al. (2013) christened the technique the warp-speed double-bootstrap method,nomenclature that we shall use here, too. Giacomini et al. (2013) demonstrated that this approach isasymptotically consistent. All this work is for the case of distribution estimation and its applicationto constructing confidence intervals and hypothesis tests.In statistics the conventional double bootstrap is used in two main classes of problems: (i) Toimprove the effectiveness of bias correction, and (ii) to improve the coverage accuracy of confidenceintervals. In problem (i), an application of the double bootstrap reduces the order of magnitude ofbias by the factor O ( n − ), and in problem (ii) it reduces coverage error by the factor O ( n − / ) forone-sided confidence intervals, and O ( n − ) for two-sided intervals. In the setting of problem (i), it isnot clear whether there exists a version of warp-speed methodology for bias correction, and whether,1hould it exist, it successfully reduces the order of magnitude of bias. Call these questions 1 and 2,respectively. In problem (ii), it is unclear whether the warp-speed double bootstrap is as effective asthe conventional double bootstrap, in the sense of offering the above levels of improved accuracy; weshall refer to this as question 3. In the present paper we show that the answers to questions 1 and 2are positive, but that the answer to question 3 is negative. In particular, the warp-speed bootstrapdoes not reduce the order of magnitude of coverage error of a confidence interval.There is an extensive literature on conventional double-bootstrap methods, particularly in thecontext of improving the coverage accuracy of single-bootstrap methods. The first mention of thedouble bootstrap in this setting apparently was by Hall (1986), followed quickly by contributions ofBeran (1987, 1988). See also Hall & Martin (1988). The approach suggested by Hall (1992, Chap. 3)allows general multiple bootstrap methods to be developed together, so that different settings donot require separate treatment. However, details of properties of the technique seem to be veryproblem-specific. Efron (1983) was the first to use the double bootstrap in any setting; in that paperhis work was in the context of estimating the error rate of classifiers. Research on optimising thetrade-off between the numbers of simulations in the first and second stages of the conventional doublebootstrap, in the context of distribution estimation and constructing confidence intervals, includesthat of Booth & Hall (1994), Booth & Presnell (1998) and Lee & Young (1999).It has become conventional to assess performance of the bootstrap in terms of Edgeworth ex-pansions, not least because that approach enables theoretical properties to be developed in the verybroad context addressed by Bhattacharya & Ghosh (1978). The resulting approximations are valid,in absolute rather than relative terms, uniformly in the tails. An alternative approach, based onlarge deviation probabilities, is valid in relative terms; see e.g. Hall (1990). However, it requireseither more stringent assumptions or specialised methods that, at least at present, are not availablein the context of the models used by Bhattacharya & Ghosh (1978). In the setting of absolute ratherthan relative accuracy, arbitrarily far out into the tails, the results in this paper take the result ofconsistency, demonstrated by Giacomini et al. (2013), much further. Let θ = f ( µ ) be a parameter expressible as a known function, f , of a p -variate mean, µ , and let¯ X denote an unbiased estimator of µ = ( µ , . . . , µ p ) T . Our estimator of θ is the same function of asample mean, ¯ X : ˆ θ = f ( ¯ X ) . (1)The smooth function f maps a point x in p -variate Euclidean space to a point on the real line. Wedo not insist that ¯ X be a mean of n , say, independent and identically distributed random p -vectors,2ince it might be the case that ¯ X = ( ¯ X , . . . , ¯ X p ) T , with¯ X j = 1 n j n j X i =1 X ji , where X ji , for 1 ≤ i ≤ n j , are independent for each i , E ( X ji ) = µ j for each j , and the n j s are notall equal. Nevertheless, in mathematical terms we shall assume that the n j s are all functions of aninteger parameter n , and that each n j ≍ n ; that is, each ratio n j /n is bounded away from zero andinfinity as n → ∞ .These issues are related to dependence relationships among the random variables X ji , whichshould be reflected in resampling methodology. In our theoretical work we shall suppose that:either (i) each n j = n and the vectors ( X i , . . . , X pi ) T , for i ≥
1, are independent andidentically distributed; or (ii) the X ji s are totally independent, for 1 ≤ i ≤ n j and1 ≤ j ≤ p , and in this case, for each j ∈ { , . . . , p } the variables X j , X j , . . . areidentically distributed, and n j ≍ n . (2)Each of (i) and (ii) above can be generalized, for example to hybrid cases where, for posi-tive integers p , . . . , p r that satisfy P rj =1 p j = p , and defining q j = P jk =1 p k , the vectors V ji =( X q j +1 ,i , . . . , X q j +1 i ) T , for 0 ≤ j ≤ r − i ≥
1, are completely independent, and for each j thevectors V ji , for i ≥
1, are identically distributed. Bootstrap methods that reflect these properties canbe constructed readily, and theory providing authoritative support in this setting can be developed,but for the sake of brevity, in our theoretical work we shall restrict attention to cases where (2) holds.
Bias-corrected estimators of θ , based on the conventional bootstrap and the double bootstrap, re-spectively, are given byˆ θ bc = 2 ˆ θ − E (ˆ θ ∗ | X ) , ˆ θ bcc = 3 ˆ θ − E (ˆ θ ∗ | X ) + E (ˆ θ ∗∗ | X ) . (3)Here X = { X ji : 1 ≤ i ≤ n j , ≤ j ≤ p } denotes the original dataset, ˆ θ ∗ is the version of ˆ θ computed from a resample X ∗ drawn randomly, with replacement, from X , in a manner that reflectsappropriately the dependence structure, and ˆ θ ∗∗ is the version of ˆ θ computed from X ∗∗ , which inturn is drawn randomly with replacement from X ∗ , again reflecting dependence.Monte Carlo approximations to the quantities ˆ θ bc and ˆ θ bcc in (3) are given respectively by˜ θ bc = 2 ˆ θ − B B X b =1 ˆ θ ∗ b , ˜ θ bcc = 3 ˆ θ − B B X b =1 ˆ θ ∗ b + 1 BC B X b =1 C X c =1 ˆ θ ∗∗ bc , (4)where ˆ θ ∗ b denotes the b th out of B independent and identically distributed, conditional on X , versionsof ˆ θ ∗ , computed from respective resamples X ∗ b drawn by sampling randomly, with replacement, fromthe data in X , and ˆ θ ∗∗ bc is the c th out of C independent and identically distributed, conditional on X and X ∗ , versions of ˆ θ ∗∗ , and is computed from a resample X ∗∗ bc drawn by sampling randomly, withreplacement, from X ∗ b . 3 .3 Bootstrap algorithms Reflecting the model at (1), we can express ˆ θ ∗ b and ˆ θ ∗∗ bc in (4) as ˆ θ ∗ b = f ( ¯ X ∗ b ) and ˆ θ ∗∗ bc = f ( ¯ X ∗∗ bc ),where ¯ X ∗ b = ( ¯ X ∗ b , . . . , ¯ X ∗ bp ) T , ¯ X ∗∗ bc = ( ¯ X ∗∗ bc , . . . , ¯ X ∗∗ bcp ) T , ¯ X ∗ bj denotes the mean of data in the resample X ∗ bj = { X ∗ bj , . . . , X ∗ bjn j } , ¯ X ∗∗ bcj is the mean of data in the re-resample X ∗∗ bcj = { X ∗∗ bcj , . . . , X ∗∗ bcjn j } drawn by sampling with replacement from X ∗ bj , the resampling operations at the first bootstraplevel are undertaken by resampling the vectors X i = ( X i , . . . , X pi ) T randomly, with replacement, if(2)(i) holds, or by resampling the X ji s randomly and completely independently, conditional on X and with replacement, if (2)(ii) obtains, and resampling at the second bootstrap level is undertakenanalogously. In Theorem 1 in section 5.1 we shall show that if C → ∞ , no matter how slowly, as n and B diverge,then the asymptotic distribution of the Monte Carlo simulation error incurred when constructing ˜ θ bcc at (4) is the same as it would be if C = ∞ . In particular, not only is the error of order ( nB ) − / , thelarge-sample limiting distribution of the relevant asymptotically normal random variable, which hasstandard deviation proportional to ( nB ) − / , and which describes in relative detail the accuracy ofMonte Carlo bootstrap simulation, is identical to the limiting distribution that would arise if C = ∞ .Moreover, if C is held fixed then the order of magnitude, ( nB ) − / , remains unchanged, but thestandard deviation of the large-sample limiting distribution referred to above changes by a constantfactor. This result is critical. It demonstrates the relatively small gains that are to be achieved bytaking C to be large, and argues in favour of taking C = 1, for example. This is the analogue, forbias correction, of the warp-speed bootstrap for distribution estimation when constructing confidenceintervals.Therefore the order of magnitude of Monte Carlo simulation error in ˜ θ bcc is unchanged even if C is held fixed. Incidentally, the order of magnitude, ( nB ) − / , should be compared with that ofthe uncorrected bias that remains after applying the bias correction that leads to ˜ θ bcc ; it is n − .Therefore, unless B is of order n or larger, for the regular bootstrap, the orders of magnitudeinvolving B , discussed above, dominate the error in the bias correction. As in section 2.1 we shall assume that the parameter θ can be represented as f ( µ ), where thefunction f : IR p → IR is known, and µ = E ( X ) is an unknown p -vector of parameters, estimated by¯ X = n − P ni =1 X i where X = { X , . . . , X n } is a random sample of data vectors. Here and below weuse model (2)(i) for the data, but only minor modifications are needed if (2)(ii) is employed instead.4n such cases, provided that f is sufficiently smooth and ˆ θ is given by (1), the asymptotic variance, σ n , of ˆ θ is estimated root- n consistently by n − ˆ σ , whereˆ σ = p X j =1 p X j =1 f j j ( ¯ X ) 1 n n X i =1 ( X j i − ¯ X j )( X j i − ¯ X j ) . Here, given a p -vector x = ( x , . . . , x p ) T , and integers j , . . . , j r between 1 and p ; and assuming that f has r well-defined derivatives with respect to each variable; we put f j ...j r ( x ) = ( ∂/∂x j ) . . . ( ∂/∂x j r ) f ( x ) . The above definitions of ˆ θ and ˆ σ are used in (5) below. Let R , referred to as the “root” by Giacomini et al. (2013), be given by either of the formulae R = n / (ˆ θ − θ ) , R = n / (ˆ θ − θ ) / ˆ σ . (5)Here ˆ θ and ˆ σ are estimators of parameters θ and σ computed from the random sample X , and σ denotes the asymptotic variance of n / ˆ θ . The warp-speed bootstrap of Giacomini et al. (2013),closely related to suggestions by White (2000) and Davidson & Mackinnon (2002, 2007), can bedefined as follows.As in section 2, let X ∗ b , for 1 ≤ b ≤ B , be drawn randomly, with replacement, from X , and beindependent conditional on X . Draw X ∗∗ b , denoting a single double-bootstrap resample, by samplingrandomly, with replacement, from X ∗ b for b = 1 , . . . , B , in such a manner that these re-resamples areindependent, conditional on X and X ∗ , . . . , X ∗ B . In the context of section 2, X ∗∗ b would be one of theresamples X ∗∗ b , . . . , X ∗∗ bC which were drawn by resampling from X ∗ b , but on the present occasion werequire only one of these resamples.Let ˆ θ ∗ b and ˆ θ ∗∗ b denote the versions of ˆ θ computed from X ∗ b and X ∗∗ b , respectively, instead of X ,and write ˆ σ ∗ b and ˆ σ ∗∗ b for the corresponding versions of ˆ σ . If R is given by one of the formulae at (5),define R ∗ b = n / (ˆ θ ∗ b − ˆ θ ) , R ∗ b = n / (ˆ θ ∗ b − ˆ θ ) / ˆ σ ∗ b , (6) R ∗∗ b = n / (ˆ θ ∗∗ b − ˆ θ ∗ b ) , R ∗∗ b = n / (ˆ θ ∗∗ b − ˆ θ ∗ b ) / ˆ σ ∗∗ b , (7)in the respective cases, and put b F ∗ B ( x ) = 1 B B X b =1 I ( R ∗ b ≤ x ) , e F ∗ B ( x ) = 1 B B X b =1 I ( R ∗∗ b ≤ x ) . (8)Then b F ∗ B is the conventional single-bootstrap, Monte Carlo approximation to the distribution func-tion F of R , and the limit of b F ∗ B , as B → ∞ , is the conventional single-bootstrap approximationto F . The function e F ∗ B is a short-cut, warp-speed, double-bootstrap approximation to F .5iven a nominal coverage level α ∈ (0 ,
1) of a confidence interval, define x = ˆ x ∗ α to be the solutionof the equation e F ∗ B ( x ) = α , and similarly let ˆ x α be the solution of b F ∗ B ( x ) = α . If R is given by eitherof the two expressions in (5), consider the respective confidence intervals, I ∗ bα = (ˆ θ ∗ b − n − / ˆ x ∗ α , ∞ ) , I ∗ bα = (ˆ θ ∗ b − n − / ˆ σ ∗ b ˆ x ∗ α , ∞ ) , (9)which are bootstrap versions of the respective intervals I α = (ˆ θ − n − / ˆ x α , ∞ ) , I α = (ˆ θ − n − / ˆ σ ˆ x α , ∞ ) . (10)In either case, our estimator of the probability p α that the interval I α covers θ is given byˆ p Bα = 1 B B X b =1 I (ˆ θ ∈ I ∗ bα ) . (11)We take the final interval to be I ˆ β Bα , where β = ˆ β Bα denotes the solution of ˆ p Bβ = α .Earlier warp-speed bootstrap methodology is a little ambiguous in the percentile- t setting, i.e. inthe context of the second definition in each of (5)–(7), where the technique is not completely clearfrom the algorithms of White (2000), Davidson & Mackinnon (2001, 2002) and Giacomini et al. (2013,pp. 570–571). In particular it is unclear from Giacomini et al. (2013) when, or whether, the estimatorˆ σ should be replaced by its single- or double-bootstrap forms, ˆ σ ∗ and ˆ σ ∗∗ , for example in (6)–(9).The choices we have made are appropriate, however, and in particular the algorithm would not besecond-order accurate, or third-order accurate in the case of the double bootstrap, if we were to usesimply ˆ σ in those instances. In section 5.2 we shall show that in the percentile- t case, using the case B = ∞ as a benchmark, theapproach suggested above produces quantile estimators that are identical to those obtained using thestandard single-bootstrap method, up to an error of order n − / . In particular, they do not reducethe O ( n − ) coverage error of single-bootstrap methods. Similar results hold for percentile-methodbootstrap procedures. Here we report the results of a simulation study comparing the performances of five different boot-strap methods for bias correction: The single bootstrap, the conventional double bootstrap, and thesuggested alternative method involving only C = 1, 2, 5 or 10 double-bootstrap replications. Thedata were of two types, either the exponential distribution, with density 2 − e − x/ on the positivehalf-line, or the log-normal distribution. These two distributions both have nonzero skewness and6onzero kurtosis, making them challenging for the bootstrap. The parameter of interest also tooktwo forms, both of them nonlinear: either θ = f ( µ ) = µ or θ = sin( µ ), where µ was the populationmean. In such cases there is a term with order n − in the bias expansion, which cannot be elimi-nated by the single bootstrap but can be removed by the double bootstrap. This is reflected in oursimulation results, which show that the double bootstrap provides better bias correction than thesingle bootstrap method.Sample size, n , was chosen in steps of 20 between 20 and 80; the number of simulations, B , inthe first bootstrap step was set equal to n , for each of the bootstrap methods; and the numberof simulations, C , for the second bootstrap step in the conventional double bootstrap was taken tobe the integer part of 10 B / , which we write as ⌊ B / ⌋ . The choice of B / here was suggestedby Booth & Hall (1994) in the context of confidence intervals, and gives an expression for C that isorders of magnitude larger than obtained using relatively small, fixed C . For example, when n = 20the value of C = ⌊ B / ⌋ is between 20 and 200 times the values C = 1, 2, 5 or 10 used to simulatethe alternative approach to double-bootstrap methods; when n = 80 the respective factors are 80to 800.From equation (4), 1 B B X b =1 ˆ θ ∗ b − ˆ θ and 3 B B X b =1 ˆ θ ∗ b − BC B X b =1 C X c =1 ˆ θ ∗∗ bc − θ , provide the estimates of the true bias of ˆ θ , i.e., E (ˆ θ ) − θ , via single bootstrap and double bootstrap,respectively. Empirical approximations to bias, computed by averaging over the results of 5,000Monte Carlo trials in each case, are reported in Tables 1-2 in Supplementary Material, and the ratiosof such approximations and true bias are graphed in Figure 1. The figure shows that, for the valuesof B used in our analysis, there is little to choose between performance when using C = 1 and C = ⌊ B / ⌋ . In this section we illustrate the coverage performance of bootstrap confidence intervals, with nominalcoverage 0 .
9, for the population means of the two distributions considered in section 4.1, i.e. theexponential and log-normal distributions. Sample size n was taken equal to 20 and 40 in each case; B was increased from 200 to 700 in steps of 100, as indicated on the horizontal axis of each panel;and one-sided and two-sided equal-tailed bootstrap confidence intervals were considered, each usingeither the percentile or percentile- t bootstrap, implemented via the single bootstrap, the conventionaldouble bootstrap, C = ⌊ B / ⌋ ; and the warp speed bootstrap, i.e. the double bootstrap with C = 1.This choice of C was suggested by Lee & Young (1999). To provide a perspective different from thatin section 4.1, in the present section we graph coverage as a function of B for fixed n , rather than asa function of n for fixed B as in section 4.1. Results in the two settings can of course be expressedin same way; the conclusions do not alter.Results for sample size n = 20, with each point on each graph based on 5,000 Monte Carlo7 n E s t i m a t e db i a s / T r u e b i a s
20 30 40 50 60 70 800.750.80.850.90.951 n E s t i m a t e db i a s / T r u e b i a s
20 30 40 50 60 70 801.051.11.151.21.251.3 n E s t i m a t e db i a s / T r u e b i a s
20 30 40 50 60 70 800.60.650.70.750.80.850.90.95 n E s t i m a t e db i a s / T r u e b i a s Figure 1: Performance of bootstrap methods for bias correction. First and second rows show resultsfor the exponential distribution, and the log-normal distribution, respectively; left- and right-handpanels show results for θ = µ and θ = sin( µ ), respectively. In each panel the graphs represent singlebootstrap method ( − ⋆ − ) and conventional double-bootstrap methods with C = 1 ( · · · + · · · ), C = 2( · · · ◦ · · · ), C = 5 ( · · · × · · · ), C = 10 ( · · · ♦ · · · ) and C = ⌊ B / ⌋ ( · · · (cid:3) · · · ), respectively.simulations, are presented in Figure 2. It can be seen that, for each confidence interval type, theconventional double-bootstrap method gives greater coverage accuracy than the single-bootstrap andwarp-speed bootstrap. Results for sample size n = 40 are similar, and are reported in SupplementaryMaterial. Our main regularity condition, in addition to the model assumptions (1) and (2), is the followingcondition:(i) f ( x ) is differentiable six times with respect to any combination of the p componentsof x ; and those derivatives, as well as f itself, are uniformly bounded; and (ii) the data X ji have at least six finite moments, and E ( X ji ) is bounded uniformly in i and j . (12)8
00 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e
200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e
200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e
200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e Figure 2: Performance of bootstrap methods for confidence intervals when n = 20. First andsecond rows show results for the exponential distribution, and the log-normal distribution, respec-tively; left- and right-hand panels show results for one-sided and two-sided equal-tailed confidenceintervals, respectively. In each panel the graphs represent single-bootstrap percentile ( − ⋆ − ), single-bootstrap percentile- t ( − · ⋆ · − ), conventional double-bootstrap percentile ( − (cid:3) − ), conventionaldouble-bootstrap percentile- t ( − · (cid:3) · − ), warp-speed percentile ( −♦− ) and warp-speed percentile- t methods ( − · ♦ · − ).Condition (12) can be generalized, but (for example) if we relax significantly the condition of bound-edness of f and its derivatives, in (12)(i), then we need to strengthen the assumption about tails ofthe distributions of the X ji s, in (12)(ii). We shall define τ = E (cid:20)(cid:26) p X j =1 ( X j − µ j ) f j ( µ ) (cid:27) (cid:21) . (13)In Theorem 1, below, we decompose the bias-corrected estimators ˜ θ bc , based on the single boot-strap, and ˜ θ bcc , based the double bootstrap, as follows:˜ θ bc = U bc + V bc , ˜ θ bcc = U bcc + V bcc , (14)Here U bc and U bcc are the “ideal” versions of ˜ θ bc and and ˜ θ bcc , respectively, that we would obtainif we were to do an infinite number of simulations, i.e. if we were to take B = C = ∞ ; and V bc V bcc denote error terms arising from doing only a finite number of Monte Carlo simulations.Part (d) of Theorem 1 shows that the error terms V bc in the case of the single bootstrap, and V bcc for the double bootstrap, both equal O p { ( nB ) − / } , and that this is the exact order, regardless ofthe selection of C in the second bootstrap stage. Although the Monte Carlo error terms in the singlebootstrap and the double bootstrap share the same convergence rate, equations (15) show that thedouble bootstrap provides a higher degree of accuracy, in terms of bias correction, than the singlebootstrap if we take B = C = ∞ . Part (d) also implies that if B is sufficiently large, or moreprecisely if n = O ( B ), then the Monte Carlo error is of the same order as, or order smaller than,the deterministic remainders in (15). These are the main theoretical findings of Theorem 1. Theorem 1.
Assume that the data are generated according to either of the models at (2) , that (12) holds, and that B = B ( n ) → ∞ as n → ∞ . Then: (a) Equations (14) hold, where U bc and U bcc arefunctions of X alone, and in particular do not involve X ∗ or X ∗∗ , and satisfy E ( U bc ) = θ + O ( n − ) , E ( U bcc ) = θ + O ( n − ) ; (15) and V bc and V bcc are functions of both X and X ∗ (and also of X ∗∗ , in the case of V bcc ), andsatisfy E ( V bc | X ) = E ( V bcc | X ) = 0 . (b) Both U bc and U bcc equal ˆ θ + O p ( n − ) , and both satisfy thesame central limit theorem as ˆ θ . (c) In particular, both U bc and U bcc are asymptotically normallydistributed with mean θ and a variance, σ n say, which has the property that n σ n is bounded as n → ∞ .(d) Conditional on X , V bc and V bcc are asymptotically normally distributed with zero means andvariances of size ( nB ) − , and if C = C ( n ) → ∞ as n → ∞ then the ratio of the variances convergesto 1 as n diverges. In the case of (2)(i) the asymptotic variances of V bc and V bcc , both conditionalon X and unconditionally, are ( Bn ) − τ and (4 + C − ) ( Bn ) − τ , respectively. In connection with part (d) it can be shown that, if C diverges (no matter how slowly) as n increases, the asymptotic distribution of the error is the same as it would be if C = ∞ . If σ n is asin part (c) then, under the model (2)(i), there exists a positive constant c such that n σ n = c + o (1)as n → ∞ . However, this is not necessarily correct under the model (2)(ii), since in that setting wedo not require the ratios n j /n to converge. In the context of (2)(i), formulae for U bc and U bcc aregiven at (A9) and (A10), respectively, in the Supplementary Material.The orders of magnitude of the remainders in (15) are exact when skewness and kurtosis arenonzero. It follows from part (b) of Theorem 1 that, in the case B = C = ∞ , ˜ θ bc and ˜ θ bcc satisfyidentical central limit theorems, and in particular both have the same asymptotic variances. We shall assume that X , which represents a generic p -vector X i = ( X i , . . . , X pi ) T , where 1 ≤ i ≤ n and (2)(i) holds, satisfies the following multivariate version of Cram´er’s continuity condition (Hall,1992): lim sup k t k→∞ (cid:12)(cid:12) E { exp( it T X ) } (cid:12)(cid:12) < . (16)10n this occasion, i denotes √−
1. For brevity we shall treat in detail only the percentile- t case,evidenced by the second formula in each of (5)–(7), and discuss the percentile method briefly belowTheorem 2.Let Φ and φ denote the standard normal distribution and density functions, respectively. Assumethat an unknown scalar parameter θ can be written as θ = f ( µ ), where µ = E ( X ), and that ourestimator of θ is ˆ θ = f ( ¯ X ), as at (1), where ¯ X = n − P ni =1 X i . Methods of Bhattacharya & Ghosh(1978) can be used to prove that, under conventional assumptions such as those in Theorem 2 below, G ( x ) ≡ pr { n / (ˆ θ − θ ) / ˆ σ ≤ x } = Φ( x ) + X j =1 n − j/ Q j ( x ) φ ( x ) + n − A n ( x ) , (17)where Q j is a polynomial of degree 3 j −
1, and is an even or odd function according as j is odd oreven, respectively; and the remainder A n ( x ) satisfiessup n ≥ sup −∞ 11s as at (19); if we were to use the percentile- t bootstrap method, it would be (ˆ θ − n − / ˆ σ ˆ x α , ∞ ),where ˆ x α is as at (20); and if we were to employ the warp-speed bootstrap method, it would be(ˆ θ − n − / ˆ σ ˆ x ˆ β α , ∞ ), as discussed in section 3.2, where ˆ β α denotes the limit, as B → ∞ , of thequantity ˆ β Bα introduced there. However, we shall show in Theorem 2 that ˆ x ˆ β α = ˆ x α + O p ( n − / ),and so the endpoints of standard percentile- t and warp-speed bootstrap confidence intervals differonly to order n − / . This signals that conventional arguments, based on Edgeworth expansions, canbe used to prove that the standard percentile- t confidence interval, and its warp-speed bootstrapvariant, have identical coverage error up to and including terms of order n − , and of course thatcan be done under the assumptions of Theorem 2. Since, as is well known, the coverage error ofthe percentile- t interval is genuinely of order n − (Hall, 1986), then it follows that the warp-speedbootstrap does not improve on that accuracy. Theorem 2. Assume that model (2)(i) applies; that the function f , in the definition θ = f ( µ ) ,has five bounded derivatives; and that (16) holds, E ( k X k K ) < ∞ for sufficiently large K > , and B = ∞ . Then ˆ x ˆ β α = ˆ x α + O p ( n − / ) . The appropriate number of moments that should be assumed for general Edgeworth or Cor-nish Fisher expansions, even in relatively simple, non-bootstrap cases, is awkward to determine.For example, the argument of Bhattacharya & Ghosh (1978) requires at least six moments in thecase of the Studentised mean, whereas it is known that three moments are sufficient; see e.g. Hall(1987). Even if we were to develop, in full detail, a proof of Theorem 2 based on the methods ofBhattacharya & Ghosh (1978), the number of moments we would need to assume would be undulygenerous, and instead refer to the number as simply K . We choose not to provide such a detaileddevelopment here. However, the number of derivatives is relatively easy to address, and the theoremprovides detail in that respect.Let e F ∗ ( x ) = pr (cid:8) n / (cid:0) ˆ θ ∗∗ − ˆ θ ∗ (cid:1)(cid:14) ˆ σ ∗∗ ≤ x (cid:12)(cid:12) X (cid:9) , which is the limit of e F ∗ B ( x ), defined in (8), as B → ∞ . Then ˆ x ˆ β α is the solution of e F ∗ ( x ) = α . Ourfocus on the case B = ∞ deserves comment. In the early days of the bootstrap, B = ∞ was seenas “the statistical bootstrap method,” and the case of finite B was interpreted as a Monte Carloapproximation to the bootstrap. Indeed, taking B < ∞ was viewed more as an issue to be addressedin computational or numerical terms, rather than statistical ones. Reflecting this, for about eightyears from the mid 1980s considerable effort was spent developing efficient computational methodsfor undertaking bootstrap resampling. However, by the early 1990s computers had become so fastthat this area of research had largely disappeared. This remains the case today; taking B in thethousands, without using numerical devices to increase simulation efficiency, is now the rule ratherthan the exception. The difference between such large values of B , and using the mathematical idealvalue B = ∞ , is particularly small. 12 Conclusion and discussion We have investigated the role played by C , the number of resamples used in the second bootstrapstage, in double bootstrap methods for bias correction and confidence intervals. Specifically, we haveshown that the double bootstrap is largely insensitive to choice of C in the context of bias correc-tion. Indeed, double bootstrap methods with fixed C can produce third-order accuracy, much as doconventional double bootstrap methods with diverging C . This result demonstrates the effective-ness, for bias correction, of using the double bootstrap with a single double-bootstrap simulation.Although existing work shows that the warp-speed double bootstrap ( C = 1) can improve accuracyin hypothesis testing, there has not been, until now, any theoretical underpinning of its performancein the context of confidence intervals. However, when only a single bootstrap resample is used in thesecond-bootstrap stage to construct confidence intervals, the order of magnitude of coverage error isnot improved relative to that for the single bootstrap. Supplementary material Supplementary Material available for theoretical proofs of Theorems 1 and 2, and additional simu-lation results in sections 4.1 and 4.2. References Beran, R. (1987). Prepivoting to reduce level error in confidence sets. Biometrika , 457–468. Beran, R. (1988). Prepivoting test statistics: a bootstrap view of asymptotic refinements. J. Amer.Statist. Assoc. , 687–697. Bhattacharya, R.N. & Ghosh, J.K. (1978). On the validity of the formal Edgeworth expansion. Ann. Statist. , 434–451. Booth, J.G. & Hall, P. (1994). Monte Carlo approximation and the iterated bootstrap. Biometrika , 331–340. Booth, J.G. & Presnell, B. (1998). Allocation of Monte Carlo resources for the iterated boot-strap. J. Comput. Graph. Statist. , 92–112. Davidson, R. & Mackinnon, J.G. (2001). Improving the reliability of bootstrap tests. QueensInstitute for Economic Research Discussion Paper No. 995, revised. Davidson, R. & Mackinnon, J.G. (2002). Fast double bootstrap tests of nonnested linear regres-sion models. Econometric Rev. , 417–427. Davidson, R. & Mackinnon, J.G. (2007). Improving the reliability of bootstrap tests with thefast double bootstrap. Comput. Statist. Data Anal. , 3259–3281.13 avison, A.C., Hinkley, D.V. & Schechtman, E. (1986). Efficient bootstrap simulation. Biometrika , 555–566. Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation. J. Amer. Statist. Assoc. , 316–331. Giacomini, R., Politis, D.N. & White, H. (2013). A warp-speed method for conducting MonteCarlo experiments involving bootstrap estimators. Econometric Theory , 567–589. Hall, P. (1986). On the bootstrap and confidence intervals. Ann. Statist. , 1431–1452. Hall, P. (1987). Edgeworth expansion for Student’s t statistic under minimal moment conditions. Ann. Probab. , 920–931. Hall, P. (1988). On symmetric bootstrap confidence intervals. J. Roy. Statist. Soc. Ser. B ,35–45. Hall, P. (1990). On the relative performance of bootstrap and Edgeworth approximations of adistribution function. J. Multivariate Anal. , 108–129. Hall, P. (1992). The Bootstrap and Edgeworth Expansion . Springer, New York. Hall, P. & Martin, M.A. (1988). On bootstrap resampling and iteration. Biometrika , 661–671. Lee, S.M.S. & Young, G.A. (1999). The effect of Monte Carlo approximation on coverage errorof double-bootstrap confidence intervals. J. Roy. Statist. Soc. Ser. B , 353–366. Mackinnon, J.G. (2006). Applications of the fast double bootstrap. Queens Economics DepartmentWorking Paper No. 1023. White, H. (2000). A reality check for data snooping. Econometrica , 1097–1126.14 upplementary material for “Double-bootstrap methods use asingle double-bootstrap simulation” Jinyuan Chang Peter HallDepartment of Mathematics and StatisticsThe University of Melbourne, VIC, 3010, Australia A Proof of Theorem 1 In view of (12), Taylor expansion can be used to derive the following formulae:ˆ θ = θ + X s =1 s ! p X j =1 . . . p X j s =1 ( ¯ X j − µ j ) · · · ( ¯ X j s − µ j s ) f j ...j s ( µ ) + O p ( n − ) , (A1)and E (ˆ θ ) = θ + X s =1 s ! p X j =1 . . . p X j s =1 E { ( ¯ X j − µ j ) · · · ( ¯ X j s − µ j s ) } f j ...j s ( µ ) + O ( n − ) , (A2)where the remainder term R n that is denoted by O p ( n − ) in (A1) satisfies E ( R n ) = O ( n − ).Define ξ j j = cov( X j , X j ) ,ξ j j j = E { ( X j − µ j ) ( X j − µ j ) ( X j − µ j ) } ,ξ j j j j = ξ j j ξ j j + ξ j j ξ j j + ξ j j ξ j j . Then, if (2)(i) holds, E { ( ¯ X j − µ j )( ¯ X j − µ j ) } = n − ξ j j ,E { ( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j ) } = n − ξ j j j ,E { ( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j ) } = n − ξ j j j j + O ( n − ) . Hence, by (A2), E (ˆ θ ) = θ + 12 n p X j =1 p X j =1 ξ j j f j j ( µ ) + 16 n p X j =1 p X j =1 p X j =1 ξ j j j f j j j ( µ )+ 124 n p X j =1 . . . p X j =1 ξ j j j j f j j j j ( µ ) + O ( n − )= θ + n − γ + n − (cid:0) γ + γ (cid:1) + O ( n − ) , (A3)where, for r = 2 , , γ r = p X j =1 . . . p X j r =1 ξ j ...j r f j ...j r ( µ ) . 15f (2)(ii) holds, instead of (2)(i); and if we define σ j = ξ jj , and write I ( E ) for the indicatorfunction of an event E ; then the following relations obtain: E { ( ¯ X j − µ j )( ¯ X j − µ j ) } = n − j I ( j = j ) σ j ,E { ( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j ) } = n − j I ( j = j = j ) ξ j j j , and E { ( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j )( ¯ X j − µ j ) } = ( n j n j ) − I ( j = j = j = j ) ξ j j j j + ( n j n j ) − I ( j = j = j = j ) ξ j j j j + ( n j n j ) − I ( j = j = j = j ) ξ j j j j + O ( n − ) . Therefore we can write (A2) as E (ˆ θ ) = θ + n − γ (1) + n − γ (2) + O ( n − ) , (A4)where the quantities γ (1) and γ (2) may depend on n but are bounded as n → ∞ . Property (A4) isthe analogue, in the context of (2)(ii) rather than (2)(i), of (A3).To explore properties of Monte Carlo approximations to the quantities E (ˆ θ ∗ | X ) and E (ˆ θ ∗∗ | X )(compare (3) and (4)), observe first that, analogously to (A1),ˆ θ ∗ = f ( ¯ X ∗ ) = ˆ θ + X r =1 r ! p X j =1 . . . p X j r =1 ( ¯ X ∗ j − ¯ X j ) · · · ( ¯ X ∗ j r − ¯ X j r ) f j ...j r ( ¯ X ) + O p ( n − ) , ˆ θ ∗∗ = f ( ¯ X ∗∗ ) = ˆ θ ∗ + X r =1 r ! p X j =1 . . . p X j r =1 ( ¯ X ∗∗ j − ¯ X ∗ j ) · · · ( ¯ X ∗∗ j r − ¯ X ∗ j r ) f j ...j r ( ¯ X ∗ ) + O p ( n − ) . Averaging these formulae over bootstrap replicates we obtain the following expansions: S bc ≡ B B X b =1 ˆ θ ∗ b = ˆ θ + X r =1 r ! p X j =1 . . . p X j r =1 f j ...j r ( ¯ X ) × B B X b =1 ( ¯ X ∗ bj − ¯ X j ) · · · ( ¯ X ∗ j r − ¯ X j r ) + O p ( n − ) , (A5) S bcc ≡ BC B X b =1 C X c =1 ˆ θ ∗∗ bc = 1 B B X b =1 ˆ θ ∗ b + X r =1 r ! p X j =1 . . . p X j r =1 B B X b =1 f j ...j r ( ¯ X ∗ b ) × C C X c =1 ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) · · · ( ¯ X ∗∗ bcj r − ¯ X ∗ bj r ) + O p ( n − ) . (A6)In view of (12), the remainder terms R n , say, that are denoted by O p ( n − ) in (A5) and (A6) satisfy E ( R n ) = O ( n − ). 16efine ˆ ξ j j = 1 n n X i =1 ( X j i − ¯ X j )( X j i − ¯ X j ) , ˆ ξ j j j = 1 n n X i =1 ( X j i − ¯ X j )( X j i − ¯ X j )( X j i − ¯ X j ) , ˆ ξ j j j j = ˆ ξ j j ˆ ξ j j + ˆ ξ j j ˆ ξ j j + ˆ ξ j j ˆ ξ j j , ˆ η r = p X j =1 . . . p X j r =1 ˆ ξ j ...j r f j ...j r ( ¯ X ) , the latter for r = 2 , , 4. In the discussion below we shall assume, for the sake of definiteness, thatthe data are generated by the model (2)(i). The case of model (2)(ii) is similar.Suppose first that we use the regular bootstrap, both for resampling X ∗ b from X and for resampling X ∗∗ bc from X ∗ b . Then the conditional expected values of the non-remainder terms on the right-handsides of (A5) and (A6) satisfy the following identities, respectively: E (cid:26) ˆ θ + X r =1 r ! p X j =1 . . . p X j r =1 f j ...j r (cid:0) ¯ X (cid:1) B B X b =1 ( ¯ X ∗ bj − ¯ X j ) · · · ( ¯ X ∗ j r − ¯ X j r ) (cid:12)(cid:12)(cid:12)(cid:12) X (cid:27) = ˆ θ + n − ˆ η + n − (cid:0) ˆ η + ˆ η (cid:1) + O p ( n − ) , (A7) E (cid:26) B B X b =1 ˆ θ ∗ b + X r =1 r ! p X j =1 . . . p X j r =1 B B X b =1 f j ...j r ( ¯ X ∗ b ) × C C X c =1 ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) · · · ( ¯ X ∗∗ bcj r − ¯ X ∗ bj r ) (cid:12)(cid:12)(cid:12)(cid:12) X (cid:27) = ˆ θ + n − (2 − n − ) ˆ η + n − (cid:0) ˆ η + ˆ η (cid:1) + 2 n − (cid:0) ˆ η + ˆ η (cid:1) + O p ( n − ) , (A8)where, as before, the expected values of the O p ( n − ) remainder terms equal O ( n − ).Recall the definitions of ˜ θ bc and ˜ θ bcc at (4), and define U bc ≡ E (˜ θ bc | X ) = 2 ˆ θ − E ( S bc | X ) ,U bcc ≡ E (˜ θ bcc | X ) = 3 { ˆ θ − E ( S bc | X ) } + E ( S bcc | X ) . Then (A7) and (A8) imply that U bc = U bc ′ + O p ( n − ) and U bcc = U bcc ′ + O p ( n − ), where theexpected values of the O p ( n − k ) remainder terms equal O ( n − k ), and U bc ′ = ˆ θ − (cid:8) n − ˆ η + n − (cid:0) ˆ η + ˆ η (cid:1)(cid:9) , (A9) U bcc ′ = ˆ θ − n − (1 + n − ) ˆ η + n − (cid:0) ˆ η + ˆ η (cid:1) − n − (cid:0) ˆ η + ˆ η (cid:1) . (A10)Therefore U bc and U bcc both equal ˆ θ + O p ( n − ), as claimed in part (b) of Theorem 1.Put V bc = ˜ θ bc − E (˜ θ bc | X ) and V bcc = ˜ θ bcc − E (˜ θ bcc | X ). Employing (A3) and the properties E (ˆ η ) = (1 − n − ) γ + n − (cid:0) γ + γ (cid:1) + O ( n − ) , E (ˆ η r ) = γ r + O ( n − ) (A11)17or r = 3 , 4, we deduce that E (˜ θ bc ) = E ( U bc ) = θ + O ( n − ), and that V bc = ˜ θ bc − U bc is afunction of both X and X ∗ , satisfying E ( V bc | X ) = 0 (in the context of (2)(i)) and var( V bc | X ) = { o p (1) } ( Bn ) − τ . Central limit theorems for U bc and V bc follow from Lindeberg’s theorem.In the context of (2)(i), those parts of (15) and (b)–(d), in Theorem 1, that pertain to the single-bootstrap estimator ˜ θ bc , follow from these properties. (The exactness of the orders of magnitude ofremainders in (15) can be proved by deriving concise formulae for those terms, using (A9)–(A11).)The results discussed two paragraphs above also imply that E (˜ θ bcc ) = E ( U bcc ) = θ + O ( n − ),and of course, V bcc = ˜ θ bcc − U bcc is a function of X , X ∗ and X ∗∗ satisfying E ( V bcc | X ) = 0. Notetoo that, in the context of (2)(i),( BC ) var( S bcc − S bc (cid:12)(cid:12) X ) ∼ p var (cid:26) B X b =1 C X c =1 p X j =1 f j ( ¯ X ∗ b ) ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) (cid:12)(cid:12)(cid:12)(cid:12) X (cid:27) = E "(cid:26) B X b =1 C X c =1 p X j =1 f j ( ¯ X ∗ b ) ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ∼ p E "(cid:26) B X b =1 C X c =1 p X j =1 f j ( µ ) ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X = E E "(cid:26) B X b =1 C X c =1 p X j =1 f j ( µ ) ( ¯ X ∗∗ bcj − ¯ X ∗ bj ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X , X ∗ X ! = C E E "(cid:26) B X b =1 p X j =1 f j ( µ ) ( ¯ X ∗∗ b j − ¯ X ∗ bj ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X , X ∗ X ! = C E "(cid:26) B X b =1 p X j =1 f j ( µ ) ( ¯ X ∗∗ b j − ¯ X ∗ bj ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X = BC E "(cid:26) p X j =1 f j ( µ ) ( ¯ X ∗∗ j − ¯ X ∗ j ) (cid:27) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ∼ p BC n − τ , and cov( S bcc − S bc , S bc | X ) = o p ( B − ). Therefore,var(˜ θ bcc | X ) = var( V bcc | X ) = var( S bcc − S bc | X )= var( S bcc − S bc | X ) − S bcc − S bc , S bc | X ) + 4 var( S bc | X )= ( nB ) − (4 + C − ) τ + o p { ( nB ) − } . Much as in the case of ˜ θ bc , it can be proved from (A10) and (A11) that E (˜ θ bcc ) = E ( U bcc ) = θ + O ( n − ). If (2)(i) holds then these properties, and Lindeberg’s central limit theorem, imply thoseparts of Theorem 1 that pertain to the double-bootstrap estimator ˜ θ bcc . Cases where the model(2)(ii) holds are similar. 18 Proof of Theorem 2 Consider first the solution β = β α , say, of the equationpr { n / (ˆ θ ∗ − ˆ θ ) / ˆ σ ∗ ≤ x β } = α , (A12)where x = x β is the solution of pr { n / (ˆ θ − θ ) / ˆ σ ≤ x } = β . (A13)Note that pr { n / (ˆ θ ∗ − ˆ θ ) / ˆ σ ∗ ≤ x | X } = Φ( x ) + n − / b Q ( x ) φ ( x ) + · · · + n − / b Q ( x ) φ ( x ) + n − ˆ A n ( x ) , (A14)where the remainder ˆ A n ( x ) satisfiespr (cid:26) sup −∞ 19s identical, up to terms of order n − / , to the solution x = x α of equation (A13) when β = α there,and in particular, x β α = x α + O ( n − / ) . Therefore, x β α = z α + n − / Q cf1 ( z α ) + n − Q cf2 ( z α ) + O ( n − / ) . (A17)Recall that the distribution function estimator with which we are working is the version of thesecond formula in (8) when B = ∞ and C = 1: e F ∞ ( x ) = pr { n / (ˆ θ ∗∗ − ˆ θ ∗ ) / ˆ σ ∗∗ ≤ x | X } , where ˆ θ ∗ , ˆ θ ∗∗ and ˆ σ ∗∗ are computed from X ∗ , X ∗∗ and X ∗∗ , respectively. Since we are taking B = ∞ in our analysis then ˆ x α , defined (9) in the case of finite B , is now given by the limit as B → ∞ ofthat definition, i.e. the solution in x of pr { n / (ˆ θ ∗ − ˆ θ ) / ˆ σ ∗ ≤ x | X } = α . In this notation, ˆ β α isdefined to be the solution in β of the equation e F ∞ (ˆ x β ) = α , i.e. the solution in β ofpr { n / (ˆ θ ∗∗ − ˆ θ ∗ ) / ˆ σ ∗∗ ≤ ˆ x β | X } = α . (A18)Now, the solution in β of (A18) is an estimator of the solution β = β α ofpr { n / (ˆ θ ∗ − ˆ θ ) / ˆ σ ∗ ≤ x β } = α , where x = x β is the solution of (A13). That is, a representation of ˆ x ˆ β α as a Cornish-Fisher expansionis identical to the analogous representation of x β α , except that moments of X are replaced by thecorresponding moments of X ∗ conditional on X . Since the Cornish-Fisher expansion of x β α is givenby (A17), up to and including terms of order n − , thenˆ x ˆ β α = z α + n − / b Q cf1 ( z α ) + n − b Q cf2 ( z α ) + O p ( n − / ) . This is identical to the expansion of ˆ x α , the solution ofpr { n / (ˆ θ ∗ − ˆ θ ) / ˆ σ ∗ ≤ x | X } = α , up to and including terms of order n − , and so ˆ x ˆ β α = ˆ x α + O p ( n − / ), as had to be proved. C Simulation results In this section, we provide the simulation results for sections 4.1 and 4.2. C.1 Bias estimation in section 4.1 Tables 1 and 2 report the empirical approximations to bias computed by averaging over the resultsof 5,000 Monte Carlo trials in the settings of exponential distribution and log-normal distribution,respectively. 20able 1: Bias estimation based on different bootstrap methods for µ and sin( µ ) with Exp(2) distri-bution. The values in brackets denote the ratios of the estimated biases and true bias, respectively. n 20 40 60 80 µ true bias ( × ) 115.1658 57.0163 38.1427 28.6419single ( × ) 129.7612 62.6221 41.3012 30.7055[1.1267] [1.0983] [1.0828] [1.0720]double with C = 1 ( × ) 125.9539 61.2805 40.8512 30.2225[1.0937] [1.0748] [1.0710] [1.0552]double with C = 2 ( × ) 125.1125 61.4080 40.6490 30.2391[1.0864] [1.0770] [1.0657] [1.0558]double with C = 5 ( × ) 125.3128 61.3515 40.5743 30.2928[1.0881] [1.0760] [1.0638] [1.0576]double with C = 10 ( × ) 125.6812 61.4801 40.5936 30.2841[1.0913] [1.0783] [1.0643] [1.0573]double with C = ⌊ B / ⌋ ( × ) 125.5125 61.4068 40.6418 30.2630[1.0898] [1.0770] [1.0655] [1.0566]sin( µ ) true bias ( × ) -8.4970 -4.4585 -2.9896 -2.2458single ( × ) -6.2578 -3.8283 -2.7155 -2.1012[0.7365] [0.8587] [0.9083] [0.9356]double with C = 1 ( × ) -7.8440 -4.3452 -2.9636 -2.2358[0.9231] [0.9746] [0.9913] [0.9955]double with C = 2 ( × ) -7.8299 -4.3505 -2.9557 -2.2359[0.9215] [0.9758] [0.9887] [0.9956]double with C = 5 ( × ) -7.8483 -4.3475 -2.9526 -2.2383[0.9237] [0.9751] [0.9876] [0.9967]double with C = 10 ( × ) -7.8521 -4.3499 -2.9541 -2.2380[0.9241] [0.9756] [0.9881] [0.9965]double with C = ⌊ B / ⌋ ( × ) -7.8520 -4.3480 -2.9555 -2.2371[0.9241] [0.9752] [0.9886] [0.9961]21able 2: Bias estimation based on different bootstrap methods for µ and sin( µ ) with exp { N (0 , } distribution. The values in brackets denote the ratios of the estimated biases and true bias, respec-tively. n 20 40 60 80 µ true bias ( × ) 116.4471 55.6341 36.9453 27.9352single ( × ) 150.1797 66.8400 42.5223 31.3730[1.2897] [1.2014] [1.1510] [1.1231]double with C = 1 ( × ) 128.1239 59.6595 39.0303 29.2126[1.1003] [1.0724] [1.0564] [1.0457]double with C = 2 ( × ) 131.4972 59.7961 39.2092 29.2521[1.1292] [1.0748] [1.0613] [1.0471]double with C = 5 ( × ) 127.7990 59.7409 39.0654 29.1772[1.0975] [1.0738] [1.0574] [1.0445]double with C = 10 ( × ) 129.5233 59.4563 39.0700 29.1729[1.1123] [1.0687] [1.0575] [1.0443]double with C = ⌊ B / ⌋ ( × ) 128.8509 59.5656 39.1011 29.1925[1.1065] [1.0707] [1.0584] [1.0450]sin( µ ) true bias ( × ) -9.8256 -5.6652 -3.9217 -2.9741single ( × ) -6.1373 -4.3128 -3.2181 -2.5383[0.6246] [0.7613] [0.8206] [0.8535]double with C = 1 ( × ) -8.1200 -5.2653 -3.7202 -2.8340[0.8264] [0.9294] [0.9486] [0.9529]double with C = 2 ( × ) -8.0672 -5.2670 -3.7275 -2.8318[0.8210] [0.9297] [0.9505] [0.9522]double with C = 5 ( × ) -8.0785 -5.2651 -3.7201 -2.8321[0.8222] [0.9294] [0.9486] [0.9523]double with C = 10 ( × ) -8.0812 -5.2684 -3.7214 -2.8320[0.8225] [0.9300] [0.9489] [0.9522]double with C = ⌊ B / ⌋ ( × ) -8.0796 -5.2667 -3.7228 -2.8324[0.8223] [0.9297] [0.9493] [0.9524]22 .2 Performance of n = 40 in section 4.2 Figure 3 shows the empirical coverage of the confidence intervals constructed by different bootstrapmethods when sample size n = 40. 200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e 200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e 200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e 200 300 400 500 600 7000.780.80.820.840.860.880.90.920.940.960.98 B E m p i r i c a l c o v e r ag e Figure 3: Performance of bootstrap methods for confidence intervals when n = 40. First andsecond rows show results for the exponential distribution, and the log-normal distribution, respec-tively; left- and right-hand panels show results for one-sided and two-sided equal-tailed confidenceintervals, respectively. In each panel the graphs represent single-bootstrap percentile ( − ⋆ − ), single-bootstrap percentile- t ( − · ⋆ · − ), conventional double-bootstrap percentile ( − (cid:3) − ), conventionaldouble-bootstrap percentile- t ( − · (cid:3) · − ), warp-speed percentile ( −♦− ) and warp-speed percentile- t methods ( − · ♦ · −− · ♦ · −