[PDF] Edgeworth corrections for spot volatility estimator

Abstract

We develop Edgeworth expansion theory for spot volatility estimator under general assumptions on the log-price process that allow for drift and leverage effect. The result is based on further estimation of skewness and kurtosis, when compared with existing second order asymptotic normality result. Thus our theory can provide with a refinement result for the finite sample distribution of spot volatility. We also construct feasible confidence intervals (one-sided and two-sided) for spot volatility by using Edgeworth expansion. The Monte Carlo simulation study we conduct shows that the intervals based on Edgeworth expansion perform better than the conventional intervals based on normal approximation, which justifies the correctness of our theoretical conclusion.

Full PDF

aa r X i v : . [ m a t h . S T ] J u l Edgeworth corrections for spot volatility estimator

Lidan HE a , Qiang LIU b, ∗ , Zhi LIU a a Department of Mathematics, University of Macau b Department of Mathematics, National University of Singapore

Abstract

We develop Edgeworth expansion theory for spot volatility estimator under general assump-tions on the log-price process that allow for drift and leverage eﬀect. The result is basedon further estimation of skewness and kurtosis, when compared with existing second orderasymptotic normality result. Thus our theory can provide with a reﬁnement result for theﬁnite sample distribution of spot volatility. We also construct feasible conﬁdence intervals(one-sided and two-sided) for spot volatility by using Edgeworth expansion. The MonteCarlo simulation study we conduct shows that the intervals based on Edgeworth expansionperform better than the conventional intervals based on normal approximation, which justi-ﬁes the correctness of our theoretical conclusion.

Keywords:

High frequency data, Spot volatility, Central limit theorem, Edgeworthexpansion, Conﬁdence interval

1. Introduction

The fast development of computer technology and its wide application in ﬁnancial markethave made high frequency data to be increasingly available. And its research on both statis-tics and econometrics has been experiencing a great growth over the last several decades.Volatility of an asset quantiﬁes the strength of its ﬂuctuation over time. It plays a pivotalrole in the ﬁelds of asset and derivations pricing, portfolio selection, risk management, andhedging, etc.Recently, spot volatility estimation by using high frequency data has been received sub-stantial attention, since it enables one to determine the variation of an asset at any giventime. From a theoretical point of view, if we model the latent price of an asset as a con-tinuous semi-martingale, spot volatility is just the coeﬃcient of diﬀusion part, namely theconditional variance of the price. By rolling and blocking sampling ﬁlters, Foster and Nelson(1996) estimated spot volatility from high frequency data for the ﬁrst time, and proved apointwise asymptotic normality for rolling regression estimators. In Fan and Wang (2008), ∗ Corresponding author. Email: [email protected] ∗∗ Qiang LIU’s research is supported by MOE-AcRF grant R-146-000-258-114 of Singapore. Zhi LIU’sresearch is supported by the Science and Technology Development Fund, Macau SAR (File no. 202/2017/A3)and NSFC(No. 11971507).

Preprint submitted to July 23, 2020 he researchers proposed a kernel type estimator for spot volatility and established its explicitasymptotic distribution, when the price and volatility processes of an asset are modeled bybivariate diﬀusion processes. More literatures on kernel smoothing for the estimation of spotvolatility, where microstructure noise or jumps may be accommodated, can be referred toRen`o (2008), Kristensen (2010), Zu and Boswijk (2014), Yu et al. (2014), Liu et al. (2018)and references therein.Based on the asymptotic normality of the estimator of spot volatility, statistical inferenceon volatility can be made. More precisely, conﬁdence intervals for spot volatility can beconstructed. In this paper, our main motivation is to improve upon the existing asymptoticmixed normal approximation for the kernel estimator. Our theory is built upon generalcontinuous semi-martingale assumption where a correlational relationship between the priceand volatility processes, namely leverage eﬀect in ﬁnance, is considered.Edgeworth expansion is a power series result for the asymptotic distribution of an estima-tor that incorporates all moment information(see Hall (1992) for a complete introduction).Thus, it can correct the asymptotic normal approximation by including the estimation ofhigh order moments such as skewness and kurtosis. Recently, it has been applied to theestimation of volatility for correcting its performance in small samples. The Edgeworthexpansion for realized volatility, which estimates the integrated volatility, was pioneeringlygiven in Goncalves and Meddahi (2009). Their result was based on the assumption that thevolatility process is independent of the price process, namely the leverage eﬀect was ruledout, and the drift term should not be involved. By using the aforementioned conclusion,Goncalves and Meddahi (2008) discussed how conﬁdence intervals could be constructed tocorrect normal approximation for realized volatility, and conducted some Monte Carlo simu-lation studies to validate their conclusion. Zhang et al. (2011) even considered the presenceof microstructure noise when deriving Edgeworth expansions for realized volatility and othermicrostructure noise robust estimators. Hounyo and Veliyev (2016) established a full formalvalidity of Edgeworth expansions for realized volatility estimators given in above references.In this paper, we develop the theory of Edgeworth expansion for spot volatility estimator,and use it to construct corrected conﬁdence intervals which reﬁne conventional conﬁdenceintervals based on normal approximation.The paper is organized as follows. In Section 2, we give out the theoretical set up of ourmodel and related assumptions. We simply review the spot volatility estimator of kernel typeand develop its Edgeworth expansion in Section 3, where the corrected conﬁdence intervalsare also constructed. In Section 4, some Monte Carlo simulation studies are conductedfor evaluating the ﬁnite sample performance of our proposed corrected conﬁdence intervals.Section 5 concludes our paper. The theoretical proofs are deferred to Appendix part.

2. Setup

Under the assumption of arbitrage-free and frictionless market, the logarithmic price of anasset { X t } t ∈ [0 ,T ] is necessarily to be modeled as a semi-martingale process (Delbaen and Schachermayer(1994)). In this paper, we assume { X t } t ∈ [0 ,T ] is a continuous Itˆo semi-martingale withoutthe presence of jumps. It is a fundamental case that is most widely used in econometricsliteratures. Under the continuous setting, the underlying data generating process X t deﬁned2n a ﬁltered probability space (Ω , F , {F t } t ∈ [0 ,T ] , P ) is driven by dX t = b t dt + σ t dB t , t ∈ [0 , T ] , (1)where B is a standard Brownian motion, b and σ are adapted and locally bounded c´adl´agprocesses. To guarantee the existence and uniqueness of the solution for the stochasticdiﬀerential equation (1), we assume the following Lipschitz continuity conditions are satisﬁedfor the volatility process σ . Assumption For s, t ∈ [0 , T ] , there exists a constant C and < α < such that E [( σ s − σ t ) ] ≤ C | s − t | α . Moreover, σ t is bounded away from 0, that is, there exists a constant c such that σ t > c > . We note that this is a rather general assumption and is widely used in many other literaturessuch as Jacod and Todorov (2014), Liu et al. (2018), Zu and Boswijk (2014), etc. Possiblemodels of σ that satisfy the above assumption can be dσ t = b σt dt + σ ′ t d B t + σ ′′ t d W t , t ∈ [0 , T ] , (2)where W is another standard Brownian motion independent of B , and b σ , σ ′ , σ ′′ are adaptedand locally bounded c´adl´ag processes. In this case, assumption (1) can be satisﬁed by taking0 < α ≤ /

2. Further, the presence of jumps can also be involved in this model, which shallnot violate the assumption. Interested readers can refer to Jacod and Todorov (2014) for theexplicit form. We also note that the common driving process B between the price process(1) and the volatility process (2) depicts their correlated relationship, which is called lever-age eﬀect in ﬁnance. While in Goncalves and Meddahi (2009) and Goncalves and Meddahi(2008), independent structure between X and σ is required for them to derive Edgeworthexpansions for realized volatility. In this sense, our model is a general extension to their one,based on which the Edgeworth expansion for the spot volatility σ τ at time τ is developed.In practice, the whole realization path of { X t } for t ∈ [0 , T ] is not achievable, and theprice data are recorded at some ﬁnite time points. Without loss of generality, we assume theobservations are obtained at ﬁxed time points that are equally distributed within [0 , T ], thatis { , ∆ n , n , · · · , n ∆ n } with ∆ n = Tn . As n tends to inﬁnity, the length of time span forcontinuously observed data ∆ n shrinks, and it results in the so-called high frequency data.In what follows, our whole theory shall based on such an inﬁll setting by taking n → ∞ . Wedeﬁne the shorthand ∆ ni X := X i ∆ n − X ( i − n for i = 1 , ..., n .

3. Main results

In this paper, we are interested in estimating the spot volatility σ τ at a given time τ ∈ [0 , T ]. One of the most often used technique is by plugging in a kernel function intoan estimator of the integrated volatility R T σ t dt and then letting the bandwidth parametertends to 0 (see, e.g. Fan and Wang (2008), Ren`o (2008), Kristensen (2010), Zu and Boswijk32014), Yu et al. (2014), Liu et al. (2018)). Namely, the kernelized estimator of σ τ whenrealized volatility in Barndorﬀ-Nielsen and Shephard (2002) is applied can be written as b σ τ ker = ∆ n n X i =1 K h ( i ∆ n − τ )(∆ ni X ) , (3)where h is the bandwidth parameter, K h ( x ) = K ( x/h ) /h with K ( x ) being the kernel functiondeﬁned on bounded interval [ a, b ]. We also assume that K ( x ) is nonnegative and continuouslydiﬀerentiable with Z ba K ( x ) dx < ∞ , Z ba K ( x ) dx = 1 . (4)In this paper, we consider the speciﬁc kernel function of K ( x ) = 1 { ≤ x< } for clarity: b σ τ = 1 k n ∆ n ⌊ τ/ ∆ n ⌋ + k n X i = ⌊ τ/ ∆ n ⌋ +1 (∆ ni X ) , (5)where k n := ⌊ h/ ∆ n ⌋ is the number of intraday returns that are close to time τ and approx-imately used for quantifying the variation of price process X at that time. We note thatthe asymptotic properties of b σ τ can be generally extended for b σ τ ker by lettting 1 /k n in (5)to be ∆ n h K (( i ∆ n − τ ) /h ) in (3). We see that for diﬀerent kernel functions, diﬀerent weightsare used for the increments, which lead to possible diﬀerent asymptotic variances and higherorder moments for our use in this paper. This can be seen from (12) and (16) in Liu et al.(2018), and uniform, Epanechnikov, quartic, triweight kernel functions are discussed therefor illustration.According to the asymptotic results given in the aforementioned existing literatures forthe kernel version of the spot volatility estimator, we have p k n ( b σ τ − σ τ ) → st N (0 , σ τ ) , as k n → ∞ , k n ∆ n → , where → st means converging stably, which is a stronger result than convergence in distribu-tion. Interested readers can refer to Jacod and Shiryayev (2003) for its rigorous deﬁnitionand more detailed properties. And further, we have the following central limit theorem S ( τ, k n ) := √ k n ( b σ τ − σ τ ) √ σ τ → d N (0 , , as k n → ∞ , k n ∆ n → . (6)The above result is not feasible for inferring the information of σ τ in practice since thedenominator term of the statistic S ( τ, k n ) relies on the underlying spot volatility σ τ . Since b σ τ can also be used to estimate σ τ consistently, it gives us the following feasible version ofsecond order asymptotic result: T ( τ, k n ) := √ k n ( b σ τ − σ τ ) √ b σ τ → d N (0 , , as k n → ∞ , k n ∆ n → . (7)With the asymptotic distribution conclusions (6) and (7), statistical inference with re-spect to σ τ turns to constructing conﬁdence intervals for σ τ . In the proceeding, we will showhow Edgeworth expansions can be derived for the statistics S ( τ, k n ) and T ( τ, k n ), based onwhich more accurate conﬁdence interval results can be given.4 .2. Edgeworth expansions for spot volatility estimator Let k j [ S ( τ, k n )] , k j [ T ( τ, k n )] denote the j -th order cumulant of S ( τ, k n ) and T ( τ, k n ). TheEdgeworth expansions for S ( τ, k n ) and T ( τ, k n ) depend on their cumulants. The followinglemma gives out the ﬁrst fourth cumulants of the two statistics. Lemma 1.

Under assumption 1 and conditional on σ τ , we have k [ S ( τ, k n )] = 0 + O p (cid:0) k α +1 / n ∆ αn (cid:1) , k [ S ( τ, k n )] = 1 + O p (cid:0) k α +1 / n ∆ αn (cid:1) ,k [ S ( τ, k n )] = 2 √ √ k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , k [ S ( τ, k n )] = 12 k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , and further, k [ T ( τ, k n )] = −√ √ k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) ,k [ T ( τ, k n )] = 1 + 8 k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) ,k [ T ( τ, k n )] = − √ √ k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , k [ T ( τ, k n )] = 60 k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) . Now, we are ready to give the Edgeworth expansions of S ( τ, k n ) and T ( τ, k n ). Theorem 1.

Under assumption 1 and conditional on σ τ , if k n → ∞ and k α +3 / n ∆ αn → ,then we have the following second order Edgeworth expansions for S ( τ, k n ) and T ( τ, k n ) forany given x ∈ R : P ( S ( τ, k n ) ≤ x ) = Φ( x ) + 1 √ k n p ( x ) φ ( x ) + 1 k n p ( x ) φ ( x ) + o ( 1 k n ) , (8) P ( T ( τ, k n ) ≤ x ) = Φ( x ) + 1 √ k n q ( x ) φ ( x ) + 1 k n q ( x ) φ ( x ) + o ( 1 k n ) , (9) with p ( x ) = − √

23 ( x − , p ( x ) = − H ( x ) − H ( x ) ,q ( x ) = √ √

23 ( x − , q ( x ) = − H ( x ) − H ( x ) + 49 H ( x ) , where Φ( · ) and φ ( · ) are the standard normal cumulative and partial distribution functionsrespectively, H i denotes the i -th order Hermite polynomials with H ( x ) = x, H ( x ) = x ( x − , H ( x ) = x ( x − x + 15) . Remark 1.

Considering the sample mean estimator of independent and identically dis-tributed random variables, its tail probability is obtained by using the characteristic function.The Hermite polynomials are from the inverse Fourier-Stieltjes transform of the character-istic function of standard normal random variable and are orthogonal with respect to φ .The detailed derivation can be found in Section 2.2 in Hall (1992). Thus, the conclusions of(8) and (9) above can be established if only the ﬁnite moment information of S ( τ, k n ) and T ( τ, k n ) can be approximated. 5 emark 2. For the setting of the parameter k n , one alternative is by taking k n = ⌊ c ∆ − sn ⌋ .In this case, the condition k α +3 / n ∆ αn → s with 0 < s < αα + 3 / In this section, we provide the conﬁdence intervals for σ τ based on the Edgeworth ex-pansions in the last part. We will ﬁrstly describe the one-sided intervals, which are easierto understand. The discussion for the two-sided conﬁdence interval follows. All of ourdiscussions focus on intervals for σ τ based on the studentized statistic T ( τ, k n ). Based on the asymptotic normality result (7), we see that the conventional 95% levelone-sided conﬁdence interval for σ τ can be written as I N − T, = (0 , b σ τ − √ b σ τ z . √ k n ) , where z . = − .

645 is the 5% quantile of standard normal distribution. By using the secondorder Edgeworth expansion result for T ( τ, k n ) in (9), the one-sided conﬁdence interval hascoverage probability equal to P ( σ τ ∈ I N − T, ) = P ( T ( τ, k n ) ≥ z . ) = 1 − P ( T ( τ, k n ) < z . )= 1 − h Φ( z . ) + φ ( z . ) q ( z . ) √ k n + o ( 1 k n ) i = 0 . − φ ( z . ) q ( z . ) √ k n + o ( 1 k n ) . (10)It’s obvious that the error in coverage probability of I N − T, is of order O ( 1 √ k n ). This inspiresus to consider the following corrected one-sided conﬁdence interval for σ τ : I E − T, = (0 , b σ τ − √ b σ τ z . √ k n + √ b σ τ q ( z . ) k n ) , where we recall that q ( x ) is deﬁned in Theorem 1. The above interval brings in a skewnesscorrection term, that is √ c σ τ q ( z . ) k n . Now, the coverage probability of I E − T, is P ( σ τ ∈ I E − T, ) = P (cid:16) T ( τ, k n ) ≥ z . − q ( z . ) √ k n (cid:17) = Φ (cid:16) z . − q ( z . ) √ k n (cid:17) + q ( z . − q ( z . ) √ k n ) √ k n φ (cid:16) z . − q ( z . ) √ k n (cid:17) + o (cid:16) √ k n (cid:17) = 0 .

95 + O ( 1 k n ) , (11)which follows from arguments in Section 3.8 of Hall (1992). We see that the coverageprobability error for I E − T, is of order O ( k n ). Compared with the order of O ( √ k n ) for I N − T, based on the normal approximation, the corrected interval provides us with moreexact result. 6 .3.2. Two-sided conﬁdence interval Similarily as the one-sided corrected conﬁdence interval for σ τ by applying Edgeworthexpansion, we can also develop the corresponding two-sided version. Following the discussionin the last part, by using the asymptotic normality result (7), the conventional 95% leveltwo-sided conﬁdence interval for σ τ is I N − T, = ( b σ τ − √ b σ τ z . √ k n , b σ τ + √ b σ τ z . √ k n ) , (12)where z . = 1 .

96 is the 97 .

5% quantile of standard normal distribution. Its coverageprobability is given by P ( σ τ ∈ I N − T, ) = P ( | T ( τ, k n ) | ≤ z . )= 2Φ( z . ) − φ ( z . ) q ( z . ) k n + o (cid:16) k n (cid:17) = 0 .

95 + 2 φ ( z . ) q ( z . ) k n + o ( 1 k n ) . (13)The above result is derived by using the second order Edgeworth expansion result for T ( τ, k n )–(9), together with the symmetry of Φ, φ , q and q . It can be seen that the er-ror oder of coverage probability for I N − T, is O ( k n ). The corrected interval which contains askewness and kurtosis correction term, and is based on the Edgeworth expansion of T ( τ, k n ),is given as I E − T, = (cid:16) b σ τ − √ b σ τ z . √ k n + √ b σ τ q ( z . ) k n , b σ τ + √ b σ τ z . √ k n − √ b σ τ q ( z . ) k n (cid:17) . By similar proof as (11), we can show that the coverage probability of I E − T, is P ( σ τ ∈ I E − T, ) = P ( | T ( τ, k n ) | ≤ z . − q ( z . ) k n ) = 0 .

95 + O ( 1 k n ) , (14)which implies that the coverage probability error order of I E − T, is O ( 1 k n ). Comparing theresults (13) and (14) demonstrates us to what degree the two-sided conﬁence interval iscorrected by using the Edgewroth expansion derived.Both the one-sided and two-sided corrected conﬁdence intervals have a smaller error orderthan the corresponding ones for normal approximation. Until now, we have provided thecorrected conﬁdence intervals for σ τ based on the studentized statistic T ( τ, k n ). In fact,similar results also hold for the normalized statistic S ( τ, k n ). But since it is an infeasiblestatistic, we do not give a detailed discussion on it.

4. Simulation studies

In this section, we conduct some Monte Carlo studies to evaluate the ﬁnite sample per-formance of the corrected intervals based on the Edgeworth expansion, namely I E − T, and7 E − T, . We also compare their performance with the one of respective asymptotic theory-based intervals I N − T, and I N − T, . The simulation results show that the corrected intervalsalways outperform corresponding non-corrected versions under diﬀerent settings, which ver-iﬁes our theoretical analyses in the last section.We consider two stochastic volatility models in our data generating process (1). One ofthem is the following one factor stochastic volatility modelModel I : σ t = exp( β + β v t ) , dv t = αv t dt + dW t , where W is a standard Brownian motion independent of B ; β , β and α are constants. Theother one is a two factor stochastic volatility model:Model II : σ t = f ( β + β v t + β v t ) ,dv t = α v t dt + dW t , dv t = α v t dt + (1 + φv t ) dW t , where W , W are mutually independent standard Brownian motions and they are alsoindependent of B ; β , β , β , α , α , φ are constants; and the function f ( · ) is deﬁned as f ( x ) = ( exp( x ) , if x ≤ log(1 . , . p − log(1 .

5) + x / log(1 . , otherwise . For the parameters setting, we follow the ones in Zu and Boswijk (2014), Huang and Tauchen(2005) and Barndorﬀ-Nielsen et al. (2008) with β = 0 . α = − . β = β / (2 α ) forModel I ; β = − . β = 0 . β = 1 . α = − . α = − . φ = 0 .

25 forModel II . The initial value of above models both are 0 .

1. And we consider the drift termin (1) is b t ≡

1. For aforementioned models, a total number of 10000 paths are generated, andthe estimation of σ τ at τ = 0 . , . , . n as 780, 1560,4680, 7800, 11700, 23400 are considered, and they correspond to“30-second”, “15-second”,“5-second”, “3-second”,“2-second”, “1-second” interval returns. We set k n as ⌊ cn / ⌋ with c equals 0 . Table 1: Coverage probabilities of normal 95% conﬁdence intervals for σ τ in Model I τ = 0 . τ = 0 . τ = 0 . n I N − T, I E − T, I N − T, I E − T, I N − T, I E − T,

780 79.96 87.38 79.99 87.72 79.01 87.381560 85.10 91.52 85.68 92.26 85.41 91.524680 88.01 93.45 88.59 93.95 88.77 93.457800 88.50 93.56 87.54 92.98 88.64 93.5611700 91.29 95.32 90.64 94.84 91.18 95.3223400 93.03 96.52 92.64 96.10 92.44 96.52 n I N − T, I E − T, I N − T, I E − T, I N − T, I E − T,

780 80.56 82.94 81.30 83.82 81.47 84.021560 87.16 89.17 87.17 89.57 86.83 88.924680 90.15 91.84 89.68 91.61 90.41 92.497800 89.40 91.24 90.20 92.09 89.86 91.6111700 92.36 94.04 92.43 93.73 92.51 94.1523400 94.62 95.84 93.57 95.02 94.16 95.41 I N − T, , I E − T, , I N − T, and I E − T, , when astandard normal coverage probability of 95% is considered for the above two models. Similarphenomena are observed for these two diﬀerent models. The degrees of undercoverage for thenormal approximation based intervals are larger than the ones for corresponding Edgeworthcorrected versions. We see that for relative lower frequency data, namely smaller value of n ,the degree of undercoverage is larger. When the frequency is high enough, say n = 23400,the coverage probabilities for the Edgeworth expansion corrected conﬁdence intervals almostequal to 95%. In short, the correction eliminates the coverage distortions associated withthe conventional conﬁdence intervals with good eﬀect. Table 2: Coverage probabilities of normal 95% conﬁdence intervals for σ τ in Model II τ = 0 . τ = 0 . τ = 0 . n I N − T, I E − T, I N − T, I E − T, I N − T, I E − T,

780 78.95 86.52 78.75 86.68 79.15 87.021560 86.11 92.14 85.00 91.74 85.13 91.874680 88.80 93.95 88.75 93.97 88.97 94.227800 88.18 93.15 88.48 93.45 88.20 93.1911700 91.00 95.30 90.75 95.00 90.63 94.9723400 92.81 96.31 92.41 95.97 93.09 96.67 n I N − T, I E − T, I N − T, I E − T, I N − T, I E − T,

780 80.79 83.36 81.02 83.47 80.77 83.341560 87.20 89.27 87.41 89.66 86.91 89.024680 90.45 92.22 90.22 92.27 90.19 92.127800 89.33 91.29 89.32 91.38 89.57 91.5211700 92.36 93.98 92.81 94.46 92.31 93.9223400 94.17 95.52 94.28 95.05 93.87 95.00

5. Conclusion

We derive Edgeworth expansion for the kernel type estimator of the spot volatility, whichprovides more exact result of asymptotic distribution than usual mixed normal distribution.Our theory is established in the presence of leverage eﬀect, which has not been consideredin other existing literatures on Edgeworth corrections for volatility estimators. By applyingour theoretical conclusion, we give out corrections of the conﬁdence intervals, one-sided ortwo-sided, with respect to the ones based on usual central limit theorem. In simulations, thesuperior ﬁnite sample performance of the corrected conﬁdence intervals is observed, both forone-sided and two-sided versions.

Appendix

For simplicity of the proof procedure, we consider b s ≡ b has noeﬀect on the estimation of volatility. And we deﬁne the following notations in advance: b σ τ ′ = 1 k n ∆ n ⌊ τ/ ∆ n ⌋ + k n X i = ⌊ τ/ ∆ n ⌋ +1 ( σ τ ∆ ni B ) , R ( τ, k n ) = √ k n ( b σ τ − b σ τ ′ ) √ b σ τ , R ′ ( τ, k n ) = √ k n ( b σ τ − b σ τ ′ ) √ σ τ , ( τ, k n ) = √ k n ( b σ τ ′ − σ τ ) √ σ τ , U ( τ, k n ) = √ k n ( b σ τ − σ τ ) σ τ , Q ( τ, k n ) = M ( τ, k n )(1 + 1 √ k n U ( τ, k n )) − and observe that S ( τ, k n ) = √ k n ( b σ τ − σ τ ) √ σ τ = √ k n ( b σ τ − b σ τ ′ ) √ σ τ + √ k n ( b σ τ ′ − σ τ ) √ σ τ = R ′ ( τ, k n ) + M ( τ, k n ) ,T ( τ, k n ) = √ k n ( b σ τ − σ τ ) √ b σ τ = √ k n ( b σ τ − b σ τ ′ ) √ b σ τ + √ k n ( b σ τ ′ − σ τ ) √ σ τ σ τ b σ τ = R ( τ, k n ) + Q ( τ, k n ) . Proof of Lemma 1:

For R ′ ( τ, k n ) and R ( τ, k n ), under assumption 1, we have E [ | R ( τ, k n ) | ] = E [ | √ k n ( b σ τ − b σ τ ′ ) √ b σ τ | ] ≤ p k n Ck n ∆ n ⌊ τ/ ∆ n ⌋ + k n X i = ⌊ τ/ ∆ n ⌋ +1 E [ | (∆ ni X ) − ( σ τ ∆ ni B ) | ] ≤ p k n Ck n ∆ n ⌊ τ/ ∆ n ⌋ + k n X i = ⌊ τ/ ∆ n ⌋ +1 ( E [ | Z i ∆ n ( i − n ( σ s − σ τ ) dB s | ]) / ( E [ | Z i ∆ n ( i − n ( σ s + σ τ ) dB s | ]) / ≤ Ck α +1 / n ∆ αn . The same result also holds for R ′ ( τ, k n ) and can be similarily derived. Thus, we have R ( τ, k n ) = O p ( k α +1 / n ∆ αn ) and R ′ ( τ, k n ) = O p ( k α +1 / n ∆ αn ).For Q ( τ, k n ) = M ( τ, k n )(1 + 1 √ k n U ( τ, k n )) − , according to the second order Taylor ex-pansion of f ( x ) = (1 + x ) − k around 0 for any ﬁxed positive integer k , namely f ( x ) =1 − kx + k ( k +1)2 x + O ( x ), together with the fact M ( τ, k n ) = O p (1) and U ( τ, k n ) = O p (1),we have Q ( τ, k n ) k = M ( τ, k n ) k (1 − k U ( τ, k n ) √ k n + k ( k + 1)2 U ( τ, k n ) k n ) + O p ( k − / n ) . (15)We note that condition on the information at time point τ , σ τ can be seen as a constant,and the following results hold E [ M ( τ, k n )] = 0 , E [ M ( τ, k n ) ] = 1 , E [ M ( τ, k n ) ] = 2 √ √ k n , E [ M ( τ, k n ) ] = 3 + 12 k n , E [ M ( τ, k n ) ] = 20 √ √ k n + 48 √ k / n , E [ M ( τ, k n ) ] = 15 + 260 k n + 480 k n , E [ M ( τ, k n ) U ( τ, k n )] = √ O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n )] = 4 √ k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n )] = 3 √ √ k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , [ M ( τ, k n ) U ( τ, k n )] = 40 √ k n + 96 k / n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n ) ] = 4 √ √ k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n ) ] = 6 + 24 k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n ) ] = 40 √ √ k n + 96 √ k / n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , E [ M ( τ, k n ) U ( τ, k n ) ] = 30 + 520 k n + 960 k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) . Furthermore, from (15) we obtain E [ Q ( τ, k n )] = −√ √ k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , E [ Q ( τ, k n ) ] = 1 + 10 k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , E [ Q ( τ, k n ) ] = − √ √ k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , E [ Q ( τ, k n ) ] = 3 + 152 k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) . Since T ( τ, k n ) = R ( τ, k n ) + Q ( τ, k n ) and the ﬁrst four cumulants of T ( τ, k n ) are given by(see, e.g., Hall (1992)): k ( T ( τ, k n )) = E [ T ( τ, k n )] , k ( T ( τ, k n )) = E [ T ( τ, k n ) ] − [ E [ T ( τ, k n )]] ,k ( T ( τ, k n )) = E [ T ( τ, k n ) ] − E [ T ( τ, k n ) ] E [ T ( τ, k n )] + 2[ E [ T ( τ, k n )]] ,k ( T ( τ, k n )) = E [ T ( τ, k n ) ] − E [ T ( τ, k n ) ] E [ T ( τ, k n )] − E [ T ( τ, k n )] ] + 12 E [ T ( τ, k n ) ][ E [ T ( τ, k n )]] − E [ T ( τ, k n )]] , we further have k [ T ( τ, k n )] = −√ √ k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , k [ T ( τ, k n )] = 1 + 8 k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) ,k [ T ( τ, k n )] = − √ √ k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , k [ T ( τ, k n )] = 60 k n + O p (cid:0) k α + n ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) . And since S ( τ, k n ) = R ′ ( τ, k n ) + M ( τ, k n ), similarily we obtain k [ S ( τ, k n )] = 0 + O p (cid:0) k α +1 / n ∆ αn (cid:1) , k [ S ( τ, k n )] = 1 + O p (cid:0) k α +1 / n ∆ αn (cid:1) ,k [ S ( τ, k n )] = 2 √ √ k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) , k [ S ( τ, k n )] = 12 k n + O p (cid:0) k α +1 / n ∆ αn (cid:1) . (cid:3) Proof of Theorem 1:

We observe that P ( M ( τ, k n ) ≤ x ) = Φ( x ) + 1 √ k n p ( x ) φ ( x ) + 1 k n p ( x ) φ ( x ) + o ( 1 k n ) , (16) P ( Q ( τ, k n ) ≤ x ) = Φ( x ) + 1 √ k n q ( x ) φ ( x ) + 1 k n q ( x ) φ ( x ) + o ( 1 k n ) , (17)11hich follow from (2.17) and Section 2.3 in Hall (1992), the condition k α +3 / n ∆ αn →

0, andthe following cumulants k [ M ( τ, k n )] = 0 + O p (cid:0) k αn ∆ αn (cid:1) , k [ M ( τ, k n )] = 1 + O p (cid:0) k αn ∆ αn (cid:1) ,k [ M ( τ, k n )] = 2 √ √ k n + O p (cid:0) k αn ∆ αn (cid:1) , k [ M ( τ, k n )] = 12 k n + O p (cid:0) k αn ∆ αn (cid:1) , and k [ Q ( τ, k n )] = −√ √ k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , k [ Q ( τ, k n )] = 1 + 8 k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) ,k [ Q ( τ, k n )] = − √ √ k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) , k [ Q ( τ, k n )] = 60 k n + O p (cid:0) k αn ∆ αn (cid:1) + O p (cid:0) k − n (cid:1) . The above cumulant results can be easily seen from the proof of Lemma 1.For any given x ∈ R and h ≥

0, we note that there exists a constant C such that P ( M ( τ, k n ) ≤ x + h ) − P ( M ( τ, k n ) ≤ x ) ≤ Ch, since Φ( x ) , p ( x ) φ ( x ) , p ( x ) φ ( x ) in (16) and q ( x ) φ ( x ) , q ( x ) φ ( x ) in (17) are diﬀerentiablewith continuous derivative. As shown in the proof of Lemma 1, we have R ( τ, k n ) = O p ( k α +1 / n ∆ αn ) and R ′ ( τ, k n ) = O p ( k α +1 / n ∆ αn ). Thus, P ( S ( τ, k n ) ≤ x ) = P ( R ′ ( τ, k n ) + M ( τ, k n ) ≤ x )= P ( M ( τ, k n ) ≤ x + O p ( k α +1 / n ∆ αn )) = P ( M ( τ, k n ) ≤ x ) + O p ( k α +1 / n ∆ αn ) , P ( T ( τ, k n ) ≤ x ) = P ( R ( τ, k n ) + Q ( τ, k n ) ≤ x )= P ( Q ( τ, k n ) ≤ x + O p ( k α +1 / n ∆ αn )) = P ( Q ( τ, k n ) ≤ x ) + O p ( k α +1 / n ∆ αn ) . Together with the condition k α +3 / n ∆ αn →

0, we obtain the conclusions (8) and (9). (cid:3)