Bias optimal vol-of-vol estimation: the role of window overlapping
aa r X i v : . [ ec on . E M ] A p r Bias-Optimal Vol-of-Vol Estimation: the Role of Window Overlapping
Giacomo Toscano
Scuola Normale Superiore, Italy - email: [email protected]; address: Piazza dei Cavalieri 7, 56126 Pisa, Italy
Maria Cristina Recchioni
Universit`a Politecnica delle Marche, Italy - email: [email protected]; address: Piazzale Martelli 8, 60121Ancona, Italy
Abstract
We derive a feasible criterion for the bias-optimal selection of the tuning parameters involved inestimating the integrated volatility of the spot volatility via the simple realized estimator by Barndorff-Nielsen and Veraart (2009). Our analytic results are obtained assuming that the spot volatility is acontinuous mean-reverting process and that consecutive local windows for estimating the spot volatilityare allowed to overlap in a finite sample setting. Moreover, our analytic results support some optimalselections of tuning parameters prescribed in the literature, based on numerical evidence. Interestingly,it emerges that the window-overlapping is crucial for optimizing the finite-sample bias of volatility-of-volatility estimates.
Keywords: stochastic volatility of volatility, high-frequency data, bias optimization, CIR model,CKLS model.
Preprint submitted to Elsevier April 9, 2020 . Introduction
Estimating the volatility of asset volatility is relevant in many areas of mathematical finance. Theseinclude the calibration of stochastic volatility of volatility models (Barndorff-Nielsen and Veraart(2009), Sanfelici et al. (2015)), the hedging of portfolios against volatility of volatility risk (Huang et al.(2018)), the estimation of the leverage effect (Kalnina and Xiu (2017), A¨ıt-Sahalia et al. (2017)), andthe inference of future returns (Bollerslev et al. (2009)), along with spot volatilities (Mykland and Zhang(2009)). Note that in this paper we use the term volatility to refer to the variance of the price process,as is customary in the literature on high-frequency econometrics.
To the best of our knowledge, the literature offers at least the following five consistent estimatorsof the integrated volatility of volatility (hereinafter vol-of-vol):(a) the estimator introduced by Barndorff-Nielsen and Veraart (2009), called the pre-estimated spot-variance based realized variance (PSRV), which is, in fact, the realized variance of the unobserv-able spot volatility computed using estimates of the latter;(b) the estimator derived by Vetter (2015), which is a modified version of the PSRV allowing for acentral limit theorem with optimal rate of convergence, but also allowing for negative values;(c) the estimator derived by Vetter (2015), another modified version of the PSRV allowing for acentral limit theorem. With respect to the rate-optimal estimator in (b), this estimator has theadvantage that it preserves positivity, while its disadvantage is a slower rate of convergence;(d) the Fourier estimator introduced by Sanfelici et al. (2015), which is consistent in the presenceof market microstructure noise;(e) the Fourier estimator derived by Cuchiero and Teichmann (2015), which allows for a centrallimit theorem in the presence of jumps in the price and volatility processes.The first three estimators belong to the category of realized vol-of-vol estimators (see Chapter 8.3 inA¨ıt-Sahalia and Jacod (2014)), the last two belong to the category of Fourier-based vol-of-vol estima-tors (see Chapter 6.2 in Mancino et al. (2017)).The numerical studies in A¨ıt-Sahalia et al. (2017) and Sanfelici et al. (2015) show that both re-alized and Fourier-based integrated vol-of-vol estimators may carry a substantial finite-sample biasunless the selection of the tuning parameters involved in their computation is carefully optimized.2owever, this is a rather unexplored issue that we aim to address. To this end, we concentrate ourattention on the finite-sample performance of the PSRV (see (a)), since it is the most intuitive andeasy-to-implement vol-of-vol estimator in the list (a)–(e).Based on asymptotic properties, one should obviously use the rate-optimal estimator by Vet-ter (see (b)) to obtain integrated vol-of-vol estimates in the absence of jumps and microstructurenoise. However, asymptotically-optimal estimators do not necessarily guarantee the best finite-sampleperformance, as pointed out in the extensive study by Gatheral and Oomen (2010) on integratedvolatility estimators and confirmed for integrated vol-of-vol estimators by the numerical studies inA¨ıt-Sahalia et al. (2017) and Sanfelici et al. (2015). In other words, there is no reason to expect a pri-ori that the simple PSRV would show worse finite-sample performance than its modified version withthe optimal rate of convergence. Consequently, our finite-sample study favors analytic tractabilityover asymptotic optimality, focusing on the simple PSRV.As mentioned, the PSRV is the realized volatility of the unobservable spot volatility process, com-puted from discrete estimates of the latter. Thus, when computing PSRV values, one has to select thespot volatility estimation grid. Moreover, since the spot volatility is estimated as an average of theprice realized volatility over a local window, the length of the window must also be selected. Morespecifically, the figure below details the different quantities involved in the computation of the PSRV: h , the time horizon for the estimation of the integrated vol-of-vol; δ N = hN , the log-price samplingfrequency; ∆ N = λ N δ N , λ N = min ( N, ⌈ λδ c − N ⌉ ) , c ∈ (0 , W N = k N δ N , k N = ⌈ κδ bN ⌉ , b ∈ ( − , ν ( s ), the spot volatility estimate at time s = τ + j ∆ N , j = 0 , , ..., ⌊ h/ ∆ N ⌋ . Note that ⌈ · ⌉ denotesthe ceiling function. ˆ ν ( τ ) ˆ ν ( τ + ∆ N ) ˆ ν ( τ + 2∆ N ) ˆ ν ( τ + h ) δ N ∆ N = λ N δ N W N = k N δ N ... time As a consequence, for given values of the asymptotic rates b and c , the finite-sample performanceof the PSRV (i.e., the performance of the PSRV for a fixed N ) depends on the selection of two tuningparameters: λ , which determines the mesh of the spot volatility estimation grid and κ , which deter-mines the length of the local window used to estimate the spot volatility.3 .2. Objectives and approach Thus, the aim of this paper is to gain insight into the existence of feasible criteria for the bias-optimal selection of λ and κ in finite samples. To do so, we use an approach inspired by the one usedin A¨ıt-Sahalia et al. (2013) to solve the “leverage effect puzzle.” The basic steps in our approach arethe following:I) derive a rule for the bias-optimal selection of λ and κ in a parametric setting of practical interest,where the explicit formula of the finite-sample bias can be obtained;II) extend this rule to a more general setting, where explicit formulas are not available or theirderivation is too much costly, by using dimensional analysis.Specifically, in the first step, we proceed as follows. Assuming that the log-price is a diffusionand the spot volatility follows the CIR model (as in Bollerslev and Zhou (2002)), we derive the exactparametric expression of the PSRV finite-sample bias, both in the case W N > ∆ N (i.e., overlappingof two consecutive local windows to estimate the spot volatility) and in the opposite case W N ≤ ∆ N .The case W N ≤ ∆ N is relevant when studying the asymptotic properties of the PSRV, includingasymptotic unbiasedness (see Theorem 1). In fact, the condition W N ∆ N → N → ∞ must hold for thePSRV to be consistent (see Proposition 2). Instead, the case W N > ∆ N is relevant for finite-samplesettings. Indeed, the numerical study of Sanfelici et al. (2015) and our preliminary numerical exer-cise (see Section 2) both show that allowing consecutive local windows to overlap is crucial in orderto obtain finite-sample unbiased integrated vol-of-vol estimates through the PSRV over a daily horizon.Once we have obtained the exact parametric expression of the PSRV finite-sample bias under ourparametric assumption for the data-generating process, we derive the constraints on the rates b and c that guarantee the asymptotic unbiasedness of the PSRV, namely b ∈ ( − ,
0) and c < min ( − b, b ).Then, for values of b and c within those constraints, conditioning to the natural filtration of the spotvolatility process ν ( t ) up to the initial time t = τ , we expand the parametric expression of the PSRVfinite-sample bias when W N > ∆ N for small values of the tuning parameter λ and the time horizon h .Thereby, we show that the dominant term of the expansion is independent of the tuning parameter λ and is annihilated by selecting the asymptotic rate of k N as b = − /
2, the asymptotic rate of ∆ N as c < /
2, and the local-window tuning parameter as4 = 2 p ˆ ν ( τ )ˆ γ , where ˆ ν ( τ ) is the Fourier estimate of the spot volatility ν ( t ) at the initial time t = τ and ˆ γ is theestimate of the spot volatility diffusion parameter γ , computed through the indirect inference methoddetailed in Appendix B.Consequently, the bias-optimal expression of W N reads W N = 2 p ˆ ν ( τ )ˆ γ δ / N . This analytic result supports the numerical findings of Sanfelici et al. (2015).In the second step, we use dimensional analysis (see, e.g., Kyle and Obizhaeva (2017)) to generalizethe rule for the selection of W N derived analytically in the first step for the CIR model, so that itholds for the general mean-reverting model of Chan et al. (1992), i.e., the CKLS model, where thediffusion part of the spot volatility ν ( t ) is represented by the process γν ( t ) β , β ≥ /
2. Note that wefocus on the CKLS model because, depending on the value of the parameter β , it includes severalmodels commonly used in the literature to capture the volatility dynamics. For example, it includesthe CIR model by Cox et al. (1985) (corresponding to β = 1 / β = 1), and the CIR-VR model by Cox et al. (1980)(corresponding to β = 3 / b = − / c < / κ = 2 ˆ ν ( τ ) − β ˆ γ , so that the bias-optimal expression of W N reads W N = 2 ˆ ν ( τ ) − β ˆ γ δ / N , where, again, ˆ ν ( τ ) is the Fourier estimate of ν ( τ ) (see Malliavin and Mancino (2009)) and ˆ γ is theestimate of γ computed through the indirect inference method detailed in Appendix B. The numericalresults illustrated in Section 5 provide evidence that this rule is effective in terms of bias reduction.Finally, we also address the problem of the bias-optimal selection of the PSRV tuning parametersfor the more realistic case in which the price is contaminated at high-frequencies by microstructurenoise. Assuming the presence of an i.i.d. noise component (see Hasbrouck (2007)), we derive theexact analytic expression of the PSRV finite-sample bias and obtain the rate at which it explodes5symptotically. We also show numerically, for typical values of the market noise-to-signal ratio, thatif the price is sampled on a suitably sparse grid, the extra bias term induced by the presence of noiseis negligible and we can still select the local-window tuning parameter κ according to the bias-optimalrule derived in the absence of microstructure noise.Additionally, as a byproduct of the PSRV bias analysis, we quantify the finite-sample bias reduc-tion following the assumption that the initial value of the volatility process is equal to the long-termvolatility parameter, in the case of both the PSRV and the locally averaged realized volatility (seeA¨ıt-Sahalia and Jacod (2014)). This is a very common assumption in the literature, typically madein simulation studies where a mean-reverting process drives the spot volatility (see, e.g., among manyothers, A¨ıt-Sahalia et al. (2013), Sanfelici et al. (2015), Vetter (2015)). This paper is organized as follows. In Section 2, we define the locally averaged realized volatility andthe PSRV while recalling their asymptotic properties. Moreover, we show the results of a preliminarynumerical exercise that motivates the analytic study in Section 3. In Section 3, we derive and studythe exact parametric expression for the PSRV finite-sample bias, both when W N ≤ ∆ N and when W N > ∆ N , under the assumption that the spot volatility follows the CIR model. Moreover, weisolate the dominant term of the PSRV finite-sample bias when W N > ∆ N and derive the rule toselect the local-window tuning parameter κ that annihilates this dominant term. In Section 4, basedon dimensional analysis, we generalize the rule to select κ to be effective when the spot volatilityfollows a generic continuous mean-reverting spot volatility model, i.e., the CKLS model. In Section5, we perform an extensive numerical study in which we test the performance of the feasible versionof the rule to select κ . In Section 6, we show the results of an empirical study in which we computePSRV values from high-frequency S&P 500 prices, selecting κ based on the feasible version of thebias-optimal rule. Section 7 summarizes our conclusions. Finally, Appendix A contains the proofsand Appendix B illustrates the indirect inference method that we use to roughly estimate the CKLSparameters relevant to our study.
2. Motivation
As mentioned in the Introduction, the PSRV is the sum over a given time horizon of the squaredincrements of the estimated unobservable spot volatility process. These estimates are obtained as6ocal averages of the price realized volatility. Formally, the locally averaged realized volatility and thePSRV are defined as follows.
Definition 1.
Locally averaged realized volatility
Suppose that the log-price process p is observable on an equally-spaced grid of mesh size δ N , with δ N → as N → ∞ . The locally averaged realized volatility at time t is defined as: ˆ ν N ( t ) := 1 k N δ N k N X j =1 h p ( ⌊ t/δ N ⌋ δ N − k N δ N + jδ N ) − p ( ⌊ t/δ N ⌋ δ N − k N δ N + ( j − δ N ) i , where k N = O ( δ bN ) , b ∈ ( − , , is a sequence of positive integers such that k N → ∞ and W N := k N δ N → as N → ∞ , while ⌊ · ⌋ denotes the floor function. Proposition 1.
Let the log-price process p be a Brownian semimartingale and let the process ν denoteits instantaneous volatility. Then ˆ ν N ( s ) is a consistent local estimator of ν ( s ) as N → ∞ .Proof. See A¨ıt-Sahalia and Jacod (2014).
Definition 2.
Pre-estimated spot-variance based realized variance
Suppose that the log-price process p is observable on an equally-spaced grid of mesh size δ N , with δ N → as N → ∞ . The pre-estimated spot-variance based realized variance (PSRV) on the interval [ τ, τ + h ] is defined as P SRV [ τ,τ + h ] ,N := ⌊ h/ ∆ N ⌋ X i =1 h ˆ ν N ( τ + i ∆ N ) − ˆ ν N ( τ + ( i − N ) i , where:- ˆ ν N ( · ) is the locally averaged realized volatility in Definition 1, with k N = O ( δ bN ) , b ∈ ( − , ;- ∆ N = O ( δ cN ) , c ∈ (0 , , is the locally averaged realized volatility sampling frequency. Proposition 2.
Let the log-price process p and the spot volatility process ν be Brownian semimartin-gales. Then the PSRV is a consistent estimator of the quadratic variation of the volatility process h ν, ν i [ τ,τ + h ] if b ∈ ( − / , and c ∈ (0 , − b/ .Proof. See Proposition 8 in Barndorff-Nielsen and Veraart (2009). Note that the requirements forrates b and c that guarantee consistency imply that W N ∆ N → N → ∞ . Indeed, as one can easilyverify, − / < b < < c < − b/ c < b, which, in turn, implies W N ∆ N → N → ∞ .7herefore, for a given log-price sampling frequency δ N := hN , the computation of the PSRV requiresthe selection of the spot volatility estimation frequency, ∆ N , and the length of the local window, W N (i.e., k N ), as functions of δ N . In this regard, following Sanfelici et al. (2015), we make the followingassumption: Assumption 1.
Functional forms of ∆ N and k N We assume that ∆ N and k N as in, respectively, Definition 2 and 1 take the following functionalforms: ∆ N = min ( N, ⌈ λδ c − N ⌉ ) δ N , with λ > and c ∈ (0 , ,k N = ⌈ κδ bN ⌉ , with κ > and b ∈ ( − , . Based on Assumption 1, the computation of the PSRV requires the finite-sample optimal selectionof rates b and c and tuning parameters κ and λ , which is not straightforward a priori. In this section,we gain some preliminary insight into this issue by performing a numerical study. As detailed below,the findings of this study are rather puzzling and call for a thorough analytic investigation to supportthem (see Section 3).In our preliminary numerical study we simulate log-price observations under the following data-generating process. Assumption 2.
Data-generating process
For t ∈ [0 , T ] , T > , the dynamics of the log-price process p ( t ) and the spot volatility process ν ( t ) read: p ( t ) = p (0) + Z t p ν ( s ) dW ( s ) ν ( t ) = ν (0) + θ Z t (cid:16) α − ν ( s ) (cid:17) ds + γ Z t p ν ( s ) dZ ( s ) , where W and Z are two correlated Brownian motions on (Ω , F , ( F t ) t ≥ , P ) and the strictly positiveparameters θ, α, γ denote, respectively, the speed of mean reversion, the long term mean and the vol-of-vol parameter. We also assume that αθ > γ and ν (0) > , to ensure that ν ( t ) is a.s. positive ∀ t ∈ [0 , T ] . In particular, we simulate one thousand 1-year trajectories of 1-second observations, with a yearcomposed of 252 trading days of 6 hours each. We consider three scenarios determined by the following8ets of model parameters:
Set 1 : ( α, θ, γ, ρ, ν (0)) = (0 . , , . , − . , . Set 2 : ( α, θ, γ, ρ, ν (0)) =(0 . , , . , − . , . Set 3 : ( α, θ, γ, ρ, ν (0)) = (0 . , , . , − . , . Set 1 , is suggested in Sanfelici et al. (2015) and Vetter (2015) andrepresents our baseline scenario. The second,
Set 2 , represents the opposite scenario. In fact, thevolatility generated by
Set 2 is lower than the volatility generated by
Set 1 , since the long term mean, α , and the speed of mean reversion, θ , are, respectively, much lower and much higher than in Set 1 .The second scenario is also characterized by a lower volatility of the volatility, which is captured by theparameter γ and a more pronounced leverage effect, which is captured by the correlation parameter ρ .The third set of parameters, Set 3 , differs from the first only in that the initial value of the volatility, ν (0), is twice the long term volatility, α . In this regard, note that if the initial volatility ν (0) isequal to α , the spot volatility has a constant unconditional mean over time under Assumption 2 (seeAppendix A in Bollerslev and Zhou (2002)). This is a simplifying assumption typically adopted innumerical studies where a mean-reverting volatility process is used (see, e.g., among many others,A¨ıt-Sahalia et al. (2013), Sanfelici et al. (2015), Vetter (2015)).We estimate daily values of the PSRV in these three scenarios from simulated prices sampled witha 1-minute frequency using different values of κ and λ and choosing b = − / c = 1 /
4. Notethat this choice of b and c satisfies the constraints for asymptotic unbiasedness (see Theorem 1 inSection 3). Moreover, note that the selection b = − / κ and λ in the three scenarios, we observethat the same selection κ and λ could lead to very different values of bias. Figure 1 summarizes theresults of the numerical exercise. 9 .0002 0.0004 0.0006 0.0010 0.0019 0.0029 0.0057-1-0.8-0.6-0.4-0.200.20.40.60.81 R e l a t i v e B i a s a) =1.5=2 =2.5=3 0.0002 0.0004 0.0006 0.0010 0.0019 0.0029 0.0057-1-0.8-0.6-0.4-0.200.20.40.60.81 R e l a t i v e B i a s b) =1.5=2 =2.5=3 0.0002 0.0004 0.0006 0.0010 0.0019 0.0029 0.0057-1-0.8-0.6-0.4-0.200.20.40.60.81 R e l a t i v e B i a s c) =1.5=2 =2.5=3 Figure 1: Daily PSRV finite-sample relative bias as a function of λ for values of κ ∈ (1 . , , . ,
3) and δ N = 1 minute, b = − / c = 1 /
4. The values of λ on the x -axis correspond to ∆ N equal to jδ N , for j = 1 , , , , , ,
30. The panelsrefer to the following parameter sets: a)
Set 1 : ( α, θ, γ, ρ, ν (0)) = (0 . , , . , − . , . Set 2 : ( α, θ, γ, ρ, ν (0)) =(0 . , , . , − . , . Set 3 : ( α, θ, γ, ρ, ν (0)) = (0 . , , . , − . , . Figure 1 suggests that the bias-optimal selection of the tuning parameter κ in finite samples isstrongly dependent on the parameters of the data-generating process. In fact, the same value of κ leads to very different values of the bias in the three different scenarios. For example, the selection κ = 2 leads to a relative bias of approximately −
20% in scenario 1, −
50% in scenario 2 and − λ , and, furthermore, that the bias-optimal value of κ is between 1.5 and 2 in the baseline scenario,slightly smaller than 1.5 in the second scenario, and around 2 in the third scenario. As one can easilyverify using the functional form of k N in Assumption 1, values of κ of this order of magnitude implythat consecutive local windows for estimating the spot volatility overlap, for all values of λ considered.Consequently, Figure 1 tells us that allowing for overlap of the local windows is crucial in order tooptimize the finite-sample bias of the PSRV, even when ∆ N >> δ N . This is confirmed by the resultsof the numerical study by Sanfelici et al. (2015), where, based on the same parameter set used in ourbaseline scenario, the optimal value of κ is found to be approximately equal to 2.In sum, our preliminary numerical study shows that: only the selection of tuning parameter κ iscrucial for optimizing the PSRV finite-sample bias; to avoid obtaining highly biased vol-of-vol esti-mates, it is critical to uncover the dependence between the bias-optimal value of κ and the parameters10f the data-generating process; for typical values of the CIR parameters, the bias-optimal value of κ issuch that consecutive local windows for estimating the spot volatility overlap, even when ∆ N >> δ N .Gaining a more in-depth understanding of these numerical findings is what motivates our analyticstudy in the next section.
3. Analytic results in the CIR framework
Here we carry out the first step towards the bias-optimal selection of the PSRV tuning parameters.That is, we derive the rule for this selection in a parametric setting of practical interest, namely theCIR framework, where the explicit formula of the finite-sample bias can be obtained with simple buttedious calculations, both when W N ≤ ∆ N and when W N > ∆ N .The bias expression for W N ≤ ∆ N is the starting point to derive the asymptotic constraints onrates b and c that ensure the asymptotic unbiasedness of the PSRV. In this regard, we obtain thefollowing result. Theorem 1.
Let Assumption 2 hold. Then, if b ≥ − / and c < − b or b < − / and c < b , W N ∆ N → as N → ∞ and the PSRV as given in Definition 2 is asymptotically unbiased, i.e., E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i → as N → ∞ , where E h h ν, ν i [ τ,τ + h ] i = γ αh + γ (cid:16) E [ ν ( τ )] − α (cid:17) − e − θh θ ,E h P SRV [ τ,τ + h ] ,N i = γ αhA N + γ (cid:16) E [ ν ( τ )] − α (cid:17) − e − θh θ B N + C N ,E [ ν ( τ )] = α + ( ν (0) − α ) e − θτ , and A N , B N and C N are given in Eqs.(4,5,6).Furthermore, bearing in mind that W N = k N δ N , as N → ∞ , k N δ N / ∆ N → + , k N ∆ N → + ∞ wehave: E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = a ∆ N + a k N ∆ N + a k N δ N ∆ N + o (cid:16) ∆ N (cid:17) + o (cid:16) k N ∆ N (cid:17) + o (cid:16) k N δ N ∆ N (cid:17) , (1)11 here: a = − θ γ αh + θ γ ( E [ ν ( τ )] − α ) 1 − e − θh θ + θ − e − θh ) h ( E [ ν ( τ )] − α ) + γ θ (cid:16) α − E [ ν ( τ )] (cid:17)i ,a = 2 θ γ αh + 4 θ γ ( E [ ν ( τ )] − α ) 1 − e − θh θ + 2 θ (1 − e − θh ) h ( E [ ν ( τ )] − α ) + γ θ (cid:16) α − E [ ν ( τ )] (cid:17)i +4 α h + 8 α ( E [ ν ( τ )] − α )(1 − e − θh ) θ ,a = − γ ( E [ ν ( τ )] − α ) 1 − e − θh θ . Proof.
See Appendix A.
Corollary 1.
Let Assumption 1 hold. The leading term of the PSRV finite-sample bias expansion inEq. (1) can be canceled in the case b = − / and c = 1 / , provided that there exists a solution (˜ κ, ˜ λ ) to the following system: a κ + a λ κ + a = 0 κ > W N ≤ ∆ N . If a solution (˜ κ, ˜ λ ) exists, the corresponding bias-optimal selection of W N and ∆ N reads W N = ˜ κδ / N , ∆ N = ˜ λδ / N . Proof.
See Appendix A.
Remark 1.
Under Assumption 1, for b = − / and c = 1 / , the “no-overlapping” condition W N ≤ ∆ N is equivalent to δ N ≤ λ h − κ − . Assuming that a positive solution ˜ κ ( λ ) to a κ + a λ κ + a = 0 exists for some λ > , we define the “no-overlapping” threshold for δ N as δ ∗ ( λ ) := λ h − ˜ κ ( λ ) − .For the three sets of CIR parameters used in the numerical study in Section 2, Figure 2 shows thefrequency corresponding to the threshold δ ∗ ( λ ) as a function of λ with λ ∈ (0 , λ ∗ ] . Here λ ∗ is thelargest admissible value of λ such that ∆ N ≤ h (i.e., λ ∗ δ ∗ ( λ ∗ ) / = h ). Specifically, Figure 2 showsthat the sampling frequency corresponding to δ ∗ ( λ ) is bounded by, respectively, 11 seconds (see Panela), 18 seconds (see Panel b) and 45 seconds seconds (see Panel c). This suggests that for typical valuesof the CIR parameters, the system in Corollary 1 can be solved only for very high frequencies, that is,for frequencies at which prices are affected by microstructure noise. However, this makes the solutionuseless for practical applications, as the presence of noise at high frequencies adds an explosive termto the PSRV bias, which is not taken into account by Theorem 1 (the effects of microstructure noiseon the PSRV bias are examined in the remaining part of this section, see Theorem 4). s e c ond s a) * ( ) ** ( * ) s e c ond s b) * ( ) ** ( * ) s e c ond s c) * ( ) ** ( * ) Figure 2: The threshold δ ∗ ( λ ) is plotted in blue. The dotted gray vertical line corresponds to λ = λ ∗ . The dotted redhorizontal line corresponds to δ ∗ ( λ ∗ ). The panels refer to the following sets of parameters: a) Set 1 : ( α, θ, γ, ν (0)) =(0 . , , . , . Set 2 : ( α, θ, γ, ν (0)) = (0 . , , . , . Set 3 : ( α, θ, γ, ν (0)) = (0 . , , . , . δ ∗ ( λ ) isindependent of the correlation parameter ρ , which is therefore omitted. For panel c) we consider τ = 5 days, while inpanels a) and b) δ ∗ ( λ ) is independent of τ . We have assumed h = 1 / In empirical applications one can only observe the noisy price ˜ p ( t ), that is, the efficient pricecontaminated by a noise component that originates from market microstructure frictions, such as thepresence of a bid-ask spread. Here, we assume that the noise component is an i.i.d. process independentof the efficient price process, as in the seminal paper by Roll (1984). For a general discussion of thestatistical models of microstructure noise, see Jacod et al. (2017). Assumption 3.
Data-generating process in the presence of market microstructure noise
The observable price process ˜ p is given by ˜ p ( t ) = p ( t ) + η ( t ) , where p ( t ) represents the efficient price process and evolves according to Assumption 2 while η ( t ) is asequence of i.i.d. random variables independent of p ( t ) , such that E [ η ( t )] = 0 , E [ η ( t ) ] = V η < ∞ and E [ η ( t ) ] = Q η < ∞ ∀ t . Under Assumption 2, the PSRV is asymptotically biased, as the presence of microstructure noiseintroduces an extra term in the bias expression, D N , that diverges as N → ∞ , as illustrated by thenext theorem. 13 heorem 2. Let Assumption 3 hold. Moreover, let ^ P SRV [ τ,τ + h ] ,N denote the PSRV in Definition 2,computed from noisy price observations. Then, if either b ≥ − and c < − b or b < − and c < b + 1 ,as N → ∞ , E h ^ P SRV [ τ,τ + h ] ,N − h ν, ν i i [ τ,τ + h ] ,N → ∞ , where E h ^ P SRV [ τ,τ + h ] ,N i reads E h ^ P SRV [ τ,τ + h ] ,N i = γ αhA N + γ (cid:16) E [ ν ( τ )] − α (cid:17) − e − θh θ B N + C N + D N , with A N , B N and C N as in Theorem 1, and D N given by D N = [4( Q η + V η ) + 16 αV η δ N ] h k N δ N ∆ N + 8 V η ( α − E [ ν ( τ )])(1 − e − θh ) (1 + e − θ ∆ N )(1 − e − θk N δ N )(1 − e − θ ∆ N ) k N δ N , where k N δ N ∆ N D N = 4( Q η + V η ) h + O ( δ N ) and k N δ N ∆ N → , δ N → . Proof.
See Appendix A.Theorem 2 therefore provides the rate at which the PSRV diverges under Assumption 2, and alsothe exact expression of the extra bias component, D N . From the proof of Theorem 2 in Appendix A,one can easily see that the expression of D N is the same for any continuous mean-reverting volatilitymodel, as its computation only depends on the drift of ν in Assumption 2. Moreover, Theorem 1and 2 quantify the bias reduction ensuing from the simplifying assumption ν (0) = α . Indeed, thisassumption cuts off the entire source of bias B N and part of the source of bias D N . Interestingly, thissimplifying assumption still reduces the bias in the overlapping case, i.e., when W N > ∆ N , as shownin the proof of Theorem 4 detailed in Appendix A. The finite-sample bias reduction ensuing from thissimplifying assumption is not peculiar to the PSRV, though. In fact, this simplifying assumption isalso beneficial for reducing the finite-sample bias of the locally averaged realized volatility, as shownin the next theorem. Theorem 3.
Let Assumption 2 hold. Moreover, let ˆ ν ( τ ) denote the locally-averaged realized volatilityin Definition 1 at time τ . Then, if b ∈ ( − , , ˆ ν ( τ ) is asymptotically unbiased, i.e., E [ˆ ν ( τ ) − ν ( τ )] = ( ν (0) − α ) e − θτ e θk N δ N − − θk N δ N θk N δ N , and, as N → ∞ , we have [ˆ ν ( τ ) − ν ( τ )] = θ ν (0) − α ) e − θ τ k N δ N + o ( k N δ N ) , k N δ N → . Let Assumption 3 hold. Moreover, let w ( τ ) denote the locally-averaged realized volatility in Defini-tion 1 at time τ computed from noisy price observations. Then, ∀ b ∈ ( − , , w ( τ ) is asymptoticallybiased, i.e., E [ w ( τ ) − ν ( τ )] = ( ν (0) − α ) e − θτ e θk N δ N − − θk N δ N θk N δ N + 2 V η δ N , and, as N → ∞ , we have E [ w ( τ ) − ν ( τ )] = θ ν (0) − α ) e − θ τ k N δ N + 2 V η δ N + o ( k N δ N ) , k N δ N → . Proof.
See Appendix A.This theorem has two interesting implications. First, under Assumption 2, the locally averagedrealized volatility is unbiased in finite samples if and only if ν (0) = α . Second, under Assumption 3,if α > ν (0), the presence of noise could actually compensate for the negative bias originating from thefirst term of the bias expression. This also holds for the PSRV finite-sample bias, provided that theterm D N is of opposite sign with respect to the sum of the other terms in the bias expression.After deriving the asymptotic constraints on rates b and c that ensure the asymptotic unbiasednessof the PSRV, we focus on the bias-optimal selection of tuning parameters κ and λ in finite samples.In order to analytically derive a rule for the bias-optimal selection of the PSRV tuning parametersin finite samples, we proceed as follows. First, for fixed N , we derive the exact parametric biasexpression under the assumption that W N > ∆ N . This explicit expression includes some extra termswith respect to the corresponding expression in the absence of overlapping, i.e., for W N ≤ ∆ N . Wethen compute a suitable expansion of the bias and we determine the value of the tuning parametersthat make the dominant term of the expansion equal to zero.We need a “suitable expansion” since the natural expansion as N → ∞ is precluded to isolate thebias dominant term when W N > ∆ N , as the consistency and asymptotic unbiasedness of the PSRVrequire that W N ∆ N → N → ∞ . Thus, we determine the leading term of the bias for W N > ∆ N through an alternative asymptotic expansion, which exploits some natural, non-restrictive constraintson the magnitude of the tuning parameter λ and the time horizon h .Specifically, we first regard the bias as a function of λ and we perform its Taylor expansion withbase point λ = 0. Then, regarding each term of this expansion as a function of h , we perform their15aylor expansions with base point h = 0. The choice of the base point λ = 0 is supported by the factthat under Assumption 1, the largest feasible values of λ are very small, i.e., on the order of 10 − when c < / δ N is equal to one minute (see, e.g., Figure 1 for the case c = 1 / λ satisfies ∆ N := λδ cN < h . The choice of base point h = 0 is insteadsupported by the fact that in the literature on high-frequency econometrics, the typical time horizonused to estimate the integrated quantities is one trading day, i.e., h = 1 / ≈ · − . The order ofthis sequential expansion is rather natural: intuitively, we first take the limit λ → h → τ .This approach leads to the following result. Theorem 4.
Let Assumptions 1 and 2 hold and let W N > ∆ N . Then, as λ → , h → E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] i = E [ ν ( τ )] k δ bN − γ E [ ν ( τ )] ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ E [ ν ( τ )] h + O ( h − b ) + O ( λ ) if b < − / , c < b .In particular, if ν (0) = α , then E [ ν ( t )] = α ∀ t , so the previous equation reads: E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] i = α k δ bN − γ α ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ αh + O ( h − b ) + O ( λ ) if b < − / , c < b .Moreover, let ( F νt ) t ≥ be the natural filtration associated with the process ν . Then, as λ → , h → E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] |F ντ i = ν ( τ ) k δ bN − γ ν ( τ ) ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ ν ( τ ) h + O ( h − b ) + O ( λ ) if b < − / , c < b .Proof. The proof of Theorem 4 is given in Appendix A. It is worth noting here that the parametricexpression of the PSRV finite-sample bias under the assumption W N > ∆ N differs from the corre-sponding expression under the assumption W N ≤ ∆ N since the parametric expression of the term E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] differs in the two cases.Figure 3 compares the true finite-sample bias of the daily PSRV computed analytically (see theproof of Theorem 4 in Appendix A for its full explicit expression) with the dominant term of theexpansion in Theorem 4 as functions of the tuning parameter κ . The three panels refer to the threescenarios described by the CIR model parameters in Set 1 (panel (a)),
Set 2 (panel (b)), and
Set 3 b = − / c = 1 / λ = 0 . h = 1 / N = 360. Thecorresponding δ N and ∆ N are equal to 1 minute and 3 minutes, as we consider 6-hr trading days. Theapproximation of the true bias with the dominant term of the expansion is very accurate. -4 a) true biasdominant term -5 b) true biasdominant term -3 c) true biasdominant term Figure 3: Comparison between the true finite-sample bias of the daily PSRV and the dominant term of the expansionin Theorem 4 as functions of κ for b = − / c = 1 / λ = 0 . N = 360. Panel a) refers to the parameterset ( α, θ, γ, ν (0)) = (0 . , , . , . α, θ, γ, ν (0)) = (0 . , , . , . α, θ, γ, ν (0)) =(0 . , , . , . τ = 5 days, while the bias terms in panels a) and b) are independent of τ . Based on the results in Theorem 4, we make the following considerations on the bias-optimalselection of the PSRV tuning parameters in finite samples. Consider, without loss of generality, thecase when ν (0) = α . First, we note that the dominant term of the bias can be annihilated simplyby suitably selecting κ for any feasible value of λ when b ≥ − / c < − b . Instead, when b < − / c < b , the dominant term of the bias is independent of κ and λ . Specifically, when b ≥ − / c < − b , the suitable selection is κ = 2 √ α ˆ γδ b +1 / N . However, since κ is a tuning parameter, it is not allowed to depend on N . Therefore, the onlyadmissible choice is b = − / c < /
2, which make the bias-optimal value of κ independent of N and equal to κ ∗ := 2 √ αγ . Interestingly, this analytic result, based on Theorem 4, supports the optimal selections of b and κ determined numerically in the literature. Indeed, note that for the first parameter set in the17umerical exercise in Section 2, Set 1 , which is also used in Sanfelici et al. (2015), κ ∗ is equal to 1 . κ is saidto be approximately equal to 2. Note also that the numerical studies in A¨ıt-Sahalia et al. (2017),Sanfelici et al. (2015) both select b = − / Remark 2.
We underline that in our finite-sample setting, we assume W N > ∆ N , thereby implyinga constraint on the price grid δ N . Specifically, under Assumption 1, for b = − / , c < / and κ = κ ∗ := 2 √ αγ , the overlapping condition W N > ∆ N is equivalent to δ N > δ ∗ := (cid:16) κ ∗ λ (cid:17) c − / . Thethreshold δ ∗ is very small for typical orders of magnitude of the CIR parameters, h corresponding toone trading day and any feasible value of λ . For example, for the values of the CIR parameters inSet 1 (see Section 2), λ = 0 . and c = 1 / (so that, if δ N = 1 minute, then ∆ N := λ ∆ cN ≈ minutes), we have δ ∗ = 7 . · − seconds and thus the condition δ N > δ ∗ is largely satisfied at themost commonly available price sampling frequencies. More generally, independently of the initial value ν (0), we can exploit the conditional bias expan-sion in Theorem 4 and establish the following criterion for the bias-optimal selection of κ : κ ∗∗ := 2 p ν ( τ ) γ , for b = − / c < /
2. This result shows that if the spot volatility has a time-varying mean(i.e., if ν (0) = α ), the bias-optimal selection of κ necessarily varies with time, since it depends on ν ( τ ), the magnitude of the spot volatility at the beginning of the estimation period. However, such acriterion for the selection of κ is unfeasible unless estimates of γ and ν ( τ ) are available. To overcomethis problem, we first reconstruct the unobservable spot volatility path from price observations usingthe global spot volatility estimator of Malliavin and Mancino (2009). We then apply a simple indirectinference method to estimate γ from the reconstructed spot volatility path (see, Appendix B). Althoughthis procedure is very rough, the resulting estimates, in particular for ˆ γ , are of sufficient quality toensure that our rule for the bias-optimal selection of κ is effective, as shown in the numerical study inSection 5. However, this procedure is only one of many possibilities available in the literature (see, forexample, A¨ıt-Sahalia and Jacod (2014), A¨ıt-Sahalia and Kimmel (2007), Bollerslev and Zhou (2002),Lunde and Brix (2013)).
4. Generalization to the CKLS framework via dimensional analysis
The second step provides a heuristic criterion for the bias-optimal selection of κ when the data-generating process is the generalized version of the model in Assumption 2, as illustrated in thefollowing assumption. 18 ssumption 4. Generalized data-generating process
For t ∈ [0 , T ] , T > , we assume that the dynamics of the log-price process p ( t ) and the spotvolatility process ν ( t ) follow p ( t ) = p (0) + Z t µ − ν ( s )2 ! ds + Z t p ν ( s ) dW ( s ) ν ( t ) = ν (0) + θ Z t (cid:16) α − ν ( s ) (cid:17) ds + γ Z t ν ( s ) β dZ ( s ) , where W and Z are two correlated Brownian motions on (Ω , F , ( F t ) t ≥ , P ) , µ ∈ R , β ≥ / , θ, α, γ > ,and αθ > γ if β = 1 / . In this generalized model, the volatility evolves according to a generic CKLS model Chan et al.(1992) with arbitrary β ≥ /
2. Note that the CKLS model incorporates a number of popular modelsas special cases. For example, if β = 1 /
2, one obtains the CIR model Cox et al. (1985); if β = 1 onefinds the Brennan-Schwartz model Brennan and Schwartz (1980); if β = 3 /
2, one gets the CIR-VRmodel Cox et al. (1980). Note that we also include a drift in the price model to numerically confirmthat the impact of the latter on the PSRV finite-sample bias is negligible.Via dimensional analysis, we derive a heuristic criterion for the bias-optimal selection of κ underAssumption 4. After deriving this criterion, we test its efficacy in a numerical exercise, with over-whelming results (see Section 5). Note that dimensional analysis is typically used in physics andengineering to make an educated guess about the solution to a problem without performing a fullanalytic study (see, e.g., Kyle and Obizhaeva (2017), Smith et al. (2003)).Here, we proceed as follows. Let dim [ q ] denote the dimension of the quantity q and consider themodel in Assumption 4. Since the log-return dp ( t ) is a-dimensional and dim [ dW ( t )] = dim [ dZ ( t )] = time / , from the dynamics of the log-price we obtain dim [ µ ] = 1 /time and dim[ ν ( t )] = 1 /time . Thus,from the dynamics of ν ( t ) we have dim [ α ] = 1 /time , dim [ θ ] = 1 /time and dim [ γν ( t ) β dZ ( t )] = 1 /time .The latter implies dim [ γ ] dim [ ν ( t ) β ] dim [ dZ ( t )] = 1 /time , that is, dim [ γ ] = 1 /time − β +3 / .Now, without loss of generality, let ν (0) = α and consider the dominant term in the expansion ofTheorem 4, i.e., the term (cid:16) ακ δ bN − γ α (cid:17) h. Since the dominant term of the PSRV bias must clearly have the same dimension as the expectedquadratic variation of ν over any generic interval of length h , i.e., γ αh , we have19 im h(cid:16) ακ δ bN − γ α (cid:17) h i = dim [ γ αh ] = 1 /time , and, as one can easily verify, this implies dim [ κ ] = time − b (alternatively, one can show that dim [ κ ] = time − b by simply noting that k N = κδ bN is a-dimensional and dim [ δ bN ] = time b ).Now observe that the leading term of any expansion of the PSRV finite-sample bias must havedimension equal to 1 /time . Based on this observation, we conjecture that the leading term of theexpansion in Theorem 4 under Assumption 4 is (cid:16) E [ ν ( τ )] κ δ bN − γ E [ ν ( τ ) β ] (cid:17) h, whose dimension is 1 /time , as one can easily check by recalling that dim [ κ ] = time − b , dim[ ν ( t )] =1 /time and dim [ γ ] = 1 /time − β +3 / . Accordingly, if one conditions the bias to the natural filtrationof ν ( t ) up to time t = τ , the generalized bias-optimal value of κ , for b = − / c < /
2, reads κ ∗∗ = 2 ν ( τ ) − β γ . Our conjecture is based on the origin of the two addenda in the leading term of the bias (seeTheorem 4) in the CIR framework. In fact, bearing in mind the the leading term is E [ ν ( τ )] k δ bN − γ E [ ν ( τ )] ! h , we note that the second addendum, i.e., γ E [ ν ( τ )] h , comes from the expected quadratic variation ofthe volatility process. More specifically, it originates from the leading term of the following expansion: E h h ν, ν i [ τ,τ + h ] i = γ E [ ν ( τ )] h + o ( h ) , h → . Instead, the first addendum, i.e., E [ ν ( τ )] k δ bN , is due to the drift of the volatility process.Thus in the case of the CKLS model, the first addendum remains unchanged since the drift of theprocess is the same for any β , while the second addendum changes according to the expected quadraticvariation of the volatility process, which, for small h , reads E h h ν, ν i [ τ,τ + h ] i = γ E [ ν ( τ ) β ] h + o ( h ) , h → . since E h h ν, ν i [ τ,τ + h ] i = γ R τ + hτ E [ ν ( s ) β ] ds . 20 . Numerical study As detailed in the previous section, in the absence of microstructure noise and assuming ν ( τ ) tobe observable and γ to be known, the finite-sample bias of the PSRV is optimized, under Assumption2 and for any ν (0), by selecting b = − / c < / κ = κ ∗∗ , where κ ∗∗ reads κ ∗∗ := 2 p ν ( τ ) γ . In this section, we assess the performance of this bias-optimal rule for the selection of κ in arealistic scenario where the observed prices are contaminated by noise, the volatility trajectory is notobservable, and the model parameters are unknown. To this end, we perform the following numericalexercise, which involves three scenarios with an increasing number of sources of bias.In the first scenario, we simulate log-price paths under Assumption 2, and compute daily PSRVvalues from noise-free price observations assuming that the CIR parameters are known and the initialvolatility value ν ( τ ) is observable. In this scenario, we use two price sampling frequencies, that is, δ N = 1 minute and δ N = 5 minutes. This allows us to numerically verify that the bias generated bythe price discrete sampling is relatively small, e.g., less than 5% if δ N = 1 minute when κ = κ ∗∗ (seeTable 1).In the second scenario, we simulate log-price paths under Assumption 3 and compute PSRV valuesfrom noisy prices while assuming that the CIR parameters are known and the initial volatility value ν ( τ ) is observable. As the PSRV is not robust to the presence of noise contaminations in the priceprocess, here we only consider the sampling frequency δ N = 5 minutes, as recommended in the seminalpaper by Andersen et al. (2001), where the authors suggest that this sampling frequency reduces theimpact of noise on returns while still falling within a high-frequency framework. Indeed, a comparisonof the numerical results obtained in these first two scenarios shows that the impact of the price noiseon the PSRV estimates is relatively small at the 5-minute sampling frequency, when κ = κ ∗∗ is used.In the third scenario, we still simulate the log-price path under Assumption 3, but the value of theinitial volatility, ν ( τ ), is now unobservable and the model parameter γ is unknown. Thus, we computePSRV values from noisy prices by selecting κ = ˆ κ ∗∗ , where ˆ κ ∗∗ is equal toˆ κ ∗∗ := 2 p ˆ ν ( τ )ˆ γ . Here, ˆ ν ( τ ) is the Fourier estimate of the spot volatility at time τ while ˆ γ is given Appendix B. Acomparison of the results obtained in these different scenarios shows that the PSRV finite-samplebias reduction obtained with the feasible selection of κ , i.e., κ = ˆ κ ∗∗ is very similar to the reductionobtained with the unfeasible selection of κ , i.e., κ = κ ∗∗ .21verall, for each scenario, we consider the three sets of parameters described in Section 2. Foreach parameter set, we simulate one thousand 1-year trajectories of 1-second observations.The noise component η in Assumption 3 is simulated as an i.i.d. Gaussian process, with noise-to-signalratio ζ ranging from 0.5 to 3.5, as in the numerical exercise proposed in Sanfelici et al. (2015). Wedefine the noise-to-signal ratio ζ as in Sanfelici et al. (2015), i.e., ζ := std (∆ η ) std ( r ) , where ∆ η denotes ageneric increment of the i.i.d. process η under Assumption 3 and r denotes the noise-free log-returnat the maximum sampling frequency available, which is equal to 1 second in our numerical exercise.From the simulated prices, we compute daily PSRV values, that is, we set a small time horizon h ,i.e., h = 1 / κ is valid when b = − / c < /
2. Accordingly, we set b = − / c = 1 / N is equal to 360 when δ N = 1 minute and 72 when δ N = 5 minutes. Note that the overlappingcondition W N > ∆ N is always satisfied for the values of ∆ N in Table 1. In particular, the averagelength of W N is approximately equal to: 530 minutes for Set 1 , 410 minutes for
Set 2 and 580 for
Set 3 , when δ N = 1 minute; 1200 minutes for Set 1 , 930 minutes for
Set 2 and 1310 minutes for
Set3 , when δ N = 5 minutes. These averages are computed over all simulated days and are stable acrossthe three scenarios. Recall that the length of W N varies by day, as it depends on κ ∗∗ , which in turndepends on the volatility value at the beginning of each day, i.e., ν ( τ ) (in scenarios 1 and 2), or itsFourier estimate, i.e., ˆ ν ( τ ) (in scenario 3). 22 oise-to-signal ratio ζ δ N ∆ N λ rel. bias 1 (Set 1 ) rel. bias ( Set 2 ) rel. bias (
Set 3 ) ζ = 0 1 min. δ N (1 min. ) 2 · − δ N (2 min.) 4 · − δ N (3 min.) 6 · − δ N (5 min.) 1 · − δ N (10 min.) 1 . · − δ N (15 min.) 2 . · − ζ = 0 5 min. δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − Table 1:
Scenario 1: daily PSRV finite-sample relative bias with κ = κ ∗∗ , ζ = 0, γ known and ν ( τ ) observable.Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ).noise-to-signal ratio ζ δ N ∆ N λ rel. bias ( Set 1 ) rel. bias (
Set 2 ) rel. bias (
Set 3 ) ζ = 0 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 1 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 2 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 3 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − Table 2:
Scenario 2: daily PSRV finite-sample relative bias with κ = κ ∗∗ , ζ > γ known and ν ( τ ) observable.Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ). oise-to-signal ratio ζ δ N ∆ N λ rel. bias ( Set 1 ) rel. bias (
Set 2 ) rel. bias (
Set 3 ) ζ = 0 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 1 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 2 . δ N (5 min.) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − ζ = 3 . δ N (5 min) 6 · − δ N (10 min.) 1 . · − δ N (15 min.) 1 . · − δ N (30 min.) 3 . · − Table 3:
Scenario 3: daily PSRV finite-sample relative bias with κ = κ ∗∗ , ζ > γ unknown and ν ( τ ) unobserv-able. Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ). Table 1 shows that for δ N = 1 minute and ∆ N ≤ ν (0) = α , while it is slightly larger but still acceptable (i.e., between 3% and4%) when ν (0) = 2 α . This is in line with Theorems 1-4 in that various sources of bias are eliminatedwhen ν (0) = α . With a price sampling frequency of five minutes, the bias is still acceptable, around6% at worst. Additionally, Table 2 shows that in the presence of noise, price sampling at five-minuteintervals to avoid microstructure frictions represents an acceptable compromise, as the bias is lessthan 15% even in the presence of very intense microstructure effects. Finally, Table 3 shows that thestatistical error related to the estimation of γ and ν ( τ ) could actually partially compensate for thebias due to the presence of noise, especially when the common assumption ν (0) = α is violated.We conclude this section by testing the efficacy of the generalized, conjecture-based, criterion forthe bias-optimal selection of κ under Assumption 4, i.e., under the assumption that the volatilityevolves as a CKLS model. In this case, the feasible version of the bias-optimal criterion to select κ is24iven by ˆ κ ∗∗ = 2ˆ ν ( τ ) − β ˆ γ , for b = − / c < /
2. To test the efficacy of this criterion, we repeat the numerical exercisepreviously performed in scenario 1 under Assumption 2, considering three different values of β : β =1 /
2, corresponding to the model by Heston (1993), which differs from the model of Assumption 2 onlyin the presence of a price drift; β = 1, corresponding to the continuous-time GARCH model by Nelson(1990); and β = 3 /
2, corresponding to the 3/2 model by Platen (1997). For all parameter sets, µ isset equal to 0.05. The following tables show that our general criterion for the bias-optimal selection of κ under Assumption 4 is effective, as it gives satisfactory results in terms of relative bias. Note thatthe case β = 1 / κ derived analytically under Assumption 2, i.e., κ = κ ∗∗ , is also effective in the presence of a pricedrift. Model δ N ∆ N λ rel. bias ( Set 1 ) rel. bias (
Set 2 ) rel. bias (
Set 3 ) β = δ N (1 min.) 2 · − δ N (2 min.) 4 · − δ N (3 min.) 6 · − δ N (5 min.) 1 · − δ N (10 min.) 1 . · − δ N (15 min.) 2 . · − Table 4: β = 1 /
2: daily PSRV finite-sample relative bias with κ = 2 γ − ν ( τ ) − β , ζ = 0, γ known and ν ( τ )observable. Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ). The price drift, µ , is always equal to 0 . odel δ N ∆ N λ rel. bias ( Set 1 ) rel. bias (
Set 2 ) rel. bias (
Set 3 ) β = 1 1 min. δ N (1 min.) 2 · − δ N (2 min.) 4 · − δ N (3 min.) 6 · − δ N (5 min.) 1 · − δ N (10 min.) 1 . · − δ N (15 min.) 2 . · − Table 5: β = 1: daily PSRV finite-sample relative bias with κ = 2 γ − ν ( τ ) − β , ζ = 0, γ known and ν ( τ )observable. Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ). The price drift, µ , is always equal to 0 . δ N ∆ N λ rel. bias ( Set 1 ) rel. bias (
Set 2 ) rel. bias (
Set 3 ) β = δ N (1 min.) 2 · − δ N (2 min.) 4 · − δ N (3 min.) 6 · − δ N (5 min.) 1 · − δ N (10 min.) 1 . · − δ N (15 min.) 2 . · − Table 6: β = 3 /
2: daily PSRV finite-sample relative bias with κ = 2 γ − ν ( τ ) − β , ζ = 0, γ known and ν ( τ )observable. Model parameters: α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 1 ); α = 0 . θ = 10, γ = 0 . ρ = − . ν (0) = 0 .
03 (
Set 2 ); α = 0 . θ = 5, γ = 0 . ρ = − . ν (0) = 0 . Set 3 ). The price drift, µ , is always equal to 0 .
6. Empirical study
We conclude this paper with an empirical analysis, where we apply our bias-optimal criterion for κ when computing daily PSRV estimates. The dataset is composed of two 1-year samples of S&P500 1-minute prices relative to the years 2016 and 2017, respectively. The two samples are analyzedseparately since the volatility of these two time series behaves very differently. In fact, the year 2016is characterized by volatility spikes (due, e.g., to uncertainty pertaining to the so-called Brexit in themonth of June or the U.S. presidential election in the month of November), while the year 2017 ischaracterized by low volatility, as one can see in Figure 4. Analyzing the two series separately allowsfor validation of the feasible rule for the selection of κ in two very different scenarios.26 an Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec100200300400500600700800900 VIX , 2016 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec100200300400500600700800900
VIX , 2017 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec-0.04-0.03-0.02-0.0100.010.020.030.04
S&P500 log-returns, 2016
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec-0.04-0.03-0.02-0.0100.010.020.030.04
S&P500 log-returns, 2017
Figure 4: Daily
V IX values (left) and daily S&P 500 log-returns (right) in the years 2016 and 2017. We proceed as follows. First, through the method detailed in Appendix B, we fit the model ofAssumption 4 to each sample, comparing the three different values of β considered in the numericalexercise of Section 5. The results of the fitting are shown in Table 7. Model Sample year ˆ γ ˆ α ˆ ν (0) R β = β = 1 2016 6.1682 0.0083 0.0105 0.07252017 5.9973 0.0037 0.0014 0.0901 β = Table 7:
Results of fitting the stochastic volatility model in Assumption 4 for different values of β . Then, based on the resulting R values, we assume the Heston model ( β = 1 /
2) as the datagenerating process for both samples. Consequently, we select b = − / c = 1 / κ = 2ˆ γ − p ˆ ν ( τ )and compute daily PSRV values from 5-minute empirical prices, as the impact of microstructurecontaminations is negligible at that sampling frequency. Before fitting the model of Assumption 4, weperform the Hausman test by A¨ıt-Sahalia and Xiu (2019) for the presence of noise. The result of thetest tells that the impact of noise at the 5-minute frequency is negligible in our samples, confirminga well-known stylized fact (see Andersen et al. (2001)). Based on this result, we then perform the27ump-detection test by Corsi et al. (2010) on 5-minute returns (the test is not robust to the presenceof noise contaminations in the price process). Based on the result of the jump-detection test, weremove from the samples the days in which price jumps are detected. These days amount to 12 . .
30% of the sample in 2017.The following figures show the PSRV values obtained for four different values of λ correspondingto a spot volatility estimation frequency ∆ N equal to 5, 10, 15, and 30 minutes, respectively.
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V N = 5 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V N = 10 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V N = 15 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V N = 30 minutes Figure 5: Daily PSRV values in the year 2016. Days D a il y PS R V -3 N = 5 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V -3 N = 10 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V -3 N = 15 minutes
20 40 60 80 100 120 140 160 180 200 220
Days D a il y PS R V -3 N = 30 minutes Figure 6: Daily PSRV values in the year 2017.
Comparing the dynamics of the VIX index in Figure 4 with those of the PSRV, one notices thatwhen the VIX spikes, the vol-of-vol also spikes (see, e.g., the behavior of the plots at the end of June2016) and, viceversa, when the VIX is low and stable (e.g., in 2017) the vol-of-vol is also low andstable. This evidence corroborates the goodness of our vol-of-vol estimates. Finally, note that foreither of the two samples, the plots for different values of ∆ N are basically indistinguishable. Withrespect to the bias-optimal selection of λ (i.e., ∆ N ), this evidence confirms what emerges from theanalytic study in Section 3: the impact of the selection of λ (i.e., ∆ N ) on PSRV values is marginal, ifnot negligible.
7. Conclusions
As pointed out in the Introduction, a number of integrated volatility-of-volatility estimators havebeen put forward in recent years, although without providing feasible criteria to optimally select thetuning parameters involved in their computation in a finite-sample setting. The main contribution ofthis paper is to provide an approach to fill this gap. Specifically, we focus on the simplest vol-of-volestimator available in the literature, the pre-estimated spot-variance based realized variance (PSRV) byBarndorff-Nielsen and Veraart (2009). Inspired by the approach used in A¨ıt-Sahalia et al. (2013), wecompute its finite-sample bias in a parametric setting where explicit calculations are fully attainable.In our calculations, consecutive local windows for estimating the spot volatility are allowed to overlapbased on preliminary numerical evidence that this overlapping is crucial for practical purposes.29n turn, full knowledge of the PSRV bias expression, in the absence of microstructure noise, showstwo aspects. First, when consecutive local windows overlap, the dominant term of the bias dependsonly on the local-window tuning parameter κ and, more interestingly, a simple choice of this tuningparameter allows us to reduce the bias. Our analytic results support the optimal selection of thelocal-window tuning parameter derived numerically in Sanfelici et al. (2015).Second, when local windows do not overlap, the dominant term of the bias depends on both tuningparameters. In this case, this term can be removed only if the system of equations in Corollary 1 admitsa solution. Interestingly, this system can be solved for very high frequencies, that is, frequencies atwhich prices are affected by microstructure noise, thereby implying again a bias in the PSRV estimates.Hence, the proposed results suggest that a key ingredient for reliable PSRV estimates is the over-lapping of two consecutive windows for spot volatility estimation. Note that although the bias-optimalvalue of the local-window tuning parameter depends on quantities which are not directly observable,namely the vol-of-vol parameter and the spot volatility value at the beginning of the estimation period,rough estimates of these quantities (see Appendix B) make the bias-optimal selection of the tuningparameter effective in removing the bias.Once the bias-optimal parametric expression of the local-window tuning parameter has been ob-tained under the CIR model, we generalize it to be effective under the CKLS model by using dimen-sional analysis. Numerical results corroborate this generalization in that nearly unbiased vol-of-volestimates are obtained for two other models incorporated in the CKLS class, namely, the continuous-time GARCH model and the 3/2 model.We highlight that the analytic approach used in this paper to study the PSRV finite-sample biascould be applied to analyze the finite-sample performance of other estimators of second-order quantitieswhich require pre-estimation of the spot volatility (e.g., estimators of the integral of the stochasticleverage, see Chapter 8 in A¨ıt-Sahalia and Jacod (2014)).Finally, as a byproduct of this analysis, we quantify the ensuing bias reduction for both the PSRVand the locally averaged realized volatility from the assumption that the initial value of the volatilityis equal to its long-term mean, which is very common in simulation studies found in the literature. References
A¨ıt-Sahalia, Y., Fan, J., Laeven, R., Wang, C. D., and Yang, X. (2017). Estimation of the continuousand discontinuous leverage effects.
Journal of the American Statistical Association , 112(520):1744–1758. 30¨ıt-Sahalia, Y., Fan, J., and Li, Y. (2013). The leverage effect puzzle: disentangling sources of biasat high frequency.
Journal of Financial Economics , 109:224–249.A¨ıt-Sahalia, Y. and Jacod, J. (2014). High-frequency financial econometrics.
Princeton UniversityPress .A¨ıt-Sahalia, Y. and Kimmel, R. (2007). Maximum likelihood estimation of stochastic volatility models.
Journal of Financial Economics , 83(2):413 – 452.A¨ıt-Sahalia, Y. and Xiu, D. (2019). A hausman test for the presence of noise in high frequency data.
Journal of Econometrics , 211:176–205.Andersen, T., Bollerslev, T., Diebold, F., and Ebens, H. (2001). The distribution of realized stockreturn volatility.
Journal of Financial Economics , 61:43–76.Barndorff-Nielsen, O. and Veraart, A. (2009). Stochastic volatility of volatility in continuous time.
CREATES Research Paper , No. 2009-25.Bollerslev, T., Tauchen, G., and Zhou, H. (2009). Expected stock returns and variance risk premia.
The Review of Financial Studies , 22(11):4463–4492.Bollerslev, T. and Zhou, H. (2002). Estimating stochastic volatility diffusion using conditional mo-ments of integrated volatility.
Journal of Econometrics , 109:33–65.Brennan, M. and Schwartz, E. (1980). Analyzing convertible securities.
Journal of Financial andQuantitative Analysis , 15(4):907–929.Chan, K., Karolyi, G., Longstaff, F., and Sanders, A. (1992). An empirical comparison of alternativemodels of the short-term interest rate.
The Journal of Finance , 47(3):1209–1227.Corsi, F., Pirino, D., and Ren`o, R. (2010). Threshold bipower variation and the impact of jumps onvolatility forecasting.
Journal of Econometrics , 159(2):276–288.Cox, J., Ingersoll, J., and Ross, S. (1980). An analysis of variable rate loan contracts.
The Journal ofFinance , 35:389–403.Cox, J., Ingersoll, J., and Ross, S. (1985). An intertemporal general equilibrium model of asset prices.
Econometrica , 53(2):363–384.Cuchiero, C. and Teichmann, J. (2015). Fourier transform methods for pathwise covariance estimationin the presence of jumps.
Stochastic Processes and Their Applications , 125(1):116–160.31atheral, J. and Oomen, R. (2010). Zero-intelligence realized variance estimation.
Finance andStochastics , 14(2):249–283.Hasbrouck, J. (2007). Empirical market microstructure: the institutions, economics, and econometricsof securities trading.
Oxford University Press .Heston, S. (1993). A closed-form solution for options with stochastic volatility with applications tobond and currency options.
The Review of Financial Studies , 6(2):327–343.Huang, D., Schlag, C., Shaliastovich, I., and Thimme, J. (2018). Volatility-of-volatility risk.
Journalof Financial and Quantitative Analysis , page 1–63.Jacod, J., Li, Y., and Zheng, X. (2017). Statistical properties of microstructure noise.
Econometrica ,85:1133–1174.Kalnina, I. and Xiu, D. (2017). Nonparametric estimation of the leverage effect: a trade-off betweenrobustness and efficiency.
Journal of the American Statistical Association , 112(517):384–399.Kyle, A. and Obizhaeva, A. (2017). Dimensional analysis and market microstructure invariance.
Working Paper w0234, New Economic School (NES) .Lunde, A. and Brix, A. (2013). Estimating stochastic volatility models using prediction-based esti-mating functions.
CREATES Research Paper , No. 2013-23.Malliavin, P. and Mancino, M. (2009). A fourier transform method for nonparametric estimation ofmultivariate volatility.
Annals of Statistics , 37(4):1983–2010.Mancino, M., Recchioni, C., and Sanfelici, S. (2017). Fourier-malliavin volatility estimation. theoryand practice.
Spinger .Mykland, P. A. and Zhang, L. (2009). Inference for continuous semimartingales observed at highfrequency.
Econometrica , 77(5):1403–1445.Nelson, D. (1990). Arch models as diffusion approximations.
Journal of Econometrics , 45(2):7–38.Platen, E. (1997). A non-linear stochastic volatility model.
Financial Mathematics Research ReportNo.FMRR005-97, Center for Financial Mathematics, Australian National University .Roll, R. (1984). A simple implicit measure of the effective bid-ask spread in an efficient market.
TheJournal of Finance , 39:1127–1139. 32anfelici, S., Curato, I., and Mancino, M. (2015). High frequency volatility of volatility estimationfree from spot volatility estimates.
Quantitative Finance , 15(8):1331–1345.Smith, E., Farmer, D., Gillemot, L., and Krishnamurthy, S. (2003). Statistical theory of the continuousdouble auction.
Quantitative Finance , 3(6):481–514.Vetter, M. (2015). Estimation of integrated volatility of volatility with applications to goodness-of-fittesting.
Bernoulli , 21(4):2393–2418.
Appendix A Proofs
See supplementary file.
Appendix B Indirect inference of the CKLS parameters
See supplementary file. ppendix A Proofs Theorem 1Proof.
From Definition 2 we have
P SRV [ τ,τ + h ] ,N := ⌊ h/ ∆ N ⌋ X i =1 h ˆ ν ( τ + i ∆ N ) − ˆ ν ( τ + i ∆ N − ∆ N ) i , where, for s taking values on the time grid of mesh-size δ N :- ˆ ν ( s ) := RV ( s, k N δ N )( k N δ N ) − , - RV ( s, k N δ N ) := k N X j =1 ∆ p ( s + jδ N − k N δ N , δ N ) , - ∆ p ( s ) := p ( s ) − p ( s − δ N ) . Note that E [ P SRV [ τ,τ + h ] ,N ] can be rewritten as E [ P SRV [ τ,τ + h ] ,N ] = ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E [ RV ( τ + i ∆ N , k N δ N )] + E [ RV ( τ + i ∆ N − ∆ N , k N δ N )] − E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] . (2)Therefore, under Assumption 2, the explicit formula for E [ P SRV [ τ,τ + h ] ,N ] can be obtained byderiving an analytic expression for E [ RV ( τ + i ∆ N , k N δ N )], E [ RV ( τ + i ∆ N − ∆ N , k N δ N )] and E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )].We address these three tasks separately as follows. I) Analytic expression of E [ RV ( τ + i ∆ N , k N δ N )] To simplify the notation, we use a i,u,N to denote the quantity a i,u,N = τ + i ∆ N + ( u − k N ) δ N and( F vs ) s ≥ the natural filtration associated with the process ν . We have E [ RV ( τ + i ∆ N , k N δ N )] = k N X j =1 E [∆ p ( τ + i ∆ N + ( j − k N ) δ N )]+2 k N X j =2 E [∆ p ( τ + i ∆ N + ( j − k N ) δ N ) j − X h =1 ∆ p ( τ + i ∆ N + ( h − k N ) δ N )] =34 N X j =1 E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) i +2 k N X j =2 E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) × j − X h =1 (cid:16) Z a i,h,N a i,h − ,N p v ( s ) dW ( s ) (cid:17) i = k N X j =1 E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) i +2 k N X j =2 j − X h =1 E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) × (cid:16) Z a i,h,N a i,h − ,N p v ( s ) dW ( s ) (cid:17) i , where: • Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) |F va i,j,N ∼ N (cid:16) , Z a i,j,N a i,j − ,N v ( s ) ds (cid:17) , which implies E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) i = E h E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,j,N ii = 3 E h(cid:16) Z a i,j,N a i,j − ,N v ( s ) ds (cid:17) i ; • for h < j and s < r we have: E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) (cid:16) Z a i,h,N a i,h − ,N p v ( s ) dW ( s ) (cid:17) i = E h E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) (cid:16) Z a i,h,N a i,h − ,N p v ( s ) dW ( s ) (cid:17) |F va i,j,N ii = E h E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,j,N i E h(cid:16) Z a i,h,N a i,h − ,N p v ( s ) dW ( s ) (cid:17) |F va i,j,N ii = E h Z a i,j,N a i,j − ,N v ( s ) ds Z a i,h,N a i,h − ,N v ( s ) ds i = Z a i,j,N a i,j − ,N Z a i,h,N a i,h − ,N E [ v ( r ) v ( s )] dsdr = Z a i,j,N a i,j − ,N Z a i,h,N a i,h − ,N E [ v ( s ) E [ v ( r ) |F νs ] dsdr. Under Assumption 2 (see Appendix A in Bollerslev and Zhou (2002)), we also have: • E h(cid:16) Z a i,j,N a i,j − ,N v ( s ) ds (cid:17) i = 1 θ (1 − e − θδ N ) n e − θi ∆ N − θjδ N +2 θ (1+ k N ) δ N E [ ν ( τ )] + h γ θ ( e − θi ∆ N − θjδ N + θ ( k N +1) δ N − e − θi ∆ N − θjδ N +2 θ ( k N +1) δ N )+2 αe − θi ∆ N − θjδ N + θ ( k N +1) δ N (1 − e − θi ∆ N − θjδ N + θ ( k N +1) δ N ) i E [ ν ( τ )]+ (cid:16) γ α θ + α (cid:17) (1 + e − θi ∆ N − θjδ N +2 θ ( k N +1) δ N − e − θ i ∆ N − θjδ N + θ ( k N +1) δ N ) o + γ θ (cid:16) θ − δ N e − θδ N − θ e − θδ N (cid:17) +2 θ (1 − e − θδ N ) h αδ N − αθ (1 − e − θδ N ) ih e − θi ∆ N − θjδ N + θ ( k N +1) δ N E [ ν ( τ )]+ α (1 − e − θi ∆ N − θjδ N + θ ( k N +1) δ N ) i + γ θ h αδ N (1 + 2 e − θδ N ) + α θ ( e − θδ N + 4 e − θδ N − i + α δ N + α θ (1 − e − θδ N ) − α θ δ N (1 − e − θδ N ); 35 for h < j and s < r , Z a i,j,N a i,j − ,N Z a i,h,N a i,h − ,N E [ v ( s ) E [ v ( r ) |F νs ] dsdr = h(cid:16) E [ ν ( τ )] − α (cid:17) + γ θ (cid:16) α − E [ ν ( τ )] (cid:17)i θ · e − θi ∆ N − θjδ N − θhδ N +2 θk N δ N (1 − e θδ N ) + − γ α θ e − θjδ N + θhδ N (2 − e − θδ N − e θδ N ) + α δ N − γ αθ ( E [ ν ( τ )] − α ) δ N (1 − e θδ N ) e − θi ∆ N − θjδ N + θk N δ N + − αθ ( E [ ν ( τ )] − α ) δ N (1 − e θδ N ) e − θi ∆ N − θhδ N + θk N δ N . Finally, putting everything together, we obtain: E [ RV ( τ + i ∆ N , k N δ N )] =(1 − e − θk N δ N )(1 − e − θδ N ) − e − i ∆ N +2 θk N δ N (1 − e − θδ N ) θ h ( E [ ν ( τ )] − α ) + γ θ (cid:16) α − E [ ν ( τ )] (cid:17)i +(1 − e − θk N δ N )(1 − e − θδ N ) − e − i ∆ N + θk N δ N n γ θ ( E [ ν ( τ )] − α ) 3 θ (1 − e − θδ N ) + γ θ (cid:16) θ − e − θδ N δ N − θ e − θδ N (cid:17) θ ( E [ ν ( τ )] − α ) + h αθ δ N (1 − e − θδ N ) i ( E [ ν ( τ )] − α ) o + γ θ k N h α θ (1 − e − θδ N ) + 3 αθ (cid:16) θ − e − θδ N δ N − θ e − θδ N (cid:17) + 3 αθ δ N (1 + 2 e − θδ N )+ 3 α θ ( e − θδ N + 4 e − θδ N − i + 3 α δ N k N +2 h γ θ (cid:16) α − E [ ν ( τ )] (cid:17) + ( E [ ν ( τ )] − α ) i θ e θk N δ N − θi ∆ N − θδ N ×× (1 − e − θ ( k N − δ N + e − θ ( k N + i ) δ N − e − θδ N + e − θ (2 k N − δ N − e − θk N δ N )(1 − e − θδ N ) − +2 α ( E [ ν ( τ )] − α ) 1 θ δ N e θk N δ N − θi ∆ N ( e θδ N − − ×× [ e − θk N δ N ( e θk N δ N − k N − k N e θδ N ) + k N ( e θδ N −
1) + e θδ N ( e − θk N δ N − γ θ ( E [ ν ( τ )] − α ) 1 θ δ N e − θi ∆ N ( e θk N δ N − k N − k N e θδ N )( e θδ N − − + γ α θ ( e − θk N δ N − k N − k N e − θδ N ) + α δ N ( k N − k N ) . II) Analytic expression of E [ RV ( τ + i ∆ N − ∆ N , k N δ N )] The analytic expression of E [ RV ( τ + i ∆ N − ∆ N , k N δ N )] under Assumption 2 is easily obtainedby replacing i with i − E [ RV ( τ + i ∆ N , k N δ N )] derived in (I). III) Analytic expression of E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N ∆ N , k N δ N )] We assume that W N = k N δ N < ∆ N for N larger than some threshold µ ∗ >
0. Then, for
N > µ ∗ we rewrite 36 [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + ( i − N , k N δ N )] = E h k N X j =1 ∆ p ( τ + i ∆ N + ( j − k N ) δ N , δ N ) k N X j =1 ∆ p ( τ ( i − N + ( j − k N ) δ N , δ N ) i = E h k N X j =1 (cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) k N X j =1 (cid:16) Z a i − ,j,N a i − ,j − ,N p v ( s ) dW ( s ) (cid:17) i = E h E h k N X j =1 (cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) k N X j =1 (cid:16) Z a i − ,j,N a i − ,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,kN ,N ii = E h E h k N X j =1 (cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,kN ,N i E h k N X j =1 (cid:16) Z a i − ,j,N a i − ,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,kN ,N ii = E h k N X j =1 E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,kN ,N i k N X j =1 E h(cid:16) Z a i − ,j,N a i − ,j − ,N p v ( s ) dW ( s ) (cid:17) |F va i,kN ,N ii = E h k N X j =1 Z a i,j,N a i,j − ,N v ( s ) ds k N X j =1 Z a i − ,j,N a i − ,j − ,N v ( s ) ds i = E h Z a i,kN ,N a i, ,N v ( s ) ds Z a i − ,kN ,N a i − , ,N v ( s ) ds i = Z a i,kN ,N a i, ,N Z a i − ,kN ,N a i − , ,N E h v ( r ) v ( s ) i dsdr = Z a i,kN ,N a i, ,N Z a i − ,kN ,N a i − , ,N E [ v ( s ) E [ v ( r ) |F νs ] dsdr, s < r. Under Assumption 2 (see, again, Appendix A in Bollerslev and Zhou (2002)), E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + ( i − N , k N δ N )] = (3)= 1 θ e θ ∆ N (cid:16) − e θk N δ N (cid:17) e − θi ∆ N (cid:20) ( E [ ν ( τ )] − α ) + γ θ (cid:16) α − E [ ν ( τ )] (cid:17)(cid:21) − e − θ ∆ N (cid:16) − e θk N δ N − e − θ kN δ N (cid:17) γ α θ − θ k N δ N (cid:16) − e θk N δ N (cid:17) e − θi ∆ N (cid:20)(cid:18) γ θ + α (cid:19) ( E [ ν ( τ )] − α ) (cid:21) − θ k N δ N (cid:16) − e θk N δ N (cid:17) e θ ∆ N − θi ∆ N [ α ( E [ ν ( τ )] − α )] + α ( k N δ N ) . After plugging the explicit expressions obtained in (I), (II) and (III) into Eq. (A), simple buttedious calculations yield the parametric expression of E [ P SRV [ τ,τ + h ] ,N ] under Assumption 2, whichcan be expressed in the following compact form: E [ P SRV [ τ,τ + h ] ,N ] = γ αhA N + γ (cid:16) E [ ν ( τ )] − α (cid:17) − e − θh h B N + C N , where: 37 N = ( k N δ N ) − ∆ − N n θ k N h θ (1 − e − θδ N ) + 3 1 θ (cid:16) θ − e − θδ N δ N − θ e − θδ N (cid:17) + 3 1 θ δ N (1 + 2 e − θδ N ) + 32 θ ( e − θδ N + 4 e − θδ N − i + 2 θ ( e − θk N δ N − k N − k N e − θδ N )+ 1 θ e − θ ∆ N (2 − e θk N δ N − e − θk N δ N ) o ; (4) B N = ( k N δ N ) − e − θ ∆ N (1 − e − θ ∆ N ) − n (1 + e θ ∆ N ) h θ ( e θk N δ N − − e − θδ N )+ 3 θ ( e θk N δ N − − e − θδ N ) − (cid:16) θ − e − θδ N δ N + − θ e − θδ N (cid:17) + 2 1 θ δ N ( e θδ N − − ( k N − e θk N δ N − k N e θδ N ) i + 2 θ k N δ N (1 − e θk N δ N ) o ; (5) C N = ( k N δ N ) − ( e − θ ∆ N (1 − e − θh )(1 − e − θ ∆ N ) − θ h ( E [ ν ( τ )] − α ) + γ θ (cid:16) α − ν ( τ ) (cid:17)i × n (1 + e θ ∆ N )(1 − e − θδ N ) − h e θk N δ N − − e − θδ N ) + 2(1 − e − θδ N )+ 2 e θk N δ N ( e − θδ N −
1) + 2 e θk N δ N − θδ N (1 − e − θδ N ) i − e θ ∆ N (1 − e θk N δ N ) o + (6 α δ N k N − α k N δ N ) h ∆ − N + e − θ ∆ N (1 − e − θh )(1 − e − θ ∆ N ) − nh αθ δ N ( E [ ν ( τ )] − α )( e θk N δ N −
1) + 2 αθ δ N ( E [ ν ( τ )] − α )( e θδ N − − ×× [( e θk N δ N − k N − k N e θδ N ) + k N e θk N δ N ( e θδ N −
1) + e θδ N (1 − e θk N δN )] i (1 + e θ ∆ N )+ 2 αθ k N δ N ( E [ ν ( τ )] − α )(1 + e θ ∆ N )(1 − e θk N δ N ) o) . (6)We now recall that for N → ∞ , ∆ N = O ( δ cN ), c ∈ (0 , , and k N = O ( δ bN ), b ∈ ( − , b ≥ − / c < − b or b < − / c < b , we havelim N → + ∞ k N ∆ N = 0and lim N → + ∞ k N δ N ∆ N = 0 . Expanding A N , B N , and C N as N → ∞ , one obtains • A N ∼ θk N ∆ N + θ ( k N δ N ) N − θ ( k N δ N ) − θ ∆ N + θ ( k N δ N ) ∆ N ; • B N ∼ θk N ∆ N + δ N ∆ N + k N − δ N k N ∆ N − k N δ N ∆ N − θδ N ∆ N + θδ N − θδ N k N + θ ∆ N + θ ∆ N k N + θδ N k N N − θ δ N ∆ N k N − θ δ N + θ δ N ∆ N − θ δ N ∆ N ; 38 C N ∼ − e − θh θ h ( E [ ν ( τ )] − α ) + γ θ ( α − E [ ν ( τ )]) ih θ k N ∆ N + θ δ N ∆ N + θ ∆ N + θ ∆ N k N + 3 θ ∆ N δ N i + α hk N ∆ N + α ( E [ ν ( τ )] − α )(1 − e − θh ) θk N ∆ N , from which we get Eq.(1).Based on the corresponding asymptotic expansions, one can easily check that as N → ∞ , if b ≥ − / c < − b or, alternatively, b < − / c < b , then A N → B N → C N → N → ∞ , if b ≥ − / c < − b or, alternatively, b < − / c < b ,then E [ P SRV [ τ,τ + h ] ,N ] = γ αhA N + γ ( E [ ν ( τ )] − α ) − e − θh h B N + C N converges to E [ h ν, ν i [ τ,τ + h ] ] = γ αh + γ ( E [ ν ( τ )] − α ) − e − θh θ , where the equivalence E [ h ν, ν i [ τ,τ + h ] ] = γ αh + γ ( E [ ν ( τ )] − α ) − e − θh θ is obtained from Appendix A in Bollerslev and Zhou (2002).In particular, one can easily verify that, as N → ∞ : • for b ≥ − / c < − b , A N − O (∆ N ) , B N − O (∆ N ) , C N = O (∆ N ) if c < − b/ , (7) A N − O (cid:16) k N ∆ N (cid:17) , B N − O (cid:16) k N ∆ N (cid:17) , C N = O (cid:16) k N ∆ N (cid:17) if − b/ ≤ c < − b ; (8) • for − / ≤ b < − / c < b , A N − O (∆ N ) , B N − O (∆ N ) , C N = O (∆ N ) if c < (1 + b ) / , (9) A N − O (∆ N ) , B N − O (cid:16) k N δ N ∆ N (cid:17) , C N = O (∆ N ) if (1 + b ) / ≤ c < − b/ , (10) A N − O (cid:16) k N ∆ N (cid:17) , B N − O (cid:16) k N δ N ∆ N (cid:17) , C N = O (cid:16) k N ∆ N (cid:17) if − b/ ≤ c < b ;(11) • for b < − / c < b , A N − O (∆ N ) , B N − O (∆ N ) , C N = O (∆ N ) if c < (1 + b ) / , (12) A N − O (∆ N ) , B N − O (cid:16) k N δ N ∆ N (cid:17) , C N = O (∆ N ) if (1 + b ) / ≤ c < b. (13)The proof is complete. Corollary 1Proof.
Let Assumption 1 hold. Based on Eq. (1) and the asymptotic rates of A N , B N and C N (seeEqs. (7) − (12)), we observe that: 39 for b ≥ − / , c < − b/ b < − / , c < (1 + b ) / b = − / , c < / E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = a λδ cN + o ( δ cN );- for b > − / , c ∈ ( − b, − b/ or b < − / , c ∈ ((1 + b ) / , b ) ,E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = a κλ δ − b − cN + o ( δ − b − cN );- for b ∈ ( − / , − / , c ∈ ((1 + b ) / , b ) ,E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = a κλ δ b − cN + o ( δ b − cN );- for b = − / c > / E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = a κλ δ / − cN + o ( δ / − cN );- for b = − / , c = 1 / ,E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = 1 λ δ / N ( a λ + a κ − + a κ ) + o ( δ / N );- for b = − / , c > / ,E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = 1 λ δ / − cN ( a κ − + a κ ) + o ( δ / − cN );- for b = − / c = 1 / E h P SRV [ τ,τ + h ] ,N − h ν, ν i [ τ,τ + h ] i = δ / N ( a λ + a κλ − ) + o ( δ / N ) . Thus, it is possible to select κ and λ such that the dominant term of the bias expansion is canceledonly when b = − / c ≥ / b = − / c = 1 /
6, provided that the selected values of κ and λ verify the condition W N ≤ ∆ N , which, under Assumption 1, is equivalent to κδ bN ≤ λδ cN . In allother cases, the dominant term of the bias can only be subtracted.The case b = − / c = 1 / ν (0) = α , which is equivalent to E [ ν ( τ )] = α . In fact, if E [ ν ( τ )] = α ,40hen a = 0 and it is not possible to cancel the leading term of the bias expansion through the selectionof κ and λ when b = − / c > / b = − / c = 1 / b = − / c = 1 / κ, ˜ λ ) to the following system a κ + a λ κ + a = 0 κ > W N ≤ ∆ N , where W N = κδ / N and ∆ N = λδ / N . If a solution (˜ κ, ˜ λ ) exists, the corresponding bias-optimalselection of W N and ∆ N reads W N = ˜ κδ / N , ∆ N = ˜ λδ / N . Theorem 2Proof.
Let Assumption 3 hold and consider the estimator: ^ P SRV [ τ,τ + h ] ,N := ⌊ h/ ∆ N ⌋ X i =1 h ˆ ν ( τ + i ∆ N ) − ˆ ν ( τ + i ∆ N − ∆ N ) i , where, for s taking values on the time grid of mesh-size δ N :- ˆ ν ( s ) := g RV ( s, k N δ N )( k N δ N ) − , - g RV ( s, k N δ N ) := k N X j =1 ∆˜ p ( s + jδ N − k N δ N , δ N ) , - ∆˜ p ( s ) = ∆ p ( s ) − ∆ η ( s ) := ˜ p ( s ) − ˜ p ( s − δ N ) = p ( s ) + η ( s ) − p ( s − δ N ) − η ( s − δ N ) . We observe that ^ P SRV [ τ,τ + h ] ,N = ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 [ g RV ( τ + i ∆ N , k N δ N ) − g RV ( τ + ( i − N , k N δ N )] =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 h k N X j =1 ∆˜ p ( τ + i ∆ N + ( j − k N ) δ N ) − k N X j =1 ∆˜ p ( τ + ( i − N + ( j − k N ) δ N ) i =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 h k N X j =1 (cid:16) ∆ p ( τ + i ∆ N + ( j − k N ) δ N ) + ∆ η ( τ + i ∆ N + ( j − k N ) δ N ) (cid:17) − k N X j =1 (cid:16) ∆ p ( τ + ( i − N + ( j − k N ) δ N ) + ∆ η ( τ + ( i − N + ( j − k N ) δ N ) (cid:17) i .
41o simplify the notation, we replace ∆ p ( τ + i ∆ N + ( j − k N ) δ N ) with r ( i, j, N ) and∆ η ( τ + i ∆ N + ( j − k N ) δ N ) with ǫ ( i, j, N ) and rewrite: ^ P SRV [ τ,τ + h ] ,N = ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 h k N X j =1 (cid:16) r ( i, j, N )+ ǫ ( i, j, N ) (cid:17) − k N X j =1 (cid:16) r ( i − , j, N )+ ǫ ( i − , j, N ) (cid:17) i =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 h k N X j =1 (cid:16) r ( i, j, N ) + ǫ ( i, j, N ) + 2 r ( i, j, N ) ǫ ( i, j, N ) (cid:17) − k N X j =1 (cid:16) r ( i − , j, N ) + ǫ ( i − , j, N ) + 2 r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17)i =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 h k N X j =1 (cid:16) r ( i, j, N ) − ( r ( i − , j, N ) (cid:17) + k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) +2 k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17)i = ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 (" k N X j =1 (cid:16) r ( i, j, N ) − ( r ( i − , j, N ) (cid:17) + " k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) + 4 " k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) +2 k N X j =1 (cid:16) r ( i, j, N ) − ( r ( i − , j, N ) (cid:17) k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) +4 k N X j =1 (cid:16) r ( i, j, N ) − r ( i − , j, N ) k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) +4 k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17)) . Based on the previous expression, we can split the expected value of ^ P SRV [ τ,τ + h ] ,N into the sumof the following six components:i) P SRV [ τ,τ + h ] ,N , ii) ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) , iii) 4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) , iv) 2( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( E h k N X j =1 (cid:16) r ( i, j, N ) − ( r ( i − , j, N ) (cid:17)i E h k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17)i) , v) 4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) r ( i, j, N ) − r ( i − , j, N ) (cid:17) k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) , vi) 4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) . Note that under Assumption 3, r is a martingale and ǫ is a mean-zero stationary process indepen-dent of r . Therefore components iv), v) and vi) are equal to zero. Moreover, note that the analyticexpression of i) was already obtained in Theorem 1. Thus, in order to obtain the analytic expressionof E [ ^ P SRV [ τ,τ + h ] ,N ] under Assumption 3, we only have to compute the analytic expressions of ii) andiii).We start with ii). We have:( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) =+( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E "(cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17) +2 k N X j =2 j − X h =1 E "(cid:16) ǫ ( i, j, N ) − ǫ ( i − , j, N ) (cid:17)(cid:16) ǫ ( i, h, N ) − ǫ ( i − , h, N ) (cid:17) =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E " ǫ ( i, j, N ) + ǫ ( i − , j, N ) − ǫ ( i, j, N ) ǫ ( i − , j, N ) +2 k N X j =2 j − X h =1 E "(cid:16) ǫ ( i, j, N ) ǫ ( i, h, N ) − ǫ ( i, j, N ) ǫ ( i − , h, N ) − ǫ ( i − , j, N ) ǫ ( i, h, N )+ ǫ ( i − , j, N ) ǫ ( i − , h, N ) (cid:17) =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E " ǫ ( i, j, N ) + ǫ ( i − , j, N ) − ǫ ( i, j, N ) ǫ ( i − , j, N ) +2 k N X j =2 j − X h =1 E "(cid:16) ǫ ( i, j, N ) ǫ ( i, h, N ) − ǫ ( i, j, N ) ǫ ( i − , h, N ) − ǫ ( i − , j, N ) ǫ ( i, h, N )+ ǫ ( i − , j, N ) ǫ ( i − , h, N ) (cid:17) =( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E [ ǫ ( i, j, N )] − E [ ǫ ( i, j, N )] ) = 4 (cid:16) Q η + V η (cid:17) hk N δ N ∆ N , since ǫ is an i.i.d. process such that E [ ǫ ( i, j, N )] = 2 V η and E [ ǫ ( i, j, N )] = 2 Q η + 6 V η , as one43an easily check.Then we move on to iii). First, we rewrite:4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) =4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E "(cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) +2 k N X j =2 j − X h =1 E "(cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17)(cid:16) r ( i, h, N ) ǫ ( i, h, N ) − r ( i − , h, N ) ǫ ( i − , h, N ) (cid:17) =4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 ( k N X j =1 E " r ( i, j, N ) ǫ ( i, j, N ) + r ( i − , j, N ) ǫ ( i − , j, N ) − r ( i, j, N ) r ( i − , j, N ) ǫ ( i, j, N ) ǫ ( i − , j, N ) +2 k N X j =2 j − X h =1 E " r ( i, j, N ) r ( i, h, N ) ǫ ( i, j, N ) ǫ ( i, h, N ) − r ( i, j, N ) r ( i − , h, N ) ǫ ( i, j, N ) ǫ ( i − , h, N ) − r ( i − , j, N ) r ( i, h, N ) ǫ ( i − , j, N ) ǫ ( i, h, N ) + r ( i − , j, N ) r ( i − , h, N ) ǫ ( i − , j, N ) ǫ ( i − , h, N ) . Then we note that- r is a martingale and is independent of ǫ , therefore:- E h r ( i, j, N ) r ( i − , j, N ) ǫ ( i, j, N ) ǫ ( i − , j, N ) i = 0;- E h r ( i, j, N ) r ( i, h, N ) ǫ ( i, j, N ) ǫ ( i, h, N ) − r ( i, j, N ) r ( i − , h, N ) ǫ ( i, j, N ) ǫ ( i − , h, N ) − r ( i − , j, N ) r ( i, h, N ) ǫ ( i − , j, N ) ǫ ( i, h, N ) + r ( i − , j, N ) r ( i − , h, N ) ǫ ( i − , j, N ) ǫ ( i − , h, N ) i = 0;- ǫ is stationary and independent of r , therefore: E h r ( i, j, N ) ǫ ( i, j, N ) + r ( i − , j, N ) ǫ ( i − , j, N ) i = E h ǫ ( i, j, N ) i E h r ( i, j, N ) + r ( i − , j, N ) i =2 V η E h(cid:16) Z a i,j,N a i,j − ,N p v ( s ) dW ( s ) (cid:17) + (cid:16) Z a i − ,j,N a i − ,j − ,N p v ( s ) dW ( s ) (cid:17) i =2 V η E h Z a i,j,N a i,j − ,N v ( s ) ds + Z a i − ,j,N a i − ,j − ,N v ( s ) ds i , where a i,j,N := i ∆ N + ( j − k N ) δ N .Therefore, we can rewrite component iii) as: 44( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E " k N X j =1 (cid:16) r ( i, j, N ) ǫ ( i, j, N ) − r ( i − , j, N ) ǫ ( i − , j, N ) (cid:17) =+4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 k N X j =1 E " r ( i, j, N ) ǫ ( i, j, N ) + r ( i − , j, N ) ǫ ( i − , j, N ) =+4( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 k N X j =1 V η E h Z a i,j,N a i,j − ,N v ( s ) ds + Z a i − ,j,N a i − ,j − ,N v ( s ) ds i =+8 V η ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E h k N X j =1 (cid:16) Z a i,j,N a i,j − ,N v ( s ) ds + Z a i − ,j,N a i − ,j − ,N v ( s ) ds (cid:17)i =8 V η ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 E h Z i ∆ N i ∆ N − k n δ N v ( s ) ds + Z ( i − N ( i − N − k n δ N v ( s ) ds i =8 V η ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 (cid:16) Z i ∆ N i ∆ N − k n δ N E [ v ( s )] ds + Z ( i − N ( i − N − k n δ N E [ v ( s )] ds (cid:17) =+8 V η ( k N δ N ) − ⌊ h/ ∆ N ⌋ X i =1 (cid:16) Z i ∆ N i ∆ N − k n δ N h ( E [ ν ( τ )] − α ) e − θs + α i ds + Z ( i − N ( i − N − k n δ N h ( E [ ν ( τ )] − α ) e − θs + α i ds (cid:17) =+ 8 V η ( α − E [ ν ( τ )])(1 + e θ ∆ N )(1 − e θk N δ N ) θk N δ N ⌊ h/ ∆ N ⌋ X i =1 e − iθ ∆ N + 16 αV η hk N δ N ∆ N =+ 8 V η ( α − E [ ν ( τ )])(1 + e − θ ∆ N )(1 − e θk N δ N )(1 − e θh ) θ (1 − e − θ ∆ N ) k N δ N + 16 αV η hk N δ N ∆ N . Finally, putting everything together, we have E [ ^ P SRV [ τ,τ + h ] ,N ] = E [ P SRV [ τ,τ + h ] ,N ] + D N , where D N := [4( Q η + V η ) + 16 αV η δ N ] h k N δ N ∆ N + 8 V η ( α − E [ ν ( τ )])(1 − e − θh ) (1 + e − θ ∆ N )(1 − e − θk N δ N )(1 − e − θ ∆ N ) k N δ N . We now recall that as N → ∞ , ∆ N = O ( δ cN ), c ∈ (0 , , and k N = O ( δ bN ), b ∈ ( − , N → ∞ , E [ P SRV [ τ,τ + h ] ,N ] → h ν, ν i [ τ,τ + h ] if b ≥ − / c < − b or b < − / c < b (see Theorem 1);- as N → ∞ , D N ∼ Q η + V η ) h k N δ N ∆ N + 16 αV η h k N δ N ∆ N + 8 V η ( α − E [ ν ( τ )])(1 − e − θh )(1 + e − θ ∆ N ) k N δ N ∆ N , D N → ∞ as N → ∞ for any ( b, c ) ∈ ( − , × (0 , N → ∞ , D N is O ( k N δ N ∆ N ) for any ( b, c ) ∈ ( − , × (0 , . Therefore, as N → ∞ , if b ≥ − / c < − b or b < − / c < b , then E [ ^ P SRV [ τ,τ + h ] ,N ] = E [ P SRV [ τ,τ + h ] ,N ] + D N diverges, with rate k N δ N ∆ N . The proof is complete. Theorem 3Proof.
Recall from Definition 1 that for τ with values on the price-sampling grid of mesh size δ N :ˆ ν ( τ ) := ( k N δ N ) − P k N j =1 h p ( τ − k N δ N + jδ N ) − p ( τ − k N δ N + ( j − δ N ) i . Moreover, from Appendix A in Bollerslev and Zhou (2002), we have under Assumption 2: E h Z ττ − ∆ ν ( t ) dt i = α ∆ + ( E [ ν (0)] − α ) θ − e − θτ ( e θ ∆ −
1) and E [ ν ( τ )] = α + ( ν (0) − α ) e − θτ . Therefore, under Assumption 2, E [ˆ ν ( τ ) − ν ( τ )] = ( k N δ N ) − E h k N X j =1 h p ( τ − k N δ N + jδ N ) − p ( τ − k N δ N + ( j − δ N ) i i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − E h k N X j =1 h τ − k N δ N + jδ N Z τ − k N δ N +( j − δ N p ν ( t ) dW ( t ) i i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − k N X j =1 E hh τ − k N δ N + jδ N Z τ − k N δ N +( j − δ N p ν ( t ) dW ( t ) i i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − k N X j =1 E h τ − k N δ N + jδ N Z τ − k N δ N +( j − δ N ν ( t ) dt i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − E h Z ττ − k N δ N ν ( t ) dt i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − h αk N δ N + ( E [ ν (0)] − α ) θ − e − θτ ( e θk N δ N − i − h α + ( ν (0) − α ) e − θτ i =( E [ ν (0)] − α ) e − θτ h ( θk N δ N ) − ( e θk N δ N − − i . Expanding this as N → ∞ , we can rewrite E [ˆ ν ( τ ) − ν ( τ )] = ( E [ ν (0)] − α ) e − θτ θk N δ N + o ( k N δ N ) . Furthermore, recall that k N δ N = O ( δ b +1 N ) and b ∈ ( − , E [ˆ ν ( τ ) − ν ( τ )] converges to zero as N → ∞ , with rate k N δ N .Now let Assumption 3 hold and replace p with ˜ p in the Definition of the locally averaged realizedvolatility, i.e., consider the estimator w ( τ ) := ( k N δ N ) − P k N j =1 h ˜ p ( τ − k N δ N + jδ N ) − ˜ p ( τ − k N δ N +( j − δ N ) i . The following holds: 46 [ w ( τ ) − ν ( τ )] = ( k N δ N ) − E h P k N j =1 h ˜ p ( τ − k N δ N + jδ N ) − ˜ p ( τ − k N δ N + ( j − δ N ) i i − h α +( ν (0) − α ) e − θτ i =( k N δ N ) − E h P k N j =1 h p ( τ − k N δ N + jδ N ) + η ( τ − k N δ N + jδ N ) − p ( τ − k N δ N + ( j − δ N ) − η ( τ − k N δ N + ( j − δ N ) i i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − E h k N X j =1 h τ − k N δ N + jδ N Z τ − k N δ N +( j − δ N p ν ( t ) dW ( t ) i + k N X j =1 h η ( τ − k N δ N + jδ N ) − η ( τ − k N δ N + jδ N ) i i − h α + ( ν (0) − α ) e − θτ i =( k N δ N ) − k N X j =1 E hh τ − k N δ N + jδ N Z τ − k N δ N +( j − δ N p ν ( t ) dW ( t ) i i + ( k N δ N ) − k N X j =1 E hh η ( τ − k N δ N + jδ N ) − η ( τ − k N δ N + jδ N ) i i − h α + ( ν (0) − α ) e − θτ i =( E [ ν (0)] − α ) e − θτ h ( θk N δ N ) − ( e θk N δ N − − i + ( k N δ N ) − k N X j =1 E hh η ( τ − k N δ N + jδ N ) − η ( τ − k N δ N + jδ N ) i i =( E [ ν (0)] − α ) e − θτ h ( θk N δ N ) − ( e θk N δ N − − i + ( k N δ N ) − k N (2 V η ) =( E [ ν (0)] − α ) e − θτ h ( θk N δ N ) − ( e θk N δ N − − i + 2 V η δ − N = E [ˆ ν ( τ ) − ν ( τ )] + 2 V η δ − N . Therefore, under Assumption 3, E [ w ( τ ) − ν ( τ )] diverges as N → ∞ , with rate 1 δ N . The proof iscomplete. Theorem 4Proof.
Let Assumptions 1 and 2 hold. Moreover, let W N = k N δ N > ∆ N , which is equivalent to N < h (cid:16) κλ (cid:17) b − c , c = 1 + b , under Assumption 1. Now note that for W N > ∆ N , the parametricexpression of the PSRV bias obtained in Theorem 1 is not entirely valid, as the parametric expressionof E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] was computed under the assumption that N issuch that W N < ∆ N .Therefore, to prove the result of this theorem, we have to: I) derive the parametric expressionof E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + ( i ∆ N − ∆ N , k N δ N )] under the assumption that k N δ N > ∆ N ; II)plug the latter into the formula of the PSRV bias from Theorem 1, in place of that of E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] under the assumption that W N < ∆ N , and re-compute theparametric expression of the PSRV bias under the assumption that W N > ∆ N ; III) perform theasymptotic expansion of the parametric expression of the bias.Here, for brevity, we detail only steps I) and III), as step II) involves calculations that are analogousto those performed in the proof of Theorem 1. 47 ) E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] when W N = k N δ N > ∆ N The parametric expression of E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + i ∆ N − ∆ N , k N δ N )] under the assump-tion that W N > ∆ N is obtained as follows. First we decompose E [ RV ( τ + i ∆ N , k N δ N ) RV ( τ + ( i − N , k N δ N )] into the sum of four components: E h RV ( τ + i ∆ N , k N δ N ) RV ( τ + ( i − N , k N δ N ) i = E h(cid:16) RV ( τ + i ∆ N , ∆ N ) + RV ( τ + ( i − N , k N δ N − ∆ N ) (cid:17)(cid:16) RV ( τ + ( i − N , k N δ N − ∆ N ) + RV ( τ + i ∆ N − k N δ N , ∆ N ) (cid:17)i = E h RV ( τ + i ∆ N , ∆ N ) RV ( τ +( i − N , k N δ N − ∆ N )+( RV ( τ + i ∆ N , ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N )+ RV ( τ + ( i − N , k N δ N − ∆ N ) + RV ( τ + ( i − N , k N δ N − ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N ) i = E h RV ( τ + i ∆ N , ∆ N ) RV ( τ + ( i − N , k N δ N − ∆ N ) i + E h ( RV ( τ + i ∆ N , ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N ) i + E h RV ( τ + ( i − N , k N δ N − ∆ N ) i + E h RV ( τ + ( i − N , k N δ N − ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N ) i . (14)We then obtain the parametric expressions of these four components, which we term O , O , O and O , respectively (we omit the intermediate steps, as are they are analogous to those followed in I) andIII) in the proof of Theorem 1):- O = E h ( RV ( τ + i ∆ N , ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N ) i = α ∆ N − ( E [ ν ( τ )] − α ) (cid:16) γ θ + α (cid:17) ∆ N θ e − θi ∆ N (cid:0) − e θ ∆ N (cid:1) − α ( E [ ν ( τ )] − α ) ∆ N θ e − θi ∆ N + θk N δ N (cid:0) − e θ ∆ N (cid:1) − γ α θ e − θk N δ N (2 − e − θ ∆ N − e θ ∆ N )+ h γ θ (cid:16) α − E [ ν ( τ )] (cid:17) + ( E [ ν ( τ )] − α ) i θ e − θi ∆ N (1 − e θ ∆ N ) e θk N δ N ;- O = E h RV ( τ + i ∆ N , ∆ N ) RV ( τ + ( i − N , k N δ N − ∆ N ) i = α ∆ N ( k N δ N − ∆ N ) + ( E [ ν ( τ )] − α ) (cid:16) γ θ + α (cid:17) ( k N δ N − ∆ N ) θ e − θi ∆ N ( e θ ∆ N − α ( E [ ν ( τ )] − α )∆ N θ e − θi ∆ N ( e θk N δ N − e θ ∆ N ) − γ α θ (1 − e θ ∆ N )( e − θ ∆ N − e − θk N δ N )+ h γ θ ( α − E [ ν ( τ )]) + ( E [ ν ( τ )] − α ) i θ e − θi ∆ N (1 − e θ ∆ N )( e θ ∆ N − e θk N δ N );- O = E h RV ( τ + ( i − N , k N δ N − ∆ N ) RV ( τ + i ∆ N − k N δ N , ∆ N ) i = α ∆ N ( k N δ N − ∆ N ) + ( E [ ν ( τ )] − α ) (cid:16) γ θ + α (cid:17) ∆ N θ e − θi ∆ N ( e θk N δ N − e θ ∆ N )+ α ( E [ ν ( τ )] − α )( k N δ N − ∆ N ) θ e − θi ∆ N + θk N δ N ( e θ ∆ N − − γ α θ ( e θ ∆ N − e θk N δ N ) e − θk N δ N (1 − e − θ ∆ N )+ h γ θ (cid:16) α − E [ ν ( τ )] (cid:17) + ( E [ ν ( τ )] − α ) i θ e − θi ∆ N + θk N δ N (1 − e θ ∆ N )( e θ ∆ N − e θk N δ N );- O = E h RV ( τ + ( i − N , k N δ N − ∆ N ) i = ( k N δ N − ∆ N ) 1 δ N h γ θ α (cid:16) θ (1 − e − θδ N ) (cid:17) + 1 θ − δ N e − θδ N − θ e − θδ N + δ N (1 + 2 e − θδ N ) + 12 θ ( e − θδ N + 4 e − θδ N −
5) + 3 α δ N i e − θδ N (1 − e − θ ( k N δ N − ∆ N ) )(1 − e − θδ N ) − h θ (1 − e − θδ N ) e − θi ∆ N +2 θ (1+ k N ) δ N (cid:16) E [ ν ( τ )] − γ θ E [ ν ( τ )] − αE [ ν ( τ )] + γ α θ + α (cid:17)i + 3 θ e − θδ N (1 − e − θ ( k N δ N − ∆ N ) )(1 − e − θδ N ) − e − θi ∆ N + θ (1+ k N ) δ N h γ θ ( E [ ν ( τ )] − α )(1 − e − θδ N ) + γ ( E [ ν ( τ )] − α ) (cid:16) θ − δ N e − θδ N − θ e − θδ N (cid:17) + 2 α ( E [ ν ( τ )] − α )(1 − e − θδ N ) θδ N i + α (cid:16) k N δ N − ∆ N (cid:17) − α δ N (cid:16) δ N k N − ∆ N (cid:17) + 2 θ e − θi ∆ N +2 θk N δ N − θδ N (1 − e θδ N ) ( e θδ N − − (1 − e − θδ N ) − (1 − e − θδ N ) − × h γ θ ( E [ ν ( τ )] − α ) + (cid:16) α − E [ ν ( τ )] (cid:17)i × h e θδ N (cid:16) e − θ ( k n δ N − ∆ N ) − e − θ ( k N δ N − ∆ N ) (cid:17) + (cid:16) − e − θ ( k N δ N − ∆ N ) (cid:17) + e − θδ N (cid:16) e − θ ( k N δ N − ∆ N ) − (cid:17)i − γ α θ (2 − e − θδ N − e θδ N ) (cid:16) e − θ ( k N δ N − ∆ N ) − k N δ N − ∆ N ) δ − N − ( k N δ N − ∆ N ) δ − N e − θδ N (cid:17) × ( e − θδ N + e θδ N − − − θ (cid:16) γ θ + α (cid:17) ( E [ ν ( τ )] − α ) δ N (1 − e θδ N ) e − θi ∆ N + θ ∆ N ( e θδ N − − × (cid:16) e θ ( k N δ N − ∆ N ) −
1) + ( k N δ N − ∆ N ) δ − N − ( k N δ N − ∆ N ) δ − N e θδ N (cid:17) − αθ ( E [ ν ( τ )] − α ) δ N (1 − e θδ N ) e − θi ∆ N + θk N δ N ( e θδ N − − × [( k N δ N − ∆ N ) δ − N ( e θδ N − e θδ N ( e − θ ( k N δ N − ∆ N ) − . Roughly speaking, the contribution to the PSRV finite-sample bias due to the overlapping of con-secutive local windows to estimate the spot volatility (i.e., due to assuming that W N = k N δ N > ∆ N )is mainly due to the terms O , O , and O . In fact, when k N δ N = ∆ N (i.e., W N = ∆ N ), the terms O , O , and O are equal to zero, while the term O reduces to the quantity in Eq. (3). Interestingly, theterms O , O , and O are functions of the quantity ( k N δ N − ∆ N ) (i.e., W N − ∆ N ) and, in particular,they are O ( k N δ N − ∆ N ) as ( k N δ N − ∆ N ) → + . III) Asymptotic expansion of the bias when W N = k N δ N > ∆ N Once the exact parametric expression for the PSRV bias under the assumption that k N δ N > ∆ N has been obtained from step II), we expand it sequentially, first as λ →
0, and then as h →
0, toobtain: E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] i = E [ ν ( τ )] k δ bN − γ E [ ν ( τ )] ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ E [ ν ( τ )] h + O ( h − b ) + O ( λ ) if b < − / , c < b ,as λ → h →
0. 49n particular, from Appendix A in Bollerslev and Zhou (2002), we know that E [ ν ( t )] = α + ( ν (0) − α ) e − θt . Therefore, if ν (0) = α , then E [ ν ( t )] = α ∀ t , and, consequently: E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] i = α k δ bN − γ α ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ αh + O ( h − b ) + O ( λ ) if b < − / , c < b ,as λ → h → F νt ) t ≥ denote the natural filtration associated with the process ν . It is straightforwardto see that E h P SRV [ τ,τ + h ] ,N −h ν, ν i [ τ,τ + h ] |F ντ i = ν ( τ ) k δ bN − γ ν ( τ ) ! h + O ( h − b ) + O ( λ ) if b ≥ − / , c < − b − γ ν ( τ ) h + O ( h − b ) + O ( λ ) if b < − / , c < b ,as λ → h → Mathematica . The code isavailable as supplementary material.
Appendix B Indirect inference of the CKLS parameters
In order to obtain estimates of the CKLS parameters relevant for the bias-optimal selection of κ according to the rule derived in Section 3 and generalized in Section 4, we proceed as follows.First we estimate the spot volatility using the fast Fourier transform algorithm, following theprocedure detailed in Appendix B.5 of Sanfelici et al. (2015). In particular, from a sample of one-minute log-price observations of length one year, we obtain estimates of the spot volatility on thegrid of mesh size ∆ M := M +1 , where M denotes the cutting frequency to reconstruct the Fouriercoefficients of the spot volatility.Then, using ˆ ν i , i = 1 , , . . . , M + 1 to denote the obtained spot volatility estimates, we inferthe values of the volatility parameters under Assumption 4 by applying the following zero-interceptmultivariate regression: 50 = αθ ∆ M X − θ ∆ M X + γ p ∆ M Z, where Z is a vector of independent standard normal random variables, and the dependent variable Y and independent variables X , X are defined as Y i := ˆ ν i +1 − ˆ ν i ˆ ν βi , X i = ˆ ν − βi , X i = ˆ ν − βi . In particular, using a and a to denote the estimates of the regression coefficients and v the estimateof the residual volatility, γ √ ∆ M , we have:ˆ θ = − a / ∆ M , ˆ α = a / (ˆ θ ∆ M ) , ˆ γ = v/ p ∆ M ..