Time-Bridge Estimators of Integrated Variance
aa r X i v : . [ q -f i n . S T ] A ug Time-Bridge Estimators of Integrated Variance
A. Saichev , , D. Sornette , ETH Zurich – Department of Management, Technology and Economics, Switzerland Swiss Finance Institute, 40, Boulevard du Pont-d’ Arve, Case Postale 3, 1211 Geneva 4, Switzerland Nizhni Novgorod State University – Department of Mathematics, Russia.
E-mail addresses: [email protected] & [email protected] ime-Bridge Variance Estimators
Abstract
We present a set of log-price integrated variance estimators, equalto the sum of open-high-low-close bridge estimators of spot varianceswithin n subsequent time-step intervals. The main characteristics ofsome of the introduced estimators is to take into account the informa-tion on the occurrence times of the high and low values. The use ofthe high’s and low’s of the bridge associated with the original processmakes the estimators significantly more efficient that the standardrealized variance estimators and its generalizations. Adding the infor-mation on the occurrence times of the high and low values improvesfurther the efficiency of the estimators, much above those of the well-known realized variance estimator and those derived from the sumof Garman and Klass spot variance estimators. The exact analyticalresults are derived for the case where the underlying log-price processis an Itˆo stochastic process. Our results suggests more efficient waysto record financial prices at intermediate frequencies. Didier SornetteDepartment of Management, Technology and Economics(D-MTEC, KPL F38.2) ETH ZurichKreuzplatz 5CH-8032 ZurichSwitzerland 2
Introduction
The integrated variance is a crucial risk indicator of the stochastic log-price process within specific time intervals. Most of the existing high-frequencyintegrated variance estimators are modifications of the well-known realizedvolatility (see, for instance, Andersen et al. (2003), A¨ıt-Sahalia (2005), Zhanget al. (2005)), and are based on the knowledge of the open and close pricesof n time-step intervals dividing the whole time interval of interest. An-other common practice to estimate the variance of a log-price process is touse not two (open-close) log-prices within a given time-step, but four values,the so-called the open-high-low-close (OHLC) of the log-prices. Well-knownexamples are the Garman and Klass (G&K) (1980) and Parkinson (Park)(1980) spot variance estimators.The main goal of this paper is to demonstrate the efficiency of bridgeOHLC integrated variance estimators, that use the knowledge of the high andlow values of the bridge process derived from the original log-price process,as well as possibly the random occurrence times of these extrema within eachtime-step interval. We compare the efficiencies of these time-OHLC bridgeestimators with the efficiency of the standard realized variance and with theefficiency of the integrated variance estimators based on the G&K estimatorsof the variance within each elementary time-step interval. We show thatsome time-OHLC integrated variance estimators achieve a very significantimprovement in efficiency compared with the realized variance and the G&Kintegrated variance estimators. Another remarkable property of the proposedtime-OHLC bridge estimators is that they depend much less on the drift ofthe log-price process than the realized variance and G&K integrated varianceestimators. This has the great advantage of essentially removing the biasesthat affect the standard estimators, given that the drift (expected return)is in general the most poorly constrained statistical variable. We comparethe efficiencies of the introduced integrated variance estimators using the Itˆoprocess as our workhorse to model the stochastic behavior of log-prices.Present databases record either all prices associated with transactions orprune the data to keep the OHLC at given time steps, for instance, sec-onds, minutes or days. The later records giving the OHLC of the realizedlog-prices do not allow the reconstruction of the OHLC (and even less theoccurrence times of the high’s and low’s) for the associated bridge process ineach elementary interval. Of course, one could construct the OHLC and anyother useful information from the full time series of all transaction prices.But then, one could question the value of deriving new estimators based ona reduced information set. Therefore, the present paper can be consideredas a normative exercise to learn about the fundamental limits of integrated3ariance estimators. Our results are also useful in suggesting more efficientways to record financial prices at intermediate frequencies: instead of record-ing the OHLC at the daily scale for instance, we propose that data centersand vendors should store to open and close of the real log-price and thehigh and low of the corresponding bridge in each day (or in any other cho-sen frequency). Our calculations below show that this information, whichhas the same cost and is as easy to obtain at the end of the day from thehigh frequency data, provides much more efficient estimators of the variancethat can be stored for future use. The same conclusion holds true for otherrisk measures beyond variance such as higher order moments, but this is notexplored in the present paper.The paper is organized as follows. Section 2 describes the properties of thewell-known realized variance estimator, which we need in order to compare itsefficiency with the efficiencies of the suggested time-OHLC bridge integratedvariance estimators. Section 3 is devoted to the discussion of the efficienciesof the simple bridge integrated variance estimators, illustrating the compara-tive efficiency and unbiasedness of the bridge integrated variance estimators.This section written in a pedagogical style gradually introduces the readersin the area of homogeneous most efficient variance estimators. Section 4 pro-vides a detailed analysis of the efficiency of the OHL and time-OHLC bridgeintegrated variance estimators, which turn out to be significantly more effi-cient than the realized variance and the G&K integrated variance estimators.Section 5 describes the results of numerical simulations demonstrating thecomparative efficiency of the proposed estimators. Section 6 concludes. Thepaper is completed by three appendix. Appendix A presents the essentialproperties of the canonical bridge. Appendix B derives the joint probabilitydensity function (pdf) of the high value and of its occurrence time. AppendixC derives and gives the statistical properties of the joint distribution of thehigh and low values and of the occurrence time of the last extremum for thecanonical bridge. Henceforth, we assume that the log-price X ( t ) of a given security followsan Itˆo process dX ( t ) = µ ( t ) dt + σ ( t ) dW ( t ) , X (0) = X , (1)where W ( t ) is a realization of the standard Wiener process, while µ ( t ) is thedrift process, and σ ( t ) is the instantaneous variance of the log-price process X ( t ). 4 .1 Definitions and basic properties of realized vari-ance Let us provide first some basic definitions and properties.
Definition 1
The integrated variance of the process X ( t ) within the timeinterval t ∈ (0 , T ) is D ( T ) := Z T σ ( t ) dt . (2) Definition 2
The spot variance is defined within the time-step interval S i : ( t i − , t i ] , (3)byˆ D real { X ( t ) : t ∈ S i } := ( X i − X i − ) , X i := X ( t i ) , t i := i ∆ , ∆ = Tn . (4)
Definition 3
The well-known statistical estimator of the integrated varianceis the so-called realized variance defined as[
X, X ] T := n X i =1 ˆ D real { X ( t ) : t ∈ S i } . (5) Remark 1
For Itˆo processes (1) and for n → ∞ , it is well-known that therealized variance converges in probability to the integrated one. However, for real data, the number n of available data points is always lim-ited, ultimately by the discreteness of the transaction flow and the associatedmicrostructure noise. Such structures, which are not taken into account inthe Itˆo log-price model, can be neglected in the use of the realized varianceestimator if the discrete time step ∆ is much larger than the inverse of themean frequency ν of the tick-by-tick transactions, so that n ≪ νT . Assumption 1
While ∆ ≫ /ν , we assume that ∆ is sufficiently small incomparison with the time scales over which the drift process µ ( t ) and theinstantaneous variance σ ( t ) vary, so that one may replace the original Itˆoprocess (1) by Wiener processes with drift dX i ( t ) ≃ µ i dt + σ i dW ( t ) , X i ( t i − ) = X i − , t ∈ S i ,µ i = const , σ i = const . (6)5onsider the special case of the Wiener process with drift X ( t, µ, σ ) = µt + σW ( t ) . (7)Using the scale-invariance property of the Wiener process, the following iden-tity holds in law (represented by the symbol ∼ )ˆ D real { X ( t, µ, σ ) : t ∈ S i } ∼ σ ∆[ γ + W (1)] = σ ∆ · X (1; γ ) , (8)where X ( t ; γ ) = γt + W ( t ) , γ = µσ √ ∆ , t ∈ (0 , , (9)is the canonical Wiener process with drift . Applying the identity in law (8)to the realized variance expression (5), (4), we obtain[ X, X ] T ∼ ∆ n X i =1 σ i ( γ i + W i ) , (10)where { W i } are iid Gaussian variables N (0 , X, X ] T ] = ∆ n X i =1 σ i (1 + γ i ) , γ i = µ i σ i √ ∆ . (11)This recovers the well-known fact that the realized variance is in generalbiased for non-zero drift, and is non-biased only for zero-drift ( µ ( t ) ≡ ˆ D est ( T ) The essential idea of the present work is that it is possible to improveon the realized variance estimator of the integrated variance estimator, for afixed n ≪ νT of time-steps with durations ∆, by replacing it byˆ D est ( T ) = n X i =1 ˆ D est { X ( t ) : t ∈ S i } , (12)where the functional ˆ D est { X ( t ) : t ∈ S i } is an improved estimator of thespot variance given by definition 2. The subscript est is used to refer tosome particular estimator and the subscript real means that this estimatorreduces to the realized variance estimator.6 efinition 4 The estimator ˆ D est ( T ) defined by (12) is said to be unbiasedif, for all intervals i = 1 , ..., n ,E h ˆ D est { X ( t ) : t ∈ S i } i = ∆ · σ i , (13)which implies E h ˆ D est ( T ) i = ∆ n X i =1 σ i . (14)When there exists at least one interval j , such that condition (13) does nothold, the estimator is considered biased. Let ˆ D est ( T ) be some unbiased variance estimator. We propose to quantifyits efficiency in terms of the coefficient of variation ρ [ ˆ D est ( T )] = q Var[ ˆ D est ( T )]E[ ˆ D est ( T )] . (15)As an illustration, the coefficient of variation of the realized variance fora Wiener process with zero drift ( µ ( t ) ≡
0) is equal to ρ [[ X, X ] T ] = vuut n X i =1 σ i , n X i =1 σ i . (16)We will need the following theorem: Theorem 2.1
The lower bound of the function f ( s ) := vuut n X i =1 s i . n X i =1 s i , s = { s , s , . . . , s n } , ∀ s i > is equal to ρ ( n ) := inf ∀ s i > f ( s ) = 1 √ n . (18) And this lower bound is attained iff all s i are identical: s i ≡ s > . roof. Let { s i } be a realization of some random variable S with proba-bilities Pr { S = s i } = n , i = 1 , . . . , n . Expected and mean square values ofthe random variable S are equal toE [ S ] = 1 n n X i =1 s i , E (cid:2) S (cid:3) = 1 n n X i =1 s i . (19)Since, for any random variable S , the inequality p E [ S ] > E [ S ] holds, thisimplies f ( s ) > √ n . The inequality becomes an equality iff all s i ≡ s for ∀ s > (cid:4) Applying this theorem to the right-hand-side of expression (16) showsthat ρ [[ X, X ] T ] satisfies the inequality ρ [[ X, X ] T ] > ρ real ( n ) , ρ real ( n ) = r n , (20)where the lower bound ρ real ( n ) of the efficiency is attained only if all { σ i } areidentical.Below, we will compare the efficiencies of different estimators via thecomparison of their lower bounds ρ est ( n ) = inf ∀ σ i ρ est [ ˆ D ( T )] . (21) An important motivation for the introduction of a new class of so-called“realized bridge variance estimators” is to obtain much reduced biases com-pared that of the realized variance (5) observed for nonzero drifts µ ( t ) Definition 5
The bridge Y ( t, S i ) in discrete time steps of the original pro-cess X ( t ) is defined by Y ( t, S i ) := X ( t ) − X i − − t − t i − ∆ ( X i − X i − ) , t ∈ S i , (22)where X i := X ( t i ), t i := i ∆ and ∆ = Tn .As an example, let X ( t ) be the Wiener process with drift X ( t, µ, σ ) de-fined by (7). Using the transition and scale invariant properties of the Wienerprocess leads to Y ( t, S i ) ∼ σ √ ∆ ( W ( ζ ) − ζ W (1)) , ζ = t − t i − ∆ ∈ (0 , . (23)8his means that the bridge Y ( t, S i ) (22) is identical in law to Y ( t, S i ) ∼ σ √ ∆ Y ( ζ ) , (24)where Y ( t ) := W ( t ) − t · W (1) , t ∈ (0 , , (25)is the canonical bridge whose basic properties are given in Appendix A. Remark 2
The canonical bridge Y ( t ) is completely independent of the drift µ . This property is the fundamental reason for the better performance of thevariance bridge estimators compared with the realized variance: the biasesand efficiencies of bridge variance estimators do not depend on the drift µ .In the following, we explore the statistical properties of the bridge vari-ance estimators ˆ D est ( T ) = n X i =1 ˆ D est { Y ( t, S i ) : t ∈ S i } , (26)obtained from the general expression (12) by replacing the initial process { X ( t i ) } by its corresponding bridge { Y ( t i , S i ) } . Definition 6
The estimator (26) is called homogeneous if, when applied tothe Wiener processes with drift (6), the following identity in law holdsˆ D est { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d est , (27)where ˆ d est := ˆ D est { Y ( t ) : t ∈ (0 , } (28)is the canonical estimator of the spot variance depending on the canonicalbridge Y ( t ) (25). Obviously, the estimator (26) is unbiased if and only ifE h ˆ d est i = 1. Theorem 3.2
Under Assumption 1, the lower bound of the efficiency of theunbiased homogeneous integrated bridge variance estimator (27) is ρ est ( n ) = r Var h ˆ d est i (cid:14) n , (29) where Var h ˆ d est i is the variance of the canonical spot variance estimator ˆ d est (28) . roof. Under Assumption 1, the unbiased homogeneous bridge varianceestimator (26) is identical in law toˆ D est ( T ) ∼ ∆ n X i =1 σ i · ˆ d i est , (30)where { ˆ d i est } are iid random variables with mean value E h ˆ d est i = 1 andvariance Var h ˆ d est i . Accordingly, the expected value and variance of theunbiased bridge variance estimator (26) are equal toE h ˆ D est ( T ) i = ∆ n X i =1 σ i , Var h ˆ D est ( T ) i = ∆ Var h ˆ d est i n X i =1 σ i . (31)Substitute these relations into (15), we obtain ρ h ˆ D est ( T ) i = vuut Var h ˆ d est i n X i =1 σ i , n X i =1 σ i . (32)Using theorem 2.1, this yields the result (29). (cid:4) Our first example of an homogeneous bridge variance estimator isˆ D simple ( T ) = n X i =1 ˆ D simple { Y ( t, S i ) : t ∈ S i } , (33)where the estimator of the spot variance is given byˆ D simple { Y ( t, S i ) : t ∈ S i } = AY ( t i ( η )) , t i ( η ) = t i − + η · ∆ , η ∈ (0 , , (34)and A is a normalizing factor. The estimator ˆ D simple ( T ) is homogeneous and,if relations (6) are valid, then Y ( t i ( η ) , S i ) ∼ σ i ∆ · Y i ( η ) , (35)where { Y i ( η ) } are iid random variables that are identical in law to the canon-ical bridge (25). Substituting relation (35) into (33) leads to the identity inlaw ˆ D simple ( T ) ∼ A ∆ n X i =1 σ i Y i ( η ) . (36)10he fact that the canonical bridge Y ( η ) is Gaussian with mean value E[ Y ( η )] = η (1 − η ) implies that the estimator (33) is unbiased in the sense of definition4 if A = 1 (cid:14) η (1 − η ). Accordingly, the variance of the estimator (33) is equalto the variance of the realized variance obtained for zero drift ( µ ( t ) ≡ D simple ( T )] = 2∆ n X i =1 σ i . (37)This result means that the lower bound of the efficiency of the simplest bridgeestimator (33) is equal to the lower bound of the efficiency of the realizedvariance estimator at zero drift: ρ simple ( n ) = ρ real ( n ) = r n . (38)The shortcoming of the estimator (33) is that it is actually less efficient thanthe realized variance at zero drift in a sense discussed below. Definition 7
Let the estimator of the spot varianceˆ D est { X ( t ) : t ∈ S i } or ˆ D est { Y ( t, S i ) : t ∈ S i } depends on κ est values of the process X ( t ) or Y ( t, S i ) at κ est time-step withinthe time interval t ∈ S i . The corresponding estimators of the realized volatil-ity ˆ D est ( T ) (12) or (33) are then using a total number n eff = κ est · n oftime-steps. Example 1
The realized variance corresponds to κ real = 1. Indeed, the twovalues { X i − , X i } are used to estimate the spot realized variance (4), and thefirst value is excluded from the semi-closed interval S i (3). Example 2
For the simplest bridge estimator (33) with (34), κ simple = 2.Indeed, the estimator (34) depends on the bridge Y ( t i ( η ) , S i ) for t i ( η ) ∈ S i and Y ( t i ( η ) , S i ) (22) is defined by the open and close values { X i − , X i } ofthe original stochastic process X ( t ). Excluding the open value, this yields κ simple = 2. Example 3
Consider the Garman & Klass (G&K) variance estimator basedon open, high, low and close prices, used as the spot variance estimator in11xpression (12):ˆ D GK { X ( t ) : t ∈ S i } = k ( H i − L i ) − k ( C i ( H i − L i ) − H i L i ) − k C i ,k = 0 . , k = 0 . , k = 0 . , (39)where { O i , C i , H i , L i } are the open, close, high and low values O i = X i − , C i = X i , H i = sup t ∈ S i [ X ( t ) − O i ] , L i = inf t ∈ S i [ X ( t ) − O i ] . Excluding the open value leads to κ GK = 3. Definition 8
We characterize the efficiencies of the novel variance estima-tors by comparing them with that of the standard realized variance estimator.The corresponding comparative efficiency R est is constructed as the ratio ofthe lower bounds of the efficiencies of the realized variance and novel varianceestimator: R est = ρ real ( κ est · n ) ρ est ( n ) . (40)Putting in this expression ρ real ( n ) given by equation (20) and ρ est ( n ) givenby expression (29) yields R est = s κ est · Var[ ˆ d est ] . (41) Remark 3
For a given duration T used to define the integrated variance(2), relation (41) takes into account that the typical waiting time betweensuccessive data samples is given by ∆ eff ≃ T (cid:14) n eff . Such waiting time shouldbe approximately the same for the different generalized variance estimatorsproposed below, leading to similar distortions to the adequacy of the Itˆoprocess (1) in its ability to describe the real price process in the presence ofdiscrete tick-by-tick and other microstructure noise. Example 4
Let us come back to the simple variance estimator based onexpression (34) for ˆ D simple { Y ( t, S i ) : t ∈ S i } . The result (38) is equivalent toVar[ ˆ d simple ] = 2. Substituting this value in (41) yields R simple = 1 p κ simple = 1 √ ≃ . . (42)The efficiency of the simplest bridge estimator is smaller than that of therealized variance. 12 xample 5 Let us evaluate the comparative efficiency of the generalizedrealized variance estimator based on the spot G&K variance estimator in thecase of zero drift µ ( t ) ≡
0. It is known that the variance of the spot G&Kvariance estimator given by (39) is equal toVar h ˆ D GK { X ( t ) : t ∈ S i } i = σ i ∆ · . ⇒ Var h ˆ d GK i = 0 . . (43)This gives R GK = s κ GK · . s · . ≃ . . (44)Therefore, for zero drift, the G&K realized variance estimator is approxi-mately 1.6 times more efficient than the realized variance estimator. The fact that the G&K realized variance estimator based on open-high-low-close prices is significantly more efficient than the standard realized vari-ance, at least for Itˆo process X ( t ) (1) with zero drift µ ( t ) ≡
0, suggests tostudy other estimators using different combinations of the open-high-low-close prices. Let us start by analyzing the simplest case of what we will referto as the “high bridge variance estimator”, defined through its spot variancegiven by ˆ D high { Y ( t, S i ) : t ∈ S i } = A · H i , (45)where A is normalizing factor and H i = sup t ∈ S i Y ( t, S i ) , (46)is the high value of the bridge Y ( t, S i ). Note that we use here the samenotation for the high value of the bridge Y ( t, S i ) as for that of the originalprocess X ( t ), hoping that this will not give rise to any confusion.It follows from (24) thatˆ D high { Y ( t, S t ) : t ∈ S t } ∼ σ i ∆ · ˆ d high , ˆ d high = AH , (47)where the high value H of the canonical bridge Y ( t ) (25) has the followingprobability density function (pdf) ϕ high ( h ) = 4 he − h , h > . (48)13he derivation of the pdf (48) is given in Jeanblanc et al. (2009) (see alsothe derivations presented in Appendix B). Accordingly, the expected valueand the variance of the square of H are given byE (cid:2) H (cid:3) = 12 , Var (cid:2) H (cid:3) = 14 . (49)In order for the high spot bridge variance estimator to be unbiased, we haveto choose in (51) the value A = 2 for the normalizing factor. This givesVar h ˆ d high i = 1. With κ high = 2, we find that the comparative efficiency (41)of the high bridge realized variance estimator is R high = 1. Thus, the highbridge realized variance estimator has the same efficiency as the standardrealized variance. But the advantage of the former is that, under Assumption(1), it is unbiased for any drift µ ( t ) = 0. Remark 4
Let us give the intuition for the above result, obtained despitethe larger value of κ high = 2 compared to κ real = 1. The reason is thatthe pdf of the random variable 2 H is narrower than that of the randomvariable W defining the spot realized variance at zero drift. The same reasonunderlies the comparative efficiency of the G&K as well the other high andlow bridge realized variance estimators discussed below. The narrowness ofthe pdf’s of high’s and low’s compared with the pdf’s of the increments ofthe original stochastic process X ( t ) results from a weak version of the Law ofLarge Numbers, in the sense that the high’s and low’s incorporate significantadditional information about the underlying process within a given time-step,thus leading to narrower pdfs’. We now introduce a novel ingredient to improve further the estimationof the variance. In addition to using only the high H i of the bridge Y ( t, S i ),we also assume that the time t i high of the occurrence of this high is recorded: t i high : H i = Y ( t i high , S i ) . (50)The corresponding time-high bridge spot variance estimator is given byˆ D est { Y ( t, S i ) : t ∈ S i } = A · s t i high − t i − ∆ ! · H i , (51)where A is a normalizing factor, while s ( t ) , t ∈ (0 ,
1) is some function thatremains to be determined so as to make the above spot variance estimator14s efficient as possible. Before providing the solution of this problem, let usnote that the following identify in law follows from (24)ˆ D est { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d est , (52)where ˆ d est = A · s ( t high ) · H (53)is the canonical time-high bridge estimator of the spot variance, H is thehigh value of the canonical bridge Y ( t ) (25), and t high is the correspondingtime-point (50).The expected value of the canonical estimator (53) is equal toE h ˆ d est i = A Z s ( t ) α ( t ; 2) dt, α ( t ; λ ) := Z ∞ h λ ϕ high ( h, t ) dh (54)where ϕ high ( h, t ) is the joint pdf of H and t high . Taking A = 1 . Z s ( t ) α ( t ; 2) dt , (55)we obtain an unbiased time-high canonical bridge estimator:ˆ d est = s ( t high ) H R s ( t ) α ( t ; 2) dt . (56)Its variance is Var h ˆ d est i = R s ( t ) α ( t ; 4) dt (cid:16)R s ( t ) α ( t ; 2) dt (cid:17) − . (57) Theorem 3.3
The function s ( t ) that minimizes the variance (57) of theunbiased time-high canonical bridge estimator (56) is s t-high ( t ) = α ( t ; 2) α ( t ; 4) . (58) The corresponding minimal variance is equal to
Var " s t-high ( t high ) H R s ( t ) α ( t ; 2) dt = inf ∀ s ( t ) Var h ˆ d est i = 1 E t-high − , E t-high = Z α ( t ; 2) α ( t ; 4) dt. (59)15 roof. We use the Schwarz inequality (cid:18)Z A ( t ) B ( t ) dt (cid:19) Z A ( t ) dt Z B ( t ) dt (60)with A ( t ) = s ( t ) p α ( t ; 4) , B ( t ) = α ( t ; 2) p α ( t ; 4) , (61)to obtain (cid:18)Z s ( t ) α ( t ; 2) dt (cid:19) Z s ( t ) α ( t ; 4) dt Z α ( t ; 2) α ( t ; 4) dt. (62)After simple transformations, we rewrite the last inequality in the formVar h ˆ d t-high i = R s ( t ) α ( t ; 4) dt (cid:16)R s ( t ) α ( t ; 2) dt (cid:17) − > R α ( t ;2) α ( t ;4) dt − . (63)The equality in (63) is reached by substituting in it s ( t ) = s t-high ( t ) given byexpression (58). (cid:4) The joint pdf of H and t high is derived in Appendix B and reads ϕ high ( h, t ) = r π h p t (1 − t ) exp (cid:18) − h t (1 − t ) (cid:19) , h > , t ∈ (0 , . (64)Substituting this expression for ϕ high ( h, t ) into (54) yields α ( t ; λ ) = 2 √ π [2 t (1 − t )] λ Γ (cid:18) λ (cid:19) . (65)Therefore, s t-high ( t ) = 15 t (1 − t ) , E t-high = 35 ⇒ Var h ˆ d t-high i = 23 , (66)and R t-high = r ≃ . . (67)Thus, the time-high bridge realized variance estimator is less efficient thanthe corresponding G&K estimator at zero drift, but is more efficient than therealized variance. Remark 5
The numerical result (67) takes into account that the use of t i high does not increase the number of sample values used in the spot estimator(51). Thus, κ t-high = κ high = 2. 16 Bridge time-high-low estimators
Definition 9
The bridge realized variance estimator (26) that uses as spotvariance estimatorˆ D bPark { Y ( t, S i ) : t ∈ S i } = A · ( H i − L i ) (68)is called the bridge Parkinson estimator . In expression (68), H i and L i arethe high and low values of the bridges Y ( t, S i ) (22).The bridge Parkinson estimator is identical in law toˆ D bPark { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d bPark , ˆ d bPark = A · ( H − L ) , (69)where H , L are the high and low values of the canonical bridge Y ( t ) (25).The joint pdf of H and L have been derived by Saichev et al. (2009) andreads ϕ ( h, ℓ ) = ∞ X m = −∞ m [ m I ( m ( h − ℓ )) + (1 − m ) I ( m ( h − ℓ ) + ℓ )] , I ( h ) = 4(4 h − e − h . (70)It will be clear below that it is convenient to describe the joint statisticalproperties of the high H and low L by using polar coordinates H = R cos Θ , L = R sin Θ , R ∈ (0 , ∞ ) , θ ∈ (cid:16) − π , (cid:17) . (71)Accordingly, we rewrite the canonical estimator (69) in the formˆ d bPark = AR (1 − sin 2Θ) . (72)Choosing the constant A that makes the estimator (69) unbiased, we obtainˆ d bPark = R (1 − sin 2Θ) R − π/ (1 − sin 2 θ ) α ( θ ; 2) dθ ,α ( θ ; λ ) = Z ∞ r λ +1 ϕ ( r cos θ, r sin θ ) dr. (73)Substituting expression (70) yields α ( θ ; λ ) = ∞ X m = −∞ m [ mβ ( m (cos θ − sin θ ); λ ) + (1 − m ) β ( m (cos θ − sin θ ) + sin θ ; λ )] ,β ( y ; λ ) = C ( λ ) | y | λ , C ( λ ) = 1 + λ √ λ Γ (cid:18) λ (cid:19) . (74)17he variance of the canonical bridge Parkinson estimator is equal toVar h ˆ d bPark i = R − π/ (1 − sin 2 θ ) α ( θ ; 4) dθ (cid:16)R − π/ (1 − sin 2 θ ) α ( θ ; 2) dθ (cid:17) − ≃ . . (75)Substituting this value into (41) and taking into account that κ bPark = 3 forthe bridge Parkinson estimator, we obtain the comparative efficiencyVar h ˆ d bPark i = 0 . , κ bPark = 3 , ⇒ R bPark ≃ . , (76)which means that the bridge Parkinson estimator is significantly more effi-cient than the G&K estimator at zero drift. Remark 6
We stress that the canonical estimator ˆ d bPark is significantlydifferent from the well-known canonical Parkinson estimator (see Parkinson(1980)) ˆ d Park = ( H − L ) , (77)where H and L are the high and low values of the canonical Wiener processwith drift X ( t, γ ) (9). In contrast with the bridge Parkinson estimator (73)which is unbiased for any γ , the standard Parkinson estimator is biased atnonzero drift. Moreover, the variance of the standard Parkinson estimatorat zero drift is Var h ˆ d Park i ≃ . , (78)which is approximately twice the variance of the bridge Parkinson estimator(76). Until now, we have considered homogeneous (in the sense of definition 6)high-low estimators that are quadratic functions of the high and low values.We now consider the more general class of homogeneous estimators, whosespot variance estimators have the formˆ D est { Y ( t, S i ) : t ∈ S i } = D est ( H i , L i ) , (79)where D est ( h, ℓ ) is an arbitrary homogeneous function of second order. Example 6
To illustrate the notion of non-quadratic homogeneous func-tions of second order, consider the typical example D est ( H i , L i ) = ( H i − L i ) p H i + L i , (80)18hich satisfies the scaling property D est ( δ · H i , δ · L i ) ≡ δ · D est ( H i , L i ) ∀ δ > . (81)The following theorem states that the spot variance estimator (79) satis-fies the relations (27), (28) of definition 6 for homogeneous estimators. Theorem 4.4
The spot variance estimator (79) is homogeneous in the senseof definition 6.
Proof.
Let H i and L i be the high and low values of the bridge Y ( t, S i ).Due to relation (24) and Assumption 1, the following identity in law holds { H i , L i } ∼ σ i √ ∆ · { H, L } , (82)where { H, L } are the high and low values of the canonical bridge Y ( t ) (25).Substituting this last relation into (79) yields D est ( H i , L i ) ∼ D est ( σ i √ ∆ H, σ i √ ∆ L ) . (83)Using the homogeneity of the function D ( h, ℓ ), we rewrite the previous rela-tion in the form D est ( H i , L i ) ∼ σ i ∆ est · D ( H, L ) , (84)which is analogous to expression (27), where the canonical estimator of thespot variance is equal to ˆ d est = D est ( H, L ) . (85) (cid:4) Using the polar coordinates (71), the canonical estimator ˆ d est readsˆ d est = D est ( R cos Θ , R sin Θ) . (86)Using the homogeneity of the function D est , we obtainˆ d est = R · s ( θ ) , s ( θ ) = D est (cos θ, sin θ ) . (87)Its expected value is equal toE h ˆ d est i = Z − π/ s ( θ ) α ( θ ; 2) dθ , (88)where the function α ( θ, λ ) is given by the equality (73). Thus, the homoge-neous non-quadratic canonical estimator readsˆ d est = R s (Θ) R − π/ s ( θ ) α ( θ ; 2) dθ ⇒ E[ ˆ d est ] = 1 . (89)19ccordingly, the variance of the unbiased estimator is equal toVar h ˆ d est i = R − π/ s ( θ ) α ( θ ; 4) dθ (cid:16)R − π/ s ( θ ) α ( θ ; 2) dθ (cid:17) − . (90)One can easily prove the result analogous to theorem 3.3 that the mini-mum value of the variance (90) of the canonical estimator (89) with respectto all possible functions s ( θ ) is given byVar h ˆ d me i = inf ∀ s ( θ ) Var h ˆ d est i = 1 E me − , E me = Z − π/ α ( θ ; 2) α ( θ ; 4) dθ, (91)where ˆ d est is an arbitrary homogeneous canonical estimator of the form (89),while ˆ d me is the corresponding most efficient estimator given byˆ d me = 1 E me R s me (Θ) , s me ( θ ) = α ( θ ; 2) α ( θ ; 4) . (92)Calculating the numerical value of the integral in expression (91) yieldsˆ d me = 0 . , κ me = 3 , ⇒ R me ≃ . , (93)which shows a high efficiency compared with the standard realized variance. Let us consider the unbiased homogeneous time-high-low canonical esti-mator ˆ d est = R s (Θ , t last ) R dt R − π/ dθ s ( θ, t ) α last ( θ, t ; 2) , (94)where s ( θ, t ) is an arbitrary function, t last = sup { t L , t H } is the larger of thetwo times at which occur the high and low values of the canonical bridge and α last ( θ, t ; λ ) is given by (C.17) in Appendix C.3.It is easy to prove the result analogous to theorem 3.3 that the mostefficient estimator of the form (94) isˆ d t-me = R E t-me α last (Θ , t last ; 2) α last (Θ , t last ; 4) , E t-me = Z dt Z − π/ dθ α ( θ, t ; 2) α last ( θ, t ; 4) , (95)and the variance of this estimator is equal toVar h ˆ d t-me i = 1 E t-me − . (96)20he numerical calculation of E t-me givesVar h ˆ d t-me i ≃ . , κ t-me = 3 , ⇒ R t-me ≃ . . (97)The estimator of the realized variance based on the canonical estimator (95)is significantly more efficient than that based on the G&K estimator at zerodrift. Until now, we have not used explicitly the information contained in theclose values X i (4) of the time-step intervals S i (3). The close values X i have been used only for the construction of the bridge Y ( t, S i ) (22). Itseems plausible that taking into account explicitly the close values X i inthe construction of spot variance estimators may produce bridge realizedvariance estimators ˆ D est ( T ) = P ni =1 ˆ D est { Y ( t, S i ) : t ∈ S i ; X i } that are evenmore efficient than those considered until now. We show that this is indeedthe case by studying the example associated with the spot variance estimatorgiven by ˆ D est { Y ( t, S i ) : t ∈ S i ; X i } = D est ( H i , L i , X i ) , (98)where D est ( h, ℓ, x ) is an arbitrary homogeneous function satisfying relation(81). Due to its homogeneity, the following identity in law holds true D est ( H i , L i , X i ) ∼ σ i ∆ · ˆ d est , ˆ d est = D est ( H, L, X ) , (99)where H and L are the high and low values of the canonical bridge (25), while X = γ + W is the close value of the underlying canonical Wiener processwith drift (9). It is known (see, for instance, Jeanblanc et al. (2009)) thatthe canonical bridge Y ( t ) and W are statistically independent. Thus, thejoint pdf ϕ ( h, ℓ, x ) of the three random variables { H, L, X } is equal to ϕ ( h, ℓ, x ; γ ) = 1 √ π exp (cid:18) − ( x − γ ) (cid:19) ϕ ( h, ℓ ) , (100)where the joint pdf ϕ ( h, ℓ ) of high and low values is given by expression (70).Analogously to (86), it is convenient to represent the canonical estimatorˆ d est (99) in the spherical coordinate system H = R cos Υ cos Θ , L = R cos Υ sin Θ , X = R sin Υ , Υ ∈ ( − π/ , π/ , Θ ∈ ( − π/ , . (101)21h canonical estimator ˆ d est (99) then takes the formˆ d est = R s (Θ , Υ) , (102)where s (Θ , Υ) = D est (cos Υ cos Θ , cos Υ sin Θ , sin Υ) . (103)Analogously to (94) and (95), the unbiased most efficient high-low-closecanonical estimator is given byˆ d me-x = 1 E me-x R s me-x (Θ , Υ; γ ) , s me-x ( θ, υ ; γ ) = α ( θ, υ ; 2; γ ) α ( θ, υ ; 4; γ ) . (104)The function α ( θ, υ ; λ ; γ ) is defined by the equality α ( θ, υ ; λ ; γ ) = Z ∞ r λ +2 ϕ ( r cos υ cos θ, r cos υ sin θ, r sin υ ; γ ) dr . (105)The variance of the most efficient canonical estimator ˆ d me-x is equal toVar h ˆ d me-x i = 1 E me-x − , (106)with E me-x = Z − π/ dθ Z π/ − π/ dυ cos υ α ( θ, υ ; 2; γ ) α ( θ, υ ; 4; γ ) . (107)The calculation of the integral (107) for γ = 0 givesVar h ˆ d me-x i ≃ . , κ me-x = 3 , ⇒ R me-x ≃ . . (108)This estimator is definitely better than the most efficient time-high-low canon-ical estimator, as can be seen by comparing (108) with (97). The last example we present here is the realized variance estimator thatuses in each interval S i the high and low values H i , L i of the bridge Y ( t, S i )(22), the close value X i of the original stochastic process X ( t ) and the timeinstant t i last = sup { t iL , t iH } defined as the larger of the two times at whichoccur the high and low values of the canonical bridge.One can rigorously prove that, analogously to (104), the homogeneoustime-OHLC bridge canonical estimator that is most efficient for some givenvalue of γ value is equal toˆ d t-me-x (Θ , Υ , t last ; γ ) = R s t-me-x (Θ , Υ , t last ; γ ) ,s t-me-x ( θ, υ, t ; γ ) = 1 E t-me-x ( γ ) α ( θ, υ, t ; 2; γ ) α ( θ, υ, t ; 4; γ ) , (109)22here E t-me-x ( γ ) = Z dt Z − π/ dθ Z π/ − π/ dυ cos υ α ( θ, υ, t ; 2; γ ) α ( θ, υ, t ; 4; γ ) (110)and α ( θ, υ, t ; λ ; γ ) = Z ∞ r λ +2 ϕ last ( r cos υ cos θ, r cos υ sin θ, r sin υ, t ; γ ) dr. (111)The joint pdf ϕ ( h, ℓ, x, t ; γ ) is ϕ last ( h, ℓ, x, t ; γ ) = 1 √ π exp (cid:18) − ( x − γ ) (cid:19) ϕ last ( h, ℓ, t ) , (112)where ϕ last ( h, ℓ, t ) is given by expression (C.15) in Appendix C.2. Remark 7
Recall that the parameter factor γ (9) is unknown, because boththe drift µ i and the instantaneous variances σ i in equations (6) are generallyunknown. Therefore, our strategy below is to choose, for definiteness, γ =0 and then explore the dependence on γ of the bias and efficiency of thedifferent “zero drift” estimators. Accordingly, we will use below the followingshorthand notations, omitting the argument γ , such asˆ d t-me-x (Θ , Υ , t ) := ˆ d t-me-x (Θ , Υ , t ; γ = 0) . The calculation of the integral (110), where α ( θ, υ, t ; λ ) is given by ex-pression (C.18) in Appendix C.4 yields for γ = 0Var h ˆ d t-me-x i = 1 E t-me-x − ≃ . , κ t-me-x = 3 , ⇒ R t-me-x ≃ . . (113)This estimator is more efficient than all the previous one discussed until now. The goal of this section is to check by numerical simulations some ana-lytical results obtained above. Realizations of the canonical Wiener process X ( t ; γ ) (9) with drift for time t ∈ [0 ,
1] are obtained numerically as cumula-tive sums of a number I ( t ) = 10 of Gaussian summands, corresponding toa discrete time step ∆ = 10 − . For each numerical realization, we calculate23he values of the open-close spot variance canonical estimator, equal in thiscase to ˆ d real = ( γ + W ) , (114)and the values of the G&K canonical estimatorˆ d GK = k ( H − L ) − k ( W ( H − L ) − HL ) − k ( γ + W ) , (115)where H and L are the high and low values of the simulated process X ( t ; γ ).We also constructed numerical realizations of the bridge process Y ( t ) (25)and calculated the corresponding values of the canonical estimator ˆ d t-me-x (109). This estimator depends on the function α ( θ, υ, t ; λ ) defined by ex-pression (C.18) in Appendix C.4, which is explicitly obtained by summinga double-infinite series (C.19). In practice, we estimate this double-sum bykeeping only the 101 first terms in each dimension, corresponding to estimat-ing 101 × ≃ summands in (C.18). Remark 8
At first glance, it would seem that the calculation of the G&Kestimator (115), which needs only a few simple arithmetic operations, ismuch easier than the evaluation of the large number of summands in theseries (C.18) that define the estimator ˆ d t-me-x (109). In our computerizedworld, it turns out that there is actually no significant difference from thecomputational point of view. γ = 0 ) Figure 1 shows 5000 realizations of the open-close estimator ˆ d real (114), ofthe G&K estimator (115) and of the estimator ˆ d t-me-x in the case the Wienerprocess with zero drift ( γ = 0). It is clear that the last estimator is themost efficient in comparison with the open-close and the G&K estimators.The expected values and variances of these three estimators obtained bystatistical averaging over 10 samples areE[ ˆ d real ] ≃ . , E[ ˆ d GK ] ≃ . , E[ ˆ d t-me-x ] ≃ . , Var[ ˆ d real ] ≃ . , Var[ ˆ d GK ] ≃ . , Var[ ˆ d t-me-x ] ≃ . . These values are consistent with the theoretical analytical predictions ob-tained in previous sections:E[ ˆ d real ] = E[ ˆ d GK ] = E[ ˆ d t-me-x ] = 1 , Var[ ˆ d real ] = 2 , Var[ ˆ d GK ] ≃ . , Var[ ˆ d t-me-x ] ≃ . .
24n order to have truly comparable efficiencies of these realized varianceestimators, bearing in mind that their effective sample sizes are different( κ real = 1, κ GK = κ t-me-x = 3), we performed moving averages with r = 30subsequent samples for the open-close estimator (114) and with r = 10 sub-sequent samples for the G&K estimator (115) and estimator ˆ d t-me-x (109).Figure 2 presents there moving averages, which mimick the normalized es-timators of the integrated variance in the case where all instantaneous vari-ances are the same ( σ i = σ = const). It is clear that the open-close estima-tor of the realized variance remains significantly less efficient than the G&Kestimator, and much less efficient than the most efficient estimator ˆ d t-me-x . γ -dependence of biases and efficiencies of canonicalestimators In the previous subsection, we presented detailed calculations of the com-parative efficiency of unbiased variance estimators for the particular case ofWiener processes with zero drift. In real financial markets, the drift process µ ( t ) is unknown and there is not reason for it to vanish. Thus, it is importantto explore quantitatively the dependence on the parameter γ (9) of the biasesand efficiencies of the spot variance canonical estimators described above.We begin with the open-close spot variance canonical estimator ˆ d real (114).It is easy to show that its expected value and variance are quadratic functionsof γ : E h ˆ d real i = 1 + γ , Var h ˆ d real i = 2 + 4 γ . (116)The spot variance homogeneous time-open-high-low canonical bridge estima-tors, such as the Park estimator ˆ d bPark (73) and the time-high-low estimatorˆ d t-me (95), are unbiased for all γ :E h ˆ d bPark i = E h ˆ d t-me i ≡ . Their variances do not depend on γ at all:Var h ˆ d bPark i ≃ . , Var h ˆ d t-me i ≃ . ∀ γ. (117)To obtain the γ -dependence of the biases and variances of the G&K canon-ical estimator ˆ d GK (115) and of the canonical estimator ˆ d t-me-x (109), we gen-erate 10 numerical realizations of the canonical Wiener process X ( t, γ ) (9)with drift, for γ = 0; 0 . . . . .
5; 1 .
6. Then, we calculated the statistical av-erages and variances of the corresponding 10 realizations of the canonicalestimators ˆ d GK and ˆ d t-me-x , which are shown in figure 3. The continuous lines25re respectively the expected value (116) of the open-close estimator ˆ d real given by expression (114) and the fitted curvesE[ ˆ d est ] = a est γ + b est for the averaged values of the canonical estimators ˆ d GK and ˆ d t-me-x . Theirfitted parameters are a GK ≃ . , a t-me-x ≃ . , b GK ≃ b t-me-x ≃ . Figure 4 shows the statistical average of the variances of the canonicalestimators ˆ d GK and ˆ d t-me-x . The two horizontal lines indicate the variancevalues (117). The continuous lines show the fitted curvesVar[ ˆ d est ] = c est γ + d est of the variances of the canonical estimators ˆ d GK and ˆ d t-me-x . Their parametersare c GK ≃ . , c t-me-x ≃ . , d GK ≃ . , b t-me-x ≃ . . We have introduced the canonical estimator ˆ d t-me-x given by expression(109) that includes the information on the value of the time t last = sup { t L , t H } defined as the larger of the two times at which occur the high and low valuesof the canonical bridge. It seems that the canonical estimatorˆ d tt-me-x (Θ , Υ , t high , t low ; γ ) = R s tt-me-x (Θ , Υ , t high , t low ; γ ) ,s tt-me-x ( θ, υ, t , t ; γ ) = 1 E tt-me-x ( γ ) α ( θ, υ, t , t ; 2; γ ) α ( θ, υ, t , t ; 4; γ ) , (118)taking into account both high’s and low’s and their corresponding occurrencetimes ( t high : H = Y ( t high ) , t low : L = Y ( t low )) is even more efficient thanthe estimator (109). In expression (118), we have used the notation α ( θ, υ, t , t ; λ, γ ) = Z ∞ r λ +2 ϕ ( r cos υ cos θ, r cos υ sin θ, r sin υ, t , t ; γ ) dr , (119)where ϕ ( h, ℓ, x, t , t ; γ ) is the joint pdf of the high-low-close- t hight - t low randomvariables.We have not explored the statistical properties of the estimator (118)because we have made not yet the effort of deriving the exact analytical26xpression of ϕ ( h, ℓ, x, t , t ; γ ). We can however construct the function α (119) using statistical averaging: α ( θ, υ, t , t ; λ, γ ) cos υdυdθdt dt ≃ K K X k =1 R λk I (Υ k , Θ k , t high ,k , t low ,k ) . (120)In this expression, the values { Υ k , Θ k , t high ,k , t low ,k } are parameters of nu-merically simulated k -th sample of the canonical Wiener process with drift X ( t, γ ) (9), and I is the indicator of the set( υ, υ + dυ ) × ( θ, θ + dθ ) × ( t , t + dt ) × ( t , t + dt ) . We would like to point out that it is possible to construct the function α by an analogous statistical treatment for more general log-price process thatextend the Wiener process with drift to include more adequately the micro-stricture noise, the presence of heavy tails of returns and other stylized factsthat can be found for various financial assets. In others words, relations suchas (120) offer the possibility of constructing novel most efficient varianceestimators of the form (118), extending the standard approach of econome-tricians looking for new constructions of efficient volatility estimators. Therequisite is to be able to simulate numerically the underlying stochastic pro-cess that is representing a given financial asset dynamics. Then, the use ofstatistical averaging, similar to (120), will enable the construction of high-frequency realized estimators that use the most efficient estimators describedabove as elementary “bricks”. We have introduced a variety of integrated variance estimators, based onthe open-high-low values of the bridges Y ( t, S i ) (22), and close values X i (4)of the underlying log-price process X ( t ). The main peculiarity of some ofthe introduced estimators is to take into account not only the high and lowvalues but additionally their occurrence time. This last piece of informationlead to estimators that are even more efficient. We discussed quantitativelythe statistical properties of the estimators for the class off Itˆo model for thelog-price stochastic process.Our work opens the road to the construction of novel types of integratedvariance estimators of log-price processes of real financial markets that takeinto account the microstructure noise, heavy power tails of returns, andchaotic jumps. 27 cknowledgements : We are grateful to Fulvio Corsi for valuable dis-cussions of some aspects of this paper. A Basic properties of the canonical bridge
A.1 Symmetry properties
The canonical bridge Y ( t ) (25) exhibits the following time reversibilityand reflection properties Y ( t ) ∼ Y (1 − t ) , Y ( t ) ∼ − Y ( t ) . (A.1)Some statistical consequences of these symmetry properties are as follows.Let H = sup t ∈ (0 , Y ( t ) , L = inf t ∈ (0 , Y ( t ) , (A.2)be the high and low values of the canonical bridge, while t high and t low aretheir corresponding occurrence times: t high : H = Y ( t high ) , t low : L = Y ( t low ) . (A.3)Consider the cumulative distribution (cdf)Φ high ( t ) = Pr { t high < t } of the occurrence time t high of the high value of the canonical bridge. Due tothe reversibility property (A.1), one hasPr { t high < t } = Pr { t high > − t } ⇒ Φ high ( t ) + Φ high (1 − t ) = 1 . (A.4)Accordingly, the pdf of t high ϕ high ( t ) := d Φ high ( t ) dt presents the symmetry ϕ high ( t ) = ϕ high (1 − t ) . (A.5)Due to the reversibility property of the canonical bridge, the cdf Φ low ( t )of t low (A.3) coincides with the cdf of t high :Φ low ( t ) = Φ high ( t ) ⇒ ϕ low ( t ) = ϕ high ( t ) = ϕ high (1 − t ) . .2 Interplay between bridge and Wiener processes We will need below the well-known identity in law for the canonical bridge Y ( t ) ∼ Y ( t ) := (1 − t ) W (cid:18) t − t (cid:19) . Using the change of time variable τ = t − t ⇐⇒ t = τ τ and the scaling properties of the Wiener process, we can replace the com-pounded process Y ( t ( τ )) = Y (cid:18) τ τ (cid:19) by the more convenient process, which is identical in law and reads Y ( t ( τ )) ∼ Z ( τ ) = 11 + τ W ( τ ) . (A.6)In turn, the following identity in law holds Y ( t ) ∼ Z ( τ ( t )) = Z (cid:18) t − t (cid:19) . (A.7) B Joint pdf of the high value and its occur-rence time
B.1 Reflection method
Let us consider the function f ( ω ; τ, h ) such thatPr { W ( τ ) ∈ ( ω, ω + dω ) ∩ W ( τ ′ ) < h (1 + τ ′ ) : τ ′ ∈ (0 , τ ) } = f ( ω ; τ, h ) dω. (B.1)This function f ( ω ; τ, h ) satisfies to the following diffusion equation ∂f∂τ = 12 ∂ f∂ω (B.2)with initial and absorbing boundary conditions f ( ω ; τ = 0 , h ) = δ ( ω ) , f ( ω = h + hτ ; τ, h ) = 0 . (B.3)29e solve the initial-boundary problem (B.2) with (B.3) using the reflec-tion method, which amounts to searching for a solution of the form f ( ω ; τ, h ) = 1 √ πτ (cid:20) exp (cid:18) − ω τ (cid:19) − A exp (cid:18) − ( ω − h ) τ (cid:19)(cid:21) , where the factor A is defined from the absorbing boundary condition (B.3),i.e. exp (cid:18) − ( h + hτ ) τ (cid:19) = A exp (cid:18) − ( h − hτ ) τ (cid:19) ⇒ A = e − h . We thus obtain f ( ω ; τ, h ) = 1 √ πτ (cid:20) exp (cid:18) − ω τ (cid:19) − exp (cid:18) − h − ( ω − h ) τ (cid:19)(cid:21) . (B.4) B.2 Pdf of the maximal value of the canonical bridge
In view of (B.1) and (A.6), the joint pdf of W ( τ ) and of the high value H ( τ ) = sup τ ′ ∈ (0 ,τ ) Z ( τ ′ ) (B.5)of the stochastic process Z ( τ ′ ) within the interval τ ′ ∈ (0 , τ ) is equal to Q ( ω, h ; τ ) = ∂f ( ω ; τ, h ) ∂h . Substituting in the above equation the expression (B.4) yields Q ( ω, h ; τ ) = 1 τ r πτ (2 h (1 + τ ) − ω ) exp (cid:20) − h − ( ω − h ) τ (cid:21) ,ω < h (1 + τ ) , h > . (B.6)In particular, the pdf of the high value H ( τ ) (B.5) Q ( h ; τ ) = Z h (1+ τ ) −∞ Q ( ω, h, τ ) dω is equal to Q ( h ; τ ) = r πτ exp (cid:18) − h (1 + τ ) τ (cid:19) + 2 he − h erfc (cid:18) h (1 − τ ) √ τ (cid:19) . Q high ( h ; t ) of the high value H ( t ) = sup t ′ ∈ (0 ,t ) Y ( t ′ )is equal to Q high ( h ; t ) = Q (cid:18) h ; t − t (cid:19) . (B.7)In particular, the pdf’s of the high values H (B.5) and H (A.2) are the sameand equal to ϕ high ( h ) = lim τ →∞ Q ( h ; τ ) = 4 he − h . (B.8) B.3 Pdf of the high value of the bridge and of its oc-currence value
In order to derive the joint pdf of the maximal value H (A.2) and of theoccurrence time t high (A.3), we first consider the related joint pdf of the highvalue H (B.5) of the auxiliary process Z ( τ ) (A.6) and of its occurrence time τ high : H = Z ( τ high ).The function F ( h, τ ) that defines the probability F ( h, τ ) dh = Pr {H ∈ ( h, h + dh ) , τ high < τ } . is given by F ( h, τ ) = Z h (1+ τ ) −∞ Q ( ω, h ; τ ) P ( ω, h, τ ) dω, (B.9)where Q ( ω, h ; τ ) is the joint pdf of W ( τ ) and H ( τ ), given by equality (B.6),while P ( ω, h, τ ) = lim θ →∞ P ( ω, h, τ, θ ) ,P ( ω, h, τ, θ ) = Pr { W ( τ ′ | τ, ω ) < h (1 + τ ′ ) : τ ′ ∈ ( τ, τ + θ ) } . (B.10)Here, W ( τ ′ | τ, ω ) is the conditioned Wiener process that takes the value ω at τ ′ = τ . Due to the identity in law (A.6), P ( ω, h, τ ) is equal to the probabilitythat the following inequality holds Z ( τ ′ | τ, ω ) < h, τ ′ ∈ ( τ, ∞ ) , where Z ( τ ′ | τ, ω ) is the conditioned stochastic process Z ( τ ′ ), which is equalto ω/ (1 + τ ) at τ ′ = τ . 31he probability P ( ω, h, τ, θ ) (B.10) is given by P ( ω, h, τ, θ ) = Z h (1+ τ + θ ) −∞ f ( x ; ω, h, τ, θ ) dx, (B.11)where the pdf f ( x, ω, h, τ, θ ) satisfies the initial-boundary value problem ∂f∂θ = 12 ∂ f∂x ,f ( x ; ω, h, τ, θ = 0) = δ ( x − ω ) , f ( x = h (1 + τ + θ ); ω, h, τ, θ ) = 0 . Its solution, obtained by the reflection method, is f ( x ; ω, h, τ, θ ) = 1 √ πθ (cid:20) exp (cid:18) − ( x − ω ) θ (cid:19) − exp (cid:18) − h ( h (1 + τ ) − ω ) − ( x + ω − h (1 + τ )) θ (cid:19) (cid:21) . Substituting this last expression into (B.11) yields P ( ω, h, τ, θ ) =12 (cid:20) erfc (cid:18) ω − h (1 + τ + θ ) √ θ (cid:19) − e − h ( h (1+ τ ) − ω ) erfc (cid:18) h (1 + τ − θ ) − ω √ θ (cid:19)(cid:21) . In particular, in the limiting case θ → ∞ , one has P ( ω, h, τ ) = 1 − e − h ( h (1+ τ ) − ω ) . (B.12)Substituting Q ( ω, h ; τ ) (B.6) and P ( ω, h, τ ) (B.12) into (B.9), after integra-tion, we obtain F ( h, τ ) = 2 he − h erfc (cid:18) h (1 − τ ) √ τ (cid:19) . (B.13)Consider now the probabilityΦ high ( h, t ) dh = Pr { H ∈ ( h, h + dh ) , t high < t } . Due to the identity in law (A.7), Φ high ( h, t ) is equal toΦ high ( h, t ) = F (cid:18) h, t − t (cid:19) = 2 he − h erfc h (1 − t ) p t (1 − t ) ! . (B.14)32he integration over h ∈ (0 , ∞ ) gives the cumulative distribution function(cdf) of the random occurrence times t high (A.3):Φ high ( t ) = Pr { t high < t } = Z ∞ Φ( h, t ) dh = t, t ∈ (0 , . This means that the occurrence time t high of the high value of the canon-ical bridge is uniformly distributed. The above cdf satisfies the symmetryproperty (A.4). The corresponding pdf ϕ high ( t ) = 1 satisfies obviously tosymmetry property (A.5).The sought joint pdf of the high value H of canonical bridge Y ( t ) and ofits corresponding occurrence time t high is ϕ high ( h, t ) = ∂ Φ high ( h, t ) ∂t . (B.15)Substituting here Φ high ( h, t ) (B.14) yields ϕ high ( h, t ) = r π h p t (1 − t ) exp (cid:18) − h t (1 − t ) (cid:19) . (B.16) C Statistics of the high, low and occurrencetime of the last extremum of the canonicalbridge
C.1 Statistical description of the joint pdf of the high,low and occurrence time of the last extremum
The occurrence times of the first and last absolute extremes (A.3) ofcanonical bridge Y ( t ) are formally defined as t first = inf { t L , t H } , t last = sup { t L , t H } . (C.1)The joint pdf of the high H and low L (A.2) together with the cdf of theoccurrence time t last is given byΦ last ( h, ℓ, t ) dhdℓ = Pr { H ∈ ( h, h + dh ) ∩ L ∈ ( ℓ, ℓ + dℓ ) ∩ t last < t } . (C.2)We derive the function Φ last ( h, ℓ, t ) by using a natural generalization of thereasoning presented in Appendix B that led to the joint pdf Φ high ( h, t ) (B.14)of the high value H and of the cdf of the occurrence time t high . Namely, wecalculate first the probability F ( h, ℓ, τ ) dhdℓ = Pr {H ∈ ( h, h + dh ) , L ∈ ( ℓ, ℓ + dℓ ) , τ last < τ } , (C.3)33here H = sup τ ∈ (0 , ∞ ) Z ( τ ) , L = inf τ ∈ (0 , ∞ ) Z ( τ ) ,τ last = sup { τ low , τ high } , τ low : L = Z ( τ low ) , τ high : H = Z ( τ high ) . Analogously to (B.9), F ( h, ℓ, τ ) is equal to F ( h, ℓ, τ ) = Z h (1+ τ ) ℓ (1+ τ ) Q ( ω, h, ℓ, τ ) P ( ω, h, ℓ, τ ) dω, (C.4)where Q ( ω, h, ℓ, τ ) = − ∂ f ( ω ; h, ℓ, τ ) ∂h∂ℓ (C.5)and the pdf f ( ω ; h, ℓ, τ ) satisfies the initial-boundary problem ∂f∂τ = 12 ∂ f∂ω , f ( ω ; h, ℓ, τ = 0) = δ ( ω ) ,f ( ω = h (1 + τ ); h, ℓ, τ ) = 0 , f ( ω = ℓ (1 + τ ); h, ℓ, τ ) = 0 , τ > . (C.6)Similarly to P ( ω, h, τ ) (B.10), the probability P ( ω, h, ℓ, τ ) is given by P ( ω, h, ℓ, τ ) = lim θ →∞ P ( ω, h, ℓ, τ, θ ) ,P ( ω, h, ℓ, τ, θ ) = Pr { ℓ (1 + τ ′ ) < Bτ ′ | τ, ω ) < h (1 + τ ′ ) : τ ′ ∈ ( τ, τ + θ ) } . Analogously to (B.11), the last probability P ( ω, h, ℓ, τ, θ ) is equal to P ( ω, h, ℓ, τ, θ ) = Z h (1+ τ + θ ) ℓ (1+ τ + θ ) f ( x ; ω, h, ℓ, τ, θ ) dx, (C.7)where f ( x ; ω, h, ℓ, τ, θ ) is the solution of the initial-boundary problem ∂f∂θ = 12 ∂ f∂x , f ( x ; ω, h, ℓ, τ, θ = 0) = δ ( x − ω ) ,f ( x = h (1 + τ + θ ); ω, h, ℓ, τ, θ ) = 0 , f ( x = ℓ (1 + τ + θ ); ω, h, ℓ, τ, θ ) = 0 . (C.8)Knowing the function F ( h, ℓ, τ ) defined by equality (C.3), one can findthe sought function Φ last ( h, ℓ, t ) (C.2) using the following relationΦ last ( h, ℓ, t ) = F (cid:18) h, ℓ, t − t (cid:19) , (C.9)which is analogous to (B.14). In turn, one can find the joint pdf of the high H , low L values (A.2) and occurrence time of the last absolute extremum t last (C.1) of the canonical bridge Y ( t ) using, analogously to (B.15), the relation ϕ last ( h, ℓ, t ) = ∂ Φ last ( h, ℓ, t ) ∂t . (C.10)34 .2 Solutions of boundary-value problems Using the initial-boundary problem (C.6) with the reflection method, weobtain f ( ω ; h, ℓ, τ ) = ∞ X m = −∞ (cid:2) e − h − ℓ ) m g ( ω + 2( h − ℓ ) m ; τ ) − e − h − ℓ ) m + h ) g ( ω − h + ( h − ℓ ) m ); τ ) (cid:3) , (C.11)where g ( ω ; τ ) = 1 √ πτ exp (cid:18) − ω τ (cid:19) . In turn, the solution of the initial-boundary problem (C.8) is given by f ( x ; ω, h, ℓ, τ, θ ) = ∞ X m = −∞ (cid:20) e − h − ℓ ) m (1+ τ )+2 ω ( h − ℓ ) m × g ( y − ω + 2 m ( h − ℓ )(1 + τ ); θ ) − e − h − ℓ ) m + h ) (1+ τ )+2 ω (( h − ℓ ) m + h ) g ( y + ω − h − ℓ ) m + h )(1 + τ ); θ ) (cid:21) . (C.12)After substituting f ( ω ; h, ℓ, τ ) (C.11) into (C.5), we obtain Q ( ω, h, ℓ, τ ) = 4 τ ∞ X −∞ m (cid:20) me − h − ℓ ) m × [( ω + 2 m ( h − ℓ )(1 + τ )) − τ (1 + τ )] g ( ω + 2 m ( h − ℓ ) , τ ) − (1 + m ) e − m ( h − ℓ )+ h ) × [( ω − m ( h − ℓ ) + h )(1 + τ )) − τ (1 + τ )] g ( ω − m ( h − ℓ ) + h ) , τ ) (cid:21) . (C.13)Substituting f ( x ; ω, h, ℓ, τ, θ ) (C.12) into (C.7), and taking the limit θ → ∞ ,we obtain P ( ω, h, ℓ, τ ) = ∞ X m = −∞ h e − h − ℓ ) (1+ τ ) m +2( h − ℓ ) mω − e − h +( h − ℓ ) m ) (1+ τ )+2( h +( h − ℓ ) m ) ω i . (C.14)After substituting Q ( ω, h, ℓ, τ ) (C.13) and P ( ω, h, ℓ, τ ) (C.14) into (C.4),we obtain the explicit expression for F ( h, ℓ, τ ). Substituting it into (C.9)35nd using relation (C.9), we obtain the pdf of the high H , low L values andoccurrence time t last of the last extremum under the form ϕ last ( h, ℓ, t ) = ∞ X m = −∞ ∞ X n = −∞ (cid:16) m h g ( h, t, h − ℓ ) m, h − ℓ ) n ) − g ( ℓ, t, h − ℓ ) m, h − ℓ ) n ) − g ( h, t, h − ℓ ) m, h + ( h − ℓ ) n )) + g ( ℓ, t, h − ℓ ) m, h + ( h − ℓ ) n )) i − m ( m + 1) h g ( h, t, − h + ( h − ℓ ) m ) , h − ℓ ) n ) − g ( ℓ, t, − h + ( h − ℓ ) m ) , h − ℓ ) n ) − g ( h, t, − h + ( h − ℓ ) m ) , h + ( h − ℓ ) n ))+ g ( ℓ, t, − h + ( h − ℓ ) m ) , h + ( h − ℓ ) n )) i(cid:17) ,g ( y, t, a, c ) = − s π (1 − t ) t exp (cid:18) − ( a + y ) − ( a + c )( a − c + 2 y ) t t (1 − t ) (cid:19) × (cid:2) ( a + y ) − ( a + y )(3 + ( a + y )( a − c + 2 y )) t + (3 a − c + 4 y ) t (cid:3) . (C.15) C.3 Function α last ( θ, t ; λ ) Some of the most efficient estimators introduced in this paper are definedthrough the function α last ( θ, t ; λ ) = Z ∞ r λ +1 ϕ last ( r cos θ, r sin θ, t ) dr, (C.16)36hich is analogous to (73), The calculation of the integral (C.16) yields α last ( θ, t ; λ ) = − p π (1 − t ) t δ λ Γ (cid:18) λ (cid:19) ∞ X m,n = −∞ (cid:16) m × h β ( co, t, sc · m, sc · n ; λ ) − β ( si, t, sc · m, sc · n ; λ ) − β ( co, t, sc · m, co + sc · n ); λ )+ β ( si, t, sc · m, co + sc · n ); λ ) i − m ( m + 1) h β ( co, t, − co + sc · m ) , sc · n ; λ ) − β ( si, t, − si + sc · m ) , sc · n ; λ ) − β ( co, t, − co + sc · m ) , co + sc · n ); λ )+ β ( si, t, − co + sc · m ) , co + sc · n ); λ ) i(cid:17) , (C.17)where co = cos θ, si = sin θ, sc = co − si,β ( y, t, a, c ; λ ) = h ( a + y ) [ a + y − ( a − c + 2 y ) t ](3 + λ )+ δt [(6 a − c + 8 y ) t − a + y )] i ,δ = δ ( y, t, a, c ) = ( a + y ) − ( a + c )( a − c + 2 y ) t t (1 − t ) . C.4 Function α ( θ, υ, t ; λ ) Consider the function α ( θ, υ, t ; λ ) = Z ∞ r λ +2 ϕ last ( r cos υ cos θ, r cos υ sin θ, r sin υ, t ; γ = 0) dr, (C.18)that enters into the definition of the canonical estimator (109) in the caseof zero drift γ = 0. Using expression (112) for the pdf ϕ last ( h, ℓ, x, t ; γ ), we37btain after calculations the following expression α ( θ, υ, t ; λ ) = Γ (cid:0) λ (cid:1) π p t (1 − t ) ∞ X m,n = −∞ (cid:16) m × h β ′ ( x, co, t, sc · m, sc · n ; λ ) − β ′ ( x, si, t, sc · m, sc · n ; λ ) − β ′ ( x, co, t, sc · m, cc + sc · n ; λ ) + β ′ ( x, si, t, sc · m, cc + sc · n ; λ ) i − m ( m + 1) h β ′ ( x, co, t, − cc − sc · m, sc · n ; λ ) − β ′ ( x, si, t, − cc − sc · m, sc · n ; λ ) − β ′ ( x, co, t, − cc − sc · m, cc + sc · n ; λ )+ β ′ ( x, si, t, − cc − sc · m, cc + sc · n ; λ ) i(cid:17) . (C.19)which is analogous to (C.17). Here, we have set x = sin υ, co = cos θ cos υ, si = sin θ cos υ,cc = 2 cos θ cos υ, sc = 2(cos θ − sin θ ) cos υ,β ′ ( x, y, t, a, c, λ ) = (cid:2) r (4 + λ )( a + y − ( a − c + 2 y ) t )+ δt ((6 a − c + 8 y ) t − a + y )) (cid:3) δ − (6+ λ ) / ,δ = r − ( a + c )( a − c + 2 y ) t t (1 − t ) + x , r = ( a + y ) . eferences A¨ıt-Sahalia, Y., P.A. Mykland, and L. Zhang (2005). How often to samplea continuous-time process in the presence of market microstructure noise.
Review of Financial Studies
18, 351-416.Andersen, T. G., T. Bollershev, F. X. Diebolt and P. Labys (2003). Mod-eling and Forecasting Realized Volatility.
Econometrica
71, 529-626.Garman, M. and M. J. Klass (1980). On the Estimation of Security PriceVolatilities From Historical Data.
Journal of Business
53, 67-78.Jeanblanc J. & M. Yor, and M. Chesney (2009).
Mathematical Methodsfor Financial Markets.
Springer.Parkinson, M. (1980). The Extreme Value Method for Estimating theVariance of the Rate of Return.
Journal of Business
53, 61-65.Saichev, A., D. Sornette, V. Filimonov F. Corsi (2009). HomogeneousVolatility Bridge Estimators.
ETH Zurich working paper , http://ssrn.com/abstract=1523225 .Saichev A., Y. Malevergne, D. Sornette (2010) Theory of Zipf ’s Law andBeyond (Lecture Notes in Economics and Mathematical Systems), Springer.Zhang, L., Mykland, P.A. and At-Sahalia, Y. (2005). A tale of twotime scales: determining integrated volatility with noisy high-frequency data.
Journal of the American Statistical Association ˆ d r e a l ˆ d G K ˆ d t - m e - x Fig. 1: d t-me-x (109). ˆ d r e a l ˆ d G K ˆ d t - m e - x Fig. 2:
Moving averages of the open-close (top panel), G&K (middlepanel) and time-OHLC (109) (lower panel) estimators over respectivewindows sizes of 30 samples for the top panel and 10 samples forthe two other panels. As explained in the text, this moving averagemimicks the normalized estimators of the integrated variance in thecase where all instantaneous variances are the same ( σ i = σ = const). γ E [ ˆ d ] Fig. 3:
Top to bottom, γ -dependence of the expected values of theopen-close ˆ d real (114), G&K ˆ d GK (115) and most efficient ˆ d t-me-x (109)canonical estimators. The horizontal line is the expected value of thecanonical estimators ˆ d bPark and ˆ d t-me γ V a r [ ˆ d ] Fig. 4: γ -dependence of the statistical average of the variances ofthe canonical estimators ˆ d GK (upper open circles) and ˆ d t-me-x (loweropen circles). The two horizontal lines are the variances (117) of thecanonical estimators ˆ d bPark (top) and ˆ d t-me (bottom), respectively.(bottom), respectively.