[PDF] Time-Bridge Estimators of Integrated Variance

Abstract

We present a set of log-price integrated variance estimators, equal to the sum of open-high-low-close bridge estimators of spot variances within n subsequent time-step intervals. The main characteristics of some of the introduced estimators is to take into account the information on the occurrence times of the high and low values. The use of the high's and low's of the bridge associated with the original process makes the estimators significantly more efficient that the standard realized variance estimators and its generalizations. Adding the information on the occurrence times of the high and low values improves further the efficiency of the estimators, much above those of the well-known realized variance estimator and those derived from the sum of Garman and Klass spot variance estimators. The exact analytical results are derived for the case where the underlying log-price process is an Itô stochastic process. Our results suggests more efficient ways to record financial prices at intermediate frequencies.

Full PDF

aa r X i v : . [ q -f i n . S T ] A ug Time-Bridge Estimators of Integrated Variance

A. Saichev , , D. Sornette , ETH Zurich – Department of Management, Technology and Economics, Switzerland Swiss Finance Institute, 40, Boulevard du Pont-d’ Arve, Case Postale 3, 1211 Geneva 4, Switzerland Nizhni Novgorod State University – Department of Mathematics, Russia.

E-mail addresses: [email protected] & [email protected] ime-Bridge Variance Estimators

Abstract

We present a set of log-price integrated variance estimators, equalto the sum of open-high-low-close bridge estimators of spot varianceswithin n subsequent time-step intervals. The main characteristics ofsome of the introduced estimators is to take into account the informa-tion on the occurrence times of the high and low values. The use ofthe high’s and low’s of the bridge associated with the original processmakes the estimators signiﬁcantly more eﬃcient that the standardrealized variance estimators and its generalizations. Adding the infor-mation on the occurrence times of the high and low values improvesfurther the eﬃciency of the estimators, much above those of the well-known realized variance estimator and those derived from the sumof Garman and Klass spot variance estimators. The exact analyticalresults are derived for the case where the underlying log-price processis an Itˆo stochastic process. Our results suggests more eﬃcient waysto record ﬁnancial prices at intermediate frequencies. Didier SornetteDepartment of Management, Technology and Economics(D-MTEC, KPL F38.2) ETH ZurichKreuzplatz 5CH-8032 ZurichSwitzerland 2

Introduction

The integrated variance is a crucial risk indicator of the stochastic log-price process within speciﬁc time intervals. Most of the existing high-frequencyintegrated variance estimators are modiﬁcations of the well-known realizedvolatility (see, for instance, Andersen et al. (2003), A¨ıt-Sahalia (2005), Zhanget al. (2005)), and are based on the knowledge of the open and close pricesof n time-step intervals dividing the whole time interval of interest. An-other common practice to estimate the variance of a log-price process is touse not two (open-close) log-prices within a given time-step, but four values,the so-called the open-high-low-close (OHLC) of the log-prices. Well-knownexamples are the Garman and Klass (G&K) (1980) and Parkinson (Park)(1980) spot variance estimators.The main goal of this paper is to demonstrate the eﬃciency of bridgeOHLC integrated variance estimators, that use the knowledge of the high andlow values of the bridge process derived from the original log-price process,as well as possibly the random occurrence times of these extrema within eachtime-step interval. We compare the eﬃciencies of these time-OHLC bridgeestimators with the eﬃciency of the standard realized variance and with theeﬃciency of the integrated variance estimators based on the G&K estimatorsof the variance within each elementary time-step interval. We show thatsome time-OHLC integrated variance estimators achieve a very signiﬁcantimprovement in eﬃciency compared with the realized variance and the G&Kintegrated variance estimators. Another remarkable property of the proposedtime-OHLC bridge estimators is that they depend much less on the drift ofthe log-price process than the realized variance and G&K integrated varianceestimators. This has the great advantage of essentially removing the biasesthat aﬀect the standard estimators, given that the drift (expected return)is in general the most poorly constrained statistical variable. We comparethe eﬃciencies of the introduced integrated variance estimators using the Itˆoprocess as our workhorse to model the stochastic behavior of log-prices.Present databases record either all prices associated with transactions orprune the data to keep the OHLC at given time steps, for instance, sec-onds, minutes or days. The later records giving the OHLC of the realizedlog-prices do not allow the reconstruction of the OHLC (and even less theoccurrence times of the high’s and low’s) for the associated bridge process ineach elementary interval. Of course, one could construct the OHLC and anyother useful information from the full time series of all transaction prices.But then, one could question the value of deriving new estimators based ona reduced information set. Therefore, the present paper can be consideredas a normative exercise to learn about the fundamental limits of integrated3ariance estimators. Our results are also useful in suggesting more eﬃcientways to record ﬁnancial prices at intermediate frequencies: instead of record-ing the OHLC at the daily scale for instance, we propose that data centersand vendors should store to open and close of the real log-price and thehigh and low of the corresponding bridge in each day (or in any other cho-sen frequency). Our calculations below show that this information, whichhas the same cost and is as easy to obtain at the end of the day from thehigh frequency data, provides much more eﬃcient estimators of the variancethat can be stored for future use. The same conclusion holds true for otherrisk measures beyond variance such as higher order moments, but this is notexplored in the present paper.The paper is organized as follows. Section 2 describes the properties of thewell-known realized variance estimator, which we need in order to compare itseﬃciency with the eﬃciencies of the suggested time-OHLC bridge integratedvariance estimators. Section 3 is devoted to the discussion of the eﬃcienciesof the simple bridge integrated variance estimators, illustrating the compara-tive eﬃciency and unbiasedness of the bridge integrated variance estimators.This section written in a pedagogical style gradually introduces the readersin the area of homogeneous most eﬃcient variance estimators. Section 4 pro-vides a detailed analysis of the eﬃciency of the OHL and time-OHLC bridgeintegrated variance estimators, which turn out to be signiﬁcantly more eﬃ-cient than the realized variance and the G&K integrated variance estimators.Section 5 describes the results of numerical simulations demonstrating thecomparative eﬃciency of the proposed estimators. Section 6 concludes. Thepaper is completed by three appendix. Appendix A presents the essentialproperties of the canonical bridge. Appendix B derives the joint probabilitydensity function (pdf) of the high value and of its occurrence time. AppendixC derives and gives the statistical properties of the joint distribution of thehigh and low values and of the occurrence time of the last extremum for thecanonical bridge. Henceforth, we assume that the log-price X ( t ) of a given security followsan Itˆo process dX ( t ) = µ ( t ) dt + σ ( t ) dW ( t ) , X (0) = X , (1)where W ( t ) is a realization of the standard Wiener process, while µ ( t ) is thedrift process, and σ ( t ) is the instantaneous variance of the log-price process X ( t ). 4 .1 Deﬁnitions and basic properties of realized vari-ance Let us provide ﬁrst some basic deﬁnitions and properties.

Deﬁnition 1

The integrated variance of the process X ( t ) within the timeinterval t ∈ (0 , T ) is D ( T ) := Z T σ ( t ) dt . (2) Deﬁnition 2

The spot variance is deﬁned within the time-step interval S i : ( t i − , t i ] , (3)byˆ D real { X ( t ) : t ∈ S i } := ( X i − X i − ) , X i := X ( t i ) , t i := i ∆ , ∆ = Tn . (4)

Deﬁnition 3

The well-known statistical estimator of the integrated varianceis the so-called realized variance deﬁned as[

X, X ] T := n X i =1 ˆ D real { X ( t ) : t ∈ S i } . (5) Remark 1

For Itˆo processes (1) and for n → ∞ , it is well-known that therealized variance converges in probability to the integrated one. However, for real data, the number n of available data points is always lim-ited, ultimately by the discreteness of the transaction ﬂow and the associatedmicrostructure noise. Such structures, which are not taken into account inthe Itˆo log-price model, can be neglected in the use of the realized varianceestimator if the discrete time step ∆ is much larger than the inverse of themean frequency ν of the tick-by-tick transactions, so that n ≪ νT . Assumption 1

While ∆ ≫ /ν , we assume that ∆ is suﬃciently small incomparison with the time scales over which the drift process µ ( t ) and theinstantaneous variance σ ( t ) vary, so that one may replace the original Itˆoprocess (1) by Wiener processes with drift dX i ( t ) ≃ µ i dt + σ i dW ( t ) , X i ( t i − ) = X i − , t ∈ S i ,µ i = const , σ i = const . (6)5onsider the special case of the Wiener process with drift X ( t, µ, σ ) = µt + σW ( t ) . (7)Using the scale-invariance property of the Wiener process, the following iden-tity holds in law (represented by the symbol ∼ )ˆ D real { X ( t, µ, σ ) : t ∈ S i } ∼ σ ∆[ γ + W (1)] = σ ∆ · X (1; γ ) , (8)where X ( t ; γ ) = γt + W ( t ) , γ = µσ √ ∆ , t ∈ (0 , , (9)is the canonical Wiener process with drift . Applying the identity in law (8)to the realized variance expression (5), (4), we obtain[ X, X ] T ∼ ∆ n X i =1 σ i ( γ i + W i ) , (10)where { W i } are iid Gaussian variables N (0 , X, X ] T ] = ∆ n X i =1 σ i (1 + γ i ) , γ i = µ i σ i √ ∆ . (11)This recovers the well-known fact that the realized variance is in generalbiased for non-zero drift, and is non-biased only for zero-drift ( µ ( t ) ≡ ˆ D est ( T ) The essential idea of the present work is that it is possible to improveon the realized variance estimator of the integrated variance estimator, for aﬁxed n ≪ νT of time-steps with durations ∆, by replacing it byˆ D est ( T ) = n X i =1 ˆ D est { X ( t ) : t ∈ S i } , (12)where the functional ˆ D est { X ( t ) : t ∈ S i } is an improved estimator of thespot variance given by deﬁnition 2. The subscript est is used to refer tosome particular estimator and the subscript real means that this estimatorreduces to the realized variance estimator.6 eﬁnition 4 The estimator ˆ D est ( T ) deﬁned by (12) is said to be unbiasedif, for all intervals i = 1 , ..., n ,E h ˆ D est { X ( t ) : t ∈ S i } i = ∆ · σ i , (13)which implies E h ˆ D est ( T ) i = ∆ n X i =1 σ i . (14)When there exists at least one interval j , such that condition (13) does nothold, the estimator is considered biased. Let ˆ D est ( T ) be some unbiased variance estimator. We propose to quantifyits eﬃciency in terms of the coeﬃcient of variation ρ [ ˆ D est ( T )] = q Var[ ˆ D est ( T )]E[ ˆ D est ( T )] . (15)As an illustration, the coeﬃcient of variation of the realized variance fora Wiener process with zero drift ( µ ( t ) ≡

0) is equal to ρ [[ X, X ] T ] = vuut n X i =1 σ i , n X i =1 σ i . (16)We will need the following theorem: Theorem 2.1

The lower bound of the function f ( s ) := vuut n X i =1 s i . n X i =1 s i , s = { s , s , . . . , s n } , ∀ s i > is equal to ρ ( n ) := inf ∀ s i > f ( s ) = 1 √ n . (18) And this lower bound is attained iﬀ all s i are identical: s i ≡ s > . roof. Let { s i } be a realization of some random variable S with proba-bilities Pr { S = s i } = n , i = 1 , . . . , n . Expected and mean square values ofthe random variable S are equal toE [ S ] = 1 n n X i =1 s i , E (cid:2) S (cid:3) = 1 n n X i =1 s i . (19)Since, for any random variable S , the inequality p E [ S ] > E [ S ] holds, thisimplies f ( s ) > √ n . The inequality becomes an equality iﬀ all s i ≡ s for ∀ s > (cid:4) Applying this theorem to the right-hand-side of expression (16) showsthat ρ [[ X, X ] T ] satisﬁes the inequality ρ [[ X, X ] T ] > ρ real ( n ) , ρ real ( n ) = r n , (20)where the lower bound ρ real ( n ) of the eﬃciency is attained only if all { σ i } areidentical.Below, we will compare the eﬃciencies of diﬀerent estimators via thecomparison of their lower bounds ρ est ( n ) = inf ∀ σ i ρ est [ ˆ D ( T )] . (21) An important motivation for the introduction of a new class of so-called“realized bridge variance estimators” is to obtain much reduced biases com-pared that of the realized variance (5) observed for nonzero drifts µ ( t ) Deﬁnition 5

The bridge Y ( t, S i ) in discrete time steps of the original pro-cess X ( t ) is deﬁned by Y ( t, S i ) := X ( t ) − X i − − t − t i − ∆ ( X i − X i − ) , t ∈ S i , (22)where X i := X ( t i ), t i := i ∆ and ∆ = Tn .As an example, let X ( t ) be the Wiener process with drift X ( t, µ, σ ) de-ﬁned by (7). Using the transition and scale invariant properties of the Wienerprocess leads to Y ( t, S i ) ∼ σ √ ∆ ( W ( ζ ) − ζ W (1)) , ζ = t − t i − ∆ ∈ (0 , . (23)8his means that the bridge Y ( t, S i ) (22) is identical in law to Y ( t, S i ) ∼ σ √ ∆ Y ( ζ ) , (24)where Y ( t ) := W ( t ) − t · W (1) , t ∈ (0 , , (25)is the canonical bridge whose basic properties are given in Appendix A. Remark 2

The canonical bridge Y ( t ) is completely independent of the drift µ . This property is the fundamental reason for the better performance of thevariance bridge estimators compared with the realized variance: the biasesand eﬃciencies of bridge variance estimators do not depend on the drift µ .In the following, we explore the statistical properties of the bridge vari-ance estimators ˆ D est ( T ) = n X i =1 ˆ D est { Y ( t, S i ) : t ∈ S i } , (26)obtained from the general expression (12) by replacing the initial process { X ( t i ) } by its corresponding bridge { Y ( t i , S i ) } . Deﬁnition 6

The estimator (26) is called homogeneous if, when applied tothe Wiener processes with drift (6), the following identity in law holdsˆ D est { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d est , (27)where ˆ d est := ˆ D est { Y ( t ) : t ∈ (0 , } (28)is the canonical estimator of the spot variance depending on the canonicalbridge Y ( t ) (25). Obviously, the estimator (26) is unbiased if and only ifE h ˆ d est i = 1. Theorem 3.2

Under Assumption 1, the lower bound of the eﬃciency of theunbiased homogeneous integrated bridge variance estimator (27) is ρ est ( n ) = r Var h ˆ d est i (cid:14) n , (29) where Var h ˆ d est i is the variance of the canonical spot variance estimator ˆ d est (28) . roof. Under Assumption 1, the unbiased homogeneous bridge varianceestimator (26) is identical in law toˆ D est ( T ) ∼ ∆ n X i =1 σ i · ˆ d i est , (30)where { ˆ d i est } are iid random variables with mean value E h ˆ d est i = 1 andvariance Var h ˆ d est i . Accordingly, the expected value and variance of theunbiased bridge variance estimator (26) are equal toE h ˆ D est ( T ) i = ∆ n X i =1 σ i , Var h ˆ D est ( T ) i = ∆ Var h ˆ d est i n X i =1 σ i . (31)Substitute these relations into (15), we obtain ρ h ˆ D est ( T ) i = vuut Var h ˆ d est i n X i =1 σ i , n X i =1 σ i . (32)Using theorem 2.1, this yields the result (29). (cid:4) Our ﬁrst example of an homogeneous bridge variance estimator isˆ D simple ( T ) = n X i =1 ˆ D simple { Y ( t, S i ) : t ∈ S i } , (33)where the estimator of the spot variance is given byˆ D simple { Y ( t, S i ) : t ∈ S i } = AY ( t i ( η )) , t i ( η ) = t i − + η · ∆ , η ∈ (0 , , (34)and A is a normalizing factor. The estimator ˆ D simple ( T ) is homogeneous and,if relations (6) are valid, then Y ( t i ( η ) , S i ) ∼ σ i ∆ · Y i ( η ) , (35)where { Y i ( η ) } are iid random variables that are identical in law to the canon-ical bridge (25). Substituting relation (35) into (33) leads to the identity inlaw ˆ D simple ( T ) ∼ A ∆ n X i =1 σ i Y i ( η ) . (36)10he fact that the canonical bridge Y ( η ) is Gaussian with mean value E[ Y ( η )] = η (1 − η ) implies that the estimator (33) is unbiased in the sense of deﬁnition4 if A = 1 (cid:14) η (1 − η ). Accordingly, the variance of the estimator (33) is equalto the variance of the realized variance obtained for zero drift ( µ ( t ) ≡ D simple ( T )] = 2∆ n X i =1 σ i . (37)This result means that the lower bound of the eﬃciency of the simplest bridgeestimator (33) is equal to the lower bound of the eﬃciency of the realizedvariance estimator at zero drift: ρ simple ( n ) = ρ real ( n ) = r n . (38)The shortcoming of the estimator (33) is that it is actually less eﬃcient thanthe realized variance at zero drift in a sense discussed below. Deﬁnition 7

Let the estimator of the spot varianceˆ D est { X ( t ) : t ∈ S i } or ˆ D est { Y ( t, S i ) : t ∈ S i } depends on κ est values of the process X ( t ) or Y ( t, S i ) at κ est time-step withinthe time interval t ∈ S i . The corresponding estimators of the realized volatil-ity ˆ D est ( T ) (12) or (33) are then using a total number n eﬀ = κ est · n oftime-steps. Example 1

The realized variance corresponds to κ real = 1. Indeed, the twovalues { X i − , X i } are used to estimate the spot realized variance (4), and theﬁrst value is excluded from the semi-closed interval S i (3). Example 2

For the simplest bridge estimator (33) with (34), κ simple = 2.Indeed, the estimator (34) depends on the bridge Y ( t i ( η ) , S i ) for t i ( η ) ∈ S i and Y ( t i ( η ) , S i ) (22) is deﬁned by the open and close values { X i − , X i } ofthe original stochastic process X ( t ). Excluding the open value, this yields κ simple = 2. Example 3

Consider the Garman & Klass (G&K) variance estimator basedon open, high, low and close prices, used as the spot variance estimator in11xpression (12):ˆ D GK { X ( t ) : t ∈ S i } = k ( H i − L i ) − k ( C i ( H i − L i ) − H i L i ) − k C i ,k = 0 . , k = 0 . , k = 0 . , (39)where { O i , C i , H i , L i } are the open, close, high and low values O i = X i − , C i = X i , H i = sup t ∈ S i [ X ( t ) − O i ] , L i = inf t ∈ S i [ X ( t ) − O i ] . Excluding the open value leads to κ GK = 3. Deﬁnition 8

We characterize the eﬃciencies of the novel variance estima-tors by comparing them with that of the standard realized variance estimator.The corresponding comparative eﬃciency R est is constructed as the ratio ofthe lower bounds of the eﬃciencies of the realized variance and novel varianceestimator: R est = ρ real ( κ est · n ) ρ est ( n ) . (40)Putting in this expression ρ real ( n ) given by equation (20) and ρ est ( n ) givenby expression (29) yields R est = s κ est · Var[ ˆ d est ] . (41) Remark 3

For a given duration T used to deﬁne the integrated variance(2), relation (41) takes into account that the typical waiting time betweensuccessive data samples is given by ∆ eﬀ ≃ T (cid:14) n eﬀ . Such waiting time shouldbe approximately the same for the diﬀerent generalized variance estimatorsproposed below, leading to similar distortions to the adequacy of the Itˆoprocess (1) in its ability to describe the real price process in the presence ofdiscrete tick-by-tick and other microstructure noise. Example 4

Let us come back to the simple variance estimator based onexpression (34) for ˆ D simple { Y ( t, S i ) : t ∈ S i } . The result (38) is equivalent toVar[ ˆ d simple ] = 2. Substituting this value in (41) yields R simple = 1 p κ simple = 1 √ ≃ . . (42)The eﬃciency of the simplest bridge estimator is smaller than that of therealized variance. 12 xample 5 Let us evaluate the comparative eﬃciency of the generalizedrealized variance estimator based on the spot G&K variance estimator in thecase of zero drift µ ( t ) ≡

0. It is known that the variance of the spot G&Kvariance estimator given by (39) is equal toVar h ˆ D GK { X ( t ) : t ∈ S i } i = σ i ∆ · . ⇒ Var h ˆ d GK i = 0 . . (43)This gives R GK = s κ GK · . s · . ≃ . . (44)Therefore, for zero drift, the G&K realized variance estimator is approxi-mately 1.6 times more eﬃcient than the realized variance estimator. The fact that the G&K realized variance estimator based on open-high-low-close prices is signiﬁcantly more eﬃcient than the standard realized vari-ance, at least for Itˆo process X ( t ) (1) with zero drift µ ( t ) ≡

0, suggests tostudy other estimators using diﬀerent combinations of the open-high-low-close prices. Let us start by analyzing the simplest case of what we will referto as the “high bridge variance estimator”, deﬁned through its spot variancegiven by ˆ D high { Y ( t, S i ) : t ∈ S i } = A · H i , (45)where A is normalizing factor and H i = sup t ∈ S i Y ( t, S i ) , (46)is the high value of the bridge Y ( t, S i ). Note that we use here the samenotation for the high value of the bridge Y ( t, S i ) as for that of the originalprocess X ( t ), hoping that this will not give rise to any confusion.It follows from (24) thatˆ D high { Y ( t, S t ) : t ∈ S t } ∼ σ i ∆ · ˆ d high , ˆ d high = AH , (47)where the high value H of the canonical bridge Y ( t ) (25) has the followingprobability density function (pdf) ϕ high ( h ) = 4 he − h , h > . (48)13he derivation of the pdf (48) is given in Jeanblanc et al. (2009) (see alsothe derivations presented in Appendix B). Accordingly, the expected valueand the variance of the square of H are given byE (cid:2) H (cid:3) = 12 , Var (cid:2) H (cid:3) = 14 . (49)In order for the high spot bridge variance estimator to be unbiased, we haveto choose in (51) the value A = 2 for the normalizing factor. This givesVar h ˆ d high i = 1. With κ high = 2, we ﬁnd that the comparative eﬃciency (41)of the high bridge realized variance estimator is R high = 1. Thus, the highbridge realized variance estimator has the same eﬃciency as the standardrealized variance. But the advantage of the former is that, under Assumption(1), it is unbiased for any drift µ ( t ) = 0. Remark 4

Let us give the intuition for the above result, obtained despitethe larger value of κ high = 2 compared to κ real = 1. The reason is thatthe pdf of the random variable 2 H is narrower than that of the randomvariable W deﬁning the spot realized variance at zero drift. The same reasonunderlies the comparative eﬃciency of the G&K as well the other high andlow bridge realized variance estimators discussed below. The narrowness ofthe pdf’s of high’s and low’s compared with the pdf’s of the increments ofthe original stochastic process X ( t ) results from a weak version of the Law ofLarge Numbers, in the sense that the high’s and low’s incorporate signiﬁcantadditional information about the underlying process within a given time-step,thus leading to narrower pdfs’. We now introduce a novel ingredient to improve further the estimationof the variance. In addition to using only the high H i of the bridge Y ( t, S i ),we also assume that the time t i high of the occurrence of this high is recorded: t i high : H i = Y ( t i high , S i ) . (50)The corresponding time-high bridge spot variance estimator is given byˆ D est { Y ( t, S i ) : t ∈ S i } = A · s t i high − t i − ∆ ! · H i , (51)where A is a normalizing factor, while s ( t ) , t ∈ (0 ,

1) is some function thatremains to be determined so as to make the above spot variance estimator14s eﬃcient as possible. Before providing the solution of this problem, let usnote that the following identify in law follows from (24)ˆ D est { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d est , (52)where ˆ d est = A · s ( t high ) · H (53)is the canonical time-high bridge estimator of the spot variance, H is thehigh value of the canonical bridge Y ( t ) (25), and t high is the correspondingtime-point (50).The expected value of the canonical estimator (53) is equal toE h ˆ d est i = A Z s ( t ) α ( t ; 2) dt, α ( t ; λ ) := Z ∞ h λ ϕ high ( h, t ) dh (54)where ϕ high ( h, t ) is the joint pdf of H and t high . Taking A = 1 . Z s ( t ) α ( t ; 2) dt , (55)we obtain an unbiased time-high canonical bridge estimator:ˆ d est = s ( t high ) H R s ( t ) α ( t ; 2) dt . (56)Its variance is Var h ˆ d est i = R s ( t ) α ( t ; 4) dt (cid:16)R s ( t ) α ( t ; 2) dt (cid:17) − . (57) Theorem 3.3

The function s ( t ) that minimizes the variance (57) of theunbiased time-high canonical bridge estimator (56) is s t-high ( t ) = α ( t ; 2) α ( t ; 4) . (58) The corresponding minimal variance is equal to

Var " s t-high ( t high ) H R s ( t ) α ( t ; 2) dt = inf ∀ s ( t ) Var h ˆ d est i = 1 E t-high − , E t-high = Z α ( t ; 2) α ( t ; 4) dt. (59)15 roof. We use the Schwarz inequality (cid:18)Z A ( t ) B ( t ) dt (cid:19) Z A ( t ) dt Z B ( t ) dt (60)with A ( t ) = s ( t ) p α ( t ; 4) , B ( t ) = α ( t ; 2) p α ( t ; 4) , (61)to obtain (cid:18)Z s ( t ) α ( t ; 2) dt (cid:19) Z s ( t ) α ( t ; 4) dt Z α ( t ; 2) α ( t ; 4) dt. (62)After simple transformations, we rewrite the last inequality in the formVar h ˆ d t-high i = R s ( t ) α ( t ; 4) dt (cid:16)R s ( t ) α ( t ; 2) dt (cid:17) − > R α ( t ;2) α ( t ;4) dt − . (63)The equality in (63) is reached by substituting in it s ( t ) = s t-high ( t ) given byexpression (58). (cid:4) The joint pdf of H and t high is derived in Appendix B and reads ϕ high ( h, t ) = r π h p t (1 − t ) exp (cid:18) − h t (1 − t ) (cid:19) , h > , t ∈ (0 , . (64)Substituting this expression for ϕ high ( h, t ) into (54) yields α ( t ; λ ) = 2 √ π [2 t (1 − t )] λ Γ (cid:18) λ (cid:19) . (65)Therefore, s t-high ( t ) = 15 t (1 − t ) , E t-high = 35 ⇒ Var h ˆ d t-high i = 23 , (66)and R t-high = r ≃ . . (67)Thus, the time-high bridge realized variance estimator is less eﬃcient thanthe corresponding G&K estimator at zero drift, but is more eﬃcient than therealized variance. Remark 5

The numerical result (67) takes into account that the use of t i high does not increase the number of sample values used in the spot estimator(51). Thus, κ t-high = κ high = 2. 16 Bridge time-high-low estimators

Deﬁnition 9

The bridge realized variance estimator (26) that uses as spotvariance estimatorˆ D bPark { Y ( t, S i ) : t ∈ S i } = A · ( H i − L i ) (68)is called the bridge Parkinson estimator . In expression (68), H i and L i arethe high and low values of the bridges Y ( t, S i ) (22).The bridge Parkinson estimator is identical in law toˆ D bPark { Y ( t, S i ) : t ∈ S i } ∼ σ i ∆ · ˆ d bPark , ˆ d bPark = A · ( H − L ) , (69)where H , L are the high and low values of the canonical bridge Y ( t ) (25).The joint pdf of H and L have been derived by Saichev et al. (2009) andreads ϕ ( h, ℓ ) = ∞ X m = −∞ m [ m I ( m ( h − ℓ )) + (1 − m ) I ( m ( h − ℓ ) + ℓ )] , I ( h ) = 4(4 h − e − h . (70)It will be clear below that it is convenient to describe the joint statisticalproperties of the high H and low L by using polar coordinates H = R cos Θ , L = R sin Θ , R ∈ (0 , ∞ ) , θ ∈ (cid:16) − π , (cid:17) . (71)Accordingly, we rewrite the canonical estimator (69) in the formˆ d bPark = AR (1 − sin 2Θ) . (72)Choosing the constant A that makes the estimator (69) unbiased, we obtainˆ d bPark = R (1 − sin 2Θ) R − π/ (1 − sin 2 θ ) α ( θ ; 2) dθ ,α ( θ ; λ ) = Z ∞ r λ +1 ϕ ( r cos θ, r sin θ ) dr. (73)Substituting expression (70) yields α ( θ ; λ ) = ∞ X m = −∞ m [ mβ ( m (cos θ − sin θ ); λ ) + (1 − m ) β ( m (cos θ − sin θ ) + sin θ ; λ )] ,β ( y ; λ ) = C ( λ ) | y | λ , C ( λ ) = 1 + λ √ λ Γ (cid:18) λ (cid:19) . (74)17he variance of the canonical bridge Parkinson estimator is equal toVar h ˆ d bPark i = R − π/ (1 − sin 2 θ ) α ( θ ; 4) dθ (cid:16)R − π/ (1 − sin 2 θ ) α ( θ ; 2) dθ (cid:17) − ≃ . . (75)Substituting this value into (41) and taking into account that κ bPark = 3 forthe bridge Parkinson estimator, we obtain the comparative eﬃciencyVar h ˆ d bPark i = 0 . , κ bPark = 3 , ⇒ R bPark ≃ . , (76)which means that the bridge Parkinson estimator is signiﬁcantly more eﬃ-cient than the G&K estimator at zero drift. Remark 6

We stress that the canonical estimator ˆ d bPark is signiﬁcantlydiﬀerent from the well-known canonical Parkinson estimator (see Parkinson(1980)) ˆ d Park = ( H − L ) , (77)where H and L are the high and low values of the canonical Wiener processwith drift X ( t, γ ) (9). In contrast with the bridge Parkinson estimator (73)which is unbiased for any γ , the standard Parkinson estimator is biased atnonzero drift. Moreover, the variance of the standard Parkinson estimatorat zero drift is Var h ˆ d Park i ≃ . , (78)which is approximately twice the variance of the bridge Parkinson estimator(76). Until now, we have considered homogeneous (in the sense of deﬁnition 6)high-low estimators that are quadratic functions of the high and low values.We now consider the more general class of homogeneous estimators, whosespot variance estimators have the formˆ D est { Y ( t, S i ) : t ∈ S i } = D est ( H i , L i ) , (79)where D est ( h, ℓ ) is an arbitrary homogeneous function of second order. Example 6

To illustrate the notion of non-quadratic homogeneous func-tions of second order, consider the typical example D est ( H i , L i ) = ( H i − L i ) p H i + L i , (80)18hich satisﬁes the scaling property D est ( δ · H i , δ · L i ) ≡ δ · D est ( H i , L i ) ∀ δ > . (81)The following theorem states that the spot variance estimator (79) satis-ﬁes the relations (27), (28) of deﬁnition 6 for homogeneous estimators. Theorem 4.4

The spot variance estimator (79) is homogeneous in the senseof deﬁnition 6.

Proof.

Let H i and L i be the high and low values of the bridge Y ( t, S i ).Due to relation (24) and Assumption 1, the following identity in law holds { H i , L i } ∼ σ i √ ∆ · { H, L } , (82)where { H, L } are the high and low values of the canonical bridge Y ( t ) (25).Substituting this last relation into (79) yields D est ( H i , L i ) ∼ D est ( σ i √ ∆ H, σ i √ ∆ L ) . (83)Using the homogeneity of the function D ( h, ℓ ), we rewrite the previous rela-tion in the form D est ( H i , L i ) ∼ σ i ∆ est · D ( H, L ) , (84)which is analogous to expression (27), where the canonical estimator of thespot variance is equal to ˆ d est = D est ( H, L ) . (85) (cid:4) Using the polar coordinates (71), the canonical estimator ˆ d est readsˆ d est = D est ( R cos Θ , R sin Θ) . (86)Using the homogeneity of the function D est , we obtainˆ d est = R · s ( θ ) , s ( θ ) = D est (cos θ, sin θ ) . (87)Its expected value is equal toE h ˆ d est i = Z − π/ s ( θ ) α ( θ ; 2) dθ , (88)where the function α ( θ, λ ) is given by the equality (73). Thus, the homoge-neous non-quadratic canonical estimator readsˆ d est = R s (Θ) R − π/ s ( θ ) α ( θ ; 2) dθ ⇒ E[ ˆ d est ] = 1 . (89)19ccordingly, the variance of the unbiased estimator is equal toVar h ˆ d est i = R − π/ s ( θ ) α ( θ ; 4) dθ (cid:16)R − π/ s ( θ ) α ( θ ; 2) dθ (cid:17) − . (90)One can easily prove the result analogous to theorem 3.3 that the mini-mum value of the variance (90) of the canonical estimator (89) with respectto all possible functions s ( θ ) is given byVar h ˆ d me i = inf ∀ s ( θ ) Var h ˆ d est i = 1 E me − , E me = Z − π/ α ( θ ; 2) α ( θ ; 4) dθ, (91)where ˆ d est is an arbitrary homogeneous canonical estimator of the form (89),while ˆ d me is the corresponding most eﬃcient estimator given byˆ d me = 1 E me R s me (Θ) , s me ( θ ) = α ( θ ; 2) α ( θ ; 4) . (92)Calculating the numerical value of the integral in expression (91) yieldsˆ d me = 0 . , κ me = 3 , ⇒ R me ≃ . , (93)which shows a high eﬃciency compared with the standard realized variance. Let us consider the unbiased homogeneous time-high-low canonical esti-mator ˆ d est = R s (Θ , t last ) R dt R − π/ dθ s ( θ, t ) α last ( θ, t ; 2) , (94)where s ( θ, t ) is an arbitrary function, t last = sup { t L , t H } is the larger of thetwo times at which occur the high and low values of the canonical bridge and α last ( θ, t ; λ ) is given by (C.17) in Appendix C.3.It is easy to prove the result analogous to theorem 3.3 that the mosteﬃcient estimator of the form (94) isˆ d t-me = R E t-me α last (Θ , t last ; 2) α last (Θ , t last ; 4) , E t-me = Z dt Z − π/ dθ α ( θ, t ; 2) α last ( θ, t ; 4) , (95)and the variance of this estimator is equal toVar h ˆ d t-me i = 1 E t-me − . (96)20he numerical calculation of E t-me givesVar h ˆ d t-me i ≃ . , κ t-me = 3 , ⇒ R t-me ≃ . . (97)The estimator of the realized variance based on the canonical estimator (95)is signiﬁcantly more eﬃcient than that based on the G&K estimator at zerodrift. Until now, we have not used explicitly the information contained in theclose values X i (4) of the time-step intervals S i (3). The close values X i have been used only for the construction of the bridge Y ( t, S i ) (22). Itseems plausible that taking into account explicitly the close values X i inthe construction of spot variance estimators may produce bridge realizedvariance estimators ˆ D est ( T ) = P ni =1 ˆ D est { Y ( t, S i ) : t ∈ S i ; X i } that are evenmore eﬃcient than those considered until now. We show that this is indeedthe case by studying the example associated with the spot variance estimatorgiven by ˆ D est { Y ( t, S i ) : t ∈ S i ; X i } = D est ( H i , L i , X i ) , (98)where D est ( h, ℓ, x ) is an arbitrary homogeneous function satisfying relation(81). Due to its homogeneity, the following identity in law holds true D est ( H i , L i , X i ) ∼ σ i ∆ · ˆ d est , ˆ d est = D est ( H, L, X ) , (99)where H and L are the high and low values of the canonical bridge (25), while X = γ + W is the close value of the underlying canonical Wiener processwith drift (9). It is known (see, for instance, Jeanblanc et al. (2009)) thatthe canonical bridge Y ( t ) and W are statistically independent. Thus, thejoint pdf ϕ ( h, ℓ, x ) of the three random variables { H, L, X } is equal to ϕ ( h, ℓ, x ; γ ) = 1 √ π exp (cid:18) − ( x − γ ) (cid:19) ϕ ( h, ℓ ) , (100)where the joint pdf ϕ ( h, ℓ ) of high and low values is given by expression (70).Analogously to (86), it is convenient to represent the canonical estimatorˆ d est (99) in the spherical coordinate system H = R cos Υ cos Θ , L = R cos Υ sin Θ , X = R sin Υ , Υ ∈ ( − π/ , π/ , Θ ∈ ( − π/ , . (101)21h canonical estimator ˆ d est (99) then takes the formˆ d est = R s (Θ , Υ) , (102)where s (Θ , Υ) = D est (cos Υ cos Θ , cos Υ sin Θ , sin Υ) . (103)Analogously to (94) and (95), the unbiased most eﬃcient high-low-closecanonical estimator is given byˆ d me-x = 1 E me-x R s me-x (Θ , Υ; γ ) , s me-x ( θ, υ ; γ ) = α ( θ, υ ; 2; γ ) α ( θ, υ ; 4; γ ) . (104)The function α ( θ, υ ; λ ; γ ) is deﬁned by the equality α ( θ, υ ; λ ; γ ) = Z ∞ r λ +2 ϕ ( r cos υ cos θ, r cos υ sin θ, r sin υ ; γ ) dr . (105)The variance of the most eﬃcient canonical estimator ˆ d me-x is equal toVar h ˆ d me-x i = 1 E me-x − , (106)with E me-x = Z − π/ dθ Z π/ − π/ dυ cos υ α ( θ, υ ; 2; γ ) α ( θ, υ ; 4; γ ) . (107)The calculation of the integral (107) for γ = 0 givesVar h ˆ d me-x i ≃ . , κ me-x = 3 , ⇒ R me-x ≃ . . (108)This estimator is deﬁnitely better than the most eﬃcient time-high-low canon-ical estimator, as can be seen by comparing (108) with (97). The last example we present here is the realized variance estimator thatuses in each interval S i the high and low values H i , L i of the bridge Y ( t, S i )(22), the close value X i of the original stochastic process X ( t ) and the timeinstant t i last = sup { t iL , t iH } deﬁned as the larger of the two times at whichoccur the high and low values of the canonical bridge.One can rigorously prove that, analogously to (104), the homogeneoustime-OHLC bridge canonical estimator that is most eﬃcient for some givenvalue of γ value is equal toˆ d t-me-x (Θ , Υ , t last ; γ ) = R s t-me-x (Θ , Υ , t last ; γ ) ,s t-me-x ( θ, υ, t ; γ ) = 1 E t-me-x ( γ ) α ( θ, υ, t ; 2; γ ) α ( θ, υ, t ; 4; γ ) , (109)22here E t-me-x ( γ ) = Z dt Z − π/ dθ Z π/ − π/ dυ cos υ α ( θ, υ, t ; 2; γ ) α ( θ, υ, t ; 4; γ ) (110)and α ( θ, υ, t ; λ ; γ ) = Z ∞ r λ +2 ϕ last ( r cos υ cos θ, r cos υ sin θ, r sin υ, t ; γ ) dr. (111)The joint pdf ϕ ( h, ℓ, x, t ; γ ) is ϕ last ( h, ℓ, x, t ; γ ) = 1 √ π exp (cid:18) − ( x − γ ) (cid:19) ϕ last ( h, ℓ, t ) , (112)where ϕ last ( h, ℓ, t ) is given by expression (C.15) in Appendix C.2. Remark 7

Recall that the parameter factor γ (9) is unknown, because boththe drift µ i and the instantaneous variances σ i in equations (6) are generallyunknown. Therefore, our strategy below is to choose, for deﬁniteness, γ =0 and then explore the dependence on γ of the bias and eﬃciency of thediﬀerent “zero drift” estimators. Accordingly, we will use below the followingshorthand notations, omitting the argument γ , such asˆ d t-me-x (Θ , Υ , t ) := ˆ d t-me-x (Θ , Υ , t ; γ = 0) . The calculation of the integral (110), where α ( θ, υ, t ; λ ) is given by ex-pression (C.18) in Appendix C.4 yields for γ = 0Var h ˆ d t-me-x i = 1 E t-me-x − ≃ . , κ t-me-x = 3 , ⇒ R t-me-x ≃ . . (113)This estimator is more eﬃcient than all the previous one discussed until now. The goal of this section is to check by numerical simulations some ana-lytical results obtained above. Realizations of the canonical Wiener process X ( t ; γ ) (9) with drift for time t ∈ [0 ,

1] are obtained numerically as cumula-tive sums of a number I ( t ) = 10 of Gaussian summands, corresponding toa discrete time step ∆ = 10 − . For each numerical realization, we calculate23he values of the open-close spot variance canonical estimator, equal in thiscase to ˆ d real = ( γ + W ) , (114)and the values of the G&K canonical estimatorˆ d GK = k ( H − L ) − k ( W ( H − L ) − HL ) − k ( γ + W ) , (115)where H and L are the high and low values of the simulated process X ( t ; γ ).We also constructed numerical realizations of the bridge process Y ( t ) (25)and calculated the corresponding values of the canonical estimator ˆ d t-me-x (109). This estimator depends on the function α ( θ, υ, t ; λ ) deﬁned by ex-pression (C.18) in Appendix C.4, which is explicitly obtained by summinga double-inﬁnite series (C.19). In practice, we estimate this double-sum bykeeping only the 101 ﬁrst terms in each dimension, corresponding to estimat-ing 101 × ≃ summands in (C.18). Remark 8

At ﬁrst glance, it would seem that the calculation of the G&Kestimator (115), which needs only a few simple arithmetic operations, ismuch easier than the evaluation of the large number of summands in theseries (C.18) that deﬁne the estimator ˆ d t-me-x (109). In our computerizedworld, it turns out that there is actually no signiﬁcant diﬀerence from thecomputational point of view. γ = 0 ) Figure 1 shows 5000 realizations of the open-close estimator ˆ d real (114), ofthe G&K estimator (115) and of the estimator ˆ d t-me-x in the case the Wienerprocess with zero drift ( γ = 0). It is clear that the last estimator is themost eﬃcient in comparison with the open-close and the G&K estimators.The expected values and variances of these three estimators obtained bystatistical averaging over 10 samples areE[ ˆ d real ] ≃ . , E[ ˆ d GK ] ≃ . , E[ ˆ d t-me-x ] ≃ . , Var[ ˆ d real ] ≃ . , Var[ ˆ d GK ] ≃ . , Var[ ˆ d t-me-x ] ≃ . . These values are consistent with the theoretical analytical predictions ob-tained in previous sections:E[ ˆ d real ] = E[ ˆ d GK ] = E[ ˆ d t-me-x ] = 1 , Var[ ˆ d real ] = 2 , Var[ ˆ d GK ] ≃ . , Var[ ˆ d t-me-x ] ≃ . .

24n order to have truly comparable eﬃciencies of these realized varianceestimators, bearing in mind that their eﬀective sample sizes are diﬀerent( κ real = 1, κ GK = κ t-me-x = 3), we performed moving averages with r = 30subsequent samples for the open-close estimator (114) and with r = 10 sub-sequent samples for the G&K estimator (115) and estimator ˆ d t-me-x (109).Figure 2 presents there moving averages, which mimick the normalized es-timators of the integrated variance in the case where all instantaneous vari-ances are the same ( σ i = σ = const). It is clear that the open-close estima-tor of the realized variance remains signiﬁcantly less eﬃcient than the G&Kestimator, and much less eﬃcient than the most eﬃcient estimator ˆ d t-me-x . γ -dependence of biases and eﬃciencies of canonicalestimators In the previous subsection, we presented detailed calculations of the com-parative eﬃciency of unbiased variance estimators for the particular case ofWiener processes with zero drift. In real ﬁnancial markets, the drift process µ ( t ) is unknown and there is not reason for it to vanish. Thus, it is importantto explore quantitatively the dependence on the parameter γ (9) of the biasesand eﬃciencies of the spot variance canonical estimators described above.We begin with the open-close spot variance canonical estimator ˆ d real (114).It is easy to show that its expected value and variance are quadratic functionsof γ : E h ˆ d real i = 1 + γ , Var h ˆ d real i = 2 + 4 γ . (116)The spot variance homogeneous time-open-high-low canonical bridge estima-tors, such as the Park estimator ˆ d bPark (73) and the time-high-low estimatorˆ d t-me (95), are unbiased for all γ :E h ˆ d bPark i = E h ˆ d t-me i ≡ . Their variances do not depend on γ at all:Var h ˆ d bPark i ≃ . , Var h ˆ d t-me i ≃ . ∀ γ. (117)To obtain the γ -dependence of the biases and variances of the G&K canon-ical estimator ˆ d GK (115) and of the canonical estimator ˆ d t-me-x (109), we gen-erate 10 numerical realizations of the canonical Wiener process X ( t, γ ) (9)with drift, for γ = 0; 0 . . . . .

5; 1 .

6. Then, we calculated the statistical av-erages and variances of the corresponding 10 realizations of the canonicalestimators ˆ d GK and ˆ d t-me-x , which are shown in ﬁgure 3. The continuous lines25re respectively the expected value (116) of the open-close estimator ˆ d real given by expression (114) and the ﬁtted curvesE[ ˆ d est ] = a est γ + b est for the averaged values of the canonical estimators ˆ d GK and ˆ d t-me-x . Theirﬁtted parameters are a GK ≃ . , a t-me-x ≃ . , b GK ≃ b t-me-x ≃ . Figure 4 shows the statistical average of the variances of the canonicalestimators ˆ d GK and ˆ d t-me-x . The two horizontal lines indicate the variancevalues (117). The continuous lines show the ﬁtted curvesVar[ ˆ d est ] = c est γ + d est of the variances of the canonical estimators ˆ d GK and ˆ d t-me-x . Their parametersare c GK ≃ . , c t-me-x ≃ . , d GK ≃ . , b t-me-x ≃ . . We have introduced the canonical estimator ˆ d t-me-x given by expression(109) that includes the information on the value of the time t last = sup { t L , t H } deﬁned as the larger of the two times at which occur the high and low valuesof the canonical bridge. It seems that the canonical estimatorˆ d tt-me-x (Θ , Υ , t high , t low ; γ ) = R s tt-me-x (Θ , Υ , t high , t low ; γ ) ,s tt-me-x ( θ, υ, t , t ; γ ) = 1 E tt-me-x ( γ ) α ( θ, υ, t , t ; 2; γ ) α ( θ, υ, t , t ; 4; γ ) , (118)taking into account both high’s and low’s and their corresponding occurrencetimes ( t high : H = Y ( t high ) , t low : L = Y ( t low )) is even more eﬃcient thanthe estimator (109). In expression (118), we have used the notation α ( θ, υ, t , t ; λ, γ ) = Z ∞ r λ +2 ϕ ( r cos υ cos θ, r cos υ sin θ, r sin υ, t , t ; γ ) dr , (119)where ϕ ( h, ℓ, x, t , t ; γ ) is the joint pdf of the high-low-close- t hight - t low randomvariables.We have not explored the statistical properties of the estimator (118)because we have made not yet the eﬀort of deriving the exact analytical26xpression of ϕ ( h, ℓ, x, t , t ; γ ). We can however construct the function α (119) using statistical averaging: α ( θ, υ, t , t ; λ, γ ) cos υdυdθdt dt ≃ K K X k =1 R λk I (Υ k , Θ k , t high ,k , t low ,k ) . (120)In this expression, the values { Υ k , Θ k , t high ,k , t low ,k } are parameters of nu-merically simulated k -th sample of the canonical Wiener process with drift X ( t, γ ) (9), and I is the indicator of the set( υ, υ + dυ ) × ( θ, θ + dθ ) × ( t , t + dt ) × ( t , t + dt ) . We would like to point out that it is possible to construct the function α by an analogous statistical treatment for more general log-price process thatextend the Wiener process with drift to include more adequately the micro-stricture noise, the presence of heavy tails of returns and other stylized factsthat can be found for various ﬁnancial assets. In others words, relations suchas (120) oﬀer the possibility of constructing novel most eﬃcient varianceestimators of the form (118), extending the standard approach of econome-tricians looking for new constructions of eﬃcient volatility estimators. Therequisite is to be able to simulate numerically the underlying stochastic pro-cess that is representing a given ﬁnancial asset dynamics. Then, the use ofstatistical averaging, similar to (120), will enable the construction of high-frequency realized estimators that use the most eﬃcient estimators describedabove as elementary “bricks”. We have introduced a variety of integrated variance estimators, based onthe open-high-low values of the bridges Y ( t, S i ) (22), and close values X i (4)of the underlying log-price process X ( t ). The main peculiarity of some ofthe introduced estimators is to take into account not only the high and lowvalues but additionally their occurrence time. This last piece of informationlead to estimators that are even more eﬃcient. We discussed quantitativelythe statistical properties of the estimators for the class oﬀ Itˆo model for thelog-price stochastic process.Our work opens the road to the construction of novel types of integratedvariance estimators of log-price processes of real ﬁnancial markets that takeinto account the microstructure noise, heavy power tails of returns, andchaotic jumps. 27 cknowledgements : We are grateful to Fulvio Corsi for valuable dis-cussions of some aspects of this paper. A Basic properties of the canonical bridge

A.1 Symmetry properties

The canonical bridge Y ( t ) (25) exhibits the following time reversibilityand reﬂection properties Y ( t ) ∼ Y (1 − t ) , Y ( t ) ∼ − Y ( t ) . (A.1)Some statistical consequences of these symmetry properties are as follows.Let H = sup t ∈ (0 , Y ( t ) , L = inf t ∈ (0 , Y ( t ) , (A.2)be the high and low values of the canonical bridge, while t high and t low aretheir corresponding occurrence times: t high : H = Y ( t high ) , t low : L = Y ( t low ) . (A.3)Consider the cumulative distribution (cdf)Φ high ( t ) = Pr { t high < t } of the occurrence time t high of the high value of the canonical bridge. Due tothe reversibility property (A.1), one hasPr { t high < t } = Pr { t high > − t } ⇒ Φ high ( t ) + Φ high (1 − t ) = 1 . (A.4)Accordingly, the pdf of t high ϕ high ( t ) := d Φ high ( t ) dt presents the symmetry ϕ high ( t ) = ϕ high (1 − t ) . (A.5)Due to the reversibility property of the canonical bridge, the cdf Φ low ( t )of t low (A.3) coincides with the cdf of t high :Φ low ( t ) = Φ high ( t ) ⇒ ϕ low ( t ) = ϕ high ( t ) = ϕ high (1 − t ) . .2 Interplay between bridge and Wiener processes We will need below the well-known identity in law for the canonical bridge Y ( t ) ∼ Y ( t ) := (1 − t ) W (cid:18) t − t (cid:19) . Using the change of time variable τ = t − t ⇐⇒ t = τ τ and the scaling properties of the Wiener process, we can replace the com-pounded process Y ( t ( τ )) = Y (cid:18) τ τ (cid:19) by the more convenient process, which is identical in law and reads Y ( t ( τ )) ∼ Z ( τ ) = 11 + τ W ( τ ) . (A.6)In turn, the following identity in law holds Y ( t ) ∼ Z ( τ ( t )) = Z (cid:18) t − t (cid:19) . (A.7) B Joint pdf of the high value and its occur-rence time

B.1 Reﬂection method

Let us consider the function f ( ω ; τ, h ) such thatPr { W ( τ ) ∈ ( ω, ω + dω ) ∩ W ( τ ′ ) < h (1 + τ ′ ) : τ ′ ∈ (0 , τ ) } = f ( ω ; τ, h ) dω. (B.1)This function f ( ω ; τ, h ) satisﬁes to the following diﬀusion equation ∂f∂τ = 12 ∂ f∂ω (B.2)with initial and absorbing boundary conditions f ( ω ; τ = 0 , h ) = δ ( ω ) , f ( ω = h + hτ ; τ, h ) = 0 . (B.3)29e solve the initial-boundary problem (B.2) with (B.3) using the reﬂec-tion method, which amounts to searching for a solution of the form f ( ω ; τ, h ) = 1 √ πτ (cid:20) exp (cid:18) − ω τ (cid:19) − A exp (cid:18) − ( ω − h ) τ (cid:19)(cid:21) , where the factor A is deﬁned from the absorbing boundary condition (B.3),i.e. exp (cid:18) − ( h + hτ ) τ (cid:19) = A exp (cid:18) − ( h − hτ ) τ (cid:19) ⇒ A = e − h . We thus obtain f ( ω ; τ, h ) = 1 √ πτ (cid:20) exp (cid:18) − ω τ (cid:19) − exp (cid:18) − h − ( ω − h ) τ (cid:19)(cid:21) . (B.4) B.2 Pdf of the maximal value of the canonical bridge

In view of (B.1) and (A.6), the joint pdf of W ( τ ) and of the high value H ( τ ) = sup τ ′ ∈ (0 ,τ ) Z ( τ ′ ) (B.5)of the stochastic process Z ( τ ′ ) within the interval τ ′ ∈ (0 , τ ) is equal to Q ( ω, h ; τ ) = ∂f ( ω ; τ, h ) ∂h . Substituting in the above equation the expression (B.4) yields Q ( ω, h ; τ ) = 1 τ r πτ (2 h (1 + τ ) − ω ) exp (cid:20) − h − ( ω − h ) τ (cid:21) ,ω < h (1 + τ ) , h > . (B.6)In particular, the pdf of the high value H ( τ ) (B.5) Q ( h ; τ ) = Z h (1+ τ ) −∞ Q ( ω, h, τ ) dω is equal to Q ( h ; τ ) = r πτ exp (cid:18) − h (1 + τ ) τ (cid:19) + 2 he − h erfc (cid:18) h (1 − τ ) √ τ (cid:19) . Q high ( h ; t ) of the high value H ( t ) = sup t ′ ∈ (0 ,t ) Y ( t ′ )is equal to Q high ( h ; t ) = Q (cid:18) h ; t − t (cid:19) . (B.7)In particular, the pdf’s of the high values H (B.5) and H (A.2) are the sameand equal to ϕ high ( h ) = lim τ →∞ Q ( h ; τ ) = 4 he − h . (B.8) B.3 Pdf of the high value of the bridge and of its oc-currence value

In order to derive the joint pdf of the maximal value H (A.2) and of theoccurrence time t high (A.3), we ﬁrst consider the related joint pdf of the highvalue H (B.5) of the auxiliary process Z ( τ ) (A.6) and of its occurrence time τ high : H = Z ( τ high ).The function F ( h, τ ) that deﬁnes the probability F ( h, τ ) dh = Pr {H ∈ ( h, h + dh ) , τ high < τ } . is given by F ( h, τ ) = Z h (1+ τ ) −∞ Q ( ω, h ; τ ) P ( ω, h, τ ) dω, (B.9)where Q ( ω, h ; τ ) is the joint pdf of W ( τ ) and H ( τ ), given by equality (B.6),while P ( ω, h, τ ) = lim θ →∞ P ( ω, h, τ, θ ) ,P ( ω, h, τ, θ ) = Pr { W ( τ ′ | τ, ω ) < h (1 + τ ′ ) : τ ′ ∈ ( τ, τ + θ ) } . (B.10)Here, W ( τ ′ | τ, ω ) is the conditioned Wiener process that takes the value ω at τ ′ = τ . Due to the identity in law (A.6), P ( ω, h, τ ) is equal to the probabilitythat the following inequality holds Z ( τ ′ | τ, ω ) < h, τ ′ ∈ ( τ, ∞ ) , where Z ( τ ′ | τ, ω ) is the conditioned stochastic process Z ( τ ′ ), which is equalto ω/ (1 + τ ) at τ ′ = τ . 31he probability P ( ω, h, τ, θ ) (B.10) is given by P ( ω, h, τ, θ ) = Z h (1+ τ + θ ) −∞ f ( x ; ω, h, τ, θ ) dx, (B.11)where the pdf f ( x, ω, h, τ, θ ) satisﬁes the initial-boundary value problem ∂f∂θ = 12 ∂ f∂x ,f ( x ; ω, h, τ, θ = 0) = δ ( x − ω ) , f ( x = h (1 + τ + θ ); ω, h, τ, θ ) = 0 . Its solution, obtained by the reﬂection method, is f ( x ; ω, h, τ, θ ) = 1 √ πθ (cid:20) exp (cid:18) − ( x − ω ) θ (cid:19) − exp (cid:18) − h ( h (1 + τ ) − ω ) − ( x + ω − h (1 + τ )) θ (cid:19) (cid:21) . Substituting this last expression into (B.11) yields P ( ω, h, τ, θ ) =12 (cid:20) erfc (cid:18) ω − h (1 + τ + θ ) √ θ (cid:19) − e − h ( h (1+ τ ) − ω ) erfc (cid:18) h (1 + τ − θ ) − ω √ θ (cid:19)(cid:21) . In particular, in the limiting case θ → ∞ , one has P ( ω, h, τ ) = 1 − e − h ( h (1+ τ ) − ω ) . (B.12)Substituting Q ( ω, h ; τ ) (B.6) and P ( ω, h, τ ) (B.12) into (B.9), after integra-tion, we obtain F ( h, τ ) = 2 he − h erfc (cid:18) h (1 − τ ) √ τ (cid:19) . (B.13)Consider now the probabilityΦ high ( h, t ) dh = Pr { H ∈ ( h, h + dh ) , t high < t } . Due to the identity in law (A.7), Φ high ( h, t ) is equal toΦ high ( h, t ) = F (cid:18) h, t − t (cid:19) = 2 he − h erfc h (1 − t ) p t (1 − t ) ! . (B.14)32he integration over h ∈ (0 , ∞ ) gives the cumulative distribution function(cdf) of the random occurrence times t high (A.3):Φ high ( t ) = Pr { t high < t } = Z ∞ Φ( h, t ) dh = t, t ∈ (0 , . This means that the occurrence time t high of the high value of the canon-ical bridge is uniformly distributed. The above cdf satisﬁes the symmetryproperty (A.4). The corresponding pdf ϕ high ( t ) = 1 satisﬁes obviously tosymmetry property (A.5).The sought joint pdf of the high value H of canonical bridge Y ( t ) and ofits corresponding occurrence time t high is ϕ high ( h, t ) = ∂ Φ high ( h, t ) ∂t . (B.15)Substituting here Φ high ( h, t ) (B.14) yields ϕ high ( h, t ) = r π h p t (1 − t ) exp (cid:18) − h t (1 − t ) (cid:19) . (B.16) C Statistics of the high, low and occurrencetime of the last extremum of the canonicalbridge

C.1 Statistical description of the joint pdf of the high,low and occurrence time of the last extremum

The occurrence times of the ﬁrst and last absolute extremes (A.3) ofcanonical bridge Y ( t ) are formally deﬁned as t ﬁrst = inf { t L , t H } , t last = sup { t L , t H } . (C.1)The joint pdf of the high H and low L (A.2) together with the cdf of theoccurrence time t last is given byΦ last ( h, ℓ, t ) dhdℓ = Pr { H ∈ ( h, h + dh ) ∩ L ∈ ( ℓ, ℓ + dℓ ) ∩ t last < t } . (C.2)We derive the function Φ last ( h, ℓ, t ) by using a natural generalization of thereasoning presented in Appendix B that led to the joint pdf Φ high ( h, t ) (B.14)of the high value H and of the cdf of the occurrence time t high . Namely, wecalculate ﬁrst the probability F ( h, ℓ, τ ) dhdℓ = Pr {H ∈ ( h, h + dh ) , L ∈ ( ℓ, ℓ + dℓ ) , τ last < τ } , (C.3)33here H = sup τ ∈ (0 , ∞ ) Z ( τ ) , L = inf τ ∈ (0 , ∞ ) Z ( τ ) ,τ last = sup { τ low , τ high } , τ low : L = Z ( τ low ) , τ high : H = Z ( τ high ) . Analogously to (B.9), F ( h, ℓ, τ ) is equal to F ( h, ℓ, τ ) = Z h (1+ τ ) ℓ (1+ τ ) Q ( ω, h, ℓ, τ ) P ( ω, h, ℓ, τ ) dω, (C.4)where Q ( ω, h, ℓ, τ ) = − ∂ f ( ω ; h, ℓ, τ ) ∂h∂ℓ (C.5)and the pdf f ( ω ; h, ℓ, τ ) satisﬁes the initial-boundary problem ∂f∂τ = 12 ∂ f∂ω , f ( ω ; h, ℓ, τ = 0) = δ ( ω ) ,f ( ω = h (1 + τ ); h, ℓ, τ ) = 0 , f ( ω = ℓ (1 + τ ); h, ℓ, τ ) = 0 , τ > . (C.6)Similarly to P ( ω, h, τ ) (B.10), the probability P ( ω, h, ℓ, τ ) is given by P ( ω, h, ℓ, τ ) = lim θ →∞ P ( ω, h, ℓ, τ, θ ) ,P ( ω, h, ℓ, τ, θ ) = Pr { ℓ (1 + τ ′ ) < Bτ ′ | τ, ω ) < h (1 + τ ′ ) : τ ′ ∈ ( τ, τ + θ ) } . Analogously to (B.11), the last probability P ( ω, h, ℓ, τ, θ ) is equal to P ( ω, h, ℓ, τ, θ ) = Z h (1+ τ + θ ) ℓ (1+ τ + θ ) f ( x ; ω, h, ℓ, τ, θ ) dx, (C.7)where f ( x ; ω, h, ℓ, τ, θ ) is the solution of the initial-boundary problem ∂f∂θ = 12 ∂ f∂x , f ( x ; ω, h, ℓ, τ, θ = 0) = δ ( x − ω ) ,f ( x = h (1 + τ + θ ); ω, h, ℓ, τ, θ ) = 0 , f ( x = ℓ (1 + τ + θ ); ω, h, ℓ, τ, θ ) = 0 . (C.8)Knowing the function F ( h, ℓ, τ ) deﬁned by equality (C.3), one can ﬁndthe sought function Φ last ( h, ℓ, t ) (C.2) using the following relationΦ last ( h, ℓ, t ) = F (cid:18) h, ℓ, t − t (cid:19) , (C.9)which is analogous to (B.14). In turn, one can ﬁnd the joint pdf of the high H , low L values (A.2) and occurrence time of the last absolute extremum t last (C.1) of the canonical bridge Y ( t ) using, analogously to (B.15), the relation ϕ last ( h, ℓ, t ) = ∂ Φ last ( h, ℓ, t ) ∂t . (C.10)34 .2 Solutions of boundary-value problems Using the initial-boundary problem (C.6) with the reﬂection method, weobtain f ( ω ; h, ℓ, τ ) = ∞ X m = −∞ (cid:2) e − h − ℓ ) m g ( ω + 2( h − ℓ ) m ; τ ) − e − h − ℓ ) m + h ) g ( ω − h + ( h − ℓ ) m ); τ ) (cid:3) , (C.11)where g ( ω ; τ ) = 1 √ πτ exp (cid:18) − ω τ (cid:19) . In turn, the solution of the initial-boundary problem (C.8) is given by f ( x ; ω, h, ℓ, τ, θ ) = ∞ X m = −∞ (cid:20) e − h − ℓ ) m (1+ τ )+2 ω ( h − ℓ ) m × g ( y − ω + 2 m ( h − ℓ )(1 + τ ); θ ) − e − h − ℓ ) m + h ) (1+ τ )+2 ω (( h − ℓ ) m + h ) g ( y + ω − h − ℓ ) m + h )(1 + τ ); θ ) (cid:21) . (C.12)After substituting f ( ω ; h, ℓ, τ ) (C.11) into (C.5), we obtain Q ( ω, h, ℓ, τ ) = 4 τ ∞ X −∞ m (cid:20) me − h − ℓ ) m × [( ω + 2 m ( h − ℓ )(1 + τ )) − τ (1 + τ )] g ( ω + 2 m ( h − ℓ ) , τ ) − (1 + m ) e − m ( h − ℓ )+ h ) × [( ω − m ( h − ℓ ) + h )(1 + τ )) − τ (1 + τ )] g ( ω − m ( h − ℓ ) + h ) , τ ) (cid:21) . (C.13)Substituting f ( x ; ω, h, ℓ, τ, θ ) (C.12) into (C.7), and taking the limit θ → ∞ ,we obtain P ( ω, h, ℓ, τ ) = ∞ X m = −∞ h e − h − ℓ ) (1+ τ ) m +2( h − ℓ ) mω − e − h +( h − ℓ ) m ) (1+ τ )+2( h +( h − ℓ ) m ) ω i . (C.14)After substituting Q ( ω, h, ℓ, τ ) (C.13) and P ( ω, h, ℓ, τ ) (C.14) into (C.4),we obtain the explicit expression for F ( h, ℓ, τ ). Substituting it into (C.9)35nd using relation (C.9), we obtain the pdf of the high H , low L values andoccurrence time t last of the last extremum under the form ϕ last ( h, ℓ, t ) = ∞ X m = −∞ ∞ X n = −∞ (cid:16) m h g ( h, t, h − ℓ ) m, h − ℓ ) n ) − g ( ℓ, t, h − ℓ ) m, h − ℓ ) n ) − g ( h, t, h − ℓ ) m, h + ( h − ℓ ) n )) + g ( ℓ, t, h − ℓ ) m, h + ( h − ℓ ) n )) i − m ( m + 1) h g ( h, t, − h + ( h − ℓ ) m ) , h − ℓ ) n ) − g ( ℓ, t, − h + ( h − ℓ ) m ) , h − ℓ ) n ) − g ( h, t, − h + ( h − ℓ ) m ) , h + ( h − ℓ ) n ))+ g ( ℓ, t, − h + ( h − ℓ ) m ) , h + ( h − ℓ ) n )) i(cid:17) ,g ( y, t, a, c ) = − s π (1 − t ) t exp (cid:18) − ( a + y ) − ( a + c )( a − c + 2 y ) t t (1 − t ) (cid:19) × (cid:2) ( a + y ) − ( a + y )(3 + ( a + y )( a − c + 2 y )) t + (3 a − c + 4 y ) t (cid:3) . (C.15) C.3 Function α last ( θ, t ; λ ) Some of the most eﬃcient estimators introduced in this paper are deﬁnedthrough the function α last ( θ, t ; λ ) = Z ∞ r λ +1 ϕ last ( r cos θ, r sin θ, t ) dr, (C.16)36hich is analogous to (73), The calculation of the integral (C.16) yields α last ( θ, t ; λ ) = − p π (1 − t ) t δ λ Γ (cid:18) λ (cid:19) ∞ X m,n = −∞ (cid:16) m × h β ( co, t, sc · m, sc · n ; λ ) − β ( si, t, sc · m, sc · n ; λ ) − β ( co, t, sc · m, co + sc · n ); λ )+ β ( si, t, sc · m, co + sc · n ); λ ) i − m ( m + 1) h β ( co, t, − co + sc · m ) , sc · n ; λ ) − β ( si, t, − si + sc · m ) , sc · n ; λ ) − β ( co, t, − co + sc · m ) , co + sc · n ); λ )+ β ( si, t, − co + sc · m ) , co + sc · n ); λ ) i(cid:17) , (C.17)where co = cos θ, si = sin θ, sc = co − si,β ( y, t, a, c ; λ ) = h ( a + y ) [ a + y − ( a − c + 2 y ) t ](3 + λ )+ δt [(6 a − c + 8 y ) t − a + y )] i ,δ = δ ( y, t, a, c ) = ( a + y ) − ( a + c )( a − c + 2 y ) t t (1 − t ) . C.4 Function α ( θ, υ, t ; λ ) Consider the function α ( θ, υ, t ; λ ) = Z ∞ r λ +2 ϕ last ( r cos υ cos θ, r cos υ sin θ, r sin υ, t ; γ = 0) dr, (C.18)that enters into the deﬁnition of the canonical estimator (109) in the caseof zero drift γ = 0. Using expression (112) for the pdf ϕ last ( h, ℓ, x, t ; γ ), we37btain after calculations the following expression α ( θ, υ, t ; λ ) = Γ (cid:0) λ (cid:1) π p t (1 − t ) ∞ X m,n = −∞ (cid:16) m × h β ′ ( x, co, t, sc · m, sc · n ; λ ) − β ′ ( x, si, t, sc · m, sc · n ; λ ) − β ′ ( x, co, t, sc · m, cc + sc · n ; λ ) + β ′ ( x, si, t, sc · m, cc + sc · n ; λ ) i − m ( m + 1) h β ′ ( x, co, t, − cc − sc · m, sc · n ; λ ) − β ′ ( x, si, t, − cc − sc · m, sc · n ; λ ) − β ′ ( x, co, t, − cc − sc · m, cc + sc · n ; λ )+ β ′ ( x, si, t, − cc − sc · m, cc + sc · n ; λ ) i(cid:17) . (C.19)which is analogous to (C.17). Here, we have set x = sin υ, co = cos θ cos υ, si = sin θ cos υ,cc = 2 cos θ cos υ, sc = 2(cos θ − sin θ ) cos υ,β ′ ( x, y, t, a, c, λ ) = (cid:2) r (4 + λ )( a + y − ( a − c + 2 y ) t )+ δt ((6 a − c + 8 y ) t − a + y )) (cid:3) δ − (6+ λ ) / ,δ = r − ( a + c )( a − c + 2 y ) t t (1 − t ) + x , r = ( a + y ) . eferences A¨ıt-Sahalia, Y., P.A. Mykland, and L. Zhang (2005). How often to samplea continuous-time process in the presence of market microstructure noise.

Review of Financial Studies

18, 351-416.Andersen, T. G., T. Bollershev, F. X. Diebolt and P. Labys (2003). Mod-eling and Forecasting Realized Volatility.

Econometrica

71, 529-626.Garman, M. and M. J. Klass (1980). On the Estimation of Security PriceVolatilities From Historical Data.

Journal of Business

53, 67-78.Jeanblanc J. & M. Yor, and M. Chesney (2009).

Mathematical Methodsfor Financial Markets.

Springer.Parkinson, M. (1980). The Extreme Value Method for Estimating theVariance of the Rate of Return.

Journal of Business

53, 61-65.Saichev, A., D. Sornette, V. Filimonov F. Corsi (2009). HomogeneousVolatility Bridge Estimators.

ETH Zurich working paper , http://ssrn.com/abstract=1523225 .Saichev A., Y. Malevergne, D. Sornette (2010) Theory of Zipf ’s Law andBeyond (Lecture Notes in Economics and Mathematical Systems), Springer.Zhang, L., Mykland, P.A. and At-Sahalia, Y. (2005). A tale of twotime scales: determining integrated volatility with noisy high-frequency data.

Journal of the American Statistical Association ˆ d r e a l ˆ d G K ˆ d t - m e - x Fig. 1: d t-me-x (109). ˆ d r e a l ˆ d G K ˆ d t - m e - x Fig. 2:

Moving averages of the open-close (top panel), G&K (middlepanel) and time-OHLC (109) (lower panel) estimators over respectivewindows sizes of 30 samples for the top panel and 10 samples forthe two other panels. As explained in the text, this moving averagemimicks the normalized estimators of the integrated variance in thecase where all instantaneous variances are the same ( σ i = σ = const). γ E [ ˆ d ] Fig. 3:

Top to bottom, γ -dependence of the expected values of theopen-close ˆ d real (114), G&K ˆ d GK (115) and most eﬃcient ˆ d t-me-x (109)canonical estimators. The horizontal line is the expected value of thecanonical estimators ˆ d bPark and ˆ d t-me γ V a r [ ˆ d ] Fig. 4: γ -dependence of the statistical average of the variances ofthe canonical estimators ˆ d GK (upper open circles) and ˆ d t-me-x (loweropen circles). The two horizontal lines are the variances (117) of thecanonical estimators ˆ d bPark (top) and ˆ d t-me (bottom), respectively.(bottom), respectively.