[PDF] Nonparametric estimation of the volatility function in a high-frequency model corrupted by noise

Abstract

We consider the models Y_{i,n}=\int_0^{i/n} \sigma(s)dW_s+\tau(i/n)\epsilon_{i,n}, and \tilde Y_{i,n}=\sigma(i/n)W_{i/n}+\tau(i/n)\epsilon_{i,n}, i=1,...,n, where W_t denotes a standard Brownian motion and \epsilon_{i,n} are centered i.i.d. random variables with E(\epsilon_{i,n}^2)=1 and finite fourth moment. Furthermore, \sigma and \tau are unknown deterministic functions and W_t and (\epsilon_{1,n},...,\epsilon_{n,n}) are assumed to be independent processes. Based on a spectral decomposition of the covariance structures we derive series estimators for \sigma^2 and \tau^2 and investigate their rate of convergence of the MISE in dependence of their smoothness. To this end specific basis functions and their corresponding Sobolev ellipsoids are introduced and we show that our estimators are optimal in minimax sense. Our work is motivated by microstructure noise models. Our major finding is that the microstructure noise \epsilon_{i,n} introduces an additionally degree of ill-posedness of 1/2; irrespectively of the tail behavior of \epsilon_{i,n}. The method is illustrated by a small numerical study.

Full PDF

NNonparametric Estimation of the Volatility Function in aHigh-Frequency Model corrupted by Noise

Axel Munk ∗ and Johannes Schmidt-Hieber Institut f¨ur Mathematische Stochastik, Universit¨at G¨ottingen,Goldschmidtstr. 7, 37077 G¨ottingen

Email: [email protected] , [email protected] Abstract

We consider the models Y i,n = (cid:82) i/n σ ( s ) dW s + τ ( i/n ) (cid:15) i,n , and ˜ Y i,n = σ ( i/n ) W i/n + τ ( i/n ) (cid:15) i,n , i = 1 , . . . , n , where ( W t ) t ∈ [0 , denotes a standard Brownian motion and (cid:15) i,n are centered i.i.d. random variables with E (cid:0) (cid:15) i,n (cid:1) = 1 and ﬁnite fourth mo-ment. Furthermore, σ and τ are unknown deterministic functions and ( W t ) t ∈ [0 , and( (cid:15) ,n , . . . , (cid:15) n,n ) are assumed to be independent processes. Based on a spectral decom-position of the covariance structures we derive series estimators for σ and τ andinvestigate their rate of convergence of the MISE in dependence of their smoothness.To this end speciﬁc basis functions and their corresponding Sobolev ellipsoids are in-troduced and we show that our estimators are optimal in minimax sense. Our workis motivated by microstructure noise models. A major ﬁnding is that the microstruc-ture noise (cid:15) i,n introduces an additionally degree of ill-posedness of 1 /

2; irrespectivelyof the tail behavior of (cid:15) i,n . The performance of the estimates is illustrated by a smallnumerical study.

AMS 2000 Subject Classiﬁcation:

Primary 62M09, 62M10; secondary 62G08, 62G20.

Keywords:

Brownian motion; Variance estimation; Minimax rate; Microstructure noise;Sobolev Embedding.

Consider the models Y i,n = (cid:90) i/n σ ( s ) dW s + τ (cid:18) in (cid:19) (cid:15) i,n i = 1 , . . . , n, (1.1)and ˜ Y i,n = σ (cid:18) in (cid:19) W i/n + τ (cid:18) in (cid:19) (cid:15) i,n i = 1 , . . . , n (1.2)1 a r X i v : . [ s t a t . M E ] A p r espectively, where ( W t ) t ∈ [0 , denotes a Brownian motion and (cid:15) i,n is so called microstruc-ture noise, i.e. we assume (cid:15) i,n i.i.d., E (cid:16) (cid:15) i,n (cid:17) = 1 and E (cid:16) (cid:15) i,n (cid:17) < ∞ . ( W t ) t ∈ [0 , and( (cid:15) ,n , . . . , (cid:15) n,n ) are assumed to be independent, and σ and τ are unknown, positive anddeterministic functions.Our models (1.1) and (1.2) are natural extensions of the situation when σ and τ areconstant, which has been, in a slightly broader setting, previously considered by [8], [13],[14] and [24] among others. In the latter papers sharp minimax estimators were derived for σ and τ . The minimax rate for σ is n − / and for τ it is n − / , and the correspondingconstants for quadratic loss (MSE) being 8 τ σ and 2 τ , respectively. To estimate σ and τ, maximum likelihood is feasible (see [24]) and achieves these bounds. Other eﬃcientestimators where given by [8], [13] or [14]. In our case, i.e. when σ and τ are functionsthese methods fail and techniques from nonparametric regression become necessary. Wewill postpone a more careful dicussion of models (1.1) and (1.2) to Section 2.Both models incorporate, as usually in high-frequency ﬁnancial models, an additionalnoise term, denoted as microstructure noise (cf. [1] and [16] ) in order to model marketfrictions such as bid-ask spreads and rounding errors. In general, microstructure noise isoften assumed as white noise process with bounded fourth moment. Therefore, we mayinterpret both models as obtaining data from transformed Brownian motions under addi-tional measurement errors. Particularly, our assumptions cover the important case when (cid:15) i,n i.i.d. ∼ N (0 , . In this paper we try to understand how estimation of the functions σ and τ in (1.1)and (1.2) itself can be performed, i.e. the time derivative of the integrated volatility. Toour knowledge, this issue has never been addressed before, a remarkable exception is [3]where a harmonic analysis technique is introduced in order to recover σ from noiselessdata. A naive estimator of σ would be the derivative of an estimator of (cid:82) s σ ( x ) dx withrespect to s . However, (numerical) diﬀerentiation of (cid:82) s σ ( x ) dx with respect to s yieldsan additional degree of ill-posedness and there are to the best of our knowledge no esti-mates and no theoretical results available how to estimate σ in our situation. Instead, wepropose a regularized estimator for σ and τ that attains the minimax rate of convergence.Our estimator is a Fourier series estimator where we will estimate the single cosine Fouriercoeﬃcients, (cid:82) σ ( x ) cos( kπx ) dx , k = 0 , , . . . by a particular spectral estimator which isspeciﬁcally tailor suited to this problem. The diﬃculty to estimate σ can be explainedgenerically from the point of view of statistical inverse problem: Microstructure noise in-duces an additional degree of ill posedness -similar as in a deconvolution problem- whichin our case leads to a reduction of the rate of convergence by a factor 1 /

2. Surprisingly,and in contrast to deconvolution, this is only reﬂected in the behavior of the eigenvalues ofthe covariance operator of the process in (1.1) and (1.2) and not in the tail behavior of the2ourier transform of the error (cid:15) i,n .We stress again that we are aware of the fact that our model assumes a deterministicfunction σ and τ , which only depends on time t and generalization to σ ( t, X t ) is not obviousand a challenge for further research. However, the purely deterministic case already helpsus to reveal the daily pattern of the volatility and ﬁnally we believe that our analysis is animportant step into the understanding of these models from the view point of a statisticalinverse problems. Results:

All results are obtained with respect to MISE-risk. Let α and β denote a certainsmoothness of σ and τ , respectively. Roughly speaking, these numbers correspond to theusual Sobolev indices, although in our situation, a particular choice of basis is required,leading us to the deﬁnition of Sobolev s -ellipsoids (see Deﬁnition 1). Then we show that τ can be estimated at rate n − β/ (2 β +1) for β > , α > / β > , α > / σ the n − α/ (4 α +2) rate of convergence for α > / , β > / α > / , β > / s -ellipsoids. Lower bounds with respect to H¨older classes for estimation of σ have beenobtained in [17]. Here we will extend this result to Sobolev s -ellipsoids. It follows that theobtained rates are minimax, indeed.To summarize, our major ﬁnding is that in contrast to ordinary deconvolution the diﬃ-culty of estimation σ when corrupted by additional (microstructure) noise (cid:15) , is genericallyincreased by a factor of 1 / s -ellipsoids. This is quite surprising because onemight have expected that for instance Gaussian error leads to logarithmic convergence ratesdue to its exponential decay of the Fourier transform (see e.g. [4], [6], [7] and [11] for someresults in this direction). We stress that for our method a minimal smoothness of σ in(1.1) of α > / α > / σ . Roughly speaking, the results imply that n data points for estimation of σ can be compared to the situation, when we have √ n observation in usual nonparamteric regression.The work is organized as follows. In Sections 2 and 3 we will discuss models (1.1) and(1.2) in more detail, introduce notation and deﬁne the required smoothness classes, Sobolev s -ellipsoids (details can be found in Appendix B). Section 4.1 and Section 4.2 are devoted toestimate σ and τ , respectively, and to present the rates of convergence of the estimators(for a proof see Appendix A). Section 5 provides the minimax result. In Section 6 webrieﬂy discuss some numerical results and illustrate the robustness of the estimator againstnon-normality and violations of the required smoothness assumptions for σ and τ . Some3urther results and technicalities of Sections 4.1 and 4.2 are given in the supplementarymaterial. In this subsection we brieﬂy discuss the background from ﬁnancial economics of model(1.1) and explore the diﬀerences between models (1.1) and (1.2). We may consider theprocesses ( σ ( t ) W t ) t ∈ [0 , and (cid:16)(cid:82) t σ ( s ) dW s (cid:17) t ∈ [0 , D = ( W ( H ( t ))) t ∈ [0 , , H ( t ) := (cid:82) t σ ( s ) ds as (inhomogeneously) scaled Brownian motions, where scaling takes place in space and intime, respectively. Hence we will refer to ( σ ( t ) W t ) t ∈ [0 , and (cid:16)(cid:82) t σ ( s ) dW s (cid:17) t ∈ [0 , in thefuture as space-transformed (sBM) and time-transformed (tBM) Brownian motion. Model (1.1):

In the ﬁnancial econometrics literature variations of model (1.1) areoften denoted as high-frequency models, since ( W t ) t ∈ [0 , is sampled on time points t = i/n and nowadays there is a vast amount of literature on volatility estimation in high-frequencymodels with additional microstructure noise term (see [2], [15], [26] and [27]). These kindsof models have attained a lot of attention recently, since the usual quadratic variationtechniques for estimation of (cid:82) σ ( x ) dx lead to inconsistent estimators (cf. [26]).We are aware of the fact, that in contrast to our model, volatility is modelled gen-erally not only as time dependent but also depending on the process itself, i.e. Y i,n = X i/n + τ ( i/n ) (cid:15) i,n , i = 1 , . . . , n, dX t = σ ( t, X t ) dW t . An overview over commonly usedparametric forms of σ ( t, X t ) and a non-parametric treatment in the absence of microstruc-ture noise, can be found in [12]. It is known that the same rates as for the case σ and τ constant hold true if we consider the model (1.1) and estimate the so called integratedvolatility or realized volatility (cid:82) s σ ( x ) dx ( s ∈ [0 , (cid:82) s τ ( x ) dx instead of σ and τ ,respectively (see [20] and [22] for a discussion on estimation of integrated volatility andrelated quantities). Recently, model (1.1) has been proven to be asymptotically equivalentto a Gaussian shift experiment (see [21]). σ as a function of time corresponds in model(1.1) to the instantaneous volatility or spot volatility. Model (1.2):

Model (1.2) can be regarded as a nonparametric extension of the modelwith constant σ, τ as discussed for variogram estimation by [24]. To motivate the usefulnessof sBM we give the following Lemma.

Lemma 1. (i) Assume that σ , < c ≤ σ, is continuously diﬀerentiable. Then thecorresponding sBM, ( σ ( t ) W t ) t ∈ [0 , is the unique solution of the SDE dX t = X t d (log ( σ ( t ))) + σ ( t ) dW t , X = 0 , ≤ t ≤ T. ii) The variogram of sBM is given by γ ( s, t ) := E ( X t − X s ) = (cid:16) σ ( t ) t / − σ ( s ) s / (cid:17) + σ ( t ) σ ( s ) (cid:20) | s − t | − (cid:16) s / − t / (cid:17) (cid:21) . Proof. (i)

It is easy to check that sBM indeed is a solution. To establish uniqueness, weapply Theorem 9.1 in [23]. (ii)

This follows by straightforward calculations.

Comparison of the models:

We remark that tBM can be related to sBM by partialintegration (cid:82) t σ ( s ) dW s = σ ( t ) W t − (cid:82) t σ (cid:48) ( s ) W s ds. To see the diﬀerences we compared inFigure 1 sBM and tBM in two typical situations: The case where σ ( t ) = 0 for t > T andthe case, where σ is non-continuous. If σ ( t ) = 0 for t > T, sBM tends to zero, whereastBM tends to a constant, i.e. the random variable (cid:82) T σ ( s ) dW s . Furthermore, if σ is ajump function, sBM has a jump too, whereas tBM does not.Unlike Model (1.1), which can be viewed as a price process, Model (1.2) has no di-rect application in ﬁnancial mathematics. However, from the view point of nonparametricstatistics it seems to be a natural extension of the situation when σ and τ are constant. In this section we shortly introduce the setup needed in order to deﬁne the estimators.First we deﬁne suitable smoothness classes, which are diﬀerent, but related to well knownSobolev ellipsoids (see Deﬁnition B.1).

Deﬁnition 1.

For α > , C > , we call the function space Θ s := Θ s ( α, C ) := (cid:40) f ∈ L [0 ,

1] : ∃ ( θ n ) n ∈ N , s. t. f ( x ) = θ + 2 ∞ (cid:88) i =1 θ i cos ( iπx ) , ∞ (cid:88) i =1 i α θ i ≤ C (cid:41) a Sobolev s-ellipsoid. If there is a C < ∞ such that f ∈ Θ s ( α, C ) , we say f has smoothness α . For < l < u < ∞ , we further introduce the uniformly bounded Sobolev s-ellipsoid Θ bs ( α, C ) := Θ bs ( α, C, [ l, u ]) := { f ∈ Θ s ( α, C ) : l ≤ f ≤ u } . Here the “ s ” refers to “ symmetry ” since the L [0 ,

1] basis { ψ k , k = 0 , . . . } := (cid:110) , √ kπt ) , k = 1 , . . . (cid:111) , (3.1)5an also be viewed as a basis of the symmetric L [ − ,

1] functions (cid:8) f : f ∈ L [ − , , f ( x ) = f ( − x ) ∀ x ∈ [0 , (cid:9) . Usually, Sobolev ellipsoids are introduced with respect to the Fourier basis (cid:110) , √ kπt ) , √ kπt ) , k = 1 , . . . (cid:111) on L [0 ,

1] (see Deﬁnition (B.1)). As will turn out later on, Sobolev s -ellipsoids are morenatural for our approach. If a function has a certain smoothness in one space, it mighthave a completely diﬀerent smoothness with respect to the other basis. For instance thefunction cos ((2 l + 1) πx ), l ∈ N has smoothness α for all α < ∞ with respect to basis (3.1),and as can be seen by direct calculations only smoothness α < / f k : [0 , → R , k ∈ N f k ( x ) := ψ k (cid:16) x (cid:17) . Note that for k ≥ f k can be expanded in basis (3.1) by f k = ψ + 2 − / ψ k . For anyfunction g we introduce the forward diﬀerence operator ∆ i g := g (( i + 1) /n ) − g ( i/n )and further the transformed variables ∆ Y k, i,n := ( Y i +1 ,n − Y i,n ) f k ( i/n ) and ∆ Y k, i,n := (cid:16) ˜ Y i +1 ,n − ˜ Y i,n (cid:17) f k ( i/n ), i = 1 , . . . , n − Y ki,n = ∆ Y k,li,n , l = 1 , . Throughoutthe paper we abbreviate ﬁrst order diﬀerences of observations by∆ Y k := (cid:16) ∆ Y k ,n , . . . , ∆ Y kn − ,n (cid:17) t . We write M p,q , M p and D p for the space of p × q matrices, p × p matrices and p × p diagonal matrices over R , respectively. Further let D n − ∈ M n − given by ( D n − ) i,j = (cid:112) /n sin ( ijπ/n ) and deﬁne λ i,n − := 4 sin ( iπ/ (2 n )) i = 1 , . . . , n − , (3.2)the eigenvalues of the covariance matrix K n − ∈ M n − of the MA(1) process ∆ i (cid:15) i,n := (cid:15) i +1 ,n − (cid:15) i,n , i = 1 , . . . , n −

1. More explicitly K n − is tridiagonal and( K n − ) i,j =  i = j − | i − j | = 10 else . (3.3)Note that we can diagonalize K n − explicitly by K n − = D n − Λ n − D n − , where Λ n − isdiagonal with diagonal entries given by (3.2).6e will suppress the index n − K , D , Λ, λ i instead of K n − , D n − , Λ n − ,and λ i,n , respectively. We write [ x ] := max z ∈ Z { z ≤ x } , x ∈ R , the integer part of x . log()is deﬁned to be the binary logarithm and in order to deﬁne estimators properly, we assumethroughout the paper additionally n > . τ Before we will turn to the estimation of the volatility σ , we will ﬁrst discuss estimation ofthe noise variance, i.e. τ . Let J τn ∈ D n − given by( J τn ) i,j :=  ( n − n/ log n ) − λ − i δ i,j , for [ n/ log n ] ≤ i, j ≤ n −

10 otherwise , where λ i is deﬁned by (3.2) and δ i,j denotes the Kronecker delta. We consider models (1.1)and (1.2), simultaneously. Letˆ t k, := (cid:16) ∆ Y k (cid:17) t DJ τn D t (cid:16) ∆ Y k (cid:17) . (4.1)In Lemma C.1 it will be shown that ˆ t k, is a √ n − consistent estimator of t k, := (cid:90) τ ( x ) f k ( x ) dx. Note that for k ≥ t k, = (cid:82) τ ( x ) ψ ( x ) dx + 2 − / (cid:82) τ ( x ) ψ k ( x ) dx . Deﬁne Z := D (cid:0) ∆ Y k (cid:1) and denote by Z i the i -th component of Z . Thenˆ t k, = ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] λ − i Z i . (4.2)Hence this also can be seen as a spectral ﬁlter in Fourier domain, where we cut oﬀ theﬁrst n/ log n frequencies. Note that for i ≥

1, 2 / ( t i, − t , ) = (cid:82) τ ( x ) ψ i ( x ) dx is the i -th series coeﬃcient with respect to basis (3.1). This observation suggests to construct thecosine series estimator ˆ τ N ( t ) := ˆ t , + 2 N (cid:88) i =1 (cid:0) ˆ t i, − ˆ t , (cid:1) cos ( iπt ) . (4.3)The next result provides the rate of convergence of ˆ τ N uniformly within Sobolev s-ellipsoids. To this end a version of the continuous Sobolev embedding theorem is requiredfor non-integer indices α, β (see Lemma D.8). A proof of the following Theorem can befound in the supplementary material. 7 heorem 1 (MISE of ˆ τ N ( t )) . Let ˆ τ N ( t ) as deﬁned in (4.3). Assume β > , and Q, ¯ Q > .Further suppose that N = N n = o (cid:0) n / / log n (cid:1) . Assume either model (1.1) and α > / ormodel (1.2) and α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ τ N (cid:1) = O (cid:16) N − β + N n − (cid:17) . Minimizing the r.h.s. yields N ∗ = O (cid:0) n / (2 β +1) (cid:1) and consequently sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ τ N ∗ (cid:1) = O (cid:16) n − β/ (2 β +1) (cid:17) . Remark 1.

Note that for model (1.1) Theorem 1 holds, whenever α > / . Hence theBrownian motion part of the model can be viewed as a nuisance parameter, not aﬀectingrates for estimation of τ . However, for model (1.2) α > / is required here. This morerestrictive assumption is essentially a consequence of the fact that the process σ ( i/n ) W i/n is in general no martingale. Remark 2.

The result from Theorem 1 can be extended to / < β ≤ in model (1.1) andto / < α ≤ / , / < β ≤ in model (1.2). Let ˜ t k, be deﬁned as ˆ t k, in (4.1) but J τn isnow replaced by ˜ J τn ∈ D n − , (cid:16) ˜ J τn (cid:17) i,j =  n − λ − i δ i,j , for [ n/ ≤ i, j ≤ n − otherwise . Introduce further the estimator ˜ τ N ( t ) = ˜ t , + 2 (cid:80) Ni =1 (cid:0) ˜ t i, − ˜ t , (cid:1) cos ( iπt ) . Further supposethat N = O (cid:0) n / (2 β +1) (cid:1) . Then we obtain by slight modiﬁcations of the proof of Theorem 1for β > / , α > / and Q, ¯ Q > (i) Assume model (1.1). Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N (cid:1) = O (cid:16) N − β + N n − + N n − β (cid:17) and N ∗ = O (cid:0) n (2 β − / (2 β +1) (cid:1) yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N ∗ (cid:1) = O (cid:16) n − ( β − β ) / (2 β +1) (cid:17) . (ii) Assume model (1.2). Then we have the expansion sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N (cid:1) = O (cid:16) N − β + N n − + N n − β + N n − α (cid:17) , nd the choice N ∗ =  O (cid:0) n (2 β − / (2 β +1) (cid:1) for β ≤ ∧ (2 α − / ,O (cid:0) n (4 α − / (2 β +1) (cid:1) for α ≤ / ∧ ( β/ / yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N ∗ (cid:1) =  O (cid:16) n − ( β − β ) / (2 β +1) (cid:17) for β ≤ ∧ (2 α − / ,O (cid:0) n − (2 − α ) / (2 β +1) (cid:1) for α ≤ / ∧ ( β/ / . Remark 3.

It is also possible, although more technical, to compute the asymptotic constantof the estimator ˆ τ N ∗ . Suppose that the microstructure noise is Gaussian and assume model(1.1) and β > or (1.2) and β > , α > / , then we have more explicitly MISE (cid:0) ˆ τ N ∗ (cid:1) = 2 N ∗ n (cid:90) τ ( x ) dx + ∞ (cid:88) k = N ∗ +1 (cid:18)(cid:90) τ ( x ) ψ k ( x ) dx (cid:19) + o (cid:0) N ∗ n − (cid:1) . Remark 4.

There are of course simpler estimators for t k, . For instance if we replace J τn in (4.1) by (2 n ) − I n − , where I n − ∈ D n − denotes the identity matrix, we obtainthe quadratic variation estimator for t k, (cf. [1]) and it is not diﬃcult to show that thisestimator attains the optimal rate of convergence. This approach could even be extendedto a nonparametric estimator of the form (4.3). However, the single Fourier coeﬃcientsare not estimated eﬃciently, since in the case when the microstructure noise is Gaussianthe asymptotic constant is n − (cid:82) τ k ( x ) dx (this is a straightforward extension of TheoremA.1 in [27]) whereas for our estimator we have n − (cid:82) τ k ( x ) dx (see Lemma C.1). If τ is constant it can be easily seen that estimators in (4.1) are eﬃcient for k = 0 whereasquadratic variation is not. Remark 5.

In practical application it would be more natural to use instead of n/ log n in(4.2) other cut-oﬀ frequencies e.g. n γ / log n or qn , where / < γ ≤ , < q < . Smaller γ decreases the variance while on the other hand increases the bias of the estimator. σ Deﬁne J n ∈ D n − by( J n ) i,j =  √ nδ i,j , for (cid:2) n / (cid:3) + 1 ≤ i, j ≤ (cid:2) n / (cid:3) . (4.4)9imilar, as for the estimation of τ we ﬁrst introduce an estimator of appropriate Fouriercoeﬃcients by ˆ s k, = (cid:16) ∆ Y k (cid:17) t DJ n D t (cid:16) ∆ Y k (cid:17) − π ˆ t k, / . (4.5)The second part, i.e. − π ˆ t k, / π / (cid:2) n / (cid:3) + 1 and 2 (cid:2) n / (cid:3) in (4.4). As we will see, the estimatorof ˆ t k, has better convergence properties than the ﬁrst term in ˆ s k, , and hence does notaﬀect the asymptotic variance. Similar to (4.3), we putˆ σ N ( t ) = ˆ s , + 2 N (cid:88) i =1 (ˆ s i, − ˆ s , ) cos ( iπt ) . (4.6) Theorem 2 (MISE of ˆ σ N ) . Let ˆ σ N as deﬁned in (4.6). Suppose that N = N n = o (cid:0) n / (cid:1) , β > / and Q, ¯ Q > . Assume model (1.1) and α > / or model (1.2) and α > / .Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / (cid:17) and minimizing the r.h.s. yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) = O (cid:16) n − α/ (2 α +1) (cid:17) for N ∗ = O (cid:0) n / (4 α +2) (cid:1) . The proof of Theorem 2 is given in Section A.2.

Remark 6.

It is also possible to extend this result for less smooth functions σ and τ .(i) Assume model (1.1) and α > / , β > . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / + N n − β + N n − α (cid:17) , and N ∗ =  O (cid:0) n (2 α − / (2 α +1) (cid:1) for α ≤ / ∧ ( β − / ,O (cid:0) n (2 β − / (2 α +1) (cid:1) for β ≤ / ∧ ( α + 1 / yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) =  O (cid:0) n − α (2 α − / (2 α +1) (cid:1) for α ≤ / ∧ ( β − / ,O (cid:0) n − α (2 β − / (2 α +1) (cid:1) for β ≤ / ∧ ( α + 1 / . ii) Assume model (1.2) and α > / , β > . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / + N n − β (cid:17) , and N ∗ = O (cid:0) n (2 β − / (2 α +1) (cid:1) yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) = O (cid:16) n − α (2 β − / (2 α +1) (cid:17) . Remark 7.

In analogy to (4.2), the estimator ˆ s k, can also be viewed as a spectral ﬁlterin Fourier domain, where essentially only the frequencies n / , . . . , n / play a role. Forpractical purposes one can generalize this to estimators where the frequencies k, . . . , (cid:2) cn / (cid:3) , c > are used. If σ is assumed to be very smooth, one even may set k = 1 . In this moregeneral setting, the constant − π / in the deﬁnition of the estimator has to be replaced by − n/ (cid:0)(cid:2) cn / (cid:3) − k (cid:1) (cid:80) [ cn / ] i = k λ i . Remark 8.

Since the matrix D in the deﬁnition of ˆ s k, is a discrete sine transform (for adeﬁnition see [5]) the estimator ˆ σ N can be calculated explicitly taking O ( N n log n ) steps. In this section we will discuss the optimality of the proposed estimators. To this end weestablish lower bounds with respect to Sobolev s -ellipsoids. Theorem 3.

Assume model (1.1) or model (1.2), α ∈ N \ { } . Further assume τ constant.Then there exists a C > (depending only on α, Q, l, u ), such that lim n →∞ inf ˆ σ n sup σ ∈ Θ bs ( α,Q ) E (cid:16) n α α +1 (cid:13)(cid:13) ˆ σ − σ (cid:13)(cid:13) (cid:17) ≥ C. Proof.

The proof relies on a multiple hypothesis testing argument and is close to the proofgiven in [17], Theorem 2.1. However, the lower bounds there are established with respectto the space of H¨older continuous functions of index α on the interval [0 , , i.e. for l < u C b ( α, L ) := C b ( α, L, [ l, u ]) := (cid:110) f : f ( p ) exists for p = [ α ] , (cid:12)(cid:12)(cid:12) f ( p ) ( x ) − f ( p ) ( y ) (cid:12)(cid:12)(cid:12) ≤ L | x − y | α − p , ∀ x, y ∈ I, < l ≤ f ≤ u < ∞ (cid:111) . Therefore, the statement above does not follow immediately from [17], Theorem 2.1 becauseof C b ( α, L ) (cid:32) Θ bs ( α, Q ) due to boundary eﬀects. Here, we will only point out the diﬀerenceto the proof of [17], Theorem 2.1. We write σ min , σ max for the lower and upper boundof σ , respectively, i.e. σ ∈ Θ bs ( α, Q, [ σ min , σ max ]). Without loss of generality, we may11ssume that σ min = 1. For the multiple hypothesis testing argument (cf. [25]) a speciﬁcchoice of functions σ i,n is required. For a construction see [17], proof of Theorem 2.1 where L := (cid:0) π α Q (cid:1) / / (cid:13)(cid:13) K ( α ) (cid:13)(cid:13) ∞ . It remains to show σ i,n ∈ Θ bs ( α, Q ) , i = 0 , , . . . , M. Due to the construction of σ i,n , we have σ i,n ( t ) = 1 for t ∈ [0 , / ∪ [3 / ,

1] and σ l ) i,n (0) = σ l ) i,n (1) = 0 for l ∈ { , , . . . , α } . Thus, σ i,n ∈ W (cid:16) α, L (cid:13)(cid:13) K ( α ) (cid:13)(cid:13) ∞ (cid:17) (for a deﬁnition seeEquation (B.1)), α ∈ { , , , . . . } , j = 0 , . . . , M . Hence by Theorem B.1 it follows σ i,n ∈ Θ s ( α, Q ) for i = 0 , . . . , M . In this section we brieﬂy illustrate the performance of our estimators. Our aim is not to givea comprehensive simulation study, rather we would like to illustrate the behaviour of theestimator when assumptions of Theorems 1 and 2 are violated. In the following we plottedour estimator to simulated data, where we always set n = 25 . . From the point of viewof ﬁnancial statistics this is approximately the sample size obtained over a trading day (6 . N in (4.3)and (4.6) as the minimizer of (cid:13)(cid:13) ˆ τ − τ (cid:13)(cid:13) n and (cid:13)(cid:13) ˆ σ − σ (cid:13)(cid:13) n , respectively, which is in practiceunknown. Of course, proper selection of the threshold N ∗ is of major importance for theperformance of the estimator. To this end various methods are available, among others,cross validation techniques, balancing principles, and variants thereof could be employed(see e.g. [9], [10], [18] and [19]). A thorough investigation is postponed to a separate paper.Throughout our simulations we assumed τ = 0 .

01 and concentrated mainly on estimationof σ , as it is the more challenging task.In Figure 2 we have displayed the estimator for σ ( t ) = (2 + cos (2 πt )) / . Note thatby Deﬁnition 1, σ has ”inﬁnite” smoothness, i.e. for any α >

0, we can ﬁnd a

Q < ∞ ,such that σ ∈ Θ s ( α, Q ) . The reconstruction shows that estimation of τ can be donemuch easier than estimation of σ although it is of smaller magnitude. In Figure 3, we areinterested in the behavior of the estimators if heavy-tailed microstructure noise is present.This was simulated by generating (cid:15) i,n ∼ − / t (3), i = 1 , . . . , n , i.i.d., where t (3) denotesa t-distribution with 3 degrees of freedom. We can see from Plot 1 in Figure 3 thatthe resulting microstructure noise has some severe outliers according to the tail x − ofthe density of t (3). Nevertheless, estimation of τ and σ is not visibly aﬀected by thedistribution of the noise.In the subsequent ﬁgures we illustrate the behaviour of the estimator when the requiredsmoothness assumptions on σ and τ are violated. To this end, we investigate in Fig-ure 4 the situation when σ is random itself, i.e. a realization from a Brownian motion,12 ( t ) = 3 (cid:12)(cid:12)(cid:12) ˜ W t (cid:12)(cid:12)(cid:12) . The Brownian motion (cid:16) ˜ W t (cid:17) t ∈ [0 , was modelled as independent from theBrownian motion in (1.1) and the microstructure noise process. It is of course not possibleto reconstruct the complete path of σ , but as Figure 4 indicates, the estimators at leastdetects the smoothed shape of the path and so our estimator might already reveal someparts of the pattern of volatility also in case σ is non-deterministic, which is certainly morerealistic in most applications.Finally, in Figure 5 we investigated the case of σ being a jump-function. We put σ ( t ) = 1 + I ( 1 / , ( t ) , a function with jump at t = 1 / . Fourier series usually show aGibbs phenomenon, i.e. an oscillating behavior at discontinuities. This behavior is alsoclearly visible in the graph of ˆ σ . In order to reconstruct jumps in volatility other methodscertainly will be more suitable and are postponed to a separate paper.

Computational tasks:

We implemented the estimators in Matlab using the routineﬀt() for the discrete sine transform (see Remark 8). Calculation of the estimators for asample size of n = 25 .

000 took around 2-3 seconds on a Intel Celeron 1.7 GHz processor.As mentioned in Remark 8, the estimator can be calculated in O ( N n log n ) steps. Ifwe choose N with the optimal scale, i.e. N ∼ n / (4 α +2) , we have for the complexity O ( N n log n ) = o (cid:0) n / log n (cid:1) , whenever α > / . Appendix A Convergence rate of ˆ σ In this section we will give a proof of Theorem 2. To this end we ﬁrst introduce somenotation and then prove a Lemma in order to get uniform estimates of bias and varianceof the single estimators ˆ s k, . A.1 Preliminary Results and Notation

Proofs of the upper bounds are based on a decomposition of ∆ Y k . In this subsection wepresent some further notation. Let σ k ( t ) := σ ( t ) f k ( t ) and τ k ( t ) := τ ( t ) f k ( t ) , t ∈ [0 , . Letthroughout the following for the Sobolev s -ellipsoids in Deﬁnition 1 for σ the constantsbeing l = σ min and u = σ max and for τ , l = τ min , u = τ max . We deﬁne φ n := sup σ ∈ Θ bs ( α,Q ) max i =1 ,...,n − sup ξ ∈ [ i/n, ( i +1) /n ] (cid:12)(cid:12)(cid:12)(cid:12) σ ( ξ ) − σ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , ¯ φ n := sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / max i =1 ,...,n | ∆ i τ k | . (A.1)In order to do the proofs for model (1.1) and model (1.2) simultaneously, we ﬁrst deﬁnethe more general process V k,l := X ,k + X ,k + Z ,k,l + Z ,k , l = 1 , , (A.2)13here X ,k , X ,k , Z ,k,l and Z ,k are n − X ,k ) i := σ k ( i/n ) ∆ i W i/n , ( X ,k ) i := τ k ( i/n ) ∆ i (cid:15) i,n , ( Z ,k, ) i := f k ( i/n ) (cid:90) ( i +1) /ni/n (cid:18) σ ( s ) − σ (cid:18) in (cid:19)(cid:19) dW s , ( Z ,k, ) i := f k ( i/n ) (∆ i σ ) W ( i +1) /n , ( Z ,k ) i := f k ( i/n ) (∆ i τ ) (cid:15) i +1 ,n , i = 1 , . . . , n − . Obviously, ∆ Y k = V k, and ∆ Y k = V k, if model (1.1) and (1.2) holds, respectively. Deﬁnethe generalized estimators ˆ t k, ,l := V tk,l DJ τn D t V k,l and ˆ s k, ,l := V tk,l DJ n D t V k,l − π ˆ t k, ,l / C ,k,l , C ,k ∈ M n − ,n such that V k,l = C ,k,l ξ + C ,k (cid:15), (A.3)where (cid:15) = ( (cid:15) ,n , . . . , (cid:15) n,n ) t and ξ = ξ n is standard n -variate normal, (cid:15), ξ independent and C ,k,l ξ = X ,k + Z ,k,l , C ,k (cid:15) = X ,k + Z ,k . Now, let s k,p := (cid:90) σ k ( x ) cos( pπx ) dx, t k,p := (cid:90) τ k ( x ) cos( pπx ) dx (A.4)be the scaled p -th Fourier coeﬃcients of the cosine series of σ k and τ k , respectively. Deﬁnethe sums A ( σ k , r ) by A (cid:0) σ k , r (cid:1) =  s k, + 2 (cid:80) ∞ m =1 s k, nm for r ≡ n, (cid:80) ∞ m =0 s k, nm + n for r ≡ n mod 2 n, (cid:80) q ≡± r mod 2 n, q ≥ s k,q for r (cid:54)≡ n, and analogously A ( τ k , r ) with s k,p replaced by t k,p . Some properties of these variables aregiven in Lemma D.1 and Lemma D.2.Further deﬁne Σ k :=  σ k (1 /n ) . . . σ k (1 − /n )  . (A.5)We put Cum ( (cid:15) ) := Cum ( (cid:15) ,n ) for the fourth cumulant of (cid:15) ,n . If X, Y are independentrandom vectors, we write X ⊥ Y . A.2 Proofs for Estimation of σ Lemma A.1.

Let ˆ s k, be deﬁned as in (4.5). Further assume β > , Q, ¯ Q > , < σ min ≤ σ max < ∞ , < τ min ≤ τ max < ∞ and k = k n ∈ N . i) Assume model (1.1), α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / | E (ˆ s k, ) − s k, | = O (cid:16) n − / + n − β + n / − α (cid:17) , (A.6)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (ˆ s k, ) = O (cid:16) n − / + n − β (cid:17) . (A.7) (ii) Assume model (1.2), α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / | E (ˆ s k, ) − s k, | = O (cid:16) n − β + n / − α + n − / (cid:17) , (A.8)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (ˆ s k, ) = O (cid:16) n − / + n − β (cid:17) . (A.9) Proof.

The proof mainly uses the generalized estimators as introduced in Section A.1. It isclear that for two centered random vectors P and Q (cid:104) P, Q (cid:105) σ := E (cid:0) P t DJ n DQ (cid:1) deﬁnes a semi-inner product and by Lemma D.5, P ⊥ Q ⇒ (cid:104) P, Q (cid:105) σ = 0. HenceE ˆ s k, ,l = (cid:104) X ,k , X ,k (cid:105) σ + (cid:104) X ,k , X ,k (cid:105) σ + (cid:104) Z ,k,l , Z ,k,l (cid:105) σ + (cid:104) Z ,k , Z ,k (cid:105) σ +2 (cid:104) X ,k , Z ,k,l (cid:105) σ + 2 (cid:104) X ,k , Z ,k (cid:105) σ − π (cid:0) ˆ t k, ,l (cid:1) . (A.10)Clearly with (iii) in Lemma D.1 and r n := n − / (cid:2) n / (cid:3) , (cid:104) X ,k , X ,k (cid:105) σ = 1 n tr (Σ k DJ n D Σ k ) = 1 n tr (cid:0) J n D Σ k D (cid:1) = n − / [ n / ] (cid:88) i = [ n / ] +1 (cid:0) A (cid:0) σ k , (cid:1) − A (cid:0) σ k , i (cid:1)(cid:1) = r n A (cid:0) σ k , (cid:1) − n − / [ n / ] (cid:88) i = [ n / ] +1 A (cid:0) σ k , i (cid:1) . Hence due to r n ≤ | r n − | ≤ n − / (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − s k, (cid:12)(cid:12) ≤ ∞ (cid:88) m = n | s k,m | + 2 √ n ∞ (cid:88) i =0 | s k,i | , and with Lemma D.2 sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − s k, (cid:12)(cid:12) = O (cid:16) n / − α (cid:17) . (cid:104) X ,k , X ,k (cid:105) σ . In order to do this let T k ∈ D n − with entries ( T k ) i,j = τ k ( i/n ) δ i,j . Further we deﬁne ˜ T k ∈ M n − (cid:16) ˜ T k (cid:17) i,j =  (∆ i τ k ) for i = j − , (∆ j τ k ) for i = j + 1 , . (A.11)Note the relationCov ( X ,k ) = T k KT k = 1 / T k K + 1 / KT k + 1 / T k . (A.12)Using Lemma D.3 yields (cid:104) X ,k , X ,k (cid:105) σ = E (cid:0) X t ,k T k DJ n DX ,k (cid:1) = tr ( DJ n DT k KT k )= 12 tr (cid:0) J n DT k KD (cid:1) + 12 tr (cid:0) J n DKT k D (cid:1) + 12 tr (cid:16) J n D ˜ T k D (cid:17) = tr (cid:0) Λ J n DT k D (cid:1) + 12 tr (cid:16) J n D ˜ T k D (cid:17) , (A.13)and furthertr (cid:0) Λ J n DT k D (cid:1) = n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:0) A (cid:0) τ k , (cid:1) − A (cid:0) τ k , i (cid:1)(cid:1) = A (cid:0) τ k , (cid:1) n / [ n / ] (cid:88) i = [ n / ] +1 λ i − n / [ n / ] (cid:88) i = [ n / ] +1 λ i A (cid:0) τ k , i (cid:1) . (A.14)Because max i = [ n / ] +1 ,..., [ n / ] λ i = λ [ n / ] ≤ π n − , it holds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) √ n [ n / ] (cid:88) i = [ n / ] +1 λ i A (cid:0) τ k , i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:88) q ≡± i mod 2 n, q ≥ | t k,q |≤ π n − / ∞ (cid:88) i =0 | t k,i | ≤ π n − / ∞ (cid:88) i =0 | t ,i | . Therefore, (A.14) can be written as (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) tr (cid:0) Λ J n DT k D (cid:1) − t k, n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ π ∞ (cid:88) m = n | t k,m | + 8 π n − / ∞ (cid:88) i =0 | t ,i | . This gives by Lemma D.7 and Lemma D.2sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12) tr (cid:0) Λ J n DT k D (cid:1) − π t k, (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − / (cid:17) . J n ) = O ( n ). It follows (cid:12)(cid:12)(cid:12) tr (cid:16) J n D ˜ T k D (cid:17)(cid:12)(cid:12)(cid:12) ≤ tr ( J n ) max i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) D ˜ T k D (cid:17) i,j (cid:12)(cid:12)(cid:12)(cid:12) ≤ J n ) max i (∆ i τ k ) . So, sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / tr (cid:16) J n D ˜ T k D (cid:17) = O (cid:0) n ¯ φ n (cid:1) (A.15)and therefore sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − π t k, (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − / + n ¯ φ n (cid:17) . We bound the remaining terms of (A.10). Note (cid:104) Z ,k, , Z ,k, (cid:105) σ = tr ( DJ n D Cov ( Z ,k, )) ≤ λ (Cov ( Z ,k, )) tr ( J n ) ≤ φ n implying sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:104) Z ,k, , Z ,k, (cid:105) σ = O (cid:0) φ n (cid:1) . In order to bound (cid:104) Z ,k, , Z ,k, (cid:105) σ deﬁne L := (cid:18) ( i ∧ j ) + 1 n (cid:19) i,j =1 ,...,n − (A.16)and ∆Σ k ∈ D n − by (∆Σ k ) i,j := f k (cid:18) in (cid:19) (∆ i σ ) δ i,j . We obtain (cid:104) Z ,k, , Z ,k, (cid:105) σ = tr ( DJ n D (∆Σ k ) L (∆Σ k )) ≤ n / φ n (A.17)and hence sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:104) Z ,k, , Z ,k, (cid:105) σ = O (cid:16) n / φ n (cid:17) . (A.18)Similarly to (A.16), (cid:104) Z ,k , Z ,k (cid:105) σ ≤ ¯ φ n n and thussup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:104) Z ,k , Z ,k (cid:105) σ = O (cid:0) ¯ φ n n (cid:1) . (A.19)17ote that Cov ( X ,k , Z ,k, ) i,j =  j < in − f k ( i/n ) σ ( i/n ) (∆ j σ ) for j ≥ i . Hence by Proposition D.1, we obtainsup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , Z ,k, (cid:105) σ (cid:12)(cid:12) = O (cid:16) n / log n φ n (cid:17) Applying the CS-inequality givessup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , Z ,k, (cid:105) σ (cid:12)(cid:12) = O ( φ n ) , (cid:12)(cid:12) (cid:104) X ,k , Z ,k (cid:105) σ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / σ (cid:104) Z ,k , Z ,k (cid:105) / σ . Using Proposition C.1 this yields (A.6) and (A.8). In order to give an upper bound for thevariance of ˆ s k, ,l noteVar (ˆ s k, ,l ) ≤ (cid:0) V tk,l DJ n DV k,l (cid:1) + 2 7 π (cid:0) ˆ t k, ,l (cid:1) . Furthermore we have using (A.3) and Lemma D.3 (vi) V tk,l DJ n DV k,l = ξ t C t ,k,l DJ n DC t ,k,l ξ + 2 ξ t C t ,k,l DJ n DC ,k (cid:15) + (cid:15) t C t ,k DJ n DC ,k (cid:15) ≤ ξ t C t ,k,l DJ n DC t ,k,l ξ + 2 (cid:15) t C t ,k DJ n DC ,k (cid:15). Hence Var (cid:0) V tk,l DJ n DV k,l (cid:1) ≤ (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) + 8 Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) . Finally, we bound Var (cid:16) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:17) and Var (cid:16) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:17) in two steps,which will be denoted by ( a ) and ( b ).(a) By Lemma D.4 (iii), we haveVar (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) = 2 (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k + Z ,k,l ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n D (Cov ( X ,k ) + Cov ( Z ,k,l )) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F + 16 (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k,l ) DJ / n (cid:13)(cid:13)(cid:13) F . (A.20)Firstly, (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k, ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ (Cov ( Z ,k, )) tr (cid:0) J n (cid:1) ≤ n − / φ n , (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k, ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ ( DJ n D ) tr (cid:16) Cov ( Z ,k, ) (cid:17) ≤ nφ n (cid:107) L (cid:107) F ≤ n φ n , (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ (Cov ( X ,k )) tr (cid:0) J n (cid:1) ≤ σ max n − / . Therefore, sup σ ∈ Θ bs ( α,Q ) max k ≤ n / Var (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) = O (cid:16) n − / (cid:17) . (b) Next, we see with the same arguments as in (A.20)Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) ≤ (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k + Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F + 8 (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F . We obtain (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ φ n tr (cid:0) J n (cid:1) = 4 ¯ φ n n / . From (A.11) follows now (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n DT k KDJ / n (cid:13)(cid:13)(cid:13) F + 34 (cid:13)(cid:13)(cid:13) J / n D ˜ T k DJ / n (cid:13)(cid:13)(cid:13) F . Let I n − be the n − × n − J n Λ ≤ I n λ [ n / ] n / we have (cid:13)(cid:13)(cid:13) J / n DT k KDJ / n (cid:13)(cid:13)(cid:13) F = tr (cid:16) J / n DT k Λ J n Λ T k DJ / n (cid:17) ≤ λ [ n / ] n / tr (cid:16) J / n DT k DJ / n (cid:17) ≤ λ [ n / ] n / τ tr ( J n ) . Also (cid:13)(cid:13)(cid:13) J / n D ˜ T k DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ ( J n ) (cid:13)(cid:13)(cid:13) ˜ T k (cid:13)(cid:13)(cid:13) F ≤ n ¯ φ n and therefore sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) = O (cid:16) n − / + n ¯ φ n (cid:17) . Combining (a) and (b) givessup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (cid:0) V tk,l DJ n DV k,l (cid:1) = O (cid:16) n − / + n ¯ φ n (cid:17) , so (A.7) and (A.9) follow with Lemma C.1 and Propositon C.1.19 roof of Theorem 2. We decomposeMISE (cid:0) ˆ σ N (cid:1) = (cid:90) Bias (cid:0) ˆ σ N ( t ) (cid:1) dt + (cid:90) Var (cid:0) ˆ σ N ( t ) (cid:1) dt. We have that σ ( t ) = (cid:80) ∞ i =0 (cid:10) ψ k , σ (cid:11) ψ k ( t ) , where (cid:104) . , . (cid:105) denotes the standard scalarproduct on L [0 , η k,n,l := E (ˆ s k, ,l ) − s k, . Then for i ≥

1, E (2ˆ s i, ,l − s , ,l ) =2 / (cid:10) ψ i , σ (cid:11) + 2 η i,n,l − η ,n,l . Hence (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = η ,n,l + 2 N (cid:88) i =1 ( η i,n,l − η ,n,l ) + ∞ (cid:88) i = N +1 (cid:10) ψ i , σ (cid:11) and we obtain η ,n,l + 2 N (cid:88) i =1 ( η i,n,l − η ,n,l ) ≤ (8 N + 1) max i =0 ,...,N η i,n,l . Because σ ∈ Θ s ( α, Q ) it holds ∞ (cid:88) i = N +1 (cid:10) ψ i , σ (cid:11) ≤ N + 1) α ∞ (cid:88) i =1 i α (cid:10) ψ i , σ (cid:11) ≤ Q ( N + 1) − α . (A.21)Therefore, sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O  N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max i =0 ,...,N η i,n,l + N − α  . Assume model (1.1), then by Lemma A.1sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O (cid:16) N n − / + N n − β + N n − α + N − α (cid:17) . and for model (1.2), sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O (cid:16) N n − β + N n − α + N n − / + N − α (cid:17) . (cid:90) Var (cid:0) ˆ σ ( t ) (cid:1) dt = Var (ˆ s , ,l ) + 12 N (cid:88) i =1 Var (2ˆ s i, ,l − s , ,l ) ≤ (4 N + 1) Var (ˆ s , ,l ) + 4 N (cid:88) i =1 Var (ˆ s i, ,l ) . Hence sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Var (cid:0) ˆ σ ( t ) (cid:1) dt = O  N sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N Var (ˆ s k, ,l )  . Using Lemma A.1 yields the result.

Appendix B Sobolev s-ellipsoids

In this chapter we will shortly discuss the function space introduced in Section 3 andprovide a theorem needed for the lower bound. First recall the classical deﬁnition of Sobolevellipsoids (cf. Proposition 1.14 in [25]).

Deﬁnition B.1.

Deﬁne a j,α =  j α , for even j, ( j − α , for odd j . Let { ϕ j } ∞ j =1 , φ ( x ) := 1 , φ j ( x ) := √ πjx ) , φ j +1 ( x ) := √ πjx ) denote thetrigonometric basis on [0 , . Then we call the function space Θ ( α, C ) := (cid:40) f ∈ L [0 ,

1] : ∃ ( θ n ) ∞ n =1 , s. t. f ( x ) = ∞ (cid:88) i =1 θ i ϕ i ( x ) , ∞ (cid:88) i =1 a i,α θ i ≤ C (cid:41) a Sobolev ellipsoid. Interesting characterizations arise if we put Sobolev s -ellipsoids into relation with Sobolevellipsoids: Remark B.1.

Let S be the class of all symmetric functions f ∈ L [0 , such that f ( x ) = f (1 − x ) , ∀ x ∈ [0 , . Further let Θ( α, C ) be a Sobolev ellipsoid. Then a function belongsto Θ s ( α, C ) ∩ S if and only if it belongs to Θ( α, C ) ∩ S . W ( α, ¯ C ) := W ( α, ¯ C, [0 , (cid:110) f ∈ L [0 ,

1] : f ( l ) (0) = f ( l ) (1) = 0for l odd , l < α, (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx ≤ ¯ C (cid:27) . (B.1)For positive integer values of α , we have the following equivalence. Theorem B.1.

Assume α ∈ { , , . . . } , C > . Let ¯ C = 2 π α C . Then a function is in W ( α, ¯ C ) if and only if it is in Θ s ( α, C ) .Proof. First we show that if a function f ∈ W ( α, ¯ C ) then also f ∈ Θ s ( α, C ). Let ˜ f bedeﬁned on [ − ,

1] by ˜ f ( x ) :=  f ( x ) for x ∈ [0 , f ( − x ) for x ∈ [ − , . Note that ˜ f is an α -times diﬀerentiable function, ˜ f ( l ) is even if l is even and ˜ f ( l ) is odd if l is odd. Let s k ( j ) = (cid:82) − ˜ f ( j ) ( x ) dx for k = 0 (cid:82) − ˜ f ( j ) ( x ) cos( kπx ) dx for k ≥ , j even (cid:82) − ˜ f ( j ) ( x ) sin( kπx ) dx for k ≥ , j odd . It holds for j ≥ s ( j ) = (cid:90) − ˜ f ( j ) ( x ) dx = ˜ f ( j − (1) − ˜ f ( j − ( −

1) = 0 . Hence we have the Parseval type equality (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx = 12 ∞ (cid:88) k =1 s k ( α ) . (B.2)Further for k ≥ j even, it follows by partial integration s k ( j ) = (cid:90) − ˜ f ( j ) ( x ) cos( kπx ) dx = ˜ f ( j − ( x ) cos( kπx ) (cid:12)(cid:12)(cid:12) − + kπ (cid:90) − ˜ f ( j − ( x ) sin( kπx ) dx = kπs k ( j − k ≥ j odd s k ( j ) = (cid:90) − ˜ f ( j ) ( x ) sin( kπx ) dx = ˜ f ( j − ( x ) sin( kπx ) (cid:12)(cid:12)(cid:12) − − kπ (cid:90) − ˜ f ( j − ( x ) cos( kπx ) dx = − kπs k ( j − . θ k = (cid:82) f ( x ) cos( kπx ) dx it follows for k ≥ s k ( α ) = k α π α s k = 4 k α π α θ k . Com-bining this result with (B.2) yields (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx = 2 π α ∞ (cid:88) k =1 k α θ k and hence proves the ﬁrst part of the theorem. The other direction follows in a straightfor-ward way by diﬀerentiation and is thus omitted. Supplementary Material

Supplement: Proofs for upper bound of ˆ τ N and further technicalities Acknowledgments

We are grateful to T. Tony Cai, Marc Hoﬀmann, Mark Podolskij and Ingo Witt for helpfulcomments and discussions.

References [1] F. Bandi and J. Russell. Microstructure noise, realized variance, and optimal sampling.

Rev. Econom. Stud. , 75:339–369, 2008.[2] O. Barndorﬀ-Nielsen, P. Hansen, A. Lunde, and N. Stephard. Designing realised kernelsto measure the ex-post variation of equity prices in the presence of noise.

Econometrica ,76(6):1481–1536, 2008.[3] E. Barucci, P. Malliavin, and M. E. Mancino. Harmonic analysis methods for nonpara-metric estimation of volatility: theory and applications. In

Stochastic processes andapplications to mathematical ﬁnance. Proceedings of the 5th Ritsumeikan internationalsymposium, Kyoto, Japan, March 3-6, 2005 , pages 1–34, 2006.[4] N. Bissantz, T. Hohage, A. Munk, and F. Ruymgaart. Convergence rates of generalregularization methods for statistical inverse problems and applications.

SIAM J.Numerical Analysis , 45:2610–2636, 2007.[5] V. Britanak, P. C. Yip, and K. R. Yao.

Discrete Cosine and Sine Transforms: GeneralProperties, Fast Algorithms and Integer Approximations . Academic Press, 2006.236] C. Butucea and A.B. Tsybakov. Sharp optimality for density deconvolution withdominating bias, I.

Theory Probab. Appl. , 52(1):111–128, 2007.[7] C. Butucea and A.B. Tsybakov. Sharp optimality for density deconvolution withdominating bias, II.

Theory Probab. Appl. , 52(2):336–349, 2007.[8] T. Cai, A. Munk, and J. Schmidt-Hieber. Sharp minimax estimation of the variance ofBrownian motion corrupted with Gaussian noise.

Statist. Sinica , 2009. Forthcoming.[9] A. Delaigle and I. Gijbels. Practical bandwidth selection in deconvolution kernel den-sity estimation.

Comput. Stat. Data Anal. , 45(2):249–267, 2004.[10] A.K. Dey, B.A. Mair, and F.H. Ruymgaart. Cross-validation for parameter selectionin inverse estimation problems.

Scand. J. Statist. , 23:609–620, 1996.[11] J. Fan. On the optimal rates of convergence for nonparametric deconvolution problem.

Ann. Statist. , 19(3):1257–1272, 1991.[12] J. Fan, J. Jiang, C. Zhang, and Z. Zhou. Time-dependent diﬀusion models for termstructure dynamics.

Statist. Sinica , 13:965–992, 2003.[13] A. Gloter and J. Jacod. Diﬀusions with measurement errors. I. Local asymptoticnormality.

ESAIM Probab. Stat. , 5:225–242, 2001.[14] A. Gloter and J. Jacod. Diﬀusions with measurement errors. II. Optimal estimators.

ESAIM Probab. Stat. , 5:243–260, 2001.[15] J. Jacod, Y. Li, P. A. Mykland, M. Podolskij, and M. Vetter. Microstructurenoise in the continuous case: The pre-averaging approach.

Stochastic Process. Appl. ,119(7):2249–2276, 2009.[16] A. Madahavan. Market microstructure: A survey.

Journal of Financial Markets ,3:205–258, 2000.[17] A. Munk and J. Schmidt-Hieber. Lower bounds for volatility estimation in microstruc-ture noise models. arxiv:1002.3045, Math arXiv Preprint.[18] A. Neubauer. The convergence of a new heuristic parameter selection criterion forgeneral regularization methods.

Inverse Problems , 24:055005, 2008.[19] S. Pereverzev and E. Schock. On the adaptive selection of the parameter in regular-ization of ill-posed problems.

SIAM J. Numer. Anal. , 43(5):2060–2076, 2005.[20] M. Podolskij and M. Vetter. Estimation of volatility functionals in the simultaneouspresence of microstructure noise and jumps.

Bernoulli , 2009. Forthcoming.2421] M. Reiß. Asymptotic equivalence and suﬃciency for volatility estimation under mi-crostructure noise. arxiv:1001.3006, Math arXiv Preprint.[22] M. Rosenbaum. Integrated volatility and round-oﬀ error.

Bernoulli , 2009. Forthcom-ing.[23] M. Steele.

Stochastic Calculus and Financial Applications . Springer, New York, 2001.[24] M. Stein. Minimum norm quadratic estimation of spatial variograms.

J. Amer. Statist.Assoc. , 82(399):765–772, 1987.[25] A. B. Tsybakov.

Introduction to Nonparametric Estimation (Springer Series in Statis-tics XII) . Springer-Verlag, New York, 2009.[26] L. Zhang. Eﬃcient estimation of stochastic volatility using noisy observations: Amulti-scale approach.

Bernoulli , 12:1019–1043, 2006.[27] L. Zhang, P. Mykland, and Y. Ait-Sahalia. A tale of two time scales: Determin-ing integrated volatility with noisy high-frequency data.

J. Amer. Statist. Assoc. ,472:1394–1411, 2005. 25igure 1: Plots 1 and 2 display paths of sBM and tBM corresponding to σ ( t ) = (1 / − t ) + (Plot 3). Analogously, Plots 4 and 5 show paths of sBM and tBM with σ ( t ) = 1+ I ( 1 / , ( t )(Plot 6). For Plots 1 and 2 as well as Plots 4 and 5 we took the same realization ( W t ) t ∈ [0 , of the underlying Brownian motion. The ﬁrst two plots show the diﬀerent scaling behavior:sBM= 0 and tBM= (cid:82) / σ ( s ) dW s for t > /

2. On the other hand we see by Plots 4 and 5that a jump induces a random shift, i.e. sBM=tBM for t ≤ / W / =sBM for t > / n = 25000 data points from model (1.1), (cid:15) i,n ∼ N (0 , τ = 0 . σ ( t ) =(2 + cos (2 πt )) / . Plot 1 shows the data. Additionally to the data, we plotted the pathof the tBM in Plot 2. The reconstruction of τ and σ (dashed lines) as well as the truefunction (solid lines) are given in Plot 3 and 4, respectively. The threshold parameters wereselected as N ∗ = 1 for estimation of τ and N ∗ = 3 for estimation of σ . τ and ˆ σ is quite robust to heavy-tailednoise. The threshold parameters N ∗ were selected as 1 and 3 for estimation of τ and σ ,respectively. 28igure 4: (Low-smoothness) As Figure 2 but we chose σ ( t ) = 3 (cid:12)(cid:12)(cid:12) ˜ W t (cid:12)(cid:12)(cid:12) , where (cid:16) ˜ W t (cid:17) t ∈ [0 , denotes a Brownian motion independent of the noise and the Brownian motion in (1.1).The estimator returns a smoothed version of the path. The threshold parameters N ∗ wereselected as 1 and 17 for estimation of τ and σ , respectively.29igure 5: (Jump function) As Figure 2 but we chose σ ( t ) = 1 + I ( 1 / , ( t ) . The Gibbsphenomenon is clearly visible. The threshold parameters N ∗ were selected as 1 and 10 forestimation of τ and σ , respectively. 30 upplementary Material: Nonparametric Estimation of theVolatility Function in a High-Frequency Model corrupted byNoise Axel Munk ∗ and Johannes Schmidt-Hieber Institut f¨ur Mathematische Stochastik, Universit¨at G¨ottingen,Goldschmidtstr. 7, 37077 G¨ottingen

Email: [email protected] , [email protected] Abstract

This note provides proofs and supplementary technicalities for the paper ”Nonpara-metric Estimation of the Volatility Function in a High-Frequency Model corrupted byNoise”. In particular a proof on the rate of convergence of the estimator ˆ τ N is given. AMS 2000 Subject Classiﬁcation:

Primary 62M09, 62M10; secondary 62G08.

Keywords:

Brownian motion; Variance estimation; Minimax rate; Microstructure noise;Sobolev Embedding.

Appendix C Convergence Rate of ˆ τ C.1 Preliminary Results and Notation

First we recall some notation. Let σ k ( i/n ) := σ ( i/n ) f k ( i/n ) and τ k ( i/n ) := τ ( i/n ) f k ( i/n ).Let throughout the following for the Sobolev s -ellipsoids in Deﬁnition 1 for σ the constantsbeing l = σ min and u = σ max and for τ , l = τ min , u = τ max . We deﬁne K n := n / / log n and φ n := sup σ ∈ Θ bs ( α,Q ) max i =1 ,...,n − sup ξ ∈ [ i/n,i +1 /n ] (cid:12)(cid:12)(cid:12)(cid:12) σ ( ξ ) − σ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , ¯ φ n, / := sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n max i =1 ,...,n sup ξ i ∈ [( i − /n,i/n ] (cid:12)(cid:12)(cid:12)(cid:12) τ k ( ξ i ) − τ k (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) . Proposition C.1.

Assume α, β > / . It holds for any δ > , φ n = O (cid:16) n / − α + n δ − (cid:17) ¯ φ n = O (cid:16) n / − β + n − / (cid:17) ¯ φ n, / = O (cid:16) n / − β + n − / log − n (cid:17) . roof. We only prove the third equality the other two can be deduced similarly. Note thatfor τ ∈ Θ b ( β, Q ), (cid:12)(cid:12)(cid:12)(cid:12) τ k ( ξ i ) − τ k (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ (cid:12)(cid:12)(cid:12)(cid:12) τ ( ξ i ) − τ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + τ / kπ/ (2 n ) ≤ √ τ min (cid:12)(cid:12)(cid:12)(cid:12) τ ( ξ i ) − τ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + τ / kπ/ (2 n ) . Taking supremum and applying Lemma D.8 gives the result.

C.2 Proofs for Estimation of τ Lemma C.1.

Let ˆ t k, be deﬁned as in (4.1). Further assume α, β > / , Q, ¯ Q > , < σ min ≤ σ max < ∞ , < τ min ≤ τ max < ∞ and k = k n ∈ N . Assume model (1.1). Then sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) = O (cid:16) n / − β log / n (cid:17) + o (cid:16) n − / (cid:17) , (C.1)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ˆ t k, (cid:1) = O (cid:0) n − (cid:1) . (C.2) If further (cid:15) is n -variate standard normal, then sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − log − n (cid:1) , (C.3) n / (cid:0) ˆ t k, − t k, (cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) for β > , k ≤ K n . (C.4) Assume model (1.2). Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) = O (cid:16) n / − β log / n + n − α log n (cid:17) + o (cid:16) n − / (cid:17) , (C.5)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ˆ t k, (cid:1) = O (cid:0) n − α log n + n − (cid:1) . (C.6) If further (cid:15) is n -variate standard normal, then sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − α log n + n − log − n (cid:1) , (C.7) n / (cid:0) ˆ t k, − t k, (cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) for α > / , β > , k ≤ K n . (C.8)32 roof. Again we work with the generalized estimators as introduced in Section A.1. As inthe proof of Lemma A.1 we introduce for two centered random vectors P and Q a semi-innerproduct deﬁned by (cid:104) P, Q (cid:105) τ := E (cid:0) P t DJ τn D t Q (cid:1) and obtainE ˆ t k, ,l = (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) Z ,k,l , Z ,k,l (cid:105) τ + (cid:104) Z ,k , Z ,k (cid:105) τ + 2 (cid:104) X ,k , Z ,k,l (cid:105) τ + 2 (cid:104) X ,k , Z ,k (cid:105) τ . (C.9)First we bound (cid:104) X ,k , X ,k (cid:105) τ , which will turn out to be the leading term. Similar to (A.13)we have (cid:104) X ,k , X ,k (cid:105) τ = tr (cid:0) Λ J τn D t T k D (cid:1) + 12 tr (cid:16) J τn D t ˜ T k D (cid:17) , and due to tr ( J τn ) = O (log n ) (C.10)the same argument as for (A.15) givessup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n tr (cid:16) J τn D ˜ T k D (cid:17) = O (cid:16) ¯ φ n, / log n (cid:17) . Hence this is a negligible term. Using Lemma D.1 (iii)tr (cid:0) Λ J τn D t T k D (cid:1) = ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] (cid:0) A (cid:0) τ k , (cid:1) − A (cid:0) τ k , i (cid:1)(cid:1) = ¯ r n A (cid:0) τ k , (cid:1) − ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] A (cid:0) τ k , i (cid:1) , where ¯ r n = ( n − [ n/ log n ]) / ( n − n/ log n ) . Note 1 − ¯ r n ≤ / ( n − n/ log n ) . By LemmaD.2 sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) tr (cid:0) Λ J τn D t T k D (cid:1) − t k, (cid:12)(cid:12) ≤ sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:32) (1 − ¯ r n ) | t k, | + ∞ (cid:88) m = n | t k,m | + 2 ( n − n/ log n ) − ∞ (cid:88) i =0 | t k,i | (cid:33) ≤ C β,Q n / − β + 6 ( n − n/ log n ) − sup τ ∈ Θ bs ( β, ¯ Q ) ∞ (cid:88) i =0 | t ,i | = O (cid:16) n − + n / − β (cid:17) . This shows thatsup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) τ − t ,k (cid:12)(cid:12) = O (cid:16) n − + ¯ φ n, / log n + n / − β (cid:17) . (cid:104) X ,k , X ,k (cid:105) τ = 1 n tr (cid:0) DJ τn D t Σ k (cid:1) ≤ σ max n tr ( J τn )implying sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) X ,k , X ,k (cid:105) τ = O (cid:0) n − log n (cid:1) . We obtain with Lemma D.6 in the same way as in (A.16), (A.18) and (A.19)sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) Z ,k, , Z ,k, (cid:105) τ = O (cid:0) n − log n φ n (cid:1) , sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) Z ,k, , Z ,k, (cid:105) τ = O (cid:0) log n φ n (cid:1) , sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:104) Z ,k , Z ,k (cid:105) τ = O (cid:16) ¯ φ n, / log n (cid:17) . From the Cauchy-Schwarz inequality follows that (cid:12)(cid:12) (cid:104) X ,k , Z ,k,l (cid:105) τ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / τ (cid:104) Z ,k,l , Z ,k,l (cid:105) / τ ≤ (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) Z ,k,l , Z ,k,l (cid:105) τ , (cid:12)(cid:12) (cid:104) X ,k , Z ,k (cid:105) τ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / τ (cid:104) Z ,k , Z ,k (cid:105) / τ . This yields sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) =  O (cid:16) n − log n + n / − β + ¯ φ n, / log / n (cid:17) for l = 1 ,O (cid:16) n − log n + n / − β + ¯ φ n, / log / n + φ n log n (cid:17) for l = 2 , and therefore (C.5) and (C.1) holds by Proposition C.1. In order to calculate the covariancewe use the decomposition (A.3). We haveˆ t k, ,l = ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) + (cid:15) t C t ,k DJ τn DC ,k (cid:15). Using the CS-inequality repeatedly, we can write (cid:12)(cid:12)

Var (cid:0) ˆ t k, ,l (cid:1) − Var (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)(cid:12)(cid:12) (C.11) ≤ (cid:16) Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) + 2 Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1)(cid:17) + (cid:16) Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) + 2 Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1)(cid:17) · / (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)

34e subdivide the remaining part of the proofs of (C.6) and (C.2) into three steps (a),(b) and (c), where we calculate Var (cid:16) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:17) , Var (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:17) andVar (cid:16) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:17) , respectively. (a) Let TrSq( A ) := (cid:80) ni =1 A i,i for A ∈ M n . Then by Lemma D.5 it followsVar (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1) = 2 (cid:13)(cid:13) C t ,k DJ τn DC ,k (cid:13)(cid:13) F + Cum ( (cid:15) ) TrSq (cid:0) C t ,k DJ τn DC ,k (cid:1) ≤ (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F , where equality holds if Cum ( (cid:15) ) = 0. By Proposition D.2 we see thatsup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) Cum ( (cid:15) ) n − + n − log − n (cid:1) . (b) In this part of the proof we will bound Var (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:17) . Similar to part (a)in the proof of Lemma A.1 it holdsVar (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) ≤ λ ( J τn ) (cid:16) (cid:107) Cov ( X ,k ) (cid:107) F + (cid:107) Cov ( Z ,k,l ) (cid:107) F (cid:17) ≤  n − log n (cid:0) n − σ + 4 n − φ n (cid:1) , l = 1 , n − log n (cid:0) n − σ + 4 n φ n (cid:1) , l = 2 , where we used Lemma D.6 in the second inequality. Hence we getsup σ ∈ Θ bs ( α,Q ) max k ≤ K n Var (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) =  O (cid:0) n − log n (cid:1) , l = 1 ,O (cid:0) log n (cid:0) φ n + n − (cid:1)(cid:1) , l = 2 . (C.12)(c) Using Lemma D.5 (ii)Var (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1) ≤ √ / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) (cid:13)(cid:13) C t ,k DJ τn DC ,k (cid:13)(cid:13) F and hence sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1) =  O (cid:0) n − log n (cid:1) , l = 1 ,O (cid:0) log n (cid:0) φ n + n − / (cid:1)(cid:1) O (cid:0) n − / (cid:1) , l = 2 . σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) Cum ( (cid:15) ) n + 1 n log n (cid:19) +  O (cid:0) φ n log n + φ n n − / log n (cid:1) , l = 1 , , l = 2 , (C.13)and hence (C.2), (C.3), (C.6) and (C.7) follow using Proposition C.1.Finally we will show the asymptotic normality (C.8) and (C.4). Because of the decom-position (A.3), we haveˆ t k, ,l = ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) + (cid:15) t C t ,k DJ τn C ,k (cid:15). As proved above n / (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:17) P → β > , α > / l = 1 and β > l = 2. Hence by Slutzky’s Lemma it suﬃces to show that n / (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) − E (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)(cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) . In order to apply Theorem D.1, it remains to show n / λ (cid:0) C t ,k DJ τn DC ,k (cid:1) → . Using Corollary D.1, we see that n / λ (cid:0) C t ,k DJ τn DC ,k (cid:1) ≤ n − / log nλ (Cov ( X ,k + Z ,k )) ≤ n − / log n λ (Cov ( X ,k )) + 8 n − / log nλ (Cov ( Z ,k )) ≤ n − / log n sup t ∈ [0 , τ k ( t ) max i λ i + 8 n − / log nφ n, / = o (1) , which yields the last statement of the lemma. Proof of Theorem 1.

The proof is close to the one of Theorem 2. We obtainsup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ τ N ( t ) (cid:1) dt = O  N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N (cid:12)(cid:12) E (cid:0) ˆ t k, ,l (cid:1) − t k, (cid:12)(cid:12) + N − β  , sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Var (cid:0) ˆ τ N ( t ) (cid:1) dt = O  N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N Var (cid:0) ˆ t k, ,l (cid:1) . ppendix D Technical Results Proposition D.1.

Let A ∈ M n − . Then tr ( J n DAD ) ≤ (cid:16) n + 5 n / + 8 n / (1 + log n ) (cid:17) max i,j (cid:12)(cid:12)(cid:12) ( A ) i,j (cid:12)(cid:12)(cid:12) . Proof.

Write A = ( a i,j ) i,j =1 ,...,n − . Note that(

DAD ) i,j = 2 n n − (cid:88) p,q =1 sin (cid:18) ipπn (cid:19) sin (cid:18) qjπn (cid:19) a p,q . For i = j we have further( DAD ) i,i = 1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p − q ) iπn (cid:19) + 1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p + q ) iπn (cid:19) . (D.1)In order to bound the r.h.s. we need bounds for (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) r iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) Dir [ n / ] ( rπ/n ) − Dir[ n / ] ( rπ/n ) (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) sin (cid:0)(cid:0) (cid:2) n / (cid:3) + 1 / (cid:1) rπ/n (cid:1) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) sin (cid:0)(cid:0)(cid:2) n / (cid:3) + 1 / (cid:1) rπ/n (cid:1) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12) . Let B := { , . . . , n } and B := { n + 1 , . . . , n − } . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) r iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤  n / for r = 0 , n/r for r ∈ B ,n/ (2 n − r ) for r ∈ B . Therefore, we can bound the ﬁrst term of the r.h.s. of (D.1) by (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p − q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n n − (cid:88) p,q =1 | a p,q | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) ( p − q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n − max p,q =1 ,...,n − | a p,q |  n / + 2 n − (cid:88) p,q =1 q − p ∈ B nq − p  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p + q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n − max p,q =1 ,...,n − | a p,q |  n − (cid:88) p,q =1 p + q ∈ B np + q + n − (cid:88) p,q =1 p + q ∈ B n n − ( p + q )  ≤ n max p,q =1 ,...,n − | a p,q | . Due to n − (cid:88) p,q =1 q − p ∈ B q − p ≤ n n (cid:88) r =1 r ≤ n (1 + log n )and tr ( J n DAD ) = √ n [ n / ] (cid:88) i = [ n / ] +1 ( DAD ) i,i ≤ (cid:16) n + 5 n / + 8 n / (1 + log n ) (cid:17) max p,q =1 ,...,n − | a p,q | we obtain the result. Proposition D.2.

It holds sup τ ∈ Θ s ( β,Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − log − n (cid:1) . Proof.

We obtain with (A.12) Cov ( X ,k + Z ,k ) = 1 / T k K + 1 / KT k + S k , where S k :=1 / T k +Cov ( X ,k , Z ,k )+Cov ( Z ,k , X ,k )+Cov ( Z ,k ) . Application of the triangle inequalitygives 12 (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F + (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F . (D.2)Note that because of Lemma D.4 (iii) it holdstr (cid:18)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD ( J τn ) / (cid:13)(cid:13)(cid:13) F . (D.3)38ow we will boundtr (cid:18)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) = tr (cid:18)(cid:104) ( J τn Λ) / DT k D (Λ J τn ) / (cid:105) (cid:19) = n − (cid:88) i =1 λ i (cid:0) DT k D Λ J τn (cid:1) from below. We obtain with Lemma D.3 λ i (cid:0) DT k D Λ J τn (cid:1) ≥  λ n − [ n/ log n ] (Λ J τn ) λ [ n/ log n ] − i (cid:0) DT k D (cid:1) , i ≤ n − [ n/ log n ] , , i > n − [ n/ log n ] . Denote by τ k, ( i ) the i -th largest component of the vector( τ k (1 /n ) , . . . , τ k (1 − /n )) . Then tr (cid:16)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) = n − (cid:88) i =1 λ i (cid:0) DT k D Λ J τn (cid:1) ≥ n − [ n/ log n ] (cid:88) i =1 ( n − n/ log n ) − τ k, ([ n/ log n ] − i ) ≥ ( n − n/ log n ) − n − (cid:88) i =1 τ k (cid:18) in (cid:19) − τ n log n ( n − n/ log n ) − . (D.4)Next we will derive an upper bound for the r.h.s. of (D.3). Let analogously to the Deﬁnition(A.11) ¯ T k be a tridiagonal matrix with entries (cid:0) ¯ T k (cid:1) i,j := (cid:0) ∆ i τ k (cid:1) for i = j − , (cid:0) ∆ j τ k (cid:1) for i = j + 1 , . Note that max i (cid:12)(cid:12) ∆ i τ k (cid:12)(cid:12) ≤ τ / ¯ φ n, / . It is easy to check that T k KT k = 1 / T k K +1 / KT k +1 / T k holds. Clearly, J τn ≤ ( n − n/ log n ) − Λ − , and therefore we have for the upper boundin (D.3) (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ ( n − n/ log n ) − (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD Λ − / (cid:13)(cid:13)(cid:13) F ≤ ( n − n/ log n ) − tr (cid:16) ( J τn ) / DT k KT k D ( J τn ) / (cid:17) ≤ ( n − n/ log n ) − tr (cid:0) T k KDJ τn D (cid:1) + 12 ( n − n/ log n ) − tr (cid:0) D ¯ T k DJ τn (cid:1) ≤ ( n − n/ log n ) − tr (cid:0) T k (cid:1) + 2 ( n − n/ log n ) − max i,j =1 ,...,n − (cid:12)(cid:12) ¯ T k (cid:12)(cid:12) i,j tr ( J τn ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − ¯ φ n, / + n − log n ¯ φ n, / + n − log − n (cid:17) . (D.5)Now we will bound the remainder term in (D.2). Using Lemma D.6 gives (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ λ ( J τn ) (cid:107) S k (cid:107) F ≤ n − log n (cid:18)(cid:13)(cid:13)(cid:13) ˜ T k (cid:13)(cid:13)(cid:13) F + 4 (cid:107) Cov ( Z ,k ) (cid:107) F +8 (cid:107) Cov ( X ,k , Z ,k ) (cid:107) F (cid:17) . Because Cov ( X ,k , Z ,k ) is tridiagonal it holds with Lemma D.4 (i) (cid:107) Cov ( X ,k , Z ,k ) (cid:107) F = n − (cid:88) i,j =1 (cid:16) Cov ( X ,k , Z ,k ) i,j (cid:17) ≤ nτ max φ n, / and therefore (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ n − log n (cid:16) n ¯ φ n, / + 16 n ¯ φ n, / + 64 n ¯ φ n, / τ max (cid:17) This leads to sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F = O (cid:16) n − log n ¯ φ n, / (cid:17) and with (D.2) and (D.5) completes the proof. Lemma D.1.

Let s k,p and t k,p as deﬁned in (A.4). Then it holds(i) s k,p = 12 s ,p + 14 s ,p − k + 14 s ,p + k , t k,p = 12 t ,p + 14 t ,p − k + 14 t ,p + k . (ii) n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = A (cid:0) σ k , p (cid:1) − n (cid:0) ( − p σ k (1) + σ k (0) (cid:1) . iii) Let Σ k as deﬁned in (A.5). Then (cid:0) D Σ k D (cid:1) i,j = A (cid:0) σ k , i − j (cid:1) − A (cid:0) σ k , i + j (cid:1) . Remark D.1. In ( iii ) , for | i − j | (cid:28) i + j , the r.h.s. behaves like s k,i − j . In the same waywe obtain the equivalent result if we replace σ by τ .Proof. (ii) Note that we can write σ k (cid:16) rn (cid:17) = s k, + 2 ∞ (cid:88) q =1 s k,q cos ( qπr/n )and hence it holds1 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = 1 n s k, n − (cid:88) r =1 cos (cid:16) prπn (cid:17) + 2 n ∞ (cid:88) q =1 s k,q n − (cid:88) r =1 cos (cid:16) qπrn (cid:17) cos (cid:16) prπn (cid:17) . Let I { A } denote the indicator function on the set A . We have the identities n − (cid:88) r =1 cos (cid:16) prπn (cid:17) = n I { p ≡ n } −

12 (1 + ( − p )and 2 n − (cid:88) r =1 cos (cid:16) qπrn (cid:17) cos (cid:16) prπn (cid:17) = n − (cid:88) r =1 cos (cid:18) ( q − p ) πrn (cid:19) + n − (cid:88) r =1 cos (cid:18) ( q + p ) πrn (cid:19) . From this it follows1 n n − (cid:88) r =1 σ (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = 1 n  −

12 (1 + ( − p ) s k, − ∞ (cid:88) q =1 s k,q (cid:0) − q − p (cid:1) + A (cid:0) σ k , p (cid:1) , which yields the result.(iii) This follows by applying ( ii ) to (cid:0) D Σ k D (cid:1) i,j = 2 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) sin (cid:18) irπn (cid:19) sin (cid:18) rjπn (cid:19) = 1 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:18) ( i − j ) rπn (cid:19) − n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:18) ( i + j ) rπn (cid:19) . σ k in Sobolev s-ellipsoids. In particular the result shows that the Fourier series is absolutesummable. Lemma D.2.

Let s k,p be as deﬁned in (A.4). Assume k ≤ cn γ , where < c < is aconstant and either γ > , α > / or k = 0 , γ = 0 and α > / . Then it holds for n largeenough sup σ ∈ Θ s ( α,Q ) ∞ (cid:88) m =[ n γ ] | s k,m | ≤ C γ,α,Q,c n γ (1 / − α ) , where C γ,α,Q,c is independent of n .Proof. Consider the case γ > , α > /

2. Using Lemma D.1 (i), we see that for n largeenough ∞ (cid:88) m =[ n γ ] | s k,m | ≤ ∞ (cid:88) m =[(1 − c ) n γ ] | s ,m | = ∞ (cid:88) m =1 | s ,m | I { m ≥ [(1 − c ) n γ ] } ≤ (cid:32) ∞ (cid:88) i =1 i α s ,i (cid:33) /  ∞ (cid:88) i =[(1 − c ) n γ ] i − α  / ≤ C γ,α,Q,c n γ (1 / − α ) , where we used the deﬁnition of a Sobolev s-ellipsoid in the last step. If k = 0, γ = 0 and α > / Lemma D.3. (i) Let A ∈ M n be symmetric. A is positive semideﬁnite iﬀ A = B t B forsome B ∈ M n .(ii) If A, B are positive semideﬁnite matrices. Denote by λ ( A ) the largest eigenvalue of A . Then tr( AB ) ≤ λ ( A ) tr( B ) .(iii) Let A, B ∈ M n − be positive semideﬁnite. Then λ r + s +1 ( AB ) ≤ λ r +1 ( A ) λ s +1 ( B ) 0 ≤ r + s ≤ n − λ n − r − s +1 ( AB ) ≥ λ n − r ( A ) λ n − s ( B ) 2 ≤ r + s ≤ n. (iv) Let A and B symmetric matrices. Then λ r + s +1 ( A + B ) ≤ λ r +1 ( A ) + λ s +1 ( B ) 0 ≤ r + s ≤ n − . v) (CS inequality for trace operator) Let A and B matrices of the same size. Then (cid:12)(cid:12) tr (cid:0) AB t (cid:1)(cid:12)(cid:12) ≤ tr / (cid:0) AA t (cid:1) tr / (cid:0) BB t (cid:1) . (vi) Let A, B matrices of the same size. Then A t B + B t A ≤ A t A + B t B. Corollary D.1.

Let A and B matrices of the same size. Then λ (cid:0) AB t + BA t (cid:1) ≤ λ (cid:0) AA t (cid:1) + λ (cid:0) BB t (cid:1) . Proof.

By Lemma D.3 ( vi ) A t B + AB t ≤ A t A + B t B . Applying Lemma D.3 ( iv ) for r = s = 0 yields the result.In the following Lemma, we summarize some facts on Frobenius norms. Lemma D.4.

Let A ∈ M n − . Then(i) (cid:107) A (cid:107) F := tr (cid:0) AA t (cid:1) = n − (cid:88) i =1 λ i (cid:0) AA t (cid:1) = n − (cid:88) i,j =1 a i,j and whenever A = A t also (cid:107) A (cid:107) F = (cid:80) n − i =1 λ i ( A ) .(ii) It holds (cid:0) A (cid:1) ≤ (cid:13)(cid:13) A + A t (cid:13)(cid:13) F ≤ (cid:107) A (cid:107) F . (iii) Let A , B be positive semideﬁnite matrices of the same size and ≤ A ≤ B . Furtherlet X be another matrix of the same size. Then (cid:13)(cid:13) X t AX (cid:13)(cid:13) F ≤ (cid:13)(cid:13) X t BX (cid:13)(cid:13) F . Proof. (i) and (ii) is well known and omitted. (iii) By assumptions it holds 0 ≤ X t AX ≤ X t BX . Hence λ i (cid:0) X t AX (cid:1) ≤ λ i (cid:0) X t BX (cid:1) and the result follows. Lemma D.5.

Let V = ( V , . . . , V n ) , W = ( W , . . . , W m ) be two independent, centeredrandom vectors. Let A = ( a i,j ) i,j =1 ,...,n ∈ M n and B ∈ M n,m . Then(i) E (cid:0) V t AV (cid:1) = tr ( A Cov ( V )) , E (cid:0) V t BW (cid:1) = 0 and ii) Assume further that V i ⊥ V j for all i, j = 1 , . . . , n , i (cid:54) = j and W k ⊥ W l for all k, l = 1 , . . . , m , k (cid:54) = l and Var ( V i ) = Var ( W k ) = 1 for i = 1 , . . . , n and j = 1 , . . . , m .We set TrSq( A ) := (cid:80) ni =1 a i,i . Then

Var (cid:0) V t AV (cid:1) = Cum ( V ) n (cid:88) i =1 a ii + tr (cid:0) A + AA t (cid:1) ≤ Cum ( V ) n (cid:88) i =1 a ii + 2 (cid:107) A (cid:107) F ≤ (2 + Cum ( V )) (cid:107) A (cid:107) F , (D.6)Var (cid:0) V t BW (cid:1) = (cid:107) B (cid:107) F , Var (cid:0) V t ABW (cid:1) ≤ (cid:13)(cid:13) AA t (cid:13)(cid:13) F (cid:13)(cid:13) BB t (cid:13)(cid:13) F . (D.7) Proof.

We only proof the ﬁrst and the last statement in ( ii ). Note thatVar (cid:0) V t AV (cid:1) = n (cid:88) i,j,k,l =1 a ij a kl Cov ( V i V j , V k V l ) . If i = j = k = l then Cov ( V i V j , V k V l ) = 2 + Cum ( V ); if i = k , j = l , i (cid:54) = j or i = l , j = k , i (cid:54) = j then Cov ( V i V j , V k V l ) = 1. Otherwise Cov ( V i V j , V k V l ) = 0 and this gives (D.6).In order to see (D.7) note that by Lemma D.3 (v)Var (cid:0) V t ABW (cid:1) = (cid:13)(cid:13) B t A (cid:13)(cid:13) F = tr (cid:0)(cid:0) BB t (cid:1) (cid:0) AA t (cid:1)(cid:1) ≤ tr / (cid:16)(cid:0) BB t (cid:1) (cid:17) tr / (cid:16)(cid:0) AA t (cid:1) (cid:17) = (cid:13)(cid:13) BB t (cid:13)(cid:13) F (cid:13)(cid:13) AA t (cid:13)(cid:13) F . Theorem D.1.

Let ξ ∼ N (0 , I n ) and A be a positive semideﬁnite matrix. Then Var − / (cid:0) ξ t Aξ (cid:1) (cid:0) ξ t Aξ − E ξ t Aξ (cid:1) → N (0 , if and only if Var − / (cid:0) ξ t Aξ (cid:1) λ ( A ) → . Lemma D.6.

Let n ≥ . Then λ ( J τn ) ≤ n − log n. Proof.

Let r = [ n/ log n ] and note that sin( x ) − ≤ /x for x ∈ (0 , π/ λ − r ≤ (cid:18) nrπ (cid:19) ≤ π n (cid:18) n log n (cid:19) − ≤ n and λ ( J τn ) = ( n − n/ log n ) − λ − r ≤ n log n emma D.7. Let λ i be as deﬁned in (3.2). Then it holds √ n [ n / ] (cid:88) i = [ n / ] +1 λ i = 7 π O (cid:16) n − / (cid:17) . Proof.

Let x i = iπ/ (2 n ). Note that sin ( x i ) = x i − ξ i /

3, where ξ i ∈ (0 , x i ). Furthermax i = [ n / ] +1 ,..., [ n / ] x i ≤ n − / π . Hence n / [ n / ] (cid:88) i = [ n / ] +1 ξ i ≤ n max i = [ n / ] +1 ,..., [ n / ] x i = O (cid:0) n − (cid:1) and thus n / [ n / ] (cid:88) i = [ n / ] +1 λ i = 4 n / [ n / ] (cid:88) i = [ n / ] +1 i π n + 13 ξ i = π n − / [ n / ] (cid:88) i = [ n / ] +1 i + O (cid:0) n − (cid:1) = 7 π O (cid:16) n − / (cid:17) . Lemma D.8 (Continuous Sobolev Embedding) . Let C ( q ) , q > denote the space of H¨oldercontinuous functions on [0 , equipped with the canonical norm (cid:107) . (cid:107) C ( q ) and deﬁne η : (1 / , ∞ ) × [0 , ∞ ) → R , η ( α, δ ) :=  α − / α ∈ (1 / , / , − δ α = 3 / , α > / . Suppose α > / . Then for any δ > the embedding ι : Θ bs ( α, Q ) (cid:44) → C ( η ( α, δ )) is continuous and in particular sup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) C ( η ( α,δ )) < ∞ . Proof.

For a given function f : [0 , → R deﬁne ˜ f : [ − , → R ,˜ f ( x ) :=  f ( x ) x ∈ [0 , ,f ( − x ) x ∈ [ − , . s > , W s, [ − , (cid:12)(cid:12) [0 , denote the (fractional) Sobolev space on [ − , , where thedomain of functions is restricted to [0 ,

1] equipped with the norm (cid:107) f (cid:107) W s, [ − , | [0 , := (cid:13)(cid:13)(cid:13) ˜ f (cid:13)(cid:13)(cid:13) W s, [ − , . Note this is a function space on [0 ,

1] and W s, [ − , (cid:12)(cid:12) [0 , (cid:54) = W s, [0 , α > / ι : Θ bs ( α, Q ) ⊆ W α, [ − , (cid:12)(cid:12) [0 , (cid:44) → C ( η ( α, δ ))is continuous and since it is linear also bounded. This yieldssup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) C ( η ( α,δ )) ≤ (cid:107) ι (cid:107) sup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) W α, [ − , | [0 , < ∞ . References [1] Taylor, M. (1996).