Nonparametric estimation of the volatility function in a high-frequency model corrupted by noise
NNonparametric Estimation of the Volatility Function in aHigh-Frequency Model corrupted by Noise
Axel Munk ∗ and Johannes Schmidt-Hieber Institut f¨ur Mathematische Stochastik, Universit¨at G¨ottingen,Goldschmidtstr. 7, 37077 G¨ottingen
Email: [email protected] , [email protected] Abstract
We consider the models Y i,n = (cid:82) i/n σ ( s ) dW s + τ ( i/n ) (cid:15) i,n , and ˜ Y i,n = σ ( i/n ) W i/n + τ ( i/n ) (cid:15) i,n , i = 1 , . . . , n , where ( W t ) t ∈ [0 , denotes a standard Brownian motion and (cid:15) i,n are centered i.i.d. random variables with E (cid:0) (cid:15) i,n (cid:1) = 1 and finite fourth mo-ment. Furthermore, σ and τ are unknown deterministic functions and ( W t ) t ∈ [0 , and( (cid:15) ,n , . . . , (cid:15) n,n ) are assumed to be independent processes. Based on a spectral decom-position of the covariance structures we derive series estimators for σ and τ andinvestigate their rate of convergence of the MISE in dependence of their smoothness.To this end specific basis functions and their corresponding Sobolev ellipsoids are in-troduced and we show that our estimators are optimal in minimax sense. Our workis motivated by microstructure noise models. A major finding is that the microstruc-ture noise (cid:15) i,n introduces an additionally degree of ill-posedness of 1 /
2; irrespectivelyof the tail behavior of (cid:15) i,n . The performance of the estimates is illustrated by a smallnumerical study.
AMS 2000 Subject Classification:
Primary 62M09, 62M10; secondary 62G08, 62G20.
Keywords:
Brownian motion; Variance estimation; Minimax rate; Microstructure noise;Sobolev Embedding.
Consider the models Y i,n = (cid:90) i/n σ ( s ) dW s + τ (cid:18) in (cid:19) (cid:15) i,n i = 1 , . . . , n, (1.1)and ˜ Y i,n = σ (cid:18) in (cid:19) W i/n + τ (cid:18) in (cid:19) (cid:15) i,n i = 1 , . . . , n (1.2)1 a r X i v : . [ s t a t . M E ] A p r espectively, where ( W t ) t ∈ [0 , denotes a Brownian motion and (cid:15) i,n is so called microstruc-ture noise, i.e. we assume (cid:15) i,n i.i.d., E (cid:16) (cid:15) i,n (cid:17) = 1 and E (cid:16) (cid:15) i,n (cid:17) < ∞ . ( W t ) t ∈ [0 , and( (cid:15) ,n , . . . , (cid:15) n,n ) are assumed to be independent, and σ and τ are unknown, positive anddeterministic functions.Our models (1.1) and (1.2) are natural extensions of the situation when σ and τ areconstant, which has been, in a slightly broader setting, previously considered by [8], [13],[14] and [24] among others. In the latter papers sharp minimax estimators were derived for σ and τ . The minimax rate for σ is n − / and for τ it is n − / , and the correspondingconstants for quadratic loss (MSE) being 8 τ σ and 2 τ , respectively. To estimate σ and τ, maximum likelihood is feasible (see [24]) and achieves these bounds. Other efficientestimators where given by [8], [13] or [14]. In our case, i.e. when σ and τ are functionsthese methods fail and techniques from nonparametric regression become necessary. Wewill postpone a more careful dicussion of models (1.1) and (1.2) to Section 2.Both models incorporate, as usually in high-frequency financial models, an additionalnoise term, denoted as microstructure noise (cf. [1] and [16] ) in order to model marketfrictions such as bid-ask spreads and rounding errors. In general, microstructure noise isoften assumed as white noise process with bounded fourth moment. Therefore, we mayinterpret both models as obtaining data from transformed Brownian motions under addi-tional measurement errors. Particularly, our assumptions cover the important case when (cid:15) i,n i.i.d. ∼ N (0 , . In this paper we try to understand how estimation of the functions σ and τ in (1.1)and (1.2) itself can be performed, i.e. the time derivative of the integrated volatility. Toour knowledge, this issue has never been addressed before, a remarkable exception is [3]where a harmonic analysis technique is introduced in order to recover σ from noiselessdata. A naive estimator of σ would be the derivative of an estimator of (cid:82) s σ ( x ) dx withrespect to s . However, (numerical) differentiation of (cid:82) s σ ( x ) dx with respect to s yieldsan additional degree of ill-posedness and there are to the best of our knowledge no esti-mates and no theoretical results available how to estimate σ in our situation. Instead, wepropose a regularized estimator for σ and τ that attains the minimax rate of convergence.Our estimator is a Fourier series estimator where we will estimate the single cosine Fouriercoefficients, (cid:82) σ ( x ) cos( kπx ) dx , k = 0 , , . . . by a particular spectral estimator which isspecifically tailor suited to this problem. The difficulty to estimate σ can be explainedgenerically from the point of view of statistical inverse problem: Microstructure noise in-duces an additional degree of ill posedness -similar as in a deconvolution problem- whichin our case leads to a reduction of the rate of convergence by a factor 1 /
2. Surprisingly,and in contrast to deconvolution, this is only reflected in the behavior of the eigenvalues ofthe covariance operator of the process in (1.1) and (1.2) and not in the tail behavior of the2ourier transform of the error (cid:15) i,n .We stress again that we are aware of the fact that our model assumes a deterministicfunction σ and τ , which only depends on time t and generalization to σ ( t, X t ) is not obviousand a challenge for further research. However, the purely deterministic case already helpsus to reveal the daily pattern of the volatility and finally we believe that our analysis is animportant step into the understanding of these models from the view point of a statisticalinverse problems. Results:
All results are obtained with respect to MISE-risk. Let α and β denote a certainsmoothness of σ and τ , respectively. Roughly speaking, these numbers correspond to theusual Sobolev indices, although in our situation, a particular choice of basis is required,leading us to the definition of Sobolev s -ellipsoids (see Definition 1). Then we show that τ can be estimated at rate n − β/ (2 β +1) for β > , α > / β > , α > / σ the n − α/ (4 α +2) rate of convergence for α > / , β > / α > / , β > / s -ellipsoids. Lower bounds with respect to H¨older classes for estimation of σ have beenobtained in [17]. Here we will extend this result to Sobolev s -ellipsoids. It follows that theobtained rates are minimax, indeed.To summarize, our major finding is that in contrast to ordinary deconvolution the diffi-culty of estimation σ when corrupted by additional (microstructure) noise (cid:15) , is genericallyincreased by a factor of 1 / s -ellipsoids. This is quite surprising because onemight have expected that for instance Gaussian error leads to logarithmic convergence ratesdue to its exponential decay of the Fourier transform (see e.g. [4], [6], [7] and [11] for someresults in this direction). We stress that for our method a minimal smoothness of σ in(1.1) of α > / α > / σ . Roughly speaking, the results imply that n data points for estimation of σ can be compared to the situation, when we have √ n observation in usual nonparamteric regression.The work is organized as follows. In Sections 2 and 3 we will discuss models (1.1) and(1.2) in more detail, introduce notation and define the required smoothness classes, Sobolev s -ellipsoids (details can be found in Appendix B). Section 4.1 and Section 4.2 are devoted toestimate σ and τ , respectively, and to present the rates of convergence of the estimators(for a proof see Appendix A). Section 5 provides the minimax result. In Section 6 webriefly discuss some numerical results and illustrate the robustness of the estimator againstnon-normality and violations of the required smoothness assumptions for σ and τ . Some3urther results and technicalities of Sections 4.1 and 4.2 are given in the supplementarymaterial. In this subsection we briefly discuss the background from financial economics of model(1.1) and explore the differences between models (1.1) and (1.2). We may consider theprocesses ( σ ( t ) W t ) t ∈ [0 , and (cid:16)(cid:82) t σ ( s ) dW s (cid:17) t ∈ [0 , D = ( W ( H ( t ))) t ∈ [0 , , H ( t ) := (cid:82) t σ ( s ) ds as (inhomogeneously) scaled Brownian motions, where scaling takes place in space and intime, respectively. Hence we will refer to ( σ ( t ) W t ) t ∈ [0 , and (cid:16)(cid:82) t σ ( s ) dW s (cid:17) t ∈ [0 , in thefuture as space-transformed (sBM) and time-transformed (tBM) Brownian motion. Model (1.1):
In the financial econometrics literature variations of model (1.1) areoften denoted as high-frequency models, since ( W t ) t ∈ [0 , is sampled on time points t = i/n and nowadays there is a vast amount of literature on volatility estimation in high-frequencymodels with additional microstructure noise term (see [2], [15], [26] and [27]). These kindsof models have attained a lot of attention recently, since the usual quadratic variationtechniques for estimation of (cid:82) σ ( x ) dx lead to inconsistent estimators (cf. [26]).We are aware of the fact, that in contrast to our model, volatility is modelled gen-erally not only as time dependent but also depending on the process itself, i.e. Y i,n = X i/n + τ ( i/n ) (cid:15) i,n , i = 1 , . . . , n, dX t = σ ( t, X t ) dW t . An overview over commonly usedparametric forms of σ ( t, X t ) and a non-parametric treatment in the absence of microstruc-ture noise, can be found in [12]. It is known that the same rates as for the case σ and τ constant hold true if we consider the model (1.1) and estimate the so called integratedvolatility or realized volatility (cid:82) s σ ( x ) dx ( s ∈ [0 , (cid:82) s τ ( x ) dx instead of σ and τ ,respectively (see [20] and [22] for a discussion on estimation of integrated volatility andrelated quantities). Recently, model (1.1) has been proven to be asymptotically equivalentto a Gaussian shift experiment (see [21]). σ as a function of time corresponds in model(1.1) to the instantaneous volatility or spot volatility. Model (1.2):
Model (1.2) can be regarded as a nonparametric extension of the modelwith constant σ, τ as discussed for variogram estimation by [24]. To motivate the usefulnessof sBM we give the following Lemma.
Lemma 1. (i) Assume that σ , < c ≤ σ, is continuously differentiable. Then thecorresponding sBM, ( σ ( t ) W t ) t ∈ [0 , is the unique solution of the SDE dX t = X t d (log ( σ ( t ))) + σ ( t ) dW t , X = 0 , ≤ t ≤ T. ii) The variogram of sBM is given by γ ( s, t ) := E ( X t − X s ) = (cid:16) σ ( t ) t / − σ ( s ) s / (cid:17) + σ ( t ) σ ( s ) (cid:20) | s − t | − (cid:16) s / − t / (cid:17) (cid:21) . Proof. (i)
It is easy to check that sBM indeed is a solution. To establish uniqueness, weapply Theorem 9.1 in [23]. (ii)
This follows by straightforward calculations.
Comparison of the models:
We remark that tBM can be related to sBM by partialintegration (cid:82) t σ ( s ) dW s = σ ( t ) W t − (cid:82) t σ (cid:48) ( s ) W s ds. To see the differences we compared inFigure 1 sBM and tBM in two typical situations: The case where σ ( t ) = 0 for t > T andthe case, where σ is non-continuous. If σ ( t ) = 0 for t > T, sBM tends to zero, whereastBM tends to a constant, i.e. the random variable (cid:82) T σ ( s ) dW s . Furthermore, if σ is ajump function, sBM has a jump too, whereas tBM does not.Unlike Model (1.1), which can be viewed as a price process, Model (1.2) has no di-rect application in financial mathematics. However, from the view point of nonparametricstatistics it seems to be a natural extension of the situation when σ and τ are constant. In this section we shortly introduce the setup needed in order to define the estimators.First we define suitable smoothness classes, which are different, but related to well knownSobolev ellipsoids (see Definition B.1).
Definition 1.
For α > , C > , we call the function space Θ s := Θ s ( α, C ) := (cid:40) f ∈ L [0 ,
1] : ∃ ( θ n ) n ∈ N , s. t. f ( x ) = θ + 2 ∞ (cid:88) i =1 θ i cos ( iπx ) , ∞ (cid:88) i =1 i α θ i ≤ C (cid:41) a Sobolev s-ellipsoid. If there is a C < ∞ such that f ∈ Θ s ( α, C ) , we say f has smoothness α . For < l < u < ∞ , we further introduce the uniformly bounded Sobolev s-ellipsoid Θ bs ( α, C ) := Θ bs ( α, C, [ l, u ]) := { f ∈ Θ s ( α, C ) : l ≤ f ≤ u } . Here the “ s ” refers to “ symmetry ” since the L [0 ,
1] basis { ψ k , k = 0 , . . . } := (cid:110) , √ kπt ) , k = 1 , . . . (cid:111) , (3.1)5an also be viewed as a basis of the symmetric L [ − ,
1] functions (cid:8) f : f ∈ L [ − , , f ( x ) = f ( − x ) ∀ x ∈ [0 , (cid:9) . Usually, Sobolev ellipsoids are introduced with respect to the Fourier basis (cid:110) , √ kπt ) , √ kπt ) , k = 1 , . . . (cid:111) on L [0 ,
1] (see Definition (B.1)). As will turn out later on, Sobolev s -ellipsoids are morenatural for our approach. If a function has a certain smoothness in one space, it mighthave a completely different smoothness with respect to the other basis. For instance thefunction cos ((2 l + 1) πx ), l ∈ N has smoothness α for all α < ∞ with respect to basis (3.1),and as can be seen by direct calculations only smoothness α < / f k : [0 , → R , k ∈ N f k ( x ) := ψ k (cid:16) x (cid:17) . Note that for k ≥ f k can be expanded in basis (3.1) by f k = ψ + 2 − / ψ k . For anyfunction g we introduce the forward difference operator ∆ i g := g (( i + 1) /n ) − g ( i/n )and further the transformed variables ∆ Y k, i,n := ( Y i +1 ,n − Y i,n ) f k ( i/n ) and ∆ Y k, i,n := (cid:16) ˜ Y i +1 ,n − ˜ Y i,n (cid:17) f k ( i/n ), i = 1 , . . . , n − Y ki,n = ∆ Y k,li,n , l = 1 , . Throughoutthe paper we abbreviate first order differences of observations by∆ Y k := (cid:16) ∆ Y k ,n , . . . , ∆ Y kn − ,n (cid:17) t . We write M p,q , M p and D p for the space of p × q matrices, p × p matrices and p × p diagonal matrices over R , respectively. Further let D n − ∈ M n − given by ( D n − ) i,j = (cid:112) /n sin ( ijπ/n ) and define λ i,n − := 4 sin ( iπ/ (2 n )) i = 1 , . . . , n − , (3.2)the eigenvalues of the covariance matrix K n − ∈ M n − of the MA(1) process ∆ i (cid:15) i,n := (cid:15) i +1 ,n − (cid:15) i,n , i = 1 , . . . , n −
1. More explicitly K n − is tridiagonal and( K n − ) i,j = i = j − | i − j | = 10 else . (3.3)Note that we can diagonalize K n − explicitly by K n − = D n − Λ n − D n − , where Λ n − isdiagonal with diagonal entries given by (3.2).6e will suppress the index n − K , D , Λ, λ i instead of K n − , D n − , Λ n − ,and λ i,n , respectively. We write [ x ] := max z ∈ Z { z ≤ x } , x ∈ R , the integer part of x . log()is defined to be the binary logarithm and in order to define estimators properly, we assumethroughout the paper additionally n > . τ Before we will turn to the estimation of the volatility σ , we will first discuss estimation ofthe noise variance, i.e. τ . Let J τn ∈ D n − given by( J τn ) i,j := ( n − n/ log n ) − λ − i δ i,j , for [ n/ log n ] ≤ i, j ≤ n −
10 otherwise , where λ i is defined by (3.2) and δ i,j denotes the Kronecker delta. We consider models (1.1)and (1.2), simultaneously. Letˆ t k, := (cid:16) ∆ Y k (cid:17) t DJ τn D t (cid:16) ∆ Y k (cid:17) . (4.1)In Lemma C.1 it will be shown that ˆ t k, is a √ n − consistent estimator of t k, := (cid:90) τ ( x ) f k ( x ) dx. Note that for k ≥ t k, = (cid:82) τ ( x ) ψ ( x ) dx + 2 − / (cid:82) τ ( x ) ψ k ( x ) dx . Define Z := D (cid:0) ∆ Y k (cid:1) and denote by Z i the i -th component of Z . Thenˆ t k, = ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] λ − i Z i . (4.2)Hence this also can be seen as a spectral filter in Fourier domain, where we cut off thefirst n/ log n frequencies. Note that for i ≥
1, 2 / ( t i, − t , ) = (cid:82) τ ( x ) ψ i ( x ) dx is the i -th series coefficient with respect to basis (3.1). This observation suggests to construct thecosine series estimator ˆ τ N ( t ) := ˆ t , + 2 N (cid:88) i =1 (cid:0) ˆ t i, − ˆ t , (cid:1) cos ( iπt ) . (4.3)The next result provides the rate of convergence of ˆ τ N uniformly within Sobolev s-ellipsoids. To this end a version of the continuous Sobolev embedding theorem is requiredfor non-integer indices α, β (see Lemma D.8). A proof of the following Theorem can befound in the supplementary material. 7 heorem 1 (MISE of ˆ τ N ( t )) . Let ˆ τ N ( t ) as defined in (4.3). Assume β > , and Q, ¯ Q > .Further suppose that N = N n = o (cid:0) n / / log n (cid:1) . Assume either model (1.1) and α > / ormodel (1.2) and α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ τ N (cid:1) = O (cid:16) N − β + N n − (cid:17) . Minimizing the r.h.s. yields N ∗ = O (cid:0) n / (2 β +1) (cid:1) and consequently sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ τ N ∗ (cid:1) = O (cid:16) n − β/ (2 β +1) (cid:17) . Remark 1.
Note that for model (1.1) Theorem 1 holds, whenever α > / . Hence theBrownian motion part of the model can be viewed as a nuisance parameter, not affectingrates for estimation of τ . However, for model (1.2) α > / is required here. This morerestrictive assumption is essentially a consequence of the fact that the process σ ( i/n ) W i/n is in general no martingale. Remark 2.
The result from Theorem 1 can be extended to / < β ≤ in model (1.1) andto / < α ≤ / , / < β ≤ in model (1.2). Let ˜ t k, be defined as ˆ t k, in (4.1) but J τn isnow replaced by ˜ J τn ∈ D n − , (cid:16) ˜ J τn (cid:17) i,j = n − λ − i δ i,j , for [ n/ ≤ i, j ≤ n − otherwise . Introduce further the estimator ˜ τ N ( t ) = ˜ t , + 2 (cid:80) Ni =1 (cid:0) ˜ t i, − ˜ t , (cid:1) cos ( iπt ) . Further supposethat N = O (cid:0) n / (2 β +1) (cid:1) . Then we obtain by slight modifications of the proof of Theorem 1for β > / , α > / and Q, ¯ Q > (i) Assume model (1.1). Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N (cid:1) = O (cid:16) N − β + N n − + N n − β (cid:17) and N ∗ = O (cid:0) n (2 β − / (2 β +1) (cid:1) yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N ∗ (cid:1) = O (cid:16) n − ( β − β ) / (2 β +1) (cid:17) . (ii) Assume model (1.2). Then we have the expansion sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N (cid:1) = O (cid:16) N − β + N n − + N n − β + N n − α (cid:17) , nd the choice N ∗ = O (cid:0) n (2 β − / (2 β +1) (cid:1) for β ≤ ∧ (2 α − / ,O (cid:0) n (4 α − / (2 β +1) (cid:1) for α ≤ / ∧ ( β/ / yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˜ τ N ∗ (cid:1) = O (cid:16) n − ( β − β ) / (2 β +1) (cid:17) for β ≤ ∧ (2 α − / ,O (cid:0) n − (2 − α ) / (2 β +1) (cid:1) for α ≤ / ∧ ( β/ / . Remark 3.
It is also possible, although more technical, to compute the asymptotic constantof the estimator ˆ τ N ∗ . Suppose that the microstructure noise is Gaussian and assume model(1.1) and β > or (1.2) and β > , α > / , then we have more explicitly MISE (cid:0) ˆ τ N ∗ (cid:1) = 2 N ∗ n (cid:90) τ ( x ) dx + ∞ (cid:88) k = N ∗ +1 (cid:18)(cid:90) τ ( x ) ψ k ( x ) dx (cid:19) + o (cid:0) N ∗ n − (cid:1) . Remark 4.
There are of course simpler estimators for t k, . For instance if we replace J τn in (4.1) by (2 n ) − I n − , where I n − ∈ D n − denotes the identity matrix, we obtainthe quadratic variation estimator for t k, (cf. [1]) and it is not difficult to show that thisestimator attains the optimal rate of convergence. This approach could even be extendedto a nonparametric estimator of the form (4.3). However, the single Fourier coefficientsare not estimated efficiently, since in the case when the microstructure noise is Gaussianthe asymptotic constant is n − (cid:82) τ k ( x ) dx (this is a straightforward extension of TheoremA.1 in [27]) whereas for our estimator we have n − (cid:82) τ k ( x ) dx (see Lemma C.1). If τ is constant it can be easily seen that estimators in (4.1) are efficient for k = 0 whereasquadratic variation is not. Remark 5.
In practical application it would be more natural to use instead of n/ log n in(4.2) other cut-off frequencies e.g. n γ / log n or qn , where / < γ ≤ , < q < . Smaller γ decreases the variance while on the other hand increases the bias of the estimator. σ Define J n ∈ D n − by( J n ) i,j = √ nδ i,j , for (cid:2) n / (cid:3) + 1 ≤ i, j ≤ (cid:2) n / (cid:3) . (4.4)9imilar, as for the estimation of τ we first introduce an estimator of appropriate Fouriercoefficients by ˆ s k, = (cid:16) ∆ Y k (cid:17) t DJ n D t (cid:16) ∆ Y k (cid:17) − π ˆ t k, / . (4.5)The second part, i.e. − π ˆ t k, / π / (cid:2) n / (cid:3) + 1 and 2 (cid:2) n / (cid:3) in (4.4). As we will see, the estimatorof ˆ t k, has better convergence properties than the first term in ˆ s k, , and hence does notaffect the asymptotic variance. Similar to (4.3), we putˆ σ N ( t ) = ˆ s , + 2 N (cid:88) i =1 (ˆ s i, − ˆ s , ) cos ( iπt ) . (4.6) Theorem 2 (MISE of ˆ σ N ) . Let ˆ σ N as defined in (4.6). Suppose that N = N n = o (cid:0) n / (cid:1) , β > / and Q, ¯ Q > . Assume model (1.1) and α > / or model (1.2) and α > / .Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / (cid:17) and minimizing the r.h.s. yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) = O (cid:16) n − α/ (2 α +1) (cid:17) for N ∗ = O (cid:0) n / (4 α +2) (cid:1) . The proof of Theorem 2 is given in Section A.2.
Remark 6.
It is also possible to extend this result for less smooth functions σ and τ .(i) Assume model (1.1) and α > / , β > . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / + N n − β + N n − α (cid:17) , and N ∗ = O (cid:0) n (2 α − / (2 α +1) (cid:1) for α ≤ / ∧ ( β − / ,O (cid:0) n (2 β − / (2 α +1) (cid:1) for β ≤ / ∧ ( α + 1 / yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) = O (cid:0) n − α (2 α − / (2 α +1) (cid:1) for α ≤ / ∧ ( β − / ,O (cid:0) n − α (2 β − / (2 α +1) (cid:1) for β ≤ / ∧ ( α + 1 / . ii) Assume model (1.2) and α > / , β > . Then it holds sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N (cid:1) = O (cid:16) N − α + N n − / + N n − β (cid:17) , and N ∗ = O (cid:0) n (2 β − / (2 α +1) (cid:1) yields sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) MISE (cid:0) ˆ σ N ∗ (cid:1) = O (cid:16) n − α (2 β − / (2 α +1) (cid:17) . Remark 7.
In analogy to (4.2), the estimator ˆ s k, can also be viewed as a spectral filterin Fourier domain, where essentially only the frequencies n / , . . . , n / play a role. Forpractical purposes one can generalize this to estimators where the frequencies k, . . . , (cid:2) cn / (cid:3) , c > are used. If σ is assumed to be very smooth, one even may set k = 1 . In this moregeneral setting, the constant − π / in the definition of the estimator has to be replaced by − n/ (cid:0)(cid:2) cn / (cid:3) − k (cid:1) (cid:80) [ cn / ] i = k λ i . Remark 8.
Since the matrix D in the definition of ˆ s k, is a discrete sine transform (for adefinition see [5]) the estimator ˆ σ N can be calculated explicitly taking O ( N n log n ) steps. In this section we will discuss the optimality of the proposed estimators. To this end weestablish lower bounds with respect to Sobolev s -ellipsoids. Theorem 3.
Assume model (1.1) or model (1.2), α ∈ N \ { } . Further assume τ constant.Then there exists a C > (depending only on α, Q, l, u ), such that lim n →∞ inf ˆ σ n sup σ ∈ Θ bs ( α,Q ) E (cid:16) n α α +1 (cid:13)(cid:13) ˆ σ − σ (cid:13)(cid:13) (cid:17) ≥ C. Proof.
The proof relies on a multiple hypothesis testing argument and is close to the proofgiven in [17], Theorem 2.1. However, the lower bounds there are established with respectto the space of H¨older continuous functions of index α on the interval [0 , , i.e. for l < u C b ( α, L ) := C b ( α, L, [ l, u ]) := (cid:110) f : f ( p ) exists for p = [ α ] , (cid:12)(cid:12)(cid:12) f ( p ) ( x ) − f ( p ) ( y ) (cid:12)(cid:12)(cid:12) ≤ L | x − y | α − p , ∀ x, y ∈ I, < l ≤ f ≤ u < ∞ (cid:111) . Therefore, the statement above does not follow immediately from [17], Theorem 2.1 becauseof C b ( α, L ) (cid:32) Θ bs ( α, Q ) due to boundary effects. Here, we will only point out the differenceto the proof of [17], Theorem 2.1. We write σ min , σ max for the lower and upper boundof σ , respectively, i.e. σ ∈ Θ bs ( α, Q, [ σ min , σ max ]). Without loss of generality, we may11ssume that σ min = 1. For the multiple hypothesis testing argument (cf. [25]) a specificchoice of functions σ i,n is required. For a construction see [17], proof of Theorem 2.1 where L := (cid:0) π α Q (cid:1) / / (cid:13)(cid:13) K ( α ) (cid:13)(cid:13) ∞ . It remains to show σ i,n ∈ Θ bs ( α, Q ) , i = 0 , , . . . , M. Due to the construction of σ i,n , we have σ i,n ( t ) = 1 for t ∈ [0 , / ∪ [3 / ,
1] and σ l ) i,n (0) = σ l ) i,n (1) = 0 for l ∈ { , , . . . , α } . Thus, σ i,n ∈ W (cid:16) α, L (cid:13)(cid:13) K ( α ) (cid:13)(cid:13) ∞ (cid:17) (for a definition seeEquation (B.1)), α ∈ { , , , . . . } , j = 0 , . . . , M . Hence by Theorem B.1 it follows σ i,n ∈ Θ s ( α, Q ) for i = 0 , . . . , M . In this section we briefly illustrate the performance of our estimators. Our aim is not to givea comprehensive simulation study, rather we would like to illustrate the behaviour of theestimator when assumptions of Theorems 1 and 2 are violated. In the following we plottedour estimator to simulated data, where we always set n = 25 . . From the point of viewof financial statistics this is approximately the sample size obtained over a trading day (6 . N in (4.3)and (4.6) as the minimizer of (cid:13)(cid:13) ˆ τ − τ (cid:13)(cid:13) n and (cid:13)(cid:13) ˆ σ − σ (cid:13)(cid:13) n , respectively, which is in practiceunknown. Of course, proper selection of the threshold N ∗ is of major importance for theperformance of the estimator. To this end various methods are available, among others,cross validation techniques, balancing principles, and variants thereof could be employed(see e.g. [9], [10], [18] and [19]). A thorough investigation is postponed to a separate paper.Throughout our simulations we assumed τ = 0 .
01 and concentrated mainly on estimationof σ , as it is the more challenging task.In Figure 2 we have displayed the estimator for σ ( t ) = (2 + cos (2 πt )) / . Note thatby Definition 1, σ has ”infinite” smoothness, i.e. for any α >
0, we can find a
Q < ∞ ,such that σ ∈ Θ s ( α, Q ) . The reconstruction shows that estimation of τ can be donemuch easier than estimation of σ although it is of smaller magnitude. In Figure 3, we areinterested in the behavior of the estimators if heavy-tailed microstructure noise is present.This was simulated by generating (cid:15) i,n ∼ − / t (3), i = 1 , . . . , n , i.i.d., where t (3) denotesa t-distribution with 3 degrees of freedom. We can see from Plot 1 in Figure 3 thatthe resulting microstructure noise has some severe outliers according to the tail x − ofthe density of t (3). Nevertheless, estimation of τ and σ is not visibly affected by thedistribution of the noise.In the subsequent figures we illustrate the behaviour of the estimator when the requiredsmoothness assumptions on σ and τ are violated. To this end, we investigate in Fig-ure 4 the situation when σ is random itself, i.e. a realization from a Brownian motion,12 ( t ) = 3 (cid:12)(cid:12)(cid:12) ˜ W t (cid:12)(cid:12)(cid:12) . The Brownian motion (cid:16) ˜ W t (cid:17) t ∈ [0 , was modelled as independent from theBrownian motion in (1.1) and the microstructure noise process. It is of course not possibleto reconstruct the complete path of σ , but as Figure 4 indicates, the estimators at leastdetects the smoothed shape of the path and so our estimator might already reveal someparts of the pattern of volatility also in case σ is non-deterministic, which is certainly morerealistic in most applications.Finally, in Figure 5 we investigated the case of σ being a jump-function. We put σ ( t ) = 1 + I ( 1 / , ( t ) , a function with jump at t = 1 / . Fourier series usually show aGibbs phenomenon, i.e. an oscillating behavior at discontinuities. This behavior is alsoclearly visible in the graph of ˆ σ . In order to reconstruct jumps in volatility other methodscertainly will be more suitable and are postponed to a separate paper.
Computational tasks:
We implemented the estimators in Matlab using the routinefft() for the discrete sine transform (see Remark 8). Calculation of the estimators for asample size of n = 25 .
000 took around 2-3 seconds on a Intel Celeron 1.7 GHz processor.As mentioned in Remark 8, the estimator can be calculated in O ( N n log n ) steps. Ifwe choose N with the optimal scale, i.e. N ∼ n / (4 α +2) , we have for the complexity O ( N n log n ) = o (cid:0) n / log n (cid:1) , whenever α > / . Appendix A Convergence rate of ˆ σ In this section we will give a proof of Theorem 2. To this end we first introduce somenotation and then prove a Lemma in order to get uniform estimates of bias and varianceof the single estimators ˆ s k, . A.1 Preliminary Results and Notation
Proofs of the upper bounds are based on a decomposition of ∆ Y k . In this subsection wepresent some further notation. Let σ k ( t ) := σ ( t ) f k ( t ) and τ k ( t ) := τ ( t ) f k ( t ) , t ∈ [0 , . Letthroughout the following for the Sobolev s -ellipsoids in Definition 1 for σ the constantsbeing l = σ min and u = σ max and for τ , l = τ min , u = τ max . We define φ n := sup σ ∈ Θ bs ( α,Q ) max i =1 ,...,n − sup ξ ∈ [ i/n, ( i +1) /n ] (cid:12)(cid:12)(cid:12)(cid:12) σ ( ξ ) − σ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , ¯ φ n := sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / max i =1 ,...,n | ∆ i τ k | . (A.1)In order to do the proofs for model (1.1) and model (1.2) simultaneously, we first definethe more general process V k,l := X ,k + X ,k + Z ,k,l + Z ,k , l = 1 , , (A.2)13here X ,k , X ,k , Z ,k,l and Z ,k are n − X ,k ) i := σ k ( i/n ) ∆ i W i/n , ( X ,k ) i := τ k ( i/n ) ∆ i (cid:15) i,n , ( Z ,k, ) i := f k ( i/n ) (cid:90) ( i +1) /ni/n (cid:18) σ ( s ) − σ (cid:18) in (cid:19)(cid:19) dW s , ( Z ,k, ) i := f k ( i/n ) (∆ i σ ) W ( i +1) /n , ( Z ,k ) i := f k ( i/n ) (∆ i τ ) (cid:15) i +1 ,n , i = 1 , . . . , n − . Obviously, ∆ Y k = V k, and ∆ Y k = V k, if model (1.1) and (1.2) holds, respectively. Definethe generalized estimators ˆ t k, ,l := V tk,l DJ τn D t V k,l and ˆ s k, ,l := V tk,l DJ n D t V k,l − π ˆ t k, ,l / C ,k,l , C ,k ∈ M n − ,n such that V k,l = C ,k,l ξ + C ,k (cid:15), (A.3)where (cid:15) = ( (cid:15) ,n , . . . , (cid:15) n,n ) t and ξ = ξ n is standard n -variate normal, (cid:15), ξ independent and C ,k,l ξ = X ,k + Z ,k,l , C ,k (cid:15) = X ,k + Z ,k . Now, let s k,p := (cid:90) σ k ( x ) cos( pπx ) dx, t k,p := (cid:90) τ k ( x ) cos( pπx ) dx (A.4)be the scaled p -th Fourier coefficients of the cosine series of σ k and τ k , respectively. Definethe sums A ( σ k , r ) by A (cid:0) σ k , r (cid:1) = s k, + 2 (cid:80) ∞ m =1 s k, nm for r ≡ n, (cid:80) ∞ m =0 s k, nm + n for r ≡ n mod 2 n, (cid:80) q ≡± r mod 2 n, q ≥ s k,q for r (cid:54)≡ n, and analogously A ( τ k , r ) with s k,p replaced by t k,p . Some properties of these variables aregiven in Lemma D.1 and Lemma D.2.Further define Σ k := σ k (1 /n ) . . . σ k (1 − /n ) . (A.5)We put Cum ( (cid:15) ) := Cum ( (cid:15) ,n ) for the fourth cumulant of (cid:15) ,n . If X, Y are independentrandom vectors, we write X ⊥ Y . A.2 Proofs for Estimation of σ Lemma A.1.
Let ˆ s k, be defined as in (4.5). Further assume β > , Q, ¯ Q > , < σ min ≤ σ max < ∞ , < τ min ≤ τ max < ∞ and k = k n ∈ N . i) Assume model (1.1), α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / | E (ˆ s k, ) − s k, | = O (cid:16) n − / + n − β + n / − α (cid:17) , (A.6)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (ˆ s k, ) = O (cid:16) n − / + n − β (cid:17) . (A.7) (ii) Assume model (1.2), α > / . Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / | E (ˆ s k, ) − s k, | = O (cid:16) n − β + n / − α + n − / (cid:17) , (A.8)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (ˆ s k, ) = O (cid:16) n − / + n − β (cid:17) . (A.9) Proof.
The proof mainly uses the generalized estimators as introduced in Section A.1. It isclear that for two centered random vectors P and Q (cid:104) P, Q (cid:105) σ := E (cid:0) P t DJ n DQ (cid:1) defines a semi-inner product and by Lemma D.5, P ⊥ Q ⇒ (cid:104) P, Q (cid:105) σ = 0. HenceE ˆ s k, ,l = (cid:104) X ,k , X ,k (cid:105) σ + (cid:104) X ,k , X ,k (cid:105) σ + (cid:104) Z ,k,l , Z ,k,l (cid:105) σ + (cid:104) Z ,k , Z ,k (cid:105) σ +2 (cid:104) X ,k , Z ,k,l (cid:105) σ + 2 (cid:104) X ,k , Z ,k (cid:105) σ − π (cid:0) ˆ t k, ,l (cid:1) . (A.10)Clearly with (iii) in Lemma D.1 and r n := n − / (cid:2) n / (cid:3) , (cid:104) X ,k , X ,k (cid:105) σ = 1 n tr (Σ k DJ n D Σ k ) = 1 n tr (cid:0) J n D Σ k D (cid:1) = n − / [ n / ] (cid:88) i = [ n / ] +1 (cid:0) A (cid:0) σ k , (cid:1) − A (cid:0) σ k , i (cid:1)(cid:1) = r n A (cid:0) σ k , (cid:1) − n − / [ n / ] (cid:88) i = [ n / ] +1 A (cid:0) σ k , i (cid:1) . Hence due to r n ≤ | r n − | ≤ n − / (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − s k, (cid:12)(cid:12) ≤ ∞ (cid:88) m = n | s k,m | + 2 √ n ∞ (cid:88) i =0 | s k,i | , and with Lemma D.2 sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − s k, (cid:12)(cid:12) = O (cid:16) n / − α (cid:17) . (cid:104) X ,k , X ,k (cid:105) σ . In order to do this let T k ∈ D n − with entries ( T k ) i,j = τ k ( i/n ) δ i,j . Further we define ˜ T k ∈ M n − (cid:16) ˜ T k (cid:17) i,j = (∆ i τ k ) for i = j − , (∆ j τ k ) for i = j + 1 , . (A.11)Note the relationCov ( X ,k ) = T k KT k = 1 / T k K + 1 / KT k + 1 / T k . (A.12)Using Lemma D.3 yields (cid:104) X ,k , X ,k (cid:105) σ = E (cid:0) X t ,k T k DJ n DX ,k (cid:1) = tr ( DJ n DT k KT k )= 12 tr (cid:0) J n DT k KD (cid:1) + 12 tr (cid:0) J n DKT k D (cid:1) + 12 tr (cid:16) J n D ˜ T k D (cid:17) = tr (cid:0) Λ J n DT k D (cid:1) + 12 tr (cid:16) J n D ˜ T k D (cid:17) , (A.13)and furthertr (cid:0) Λ J n DT k D (cid:1) = n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:0) A (cid:0) τ k , (cid:1) − A (cid:0) τ k , i (cid:1)(cid:1) = A (cid:0) τ k , (cid:1) n / [ n / ] (cid:88) i = [ n / ] +1 λ i − n / [ n / ] (cid:88) i = [ n / ] +1 λ i A (cid:0) τ k , i (cid:1) . (A.14)Because max i = [ n / ] +1 ,..., [ n / ] λ i = λ [ n / ] ≤ π n − , it holds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) √ n [ n / ] (cid:88) i = [ n / ] +1 λ i A (cid:0) τ k , i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:88) q ≡± i mod 2 n, q ≥ | t k,q |≤ π n − / ∞ (cid:88) i =0 | t k,i | ≤ π n − / ∞ (cid:88) i =0 | t ,i | . Therefore, (A.14) can be written as (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) tr (cid:0) Λ J n DT k D (cid:1) − t k, n / [ n / ] (cid:88) i = [ n / ] +1 λ i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ π ∞ (cid:88) m = n | t k,m | + 8 π n − / ∞ (cid:88) i =0 | t ,i | . This gives by Lemma D.7 and Lemma D.2sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12) tr (cid:0) Λ J n DT k D (cid:1) − π t k, (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − / (cid:17) . J n ) = O ( n ). It follows (cid:12)(cid:12)(cid:12) tr (cid:16) J n D ˜ T k D (cid:17)(cid:12)(cid:12)(cid:12) ≤ tr ( J n ) max i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) D ˜ T k D (cid:17) i,j (cid:12)(cid:12)(cid:12)(cid:12) ≤ J n ) max i (∆ i τ k ) . So, sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / tr (cid:16) J n D ˜ T k D (cid:17) = O (cid:0) n ¯ φ n (cid:1) (A.15)and therefore sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) σ − π t k, (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − / + n ¯ φ n (cid:17) . We bound the remaining terms of (A.10). Note (cid:104) Z ,k, , Z ,k, (cid:105) σ = tr ( DJ n D Cov ( Z ,k, )) ≤ λ (Cov ( Z ,k, )) tr ( J n ) ≤ φ n implying sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:104) Z ,k, , Z ,k, (cid:105) σ = O (cid:0) φ n (cid:1) . In order to bound (cid:104) Z ,k, , Z ,k, (cid:105) σ define L := (cid:18) ( i ∧ j ) + 1 n (cid:19) i,j =1 ,...,n − (A.16)and ∆Σ k ∈ D n − by (∆Σ k ) i,j := f k (cid:18) in (cid:19) (∆ i σ ) δ i,j . We obtain (cid:104) Z ,k, , Z ,k, (cid:105) σ = tr ( DJ n D (∆Σ k ) L (∆Σ k )) ≤ n / φ n (A.17)and hence sup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:104) Z ,k, , Z ,k, (cid:105) σ = O (cid:16) n / φ n (cid:17) . (A.18)Similarly to (A.16), (cid:104) Z ,k , Z ,k (cid:105) σ ≤ ¯ φ n n and thussup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:104) Z ,k , Z ,k (cid:105) σ = O (cid:0) ¯ φ n n (cid:1) . (A.19)17ote that Cov ( X ,k , Z ,k, ) i,j = j < in − f k ( i/n ) σ ( i/n ) (∆ j σ ) for j ≥ i . Hence by Proposition D.1, we obtainsup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , Z ,k, (cid:105) σ (cid:12)(cid:12) = O (cid:16) n / log n φ n (cid:17) Applying the CS-inequality givessup σ ∈ Θ bs ( α,Q ) max k ≤ n / (cid:12)(cid:12) (cid:104) X ,k , Z ,k, (cid:105) σ (cid:12)(cid:12) = O ( φ n ) , (cid:12)(cid:12) (cid:104) X ,k , Z ,k (cid:105) σ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / σ (cid:104) Z ,k , Z ,k (cid:105) / σ . Using Proposition C.1 this yields (A.6) and (A.8). In order to give an upper bound for thevariance of ˆ s k, ,l noteVar (ˆ s k, ,l ) ≤ (cid:0) V tk,l DJ n DV k,l (cid:1) + 2 7 π (cid:0) ˆ t k, ,l (cid:1) . Furthermore we have using (A.3) and Lemma D.3 (vi) V tk,l DJ n DV k,l = ξ t C t ,k,l DJ n DC t ,k,l ξ + 2 ξ t C t ,k,l DJ n DC ,k (cid:15) + (cid:15) t C t ,k DJ n DC ,k (cid:15) ≤ ξ t C t ,k,l DJ n DC t ,k,l ξ + 2 (cid:15) t C t ,k DJ n DC ,k (cid:15). Hence Var (cid:0) V tk,l DJ n DV k,l (cid:1) ≤ (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) + 8 Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) . Finally, we bound Var (cid:16) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:17) and Var (cid:16) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:17) in two steps,which will be denoted by ( a ) and ( b ).(a) By Lemma D.4 (iii), we haveVar (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) = 2 (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k + Z ,k,l ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n D (Cov ( X ,k ) + Cov ( Z ,k,l )) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F + 16 (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k,l ) DJ / n (cid:13)(cid:13)(cid:13) F . (A.20)Firstly, (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k, ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ (Cov ( Z ,k, )) tr (cid:0) J n (cid:1) ≤ n − / φ n , (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k, ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ ( DJ n D ) tr (cid:16) Cov ( Z ,k, ) (cid:17) ≤ nφ n (cid:107) L (cid:107) F ≤ n φ n , (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ (Cov ( X ,k )) tr (cid:0) J n (cid:1) ≤ σ max n − / . Therefore, sup σ ∈ Θ bs ( α,Q ) max k ≤ n / Var (cid:0) ξ t C t ,k,l DJ n DC ,k,l ξ (cid:1) = O (cid:16) n − / (cid:17) . (b) Next, we see with the same arguments as in (A.20)Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) ≤ (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k + Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F + 8 (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F . We obtain (cid:13)(cid:13)(cid:13) J / n D Cov ( Z ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ φ n tr (cid:0) J n (cid:1) = 4 ¯ φ n n / . From (A.11) follows now (cid:13)(cid:13)(cid:13) J / n D Cov ( X ,k ) DJ / n (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) J / n DT k KDJ / n (cid:13)(cid:13)(cid:13) F + 34 (cid:13)(cid:13)(cid:13) J / n D ˜ T k DJ / n (cid:13)(cid:13)(cid:13) F . Let I n − be the n − × n − J n Λ ≤ I n λ [ n / ] n / we have (cid:13)(cid:13)(cid:13) J / n DT k KDJ / n (cid:13)(cid:13)(cid:13) F = tr (cid:16) J / n DT k Λ J n Λ T k DJ / n (cid:17) ≤ λ [ n / ] n / tr (cid:16) J / n DT k DJ / n (cid:17) ≤ λ [ n / ] n / τ tr ( J n ) . Also (cid:13)(cid:13)(cid:13) J / n D ˜ T k DJ / n (cid:13)(cid:13)(cid:13) F ≤ λ ( J n ) (cid:13)(cid:13)(cid:13) ˜ T k (cid:13)(cid:13)(cid:13) F ≤ n ¯ φ n and therefore sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (cid:0) (cid:15) t C t ,k DJ n DC ,k (cid:15) (cid:1) = O (cid:16) n − / + n ¯ φ n (cid:17) . Combining (a) and (b) givessup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / Var (cid:0) V tk,l DJ n DV k,l (cid:1) = O (cid:16) n − / + n ¯ φ n (cid:17) , so (A.7) and (A.9) follow with Lemma C.1 and Propositon C.1.19 roof of Theorem 2. We decomposeMISE (cid:0) ˆ σ N (cid:1) = (cid:90) Bias (cid:0) ˆ σ N ( t ) (cid:1) dt + (cid:90) Var (cid:0) ˆ σ N ( t ) (cid:1) dt. We have that σ ( t ) = (cid:80) ∞ i =0 (cid:10) ψ k , σ (cid:11) ψ k ( t ) , where (cid:104) . , . (cid:105) denotes the standard scalarproduct on L [0 , η k,n,l := E (ˆ s k, ,l ) − s k, . Then for i ≥
1, E (2ˆ s i, ,l − s , ,l ) =2 / (cid:10) ψ i , σ (cid:11) + 2 η i,n,l − η ,n,l . Hence (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = η ,n,l + 2 N (cid:88) i =1 ( η i,n,l − η ,n,l ) + ∞ (cid:88) i = N +1 (cid:10) ψ i , σ (cid:11) and we obtain η ,n,l + 2 N (cid:88) i =1 ( η i,n,l − η ,n,l ) ≤ (8 N + 1) max i =0 ,...,N η i,n,l . Because σ ∈ Θ s ( α, Q ) it holds ∞ (cid:88) i = N +1 (cid:10) ψ i , σ (cid:11) ≤ N + 1) α ∞ (cid:88) i =1 i α (cid:10) ψ i , σ (cid:11) ≤ Q ( N + 1) − α . (A.21)Therefore, sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max i =0 ,...,N η i,n,l + N − α . Assume model (1.1), then by Lemma A.1sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O (cid:16) N n − / + N n − β + N n − α + N − α (cid:17) . and for model (1.2), sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ σ ( t ) (cid:1) dt = O (cid:16) N n − β + N n − α + N n − / + N − α (cid:17) . (cid:90) Var (cid:0) ˆ σ ( t ) (cid:1) dt = Var (ˆ s , ,l ) + 12 N (cid:88) i =1 Var (2ˆ s i, ,l − s , ,l ) ≤ (4 N + 1) Var (ˆ s , ,l ) + 4 N (cid:88) i =1 Var (ˆ s i, ,l ) . Hence sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Var (cid:0) ˆ σ ( t ) (cid:1) dt = O N sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N Var (ˆ s k, ,l ) . Using Lemma A.1 yields the result.
Appendix B Sobolev s-ellipsoids
In this chapter we will shortly discuss the function space introduced in Section 3 andprovide a theorem needed for the lower bound. First recall the classical definition of Sobolevellipsoids (cf. Proposition 1.14 in [25]).
Definition B.1.
Define a j,α = j α , for even j, ( j − α , for odd j . Let { ϕ j } ∞ j =1 , φ ( x ) := 1 , φ j ( x ) := √ πjx ) , φ j +1 ( x ) := √ πjx ) denote thetrigonometric basis on [0 , . Then we call the function space Θ ( α, C ) := (cid:40) f ∈ L [0 ,
1] : ∃ ( θ n ) ∞ n =1 , s. t. f ( x ) = ∞ (cid:88) i =1 θ i ϕ i ( x ) , ∞ (cid:88) i =1 a i,α θ i ≤ C (cid:41) a Sobolev ellipsoid. Interesting characterizations arise if we put Sobolev s -ellipsoids into relation with Sobolevellipsoids: Remark B.1.
Let S be the class of all symmetric functions f ∈ L [0 , such that f ( x ) = f (1 − x ) , ∀ x ∈ [0 , . Further let Θ( α, C ) be a Sobolev ellipsoid. Then a function belongsto Θ s ( α, C ) ∩ S if and only if it belongs to Θ( α, C ) ∩ S . W ( α, ¯ C ) := W ( α, ¯ C, [0 , (cid:110) f ∈ L [0 ,
1] : f ( l ) (0) = f ( l ) (1) = 0for l odd , l < α, (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx ≤ ¯ C (cid:27) . (B.1)For positive integer values of α , we have the following equivalence. Theorem B.1.
Assume α ∈ { , , . . . } , C > . Let ¯ C = 2 π α C . Then a function is in W ( α, ¯ C ) if and only if it is in Θ s ( α, C ) .Proof. First we show that if a function f ∈ W ( α, ¯ C ) then also f ∈ Θ s ( α, C ). Let ˜ f bedefined on [ − ,
1] by ˜ f ( x ) := f ( x ) for x ∈ [0 , f ( − x ) for x ∈ [ − , . Note that ˜ f is an α -times differentiable function, ˜ f ( l ) is even if l is even and ˜ f ( l ) is odd if l is odd. Let s k ( j ) = (cid:82) − ˜ f ( j ) ( x ) dx for k = 0 (cid:82) − ˜ f ( j ) ( x ) cos( kπx ) dx for k ≥ , j even (cid:82) − ˜ f ( j ) ( x ) sin( kπx ) dx for k ≥ , j odd . It holds for j ≥ s ( j ) = (cid:90) − ˜ f ( j ) ( x ) dx = ˜ f ( j − (1) − ˜ f ( j − ( −
1) = 0 . Hence we have the Parseval type equality (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx = 12 ∞ (cid:88) k =1 s k ( α ) . (B.2)Further for k ≥ j even, it follows by partial integration s k ( j ) = (cid:90) − ˜ f ( j ) ( x ) cos( kπx ) dx = ˜ f ( j − ( x ) cos( kπx ) (cid:12)(cid:12)(cid:12) − + kπ (cid:90) − ˜ f ( j − ( x ) sin( kπx ) dx = kπs k ( j − k ≥ j odd s k ( j ) = (cid:90) − ˜ f ( j ) ( x ) sin( kπx ) dx = ˜ f ( j − ( x ) sin( kπx ) (cid:12)(cid:12)(cid:12) − − kπ (cid:90) − ˜ f ( j − ( x ) cos( kπx ) dx = − kπs k ( j − . θ k = (cid:82) f ( x ) cos( kπx ) dx it follows for k ≥ s k ( α ) = k α π α s k = 4 k α π α θ k . Com-bining this result with (B.2) yields (cid:90) (cid:16) f ( α ) ( x ) (cid:17) dx = 2 π α ∞ (cid:88) k =1 k α θ k and hence proves the first part of the theorem. The other direction follows in a straightfor-ward way by differentiation and is thus omitted. Supplementary Material
Supplement: Proofs for upper bound of ˆ τ N and further technicalities Acknowledgments
We are grateful to T. Tony Cai, Marc Hoffmann, Mark Podolskij and Ingo Witt for helpfulcomments and discussions.
References [1] F. Bandi and J. Russell. Microstructure noise, realized variance, and optimal sampling.
Rev. Econom. Stud. , 75:339–369, 2008.[2] O. Barndorff-Nielsen, P. Hansen, A. Lunde, and N. Stephard. Designing realised kernelsto measure the ex-post variation of equity prices in the presence of noise.
Econometrica ,76(6):1481–1536, 2008.[3] E. Barucci, P. Malliavin, and M. E. Mancino. Harmonic analysis methods for nonpara-metric estimation of volatility: theory and applications. In
Stochastic processes andapplications to mathematical finance. Proceedings of the 5th Ritsumeikan internationalsymposium, Kyoto, Japan, March 3-6, 2005 , pages 1–34, 2006.[4] N. Bissantz, T. Hohage, A. Munk, and F. Ruymgaart. Convergence rates of generalregularization methods for statistical inverse problems and applications.
SIAM J.Numerical Analysis , 45:2610–2636, 2007.[5] V. Britanak, P. C. Yip, and K. R. Yao.
Discrete Cosine and Sine Transforms: GeneralProperties, Fast Algorithms and Integer Approximations . Academic Press, 2006.236] C. Butucea and A.B. Tsybakov. Sharp optimality for density deconvolution withdominating bias, I.
Theory Probab. Appl. , 52(1):111–128, 2007.[7] C. Butucea and A.B. Tsybakov. Sharp optimality for density deconvolution withdominating bias, II.
Theory Probab. Appl. , 52(2):336–349, 2007.[8] T. Cai, A. Munk, and J. Schmidt-Hieber. Sharp minimax estimation of the variance ofBrownian motion corrupted with Gaussian noise.
Statist. Sinica , 2009. Forthcoming.[9] A. Delaigle and I. Gijbels. Practical bandwidth selection in deconvolution kernel den-sity estimation.
Comput. Stat. Data Anal. , 45(2):249–267, 2004.[10] A.K. Dey, B.A. Mair, and F.H. Ruymgaart. Cross-validation for parameter selectionin inverse estimation problems.
Scand. J. Statist. , 23:609–620, 1996.[11] J. Fan. On the optimal rates of convergence for nonparametric deconvolution problem.
Ann. Statist. , 19(3):1257–1272, 1991.[12] J. Fan, J. Jiang, C. Zhang, and Z. Zhou. Time-dependent diffusion models for termstructure dynamics.
Statist. Sinica , 13:965–992, 2003.[13] A. Gloter and J. Jacod. Diffusions with measurement errors. I. Local asymptoticnormality.
ESAIM Probab. Stat. , 5:225–242, 2001.[14] A. Gloter and J. Jacod. Diffusions with measurement errors. II. Optimal estimators.
ESAIM Probab. Stat. , 5:243–260, 2001.[15] J. Jacod, Y. Li, P. A. Mykland, M. Podolskij, and M. Vetter. Microstructurenoise in the continuous case: The pre-averaging approach.
Stochastic Process. Appl. ,119(7):2249–2276, 2009.[16] A. Madahavan. Market microstructure: A survey.
Journal of Financial Markets ,3:205–258, 2000.[17] A. Munk and J. Schmidt-Hieber. Lower bounds for volatility estimation in microstruc-ture noise models. arxiv:1002.3045, Math arXiv Preprint.[18] A. Neubauer. The convergence of a new heuristic parameter selection criterion forgeneral regularization methods.
Inverse Problems , 24:055005, 2008.[19] S. Pereverzev and E. Schock. On the adaptive selection of the parameter in regular-ization of ill-posed problems.
SIAM J. Numer. Anal. , 43(5):2060–2076, 2005.[20] M. Podolskij and M. Vetter. Estimation of volatility functionals in the simultaneouspresence of microstructure noise and jumps.
Bernoulli , 2009. Forthcoming.2421] M. Reiß. Asymptotic equivalence and sufficiency for volatility estimation under mi-crostructure noise. arxiv:1001.3006, Math arXiv Preprint.[22] M. Rosenbaum. Integrated volatility and round-off error.
Bernoulli , 2009. Forthcom-ing.[23] M. Steele.
Stochastic Calculus and Financial Applications . Springer, New York, 2001.[24] M. Stein. Minimum norm quadratic estimation of spatial variograms.
J. Amer. Statist.Assoc. , 82(399):765–772, 1987.[25] A. B. Tsybakov.
Introduction to Nonparametric Estimation (Springer Series in Statis-tics XII) . Springer-Verlag, New York, 2009.[26] L. Zhang. Efficient estimation of stochastic volatility using noisy observations: Amulti-scale approach.
Bernoulli , 12:1019–1043, 2006.[27] L. Zhang, P. Mykland, and Y. Ait-Sahalia. A tale of two time scales: Determin-ing integrated volatility with noisy high-frequency data.
J. Amer. Statist. Assoc. ,472:1394–1411, 2005. 25igure 1: Plots 1 and 2 display paths of sBM and tBM corresponding to σ ( t ) = (1 / − t ) + (Plot 3). Analogously, Plots 4 and 5 show paths of sBM and tBM with σ ( t ) = 1+ I ( 1 / , ( t )(Plot 6). For Plots 1 and 2 as well as Plots 4 and 5 we took the same realization ( W t ) t ∈ [0 , of the underlying Brownian motion. The first two plots show the different scaling behavior:sBM= 0 and tBM= (cid:82) / σ ( s ) dW s for t > /
2. On the other hand we see by Plots 4 and 5that a jump induces a random shift, i.e. sBM=tBM for t ≤ / W / =sBM for t > / n = 25000 data points from model (1.1), (cid:15) i,n ∼ N (0 , τ = 0 . σ ( t ) =(2 + cos (2 πt )) / . Plot 1 shows the data. Additionally to the data, we plotted the pathof the tBM in Plot 2. The reconstruction of τ and σ (dashed lines) as well as the truefunction (solid lines) are given in Plot 3 and 4, respectively. The threshold parameters wereselected as N ∗ = 1 for estimation of τ and N ∗ = 3 for estimation of σ . τ and ˆ σ is quite robust to heavy-tailednoise. The threshold parameters N ∗ were selected as 1 and 3 for estimation of τ and σ ,respectively. 28igure 4: (Low-smoothness) As Figure 2 but we chose σ ( t ) = 3 (cid:12)(cid:12)(cid:12) ˜ W t (cid:12)(cid:12)(cid:12) , where (cid:16) ˜ W t (cid:17) t ∈ [0 , denotes a Brownian motion independent of the noise and the Brownian motion in (1.1).The estimator returns a smoothed version of the path. The threshold parameters N ∗ wereselected as 1 and 17 for estimation of τ and σ , respectively.29igure 5: (Jump function) As Figure 2 but we chose σ ( t ) = 1 + I ( 1 / , ( t ) . The Gibbsphenomenon is clearly visible. The threshold parameters N ∗ were selected as 1 and 10 forestimation of τ and σ , respectively. 30 upplementary Material: Nonparametric Estimation of theVolatility Function in a High-Frequency Model corrupted byNoise Axel Munk ∗ and Johannes Schmidt-Hieber Institut f¨ur Mathematische Stochastik, Universit¨at G¨ottingen,Goldschmidtstr. 7, 37077 G¨ottingen
Email: [email protected] , [email protected] Abstract
This note provides proofs and supplementary technicalities for the paper ”Nonpara-metric Estimation of the Volatility Function in a High-Frequency Model corrupted byNoise”. In particular a proof on the rate of convergence of the estimator ˆ τ N is given. AMS 2000 Subject Classification:
Primary 62M09, 62M10; secondary 62G08.
Keywords:
Brownian motion; Variance estimation; Minimax rate; Microstructure noise;Sobolev Embedding.
Appendix C Convergence Rate of ˆ τ C.1 Preliminary Results and Notation
First we recall some notation. Let σ k ( i/n ) := σ ( i/n ) f k ( i/n ) and τ k ( i/n ) := τ ( i/n ) f k ( i/n ).Let throughout the following for the Sobolev s -ellipsoids in Definition 1 for σ the constantsbeing l = σ min and u = σ max and for τ , l = τ min , u = τ max . We define K n := n / / log n and φ n := sup σ ∈ Θ bs ( α,Q ) max i =1 ,...,n − sup ξ ∈ [ i/n,i +1 /n ] (cid:12)(cid:12)(cid:12)(cid:12) σ ( ξ ) − σ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , ¯ φ n, / := sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n max i =1 ,...,n sup ξ i ∈ [( i − /n,i/n ] (cid:12)(cid:12)(cid:12)(cid:12) τ k ( ξ i ) − τ k (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) . Proposition C.1.
Assume α, β > / . It holds for any δ > , φ n = O (cid:16) n / − α + n δ − (cid:17) ¯ φ n = O (cid:16) n / − β + n − / (cid:17) ¯ φ n, / = O (cid:16) n / − β + n − / log − n (cid:17) . roof. We only prove the third equality the other two can be deduced similarly. Note thatfor τ ∈ Θ b ( β, Q ), (cid:12)(cid:12)(cid:12)(cid:12) τ k ( ξ i ) − τ k (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ (cid:12)(cid:12)(cid:12)(cid:12) τ ( ξ i ) − τ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + τ / kπ/ (2 n ) ≤ √ τ min (cid:12)(cid:12)(cid:12)(cid:12) τ ( ξ i ) − τ (cid:18) in (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + τ / kπ/ (2 n ) . Taking supremum and applying Lemma D.8 gives the result.
C.2 Proofs for Estimation of τ Lemma C.1.
Let ˆ t k, be defined as in (4.1). Further assume α, β > / , Q, ¯ Q > , < σ min ≤ σ max < ∞ , < τ min ≤ τ max < ∞ and k = k n ∈ N . Assume model (1.1). Then sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) = O (cid:16) n / − β log / n (cid:17) + o (cid:16) n − / (cid:17) , (C.1)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ˆ t k, (cid:1) = O (cid:0) n − (cid:1) . (C.2) If further (cid:15) is n -variate standard normal, then sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − log − n (cid:1) , (C.3) n / (cid:0) ˆ t k, − t k, (cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) for β > , k ≤ K n . (C.4) Assume model (1.2). Then it holds sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) = O (cid:16) n / − β log / n + n − α log n (cid:17) + o (cid:16) n − / (cid:17) , (C.5)sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ˆ t k, (cid:1) = O (cid:0) n − α log n + n − (cid:1) . (C.6) If further (cid:15) is n -variate standard normal, then sup σ ∈ Θ bs ( α,Q ) ,τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − α log n + n − log − n (cid:1) , (C.7) n / (cid:0) ˆ t k, − t k, (cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) for α > / , β > , k ≤ K n . (C.8)32 roof. Again we work with the generalized estimators as introduced in Section A.1. As inthe proof of Lemma A.1 we introduce for two centered random vectors P and Q a semi-innerproduct defined by (cid:104) P, Q (cid:105) τ := E (cid:0) P t DJ τn D t Q (cid:1) and obtainE ˆ t k, ,l = (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) Z ,k,l , Z ,k,l (cid:105) τ + (cid:104) Z ,k , Z ,k (cid:105) τ + 2 (cid:104) X ,k , Z ,k,l (cid:105) τ + 2 (cid:104) X ,k , Z ,k (cid:105) τ . (C.9)First we bound (cid:104) X ,k , X ,k (cid:105) τ , which will turn out to be the leading term. Similar to (A.13)we have (cid:104) X ,k , X ,k (cid:105) τ = tr (cid:0) Λ J τn D t T k D (cid:1) + 12 tr (cid:16) J τn D t ˜ T k D (cid:17) , and due to tr ( J τn ) = O (log n ) (C.10)the same argument as for (A.15) givessup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n tr (cid:16) J τn D ˜ T k D (cid:17) = O (cid:16) ¯ φ n, / log n (cid:17) . Hence this is a negligible term. Using Lemma D.1 (iii)tr (cid:0) Λ J τn D t T k D (cid:1) = ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] (cid:0) A (cid:0) τ k , (cid:1) − A (cid:0) τ k , i (cid:1)(cid:1) = ¯ r n A (cid:0) τ k , (cid:1) − ( n − n/ log n ) − n − (cid:88) i =[ n/ log n ] A (cid:0) τ k , i (cid:1) , where ¯ r n = ( n − [ n/ log n ]) / ( n − n/ log n ) . Note 1 − ¯ r n ≤ / ( n − n/ log n ) . By LemmaD.2 sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) tr (cid:0) Λ J τn D t T k D (cid:1) − t k, (cid:12)(cid:12) ≤ sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:32) (1 − ¯ r n ) | t k, | + ∞ (cid:88) m = n | t k,m | + 2 ( n − n/ log n ) − ∞ (cid:88) i =0 | t k,i | (cid:33) ≤ C β,Q n / − β + 6 ( n − n/ log n ) − sup τ ∈ Θ bs ( β, ¯ Q ) ∞ (cid:88) i =0 | t ,i | = O (cid:16) n − + n / − β (cid:17) . This shows thatsup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) (cid:104) X ,k , X ,k (cid:105) τ − t ,k (cid:12)(cid:12) = O (cid:16) n − + ¯ φ n, / log n + n / − β (cid:17) . (cid:104) X ,k , X ,k (cid:105) τ = 1 n tr (cid:0) DJ τn D t Σ k (cid:1) ≤ σ max n tr ( J τn )implying sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) X ,k , X ,k (cid:105) τ = O (cid:0) n − log n (cid:1) . We obtain with Lemma D.6 in the same way as in (A.16), (A.18) and (A.19)sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) Z ,k, , Z ,k, (cid:105) τ = O (cid:0) n − log n φ n (cid:1) , sup σ ∈ Θ bs ( α,Q ) max k ≤ K n (cid:104) Z ,k, , Z ,k, (cid:105) τ = O (cid:0) log n φ n (cid:1) , sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:104) Z ,k , Z ,k (cid:105) τ = O (cid:16) ¯ φ n, / log n (cid:17) . From the Cauchy-Schwarz inequality follows that (cid:12)(cid:12) (cid:104) X ,k , Z ,k,l (cid:105) τ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / τ (cid:104) Z ,k,l , Z ,k,l (cid:105) / τ ≤ (cid:104) X ,k , X ,k (cid:105) τ + (cid:104) Z ,k,l , Z ,k,l (cid:105) τ , (cid:12)(cid:12) (cid:104) X ,k , Z ,k (cid:105) τ (cid:12)(cid:12) ≤ (cid:104) X ,k , X ,k (cid:105) / τ (cid:104) Z ,k , Z ,k (cid:105) / τ . This yields sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12) E ˆ t k, − t k, (cid:12)(cid:12) = O (cid:16) n − log n + n / − β + ¯ φ n, / log / n (cid:17) for l = 1 ,O (cid:16) n − log n + n / − β + ¯ φ n, / log / n + φ n log n (cid:17) for l = 2 , and therefore (C.5) and (C.1) holds by Proposition C.1. In order to calculate the covariancewe use the decomposition (A.3). We haveˆ t k, ,l = ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) + (cid:15) t C t ,k DJ τn DC ,k (cid:15). Using the CS-inequality repeatedly, we can write (cid:12)(cid:12)
Var (cid:0) ˆ t k, ,l (cid:1) − Var (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)(cid:12)(cid:12) (C.11) ≤ (cid:16) Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) + 2 Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1)(cid:17) + (cid:16) Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) + 2 Var / (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1)(cid:17) · / (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)
34e subdivide the remaining part of the proofs of (C.6) and (C.2) into three steps (a),(b) and (c), where we calculate Var (cid:16) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:17) , Var (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:17) andVar (cid:16) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:17) , respectively. (a) Let TrSq( A ) := (cid:80) ni =1 A i,i for A ∈ M n . Then by Lemma D.5 it followsVar (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1) = 2 (cid:13)(cid:13) C t ,k DJ τn DC ,k (cid:13)(cid:13) F + Cum ( (cid:15) ) TrSq (cid:0) C t ,k DJ τn DC ,k (cid:1) ≤ (2 + Cum ( (cid:15) )) (cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F , where equality holds if Cum ( (cid:15) ) = 0. By Proposition D.2 we see thatsup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1) − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) Cum ( (cid:15) ) n − + n − log − n (cid:1) . (b) In this part of the proof we will bound Var (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:17) . Similar to part (a)in the proof of Lemma A.1 it holdsVar (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) ≤ λ ( J τn ) (cid:16) (cid:107) Cov ( X ,k ) (cid:107) F + (cid:107) Cov ( Z ,k,l ) (cid:107) F (cid:17) ≤ n − log n (cid:0) n − σ + 4 n − φ n (cid:1) , l = 1 , n − log n (cid:0) n − σ + 4 n φ n (cid:1) , l = 2 , where we used Lemma D.6 in the second inequality. Hence we getsup σ ∈ Θ bs ( α,Q ) max k ≤ K n Var (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) = O (cid:0) n − log n (cid:1) , l = 1 ,O (cid:0) log n (cid:0) φ n + n − (cid:1)(cid:1) , l = 2 . (C.12)(c) Using Lemma D.5 (ii)Var (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1) ≤ √ / (cid:0) ξ t C t ,k,l DJ τn DC ,k,l ξ (cid:1) (cid:13)(cid:13) C t ,k DJ τn DC ,k (cid:13)(cid:13) F and hence sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n Var (cid:0) ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:1) = O (cid:0) n − log n (cid:1) , l = 1 ,O (cid:0) log n (cid:0) φ n + n − / (cid:1)(cid:1) O (cid:0) n − / (cid:1) , l = 2 . σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ K n (cid:12)(cid:12)(cid:12)(cid:12) Var (cid:0) ˆ t k, (cid:1) − n (cid:90) τ k ( x ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) Cum ( (cid:15) ) n + 1 n log n (cid:19) + O (cid:0) φ n log n + φ n n − / log n (cid:1) , l = 1 , , l = 2 , (C.13)and hence (C.2), (C.3), (C.6) and (C.7) follow using Proposition C.1.Finally we will show the asymptotic normality (C.8) and (C.4). Because of the decom-position (A.3), we haveˆ t k, ,l = ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) + (cid:15) t C t ,k DJ τn C ,k (cid:15). As proved above n / (cid:16) ξ t C t ,k,l DJ τn DC ,k,l ξ + 2 ξ t C t ,k,l DJ τn DC ,k (cid:15) (cid:17) P → β > , α > / l = 1 and β > l = 2. Hence by Slutzky’s Lemma it suffices to show that n / (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) − E (cid:0) (cid:15) t C t ,k DJ τn DC ,k (cid:15) (cid:1)(cid:1) L → N (cid:18) , (cid:90) τ k ( x ) dx (cid:19) . In order to apply Theorem D.1, it remains to show n / λ (cid:0) C t ,k DJ τn DC ,k (cid:1) → . Using Corollary D.1, we see that n / λ (cid:0) C t ,k DJ τn DC ,k (cid:1) ≤ n − / log nλ (Cov ( X ,k + Z ,k )) ≤ n − / log n λ (Cov ( X ,k )) + 8 n − / log nλ (Cov ( Z ,k )) ≤ n − / log n sup t ∈ [0 , τ k ( t ) max i λ i + 8 n − / log nφ n, / = o (1) , which yields the last statement of the lemma. Proof of Theorem 1.
The proof is close to the one of Theorem 2. We obtainsup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Bias (cid:0) ˆ τ N ( t ) (cid:1) dt = O N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N (cid:12)(cid:12) E (cid:0) ˆ t k, ,l (cid:1) − t k, (cid:12)(cid:12) + N − β , sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) (cid:90) Var (cid:0) ˆ τ N ( t ) (cid:1) dt = O N sup σ ∈ Θ bs ( α,Q ) , τ ∈ Θ bs ( β, ¯ Q ) max ≤ k ≤ N Var (cid:0) ˆ t k, ,l (cid:1) . ppendix D Technical Results Proposition D.1.
Let A ∈ M n − . Then tr ( J n DAD ) ≤ (cid:16) n + 5 n / + 8 n / (1 + log n ) (cid:17) max i,j (cid:12)(cid:12)(cid:12) ( A ) i,j (cid:12)(cid:12)(cid:12) . Proof.
Write A = ( a i,j ) i,j =1 ,...,n − . Note that(
DAD ) i,j = 2 n n − (cid:88) p,q =1 sin (cid:18) ipπn (cid:19) sin (cid:18) qjπn (cid:19) a p,q . For i = j we have further( DAD ) i,i = 1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p − q ) iπn (cid:19) + 1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p + q ) iπn (cid:19) . (D.1)In order to bound the r.h.s. we need bounds for (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) r iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) Dir [ n / ] ( rπ/n ) − Dir[ n / ] ( rπ/n ) (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) sin (cid:0)(cid:0) (cid:2) n / (cid:3) + 1 / (cid:1) rπ/n (cid:1) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) sin (cid:0)(cid:0)(cid:2) n / (cid:3) + 1 / (cid:1) rπ/n (cid:1) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) rπ/ (2 n )) (cid:12)(cid:12)(cid:12)(cid:12) . Let B := { , . . . , n } and B := { n + 1 , . . . , n − } . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) r iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n / for r = 0 , n/r for r ∈ B ,n/ (2 n − r ) for r ∈ B . Therefore, we can bound the first term of the r.h.s. of (D.1) by (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p − q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n n − (cid:88) p,q =1 | a p,q | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 cos (cid:18) ( p − q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n − max p,q =1 ,...,n − | a p,q | n / + 2 n − (cid:88) p,q =1 q − p ∈ B nq − p (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ n / ] (cid:88) i = [ n / ] +1 n n − (cid:88) p,q =1 a p,q cos (cid:18) ( p + q ) iπn (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n − max p,q =1 ,...,n − | a p,q | n − (cid:88) p,q =1 p + q ∈ B np + q + n − (cid:88) p,q =1 p + q ∈ B n n − ( p + q ) ≤ n max p,q =1 ,...,n − | a p,q | . Due to n − (cid:88) p,q =1 q − p ∈ B q − p ≤ n n (cid:88) r =1 r ≤ n (1 + log n )and tr ( J n DAD ) = √ n [ n / ] (cid:88) i = [ n / ] +1 ( DAD ) i,i ≤ (cid:16) n + 5 n / + 8 n / (1 + log n ) (cid:17) max p,q =1 ,...,n − | a p,q | we obtain the result. Proposition D.2.
It holds sup τ ∈ Θ s ( β,Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:0) n − log − n (cid:1) . Proof.
We obtain with (A.12) Cov ( X ,k + Z ,k ) = 1 / T k K + 1 / KT k + S k , where S k :=1 / T k +Cov ( X ,k , Z ,k )+Cov ( Z ,k , X ,k )+Cov ( Z ,k ) . Application of the triangle inequalitygives 12 (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D Cov ( X ,k + Z ,k ) D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F + (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F . (D.2)Note that because of Lemma D.4 (iii) it holdstr (cid:18)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD ( J τn ) / (cid:13)(cid:13)(cid:13) F . (D.3)38ow we will boundtr (cid:18)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) = tr (cid:18)(cid:104) ( J τn Λ) / DT k D (Λ J τn ) / (cid:105) (cid:19) = n − (cid:88) i =1 λ i (cid:0) DT k D Λ J τn (cid:1) from below. We obtain with Lemma D.3 λ i (cid:0) DT k D Λ J τn (cid:1) ≥ λ n − [ n/ log n ] (Λ J τn ) λ [ n/ log n ] − i (cid:0) DT k D (cid:1) , i ≤ n − [ n/ log n ] , , i > n − [ n/ log n ] . Denote by τ k, ( i ) the i -th largest component of the vector( τ k (1 /n ) , . . . , τ k (1 − /n )) . Then tr (cid:16)(cid:16) ( J τn ) / DT k KD ( J τn ) / (cid:17) (cid:19) = n − (cid:88) i =1 λ i (cid:0) DT k D Λ J τn (cid:1) ≥ n − [ n/ log n ] (cid:88) i =1 ( n − n/ log n ) − τ k, ([ n/ log n ] − i ) ≥ ( n − n/ log n ) − n − (cid:88) i =1 τ k (cid:18) in (cid:19) − τ n log n ( n − n/ log n ) − . (D.4)Next we will derive an upper bound for the r.h.s. of (D.3). Let analogously to the Definition(A.11) ¯ T k be a tridiagonal matrix with entries (cid:0) ¯ T k (cid:1) i,j := (cid:0) ∆ i τ k (cid:1) for i = j − , (cid:0) ∆ j τ k (cid:1) for i = j + 1 , . Note that max i (cid:12)(cid:12) ∆ i τ k (cid:12)(cid:12) ≤ τ / ¯ φ n, / . It is easy to check that T k KT k = 1 / T k K +1 / KT k +1 / T k holds. Clearly, J τn ≤ ( n − n/ log n ) − Λ − , and therefore we have for the upper boundin (D.3) (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ ( n − n/ log n ) − (cid:13)(cid:13)(cid:13) ( J τn ) / DT k KD Λ − / (cid:13)(cid:13)(cid:13) F ≤ ( n − n/ log n ) − tr (cid:16) ( J τn ) / DT k KT k D ( J τn ) / (cid:17) ≤ ( n − n/ log n ) − tr (cid:0) T k KDJ τn D (cid:1) + 12 ( n − n/ log n ) − tr (cid:0) D ¯ T k DJ τn (cid:1) ≤ ( n − n/ log n ) − tr (cid:0) T k (cid:1) + 2 ( n − n/ log n ) − max i,j =1 ,...,n − (cid:12)(cid:12) ¯ T k (cid:12)(cid:12) i,j tr ( J τn ) , τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13) ( J τn ) / D (cid:0) T k K + KT k (cid:1) D ( J τn ) / (cid:13)(cid:13)(cid:13) F − n (cid:90) τ k ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:16) n − ¯ φ n, / + n − log n ¯ φ n, / + n − log − n (cid:17) . (D.5)Now we will bound the remainder term in (D.2). Using Lemma D.6 gives (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ λ ( J τn ) (cid:107) S k (cid:107) F ≤ n − log n (cid:18)(cid:13)(cid:13)(cid:13) ˜ T k (cid:13)(cid:13)(cid:13) F + 4 (cid:107) Cov ( Z ,k ) (cid:107) F +8 (cid:107) Cov ( X ,k , Z ,k ) (cid:107) F (cid:17) . Because Cov ( X ,k , Z ,k ) is tridiagonal it holds with Lemma D.4 (i) (cid:107) Cov ( X ,k , Z ,k ) (cid:107) F = n − (cid:88) i,j =1 (cid:16) Cov ( X ,k , Z ,k ) i,j (cid:17) ≤ nτ max φ n, / and therefore (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F ≤ n − log n (cid:16) n ¯ φ n, / + 16 n ¯ φ n, / + 64 n ¯ φ n, / τ max (cid:17) This leads to sup τ ∈ Θ bs ( β, ¯ Q ) max k ≤ n / (cid:13)(cid:13)(cid:13) ( J τn ) / DS k D ( J τn ) / (cid:13)(cid:13)(cid:13) F = O (cid:16) n − log n ¯ φ n, / (cid:17) and with (D.2) and (D.5) completes the proof. Lemma D.1.
Let s k,p and t k,p as defined in (A.4). Then it holds(i) s k,p = 12 s ,p + 14 s ,p − k + 14 s ,p + k , t k,p = 12 t ,p + 14 t ,p − k + 14 t ,p + k . (ii) n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = A (cid:0) σ k , p (cid:1) − n (cid:0) ( − p σ k (1) + σ k (0) (cid:1) . iii) Let Σ k as defined in (A.5). Then (cid:0) D Σ k D (cid:1) i,j = A (cid:0) σ k , i − j (cid:1) − A (cid:0) σ k , i + j (cid:1) . Remark D.1. In ( iii ) , for | i − j | (cid:28) i + j , the r.h.s. behaves like s k,i − j . In the same waywe obtain the equivalent result if we replace σ by τ .Proof. (ii) Note that we can write σ k (cid:16) rn (cid:17) = s k, + 2 ∞ (cid:88) q =1 s k,q cos ( qπr/n )and hence it holds1 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = 1 n s k, n − (cid:88) r =1 cos (cid:16) prπn (cid:17) + 2 n ∞ (cid:88) q =1 s k,q n − (cid:88) r =1 cos (cid:16) qπrn (cid:17) cos (cid:16) prπn (cid:17) . Let I { A } denote the indicator function on the set A . We have the identities n − (cid:88) r =1 cos (cid:16) prπn (cid:17) = n I { p ≡ n } −
12 (1 + ( − p )and 2 n − (cid:88) r =1 cos (cid:16) qπrn (cid:17) cos (cid:16) prπn (cid:17) = n − (cid:88) r =1 cos (cid:18) ( q − p ) πrn (cid:19) + n − (cid:88) r =1 cos (cid:18) ( q + p ) πrn (cid:19) . From this it follows1 n n − (cid:88) r =1 σ (cid:16) rn (cid:17) cos (cid:16) prπn (cid:17) = 1 n −
12 (1 + ( − p ) s k, − ∞ (cid:88) q =1 s k,q (cid:0) − q − p (cid:1) + A (cid:0) σ k , p (cid:1) , which yields the result.(iii) This follows by applying ( ii ) to (cid:0) D Σ k D (cid:1) i,j = 2 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) sin (cid:18) irπn (cid:19) sin (cid:18) rjπn (cid:19) = 1 n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:18) ( i − j ) rπn (cid:19) − n n − (cid:88) r =1 σ k (cid:16) rn (cid:17) cos (cid:18) ( i + j ) rπn (cid:19) . σ k in Sobolev s-ellipsoids. In particular the result shows that the Fourier series is absolutesummable. Lemma D.2.
Let s k,p be as defined in (A.4). Assume k ≤ cn γ , where < c < is aconstant and either γ > , α > / or k = 0 , γ = 0 and α > / . Then it holds for n largeenough sup σ ∈ Θ s ( α,Q ) ∞ (cid:88) m =[ n γ ] | s k,m | ≤ C γ,α,Q,c n γ (1 / − α ) , where C γ,α,Q,c is independent of n .Proof. Consider the case γ > , α > /
2. Using Lemma D.1 (i), we see that for n largeenough ∞ (cid:88) m =[ n γ ] | s k,m | ≤ ∞ (cid:88) m =[(1 − c ) n γ ] | s ,m | = ∞ (cid:88) m =1 | s ,m | I { m ≥ [(1 − c ) n γ ] } ≤ (cid:32) ∞ (cid:88) i =1 i α s ,i (cid:33) / ∞ (cid:88) i =[(1 − c ) n γ ] i − α / ≤ C γ,α,Q,c n γ (1 / − α ) , where we used the definition of a Sobolev s-ellipsoid in the last step. If k = 0, γ = 0 and α > / Lemma D.3. (i) Let A ∈ M n be symmetric. A is positive semidefinite iff A = B t B forsome B ∈ M n .(ii) If A, B are positive semidefinite matrices. Denote by λ ( A ) the largest eigenvalue of A . Then tr( AB ) ≤ λ ( A ) tr( B ) .(iii) Let A, B ∈ M n − be positive semidefinite. Then λ r + s +1 ( AB ) ≤ λ r +1 ( A ) λ s +1 ( B ) 0 ≤ r + s ≤ n − λ n − r − s +1 ( AB ) ≥ λ n − r ( A ) λ n − s ( B ) 2 ≤ r + s ≤ n. (iv) Let A and B symmetric matrices. Then λ r + s +1 ( A + B ) ≤ λ r +1 ( A ) + λ s +1 ( B ) 0 ≤ r + s ≤ n − . v) (CS inequality for trace operator) Let A and B matrices of the same size. Then (cid:12)(cid:12) tr (cid:0) AB t (cid:1)(cid:12)(cid:12) ≤ tr / (cid:0) AA t (cid:1) tr / (cid:0) BB t (cid:1) . (vi) Let A, B matrices of the same size. Then A t B + B t A ≤ A t A + B t B. Corollary D.1.
Let A and B matrices of the same size. Then λ (cid:0) AB t + BA t (cid:1) ≤ λ (cid:0) AA t (cid:1) + λ (cid:0) BB t (cid:1) . Proof.
By Lemma D.3 ( vi ) A t B + AB t ≤ A t A + B t B . Applying Lemma D.3 ( iv ) for r = s = 0 yields the result.In the following Lemma, we summarize some facts on Frobenius norms. Lemma D.4.
Let A ∈ M n − . Then(i) (cid:107) A (cid:107) F := tr (cid:0) AA t (cid:1) = n − (cid:88) i =1 λ i (cid:0) AA t (cid:1) = n − (cid:88) i,j =1 a i,j and whenever A = A t also (cid:107) A (cid:107) F = (cid:80) n − i =1 λ i ( A ) .(ii) It holds (cid:0) A (cid:1) ≤ (cid:13)(cid:13) A + A t (cid:13)(cid:13) F ≤ (cid:107) A (cid:107) F . (iii) Let A , B be positive semidefinite matrices of the same size and ≤ A ≤ B . Furtherlet X be another matrix of the same size. Then (cid:13)(cid:13) X t AX (cid:13)(cid:13) F ≤ (cid:13)(cid:13) X t BX (cid:13)(cid:13) F . Proof. (i) and (ii) is well known and omitted. (iii) By assumptions it holds 0 ≤ X t AX ≤ X t BX . Hence λ i (cid:0) X t AX (cid:1) ≤ λ i (cid:0) X t BX (cid:1) and the result follows. Lemma D.5.
Let V = ( V , . . . , V n ) , W = ( W , . . . , W m ) be two independent, centeredrandom vectors. Let A = ( a i,j ) i,j =1 ,...,n ∈ M n and B ∈ M n,m . Then(i) E (cid:0) V t AV (cid:1) = tr ( A Cov ( V )) , E (cid:0) V t BW (cid:1) = 0 and ii) Assume further that V i ⊥ V j for all i, j = 1 , . . . , n , i (cid:54) = j and W k ⊥ W l for all k, l = 1 , . . . , m , k (cid:54) = l and Var ( V i ) = Var ( W k ) = 1 for i = 1 , . . . , n and j = 1 , . . . , m .We set TrSq( A ) := (cid:80) ni =1 a i,i . Then
Var (cid:0) V t AV (cid:1) = Cum ( V ) n (cid:88) i =1 a ii + tr (cid:0) A + AA t (cid:1) ≤ Cum ( V ) n (cid:88) i =1 a ii + 2 (cid:107) A (cid:107) F ≤ (2 + Cum ( V )) (cid:107) A (cid:107) F , (D.6)Var (cid:0) V t BW (cid:1) = (cid:107) B (cid:107) F , Var (cid:0) V t ABW (cid:1) ≤ (cid:13)(cid:13) AA t (cid:13)(cid:13) F (cid:13)(cid:13) BB t (cid:13)(cid:13) F . (D.7) Proof.
We only proof the first and the last statement in ( ii ). Note thatVar (cid:0) V t AV (cid:1) = n (cid:88) i,j,k,l =1 a ij a kl Cov ( V i V j , V k V l ) . If i = j = k = l then Cov ( V i V j , V k V l ) = 2 + Cum ( V ); if i = k , j = l , i (cid:54) = j or i = l , j = k , i (cid:54) = j then Cov ( V i V j , V k V l ) = 1. Otherwise Cov ( V i V j , V k V l ) = 0 and this gives (D.6).In order to see (D.7) note that by Lemma D.3 (v)Var (cid:0) V t ABW (cid:1) = (cid:13)(cid:13) B t A (cid:13)(cid:13) F = tr (cid:0)(cid:0) BB t (cid:1) (cid:0) AA t (cid:1)(cid:1) ≤ tr / (cid:16)(cid:0) BB t (cid:1) (cid:17) tr / (cid:16)(cid:0) AA t (cid:1) (cid:17) = (cid:13)(cid:13) BB t (cid:13)(cid:13) F (cid:13)(cid:13) AA t (cid:13)(cid:13) F . Theorem D.1.
Let ξ ∼ N (0 , I n ) and A be a positive semidefinite matrix. Then Var − / (cid:0) ξ t Aξ (cid:1) (cid:0) ξ t Aξ − E ξ t Aξ (cid:1) → N (0 , if and only if Var − / (cid:0) ξ t Aξ (cid:1) λ ( A ) → . Lemma D.6.
Let n ≥ . Then λ ( J τn ) ≤ n − log n. Proof.
Let r = [ n/ log n ] and note that sin( x ) − ≤ /x for x ∈ (0 , π/ λ − r ≤ (cid:18) nrπ (cid:19) ≤ π n (cid:18) n log n (cid:19) − ≤ n and λ ( J τn ) = ( n − n/ log n ) − λ − r ≤ n log n emma D.7. Let λ i be as defined in (3.2). Then it holds √ n [ n / ] (cid:88) i = [ n / ] +1 λ i = 7 π O (cid:16) n − / (cid:17) . Proof.
Let x i = iπ/ (2 n ). Note that sin ( x i ) = x i − ξ i /
3, where ξ i ∈ (0 , x i ). Furthermax i = [ n / ] +1 ,..., [ n / ] x i ≤ n − / π . Hence n / [ n / ] (cid:88) i = [ n / ] +1 ξ i ≤ n max i = [ n / ] +1 ,..., [ n / ] x i = O (cid:0) n − (cid:1) and thus n / [ n / ] (cid:88) i = [ n / ] +1 λ i = 4 n / [ n / ] (cid:88) i = [ n / ] +1 i π n + 13 ξ i = π n − / [ n / ] (cid:88) i = [ n / ] +1 i + O (cid:0) n − (cid:1) = 7 π O (cid:16) n − / (cid:17) . Lemma D.8 (Continuous Sobolev Embedding) . Let C ( q ) , q > denote the space of H¨oldercontinuous functions on [0 , equipped with the canonical norm (cid:107) . (cid:107) C ( q ) and define η : (1 / , ∞ ) × [0 , ∞ ) → R , η ( α, δ ) := α − / α ∈ (1 / , / , − δ α = 3 / , α > / . Suppose α > / . Then for any δ > the embedding ι : Θ bs ( α, Q ) (cid:44) → C ( η ( α, δ )) is continuous and in particular sup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) C ( η ( α,δ )) < ∞ . Proof.
For a given function f : [0 , → R define ˜ f : [ − , → R ,˜ f ( x ) := f ( x ) x ∈ [0 , ,f ( − x ) x ∈ [ − , . s > , W s, [ − , (cid:12)(cid:12) [0 , denote the (fractional) Sobolev space on [ − , , where thedomain of functions is restricted to [0 ,
1] equipped with the norm (cid:107) f (cid:107) W s, [ − , | [0 , := (cid:13)(cid:13)(cid:13) ˜ f (cid:13)(cid:13)(cid:13) W s, [ − , . Note this is a function space on [0 ,
1] and W s, [ − , (cid:12)(cid:12) [0 , (cid:54) = W s, [0 , α > / ι : Θ bs ( α, Q ) ⊆ W α, [ − , (cid:12)(cid:12) [0 , (cid:44) → C ( η ( α, δ ))is continuous and since it is linear also bounded. This yieldssup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) C ( η ( α,δ )) ≤ (cid:107) ι (cid:107) sup f ∈ Θ bs ( α,Q ) (cid:107) f (cid:107) W α, [ − , | [0 , < ∞ . References [1] Taylor, M. (1996).