Uniform concentration inequality for ergodic diffusion processes observed at discrete times
aa r X i v : . [ m a t h . P R ] S e p Uniform concentration inequality for ergodicdiffusion processes observed at discrete times. ∗ L. Galtchouk † S. Pergamenshchikov ‡ August 15, 2018
Abstract
In this paper a concentration inequality is proved for the deviation inthe ergodic theorem in the case of discrete time observations of diffusionprocesses. The proof is based on the geometric ergodicity property fordiffusion processes. As an application we consider the nonparametricpointwise estimation problem for the drift coefficient under discrete timeobservations.
Keywords:
Ergodic diffusion processes; Markov chains; Tail distribution; Up-per exponential bound; Concentration inequality.
AMS 2000 Subject Classifications : 60F10 ∗ The second author is partially supported by the RFFI-Grant 09-01-00172-a. † Department of Mathematics, Strasbourg University 7, rue Rene Descartes, 67084, Stras-bourg, France, e-mail: [email protected] ‡ Laboratoire de Math´ematiques Raphael Salem, Avenue de l’Universit´e, BP. 12,Universit´e de Rouen, F76801, Saint Etienne du Rouvray, Cedex France, e-mail:[email protected] Introduction
We consider the process ( y t ) t ≥ governed by the stochastic differential equationd y t = S ( y t ) d t + σ ( y t )d W t , ≤ t ≤ T , (1.1)where ( W t , F t ) t ≥ is a standard Wiener process, y is a initial condition and ϑ = ( S, σ ) are unknown functions. For this model we consider the pointwiseestimation problem for the function S at a fixed point x ∈ R (i.e. S ( x )), onthe basis of the discrete time observations of the process (1.1), i.e.( y t j ) ≤ j ≤ N , (1.2)where t j = jδ , N = [ T /δ ] and δ is some positive fixed observation frequencywhich will be specified later. Usually, for this problem one uses kernel estima-tors b S N ( x ) defined as b S N ( x ) = P Nk =1 ψ h,x ( y t k ) ∆ y t k P Nk =1 ψ h,x ( y t k ) ∆ t k , ψ h,x ( y ) = 1 h Ψ (cid:18) y − x h (cid:19) , (1.3)where Ψ( y ) is a kernel function which equals to zero for | y | ≥ < h < y t k = y t k − y t k − and ∆ t k = δ .Main difficulty in this estimator is that the denominator is random. There-fore, to obtain the convergence rate for this estimator we have to study thebehavior of the denominator, more precisely, one needs to show that N X k =1 ψ h,x ( y t k )∆ t k ≈ π ϑ ( ψ h,x ) hT as T → ∞ , where π ϑ ( ψ h,x ) = Z R ψ h,x ( y ) q ϑ ( y ) d y (1.4)and q ϑ is the ergodic density defined in (2.2).Unfortunately, the ergodic theorem does not permit to obtain this kind ofresult because the times t k and the bandwidth h depend on T . Usually oneobtains such properties through concentration inequalities for the deviation inthe ergodic theorem, i.e. one needs to study the limit behavior of the deviation D T ( φ ) = N X k =1 (cid:16) φ ( y t k ) − π ϑ ( φ ) (cid:17) ∆ t k (1.5)for some functions φ which can be dependent on T , for example, φ ( · ) = ψ h,x ( · ).More precisely, we need to show, that for any ε > m > ϑ , lim T →∞ T m P ϑ (cid:16) | D T ( ψ h,x ) | > εT (cid:17) = 0 , (1.6)2here P ϑ is the law of the process ( y t ) t ≥ under the coefficients ϑ = ( S, σ ).Usually, to get properties of type (1.6) one needs to establish an exponentialinequality for the deviations (1.5).There are a number of papers devoted to concentration inequalities forfunctions of independent random variables (we refer the reader to [2] and ref-erences therein), for functions of dependent random variables (see [4], [5], [14]).For Markov chains such inequalities were obtained in [1]. For continuous timeMarkov processes an exponential concentration inequality was obtained in [3](see also references therein). Some applications of concentration inequalitiesto statistics are presented in [13]. Concentration inequalities for diffusion pro-cesses are given in [8], [16], [18].For statistical applications, we need uniform upper bounds for the taildistribution over functions φ like to the exponential bounds in [8]. We can notapply directly the method from [8], since there it is based on the continuoustimes version of the Ito formula. In this paper we apply this approach throughuniform (over the functions S ) geometric ergodicity. We recall (see [15]), thatthe geometric ergodicity yields a geometric rate in the convergencelim t →∞ E ϑ ( g ( y t ) | y = x ) = π ϑ ( g )for any integrable functions g and any initial value x ∈ R . Here E ϑ denotes theexpectation with respect to the distribution P ϑ . In [10] through the Lyapunovfunctions method it is shown that the process (1.1) is geometrically ergodicuniformly over functions ϑ = ( S, σ ) from the functional class Θ defined in(2.1).The paper is organized as follows. In the next section we formulate the mainresults. In Section 3 we introduce all the necessary parameters. In Section 4we show a concentration inequality in ergodic theorem for the continuous ob-servations of the process (1.1). In Section 5 we announce the uniform geometricergodic property for the process (1.1). In Section 6 we give the Burkh¨olderinequality for dependent random variables. In Section 7 we prove all mainresults. The Appendix contains the proofs of some auxiliary results.
First we describe the functional class Θ for functions ϑ = ( S, σ ) defined in[10]. We start with some real numbers x ∗ ≥ M >
L >
L,M the class of functions S from C ( R ) such thatsup | x |≤ x ∗ (cid:16) | S ( x ) | + | ˙ S ( x ) | (cid:17) ≤ M − L ≤ inf | x |≥ x ∗ ˙ S ( x ) ≤ sup | x |≥ x ∗ ˙ S ( x ) ≤ − L − . Furthermore, for some fixed numbers 0 < σ min ≤ σ max < ∞ , we denote by V the class of the functions σ from C ( R ) such that σ min ≤ inf x ∈ R min ( | σ ( x ) | , | ˙ σ ( x ) | , | ¨ σ ( x ) | ) ≤ sup x ∈ R max ( | σ ( x ) | , | ˙ σ ( x ) | , | ¨ σ ( x ) | ) ≤ σ max . Finally, we set Θ = Σ
L,M × V . (2.1)It should be noted (see, for example, [11]), that for any ϑ = ( S, σ ) ∈ Θ, theequation (1.1) has a unique strong solution which is a ergodic process with theinvariant density q ϑ defined as q ϑ ( x ) = (cid:18)Z R σ − ( z ) e e S ( z ) d z (cid:19) − σ − ( x ) e e S ( x ) , (2.2)where e S ( x ) = 2 R x S ( v )d v and S ( x ) = S ( x ) /σ ( x ).Now we describe the functional classes for the functions φ . First, for anyparameters ν > ν > V ν ,ν = { φ ∈ C ( R ) : | φ | ≤ ν , | φ | ∗ ≤ ν } , (2.3)where | φ | = R R | φ ( y ) | d y and | φ | ∗ = sup y ∈ R | φ ( y ) | .For any function φ from C ( R ) we denote by L ϑ ( φ ) the generator operatorfor the process (1.1), i.e. L ϑ ( φ )( y ) = S ( y ) ˙ φ ( y ) + σ ( y )2 ¨ φ ( y ) . Using this notation, we set µ ( φ ) = sup ϑ ∈ Θ kL ϑ ( φ ) k ∗ and e µ ( φ ) = sup ϑ ∈ Θ | e π ϑ ( φ ) | , (2.4)where e π ϑ ( φ ) = π ϑ ( L ϑ ( φ )). Now for any vector ν = ( ν , ν , ν , ν , ν ) from R we set K ν = n φ ∈ V ν ,ν : k ˙ φ k ∗ ≤ ν , µ ( φ ) ≤ ν , e µ ( φ ) ≤ ν o . (2.5)4 heorem 2.1. For any vector ν = ( ν , ν , ν , ν , ν ) from R and any < δ ≤ there exist positive parameters z = z ( δ, ν ) , γ = γ ( δ, ν ) and κ = κ ( δ, ν ) such that sup T ≥ sup z ≥ z sup φ ∈K ν sup ϑ ∈ Θ e z min( κ z , γ ) P ϑ (cid:16) | D T ( φ ) | ≥ z √ N (cid:17) ≤ , (2.6) where the parameters z , γ and κ are defined in (3.5) – (3.6) . Now we apply this theorem to the pointwise estimation problem, i.e. for thefunctions ψ h,x defined in (1.3). To this end we assume that the frequency δ in the observations (1.2) is of the following form δ = δ T = 1 T l T , (2.7)where the function l T is such that for any m > T →∞ l T T m = 0 and lim T →∞ l T ln T = + ∞ . (2.8)Further, let ǫ = ǫ T be a positive function satisfying the following propertieslim T →∞ ǫ T = 0 , lim T →∞ l T T ǫ T = 0 and lim T →∞ ǫ T l T ln T = + ∞ . (2.9)We can take, for example, for some ι > l T = ln ι ( T + 1) and ǫ T = 1ln ι ( T + 1) . Theorem 2.2.
Assume that the kernel function Ψ in (1.3) is two continu-ously differentiable. Moreover, assume that the functions δ T and l T satisfy theproperties (2.7) and (2.9) . Then there exist coefficients z ∗ = z ∗ (Ψ) > and γ ∗ = γ ∗ (Ψ) > such that lim sup T →∞ e aγ ∗ l T sup a ≥ a ∗ sup h ≥ T − / sup ϑ ∈ Θ P ϑ (cid:16) | D T ( ψ h,x ) | ≥ a T (cid:17) ≤ , (2.10) where a ∗ = z ∗ /l T , the parameters z ∗ and γ ∗ are given in Section 3. This theorem implies immediately the following
Corollary 2.1.
Assume, that all conditions of Theorem 2.2 hold. Then, forany m > , lim sup T →∞ T m sup a ≥ a ∗ sup h ≥ T − / sup ϑ ∈ Θ P ϑ (cid:16) | D T ( ψ h,x ) | ≥ a T (cid:17) = 0 . χ h,x ( y ) = 1 h χ (cid:18) y − x h (cid:19) , (2.11)where χ ( y ) = {| y |≤ } . Theorem 2.3.
Assume that the parameter δ has the form (2.7) . Then, forany m > , and for any function ǫ T , satisfying the condtions (2.8) and (2.9)lim T →∞ T m sup h ≥ T − / sup ϑ ∈ Θ P ϑ (cid:16) | D T ( χ h,x ) | ≥ ǫ T T (cid:17) = 0 . (2.12) Remark 2.1.
It is well known that to obtain the optimal rate in the estimationproblem for a differentiable function S in the process (1.1) one needs to choosethe bandwidth h as h = T − / (2 α +1) with the regularity parameter α ≥ . This means that, really for the pointwiseestimation problem, h ≥ T − / . But in the quadratic risk one needs to choosethe parameter h as h = T − / (see [6]-[7],[9]). In this section we introduce all necessary constants and parameters. First, weset υ = e β / (4 β ) and υ = p π/β e β / (4 β ) , (3.1)where β = 2 M/σ and β = 1 /Lσ . Moreover, as we will see in Appendix,the ergodic density (2.2) is uniformly bounded by q ∗ , where q ∗ = σ max σ min e β x ∗ + β / (4 β ) . (3.2)Now we set r = r ( ν ) = 2 ν σ (1 + υ + q ∗ ( x ∗ + υ )) e x ∗ β , (3.3)where the parameter ν is defined in (2.3). Now using this function we set κ = κ ( ν ) = 1108 r (3 ρ + y + 2 σ ) (3.4)where ρ = max (cid:16) | y | , σ max √ L , x ∗ + M L ) (cid:17) .6ow for any δ > ν = ( ν , ν , ν , ν , ν ) from R we set z = z ( δ, ν ) = δ / max (cid:0) c ∗ ν , c ∗ ν , ν T / , ν T − / (cid:1) ,τ = τ ( δ, ν ) = δ / max (cid:0) c ∗ ν , c ∗ ν (cid:1) , (3.5)where c ∗ = 2 e κ +1 r R (1 + ρ ) κ and c ∗ = √ eσ max . The parameters R and κ are defined in Theorem 5.1. Finally we set γ = 14 τ and κ = κ ( δ, ν ) = 9 κ (1 − δ )64 δ . (3.6)Now we set M = M + L ( x ∗ + | x | + 2) . (3.7)Now for any inegrated two times continuously differentiable R → R functionΨ we define k ∗ (Ψ) = max (cid:16) | ˙Ψ | , | ¨Ψ | , k Ψ k ∗ , k ˙Ψ k ∗ , k ¨Ψ k ∗ (cid:17) . (3.8)Using this operator we define the parameters z ∗ = λ k ∗ (Ψ) and τ ∗ = λ k ∗ (Ψ) , (3.9)where λ = max (cid:0) c ∗ M , c ∗ , M q ∗ , (cid:1) and λ = max (cid:0) c ∗ M , c ∗ (cid:1) . Finally, we set γ ∗ = 14 τ ∗ . (3.10) In this section we study the deviation in the ergodic theorem for the continuousobservation case, which in this case is defined as∆ T ( φ ) = 1 √ T Z T ( φ ( y t ) − π ϑ ( φ )) d t , (4.1)where φ is any integrated function, i.e. | φ | < ∞ .7 roposition 4.1. For any ν > and ν > z ≥ e κ z sup T ≥ sup φ ∈V ν ,ν sup ϑ ∈ Θ P ϑ ( | ∆ T ( φ ) | ≥ z ) ≤ , (4.2) where the parameter κ is given in (3.4) . Proof.
Similarly to [8] firstly we show that the deviation (4.1) has an expo-nential moment, i.e. we show that for the parameter κ sup T ≥ sup ϑ ∈ Θ E ϑ e κ ∆ T ( φ ) ≤ . (4.3)Indeed, to show this inequality we need to estimate the expectation of anyeven power for the deviation ∆ T ( φ ). To this end we have to represent thisdeviation as the sum of a continuous martingale and a negligible term. Forthis one needs to find a bounded solution for the following differential equation˙ v ϑ ( u ) + 2 S ( u ) σ ( u ) v ϑ ( u ) = 2 e φ ( u ) σ ( u ) , e φ ( u ) = φ ( u ) − π ϑ ( φ ) . (4.4)One can check directly that the function v ϑ ( u ) = − Z ∞ u e φ ( y ) σ ( y ) exp { Z yu S ( z )d z } d y (4.5)yields such a solution. We recall that the function S is defined in (2.2).Moreover, due to Lemma A.2 from Appendix implies this function is uniformbounded. By applying the Ito formula to the function V ( y ) = R y v ϑ ( u )d u wefollowing representation Z T e φ ( y s )d s = V ( y T ) − V ( y ) − ζ T , (4.6)where ζ T = R T v ϑ ( y s ) σ ( y s )d w s . Therefore, for any T ≥ T ( φ ) from above as | ∆ T ( φ ) | ≤ r | y T | + r | y | + 1 √ T | ζ T | . Moreover, taking into account (see [12], Lemma 4.11), that for any m ≥ E ϑ ( ζ T ) m ≤ (2 m − r m σ m max T m , we obtain by Proposition A.1 , that for any m ≥ E ϑ | ∆ T ( φ ) | m ≤ m − r m ( E ϑ | y T | m + | y | m ) + E ϑ ( ζ T ) m T m ! ≤ (3 r ) m (cid:0) m + 1)(2 m − ρ m + y m + (2 m − σ m max (cid:1) . κ , we obtain E ϑ e κ ∆ T ( φ ) = 1 + ∞ X m =1 κ m m ! (3 r ) m (cid:0) m + 1)!! ρ m + y m + (2 m − σ m max (cid:1) ≤ ∞ X m =1 κ m (3 r ) m (cid:0) ρ ) m + y m + 2 m σ m max (cid:1) ≤ ∞ X m =1 (1 / m = 2 . From here we obtain the inequality (4.3) and by the Chebychev inequality wecome to the upper bound (4.2). Hence Proposition 4.1.
Remark 4.1.
It should be noted that the inequality (4.2) is shown in [8] forthe process (1.1) with σ = 1 . Thus Proposition 4.1 extends teh result from [8]for any diffusion function σ . Here we announce a result on geometric ergodicity obtained in [10].
Theorem 5.1.
There exist some constants R ≥ and κ > such that sup t ≥ e κt sup k g k ∗ ≤ sup x ∈ R sup ϑ ∈ Θ | E ϑ ( g ( y t ) | y = x ) − π ϑ ( g ) | | x | ≤ R , (5.1) where the parameters R and κ are given in [10]. In this section we give the following inequality from [4],[17].
Proposition 6.1.
Let (Ω , F , ( F j ) ≤ j ≤ n , P ) be a filtered probability space and ( X j , F j ) ≤ j ≤ n be sequence of random variables such that for some p ≥ ≤ j ≤ n E | X j | p < ∞ . Define b j,n ( p ) = E ( | X j | n X k = j | E ( X k |F j ) | ) p/ /p . hen E | n X j =1 X j | p ≤ (2 p ) p/ n X j =1 b j,n ( p ) p/ . (6.1)Proof of this Proposition is given in Appendix. First note, that by Proposition A.1 and the H¨older inequality we obtain forany α ≥ t ≥ sup ϑ ∈ Θ E ϑ ( | y t | α | y = x ) ≤ α + 1) α/ ρ α . (7.1)Now we represent the deviation D T ( φ ) as D T ( φ ) = Z T ( φ ( y t ) − π ϑ ( φ ))d t + A ,T − A ,T = √ T ∆ T ( φ ) + A ,T − A ,T , (7.2)where A ,T = N X j =1 Z t j t j − (cid:16) φ ( y t j ) − φ ( y t ) (cid:17) d t and A ,T = Z TδN ( φ ( y t ) − π ϑ ( φ ))d t . To estimate the term A ,T we represent through the Ito formula the difference φ ( y t j ) − φ ( y t ) as φ ( y t j ) − φ ( y t ) = Z t j t L ϑ ( φ )( y s ) d s + Z t j t ˙ φ ( y s ) σ ( y s )d W s = e π ϑ ( φ )( t j − t ) + Ψ j ( t ) + Z t j t ˙ φ ( y s ) σ ( y s )d W s , where Ψ j ( t ) = Z t j t ψ ( y s ) d s , ω j ( t ) = Z t j t ˙ φ ( y s ) σ ( y s ) d W s and ψ ( y ) = L ϑ ( φ )( y ) − e π ϑ ( φ ). Now setting X j = Z t j t j − Ψ j ( t ) d t and η j = Z t j t j − ω j ( t )d t ,
10e obtain A ,T = e π ϑ ( φ ) N δ N X j =1 X j + N X j =1 η j . (7.3)To estimate the second term in the right-hand part of (7.3), we makeuse of the Proposition 6.1.We start with verifying its conditions. Putting F s = σ { y u , ≤ u ≤ s } , we obtain by Theorem 5.1, that for any t ≥ s and forany φ from the functional class (2.5) | E ϑ ( ψ ( y t ) |F s ) | ≤ µ ( φ ) R (1 + | y s | ) e − κ ( t − s ) ≤ ν R (1 + | y s | ) e − κ ( t − s ) . Therefore, for any k > j , | E ϑ ( X k |F t j ) | ≤ Re κ (1 + | y t j | ) ν δ e − κδ ( k − j ) . (7.4)It should be noted also, that the random variables X j are bounded, i.e. | X j | ≤ ν δ . To estimatie the probability tail for the sum P nj =1 X j we will use the inequality(6.1). For this we need to estmate the coefficients b j,N ( p ) for any p ≥
1. Fromhere, taking into account that 1 − e − κδ ≥ κδe − κ and that for p ≥ (cid:16) E ϑ (1 + | y t j | ) p/ (cid:17) /p ≤ (cid:16) E ϑ | y t j | p/ (cid:17) /p , we can estimate the coefficient b j,N ( p ) as b j,N ( p ) ≤ κ R e κ ς (cid:16) E | y t j | p/ ) /p (cid:17) , where ς = ν δ . Now the inequality (7.1) yields b j,N ( p ) ≤ R ς p p ≤ R ς p p , where R = 1 κ R e κ (1 + ρ ) . Using this in (6.1) we obtain, that for any p > E ϑ | N X k =1 X k | p ≤ (2 p ) p/ N p/ R p/ ς p (2 p ) p/ ≤ (2 p R ς ) p N p/ p p . P ϑ | N X k =1 X k | ≥ z √ N ! ≤ e p ln( a )+ p ln p with a = 2 p R ς/z . Minimizing now the right-hand part over p ≥
2, we obtainfor z ≥ e p R ν δ / P ϑ | N X k =1 X k | ≥ z √ N ! ≤ e − z/ς , (7.5)where ς = 2 e p R ς .Moreover, note that by the Burkholder-Davis-Gundy inequality, for any α ≥ E ϑ | ω j ( t ) | α ≤ ( α ) α/ ν α σ α max ( t j − t ) α/ . Using this and the the H¨older inequality, we get E ϑ | η j | α ≤ δ α − Z t j t j − E ϑ | ω j ( t ) | α d t ≤ δ α/ α α/ ν α σ α max . Note, that in this case in the right hand of the inequality (6.1) b j,N = (cid:0) E ϑ | η j | p (cid:1) /p . Therefore, similarly to the inequality (4.5) we find, that for all z ≥ ς , P ϑ | N X k =1 η k | ≥ z √ N ! ≤ e − z/ς , (7.6)where ς = √ eδ / ν σ max . Now from (7.3), (4.5)–(4.6) it follows that for z ≥ z P ϑ (cid:16) | A ,T | ≥ z √ N (cid:17) ≤ P ϑ | N X k =1 X k | ≥ z √ N / ! + P ϑ | N X k =1 η k | ≥ z √ N / ! ≤ e − z/ τ , (7.7)when the parameters z and τ are given in (3.5). Moreover, note that due to(2.5) the last term in (7.2) is bounded, i.e. | A ,T | ≤ δ k φ k ∗ ≤ δν ≤ z √ N / . z ≥ z one has P ϑ ( | D T ( φ ) | ≥ z √ N ) ≤ P ϑ (cid:16) √ T | ∆ T ( φ ) | + | A ,T | ≥ z √ N / (cid:17) ≤ P ϑ (cid:16) √ T | ∆ T ( φ ) | ≥ z √ N / (cid:17) + P ϑ (cid:16) | A ,T | ≥ z √ N / (cid:17) . Taking into account here, that
N/T ≥ (1 − δ ) /δ for any 0 < δ < T ≥ P ϑ ( | D T ( φ ) | ≥ z √ N ) ≤ P ϑ | ∆ T ( φ ) | ≥ z p (1 − δ )8 √ δ ! + P ϑ (cid:18) | A ,T | ≥ z √ N (cid:19) . Therefore, applying here the inequalities (4.2) and (7.7) we come to the upperbound (2.6) with the parameter κ given in (3.6). Hence Theorem 2.1. Firstly, note that in this case | ψ h,x | = | Ψ | , k ψ h,x k ∗ = 1 h k Ψ k ∗ and k ˙ ψ h,x k ∗ = 1 h k ˙Ψ k ∗ . Moreover, taking into account that | S ( y ) | ≤ M + L x ∗ + L | y | , we find thatsup | y |≤| x | +2 | S ( y ) | ≤ M , (7.8)where M is given in (3.7).Therefore, in view of the fact that 0 < h <
1, we can estimate from above theparametrs (2.4) as µ ( ψ h,x ) ≤ µ ∗ h − and e µ ( ψ h,x ) ≤ e µ ∗ h − , (7.9)where µ ∗ = max (cid:16) k ˙Ψ k ∗ , k ¨Ψ k ∗ (cid:17) M and e µ ∗ = max (cid:16) | ˙Ψ | , | ¨Ψ | (cid:17) M q ∗ . Therefore, the function ψ h,x belongs to the class (2.5) with the following pa-rameters ν = | Ψ | , ν = k Ψ k ∗ h , ν = k ˙Ψ k ∗ h , ν = µ ∗ h , ν = e µ ∗ h . κ ( | Ψ | ) and the param-eters (3.5) can be represented as z = δ / h max (cid:16) c ∗ µ ∗ , c ∗ k ˙Ψ k ∗ h , e µ ∗ hT / , k Ψ k ∗ h T − / (cid:17) τ = δ / h max (cid:16) c ∗ µ ∗ , c ∗ k ˙Ψ k ∗ h (cid:17) . (7.10)Therefore, thanks to the condition (2.8) for any T − / ≤ h ≤ z ≤ l − / T z ∗ and τ ≤ l − / T τ ∗ , (7.11)where the parameters z ∗ and τ ∗ are given in (3.9). Note now that, by thecondition (2.7) P ϑ (cid:16) | D T ( ψ h,x ) | ≥ a T (cid:17) ≤ P ϑ (cid:16) | D T ( ψ h,x ) | ≥ z √ N (cid:17) where z = a/ p l T . The first inequality in (7.11) implies that z ≥ z for all a ≥ a ∗ = z ∗ /l T . Moreover, from the last inequality in (7.11) it follows, thatfor a ≥ a ∗ min ( κ z , γ ) = min (cid:18) κ z , τ (cid:19) ≥ min κ z ∗ l T p l T , l T p l T τ ∗ ! . Taking into account here the definition of κ in (3.6) and the form for δ givenby (2.7) we obtain that for sufficiently large T min κ z ∗ l T p l T , l T p l T τ ∗ ! = l T p l T τ ∗ . Thus, through Theorem 2.1 we come to the inequality (2.10). Hence Theo-rem 2.2
First we represent the tail probability as P ϑ (cid:16) | D T ( χ h,x ) | ≥ ǫ T T (cid:17) = I + I , where I = P ϑ N X j =1 χ h,x ( y t j )∆ t j ≤ ( π ϑ ( χ h,x ) − ǫ T ) T I = P ϑ N X j =1 χ h,x ( y t j )∆ t j ≥ ( π ϑ ( χ h,x ) + ǫ T ) T . Let us define now the following smoothing indicator functionsΨ ,η ( u ) = 1 η Z + ∞−∞ {| z |≤ − η } V (cid:18) z − uη (cid:19) d z and Ψ ,η ( u ) = 1 η Z + ∞−∞ {| z |≤ η } V (cid:18) z − uη (cid:19) d z , where η is a smoothing positive parameter which will be specified later, V is atwo times continuously differentiable even R → R function such that V ( z ) = 0for | z | ≥ Z − V ( z )d z = 1 . It is easy to see that, for any y ∈ R and 0 < η ≤ / ,η ( u )( y ) ≤ χ ( y ) ≤ Ψ ,η ( y )and Ψ ,η ( y ) = 0 for | y | ≥
2. Moreover, for the functions ψ i,h ( y ) = 1 h Ψ i,η (cid:18) y − z h (cid:19) using the inequality (A.4), we can estimate the difference between the coore-sponding ergodic intergals (1.4) as | π ϑ ( χ h,x ) − π ϑ ( ψ i,h ) | ≤ ηq ∗ . Therefore, choosing here η = ǫ T we obtain, for sufficiently large T , I i ≤ P ϑ (cid:0) | D T ( φ i,h ) | ≥ ǫ T T / (cid:1) . One can check directly that in this case the operator (3.8) has the followingasymptotic ( T → ∞ ) form k ∗ (Ψ i,η ) = O (cid:0) η − (cid:1) . Therefore, from (3.9) and (7.11) it follows that for T → ∞ and h ≥ T − / z ( φ i,h ) = O (cid:16) η − l − / T (cid:17) and τ ( φ i,h ) = O (cid:16) η − l − / T (cid:17) , z ( φ i,h ) = O ǫ T l / T ! and τ ( φ i,h ) = O ǫ T l / T ! . Now we have P ϑ (cid:16) | D T ( ψ h,x ) | ≥ ǫ T T (cid:17) ≤ P ϑ (cid:16) | D T ( ψ h,x ) | ≥ z √ N (cid:17) , where z = ǫ T / p l T . The last equality in (2.9) implies z ≥ z for sufficientlylarge T . Moreover, taking into account, that there exists a constant c ∗ > T κ z ≥ c ∗ T p l T ǫ T and γ ≥ c ∗ l T p l T ǫ T , i.e. for sufficiently large T min ( κ z , γ ) ≥ c ∗ l T p l T ǫ T . Therefore, by Theorem 2.1 for sufficiently large T P ϑ (cid:16) | D T ( ψ h,x ) | ≥ ǫ T T (cid:17) ≤ e − c ∗ l T ǫ T . Now the last condition in (2.9) yields the equality (2.12). Hence Theorem 2.3.
A Appendix
A.1 Proof of Proposition 6.1
We set h n ( t ) = E | S n − + tX n | p with S n = n X j =1 X j . By the induction method we assume that for any 1 ≤ k ≤ n − ≤ t ≤ h k ( t ) ≤ (2 p ) p/ B p/ k ( t ) , (A.1)where B k ( t ) = k − X j =1 b j,k ( p ) + tb k,k ( p ) . Note now that as is shown in [17] (Theorem 2.3) E | S n | p = p ( p − n X j =1 Z E | S j − + vX j | p − ( − vX j + Υ( j, n ))d v . (A.2)16ith Υ( j, n ) = X j n X k = j E ( X k |F j ) . Therefore, h n ( t ) = p ( p − n − X j =1 Z E | S j − + vX j | p − ( − vX j + G ( i, n, t ))d v + p ( p − Z E | S n − + vtX n | p − t (1 − v ) X n d v , where G ( j, n, t ) = Υ( j, n −
1) + tX j E ( X n |F j ) . Moreover, we can estimate h n ( t ) as h n ( t ) p ≤ n − X j =1 Z E | S j − + vX j | p − | G ( i, n, t ) | d v + Z t E | S n − + sX n | p − X n d s Now taking into account that for 0 ≤ t ≤ (cid:0) E | G ( j, n, t ) | p/ (cid:1) /p ≤ b j,n ( p ) , we obtain by the H¨older inequality Z E | S j − + vX j | p − | G ( i, n, t ) | d v ≤ Z h αj ( v ) b j,n ( p )d v , where α = 1 − /p . Therefore, h n ( t ) p ≤ n − X j =1 b j,n ( p ) Z h αj ( v )d v + b n,n ( p ) Z t h αn ( s )d s Now by the induction assumption for any 1 ≤ j ≤ n − b j,n ( p ) Z h αj ( v )d v ≤ (2 p ) ( p − / Z B ( p − / j ( v ) d v b j,n ( p ) . Moreover, taking into account that B j ( v ) ≤ j − X i =1 b i,n + vb j,n ( p ) ,
17e obtain that Z B ( p − / j ( v ) d v b j,n ( p ) ≤ p ( j X i =1 b i,n ) p/ − ( j − X i =1 b i,n ) p/ ! . This implies for any 0 ≤ t ≤ h n ( t ) ≤ k n Z t h αn ( v ) d v + f n (A.3)with k n = p b n,n ( p ) and f n = p n − X j =1 b j,n ( p ) p/ . Now by setting Z ( t ) = Z t h αn ( s )d s + f n k n , we obtain from (A.3) that ˙ Z ( t ) ≤ k αn Z α ( t ) . Now introducing g ( t ) = ˙ Z ( t ) − k αn Z α ( t ) , we obtain the differential equation˙ Z ( t ) = k αn Z α ( t ) + g ( t )with g ( t ) ≤
0. From here we obtain Z /p ( t ) = Z /p (0) + 2 p k αn t + Z tO g ( u ) Z α ( u ) d u ≤ Z /p (0) + 2 p k αn t , i.e. Z ( t ) ≤ (cid:18) Z /p (0) + 2 p k αn t (cid:19) p/ . Substituting this bound in (A.3) we obtain h n ( t ) ≤ k n Z ( t ) ≤ k n (cid:18) Z /p (0) + 2 p k αn t (cid:19) p/ = p n − X j =1 b j,n ( p ) + 2 ptb n,n ( p ) p/ . Hence Proposition 6.1. 18 .2 Uniform bound for the invariant density
Lemma A.1.
The invariant density (2.2) is uniformly bounded: sup x ∈ R sup ϑ ∈ Θ q ϑ ( x ) ≤ q ∗ < ∞ , (A.4) where the upper bound q ∗ is given in (3.2) . Proof.
First, note that through the definition of Θ we can check directly thatfor any | x | ≥ x ∗ Z x S ( v ) d v ≤ β | x | − β ( | x | − x ∗ ) , (A.5)where the coefficients β and β are given in (3.1). Therefore, taking intoaccount, that for | x | ≥ x ∗ Z x S ( v ) d v ≤ β x ∗ , we obtain that 2 sup x ∈ R Z x S ( v ) d v ≤ β x ∗ + β β . Estimating now the denominator in (2.2) from below as Z R σ − ( z ) e e S ( z ) d z ≥ Z σ − ( z ) d z ≥ σ max , and taking into account the definition of q ∗ we come to the upper uniformbound (A.4). Hence Proposition A.1. A.3 Moment bound for the process y t . Proposition A.1.
For any m ≥ t ≥ sup ϑ ∈ Θ E ϑ | y t | m ≤ m + 1)(2 m − ρ m ≤ m ) m ρ m , where ρ is given in (3.4) . Proof.
First note, that through the Ito formula we can write for the function z t ( m ) = E ϑ y mt the following intergal equality z t ( m ) = z ( m ) + 2 m Z t E ϑ y m − s S ( y s )d s + m (2 m − Z t E ϑ y m − s σ ( y s ) ds , z t ( m ) = 2 m E ϑ y m − s S ( y t ) + m (2 m − E ϑ y m − t σ ( y t ) . Taking into account here that sup x ∈ R σ ( x ) ≤ σ we obtain, that for any m ≥ t ≥ z t ( m ) ≤ m E ϑ y m − t S ( y t ) + m (2 m − σ z t ( m − . Now we need to estimate from above the function x m − S ( x ). Obviously, thatfor any K > x ∗ x m − S ( x ) ≤ K m − sup | x |≤ K | S ( x ) | {| x |≤ K } + x m S ( x ) x {| x | >K } . Taking into account that sup | x | > x ∗ | ˙ S ( x ) | ≤ L , we obtain, for any x ∈ [ x ∗ , K ], | S ( x ) | ≤ | S ( x ∗ ) | + L | x − x ∗ | ≤ M + L ( K − x ∗ ) . Similarly, we obtain the same upper bound for x ∈ [ − K, − x ∗ ]. Therefore,sup | x |≤ K | S ( x ) | ≤ M + L ( K − x ∗ ) . Consider now the case | x | > K . We recall, that sup | x |≥ x ∗ ˙ S ( x ) ≤ − L − .Therefore, S ( x ) x ≤ MK − K − x ∗ LK .
Choosing K = 2( x ∗ + M L ) yields S ( x ) x ≤ − L .
Therefore, x m − S ( x ) ≤ K m − ( M + L ( K − x ∗ )) − L x m {| x | >K } = K m − ( M + L ( K − x ∗ )) + β x m {| x |≤ K } − L x m ≤ A m − β x m , where A m = (2( x ∗ + M L )) m − (cid:0) M + x ∗ (cid:0) L + L − (cid:1) + 2 L M (cid:1) z t ( m ) ≤ m A m − L − m z t ( m ) + m (2 m − σ z t ( m − . We can rewrite this inequality as follows˙ z t ( m ) = − L − mz t ( m ) + m (2 m − σ z t ( m −
1) + ψ t , where sup t ≥ ψ t ≤ m A m . This equality provides z t ( m ) = z ( m ) e − mL − t + m (2 m − σ Z t e − mL − ( t − s ) z s ( m − s + Z t e − mL − ( t − s ) ψ s d s ≤ m (2 m − σ Z t e − mL − ( t − s ) z s ( m − s + B m , where B m = y m + 2 A m L . Setting B = 1 and resolving this inequality byrecurrence yields z t ( m ) ≤ m − m X j =0 (cid:0) σ L (cid:1) m − j B j . It is easy to see, that B m ≤ (cid:0) max (cid:0) | y | , x ∗ + M L ) (cid:1)(cid:1) m . Therefore sup t ≥ z t ( m ) ≤ m + 1)(2 m − ρ m ≤ m ) m ρ m , where ρ is defined in (3.4). Hence Proposition A.1. A.4 Properties of the function (4.5)
Lemma A.2.
For any integrated function φ the solution (4.5) is uniformbounded, i.e. sup ϑ ∈ Θ sup y ∈ R | v ϑ ( y ) | ≤ r , where the upper bound r is introduced in (3.3) . roof. Firstly we note, that for any ϑ from Θ and any intergated R → R function φ | π ϑ ( φ ) | ≤ q ∗ | φ | . Moreover, by the definition of the parameter β we get2 sup | u |≤ x ∗ | S ( u ) | ≤ β . Therefore, for 0 ≤ u ≤ x ∗ we can estimate the function v ϑ as | v ϑ ( u ) | ≤ e x ∗ β σ ((1 + q ∗ x ∗ ) | φ | + I ( φ )) , where β is given in (3.1) and I ( φ ) = Z ∞ x ∗ ( | φ ( y ) | + q ∗ | φ | ) e R y x ∗ S ( z )d z d y . To estimate this term note that similarly to (A.5) we can obtain that for any y ≥ a ≥ x ∗ Z ya S ( z )d z ≤ β ( y − a ) − β ( y − a ) . (A.6)Using this inequlity for a = x ∗ , we get I ( φ ) ≤ Z ∞ x ∗ | φ ( y ) | e β ( y − x ∗ ) − β ( y − x ∗ ) d y + q ∗ | φ | Z ∞ e β z − β z d z ≤ | φ | sup z ≥ e β z − β z + q ∗ | φ | Z ∞ e β z − β z d z ≤ | φ | ( υ + q ∗ υ ) , where the parameters υ and υ are introduced in (3.1). Therefore, taking intoaccount the definition (3.3), the last inequality impliessup ϑ ∈ Θ sup ≤ u ≤ x ∗ | v ϑ ( u ) | ≤ r . (A.7)If u ≥ x ∗ , then through the inequality (A.6) we estimate the function v ϑ ( u )from above as sup ϑ ∈ Θ sup u ≥ x ∗ | v ϑ ( u ) | ≤ | φ | σ ( υ + q ∗ υ ) ≤ r . Let now u ≤
0. Taking into account that Z R e φ ( y ) σ ( y ) exp { Z y S ( z ) d z } d y = 0 ,
22e can represent the function v ϑ as v ϑ ( u ) = 2 Z ∞| u | e φ ( − y ) σ ( − y ) e − R y | u | S ( − z ) d z d y . Similarly to (A.6), one can check directly, that for any y ≥ a ≥ x ∗ − Z ya S ( − z ) d z ≤ β ( y − a ) − β ( y − a ) . Therefore, by the same way as in the proof of (A.7) we can estimate thefunction v ϑ ( u ) as sup ϑ ∈ Θ sup u ≤ | v ϑ ( u ) | ≤ r . Hence Lemma A.2.
References [1] P. Bertail, S. Cl´emencon, Sharp bounds for the tails of functionals ofMarkov chains.
Theor. Veroyatnost i Primenen N3,(2009), 609-619.[2] S. Boucheron, G. Lugosi, P. Massart, Concentration inequalities usingthe entropy method.
The Ann. Probab. (2003) 1583-1614.[3] D.A. Cattiaux, A. Guillin, Deviation bounds for additive functionals ofMarkov process. ESAIM PS , Vol 12,(2008) p. 12-29.[4] J. Dedecker, P. Doukhan, A new covariance inequality and applications.
Stochastic Proc. Appl. (2003) 63-80.[5] J. Dedecker, C. Prieur, New dependence coefficients. Examples and ap-plications to statistics.
Probab. Theory and Related Fields (2005)203-236.[6] L. Galtchouk, S. Pergamenshchikov, Sequential nonparametric adaptiveestimation of the drift coefficient in diffusion processes.
MathematicalMethods of Statistics (2001) 316-330.[7] L. Galtchouk, S. Pergamenshchikov, Asymptotically efficient sequentialkernel estimates of the drift coefficient in ergodic diffusion processes. Statistic Inferences for Stochastic Processes (2006) 1-16.[8] L. Galtchouk, S. Pergamenshchikov, Uniform concentration inequalityfor ergodic diffusion processes. Stochastic Processes and their applica-tions (2007) 830-839. 239] L. Galtchouk, S. Pergamenshchikov, Adaptive sequential estimation forergodic diffusion processes in quadratic metric.
Journal of NonparametricStatistics (2) (2011) 255-285.[10] L. Galtchouk, S. Pergamenshchikov, Geometric er-godicity for families of homogeneous Markov chains. http://hal.archives-ouvertes.fr/hal-00455976/fr (2011)[11] I.I. Gihman, A.V. Skorohod, Stochastic differential equations.
Springer,New York, 1972.[12] R.Sh. Liptser, A.N. Shiryaev,
Statistics of random processes, I,
Springer,New York, 1977.[13] P. Massart, Some applications of concentration inequalities to statistics.
Ann. Fac. Sci. Toulouse Math. (2000) 245-303.[14] V. Maume-Deschamps, Concentration inequalities and estimation ofconditional probabilities. Universit´e de Bourgogne , D´ecembre 2004,Pr´epublication n. 396.[15] S. Meyn, R. Tweedie, Markov Chains and Stochastic Stability. SpringerVerlag, 1993.[16] S.M. Pergamenshchikov, On large deviation probabilities in ergodic the-orem for singularly perturbed stochastic systems.
Weierstrass Institutf¨ur Angewandte Analysis und Stochastik, Berlin , 1998, Preprint n. 414.[17] Rio, E. (2000)
Th´eorie asymptotique des processus faiblementd´ependants . In Collection : Math´ematiques & Applications, ,Springer, Berlin.[18] A.Yu. Veretennikov, On large deviations for diffusion processes with mea-surable coefficients. Uspekhi Mat. Nauk ,50