A least square-type procedure for parameter estimation in stochastic differential equations with additive fractional noise
aa r X i v : . [ m a t h . P R ] N ov A LEAST SQUARE-TYPE PROCEDURE FOR PARAMETERESTIMATION IN STOCHASTIC DIFFERENTIAL EQUATIONSWITH ADDITIVE FRACTIONAL NOISE
ANDREAS NEUENKIRCH AND SAMY TINDEL
Abstract.
We study a least square-type estimator for an unknown parameter in thedrift coefficient of a stochastic differential equation with additive fractional noise of Hurstparameter
H > / . The estimator is based on discrete time observations of the sto-chastic differential equation, and using tools from ergodic theory and stochastic analysiswe derive its strong consistency. Introduction
In this article, we will consider the following R d -valued stochastic differential equation(SDE) Y t = y + Z t b ( Y s ; ϑ ) ds + m X j =1 σ j B jt , t ∈ [0 , T ] . (1)Here y ∈ R d is a given initial condition, B = ( B , . . . , B m ) is an m -dimensional fractionalBrownian motion (fBm) with Hurst parameter H ∈ (0 , , the unknown parameter ϑ liesin a certain set Θ which will be specified later on, { b ( · ; ϑ ) , ϑ ∈ Θ } is a family of driftcoefficients with b ( · ; ϑ ) : R d → R , and σ , . . . , σ m ∈ R d are assumed to be known diffusioncoefficients.Let us recall that B is a centred Gaussian process defined on a complete probabilityspace (Ω , F , P ) . Its law is thus characterized by its covariance function, which is definedby E (cid:0) B it B js (cid:1) = 12 (cid:0) t H + s H − | t − s | H (cid:1) ( i = j ) , s, t ∈ R . The variance of the increments of B is then given by E | B it − B is | = | t − s | H , s, t ∈ R , i = 1 , . . . , m, and this implies that almost surely the fBm paths are γ -Hölder continuous for any γ < H .Furthermore, for H = 1 / , fBm coincides with the usual Brownian motion, converting thefamily { B H , H ∈ (0 , } into the most natural generalization of this classical process. Inthe current paper we assume that the Hurst coefficient satisfies H > / and we focus onthe estimation of the parameter ϑ ∈ Θ . Note that the Hurst parameter and the diffusioncoefficients can be estimated via the quadratic variation of Y , see e.g. [1, 4, 15]. Date : May 3, 2019.2010
Mathematics Subject Classification.
Key words and phrases. fractional Brownian motion, parameter estimation, least square procedure,ergodicity.S. Tindel is member of the BIGS (Biology, Genetics and Statistics) team at INRIA.
Estimators for the unknown parameter in equation (1) based on continuous observationof Y have been studied e.g. in [2, 14, 17, 18, 20, 21, 22]. Estimators based on discrete timedata, which are important for practical applications, are then obtained via discretization.However, to the best of our knowledge no genuine estimators based on discrete time datahave been analyzed yet.We propose here a least square estimator for ϑ based on discrete observations ofthe process Y at times { t k ; 0 ≤ k ≤ n } . For simplicity, we shall take equally spacedobservation times with t k +1 − t k = κ n − α := α n with given α ∈ (0 , , κ > . We call ourmethod least square-type procedure, insofar as we consider a quadratic statistics of theform Q n ( ϑ ) = 1 nα n n − X k =0 (cid:16)(cid:12)(cid:12) δY t k t k +1 − b ( Y t k ; ϑ ) α n (cid:12)(cid:12) − k σ k α Hn (cid:17) , (2)where δY u u := Y u − Y u for any ≤ u ≤ u ≤ T and k σ k = P mj =1 | σ j | .Let us now describe the assumptions under which we shall work, starting from a stan-dard hypothesis on the parameter set Θ : Hypothesis 1.1.
The set Θ is compactly embedded in R q for a given q ≥ . In order to describe the assumptions on our coefficients b , we will use the followingnotation for partial derivatives: Notation 1.2.
Let f : R d × Θ → R be a C p ,p function for p , p ≥ . Then for any tuple ( i , . . . i p ) ∈ { , . . . , d } p , we set ∂ i ...i p f for ∂ p f∂x i ...∂x ip . Moreover, we will write ∂ x f resp. ∂ ϑ f for the Jacobi-matrices ( ∂ x f, . . . , ∂ x d f ) and ( ∂ ϑ f, . . . , ∂ ϑ q f ) . With this notation in mind, our drift coefficients and their derivatives will satisfy apolynomial growth condition, plus an inward condition which is traditional for estimationprocedures in the Brownian diffusion case (see e.g [5, 16]):
Hypothesis 1.3.
We have b ∈ C , ( R d × Θ; R d ) and there exist constants c , c > and N ∈ N such that: (i) For every x, y ∈ R d and ϑ ∈ Θ we have h b ( x ; ϑ ) − b ( y ; ϑ ) , x − y i ≤ − c | x − y | (ii) For every x ∈ R d and ϑ ∈ Θ the following growth bounds are satisfied: | b ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ x b ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ ϑ b ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) . As a consequence of the above assumptions on the drift coefficient and the initialcondition, for given ϑ ∈ Θ the solution of equation (1) converges for t → ∞ to astationary and ergodic stochastic process ( Y t , t ≥ , see the next section.Finally, we also assume that our-drift coefficient is of gradient-type, i.e.: Hypothesis 1.4.
There exists a function U ∈ C , ( R d × Θ; R ) such that ∂ x U ( x ; ϑ ) = b ( x ; ϑ ) , x ∈ R d , ϑ ∈ Θ . With those assumptions in mind, we obtain the following convergence result:
SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 3
Theorem 1.5.
Assume that the Hypotheses 1.1, 1.3 and 1.4 are satisfied for equation (1)and that we moreover have
H > / . Let Q n ( ϑ ) be defined by (2). Then we have sup ϑ ∈ Θ (cid:12)(cid:12) Q n ( ϑ ) − (cid:0) E | b ( Y ; ϑ ) | − E | b ( Y ; ϑ ) | (cid:1)(cid:12)(cid:12) → (3) in the P -almost sure sense. This convergence is in contrast to the case H = 1 / , i.e. to the case of SDEs withadditive Brownian noise. There it holds sup ϑ ∈ Θ (cid:12)(cid:12) Q n ( ϑ ) − (cid:0) E | b ( Y ; ϑ ) − b ( Y ; ϑ ) | (cid:1)(cid:12)(cid:12) → (4)in the P -almost sure sense, see Remark 3.6, and usually the consistent least squaresestimator argmin ϑ ∈ Θ n − X k =0 (cid:12)(cid:12) δY t k t k +1 − b ( Y t k ; ϑ ) α n (cid:12)(cid:12) , is considered, see e.g. [5, 16]. The difference in the limits (3) and (4) is due to the highersmoothness and long-range dependence of fractional Brownian motion for H > / . Ourestimator can thus be seen as a ”zero squares” estimator instead of a classical least squareestimator. In order to show its convergence, we shall work under the following naturalidentifiability assumption: Hypothesis 1.6.
For any ϑ ∈ Θ , we have E | b ( Y ; ϑ ) | = E | b ( Y ; ϑ ) | iff ϑ = ϑ . With this additional Hypothesis, the main result of the current article is the consistencyof the zero squares estimator based on the statistics Q n : Theorem 1.7.
Assume that the Hypotheses 1.1, 1.3, 1.4 and 1.6 are satisfied for equation(1) and let
H > / . Let Q n ( ϑ ) be defined by (2), and let b ϑ n = argmin ϑ ∈ Θ | Q n ( ϑ ) | . Thenfor any ϑ ∈ Θ , we have lim n →∞ b ϑ n = ϑ in the P -almost sure sense. Note that minimizing | Q n ( ϑ ) | is of course equivalent to finding the zero of Q n ( ϑ ) . Letus shortly compare Theorem 1.7 with the existing literature on estimation procedures forfBm driven equations:(i) Most of the previous results, see e.g. [2, 14, 17, 21], deal with the one-dimensionalfractional Ornstein-Uhlenbeck process in a continuous observation setting. In particular,for this process simple continuous time least-square estimators are obtained in [2, 14], forwhich also covergence rates and asymptotic error distributions are derived. Compared tothese results our estimation procedure covers a broad class of ergodic multi-dimensionalequations and relies on discrete data only.(ii) A general estimation procedure based on moment matching is established in [20].However, the main assumption in [20] is that many independent observations of samplepaths over a short time interval are available, which is not the case in many practicalsituations where rather one sample path is discretely observed for a long time period.Let us also mention the article [3], in which a general discrete data maximum likelihoodtype procedure has been designed for parameter estimation in both the drift and diffusioncoefficients, however without proof of consistency. ANDREAS NEUENKIRCH AND SAMY TINDEL (iii) Our current work probably compares best with the maximum likelihood estimatoranalyzed in [22]. The latter pioneering reference focused on one-dimensional SDEs of theform dY t = ϑ h ( Y t ) dt + dB t with h : R → R satisfying suitable regularity assumptions. Strong consistency is obtainedfor the continuous time estimator and also for a discretized version of the estimator.However, the discretized estimator involves rather complicated operators related to thekernel functions arising in the Wiener-integral representation of fBm, which are avoidedin our approach. Moreover, in contrast to [22] the consistency proof for our estimatordoes not rely on Malliavin calculus methods.So, in view of the existing results in the literature, Theorem 1.7 can be seen a steptowards simple and implementable parameter estimation procedures for SDEs driven byfBm.Finally, let us comment on the assumptions we have imposed on the drift coefficientand on the Hurst parameter:(a) The hypothesis of Theorem 1.5 are standard for the case H = 1 / , except Hypothe-sis 1.4 which restricts us to gradient-type drift coefficients. We require this condition toshow an ergodic-type result for weighted sums of the increments of fBm, see Lemma 3.2.However, this Hypothesis 1.4 is also implicitly present in the additional condition of [16,Theorem 1].(b) It can easily be shown that whenever ϑ is a one-dimensional coefficient (namely for q =1 ), Hypothesis 1.6 is satisfied if the drift coefficient is of the form b ( x ; ϑ ) = ϑh ( x ) for some h : R d → R d and if the stationary solution is non-degenerate, i.e. we have E | Y | = 0 . Thelatter conditions hold in particular in the case of the ergodic fractional Ornstein-Uhlenbeckprocess. It would be nice to obtain criteria for richer classes of examples, but this wouldrely on differentiability and non-degeneracy properties of the map ϑ E | b ( Y ; ϑ ) | (see[11] in the Markovian case). We wish to investigate this question in future works.(c) Even if the noise enters additively in our equation, we still need the assumption H > / in order to prove Theorem 1.7. Indeed, this hypothesis ensures the conver-gence of some deterministic and stochastic Riemann sums in the computations below (seeRemark 3.3 for further details). Whether an adaptation of the proposed zero squaresestimator is also convergent in the case H < / remains an open problem.Let us finish this introduction with the simplest example of an equation which satis-fies the above assumptions: namely the one-dimensional fractional Ornstein-Uhlenbeckprocess given by dY t = ϑ Y t dt + dB t , Y = y ∈ R with ϑ < . The solution of this SDE reads as Y t = y exp( ϑ t ) + exp( ϑ t ) Z t exp( − ϑ s ) dB s . For t → ∞ this process converges to the stationary fractional Ornstein-Uhlenbeck process exp( ϑ t ) Z t −∞ exp( − ϑ s ) dB s , t ≥ , SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 5 see e.g. [7]. Here straightforward computations yield the explicit estimator b ϑ n = P n − k =0 Y t k δY t k t k +1 P n − k =0 Y t k α n − vuut P n − k =0 Y t k δY t k t k +1 P n − k =0 Y t k α n ! − P n − k =0 (cid:0) | δY t k t k +1 | − α Hn (cid:1)P n − k =0 Y t k α n . Notice that even in the case of the fractional Ornstein-Uhlenbeck process, we could neitherproof the consistency nor show the inconsistency of our estimator for
H < / .The remainder of this paper is structured as follows: In Section 2 we give some auxiliaryresults on stochastic calculus for fractional Brownian motion. Section 3 is then devotedto the proof of our main theorems.2. Auxiliary Results
Ergodic Properties of the SDE.
To deduce the ergodic properties of SDE (1)we will work without loss of generality on the canonical probability space (Ω , F , P ) ,i.e. Ω = C ( R , R m ) equipped with the compact open topology, F is the correspondingBorel- σ -algebra and P is the distribution of the fractional Brownian motion B , which isconsequently given here by the canonical process B t ( ω ) = ω ( t ) , t ∈ R . Together with theshift operators θ t : Ω → Ω defined by θ t ω ( · ) = ω ( · + t ) − ω ( t ) , t ∈ R , ω ∈ Ω , the canonical probability space defines an ergodic metric dynamical system, see e.g. [9].In particular, the measure P is invariant to the shift operators θ t , i.e. the shifted process ( B s ( θ t · )) s ∈ R is still an m -dimensional fractional Brownian motion and for any integrablerandom variable F : Ω → R we have lim T →∞ T Z T F ( θ t ( ω )) dt = E F for P -almost all ω ∈ Ω . Owing to the results in Section 4 of [7] we have the following: Theorem 2.1.
Let Hypothesis 1.3 hold. Then for any ϑ ∈ Θ the following holds: (i) Equation (1) admits a unique solution Y in C λ ( R + ; R d ) for all λ < H . (ii) There exists a random variable Y : Ω → R d such that lim t →∞ | Y t ( ω ) − Y ( θ t ω ) | = 0 for P -almost all ω ∈ Ω . Moreover, we have E | Y | p < ∞ for all p ≥ . Note that the law of Y must coincide with the attracting invariant measure for (1)given in [10], see also [12, 13]. Moreover, proceeding as in [7] we have: Proposition 2.2.
Assume Hypothesis 1.3 holds true. Then for any ϑ ∈ Θ and p ≥ there exist constants c p , k p > such that E | Y t | p ≤ c p , E | Y t − Y s | p ≤ k p | t − s | pH , for all s, t ≥ . The integrability of Y now implies the ergodicity of equation (1): ANDREAS NEUENKIRCH AND SAMY TINDEL
Proposition 2.3.
Assume Hypothesis 1.3 holds true. Then for any ϑ ∈ Θ and any f ∈ C ( R d ; R ) such that | f ( x ) | + | ∂ x f ( x ) | ≤ c (cid:0) | x | N (cid:1) , x ∈ R d , for some c > , N ∈ N , we have lim T →∞ T Z T f ( Y t ) dt = E f ( Y ) P - a.s. Proof.
Since the shift operator is ergodic and f has polynomial growth, we have lim T →∞ T Z T f ( Y ( θ t )) dt = E f ( Y ) P - a.s. Moreover, since lim t →∞ | Y t ( ω ) − Y ( θ t ω ) | = 0 by Theorem 2.1 and f is polynomially Lipschitz, the assertion easily follows. (cid:3) Generalized Riemann-Stieltjes Integrals.
We set k f k ∞ ;[ a,b ] = sup t ∈ [ a,b ] | f ( t ) | , | f | λ ;[ a,b ] = sup s,t ∈ [ a,b ] | f ( t ) − f ( s ) || t − s | λ where f : R → R n and λ ∈ (0 , .Now, let f ∈ C λ ([ a, b ]; R ) and g ∈ C µ ([ a, b ]; R ) with λ + µ > . Then it is well knownthat the Riemann-Stieltjes integral R ba f ( x ) dg ( x ) exists, see e.g. [24]. Also, the classicalchain rule for the change of variables remains valid, see e.g. [25]: Let f ∈ C λ ([ a, b ]; R ) with λ > / and F ∈ C ( R ; R ) . Then we have F ( f ( y )) − F ( f ( a )) = Z ya F ′ ( f ( x )) df ( x ) , y ∈ [ a, b ] . (5)Moreover, one has a density type formula: let f, h ∈ C λ ([ a, b ]; R ) and g ∈ C µ ([ a, b ]; R ) with λ + µ > . Then for ϕ : [ a, b ] → R , ϕ ( y ) = Z ya f ( x ) dg ( x ) , y ∈ [ a, b ] , we have Z ba h ( x ) dϕ ( x ) = Z ba h ( x ) f ( x ) dg ( x ) . (6)For later use, we also note the following estimate, which can be found e.g. in [24]. Proposition 2.4.
Let f, g as above. There exists a constant c λ,µ (independent of a, b )such that (cid:12)(cid:12)(cid:12)(cid:12)Z ba ( f ( s ) − f ( a )) dg ( s ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ c λ,µ | f | λ ;[ a,b ] | g | µ ;[ a,b ] | b − a | λ + µ holds for all a, b ∈ [0 , ∞ ) . SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 7
The Garcia-Rademich-Rumsey Lemma.
We will use the following variant ofthe Garcia-Rademich-Rumsey Lemma [8]:
Lemma 2.5.
Let q > , α ∈ (1 /q, and f : [0 , ∞ ) → R be a continuous function. Thenthere exists a constant c α,q > , depending only on α, q , such that | f | q [ s,t ]; α − /q ≤ c α,q Z ts Z ts | f ( u ) − f ( v ) | q | u − v | qα du dv. A Lemma on Pathwise Convergence Rates.
The following Lemma (see e.g.[19]), which is a direct consequence of the Borel-Cantelli Lemma, allows us to turn con-vergence rates in the p -th mean into pathwise convergence rates. Lemma 2.6.
Let α > , p ∈ N and c p ∈ [0 , ∞ ) for p ≥ p . In addition, let Z n , n ∈ N ,be a sequence of random variables such that ( E | Z n | p ) /p ≤ c p · n − α for all p ≥ p and all n ∈ N . Then for all ε > there exists a random variable η ε suchthat | Z n | ≤ η ε · n − α + ε a.s. for all n ∈ N . Moreover, E | η ε | p < ∞ for all p ≥ . Quadratic Variations of Fractional Brownian Motion.
The following result forthe behavior of the quadratic variations of a one-dimensional fractional Brownian motion β with Hurst parameter H is well known, see e.g [23]. Indeed, for H < / we have E (cid:12)(cid:12)(cid:12) n n − X k =0 (cid:2) | δ kk +1 β | − (cid:3)(cid:12)(cid:12)(cid:12) ≤ c H · n , (7)while for H = , n > , it holds E (cid:12)(cid:12)(cid:12) n n − X k =0 (cid:2) | δ kk +1 β | − (cid:3)(cid:12)(cid:12)(cid:12) ≤ c / · log( n ) n . (8)Finally, if H ∈ ( , then E (cid:12)(cid:12)(cid:12) n n − X k =0 (cid:2) | δ kk +1 β | − (cid:3)(cid:12)(cid:12)(cid:12) ≤ c H · n − H . (9)Here, c H > denotes a constant depending only on H . ANDREAS NEUENKIRCH AND SAMY TINDEL Proof of Theorems 1.5 and 1.7
We will denote constants, whose particular value is not important (and which do notdepend on ϑ or n ) by c , regardless of their value.Recall that Q n ( ϑ ) = 1 nα n n − X k =0 (cid:16)(cid:12)(cid:12) δY t k t k +1 − b ( Y t k ; ϑ ) α n (cid:12)(cid:12) − k σ k α Hn (cid:17) . For t ≥ , setting F t = m X j =1 σ j B ( j ) t , and r k = Z t k +1 t k ( b ( Y u ; ϑ ) − b ( Y t k ; ϑ )) du and moreover using the notation δ ϑ ϑ b ( x ) = b ( x ; ϑ ) − b ( x ; ϑ ) , and δF t k t k +1 = F t k +1 − F t k , it is readily checked that Q n ( ϑ ) = 1 nα n n − X k =0 | δ ϑ ϑ b ( Y t k ) | α n − nα n n − X k =0 h δ ϑ ϑ b ( Y t k ) , δF t k t k +1 i α n + 1 nα n n − X k =0 (cid:0) | δF t k t k +1 | − k σ k α Hn (cid:1) + 1 nα n n − X k =0 | r k | (10) − nα n n − X k =0 h δ ϑ ϑ b ( Y t k ) , r k i α n + 2 nα n n − X k =0 h δF t k t k +1 , r k i . Note that our assumptions on the drift coefficient imply that sup ϑ ∈ Θ | b ( x ; ϑ ) − b ( y ; ϑ ) | ≤ c (cid:0) | x | N + | y | N (cid:1) · | x − y | for all x, y ∈ R d and | b ( x ; ϑ ) − b ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) · | ϑ − ϑ | for all x ∈ R d and ϑ , ϑ ∈ Θ . So, straightforward estimations using Proposition 2.2 give E | r k | p ≤ c · α p (1+ H ) n . Hence for all p ≥ it holds E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 | r k | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ≤ c · n p α p (1+ H ) n , and Lemma 2.6 implies lim n →∞ nα n n − X k =0 | r k | = 0 P - a.s. (11)Using Proposition 2.2 and Lemma 2.6 again, it follows similarly lim n →∞ sup ϑ ∈ Θ nα n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 h δ ϑ ϑ b ( Y t k ) , r k i α n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 P - a.s. (12) SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 9 and, since
H > / , we also have lim n →∞ nα n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 h δF t k t k +1 , r k i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 P - a.s. (13)Plugging relations (11)–(13) into (10), we have obtained that Q n ( ϑ ) = Q (1) n ( ϑ ) − Q (2) n ( ϑ ) + Q (3) n + R n ( ϑ ) , (14)where lim n →∞ sup ϑ ∈ Θ | R n ( ϑ ) | = 0 in the P -almost sure sense and Q (1) n ( ϑ ) = 1 n n − X k =0 | δ ϑ ϑ b ( Y t k ) | , Q (2) n ( ϑ ) = 1 nα n n − X k =0 h δ ϑ ϑ b ( Y t k ) , δF t k t k +1 i and Q (3) n = 1 nα n n − X k =0 (cid:0) | δF t k t k +1 | − k σ k α Hn (cid:1) . The treatment of the terms Q (1) n ( ϑ ) , Q (2) n ( ϑ ) and Q (3) n will be carried out in the followingseries of Lemmata. We first show a discrete version of Proposition 2.3: Lemma 3.1.
Let f ∈ C , ( R d × Θ; R d ) be a function such that | f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ x f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ ϑ f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) for some c > , N ∈ N , independent of ϑ ∈ Θ . Then we have sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n − X k =0 | f ( Y t k ; ϑ ) | − E | f ( Y ; ϑ ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) → P - a.s. In particular, we have sup ϑ ∈ Θ (cid:12)(cid:12) Q (1) n ( ϑ ) − E | δ ϑ ϑ b ( Y ) | (cid:12)(cid:12) → P - a.s. Proof.
Let T n = nα n and set V n ( ϑ ) = 1 T n Z T n | f ( Y s ; ϑ ) | ds. The ergodicity of Y yields that there exists a set A ∈ F with full measure such that lim n →∞ V n ( ϑ )( ω ) = E | f ( Y ; ϑ ) | for all ϑ ∈ Θ ∩ Q q and all ω ∈ A . The assumptions on f give | V n ( ϑ ) − V n ( ϑ ) | ≤ c · (cid:18) T n Z T n | Y s | N ds (cid:19) · | ϑ − ϑ | , (15)so V n is Lipschitz continuous in ϑ and thus sup ϑ ∈ Θ (cid:12)(cid:12) V n ( ϑ ) − E | f ( Y ; ϑ ) | (cid:12)(cid:12) = sup ϑ ∈ Θ ∩ Q q (cid:12)(cid:12) V n ( ϑ ) − E | f ( Y ; ϑ ) | (cid:12)(cid:12) . However, from (15) and the ergodicity of Y , it also follows that there exists a set A ∈ F with P ( A ) = 1 in which the family of random functions V n : Θ → R , n ∈ N , is equicon-tinuous, and hence the Arzela-Ascoli Theorem yields the desired uniform convergence,i.e. lim n →∞ sup ϑ ∈ Θ ∩ Q q (cid:12)(cid:12) V n ( ϑ ) − E | f ( Y ; ϑ ) | (cid:12)(cid:12) = 0 P - a.s. (16)Setting G n ( t ; ϑ ) = | f ( Y t ; ϑ ) | − | f ( Y t k ; ϑ ) | , t ∈ [ t k , t k +1 ) , k = 0 , , . . . , it remains to show that T n Z T n sup ϑ ∈ Θ | G n ( t ; ϑ ) | dt → P - a.s. To this aim, the assumptions on f imply that sup ϑ ∈ Θ | G n ( t ; ϑ ) | ≤ c · (1 + Y Nt + Y Nt k ) · | Y t − Y t k | . Using Proposition 2.2 and Hölder’s inequality we obtain sup t ≥ E sup ϑ ∈ Θ | G n ( t ; ϑ ) | p ≤ c · α pHn (17)for all p ≥ . Now, Jensen’s inequality gives E (cid:12)(cid:12)(cid:12)(cid:12) T n Z T n sup ϑ ∈ Θ | G n ( t ; ϑ ) | dt (cid:12)(cid:12)(cid:12)(cid:12) p ≤ T n Z T n E sup ϑ ∈ Θ | G n ( t ; ϑ ) | p dt, so (17) yields E (cid:12)(cid:12)(cid:12)(cid:12) T n Z T n sup ϑ ∈ Θ | G n ( t ; ϑ ) | dt (cid:12)(cid:12)(cid:12)(cid:12) p ≤ c · α pHn for all p ≥ . Lemma 2.6 implies T n Z T n sup ϑ ∈ Θ | G n ( t ; ϑ ) | dt → P - a.s. for n → ∞ . (cid:3) We have a similar ergodic result for weighted sums of the increments of the process F . Lemma 3.2.
Let f ∈ C , ( R d × Θ; R d ) be a function such that | f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ x f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) , | ∂ ϑ f ( x ; ϑ ) | ≤ c (cid:0) | x | N (cid:1) for some c > , N ∈ N , independent of ϑ ∈ Θ . Assume moreover that there exists afunction U ∈ C , ( R d × Θ; R ) such that ∂ x U ( x ; ϑ ) = f ( x ; ϑ ) , x ∈ R d , ϑ ∈ Θ , i.e. f is of gradient type. Then, for H > / , we have sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) nα n n − X k =0 h f ( Y t k ; ϑ ) , δF t k t k +1 i + E h b ( Y ; ϑ ) , f ( Y ; ϑ ) i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) → P - a.s. In particular, sup ϑ ∈ Θ (cid:12)(cid:12) Q (2) n ( ϑ ) + E h b ( Y ; ϑ ) , δ ϑ ϑ b ( Y ) i (cid:12)(cid:12) → P - a.s. SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 11
Proof.
Let T n = nα n . First note that the chain of variable and density formula forRiemann-Stieltjes integrals, see (5) and(6) in Subsection 2.2, gives that T n ( U ( Y T n ; ϑ ) − U ( y ; ϑ )) = 1 T n Z T n h f ( Y u ; ϑ ) , b ( Y u ; ϑ ) i du + 1 T n Z T n h f ( Y u ; ϑ ) , dF u i . Now the properties of f , Proposition 2.2 and Lemma 2.6 imply that sup ϑ ∈ Θ T n | U ( Y T n ; ϑ ) − U ( y ; ϑ ) | → P - a.s. Moreover, we have sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12) T n Z T n h f ( Y u ; ϑ ) , b ( Y u ; ϑ ) i du − E h f ( Y ; ϑ ) , b ( Y ; ϑ ) i (cid:12)(cid:12)(cid:12)(cid:12) → P - a.s., which can be derived completely analogously to (16). It follows sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12) T n Z T n h f ( Y u ; ϑ ) , dF u i + E h f ( Y ; ϑ ) , b ( Y ; ϑ ) i (cid:12)(cid:12)(cid:12)(cid:12) → P - a.s. So, it remains to show that sup ϑ ∈ Θ T n (cid:12)(cid:12)(cid:12)(cid:12)Z T n h G n ( t ; ϑ ) , dF t i (cid:12)(cid:12)(cid:12)(cid:12) → P - a.s. (18)where G n ( t ; ϑ ) = f ( Y t ; ϑ ) − f ( Y t k ; ϑ ) , t ∈ [ t k , t k +1 ) , k = 0 , , . . . . Applying Proposition 2.4 and using the polynomial Lipschitz continuity of f yields, forall λ < H , (cid:12)(cid:12)(cid:12)(cid:12)Z T n h G n ( t ; ϑ ) , dF t i (cid:12)(cid:12)(cid:12)(cid:12) ≤ c · α λn · m X j =1 n − X k =0 sup t ∈ [ t k ,t k +1 ] (1 + | Y t | N ) | Y | λ ;[ t k ; t k +1 ] | B ( j ) | λ ;[ t k ; t k +1 ] . From the Garcia-Rademich-Rumsey inequality, see Lemma 2.5, and Proposition 2.2 wehave that (cid:16) E | Y | pλ ;[ t k ; t k +1 ] (cid:17) /p ≤ c · α H − λn and also (cid:16) E | B ( j ) | pλ ;[ t k ; t k +1 ] (cid:17) /p ≤ c · α H − λn . Since moreover sup t ∈ [ t k ,t k +1 ] (1 + | Y t | N ) ≤ c · (cid:16) Y Nt k + α λNn · | Y | Nλ ;[ t k ,t k +1 ] (cid:17) and sup t ≥ E | Y t | p < ∞ for all p ≥ , it follows that (cid:18) E sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12) T n Z T n h G n ( t ; ϑ ) , dF t i (cid:12)(cid:12)(cid:12)(cid:12) p (cid:19) /p ≤ c · α H − n (19)Now Lemma 2.6 implies (18), since H > / . (cid:3) Remark . As mentioned in the introduction, the condition
H > / is used in ourproofs. Specifically, it is invoked in the convergence of the weighted stochastic integral (19)and also in order to derive (13). The following Lemma deals with the remaining term, i.e. the quadratic variations ofthe process F : Lemma 3.4.
We have lim n →∞ Q (3) n = lim n →∞ nα n n − X k =0 (cid:0) | δF t k t k +1 | − k σ k α Hn (cid:1) = 0 P - a.s. with k σ k = P mj =1 | σ j | .Proof. We have | δF t k t k +1 | − k σ k α Hn = m X j =1 | σ j | (cid:16) | δB ( j ) t k t k +1 | − α Hn (cid:17)| {z } = I (1) k + m X i,j =1 , i = j h σ i , σ j i δB ( i ) t k t k +1 δB ( j ) t k t k +1 | {z } = I (2) k . Owing to the scaling property of fBm it follows that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 I (1) k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p = α Hpn E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 m X j =1 | σ j | (cid:2) | δB ( j ) kk +1 | − (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p . Since all moments of random variables in a finite Gaussian chaos are equivalent, it followsfrom (7)-(9) that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 (cid:2) | δB ( j ) kk +1 | − (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ≤ c · | log( n ) | p · n p (2 H − and consequently E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 I (1) k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ≤ c · α Hpn · | log( n ) | p · n p (2 H − . Since α n = κ · n − α with α ∈ (0 , Lemma 2.6 now implies that lim n →∞ nα n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 I (1) k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 P - a.s. So it remains to consider the off-diagonal terms, i.e. I (2) k . Here we can exploit the followingtrick: Let β and ˜ β be two independent fractional Brownian motions with the same Hurstindex. From (7)-(9) we clearly have that V n = n − X k =0 (cid:16) | δ t k t k +1 β | − | δ t k t k +1 ˜ β | (cid:17) satisfies E | V n | p ≤ c · α Hpn · | log( n ) | p · n p (2 H − . However, setting B ( i ) = ( β + e β ) / √ and B ( j ) = ( β − e β ) / √ , then B ( i ) and B ( j ) are twoindependent fractional Brownian motions and V n L = 2 n − X k =0 δ t k t k +1 B ( i ) δ t k t k +1 B ( j ) . SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 13
Now we can easily conclude that lim n →∞ nα n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X k =0 I (2) k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 P - a.s. (cid:3) Proof of Theorem 1.5.
Let us go back to expression (10) and (14). Applying Lemma 3.1and 3.2 we obtain that lim n →∞ sup ϑ ∈ Θ (cid:12)(cid:12)(cid:12) (cid:0) Q (1) n ( ϑ ) − Q (2) n ( ϑ ) (cid:1) − (cid:0) E | b ( Y ; ϑ ) | − E | b ( Y ; ϑ ) | (cid:1) (cid:12)(cid:12)(cid:12) = 0 almost surely. Furthermore, recall that Lemma 3.4 asserts that lim n →∞ Q (3) n = 0 almostsurely. The proof is now finished. (cid:3) Proof of Theorem 1.7.
Let us first recall the following result ([6, 16]):
Proposition 3.5.
Assume that the family of random variables L n ( ϑ ) , n ∈ N , ϑ ∈ Θ ,satisfies: (1) With probability one, L n ( ϑ ) → L ( ϑ ) uniformly in ϑ ∈ Θ as n → ∞ . (2) The limit L is non-random and L ( ϑ ) ≤ L ( ϑ ) for all ϑ ∈ Θ . (3) It holds L ( ϑ ) = L ( ϑ ) if and only if ϑ = ϑ .Then, we have b ϑ n → ϑ P - a.s. for n → ∞ , where L n ( b ϑ n ) = min ϑ ∈ Θ L n ( ϑ ) . The strong consistency of the zero squares estimator follows now from Theorem 1.5 andan application of Proposition 3.5 to | Q n ( ϑ ) | . (cid:3) Remark . In the case H = 1 / we have under similar assumptions that sup ϑ ∈ Θ (cid:12)(cid:12)(cid:0) Q n ( ϑ ) − Q n ( ϑ ) (cid:1) − (cid:0) E | b ( Y ; ϑ ) − b ( Y ; ϑ ) | (cid:1)(cid:12)(cid:12) → in the P -almost sure sense, see e.g. [5]. Since Q n ( ϑ ) = 1 nα n n − X k =0 (cid:0) | δF t k t k +1 | − k σ k α n (cid:1) + 1 nα n n − X k =0 | r k | + 2 nα n n − X k =0 h δF t k t k +1 , r k i , an application of Lemma 3.1 and 3.4 (which are also valid for H = 1 / ) yield that lim n →∞ Q n ( ϑ ) = lim n →∞ nα n n − X k =0 h δF t k t k +1 , r k i P - a.s. However, using the Itô-isometry and the Burkholder-Davis-Gundy inequality we have E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) nα n n − X k =0 h r k , δ t k t k +1 F i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ≤ c · n p α pn · T p/ n α p/ n ≤ c · n − p/ − α and Lemma 2.6 thus gives lim n →∞ Q n ( ϑ ) = 0 P - a.s. Hence we end up with sup ϑ ∈ Θ (cid:12)(cid:12) Q n ( ϑ ) − (cid:0) E | b ( Y ; ϑ ) − b ( Y ; ϑ ) | (cid:1)(cid:12)(cid:12) → , so the limit of the statistics Q n is different for H = 1 / , where one obtains the standardleast square estimator. References [1] A. Bégyn (2005): Quadratic variations along irregular subdivisions for Gaussian processes.
ElectronicJ. Probab. , 691–717.[2] R. Belfadli, K. Es-Sebaiy, Y. Ouknine (2011): Parameter Estimation for Fractional Ornstein-Uhlenbeck Processes: Non-ergodic Case. Arxiv
Preprint.[3] A. Chronopoulou, S. Tindel (2011): On inference for fractional differential equations.
Arxiv
Preprint.[4] J.F. Coeurjolly (2001): Estimating the Parameters of a Fractional Brownian Motion by DiscreteVariations of its Sample Paths.
Stat. Infer. Stoch. Process. , no. 2, 199–227.[5] D. Florens-Zmirou (1989): Approximate discrete-time schemes for statistics of diffusion processes. Statistics , no. 4, 547–557.[6] R. Frydman (1980): A proof of the consistency of maximum likelihood estimators of non-linearregression models with autocorellated errors. Econometrica , 853–860.[7] M. Garrido-Atienza, P. Kloeden, A. Neuenkirch (2009): Discretization of stationary solutions ofstochastic systems driven by fractional Brownian motion. Appl. Math. Optim. , no. 2, 151–172.[8] Garcia, A. M., Rodemich, E. and Rumsey Jr., H. (1978). A real variable lemma and the continuityof paths of some Gaussian processes. Indiana Math. J. , 565–578.[9] M. Garrido-Atienza, B. Schmalfuss (2011): Ergodicity of the infinite-dimensional fractional Brownianmotion. J. Dynam. Differential Equations. , no. 3, 671–681.[10] M. Hairer (2005): Ergodicity of stochastic differential equations driven by fractional Brownian mo-tion. Ann. Probab. , no. 2, 703–758.[11] M. Hairer, A. Majda (2010): A simple framework to justify linear response theory. Nonlinearity ,no. 4, 909–922.[12] M. Hairer, A. Ohashi (2007): Ergodicity theory of SDEs with extrinsic memory. Ann. Probab. ,no. 5, 1950–1977.[13] M. Hairer, S. Pillai (2011): Ergodicity of hypoelliptic SDEs driven by fractional Brownian motion. Ann. Inst. Henri Poincaré, Probab. Stat. , no. 2, 601–628.[14] Y. Hu, D. Nualart (2010): Parameter estimation for fractional Ornstein-Uhlenbeck processes. Stat.Prob. Lett. , 1030–1038.[15] J. Istas, G. Lang (1994): Quadratic variations and estimation of the local Hölder index of a Gaussianprocess. Ann. Inst. Poincaré , 407–436.[16] R.A. Kasonga (1988): The consistency of a non-linear least squares estimator from diffusion pro-cesses. Stoch. Proc. Appl. , 263–275.[17] M. Kleptsyna, A. Le Breton (2002): Statistical analysis of the fractional Ornstein-Uhlenbeck typeprocess. Stat. Inference Stoch. Process. , no. 3, 229–248.[18] A. Le Breton (1998): Filtering and parameter estimation in a simple linear system driven by afractional Brownian motion. Stat. Probab. Lett. , no. 3, 263–274.[19] P. Kloeden, A. Neuenkirch (2007): The pathwise convergence of approximation schemes for stochasticdifferential equations, LMS J. Comp. Math. , 235–253.[20] A. Papavasiliou, C. Ladroue (2011): Parameter estimation for rough differential equations. Ann.Statist. , to appear.[21] Prakasa Rao, B. L. S. (2010): Statistical inference for fractional diffusion processes. Wiley Series inProbability and Statistics, Chichester, John Wiley & Sons.[22] C. Tudor, F. Viens (2007): Statistical aspects of the fractional stochastic calculus.
Ann. Statist. ,no. 3, 1183–1212. SE FOR SDES WITH ADDITIVE FRACTIONAL NOISE 15 [23] C. Tudor, F. Viens (2009): Variations and estimators for self-similarity parameters via Malliavincalculus.
Ann. Probab. , no. 6, 2093–2134.[24] Young, L.C. (1936). An inequality of Hölder type connected with Stieltjes integration. Acta Math. Math. Nachr. , no. 9, 1097–1106.
Andreas Neuenkirch, Fachbereich Mathematik, TU Kaiserslautern, Postfach 3049, D-67663 Kaiserslautern, Germany [email protected]