Uniform LAN property of locally stable Lévy process observed at high frequency
aa r X i v : . [ m a t h . P R ] A ug UNIFORM LAN PROPERTY OF LOCALLY STABLE L´EVY PROCESSOBSERVED AT HIGH FREQUENCY
D. O. IVANENKO, A. M. KULIK, AND H. MASUDA
Abstract.
Suppose we have a high-frequency sample from the L´evy process of the form X θt = βt + γZ t + U t , where Z is a possibly asymmetric locally α -stable L´evy process, and U is a nuisanceL´evy process less active than Z . We prove the LAN property about the explicit parameter θ = ( β, γ )under very mild conditions without specific form of the L´evy measure of Z , thereby generalizing theLAN result of A¨ıt-Sahalia and Jacod [1]. In particular, it is clarified that a non-diagonal normingmay be necessary in the truly asymmetric case. Due to the special nature of the local α -stableproperty, the asymptotic Fisher information matrix takes a clean-cut form. Introduction
Ever since Le Cam’s pioneering work [20], local asymptotics of likelihood random fields hasbeen playing a crucial role in the theory of asymptotic inference. Specifically, the celebrated localasymptotic normality property (LAN) introduced by Le Cam has been a longstanding prominentconcept, based on which we can deduce, among others, asymptotic optimality criteria for estimationand testing hypothesis. Not only for the classical i.i.d. models, there are many existing LAN resultsfor several kinds of statistical experiments of dependent data, including ergodic times-series models,homoscedastic models, and ergodic stochastic processes, to mention juts a few. One can consult[21] and the references therein for a systematic account of the LAN together with many relatedtopics.It is a common knowledge that verification of the LAN for a stochastic processes with no closed-form likelihood is generally a difficult matter. In case of diffusions under high-frequency, Gobet [13]and [14] successfully derived the LAN and LAMN by means of the Malliavin calculus. There thestructures of the limit experiments turned out to be simple enough (normal or mixed normal). Oneof theoretical merits of high-frequency sampling is that it enables us to take into account a small-time approximation of the underlying model, based on which we may derive an implementable andasymptotically efficient estimator. This has been achieved for the diffusion models, see Kessler [16]and Genon-Catalot and Jacod [11]. However, to say nothing of L´evy driven non-linear stochasticdifferential equations, much less has been known about the explicit LAN result for L´evy processesobserved at high frequency where the transition probability is hardly available in a closed form.We refer to [24] for several explicit case studies about LAN result and related statistical-estimationproblems concerning L´evy processes observed at high frequency. Especially when the underlyingL´evy process or the most active part of the process is symmetric α -stable, the explicit LAN resulthas been proved in [1] and [23]. See also [2] for the precise asymptotic behavior of the Fisher-information matrix for the same model setting as in [1].We will consider the L´evy process X θ described by X θt = βt + γZ t + U t , where Z is a locally α -stable L´evy process and where U is a L´evy process which is independent of Z and less active than Z , the latter being regarded as a nuisance process; we specify them below. The objective of Date : Revised October 1, 2018 (First version November 6, 2014).2000
Mathematics Subject Classification.
Primary ; Secondary.
Key words and phrases.
High-frequency sampling, LAN, Likelihood function, L´evy process, Regular statisticalexperiment.Partly supported by JSPS KAKENHI Grant Number 26400204 (HM). this paper is to derive the LAN about the explicit parameter θ = ( β, γ ) under very mild conditions,when X θ is observed at high-frequency. Our model setting is quite broad to cover many specificexamples of infinite-activity pure-jump L´evy processes, and in particular generalizes the LAN resultof [1], for the locally α -stable property only requires that the L´evy measure behaves like that of the α -stable distribution only near the origin, hence is much weaker requirement than the genuine α -stable case. It turns out that the special nature of the locally α -stable character leads to a clean-cutlimit experiments described in terms of the α -stable density. Owing to high-frequency sampling,the method we propose is highly non-sensitive with respect to the nuisance process U , and allowsus to formulate the LAN property uniformly with respect to a class of nuisance processes; thisexplains the term “uniform” in the title of the paper.Our proof of the LAN property is based on two principal ingredients. One of them is theclassical L -regularity technique, which dates back to Le Cam. Another important ingredient isthe Malliavin calculus-based integral representation for the derivative of the log-likelihood function,which we use in order to derive the L p -bounds for this derivative. This method of proof is mainlybased on the ideas developed in [17], [18] for the model where X θ is a solution to a L´evy-drivenSDE observed with a fixed frequency, but in the high-frequency case we encounter new challengeto design the particular version of the Malliavin calculus in a way which provide asymptoticallyprecise L p -bounds. We mention an independent recent paper [7], where similar tools are developedfor the same purposes. Our way to obtain the asymptotically precise L p -bounds and its relation tothat developed in [7] is discussed in details in Section 4 below.It is natural to ask for extending our LAN result for stochastic differential equation driven by alocally α -stable Z . This extension is far-reaching and may involve the notion of the locally asymp-totically mixed normality property (LAMN) introduced by Jeganathan [19], which covers cases ofrandom asymptotic Fisher information matrix. This is particularly relevant to heteroscedastic pro-cesses observed at n distinct time points over a fixed time domain. In such cases it is typical thatrandomness of the covariance structure is not averaged out in the limit experiments. See [9], [12]and [13] for the case of diffusion processes. The LA(M)N property of a solution to a SDE driven bya locally α -stable L´evy process under high-frequency sampling is one of currently-projected topics.To the best of our current knowledge, the papers [7] and [22] are the only existing result in thisdirection. This will involve more technicalities than the present L´evy-process setting, and will beinvestigated in a subsequent paper.This paper is organized as follows. In Section 2 we describe the model, introduce the assump-tions, and formulate the main results of the paper. Section 3 contains the main part of the proof,which is based on the Le Cam’s L -regularity technique and relies on L p -bounds for the derivativeof the log-likelihood function. These L p -bounds are proved in Section 4 by means of a speciallydesigned version of the Malliavin calculus.2. Main results
Let X θ be a L´evy process of the form(2.1) X θt = βt + γZ t + U t , t ≥ . Here Z and U are independent L´evy processes defined on a probability space (Ω , F , P ), and θ =( β, γ ) ⊤ ∈ R is an unknown parameter subject to a statistical estimation. We assume Z to be suchthat in its L´evy-Khintchine representation E e iλZ t = e tψ ( λ ) , the characteristic exponent ψ has the form(2.2) ψ ( λ ) = Z R (cid:16) e iλu − − iλu | u |≤ (cid:17) µ ( du ) . NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 3
That is, Z does not contain the diffusion term, the truncation function equals u | u |≤ , and noadditional drift term is involved. Throughout this paper, the L´evy measure µ is assumed to satisfythe following conditions: H1. µ ( du ) = m ( u ) du , and for some α ∈ (0 , m ( u ) ∼ (cid:26) C + | u | − α − , u → ,C − | u | − α − , u → − , C − + C + > . H2. m ∈ C ( R \ { } ), and there exists a constant u > τ ( u ) := | um ′ ( u ) | m ( u )is bounded on the set {| u | ≤ u } and satisfies Z | u | >u τ δ ( u ) µ ( du ) < ∞ for some δ > α -stable process its L´evy measure has the density(2.3) m α,C ± ( u ) := (cid:26) C + | u | − α − , u > ,C − | u | − α − , u < . Hence H1 requires that locally near the origin the L´evy measure for Z behaves similar to that for an α -stable process; that is why we call Z locally α -stable . The constant ( C + − C − ) / ( C + + C − ) ∈ [ − , H2 does not require τ ( u ) to be bounded for “large” u ; this includes into the class ofadmissible Z a wide range of “stable-like” L´evy processes with m ( u ) = f ( u ) m α,C ± ( u ) , where f ( u ) → , | u | → Example 2.1 (Tempered α -stable process) . For either f ( u ) = e −√ u , f ( u ) = e − u , or f ( u ) = e −| u | , conditions H1 , H2 hold true, although τ ( u ) fails to be bounded. Example 2.2 (Smoothly damped α -stable process) . Let m ( u ) = f ( u ) | u | − α − [ − u ,u ] ( u ), u > f is continuous in R , f > u ∈ [ − u , u ] with f ( u ) → u →
0, and f smoothlyvanishes outside the interval [ − u , u ] in such a way that u
7→ | u || f ′ ( u ) | /f ( u ) is locally boundedand moreover u
7→ {| u || f ′ ( u ) | /f ( u ) } δ m ( u ) is du -integrable over the set {| u | ≥ u } for some δ > u >
0. Then conditions H1 , H2 hold true; note that τ ( u ) ≤ | u || f ′ ( u ) | /f ( u ) + α + 1. Oneparticular example of such f is of the form f ( u ) = Ce − / ( u + u ) − / ( u − u ) [ − u ,u ] ( u ) . We are focused on the following setting: • the process X θ is discretely observed , i.e. the n -th sample contains its values at the first n points { t k,n = kh n , k = 1 , . . . , n } of the uniform partition of the time axis with the partitioninterval h n ; • h n → n → ∞ , i.e. the discrete observations of X θ have high frequency .We note that the terminal sampling time nh n may or may not tend to infinity as n → ∞ .In what follows an open set Θ ∈ R denotes the set of possible values of the unknown parameter θ ; we assume that Θ ⊂ R × (0 , ∞ ) , i.e. parameter γ takes only positive values. Denote P θn the lawof the sample n X θt k,n , k = 1 , . . . , n o D. O. IVANENKO, A. M. KULIK, AND H. MASUDA in ( R n , B ( R n )), and write E n = n R n , B ( R n ) , ( P θn , θ ∈ Θ) o for a statistical model based on this sample.Under our conditions on the process Z , the law P θn is absolutely continuous with respect toLebesgue measure (see Section 3), i.e. the model E n possesses the likelihood function L n ( θ ; x , . . . , x n ) = P θn ( dx . . . dx n ) dx . . . dx n Denote by Z n ( θ , θ ; x , . . . , x n ) = L n ( θ ; x , . . . , x n ) L n ( θ ; x , . . . , x n )the likelihood ratio of P θn with respect to P θ n with the convention (anything) / ∞ .Our goal is to establish the LAN property for the sequence of statistical models E n , n ≥ θ ∈ Θ with the matrixrate { r ( n ) = r ( n, θ ) , n ≥ } and the covariance matrix Σ( θ ), if for every v the sampled likelihoodratio Z n ( θ , θ + r ( n ) v ) = Z n ( θ , θ + r ( n ) v ; X θ t ,n , . . . , X θ t n,n )possesses representation under(2.4) Z n ( θ , θ + r ( n ) v ) = exp (cid:26) v ⊤ ∆ n ( θ ) − v ⊤ Σ( θ ) v + Ψ n ( v, θ ) (cid:27) with(2.5) ∆ n ( θ ) ⇒ N (0 , Σ( θ )) , n → ∞ and(2.6) Ψ n ( v, θ ) P −→ , n → ∞ along P θ n .Put(2.7) c t = t Z t /α < | u |≤ u µ ( du ) , which is identically zero if µ is symmetric and denote by Z α,C ± the α -stable process whose charac-teristic exponent has the form (2.2) with the L´evy measure (2.3), where C + , C − are given by thecondition H1 . Finally, denote by φ α,C ± the distribution density of Z α,C ± (this density exists, see[27] or Proposition 3.1 below).Now we are able to formulate our main result. Theorem 2.1.
Let X θ be given by (2.1) and assume that Z satisfies H1 and H2 , that (2.8) t − /α U t → , t → , in probability, and that n − / h /α − n → (automatic if α ∈ (0 , since we are supposing that h n → ). Then the LAN property holds true atevery point θ ∈ Θ with (2.9) r ( n ) = n − / (cid:18) h /α − n c h n h − n (cid:19) , Σ( θ ) = (cid:18) Σ ( θ ) 00 Σ ( θ ) (cid:19) , NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 5 where Σ ( θ ) = γ − Z R φ ′ α,C ± ( x ) φ α,C ± ( x ) ! φ α,C ± ( x ) dx, Σ ( θ ) = γ − Z R xφ ′ α,C ± ( x ) φ α,C ± ( x ) ! φ α,C ± ( x ) dx. Remark . Recall the definition of the Blumenthal-Getoor activity index of a L´evy process Y with the L´evy measure µ Y : α Y := inf (cid:26) q ≥ Z | u |≤ | u | q µ Y ( du ) < ∞ (cid:27) . Then it is sufficient for the condition (2.8) that α U < α (see, e.g., p.362 of [25]); note that α Z = α .In this paper we are assuming that the activity index α is known. This might seem disappointing,however, as it was clarified in [2] and [23], if one attempts to make joint maximum-likelihoodestimation of α and the scale parameter γ , one may confront the degeneracy of the asymptoticFisher information matrix. This degeneracy is inevitable, and how to cope with it is beyond thescope of this paper. Remark . In view of the standard theory [15] concerning asymptotically efficient estimation ofa LAN model, Theorem 2.1 suggests to seek an estimator ˆ θ = ( ˆ β n , ˆ γ n ) ⊤ such that r ( n ) − (ˆ θ n − θ ) = (cid:18) √ nh − /αn ( ˆ β n − β ) − h − /αn c h n · √ n (ˆ γ n − γ ) √ n (ˆ γ n − γ ) (cid:19) weakly tends to the centered normal distribution with the asymptotic covariance matrix Σ( θ ).Observe that, when µ is asymmetric, the factor h − /αn c h n = h − /αn Z h /αn < | u |≤ uµ ( du )may or may not vanish, or even may diverge, implying that the asymmetry essentially and non-trivially affect estimation of the drift parameter β . As a matter of fact, the necessity of non-diagonal norming seems to be non-standard in the literature: typically, it is enough to take r ( n ) = diag { ( E | ∂∂θ j log L n ( θ ; X θ t ,n , . . . , X θ t n,n ) | ) j } whenever exists; for instance, the monograph[3] is devoted to the diagonal norming. We refer to [10] and [26] for some technical refinements ofasymptotic inference by using a non-diagonal norming. Nevertheless, we note that since r ( n ) of(2.9) is invertible, we have no trouble in construction of an asymptotically confidence region of anasymptotically normally distributed estimator converging at rate r ( n ).Under the condition (2.8) the process U is interpreted as a “nuisance noise”, in the sense that U is less active than the “principal” part Z , as was mentioned in Remark 2.1. A natural question iswhether or not it is possible to extend Theorem 2.1 so as to make the LAN property valid not onlyfor each single U , but also uniformly over some “nuisance class” U of U . Our method of proof ofTheorem 2.1 is strong enough to provide the following uniform LAN property in such an extendedsetting.
Theorem 2.2.
Let U be a class of L´evy processes such that condition (2.8) holds true uniformlyover U ∈ U . If in addition Z and h n satisfy conditions of Theorem 2.1, then for every U ∈ U and θ ∈ Θ , the likelihood ratio for the discretely observed process (2.1) admits a representation (2.4)with r ( n ) , Σ( θ ) specified in (2.9), and relations (2.5), (2.6) hold true uniformly over U ∈ U . D. O. IVANENKO, A. M. KULIK, AND H. MASUDA
As it was explained in [1], the uniform negligibility of U would play an important role for purposesof semiparametric statistical (adaptive) estimation of θ : in p.358 of [1], the authors introduce aclass of possible nuisance noise distribution L ( U ), over which one can precisely formulate anasymptotically uniformly efficient estimation of θ ; this in turn leads to the notion of asymptoticallyuniformly efficient estimator of θ . As a matter of fact, it would be possible to precisely statea uniform-in- U version of the Haj´ek-Le Cam convolution theorem, which effectively clarifies theuniform asymptotic lower bound of an expected loss of any regular estimator with r ( n )-rate ofconvergence; among others, see Section 2.3 of [3] and Section II.11 of [15] for details. How toconstruct an asymptotically efficient estimator would be several things, to be reported elsewhere.3. Proofs of Theorem 2.1 and Theorem 2.2
In this section we prove Theorem 2.1 and outline the proof of Theorem 2.2. The key ingredientin these proofs would the L p -bound for the derivative of the log-likelihood (Proposition 3.2), whichwe discuss in details and prove separately in Section 4 below.3.1. Proof of Theorem 2.1: an outline and preliminaries.
Denote by p t ( θ ; x, y ) the transitionprobability density for X θ , considered as a Markov process; in what follows we will prove that thisdensity exists. Denote also g t ( θ ; x, y ) = ∇ θ p t ( θ ; x, y ) p t ( θ ; x, y ) = ∇ θ log p t ( θ ; x, y ) , q t ( θ ; x, y ) = ∇ θ p t ( θ ; x, y )2 p p t ( θ ; x, y ) = ∇ θ p p t ( θ ; x, y ) , assuming the derivatives to exist for P t ( x, · )-a.a. y for every fixed x, t . Since X θ has independentincrements, we can write p t ( θ ; x, y ) = p t ( θ ; y − x ) , g t ( θ ; x, y ) = g t ( θ ; y − x ) , q t ( θ ; x, y ) = q t ( θ ; y − x ) . Then the sampled likelihood ratio for the model can be written in the form Z n ( θ , θ + r ( n ) v ) = n Y k =1 p h n ( θ + r ( n ) v ; X θ t k,n − X θ t k − ,n ) p h n ( θ ; X θ t k,n − X θ t k − ,n ) . Denote η θk,n = X θt k,n − X θt k − ,n , ≤ k ≤ n, and observe that p h n ( θ ; · ) is the distribution density for η θkn . Hence the statistical model describedabove, after a re-sampling ( X θt k,n ) nk =1 ( η θk,n ) nk =1 , actually is reduced to the one with a triangular array of independent observations. The LANproperty for triangular arrays of independent observations is well studied, e.g. [15], TheoremII.3.1 ′ , Theorem II.6.1, and Remark II.6.2. In particular, in order to prove the required LANproperty at a point θ ∈ Θ it would be enough for us to prove the following assertions. A1 For every n , the function Θ ∋ θ → p p h n ( θ ; · ) ∈ L ( R )is continuously differentiable; that is, the statistical experiment is regular. A2 lim n →∞ E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X k =1 (cid:16) r ( n ) ⊤ g h n (cid:16) θ ; X θ kh n − X θ ( k − h n (cid:17)(cid:17) ⊗ − Σ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 . A3 For some ε >
0, lim n →∞ n Z R (cid:12)(cid:12)(cid:12) r ( n ) ⊤ g h n ( θ ; y ) (cid:12)(cid:12)(cid:12) ε p h n ( θ ; y ) dy = 0 . NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 7 A4 For every
N > n →∞ sup | v |
Proposition 3.1. (1) ζ α,t ⇒ Z α,C ± , t → .(2) The variables ζ α,t , t > , an Z α,C ± possess the distribution densities φ α,t , t > , and φ α,C ± , respectively. These densities are infinitely differentiable, bounded together with theirderivatives, and for every N > | x |≤ N | φ α,t ( x ) − φ α,C ± ( x ) | → , sup | x |≤ N | φ ′ α,t ( x ) − φ ′ α,C ± ( x ) | → , t → . Denote Y t = X t − U t = γt /α ζ α,t + βt − γc t , then the distribution density for Y t under P θ equals(3.2) p Yt ( θ ; x ) = γ − t − /α φ α,t (cid:16) γ − t − /α ( x − βt + γc t ) (cid:17) , and consequently ∂ β p Yt ( θ ; x ) = − γ − t − /α φ ′ α,t (cid:16) γ − t − /α ( x − βt + γc t ) (cid:17) ,∂ γ p Yt ( θ ; x ) = − γ − t − /α (cid:20) φ α,t (cid:16) γ − t − /α ( x − βt + γc t ) (cid:17) + x − βtγt /α φ ′ α,t (cid:16) γ − t − /α ( x − βt + γc t ) (cid:17)(cid:21) = − γ − t − /α h φ α,t ( z ) + zφ ′ α,t ( z ) − c t t − /α φ ′ α,t ( z ) i z = γ − t − /α ( x − βt + γc t ) . Because X t = Y t + U t and Y, U are independent, we have(3.3) p t ( θ ; x ) = Z R p Yt ( θ ; x − t /α y ) ν t ( dy ) , where ν t denotes the law of t − /α U t .Taking (3.2) into account, we can re-arrange the above convolution formula for p t ( θ ; x ) in thefollowing way:(3.4) p t ( θ ; x ) = γ − t − /α f t (cid:16) θ ; γ − t − /α ( x − βt + γc t ) (cid:17) , where f t ( θ ; z ) = Z R φ α,t ( z − γ − y ) ν t ( dy ) . Note that (2.8) implies ν t ⇒ δ , t →
0, and γ is separated from 0 when θ = ( β, γ ) ∈ ˜Θ. Then bythe first assertion in (3.1), for every N > θ ∈ ˜Θ sup | z |≤ N | f t ( θ ; z ) − φ α,C ± ( z ) | → , t → . D. O. IVANENKO, A. M. KULIK, AND H. MASUDA
It follows from (3.3) that ∂ β p t ( θ ; x ) = Z R ∂ β p Yt ( θ ; x − t /α y ) ν t ( dy ) , ∂ γ p t ( θ ; x ) = Z R ∂ γ p Yt ( θ ; x − t /α y ) ν t ( dy )Then, similarly as above, we have(3.6) ∂ β p t ( θ ; x ) = − γ − t − /α f (1) t (cid:16) θ ; γ − t − /α ( x − βt + γc t ) (cid:17) ,∂ γ p t ( θ ; x ) = − γ − t − /α h f t ( θ ; z ) + f (2) t ( θ ; z ) − c t t − /α f (1) t ( θ ; z ) i z = γ − t − /α ( x − βt + γc t ) with f (1) t ( θ ; z ) = Z R φ ′ α,t ( z − γ − y ) ν t ( dy ) , f (2) t ( θ ; z ) = Z R ( z − γ − y ) φ ′ α,t ( z − γ − y ) ν t ( dy ) , and for every N > θ ∈ ˜Θ sup | z |≤ N | f (1) t ( θ ; z ) − φ ′ α,C ± ( z ) | → , sup θ ∈ ˜Θ sup | z |≤ N | f (2) t ( θ ; z ) − zφ ′ α,C ± ( z ) | → , t → . Below we use formulae (3.4) – (3.7) to control the “local” behavior of the functions g t , q t involvedinto A1 – A4 . To control the “global” behavior, we use the following moment bound; we evaluatethis bound in Section 4 below. Proposition 3.2.
Under conditions of Theorem 2.1, for every ˜Θ and every δ ∈ (0 , δ ) (where δ comes from H2 ), sup n ≥ ,θ ∈ ˜Θ E (cid:12)(cid:12)(cid:12) ˜ r ( n ) ⊤ g h n (cid:16) θ ; X θh n (cid:17)(cid:12)(cid:12)(cid:12) δ < ∞ . Proofs of A1–A4.
Proof of A1 . For this it is sufficient to show that, for a fixed t = h n , the mapping(3.8) Θ ∋ θ q t ( θ ; · ) ∈ L ( R )is continuous, and for any θ , θ such that the segment [ θ , θ ] is contained in Θ,(3.9) p p t ( θ ; · ) − p p t ( θ ; · ) = (cid:18) Z q t ((1 − s ) θ + sθ ; · ) ds (cid:19) ⊤ ( θ − θ )with the integral understood in the sense of convergence of the Riemann sums in L ( R ). Theargument here is similar and simpler to the one used in the proof of Theorem 2 in [18], hence wejust outline the main steps. DefineΨ ε ( z ) = , z < ε/ , ( z − ε/ ε / , z ∈ [ ε/ , ε ] , √ z − √ ε , z ≥ ε. Then, by the construction, for z > ε ( z ) → Ψ ( z ) := √ z, Ψ ′ ε ( z ) → Ψ ′ ( z ) = 12 √ z , ε → . Because Ψ ε ∈ C , ε >
0, we have by Proposition 3.1 and (3.4), (3.6) that Ψ ε ( p t ( θ ; x )) dependssmoothly on θ, x and q t,ε ( θ ; x ) := ∇ θ (cid:16) Ψ ε ( p t ( θ ; x )) (cid:17) = Ψ ′ ε ( p t ( θ ; x )) ∇ θ p t ( θ ; x ) . Then Ψ ε (cid:16) p t ( θ ; · ) (cid:17) − Ψ ε (cid:16) p t ( θ ; · ) (cid:17) = (cid:18) Z q t,ε ((1 − s ) θ + sθ ; · ) ds (cid:19) ⊤ ( θ − θ ) , NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 9 and to prove (3.9) it is sufficient to prove thatΨ ε (cid:16) p t ( θ ; · ) (cid:17) → Ψ (cid:16) p t ( θ ; · ) (cid:17) = p p t ( θ ; · ) , ε → L ( R ) for every θ ∈ Θ, and that q t,ε ( θ ; · ) → q t ( θ ; · ) , ε → L ( R ) uniformly in θ ∈ [ θ , θ ]. The latter would also provide that the function (3.8) is continuousas a uniform limit of continuous functions. Let us prove the second convergence, since the proof ofthe first one is similar and simpler. By the construction, we have 0 ≤ Ψ ′ ε ( z ) ≤ Ψ ′ ( z ) = (2 √ z ) − ,therefore q t,ε ( θ ; x ) = Ψ ′ ε ( p t ( θ ; x )) ∇ θ p t ( θ ; x ) = Υ ε ( p t ( θ ; x )) g t ( θ ; x ) , where Υ ε ( z ) = Ψ ′ ε ( z ) z (cid:26) ≤ (1 / √ z, z > / √ z, z ≥ ε. Hence Z R ( q t,ε ( θ ; x ) − q t ( θ ; x )) dx ≤ Z { x : p t ( θ ; x ) ≤ ε } (cid:16) g t ( θ ; x ) (cid:17) p t ( θ ; x ) dx. Recall that t = h n . Take ˜Θ in Proposition 3.2 equal to the segment [ θ , θ ], thensup θ ∈ ˜Θ Z R (cid:12)(cid:12)(cid:12) g t ( θ ; x ) (cid:12)(cid:12)(cid:12) δ p t ( θ ; x ) dx = sup θ ∈ ˜Θ E (cid:12)(cid:12)(cid:12) g t ( θ ; X t ) (cid:12)(cid:12)(cid:12) δ < ∞ . Hence by the H¨older inequality Z R ( q t,ε ( θ ; x ) − q t ( θ ; x )) dx ≤ C Z { x : p t ( θ ; x ) ≤ ε } p t ( θ ; x ) dx ! δ / (2+ δ ) . The density p t ( θ ; x ) is given explicitly by (3.4). Using this representation and changing the variables z = γ − t − /α ( x − βt + γc t ), we get Z { x : p t ( θ ; x ) ≤ ε } p t ( θ ; x ) dx = Z { z : f ( θ ; z ) ≤ γt /α ε } f ( θ ; z ) dz. We have φ α,t ∈ L ( R ), and therefore the mapping[ θ , θ ] ∋ θ = ( β, γ ) φ α,t ( · − γ − y ) ∈ L ( R )is continuous. Hence the mapping[ θ , θ ] ∋ θ f ( θ ; · ) = Z R φ α,t ( · − γ − y ) ν t ( dy )is continuous, as well. This finally implies that Z { x : p t ( θ ; x ) ≤ ε } p t ( θ ; x ) dx = Z { z : f ( θ ; z ) ≤ γt /α ε } f ( θ ; z ) dz → , ε → θ ∈ [ θ , θ ], which completes the proof of the required convergence and provides A1 . Proof of A2 . DenoteΓ θk,n = ˜ r ( n ) ⊤ g h n (cid:16) θ ; X θkh n − X θ ( k − h n (cid:17) , k = 1 , . . . , n, then n X k =1 (cid:16) r ( n ) ⊤ g h n (cid:16) θ ; X θkh n − X θ ( k − h n (cid:17)(cid:17) ⊗ = 1 n n X k =1 (cid:16) Γ θk,n (cid:17) ⊗ . Since X is a L´evy process, { Γ θk,n } ≤ k ≤ n is a triangular array of random vectors, which are row-wiseindependent and identically distributed. Let us analyze the common law of Γ θk,n at an n -th row.Denote ξ θk,n = γ − h − /αn ( X θkh n − X θ ( k − h n − βh n + γc h n ) , k = 1 , . . . , n, which are i.i.d. random variables with ξ θ ,n d = ζ α,h n + γ − h − /αn U h n . By statement (1) of Proposition 3.1 and (2.8), we have then(3.10) ξ θ ,n ⇒ Z α,C ± . Next, by (3.4), (3.6) the components of g h n ( θ ; X θkh n − X θ ( k − h n ) are given by g h n (cid:16) θ ; X θkh n − X θ ( k − h n (cid:17) = − γ − h − /αn f (1) h n ( θ ; ξ θk,n ) f h n ( θ ; ξ θk,n ) ,g h n (cid:16) θ ; X θkh n − X θ ( k − h n (cid:17) = − γ − " f (2) h n ( θ ; ξ θk,n ) f h n ( θ ; ξ θk,n ) + γ − c h n h − /αn f (1) h n ( θ ; ξ θk,n ) f h n ( θ ; ξ θk,n ) . Recall that ˜ r ( n ) ⊤ = (cid:18) h /α − n c h n h − n (cid:19) , hence we can write finally Γ θk,n = γ − G α,h n ( θ ; ξ θk,n ) , where the vector-valued functions G α,h n have the components G α,h n ( θ ; x ) = − f (1) h n ( θ ; x ) f h n ( θ ; x ) , G α,h n ( θ ; x ) = − − f (2) h n ( θ ; x ) f h n ( θ ; x ) . Denote by G α,C ± the vector-valued function with the components G α,C ± ( x ) = − φ ′ α,C ± ( x ) φ α,C ± ( x ) , G α,C ± ( x ) = − − xφ ′ α,C ± ( x ) φ α,C ± ( x ) , and denote for ε > , N > K ε,N = { x : | x | ≤ N, φ α,C ± ( x ) ≥ ε } , which is a compact set in R . It follows from (3.5), (3.7) that, for every fixed θ ∈ Θ , ε > , N > G α,t ( θ ; x ) → G α,C ± ( x ) , t → x ∈ K ε,N . By (3.10) and continuity of the limit distribution,lim sup n →∞ P ( ξ θ ,n / ∈ K ε,N ) ≤ P ( Z α,C ± / ∈ K ε,N ) . We also have P ( Z α,C ± K ε,N ) → , ε → , N → ∞ . NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 11
Let us summarize: the random vectors Γ θk,n are represented as images of the i.i.d. random variables ξ θk,n under the functions G α,h n ( θ ; · ), and • the common law of ξ θk,n weakly converge to the law of Z α,C ± ; • on every compact set K ε,N , the functions G α,h n ( θ ; · ) converge uniformly to the function G α,C ± which is continuous on this compact; • by choosing ε > N > ξ θk,n ∈ K ε,N can be madearbitrarily close to 1.Because the weak convergence is preserved by continuous mappings, we deduce from the abovethat the common law of Γ θk,n , k = 1 , . . . , n weakly converge as n → ∞ to the law of Γ θ = γ − G α,C ± ( Z α,C ± ). On the other hand Proposition 3.2 yields that the family { (Γ θk,n ) ⊗ } is uni-formly integrable, hence by the Law of Large Numbers for independent random variables E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n X k =1 (Γ θk,n ) ⊗ − E (Γ θ ) ⊗ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) → , n → ∞ . Observe that it is an easy calculation to show that the covariation matrix for Γ θ equals Σ( θ ) givenby the second identity in (2.9). Taking θ = θ , we complete the proof of A2 .3.2.3. Proof of A3 . Because n Z R | r ( n ) g h n ( θ ; y ) | δ p h n ( θ ; y ) dy = n − δ / E (cid:12)(cid:12)(cid:12) ˜ r ( n ) g h n (cid:16) θ ; X θh n (cid:17)(cid:12)(cid:12)(cid:12) δ , assertion A3 follows from Proposition 3.2 immediately.3.2.4. Proof of A4 . We have q t ( θ ; x ) = 12 g t ( θ ; x ) p p t ( θ ; x ) , hence for any θ , θ ∈ Θ n Z R (cid:12)(cid:12)(cid:12) r ( n ) ⊤ ( q h n ( θ ; x ) − q h n ( θ ; x )) (cid:12)(cid:12)(cid:12) dx = 14 Z R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ r ( n ) ⊤ g h n ( θ ; x ) s p h n ( θ ; x ) p h n ( θ ; x ) − ˜ r ( n ) ⊤ g h n ( θ ; x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p h n ( θ ; x ) dx = 14 E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) γ G α,h n ( θ , ξ θ ,n ) vuut f h n ( θ ; ξ θ ,n ) f h n ( θ ; ξ θ ,n ) − γ G α,h n ( θ , ξ θ ,n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ;we keep using the notation introduced in the proof of A2 . Take θ = θ , and let θ n = ( β n , γ n ) → θ be arbitrary sequence. It follows from (3.5), (3.7) that for every fixed ε > , N > x ∈ K ε,N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) γ n G α,h n ( θ n , x ) s f h n ( θ n ; x ) f h n ( θ ; x ) − γ G α,h n ( θ , x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) → , n → ∞ . Then, by the Cauchy inequality ( a + b ) ≤ a + 2 b ,lim sup n →∞ n Z R (cid:12)(cid:12)(cid:12) r ( n ) ⊤ ( q h n ( θ n ; x ) − q h n ( θ ; x )) (cid:12)(cid:12)(cid:12) dx ≤ lim sup n →∞ γ n E (cid:12)(cid:12)(cid:12) G α,h n ( θ n , ξ θ ,n ) (cid:12)(cid:12)(cid:12) f h n ( θ n ; ξ θ ,n ) f h n ( θ ; ξ θ ,n ) 1 ξ θ ,n K ε,N + 12 γ E (cid:12)(cid:12)(cid:12) G α,h n ( θ , ξ θ ,n ) (cid:12)(cid:12)(cid:12) ξ θ ,n K ε,N ! = lim sup n →∞ (cid:18) γ n E (cid:12)(cid:12)(cid:12) G α,h n ( θ n , ξ θ n ,n ) (cid:12)(cid:12)(cid:12) ξ θn ,n K ε,N + 12 γ E (cid:12)(cid:12)(cid:12) G α,h n ( θ , ξ θ ,n ) (cid:12)(cid:12)(cid:12) ξ θ ,n K ε,N (cid:19) . Since γ − G α,h n ( θ ; ξ θ ,n ) = Γ θ ,n = ˜ r ( n ) ⊤ g h n (cid:16) θ ; X θkh n − X θ ( k − h n (cid:17) , we deduce using Proposition 3.2 and the H¨older inequality thatlim sup n →∞ n Z R (cid:12)(cid:12)(cid:12) r ( n ) ⊤ ( q h n ( θ n ; x ) − q h n ( θ ; x )) (cid:12)(cid:12)(cid:12) dx ≤ C lim sup n →∞ (cid:16) P ( ξ θ n ,n K ε,N ) δ / (2+ δ ) + P ( ξ θ ,n K ε,N ) δ / (2+ δ ) (cid:17) with some constant C . We have ξ θ n ,n ⇒ Z α,C ± , ξ θ ,n ⇒ Z α,C ± , n → ∞ , hence we getlim sup n →∞ n Z R | r ( n ) ( q h n ( θ n ; x ) − q h n ( θ ; x )) | dx ≤ C P ( Z α,C ± K ε,N ) δ / (2+ δ ) . Recall that ε > , N > ε → , N → ∞ we getfinally lim sup n →∞ n Z R | r ( n ) ( q h n ( θ n ; x ) − q h n ( θ ; x )) | dx = 0 , which completes the proof of A4 . (cid:3) Outline of the proof of Theorem 2.2.
To get the required LAN property uniformly in U ∈ U , it is enough to fix a sequence U n of L´evy processes, such that h − /αn U nh n → X θ in the n -th sample is replaced by X θ,nt = βt + γZ t + U nt . The moment bound in Proposition 3.2 is, to a very high extent, insensitive with respect to theprocess U ; in particular, we will show in Section 4 that(3.11) sup n ≥ ,θ ∈ ˜Θ E (cid:12)(cid:12)(cid:12) ˜ r ( n ) g h n (cid:16) θ ; X θ,nh n (cid:17)(cid:12)(cid:12)(cid:12) ε < ∞ . The law of the process U is involved in the definition of the functions f, f (1) , f (2) , but it is straight-forward to see that the relations (3.5), (3.7) in fact hold true uniformly with respect to U ∈ U .Finally, the random variables ξ θ,nk,n = γ − h − /αn ( X θ,nkh n − X θ,n ( k − h n − βh n + γc h n ) d = ζ α,h n + γ − h − /αn U nh n weakly converge to Z α,C ± . Hence repeating, with obvious notational changes, the calculations fromSection 3.2, we get properties A1 – A4 for the modified model, which proves the required LANproperty, uniform in U ∈ U . (cid:3) Malliavin calculus-based integral representation for the derivative of thelog-likelihood function and related L p -bounds Our main aim in this section is to prove Proposition 3.2, which is the cornerstone of the proofof Theorem 2.1. With this purpose in mind, we give an integral representation for the derivative ofthe log-likelihood function by means of a certain version of the Malliavin calculus. In the diffusivecase, such a representation was developed by Gobet in [13] and [14]; see also [8]. In the L´evysetting, the choice of a particular design of such a calculus is a non-trivial problem, and we discussit in details below.
NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 13
The main statement: formulation, discussion, and an outline of the proof.
In whatfollows, ν ( ds, du ) and ˜ ν ( ds, du ) = ν ( ds, du ) − dsµ ( du ) are, respectively, the Poisson point measureand the compensated Poisson measure from the L´evy-Itˆo representation of Z : Z t = Z t Z | u | > uν ( ds, du ) + Z t Z | u |≤ u ˜ ν ( ds, du ) . We denote D t X θt = γ D t Z t = γ Z t Z R u ν ( ds, du ) , D t X θt = 2 γ Z t Z R u ν ( ds, du ) , the genealogy of the notation will become clear later. To define the stochastic integrals with respectto ν properly, we decompose them in two integrals (which is a usual trick). The “small jump” parts,which correspond to values u ∈ [ − , u , u are integrablewith respect to µ on [ − , ν are understood in L sense. The“large jump” parts of the integrals with respect to ν are understood in the path-wise way, i.e. assums over finite set of jumps. Next, we denote χ ( u ) = − u m ′ ( u ) m ( u ) − u and put δ t (1) = Z t Z | u |≤ u χ ( u )˜ ν ( ds, du ) + Z t Z | u | >u χ ( u ) ν ( ds, du ) + tu h m ( u ) − m ( − u ) i , where u comes from the condition H2 . By H2 , | χ ( u ) | ≤ C | u | for | u | ≤ u , hence the “small jump”integral above is well defined in L sense; the “large jump” integral is understood in the path-wisesense.We define the modified Malliavin weight (we postpone for a while the explanation of the termi-nology) as the vector Ξ θt = (Ξ βt , Ξ γt ) ⊤ with(4.1) Ξ βt = tδ t (1)D t X θt + t D t X θt (D t X θt ) , Ξ γt = Z t δ t (1)D t X θt + Z t D t X θt (D t X θt ) − γ . Denote by E t,θx,y the expectation with respect to the law of the bridge of the process X θ conditionedby X θ = x, X θt = y . Note that because the process X θ possesses a continuous transition probabilitydensity p t ( θ ; x, y ) = p t ( θ ; y − x ), the law of the bridge is well defined for any t, x, y such that p t ( θ ; x, y ) > Theorem 4.1. (1) Let δ be the same as in H2 . Then for every δ ∈ (0 , δ ) and every ˜Θ , (4.2) sup θ ∈ ˜Θ sup n ≥ E (cid:12)(cid:12)(cid:12) ˜ r ( n ) ⊤ Ξ θh n (cid:12)(cid:12)(cid:12) δ < ∞ . (2) The following intergal representation formula holds true: (4.3) g t ( θ ; x ) = ( E t,θ ,x Ξ θt , p t ( θ ; x ) > , , otherwise . It follows from (4.3) that ˜ r ( n ) ⊤ g t ( θ ; X θh n ) = E h ˜ r ( n ) ⊤ Ξ θh n (cid:12)(cid:12)(cid:12) X θh n i . Note that the explicit formula for the weight Ξ θt does not involve the “nuisance noise” U at all,and the dependence of p t ( θ ; X θh n ) on U is contained in the operation of the conditional expectation,only. Then by the Jensen inequality E (cid:12)(cid:12)(cid:12) ˜ r ( n ) ⊤ g t ( θ ; X θh n ) (cid:12)(cid:12)(cid:12) δ ≤ E (cid:12)(cid:12)(cid:12) ˜ r ( n ) ⊤ Ξ θh n (cid:12)(cid:12)(cid:12) δ , where U is not involved in the right hand side term. Hence Theorem 4.1 immediately yields bothProposition 3.2 and the moment bound (3.11).Let us explain the main idea which Theorem 4.1 is based on. By analogy to Gobet’s resultsin the diffusive case ([13], [14]), one can naturally expect that an integral representation of theform (4.3) could be obtained by means of a proper version of the Malliavin calculus for jumpprocesses. Two possible ways to do that were developed in [7] and [18], being in fact close toeach other and, heuristically, being based on “infinitesimal perturbation of the jump configuration”with an intensity function ρ which is a “compactly” supported smooth function (see a more detailedexposition in Section 4.2 below). Using either of these two approaches it is possible to prove ananalogue of (4.3) with Ξ θt being replaced by some Ξ θt,ρ which, in full analogy to Gobet’s approach,has the meaning of the Malliavin weight (see formula (4.12) below). However, we then encounterfollowing two difficulties, both being related with moment bounds for the corresponding terms. • In order to provide that Ξ γt,ρ is square integrable (which is necessary for Ξ θt,ρ to have repre-sentation (4.12) with the Skorokhod integral in the right hand side), we need an additionalmoment bound for Z : for some δ ′ > Z | u |≥ | u | δ ′ µ ( du ) < ∞ . This excludes from the consideration “heavy tailed” L´evy processes, e.g. the particularlyimportant α -stable process. • Even if we confine ourselves by the class of “light tailed” L´evy processes satisfying (4.4),we can not obtain analogue of the moment bound (4.2) for Ξ θt,ρ : namely, respective upperbound for L δ -norm of Ξ θt,ρ would explode as t → ρ ( u ) = u , u ∈ R , in the formula for Ξ θt,ρ ; see Section 4.2 below, and especially Remark 4.1 which explains the heuristicsbehind the particular choice ρ ( u ) = u . This explains both the name “the modified Malliavinweight” we have used for the term defined by (4.1), and the background for the notation D t , δ t : wetake the explicit formulae for the Malliavin derivative D t,ρ and respective Skorokhod integral δ t,ρ , and put therein ρ ( u ) = u . Note that because ρ ( u ) = u is not compactly supported, D t X θt mayfail to be square integrable, which means that now D t X θt can not be interpreted as a Malliavinderivative. As a consequence, now one can not apply the Malliavin calculus tools to prove (4.3)directly. Hence we will use the following three step procedure to prove (4.3):(1) first, we apply Malliavin calculus tools to prove analogue of (4.3) for Ξ θt,ρ with compactlysupported ρ under the additional moment condition (4.4);(2) second, we approximate ρ ( u ) = u by a sequence of compactly supported ρ ’s;(3) finally, we approximate general Z by a sequence of “light tailed” L´evy processes Z L , L ≥ Proof of (4.2): Moment bound.
First, we give an explicit expression for ˜ r ( n ) ⊤ Ξ θt . Denote˜ Z t = Z t + c t = Z t Z | u | >t /α uν ( ds, du ) + Z t Z | u |≤ t /α u ˜ ν ( ds, du ) , NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 15 then(4.5) ˜ r ( n ) ⊤ Ξ θt = t /α δ t (1)D t X θt + t /α D t X θt (D t X θt ) , ˜ Z t δ t (1)D t X θt + ˜ Z t D t X θt (D t X θt ) − γ ! ⊤ . We will conclude the required bound (4.2) from a sequence of auxiliary estimates for the termsinvolved in the explicit expression (4.5).Denote κ t = Z t Z | u |≤ t /α u ν ( ds, du ) . Lemma 4.1.
For every p ≥ there exists C p < ∞ such that E ( t − /α κ t ) − p ≤ C p , t ∈ (0 , . Proof.
For any ε < P ( κ t < ε t /α ) ≤ P (cid:16) ν ([0 , t ] × {| u | ∈ [ εt /α , t /α ] } ) = 0 (cid:17) = exp n − tµ ( | u | ∈ [ εt /α , t /α ]) o . Condition H1 yields that, with some positive constant C , tµ ( | u | ∈ [ εt /α , t /α ]) ≥ Ct Z t /α εt /α α | u | − α − du = C ( ε − α − , t ∈ (0 , . Hence for the family of random variables t − /α κ t , t ∈ (0 ,
1] we have the uniform bound P ( t − /α κ t < ε ) ≤ e − Cε − α + C , ε < , t ≤ , which proves the required statement. (cid:3) Because D X θt = γ Z t Z R u ν ( ds, du ) ≥ γκ t , Lemma 4.1 immediately gives the following: for every p ≥ θ ∈ ˜Θ ,t ∈ (0 , E t /α q D t X θt p < ∞ . Next, observe that both D t X θt and D t X θt are represented as sums over the set of jumps of theprocess Z . Because X i a i ! / ≥ X i a / i , { a i } ⊂ [0 , ∞ ) , we have(4.7) (cid:12)(cid:12)(cid:12)(cid:12) D t X θt (D t X θt ) / (cid:12)(cid:12)(cid:12)(cid:12) ≤ √ γ . Lemma 4.2.
For every p ≥ , (4.8) sup θ ∈ ˜Θ ,t ∈ (0 , E ˜ Z t q D t X θt p < ∞ . Proof.
We have ˜ Z t = Z t Z | u |≤ t /α u ˜ ν ( ds, du ) + Z t Z | u | >t /α uν ( ds, du ) =: ξ t + ζ t , D X θt = γ Z t Z R u ν ( ds, du ) = γ ( κ t + η t ) , η t = Z t Z | u | >t /α u ν ( ds, du )( κ t is already defined above). Then(4.9) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ Z t q D X θt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ γ (cid:18) | ξ t |√ κ t + η t + | ζ t |√ κ t + η t (cid:19) ≤ γ (cid:18) | ξ t |√ κ t + | ζ t |√ η t (cid:19) . By Lemma 4.1, the family ( t /α √ κ t ) t ∈ (0 , has bounded L p -norms for any p ≥
1. In addition, the family n t − /α ξ t o t ∈ [0 , also has bounded L p -norms for any p ≥
1. To see this, observe that ξ t is an integral of a deterministicfunction over a compensated Poisson point measure, and therefore its exponential moments can beexpressed explicitly: E exp( cξ t ) = exp h t Z | u |≤ t /α ( e cu − − cu ) µ ( du ) i . Taking c = ± t − /α and using H1 , we get E exp (cid:16) ± t − /α ξ t (cid:17) ≤ exp h C t Z | u |≤ t /α ( t − /α u ) µ ( du ) i ≤ C , which yields the required L p -bounds. Applying the Cauchy inequality, we get finally that the family (cid:26) | ξ t |√ κ t (cid:27) t ∈ (0 , has bounded L p -norms.For the second summand in the right hand side of (4.9), we write the Cauchy inequality: | ζ t |√ η t ≤ p N t , N t = ν ([0 , t ] × {| u | > t /α } ) . Observe that N t has a Poisson law with the intensity tµ ( | u | > t /α ) , t ∈ (0 , , which is bounded because of H1.
Hence the family (cid:26) | ζ t |√ η t (cid:27) t ∈ [0 , also has bounded L p -norms, which completes the proof of (4.8). (cid:3) Lemma 4.3.
Let δ be the same as in H2 . Then (4.10) sup θ ∈ ˜Θ ,t ∈ (0 , E δ t (1) q D t X θt δ < ∞ . NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 17
Proof.
The proof is similar to the previous one, but some additional technicalities arise becausenow we have χ ( u ) instead of u under the integrals in the numerator. We have δ t (1) = Z t Z | u |≤ t /α χ ( u )˜ ν ( ds, du ) + Z t Z | u | >t /α χ ( u ) ν ( ds, du ) + t /α h m ( t /α ) − m ( − t /α ) i =: ˆ ξ t + ˆ ζ t + ̟ t . Denote t = ( u ) α with u coming from H2 , then the ratio υ ( u ) := χ ( u ) u = − um ′ ( u ) m ( u ) − {| u | ≤ t /α } ⊂ {| u | ≤ u } . Then the same argument as we have used beforeshows that the family ( | ˆ ξ t |√ κ t ) t ∈ (0 ,t ] has bounded L p -norms for every p ≥ H1 that ̟ t ∼ ( C + − C − ) t /α , t → , and thus | ̟ t | ≤ Ct /α , t ∈ (0 , . Then by Lemma 4.1 the family (cid:26) ̟ t √ κ t (cid:27) t ∈ (0 ,t ] has bounded L p -norms for every p ≥ t ≤ t by the Cauchy inequality we have | ˆ ζ t | ≤ √ η t p J t , where J t = Z t Z | u | >t /α υ ( u ) ν ( ds, du )= Z t Z t /α
1. The random variable J t has a compound Poisson distribution with the intensity of thePoisson random variable equal tµ ( | u | > u ), and the law of a single jump equal to the image under υ of the measure µ conditioned to {| u | > u } . By condition H2 , this law have a finite moment ofthe order 2 + δ , therefore the variables J t , t ≤ t have bounded L δ -norms. Summarizing all theabove, we get sup θ ∈ ˜Θ ,t ∈ (0 ,t ] E δ t (1) q D t X θt δ < ∞ . The same bound for t ∈ [ t ,
1] can be proved in a similar and simpler way; in that case insteadof taking the integrals with respect to {| u | ≤ t /α } , {| u | > t /α } , one should consider, both in thenumerator and the denominator, the integrals with respect to {| u | ≤ u } , {| u | > u } . (cid:3) Now we deduce (4.2) by simply applying the H¨older inequality to (4.5) with the estimates(4.6)–(4.8) and (4.10).
Remark . Now we can explain the main idea, which the choice of the intensity function ρ ( u ) = u is based on. When ρ is compactly supported as was in [17], the “large jumps” are excluded fromthe formula for D t,ρ X θt . On the other hand, “large jumps” are involved e.g. into ˜ Z t , which willappear in the numerator in one term in (4.12). We have ρ ( u ) = 0 , | u | > u ∗ for some u ∗ >
0, andhence the integrals Z t Z | u | >u ∗ uν ( ds, du ) , Z t Z R ρ ( u ) ν ( ds, du )are independent. In addition, we know that E Z t Z | u | >u ∗ uν ( ds, du ) ! = t Z | u | >u ∗ u µ ( du ) + t Z | u | >u ∗ u µ ( du ) ! ,t − /α Z t Z R ρ ( u ) ν ( ds, du ) ⇒ ζ, t → , where ζ is a positive ( α/ E R t R | u | >u uν ( ds, du ) q D t,ρ X θt ≥ Ct − /α +1 , which is unbounded for small t because α <
2. This indicates that, in the present high-frequencysampling setting, one can hardly expect to get the uniform moment bound of the type (4.2) fora Malliavin weight which corresponds to a compactly supported ρ . Nevertheless, in the modifiedconstruction we “extend the support” of ρ ; this brings a “large jumps” part to the denominator,which provides a good balance to respective parts which appear in the numerator, and this is thereason why the modified weight satisfies the required uniform moment bounds. Remark . Another natural possibility to design the Malliavin weight is to take into account thelocal scale for the process Z and to make the function ρ depend on t in the following way: ρ ( u ) = ρ t ( u ) = u ς ( t − /α u )with ς ∈ C such that ς ( u ) = 1 , | u | ≤ ς ( u ) = 0 , | u | ≥
2. Actually, the “scaled” choice of ρ = ρ t with the size of its support ≍ t /α is essentially the one used in the Malliavin calculusconstruction developed in [7]. It is easy to see that, under such a choice, an analogue of (4.10)would hold true; the reason is that now δ t would contain only “small jumps part”, which is wellbalanced with q D X θt in completely the same way we have seen in the proof of Lemma 4.2. Thiswould give the uniform moment bound for the first component of the respective Malliavin weight,and hence the “scaled” choice of ρ = ρ t is appropriate when the unknown parameter is involvedinto the drift term, only. However, the model with the parameter involved into the jump term thischoice does not seem to be appropriate by the reason stated in Remark 4.1: the “large jump part”of ˜ Z t is not well balanced by the “small jump part” √ κ t of q D X θt . In regard to this point, forthe sake of reference let us discuss [7] in a little bit more detail. Therein the authors proved thedrift-parameter LAMN property when observing a sample ( X j/n ) nj =0 from the process X of theform X t = x + Z t b ( X s , β ) ds + Z t . Their technical assumptions are: (i) the boundedness of the L´evy measure of Z , which is assumedto be locally α -stable in a neighborhood of the origin; (ii) the smoothness and boundedness of NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 19 ( x, β ) b ( x, β ); and that (iii) the stable-like index α ∈ (1 , n − /α ( X j/n − X ( j − /n ), which isto be close to L j/n − L ( j − /n . In particular, the boundedness (ii) seems essential in their proof, aswell as the fact that only the drift parameter is the subject to statistical estimation. We emphasizethat because of absence of the jump-related parameter γ , the result of [7] is not comparable neitherwith our current results nor even with those from the aforementioned paper [1]. We expect that ourproof technique based on the modified Malliavin weight combined with the approximation of theintensity function ρ will be workable for the general case of state-dependent coefficients includingthe jump coefficient which contains a parameter γ ; this is a subject of a further research.4.3. Proof of (4.3): Integral representation.
The proof consists of three steps outlined atthe end of Section 4.1. The first step is based on a version of the Malliavin calculus on a spaceof trajectories of a L´evy process, which we outline below and which is essentially developed in[18]. Note that the Malliavin calculus for L´evy noises is a classical and well developed tool, whichdates back to [4] and [5]. However, we found it difficult to apply existing technique directly forour purposes: the reason is that unlike in the classical approach developed in [4] and [5], we areinterested not in the distribution density p t ( θ ; x, y ) itself (which is typically treated by means ofthe inverse Fourier transform), but in the ratio ∂ θ p t ( θ ; x, y ) /p t ( θ ; x, y ). To get the L p bounds forthis ratio, and especially to make approximating procedures outlined at the end of Section 4.1,we need to have an integral representation for this ratio in a most simple possible form. For thatpurpose mainly, and also to make the exposition self-consistent, we introduce a specially designedsimple version of the Malliavin calculus. This version of course is neither a substantial novelty noris unique possible one; see e.g. the construction in [7] aimed at similar purposes.Let the L´evy measure µ of Z satisfy H1 , H2 and assume additionally that Z is “light tailed” inthe sense that (4.4) holds true for some δ ′ >
0. Fix some function ρ ∈ C such that ρ ( u ) = u in aneighborhood of the point u = 0. Consider a flow Q c , c ∈ R of transformations of R , which satisfies ddc Q c ( u ) = ρ ( Q c ( u )) , Q ( u ) = u, and for a fixed t > Q tc , c ∈ R of transformations of the process Z s , s ≥ Q tc Z has jumps at the same time instants with the initialprocess Z ; if the process has a jump with the amplitude u then at the time moment s , respectivejump of Z has the amplitude equal either Q c ( u ) or u if s ≤ t or s > t , respectively. It is provedin [18], Proposition 1, that under conditions H1 , H2 the law of Q tc Z in D (0 , ∞ ) is absolutelycontinuous with respect to the law of Z . Hence every transformation Q tc can be naturally extendedto a transformation of the space L (Ω , σ ( Z ) , P ) of the functionals of the process Z . Denote thistransformation by the same symbol Q tc , and call a random variable ξ ∈ L (Ω , σ ( Z ) , P ) stochasticallydifferentiable if there exists the mean square limitˆD ξ = lim ε → Q tε ξ − ξε . The L (Ω , σ ( Z ) , P )-closure of the operator ˆD is called the stochastic derivative and is denoted byD. The adjoint operator δ = D ∗ is called the divergence operator or the extended stochastic integral .The operators D , δ are well defined under conditions H1 , H2 for every t > ρ specified above;see [18], Remark 3.Clearly, the above construction depends on the choice of t and ρ : to track this dependence we usethe notation D t,ρ , δ t,ρ instead of D , δ . In a slightly larger generality, literally the same constructioncan be made on the space L (Ω , σ ( Z, U ) , P ) of the functionals of the pair of processes Z and U ,with the trajectories of U not being perturbed by Q Tc . Then, analogously to the calculations made in [18], Sections 3.1, 3.2, we haveD t,ρ Z t = Z t Z R ̺ ( u ) ν ( ds, du ) , D t,ρ U t = 0 ,δ t,ρ (1) = Z t Z R χ ρ ( u )˜ ν ( ds, du ) , χ ρ ( u ) = − ̺ ( u ) m ′ ( u ) m ( u ) − ̺ ′ ( u );recall that ν and ˜ ν are, respectively, the Poisson point measure and the compensated Poissonmeasure from the L´evy-Itˆo representation of Z . Respectively,D t,ρ X θt = γ D t,ρ Z t = γ Z t Z R ̺ ( u ) ν ( ds, du ) . Furthermore, the second order stochastic derivative of X θt is well defined:D t,ρ X θt = γ Z t Z R ̺ ( u ) ̺ ′ ( u ) ν ( ds, du ) . By Proposition 3.1 and formula (3.4), the variable X θt has a distribution density p t ( θ ; x ) whichis a C -function with respect to θ, x . On the other hand, the Malliavin calculus developed aboveallows one to derive an integral representation for the ratio g t ( θ ; x, y ) = ∇ θ p t ( θ ; x, y ) p t ( θ ; x, y ) . Namely, repeating literally the proof of the assertion III of Theorem 1 in [18], we obtain the followingrepresentation:(4.11) g t ( θ ; x ) = ( E t,θ ,x Ξ θt,ρ , p t ( θ ; x ) > , , otherwise , where(4.12) Ξ θt,ρ := δ t,ρ (cid:18) ∇ θ X θt D t,ρ X θt (cid:19) = ( δ t,ρ (1))( ∇ θ X θt )D t,ρ X θt + (D t,ρ X θt )( ∇ θ X θt )(D t,ρ X θt ) − D t,ρ ( ∇ θ X θt )D t,ρ X θt . Note that, formally, we can not apply Theorem 1 of [18] directly, because now we have an additionalprocess U which our target process X θ depends on. Nevertheless, because U is not perturbed underthe transformations Q tc which give the rise for the Malliavin calculus construction, it is easy to checkthat literally the same argument as the one used in the proof of Theorem 1 [18] can be applied inthe current (slightly extended) setting.Recall that we already know p t ( θ ; x, y ) exists and is smooth with respect to θ, x, y . We have Z R f ( y ) ∇ θ p t ( θ ; x, y ) dy = ∇ θ E θx f ( X θt ) = E θx f ′ ( X θt )( ∇ θ X θt )= E θx D t,ρ f ( X θt ) (cid:18) ∂ θ X t D t,ρ X t (cid:19) = E θx f ( X θt )Ξ t,ρ = E θx f ( X θt ) g θt ( x, X θt ) = Z R f ( y ) g t ( θ ; x, y ) p t ( θ ; x, y ) dy. Hence the formula (4.11) is actually equivalent to the following: for every compactly supported f ∈ C ( R ),(4.13) ∇ θ E f ( X θt ) = E f ( X θt )Ξ θt,ρ , NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 21 and to prove (4.3) it is sufficient to prove (4.13) with the modified Malliavin weight Ξ θt instead ofΞ θt,ρ . To do that, we exploit an approximation procedure, hence we rewrite (4.13) in an integralform, which is convenient for approximation purposes:(4.14) E f (cid:16) X θ + vt (cid:17) − E f (cid:16) X θt (cid:17) = Z E n f (cid:16) X θ + svt (cid:17) (Ξ θ + svt,ρ , v ) o ds. Because ∂ β X θ = t, ∂ γ X θ = Z t , we have D t,ρ ( ∂ β X θ ) = 0 , D t,ρ ( ∂ γ X θ ) = D t,ρ Z t = 1 γ D t,ρ X θt , therefore Ξ θt,ρ = tδ t,ρ (1)D t,ρ X θt + t D t,ρ X θt (D t,ρ X θt ) , Z t δ t,ρ (1)D t,ρ X θt + Z t D t,ρ X θt (D t,ρ X θt ) − γ ! ⊤ . Now we proceed with the first approximation step as follows. Fix some ρ ∈ C ( R ) , ρ ( u ) ≥ ρ ( u ) = ( u , | u | ≤ , | u | ≥ , and define ρ N ( u ) = N ρ ( u/N ) . Observe that, for t fixed and N large enough, ρ N ( u ) = u for | u | ≤ t /α , henceD t,ρ N X θt ≥ γκ t . Next, there exists a constant C such that ρ N ( u ) ≤ Cu , | ρ ′ N ( u ) | ≤ C | u | , and therefore (cid:12)(cid:12)(cid:12)(cid:12) χ ρ N ( u ) u (cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( τ ( u ) + 1) . Then, repeating literally the calculations from Section 4.2, we can obtain a bound similar to (4.2)for t fixed, but a family of weights Ξ θt,ρ N , N ≥ N ≥ ,θ ∈ ˜Θ E (cid:12)(cid:12)(cid:12) Ξ θt,ρ N (cid:12)(cid:12)(cid:12) δ < ∞ , δ < δ ∧ δ ′ (here δ comes from H2, and δ ′ comes from (4.4)). Hence the family { Ξ θt,ρ N , N ≥ , θ ∈ ˜Θ } isuniformly integrable. It is straightforward to see thatΞ θ N t,ρ N → Ξ θt , N → ∞ with probability 1 for any sequence θ N → θ ∈ Θ. Combined with the above uniform integrability,this shows that Ξ θt,ρ N → Ξ θt , N → ∞ in L (Ω , P ) uniformly with respect to θ ∈ ˜Θ. Hence we can pass to the limit in (4.14) as N →∞ and get the required identity (4.13) with the modified Malliavin weight Ξ θt . This proves therepresentation (4.3) under the additional moment assumption (4.4).The second approximation step is aimed to remove the assumption (4.4), and is similar to theabove one. Consider a family of processes Z L , L ≥ µ L ( du ) = m L ( u ) du, m L ( u ) = m ( u ) e − u /L ; the Z L -driven versions of the processes X θt and Ξ θt are also specified by the superscript L : X θ,L ,Ξ θ,L . Because | u | δ e − u /L ≤ C , every µ L satisfies (4.4). In addition, it is an easy calculation toshow that conditions H1 , H2 are satisfied for µ L uniformly with respect to L ≥
1. Hence we havethe following:(a) for every L ≥
1, (4.14) holds true with X θt and Ξ θt replaced by X θ,Lt and Ξ θ,Lt , respectively;(b) for every t ∈ (0 , θ,Lt , θ ∈ ˜Θ , L ≥ L ≥ X θ,Lt , Ξ θ,Lt ) weakly converge to ( X θt , Ξ θt ) as L → ∞ .Since the family { Ξ θ,Lt } is uniformly integrable by the above property (b), we can pass to thelimit in the relation (4.14) for X θ,Lt , Ξ θ,Lt , and get finally (4.14) for X θt , Ξ θt . This proves (4.3) andcompletes the proof of Theorem 4.1. (cid:3) Appendix A. Proof of Proposition 3.1 (1) Because U is negligible (see (2.8)), we can restrict our considerations to the variables ζ α,t = t − /α ( Z t + c t ) . Their characteristic functions have the form E e iλζ α,t = e ψ α,t ( λ ) , where ψ α,t ( λ ) = t Z R (cid:16) e iλt − /α u − − iλt − /α u | u |≤ (cid:17) µ ( du ) + iλc t t − /α = t Z R (cid:16) e iλt − /α u − − iλt − /α u | u |≤ t /α (cid:17) µ ( du );in the last identity we have used the formula (2.7) for c t . Changing the variable v = ut − /α , we get ψ α,t ( λ ) = Z R (cid:16) e iλv − − iλv | v |≤ (cid:17) µ t ( dv ) , where µ t ( dv ) has the density m t ( v ) = t /α m ( t /α v ) . By H1 , for every ε > u ε > − ε ) m α,C ± ( v ) ≤ m t ( v ) ≤ (1 + ε ) m α,C ± ( v ) , | v | ≤ t − /α u ε . On the other hand, the term (cid:0) e iλv − − iλv | v |≤ (cid:1) is bounded, and µ t (cid:16) { v : | v | > t − /α u ε } (cid:17) = tµ (cid:16) { u : | u | > u ε } (cid:17) → , t → . Using that, one can easily derive ψ α,t ( λ ) → ψ α,C ± ( λ ) := Z R (cid:16) e iλv − − iλv | v |≤ (cid:17) m α,C ± ( v ) dv, t → . Because the characteristic function of Z α,C ± equals e ψ α,C ± ( λ ) , this completes the proof.(2) Consider first the case U ≡
0; now φ α,t does not depend on θ , and we omit θ in the notation.We would like to apply the inverse Fourier transform representation for φ α,t and its derivatives: φ α,t ( x ) = 12 π Z R e − iλx + ψ α,t ( λ ) dλ, ( ∂ x ) k φ α,t ( x ) = 12 π Z R ( − iλ ) k e − iλx + ψ α,t ( λ ) dλ, NIFORM LAN OF LOCALLY STABLE L´EVY PROCESS 23
To do that, we have to verify that the functions under the integrals are absolutely integrable. Wehave | e − iλx + ψ α,t ( λ ) | ≤ e Re ψ α,t ( λ ) , Re ψ α,t ( λ ) = t Z R (cos( t − /α λu ) − µ ( du ) ≤ t Z t − /α | λu | < (cos( t − /α λu ) − µ ( du ) . Then by H1 there exist c , c > | e − ixλ + ψ α,t ( λ ) | ≤ c e − c | λ | α , x, λ ∈ R , t ∈ (0 , . This proves existence of φ α,t ( x ) and all its derivatives. Moreover, we havesup x ∈ R ,t ∈ (0 , | φ ′ α,t ( x ) | < ∞ , sup x ∈ R ,t ∈ (0 , | φ ′′ α,t ( x ) | < ∞ . Now we come back to the case of non-zero U . Because the law of ζ θα,t is a convolution of thelaws of ζ α,t and γ − U t , the above bound can be extended:(A.1) sup θ ∈ ˜Θ sup x ∈ R ,t ∈ (0 , | φ ′ α,t ( θ ; x ) | < ∞ , sup θ ∈ ˜Θ sup x ∈ R ,t ∈ (0 , | φ ′′ α,t ( θ ; x ) | < ∞ . In addition, ζ θα,t ⇒ Z α,C ± uniformly in θ ∈ ˜Θ , U ∈ U : this follows from the statement (1) and thefact that γ − t − /α U t is uniformly negligible. This convergence and the first (resp. second) boundin (A.1) provide the first (resp. second) convergence in (3.1). (cid:3) Acknowledgement.
The authors are grateful to the anonymous referees for their valuable com-ments, which led to substantial improvement of the earlier version.
References [1] A¨ıt-Sahalia, Y. and Jacod, J. (2007), Volatility estimators for discretely sampled L´evy processes. Ann. Statist. 35(2007), no. 1, 355–392.[2] A¨ıt-Sahalia, Y. and Jacod, J. (2008), Fisher’s information for discretely sampled L´evy processes. Econometrica76, 727–761.[3] Basawa, I. V. and Scott, D. J. (1983), Asymptotic optimal inference for nonergodic models. Lecture Notes inStatistics, 17. Springer-Verlag, New York-Berlin.[4] Bichteler, K., Gravereaux, J. B. and Jacod, J.
Malliavin calculus for processes with jumps . Gordon and Breachscience publishers, N.Y., London, Paris, Tokyo, 1987.[5] Bismut, J. M. Calcul des variations stochastique et processus de sauts. Z. Warw. theor. verw. Geb., 56(4):469-505,1981.[6] Chaumont, L. and Uribe Bravo, G. Markovian bridges: Weak continuity and pathwise constructions.
Ann. Probab. ,39(2):609-647, 2011.[7] Cl´ement, E. and Gloter, A. (2015), Local Asymptotic Mixed Normality property for discretely observed stochasticdifferential equations driven by stable L´evy processes. Stochastic Proc. Appl. 125, 2316–2352.[8] Corcuera, J. M. and Kohatsu-Higa, A. Statistical inference and Malliavin calculus.
Seminar on Stochastic Analysis,Random Fields and Applications VI, Progress in Probability, Springer Basel , 63:59-82, 2011.[9] Dohnal, G (1987), On estimating the diffusion coefficient. J. Appl. Probab. 24, 105–114.[10] Fahrmeir, L. (1988) A note on asymptotic testing theory for nonhomogeneous observations. Stochastic Process.Appl. 28, 267–273.[11] Genon-Catalot, V. and Jacod, J. (1993), On the estimation of the diffusion coefficient for multi-dimensionaldiffusion processes. Ann. Inst. H. Poincar´e Probab. Statist. 29, 119–151.[12] Genon-Catalot, V. and Jacod, J. (1994), Estimation of the diffusion coefficient for diffusion processes: randomsampling. Scand. J. Statist. 21, 193–221.[13] Gobet, E. (2001), Local asymptotic mixed normality property for elliptic diffusion: a Malliavin calculus approach.Bernoulli 7, 899 – 912.[14] Gobet, E. (2001), LAN property for ergodic diffusions with discrete observations. Ann. Inst. H. Poincar´e Probab.Statist. 38, 711–737.[15] Ibragimov, I. A. and Hasminskii, R. Z. (1981),
Statistical estimation: asymptotic theory . New York, Springer-Verlag.[16] Kessler, M. (1997), Estimation of an ergodic diffusion from discrete observations. Scand. J. Statist. 24, 211–229. [17] Ivanenko, D. O. and Kulik, A. M. (2014), LAN property for families of distributions of solutions to L´evy drivenSDEs. Modern Stochastics: Theory and Applications 1, 33–47.[18] Ivanenko, D. O. and Kulik, A. M. (2015), Malliavin calculus approach to statistical inference for L´evy drivenSDEs. Methodology and Computing in Applied Probability 17, 107–123.[19] Jeganathan, P. (1982), On the asymptotic theory of estimation when the limit of the log-likelihood ratios ismixed normal. Sankhy`a Ser. A 44, 173–212.[20] Le Cam, L. (1960), Locally asymptotically normal families of distributions. Certain approximations to familiesof distributions and their use in the theory of estimation and testing hypotheses. Univ. california Publ. Statist.3, 37–98.[21] Le Cam, L. and Yang, G. L. (2000), Asymptotics in statistics. Some basic concepts. Second edition. SpringerSeries in Statistics. Springer-Verlag, New York.[22] Mai, H. (2014), Efficient maximum likelihood estimation for L´evy-driven Ornstein-Uhlenbeck processes. Bernoulli20, 919–957.[23] Masuda, H. (2009), Joint estimation of discretely observed stable L´evy processes with symmetric L´evy density.J. Japan Statist. Soc. 39, 49–75.[24] Masuda, H. Parametric estimation of L´evy processes. L`evy Matters IV, Estimation for Discretely Observed L´evyProcesses, pp.179–286, Lecture Notes in Mathematics, Vol. 2128, Springer.[25] Sato, K. (1999)
L´evy processes and infinitely divisible distributions.
Cambridge university press.[26] Sweeting, Trevor J. (1992) Asymptotic ancillarity and conditional inference for stochastic processes. Ann. Statist.20, 580–589.[27] Zololarev, V. M. (1986) One-dimensional stable distributions. Transl. Math. Monographs, 65, AMS, Providence.
Kyiv National Taras Shevchenko University, Volodymyrska, 64, Kyiv, 01033, Ukraine
E-mail address : [email protected] Institute of Mathematics, Ukrainian National Academy of Sciences, 01601 Tereshchenkivska, 3,Kyiv, Ukraine
E-mail address : [email protected]
1) Faculty of Mathematics, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan;2) CREST, JST, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
E-mail address ::