[PDF] Simplified quasi-likelihood analysis for a locally asymptotically quadratic random field

Abstract

The asymptotic decision theory by Le Cam and Hajek has been given a lucid perspective by the Ibragimov-Hasminskii theory on convergence of the likelihood random field. Their scheme has been applied to stochastic processes by Kutoyants, and today this plot is called the IHK program. This scheme ensures that asymptotic properties of an estimator follow directly from the convergence of the random field if a large deviation estimate exists. The quasi-likelihood analysis (QLA) proved a polynomial type large deviation (PLD) inequality to go through a bottleneck of the program. A conclusion of the QLA is that if the quasi-likelihood random field is asymptotically quadratic and if a key index reflecting identifiability the random field has is non-degenerate, then the PLD inequality is always valid, and as a result, the IHK program can run. Many studies already took advantage of the QLA theory. However, not a few of them are using it in an inefficient way yet. The aim of this paper is to provide a reformed and simplified version of the QLA and to improve accessibility to the theory. As an example of the effects of the theory based on the PLD, the user can obtain asymptotic properties of the quasi-Bayesian estimator by only verifying non-degeneracy of the key index.

Full PDF

aa r X i v : . [ m a t h . S T ] F e b Simpliﬁed quasi-likelihood analysis for a locallyasymptotically quadratic random ﬁeld ∗ Nakahiro YoshidaGraduate School of Mathematical Sciences, University of Tokyo † Japan Science and Technology Agency CRESTThe Institute of Statistical MathematicsFebruary 22, 2021

Summary

The asymptotic decision theory by Le Cam and Hajek has been given a lucid per-spective by the Ibragimov-Hasminskii theory on convergence of the likelihood random ﬁeld.Their scheme has been applied to stochastic processes by Kutoyants, and today this plot iscalled the IHK program. This scheme ensures that asymptotic properties of an estimator fol-low directly from the convergence of the random ﬁeld if a large deviation estimate exists. Thequasi-likelihood analysis (QLA) proved a polynomial type large deviation (PLD) inequality togo through a bottleneck of the program. A conclusion of the QLA is that if the quasi-likelihoodrandom ﬁeld is asymptotically quadratic and if a key index reﬂecting identiﬁability the randomﬁeld has is non-degenerate, then the PLD inequality is always valid, and as a result, the IHKprogram can run. Many studies already took advantage of the QLA theory. However, not a fewof them are using it in an ineﬃcient way yet. The aim of this paper is to provide a reformedand simpliﬁed version of the QLA and to improve accessibility to the theory. As an exampleof the eﬀects of the theory based on the PLD, the user can obtain asymptotic properties of thequasi-Bayesian estimator by only verifying non-degeneracy of a key index.

Keywords and phrases

Ibragimov-Has’minskii theory, quasi-likelihood analysis, polynomialtype large deviation, random ﬁeld, asymptotic decision theory, non-ergodic statistics.

The asymptotic decision theory by Le Cam and H´ajek has been given a lucid perspective by theIbragimov-Has’minskii theory ([3, 4, 5]) on convergence of the likelihood random ﬁeld. Their ∗ This work was in part supported by Japan Science and Technology Agency CREST JPMJCR14D7; JapanSociety for the Promotion of Science Grants-in-Aid for Scientiﬁc Research No. 17H01702 (Scientiﬁc Research);and by a Cooperative Research Program of the Institute of Statistical Mathematics. † Graduate School of Mathematical Sciences, University of Tokyo: 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan. e-mail: [email protected] As a conclusion of the QLA theory, if the quasi-likelihood random ﬁeld is locally asymp-totically quadratic (LAQ) and if a key index reﬂecting identiﬁability the random ﬁeld has isnon-degenerate, then the polynomial type large deviation inequality is always valid, and as aresult, the IHK program can run.Since an ad hoc model-dependent method is not necessary, the QLA is universal and canapply to various dependent models. Many studies are based on and taking advantage of theQLA. These applications include sampled ergodic diﬀusion processes (Yoshida [30]), adaptiveestimation for diﬀusion processes (Uchida and Yoshida [25]), adaptive Bayes type estimatorsfor ergodic diﬀusion processes (Uchida and Yoshida [28]), approximate self-weighted LAD esti-mation of discretely observed ergodic Ornstein-Uhlenbeck processes (Masuda [13]), parametricestimation of L´evy processes (Masuda [15]), Gaussian quasi-likelihood random ﬁelds for er-godic L´evy driven SDE (Masuda [14]), and ergodic point processes for limit order book (Clinetand Yoshida [1]). Thanks to its ﬂexibility, the QLA is also applicable to non-ergodic statis-tics: volatility parameter estimation in regular sampling of ﬁnite time horizon (Uchida andYoshida [27]) and in non-synchronous sampling (Ogihara and Yoshida [19]), a non-ergodicpoint process regression model (Ogihara and Yoshida [20]). Analysis of complex algorithmsis possible by relying on the universal design of the QLA: hybrid multi-step estimators (Ka-matani and Uchida [7]), adaptive Bayes estimators and hybrid estimators for small diﬀusionprocesses based on sampled data (Nomura and Uchida [17]). Information criteria, sparse es-timation and regularization methods are recently understood in the framework of the QLA:contrast-based information criterion for diﬀusion processes (Uchida [24]), AIC for non-concavepenalized likelihood method (Umezu et al. [29]), Schwarz type model comparison for LAQmodels (Eguchi and Masuda [2]), moment convergence of regularized least-squares estimatorfor linear regression model (Shimizu [22]), moment convergence in regularized estimation un-der multiple and mixed-rates asymptotics (Masuda and Shimizu [16]), penalized method andpolynomial type large deviation inequality (Kinoshita and Yoshida [8]) and related Suzuki andYoshida ([23]). Jump ﬁltering problems: jump diﬀusion processes Ogihara and Yoshida([18]),threshold estimation for stochastic processes with small noise (Shimizu [21]), global jump ﬁl-ters (Inatsugu and Yoshida [6]). Partial quasi-likelihood analysis: Yoshida [31]. Such variety ofapplications are demonstrating the universality of the framework of the QLA. Since the IHKprogram runs there, we can obtain limit theorems and the L p -boundedness of the QL estimators(quasi-maximum likelihood estimator and the quasi-Bayesian estimator), which is indispensable The term “quasi-likelihood” is not in the sense of GLM. We use ”quasi-likelihood analysis” because statisticalinference for sampled stochastic processes cannot avoid a quasi-likelihood function for estimation. The methodis relatively new, but not because of “quasi”. The diﬃculty in large deviation estimates already existed in thelikelihood analysis for stochastic processes.

2o develop statistical theories.The essence of the QLA is the polynomial type large deviation inequality that was proved ina general setting (Yoshida [30]). Since the LAQ property quite often appears when the modelis diﬀerentiable, Yoshida [30] was based on this structure. Because of it, the limit distributionof the associated estimators has an explicit expression. The paper [30] gave it, but due to ageneral way of writing, not a few users are apt to avoid following that passage after the PLD’stheorem and try to reconstruct it in each situation. However, such a task is unnecessary infact. Besides, four time diﬀerentiability is often assumed in many applications of the QLA.It may be only because a handy condition in [30] assumed an estimate of the supremum ofthe third-order derivative of the quasi-log likelihood random ﬁeld H T , though the paper gave acondition ([ A ′ ]) to treat H T of class C .The aim of this paper is to provide a simpliﬁed version of the QLA theory directly connect-ing the assumptions with the limit theorems in order to improve accessibility to the theory.Essentially, the user is only requested to verify non-degeneracy of a key index, and this task istrivial in particular in ergodic statistics. We will give handy conditions for the quasi-likelihoodrandom ﬁeld of class C , based on [30], in order to reach the asymptotic properties of the esti-mators at a single leap. Some assumptions in [30] are arranged and replaced by simple-lookingones in this paper. This simpliﬁcation will serve for future progress e.g. in analysis of regular-ization methods. The LAQ property we adopted here is just one principle of separation, andit is possible to develop a similar theory for a non-LAQ type random ﬁeld; see Kinoshita andYoshida [8] for a case of regularization.A smart way of presenting the theory is to use the convergence of the quasi-likelihoodrandom ﬁeld Z T to a random ﬁeld Z in the function space b C ( R p ), the separable Banach spaceof continuous functions f on R p satisfying lim | u |→∞ f ( u ) = 0, equipped with the supremumnorm. This plot is possible but to carry out it, one needs a suitable measurable extension of Z T to the outside of the originally given local parameter space and an argument about tightnessof random ﬁelds on the non-compact R p . In this article, we dared avoid this approach to givepriority to simplicity. As a result, the presentation of the theory is now much more elementarythan Yoshida [30]. Given a probability space (Ω , F , P ) and a bounded open set Θ in R p , we consider a randomﬁeld H T : Ω × Θ → R , a function measurable with respect to the product σ -ﬁeld F × B (Θ), B (Θ) being the Borel σ -ﬁeld of Θ. Here T is a subset of R + = [0 , ∞ ) satisfying sup T = ∞ .We suppose that H T is continuous and of class C , that is, for every ω ∈ Ω, the mappingΘ ∋ θ H T ( θ ) ∈ R is of class C and that H T is continuously extended to ∂ Θ. We shallpresent a simpliﬁed version of the polynomial type large deviation inequality of Yoshida [30]under a handy set of suﬃcient conditions.Let θ ∗ ∈ Θ. Deﬁne ∆ T and Γ T ( θ ) by∆ T = ∂ θ H T ( θ ∗ ) a T and Γ T ( θ ) = − a ⋆T ∂ θ H T ( θ ) a T (2.1) Because of the assumptions below about the continuity of H T and the separability of Θ, this is equivalentto that the function H T ( · , θ ) is measurable for each θ ∈ Θ. ⋆ denotes the matrix transpose. Let a T ∈ GL ( R p ) be a scaling matrix suchthat | a T | → n → ∞ . We suppose that Γ is a p × p symmetric random matrix. Let U ( θ, r ) = { θ ′ ∈ R p ; | θ ′ − θ | < r } for θ ∈ Θ and r >

0. There exists a positive constant r suchthat U ( θ ∗ , r ) ⊂ Θ.The minimum and maximum eigenvalues of the symmetric matrix M are denoted by λ min ( M ) and λ max ( M ), respectively. Let b T = (cid:8) λ min ( a ⋆T a T ) } − . In particular, b T → ∞ as n → ∞ . Moreover, we assume that b − T ≤ λ max ( a ⋆T a T ) ≤ C b − T ( T ∈ T ) (2.2)for some constant C ∈ [1 , ∞ ). A typical case is n for b T , and n − / I p for a T , where I p is theidentity matrix. Remark 2.1.

In an ergodic diﬀusion model, the parameter θ of the diﬀusion coeﬃcient andthe parameter θ of the drift coeﬃcient have diﬀerent convergence rates in estimation with highfrequency data. Then Condition (2.2) may seem restrictive, but it is incorrect. The randomﬁeld H n is not necessarily the same as a quasi-likelihood function Ψ n used for estimation inreality. The random ﬁeld H n is rather ”living in the proof” in various manners. Consider ajoint quasi-maximum likelihood estimator (ˆ θ ,n , ˆ θ ,n ) for ( θ , θ ). To analyze the asymptoticbehavior of ˆ θ ,n , the random ﬁeld H n ( θ ) = Ψ n ( θ , ˆ θ ,n ) can be used. H n ( θ ) is estimated bytaking supremum about the second argument of Ψ n at some stage. For ˆ θ ,n , one can switch H n to a diﬀerent random ﬁeld H n ( θ ) = Ψ n (ˆ θ ,n , θ ). Such a stepwise application of the QLA in thepresent article’s form can be observed in many studies; see Yoshida [30], Uchida and Yoshida[26, 28] and the papers listed in Introduction.Deﬁne Y T : Ω × Θ → R by Y T ( θ ) = 1 b T (cid:8) H T ( θ ) − H T ( θ ∗ ) (cid:9) ( θ ∈ Θ)for n ∈ N . Let Y : Ω × Θ → R be a continuous random ﬁeld. Let L > [S1]

Parameters α , β , β , ρ and ρ satisfy the following inequalities:0 < β < / , < ρ < min (cid:8) , α/ (1 − α ) , β / (1 − α ) (cid:9) , < α < ρ , β ≥ , − β − ρ > . [S2] (i) There exists a positive random variable χ and the following conditions are fulﬁlled. (i-1) Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ (cid:12)(cid:12) θ − θ ∗ (cid:12)(cid:12) for all θ ∈ Θ. (i-2) For some constant C L , it holds that P (cid:2) χ ≤ r − ( ρ − α ) (cid:3) ≤ C L r L ( r > . (ii) For some constant C L , it holds that P (cid:2) λ min (Γ) < r − ρ (cid:3) ≤ C L r L ( r > β = α/ (1 − α ). Let k V k p = (cid:0) E [ | V | p ]) /p for p > V . [S3] (i) For M = L (1 − ρ ) − , sup T ∈ T (cid:13)(cid:13) | ∆ T | (cid:13)(cid:13) M < ∞ . (ii) For M = L (1 − β − ρ ) − ,sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ \ U ( θ ∗ ,b − α/ T ) b − β T (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M < ∞ . (iii) For M = L ( β − ρ ) − sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M = O ( δ ) ( δ ↓ . (iv) For M = L (cid:0) β (1 − α ) − − ρ (cid:1) − ,sup T ∈ T (cid:13)(cid:13) b β T (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) M < ∞ . Remark 2.2. (i) In the above conditions, each constant C L is independent of r and n , but maydepend on the parameters appearing in [ S

1] as well as L . (ii) In applications, we often need toestimate the supremum of a sequence of martingales depending on θ to verify the above momentconditions. Use of Sobolev’s embedding inequality is a simple solution. (iii) The random matrixΓ is positive-deﬁnite a.s. if [ S

2] (ii) is satisﬁed.Let U T = { u ∈ R p ; θ ∗ + a T u ∈ Θ } and V T ( r ) = { u ∈ U T ; | u | ≥ r } for r >

0. Deﬁne therandom ﬁeld Z T on U T by Z T ( u ) = exp (cid:0) H T ( θ ∗ + a T u ) − H T ( θ ∗ ) (cid:1) for u ∈ U T . Following Yoshida [30], we give a polynomial type large deviation inequality forthe random ﬁeld Z T . Theorem 2.3.

Given a positive constant L , suppose that [ S , [ S and [ S are fulﬁlled. Thenthere exists a constant C L such that P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ exp (cid:0) − − r − ( ρ ∨ ρ ) (cid:1)(cid:21) ≤ C L r L (2.3) for all r > and T ∈ T . The supremum of the empty set should read −∞ .Proof. Suppose that the constants α, β , β , ρ , ρ satisfy Condition [ S ρ = 2 for H T of class C . According to Section 3.1 of [30], it suﬃces to verifyConditions [ A ′ ] and [ A A

6] therein. Condition [ A

4] of [30] with ρ = 2 and the conditionthat α ∈ (0 ,

1) are satisﬁed under [ S

1] since ρ < A ′ ] of [30] requires the estimatesup T > P (cid:2) S ′ T ( r ) c (cid:3) ≤ C L r L ( r >

0) (2.4)for some constant C L , where the event S ′ T ( r ) is deﬁned by S ′ T ( r ) =  sup h : θ ∗ + h ∈ Θ ,b − / T r ≤| h |≤ C / b − α/ T (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ (cid:12)(cid:12) < r − ρ  . To verify (2.4), we may assume r ≤ C / b (1 − α ) / T , equivalently, b − T ≤ C / (1 − α )0 r − / (1 − α ) . (2.5)Otherwise, S ′ T ( r ) c = ∅ , and there is nothing to show. Furthermore, we may assume r issuﬃciently large (in particular, r ≥

1) to show Inequality (2.4), by changing C L if necessary.We have P (cid:2) S ′ T ( r ) c (cid:3) ≤ P ( T, r ) + P ( T, r ) , (2.6)where P ( T, r ) = P " sup h : | h |≤ C / { − α ) } r − β (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ T ( θ ∗ ) (cid:12)(cid:12) ≥ r − ρ and P ( T, r ) = P (cid:20)(cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12) ≥ r − ρ (cid:21) n b − T ≤ C / (1 − α )0 r − / (1 − α ) o in view of (2.5). For suﬃciently large r , by Condition [ S

3] (iii),sup T ∈ T P ( T, r ) < ∼ r M ρ sup T ∈ T E " sup h : | h |≤ r − β (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ T ( θ ∗ ) (cid:12)(cid:12) M < ∼ r − M ( β − ρ ) = r − L . (2.7)Next, by Condition [ S

3] (iv), we havesup T ∈ T P ( T, r ) < ∼ sup T ∈ T (cid:18) b − M β T r M ρ n b − T ≤ C / (1 − α )0 r − / (1 − α ) o (cid:19) < ∼ r − M (2 β / (1 − α ) − ρ ) = r − L . (2.8)From (2.6), (2.7) and (2.8), we obtain (2.4), therefore [ A ′ ] of [30] was veriﬁed.6ondition [ A

6] of [30] follows from Condition [ S

3] (i) and Condition [ S

3] (ii). Condition[ S

2] (i) ensures Conditions [ A

3] for ρ = 2 and [ A

5] of [30]. Moreover, [ S

2] (ii) veriﬁes [ A

2] of[30]. Now, as already mentioned, we apply Theorem 1 of [30] to Z T for ρ = 2 in order to obtain(2.3).Deﬁne r T ( u ) by r T ( u ) =  H T ( θ ∗ + a T u ) − H T ( θ ∗ ) − (cid:18) ∆ T [ u ] − Γ[ u ⊗ ] (cid:19) ( u ∈ U T )1 ( u U T ) (2.9) Proposition 2.4.

Suppose that sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ (cid:12)(cid:12) → p T → ∞ ) (2.10) for every K > . Then the random ﬁeld Z T is locally asymptotically quadratic at θ ∗ , that is, Z T ( u ) = exp (cid:18) ∆ T [ u ] −

12 Γ[ u ⊗ ] + r T ( u ) (cid:19) ( u ∈ U T ) (2.11) and r T ( u ) → p as T → ∞ for every u ∈ R p .Proof. By deﬁnition of r T ( u ), Equation (2.11) holds for u ∈ U T . For each u ∈ R p , there is anumber T u such that a T u ∈ U (0 , r ) for all T ≥ T u . Then r T ( u ) admits the expression r T ( u ) = − Z (1 − s ) (cid:8) Γ T ( θ ∗ + sa T u ) − Γ (cid:9) ds [ u ⊗ ] . (2.12)Therefore r T ( u ) → T → ∞ by (2.10). Remark 2.5.

We have sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ T ( θ ∗ ) (cid:12)(cid:12) → p T → ∞ under [ S

3] (iii) sincelim sup T →∞ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) M ≤ lim sup T →∞ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup v ∈ R p : | v |≤ (cid:12)(cid:12) Γ T (cid:0) θ ∗ + K | a T | v (cid:1) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) M ≤ lim sup T →∞ O ( K | a T | ) = 0 . On the other hand, Γ T ( θ ∗ ) − Γ → p T → ∞ under [ S

3] (iv). Therefore, Z T is locally asymptotically quadratic at θ ∗ if [ S

3] (iii)and (iv) are satisﬁed since the convergence (2.10) holds under [ S

3] (iii) and (iv), though theseconditions are too suﬃcient for (2.10). 7et ∆ be a p -dimensional random vector on some extension of (Ω , F , P ). Deﬁne a randomﬁeld Z on R p by Z ( u ) = exp (cid:18) ∆[ u ] −

12 Γ[ u ⊗ ] (cid:19) (2.15)for u ∈ R p . Let ˆ u = Γ − ∆.Any measurable mapping ˆ θ MT : Ω → Θ is called a quasi-maximum likelihood estimator(QMLE) for H T if H T (ˆ θ MT ) = max θ ∈ Θ H T ( θ ) . (2.16)Since H T is continuous on the compact Θ, such a measurable function always exists, which isensured by the measurable selection theorem. Uniqueness of ˆ θ MT is not assumed. Let ˆ u MT = a − T (ˆ θ MT − θ ∗ ) for the QMLE ˆ θ MT .Let G be a σ -ﬁeld such that σ [Γ] ⊂ G ⊂ F . It is said that a sequence ( V T ) T ∈ T of randomvariables taking values in a metric space S equipped with the Borel σ -ﬁeld converges G -stablyto an S -valued random variable V ∞ deﬁned on an extension of (Ω , F , P ) if ( V T , Ψ) → d ( V ∞ , Ψ)as T → ∞ for any G -measurable random variable Ψ. The G -stable convergence is denoted by → d s ( G ) . Theorem 2.6.

Let

L > p > . Suppose that Conditions [ S , [ S and [ S are satisﬁed andthat ∆ T → d s ( G ) ∆ (2.17) as T → ∞ . Then (a) As T → ∞ , ˆ u MT − Γ − ∆ T → p . (2.18) (b) As T → ∞ , E (cid:2) f (ˆ u MT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.19) for any bounded G -measurable random variable Φ and any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ .Proof. As mentioned in Remark 2.5, the convergence (2.10) holds for every

K > S u : | u |≤ R | r T ( u ) | → p T → ∞ ) (2.20)for every R >

0. The space C ( U (0 , R )) of continuous function on U (0 , R ) is equipped with thesupremum norm. Combining the representation (2.11) of Z T with the convergences (2.17) and82.20), by estimating the modulus of continuity of log Z T on U (0 , R ), we obtain tightness of thefamily (cid:8) Z T | U (0 ,R ) (cid:9) T ≥ T for some T ∈ T , which yields the convergence Z T | U (0 ,R ) → d Z | U (0 ,R ) (2.21)in C ( U (0 , R )) as T → ∞ for every R > Z T → d f Z is given by (2.17), (2.20) and (2.11).Let F be any closed set in R p . Thenlim sup T →∞ P (cid:2) ˆ u MT ∈ F (cid:3) ≤ lim sup T →∞ P (cid:2) ˆ u MT ∈ F ∩ U (0 , R ) (cid:3) + lim sup T →∞ P (cid:2) ˆ u MT ∈ V T ( R ) (cid:3) ≤ lim sup T →∞ P (cid:20) sup u ∈ F ∩ U (0 ,R ) Z T ( u ) − sup u ∈ F c ∩ U (0 ,R ) Z T ( u ) ≥ (cid:21) + lim sup T →∞ P (cid:20) sup u ∈ V T ( R ) Z T ( u ) ≥ (cid:21) ≤ P (cid:20) sup u ∈ F ∩ U (0 ,R ) Z ( u ) − sup u ∈ F c ∩ U (0 ,R ) Z ( u ) ≥ (cid:21) + C L R L (2.22)by the convergence (2.21) and the polynomial type large deviation inequality (2.3) given byTheorem 2.3. Let R → ∞ in (2.22) to obtainlim sup T →∞ P (cid:2) ˆ u MT ∈ F (cid:3) ≤ P (cid:20) sup u ∈ F Z ( u ) − sup u ∈ F c Z ( u ) ≥ (cid:21) ≤ P (cid:2) ˆ u ∈ F (cid:3) . (2.23)Here the positivity of Γ given by [ S

2] (ii) (Remark 2.2) was used for the ﬁrst inequality, andthe last inequality is by the uniqueness of the maximum point of the random ﬁeld Z deﬁned by(2.15). Inequality (2.23) shows the convergence ˆ u MT → d ˆ u as T → ∞ .From the convergence of ˆ u MT , in particular ˆ θ MT → p θ ∗ θ , and when ˆ θ MT ∈ U ( θ ∗ , r ), one has∆ T = Z Γ T (cid:0) θ ∗ T + s (ˆ θ MT − θ ∗ ) (cid:1) ds ˆ u MT since ∂ θ H T (ˆ θ MT ) = 0. Then we obtain (2.18) from [ S

3] (iii) and (iv). The G -stable convergenceˆ u MT → d s ( G ) ˆ u (2.24)follows from (2.17).As already used in the above argument, P (cid:2) | ˆ u MT | ≥ r (cid:3) ≤ P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ (cid:21) ≤ C L r L (2.25)for all T ∈ T and r >

0. Therefore, sup T ∈ T E (cid:2) | ˆ u MT | q (cid:3) < ∞ for any constant q such that L > q > p . This means the family (cid:8) f (ˆ u MT ) (cid:9) T > is uniformlyintegrable. Consequently, we obtain (2.19) from (2.24).9 emark 2.7. (i) The convergence (2.19) holds for non-bounded Φ if Φ has the dual inte-grability for f (ˆ u MT ). For example, the convergence holds for Φ ∈ L r ( G ) for some r > | u |→∞ | u | − p ( r − /r | f ( u ) | < ∞ . (ii) The asymptotic equivalence (2.18) between ˆ u MT andΓ − ∆ T is called the ﬁrst-order eﬃciency in particular for the maximum likelihood estimator.This relation is useful when one considers a joint convergence of ˆ u MT with other variables. Suchan asymptotic representation of the error is useful in analysis of a model having multi-scaledparameters.The quasi-likelihood analysis enables us to derive asymptotic properties of the Bayesianestimator, as well as the quasi-maximum likelihood estimator. The mappingˆ θ BT = (cid:20) Z Θ exp (cid:0) H T ( θ ) (cid:1) ̟ ( θ ) dθ (cid:21) − Z Θ θ exp (cid:0) H T ( θ ) (cid:1) ̟ ( θ ) dθ (2.26)is called a quasi-Bayesian estimator (QBE) with respect to the prior density ̟ . The QBE ˆ θ BT takes values in the convex-hull of Θ. When the H T is the log likelihood function, the QBEcoincides with the Bayesian estimator with respect to the quadratic loss function. We willassume ̟ is continuous and 0 < inf θ ∈ Θ ̟ ( θ ) ≤ sup θ ∈ Θ ̟ ( θ ) < ∞ . Theorem 2.8. (I)

Let

L > . Suppose that Conditions [ S , [ S and [ S are satisﬁed andthat the convergence (2.17) holds as T → ∞ . Then (a) As T → ∞ , ˆ u BT − Γ − ∆ T → p . (2.27) (b) As T → ∞ , ˆ u BT → d s ( G ) ˆ u. (2.28) (II) Let p ≥ and L > ( p + 1) ∨ . Suppose that Conditions [ S , [ S and [ S are satisﬁedand that the convergence (2.17) holds as T → ∞ . Moreover, suppose that there existpositive constants q , c , δ and T ∈ T such that q > p and sup T ≥ T E (cid:2)(cid:12)(cid:12) H T ( θ ∗ + a T u ) − H T ( θ ∗ ) (cid:12)(cid:12) q (cid:3) ≤ c | u | q (2.29) for all u ∈ U (0 , δ ) . Then (2.27) holds, and moreover, E (cid:2) f (ˆ u BT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.30) as T → ∞ for any G -measurable bounded random variable Φ and any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ . Remark 2.9.

In Theorem 2.8, we implicitly assume that T is suﬃciently large so that U (0 , δ ) ⊂ U T for all T ≥ T , and the left-hand side of (2.29) makes sense. Remark 2.10.

Condition (2.29) holds under any one of the following conditions:10 i) There exist constants q > p , δ > T ∈ T such that sup T ≥ T sup θ ∈ U T ∩ U ( θ ∗ ,δ ) k Γ T ( θ ) k q < ∞ and sup T ≥ T k ∆ T k q < ∞ . (ii) | Γ | ∈ L q , and M , M , and M appearing in [ S

3] satisfy M ∧ M ∧ M ≥ q .This follows from the representation (2.11) of Z T ( u ) and the representation (2.12) of r T ( u ). Proof of Theorem 2.8. (I) We obtain a polynomial type large deviation inequality from Theorem2.3: for any

D >

0, there exist positive constants C and C such that P (cid:20) sup V T ( r ) Z T ≥ C r − D (cid:21) ≤ C r − L (2.31)for all T and r >

0. Choose a number D such that D > p + ( p ∨ C ′ , then P (cid:20) Z V T ( r ) (1 + | u | ) Z T ( u ) du > C ′ ∞ X ℓ =0 ( r + ℓ ) p − D (cid:21) ≤ ∞ X ℓ =0 P (cid:20) Z { r + ℓ ≤| u | < ( r + ℓ +1) }∩ U T (1 + | u | ) Z T ( u ) du > C ′ ( r + ℓ ) p − D (cid:21) ≤ ∞ X ℓ =0 P (cid:20) sup { r + ℓ ≤| u | < ( r + ℓ +1) }∩ U T Z T ( u ) > C ( r + ℓ ) − D (cid:21) ≤ C ∞ X ℓ =0 ( r + ℓ ) − L for T ∈ T and r >

1, by (2.31). Since D − p > L > C and ǫ (independent of ( r, T )) such that P (cid:20) Z V T ( r ) (1 + | u | ) Z T ( u ) du > C r − ǫ (cid:21) ≤ C r − ǫ ( r > , T ∈ T ) (2.32)The variable ˆ u BT has the expressionˆ u BT = (cid:18) Z U T Z T ( u ) ̟ ( θ ∗ + a T u ) du (cid:19) − Z U T u Z T ( u ) ̟ ( θ ∗ + a T u ) du. (2.33)For g ( u ) = (1 , u ), let X T = Z U T g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du, X T,r = Z U T ∩ U (0 ,r ) g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du,W T,r = Z V T ( r ) g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du,X ∞ = Z R p g ( u ) Z ( u ) ̟ ( θ ∗ ) du, X ∞ ,r = Z U (0 ,r ) g ( u ) Z ( u ) ̟ ( θ ∗ ) du, where θ † T ( u ) = θ ∗ + a T u and Z is given by (2.15). Then X T = X T,r + W T,r and the followingproperties hold. 11i) For any η >

0, there exists r > T ∈ T P [ | W T,r | > η ] < η for all r ≥ r .(ii) For every r > X T,r → d X ∞ ,r as T → ∞ .(iii) X ∞ ,r → d X ∞ as r → ∞ .Indeed, (i) follows from (2.32), (ii) from the convergence (2.21), and (iii) is obvious. Therefore X T → d X ∞ (2.34)as T → ∞ .Denote X T = ( X (0) T , X (1) T ) and X T,r = ( X (0) T,r , X (1)

T,r ). We will consider suﬃciently large T such that U T ⊃ U (0 , A T = (cid:0) | X (1) T | + | X (0) T | (cid:1)(cid:18) X (0) T Z U (0 , Z T ( u ) du inf θ ∈ Θ ̟ ( θ ) (cid:19) − . Then A T ≥ | X (1) T | + | X (0) T | X (0) T X (0) T,r for all r ≥

1. We have (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) X (0) T,r (cid:1) − X (1) T,r (cid:12)(cid:12)(cid:12) ≤ | W T,r | A T for all r ≥

1. Let ǫ >

0. Then there exists a positive number η > T →∞ P (cid:20) A T > η (cid:21) < ǫ { A T } T ≥ T is tight for some T ∈ T by (2.34) and (2.21). For the pair ( ǫ, η ),there exists r = r ( ǫ, η ) ≥ T →∞ P (cid:20) | W T,r | > ǫη (cid:21) < ǫ r ≥ r )by the property (i) mentioned just before (2.34). In what follows, we ﬁx an r ≥ r . Thenlim sup T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) X (0) T,r (cid:1) − X (1) T,r (cid:12)(cid:12)(cid:12) > ǫ (cid:21) ≤ lim sup T →∞ P (cid:20) | W T,r | > ǫη (cid:21) + lim T →∞ P (cid:20) A T > η (cid:21) < ǫ . (2.35)We have (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i Z T ( u ) ̟ ( θ † T ( u )) du − Z U (0 ,r ) u i Z T ( u ) ̟ ( θ ∗ ) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z U (0 ,r ) (1 + | u | ) Z T ( u ) du × sup (cid:26) | ̟ ( θ ) − ̟ ( θ ∗ ) | ; θ ∈ Θ , | θ − θ ∗ | ≤ | a T | r (cid:27) u = 1 and u = u . Therefore (cid:18) Z U (0 ,r ) Z T ( u ) ̟ ( θ † T ( u )) du (cid:19) − Z U (0 ,r ) u Z T ( u ) ̟ ( θ † T ( u )) du − (cid:18) Z U (0 ,r ) Z T ( u ) du (cid:19) − Z U (0 ,r ) u Z T ( u ) du → p T → ∞ .Moreover, we have (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i Z T ( u ) du − Z U (0 ,r ) u i ˆ Z T ( u ) du (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] + r T ( u ) (cid:1) du − Z U (0 ,r ) u i exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] (cid:1) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z U (0 ,r ) (1 + | u | )ˆ Z T ( u ) du sup u ∈ U (0 ,r ) (cid:12)(cid:12) e r T ( u ) − (cid:12)(cid:12) where ˆ Z T ( u ) = exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] (cid:1) . Therefore (cid:18) Z U (0 ,r ) Z T ( u ) du (cid:19) − Z U (0 ,r ) u Z T ( u ) du − (cid:18) Z U (0 ,r ) ˆ Z T ( u ) du (cid:19) − Z U (0 ,r ) u ˆ Z T ( u ) du → p T → ∞ thanks to the convergence (2.20).Let ˆ X T = ( ˆ X (0) T , ˆ X (1) T ) = Z R p g ( u )ˆ Z T ( u ) du, ˆ X T,r = ( ˆ X (0) T,r , ˆ X (1) T,r ) = Z U (0 ,r ) g ( u )ˆ Z T ( u ) du, ˆ W T,r = Z R p \ U (0 ,r ) g ( u )ˆ Z T ( u ) du. Then (cid:12)(cid:12)(cid:12)(cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T − (cid:0) ˆ X (0) T,r (cid:1) − ˆ X (1) T,r (cid:12)(cid:12)(cid:12) ≤ | ˆ W T,r | ˆ A T for all r ≥

1, whereˆ A T = (cid:0) | ˆ X (1) T | + | ˆ X (0) T | (cid:1)(cid:18) ˆ X (0) T Z U (0 , ˆ Z T ( u ) du inf θ ∈ Θ ̟ ( θ ) (cid:19) − ≥ | ˆ X (1) T | + | ˆ X (0) T | ˆ X (0) T ˆ X (0) T,r T show that for any η >

0, there exist r > T ∈ T such thatsup T ≥ T P [ ˆ W T,r > η ] < η ( r ≥ r ) . (2.38)In the same way as we showed (2.35),lim T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T − (cid:0) ˆ X (0) T,r (cid:1) − ˆ X (1) T,r (cid:12)(cid:12)(cid:12) > ǫ (cid:21) < ǫ . (2.39)To obtain (2.39) by using (2.38) and the tightness of { ˆ A T } T ≥ T for some T ∈ T , we replace r by a larger number, if necessary.Combining (2.35), (2.36), (2.37) and (2.39), we obtainlim sup T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T (cid:12)(cid:12)(cid:12) > ǫ (cid:21) < ǫ. (2.40)This completes the proof of (2.27) since ˆ u BT = (cid:0) X (0) T (cid:1) − X (1) T and (cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T = Γ − ∆ T . From(2.27), we obtain (2.28).(II) There exists a number p ∗ such that p ∗ ≥ , p < p ∗ < ( D − p ) ∧ ( L − p ≥ L > ( p + 1) ∨

2. Then the following estimates are standard: E (cid:2) | ˆ u BT | p ∗ ] ≤ E (cid:20)(cid:18) Z U T Z T ( u ) ̟ ( θ † T ( u )) du (cid:19) − Z U T | u | p ∗ Z T ( u ) ̟ ( θ † T ( u )) du (cid:21) ≤ C ( ̟ ) ∞ X r =0 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du (cid:21) ≤ C ( ̟ ) (cid:0) ,T + Φ ,T (cid:1) for sme constant C ( ̟ ), whereΦ ,T = ∞ X r =1 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du × (cid:26) R { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du> C ′ rD − p +1 (cid:27)(cid:21) and Φ ,T = ∞ X r =1 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du × (cid:26) R { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du ≤ C ′ rD − p +1 (cid:27)(cid:21) . C ′ . Since the integrand of the expectation of Φ ,T is not greaterthan one, we obtainΦ ,T ≤ ∞ X r =1 ( r + 1) p ∗ P (cid:20) Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du > C ′ r D − p +1 (cid:21) < ∼ ∞ X r =1 r − ( L − p ∗ ) thanks to the polynomial type large deviation inequality (2.31). For Φ ,T ,Φ ,T < ∼ ∞ X r =1 r − ( D − p ∗ − p +1) E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − (cid:21) . Therefore, the family {| ˆ u BT | p } T ∈ T is uniformly integrable ifsup T ≥ T E (cid:20)(cid:18) Z U (0 ,δ ) Z T ( u ) du (cid:19) − (cid:21) < ∞ (2.41)since U (0 , δ ) ⊂ ∩ T ≥ T U T (see Remark 2.10) and thensup T ∈ T E (cid:2) | ˆ u BT | p ∗ ] < ∞ . We note that the family { ˆ u BT } T ∈ T ,T

2] and [ S [T1] (i) There exists a positive random variable χ and the following conditions are fulﬁlled. (i-1) Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ (cid:12)(cid:12) θ − θ ∗ (cid:12)(cid:12) for all θ ∈ Θ. (i-2) For every

L >

0, there exists a constant C such that P (cid:2) χ ≤ r − (cid:3) ≤ Cr L ( r > . (ii) For every

L >

0, there exists a constant C such that P (cid:2) λ min (Γ) < r − (cid:3) ≤ Cr L ( r > Remark 2.11. χ − ∈ L ∞ – = ∩ p> L p under [ T

1] (i-2). | Γ − | ∈ L ∞ – under [ T

1] (ii) since (cid:0) λ min (Γ) (cid:1) − ∈ L ∞ – and | Γ − | ≤ C p (cid:0) λ min (Γ) (cid:1) − for a constant only depending on p . The L q -integrability of Γ will be assumed when we verify (2.29).15 T2]

There exist positive numbers ǫ and ǫ such that the following conditions are satisﬁed forall p > (i) sup T ∈ T (cid:13)(cid:13) | ∆ T | (cid:13)(cid:13) p < ∞ . (ii) sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ b Tǫ (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) p < ∞ . (iii) sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) p = O ( δ ) ( δ ↓ . (iv) sup T ∈ T (cid:13)(cid:13) b Tǫ (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) p < ∞ . Denote by f ∈ C p ( R p ) the set of continuous functions of at most polynomial growth. Wefurther simplify Theorems 2.6 and 2.8. Theorem 2.12.

Suppose that Conditions [ T and [ T are satisﬁed and that the convergence(2.17) holds as T → ∞ . Then (a) As T → ∞ , ˆ u MT − Γ − ∆ T → p . (2.42) (b) As T → ∞ , E (cid:2) f (ˆ u MT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.43) for any f ∈ C p ( R p ) and any G -measurable random variable Φ ∈ ∪ p> L p .Proof. There exist values of the parameters α, β , β , ρ and ρ satisfying β ∈ (0 , min { ǫ , / } ),1 / − β ≤ ǫ and Condition [ S S

2] is veriﬁed for any given

L > T T

2] is suﬃcient for [ S

3] for any

L >

0. Therefore we can apply Theorems 2.6.This concludesthe proof.

Theorem 2.13.

Suppose that Conditions [ T and [ T are satisﬁed and that the convergence(2.17) holds as T → ∞ . Moreover, suppose that | Γ | ∈ L q for some q > p . Then (a) As T → ∞ , ˆ u BT − Γ − ∆ T → p . (2.44) (b) As T → ∞ , E (cid:2) f (ˆ u BT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.45) for any f ∈ C p ( R p ) and any G -measurable random variable Φ ∈ ∪ p> L p .Proof. We apply Theorem 2.8. In particular, (2.29) holds now according to Remark 2.10.16

Further simpliﬁcation in ergodic statistics

When the limit Y of Y T is deterministic, some more simpliﬁcation of the theory is possible. Inthis section, we suppose that the random ﬁeld Y and the p × p positive-deﬁnite symmetric matrixΓ are deterministic. Let L be a positive number. We will consider the following conditions. [ U ] Γ is positive-deﬁnite, in addition, there is a positive number χ such that Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ | θ − θ ∗ | for all θ ∈ Θ. [ U ] The numbers α , β , β and ρ satisfy the inequalities0 < α < ρ , β ≥ , − β − ρ > , < β < / , (3.1)and the following conditions are fulﬁlled. (i) For some M > L , sup T > (cid:13)(cid:13) ∆ T (cid:13)(cid:13) M < ∞ . (ii) For M = L (1 − β − ρ ) − ,sup T > (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ U T \ U (cid:0) θ ∗ ,b − α/ T (cid:1) b − β T (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M < ∞ . (iii) For some M > Lβ − ,sup T > (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M = O ( δ ) ( δ ↓ . (iv) For some M > L (cid:0) β ) − (1 − α ),sup T > (cid:13)(cid:13) b β T (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) M < ∞ . Remark 3.1.

Condition [ U

1] is almost trivial because the function Y should be of C on Θand continuous on the compact set Θ and then local non-degeneracy of the information impliesthe global identiﬁability. Theorem 3.2.

Suppose that Conditions [ U and [ U are fulﬁlled for a positive constant L .Then there exists a constant C L such that P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ exp (cid:0) − − r − ρ (cid:1)(cid:21) ≤ C L r L for all T ∈ T and r > . Here the supremum on the empty set should read −∞ by convention. roof. Choose a positive constant ρ such that0 < ρ < min (cid:8) , α/ (1 − α ) , β / (1 − α ) (cid:9) , ρ ≤ ρ ,M ≥ M ′ := L (1 − ρ ) − , M ≥ M ′ := L ( β − ρ ) − ,M ≥ M ′ := L (cid:0) β (1 − α ) − − ρ (cid:1) − (3.2)for M , M , M given in Condition [ U ρ exists. It is suﬃcient toverify the conditions of Theorem 2.3. Condition [ S

1] is fulﬁlled by [ U

2] and a choice of ρ in (3.2). Condition [ S

2] (i-1) is satisﬁed with [ U S

2] aretrivial because χ is a deterministic positive number and Γ is positive-deﬁnite, deterministicin the present situation, respectively. Conditions (i)-(iv) of [ S

3] are veriﬁed by [ U

2] with( M ′ , M , M ′ , M ′ ) for ( M , M , M , M ) in [ S H T is characterized by(2.16). Theorem 2.6 is rephrased as follows with the trivial σ -ﬁeld for G . Theorem 3.3.

Let

L > p > . Suppose that Conditions [ U and [ U are satisﬁed and that ∆ T → d ∆ (3.3) as T → ∞ . Then, (a) ˆ u MT − Γ − ∆ T → p as T → ∞ . (b) E (cid:2) f (ˆ u MT ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ .Proof. Take ρ as (3.2) and apply Theorem 2.6.Consider the quasi-Beyesian estimator ˆ θ BT deﬁned by (2.26). We can rephrase Theorem 2.8as follows. Theorem 3.4.

Let p ≥ and L > ( p + 1) ∨ . Suppose that Conditions [ U and [ U aresatisﬁed and that the convergence (3.3) holds as T → ∞ . Moreover, suppose that there existpositive constants q , c , δ , T ∈ T and c such that q > p and the inequality (2.29) holds for all u ∈ U (0 , δ ) . Then (a) ˆ u BT − Γ − ∆ T → p as T → ∞ . (b) E (cid:2) f (ˆ u BT ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ . As a corollary of Theorems 3.3 and 3.4, we obtain the following result.

Theorem 3.5.

Suppose that Conditions [ U and [ T are satisﬁed and that the convergence(3.3) holds as T → ∞ . Then (b) ˆ u A T − Γ − ∆ T → p as T → ∞ for A ∈ { M, B } . (a) E (cid:2) f (ˆ u A T ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for A ∈ { M, B } and any f ∈ C p ( R p ) . eferences [1] Clinet, S., Yoshida, N.: Statistical inference for ergodic point processes and application tolimit order book. arXiv preprint arXiv:1512.01899 (2015)[2] Eguchi, S., Masuda, H.: Schwarz type model comparison for laq models. arXiv preprintarXiv:1606.01627 (2016)[3] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: The asymptotic behavior of certain statistical estimatesin the smooth case. I. Investigation of the likelihood ratio. Teor. Verojatnost. i Primenen. , 469–486 (1972)[4] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: Asymptotic behavior of certain statistical estimates.II. Limit theorems for a posteriori density and for Bayesian estimates. Teor. Verojatnost.i Primenen. , 78–93 (1973)[5] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: Statistical estimation, Applications of Mathematics,vol. 16. Springer-Verlag, New York (1981). Asymptotic theory, Translated from the Russianby Samuel Kotz[6] Inatsugu, H., Yoshida, N.: Global jump ﬁlters and quasi-likelihood analysis for volatility.Annals of the Institute of Statistical Mathematics: updated arXiv:1806.10706v3 pp. 1–44(2021)[7] Kamatani, K., Uchida, M.: Hybrid multi-step estimators for stochastic diﬀerential equa-tions based on sampled data. Statistical Inference for Stochastic Processes (2), 177–204(2014)[8] Kinoshita, Y., Yoshida, N.: Penalized quasi likelihood estimation for variable selection.arXiv preprint arXiv:1910.12871 (2019)[9] Kutoyants, Y.: Identiﬁcation of dynamical systems with small noise, Mathematics and itsApplications, vol. 300. Kluwer Academic Publishers Group, Dordrecht (1994)[10] Kutoyants, Y.A.: Parameter estimation for stochastic processes, Research and Expositionin Mathematics, vol. 6. Heldermann Verlag, Berlin (1984). Translated from the Russianand edited by B. L. S. Prakasa Rao[11] Kutoyants, Y.A.: Statistical inference for spatial Poisson processes, Lecture Notes inStatistics, vol. 134. Springer-Verlag, New York (1998)[12] Kutoyants, Y.A.: Statistical inference for ergodic diﬀusion processes. Springer Series inStatistics. Springer-Verlag London Ltd., London (2004)[13] Masuda, H.: Approximate self-weighted lad estimation of discretely observed ergodicornstein-uhlenbeck processes. Electronic Journal of Statistics , 525–565 (2010)[14] Masuda, H.: Convergence of gaussian quasi-likelihood random ﬁelds for ergodic l´evy drivensde observed at high frequency. The Annals of Statistics (3), 1593–1641 (2013)1915] Masuda, H.: Parametric estimation of l´evy processes. In: L´evy Matters IV, pp. 179–286.Springer (2015)[16] Masuda, H., Shimizu, Y.: Moment convergence in regularized estimation under multipleand mixed-rates asymptotics. Mathematical Methods of Statistics (2), 81–110 (2017)[17] Nomura, R., Uchida, M.: Adaptive bayes estimators and hybrid estimators for small dif-fusion processes based on sampled data. Journal of the Japan Statistical Society (2),129–154 (2016)[18] Ogihara, T., Yoshida, N.: Quasi-likelihood analysis for the stochastic diﬀerential equa-tion with jumps. Stat. Inference Stoch. Process. (3), 189–229 (2011). DOI 10.1007/s11203-011-9057-z. URL http://dx.doi.org/10.1007/s11203-011-9057-z [19] Ogihara, T., Yoshida, N.: Quasi-likelihood analysis for nonsynchronously observed diﬀu-sion processes. Stochastic Processes and their Applications (9), 2954–3008 (2014)[20] Ogihara, T., Yoshida, N.: Quasi likelihood analysis of point processes for ultra high fre-quency data. arXiv preprint arXiv:1512.01619 (2015)[21] Shimizu, Y.: Threshold estimation for stochastic processes with small noise. arXiv preprintarXiv:1502.07409 (2015)[22] Shimizu, Y.: Moment convergence of regularized least-squares estimator for linear regres-sion model. Annals of the Institute of Statistical Mathematics (5), 1141–1154 (2017)[23] Suzuki, T., Yoshida, N.: Penalized least squares approximation methods and their appli-cations to stochastic processes. Japanese Journal of Statistics and Data Science pp. 1–29(2020)[24] Uchida, M.: Contrast-based information criterion for ergodic diﬀusion processes fromdiscrete observations. Ann. Inst. Statist. Math. (1), 161–187 (2010). DOI 10.1007/s10463-009-0245-1. URL http://dx.doi.org/10.1007/s10463-009-0245-1 [25] Uchida, M., Yoshida, N.: Adaptive estimation of an ergodic diﬀusion process based onsampled data. Stochastic Process. Appl. (8), 2885–2924 (2012). DOI 10.1016/j.spa.2012.04.001. URL http://dx.doi.org/10.1016/j.spa.2012.04.001 [26] Uchida, M., Yoshida, N.: Adaptive estimation of an ergodic diﬀusion process based onsampled data. Stochastic Processes and their Applications (8), 2885–2924 (2012)[27] Uchida, M., Yoshida, N.: Quasi likelihood analysis of volatility and nondegeneracy ofstatistical random ﬁeld. Stochastic Process. Appl. (7), 2851–2876 (2013). DOI 10.1016/j.spa.2013.04.008. URL http://dx.doi.org/10.1016/j.spa.2013.04.008 [28] Uchida, M., Yoshida, N.: Adaptive bayes type estimators of ergodic diﬀusion processesfrom discrete observations. Statistical Inference for Stochastic Processes (2), 181–219(2014) 2029] Umezu, Y., Shimizu, Y., Masuda, H., Ninomiya, Y.: Aic for non-concave penalized likeli-hood method. arXiv preprint arXiv:1509.01688 (2015)[30] Yoshida, N.: Polynomial type large deviation inequalities and quasi-likelihood analysis forstochastic diﬀerential equations. Ann. Inst. Statist. Math. (3), 431–479 (2011). DOI10.1007/s10463-009-0263-z. URL http://dx.doi.org/10.1007/s10463-009-0263-z [31] Yoshida, N.: Partial quasi-likelihood analysis. Japanese Journal of Statistics and DataScience1