Simplified quasi-likelihood analysis for a locally asymptotically quadratic random field
aa r X i v : . [ m a t h . S T ] F e b Simplified quasi-likelihood analysis for a locallyasymptotically quadratic random field ∗ Nakahiro YoshidaGraduate School of Mathematical Sciences, University of Tokyo † Japan Science and Technology Agency CRESTThe Institute of Statistical MathematicsFebruary 22, 2021
Summary
The asymptotic decision theory by Le Cam and Hajek has been given a lucid per-spective by the Ibragimov-Hasminskii theory on convergence of the likelihood random field.Their scheme has been applied to stochastic processes by Kutoyants, and today this plot iscalled the IHK program. This scheme ensures that asymptotic properties of an estimator fol-low directly from the convergence of the random field if a large deviation estimate exists. Thequasi-likelihood analysis (QLA) proved a polynomial type large deviation (PLD) inequality togo through a bottleneck of the program. A conclusion of the QLA is that if the quasi-likelihoodrandom field is asymptotically quadratic and if a key index reflecting identifiability the randomfield has is non-degenerate, then the PLD inequality is always valid, and as a result, the IHKprogram can run. Many studies already took advantage of the QLA theory. However, not a fewof them are using it in an inefficient way yet. The aim of this paper is to provide a reformedand simplified version of the QLA and to improve accessibility to the theory. As an exampleof the effects of the theory based on the PLD, the user can obtain asymptotic properties of thequasi-Bayesian estimator by only verifying non-degeneracy of a key index.
Keywords and phrases
Ibragimov-Has’minskii theory, quasi-likelihood analysis, polynomialtype large deviation, random field, asymptotic decision theory, non-ergodic statistics.
The asymptotic decision theory by Le Cam and H´ajek has been given a lucid perspective by theIbragimov-Has’minskii theory ([3, 4, 5]) on convergence of the likelihood random field. Their ∗ This work was in part supported by Japan Science and Technology Agency CREST JPMJCR14D7; JapanSociety for the Promotion of Science Grants-in-Aid for Scientific Research No. 17H01702 (Scientific Research);and by a Cooperative Research Program of the Institute of Statistical Mathematics. † Graduate School of Mathematical Sciences, University of Tokyo: 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan. e-mail: [email protected] As a conclusion of the QLA theory, if the quasi-likelihood random field is locally asymp-totically quadratic (LAQ) and if a key index reflecting identifiability the random field has isnon-degenerate, then the polynomial type large deviation inequality is always valid, and as aresult, the IHK program can run.Since an ad hoc model-dependent method is not necessary, the QLA is universal and canapply to various dependent models. Many studies are based on and taking advantage of theQLA. These applications include sampled ergodic diffusion processes (Yoshida [30]), adaptiveestimation for diffusion processes (Uchida and Yoshida [25]), adaptive Bayes type estimatorsfor ergodic diffusion processes (Uchida and Yoshida [28]), approximate self-weighted LAD esti-mation of discretely observed ergodic Ornstein-Uhlenbeck processes (Masuda [13]), parametricestimation of L´evy processes (Masuda [15]), Gaussian quasi-likelihood random fields for er-godic L´evy driven SDE (Masuda [14]), and ergodic point processes for limit order book (Clinetand Yoshida [1]). Thanks to its flexibility, the QLA is also applicable to non-ergodic statis-tics: volatility parameter estimation in regular sampling of finite time horizon (Uchida andYoshida [27]) and in non-synchronous sampling (Ogihara and Yoshida [19]), a non-ergodicpoint process regression model (Ogihara and Yoshida [20]). Analysis of complex algorithmsis possible by relying on the universal design of the QLA: hybrid multi-step estimators (Ka-matani and Uchida [7]), adaptive Bayes estimators and hybrid estimators for small diffusionprocesses based on sampled data (Nomura and Uchida [17]). Information criteria, sparse es-timation and regularization methods are recently understood in the framework of the QLA:contrast-based information criterion for diffusion processes (Uchida [24]), AIC for non-concavepenalized likelihood method (Umezu et al. [29]), Schwarz type model comparison for LAQmodels (Eguchi and Masuda [2]), moment convergence of regularized least-squares estimatorfor linear regression model (Shimizu [22]), moment convergence in regularized estimation un-der multiple and mixed-rates asymptotics (Masuda and Shimizu [16]), penalized method andpolynomial type large deviation inequality (Kinoshita and Yoshida [8]) and related Suzuki andYoshida ([23]). Jump filtering problems: jump diffusion processes Ogihara and Yoshida([18]),threshold estimation for stochastic processes with small noise (Shimizu [21]), global jump fil-ters (Inatsugu and Yoshida [6]). Partial quasi-likelihood analysis: Yoshida [31]. Such variety ofapplications are demonstrating the universality of the framework of the QLA. Since the IHKprogram runs there, we can obtain limit theorems and the L p -boundedness of the QL estimators(quasi-maximum likelihood estimator and the quasi-Bayesian estimator), which is indispensable The term “quasi-likelihood” is not in the sense of GLM. We use ”quasi-likelihood analysis” because statisticalinference for sampled stochastic processes cannot avoid a quasi-likelihood function for estimation. The methodis relatively new, but not because of “quasi”. The difficulty in large deviation estimates already existed in thelikelihood analysis for stochastic processes.
2o develop statistical theories.The essence of the QLA is the polynomial type large deviation inequality that was proved ina general setting (Yoshida [30]). Since the LAQ property quite often appears when the modelis differentiable, Yoshida [30] was based on this structure. Because of it, the limit distributionof the associated estimators has an explicit expression. The paper [30] gave it, but due to ageneral way of writing, not a few users are apt to avoid following that passage after the PLD’stheorem and try to reconstruct it in each situation. However, such a task is unnecessary infact. Besides, four time differentiability is often assumed in many applications of the QLA.It may be only because a handy condition in [30] assumed an estimate of the supremum ofthe third-order derivative of the quasi-log likelihood random field H T , though the paper gave acondition ([ A ′ ]) to treat H T of class C .The aim of this paper is to provide a simplified version of the QLA theory directly connect-ing the assumptions with the limit theorems in order to improve accessibility to the theory.Essentially, the user is only requested to verify non-degeneracy of a key index, and this task istrivial in particular in ergodic statistics. We will give handy conditions for the quasi-likelihoodrandom field of class C , based on [30], in order to reach the asymptotic properties of the esti-mators at a single leap. Some assumptions in [30] are arranged and replaced by simple-lookingones in this paper. This simplification will serve for future progress e.g. in analysis of regular-ization methods. The LAQ property we adopted here is just one principle of separation, andit is possible to develop a similar theory for a non-LAQ type random field; see Kinoshita andYoshida [8] for a case of regularization.A smart way of presenting the theory is to use the convergence of the quasi-likelihoodrandom field Z T to a random field Z in the function space b C ( R p ), the separable Banach spaceof continuous functions f on R p satisfying lim | u |→∞ f ( u ) = 0, equipped with the supremumnorm. This plot is possible but to carry out it, one needs a suitable measurable extension of Z T to the outside of the originally given local parameter space and an argument about tightnessof random fields on the non-compact R p . In this article, we dared avoid this approach to givepriority to simplicity. As a result, the presentation of the theory is now much more elementarythan Yoshida [30]. Given a probability space (Ω , F , P ) and a bounded open set Θ in R p , we consider a randomfield H T : Ω × Θ → R , a function measurable with respect to the product σ -field F × B (Θ), B (Θ) being the Borel σ -field of Θ. Here T is a subset of R + = [0 , ∞ ) satisfying sup T = ∞ .We suppose that H T is continuous and of class C , that is, for every ω ∈ Ω, the mappingΘ ∋ θ H T ( θ ) ∈ R is of class C and that H T is continuously extended to ∂ Θ. We shallpresent a simplified version of the polynomial type large deviation inequality of Yoshida [30]under a handy set of sufficient conditions.Let θ ∗ ∈ Θ. Define ∆ T and Γ T ( θ ) by∆ T = ∂ θ H T ( θ ∗ ) a T and Γ T ( θ ) = − a ⋆T ∂ θ H T ( θ ) a T (2.1) Because of the assumptions below about the continuity of H T and the separability of Θ, this is equivalentto that the function H T ( · , θ ) is measurable for each θ ∈ Θ. ⋆ denotes the matrix transpose. Let a T ∈ GL ( R p ) be a scaling matrix suchthat | a T | → n → ∞ . We suppose that Γ is a p × p symmetric random matrix. Let U ( θ, r ) = { θ ′ ∈ R p ; | θ ′ − θ | < r } for θ ∈ Θ and r >
0. There exists a positive constant r suchthat U ( θ ∗ , r ) ⊂ Θ.The minimum and maximum eigenvalues of the symmetric matrix M are denoted by λ min ( M ) and λ max ( M ), respectively. Let b T = (cid:8) λ min ( a ⋆T a T ) } − . In particular, b T → ∞ as n → ∞ . Moreover, we assume that b − T ≤ λ max ( a ⋆T a T ) ≤ C b − T ( T ∈ T ) (2.2)for some constant C ∈ [1 , ∞ ). A typical case is n for b T , and n − / I p for a T , where I p is theidentity matrix. Remark 2.1.
In an ergodic diffusion model, the parameter θ of the diffusion coefficient andthe parameter θ of the drift coefficient have different convergence rates in estimation with highfrequency data. Then Condition (2.2) may seem restrictive, but it is incorrect. The randomfield H n is not necessarily the same as a quasi-likelihood function Ψ n used for estimation inreality. The random field H n is rather ”living in the proof” in various manners. Consider ajoint quasi-maximum likelihood estimator (ˆ θ ,n , ˆ θ ,n ) for ( θ , θ ). To analyze the asymptoticbehavior of ˆ θ ,n , the random field H n ( θ ) = Ψ n ( θ , ˆ θ ,n ) can be used. H n ( θ ) is estimated bytaking supremum about the second argument of Ψ n at some stage. For ˆ θ ,n , one can switch H n to a different random field H n ( θ ) = Ψ n (ˆ θ ,n , θ ). Such a stepwise application of the QLA in thepresent article’s form can be observed in many studies; see Yoshida [30], Uchida and Yoshida[26, 28] and the papers listed in Introduction.Define Y T : Ω × Θ → R by Y T ( θ ) = 1 b T (cid:8) H T ( θ ) − H T ( θ ∗ ) (cid:9) ( θ ∈ Θ)for n ∈ N . Let Y : Ω × Θ → R be a continuous random field. Let L > [S1]
Parameters α , β , β , ρ and ρ satisfy the following inequalities:0 < β < / , < ρ < min (cid:8) , α/ (1 − α ) , β / (1 − α ) (cid:9) , < α < ρ , β ≥ , − β − ρ > . [S2] (i) There exists a positive random variable χ and the following conditions are fulfilled. (i-1) Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ (cid:12)(cid:12) θ − θ ∗ (cid:12)(cid:12) for all θ ∈ Θ. (i-2) For some constant C L , it holds that P (cid:2) χ ≤ r − ( ρ − α ) (cid:3) ≤ C L r L ( r > . (ii) For some constant C L , it holds that P (cid:2) λ min (Γ) < r − ρ (cid:3) ≤ C L r L ( r > β = α/ (1 − α ). Let k V k p = (cid:0) E [ | V | p ]) /p for p > V . [S3] (i) For M = L (1 − ρ ) − , sup T ∈ T (cid:13)(cid:13) | ∆ T | (cid:13)(cid:13) M < ∞ . (ii) For M = L (1 − β − ρ ) − ,sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ \ U ( θ ∗ ,b − α/ T ) b − β T (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M < ∞ . (iii) For M = L ( β − ρ ) − sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M = O ( δ ) ( δ ↓ . (iv) For M = L (cid:0) β (1 − α ) − − ρ (cid:1) − ,sup T ∈ T (cid:13)(cid:13) b β T (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) M < ∞ . Remark 2.2. (i) In the above conditions, each constant C L is independent of r and n , but maydepend on the parameters appearing in [ S
1] as well as L . (ii) In applications, we often need toestimate the supremum of a sequence of martingales depending on θ to verify the above momentconditions. Use of Sobolev’s embedding inequality is a simple solution. (iii) The random matrixΓ is positive-definite a.s. if [ S
2] (ii) is satisfied.Let U T = { u ∈ R p ; θ ∗ + a T u ∈ Θ } and V T ( r ) = { u ∈ U T ; | u | ≥ r } for r >
0. Define therandom field Z T on U T by Z T ( u ) = exp (cid:0) H T ( θ ∗ + a T u ) − H T ( θ ∗ ) (cid:1) for u ∈ U T . Following Yoshida [30], we give a polynomial type large deviation inequality forthe random field Z T . Theorem 2.3.
Given a positive constant L , suppose that [ S , [ S and [ S are fulfilled. Thenthere exists a constant C L such that P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ exp (cid:0) − − r − ( ρ ∨ ρ ) (cid:1)(cid:21) ≤ C L r L (2.3) for all r > and T ∈ T . The supremum of the empty set should read −∞ .Proof. Suppose that the constants α, β , β , ρ , ρ satisfy Condition [ S ρ = 2 for H T of class C . According to Section 3.1 of [30], it suffices to verifyConditions [ A ′ ] and [ A A
6] therein. Condition [ A
4] of [30] with ρ = 2 and the conditionthat α ∈ (0 ,
1) are satisfied under [ S
1] since ρ < A ′ ] of [30] requires the estimatesup T > P (cid:2) S ′ T ( r ) c (cid:3) ≤ C L r L ( r >
0) (2.4)for some constant C L , where the event S ′ T ( r ) is defined by S ′ T ( r ) = sup h : θ ∗ + h ∈ Θ ,b − / T r ≤| h |≤ C / b − α/ T (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ (cid:12)(cid:12) < r − ρ . To verify (2.4), we may assume r ≤ C / b (1 − α ) / T , equivalently, b − T ≤ C / (1 − α )0 r − / (1 − α ) . (2.5)Otherwise, S ′ T ( r ) c = ∅ , and there is nothing to show. Furthermore, we may assume r issufficiently large (in particular, r ≥
1) to show Inequality (2.4), by changing C L if necessary.We have P (cid:2) S ′ T ( r ) c (cid:3) ≤ P ( T, r ) + P ( T, r ) , (2.6)where P ( T, r ) = P " sup h : | h |≤ C / { − α ) } r − β (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ T ( θ ∗ ) (cid:12)(cid:12) ≥ r − ρ and P ( T, r ) = P (cid:20)(cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12) ≥ r − ρ (cid:21) n b − T ≤ C / (1 − α )0 r − / (1 − α ) o in view of (2.5). For sufficiently large r , by Condition [ S
3] (iii),sup T ∈ T P ( T, r ) < ∼ r M ρ sup T ∈ T E " sup h : | h |≤ r − β (cid:12)(cid:12) Γ T ( θ ∗ + h ) − Γ T ( θ ∗ ) (cid:12)(cid:12) M < ∼ r − M ( β − ρ ) = r − L . (2.7)Next, by Condition [ S
3] (iv), we havesup T ∈ T P ( T, r ) < ∼ sup T ∈ T (cid:18) b − M β T r M ρ n b − T ≤ C / (1 − α )0 r − / (1 − α ) o (cid:19) < ∼ r − M (2 β / (1 − α ) − ρ ) = r − L . (2.8)From (2.6), (2.7) and (2.8), we obtain (2.4), therefore [ A ′ ] of [30] was verified.6ondition [ A
6] of [30] follows from Condition [ S
3] (i) and Condition [ S
3] (ii). Condition[ S
2] (i) ensures Conditions [ A
3] for ρ = 2 and [ A
5] of [30]. Moreover, [ S
2] (ii) verifies [ A
2] of[30]. Now, as already mentioned, we apply Theorem 1 of [30] to Z T for ρ = 2 in order to obtain(2.3).Define r T ( u ) by r T ( u ) = H T ( θ ∗ + a T u ) − H T ( θ ∗ ) − (cid:18) ∆ T [ u ] − Γ[ u ⊗ ] (cid:19) ( u ∈ U T )1 ( u U T ) (2.9) Proposition 2.4.
Suppose that sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ (cid:12)(cid:12) → p T → ∞ ) (2.10) for every K > . Then the random field Z T is locally asymptotically quadratic at θ ∗ , that is, Z T ( u ) = exp (cid:18) ∆ T [ u ] −
12 Γ[ u ⊗ ] + r T ( u ) (cid:19) ( u ∈ U T ) (2.11) and r T ( u ) → p as T → ∞ for every u ∈ R p .Proof. By definition of r T ( u ), Equation (2.11) holds for u ∈ U T . For each u ∈ R p , there is anumber T u such that a T u ∈ U (0 , r ) for all T ≥ T u . Then r T ( u ) admits the expression r T ( u ) = − Z (1 − s ) (cid:8) Γ T ( θ ∗ + sa T u ) − Γ (cid:9) ds [ u ⊗ ] . (2.12)Therefore r T ( u ) → T → ∞ by (2.10). Remark 2.5.
We have sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ T ( θ ∗ ) (cid:12)(cid:12) → p T → ∞ under [ S
3] (iii) sincelim sup T →∞ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T ∩ U (0 ,K ) (cid:12)(cid:12) Γ T ( θ ∗ + a T u ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) M ≤ lim sup T →∞ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup v ∈ R p : | v |≤ (cid:12)(cid:12) Γ T (cid:0) θ ∗ + K | a T | v (cid:1) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) M ≤ lim sup T →∞ O ( K | a T | ) = 0 . On the other hand, Γ T ( θ ∗ ) − Γ → p T → ∞ under [ S
3] (iv). Therefore, Z T is locally asymptotically quadratic at θ ∗ if [ S
3] (iii)and (iv) are satisfied since the convergence (2.10) holds under [ S
3] (iii) and (iv), though theseconditions are too sufficient for (2.10). 7et ∆ be a p -dimensional random vector on some extension of (Ω , F , P ). Define a randomfield Z on R p by Z ( u ) = exp (cid:18) ∆[ u ] −
12 Γ[ u ⊗ ] (cid:19) (2.15)for u ∈ R p . Let ˆ u = Γ − ∆.Any measurable mapping ˆ θ MT : Ω → Θ is called a quasi-maximum likelihood estimator(QMLE) for H T if H T (ˆ θ MT ) = max θ ∈ Θ H T ( θ ) . (2.16)Since H T is continuous on the compact Θ, such a measurable function always exists, which isensured by the measurable selection theorem. Uniqueness of ˆ θ MT is not assumed. Let ˆ u MT = a − T (ˆ θ MT − θ ∗ ) for the QMLE ˆ θ MT .Let G be a σ -field such that σ [Γ] ⊂ G ⊂ F . It is said that a sequence ( V T ) T ∈ T of randomvariables taking values in a metric space S equipped with the Borel σ -field converges G -stablyto an S -valued random variable V ∞ defined on an extension of (Ω , F , P ) if ( V T , Ψ) → d ( V ∞ , Ψ)as T → ∞ for any G -measurable random variable Ψ. The G -stable convergence is denoted by → d s ( G ) . Theorem 2.6.
Let
L > p > . Suppose that Conditions [ S , [ S and [ S are satisfied andthat ∆ T → d s ( G ) ∆ (2.17) as T → ∞ . Then (a) As T → ∞ , ˆ u MT − Γ − ∆ T → p . (2.18) (b) As T → ∞ , E (cid:2) f (ˆ u MT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.19) for any bounded G -measurable random variable Φ and any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ .Proof. As mentioned in Remark 2.5, the convergence (2.10) holds for every
K > S u : | u |≤ R | r T ( u ) | → p T → ∞ ) (2.20)for every R >
0. The space C ( U (0 , R )) of continuous function on U (0 , R ) is equipped with thesupremum norm. Combining the representation (2.11) of Z T with the convergences (2.17) and82.20), by estimating the modulus of continuity of log Z T on U (0 , R ), we obtain tightness of thefamily (cid:8) Z T | U (0 ,R ) (cid:9) T ≥ T for some T ∈ T , which yields the convergence Z T | U (0 ,R ) → d Z | U (0 ,R ) (2.21)in C ( U (0 , R )) as T → ∞ for every R > Z T → d f Z is given by (2.17), (2.20) and (2.11).Let F be any closed set in R p . Thenlim sup T →∞ P (cid:2) ˆ u MT ∈ F (cid:3) ≤ lim sup T →∞ P (cid:2) ˆ u MT ∈ F ∩ U (0 , R ) (cid:3) + lim sup T →∞ P (cid:2) ˆ u MT ∈ V T ( R ) (cid:3) ≤ lim sup T →∞ P (cid:20) sup u ∈ F ∩ U (0 ,R ) Z T ( u ) − sup u ∈ F c ∩ U (0 ,R ) Z T ( u ) ≥ (cid:21) + lim sup T →∞ P (cid:20) sup u ∈ V T ( R ) Z T ( u ) ≥ (cid:21) ≤ P (cid:20) sup u ∈ F ∩ U (0 ,R ) Z ( u ) − sup u ∈ F c ∩ U (0 ,R ) Z ( u ) ≥ (cid:21) + C L R L (2.22)by the convergence (2.21) and the polynomial type large deviation inequality (2.3) given byTheorem 2.3. Let R → ∞ in (2.22) to obtainlim sup T →∞ P (cid:2) ˆ u MT ∈ F (cid:3) ≤ P (cid:20) sup u ∈ F Z ( u ) − sup u ∈ F c Z ( u ) ≥ (cid:21) ≤ P (cid:2) ˆ u ∈ F (cid:3) . (2.23)Here the positivity of Γ given by [ S
2] (ii) (Remark 2.2) was used for the first inequality, andthe last inequality is by the uniqueness of the maximum point of the random field Z defined by(2.15). Inequality (2.23) shows the convergence ˆ u MT → d ˆ u as T → ∞ .From the convergence of ˆ u MT , in particular ˆ θ MT → p θ ∗ θ , and when ˆ θ MT ∈ U ( θ ∗ , r ), one has∆ T = Z Γ T (cid:0) θ ∗ T + s (ˆ θ MT − θ ∗ ) (cid:1) ds ˆ u MT since ∂ θ H T (ˆ θ MT ) = 0. Then we obtain (2.18) from [ S
3] (iii) and (iv). The G -stable convergenceˆ u MT → d s ( G ) ˆ u (2.24)follows from (2.17).As already used in the above argument, P (cid:2) | ˆ u MT | ≥ r (cid:3) ≤ P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ (cid:21) ≤ C L r L (2.25)for all T ∈ T and r >
0. Therefore, sup T ∈ T E (cid:2) | ˆ u MT | q (cid:3) < ∞ for any constant q such that L > q > p . This means the family (cid:8) f (ˆ u MT ) (cid:9) T > is uniformlyintegrable. Consequently, we obtain (2.19) from (2.24).9 emark 2.7. (i) The convergence (2.19) holds for non-bounded Φ if Φ has the dual inte-grability for f (ˆ u MT ). For example, the convergence holds for Φ ∈ L r ( G ) for some r > | u |→∞ | u | − p ( r − /r | f ( u ) | < ∞ . (ii) The asymptotic equivalence (2.18) between ˆ u MT andΓ − ∆ T is called the first-order efficiency in particular for the maximum likelihood estimator.This relation is useful when one considers a joint convergence of ˆ u MT with other variables. Suchan asymptotic representation of the error is useful in analysis of a model having multi-scaledparameters.The quasi-likelihood analysis enables us to derive asymptotic properties of the Bayesianestimator, as well as the quasi-maximum likelihood estimator. The mappingˆ θ BT = (cid:20) Z Θ exp (cid:0) H T ( θ ) (cid:1) ̟ ( θ ) dθ (cid:21) − Z Θ θ exp (cid:0) H T ( θ ) (cid:1) ̟ ( θ ) dθ (2.26)is called a quasi-Bayesian estimator (QBE) with respect to the prior density ̟ . The QBE ˆ θ BT takes values in the convex-hull of Θ. When the H T is the log likelihood function, the QBEcoincides with the Bayesian estimator with respect to the quadratic loss function. We willassume ̟ is continuous and 0 < inf θ ∈ Θ ̟ ( θ ) ≤ sup θ ∈ Θ ̟ ( θ ) < ∞ . Theorem 2.8. (I)
Let
L > . Suppose that Conditions [ S , [ S and [ S are satisfied andthat the convergence (2.17) holds as T → ∞ . Then (a) As T → ∞ , ˆ u BT − Γ − ∆ T → p . (2.27) (b) As T → ∞ , ˆ u BT → d s ( G ) ˆ u. (2.28) (II) Let p ≥ and L > ( p + 1) ∨ . Suppose that Conditions [ S , [ S and [ S are satisfiedand that the convergence (2.17) holds as T → ∞ . Moreover, suppose that there existpositive constants q , c , δ and T ∈ T such that q > p and sup T ≥ T E (cid:2)(cid:12)(cid:12) H T ( θ ∗ + a T u ) − H T ( θ ∗ ) (cid:12)(cid:12) q (cid:3) ≤ c | u | q (2.29) for all u ∈ U (0 , δ ) . Then (2.27) holds, and moreover, E (cid:2) f (ˆ u BT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.30) as T → ∞ for any G -measurable bounded random variable Φ and any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ . Remark 2.9.
In Theorem 2.8, we implicitly assume that T is sufficiently large so that U (0 , δ ) ⊂ U T for all T ≥ T , and the left-hand side of (2.29) makes sense. Remark 2.10.
Condition (2.29) holds under any one of the following conditions:10 i) There exist constants q > p , δ > T ∈ T such that sup T ≥ T sup θ ∈ U T ∩ U ( θ ∗ ,δ ) k Γ T ( θ ) k q < ∞ and sup T ≥ T k ∆ T k q < ∞ . (ii) | Γ | ∈ L q , and M , M , and M appearing in [ S
3] satisfy M ∧ M ∧ M ≥ q .This follows from the representation (2.11) of Z T ( u ) and the representation (2.12) of r T ( u ). Proof of Theorem 2.8. (I) We obtain a polynomial type large deviation inequality from Theorem2.3: for any
D >
0, there exist positive constants C and C such that P (cid:20) sup V T ( r ) Z T ≥ C r − D (cid:21) ≤ C r − L (2.31)for all T and r >
0. Choose a number D such that D > p + ( p ∨ C ′ , then P (cid:20) Z V T ( r ) (1 + | u | ) Z T ( u ) du > C ′ ∞ X ℓ =0 ( r + ℓ ) p − D (cid:21) ≤ ∞ X ℓ =0 P (cid:20) Z { r + ℓ ≤| u | < ( r + ℓ +1) }∩ U T (1 + | u | ) Z T ( u ) du > C ′ ( r + ℓ ) p − D (cid:21) ≤ ∞ X ℓ =0 P (cid:20) sup { r + ℓ ≤| u | < ( r + ℓ +1) }∩ U T Z T ( u ) > C ( r + ℓ ) − D (cid:21) ≤ C ∞ X ℓ =0 ( r + ℓ ) − L for T ∈ T and r >
1, by (2.31). Since D − p > L > C and ǫ (independent of ( r, T )) such that P (cid:20) Z V T ( r ) (1 + | u | ) Z T ( u ) du > C r − ǫ (cid:21) ≤ C r − ǫ ( r > , T ∈ T ) (2.32)The variable ˆ u BT has the expressionˆ u BT = (cid:18) Z U T Z T ( u ) ̟ ( θ ∗ + a T u ) du (cid:19) − Z U T u Z T ( u ) ̟ ( θ ∗ + a T u ) du. (2.33)For g ( u ) = (1 , u ), let X T = Z U T g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du, X T,r = Z U T ∩ U (0 ,r ) g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du,W T,r = Z V T ( r ) g ( u ) Z T ( u ) ̟ ( θ † T ( u )) du,X ∞ = Z R p g ( u ) Z ( u ) ̟ ( θ ∗ ) du, X ∞ ,r = Z U (0 ,r ) g ( u ) Z ( u ) ̟ ( θ ∗ ) du, where θ † T ( u ) = θ ∗ + a T u and Z is given by (2.15). Then X T = X T,r + W T,r and the followingproperties hold. 11i) For any η >
0, there exists r > T ∈ T P [ | W T,r | > η ] < η for all r ≥ r .(ii) For every r > X T,r → d X ∞ ,r as T → ∞ .(iii) X ∞ ,r → d X ∞ as r → ∞ .Indeed, (i) follows from (2.32), (ii) from the convergence (2.21), and (iii) is obvious. Therefore X T → d X ∞ (2.34)as T → ∞ .Denote X T = ( X (0) T , X (1) T ) and X T,r = ( X (0) T,r , X (1)
T,r ). We will consider sufficiently large T such that U T ⊃ U (0 , A T = (cid:0) | X (1) T | + | X (0) T | (cid:1)(cid:18) X (0) T Z U (0 , Z T ( u ) du inf θ ∈ Θ ̟ ( θ ) (cid:19) − . Then A T ≥ | X (1) T | + | X (0) T | X (0) T X (0) T,r for all r ≥
1. We have (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) X (0) T,r (cid:1) − X (1) T,r (cid:12)(cid:12)(cid:12) ≤ | W T,r | A T for all r ≥
1. Let ǫ >
0. Then there exists a positive number η > T →∞ P (cid:20) A T > η (cid:21) < ǫ { A T } T ≥ T is tight for some T ∈ T by (2.34) and (2.21). For the pair ( ǫ, η ),there exists r = r ( ǫ, η ) ≥ T →∞ P (cid:20) | W T,r | > ǫη (cid:21) < ǫ r ≥ r )by the property (i) mentioned just before (2.34). In what follows, we fix an r ≥ r . Thenlim sup T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) X (0) T,r (cid:1) − X (1) T,r (cid:12)(cid:12)(cid:12) > ǫ (cid:21) ≤ lim sup T →∞ P (cid:20) | W T,r | > ǫη (cid:21) + lim T →∞ P (cid:20) A T > η (cid:21) < ǫ . (2.35)We have (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i Z T ( u ) ̟ ( θ † T ( u )) du − Z U (0 ,r ) u i Z T ( u ) ̟ ( θ ∗ ) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z U (0 ,r ) (1 + | u | ) Z T ( u ) du × sup (cid:26) | ̟ ( θ ) − ̟ ( θ ∗ ) | ; θ ∈ Θ , | θ − θ ∗ | ≤ | a T | r (cid:27) u = 1 and u = u . Therefore (cid:18) Z U (0 ,r ) Z T ( u ) ̟ ( θ † T ( u )) du (cid:19) − Z U (0 ,r ) u Z T ( u ) ̟ ( θ † T ( u )) du − (cid:18) Z U (0 ,r ) Z T ( u ) du (cid:19) − Z U (0 ,r ) u Z T ( u ) du → p T → ∞ .Moreover, we have (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i Z T ( u ) du − Z U (0 ,r ) u i ˆ Z T ( u ) du (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) Z U (0 ,r ) u i exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] + r T ( u ) (cid:1) du − Z U (0 ,r ) u i exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] (cid:1) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z U (0 ,r ) (1 + | u | )ˆ Z T ( u ) du sup u ∈ U (0 ,r ) (cid:12)(cid:12) e r T ( u ) − (cid:12)(cid:12) where ˆ Z T ( u ) = exp (cid:0) ∆ T [ u ] − − Γ[ u ⊗ ] (cid:1) . Therefore (cid:18) Z U (0 ,r ) Z T ( u ) du (cid:19) − Z U (0 ,r ) u Z T ( u ) du − (cid:18) Z U (0 ,r ) ˆ Z T ( u ) du (cid:19) − Z U (0 ,r ) u ˆ Z T ( u ) du → p T → ∞ thanks to the convergence (2.20).Let ˆ X T = ( ˆ X (0) T , ˆ X (1) T ) = Z R p g ( u )ˆ Z T ( u ) du, ˆ X T,r = ( ˆ X (0) T,r , ˆ X (1) T,r ) = Z U (0 ,r ) g ( u )ˆ Z T ( u ) du, ˆ W T,r = Z R p \ U (0 ,r ) g ( u )ˆ Z T ( u ) du. Then (cid:12)(cid:12)(cid:12)(cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T − (cid:0) ˆ X (0) T,r (cid:1) − ˆ X (1) T,r (cid:12)(cid:12)(cid:12) ≤ | ˆ W T,r | ˆ A T for all r ≥
1, whereˆ A T = (cid:0) | ˆ X (1) T | + | ˆ X (0) T | (cid:1)(cid:18) ˆ X (0) T Z U (0 , ˆ Z T ( u ) du inf θ ∈ Θ ̟ ( θ ) (cid:19) − ≥ | ˆ X (1) T | + | ˆ X (0) T | ˆ X (0) T ˆ X (0) T,r T show that for any η >
0, there exist r > T ∈ T such thatsup T ≥ T P [ ˆ W T,r > η ] < η ( r ≥ r ) . (2.38)In the same way as we showed (2.35),lim T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T − (cid:0) ˆ X (0) T,r (cid:1) − ˆ X (1) T,r (cid:12)(cid:12)(cid:12) > ǫ (cid:21) < ǫ . (2.39)To obtain (2.39) by using (2.38) and the tightness of { ˆ A T } T ≥ T for some T ∈ T , we replace r by a larger number, if necessary.Combining (2.35), (2.36), (2.37) and (2.39), we obtainlim sup T →∞ P (cid:20) (cid:12)(cid:12)(cid:12)(cid:0) X (0) T (cid:1) − X (1) T − (cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T (cid:12)(cid:12)(cid:12) > ǫ (cid:21) < ǫ. (2.40)This completes the proof of (2.27) since ˆ u BT = (cid:0) X (0) T (cid:1) − X (1) T and (cid:0) ˆ X (0) T (cid:1) − ˆ X (1) T = Γ − ∆ T . From(2.27), we obtain (2.28).(II) There exists a number p ∗ such that p ∗ ≥ , p < p ∗ < ( D − p ) ∧ ( L − p ≥ L > ( p + 1) ∨
2. Then the following estimates are standard: E (cid:2) | ˆ u BT | p ∗ ] ≤ E (cid:20)(cid:18) Z U T Z T ( u ) ̟ ( θ † T ( u )) du (cid:19) − Z U T | u | p ∗ Z T ( u ) ̟ ( θ † T ( u )) du (cid:21) ≤ C ( ̟ ) ∞ X r =0 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du (cid:21) ≤ C ( ̟ ) (cid:0) ,T + Φ ,T (cid:1) for sme constant C ( ̟ ), whereΦ ,T = ∞ X r =1 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du × (cid:26) R { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du> C ′ rD − p +1 (cid:27)(cid:21) and Φ ,T = ∞ X r =1 ( r + 1) p ∗ E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du × (cid:26) R { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du ≤ C ′ rD − p +1 (cid:27)(cid:21) . C ′ . Since the integrand of the expectation of Φ ,T is not greaterthan one, we obtainΦ ,T ≤ ∞ X r =1 ( r + 1) p ∗ P (cid:20) Z { u ; r< | u |≤ r +1 }∩ U T Z T ( u ) du > C ′ r D − p +1 (cid:21) < ∼ ∞ X r =1 r − ( L − p ∗ ) thanks to the polynomial type large deviation inequality (2.31). For Φ ,T ,Φ ,T < ∼ ∞ X r =1 r − ( D − p ∗ − p +1) E (cid:20)(cid:18) Z U T Z T ( u ) du (cid:19) − (cid:21) . Therefore, the family {| ˆ u BT | p } T ∈ T is uniformly integrable ifsup T ≥ T E (cid:20)(cid:18) Z U (0 ,δ ) Z T ( u ) du (cid:19) − (cid:21) < ∞ (2.41)since U (0 , δ ) ⊂ ∩ T ≥ T U T (see Remark 2.10) and thensup T ∈ T E (cid:2) | ˆ u BT | p ∗ ] < ∞ . We note that the family { ˆ u BT } T ∈ T ,T 2] and [ S [T1] (i) There exists a positive random variable χ and the following conditions are fulfilled. (i-1) Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ (cid:12)(cid:12) θ − θ ∗ (cid:12)(cid:12) for all θ ∈ Θ. (i-2) For every L > 0, there exists a constant C such that P (cid:2) χ ≤ r − (cid:3) ≤ Cr L ( r > . (ii) For every L > 0, there exists a constant C such that P (cid:2) λ min (Γ) < r − (cid:3) ≤ Cr L ( r > Remark 2.11. χ − ∈ L ∞ – = ∩ p> L p under [ T 1] (i-2). | Γ − | ∈ L ∞ – under [ T 1] (ii) since (cid:0) λ min (Γ) (cid:1) − ∈ L ∞ – and | Γ − | ≤ C p (cid:0) λ min (Γ) (cid:1) − for a constant only depending on p . The L q -integrability of Γ will be assumed when we verify (2.29).15 T2] There exist positive numbers ǫ and ǫ such that the following conditions are satisfied forall p > (i) sup T ∈ T (cid:13)(cid:13) | ∆ T | (cid:13)(cid:13) p < ∞ . (ii) sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ b Tǫ (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) p < ∞ . (iii) sup T ∈ T (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) p = O ( δ ) ( δ ↓ . (iv) sup T ∈ T (cid:13)(cid:13) b Tǫ (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) p < ∞ . Denote by f ∈ C p ( R p ) the set of continuous functions of at most polynomial growth. Wefurther simplify Theorems 2.6 and 2.8. Theorem 2.12. Suppose that Conditions [ T and [ T are satisfied and that the convergence(2.17) holds as T → ∞ . Then (a) As T → ∞ , ˆ u MT − Γ − ∆ T → p . (2.42) (b) As T → ∞ , E (cid:2) f (ˆ u MT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.43) for any f ∈ C p ( R p ) and any G -measurable random variable Φ ∈ ∪ p> L p .Proof. There exist values of the parameters α, β , β , ρ and ρ satisfying β ∈ (0 , min { ǫ , / } ),1 / − β ≤ ǫ and Condition [ S S 2] is verified for any given L > T T 2] is sufficient for [ S 3] for any L > 0. Therefore we can apply Theorems 2.6.This concludesthe proof. Theorem 2.13. Suppose that Conditions [ T and [ T are satisfied and that the convergence(2.17) holds as T → ∞ . Moreover, suppose that | Γ | ∈ L q for some q > p . Then (a) As T → ∞ , ˆ u BT − Γ − ∆ T → p . (2.44) (b) As T → ∞ , E (cid:2) f (ˆ u BT )Φ (cid:3) → E (cid:2) f (ˆ u )Φ (cid:3) (2.45) for any f ∈ C p ( R p ) and any G -measurable random variable Φ ∈ ∪ p> L p .Proof. We apply Theorem 2.8. In particular, (2.29) holds now according to Remark 2.10.16 Further simplification in ergodic statistics When the limit Y of Y T is deterministic, some more simplification of the theory is possible. Inthis section, we suppose that the random field Y and the p × p positive-definite symmetric matrixΓ are deterministic. Let L be a positive number. We will consider the following conditions. [ U ] Γ is positive-definite, in addition, there is a positive number χ such that Y ( θ ) = Y ( θ ) − Y ( θ ∗ ) ≤ − χ | θ − θ ∗ | for all θ ∈ Θ. [ U ] The numbers α , β , β and ρ satisfy the inequalities0 < α < ρ , β ≥ , − β − ρ > , < β < / , (3.1)and the following conditions are fulfilled. (i) For some M > L , sup T > (cid:13)(cid:13) ∆ T (cid:13)(cid:13) M < ∞ . (ii) For M = L (1 − β − ρ ) − ,sup T > (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ U T \ U (cid:0) θ ∗ ,b − α/ T (cid:1) b − β T (cid:12)(cid:12) Y T ( θ ) − Y ( θ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M < ∞ . (iii) For some M > Lβ − ,sup T > (cid:13)(cid:13)(cid:13)(cid:13) sup u ∈ U T : | u |≤ (cid:12)(cid:12) Γ T ( θ ∗ + δu ) − Γ T ( θ ∗ ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) M = O ( δ ) ( δ ↓ . (iv) For some M > L (cid:0) β ) − (1 − α ),sup T > (cid:13)(cid:13) b β T (cid:12)(cid:12) Γ T ( θ ∗ ) − Γ (cid:12)(cid:12)(cid:13)(cid:13) M < ∞ . Remark 3.1. Condition [ U 1] is almost trivial because the function Y should be of C on Θand continuous on the compact set Θ and then local non-degeneracy of the information impliesthe global identifiability. Theorem 3.2. Suppose that Conditions [ U and [ U are fulfilled for a positive constant L .Then there exists a constant C L such that P (cid:20) sup u ∈ V T ( r ) Z T ( u ) ≥ exp (cid:0) − − r − ρ (cid:1)(cid:21) ≤ C L r L for all T ∈ T and r > . Here the supremum on the empty set should read −∞ by convention. roof. Choose a positive constant ρ such that0 < ρ < min (cid:8) , α/ (1 − α ) , β / (1 − α ) (cid:9) , ρ ≤ ρ ,M ≥ M ′ := L (1 − ρ ) − , M ≥ M ′ := L ( β − ρ ) − ,M ≥ M ′ := L (cid:0) β (1 − α ) − − ρ (cid:1) − (3.2)for M , M , M given in Condition [ U ρ exists. It is sufficient toverify the conditions of Theorem 2.3. Condition [ S 1] is fulfilled by [ U 2] and a choice of ρ in (3.2). Condition [ S 2] (i-1) is satisfied with [ U S 2] aretrivial because χ is a deterministic positive number and Γ is positive-definite, deterministicin the present situation, respectively. Conditions (i)-(iv) of [ S 3] are verified by [ U 2] with( M ′ , M , M ′ , M ′ ) for ( M , M , M , M ) in [ S H T is characterized by(2.16). Theorem 2.6 is rephrased as follows with the trivial σ -field for G . Theorem 3.3. Let L > p > . Suppose that Conditions [ U and [ U are satisfied and that ∆ T → d ∆ (3.3) as T → ∞ . Then, (a) ˆ u MT − Γ − ∆ T → p as T → ∞ . (b) E (cid:2) f (ˆ u MT ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ .Proof. Take ρ as (3.2) and apply Theorem 2.6.Consider the quasi-Beyesian estimator ˆ θ BT defined by (2.26). We can rephrase Theorem 2.8as follows. Theorem 3.4. Let p ≥ and L > ( p + 1) ∨ . Suppose that Conditions [ U and [ U aresatisfied and that the convergence (3.3) holds as T → ∞ . Moreover, suppose that there existpositive constants q , c , δ , T ∈ T and c such that q > p and the inequality (2.29) holds for all u ∈ U (0 , δ ) . Then (a) ˆ u BT − Γ − ∆ T → p as T → ∞ . (b) E (cid:2) f (ˆ u BT ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for any f ∈ C ( R p ) satisfying lim sup | u |→∞ | u | − p | f ( u ) | < ∞ . As a corollary of Theorems 3.3 and 3.4, we obtain the following result. Theorem 3.5. Suppose that Conditions [ U and [ T are satisfied and that the convergence(3.3) holds as T → ∞ . Then (b) ˆ u A T − Γ − ∆ T → p as T → ∞ for A ∈ { M, B } . (a) E (cid:2) f (ˆ u A T ) (cid:3) → E (cid:2) f (ˆ u ) (cid:3) as T → ∞ for A ∈ { M, B } and any f ∈ C p ( R p ) . eferences [1] Clinet, S., Yoshida, N.: Statistical inference for ergodic point processes and application tolimit order book. arXiv preprint arXiv:1512.01899 (2015)[2] Eguchi, S., Masuda, H.: Schwarz type model comparison for laq models. arXiv preprintarXiv:1606.01627 (2016)[3] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: The asymptotic behavior of certain statistical estimatesin the smooth case. I. Investigation of the likelihood ratio. Teor. Verojatnost. i Primenen. , 469–486 (1972)[4] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: Asymptotic behavior of certain statistical estimates.II. Limit theorems for a posteriori density and for Bayesian estimates. Teor. Verojatnost.i Primenen. , 78–93 (1973)[5] Ibragimov, I.A., Has ′ minski˘ı, R.Z.: Statistical estimation, Applications of Mathematics,vol. 16. Springer-Verlag, New York (1981). Asymptotic theory, Translated from the Russianby Samuel Kotz[6] Inatsugu, H., Yoshida, N.: Global jump filters and quasi-likelihood analysis for volatility.Annals of the Institute of Statistical Mathematics: updated arXiv:1806.10706v3 pp. 1–44(2021)[7] Kamatani, K., Uchida, M.: Hybrid multi-step estimators for stochastic differential equa-tions based on sampled data. Statistical Inference for Stochastic Processes (2), 177–204(2014)[8] Kinoshita, Y., Yoshida, N.: Penalized quasi likelihood estimation for variable selection.arXiv preprint arXiv:1910.12871 (2019)[9] Kutoyants, Y.: Identification of dynamical systems with small noise, Mathematics and itsApplications, vol. 300. Kluwer Academic Publishers Group, Dordrecht (1994)[10] Kutoyants, Y.A.: Parameter estimation for stochastic processes, Research and Expositionin Mathematics, vol. 6. Heldermann Verlag, Berlin (1984). Translated from the Russianand edited by B. L. S. Prakasa Rao[11] Kutoyants, Y.A.: Statistical inference for spatial Poisson processes, Lecture Notes inStatistics, vol. 134. Springer-Verlag, New York (1998)[12] Kutoyants, Y.A.: Statistical inference for ergodic diffusion processes. Springer Series inStatistics. Springer-Verlag London Ltd., London (2004)[13] Masuda, H.: Approximate self-weighted lad estimation of discretely observed ergodicornstein-uhlenbeck processes. Electronic Journal of Statistics , 525–565 (2010)[14] Masuda, H.: Convergence of gaussian quasi-likelihood random fields for ergodic l´evy drivensde observed at high frequency. The Annals of Statistics (3), 1593–1641 (2013)1915] Masuda, H.: Parametric estimation of l´evy processes. In: L´evy Matters IV, pp. 179–286.Springer (2015)[16] Masuda, H., Shimizu, Y.: Moment convergence in regularized estimation under multipleand mixed-rates asymptotics. Mathematical Methods of Statistics (2), 81–110 (2017)[17] Nomura, R., Uchida, M.: Adaptive bayes estimators and hybrid estimators for small dif-fusion processes based on sampled data. Journal of the Japan Statistical Society (2),129–154 (2016)[18] Ogihara, T., Yoshida, N.: Quasi-likelihood analysis for the stochastic differential equa-tion with jumps. Stat. Inference Stoch. Process. (3), 189–229 (2011). DOI 10.1007/s11203-011-9057-z. URL http://dx.doi.org/10.1007/s11203-011-9057-z [19] Ogihara, T., Yoshida, N.: Quasi-likelihood analysis for nonsynchronously observed diffu-sion processes. Stochastic Processes and their Applications (9), 2954–3008 (2014)[20] Ogihara, T., Yoshida, N.: Quasi likelihood analysis of point processes for ultra high fre-quency data. arXiv preprint arXiv:1512.01619 (2015)[21] Shimizu, Y.: Threshold estimation for stochastic processes with small noise. arXiv preprintarXiv:1502.07409 (2015)[22] Shimizu, Y.: Moment convergence of regularized least-squares estimator for linear regres-sion model. Annals of the Institute of Statistical Mathematics (5), 1141–1154 (2017)[23] Suzuki, T., Yoshida, N.: Penalized least squares approximation methods and their appli-cations to stochastic processes. Japanese Journal of Statistics and Data Science pp. 1–29(2020)[24] Uchida, M.: Contrast-based information criterion for ergodic diffusion processes fromdiscrete observations. Ann. Inst. Statist. Math. (1), 161–187 (2010). DOI 10.1007/s10463-009-0245-1. URL http://dx.doi.org/10.1007/s10463-009-0245-1 [25] Uchida, M., Yoshida, N.: Adaptive estimation of an ergodic diffusion process based onsampled data. Stochastic Process. Appl. (8), 2885–2924 (2012). DOI 10.1016/j.spa.2012.04.001. URL http://dx.doi.org/10.1016/j.spa.2012.04.001 [26] Uchida, M., Yoshida, N.: Adaptive estimation of an ergodic diffusion process based onsampled data. Stochastic Processes and their Applications (8), 2885–2924 (2012)[27] Uchida, M., Yoshida, N.: Quasi likelihood analysis of volatility and nondegeneracy ofstatistical random field. Stochastic Process. Appl. (7), 2851–2876 (2013). DOI 10.1016/j.spa.2013.04.008. URL http://dx.doi.org/10.1016/j.spa.2013.04.008 [28] Uchida, M., Yoshida, N.: Adaptive bayes type estimators of ergodic diffusion processesfrom discrete observations. Statistical Inference for Stochastic Processes (2), 181–219(2014) 2029] Umezu, Y., Shimizu, Y., Masuda, H., Ninomiya, Y.: Aic for non-concave penalized likeli-hood method. arXiv preprint arXiv:1509.01688 (2015)[30] Yoshida, N.: Polynomial type large deviation inequalities and quasi-likelihood analysis forstochastic differential equations. Ann. Inst. Statist. Math. (3), 431–479 (2011). DOI10.1007/s10463-009-0263-z. URL http://dx.doi.org/10.1007/s10463-009-0263-z [31] Yoshida, N.: Partial quasi-likelihood analysis. Japanese Journal of Statistics and DataScience1