[PDF] Estimating FARIMA models with uncorrelated but non-independent error terms

Abstract

In this paper we derive the asymptotic properties of the least squares estimator (LSE) of fractionally integrated autoregressive moving-average (FARIMA) models under the assumption that the errors are uncorrelated but not necessarily independent nor martingale differences. We relax considerably the independence and even the martingale difference assumptions on the innovation process to extend the range of application of the FARIMA models. We propose a consistent estimator of the asymptotic covariance matrix of the LSE which may be very different from that obtained in the standard framework. A self-normalized approach to confidence interval construction for weak FARIMA model parameters is also presented. All our results are done under a mixing assumption on the noise. Finally, some simulation studies and an application to the daily returns of stock market indices are presented to corroborate our theoretical work.

Full PDF

aa r X i v : . [ s t a t . A P ] O c t Estimating FARIMA models with uncorrelated butnon-independent error terms

Yacouba Boubacar MaïnassaraYoussef EsstafaBruno Saussereau

Université Bourgogne Franche-Comté,Laboratoire de mathématiques de Besançon,UMR CNRS 6623,16 route de Gray,25030 Besançon, France.e-mail: [email protected] e-mail: [email protected] e-mail: [email protected]

Abstract:

In this paper we derive the asymptotic properties of the least squares estimator(LSE) of fractionally integrated autoregressive moving-average (FARIMA) models under theassumption that the errors are uncorrelated but not necessarily independent nor martingalediﬀerences. We relax considerably the independence and even the martingale diﬀerence as-sumptions on the innovation process to extend the range of application of the FARIMAmodels. We propose a consistent estimator of the asymptotic covariance matrix of the LSEwhich may be very diﬀerent from that obtained in the standard framework. A self-normalizedapproach to conﬁdence interval construction for weak FARIMA model parameters is alsopresented. All our results are done under a mixing assumption on the noise. Finally, somesimulation studies and an application to the daily returns of stock market indices are presentedto corroborate our theoretical work.

AMS 2000 subject classiﬁcations:

Primary 62M10; secondary 91B84.

Keywords and phrases:

Nonlinear processes; FARIMA models; Least-squares estimator; Con-sistency; Asymptotic normality; Spectral density estimation; Self-normalization; Cumulants.

1. Introduction

Long memory processes takes a large part in the literature of time series (see for instance [GJ80],[FT86], [Dah89], [Hos81], [BFGK13], [Pal07], among others). They also play an important role inmany scientiﬁc disciplines and applied ﬁelds such as hydrology, climatology, economics, ﬁnance,to name a few. To model the long memory phenomenon, a widely used model is the fractionalautoregressive integrated moving average (FARIMA, for short) model. Consider a second ordercentered stationary process X := ( X t ) t ∈ Z satisfying a FARIMA ( p , d , q ) representation of the form a ( L )(1 − L ) d X t = b ( L ) ǫ t , (1)where d ∈ ]0, 1 / is the long memory parameter, L stands for the back-shift operator and a ( L ) = 1 − P pi =1 a i L i is the autoregressive (AR for short) operator and b ( L ) = 1 − P qi =1 b i L i isthe moving average (MA for short) operator (by convention a = b = 1 ). The operators a and . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models b represent the short memory part of the model. The linear innovation process ǫ := ( ǫ t ) t ∈ Z isassumed to be a stationary sequence satisﬁes (A0): E [ ǫ t ] = 0, Var ( ǫ t ) = σ ǫ and Cov ( ǫ t , ǫ t + h ) = 0 for all t ∈ Z and all h = 0 .Under the above assumptions the process ǫ is called a weak white noise. Diﬀerent sub-classes ofFARIMA models can be distinguished depending on the noise assumptions. It is customary to saythat X is a strong FARIMA ( p , d , q ) representation and we will do this henceforth if in (1) ǫ isa strong white noise, namely an independent and identically distributed (iid for short) sequenceof random variables with mean 0 and common variance. A strong white noise is obviously a weakwhite noise because independence entails uncorrelatedness. Of course the converse is not true.Between weak and strong noises, one can say that ǫ is a semi-strong white noise if ǫ is a stationarymartingale diﬀerence, namely a sequence such that E ( ǫ t | ǫ t − , ǫ t − , . . . ) = 0 . An example of semi-strong white noise is the generalized autoregressive conditional heteroscedastic (GARCH) model(see [FZ10]). If ǫ is a semi-strong white noise in (1), X is called a semi-strong FARIMA ( p , d , q ) .If no additional assumption is made on ǫ , that is if ǫ is only a weak white noise (not necessarily iid,nor a martingale diﬀerence), the representation (1) is called a weak FARIMA ( p , d , q ) . It is clearfrom these deﬁnitions that the following inclusions hold: { strong FARIMA ( p , d , q ) } ⊂ { semi-strong FARIMA ( p , d , q ) } ⊂ { weak FARIMA ( p , d , q ) } . Nonlinear models are becoming more and more employed because numerous real time series exhibitnonlinear dynamics. For instance conditional heteroscedasticity can not be generated by FARIMAmodels with iid noises. As mentioned by [FZ05, FZ98] in the case of ARMA models, manyimportant classes of nonlinear processes admit weak ARMA representations in which the linearinnovation is not a martingale diﬀerence. The main issue with nonlinear models is that they aregenerally hard to identify and implement. These technical diﬃculties certainly explain the reasonwhy the asymptotic theory of FARIMA model estimation is mainly limited to the strong or semi-strong FARIMA model.Now we present some of the main works about FARIMA model estimation when the noiseis strong or semi-strong. For the estimation of long-range dependent process, the commonly usedestimation method is based on the Whittle frequency domain maximum likelihood estimator (MLE)(see for instance [Dah89], [FT86], [TT97], [GS90]). The asymptotic properties of the MLE ofFARIMA models are well-known under the restrictive assumption that the errors ǫ t are independentor martingale diﬀerence (see [Ber95], [BFGK13], [Pal07], [BCT96], [LL97], [HK98], among others).All the works mentioned above assume either strong or semi-strong innovations. In the modelingof ﬁnancial time series, for example, the GARCH assumption on the errors is often used (see forinstance [BCT96], [HK98]) to capture the conditional heteroscedasticity. There is no doubt thatit is important to have a soundness inference procedure for the parameter in the FARIMA modelwhen the (possibly dependent) error is subject to unknown conditional heteroscedasticity. Littleis thus known when the martingale diﬀerence assumption is relaxed. Our aim in this paper is toconsider a ﬂexible FARIMA speciﬁcation and to relax the independence assumption (and even themartingale diﬀerence assumption) in order to be able to cover weak FARIMA representations ofgeneral nonlinear models. This is why it is interesting to consider weak FARIMA models. To cite few examples of nonlinear processes, let us mention the self-exciting threshold autoregressive (SETAR),the smooth transition autoregressive (STAR), the exponential autoregressive (EXPAR), the bilinear, the randomcoeﬃcient autoregressive (RCA), the functional autoregressive (FAR) (see [Ton90] and [FY08] for references onthese nonlinear time series models). . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models A very few works deal with the asymptotic behavior of the MLE of weak FARIMA models.To our knowledge, [Sha12, Sha10b] are the only papers on this subject. Under weak assumptionson the noise process, the author has obtained the asymptotic normality of the Whittle estimator(see [Whi53]). Nevertheless, the inference problem is not addressed. This is due to the fact thatthe asymptotic covariance matrix of the Whittle estimator involves the integral of the fourth-ordercumulant spectra of the dependent errors ǫ t . Using non-parametric bandwidth-dependent methods,one build an estimation of this integral but there is no guidance on the choice of the bandwidthin the estimation procedures (see [Sha12, Tan82, Kee87, Chi88] for further details). The diﬃcultyis caused by the dependence in ǫ t . Indeed, for strong noise, a bandwidth-free consistent estimatorof the asymptotic covariance matrix is available. When ǫ t is dependent, no explicit formula for aconsistent estimator of the asymptotic variance matrix seems to be provided in the literature (see[Sha12]).In this work we propose to adopt for weak FARIMA models the estimation procedure developedin [FZ98] so we use the least squares estimator (LSE for short). We show that a strongly mixingproperty and the existence of moments are suﬃcient to obtain a consistent and asymptoticallynormally distributed least squares estimator for the parameters of a weak FARIMA representation.For technical reasons, we often use an assumption on the summability of cumulants. This can bea consequence of a mixing and moments assumptions (see [DL89], for more details). These kindof hypotheses enable us to circumvent the problem of the lack of speed of convergence (due tothe long-range dependence) in the inﬁnite AR or MA representations. We ﬁx this gap by proposingrather sharp estimations of the inﬁnite AR and MA representations in the presence of long-rangedependence (see Subsection 6.1 for details).In our opinion there are three major contributions in this work. The ﬁrst one is to show thatthe estimation procedure developed in [FZ98] can be extended to weak FARIMA models. This goalis achieved thanks to Theorem 1 and Theorem 2 in which the consistency and the asymptoticnormality are stated. The second one is to provide an answer to the open problem raised by[Sha12] (see also [Sha10b]) on the asymptotic covariance matrix estimation. We propose in ourwork a weakly consistent estimator of the asymptotic variance matrix (see Theorem 3). Thanksto this estimation of the asymptotic variance matrix, we can construct a conﬁdence region forthe estimation of the parameters. Finally another method to construct such conﬁdence region isachieved thanks to an alternative method using a self normalization procedure (see Theorem 6).The paper is organized as follows. Section 2 shown that the least squares estimator for theparameters of a weak FARIMA model is strongly consistent when the weak white noise ( ǫ t ) isergodic and stationary, and that the LSE is asymptotically normally distributed when ( ǫ t ) satisﬁesmixing assumptions. The asymptotic variance of the LSE may be very diﬀerent in the weak andstrong cases. Section 3 is devoted to the estimation of this covariance matrix. We also propose aself-normalization-based approach to constructing a conﬁdence region for the parameters of weakFARIMA models which avoids to estimate the asymptotic covariance matrix. We gather in Section7 all our ﬁgures and tables. These simulation studies and illustrative applications on real data arepresented and discussed in Section 4. The proofs of the main results are collected in Section 6.In all this work, we shall use the matrix norm deﬁned by k A k = sup k x k≤ k Ax k = ρ / ( A ′ A ) ,when A is a R k × k matrix, k x k = x ′ x is the Euclidean norm of the vector x ∈ R k , and ρ ( · ) denotes the spectral radius. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models

2. Least squares estimation

In this section we present the parametrization and the assumptions that are used in the sequel.Then we state the asymptotic properties of the LSE of weak FARIMA models.

We make the following standard assumption on the roots of the AR and MA polynomials in (1). (A1):

The polynomials a ( z ) and b ( z ) have all their roots outside of the unit disk with nocommon factors.Let Θ ∗ be the compact space Θ ∗ := n ( θ , θ , ..., θ p + q ) ∈ R p + q , where a θ ( z ) = 1 − p X i =1 θ i z i , and b θ ( z ) = 1 − q X j =1 θ p + j z j have all their zeros outside the unit disk and have no zero in common o . Denote by Θ the cartesian product Θ ∗ × [ d , d ] , where [ d , d ] ⊂ ]0, 1 / with d ≤ d ≤ d .The unknown parameter of interest θ = ( a , a , . . . , a p , b , b , . . . , b q , d ) is supposed to belongto the parameter space Θ .The fractional diﬀerence operator (1 − L ) d is deﬁned, using the generalized binomial series, by (1 − L ) d = X j ≥ α j ( d ) L j , where for all j ≥ , α j ( d ) = Γ ( j − d ) / { Γ ( j + 1) Γ ( − d ) } and Γ ( · ) is the Gamma function. Usingthe Stirling formula we obtain that for large j , α j ( d ) ∼ j − d − / Γ ( − d ) (one refers to [BFGK13]for further details).For all θ ∈ Θ we deﬁne ( ǫ t ( θ )) t ∈ Z as the second order stationary process which is the solutionof ǫ t ( θ ) = X j ≥ α j ( d ) X t − j − p X i =1 θ i X j ≥ α j ( d ) X t − i − j + q X j =1 θ p + j ǫ t − j ( θ ). (2)Observe that, for all t ∈ Z , ǫ t ( θ ) = ǫ t a.s. Given a realization X , . . . , X n of length n , ǫ t ( θ ) canbe approximated, for < t ≤ n , by ˜ ǫ t ( θ ) deﬁned recursively by ˜ ǫ t ( θ ) = t − X j =0 α j ( d ) X t − j − p X i =1 θ i t − i − X j =0 α j ( d ) X t − i − j + q X j =1 θ p + j ˜ ǫ t − j ( θ ), (3)with ˜ ǫ t ( θ ) = X t = 0 if t ≤ . It will be shown that these initial values are asymptotically negligibleuniformly in θ and, in particular, that ǫ t ( θ ) − ˜ ǫ t ( θ ) → almost surely as t → ∞ (see Lemma 4hereafter). Thus the choice of the initial values has no inﬂuence on the asymptotic properties ofthe model parameters estimator.Let Θ ∗ δ denote the compact set Θ ∗ δ = (cid:8) θ ∈ R p + q ; the roots of the polynomials a θ ( z ) and b θ ( z ) have modulus ≥ δ (cid:9) . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We deﬁne the set Θ δ as the cartesian product of Θ ∗ δ by [ d , d ] , i.e. Θ δ = Θ ∗ δ × [ d , d ] , where δ is a strictly positive constant chosen such that θ belongs to Θ δ .The random variable ˆ θ n is called least squares estimator if it satisﬁes, almost surely, ˆ θ n = argmin θ ∈ Θ δ Q n ( θ ), where Q n ( θ ) = 1 n n X t =1 ˜ ǫ t ( θ ). (4)Our main results are proven under the following assumptions: (A2): The process ( ǫ t ) t ∈ Z is strictly stationary and ergodic.The strong consistency of the least squares estimator will be proved under the three above assump-tions ( (A0) , (A1) and (A2) ). For the asymptotic normality of the LSE, additional assumptionsare required. It is necessary to assume that θ is not on the boundary of the parameter space Θ . (A3): We have θ ∈ ◦ Θ , where ◦ Θ denotes the interior of Θ .The stationary process ǫ is not supposed to be an independent sequence. So one needs to controlits dependency by means of its strong mixing coeﬃcients { α ǫ ( h ) } h ∈ N deﬁned by α ǫ ( h ) = sup A ∈F t −∞ , B ∈F ∞ t + h | P ( A ∩ B ) − P ( A ) P ( B ) | , where F t −∞ = σ ( ǫ u , u ≤ t ) and F ∞ t + h = σ ( ǫ u , u ≥ t + h ) .We shall need an integrability assumption on the moments of the noise ǫ and a summabilitycondition on the strong mixing coeﬃcients ( α ǫ ( k )) k ≥ . (A4): There exists an integer τ such that for some ν ∈ ]0, 1] , we have E | ǫ t | τ + ν < ∞ and P ∞ h =0 ( h + 1) k − { α ǫ ( h ) } ν k + ν < ∞ for k = 1, . . . , τ .Note that (A4) implies the following weak assumption on the joint cumulants of the innovationprocess ǫ (see [DL89], for more details). (A4’): There exists an integer τ ≥ such that C τ := P i , ... , i τ − ∈ Z | cum( ǫ , ǫ i , . . . , ǫ i τ − ) | < ∞ . In the above expression, cum( ǫ , ǫ i , . . . , ǫ i τ − ) denotes the τ − th order cumulant of the stationaryprocess. Due to the fact that the ǫ t ’s are centered, we notice that for ﬁxed ( i , j , k )cum( ǫ , ǫ i , ǫ j , ǫ k ) = E [ ǫ ǫ i ǫ j ǫ k ] − E [ ǫ ǫ i ] E [ ǫ j ǫ k ] − E [ ǫ ǫ j ] E [ ǫ i ǫ k ] − E [ ǫ ǫ k ] E [ ǫ i ǫ j ] . Assumption (A4) is a usual technical hypothesis which is useful when one prove the asymptoticnormality (see [FZ98] for example). Let us notice however that we impose a stronger convergencespeed for the mixing coeﬃcients than in the works on weak ARMA processes. This is due to thefact that the coeﬃcients in the AR or MA representation of ǫ t ( θ ) have no more exponential decaybecause of the fractional operator (see Subsection 6.1 for details and comments).As mentioned before, Hypothesis (A4) implies (A4’) which is also a technical assumption usuallyused in the fractionally integrated ARMA processes framework (see for instance [Sha10c]) or evenin an ARMA context (see [FZ07, ZL15]). One remarks that in [Sha10b], the author emphasizedthat a geometric moment contraction implies (A4’) . This provides an alternative to strong mixingassumptions but, to our knowledge, there is no relation between this two kinds of hypotheses. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models The asymptotic properties of the LSE of the weak FARIMA model are stated in the following twotheorems.

Theorem 1. (Consistency). Assume that ( ǫ t ) t ∈ Z satisﬁes (1) and belonging to L . Let ( ˆ θ n ) n bea sequence of least squares estimators. Under Assumptions (A0) , (A1) and (A2) , we have ˆ θ n a . s . −−−→ n →∞ θ . The proof of this theorem is given in Subsection 6.2.In order to state our asymptotic normality result, we deﬁne the function O n ( θ ) = 1 n n X t =1 ǫ t ( θ ), (5)where the sequence ( ǫ t ( θ )) t ∈ Z is given by (2). We consider the following information matrices I ( θ ) = lim n →∞ V ar (cid:26) √ n ∂∂θ O n ( θ ) (cid:27) and J ( θ ) = lim n →∞ (cid:20) ∂ ∂θ i ∂θ j O n ( θ ) (cid:21) a.s.The existence of these matrices are proved when one demonstrates the following result. Theorem 2. (Asymptotic normality). We assume that ( ǫ t ) t ∈ Z satisﬁes (1) . Under (A0) - (A3) andAssumption (A4) with τ = 4 . The sequence ( √ n ( ˆ θ n − θ )) n ≥ has a limiting centered normaldistribution with covariance matrix Ω := J − ( θ ) I ( θ ) J − ( θ ) . The proof of this theorem is given in Subsection 6.3.

Remark 1.

Hereafter (see more precisely (52) ), we will be able to prove that J ( θ ) = 2 E (cid:20) ∂∂θ ǫ t ( θ ) ∂∂θ ′ ǫ t ( θ ) (cid:21) . Thus the matrix J ( θ ) has the same expression in the strong and weak FARIMA cases (see Theorem1 of [Ber95]). On the contrary, the matrix I ( θ ) is in general much more complicated in the weakcase than in the strong case. Remark 2.

In the standard strong FARIMA case, i.e. when (A2) is replaced by the assumption that ( ǫ t ) is iid, we have I ( θ ) = 2 σ ǫ J ( θ ) . Thus the asymptotic covariance matrix is then reduced as Ω S := 2 σ ǫ J − ( θ ) . Generally, when the noise is not an independent sequence, this simpliﬁcationcan not be made and we have I ( θ ) = 2 σ ǫ J ( θ ) . The true asymptotic covariance matrix Ω = J − ( θ ) I ( θ ) J − ( θ ) obtained in the weak FARIMA framework can be very diﬀerent from Ω S . Asa consequence, for the statistical inference on the parameter, the ready-made softwares used toﬁt FARIMA do not provide a correct estimation of Ω for weak FARIMA processes because thestandard time series analysis softwares use empirical estimators of Ω S . The problem also holds inthe weak ARMA case (see [FZ07] and the references therein).This is why it is interesting to ﬁndan estimator of Ω which is consistent for both weak and (semi-)strong FARIMA cases. Based on the above remark, the next subsection deals with two diﬀerent methods in order ﬁndan estimator of Ω . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models

3. Estimating the asymptotic variance matrix

For statistical inference problem, the asymptotic variance Ω has to be estimated. In particularTheorem 2 can be used to obtain conﬁdence intervals and signiﬁcance tests for the parameters.First of all, the matrix J ( θ ) can be estimated empirically by the square matrix ˆJ n of order p + q + 1 deﬁned by: ˆJ n = 2 n n X t =1 (cid:26) ∂∂θ ˜ ǫ t (cid:16) ˆ θ n (cid:17)(cid:27) (cid:26) ∂∂θ ′ ˜ ǫ t (cid:16) ˆ θ n (cid:17)(cid:27) . (6)The convergence of ˆJ n to J ( θ ) is classical (see Lemma 8 in Subsection 6.3 for details).In the standard strong FARIMA case, in view of remark 2, we have ˆΩ S := 2 ˆ σ ǫ ˆJ − n with ˆ σ ǫ = Q n ( ˆ θ n ) . Thus ˆΩ S is a strongly consistent estimator of Ω S . In the general weak FARIMA case, thisestimator is not consistent when I ( θ ) = 2 σ ǫ J ( θ ) . So we need a consistent estimator of I ( θ ) . I ( θ ) For all t ∈ Z , let H t ( θ ) = 2 ǫ t ( θ ) ∂∂θ ǫ t ( θ ) = (cid:18) ǫ t ( θ ) ∂∂θ ǫ t ( θ ), . . . , 2 ǫ t ( θ ) ∂∂θ p + q +1 ǫ t ( θ ) (cid:19) ′ . (7)We shall see in the proof of Lemma 9 that I ( θ ) = lim n →∞ Var √ n n X t =1 H t ( θ ) ! = ∞ X h = −∞ Cov ( H t ( θ ), H t − h ( θ )) . Following the arguments developed in [BMCF12], the matrix I ( θ ) can be estimated using Berk’sapproach (see [Ber74]). More precisely, by interpreting I ( θ ) / π as the spectral density of thestationary process ( H t ( θ )) t ∈ Z evaluated at frequency , we can use a parametric autoregressiveestimate of the spectral density of ( H t ( θ )) t ∈ Z in order to estimate the matrix I ( θ ) .For any θ ∈ Θ , H t ( θ ) is a measurable function of { ǫ s , s ≤ t } . The stationary process ( H t ( θ )) t ∈ Z admits the following Wold decomposition H t ( θ ) = u t + P ∞ k =1 ψ k u t − k , where ( u t ) t ∈ Z is a ( p + q + 1) − variate weak white noise with variance matrix Σ u .Assume that Σ u is non-singular, that P ∞ k =1 k ψ k k < ∞ , and that det( I p + q +1 + P ∞ k =1 ψ k z k ) = 0 when | z | ≤ . Then ( H t ( θ )) t ∈ Z admits a weak multivariate AR( ∞ ) representation of the form Φ ( L ) H t ( θ ) := H t ( θ ) − ∞ X k =1 Φ k H t − k ( θ ) = u t , (8)such that P ∞ k =1 k Φ k k < ∞ and det { Φ ( z ) } 6 = 0 for all | z | ≤ . It is proved in [BM09, Lüt05] thatone may ﬁnd a constant K and < ρ < such that k Φ k k ≤ K ρ k . (9)Thanks to the previous remarks, the estimation of I ( θ ) is therefore based on the followingexpression I ( θ ) = Φ − (1) Σ u Φ − (1). . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Consider the regression of H t ( θ ) on H t − ( θ ), . . . , H t − r ( θ ) deﬁned by H t ( θ ) = r X k =1 Φ r , k H t − k ( θ ) + u r , t , (10)where u r , t is uncorrelated with H t − ( θ ), . . . , H t − r ( θ ) . Since H t ( θ ) is not observable, we introduce ˆH t ∈ R p + q +1 obtained by replacing ǫ t ( θ ) by ˜ ǫ t ( θ ) and θ by ˆ θ n in (7): ˆH t = 2 ˜ ǫ t ( ˆ θ n ) ∂∂θ ˆ ǫ t ( θ n ) . (11)Let ˆΦ r ( z ) = I p + q +1 − P rk =1 ˆΦ r , k z k , where ˆΦ r ,1 , . . . , ˆΦ r , r denote the coeﬃcients of the LS re-gression of ˆH t on ˆH t − , . . . , ˆH t − r . Let ˆu r , t be the residuals of this regression and let ˆΣ ˆu r be theempirical variance (deﬁned in (12) below) of ˆu r ,1 , . . . , ˆu r , r . The LSE of Φ r = ( Φ r ,1 , . . . , Φ r , r ) and Σ u r = Var( u r , t ) are given by ˆΦ r = ˆΣ ˆH , ˆH r ˆΣ − ˆH r and ˆΣ ˆu r = 1 n n X t =1 (cid:16) ˆH t − ˆΦ r ˆH r , t (cid:17) (cid:16) ˆH t − ˆΦ r ˆH r , t (cid:17) ′ , (12)where ˆH r , t = ( ˆH ′ t − , . . . , ˆH ′ t − r ) ′ , ˆΣ ˆH , ˆH r = 1 n n X t =1 ˆH t ˆH ′ r , t and ˆΣ ˆH r = 1 n n X t =1 ˆH r , t ˆH ′ r , t , with by convention ˆH t = 0 when t ≤ . We assume that ˆΣ ˆH r is non-singular (which holds trueasymptotically).In the case of linear processes with independent innovations, Berk (see [Ber74]) has shown thatthe spectral density can be consistently estimated by ﬁtting autoregressive models of order r = r ( n ) ,whenever r tends to inﬁnity and r / n tends to as n tends to inﬁnity. There are diﬀerences with[Ber74]: ( H t ( θ )) t ∈ Z is multivariate, is not directly observed and is replaced by ( ˆH t ) t ∈ Z . It is shownthat this result remains valid for the multivariate linear process ( H t ( θ )) t ∈ Z with non-independentinnovations (see [BMCF12, BMF11], for references in weak (multivariate) ARMA models). We willextend the results of [BMCF12] to weak FARIMA models.The asymptotic study of the estimator of I ( θ ) using the spectral density method is given in thefollowing theorem. Theorem 3.

We assume (A0) - (A3) and Assumption (A4’) with τ = 8 . In addition, we assumethat the process ( H t ( θ )) t ∈ Z deﬁned in (7) admits a multivariate AR ( ∞ ) representation (8) . Then,the spectral estimator of I ( θ ) ˆI SP n := ˆΦ − r (1) ˆΣ ˆu r ˆΦ ′ − r (1) P −−−→ n →∞ I ( θ ) = Φ − (1) Σ u Φ − (1) where r depends on n and satisﬁes lim n →∞ r ( n ) / n − d − d ) = 0 (remind that d ∈ [ d , d ] ⊂ ]0,1 / ). The proof of this theorem is given in Subsection 6.4.A second method to estimate the asymptotic matrix (or rather avoiding estimate it) is proposedin the next subsection. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We have seen previously that we may obtain conﬁdence intervals for weak FARIMA model parame-ters as soon as we can construct a convergent estimator of the variance matrix I ( θ ) (see Theorems2 and 3). The parametric approach based on an autoregressive estimate of the spectral density of ( H t ( θ )) t ∈ Z that we used before has the drawback of choosing the truncation parameter r in (10).This choice of the order truncation is often crucial and diﬃcult. So the aim of this section is toavoid such a diﬃculty.This section is also of interest because, to our knowledge, it has not been studied for weakFARIMA models. Notable exception is [Sha12] who studied this problem in a short memory case(see Assumption 1 in [Sha12] that implies that the process X is short-range dependent).We propose an alternative method to obtain conﬁdence intervals for weak FARIMA models byavoiding the estimation of the asymptotic covariance matrix I ( θ ) . It is based on a self-normalizationapproach used to build a statistic which depends on the true parameter θ and which is asymptot-ically distribution-free (see Theorem 1 of [Sha12] for a reference in weak ARMA case). The ideacomes from [Lob01] and has been already extended by [BMS18, KL06, Sha10c, Sha10a, Sha12]to more general frameworks. See also [Sha15] for a review on some recent developments on theinference of time series data using the self-normalized approach.Let us brieﬂy explain the idea of the self-normalization.By a Taylor expansion of the function ∂ Q n ( · ) /∂θ around θ , under (A3) , we have √ n ∂∂θ Q n ( ˆ θ n ) = √ n ∂∂θ Q n ( θ ) + (cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) √ n (cid:16) ˆ θ n − θ (cid:17) , (13)where the θ ∗ n , i , j ’s are between ˆ θ n and θ . Using the following equation √ n (cid:18) ∂∂θ O n ( θ ) − ∂∂θ Q n ( θ ) (cid:19) = √ n ∂∂θ O n ( θ ) + (cid:26)(cid:20) ∂ ∂θ i ∂θ j Q n ( θ ∗ n , i , j ) (cid:21) − J ( θ ) + J ( θ ) (cid:27) √ n ( ˆ θ n − θ ), we shall be able to prove that (13) implies that √ n ∂∂θ O n ( θ ) + J ( θ ) √ n ( ˆ θ n − θ ) = o P (1) . (14)This is due to the following technical properties: • the convergence in probability of √ n ∂ Q n ( θ ) /∂θ to √ n ∂ O n ( θ ) /∂θ (see Lemma 5 hereafter), • the almost-sure convergence of [ ∂ Q n ( θ ∗ n , i , j ) /∂θ i ∂θ j ] to J ( θ ) (see Lemma 8 hereafter), • the tightness of the sequence ( √ n ( ˆ θ n − θ )) n (see Theorem 2) and • the existence and invertibility of the matrix J ( θ ) (see Lemma 6 hereafter).Thus we obtain from (14) that √ n ( ˆ θ n − θ ) = 1 √ n n X t =1 U t + o P (1) , where (remind (7)) U t = − J − ( θ ) H t ( θ ). . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models At this stage, we do not rely on the classical method that would consist in estimating the asymptoticcovariance matrix I ( θ ) . We rather try to apply Lemma 1 in [Lob01]. So we need to check that afunctional central limit theorem holds for the process U := ( U t ) t ≥ . For that sake, we deﬁne thenormalization matrix P p + q +1, n of R ( p + q +1) × ( p + q +1) by P p + q +1, n = 1 n n X t =1  t X j =1 ( U j − ¯U n )   t X j =1 ( U j − ¯U n )  ′ , (15)where ¯U n = (1 / n ) P ni =1 U i . To ensure the invertibility of the normalization matrix P p + q +1, n (itis the result stated in the next proposition), we need the following technical assumption on thedistribution of ǫ t . (A5): The process ( ǫ t ) t ∈ Z has a positive density on some neighborhood of zero. Proposition 4.

Under the assumptions of Theorem 2 and (A5) , the matrix P p + q +1, n is almostsurely non singular. The proof of this proposition is given in Subsection 6.5.Let ( B m ( r )) r ≥ be a m -dimensional Brownian motion starting from 0. For m ≥ , we denote by U m the random variable deﬁned by: U m = B ′ m (1) V − m B m (1), (16)where V m = Z ( B m ( r ) − r B m (1)) ( B m ( r ) − r B m (1)) ′ d r . (17)The critical values of U m have been tabulated by [Lob01].The following theorem states the self-normalized asymptotic distribution of the random vector √ n ( ˆ θ n − θ ) . Theorem 5.

Under the assumptions of Theorem and (A5) , we have n ( ˆ θ n − θ ) ′ P − p + q +1, n ( ˆ θ n − θ ) in law −−−→ n →∞ U p + q +1 . The proof of this theorem is given in Subsection 6.6.Of course, the above theorem is useless for practical purpose because the normalization matrix P p + q +1, n is not observable. This gap will be ﬁxed below when one replaces the matrix P p + q +1, n by its empirical or observable counterpart ˆP p + q +1, n = 1 n n X t =1  t X j =1 ( ˆU j − n P nk =1 ˆU k )   t X j =1 ( ˆU j − n P nk =1 ˆU k )  ′ where ˆU j = − ˆJ − n ˆH j . (18)The above quantity is observable and we are able to state our Theorem which is the applicableversion of Theorem 5. Theorem 6.

Under the assumptions of Theorem and (A5) , we have n ( ˆ θ n − θ ) ′ ˆP − p + q +1, n ( ˆ θ n − θ ) in law −−−→ n →∞ U p + q +1 . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models The proof of this theorem is given in Subsection 6.7.At the asymptotic level α , a joint − α )% conﬁdence region for the elements of θ is thengiven by the set of values of the vector θ which satisfy the following inequality: n ( ˆ θ n − θ ) ′ ˆP − p + q +1, n ( ˆ θ n − θ ) ≤ U p + q +1, α , where U p + q +1, α is the quantile of order − α for the distribution of U p + q +1 . Corollary 7.

For any ≤ i ≤ p + q + 1 , a − α )% conﬁdence region for θ ( i ) is given bythe following set: n x ∈ R ; n (cid:0) ˆ θ n ( i ) − x (cid:1) ˆP − p + q +1, n ( i , i ) ≤ U α o , where U α denotes the quantile of order − α of the distribution for U .The proof of this corollary is similar to that of Theorem 6 when one restricts ourselves to a onedimensional case.

4. Numerical illustrations

In this section, we investigate the ﬁnite sample properties of the asymptotic results that we intro-duced in this work. For that sake we use Monte Carlo experiments. The numerical illustrations ofthis section are made with the open source statistical software R (see R Development Core Team,2017) or (see http://cran.r-project.org/).

We study numerically the behavior of the LSE for FARIMA models of the form (1 − L ) d ( X t − aX t − ) = ǫ t − b ǫ t − , (19)where the unknown parameter is taken as θ = ( a , b , d ) = ( − − . First we assume thatin (19) the innovation process ( ǫ t ) t ∈ Z is an iid centered Gaussian process with common variance 1which corresponds to the strong FARIMA case. In two other experiments we consider that in (19)the innovation processes ( ǫ t ) t ∈ Z are deﬁned respectively by (cid:26) ǫ t = σ t η t σ t = 0.04 + 0.12 ǫ t − + 0.85 σ t − (20)and ǫ t = η t η t − , (21)where ( η t ) t ≥ is a sequence of iid centered Gaussian random variables with variance 1. Note thatthe innovation process in (21) is not a martingale diﬀerence whereas it is the case of the noisedeﬁned in (20).We simulated N = 1, 000 independent trajectories of size n = 2, 000 of Model (19) in the threefollowing case: the strong Gaussian noise, the semi-strong noise (20) and the weak noise (21).Figure 1, Figure 2 and Figure 3 compare the distribution of the LSE in these three contexts.The distributions of ˆd n are similar in the three cases whereas the LSE ˆa n of a is more accurate inthe weak case than in the strong and semi-strong cases. The distributions of ˆb n are more accurate . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models in the strong case than in the weak case. Remark that in the weak case the distributions of ˆb n aremore accurate to the semi-strong ones.Figure 4 compares standard estimator ˆΩ S = 2 ˆ σ ǫ ˆJ − n and the sandwich estimator ˆΩ = ˆJ − n ˆI SP n ˆJ − n of the LSE asymptotic variance Ω . We used the spectral estimator ˆI SP n deﬁned in Theorem 3.The multivariate AR order r (see (10)) is automatically selected by AIC (we use the function VARselect() of the vars

R package). In the strong FARIMA case we know that the two estimatorsare consistent. In view of the two upper subﬁgures of Figure 4, it seems that the sandwich estimatoris less accurate in the strong case. This is not surprising because the sandwich estimator is morerobust, in the sense that this estimator remains consistent in the semi-strong and weak FARIMAcases, contrary to the standard estimator (see the middle and bottom subﬁgures of Figure 4).Figure 5 (resp. Figure 6) presents a zoom of the left(right)-middle and left(right)-bottom panelsof Figure 4. It is clear that in the semi-strong or weak case n ( ˆa n − a ) , n ( ˆb n − b ) and n ( ˆd n − d ) are, respectively, better estimated by ˆJ − n ˆI SP n ˆJ − n (1, 1) , ˆJ − n ˆI SP n ˆJ − n (2, 2) and ˆJ − n ˆI SP n ˆJ − n (3, 3) (seeFigure 6) than by ˆ σ ǫ ˆJ − n (1, 1) , ˆ σ ǫ ˆJ − n (2, 2) and ˆ σ ǫ ˆJ − n (3, 3) (see Figure 5). The failure of thestandard estimator of Ω in the weak FARIMA framework may have important consequences interms of identiﬁcation or hypothesis testing and validation.Now we are interested in standard conﬁdence interval and the modiﬁed versions proposed inSubsections 3.1 and 3.2. Table 1 displays the empirical sizes in the three previous diﬀerent FARIMAcases. For the nominal level α = 5% , the empirical size over the N = 1, 000 independent replicationsshould vary between the signiﬁcant limits 3.6% and 6.4% with probability 95%. For the nominallevel α = 1% , the signiﬁcant limits are 0.3% and 1.7%, and for the nominal level α = 10% , theyare 8.1% and 11.9%. When the relative rejection frequencies are outside the signiﬁcant limits, theyare displayed in bold type in Table 1. For the strong FARIMA model, all the relative rejectionfrequencies are inside the signiﬁcant limits for n large. For the semi-strong FARIMA model, therelative rejection frequencies of the standard conﬁdence interval are deﬁnitely outside the signiﬁcantlimits, contrary to the modiﬁed versions proposed. For the weak FARIMA model, only the standardconﬁdence interval of ˆb n is outside the signiﬁcant limits when n increases. As a conclusion, Table1 conﬁrms the comments done concerning Figure 4. We now consider an application to the daily returns of four stock market indices (CAC, DAX,Nikkei and S&P 500). The returns are deﬁned by r t = log( p t / p t − ) where p t denotes the priceindex of the stock market indices at time t . The observations cover the period from the startingdate of each index to February 14, 2019. Figure 7 (resp. Figure 8) plots the closing prices (resp.the returns) of the four stock market indices. Figure 9 shows that the squared returns ( r t ) t ≥ aregenerally strongly autocorrelated.In Financial Econometrics the returns are often assumed to be martingale increments. Thesquares of the returns have often second-order moments close to those of an ARMA (1, 1) which iscompatible with a GARCH (1, 1) model for the returns (see [FZ10]). A long-range memory propertyof the stock market returns series was also largely investigated by [DGE93] (see also [BFGK13],[Pal07] and [BCT96]). The squared returns ( r t ) t ≥ have signiﬁcant positive autocorrelations atleast up to lag 100 (see Figure 9) which conﬁrm the claim that stock market returns have long-termmemory (see [DGE93]). In particular the returns ( r t ) t ≥ process is characterized by substantiallymore correlation between absolute or squared returns than between the returns themselves.Now we focus on the dynamics of the squared returns and we ﬁt a FARIMA (1, d , 1) model tothe squares of the 4 daily returns. Denoting by ( X t ) t ≥ the mean corrected series of the squared . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models returns, we adjust the following model (1 − L ) d ( X t − aX t − ) = ǫ t − b ǫ t − . Table 2 displays the LSE of the parameter θ = ( a , b , d ) of each squared of daily returns. The p − values of the corresponding LSE, ˆ θ n = ( ˆa n , ˆb n , ˆd n ) are given in parentheses. The last columnpresents the estimated residual variance. Note that for all series, the estimated coeﬃcients | ˆa n | and | ˆb n | are smaller than one and this is in accordance with our Assumption (A1) . We also observe thatfor all series the estimated long-range dependence coeﬃcients ˆd n are signiﬁcant for any reasonableasymptotic level and are inside ]0, 0.5[ . We thus think that the assumption (A3) is satisﬁed andthus our asymptotic normality theorem can be applied. Table 3 then presents for each serie themodiﬁed conﬁdence interval at the asymptotic level α = 5% for the parameters estimated inTable 2.

5. Conclusion

Taking into account the possible lack of independence of the error terms, we show in this paperthat we can ﬁt FARIMA representations of a wide class of nonlinear long memory times series. Thisis possible thanks to our theoretical results and it is illustrated in our real cases and simulationsstudies.This standard methodology (when the noise is supposed to be iid), in particular the signiﬁcancetests on the parameters, needs however to be adapted to take into account the possible lack ofindependence of the errors terms. A ﬁrst step has been done thanks to our results on the conﬁdenceintervals. In future works, we intent to study how the existing identiﬁcation (see [BM12], [BMK16])and diagnostic checking (see [BMS18], [FRZ05]) procedures should be adapted in the presence oflong-range dependence framework and dependent noise.

6. Proofs

In all our proofs, K is a strictly positive constant that may vary from line to line. In this subsection, we shall give some results on estimations of the coeﬃcient of formal power seriesthat will arise in our study. Some of them are well know on some others are new to our knowledge.We will make some precise comments hereafter.We begin by recalling the following properties on power series. If for | z | ≤ R , the power series f ( z ) = P i ≥ a i z i and g ( z ) = P i ≥ b i z i are well deﬁned, then one has ( f g )( z ) = P i ≥ c i z i isalso well deﬁned for | z | ≤ R with the sequence ( c i ) i ≥ which is given by c = a ∗ b where ∗ denotesthe convolution product between a and b deﬁned by c i = P ik =0 a k b i − k = P ik =0 a i − k b i . We willmake use of the Young inequality that states that if the sequence a ∈ ℓ p and b ∈ ℓ q and such that p + q = 1 + r with ≤ p , q , r ≤ ∞ , then k a ∗ b k ℓ r ≤ k a k ℓ p × k b k ℓ q . Now we come back to the power series that arise in our context. Remind that for the true valueof the parameter, a θ ( L )(1 − L ) d X t = b θ ( L ) ǫ t . (22) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Thanks to the assumptions on the moving average polynomials b θ and the autoregressive polyno-mials a θ , the power series a − θ and b − θ are well deﬁned.Thus the functions ǫ t ( θ ) deﬁned in (2) can be written as ǫ t ( θ ) = b − θ ( L ) a θ ( L )(1 − L ) d X t (23) = b − θ ( L ) a θ ( L )(1 − L ) d − d a − θ ( L ) b θ ( L ) ǫ t (24)and if we denote γ ( θ ) = ( γ i ( θ )) i ≥ the sequence of coeﬃcients of the power series b − θ ( z ) a θ ( z )(1 − z ) d (which is absolutely convergent for at least for | z | ≤ ), we may write for all t ∈ Z : ǫ t ( θ ) = X i ≥ γ i ( θ ) X t − i , (25)In the same way, by (23) one has X t = (1 − L ) − d a − θ ( L ) b θ ( L ) ǫ t ( θ ) and if we denote η ( θ ) = ( η i ( θ )) i ≥ the coeﬃcients of the power series (1 − z ) − d a − θ ( z ) b θ ( z ) onehas X t = X i ≥ η i ( θ ) ǫ t − i ( θ ) . (26)We strength the fact that γ ( θ ) = η ( θ ) = 1 for all θ .For large j , [HTSC99] have shown that uniformly in θ the sequences γ ( θ ) and η ( θ ) satisfy ∂ k γ j ( θ ) ∂θ i · · · ∂θ i k = O (cid:16) j − − d { log( j ) } k (cid:17) , for k = 0, 1, 2, 3, (27)and ∂ k η j ( θ ) ∂θ i · · · ∂θ i k = O (cid:16) j − d { log( j ) } k (cid:17) , for k = 0, 1, 2, 3. (28)Note that, in view of (25), (26) and (27), for all θ ∈ Θ δ , ǫ t ( θ ) belongs to L , that ( ǫ t ( θ )) t ∈ Z isan ergodic sequence and that, for all t ∈ Z , the function ǫ t ( · ) is a continuous function.One diﬃculty that has to be addressed is that (25) includes the inﬁnite past ( X t − i ) i ≥ whereasonly a ﬁnite number of observations ( X t ) ≤ t ≤ n are available to compute the estimators deﬁned in(4). The simplest solution is truncation which amounts to setting all unobserved values equal tozero. Thus, for all θ ∈ Θ and ≤ t ≤ n one deﬁnes ˜ ǫ t ( θ ) = t − X i =0 γ i ( θ ) X t − i = X i ≥ γ ti ( θ ) X t − i (29)where the truncated sequence γ t ( θ ) = ( γ ti ( θ )) i ≥ is deﬁned by γ ti ( θ ) = (cid:26) γ i ( θ ) if ≤ i ≤ t − otherwise.Since our assumptions are made on the noise in (1) , it will be useful to express the random variables ǫ t ( θ ) and its partial derivatives with respect to θ , as a function of ( ǫ t − i ) i ≥ . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models From (24), there exists a sequence λ ( θ ) = ( λ i ( θ )) i ≥ such that ǫ t ( θ ) = ∞ X i =0 λ i ( θ ) ǫ t − i (30)where the sequence λ ( θ ) is given by the sequence of the coeﬃcients of the power series b − θ ( z ) a θ ( z )(1 − z ) d − d a − θ ( z ) b θ ( z ) . Consequently λ ( θ ) = γ ( θ ) ∗ η ( θ ) or, equivalently, λ i ( θ ) = i X j =0 γ j ( θ ) η i − j ( θ ). (31)We proceed in the same way as regard to the derivatives of ǫ t ( θ ) . More precisely, for any θ ∈ Θ , t ∈ Z and ≤ k , l ≤ p + q + 1 there exists sequences . λ k ( θ ) = ( . λ i , k ( θ )) i ≥ and .. λ k , l ( θ ) =( .. λ i , k , l ( θ )) i ≥ such that ∂ǫ t ( θ ) ∂θ k = ∞ X i =1 . λ i , k ( θ ) ǫ t − i (32) ∂ ǫ t ( θ ) ∂θ k ∂θ l = ∞ X i =1 .. λ i , k , l ( θ ) ǫ t − i . (33)Of course it holds that . λ k ( θ ) = ∂γ ( θ ) ∂θ k ∗ η ( θ ) and .. λ k , l ( θ ) = ∂ γ ( θ ) ∂θ k ∂θ l ∗ η ( θ ) .Similarly we have ˜ ǫ t ( θ ) = ∞ X i =0 λ ti ( θ ) ǫ t − i , (34) ∂ ˜ ǫ t ( θ ) ∂θ k = ∞ X i =1 . λ ti , k ( θ ) ǫ t − i (35) ∂ ˜ ǫ t ( θ ) ∂θ k ∂θ l = ∞ X i =1 .. λ ti , k , l ( θ ) ǫ t − i , (36)where λ t ( θ ) = γ t ( θ ) ∗ η ( θ ) , . λ tk ( θ ) = ∂γ t ( θ ) ∂θ k ∗ η ( θ ) and .. λ tk , l ( θ ) = ∂ γ t ( θ ) ∂θ k ∂θ l ∗ η ( θ ) .In order to handle the truncation error ǫ t ( θ ) − ˜ ǫ t ( θ ) , one needs informations on the sequence λ ( θ ) − λ t ( θ ) . This is the purpose on the following lemma. Lemma 1.

For ≤ r ≤ ∞ , ≤ k ≤ p + q + 1 and θ ∈ Θ , we have k λ ( θ ) − λ t ( θ ) k ℓ r = O (cid:16) t − r − ( d − d ) (cid:17) and k . λ k ( θ ) − . λ tk ( θ ) k ℓ r = O (cid:16) t − r − ( d − d ) (cid:17) . Proof.

We have λ ( θ ) − λ t ( θ ) = (cid:0) γ ( θ ) − γ t ( θ ) (cid:1) ∗ η ( θ ). . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models In view of (28), the sequence η ( θ ) belongs to ℓ q for any q > / (1 − d ) . Young’s inequality forconvolution yields that for all r ≥ k λ ( θ ) − λ t ( θ ) k ℓ r ≤k γ ( θ ) − γ t ( θ ) k ℓ p k η ( θ ) k ℓ q (37)with q = (1 − ( d + β )) − > / (1 − d ) and p = r / (1 + r ( d + β )) , for some β > suﬃcientlysmall. Thus there exists K such that k η ( θ ) k ℓ q ≤ K . Since for any j ≥ , γ j ( θ ) − γ tj ( θ ) = (cid:26) if ≤ j ≤ t − γ j ( θ ) otherwisewe obtain using (27) that k λ ( θ ) − λ t ( θ ) k ℓ r ≤ K ∞ X k =0 (cid:12)(cid:12) γ k ( θ ) − γ tk ( θ ) (cid:12)(cid:12) p ! / p ≤ K ∞ X k = t | γ k ( θ ) | p ! / p ≤ K ∞ X k = t k p + pd ! / p ≤ K (cid:18)Z ∞ t x p + pd dx (cid:19) / p ≤ K t − − d + p ≤ K t − r − ( d − d )+ β , where the constant K varies from line to line. The conclusion follows by tending β to .The second point of the lemma is shown in the same way as the ﬁrst. This is because from (27),the coeﬃcient ∂γ j ( θ ) /∂θ k = O( j − − d + ζ ) for any small enough ζ > . The proof of the lemma isthen complete. Remark 3.

Taking r = ∞ in the above lemma implies that the sequence . λ k ( θ ) − . λ t k ( θ ) isbounded and more precicely there exists K such that sup j ≥ (cid:12)(cid:12)(cid:12) . λ j , k ( θ ) − . λ t j , k ( θ ) (cid:12)(cid:12)(cid:12) ≤ Kt (38) for any t and any ≤ k ≤ p + q + 1 . One shall also need the following lemmas.

Lemma 2.

For any ≤ r ≤ ∞ , ≤ k ≤ p + q + 1 and θ ∈ Θ , there exists a constant K suchthat we have k . λ tk ( θ ) k ℓ r ≤ K . Proof.

The proof follows the same arguments than the proof of Lemma 1.

Lemma 3.

There exists a constant K such that we have (cid:12)(cid:12)(cid:12) . λ i , k ( θ ) (cid:12)(cid:12)(cid:12) ≤ Ki . (39) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Proof.

For ≤ k ≤ p + q + 1 , the sequence . λ k ( θ ) = ( . λ i , k ( θ )) i ≥ is in fact the sequence of thecoeﬃcients in the power series of ∂∂θ k (cid:16) b − θ ( z ) a θ ( z )(1 − z ) d − d a − θ ( z ) b θ ( z ) (cid:17) . Thus . λ i , k ( θ ) is the i − th coeﬃcient taken in θ = θ . There are three cases. ⋄ k = 1, . . . , p :Since ∂∂θ k (cid:16) b − θ ( z ) a θ ( z )(1 − z ) d − d a − θ ( z ) b θ ( z ) (cid:17) = − b − θ ( z ) z k (1 − z ) d − d a − θ ( z ) b θ ( z ) , we deduce that . λ i , k ( θ ) is the i − th coeﬃcient of z k a − θ ( z ) which satisﬁes . λ i , k ( θ ) ≤ K ρ i for some ρ < (see [FZ98] for example). ⋄ k = p + 1, . . . , p + q :We have ∂∂θ k (cid:16) b − θ ( z ) a θ ( z )(1 − z ) d − d a θ ( z ) b θ ( z ) (cid:17) = (cid:18) ∂∂θ k b − θ ( z ) (cid:19) a θ ( z )(1 − z ) d − d a − θ ( z ) b θ ( z ) and consequently . λ i , k ( θ ) is the i − th coeﬃcient of ( ∂∂θ k b − θ ( z )) b θ ( z ) which also satisﬁes . λ i , k ( θ ) ≤ K ρ i (see [FZ98]).The last case will not be a consequence of the usual works on ARMA processes. ⋄ k = p + q + 1 :In this case, θ k = d and so we have ∂∂θ k (cid:16) b − θ ( z ) a θ ( z )(1 − z ) d − d a − θ ( z ) b θ ( z ) (cid:17) = b − θ ( z ) a θ ( z )ln(1 − z )(1 − z ) d − d a − θ ( z ) b θ ( z ) and consequently . λ i , k ( θ ) is the i − th coeﬃcient of ln(1 − z ) which is equal to − / i .The three above cases imply the expected result. We can follow from line to line the proof of Theorem 1 in [FZ98]. The only diﬀerence relies on thefollowing Lemma in which it is stated that the choice of the initial values has no inﬂuence on theestimation. Its proof is completely diﬀerent from the one done in [FZ98] because we do not havethe same speed of convergence.

Lemma 4.

Under the assumptions of Theorem 1, we have almost surely lim t →∞ sup θ ∈ Θ δ | ǫ t ( θ ) − ˜ ǫ t ( θ ) | = 0. (40) Proof.

From (25) and (29), for all θ ∈ Θ δ and all t ∈ Z , we have ǫ t ( θ ) − ˜ ǫ t ( θ ) = X j ≥ γ j ( θ ) X t − j − t − X j =0 γ j ( θ ) X t − j = X j ≥ t γ j ( θ ) X t − j = X k ≥ γ t + k ( θ ) X − k . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Recall that for any sequence ( Y n ) n of random variables it holds that Y n a.s. −→ n →∞ Y ⇔ sup k ≥ n | Y k − Y | P −→ n →∞ Hence sup θ ∈ Θ δ | ǫ t ( θ ) − ˜ ǫ t ( θ ) | converges almost surely to as soon as sup k ≥ t sup θ ∈ Θ δ | ǫ k ( θ ) − ˜ ǫ k ( θ ) | converges in probability to . In view of (27), for all β > and for large t we have P sup k ≥ t sup θ ∈ Θ δ | ǫ k ( θ ) − ˜ ǫ k ( θ ) | > β ! = P  sup k ≥ t sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X j ≥ γ k + j ( θ ) X − j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > β  ≤ P X j ≥ sup k ≥ t sup θ ∈ Θ δ | γ k + j ( θ ) | | X − j | > β  ≤ K β (cid:18) sup t ∈ Z E | X t | (cid:19) X j ≥ (cid:18) t + j (cid:19) d , ≤ K β d ( t − − d −→ t →∞ and (40) is proved. By a Taylor expansion of the function ∂ Q n ( · ) /∂θ around θ and under (A3) , we have √ n ∂∂θ Q n ( ˆ θ n ) = √ n ∂∂θ Q n ( θ ) + (cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) √ n (cid:16) ˆ θ n − θ (cid:17) , (41)where the θ ∗ n , i , j ’s are between ˆ θ n and θ . The equation (41) can be rewritten in the form: √ n ∂∂θ O n ( θ ) − √ n ∂∂θ Q n ( θ ) = √ n ∂∂θ O n ( θ ) + (cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) √ n (cid:16) ˆ θ n − θ (cid:17) . (42)Under the assumptions of Theorem 2, it will be shown respectively in Lemma 5 and Lemma 8 that √ n ∂∂θ O n ( θ ) − √ n ∂∂θ Q n ( θ ) = o P (1), and lim n →∞ (cid:26)(cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) − J ( θ ) (cid:27) = 0 a.s.As a consequence, the asymptotic normality of √ n ( ˆ θ n − θ ) will be a consequence of the one of √ n ∂/∂θ O n ( θ ) . Lemma 5.

For ≤ k ≤ p + q + 1 , under the assumptions of Theorem 2, we have √ n (cid:18) ∂∂θ k Q n ( θ ) − ∂∂θ k O n ( θ ) (cid:19) = o P (1). (43) Proof.

Throughout this proof, θ = ( θ , ..., θ p + q , d ) ′ ∈ Θ δ is such that d < d ≤ d where d isthe upper bound of the support of the long-range dependence parameter d .The proof is quite long so we divide it in several steps. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models ⋄ Step 1: preliminaries

For ≤ k ≤ p + q + 1 we have √ n ∂∂θ k Q n ( θ ) = 2 √ n n X t =1 ˜ ǫ t ( θ ) ∂∂θ k ˜ ǫ t ( θ )= 2 √ n n X t =1 ( ˜ ǫ t ( θ ) − ˜ ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ ) + 2 √ n n X t =1 ( ˜ ǫ t ( θ ) − ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ )+ 2 √ n n X t =1 ( ǫ t ( θ ) − ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ ) + 2 √ n n X t =1 ǫ t ( θ ) (cid:18) ∂∂θ k ˜ ǫ t ( θ ) − ∂∂θ k ǫ t ( θ ) (cid:19) + 2 √ n n X t =1 ǫ t ( θ ) ∂∂θ k ǫ t ( θ )= ∆ kn ,1 ( θ ) + ∆ kn ,2 ( θ ) + ∆ kn ,3 ( θ ) + ∆ kn ,4 ( θ ) + √ n ∂∂θ k O n ( θ ), (44)where ∆ kn ,1 ( θ ) = 2 √ n n X t =1 ( ˜ ǫ t ( θ ) − ˜ ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ ), ∆ kn ,2 ( θ ) = 2 √ n n X t =1 ( ˜ ǫ t ( θ ) − ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ ), ∆ kn ,3 ( θ ) = 2 √ n n X t =1 ( ǫ t ( θ ) − ǫ t ( θ )) ∂∂θ k ˜ ǫ t ( θ ) ∆ kn ,4 ( θ ) = 2 √ n n X t =1 ǫ t ( θ ) (cid:18) ∂∂θ k ˜ ǫ t ( θ ) − ∂∂θ k ǫ t ( θ ) (cid:19) . Using (30) and (34), the fourth term ∆ kn ,4 ( θ ) can be rewritten in the form: ∆ kn ,4 ( θ ) = 2 √ n n X t =1 ∞ X j =1 n . λ tj , k ( θ ) − . λ j , k ( θ ) o ǫ t ǫ t − j . (45)Therefore, if we prove that the three sequences of random variables ( ∆ kn ,1 ( θ )+ ∆ kn ,3 ( θ )) n , ( ∆ kn ,2 ( θ )) n and ( ∆ kn ,4 ( θ )) n converge in probability towards , then (43) will be true. ⋄ Step 2: convergence in probability of ( ∆ kn ,4 ( θ )) n to For simplicity, in this step we denote in the sequel by . λ j , k the coeﬃcient . λ j , k ( θ ) and by . λ tj , k thecoeﬃcient . λ tj , k ( θ ) . Let ̺ ( · , · ) be the function deﬁned for ≤ t , s ≤ n by ̺ ( t , s ) = ∞ X j =1 ∞ X j =1 n . λ j , k − . λ tj , k o n . λ j , k − . λ sj , k o E [ ǫ t ǫ t − j ǫ s ǫ s − j ] . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models For all β > , using the symmetry of the function ̺ ( t , s ) , we obtain that P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,4 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ n β E  n X t =1 ∞ X j =1 n . λ j , k − . λ tj , k o ǫ t ǫ t − j   ≤ n β n X t =1 n X s =1 ∞ X j =1 ∞ X j =1 n . λ j , k − . λ tj , k o n . λ j , k − . λ sj , k o E [ ǫ t ǫ t − j ǫ s ǫ s − j ] ≤ n β n X t =1 t X s =1 ∞ X j =1 ∞ X j =1 n . λ j , k − . λ tj , k o n . λ j , k − . λ sj , k o E [ ǫ t ǫ t − j ǫ s ǫ s − j ] . By the stationarity of ( ǫ t ) t ∈ Z which is assumed in (A2) , we have E [ ǫ t ǫ t − j ǫ s ǫ s − j ] = cum ( ǫ , ǫ − j , ǫ s − t , ǫ s − t − j ) + E [ ǫ ǫ − j ] E [ ǫ s − t ǫ s − t − j ] + E [ ǫ ǫ s − t ] E [ ǫ − j ǫ s − t − j ]+ E [ ǫ ǫ s − t − j ] E [ ǫ − j ǫ s − t ] . Since the noise is not correlated, we deduce that E [ ǫ ǫ − j ] = 0 and E [ ǫ ǫ s − t − j ] = 0 for ≤ j , j and s ≤ t . Consequently we obtain P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,4 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ n β n X t =1 t X s =1 ∞ X j =1 ∞ X j =1 sup j ≥ (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ sj , k (cid:12)(cid:12)(cid:12) | cum ( ǫ , ǫ − j , ǫ s − t , ǫ s − t − j ) | + 8 n β n X t =1 t X s =1 ∞ X j =1 ∞ X j =1 (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ sj , k (cid:12)(cid:12)(cid:12) | E [ ǫ ǫ s − t ] E [ ǫ − j ǫ s − t − j ] | (46)If t X s =1 ∞ X j =1 ∞ X j =1 sup j ≥ (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ sj , k (cid:12)(cid:12)(cid:12) | cum ( ǫ , ǫ − j , ǫ s − t , ǫ s − t − j ) | −−−→ t →∞ (47)Cesàro’s Lemma implies that the ﬁrst term in the right hand side of (46) tends to . Thanks toLemma 1 applied with r = ∞ (or see Remark 3) and Assumption (A4’) with τ = 4 , we obtainthat t X s =1 ∞ X j =1 ∞ X j =1 sup j ≥ (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ sj , k (cid:12)(cid:12)(cid:12) | cum ( ǫ , ǫ − j , ǫ s − t , ǫ s − t − j ) |≤ Kt t X s =1 ∞ X j =1 ∞ X j =1 | cum ( ǫ , ǫ − j , ǫ s − t , ǫ s − t − j ) |≤ Kt ∞ X s = −∞ ∞ X j = −∞ ∞ X j = −∞ | cum ( ǫ , ǫ s , ǫ j , ǫ j ) | −−−→ t →∞ . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models hence (47) holds true. Concerning the second term of right hand side of the inequality (46), wehave n β n X t =1 t X s =1 ∞ X j =1 ∞ X j =1 (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ sj , k (cid:12)(cid:12)(cid:12) | E [ ǫ ǫ s − t ] E [ ǫ − j ǫ s − t − j ] | = 8 σ ǫ n β n X t =1 ∞ X j =1 ∞ X j =1 (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) | E [ ǫ − j ǫ − j ] | = 8 σ ǫ n β n X t =1 ∞ X j =1 (cid:12)(cid:12)(cid:12) . λ j , k − . λ tj , k (cid:12)(cid:12)(cid:12) = 8 σ ǫ n β n X t =1 (cid:13)(cid:13)(cid:13) . λ k − . λ tk (cid:13)(cid:13)(cid:13) ℓ ≤ K β n n X t =1 t −−−→ n →∞ where we have used the fact that the noise is not correlated, Lemma 1 with r = 2 and Cesàro’sLemma. This ends Step 2. ⋄ Step 3: ( ∆ kn ,2 ( θ )) n converges in probability to For all β > , we have P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,2 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ β √ n n X t =1 k ˜ ǫ t ( θ ) − ǫ t ( θ ) k L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ k ˜ ǫ t ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) L . First, using Lemma 2, we have (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ k ˜ ǫ t ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) L = E  ∞ X i =1 . λ ti , k ( θ ) ǫ t − i !  = ∞ X i =1 ∞ X j =1 . λ ti , k ( θ ) . λ tj , k ( θ ) E [ ǫ t − i ǫ t − j ]= σ ǫ ∞ X i =1 n . λ ti , k ( θ ) o ≤ K . (48) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models In view of (30), (34) and (48), we may write P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,2 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ K β √ n n X t =1 (cid:16) E h ( ˜ ǫ t ( θ ) − ǫ t ( θ )) i(cid:17) / ≤ K β √ n n X t =1 X i ≥ X j ≥ (cid:0) λ ti ( θ ) − λ i ( θ ) (cid:1) (cid:0) λ tj ( θ ) − λ j ( θ ) (cid:1) E [ ǫ t − i ǫ t − j ]  / ≤ σ ǫ K β √ n n X t =1 X i ≥ (cid:0) λ ti ( θ ) − λ i ( θ ) (cid:1)  / ≤ σ ǫ K β √ n n X t =1 (cid:13)(cid:13) λ ( θ ) − λ t ( θ ) (cid:13)(cid:13) ℓ . We use Lemma 1, the fact that d > d and the fractional version of Cesàro’s Lemma , we obtain P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,2 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ σ ǫ K β √ n n X t =1 t / d − d ) −−−→ n →∞ This proves the expected convergence in probability. ⋄ Step 4: Convergence of ( ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ )) n Note now that, for all n ≥ , we have ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ ) = 2 √ n n X t =1 n ( ǫ t ( θ ) − ˜ ǫ t ( θ )) − ( ǫ t ( θ ) − ˜ ǫ t ( θ )) o ∂∂θ k ˜ ǫ t ( θ ). By the mean value theorem, there exists < c ω < such that (cid:12)(cid:12)(cid:12) ( ǫ t ( θ ) − ˜ ǫ t ( θ )) − ( ǫ t ( θ ) − ˜ ǫ t ( θ )) (cid:12)(cid:12)(cid:12) ≤ (cid:13)(cid:13)(cid:13)(cid:13) ∂ ( ǫ t − ˜ ǫ t ) ∂θ ((1 − c ω ) θ + c ω θ ) (cid:13)(cid:13)(cid:13)(cid:13) R p + q +1 k θ − θ k R p + q +1 . (49) Recall that the fractional version of Cesàro’s Lemma states that for ( h t ) t a sequence of positive real numbers, κ > and c ≥ we have lim t →∞ h t t − κ = | κ | c ⇒ lim n →∞ n κ n X t =0 h t = c . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Following the same method than in the previous step we obtain E (cid:16) ( ǫ t ( θ ) − ˜ ǫ t ( θ )) − ( ǫ t ( θ ) − ˜ ǫ t ( θ )) (cid:17) ≤ k θ − θ k R p + q +1 p + q +1 X k =1 E "(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( ǫ t − ˜ ǫ t ) ∂θ k ((1 − c ω ) θ + c ω θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ k θ − θ k R p + q +1 p + q +1 X k =1 sup θ E "(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( ǫ t − ˜ ǫ t ) ∂θ k ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ k θ − θ k R p + q +1 p + q +1 X k =1 σ ǫ sup θ (cid:13)(cid:13)(cid:13) ( . λ k − . λ k t )( θ ) (cid:13)(cid:13)(cid:13) ℓ ≤ K k θ − θ k R p + q +1 sup d ; d ≤ d ≤ d (cid:18) t / d − d ) (cid:19) ≤ K k θ − θ k R p + q +1 t , (50)where we have used the fact that the function θ E "(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( ǫ t − ˜ ǫ t ) ∂θ k ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) is bounded and continuous. By (50) and (48), it follows that P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ K β k θ − θ k R p + q +1 √ n n X t =1 t / and the fractional version of Cesàro’s Lemma implies lim n →∞ P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ K β k θ − θ k R p + q +1 . (51) ⋄ Step 5: end of the proof

For any ε > , we choose θ such that K β k θ − θ k R p + q +1 ≤ ε . Then,from (51), there exists n such that for all n ≥ n , P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ ε . By Step 2 and 3, one also has for n ≥ n P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,2 ( θ ) + ∆ kn ,4 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ ε . Therefore, for all n ≥ n , P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) √ n ∂∂θ k Q n ( θ ) − √ n ∂∂θ k O n ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ β (cid:19) ≤ P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,1 ( θ ) + ∆ kn ,3 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) + P (cid:16)(cid:12)(cid:12)(cid:12) ∆ kn ,2 ( θ ) + ∆ kn ,4 ( θ ) (cid:12)(cid:12)(cid:12) ≥ β (cid:17) ≤ ε and the expected convergence in probability is proved. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We show in the following lemma the existence and invertibility of J ( θ ) . Lemma 6.

Under Assumptions of Theorem , the matrix J ( θ ) = lim n →∞ (cid:20) ∂ ∂θ i ∂θ j O n ( θ ) (cid:21) exists almost surely and is invertible.Proof. For all ≤ i , j ≤ p + q + 1 , we have ∂ ∂θ i ∂θ j O n ( θ ) = 1 n n X t =1 ∂ ∂θ i ∂θ j ǫ t ( θ ) = 2 n n X t =1 (cid:26) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) + ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) (cid:27) . Note that in view of (25), (26) and (27), the ﬁrst and second order derivatives of ǫ t ( · ) belong to L . By using the ergodicity of ( ǫ t ) t ∈ Z assumed in Assumption (A2) , we deduce that ∂ ∂θ i ∂θ j O n ( θ ) a.s. −→ n →∞ E (cid:20) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:21) + 2 E (cid:20) ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) (cid:21) . By (30) and (33), ǫ t and ∂ǫ t ( θ ) /∂θ are non correlated as well as ǫ t and ∂ ǫ t ( θ ) /∂θ∂θ . Thus wehave ∂ ∂θ i ∂θ j O n ( θ ) a.s. −→ n →∞ J ( θ )( i , j ) := 2 E (cid:20) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:21) . (52)From (30) and (39) we obtain that E (cid:20) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:21) = E  X k ≥ . λ k , i ( θ ) ǫ t − k   X k ≥ . λ k , j ( θ ) ǫ t − k  = X k ≥ X k ≥ . λ k , i ( θ ) . λ k , j ( θ ) E [ ǫ t − k ǫ t − k ] ≤ K σ ǫ X k ≥ (cid:18) k (cid:19) < ∞ . Therefore J ( θ ) exists almost surely.If the matrix J ( θ ) is not invertible, there exists some real constants c , . . . , c p + q +1 not all equalto zero such that c ′ J ( θ ) c = P p + q +1 i =1 P p + q +1 j =1 c j J ( θ )( j , i ) c i = 0 , where c = ( c , . . . , c p + q +1 ) ′ .In view of (52) we obtain that p + q +1 X i =1 p + q +1 X j =1 E (cid:20)(cid:18) c j ∂ǫ t ( θ ) ∂θ j (cid:19) (cid:18) c i ∂ǫ t ( θ ) ∂θ i (cid:19)(cid:21) = E  p + q +1 X k =1 c k ∂ǫ t ( θ ) ∂θ k !  = 0, which implies that p + q +1 X k =1 c k ∂ǫ t ( θ ) ∂θ k = 0 a.s. or equivalenty c ′ ∂ǫ t ( θ ) ∂θ = 0 a.s. (53) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Diﬀerentiating the equation (1), we obtain that c ′ ∂∂θ n a θ ( L )(1 − L ) d o X t = c ′ (cid:26) ∂∂θ b θ ( L ) (cid:27) ǫ t + b θ ( L ) c ′ ∂∂θ ǫ t ( θ ). and by (53) we may write that c ′ (cid:18) ∂∂θ n a θ ( L )(1 − L ) d o X t − (cid:26) ∂∂θ b θ ( L ) (cid:27) ǫ t (cid:19) = 0 a.s. It follows that (1) can therefore be rewritten in the form: (cid:18) a θ ( L )(1 − L ) d + c ′ ∂∂θ n a θ ( L )(1 − L ) d o(cid:19) X t = (cid:18) b θ ( L ) + c ′ ∂∂θ b θ ( L ) (cid:19) ǫ t , a.s. Under Assumption (A1) the representation in (1) is unique (see [Hos81]) so c ′ ∂∂θ n a θ ( L )(1 − L ) d o = 0 and (54) c ′ ∂∂θ b θ ( L ) = 0. (55)First, (55) implies that p + q X k = p +1 c k ∂∂θ k b θ ( L ) = p + q X k = p +1 − c k L k = 0 and thus c k = 0 for p + 1 ≤ k ≤ p + q .Similarly, (54) yields that p X k =1 c k ∂∂θ k a θ ( L )(1 − L ) d + c p + q +1 a θ ( L ) ∂ (1 − L ) d ∂ d ( d ) = 0 . Since ∂ (1 − L ) d /∂ d = (1 − L ) d ln(1 − L ) , it follows that − p X k =1 c k L k + c p + q +1 X k ≥ e k L k = 0 , where the sequence ( e k ) k ≥ is given by the coeﬃcients of the power series a θ ( L )ln(1 − L ) . Since e = 0 and e = − , we obtain that c = − c p + q +1 c k = e k c p + q +1 for k = 2, . . . , p e k c p + q +1 for k ≥ p + 1. Since the polynomial a θ is not the null polynomial, this implies that c p + q +1 = 0 and then c k for ≤ k ≤ p . Thus c = 0 which leads us to a contradiction. Hence J ( θ ) is invertible. Lemma 7.

For any ≤ i , j ≤ p + q + 1 and under the assumptions of Theorem 1, we have almostsurely lim t →∞ sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 and lim t →∞ sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ i ∂θ j ǫ t ( θ ) − ∂ ∂θ i ∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) = 0. (56) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Proof.

The proof uses the same arguments that the proof of Lemma 4 so it is omitted.

Lemma 8.

For any ≤ i , j ≤ p + q + 1 and under the assumptions of Theorem 1, we have almostsurely lim n →∞ (cid:26)(cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) − J ( θ ) (cid:27) = 0 (57) where θ ∗ n , i , j is deﬁned in (41) .Proof. For any θ ∈ Θ δ , let J n ( θ ) = ∂ ∂θ∂θ ′ Q n ( θ ) = 2 n n X t =1 (cid:26) ∂∂θ ˜ ǫ t ( θ ) (cid:27) (cid:26) ∂∂θ ′ ˜ ǫ t ( θ ) (cid:27) + 2 n n X t =1 ˜ ǫ t ( θ ) ∂ ∂θ∂θ ′ ˜ ǫ t ( θ ), and J ∗ n ( θ ) = ∂ ∂θ∂θ ′ O n ( θ ) = 2 n n X t =1 (cid:26) ∂∂θ ǫ t ( θ ) (cid:27) (cid:26) ∂∂θ ′ ǫ t ( θ ) (cid:27) + 2 n n X t =1 ǫ t ( θ ) ∂ ∂θ∂θ ′ ǫ t ( θ ). We have (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1) − J ( θ )( i , j ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12) J n ( θ ∗ n , i , j )( i , j ) − J ∗ n ( θ ∗ n , i , j )( i , j ) (cid:12)(cid:12) + (cid:12)(cid:12) J ∗ n ( θ ∗ n , i , j )( i , j ) − J ∗ n ( θ )( i , j ) (cid:12)(cid:12) + | J ∗ n ( θ )( i , j ) − J ( θ )( i , j ) | . (58)So it is enough to show that the three terms in the right hand side of (58) tend almost-surely to when n tends to inﬁnity. Following the same arguments as the proof of Lemma 6 and applyingthe ergodic theorem, we obtain that J ∗ n ( θ ) a.s. −→ n →∞ E (cid:20) ∂∂θ ǫ t ( θ ) ∂∂θ ′ ǫ t ( θ ) (cid:21) = J ( θ ). Let us now show that the term | J ∗ n ( θ ∗ n , i , j )( i , j ) − J ∗ n ( θ )( i , j ) | converges almost-surely to 0. In view . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models of (25) and (27), we have sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ (cid:18) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) = sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ  X k ≥ ∂∂θ i γ k ( θ ) X t − k   X k ≥ ∂∂θ j γ k ( θ ) X t − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ  X k , k ≥ ∂∂θ i γ k ( θ ) ∂∂θ j γ k ( θ ) X t − k X t − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k , k ≥ (cid:18) ∂∂θ ∂∂θ i γ k ( θ ) (cid:19) ∂∂θ j γ k ( θ ) X t − k X t − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k , k ≥ ∂∂θ i γ k ( θ ) (cid:18) ∂∂θ ∂∂θ j γ k ( θ ) (cid:19) X t − k X t − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ X k , k ≥ sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ ∂∂θ i γ k ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ j γ k ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) | X t − k | | X t − k | + X k , k ≥ sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ i γ k ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ ∂∂θ j γ k ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) | X t − k | | X t − k |≤ K X k , k ≥ (log( k )) k − − d log( k ) k − − d | X t − k | | X t − k | + K X k , k ≥ log( k ) k − − d (log( k )) k − − d | X t − k | | X t − k | . Consequently we obtain E θ " sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ (cid:18) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) ≤ K X k , k ≥ (log( k )) k − − d log( k ) k − − d sup t ∈ Z E θ | X t | + K X k , k ≥ log( k ) k − − d (log( k )) k − − d sup t ∈ Z E θ | X t | ≤ K . (59)Following the same approach used in (59), we have E θ " sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ (cid:26) ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) (cid:27)(cid:13)(cid:13)(cid:13)(cid:13) < ∞ . (60)A Taylor expansion implies that there exists a random variable θ ∗∗ n , i , j ’s between θ ∗ n , i , j and θ such . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models that (cid:12)(cid:12) J ∗ n ( θ ∗ n , i , j )( i , j ) − J ∗ n ( θ )( i , j ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ J ∗ n ( θ ∗∗ n , i , j )( i , j ) · ( θ ∗ n , i , j − θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ J ∗ n ( θ )( i , j ) (cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13) θ ∗ n , i , j − θ (cid:13)(cid:13) ≤ n n X t =1 sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ (cid:18) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13) θ ∗ n , i , j − θ (cid:13)(cid:13) + 2 n n X t =1 sup θ ∈ Θ δ (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ (cid:26) ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) (cid:27)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13) θ ∗ n , i , j − θ (cid:13)(cid:13) . By Theorem 1, the ergodic theorem, (59) and (60) imply that lim n →∞ | J ∗ n ( θ ∗ n , i , j )( i , j ) − J ∗ n ( θ )( i , j ) | =0 a.s.To prove the almost-sure convergence of the ﬁrst term of the right hand side of (58) it suﬃcesto show that n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) ∂∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) and n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) − ˜ ǫ t ( θ ) ∂ ∂θ i ∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) converge almost-surely to 0. On one hand we have n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) ∂∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ n n X t =1 ( sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) + sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ˜ ǫ t ( θ ) − ∂∂θ j ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)) ≤  n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  /  n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  / +  n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  /  n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ˜ ǫ t ( θ ) − ∂∂θ j ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  / . In view of (25) and (27) it follows that E θ  sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  ≤ sup t ∈ Z E θ | X t |  X k ≥ log( k ) k − − d  < ∞ . Similar commutations imply that E θ  sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12)!  < ∞ . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Cesàro’s Lemma, Lemma (7) and the ergodic theorem yield n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ i ǫ t ( θ ) ∂∂θ j ǫ t ( θ ) − ∂∂θ i ˜ ǫ t ( θ ) ∂∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) a.s. −→ n →∞ On the other hand, one similarly may prove that n n X t =1 sup θ ∈ Θ δ (cid:12)(cid:12)(cid:12)(cid:12) ǫ t ( θ ) ∂ ∂θ i ∂θ j ǫ t ( θ ) − ˜ ǫ t ( θ ) ∂ ∂θ i ∂θ j ˜ ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) a.s. −→ n →∞ Thus sup θ ∈ Θ δ k J n ( θ ) − J ∗ n ( θ ) k a.s. −→ n →∞ and the lemma is proved.The following lemma states the existence of the matrix I ( θ ) . Lemma 9.

Under the assumptions of Theorem 2, the matrix I ( θ ) = lim n →∞ V ar (cid:26) √ n ∂∂θ O n ( θ ) (cid:27) exists.Proof. By the stationarity of ( H t ( θ )) t ∈ Z (remind that this process is deﬁned in (7)), we have V ar (cid:26) √ n ∂∂θ O n ( θ ) (cid:27) = V ar ( √ n n X t =1 H t ( θ ) ) = 1 n n X t =1 n X s =1 C ov { H t ( θ ), H s ( θ ) } = 1 n n − X h = − n +1 ( n − | h | ) C ov { H t ( θ ), H t − h ( θ ) } . By the dominated convergence theorem, the matrix I ( θ ) exists and is given by I ( θ ) = ∞ X h = −∞ C ov { H t ( θ ), H t − h ( θ ) } whenever ∞ X h = −∞ k C ov { H t ( θ ), H t − h ( θ ) } k < ∞ . (61)For s ∈ Z and ≤ k ≤ p + q + 1 , we denote H s , k ( θ ) = 2 ǫ s ( θ ) ∂∂θ k ǫ s ( θ ) the k − th entry of . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models H s ( θ ) . In view of (30) we have | C ov { H t , i ( θ ), H t − h , j ( θ ) }| = 4 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C ov  X k ≥ . λ k , i ( θ ) ǫ t ǫ t − k , X k ≥ . λ k , j ( θ ) ǫ t − h ǫ t − h − k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ X k ≥ X k ≥ (cid:12)(cid:12)(cid:12) . λ k , i ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ k , j ( θ ) (cid:12)(cid:12)(cid:12) | E [ ǫ t ǫ t − k ǫ t − h ǫ t − h − k ] |≤ X k ≥ X k ≥ Kk k | E [ ǫ t ǫ t − k ǫ t − h ǫ t − h − k ] | where we have used Lemma 3. It follows that ∞ X h = −∞ | C ov { H t , i ( θ ), H t − h , j ( θ ) }| ≤ X h ∈ Z \{ } X k ≥ X k ≥ Kk k | cum ( ǫ t , ǫ t − k , ǫ t − h , ǫ t − h − k ) | + X k ≥ X k ≥ Kk k | E [ ǫ t ǫ t − k ǫ t ǫ t − k ] | . Thanks to the stationarity of ( ǫ t ) t ∈ Z and Assumption (A4’) with τ = 4 we deduce that ∞ X h = −∞ | C ov { H t , i ( θ ), H t − h , j ( θ ) }| ≤ X h ∈ Z \{ } X k ≥ X k ≥ Kk k | cum ( ǫ , ǫ − k , ǫ − h , ǫ − h − k ) | + X k ≥ X k ≥ Kk k | E [ ǫ ǫ − k ǫ ǫ − k ] |≤ K X h , k , l ∈ Z | cum ( ǫ , ǫ k , ǫ h , ǫ l ) | + X k ≥ X k ≥ Kk k (cid:16) | cum ( ǫ , ǫ − k , ǫ , ǫ − k ) | + σ ǫ | E [ ǫ − k ǫ − k ] | (cid:17) ≤ K X h , k , l ∈ Z | cum ( ǫ , ǫ k , ǫ h , ǫ l ) | + K σ ǫ X k ≥ (cid:18) k (cid:19) ≤ K and we obtain the expected result. Lemma 10.

Under Assumptions of Theorem 2, the random vector √ n ( ∂/∂θ ) O n ( θ ) has a limitingnormal distribution with mean and covariance matrix I ( θ ) .Proof. Observe that for any t ∈ Z E (cid:20) ǫ t ∂∂θ ǫ t ( θ ) (cid:21) = 0 (62)because ∂ǫ t ( θ ) /∂θ belongs to the Hilbert space H ǫ ( t − , linearly generated by the family ( ǫ s ) s ≤ t − . Therefore we have lim n →∞ E (cid:20) √ n ∂∂θ O n ( θ ) (cid:21) = lim n →∞ √ n n X t =1 E (cid:20) ǫ t ∂∂θ ǫ t ( θ ) (cid:21) = 0. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models For i ≥ , we denote Λ i ( θ ) = ( . λ i ,1 ( θ ) , . . . , . λ i , p + q +1 ( θ )) ′ and we introduce for r ≥ H t , r ( θ ) = 2 r X i =1 Λ i ( θ ) ǫ t ǫ t − i and G t , r ( θ ) = 2 X i ≥ r +1 Λ k ( θ ) ǫ t ǫ t − i . From (30) we have √ n ∂∂θ O n ( θ ) = 1 √ n n X t =1 H t , r ( θ ) + 1 √ n n X t =1 G t , r ( θ ). Since H t , r ( θ ) is a function of ﬁnite number of values of the process ( ǫ t ) t ∈ Z , the stationaryprocess ( H t , r ( θ )) t ∈ Z satisﬁes a mixing property (see Theorem 14.1 in [Dav94], p. 210) of theform (A4) . The central limit theorem for strongly mixing processes (see [Her84]) implies that (1 / √ n ) P nt =1 H t , r ( θ ) has a limiting N (0, I r ( θ )) distribution with I r ( θ ) = lim n →∞ V ar √ n n X t =1 H t , r ( θ ) ! . Since √ n P nt =1 H t , r ( θ ) and √ n P nt =1 H t ( θ ) have zero expectation, we shall have lim r →∞ V ar √ n n X t =1 H t , r ( θ ) ! = V ar √ n n X t =1 H t ( θ ) ! = V ar (cid:26) √ n ∂∂θ O n ( θ ) (cid:27) , as soon as lim r →∞ E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n n X t =1 H t ( θ ) − √ n n X t =1 H t , r ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)  = 0. (63)As a consequence we will have lim r →∞ I r ( θ ) = I ( θ ) . The limit in (63) is obtained as follows: E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n n X t =1 H t ( θ ) − √ n n X t =1 H t , r ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) R p + q +1  = E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n n X t =1 G t , r ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) R p + q +1  ≤ n p + q +1 X l =1 E  n X t =1 X k ≥ r +1 . λ k , l ( θ ) ǫ t − k ǫ t   ≤ n p + q +1 X l =1 n X t =1 n X s =1 X k ≥ r +1 X j ≥ r +1 (cid:12)(cid:12)(cid:12) . λ k , l ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , l ( θ ) (cid:12)(cid:12)(cid:12) | E [ ǫ t − k ǫ t ǫ s − j ǫ s ] | , We use successively the stationarity, Lemma 3 and Assumption (A4’) with τ = 4 in order to obtain . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models that E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n n X t =1 H t ( θ ) − √ n n X t =1 H t , r ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) R p + q +1  ≤ n p + q +1 X l =1 n − X h =1 − n X k ≥ r +1 X j ≥ r +1 (cid:12)(cid:12)(cid:12) . λ k , l ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , l ( θ ) (cid:12)(cid:12)(cid:12) ( n − | h | ) | E [ ǫ t − k ǫ t ǫ t − h − j ǫ t − h ] |≤ p + q +1 X l =1 ∞ X h = −∞ X k ≥ r +1 X j ≥ r +1 (cid:12)(cid:12)(cid:12) . λ k , l ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , l ( θ ) (cid:12)(cid:12)(cid:12) | E [ ǫ t − k ǫ t ǫ t − h − j ǫ t − h ] |≤ K ( r + 1) X h =0 X k ≥ r +1 ∞ X j = −∞ | cum ( ǫ , ǫ − k , ǫ − j , ǫ − h ) | + K ( r + 1) X k ≥ r +1 X j ≥ r +1 | cum ( ǫ , ǫ − k , ǫ − j , ǫ ) | + K σ ǫ X k ≥ r +1 (cid:18) k (cid:19) and we obtain the convergence stated in (63) when r → ∞ .Using Theorem 7.7.1 and Corollary 7.7.1 of Anderson (see [And71] pages 425-426), the Lemmais proved once we have, uniformly in n , V ar √ n n X t =1 G t , r ( θ ) ! −−−→ r →∞ Arguing as before we may write " V ar √ n n X t =1 G t , r ( θ ) ! ij =  V ar  √ n n X t =1 X k ≥ r +1 Λ k ( θ ) ǫ t − k ǫ t  ij = 4 n n X t =1 n X s =1 X k ≥ r +1 X k ≥ r +1 . λ k , i ( θ ) . λ k , j ( θ ) E [ ǫ t − k ǫ t ǫ s − k ǫ s ] ≤ ∞ X h = −∞ X k , k ≥ r +1 (cid:12)(cid:12)(cid:12) . λ k , i ( θ ) . λ k , j ( θ ) (cid:12)(cid:12)(cid:12) | E [ ǫ t − k ǫ t ǫ t − h − k ǫ t − h ] | . and we obtain that sup n V ar √ n n X t =1 G t , r ( θ ) ! −−−→ r →∞ (64)which completes the proof.No we can end this quite long proof of the asymptotic normality result. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Proof of Theorem 2

In view of Lemma 5, the equation (42) can be rewritten in the form: o P (1) = √ n ∂∂θ O n ( θ ) + (cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) √ n (cid:16) ˆ θ n − θ (cid:17) . From Lemma 10 √ n ( b θ n − θ )( ∂ /∂θ i ∂θ j ) Q n ( θ ∗ n , i , j ) converges in distribution to N (0, I ( θ )) . UsingLemma 8 and Slutsky’s theorem we deduce that (cid:18)(cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) , (cid:20) ∂ ∂θ i ∂θ j Q n (cid:0) θ ∗ n , i , j (cid:1)(cid:21) √ n ( ˆ θ n − θ ) (cid:19) converges in distribution to ( J ( θ ), Z ) with P Z = N (0, I ) . Consider now the function h : R ( p + q +1) × ( p + q +1) × R p + q +1 → R p + q +1 that maps ( A , X ) to A − X . If D h denotes the set of discontinuity points of h ,we have P (( J ( θ ), Z ) ∈ D h ) = 0 . By the continuous mapping theorem h (cid:16)(cid:2) ( ∂ /∂θ i ∂θ j ) Q n ( θ ∗ n , i , j ) (cid:3) , (cid:2) ( ∂ /∂θ i ∂θ j ) Q n ( θ ∗ n , i , j ) (cid:3) √ n ( ˆ θ n − θ ) (cid:17) converges in distribution to h ( J ( θ ), Z ) and thus √ n ( ˆ θ n − θ ) has a limiting normal distribution withmean and covariance matrix J − ( θ ) I ( θ ) J − ( θ ) . The proof of Theorem 2 is then completed. We show in this section the convergence in probability of ˆΩ := ˆJ − n ˆI SPn ˆJ − n to Ω , which is anadaptation of the arguments used in [BMCF12].Using the same approach as that followed in Lemma 8, we show that ˆJ n converges almost surelyto J . We give below the proof of the convergence in probability of the estimator ˆI SPn , obtainedusing the approach of the spectral density, towards I .We recall that the matrix norm used is given by k A k = sup k x k≤ k Ax k = ρ / ( A ′ A ) , when A isa R k × k matrix, k x k = x ′ x is the Euclidean norm of the vector x ∈ R k , and ρ ( · ) denotes thespectral radius. This norm satisﬁes k A k ≤ k X i =1 k X j =1 a i , j , (65)with a i , j the entries of A ∈ R k × k . The choice of the norm is crucial for the following results tohold (with e.g. the Euclidean norm, this result is not valid).We denote Σ H , H r = E H t H ′ r , t , Σ H = E H t H ′ t , Σ H r = E H r , t H ′ r , t where H t := H t ( θ ) is deﬁnied in (7) and H r , t = ( H ′ t − , . . . , H ′ t − r ) ′ . For any n ≥ , we have ˆI SP n = ˆΦ − r (1) ˆΣ ˆu r ˆΦ ′ − r (1)= (cid:16) ˆΦ − r (1) − Φ − (1) (cid:17) ˆΣ ˆu r ˆΦ ′ − r (1) + Φ − (1) (cid:16) ˆΣ ˆu r − Σ u (cid:17) ˆΦ ′ − r (1)+ Φ − (1) Σ u (cid:16) ˆΦ ′ − r (1) − Φ ′ − (1) (cid:17) + Φ − (1) Σ u Φ ′ − (1). . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We then obtain (cid:13)(cid:13)(cid:13) ˆI SP n − I ( θ ) (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ˆΦ − r (1) − Φ − (1) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΣ ˆu r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΣ ˆu r − Σ u (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) k Σ u k (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) − Φ ′ − (1) (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ˆΦ − r (1) − Φ − (1) (cid:13)(cid:13)(cid:13) (cid:16)(cid:13)(cid:13)(cid:13) ˆΣ ˆu r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) k Σ u k (cid:17) + (cid:13)(cid:13)(cid:13) ˆΣ ˆu r − Σ u (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ˆΦ − r (1) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Φ (1) − ˆΦ r (1) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) (cid:16)(cid:13)(cid:13)(cid:13) ˆΣ ˆu r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) k Σ u k (cid:17) + (cid:13)(cid:13)(cid:13) ˆΣ ˆu r − Σ u (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆΦ ′ − r (1) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13) Φ − (1) (cid:13)(cid:13) . (66)In view of (66), to prove the convergence in probability of ˆI SP n to I ( θ ) , it suﬃces to show that ˆΦ r (1) → Φ (1) and ˆΣ ˆu r → Σ u in probability. Let the r × vector r = (1, . . . , 1) ′ and the r ( p + q + 1) × ( p + q + 1) matrix E r = I p + q +1 ⊗ r , where ⊗ denotes the matrix Kronecker productand I m the m × m identity matrix. Write Φ ∗ r = ( Φ , . . . , Φ r ) where the Φ i ’s are deﬁned by (8). Wehave (cid:13)(cid:13)(cid:13) ˆΦ r (1) − Φ (1) (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) r X k =1 ˆΦ r , k − r X k =1 Φ r , k + r X k =1 Φ r , k − ∞ X k =1 Φ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) r X k =1 (cid:16) ˆΦ r , k − Φ r , k (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) r X k =1 ( Φ r , k − Φ k ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:16) ˆΦ r − Φ r (cid:17) E r (cid:13)(cid:13)(cid:13) + k ( Φ ∗ r − Φ r ) E r k + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ p p + q + 1 √ r (cid:16)(cid:13)(cid:13)(cid:13) ˆΦ r − Φ r (cid:13)(cid:13)(cid:13) + k Φ ∗ r − Φ r k (cid:17) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . (67)Under the assumptions of Theorem (see (9)) we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ ∞ X k = r +1 k Φ k k ≤ K ∞ X k = r +1 ρ k a.s. −−−→ n →∞ Therefore it is enough to show that √ r k ˆΦ r − Φ r k and √ r k Φ ∗ r − Φ r k converge in probability towards in order to obtain the convergence in probability of ˆΦ r (1) towards Φ (1) . From (10) we have H t ( θ ) = Φ r H r , t ( θ ) + u r , t , (68)and thus Σ u r = Var( u r , t ) = E h u r , t (cid:0) H t ( θ ) − Φ r H r , t ( θ ) (cid:1) ′ i . The vector u r , t is orthogonal to H r , t ( θ ) . Therefore Var( u r , t ) = E h(cid:0) H t ( θ ) − Φ r H r , t ( θ ) (cid:1) H ′ t ( θ ) i = Σ H − Φ r Σ ′ H , H r . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Consequently the least squares estimator of Σ u r can be rewritten in the form: ˆΣ ˆu r = ˆΣ ˆH − ˆΦ r ˆΣ ′ ˆH , ˆH r , (69)where ˆΣ ˆH = 1 n n X t =1 ˆH t ˆH ′ t . (70)Similar arguments combined with (8) yield Σ u = E h u t u ′ t i = E h u t H ′ t ( θ ) i = E h H t ( θ ) H ′ t ( θ ) i − r X k =1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i − ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i = Σ H − Φ ∗ r Σ ′ H , H r − ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i . By (69) we obtain (cid:13)(cid:13)(cid:13) ˆΣ ˆu r − Σ u (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ˆΣ ˆH − ˆΦ r ˆΣ ′ ˆH , ˆH r − Σ H + Φ ∗ r Σ ′ H , H r + ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ˆΣ ˆH − Σ H − (cid:16) ˆΦ r − Φ ∗ r (cid:17) ˆΣ ′ ˆH , ˆH r − Φ ∗ r (cid:16) ˆΣ ′ ˆH , ˆH r − Σ ′ H , H r (cid:17) + ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ˆΣ ˆH − Σ H (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:16) ˆΦ r − Φ ∗ r (cid:17) (cid:16) ˆΣ ′ ˆH , ˆH r − Σ ′ H , H r (cid:17)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:16) ˆΦ r − Φ ∗ r (cid:17) Σ ′ H , H r (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) Φ ∗ r (cid:16) ˆΣ ′ ˆH , ˆH r − Σ ′ H , H r (cid:17)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . (71)From Lemma 9 and hypotheses of Theorem 3 (see (9)) we deduce that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X k = r +1 Φ k E h H t − k ( θ ) H ′ t ( θ ) i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ ∞ X k = r +1 k Φ k k (cid:13)(cid:13)(cid:13) E h H t − k ( θ ) H ′ t ( θ ) i(cid:13)(cid:13)(cid:13) ≤ K ∞ X k = r +1 ρ k a.s. −−−→ n →∞ Observe also that k Φ ∗ r k ≤ X k ≥ Tr (cid:16) Φ k Φ ′ k (cid:17) < ∞ . Therefore the convergence ˆΣ ˆu r to Σ u will be a consequence of the four following properties: • k ˆΣ ˆH − Σ H k = o P (1) , • P − lim n →∞ k ˆΦ r − Φ ∗ r k = 0 , • P − lim n →∞ k ˆΣ ′ ˆH , ˆH r − Σ ′ H , H r k = 0 and • k Σ ′ H , H r k = O(1) . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models The above properties will be proved thanks several lemmas that are stated and proved hereafter.This ends the proof of Theorem 3. For this, consider the following lemmas:

Lemma 11.

Under the assumptions of Theorem 3, we have sup r ≥ max n(cid:13)(cid:13) Σ H , H r (cid:13)(cid:13) , (cid:13)(cid:13) Σ H r (cid:13)(cid:13) , (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13)o < ∞ . Proof.

See Lemma 1 in the supplementary material of [BMCF12].

Lemma 12.

Under the assumptions of Theorem there exists a ﬁnite positive constant K suchthat, for ≤ r , r ≤ r and ≤ m , m ≤ p + q + 1 we have sup t ∈ Z ∞ X h = −∞ | Cov { H t − r , m ( θ ) H t − r , m ( θ ), H t − r − h , m ( θ ) H t − r − h , m ( θ ) }| < K . Proof.

We denote in the sequel by . λ j , k the coeﬃcient . λ j , k ( θ ) deﬁned in (31).Using the fact that the process ( H t ( θ )) t ∈ Z is centered and taking into consideration the strictstationarity of ( ǫ t ) t ∈ Z we obtain that for any t ∈ Z ∞ X h = −∞ (cid:12)(cid:12)(cid:12) Cov (cid:0) H t − r , m ( θ ) H t − r , m ( θ ), H t − r − h , m ( θ ) H t − r − h , m ( θ ) (cid:1)(cid:12)(cid:12)(cid:12) = ∞ X h = −∞ (cid:12)(cid:12)(cid:12) E [ H t − r , m ( θ ) H t − r , m ( θ ) H t − r − h , m ( θ ) H t − r − h , m ( θ )] − E [ H t − r , m ( θ ) H t − r , m ( θ )] E [ H t − r − h , m ( θ ) H t − r − h , m ( θ )] (cid:12)(cid:12)(cid:12) ≤ ∞ X h = −∞ (cid:12)(cid:12)(cid:12) cum (cid:0) H t − r , m ( θ ), H t − r , m ( θ ), H t − r − h , m ( θ ), H t − r − h , m ( θ ) (cid:1)(cid:12)(cid:12)(cid:12) + ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | + ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] |≤ ∞ X h = −∞ X i , j , k , ℓ ≥ (cid:12)(cid:12)(cid:12) . λ i , m . λ j , m . λ k , m . λ ℓ , m (cid:12)(cid:12)(cid:12) | cum ( ǫ ǫ − i , ǫ r − r ǫ r − r − j , ǫ − h ǫ − h − k , ǫ r − r − h ǫ r − r − h − ℓ ) | + T (1) r m , r , m + T (2) r m , r , m , where T (1) r m , r , m = ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | and T (2) r m , r , m = ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Thanks to Lemma 3 one may use the product theorem for the joint cumulants ([Bri81]) as in theproof of Lemma A.3. in [Sha11] in order to obtain that ∞ X h = −∞ X i , j , k , ℓ ≥ (cid:12)(cid:12)(cid:12) . λ i , m . λ j , m . λ k , m . λ ℓ , m (cid:12)(cid:12)(cid:12) | cum ( ǫ ǫ − i , ǫ r − r ǫ r − r − j , ǫ − h ǫ − h − k , ǫ r − r − h ǫ r − r − h − ℓ ) | < ∞ . where we have used the absolute summability of the k -th ( k = 2, . . . , 8) cumulants assumed in (A4’) with τ = 8 .Observe now that T (1) r m , r , m = ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] |≤ sup h ∈ Z | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | . For any h ∈ Z , from (30) we have | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | ≤ X i , j ≥ (cid:12)(cid:12)(cid:12) . λ i , m (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , m (cid:12)(cid:12)(cid:12) | cum ( ǫ , ǫ − i , ǫ − h , ǫ − h − j ) | + X i , j ≥ (cid:12)(cid:12)(cid:12) . λ i , m (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . λ j , m (cid:12)(cid:12)(cid:12) ( | E [ ǫ ǫ − i ] E [ ǫ − h ǫ − h − j ] | + | E [ ǫ ǫ − h ] E [ ǫ − i ǫ − h − j ] | + | E [ ǫ ǫ − h − j ] E [ ǫ − i ǫ − h ] | ) ≤ X i , j ≥ | cum ( ǫ , ǫ − i , ǫ − h , ǫ − h − j ) | + σ ǫ X i ≥ (cid:12)(cid:12)(cid:12) . λ i , m (cid:12)(cid:12)(cid:12) . Under Assumption (A4’) with τ = 4 and in view of Lemma 3 we may write that sup h ∈ Z | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | ≤ sup h ∈ Z X i , j ≥ | cum ( ǫ , ǫ − i , ǫ − h , ǫ − h − j ) | + σ ǫ X i ≥ (cid:12)(cid:12)(cid:12) . λ i , m (cid:12)(cid:12)(cid:12) < ∞ . Similarly, we obtain ∞ X h = −∞ | E [ H t − r , m ( θ ) H t − r − h , m ( θ )] | ≤ ∞ X h = −∞ X i , j ≥ | cum ( ǫ , ǫ − i , ǫ − h , ǫ − h − j ) | + σ ǫ X i ≥ (cid:12)(cid:12)(cid:12) . λ i , m (cid:12)(cid:12)(cid:12) < ∞ . Consequently T (1) r m , r , m < ∞ and the same approach yields that T (2) r m , r , m < ∞ and the lemmais proved.Let ˆΣ H r , ˆΣ H and ˆΣ H , H r be the matrices obtained by replacing ˆH t by H t ( θ ) in ˆΣ ˆH r , ˆΣ ˆH and ˆΣ ˆH , ˆH r . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Lemma 13.

Under the assumptions of Theorem 3, √ r k ˆΣ H r − Σ H r k , √ r k ˆΣ H , H r − Σ H , H r k and √ r k ˆΣ H − Σ H k tend to zero in probability as n → ∞ when r = o( n / ) .Proof. For ≤ m , m ≤ p + q + 1 and ≤ r , r ≤ r , the ( { ( r − p + q + 1) + m } , { ( r − p + q + 1) + m } ) − th element of ˆΣ H r is given by: n n X t =1 H t − r , m ( θ ) H t − r , m ( θ ). For all β > , we use (65) and we obtain P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ H r − Σ H r (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ r β E (cid:13)(cid:13)(cid:13) ˆΣ H r − Σ H r (cid:13)(cid:13)(cid:13) ≤ r β E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n n X t =1 H r , t H ′ r , t − E h H r , t H ′ r , t i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ r β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 E n n X t =1 H t − r , m ( θ ) H t − r , m ( θ ) − E [ H t − r , m ( θ ) H t − r , m ( θ )] ! . The stationarity of the process ( H t − r , m ( θ ) H t − r , m ( θ )) t and Lemma 12 imply P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ H r − Σ H r (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ r β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 Var n n X t =1 H t − r , m ( θ ) H t − r , m ( θ ) ! ≤ r ( n β ) r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 n X t =1 n X s =1 Cov ( H t − r , m ( θ ) H t − r , m ( θ ), H s − r , m ( θ ) H s − r , m ( θ )) ≤ r ( n β ) r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 n − X h =1 − n ( n − | h | )Cov ( H t − r , m ( θ ) H t − r , m ( θ ), H t − h − r , m ( θ ) H t − h − r , m ( θ )) ≤ rn β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 sup t ∈ Z ∞ X h = −∞ | Cov ( H t − r , m ( θ ) H t − r , m ( θ ), H t − h − r , m ( θ ) H t − h − r , m ( θ )) |≤ C ( p + q + 1) r n β . Consequently we have E (cid:20) r (cid:13)(cid:13)(cid:13) ˆΣ H − Σ H (cid:13)(cid:13)(cid:13) (cid:21) ≤ E (cid:20) r (cid:13)(cid:13)(cid:13) ˆΣ H , H r − Σ H , H r (cid:13)(cid:13)(cid:13) (cid:21) ≤ E (cid:20) r (cid:13)(cid:13)(cid:13) ˆΣ H r − Σ H r (cid:13)(cid:13)(cid:13) (cid:21) ≤ C ( p + q + 1) r n −−−→ n →∞ when r = o( n / ) . The conclusion follows. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We show in the following lemma that the previous lemma remains valid when we replace H t ( θ ) by ˆH t . Lemma 14.

Under the assumptions of Theorem 3, √ r k ˆΣ ˆH r − Σ H r k , √ r k ˆΣ ˆH , ˆH r − Σ H , H r k and √ r k ˆΣ ˆH − Σ H k tend to zero in probability as n → ∞ when r = o( n (1 − d − d )) / ) .Proof. As mentioned in the end of the proof of the previous lemma, we only have to deal with theterm √ r k ˆΣ ˆH r − Σ H r k .We denote ˆΣ H r , n the matrix obtained by replacing ˜ ǫ t ( ˆ θ n ) by ǫ t ( ˆ θ n ) in ˆΣ ˆH r . We have √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − Σ H r (cid:13)(cid:13)(cid:13) ≤ √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) + √ r (cid:13)(cid:13)(cid:13) ˆΣ H r , n − ˆΣ H r (cid:13)(cid:13)(cid:13) + √ r (cid:13)(cid:13)(cid:13) ˆΣ H r − Σ H r (cid:13)(cid:13)(cid:13) . By Lemma 13, the term √ r k ˆΣ H r − Σ H r k converges in probability. The lemma will be proved assoon as we show that √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) = o P (1) and (72) √ r (cid:13)(cid:13)(cid:13) ˆΣ H r , n − ˆΣ H r (cid:13)(cid:13)(cid:13) = o P (1), (73)when r = o( n (1 − d − d )) / ) . This is done in two separate steps. Step 1: proof of (72) . For all β > , we have P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ √ r β E (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) ≤ √ r β E (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n n X t =1 ˆH r , t ˆH ′ r , t − n n X t =1 H ( n ) r , t H ( n ) ′ r , t (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ K √ r β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n X t =1 ˆH t − r , m ˆH t − r , m − n n X t =1 H ( n ) t − r , m H ( n ) t − r , m (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where H ( n ) t , m = 2 ǫ t ( ˆ θ n ) ∂∂θ m ǫ t ( ˆ θ n ) and H ( n ) r , t = (cid:16) H ( n ) ′ t − , . . . , H ( n ) ′ t − r (cid:17) ′ . It is follow that P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ K √ rn β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X t =1 ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (74) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Observe now that ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n )= (cid:16) ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) (cid:17) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n )+ ǫ t − r ( ˆ θ n ) (cid:18) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:19) ˜ ǫ t − r ( ˆ θ n ) ∂∂θ m ˜ ǫ t − r ( ˆ θ n )+ ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:16) ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) (cid:17) ∂∂θ m ˜ ǫ t − r ( ˆ θ n )+ ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) (cid:18) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:19) . We replace the above identity in (74) and we obtain by Hölder’s inequality that P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ K √ rn β r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 ( T n ,1 + T n ,2 + T n ,3 + T n ,4 ) (75)where T n ,1 = n X t =1 (cid:13)(cid:13)(cid:13) ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13) ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L , T n ,2 = n X t =1 (cid:13)(cid:13)(cid:13) ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13) ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L , T n ,3 = n X t =1 (cid:13)(cid:13)(cid:13) ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13) ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L , T n ,4 = n X t =1 (cid:13)(cid:13)(cid:13) ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13) ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) ∂∂θ m ˜ ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13)(cid:13) L . For all θ ∈ Θ δ and t ∈ Z , in view of (30) and Lemma 1, we have (cid:13)(cid:13)(cid:13) ˜ ǫ t ( ˆ θ n ) − ǫ t ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L =  E X j ≥ (cid:16) λ tj ( ˆ θ n ) − λ j ( ˆ θ n ) (cid:17) ǫ t − j   / ≤ sup θ ∈ Θ δ  E X j ≥ (cid:0) λ tj ( θ ) − λ j ( θ ) (cid:1) ǫ t − j   / ≤ σ ǫ sup θ ∈ Θ δ (cid:13)(cid:13) λ ( θ ) − λ t ( θ ) (cid:13)(cid:13) ℓ ≤ K t / − ( d − d ) . It is not diﬃcult to prove that ˜ ǫ t ( θ ) and ∂ ˜ ǫ t ( θ ) /∂θ belong to L . The fact that ǫ t ( θ ) and ∂ǫ t ( θ ) /∂θ have moment of order can be proved using the same method than in Lemma 12 using the absolute . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models summability of the k -th ( k = 2, . . . , 8) cumulants assumed in (A4’) with τ = 8 . We deduce that T n ,1 ≤ K n X t =1 (cid:13)(cid:13)(cid:13) ˜ ǫ t − r ( ˆ θ n ) − ǫ t − r ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L ≤ K X t =1 − r (cid:13)(cid:13)(cid:13) ǫ t ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L + K n X t =1 (cid:13)(cid:13)(cid:13) ˜ ǫ t ( ˆ θ n ) − ǫ t ( ˆ θ n ) (cid:13)(cid:13)(cid:13) L ≤ K r + n X t =1 t / − ( d − d ) ! . Then we obtain T n ,1 ≤ K (cid:16) r + n / d − d ) (cid:17) . (76)The same calculations hold for the terms T n ,2 , T n ,3 and T n ,4 . Thus T n ,1 + T n ,2 + T n ,3 + T n ,4 ≤ K (cid:16) r + n / d − d ) (cid:17) (77)and reporting this estimation in (75) implies that P (cid:16) √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) ≥ β (cid:17) ≤ K r / ( p + q + 1) n β (cid:16) r + n / d − d ) (cid:17) ≤ K r / n + r / n / − ( d − d ) ! . Since / > (1 − d − d )) / , the sequence √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH r − ˆΣ H r , n (cid:13)(cid:13)(cid:13) converges in probability to as n → ∞ when r = r ( n ) = o( n (1 − d − d )) / ) . Step 2: proof of (73) . First we follow the same approach than in the previous step. We have (cid:13)(cid:13)(cid:13) ˆΣ H r , n − ˆΣ H r (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n n X t =1 H ( n ) r , t H ( n ) ′ r , t − n n X t =1 H r , t H ′ r , t (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 n n X t =1 H ( n ) t − r , m H ( n ) t − r , m − n n X t =1 H t − r , m H t − r , m ! ≤ r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 n n X t =1 ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ ) ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ ) (cid:19) . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Since ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ ) ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ )= (cid:16) ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) (cid:17) ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n )+ ǫ t − r ( θ ) (cid:18) ∂∂θ m ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( θ ) (cid:19) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n )+ ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ ) (cid:16) ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) (cid:17) ∂∂θ m ǫ t − r ( ˆ θ n )+ ǫ t − r ( θ ) ∂∂θ m ǫ t − r ( θ ) ǫ t − r ( θ ) (cid:18) ∂∂θ m ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( θ ) (cid:19) , one has (cid:13)(cid:13)(cid:13) ˆΣ H r , n − ˆΣ H r (cid:13)(cid:13)(cid:13) ≤ r X r =1 r X r =1 p + q +1 X m =1 p + q +1 X m =1 ( U n ,1 + U n ,2 + U n ,3 + U n ,4 ) (78)where U n ,1 = 1 n n X t =1 (cid:12)(cid:12)(cid:12) ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12) , U n ,2 = 1 n n X t =1 | ǫ t − r ( θ ) | (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12) , U n ,3 = 1 n n X t =1 | ǫ t − r ( θ ) | (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ǫ t − r ( ˆ θ n ) − ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12) U n ,4 = 1 n n X t =1 | ǫ t − r ( θ ) | (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) | ǫ t − r ( θ ) | (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t − r ( ˆ θ n ) − ∂∂θ m ǫ t − r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) . Taylor expansions around θ yield that there exists θ and θ between ˆ θ n and θ such that (cid:12)(cid:12)(cid:12) ǫ t ( ˆ θ n ) − ǫ t ( θ ) (cid:12)(cid:12)(cid:12) ≤ w t (cid:13)(cid:13)(cid:13) ˆ θ n − θ (cid:13)(cid:13)(cid:13) and (cid:12)(cid:12)(cid:12)(cid:12) ∂∂θ m ǫ t ( ˆ θ n ) − ∂∂θ m ǫ t ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ q t (cid:13)(cid:13)(cid:13) ˆ θ n − θ (cid:13)(cid:13)(cid:13) with w t = (cid:13)(cid:13)(cid:13) ∂ǫ t ( θ ) /∂θ ′ (cid:13)(cid:13)(cid:13) and q t = (cid:13)(cid:13)(cid:13) ∂ ǫ t ( θ ) /∂θ ′ ∂θ m (cid:13)(cid:13)(cid:13) . Using the fact that E (cid:12)(cid:12)(cid:12)(cid:12) w t − r ∂∂θ m ǫ t − r ( ˆ θ n ) ǫ t − r ( ˆ θ n ) ∂∂θ m ǫ t − r ( ˆ θ n ) (cid:12)(cid:12)(cid:12)(cid:12) < ∞ and that ( √ n ( ˆ θ n − θ )) n is a tight sequence (which implies that k ˆ θ n − θ k = O P (1 / √ n ) ), wededuce that U n ,1 = O P (cid:18) √ n (cid:19) . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models The same arguments are valid for U n ,2 , U n ,3 and U n ,4 . Consequently U n ,1 + U n ,2 + U n ,3 + U n ,4 =O P (1 / √ n ) and (78) yields (cid:13)(cid:13)(cid:13) ˆΣ H r , n − ˆΣ H r (cid:13)(cid:13)(cid:13) = O P (cid:18) r n (cid:19) . When r = o( n / ) we ﬁnally obtain √ r k ˆΣ H r , n − ˆΣ H r k = o P (1) . Lemma 15.

Under the assumptions of Theorem 3, we have √ r k Φ ∗ r − Φ r k = o P (1) as r → ∞ . Proof.

Recall that by (8) and (68) we have H t ( θ ) = Φ r H r , t + u r , t = Φ ∗ r H r , t + ∞ X k = r +1 Φ k H t − k ( θ ) + u t := Φ ∗ r H r , t + u ∗ r , t . By the orthogonality conditions in (8) and (68), one has Σ u ∗ r , H r := E h u ∗ r , t H ′ r , t i = E h(cid:0) H t ( θ ) − Φ ∗ r H r , t (cid:1) H ′ r , t i = E h(cid:0) Φ r H r , t + u r , t − Φ ∗ r H r , t (cid:1) H ′ r , t i = ( Φ r − Φ ∗ r ) Σ H r , and consequently Φ ∗ r − Φ r = − Σ u ∗ r , H r Σ − H r . (79)Using Lemma 11 and Lemma 12, (79) implies that P (cid:0) √ r k Φ ∗ r − Φ r k ≥ β (cid:1) ≤ √ r β (cid:13)(cid:13) Σ u ∗ r , H r (cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ≤ K √ r β (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) E  X k ≥ r +1 Φ k H t − k ( θ ) + u t  H ′ r , t (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ K √ r β X k ≥ r +1 k Φ k k (cid:13)(cid:13)(cid:13) E h H t − k ( θ ) H ′ r , t i(cid:13)(cid:13)(cid:13) ≤ K √ r β X ℓ ≥ k Φ ℓ + r k (cid:13)(cid:13)(cid:13) E h H t − ℓ − r ( θ ) (cid:16) H ′ t − ( θ ), . . . , H ′ t − r ( θ ) (cid:17)i(cid:13)(cid:13)(cid:13) ≤ K √ r β X ℓ ≥ k Φ ℓ + r k  p + q +1 X j =1 p + q +1 X k =1 r X r =1 | E [ H t − r − ℓ , j ( θ ) H t − r , k ( θ )] |  / ≤ K √ r β X ℓ ≥ k Φ ℓ + r k  p + q +1 X j =1 p + q +1 X k =1 r X r =1 E (cid:2) H t − r − ℓ , j ( θ ) (cid:3) E (cid:2) H t − r , k ( θ ) (cid:3) / ≤ K ( p + q + 1) r β X ℓ ≥ k Φ ℓ + r k . By (9), r P ℓ ≥ k Φ ℓ + r k = o(1) as r → ∞ . The proof of the lemma follows. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Lemma 16.

Under the assumptions of Theorem , we have √ r (cid:13)(cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13)(cid:13) = o P (1) as n → ∞ when r = o( n (1 − d − d )) / ) and r → ∞ .Proof. We have (cid:13)(cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13)(cid:13) ≤ (cid:16)(cid:13)(cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13)(cid:17) (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) , and by induction we obtain (cid:13)(cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ∞ X k =1 (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) k (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) k . We have P (cid:16) √ r (cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13) > β (cid:17) ≤ P √ r (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ∞ X k =1 (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) k (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) k > β ! ≤ P √ r (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ∞ X k =1 (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) k (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) k > β and (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) < ! + P √ r (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ∞ X k =1 (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) k (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) k > β and (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ≥ ! ≤ P  √ r (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) − (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) > β  + P (cid:16) √ r (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) ≥ (cid:17) ≤ P  √ r (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) > β (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) + β r − / (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13)  + P (cid:18) √ r (cid:13)(cid:13)(cid:13) Σ H r − ˆΣ ˆH r (cid:13)(cid:13)(cid:13) ≥ (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) − (cid:19) . Lemmas 11 and 14 imply the result.

Lemma 17.

Under the assumptions of Theorem 3, we have √ r (cid:13)(cid:13)(cid:13) ˆΦ r − Φ r (cid:13)(cid:13)(cid:13) = o P (1) as r → ∞ and r = o( n (1 − d − d )) / ). Proof.

Lemmas 11 and 16 yield (cid:13)(cid:13)(cid:13) ˆΣ − ˆH r (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ˆΣ − ˆH r − Σ − H r (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) Σ − H r (cid:13)(cid:13)(cid:13) = O P (1). (80)By (68), we have E h u r , t H ′ r , t i = E h(cid:0) H t ( θ ) − Φ r H r , t (cid:1) H ′ r , t i = Σ H , H r − Φ r Σ H r , . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models and so we have Φ r = Σ H , H r Σ − H r . Lemmas 11, 14, 16 and (80) imply √ r (cid:13)(cid:13)(cid:13) ˆΦ r − Φ r (cid:13)(cid:13)(cid:13) = √ r (cid:13)(cid:13)(cid:13) ˆΣ ˆH , ˆH r ˆΣ − ˆH r − Σ H , H r Σ − H r (cid:13)(cid:13)(cid:13) = √ r (cid:13)(cid:13)(cid:13)(cid:16) ˆΣ ˆH , ˆH r − Σ H , H r (cid:17) ˆΣ − ˆH r + Σ H , H r (cid:16) ˆΣ − ˆH r − Σ − H r (cid:17)(cid:13)(cid:13)(cid:13) = o P (1), and the lemma is proved. Proof of Theorem 3

Since by Lemma 14 we have k ˆΣ ˆH − Σ H k = o P ( r − / ) = o P (1) and k ˆΣ ˆH , ˆH r − Σ H , H r k = o P ( r − / ) =o P (1) , and by Lemma 15 k ˆΦ r − Φ ∗ r k = o P ( r − / ) = o P (1) , Theorem 3 is then proved. P p + q +1, n The following proofs are quite technical and are adaptations of the arguments used in [BMS18].To prove Proposition 4, we need to introduce the following notation.We denote S t the vector of R p + q +1 deﬁned by: S t = t X j =1 U j = t X j =1 − J − H j = − J − t X j =1 ǫ j ∂∂θ ǫ j ( θ ), and S t ( i ) is its i − th component. We have S t − ( i ) = S t ( i ) − U t ( i ). (81)If the matrix P p + q +1, n is not invertible, there exists some real constants d , . . . , d p + q +1 not allequal to zero, such that d ′ P p + q +1, n d = 0 , where d = ( d , . . . , d p + q +1 ) ′ . Thus we may write that P p + q +1 i =1 P p + q +1 j =1 d j P p + q +1, n ( j , i ) d i = 0 or equivalently n n X t =1 p + q +1 X i =1 p + q +1 X j =1 d j t X k =1 ( U k ( j ) − ¯U n ( j )) ! t X k =1 ( U k ( i ) − ¯U n ( i )) ! d i = 0. Then n X t =1 p + q +1 X i =1 d i t X k =1 ( U k ( i ) − ¯U n ( i )) !! = 0, which implies that for all t ≥ p + q +1 X i =1 d i t X k =1 ( U k ( i ) − ¯U n ( i )) ! = p + q +1 X i =1 d i (cid:16) S t ( i ) − tn S n ( i ) (cid:17) = 0. So we have t p + q +1 X i =1 d i S t ( i ) = p + q +1 X i =1 d i (cid:18) n S n ( i ) (cid:19) . (82) . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models We apply the ergodic theorem and we use the orthogonality of ǫ t and ( ∂/∂θ ) ǫ t ( θ ) in order toobtain that p + q +1 X i =1 d i n n X k =1 U k ( i ) ! a.s. −−−→ n →∞ p + q +1 X i =1 d i E [ U k ( i )] = − p + q +1 X i , j =1 d i J − ( i , j ) E (cid:20) ǫ k ∂ǫ k ∂θ j (cid:21) = 0 . Reporting this convergence in (82) implies that P p + q +1 i =1 d i S t ( i ) = 0 a.s. for all t ≥ . By (81),we deduce that p + q +1 X i =1 d i U t ( i ) = − p + q +1 X i =1 d i p + q +1 X j =1 J − ( i , j ) (cid:18) ǫ t ∂ǫ t ∂θ j (cid:19) = 0, a.s. Thanks to Assumption (A5) , ( ǫ t ) t ∈ Z has a positive density in some neighborhood of zero and then ǫ t = 0 almost-surely. So we would have d ′ J − ∂ǫ t ∂θ = 0 a.s. Now we can follow the same argumentsthat we developed in the proof of the invertibility of J (see Proof of Lemma 6 and more precisely(53)) and this leads us to a contradiction. We deduce that the matrix P p + q +1, n is non singular. The arguments follows the one [BMS18] in a simpler context.Recall that the Skorohod space D ℓ [0,1] is the set of R ℓ − valued functions on [0,1] which areright-continuous and has left limits everywhere. It is endowed with the Skorohod topology and theweak convergence on D ℓ [0,1] is mentioned by D ℓ −→ . The integer part of x will be denoted by ⌊ x ⌋ .The goal at ﬁrst is to show that there exists a lower triangular matrix T with nonnegativediagonal entries such that √ n ⌊ nr ⌋ X t =1 U t D p + q +1 −−−−→ n →∞ ( T T ′ ) / B p + q +1 ( r ), (83)where ( B p + q +1 ( r )) r ≥ is a ( p + q + 1) − dimensional standard Brownian motion. Using (30), U t can be rewritten as U t = − ( ∞ X i =1 . λ i ,1 ( θ ) ǫ t ǫ t − i , . . . , ∞ X i =1 . λ i , p + q +1 ( θ ) ǫ t ǫ t − i ) J − ′ ! ′ . The non-correlation between ǫ t ’s implies that the process ( U t ) t ∈ Z is centered. In order to applythe functional central limit theorem for strongly mixing process, we need to identify the asymptoticcovariance matrix in the classical central limit theorem for the sequence ( U t ) t ∈ Z . It is proved inTheorem 2 that √ n n X t =1 U t in law −−−→ n →∞ N (0, Ω =: 2 π f U (0)) , where f U (0) is the spectral density of the stationary process ( U t ) t ∈ Z evaluated at frequency 0. Theexistence of the matrix Ω has already been discussed (see the proofs of lemmas 6 and 9).Since the matrix Ω is positive deﬁnite, it can be factorized as Ω = T T ′ where the ( p + q +1) × ( p + q + 1) lower triangular matrix T has nonnegative diagonal entries. Therefore, we have √ n n X t =1 ( T T ′ ) − / F t in law −−−→ n →∞ N (0, I p + q +1 ) , . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models where ( T T ′ ) − / is the Moore-Penrose inverse (see [MN99], p. 36) of ( T T ′ ) / and I p + q +1 is theidentity matrix of order p + q + 1 .As in the proof of the asymptotic normality of ( √ n ( ˆ θ n − θ )) n , the distribution of n − / P nt =1 U t when n tends to inﬁnity is obtained by introducing the random vector U kt deﬁned for any strictlypositive integer k by U kt = − ( k X i =1 . λ i ,1 ( θ ) ǫ t ǫ t − i , . . . , k X i =1 . λ i , p + q +1 ( θ ) ǫ t ǫ t − i ) J − ′ ! ′ . Since U kt depends on a ﬁnite number of values of the noise-process ( ǫ t ) t ∈ Z , it also satisﬁes amixing property (see Theorem 14.1 in [Dav94], p. 210). The central limit theorem for stronglymixing process of [Her84] shows that its asymptotic distribution is normal with zero mean andvariance matrix Ω k that converges when k tends to inﬁnity to Ω (see the proof of Lemma 10): √ n n X t =1 U kt in law −−−→ n →∞ N (0, Ω k ) . The above arguments also apply to matrix Ω k with some matrix T k which is deﬁned analogouslyas T . Consequently, we obtain √ n n X t =1 ( T k T ′ k ) − / U kt in law −−−→ n →∞ N (0, I p + q +1 ). Now we are able to apply the functional central limit theorem for strongly mixing process of [Her84]and we obtain that √ n ⌊ nr ⌋ X t =1 ( T k T ′ k ) − / U kt D p + q +1 −−−−→ n →∞ B p + q +1 ( r ). Since ( T T ′ ) − / U kt = (cid:16) ( T T ′ ) − / − ( T k T ′ k ) − / (cid:17) U kt + ( T k T ′ k ) − / U kt , we may use the same approach as in the proof of Lemma 10 in order to prove that n − / P nt =1 (( T T ′ ) − / − ( T k T ′ k ) − / ) U kt converge in distribution to 0. Consequently we obtain that √ n ⌊ nr ⌋ X t =1 ( T T ′ ) − / U kt D p + q +1 −−−−→ n →∞ B p + q +1 ( r ). In order to conclude that (83) is true, it remains to observe that uniformly with respect to n itholds that ˜Y kn ( r ) := 1 √ n ⌊ nr ⌋ X t =1 ( T T ′ ) − / ˜Z kt D p + q +1 −−−−→ n →∞ (84)where ˜Z kt = − ( ∞ X i = k +1 . λ i ,1 ( θ ) ǫ t ǫ t − i , . . . , ∞ X i = k +1 . λ i , p + q +1 ( θ ) ǫ t ǫ t − i ) J − ′ ! ′ . . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models By (64) sup n Var √ n n X t =1 ˜Z kt ! −−−→ n →∞ and since ⌊ nr ⌋ ≤ n , sup ≤ r ≤ sup n n(cid:13)(cid:13)(cid:13) ˜Y kn ( r ) (cid:13)(cid:13)(cid:13)o −−−→ n →∞ Thus (84) is true and the proof of (83) is achieved.By (83) we deduce that √ n  ⌊ nr ⌋ X j =1 ( U j − ¯U n )  D p + q +1 −−−−→ n →∞ ( T T ′ ) / ( B p + q +1 ( r ) − r B p + q +1 (1)) . (85)One remarks that the continuous mapping theorem on the Skorohod space yields P p + q +1, n in law −−−→ n →∞ ( T T ′ ) / (cid:20)Z { B p + q +1 ( r ) − r B p + q +1 (1) } { B p + q +1 ( r ) − r B p + q +1 (1) } ′ dr (cid:21) ( T T ′ ) / =( T T ′ ) / V p + q +1 ( T T ′ ) / . Using (83), (85) and the continuous mapping theorem on the Skorohod space, one ﬁnally obtains n (cid:0) ˆ θ n − θ (cid:1) ′ P − p + q +1, n (cid:0) ˆ θ n − θ (cid:1) D p + q +1 −−−−→ n →∞ n ( T T ′ ) / B p + q +1 (1) o ′ n ( T T ′ ) / V p + q +1 ( T T ′ ) / o − n ( T T ′ ) / B p + q +1 (1) o = B ′ p + q +1 (1) V − p + q +1 B p + q +1 (1) := U p + q +1 . The proof of Theorem 5 is then complete.

In view of (15) and (18), we write ˆP p + q +1, n = P p + q +1, n + Q p + q +1, n where Q p + q +1, n = (cid:0) J ( θ ) − − ˆJ − n (cid:1) n n X t =1  t X j =1 ( H j − n P nk =1 H k )   t X j =1 ( H j − n P nk =1 H k )  ′ + ˆJ − n n n X t =1 (  t X j =1 ( H j − n P nk =1 H k )   t X j =1 ( H j − n P nk =1 H k )  ′ −  t X j =1 ( ˆH j − n P nk =1 ˆH k )   t X j =1 ( ˆH j − n P nk =1 ˆH k )  ′ ) . Using the same approach as in Lemma 8, ˆJ n converges almost surely to J . Thus we deduce thatthe ﬁrst term in the right hand side of the above equation tends to zero in probability. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models The second term is a sum composed of the following terms q i , j , k , ls , t = ǫ s ( θ ) ǫ t ( θ ) ∂ǫ s ( θ ) ∂θ i ∂ǫ t ( θ ) ∂θ j − ˜ ǫ s ( ˆ θ n ) ˜ ǫ t ( ˆ θ n ) ∂ ˜ ǫ s ( ˆ θ n ) ∂θ k ∂ ˜ ǫ t ( ˆ θ n ) ∂θ l . Using similar arguments done before (see for example the use of Taylor’s expansion in Subsection6.4, we have q i , j , k , ls , t = o P (1) as n goes to inﬁnity and thus Q p + q +1, n = o P (1) . So one may ﬁnd amatrix Q ∗ p + q +1, n that tends to the null matrix in probability and such that n (cid:16) ˆ θ n − θ (cid:17) ′ ˆP − p + q +1, n (cid:16) ˆ θ n − θ (cid:17) = n (cid:16) ˆ θ n − θ (cid:17) ′ ( P p + q +1, n + Q p + q +1, n ) − (cid:16) ˆ θ n − θ (cid:17) = n (cid:16) ˆ θ n − θ (cid:17) ′ P − p + q +1, n (cid:16) ˆ θ n − θ (cid:17) + n (cid:16) ˆ θ n − θ (cid:17) ′ Q ∗ p + q +1, n (cid:16) ˆ θ n − θ (cid:17) . Thanks to the arguments developed in the proof of Theorem 5, n ( ˆ θ n − θ ) ′ P − p + q +1, n ( ˆ θ n − θ ) converges in distribution. So n ( ˆ θ n − θ ) ′ Q ∗ p + q +1, n ( ˆ θ n − θ ) tends to zero in distribution, hence inprobability. Then n ( ˆ θ n − θ ) ′ ˆP − p + q +1, n ( ˆ θ n − θ ) and n ( ˆ θ n − θ ) ′ P − p + q +1, n ( ˆ θ n − θ ) have the samelimit in distribution and the result is proved. References [And71]

T. W. Anderson – The statistical analysis of time series , John Wiley & Sons, Inc.,New York-London-Sydney, 1971.[BCT96]

R. T. Baillie, C.-F. Chung & M. A. Tieslau – Analysing inﬂation by thefractionally integrated ARFIMA-GARCH model , Journal of applied econometrics (1996), no. 1, p. 23–40.[Ber74] K. N. Berk – Consistent autoregressive spectral estimates , Ann. Statist. (1974),p. 489–502, Collection of articles dedicated to Jerzy Neyman on his 80th birthday.[Ber95] J. Beran – Maximum likelihood estimation of the diﬀerencing parameter for invert-ible short and long memory autoregressive integrated moving average models , J. Roy.Statist. Soc. Ser. B (1995), no. 4, p. 659–672.[BFGK13] J. Beran, Y. Feng, S. Ghosh & R. Kulik – Long-memory processes , Springer,Heidelberg, 2013, Probabilistic properties and statistical methods.[BM09]

Y. Boubacar Maïnassara – Estimation, validation et identiﬁcation des modèlesARMA faibles multivariés , PHD thesis of Université Lille 3, 2009.[BM12]

Y. Boubacar Maïnassara – Selection of weak VARMA models by modiﬁedAkaike’s information criteria , J. Time Series Anal. (2012), no. 1, p. 121–130.[BMCF12] Y. Boubacar Mainassara, M. Carbon & C. Francq – Computing and esti-mating information matrices of weak ARMA models , Comput. Statist. Data Anal. (2012), no. 2, p. 345–361.[BMF11] Y. Boubacar Mainassara & C. Francq – Estimating structural VARMA modelswith uncorrelated but non-independent error terms , J. Multivariate Anal. (2011),no. 3, p. 496–505.[BMK16]

Y. Boubacar Maïnassara & C. C. Kokonendji – Modiﬁed Schwarz andHannan-Quinn information criteria for weak VARMA models , Stat. Inference Stoch.Process. (2016), no. 2, p. 199–217. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models [BMS18] Y. Boubacar Maïnassara & B. Saussereau – Diagnostic checking in multivari-ate arma models with dependent errors using normalized residual autocorrelations , J.Amer. Statist. Assoc. (2018), no. 524, p. 1813–1827.[Bri81]

D. R. Brillinger – Time series: data analysis and theory , vol. 36, Siam, 1981.[Chi88]

S.-T. Chiu – Weighted least squares estimators on the frequency domain for theparameters of a time series , Ann. Statist. (1988), no. 3, p. 1315–1326.[Dah89] R. Dahlhaus – Eﬃcient parameter estimation for self-similar processes , Ann. Statist. (1989), no. 4, p. 1749–1766.[Dav94] J. Davidson – Stochastic limit theory , Advanced Texts in Econometrics, The Claren-don Press, Oxford University Press, New York, 1994, An introduction for econometri-cians.[DGE93]

Z. Ding, C. W. Granger & R. F. Engle – A long memory property of stockmarket returns and a new model , Journal of Empirical Finance (1993), p. 83–106.[DL89] P. Doukhan & J. León – Cumulants for stationary mixing random sequences andapplications to empirical spectral density , Probab. Math. Stat (1989), p. 11–26.[FRZ05] C. Francq, R. Roy & J.-M. Zakoïan – Diagnostic checking in ARMA modelswith uncorrelated errors , J. Amer. Statist. Assoc. (2005), no. 470, p. 532–544.[FT86]

R. Fox & M. S. Taqqu – Large-sample properties of parameter estimates for stronglydependent stationary Gaussian time series , Ann. Statist. (1986), no. 2, p. 517–532.[FY08] J. Fan & Q. Yao – Nonlinear time series: nonparametric and parametric methods ,Springer Science & Business Media, 2008.[FZ98]

C. Francq & J.-M. Zakoïan – Estimating linear representations of nonlinear pro-cesses , J. Statist. Plann. Inference (1998), no. 1, p. 145–165.[FZ05] — , Recent results for linear time series models with non independent innovations , in

Statistical modeling and analysis for complex data problems , GERAD 25th Anniv. Ser.,vol. 1, Springer, New York, 2005, p. 241–265.[FZ07]

C. Francq & J.-M. Zakoïan – HAC estimation and strong linearity testing in weakARMA models , J. Multivariate Anal. (2007), no. 1, p. 114–144.[FZ10] C. Francq & J.-M. Zakoïan – GARCH models: Structure, statistical inference andﬁnancial applications , Wiley, 2010.[GJ80]

C. W. J. Granger & R. Joyeux – An introduction to long-memory time seriesmodels and fractional diﬀerencing , J. Time Ser. Anal. (1980), no. 1, p. 15–29.[GS90] L. Giraitis & D. Surgailis – A central limit theorem for quadratic forms in stronglydependent linear variables and its application to asymptotical normality of Whittle’sestimate , Probab. Theory Related Fields (1990), no. 1, p. 87–104.[Her84] N. Herrndorf – A functional central limit theorem for weakly dependent sequencesof random variables , Ann. Probab. (1984), no. 1, p. 141–153.[HK98] M. Hauser & R. Kunst – Fractionally integrated models with arch errors: withan application to the swiss 1-month euromarket interest rate , Review of QuantitativeFinance and Accounting (1998), no. 1, p. 95–113.[Hos81] J. R. M. Hosking – Fractional diﬀerencing , Biometrika (1981), no. 1, p. 165–176.[HTSC99] M. Hallin, M. Taniguchi, A. Serroukh & K. Choy – Local asymptotic nor-mality for regression models with long-memory disturbance , Ann. Statist. (1999),no. 6, p. 2054–2080.[Kee87] D. M. Keenan – Limiting behavior of functionals of higher-order sample cumulantspectra , Ann. Statist. (1987), no. 1, p. 134–151.[KL06] C.-M. Kuan & W.-M. Lee – Robust M tests without consistent estimation of the . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models asymptotic covariance matrix , J. Amer. Statist. Assoc. (2006), no. 475, p. 1264–1275.[LL97]

S. Ling & W. K. Li – On fractionally integrated autoregressive moving-average timeseries models with conditional heteroscedasticity , J. Amer. Statist. Assoc. (1997),no. 439, p. 1184–1194.[Lob01] I. N. Lobato – Testing that a dependent process is uncorrelated , J. Amer. Statist.Assoc. (2001), no. 455, p. 1066–1076.[Lüt05] H. Lütkepohl – New introduction to multiple time series analysis , Springer-Verlag,Berlin, 2005.[MN99]

J. R. Magnus & H. Neudecker – Matrix diﬀerential calculus with applications instatistics and econometrics , Wiley Series in Probability and Statistics, John Wiley &Sons, Ltd., Chichester, 1999, Revised reprint of the 1988 original.[Pal07]

W. Palma – Long-memory time series , Wiley Series in Probability and Statistics,Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, 2007, Theory and methods.[Sha10a]

X. Shao – Corrigendum: A self-normalized approach to conﬁdence interval construc-tion in time series , J. R. Stat. Soc. Ser. B Stat. Methodol. (2010), no. 5, p. 695–696.[Sha10b] — , Nonstationarity-extended Whittle estimation , Econometric Theory (2010),no. 4, p. 1060–1087.[Sha10c] — , A self-normalized approach to conﬁdence interval construction in time series , J. R.Stat. Soc. Ser. B Stat. Methodol. (2010), no. 3, p. 343–366.[Sha11] — , Testing for white noise under unknown dependence and its applications to diagnos-tic checking for time series models , Econometric Theory (2011), no. 2, p. 312–343.[Sha12] — , Parametric inference in stationary time series models with dependent errors , Scand.J. Stat. (2012), no. 4, p. 772–783.[Sha15] — , Self-normalization for time series: a review of recent developments , J. Amer. Statist.Assoc. (2015), no. 512, p. 1797–1817.[Tan82]

M. Taniguchi – On estimation of the integrals of the fourth order cumulant spectraldensity , Biometrika (1982), no. 1, p. 117–122.[Ton90] H. Tong – Non-linear time series: a dynamical system approach , Oxford UniversityPress, 1990.[TT97]

M. S. Taqqu & V. Teverovsky – Robustness of Whittle-type estimators for timeseries with long-range dependence , Comm. Statist. Stochastic Models (1997), no. 4,p. 723–757, Heavy tails and highly volatile phenomena.[Whi53] P. Whittle – Estimation and information in stationary time series , Ark. Mat. (1953), p. 423–434.[ZL15] K. Zhu & W. K. Li – A bootstrapped spectral test for adequacy in weak ARMAmodels , J. Econometrics (2015), no. 1, p. 113–130. . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models

7. Figures and tables (a) (b) (c) − . − . − . − . . . . . Estimation errors

Strong FARIMA (a) (b) (c) − . − . − . − . . . . . Estimation errors

Semi−strong FARIMA (a) (b) (c) − . − . − . − . . . . . Estimation errors

Weak FARIMA

Fig 1 . LSE of N = 1, 000 independent simulations of the FARIMA (1, d , 1) model (19) with size n = 2, 000 andunknown parameter θ = ( a , b , d ) = ( − − , when the noise is strong (left panel), when the noise issemi-strong (20) (middle panel) and when the noise is weak of the form (21) (right panel). Points (a)-(c), in thebox-plots, display the distribution of the estimation error ˆ θ n ( i ) − θ ( i ) for i = 1, 2, 3 .. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models −3 0 2 − . Normal quantiles a - a ^ n Strong case −3 0 2 − . Normal quantiles b - b ^ n Strong case −3 0 2 − . Normal quantiles d - d ^ n Strong case −3 0 2 − . Normal quantiles a - a ^ n Semi−strong case −3 0 2 − . Normal quantiles b - b ^ n Semi−strong case −3 0 2 − . Normal quantiles d - d ^ n Semi−strong case −3 0 2 − . Normal quantiles a - a ^ n Weak case −3 0 2 − . Normal quantiles b - b ^ n Weak case −3 0 2 − . Normal quantiles d - d ^ n Weak case

Fig 2 . LSE of N = 1, 000 independent simulations of the FARIMA (1, d , 1) model (19) with size n = 2, 000 andunknown parameter θ = ( a , b , d ) = ( − − . The top panels present respectively, from left to right, theQ-Q plot of the estimates ˆa n , ˆb n and ˆd n of a , b and d in the strong case. Similarly the middle and the bottom panelspresent respectively, from left to right, the Q-Q plot of the estimates ˆa n , ˆb n and ˆd n of a , b and d in the semi-strongand weak cases.. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Strong case

Distribution of a - a^ n D en s i t y −0.10 0.05 Strong case

Distribution of b - b^ n D en s i t y −0.2 0.0 0.2 Strong case

Distribution of d - d^ n D en s i t y −0.05 0.05 Semi−strong case

Distribution of a - a^ n D en s i t y −0.4 0.0 Semi−strong case

Distribution of b - b^ n D en s i t y −0.4 0.0 Semi−strong case

Distribution of d - d^ n D en s i t y −0.10 0.05 Weak case

Distribution of a - a^ n D en s i t y −0.15 0.05 Weak case

Distribution of b - b^ n D en s i t y −0.2 0.1 Weak case

Distribution of d - d^ n D en s i t y −0.05 0.10 Fig 3 . LSE of N = 1, 000 independent simulations of the FARIMA (1, d , 1) model (19) with size n = 2, 000 andunknown parameter θ = ( a , b , d ) = ( − − . The top panels present respectively, from left to right, thedistribution of the estimates ˆa n , ˆb n and ˆd n of a , b and d in the strong case. Similarly the middle and the bottompanels present respectively, from left to right, the distribution of the estimates ˆa n , ˆb n and ˆd n of a , b and d in thesemi-strong and weak cases. The kernel density estimate is displayed in full line, and the centered Gaussian densitywith the same variance is plotted in dotted line.. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models (a) (b) (c) Estimates of diag ( W S ) Strong FARIMA (a) (b) (c) Estimates of diag ( W ) Strong FARIMA (a) (b) (c) Estimates of diag ( W S ) Semi−strong FARIMA (a) (b) (c) Estimates of diag ( W ) Semi−strong FARIMA (a) (b) (c)

Estimates of diag ( W S ) Weak FARIMA (a) (b) (c)

Estimates of diag ( W ) Weak FARIMA

Fig 4 . Comparison of standard and modiﬁed estimates of the asymptotic variance Ω of the LSE, on the simulatedmodels presented in Figure 1. The diamond symbols represent the mean, over N = 1, 000 replications, of thestandardized errors n ( ˆa n + 0.7) for (a) (1.90 in the strong case and 4.32 (resp. 1.80) in the semi-strong case (resp.in the weak case)), n ( ˆb n + 0.2) for (b) (5.81 in the strong case and 11.33 (resp. 8.88) in the semi-strong case(resp. in the weak case)) and n ( ˆd n − for (c) (1.28 in the strong case and 2.65 (resp. 1.40) in the semi-strongcase (resp. in the weak case)).. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models (a) (b) (c) Estimates of diag ( W S ) Semi−strong FARIMA (a) (b) (c)

Estimates of diag ( W S ) Weak FARIMA

Fig 5 . A zoom of the left-middle and left-bottom panels of Figure 4. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models (a) (b) (c) Estimates of diag ( W ) Semi−strong FARIMA (a) (b) (c)

Estimates of diag ( W ) Weak FARIMA

Fig 6 . A zoom of the right-middle and right-bottom panels of Figure 4. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Table 1

Empirical size of standard and modiﬁed conﬁdence interval: relative frequencies (in %) of rejection.Modiﬁed SN stands for the self-normalized approach. In Modiﬁed we use the sandwich estimator of theasymptotic variance Ω of the LSE while in Standard we use ˆΩ S . The number of replications is N = 1000 . Model Length n Level Standard Modiﬁed Modiﬁed SN ˆa n ˆb n ˆd n ˆa n ˆb n ˆd n ˆa n ˆb n ˆd n α = 1% Strong FARIMA n = 200 α = 5% α = 10% α = 1% n = 2, 000 α = 5% α = 10% α = 1% n = 5, 000 α = 5% α = 10% α = 1% n = 200 α = 5% α = 10% α = 1% n = 2, 000 α = 5% α = 10% α = 1% n = 5, 000 α = 5% α = 10% α = 1% Weak FARIMA n = 200 α = 5% α = 10% α = 1% n = 2, 000 α = 5% α = 10% α = 1% n = 5, 000 α = 5% α = 10% . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Time t I nde x v a l ue CAC40

Time t I nde x v a l ue DAX

Time t I nde x v a l ue Nikkei

Time t I nde x v a l ue SP 500

Fig 7 . Closing prices of the four stock market indices from the starting date of each index to February 14, 2019.. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Time t R e t u r n − CAC40

Time t R e t u r n − − DAX

Time t R e t u r n − − Nikkei

Time t R e t u r n − − SP 500

Fig 8 . Returns of the four stock market indices from the starting date of each index to February 14, 2019.. Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models . . . Lag A C F Squared returns of CAC40 . . . Lag A C F Squared returns of DAX . . . Lag A C F Squared returns of Nikkei . . . Lag A C F Squared returns of SP 500

Fig 9 . Sample autocorrelations of squared returns of the four stock market indices.

Table 2

Fitting a FARIMA (1, d , 1) model to the squares of the 4 daily returns considered. The corresponding p − values are reported in parentheses. The last column presents the estimated residual variance. Series Length n ˆ θ n Var ( ǫ t ) ˆa n ˆb n ˆd n ˆ σ ǫ CAC n = 7, 341 × − DAX n = 7, 860 × − Nikkei n = 13, 318 -0.0217 (0.9528) 0.1579 (0.6050) 0.3217 (0.0000) × − S&P 500 n = 17, 390 -0.3371 (0.0023) -0.1795 (0.0227) 0.2338 (0.0000) × − Table 3

Modiﬁed conﬁdence interval at the asymptotic level α = 5% for the parameters estimated in Table 2.Modiﬁed SN stands for the self-normalized approach while Modiﬁed corresponds to the conﬁdenceinterval obtained by using the sandwich estimator of the asymptotic variance Ω of the LSE. Series Modiﬁed Modiﬁed SN ˆa n ˆb n ˆd n ˆa n ˆb n ˆd n CAC [ − DAX [ − − Nikkei [ − − − − S&P 500 [ − − − − − − − − . Boubacar Maïnassara, Y. Esstafa and B. Saussereau/Estimation of weak FARIMA models Contents I ( θ ) . . . . . . . . . . . . . . . . . . . . . . 73.2 A self-normalized approach to conﬁdence interval construction in weak FARIMAmodels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Numerical illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.1 Simulation studies and empirical sizes for conﬁdence intervals . . . . . . . . . . . . 114.2 Application to real data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136.1 Preliminary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136.2 Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176.3 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.4 Proof of the convergence of the variance matrix estimator . . . . . . . . . . . . . . 336.5 Invertibility of the normalization matrix P p + q +1, n . . . . . . . . . . . . . . . . . . 456.6 Proof of Theorem 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.7 Proof of Theorem 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Figures and tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

20 40 60 80 100 . . . Lag A C F Squared returns of CAC40 . . . Lag A C F Squared returns of DAX . . . Lag A C F Squared returns of Nikkei . . . Lag A C F Squared returns of SP 500 his figure "acf.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "acfzoom.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "graph1.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "distributionn1000.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "estimOmegan1000.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "estimOmegan1000zoom.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "estimOmegan1000zoomsand.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "estimationErrorsn1000.jng.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "estimationErrorsn1000.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1his figure "graph.png" is available in "png"(cid:10) format from:http://arxiv.org/ps/1910.07213v1 ime t I nde x v a l ue CAC40

Time t R e t u r n − CAC40 . Lag A C F Squared returns of CAC40

Time t I nde x v a l ue DAX

Time t R e t u r n − DAX . Lag A C F Squared returns of DAX

Time t I nde x v a l ue Nikkei

Time t R e t u r n − Nikkei . Lag A C F Squared returns of Nikkei

Time t I nde x v a l ue SP 500