Inference in mixed causal and noncausal models with generalized Student's t-distributions
IInference in mixed causal and noncausal models withgeneralized Student’s t-distributions
Francesco Giancaterini ∗ and Alain Hecq Department of Quantitative EconomicsSchool of Business and EconomicsMaastricht University
December, 2020
Abstract
This paper analyzes the properties of the Maximum Likelihood Estimator for mixedcausal and noncausal models when the error term follows a Student’s t − distribution. Inparticular, we compare several existing methods to compute the expected Fisher infor-mation matrix and show that they cannot be applied in the heavy-tail framework. Forthis purpose, we propose a new approach to make inference on causal and noncausalparameters in finite sample sizes. It is based on the empirical variance computed on thegeneralized Student’s t , even when the population variance is not finite. Monte Carlosimulations show the good performances of our new estimator for fat tail series. Weillustrate how the different approaches lead to different standard errors in four time se-ries: annual debt to GDP for Canada, the variation of daily Covid-19 deaths in Belgium,the monthly wheat prices and the monthly inflation rate in Brazil. Keywords : MLE, noncausal models, generalized Student’s t-distribution, inference.
JEL : C22
Mixed causal and noncausal models (MAR) are time series processes with both leads and lagscomponents. Such specifications allow to capture nonlinear features such as bubbles, namely pro-cesses that experience a rapid increase followed by a sudden crash. Linear autoregressive models(e.g. ARMA models) cannot exhibit these bubble patterns. MAR models have successfully beenimplemented on several time series, for instance: commodity prices, inflation rate, bitcoin and ∗ Corresponding author: Francesco Giancaterini, Maastricht University, School of Business and Economics, De-partment of Quantitative Economics, P.O.box 616, 6200 MD, Maastricht, The Netherlands.Email: [email protected] authors would like to thank Sean Telg, Elisa Voisin and Ines Wilms for their various suggestions. All errors areours. a r X i v : . [ ec on . E M ] D ec other equity prices. Furthermore, forecasts from mixed causal and noncausal models often beatthose from linear ones. They also have an economic flavor. They are interpreted as situations inwhich economic agents have more information then econometricians, linking MAR models with theexistence of non-fundamentalness in structural econometric models (see Alessi et al. (2011) andLanne and Saikkonen (2013)). Still their estimation and in particular making inference on MARparameters is far from trivial.This paper analyzes the behaviour of the Maximum Likelihood Estimator (MLE) for mixedcausal and noncausal models with an error term following Student’s t − distributions. Althoughmost theoretical results for MARs are derived under the assumption of finite variance of the errorterm (see i.a. Breidt, Davis, Li & Rosenblatt (1991); Lanne & Saikkonen (2011)), we emphasizethat working with the generalized version of the Student’s t allows to also cover infinite variancecases (when the degree of freedom 1 < ν ≤ t lie between 1.5 and 2. The alternative methods tomake inference in the infinite variance cases would be either to work with a different asymptotictheory (Davis and Resnick (1985)), or to use different distributions (see the work on alpha stabledistributions by Fries and Zakoian (2019)), or, in case of purely noncausal models, to rely on boot-strap estimators (Cavaliere, Nielsen and Rahbek, (2020)).The rest of the paper is organized as follows. Section 2 introduces mixed causal and noncausalmodels. Section 3 presents the different ways of obtaining the expected Fisher information matrixfor MARs. The existing strategies are briefly reviewed. Section 4 proposes a new approach to com-pute the standard errors of causal and noncausal parameters, based on a robust estimator of theresiduals. We show its validity in finite samples. Section 5 studies, using Monte Carlo simulations,the performances of the current methodologies and of the new approach. Section 6 is dedicated tothe empirical applications on four different time series. Section 7 concludes. Breidt et al. (1991) introduce a maximum likelihood procedure for estimating the parameters ofnoncausal processes. Their starting point is the autoregressive model a ( L ) y t = ε t , (1)where L is the backshift operator, ε t is an independent and identically ( i.i.d. ) non-Gaussian se-quence of random variables with mean zero and finite variance. It is assumed that the autoregressivepolynomial a ( z ) = 1 − φ z − · · · − φ z z p has no roots on the unit circle, so that φ ( z ) (cid:54) = 0 for | z | = 1.Breidt et al. (1991) further assume that the polynomial a ( z ) has respectively s roots inside and r outside the unit circle. Equation (1) can be factored in a ( z ) = ϕ ( z ) ∗ φ ( z ) , (2)where ϕ ( z ) ∗ is called the noncausal polynomial since its roots are inside the unit circle such that1 − ϕ ∗ z − ... − ϕ ∗ s z s (cid:54) = 0 for | z | ≥
1. Breidt et al. (1991) derive the covariance matrix of the estimatedparameters only for probability density functions of ε t that satisfy a certain set of assumptions listedin Section 3. The generalized Student’s t − distribution with degrees of freedom equal or less than The non-Gaussianity of ε t is required to identify noncausal from causal models.
2, does not satisfy one of these assumptions and, as a consequence, this approach cannot be usedin the heavy-tail framework. Lanne and Saikkonen (2011) directly start with a mixed causal andnoncausal model expressed as the product of the backward and forward looking polynomials φ ( L ) ϕ ( L − ) y t = (cid:15) t , (3)where L − produces leads such that L − y t = y t +1 . We denote such a model a MAR( r , s ) with φ ( L )the causal/autoregressive polynomial of order r and ϕ ( L − ) the noncausal/lead polynomial of order s . With this representation it is assumed that both φ ( z ) and ϕ ( z ) have their roots outside the unitcircle: φ ( z ) (cid:54) = 0 and ϕ ( z ) (cid:54) = 0 f or | z | < . (4)Note that purely causal or purely noncausal models are respectively obtained when ϕ ( L − ) = 1or φ ( L ) = 1. In (3) the parameter vectors φ = ( φ , ..., φ r ) and ϕ = ( ϕ , ..., ϕ s ) turn out to beorthogonal to the parameters that describe the distribution of the error term (cid:15) t (see Lemma 1of Lanne and Saikkonen (2011)). They can be estimated by an AMLE approach. AMLE refersto as the approximate maximum likelihood estimators because we lose the r first and the last s observations when estimating MAR( r, s ). An important and useful feature of mixed causal andnoncausal models, is that we can set: u t = ϕ ( L − ) y t ↔ u t = φ u t − + · · · + φ r u t − r + (cid:15) t , (5) v t = φ ( L ) y t ↔ v t = ϕ v t +1 + · · · + ϕ s v t + s + (cid:15) t . (6)In order to obtain the standard errors of the estimated parameters, Lanne and Saikkonen workwith a density function which satisfies similar assumptions presented in Breidt et al. (1991) and,in particular, that it must have a finite variance.Hecq, Lieb and Telg (2016) propose a new approach to more easily compute the standard errorsfor MAR( r, s ) using the generalized t − distribution and relying on the results developed for thelinear regression model by Fonseca et al. (2008). Their approach, also implemented in the Rpackage MARX, works if and only if E ( | (cid:15) t | ) < ∞ and hence if the degrees of freedom is largerthan 2. We show however that this approach can be misleading as it imposes strong restrictionsthat can lead to incorrect estimates of the standard errors. Let us consider a general density function f and denote the likelihood function of θ by L ( θ ) = T (cid:89) t =1 f ( (cid:15) t | θ ) . We indicate with θ = ( θ , ..., θ p ) the vector of the true values of the causal and noncausal coeffi-cients ( p = r + s ). The other parameters of the general density function (degrees of freedom andscale parameter), are, for the moment, assumed to be known and equal to their true populationvalues; we will show next that they are independent from the estimation of θ . Furthermore, we as-sume that (cid:15) t has a finite variance, equal to σ . Taking the logs of L ( θ ), we obtain the log-likelihoodfunction l ( θ ) = ln L ( θ ) = T (cid:88) t =1 ln( f ( (cid:15) t | θ )) . (7)Defining b ( θ ) = δl ( θ ) δ θ the score vector of the log-likelihood, the MLE of θ is given by the solution (cid:98) θ to the p = r + s equations b ( (cid:98) θ ) = . If the sample size is sufficiently large, it turns out that thedistribution of the maximum likelihood estimation (cid:98) θ can be well approximated by (cid:98) θ ≈ N ( θ , I ( θ ) − ) , (8)where I is the expected Fisher information matrix I ( θ ) = − E (cid:2) δ l ( θ ) δ θ δ θ (cid:48) (cid:3) . (9)Since it is not always trivial to evaluate analytically the expected value of the Hessian matrix, wecan also compute the observed Fisher information matrix: I ( θ ) = − (cid:2) δ l ( θ ) δ θ δ θ (cid:48) (cid:3) . (10)For the law of large numbers I ( θ ) converges in probability to I ( θ ). In practice, since the truevalue of θ is not known, these two matrices are obtained by replacing the population parametersby their ML estimates to get I ( (cid:98) θ ) and I ( (cid:98) θ ).Let us start with I ( θ ) , the observed information matrix of a MAR( r, s ) as described in (3).We consider (cid:15) t i.i.d. and distributed according to a generalized Student’s t distribution, such thatits density function at time t is: f σ ( (cid:15) t , ν , η ) = Γ( ν +12 )Γ( ν ) √ πν η (cid:34) ν (cid:18) (cid:15) t η (cid:19) (cid:35) , (11)with the corresponding approximate log-likelihood function, conditional on y = [ y , . . . , y T ], equalto: l ( φ , ϕ , ν , η | y ) = ( T − p ) (cid:34) ln (cid:18) Γ (cid:18) ν + 12 (cid:19)(cid:19) − ln (cid:18)(cid:113) ν πη (cid:19) − ln (cid:18) Γ (cid:18) ν (cid:19)(cid:19)(cid:35) + − ν + 12 T − s (cid:88) t = r +1 ln (cid:18) ν (cid:18) φ ( L ) ϕ ( L − ) y t η (cid:19) (cid:19) . (12)We indicate with ν and η respectively the true values of the degrees of freedom and of thescale parameter. Instead, σ denotes the true value of the variance of the error term which, in ageneralized Student’s t − distribution, is equal to σ = η ν ν − , ∀ ν > . We use the notation of Hamilton (1984, p.143).
In this case, we have that I ( θ ) is given by I ( θ ) = − (cid:34) δ l ( θ ) δ φ φ (cid:48) δ l ( θ ) δ φ ϕ (cid:48) δ l ( θ ) δ ϕ φ (cid:48) δ l ( θ ) δ ϕ ϕ (cid:48) (cid:35) , (13)knowing that, in the general case: δl (( θ ) δ φφ (cid:48) = 2( ν + 1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − U t − U (cid:48) t − [ z t η ] (cid:19) − ( ν + 1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − U t − U (cid:48) t − (cid:19) ; δl (( θ ) δ ϕϕ (cid:48) = 2( ν + 1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − V t +1 V (cid:48) t +1 [ z t η ] (cid:19) − ( ν + 1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − V t +1 V (cid:48) t +1 (cid:19) ; δl (( θ ) δ φϕ (cid:48) = 2( ν +1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − U t − V (cid:48) t +1 [ z t η ] (cid:19) − ( ν +1) η − (cid:18) T − s (cid:80) t = r +1 ( ν + z t ) − ( U t − V (cid:48) t +1 + Y t ) (cid:19) ;with z t = φ ( L ) u t η = ϕ ( L − ) v t η = φ ( L ) ϕ ( L − ) y t η , U t − = ( u t − , . . . , u t − r ), V t +1 = ( v t +1 , . . . , v t + s )and Y t is a matrix r × s with elements y t − i + j ( i = 1 , . . . , r and j = 1 , . . . , s ).Section 3.1 shows that in mixed causal and noncausal models, the expected Fisher informationmatrix (unlike I( ˆ θ )) cannot be computed when the population variance is not finite. In Section 5we will evaluate, by means of Monte Carlo simulations, whether the observed Fisher informationmatrix still allows to respect Equation (8) in this context. Lanne and Saikkonen (2011) propose to calculate the asymptotic covariance matrix using a general(Lebesgue) density function, which depends on parameters vector λ , where all the distributionalparameters are collected (scale parameter and degrees of freedom which, exactly as the previoussection, are respectively indicated with η and ν ). Furthermore, it is characterized by an i.i.d innovation term with finite and constant variance, equal to σ . Similar conditions as of Andrewset al. (2006) must be satisfied. In details these are:(A1) For all x ∈ R and all λ ∈ Λ, f ( x, λ ) > f ( x, λ ) is twice continuously differentiable withrespect to ( x, λ ).(A2) For all λ ∈ Λ , (cid:82) xf (cid:48) ( x, λ ) dx = − (cid:82) f (cid:48)(cid:48) ( x ; λ ) dx = 0 . (A4) (cid:82) x f (cid:48)(cid:48) ( x, λ ) dx = 2.(A5) J = (cid:82) ( f (cid:48) ( x, λ )) /f ( x, λ ) dx > (A7) For j, k = 1 , ..., d and all λ ∈ Λ , Matrix defined in Equation (11) in Lanne and Saikkonen (2011). • f ( x, λ ) is dominated by a function f ( x, λ ) such that (cid:82) x f ( x ) dx < ∞ , and • x f (cid:48) ( x,λ )) f ( x,λ ) , x (cid:12)(cid:12)(cid:12) f (cid:48)(cid:48) ( x,λ ) f ( x ; λ ) (cid:12)(cid:12)(cid:12) , | x | (cid:12)(cid:12)(cid:12) δf (cid:48) ( x ; λ ) /λ j f ( x ; λ ) (cid:12)(cid:12)(cid:12) , ( δf (cid:48) ( x ; λ ) / ( δλ j )) f ( x ; λ ) , and | δ f ( x,λ ) /δ j δ k | f ( x ; λ ) are dominatedby a + a | x | c , where a , a , and c are nonnegative constants and (cid:82) | x | c f ( x ) dx < ∞ .In this Section we relax the assumption that the distributional parameters of density f are known.Also, we need to introduce some notation used in their paper. Let ζ t ∼ i.i.d (0 ,
1) and define theAR( r ) stationary process u ∗ t by φ ( L ) u ∗ t = ζ t and the AR( s ) stationary process v ∗ t by ϕ ( L ) v ∗ t = ζ t .Let also define U ∗ t − = ( u ∗ t − , . . . , u ∗ t − r ), V ∗ t − = ( v ∗ t − , . . . , v ∗ t − s ) and the associated covariance ma-trices Γ U ∗ = Cov ( U ∗ t − ), Γ V ∗ = Cov ( V ∗ t − ) and Γ U ∗ V ∗ = Cov ( U ∗ t − , V ∗ t − ) = Γ (cid:48) V ∗ U ∗ . Theorem 1 (by Lanne et al., 2011) Given conditions (A1)-(A7), there exists a sequence oflocal maximizers (cid:98) θ = ( (cid:98) φ , (cid:98) ϕ , (cid:98) η, (cid:98) v ) of l t ( θ ) in (7) such that( T − p ) / ( (cid:98) θ − θ ) d −→ N (0 , diag (Σ − , Ω − )) , where Σ − is the asymptotic variance-covariance matrix of the AML estimators ( φ, ϕ ), such thatΣ = (cid:20) J Γ U ∗ Γ U ∗ V ∗ Γ V ∗ U ∗ J Γ V ∗ (cid:21) = (cid:20) σ ˜ J Γ U ∗ Γ U ∗ V ∗ Γ V ∗ U ∗ σ ˜ J Γ V ∗ (cid:21) (14)and Ω − is the asymptotic variance-covariance matrix of the distributional parameters.Lanne and Saikkonen (2008) show in detail how to obtain the Σ matrix. Furthermore, theyshow that we have a block diagonality because the representation (3) and the conditions (A2)-(A4). Due to the block diagonality of the covariance matrix of the limiting distribution, the AMLestimators of ( (cid:98) φ , (cid:98) ϕ ) and ( (cid:98) ν , (cid:98) η ) are asymptotically independent. The matrix Σ is positive definiteif condition (A5) is true ( J > (cid:26)(cid:90) x f (cid:48) σ ( x, λ ) f σ ( x, λ ) f σ ( x, λ ) dx (cid:27) ≤ (cid:26) (cid:90) x f σ ( x, λ ) dx (cid:27)(cid:26)(cid:90) (cid:18) f (cid:48) σ ( x, λ ) f σ ( x, λ ) (cid:19) f σ ( x, λ ) dx (cid:27) = σ ˜ J , (15)with an equality if and only if f is gaussian. Hence, (A5) is true for non-gaussian f since ˜ J can berewritten as ˜ J = σ − (cid:90) ( f (cid:48) ( x, λ )) f ( x, λ ) dx = σ − J , (16)where the density function inside the integral, refers to a rescaled density function (that is withunit variance). In other words, we have that Σ is positive definite if σ ˜ J > In our case, we have (cid:15) t i.i.d. according to a generalized Student’s t distribution and: σ ˜ J = σ (cid:90) f (cid:48) σ ( (cid:15) t , ν, η ) f σ ( (cid:15) t , ν, η ) d(cid:15) t = (cid:18) η νν − (cid:19)(cid:18) η − ν + 1 ν + 3 (cid:19) = ν ( ν + 1)( ν − ν + 3) > ∀ ν > , with f σ ( (cid:15) t , ν, η ) defined in (11). It is easy to see that this approach works if and only if (cid:15) t has The condition σ ˜ J > J > finite variance. This is the reason why most authors (e.g. Lanne and Saikkonen (2011)) use astandardized Student’s t − distribution (such that σ = η ) in their empirical application. When weconsider this type of standardized distribution, the log-likelihood function is: l (cid:48) ( φ , ϕ , ν , η | y ) = ln (cid:40) Γ( ν +12 )Γ( ν ) √ πνη (cid:113) ν − ν (cid:34) ν (cid:32) φ ( L ) ϕ ( L − ) y t η (cid:113) ν − ν (cid:33) (cid:35) − ν +12 (cid:41) , such that, unlike (12), its structure ensures convergence for ν > ν = 1 though). We also observe that the heavier the tails are, the faster the estimatorseems to converge (see Hecq et al. (2016)). Hecq et al. (2016) take their inspiration from the conclusions of Fonseca et al. (2008) who considerthe linear regression model y i = X (cid:48) i β + (cid:15) i ( i = 1 , ..., T ) , (17)where X i and β are both vectors p × (cid:15) i are i.i.d. following a generalized Student’s t -distributionwith ν degrees of freedom and a scale parameter η , such that c ( ν, η ) = Γ( ν +12 ) ν ν/ Γ( ν ) (cid:112) πη , (18)the log-likelihood function being l ( β , η, ν | y , X ) = T ln[ c ( ν, η )] − ν + 12 T (cid:88) i =1 ln( ν + z i ) , (19)where z i = y i − X (cid:48) i β η .The first derivative of the log-likelihood function with respect to β is given by δl ( β , η, ν | y , X ) δ β = ( ν + 1) T (cid:88) i =1 ( ν + z i ) − ( ν + 1) (cid:18) X i ( y i − X (cid:48) i β ) η (cid:19) , whereas the second derivative with respect to β , applying the product and chain rule is δ l ( β , η, ν | y , X ) δ β (cid:48) δ β = 2( ν +1) η − T (cid:88) i =1 (cid:20) ( ν + z i ) − ( X i X (cid:48) i )( y i − X (cid:48) i β ) (cid:21) − ( ν +1) η − T (cid:88) i =1 (cid:20) ( ν + z i ) − ( X i X (cid:48) i ) (cid:21) . (20)In order to obtain the expected Fisher information I ( · ) with respect to β , we have to take theexpectation of this expression and multiply it by -1. Fonseca et al. (2008) show that: I ( β ) = − E (cid:20) δ l ( β , η, ν | y , X ) δ β (cid:48) δ β | β (cid:21) = η − ν + 1 ν + 3 T (cid:88) i =1 X i X (cid:48) i . (21)Hecq et al. (2016) adapt the results obtained by Fonseca et al. (2008) in the context of thenoncausal model setup. That is, they consider a general MAR( r, s ) model φ ( L ) ϕ ( L − ) y t = (cid:15) t , (22)where (cid:15) t ∼ t ( ν, η ). To ensure a similar model setup, Hecq et al. (2016) use representations (5) and(6). They replace the aforementioned alternative representations of noncausal model to the originallinear representation (17), so that they can compute the standard errors of the causal/noncausalcoefficients using the results of Fonseca et al. (2008). In other words, they obtain the standarderrors of the causal coefficient using (5) and assuming the noncausal parameters as known. Instead,for the standard errors of the noncausal parameters, they use representation (6) supposing that thecausal coefficients are known. This is of course an approximation which leads to a block diagonaland conditional expected Fisher information matrix (14). For instance, in a MAR(1,1) they obtainthe following conditional expected Fisher information matrices for the causal and the noncausalparts I ( φ | ϕ ) = − E (cid:20) δ l ( φ, η, ν | ϕ ) δφ (cid:21) , I ( ϕ | φ ) = − E (cid:20) δ l ( ϕ, η, ν | φ ) δϕ (cid:21) implying both δ l ( · ) δφδϕ = 0 and δ l ( · ) δϕδφ = 0; hence I ( φ, ϕ ) = − E (cid:34) δ l ( φ,η,ν | ϕ ) δφ δ l ( ϕ,η,ν | φ ) δϕ (cid:35) . (23)Obviously, when we invert I ( φ, ϕ ), we have different results from those that we obtain if we wouldhave inverted the complete Fisher information matrix. Hecq et al. (2016) illustrate that this ap-proximation gives mildly satisfactory results.Furthermore, exactly as Lanne and Saikkonen (2011), Hecq et al. (2016) state that this method-ology can be applied only if the error term has a finite variance (hence if ν > In this section, we investigate the conclusions obtained by Hecq et al. (2016) in the finite variancecase of the error term. In particular, we want to evaluate to what extent the assumption of theblock diagonality of the conditional expected Fisher information matrix yields misspecified standarderrors. Indeed their approach is implemented in the R package MARX and has been applied inseveral researches. For this purpose, we compute the empirical density functions of the percentagedifference of the standard errors obtained through the two aforementioned approaches. In particular,we analyze the empirical density functions of Z φ,i and Z ϕ,j , where: Z φ,i = S.E.L. φ,i − S.E.H. φ,i
S.E.H. φ,i ×
100 ; Z ϕ,j = S.E.L. ϕ,j − S.E.H. ϕ,i
S.E.H. ϕ,j × .S.E.L. φ,i indicates the standard error of the i-th causal coefficient obtained through the expectedFisher information matrix (Σ), with i ∈ [1 , r ]. Instead, S.E.H. φ,i represents the standard error ofthe i-th causal coefficient derived from Hecq et al.’s approach. The same is true for true for
S.E.L. ϕ,j and
S.E.H. ϕ,j . The only difference is that the latters refer to the j-th noncausal coefficient, with j ∈ [1 , s ].The data generating process is a MAR(1,1) with a scale parameter η = 5, T=1000 observationsand 10000 replications. In addition, we consider different values of degrees of freedom ( ν =3, ν =4and ν =5) and different combinations of values for the causal/noncausal coefficients, that is: • φ =0.65, ϕ =0.35; • φ =0.5, ϕ =0.5; • φ =0.35, ϕ =0.65.Figures 1-3 show the empirical density functions of Z φ and of Z ϕ obtained through Monte Carloexperiments.We conclude that the standard errors proposed by Lanne and Saikkonen (2011) should be usedfor non-heavy tailed models. The approximation developed in Hecq et al. (2016) on the other hand,underestimates the standard errors and consequently provides a too narrow confidence interval.Furthermore, such underestimation decreases with decreasing degrees of freedom. Hence, althoughthe approach proposed by Hecq et al (2016) is easy to implement, it should be applied in casesof heavy tail disturbances, or where the Lanne and Saikkonen (2011)’s method already show someconvergence problems. This happens, due to estimation uncertainty, when the degrees of freedomis small even though the population variance is finite. In this section we propose a new methodology to compute the standard errors of MAR parameters.It is valid for mixed causal and noncausal models whenever the error term is distributed accordingto a generalized Student’s t − distribution and the sample size is finite. Although in the heavy-tail framework it is not possible to derive the theoretical limiting distributions of these parameters,Monte Carlo simulations in the next section show how our new estimator empirically satisfies Equa-tion (8) for ν ∈ (1 , D ], with D < ∞ .In Section 3.1, it is stated that the variance of the error term ( σ ) multiplies the block di-agonal matrices of the Expected Fisher information matrix defined in (13). Since the Student’s t − distribution with heavy-tailed innovations is characterized by an undefined variance, the ex-pected Fisher information matrix cannot be computed in this context. Our alternative strategy0 MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 5 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 5 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 5 Z φ Z ϕ Figure 1: Density plots of the variables Z φ and Z ϕ , based on 5 degrees of freedom and T=1000observations1 MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 4 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 4 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 4 Z φ Z ϕ Figure 2: Density plots of the variables Z φ and Z ϕ , based on 4 degrees of freedom and T=1000observations2 MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Z φ Z ϕ MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Z φ Z ϕ Figure 3: Density plots of the variables Z φ and Z ϕ , based on 3 degrees of freedom and T=1000observations3consists in replacing the variance of the error term with the variance of residuals ( σ (cid:98) (cid:15) ) in (13).Furthermore, especially in those cases where the population variance is not finite ( ν ∈ (1 , M AD (cid:98) (cid:15) = median ( | (cid:98) (cid:15) i − median ( (cid:98) (cid:15) i ) | ) (24)and a consistent estimation of the standard deviation is given by: σ (cid:98) (cid:15) = k × M AD (cid:98) (cid:15) . (25)Rousseeuw et al. (1993) show that, if we set k to 1.48, Equation (25) ensures convergence to thestandard deviation under the assumption of normality. Let us now find the value of k that providesa robust estimation of the standard deviation of the residuals if multiplied by the MAD estimator,under the assumption of Student’s t − distribution, for ν ∈ (1 , D ] . The standard deviation of theresiduals depends on two different parameters: the degrees of freedom ( ν ) and the sample size ( T ).This implies that also k is a function of ν and T : k ( ν, T ) = (cid:98) σ (cid:98) (cid:15) ( ν, T ) M AD (cid:98) (cid:15) . (26)In other words, k is a random variable with different density functions depending on the differentvalues of ν and T . Supposing ν = 1 . T = 500, to obtain the empirical density function of k (1 . , ν = 1 . T = 500. In each replication we compute the value of k using Equation (26). In this way, theMonte Carlo experiment yields as many values of k as the number of replications. To identify fromthese values the empirical density function of k , we use the kernel density estimation. The extremevalues of k can affect the non-parametric estimation and to avoid this, we extract all values of k within the range: (cid:2) Q − × IQR, Q × IQR (cid:3) , where Q1 and Q3 are respectively the first and the third quartile of k and IQR is its interquartilerange. With the following values, we obtain an empirical density function as shown in Figure 4.In addition to choosing ν = 1 . T = 500, we also consider N=700.000 replications. A largenumber of replications is important to obtain an empirical density function as accurate as possible.Finally, in order to obtain a robust estimate of standard deviation of residuals, we take the modeof k , indicating this value as k ∗ . Appendix A provides values of k ∗ for other ν and T .In conclusion, this approach gives us a Fisher information matrix:¯Σ = (cid:20) σ (cid:98) (cid:15) ˜ J Γ U ∗ Γ U ∗ V ∗ Γ V ∗ U ∗ σ (cid:98) (cid:15) ˜ J Γ V ∗ (cid:21) , (27)where, using Equation (25), we have: σ (cid:98) (cid:15) ˜ J = (cid:18) k ∗ ( ν, T ) × M AD (cid:98) (cid:15) (cid:19) η − ν + 1 ν + 3 . For now we are not interested in the scale parameter of the error term since, in Equation (26), it is shown that η does not affect k . Empirical density function of k( ν = 1 . , T = 500 ), obtained using 700.000 replications. So far we have seen that, for mixed causal and noncausal models with an innovation term distributedaccording to a generalized Student’s t − distribution, it is not possible to derive the theoreticallimiting distribution in the heavy-tail framework. This section focuses on identifying, throughMonte Carlo experiments, which of the aforementioned estimators of the standard errors satisfiesEquation (8) in finite samples. As previously stated, we will also include in the analysis the standarderrors obtained by the observed Fisher information matrix ( I ( (cid:98) φ , (cid:98) ϕ )).For this purpose, we run several Monte Carlo simulations characterized by N=10000 replicationseach. The data generating process is a MAR(1,1) with a scale parameter η = 3 and sample sizes T =(100, 200, 500, 1000). We also consider several degrees of freedom ν = (3 , . , . , .
2) anddifferent combinations of causal and noncausal coefficients, that is: • φ =0, ϕ =0; • φ =0.65, ϕ =0.35; • φ =0.5, ϕ =0.5; • φ =0.35, ϕ =0.65.For each replication we test whether the estimated causal and noncausal coefficients are equal totheir respective true values. In particular, we compute two different t − tests: H : φ = φ and H : ϕ = ϕ against the two sided alternatives, respectively φ (cid:54) = φ and ϕ (cid:54) = ϕ . Tables 1-14 showthe empirical rejection frequencies (at nominal significance level 5%) obtained using the differentmethodologies to compute the standard errors. In particular, the columns ˆ¯Σ and I ( (cid:98) φ, (cid:98) ϕ ) indicatethe empirical rejection frequencies whenever the standard errors are obtained by the matrices (27)and (23) respectively.We observe that for ν > t − distribution characterized by tails fatter than a standard normal distribution. The reason is thatin the denominator of the t − test we have underestimated standard errors (see Section 3.3). We also5observe that our new approach and the observed Fisher information matrix have less distortionsfor small sample sizes (T=100, T=200) than the expected Fisher information matrix. The latteronly gets closer the 5% nominal rejection frequency for T=1000. For ν ≤ I ( (cid:98) φ, (cid:98) ϕ )) performs better than I ( (cid:98) φ, (cid:98) ϕ ), but the results are still far fromthose that we would have obtained in case of standard normal distribution. Our new approach ( ˆ¯Σ)is the only one that allows us to empirically satisfy Equation (8). Also, this new method providesempirical rejection frequencies slightly smaller for high values of causal and noncausal coefficients.This is not a issue of great relevance in terms of inference, as high values are likely significantlydifferent from zero. We illustrate the differences and the similarities in the computed standard errors of MAR models onfour time series. These are: (a) the annual debt to GDP ratio for Canada from 1870 to 2015 (source:IMF), (b) the variation of daily Covid-19 deaths in Belgium from 10/March/2020 to 17/July/2020(source: WHO), (c) the monthly wheat prices from January 1990 until September 2020 (source:IMF) and (d) the monthly inflation rate in Brazil (obtained from year to year difference on IPCAindex ) observed from January 1997 to June 2020 (source: Central Bank of Brazil). Figure 5presents the data. With this panel of applications we want to show that MAR models are alsointeresting for modeling other series than the usual commodity prices.The way to estimate MAR models imply a series of steps. We first estimate a conventionalcausal autoregressive model by OLS in order to obtain the lag order p using information criteria(see Lanne and Saikkonen (2011)). We find p = 2 for three out of the four series, namely forinflation, debt to GDP ratio and wheat prices whereas p = 4 is chosen for Belgian’s Covid series.Using an AML approach and searching for the r and s with p = r + s that maximize the generalizedStudent’s t likelihood function, Canadian debt/GDP, wheat prices as well as Brazilian inflationfollow a MAR(1,1) and the variation of Covid-19 deaths a MAR(2,2). We detail next the valueof estimated parameters and their standard errors obtained using methods reviewed and newlyintroduced in this paper.From our simulation results, we can expect some differences and similarities given the degreesof freedom estimated for the four variables: for the Canadian series ˆ ν = 2 .
37, for ‘Covid-19 dataˆ ν = 1 .
17, on wheat prices ˆ ν = 2 .
21 and ˆ ν = 3 .
22 for Brazilian inflation. Although we observefat tails in each series, it is only on daily Belgian data that the the degree of freedom is below 2.However, none of them is significantly different from two. To check this, we use the standard errorsgiven by − ( T − p ) − δ l T ( ˆ φ , ˆ ϕ , θ ) /δ θ δ θ (cid:48) with ˆ θ = (ˆ ν, ˆ η ), being a consistent estimator of theexpected Fisher information matrix of the distributional parameters Ω (see Lanne and Saikkonen(2011)). This matrix, unlike Σ, has no restrictions and can be computed also when the population The IPCA targets population families with household income ranging from 1 to 40 minimum wages. Thisincome range guarantees a 90% coverage of families living in 13 geographic zones: metropolitan areas of Bel´em,Fortaleza, Recife, Salvador, Belo Horizonte, Vit´oria, Rio de Janeiro, S˜ao Paulo, Curitiba, Porto Alegre, as well as theFederal District and the cities of Goiˆania and Campo Grande. Basket items include Food and Beverages, Housing,Household Articles, Wearing Apparel, Transportation, Health and Personal Care, Personal Expenses, Education andCommunication. Empirical rejection frequencies - MAR(1,1): φ = 0 , ϕ = 0 , ν = 3 Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 24.26% 23.64% 12.08% 11.70% 18.36% 18.45% 9.52% 9.44%T=200 14.93% 15.43% 8.43% 8.77% 14.61% 15.33% 7.21% 7.53%T=500 8.98% 9.35% 5.82% 6.65% 12.14% 12.25% 5.26% 5.76%T=1000 7.17% 7.28% 5.50% 5.48% 10.50% 10.73% 4.74% 4.56%
Table 1:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 22.37% 22.36% 10.52% 10.83% 16.85% 16.53% 8.21% 8.20%T=200 13.91% 13.90% 6.96% 7.77% 12.55% 13.31% 4.47% 4.80%T=500 8.22% 8.42% 5.47% 5.89% 10.26% 10.27% 4.76% 5.16%T=1000 7.38% 6.95% 5.65% 5.37% 9.66% 9.37% 4.82% 4.27%
Table 2:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 23.50% 23.80% 11.33% 11.16% 17.44% 17.35% 8.53% 8.58%T=200 14.60% 14.97% 7.65% 8.20% 14.26% 14.65% 5.19% 5.21%T=500 8.82% 8.95% 5.85% 6.25% 11.55% 11.91% 5.08% 5.36%T=1000 7.30% 7.04% 5.35% 5.32% 10.64% 10.25% 4.54% 4.44%
Table 3:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 3 Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 23.12% 22.34% 10.80% 10.47% 16.34% 16.75% 8.33% 8.17%T=200 13.47% 13.76% 7.40% 6.80% 12.44% 13.06% 6.30% 6.41%T=500 8.68% 8.66% 5.80% 9.97% 10.47% 10.48% 5.12% 5.08%T=1000 7.07% 6.52% 5.37% 5.21% 9.30% 9.07% 4.41% 4.10%
Table 4:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution. Empirical rejection frequencies - MAR(1,1): φ = 0 , ϕ = 0 , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 13.32% 13.56% 12.20% 12.63% 6.18% 6.41%T=200 / / 10.50% 11.08% 9.62% 10.21% 5.18% 5.73%T=500 / / 9.55% 9.81% 8.73% 8.67% 4.71% 4.74%T=1000 / / 8.87% 9.58% 7.88% 8.68% 4.18% 4.89%
Table 5:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 10.06% 11.82% 10.93% 11.41% 5.54% 5.73%T=200 / / 7.38% 9.64% 7.82% 9.28% 4.36% 5.04%T=500 / / 6.72% 8.36% 7.18% 8.10% 3.74% 4.56%T=1000 / / 6.53% 8.83% 6.81% 8.32% 3.77% 4.45%
Table 6:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 11.24% 11.18% 11.72% 11.54% 5.58% 6.08%T=200 / / 8.59% 8.87% 8.81% 9.14% 4.58% 4.74%T=500 / / 7.33% 7.45% 7.63% 7.67% 3.99% 4.08%T=1000 / / 7.29% 8.37% 7.34% 7.96% 4.04% 4.41%
Table 7:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 12.29% 9.95% 11.32% 11.11% 5.86% 5.64%T=200 / / 9.45% 7.77% 9.10% 8.04% 4.71% 4.44%T=500 / / 8.29% 6.74% 7.77% 7.17% 4.18% 3.90%T=1000 / / 7.92% 7.19% 7.72% 7.43% 4.33% 4.13%
Table 8:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution. Empirical rejection frequencies - MAR(1,1): φ = 0 , ϕ = 0 , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 16.72% 16.19% 12.63% 12.90% 5.81% 5.82%T=200 / / 13.93% 14.63% 11.06% 11.31% 5.35% 5.37%T=500 / / 12.65% 12.84% 10.32% 10.10% 5.41% 4.84%T=1000 / / 11.92% 11.99% 9.62% 9.46% 4.97% 5.42%
Table 9:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 11.28% 14.71% 10.90% 12.08% 4.75% 5.13%T=200 / / 9.50% 12.47% 9.00% 10.33% 4.19% 5.01%T=500 / / 7.70% 11.20% 8.06% 9.67% 3.94% 4.56%T=1000 / / 7.63% 10.76% 7.15% 8.79% 3.61% 4.81%
Table 10:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 13.09% 13.13% 11.52% 11.87% 4.98% 5.18%T=200 / / 11.20% 10.76% 9.67% 9.63% 4.35% 4.80%T=500 / / 9.39% 9.95% 8.79% 8.91% 4.19% 4.15%T=1000 / / 9.04% 9.03% 7.61% 8.23% 3.93% 4.34%
Table 11:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 14.27% 11.48% 12.00% 10.98% 5.10% 4.70%T=200 / / 12.51% 9.09% 10.16% 8.54% 4.67% 4.27%T=500 / / 10.71% 8.39% 9.32% 8.47% 4.49% 3.96%T=1000 / / 9.96% 7.84% 8.23% 7.46% 4.23% 3.94%
Table 12:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution. Empirical rejection frequencies - MAR(1,1): φ = 0 , ϕ = 0 , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 21.30% 21.22% 14.91% 14.34% 6.11% 6.16%T=200 / / 20.15% 19.66% 13.85% 14.08% 5.91% 6.24%T=500 / / 19.00% 18.45% 13.34% 12.93% 6.03% 5.28%T=1000 / / 18.18% 18.64% 12.95% 12.89% 5.62% 5.56%
Table 13:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 13.70% 18.38% 12.30% 13.39% 4.62% 5.69%T=200 / / 11.85% 16.86% 10.69% 13.02% 4.51% 5.59%T=500 / / 10.22% 15.17% 9.22% 11.57% 4.04% 4.26%T=1000 / / 9.69% 5.35% 8.61% 11.38% 3.53% 4.81%
Table 14:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 16.54% 16.12% 13.26% 12.65% 5.31% 5.09%T=200 / / 14.97% 13.93% 12.12% 12.00% 4.90% 4.91%T=500 / / 13.21% 12.87% 10.67% 10.72% 4.51% 3.98%T=1000 / / 12.02% 11.29% 9.82% 10.35% 3.86% 4.12%
Table 15:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Empirical rejection frequencies - MAR(1,1): φ = 0 . , ϕ = 0 . , ν = 1 . Sample size ˆΣ I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆ¯Σ φ ϕ φ ϕ φ ϕ φ ϕ T=100 / / 19.05% 13.43% 14.26% 12.09% 5.79% 4.72%T=200 / / 17.39% 11.42% 12.92% 9.67% 5.27% 4.34%T=500 / / 15.54% 9.96% 11.59% 9.24% 5.05% 3.48%T=1000 / / 14.67% 9.88% 11.37% 9.22% 4.73% 3.63%
Table 16:
Percentage of observations outside the interval [-1.96, +1.96]. This value is equal to 5% in a standardnormal distribution.
Time
Canadian government debt as % of GDP (1870 −2015) (a)
Annual data for the Canadian debt ex-pressed as percentage of the GDP. −40040 2020.2 2020.3 2020.4 2020.5
Time
Covid−19 Belgium: variation of daily deaths (10/03/2020 − 17/07/2020) (b)
Daily data for the variation of deathsfor covid-19 in Belgium.
Time
Wheat prices (1990:01−2020:09) (c)
Monthly data for the wheat prices.
Time
Brazilian inflation (1997:01−2020:06) (d)
Monthly data for the inflation rate inBrazil.
Figure 5: Charts of the 4 time series covered by the empirical investigation.to the approach used to compute the standard errors. In the empirical application concerningthe Canadian debt (Table 17), we have too narrow confidence intervals whenever we compute thestandard errors using Hecq et al. (2016) or Lanne and Saikkonen (2011) methodology. Instead, inthe table associated to the Brazilian inflation rate (Table 20), we notice that the standard errorsobtained through the robust estimator of residuals, are larger (and consequently also the confidenceintervals) than those obtained by the ”traditional” methodologies described in Section 3. The sameis true for the causal coefficient in the wheat prices (Table 19). For the noncausal coefficient ofthe same empirical application, we obtain too narrow confidence intervals when the standard erroris computed by I ( (cid:98) φ, (cid:98) ϕ ) and by ˆΣ. Finally, the time series related to the variation of deaths forCovid-19 in Belgium (Table 18), is characterized by an error term with an undefined variance. Ourmethod allows to have narrower confidence intervals than those obtained using Hecq et al. (2016)and the observed Fisher information matrix.1 Canadian debt expressed as % of GDP
Estimated Standard errorscoefficients I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆΣ ˆ¯Σ (cid:99) φ =0.6504 0.072153 0.031742 0.031260 0.048623 (cid:99) ϕ =0.8860 0.029278 0.012380 0.019083 0.029682 (cid:98) η = 2 . (cid:98) ν = 2 . Table 17:
Estimated coefficients and standard errors for Canadian debt ratio.
Covid 19 in Belgium: variation of daily deaths
Estimated Standard errorscoefficients I ( (cid:98) φ , (cid:98) ϕ ) I ( (cid:98) φ , (cid:98) ϕ ) ˆΣ ˆ¯Σ (cid:99) φ =-0.4660 0.056116 0.028793 / 0.024285 (cid:99) φ =-0.5853 0.033870 0.028793 / 0.024277 (cid:99) ϕ =0.0803 0.045516 0.025620 / 0.023881 (cid:99) ϕ =0.6037 0.033870 0.025619 / 0.023881 (cid:98) η = 4 . (cid:98) ν = 1 . Table 18:
Estimated coefficients and standard errors for the variation of daily Covid-19 deaths in Belgium.
Wheat prices
Estimated Standard errorscoefficients I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆΣ ˆ¯Σ (cid:99) φ =0.9241 0.007701 0.003549 0.007851 0.013969 (cid:99) ϕ =0.2866 0.051349 0.023949 0.019681 0.035019 (cid:98) η = 6 . (cid:98) ν = 2 . Table 19:
Estimated coefficients and standard errors for wheat prices.
Brazilian inflation rate
Estimated Standard errorscoefficients I ( (cid:98) φ, (cid:98) ϕ ) I ( (cid:98) φ, (cid:98) ϕ ) ˆΣ ˆ¯Σ (cid:99) φ =0.5842 0.036605 0.028383 0.038656 0.046492 (cid:99) ϕ =0.9385 0.009304 0.006895 0.016444 0.019777 (cid:98) η = 0 . (cid:98) ν = 3 . Table 20:
Estimated coefficients and standard errors for inflation rate in Brazil. In this paper we first review the behaviour of the ML estimator for mixed causal and noncausalmodels. In particular we focused on those having an error term distributed according to a general-ized Student’s t − distribution. We have seen that the expected Fisher information matrix (derivedby Lanne and Saikkonen (2011)) of the causal and noncausal parameters, can be computed if andonly if the probability density function satisfies a certain set of assumptions. The generalized Stu-dent’s t − distribution with an infinite variance ( ν ∈ (1 , The following table shows the different values of k that maximize their own empirical densityfunctions according to the different values of T and ν selected for the Monte Carlo simulations,presented in Section 5. k ∗ ν = 1 . ν = 1 . ν = 1 . ν = 1 . ν = 1 . ν = 3T=100 4.186322 3.317155 3.049654 2.866044 2.57295 1.937395T=200 5.311298 3.901011 3.557615 3.2330488 2.85024 2.02271T=500 7.266156 4.941986 4.297126 3.849296 3.233094 2.082257T=1000 9.022733 5.839081 4.971029 4.330869 3.491673 2.116381 References
Alessi, L., Barigozzi, M., and Capasso, M.
Non-fundamentalness in structural econometricmodels: A review.
International Statistical Review 79 , 1 (2011), 16–47.
Andrews, B., and Davis, R. A.
Model identification for infinite variance autoregressive processes.
Journal of Econometrics 172 , 2 (2013), 222–234.
Andrews, B., Davis, R. A., and Breidt, F. J.
Maximum likelihood estimation for all-passtime series models.
Journal of Multivariate Analysis 97 , 7 (2006), 1638–1659.
Breidt, F. J., Davis, R. A., Lh, K.-S., and Rosenblatt, M.
Maximum likelihood estimationfor noncausal autoregressive processes.
Journal of Multivariate Analysis 36 , 2 (1991), 175–198.
Cavaliere, G., Nielsen, H. B., and Rahbek, A.
Bootstrapping noncausal autoregressions:with applications to explosive bubble modeling.
Journal of Business & Economic Statistics 38 , 1(2020), 55–67.
Davis, R., and Resnick, S.
Limit theory for moving averages of random variables with regularlyvarying tail probabilities.
The Annals of Probability (1985), 179–195.
Davis, R. A., Knight, K., and Liu, J.
M-estimation for autoregressions with infinite variance.
Stochastic Processes and Their Applications 40 , 1 (1992), 145–180.
Fries, S., and Zakoian, J.-M.
Mixed causal-noncausal ar processes and the modelling of explo-sive bubbles (2017).
Hamilton, J.
Time series analysis, (1994).
Hecq, A., Issler, J. V., and Telg, S.
Mixed causal–noncausal autoregressions with exogenousregressors.
Journal of Applied Econometrics 35 , 3 (2020), 328–343.
Hecq, A., Lieb, L., and Telg, S.
Identification of mixed causal-noncausal models in finitesamples.
Annals of Economics and Statistics/Annales d’ ´Economie et de Statistique , 123/124 (2016),307–331.
Hecq, A., and Voisin, E.
Predicting bubble bursts in oil prices using mixed causal-noncausalmodels. arXiv preprint arXiv:1911.10916 (2019).
Hecq, A., and Voisin, E.
Forecasting bubbles with mixed causal-noncausal autoregressive mod-els.
Econometrics and Statistics (2020).
Lanne, M., and Saikkonen, P.
Modeling expectations with noncausal autoregressions.
Availableat SSRN 1210122 (2008).
Lanne, M., and Saikkonen, P.
Noncausal autoregressions for economic time series.
Journal ofTime Series Econometrics 3 , 3 (2011).
Lanne, M., and Saikkonen, P.
Noncausal vector autoregression.
Econometric Theory (2013),447–481.
Rousseeuw, P. J., and Croux, C.
Alternatives to the median absolute deviation.