The Identification Problem for Linear Rational Expectations Models
aa r X i v : . [ m a t h . S T ] A ug The Identification Problem for Linear Rational ExpectationsModels ∗ Majid M. Al-Sadoon † Durham University Business School Piotr Zwiernik ‡ Universitat Pompeu Fabra & BGSEAugust 27, 2019
Abstract
We consider the problem of the identification of stationary solutions to linear rationalexpectations models from the second moments of observable data. Observational equiva-lence is characterized and necessary and sufficient conditions are provided for: (i) identifica-tion under affine restrictions, (ii) generic identification under affine restrictions of analyticallyparametrized models, and (iii) local identification under non-linear restrictions. The resultsstrongly resemble the classical theory for VARMA models although significant points of de-parture are also documented.JEL Classification: C10, C22, C32.Keywords: Identification, linear rational expectations models, linear systems, vector autore-gressive moving average models. ∗ Thanks are due to Manfred Deistler, Benedikt P¨otscher, Hashem Pesaran, Fabio Canova, Barbara Rossi, Chris-tian Brownlees, Geert Mesters, Davide Debortoli, G´abor Lugosi, Omiros Papaspiliopoulos, Albert Satorra, BerndFunovits, Peter C. B. Phillips, Michele Piffer, and to seminar participants at Universitat Pompeu Fabra, Universityof Vienna, UC San Diego, University of Southampton, Universitat de les Illes Balears, NYU Abu Dhabi, LondonSchool of Economics, and Queen Mary University of London. † MMA acknowledges support by Spanish Ministry of Economy and Competitiveness projects ECO2012-33247and ECO2015-68136-P (MINECO/FEDER, UE) and Fundaci´on BBVA Scientific Research Grant PR16-DAT-0043. ‡ PZ acknowledges the support by the Spanish Government grants (RYC-2017-22544,PGC2018-101643-B-I00,SEV-2015-0563), and Ayudas Fundaci´on BBVA a Equipos de Investigaci´on Cientifica 2017. Introduction
The linear rational expectations model (LREM) is distinguished among dynamic linear sys-tems in that the present state depends not only on events leading up to the present but alsoon endogenously formulated expectations of the future. Such models are used today by re-searchers, practitioners, and policy-makers for causal and counter-factual analysis as well asforecasting. Yet the statistical properties of LREMs are poorly understood and identificationin particular has remained an open problem throughout what is now known as the rationalexpectations revolution.Establishing identifiability of this model class is important for several reasons. Fromthe parametrization point of view, the parameters of LREMs codify the decision making ofeconomic actors (e.g. households or firms) and it is important to know whether or not thisbehaviour can be learned from the available data. From the estimation point of view, lackof identifiability leads to ill-conditioned optimization procedures when employing extremumestimators; Bayesian methods are not immune to identification failure either as the posteriorretains the shape of the prior along observationally equivalent directions in the parameterspace. Finally, inference is substantially more difficult in the absence of identification in boththe frequentist and Bayesian perspectives.Partial results for identification of LREMs were derived by Muth (1981), Wallis (1980)and Pesaran (1981). However, these results apply to very restricted LREMs and cannot beemployed in the study of modern LREMs. Subsequent econometric work in the area com-pletely ignored identification until Canova & Sala (2009) called attention to serious identifi-cation problems plaguing many LREMs used in practice, a point echoed by Pesaran & Smith(2011), Romer (2016), and Blanchard (2018). This spurred a number of researchers to providecomputational diagnostics for local and global identification (Iskrev, 2010; Komunjer & Ng,2011; Qu & Tkachenko, 2017; Kociecki & Kolasa, 2018). Unfortunately, this work has not at-tempted an analytical examination of the mapping from parameters to observables of LREMs.Consequently, it has failed to uncover the underlying reasons for identification failure and hasresorted instead to detecting the symptoms. A further consequence of this is that it has notbeen possible to make strong connections to classical identification results (e.g. the work ofHannan & Deistler (2012)).The present work builds on recent results by Onatski (2006), Anderson et al. (2012), nderson et al. (2016), and Al-Sadoon (2018) to explain why identification failure occursin LREMs, provide a characterization of observational equivalence, and provide analyticaldiagnostic tests of identification that extend classical results for vector autoregressive movingaverage (VARMA) models. The key idea is that the mapping from parameters to observablesof LREMs involves an initial Wiener-Hopf factorization but is otherwise identical to the map-ping for VARMA. Once this is recognized, the theory proceeds almost exactly analogously tothe classical theory.LREMs are subject to identification failure for some of the same reasons that simultaneousequations models and VARMA are. This is to be expected since the class of LREMs neststhe aforementioned classes of models. However, there is a new source of identification failurethat afflicts only LREMs and that has to do with the endogeneity of expectations. The factthat expectations in LREMs are functions of other endogenous variables necessitates morerestrictions than might be called for in a classical model. Said differently, forward dependenceis not identified . This result has been known to many authors in the literature, although itreceives its most general treatment in this paper.Our characterization of observational equivalence extends well-known results in the VARMAliterature. We find that every class of observationally equivalent models sits inside a particularsubspace of the parameter space, the dimension of which then gives the number of restrictionsnecessary for identification. This number is determined for both specific and generic points inthe parameter space.We then consider the identification problem under affine restrictions. These restrictionsinclude zero restrictions, normalization to one restrictions, and (more generally) restrictionson linear combinations of parameters, possibly across equations. This leads to the geometricpicture in Figure 1. Every point in the parameter space can now be thought of as lying atthe intersection of two subspaces, the set of observationally equivalent parameters, E , and theset of parameters satisfying the affine restrictions, R . When these two subspaces intersect ata single point, that point is identified. Otherwise, they intersect along a subspace and localidentification fails. This observation allows us to obtain necessary and sufficient conditions foridentification under affine restrictions. It also immediately implies the equivalence of globaland local identification for LREMs identified by affine restrictions.Most LREMs in practice consist of parameters that are themselves rational functions of ER more fundamental parameters, so-called “deep parameters.” In this case, it will be difficultto provide necessary and sufficient conditions for identification under affine restrictions thatwould be easily testable. However, we show that necessary and sufficient conditions for iden-tification under affine restrictions are still possible for generic systems. When the restrictionsare non-linear, we provide necessary and sufficient conditions for local identification. Thegeometry of Figure 1 continues to be helpful here even as R is no longer an affine space.It is perhaps worth pointing out some distinctive features of our analysis at the outset.Unlike previous approaches, ours does not requires any special assumptions on the number ofexogenous shocks relative to observables (i.e. regularity) or redundant dynamics (i.e. minimal-ity). Our approach does, however, make a strong identifying assumption on the first impulseresponse of the system. Nevertheless, we believe the new approach paves the way for muchfurther progress on identification of LREMs as we discuss later on.The reader wishing to have the full picture of the theory is advised to begin with AppendixA before starting Section 3. Section 2 sets the notation. Section 3 introduces the LREM andits solution. Section 4 characterizes observational equivalence in LREMs. Section 5 uses theseresults to provide conditions for identifiability of a parameter under affine constraints. Section6 extends the set-up to non-linear parametrizations and constraints. Section 7 concludes. Notation
Denote by Z ⊂ R ⊂ C the sets of integers, real numbers, and complex numbers respectively.We will need T = { z ∈ C : | z | = 1 } , D = { z ∈ C : | z | < } , and D = { z ∈ C : | z | ≤ } , the unitcircle, the open unit disk, and the closed unit disk respectively. Complements of sets will bedenoted by a superscript c . Denote by R [ z ] ⊂ R [ z, z − ] ⊂ R ( z ) the sets of real polynomialsin z , real Laurent polynomials in z , and real irreducible rational functions in z respectively.Similarly, R [ z − ] is the set of real polynomials in z − . When forming arrays populated byelements of a given set, we will use the superscript n × m (e.g. R [ z ] n × m is the set of n × m polynomial matrices). When m = 1, we will simply use the superscript n (e.g. R n is the set of n -dimensional real vectors). For a non-zero B ∈ R [ z, z − ] n × m , we denote by max deg( B ) thehighest power of z that appears in B , while min deg( B ) is the lowest power of z that appearsin B . Finally, we will denote by I n and 0 n × m the n × n identity matrix and the n × m matrixof zeros respectively. The linear rational expectations model (LREM) is characterized by the structural equations p X i = − q B i E t ( Y t − i ) = k X i =0 A i ε t − i , t ∈ Z . (1)Here ε is a sequence of m -dimensional, exogenous, and unobserved i.i.d. variables of mean zeroand var( ε ) = I m , while Y a sequence of n -dimensional endogenous observed variables. Thevariables in ε are understood as exogenous inputs to the system (e.g. shocks to productivity,monetary policy, etc.), while the variables in Y are understood as the output of the system(e.g. inflation, interest rates, etc.).The coefficient matrices A , . . . , A k ∈ R n × m codify the direct contemporaneous and laggedeffects of the shocks on the system. The coefficient matrices B − q , . . . , B p ∈ R n × n codify howthe expected, current, and lagged values of Y are directly related to each other. It will beconvenient to encode the parameters as B ( z ) = p X i = − q B i z i , A ( z ) = k X i =0 A i z i . ater we will introduce restrictions to these parameters and allow them to depend in a possiblynon-linear way on another set of parameters. Following the usual convention we often omitthe argument in the notation writing simply B and A . Example 3.1.
The most common specification of the LREM takes the form B = Φ l × ( n − l ) Γ B , A = Θ l × ( m − l ) ( n − l ) × l A . Here all submatrices except for B ∈ R [ z, z − ] ( n − l ) × ( n − l ) are polynomial matrices. Thus, thefirst l variables of Y are considered exogenous in the last n − l equations, which model theeconomic behaviour of primary interest. The first l variables of ε affect the dynamics of theexogenous variable, while the rest enter into the system directly.In the special case where B is a polynomial matrix, we have the most general formulationof the classical VARMAX model with exogenous variables that have rational spectral den-sity. Specializing further to the case where all matrices are constant, we obtain the classicalsimultaneous equations model.A solution to (1) will be understood to be an n -dimensional stochastic process Y sat-isfying the causality condition that Y t be measurable with respect to the σ -algebra gener-ated by ε t , ε t − , . . . for all t ∈ Z , in addition to the structural equations (1) with E t ( · ) = E ( · | ε t , ε t − , . . . ) for all t ∈ Z .The parameter space of the LREM, denoted by Ω LREM , is a set of pairs(
B, A ) ∈ R [ z, z − ] n × n × R [ z ] n × m characterized by three restrictions, which we carefully introduce in this and the next section.The first of these restrictions is(EU-LREM) B = B − B + , where B − ∈ R [ z − ] n × n , rank( B − ( z )) = n for all z ∈ D c and lim z →∞ B − ( z ) = I n ,B + ∈ R [ z ] n × n , rank( B + ( z )) = n for all z ∈ D . Restriction (EU-LREM) is equivalent to the existence and uniqueness of a stationary solutionto (1) (see Proposition 1 of Onatski (2006) and Theorem 3.2 of Al-Sadoon (2018)). Thisis because it is equivalent to the existence of a Wiener-Hopf factorization with zero partialindices for B (Clancey & Gohberg, 1981, Theorems I.1.1, I.1.2, and I.2.1). o express the solution, we will need to recall the operators " ∞ X i = −∞ D i z i + = ∞ X i =0 D i z i , " ∞ X i = −∞ D i z i − = − X i = −∞ D i z i , where P ∞ i = −∞ D i z i converges in an annulus { z ∈ C : ρ < | z | < } for some ρ ∈ (0 , Y t = B − ( L ) (cid:2) B − − A (cid:3) + ( L ) ε t , t ∈ Z , (2)where L is the lag operator, B − ( L ) is the composition of L and the Taylor series expansion of B − ( z ) in a neighbourhood of z = 0, and [ B − − A ] + ( L ) is the composition of L and the Taylorseries expansion of [ B − − A ] + ( z ) in a neighbourhood of z = 0.Now defining A + = B − [ B − − A ] + , we can rewrite the solution in the more familiar form Y t = B − ( L ) A + ( L ) ε t , t ∈ Z , (3)where B − ( L ) is the composition of L and the Laurent series expansion of B − ( z ) in a neigh-bourhood of T and A + ( L ) is the composition of L and the Laurent series expansion of A + ( z )in a neighbourhood of T . This expression is similar to the solution of the classical VARMAmodel save for the substitution of A for A + .From (2) and (3) we see that there are two analytical expressions for the transfer functionof the solution, B − [ B − − A ] + and B − A + . Each of these expressions plays a crucial role inthe theory of identification of LREMs.For future reference, we collect some of the algebraic consequences of (EU-LREM) in thefollowing lemma. Lemma 3.1.
Let ( B, A ) ∈ R [ z, z − ] n × n × R [ z ] n × m and let B satisfy (EU-LREM) . Then:(i) min deg( B − ) = min deg( B ) and max deg( B + ) = max deg( B ) .(ii) [ B − − A ] + ∈ R [ z ] n × m and max deg([ B − − A ] + ) = max deg( A ) .(iii) A + ∈ R [ z, z − ] n × m , min deg( A + ) ≥ min deg( B ) , and max deg( A + ) = max deg( A ) . iv) [ A + ] + = A .Proof. (i) Let the terms of lowest and highest degree of B − and B + be B − ,µ z − µ and B + ,ν z ν respectively. Then the terms of lowest and highest degrees of B = B − B + are B − ,µ B + , z − µ and B − , B + ,ν z ν respectively as B + , = B + (0) and B − , = I n are non-singular.(ii) The highest degree term of the Laurent series expansion of B − − in an annulus containing T is I n . Thus, the highest degree term of B − − A is the highest degree term of A and the resultfollows.(iii) Follows directly from (i) and (ii).(iv) Compute[ A + ] + = [ B − [ B − − A ] + ] + = [ B − ( B − − A )] + − [ B − [ B − − A ] − ] + = [ A ] + − A. The second equality follows from the fact that B − − A = [ B − − A ] + + [ B − − A ] − . The thirdequality follows from the fact that the Laurent series expansion of B − [ B − − A ] − in an annuluscontaining T consists of only negative powers of z .It follows from Lemma 3.1 (i) that the factors B ± ( z ) are polynomial matrices in z ± withdegrees determined by the minimum and maximum degrees of B . Thus, in the VARMA settingwhere min deg( B ) = 0, then B − = I n and B + = B so (EU-LREM) reduces to the conditionfor existence and uniqueness of a causal stationary solution to the VARMA model (condition(EU-VARMA) in Appendix A). Lemma 3.1 (ii) implies that the solution (2) can be viewedequivalently as a solution to the VARMA model with autoregressive part B + and movingaverage part [ B − A ] + , a fact that will play a crucial role in our analysis. Lemma 3.1 (iii)proves that A + is a Laurent polynomial with degrees bounded by the minimum degree of B and the maximum degree of A . It follows, again, that in the VARMA setting A + = A .Finally, Lemma 3.1 (iv) is the remarkable property of the solution that even though themapping ( B, A ) ( B, A + ) is highly non-linear, it is one-to-one and its left inverse is linear. Example 3.2.
Consider the LREM (1) with n = m = p = q = k = 1 under restriction(EU-LREM). Then Lemma 3.1 (i) implies that we may write B = B − B + with B − ( z ) =1 − b − z − , B + ( z ) = b (1 − b + z ), and | b ± | <
1. Note that in this parameterization B − = − b − b , B = b (1 + b − b + ), and B = − b b + . Writing A ( z ) = a + a z , we have B − − ( z ) A ( z ) = a b − + a b − z − b − + a b − + a + a z = ( a b − + a b − ) ∞ X i =1 b i − − z − i + a b − + a + a z n an annulus containing T . This implies that (cid:2) B − − A (cid:3) + ( z ) = a b − + a + a z. In consequence, A + ( z ) = − a b − − a b − z + a + a z. The solution finally is given by Y t = (cid:18) a b − + a + a Lb (1 − b + L ) (cid:19) ε t , which is the same as the solution of the ARMA model with autoregressive part B + ( z ) = b (1 − b + z ) and the moving average part [ B − − A ] + ( z ) = a b − + a + a z . Before completing the characterization of the parameter space, it is helpful to consider obser-vational equivalence as a means to motivate the second and third properties of the parameterspace.The spectral density of the observed data, f Y Y ( z ) = ∞ X j = −∞ cov( Y j , Y ) z j , satisfies f Y Y ( z ) = B − ( z ) A + ( z ) A + ′ ( z − ) B − ′ ( z − ) . We say that two parameters (
B, A ) and ( ˜ B, ˜ A ) are observationally equivalent and denotethis by ( B, A ) ∼ ( ˜ B, ˜ A ) if both produce the same spectral density; that is, if and only if B − ( z ) A + ( z ) A + ′ ( z − ) B − ′ ( z − ) = ˜ B − ( z ) ˜ A + ( z ) ˜ A + ′ ( z − ) ˜ B − ′ ( z − ) . Evidently observationally equivalent parameters are related in a very complicated way.The traditional way forward in the VARMA literature has been to first impose restrictionsthat identify the transfer function before considering how to identify the parameters from thetransfer function (Hannan & Deistler, 2012, Theorem 1.3.3). In particular, it is typically im-posed that for every parameter (
B, A ) the transfer function B − ( z ) A + ( z ) is of full rank for all ∈ D . This condition is known variably in the literature as the invertibility , fundamentalness ,or minimum phase condition. The transfer function then corresponds to a Wold representa-tion of Y (Hannan & Deistler, 2012, p. 25). Imposing this additional restriction allows us toconclude that ( B, A ) ∼ ( ˜ B, ˜ A ) if and only if there exists an orthogonal matrix V ∈ R m × m such that ˜ B − ˜ A + = B − A + V. See e.g. Theorems 4.6.8 and 4.6.11 of Lindquist & Picci (2015). Thus, we have identified thetransfer function from the spectral density matrix up to an orthogonal transformation.Restrictions that eliminate V are very well understood in the VARMA literature (L¨utkepohl,2005, Chapter 9). The simplest and most convenient choice for our purposes is to restrict thefirst coefficient matrix of the transfer function to be canonical quasi-lower triangular as inAnderson et al. (2012). Here, the first non-zero element of column j is positive and occursin row i j with 1 ≤ i < · · · < i m ≤ n . In the special case n = m , this is just the Choleskyidentification scheme commonly attributed in the literature to Sims (1980). Thus, we arriveat the second property of all ( B, A ) ∈ Ω LREM ,(CF-LREM) rank( B − ( z ) A + ( z )) = m for all z ∈ D and B − ( z ) A + ( z ) | z =0 is canonical quasi-lower triangular.Note that under (EU-LREM), (CF-LREM) is equivalent to [ B − − A ] + ( z ) having rank m for all z ∈ D and B + (0)[ B − − A ] + (0) being canonical quasi-lower triangular.The invertbility part of (CF-LREM) is not restrictive for Gaussian ε , which is a typi-cal specification in the LREM literature (Herbst & Schorfheide, 2016). This is because thedistribution of a Gaussian stationary process is completely determined by its spectral den-sity and so it is not possible to distinguish invertible from non-invertible Gaussian models(Rosenblatt, 2000, p. 11). The second part of (CF-LREM) imposing canonical quasi-lowertriangularity of the first impulse response uniformly over the parameter space is, on the otherhand, very restrictive and not likely to be satisfied for the typical multivariate LREM. How-ever, it permits us a great deal of mathematical traction and we hope that it will be possibleto relax it in future work. The reader who still finds (CF-LREM) objectionable may con-sider that we have solved the identification problem for LREMs up to an orthogonal matrixtransformation, as in Anderson et al. (2016). We might add that this problem, which is math- matically equivalent to the global identification of the simplest static model (9) under generallinear restrictions remains an open problem in the literature (Rubio-Ram´ırez et al. (2010) andBacchiocchi & Kitagawa (2019) provide results in the special case of exact identification).Even though we have not yet completed our characterization of Ω LREM , we can alreadysimplify observational equivalence based on conditions (EU-LREM) and (CF-LREM).
Theorem 4.1.
Let ( B, A ) and ( ˜ B, ˜ A ) satisfy (EU-LREM) and (CF-LREM) . Then ( B, A ) ∼ ( ˜ B, ˜ A ) if and only if ˜ B − [ ˜ B − − ˜ A ] + = B − [ B − − A ] + or equivalently ˜ B − ˜ A + = B − A + . Theorem 4.1 generalizes well known classical results. When B and A are polynomialmatrices, it reduces exactly to the result for the VARMA model (Theorem A.6) and when B and A are constants, it reduces to the result for the classical linear simultaneous equationsmodel (Theorem A.2). Unfortunately Theorem 4.1 does not have the simple algebraic flavourof the classical models, which leads naturally to the notion of coprimeness and other usefuland elegant algebraic ideas. However, we will see that with special care, these complicationsare surmountable.Theorem 4.1 makes clear what the identification problem is for LREMs. Two parametersare observationally equivalent if and only if they generate the same transfer function. But themapping from parameters to transfer functions factorizes as( B, A ) ϕ ( B − , B + , A ) ϕ ( B − , B + , [ B − − A ] + ) ϕ ( B + , [ B − − A ] + ) ϕ B − [ B − − A ] + . (4)Since the Wiener-Hopf factorization is a well defined function and B = B − B + , ϕ is one-to-one and so no identification problems are possible at this stage. Similarly, by Lemma 3.1 (iv), ϕ is one-to-one so no identification problems are possible at this stage either. If we considernow ϕ , this mapping is not one-to-one due to the identification problem for VARMA models(see Section A for a review). This is to be expected because the set of VARMA is a subset ofthe set of LREMs. The only new aspect to the identification problem for LREMs is ϕ , where B − is dropped. Since B − determines the forward dependence of the solution to (1), we arriveat the following fundamental observation. he Fundamental Aspect of the Identification Problem for LREMs: Forward dependence is not identified .The claim here goes beyond the well known fact that the effect of endogenous variables(e.g. expectations) in a classical structural equations model is not identified. What distin-guishes LREMs is that expectations are endogenous variables which are determined by otherendogenous variables. Thus, they necessitate more restrictions than necessary to identify theclassical structural equation models. This point has already been made by several authorsincluding the father of rational expectations himself, Muth (1981), in a paper written in 1960.Of course these earlier realizations of this point were in the context of much more restric-tive LREMs relative to what we consider in this paper. By the end of this section we willcharacterize exactly how many additional restrictions are necessary for the identification ofLREMs.Before we can do that, however, we must introduce the final characterization of our param-eter space. Notice that the set of elements in R [ z, z − ] n × n × R [ z ] n × m satisfying (EU-LREM)and (CF-LREM) is infinite dimensional. The sets of observationally equivalent parameters,as described in Theorem 4.1, are also infinite dimensional. In practice, however, LREMs arespecified with a finite number of leads and lags. Thus, it is typically assumed that there existnon-negative integers κ and λ such that for every parameter ( B, A ),(L-LREM) min deg( B ) ≥ − λ, max deg( B ) ≤ κ, max deg( A ) ≤ κ. Thus, Ω
LREM is the set of pairs (
B, A ) ∈ R [ z, z − ] n × n × R [ z ] n × m satisfying (EU-LREM),(CF-LREM), and (L-LREM). The condition (L-LREM) allows us think of Ω LREM as asubset of R n ( κ + λ +1)+ nm ( κ +1) with the Euclidean topology. The following result provides aparametrization of Ω LREM analogous to the parametrization of VARMA models.
Proposition 4.2. Ω LREM is homeomorphic to a subset of R n (1+ λ + κ )+ nm (1+ κ ) − m ( m − , theinterior of which consists of two connected components. roof. LetΘ
LREM = ( ( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) : F λ , . . . , F , G , . . . , G κ ∈ R n × n , C , A , . . . , A κ ∈ R n × m , rank( F ( z )) = n for all z ∈ D c , where F ( z ) = I n + λ X i =1 F i z − i , rank( G ( z )) = n for all z ∈ D , where G ( z ) = κ X i =0 G i z i ,C is canonical quasi-lower triangular, A ( z ) = G C − A F − · · · − A κ F κ + κ X i =1 A i z i , where F − ( z ) = ∞ X i =0 F i z − i , for all z ∈ D c , and rank (cid:0) [ F − A ] + ( z ) (cid:1) = m for all z ∈ D ) . Then Θ
LREM can be viewed as a subset of R n (1+ λ + κ )+ nm (1+ κ ) − m ( m − . We claim that themapping φ LREM : Θ
LREM → Ω LREM , defined by( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) I n + λ X i =1 F i z − i ! κ X i =0 G i z i ! , G C − A F − · · · − A κ F κ + κ X i =1 A i z i ! , is a homeomorphism. Since (cid:16) I n + P λi =1 F i z − i (cid:17) (cid:0)P ∞ i =0 F i z − i (cid:1) = I n for all z ∈ D c , we havethat F = I n and F i = − P min { λ,i } j =1 F j F i − j for i ≥ F , . . . , F κ are polynomials in the elements of F , . . . , F λ and are therefore continuous. It follows that φ LREM is continuous. Next consider φ − LREM ( B, A ). By the uniqueness of the Wiener-Hopffactorization, it must be that F ( z ) = B − ( z ) and G ( z ) = B + ( z ). The continuity of theWiener-Hopf factorization (Clancey & Gohberg, 1981, Proposition X.1.1) then ensures that B is mapped continuously to ( F λ , . . . , F , G , . . . , G κ ). On the other hand, it must be thecase that C = G − (cid:0) A + A F + · · · + A κ F κ (cid:1) so that φ − LREM is a function. Finally, C is arational function of the coefficient matrices of B − , B + , and A and therefore continuous over( B, A ) ∈ Ω LREM . Thus, φ − LREM is continuous. ext, we claim thatΘ ◦ LREM = ( ( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) ∈ Θ LREM :rank (cid:0) [ F − A ] + ( z ) (cid:1) = m for all z ∈ T ) is the interior of Θ LREM . By the continuity of zeros of a polynomial with respect to itscoefficients (Horn & Johnson, 1985, Appendix D), Θ ◦ LREM is open. Now pick any point( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) ∈ Θ LREM \ Θ ◦ LREM , then rank (cid:0) [ F − A ] + ( z ) (cid:1) < m forsome z ∈ T . Now define I n · · · F I n · · · ... . . . . . . ... ...... . . . I n F κ · · · · · · F I n A κ ( ρ ) A κ − ( ρ ) ...A ( ρ ) A ( ρ ) = ρ κ · · · ρ κ − · · · ... . . . . . . ... ...... . . . ρ · · · · · · ρ I n · · · F I n · · · ... . . . . . . ... ...... . . . I n F κ · · · · · · F I n A κ A κ − ...A A . Then for any ρ >
1, ( F λ , . . . , F , G , . . . , G κ , C , A ( ρ ) , . . . , A κ ( ρ )) Θ LREM because themoving average part of its VARMA representation is [ F − A ( ρ )]( z ) = [ F − A ] + ( ρz ), which hasa zero in D . It follows that Θ LREM \ Θ ◦ LREM are boundary points of Θ
LREM .Finally, let ( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) ∈ Θ ◦ LREM . Then, we may follow asimilar reasoning to show that( F λ , . . . , F , G , G (1 − t ) , . . . , G κ (1 − t ) , C , A (1 − t ) , . . . , A κ (1 − t )) ∈ Θ ◦ LREM , t ∈ [0 , , where G j ( ρ ) = ρ j G j for j = 1 , . . . , κ . Thus, ( F λ , . . . , F , G , . . . , G κ , C , A , . . . , A κ ) ∈ Θ ◦ LREM is in the same connected component as ( F λ , . . . , F , G , , . . . , , C , , . . . , , . . . , , G , , . . . , , C , , . . . ,
0) by thepath ( F λ (1 − t ) , . . . , F (1 − t ) , G , , . . . , , C , , . . . ,
0) for t ∈ [0 , F j ( ρ ) = ρ j F j for j = 1 , . . . , λ . The final claim then follows from Proposition A.1. n the course of proving Proposition 4.2, we find that Ω LREM is neither open nor closed.The boundary points that are also elements of Ω
LREM are exactly those parameters where thetransfer function has a zero on T . The point (0 n × n , n × m ) is a boundary point of Ω LREM thatis not an element of Ω
LREM . The fact that Ω
LREM consists of two connected components isa property inherited from the classical simultaneous equations model (see Proposition A.1);for example (cid:0) I n , (cid:2) I m (cid:3)(cid:1) and (cid:16)h − I n − i , (cid:2) I m (cid:3)(cid:17) cannot be connected by a path in Ω LREM .Now that we have the full characterization of the parameter space we may consider sim-plifying the conditions for observationally equivalence even further than what we have seen inTheorem 4.1. Let (
B, A ) ∼ ( ˜ B, ˜ A ) and let C = B − A + . Then Theorem 4.1 implies that ˜ A + = ˜ BC.
If ˜ B ( z ) = P κj = − λ ˜ B i z i , ˜ A + ( z ) = P κi = − λ ˜ A + i z i , and C ( z ) = P ∞ j =0 C i z i in an annulus containing T , then equating term by term above we arrive to the following equivalent expression h ˜ A + − λ · · · ˜ A + κ · · · i = h ˜ B − λ · · · ˜ B κ · · · i C C C C · · · C C C . . . C C . . .. . . . . . . . . . . . . Although this is an infinite dimensional system, (L-LREM) will allow us to restrict attention toa finite dimensional subsystem, which will then allow us to provide a simpler characterizationof observational equivalence. The key idea is a familiar one from linear systems theory (seee.g. Lemma A.4 of Dufour & Renault (1998)).
Lemma 4.3.
Suppose f, g ∈ R [ z ] , max deg( f ) ≤ p and g (0) = 0 . Then h = f /g is identicallyzero if and only if the first p +1 terms in the Taylor series expansion of h ( z ) in a neighbourhoodof z = 0 are zero. The essential point of Lemma 4.3 is that the coefficients of any Taylor series expansion ofa rational function are linearly recursive and therefore determined by the initial coefficients.Before deriving the simplification of observational equivalence, we will also need the fol-lowing lemma which develops some properties of a submatrix of the infinite matrix we have dentified above. The reader may wish to review the concepts of coprimeness and the McMillandegree, δ , of a rational matrix discussed in Section A.2 (c.f. (15)). Lemma 4.4.
Let ( B, A ) ∈ Ω LREM , let C = B − A + have a Taylor series expansion C ( z ) = P ∞ i =0 C i z i in a neighbourhood of z = 0 , and let H = C κ + λ +1 C κ + λ +2 · · · C ( n +1) κ + λ . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ . Then:(i) rank( H ) = δ ( C ( z − ) − C (0)) ≤ nκ .(ii) The set of parameters satisfying rank( H ) = nκ is generic (i.e. contains an open anddense subset of Ω LREM ).Proof. (i) The conditions for (
B, A ) ∈ Ω LREM together with Lemma 3.1 (i) and (ii) imply that( B + , [ B − − A ] + ) ∈ Ω V ARMA , the parameter space for VARMA models developed in Section A.The result is obtained in the course of proving Lemma A.8 (i).(ii) Lemma A.8 (ii) identifies a generic subset of Ω
V ARMA where the result holds. We willprove that its preimage under the mapping (
B, A ) ( B + , [ B − − A ] + ) is generic in Ω LREM .This will follow if we can show that ϕ ◦ ϕ ◦ ϕ in (4) is continuous and open (observe thatif X and Y are topological spaces, f : X → Y is continuous and open, and D is open anddense in Y , then f − ( D ) is open and dense in X ). First,( B, A ) ϕ ( B − , B + , A )viewed as a mapping ( R n × n ) λ + κ × ( R n × m ) κ → ( R n × n ) λ × ( R n × n ) κ × ( R n × m ) κ is con-tinuous by Proposition X.1.1 of Clancey & Gohberg (1981). Since multiplication is continuous, ϕ − is also continuous and ϕ is therefore a homeomorphism. Next,( B − , B + , A ) ϕ ( B − , B + , [ B − − A ] + ) , viewed as mapping ( R n × n ) λ × ( R n × n ) κ × ( R n × m ) κ to itself, is a composition of inver-sion, multiplication, and [ · ] + all of which are continuous in the domain we consider. By emma 3.1 (iii), ϕ − is a composition of multiplication and [ · ] + , which are continuous.Thus, ϕ is also a homeomorphism. Finally,( B − , B + , [ B − − A ] + ) ϕ ( B + , [ B − − A ] + )is a projection and therefore continuous and open. The composition of the three mappings isthen continuous and open.We are now in a position to simplify Theorem 4.1 and characterize the set( B, A ) / ∼ = n ( ˜ B, ˜ A ) ∈ Ω LREM : ( ˜ B, ˜ A ) ∼ ( B, A ) o , ( B, A ) ∈ Ω LREM . Theorem 4.5.
Let ( B, A ) , ( ˜ B, ˜ A ) ∈ Ω LREM , let C = B − A + have a Taylor series expansion C ( z ) = P ∞ i =0 C i z i in a neighbourhood of z = 0 , and let T = C C · · · C κ + λ C . . . ...... . . . . . . C · · · C , H = C κ + λ +1 C κ + λ +2 · · · C ( n +1) κ + λ . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ ,P = − T − HI m ( κ + λ +1) m ( κ + λ +1) × nmκ . Then:(i) ( ˜ B, ˜ A ) ∼ ( B, A ) if and only if vec (cid:16)h ˜ B − λ · · · ˜ B κ ˜ A + − λ · · · ˜ A + κ i(cid:17) ∈ ker (cid:0) P ′ ⊗ I n (cid:1) . (ii) ( B, A ) / ∼ is a relatively open subset of the subspace mat (cid:0) ker (cid:0) P ′ ⊗ I n (cid:1)(cid:1) , where mat : vec (cid:16)h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) κ X i = − λ B i z i , κ X i =0 A + i z i ! . (iii) dim(( B, A ) / ∼ ) = n ( κ + λ + 1) − n δ (cid:0) C ( z − ) − C (0) (cid:1) ≥ n (1 + λ ) and for generic pointsin the parameter space dim(( B, A ) / ∼ ) = n (1 + λ ) . roof. (i) If ( ˜ B, ˜ A ) ∼ ( B, A ), then Theorem 4.1 implies that0 = − z λ ˜ B ( z ) C ( z ) + z λ ˜ A + ( z )= − z λ ˜ B ( z ) B − ( z )[ B − − A ] + ( z ) + z λ ˜ A + ( z )= − z λ ˜ B ( z )adj( B + ( z ))[ B − − A ] + ( z ) + det( B + ( z )) z λ ˜ A + ( z )det( B + ( z )) . By Lemma 3.1 and (EU-LREM), each element of the right hand side can be expressed asa ratio of a polynomial (of degree at most max n λ + max deg( ˜ B ) + max deg(adj( B + )) +max deg([ B − − A ] + ) , λ + max deg(det( B + )) + max deg( ˜ A + ) o ≤ max { λ + κ + ( n − κ + κ, λ + nκ + κ } = ( n + 1) κ + λ ) and det( B + ), which satisfies det( B + (0)) = 0. By Lemma 4.3, thisis equivalent to the first 1 + ( n + 1) κ + λ Taylor series coefficients equating to zero. Thus,observational equivalence is equivalent to − h ˜ B − λ . . . ˜ B κ i h T H i + h ˜ A + − λ . . . ˜ A + κ n × m . . . n × m i = 0 n × (1+( n +1) κ + λ ) m or equivalently h ˜ B − λ · · · ˜ B κ ˜ A + − λ · · · ˜ A + κ i P = 0 n × (1+( n +1) κ + λ ) m . Vectorizing we obtain (cid:0) P ′ ⊗ I n (cid:1) vec (cid:16)h ˜ B − λ · · · ˜ B κ ˜ A + − λ · · · ˜ A + κ i(cid:17) = 0 nm (1+( n +1) κ + λ ) × . (ii) If ( ˘ B, ˘ A ) ∈ mat(ker( P ′ ⊗ I n )) then it satisfies (L-LREM). We claim that if, additionally,it satisfies (EU-LREM), then (CF-LREM) is also satisfied. To see this, note that the first1 + ( n + 1) κ + λ Taylor series coefficients of ˘ A − [ ˘ BC ] + equate to zero and, following the sameargument as used in (i), it can be shown that ˘ A = [ ˘ BC ] + . Therefore˘ A + = ˘ B − [ ˘ B − − ˘ A ] + = ˘ B − [ ˘ B − − [ ˘ BC ] + ] + = ˘ B − [ ˘ B − − ˘ BC ] + − ˘ B − [ ˘ B − − [ ˘ BC ] − ] + = ˘ B − [ ˘ B + C ] + = ˘ B − ˘ B + C = ˘ BC, here we have used the property that [ · ] + + [ · ] − is the identity mapping on R ( z ) n × m aswell as the fact that ˘ B + and C are analytic in D . It follows that (CF-LREM) is satisfied asclaimed. Thus, ( B, A ) / ∼ is the intersection of mat(ker( P ′ ⊗ I n )) with n ( ˘ B, ˘ A ) ∈ R [ z, z − ] n × n × R [ z ] n × m : (EU-LREM) and (L-LREM) are satisfied o . The latter set is open in R n ( κ + λ +1)+ nm ( κ +1) due to the continuity of the Wiener-Hopf fac-torization with respect to entries of the matrix function (Clancey & Gohberg, 1981, Proposi-tion X.1.1). Therefore, ( B, A ) / ∼ is relatively open in mat(ker( P ′ ⊗ I n )).(iii) dim(ker( P ′ )) = dim(ker( H ′ )) and so the result follows from Lemma 4.4 (i) and thestandard properties of Kronecker products (Horn & Johnson, 1991, Theorem 4.2.15). Forgeneric parameters the result follows from Lemma 4.4 (ii).Theorem 4.5 (i) provides a significantly simpler criterion for observational equivalence thanTheorem 4.1; it substitutes a rational matrix equation for a simple linear algebraic criterion.Theorem 4.5 (ii) characterizes a set of observationally equivalent parameters as a relativelyopen subset of an affine subspace of Ω LREM determined by the first 1 + ( n + 1) κ + λ impulseresponses. Finally, Theorem 4.5 (iii) gives the dimension of a set of observationally equivalentparameters, which can be understood as the number of restrictions that must be imposed onthe parameter space in order to identify a parameter with a spectral density matrix. Thisnumber can be understood as the difference between the effective number of free parameters n ( κ + λ + 1) (recall that ˜ A + is determined from C whenever ˜ B is known) and the complexityof the transfer function nδ ( C ( z − ) − C (0)).When λ = 0, Theorem 4.5 specializes to classical results for VARMA models as reviewedin Appendix A. Notice that the dimension of observationally equivalent parameters is largerby n λ than in the VARMA setting (Theorem A.9 (iii)). This is exactly the number of freeparameters in B − ( z ) and another manifestation of the fundamental aspect of the identificationproblem for LREMs. Example 4.1.
Consider the setting of Example 3.2. If ( ˜ B, ˜ A ) ∼ ( B, A ) then, by Theorem 4.1,( ˜ B − z − + ˜ B + ˜ B z ) (cid:18) a b − + a + a zb (1 − b + z ) (cid:19) = ˜ A + − z − + ˜ A +0 + ˜ A +1 z. Multiplying through by b (1 − b + z ) z we obtain an equality of two polynomials of degree atmost three, which implies four linear equations in the six variables ˜ B − , ˜ B , ˜ B , ˜ A + − , ˜ A +0 , nd ˜ A +1 . These are precisely the equations verified in Theorem 4.5 (i). We have, P ′ ⊗ I n = − C − C − C − C − C − C − C − C − C . Consider now the special case of the parameter (
B, A ) = (1 ,
1) so that C = 1 and C = C = C = 0. Then (cid:0) , , , , , (cid:1) ′ ∈ ker( P ′ ⊗ I n ). This corresponds to (cid:0) z, z (cid:1) ∈ Ω LREM and it follows from Theorem 4.5 (i) that (cid:0) z, z (cid:1) ∼ (1 , (cid:0) , , , , , (cid:1) ′ ∈ ker( P ′ ⊗ I n ), which corresponds to (cid:0) z − + 1 + z, z (cid:1) ∈ Ω LREM and is therefore also observationally equivalent to (1 , In practice, LREMs are restricted in a variety of ways such as exclusion (setting a parameterto zero), normalization (setting a parameter to 1), and, more generally, affine restrictions thatset linear combinations of the parameters (possibly across equations) to fixed values. Here weconsider the ability of such restrictions to identify a single parameter.Let Ω
RLREM be a subset of Ω
LREM endowed with the relative topology. We say that(
B, A ) ∈ Ω RLREM is identified in Ω RLREM if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RLREM is equal to (
B, A ).We say that a parameter (
B, A ) is locally identified in Ω RLREM if it has a neighbourhood N in Ω RLREM such that every ( ˜ B, ˜ A ) ∼ ( B, A ) in N is equal to ( B, A ). Clearly, a parameter islocally identified in Ω
RLREM if it is identified in Ω
RLREM but the converse is not true in general.
Theorem 5.1.
Let Ω RLREM be the set of ( B, A ) ∈ Ω LREM satisfying (5) R vec (cid:16)h B − λ · · · B κ A · · · A κ i(cid:17) = u, where R ∈ R r × n ( κ + λ +1)+ nm ( κ +1) and u ∈ R r . Partition R as h R B R A i , where R A ∈ R r × n ( κ + λ +1) and R B ∈ R r × nm ( κ +1) , and set R = h R B r × nmλ R A i . If ( B, A ) ∈ Ω RLREM and P is defined as in Theorem 4.5, then ( B, A ) is identified in Ω RLREM if and only if M = P ′ ⊗ I n R s of full column rank n ( n + m )( κ + λ + 1) ,Proof. Let ζ = vec (cid:16)h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) . If M is of full column rank, then ζ is the only point in ker( P ′ ⊗ I n ) that satisfies (5). Theorem 4.5 (i) then implies that ( B, A ) isidentified in Ω
RLREM . If M is not of full rank, then there exists 0 = ξ ∈ ker( P ′ ⊗ I n ) ∩ ker( R ).If c > B, A ) ∼ mat( ζ + cξ ) = ( B, A ) andsince mat( ζ + cξ ) satisfies (5), ( B, A ) is not identified in Ω
RLREM .Theorem 5.1 is a direct generalization of classical results for simultaneous equations models(the case λ = κ = 0) and for VARMA (the case λ = 0) derived in Theorems A.4 and A.10respectively. Theorem 5.1 also exhibits similar geometry to the classical results (Figure 1).Any given parameter ( B, A ) ∈ Ω LREM lies in the intersection of two affine subspaces. Thefirst affine subspace, denoted by E , is mat (ker( P ′ ⊗ I n )) and contains the set of parametersobservationally equivalent to ( B, A ) by Theorem 4.5 (i). The second affine subspace is thespace Ω
RLREM , denoted by R , which contains the set of parameters satisfying the restrictions(5). When ( B, A ) is the only point of intersection then it is identified in Ω
RLREM . Otherwise,the two affine subspaces intersect along an affine a subspace, which contains a line segment inΩ
RLREM by Theorem 4.5 (ii) and so every neighbourhood of (
B, A ) contains infinitely manyobservationally equivalent parameters that also satisfy the given restrictions. Thus, for theLREM subject to affine restrictions, a parameter is identified if and only if it is locally iden-tified .Theorem 5.1 leaves unstated whether or not u in (5) is equal to the zero vector. In fact, itis meaningless to allow u to be the zero vector. If u = 0, then Theorem 4.5 (i) and (ii) implythat ( cB, cA ) ∈ Ω RLREM ∩ ( B, A ) / ∼ for all c in some neighbourhood of 1, thus ( B, A ) cannotbe identified. Stated differently, if u = 0 then M cannot be of full rank because this wouldforce ( B, A ) = (0 n × n , n × m ) Ω LREM . Example 5.1.
Consider the setting of Example 3.2. By Theorem 4.5 and the discussionfollowing it, to obtain identifiability we need to impose at least n (1 + λ ) = 2 restrictions. hus, we may consider fixing B − and A . This implies that M = − C − C − C − C − C − C − C − C − C . We have det( M ) = C C and so any point in the parameter space satisfying our two re-strictions and such that its associated transfer function has a non-zero second impulse re-sponse is identified. Note that the two restrictions are not sufficient to identify the parameter( B, A ) = (1 ,
1) that we considered in Example 4.1. Since there are more free parameters thannecessary in order to characterize this point, we will need two additional restrictions.Suppose now that we are interested in identifying just the i -th equation of (1). Let Ω RLREM be as before and let (
B, A ) ∈ Ω RLREM . We say that the i -th equation of (1) is identified at ( B, A ) in Ω RLREM if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RLREM has the same i -th equation as ( B, A ).We say that the i -th equation of (1) is locally identified at ( B, A ) in Ω RLREM if (
B, A ) has aneighbourhood N in Ω RLREM such that every ( ˜ B, ˜ A ) ∼ ( B, A ) in N has the same i -th equationas ( B, A ). Clearly, if all equations are (locally) identified at (
B, A ) in Ω
RLREM , then (
B, A ) is(locally) identified in Ω
RLREM . Theorem 5.2.
Let Ω RLREM be the set of ( B, A ) ∈ Ω LREM satisfying R i vec (cid:16) e ′ i h B − λ · · · B κ A · · · A κ i(cid:17) = u i , (6) where R i ∈ R r × n ( κ + λ +1)+ m ( κ +1) , u i ∈ R r , and e i ∈ R n is the i -th standard unit vector.Partition R i as h R iB R iA i , where R iA ∈ R r × n ( κ + λ +1) and R iB ∈ R r × m ( κ +1) , and set R i = h R iB r × mλ R iA i . If ( B, A ) ∈ Ω RLREM and P is defined as in Theorem 4.5, thenthe i -th equation of (1) is identified at ( B, A ) in Ω RLREM if and only if M i = P ′ R i has full column rank ( n + m )( κ + λ + 1) . roof. Let ζ = vec (cid:16)h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) . If M i is of full column rank,then vec (cid:16) e ′ i h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) = ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ is the only pointin ker( P ′ ) that satisfies (6), since P ′ ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) = ( I m (1+( n +1) κ + λ ) ⊗ e ′ i )( P ′ ⊗ I n ).Theorem 4.5 (i) then implies that any parameter in Ω RLREM that is observationally equivalentto (
B, A ) must have the same i -th equation as ( B, A ). Thus the i -th equation is identified at( B, A ) in Ω
RLREM . If M i is not of full rank, then there exists 0 = ξ i ∈ ker( P ′ ) ∩ ker( R i ). If c > B, A ) ∼ mat( ζ + cξ i ⊗ e i ) and sincemat( ζ + cξ i ⊗ e i ) satisfies (6) but has a different i -th equation than ( B, A ), the i -th equationis not identified at ( B, A ) in Ω
RLREM .Theorem 5.2 provides necessary and sufficient conditions for the identification of an equa-tion of an LREM. It has exactly the same flavour, interpretation, and geometry as Theo-rem 5.1. It also generalizes classical results for VARMA and simultaneous equations modelsand retains the property of equivalence of identification and local identification. Finally, forthe same reason as before, u i = 0 cannot be allowed.We remark that Theorems 5.1 and 5.2 can be formulated in terms of different matricesthan M and M i respectively. In Corollary A.11 we show that, when attention is restricted toVARMA models, our result is equivalent to that of Deistler & Schrader (1979), who formulateidentification in terms of the rank of a matrix populated not by impulse responses but bythe coefficient matrices of ( B, A ). The direct generalization of Deistler & Schrader (1979) tothe LREM setting would then involve a matrix populated by (
B, A + ). Since this formulationwould involve the negative terms of A + , which have no economic interpretation, it is clearthat our formulation in terms of impulse responses is the preferable one. In practice it is usually the case that we are interested in the identification of all parameters,not just a single one. Moreover, most LREMs in practice are not only restricted by affineconstraints but their free parameters are usually functions of more fundamental structuralparameters. Similarly, the restrictions considered will not always be affine. In this section, weprovide results for these scenarios. We begin with a motivating example. xample 6.1. Hansen & Sargent (1981) study an LREM for the level of employment of afactor of production. Their model is of the form considered in Example 3.2, parametrized as B ( z ) = θ z − − (( θ /θ ) + 1 + θ ) + z, A ( z ) = θ − . Here, θ is a time discount factor, θ is a cost of adjustment, and θ is a measure of returnsto scale. This model has κ = 1 and λ = 1 and is subject to two affine restrictions, B = 1 and A = 0. The question then is whether every θ = ( θ , θ , θ ) is identified.We can attempt to answer the question posed in Example 6.1 as follows. Let the parameterspace of interest be a subset Θ ⊂ R d that maps one-to-one to a subset of Ω LREM . Thus, fixinga linear subspace as in the previous section, there is a one-to-one mapping φ : Θ → Ω RLREM ,where Ω
RLREM ⊂ Ω LREM is, as before, endowed with the relative topology. We refer to anLREM parameterized as above as a φ -LREM. We say that a φ -LREM is generically identifiedin Ω RLREM if there is a relatively open and dense subset Ψ ⊂ Θ such that every parameter in φ (Ψ) is identified in Ω RLREM . Clearly if every point of φ (Θ) is identified in Ω RLREM , then theLREM is generically identified in Ω
RLREM .In order to develop results for this new notion of identification, we will need the followingwell known lemma for which we offer an elementary proof.
Lemma 6.1.
Let X ⊂ R d be non-empty, open, and connected. Let f : X → R be a non-zeroreal analytic function. Then { x ∈ X : f ( x ) = 0 } is open and dense in X .Proof. The continuity of f implies that the set is indeed open. It remains to prove denseness.Let x, y ∈ X be such that f ( x ) = 0 and f ( y ) = 0. The connectedness and openness of X imply that there exists a polygonal path in X with vertices x = z , z , . . . , z n − , z n = y .Suppose the first integer i such that f ( z i ) = 0 is greater than 1. Then f ( tz i + (1 − t ) z i − )is a non-zero real analytic function in t defined over an open interval containing [0 ,
1] with azero at t = 0. Since the zeros of a non-zero real analytic function over an open interval areisolated (Krantz & Parks, 2002, Corollary 1.2.6), it follows that f ( tz i + (1 − t ) z i − ) = 0 forany small enough t = 0. If we now perturb z i − along the direction of z i by a non-zero butsmall enough amount to a point z ′ i − , then the polygonal path with vertices z , . . . , z ′ i − , . . . , z n remains in X and f ( z ′ i − ) = 0. Thus, we may assume that f ( z ) = 0. But then the previousargument may be repeated to show that arbitrarily near x = z there are points at which f isnon-zero. heorem 6.2. Under the assumptions and notation of Theorem 5.1, let Θ ⊂ R d be non-empty, open, and connected, and let φ : Θ → Ω RLREM be analytic and one-to-one.(i) If M has full column rank n ( n + m )( κ + λ + 1) for some point in Θ then the φ -LREMis generically identified.(ii) If there is a non-empty open subset of Θ on which M is rank deficient, then no point in φ (Θ) is identified in Ω RLREM .Proof. (i) We claim that the elements of M are real analytic functions of θ ∈ Θ. Since θ enters into M through the matrix P , it suffices to show that each coefficient matrix of C isan analytic function of θ . Every element of every coefficient matrix of B and A is analyticin θ by assumption. Using the fact that a composition of real analytic maps is real analytic(Krantz & Parks, 2002, Proposition 2.2.8), it is enough to show that each of the maps in thefactorization (4) is real analytic. Clearly ϕ , ϕ , and ϕ define real analytic mappings becausethey are compositions of real analytic mappings (multiplications, inversions, and projections).The fact that ϕ is real analytic follows from a simple extension of Proposition X.1.2 ofClancey & Gohberg (1981). Next, since there is a point in Θ at which M has full columnrank, M has a non-zero minor of order n ( n + m )( κ + λ + 1). This minor is a real analyticfunction of θ ∈ Θ and since it is not identically zero, it is generically non-zero by Lemma 6.1.Thus, M is generically of full column rank. Finally, Theorem 5.1 and the injectivity of φ implythat the φ -LREM is generically identified in Ω RLREM .(ii) If M is rank deficient on an open subset of Θ, all its minors of order n ( n + m )( κ + λ + 1)are equal to zero on this subset. By the preceding analysis, each of these minors is a realanalytic mapping on Θ that vanishes on an open set. Thus, the set of points in Θ where theseminors do not vanish is not dense in Θ and therefore Lemma 6.1 implies that all of theseminors vanish identically over Θ. It follows that M is rank deficient at every point in Θ. ByTheorem 5.1, no point in φ (Θ) is identified in Ω RLREM .The advantage of Theorem 6.2 is that it provides very simple conditions for generic iden-tification. One need only find a single point in Θ whose associated M matrix is of full rankto conclude generic identification. Example 6.2.
Consider the setting of Example 6.1. We have φ : ( θ , θ , θ ) ( θ z − − (( θ /θ ) + 1 + θ ) + z, θ − ) nd Θ = φ − (Ω LREM ). Let ρ ( θ ) and ρ ( θ ) be the zeros of θ z − − (( θ /θ )+1+ θ )+ z orderedas | ρ ( θ ) | < | ρ ( θ ) | (condition (EU-LREM) is equivalent to | ρ ( θ ) | < < | ρ ( θ ) | for all θ ∈ Θ).Thus, the solution has the transfer function C ( z ) = θ ( z − ρ ( θ )) . Condition (CF-LREM) thenrequires that C (0) = − θ ρ ( θ ) >
0. Given the continuity of ρ and ρ , it follows thatΘ = (cid:26) θ ∈ R : | ρ ( θ ) | < < | ρ ( θ ) | , θ ρ ( θ ) < (cid:27) is an open subset of R . Now consider the matrix associated with the given restrictions M = − C − C − C − C − C − C − C − C − C . Since det( M ) = C − C C and C j = − θ ρ j +12 , det( M ) is identically zero so no point of φ (Θ)is identified in Ω RLREM .Suppose we restrict θ to equal 1. We now have φ : ( θ , θ ) ( z − − (( θ /θ ) + 2) + z, θ − )and Θ = φ − (Ω LREM ) = { θ > , θ + θ < } ∪ { θ < , θ < } . We also have M = − C − C − C − C − C − C − C − C − C . It is easily seen that M is of full rank if and only if C = − θ ρ = 0, which is the casethroughout the new Θ. However, we need not rely on this observation to conclude genericidentifiability. The new Θ is the union of two open connected sets and so we may check therank of M at any randomly chosen points in either component (e.g. ( θ , θ ) = ( − , −
1) and(1 , − φ -LREM is generically identified. he example above makes clear that the conditions for the applicability of Theorem 6.2can be difficult to verify. In particular, we suspect that under the conditions of Theorem 5.1and with φ taken as the identity mapping, the φ -LREM is generically identified whenever anyof its elements is identified. However, we have not succeeded in obtaining a parametrization ofΩ RLREM that satisfies the conditions of Theorem 6.2 due to difficulties created by the canonicalquasi-lower triangular assumption on C (0) in (CF-LREM). Thus, this must be left for futureresearch.Generic identification for the i -th equation can be defined analogously. We say that the i -th equation of a φ -LREM is generically identified in Ω RLREM if there is a relatively open anddense subset Ψ ⊂ Θ such that for every parameter in φ (Ψ) the i -the equation is identified inΩ RLREM . Theorem 6.3.
Under the assumptions and notation of Theorem 5.2, let Θ ⊂ R d be non-empty, open, and connected, and let φ : Θ → Ω RLREM be analytic and one-to-one.(i) If M i has full column rank ( n + m )( κ + λ + 1) for some point in Θ then the i -th equationof the φ -LREM is generically identified.(ii) If there is a non-empty open subset of Θ on which M i is rank deficient, then at no pointin φ (Θ) is the i -th equation identified in Ω RLREM .Proof.
The proof is identical to that of Theorem 6.2 and is omitted.If the parameter space is restricted by nonlinear constraints it is generally difficult toobtain conditions for identification. However, it is possible to obtain necessary and sufficientconditions for local identification.
Theorem 6.4.
Let Ω RLREM be the set of ( B, A ) ∈ Ω LREM satisfying (7) R (cid:16) vec (cid:16)h B − λ · · · B κ A · · · A κ i(cid:17)(cid:17) = 0 , where R : R n ( κ + λ +1)+ nm ( κ +1) R r is continuously differentiable. Let π : vec (cid:16)h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) vec (cid:16)h B − λ · · · B κ A +0 · · · A + κ i(cid:17) , and let R = R ◦ π . If ( B, A ) ∈ Ω RLREM and P is defined as in Theorem 4.5, then ( B, A ) islocally identified in Ω RLREM if M = P ′ ⊗ I n ∇ R s of full column rank n ( n + m )( κ + λ + 1) , where ∇ R is the Jacobian of R evaluated at ( B, A ) .Conversely, if ( B, A ) ∈ Ω RLREM , P is defined as in Theorem 4.5, and P ′ ⊗ I n ∇ R ( ˜ B, ˜ A ) is of fixed rank lower than n ( n + m )( κ + λ + 1) for all ( ˜ B, ˜ A ) in a neighbourhood of ( B, A ) ,then ( B, A ) is not locally identified in Ω RLREM .Proof.
If (
B, A ) is not locally identified, there exists a sequence ( ˜ B j , ˜ A j ) ∈ Ω LREM convergingto (
B, A ) such that for all j ≥
1, ( ˜ B j , ˜ A j ) ∈ ( B, A ) / ∼ , ( ˜ B j , ˜ A j ) ∈ Ω RLREM , and ( ˜ B j , ˜ A j ) =( B, A ). Now define ζ j = c j vec (cid:16)h ˜ B j, − λ − B − λ · · · ˜ B j,κ − B κ ˜ A + j, − λ − A + − λ · · · ˜ A + j,κ − A + κ i(cid:17) , where c j ∈ R simply ensures that k ζ j k = 1 for all j ≥
1. Since ( ˜ B j , ˜ A j ) ∈ ( B, A ) / ∼ , Theorem4.5 (i) implies that ζ j ∈ ker( P ′ ⊗ I n ) for all j ≥
1. On the other hand, since ( ˜ B j , ˜ A j ) ∈ Ω RLREM for all j ≥ ∇ Rζ j converges to zero. It follows that M ζ j converges to zero. But this isimpossible because k M ζ j k is bounded below by the smallest singular value of M , which isnon-zero because M is of full rank (Horn & Johnson, 1985, Theorems 7.3.5 and 7.3.10). Thus,( B, A ) is locally identified.Conversely, for ( ˜ B, ˜ A ) ∈ R [ z, z − ] n × n × R [ z ] n × m satisfying (L-LREM) define Z ( ˜ B, ˜ A ) = ( P ′ ⊗ I n )vec (cid:16)h ˜ B − λ · · · ˜ B κ ˜ A + − λ · · · ˜ A + κ i(cid:17) R ( ˜ B, ˜ A ) and notice that Z ( B, A ) = 0, ∇ Z ( B, A ) = M , and the rank of ∇ Z ( ˜ B, ˜ A ) is constant and equalto the rank of M in a neighbourhood of ( B, A ). It follows from the Rank Theorem (Rudin,1976, Theorem 9.32) that every neighbourhood of (
B, A ) contains points different from (
B, A )where Z vanishes. By Theorem 4.5 (ii), the zero set of Z coincides with Ω RLREM ∩ ( B, A ) / ∼ in a neighbourhood of ( B, A ) and as a results (
B, A ) is not locally identified in Ω
RLREM .The reason why the converse in Theorem 6.4 requires more a stringent condition is wellunderstood in the identification literature (Hsiao, 1983, Section 5.1). The condition is knownas regularity. Without it, it is not possible to conclude local non-identifiability when M isrank deficient. For example, let n = m = 1, κ = λ = 0, and R ( B , A ) = ( A − . Clearly very point of Ω RLREM , defined as in Theorem 6.4, is identified. On the other hand, M is equalto (cid:2) − C
10 0 (cid:3) at any given (
B, A ) ∈ Ω RLREM . Note, however, that in this case, ∇ Z is not of fixedrank in any neighbourhood of any ( B, A ) ∈ Ω RLREM .Local identification results for the i -th equation are similar. Theorem 6.5.
Let Ω RLREM be the set of ( B, A ) ∈ Ω LREM satisfying (8) R i (cid:16) vec (cid:16) e ′ i h B − λ · · · B κ A · · · A κ i(cid:17)(cid:17) = 0 , where R i : R n ( κ + λ +1)+ m ( κ +1) R r is continuously differentiable and e i ∈ R n is the i -thstandard unit vector. Let π i : vec (cid:16) e ′ i h B − λ · · · B κ A + − λ · · · A + κ i(cid:17) vec (cid:16) e ′ i h B − λ · · · B κ A +0 · · · A + κ i(cid:17) , and let R i = R i ◦ π i . If ( B, A ) ∈ Ω RLREM and P is defined as in Theorem 4.5, then the i -thequation of (1) is locally identified at ( B, A ) in Ω RLREM if M i = P ′ ∇ R i is of full column rank ( n + m )( κ + λ + 1) , where ∇ R i is the Jacobian of R i evaluated at ( B, A ) .Conversely, if ( B, A ) ∈ Ω RLREM , P is defined as in Theorem 4.5, and P ′ ∇ R i ( ˜ B, ˜ A ) is of fixed rank lower than ( n + m )( κ + λ + 1) for all ( ˜ B, ˜ A ) in a neighbourhood of ( B, A ) , thenthe i -the equation of (1) is is not locally identified at ( B, A ) in Ω RLREM .Proof.
If the i -th equation of ( B, A ) is not locally identified, there exists a sequence ( ˜ B j , ˜ A j ) ∈ Ω LREM converging to (
B, A ) such that for all j ≥
1, ( ˜ B j , ˜ A j ) ∈ ( B, A ) / ∼ , ( ˜ B j , ˜ A j ) ∈ Ω RLREM ,and the i -th equations of ( ˜ B j , ˜ A j ) and ( B, A ) are different. Now define ζ j = c j vec (cid:16)h ˜ B j, − λ − B − λ · · · ˜ B j,κ − B κ ˜ A + j, − λ − A + − λ · · · ˜ A + j,κ − A + κ i(cid:17) , where c j ∈ R simply ensures that k ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ j k = 1 for all j ≥
1. Since ( ˜ B j , ˜ A j ) ∈ ( B, A ) / ∼ , Theorem 4.5 (i) implies that ζ j ∈ ker( P ′ ⊗ I n ) for all j ≥
1. This implies that( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ j ∈ ker( P ′ ), since P ′ ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) = ( I m (1+( n +1) κ + λ ) ⊗ e ′ i )( P ′ ⊗ I n ). n the other hand, since ( ˜ B j , ˜ A j ) ∈ Ω RLREM for all j ≥ ∇ R ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ j convergesto zero. It follows that M i ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ j converges to zero. But this is impossiblebecause k M i ( I ( n + m )( κ + λ +1) ⊗ e ′ i ) ζ j k is bounded below by the smallest singular value of M i ,which is non-zero because M i is of full rank (Horn & Johnson, 1985, Theorems 7.3.5 and7.3.10). Thus, the i -the equation is locally identified at ( B, A ) in Ω
RLREM .Conversely, for ( ˜ B, ˜ A ) ∈ R [ z, z − ] n × n × R [ z ] n × m satisfying (L-LREM) define Z i ( ˜ B, ˜ A ) = P ′ vec (cid:16) e ′ i h ˜ B − λ · · · ˜ B κ ˜ A + − λ · · · ˜ A + κ i(cid:17) R i ( ˜ B, ˜ A ) and notice that Z i ( B, A ) = 0, ∇ Z i ( B, A ) = M i , and the rank of ∇ Z i ( ˜ B, ˜ A ) is constantand equal to the rank of M i in a neighbourhood of ( B, A ). It follows from the Rank Theorem(Rudin, 1976, Theorem 9.32) that every neighbourhood of the parameters of the i -the equationof ( B, A ) contains points where Z i vanishes. Thus, in every neighbourhood of ( B, A ) wecan find a ( ˜ B, ˜ A ) that is identical to ( B, A ) except in the i -th equation and contained inΩ RLREM ∩ ( B, A ) / ∼ . Thus, the i -th equation of ( B, A ) is not locally identified in Ω
RLREM . This paper’s title is an hommage to the seminal paper of the VARMA identification literature(Hannan, 1971). Like Hannan’s paper, the present work characterizes observational equiva-lence and provides conditions for identification in a variety of settings. More importantly, andagain much like Hannan’s paper, the present work has not succeeded in answering all of thequestions surrounding the identification of the model under study. We now turn to some ofthe pending issues.Recognizing the difficulty of the identification problem for VARMA models, the literatureproposed a variety of canonical parametrizations (Hannan & Deistler, 2012, p. 67). Theseparametrizations allow the researcher to specify a model without having to worry about iden-tification. It would be quite useful for empirical work to find similar parametrizations forLREMs.Our framework has excluded measurement errors, which are commonly used in the LREMliterature. This is not an insurmountable challenge as the literature on latent variables andmeasurement error is very well developed (Bollen, 1989; Fuller, 1987). Treating this material n the present work would have made it prohibitively long and complicated, not to mentiondistracting from the primary theoretical difficulties of the identification problem for LREMs.Thus, this is also left for further work.It is tempting to think of the condition number of M or M i from Sections 5 and 6 asmeasures of distance to non-identifiability. However, this remains to be proven. It would beinteresting to look more closely at this problem especially in light of the weak identificationof LREMs frequently encountered in empirical work.Finally, the quasi-lower triangular assumption on the first impulse response in (CF-LREM)has allowed us to solve a long standing open problem, the identification problem for LREMs,up to another long standing open problem, the general identification problem for simultaneousequations models. This progress was made possible by recognizing that the appropriate math-ematical framework for LREMs is Wiener-Hopf factorization theory. Interestingly, Hannan(1971) had a similar trajectory. As Hannan describes how he came about resolving the iden-tification problem for VARMA,“It was really quite simple once you recognize what the underlying mathematicaltechnique is. . . . That’s how it came about . . . it is important to have in commandthe mathematics so you can solve the problem. Of course, the 64 dollar question iswhich mathematics to learn, because you can’t learn all of it.” (Pagan & Hannan,1985, p. 273)The 64 dollar question now is which new mathematics will allow us to solve the generalidentification problem for simultaneous equations models. A A Review of Classical Identification Theory
This section reviews some of the basic theory of identification in linear systems, beginningwith the classical simultaneous equations model and proceeding to VARMA models. Formore detailed treatments, the reader may wish to consult Hsiao (1983) or Hannan & Deistler(2012). .1 The Classical Simultaneous Equation Model The classical simultaneous equations model is the set of structural equations BY t = Aε t , t ∈ Z . (9)Here ε is an m -dimensional exogenous and unobserved i.i.d. sequences of mean zero andvar( ε ) = I m , while Y an n -dimensional endogenous observed sequence.The parameter space of the classical simultaneous equations model, denoted by Ω SEM , isa set of pairs (
B, A ) ∈ R n × n × R n × m characterized by two restrictions, the first of which is:(EU-SEM) rank( B ) = n. Restriction (EU-SEM) is equivalent to the existence and uniqueness of the solution, Y t = B − Aε t , t ∈ Z . The variance matrix of the observed data then satisfiesvar( Y ) = B − AA ′ B − ′ . Before introducing the second restriction, we must understand why it is needed. To thatend, we say that two parameters (
B, A ) and ( ˜ B, ˜ A ) are observationally equivalent and denotethis by ( B, A ) ∼ ( ˜ B, ˜ A ) if both produce the same var( Y ); that is, if and only if˜ B − ˜ A ˜ A ′ ˜ B − ′ = B − AA ′ B − ′ . In order to make progress on this problem it is necessary to impose further restrictions. Onesuch restriction requires B − A to be of full column rank so that there are no redundant shocksin the system. This then implies the existence of an orthogonal matrix V ∈ R m × m such that˜ B − ˜ A = B − AV.
Now it is certainly possible to formulate the identification problem at this level of generality.However, this makes the problem substantially more difficult. We will opt, as most of the iterature has done, for imposing additional restrictions on the parameter space in order toeliminate the dependence on V . In particular, the QR algorithm allows us to reduce B − AV to a canonical quasi-lower triangular form , where the first non-zero element of the j -th columnis positive and occurs on row i j , with 1 ≤ i < i < · · · < i m ≤ n (Anderson et al., 2012,Theorem 1). Note that when n = m , the canonical quasi-lower triangular form is the Choleskyfactor of var( Y ). We will need the following additional restriction on the parameter space.(CF-SEM) B − A is of rank m and canonical quasi-lower triangular.Thus Ω SEM is the set of pairs (
B, A ) ∈ R n × n × R n × m satisfying (EU-SEM) and (CF-SEM).We will endow it with the relative topology inherited from R n × n × R n × m . Some aspects of itstopology are given in the following result Proposition A.1. Ω SEM is homeomorphic to an open subset of R n ( n + m ) − m ( m − consistingof two connected components.Proof. Let Θ
SEM = (cid:8) ( B, C ) : B ∈ R n × n , C ∈ R n × m , rank( B ) = n, rank( C ) = m, and C is canonical quasi-lower triangular (cid:9) . Then Θ
SEM can be viewed as a subset of R n ( n + m ) − m ( m − and the mapping φ SEM : Θ
SEM → Ω SEM defined by (
B, C ) ( B, BC ) is a homeomorphism. Since the smallest singular valueof a matrix is continuous with respect to the elements of the matrix (Horn & Johnson, 1991,Theorem 3.3.16), Θ
SEM is an open subset of R n ( n + m ) − m ( m − . The set of non-singular n × n matrices consists of two components, one containing I n and another containing h − I n − i (Hall, 2003, Proposition 1.12). In turn, the set of canonical quasi-lower triangular n × m matrices of full rank has a single component as every such element is connected by a straightline to h I m ( n − m ) × m i . Thus, Θ SEM consists of two connected components.Proposition A.1 implies that Ω
SEM can be parametrized by n ( n + m ) − m ( m −
1) freeparameters.We have already characterized observational equivalence as follows. heorem A.2. Let ( B, A ) , ( ˜ B, ˜ A ) ∈ Ω SEM . Then ( ˜ B, ˜ A ) ∼ ( B, A ) if and only if ˜ B − ˜ A = B − A. (10)We need a simpler characterization of spaces of observationally equivalent parametersdenoted by ( B, A ) / ∼ = n ( ˜ B, ˜ A ) ∈ Ω SEM : ( ˜ B, ˜ A ) ∼ ( B, A ) o , ( B, A ) ∈ Ω SEM . That is the purpose of the next result.
Theorem A.3.
Let ( B, A ) , ( ˜ B, ˜ A ) ∈ Ω SEM and let C = B − A, P = − CI m . Then:(i) ( ˜ B, ˜ A ) ∼ ( B, A ) if and only if vec (cid:16)h ˜ B ˜ A i(cid:17) ∈ ker (cid:0) P ′ ⊗ I n (cid:1) . (ii) ( B, A ) / ∼ is a relatively open and dense subset of the subspace mat (cid:0) ker (cid:0) P ′ ⊗ I n (cid:1)(cid:1) , where mat : vec (cid:16)h B A i(cid:17) ( B, A ) . (iii) dim (( B, A ) / ∼ ) = n .Proof. (i) By Theorem A.2, ( ˜ B, ˜ A ) ∼ ( B, A ) if and only if − ˜ BC + ˜ A = 0 n × m . Vectorizing both sides we obtain h − C ′ ⊗ I n I nm i| {z } P ′ ⊗ I n vec (cid:16)h ˜ B ˜ A i(cid:17) = 0 nm × . ii) If ( ˘ B, ˘ A ) ∈ mat(ker( P ′ ⊗ I n )) then the preceding implies (CF-SEM) is satisfied if(EU-SEM) is satisfied. Therefore, ( B, A ) / ∼ is the intersection of mat(ker( P ′ ⊗ I n )) with n ( ˘ B, ˘ A ) ∈ R n × n × R n × m : (EU-SEM) is satisfied o . Since the latter set is open in R n × n × R n × m , ( B, A ) / ∼ is relatively open in mat(ker( P ′ ⊗ I n )).If ( ˘ B, ˘ A ) ∈ mat(ker( P ′ ⊗ I n )) and det( ˘ B ) = 0, then arbitrarily near ˘ B we can find a non-singular ¯ B (Horn & Johnson, 1985, p. 312) and we can then define ¯ A = ¯ BC , which is thenalso arbitrarily near ˘ A . It follows that ( B, A ) / ∼ is dense in mat(ker( P ′ ⊗ I n )).(iii) dim(ker( P ′ )) = n and so dim(ker( P ′ ⊗ I n )) = n by the standard properties of Kro-necker products (Horn & Johnson, 1991, Theorem 4.2.15).Theorem A.3 characterizes the spaces of observationally equivalent parameters as relativelyopen and dense subsets of subspaces of the parameter space. As the origin is not an elementof Ω SEM , these subsets are proper.Theorem A.3 makes it clear that the parameter space needs to be further restricted inorder to be able to identify a single parameter with a given var( Y ). Let Ω RSEM be a subset ofΩ
SEM endowed with the relative topology. We say that (
B, A ) ∈ Ω RSEM is identified in Ω RSEM if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RSEM is equal to (
B, A ). We say that a parameter (
B, A ) is locallyidentified in Ω RSEM if it has a neighbourhood N in Ω RSEM such that every ( ˜ B, ˜ A ) ∼ ( B, A ) in N is equal to ( B, A ). Clearly, a parameter is locally identified in Ω
RSEM if it is identified inΩ
RSEM but the converse is not true in general.
Theorem A.4.
Let Ω RSEM be the set of ( B, A ) ∈ Ω SEM satisfying R vec (cid:16)h B A i(cid:17) = u, (11) where R ∈ R r × n ( m + n ) and u ∈ R r . If ( B, A ) ∈ Ω RSEM and P is defined as in Theorem A.3,then ( B, A ) is identified in Ω RSEM if and only if M = P ′ ⊗ I n R is of full column rank n ( n + m ) .Proof. Let ζ = vec (cid:16)h B A i(cid:17) . If M is of full column rank, then ζ is the only point inker( P ′ ⊗ I n ) that satisfies (11). Theorem A.3 (i) then implies that ( B, A ) is identified in RSEM . If M is not of full rank, then there exists 0 = ξ ∈ ker( P ′ ⊗ I n ) ∩ ker( R ). If c > B, A ) ∼ mat( ζ + cξ ) = ( B, A ) and sincemat( ζ + cξ ) satisfies (11), ( B, A ) is not identified in Ω
RSEM .The geometry of Theorem A.4 is illustrated in Figure 1. Any given parameter (
B, A ) lies inthe intersection of two affine subspaces. The first affine subspace, denoted by E , contains theset of parameters observationally equivalent to ( B, A ) by Theorem A.3 (i). The second affinesubspace is the space Ω
RSEM , denoted by R , which contains the set of parameters satisfying therestrictions (11). When ( B, A ) is the only point of intersection then it is identified in Ω
RSEM .Otherwise, the two affine subspaces intersect along an affine a subspace, which contains aline segment in Ω
RSEM by Theorem A.3 (ii) and so every neighbourhood of (
B, A ) containsinfinitely many observationally equivalent parameters that also satisfy the given restrictions.Thus, for the classical simultaneous equations model subject to affine restrictions, a parameteris identified if and only if it is locally identified .Suppose now that we are interested in identifying just the i -th equation of (9). Let Ω RSEM be as before and let (
B, A ) ∈ Ω RSEM . We say that the i -th equation of (9) is identified at ( B, A ) in Ω RSEM if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RSEM has the same i -th equation as ( B, A ).We say that the i -th equation of (9) is locally identified at ( B, A ) in Ω RSEM if (
B, A ) has aneighbourhood N in Ω RSEM such that every ( ˜ B, ˜ A ) ∼ ( B, A ) in N has the same i -th equationas ( B, A ). Clearly, if all equations are (locally) identified at (
B, A ) in Ω
RSEM , then (
B, A ) is(locally) identified in Ω
RSEM . Theorem A.5.
Let Ω RSEM be the set of ( B, A ) ∈ Ω SEM satisfying R i vec (cid:16) e ′ i h B A i(cid:17) = u i , (12) where R i ∈ R r × ( m + n ) , u i ∈ R r , and e i ∈ R n is the i -th standard unit vector. If ( B, A ) ∈ Ω RSEM and P is defined as in Theorem A.3, then the i -th equation of (9) is identified at ( B, A ) in Ω RSEM if and only if M i = P ′ R i is of full column rank n + m . roof. Let ζ = vec (cid:16)h B A i(cid:17) . If M i is of full column rank, then vec (cid:16) e ′ i h B A i(cid:17) =( I n + m ⊗ e ′ i ) ζ is the only point in ker( P ′ ) that satisfies (12), since P ′ ( I n + m ⊗ e ′ i ) = ( I m ⊗ e ′ i )( P ′ ⊗ I n ). Theorem A.3 (i) then implies that any parameter in Ω RSEM that is observationallyequivalent to (
B, A ) must have the same i -th equation as ( B, A ). Thus the i -th equation isidentified at ( B, A ) in Ω
RSEM . If M i is not of full rank, then there exists 0 = ξ i ∈ ker( P ′ ) ∩ ker( R i ). If c > B, A ) ∼ mat( ζ + cξ i ⊗ e i )and since mat( ζ + cξ i ⊗ e i ) satisfies (12) but has a different i -th equation than ( B, A ), the i -thequation is not identified at ( B, A ) in Ω
RSEM .The geometry of Theorem A.5 is exactly analogous to that of Theorem A.4, as is theequivalence of identification and local identification.
A.2 The Classical VARMA Model
The classical VARMA model generalizes (9) by allowing for dependence across time (i.e.dynamics). p X i =0 B i Y t − i = k X i =0 A i ε t − i , t ∈ Z , (13)Here ε is an m -dimensional exogenous and unobserved i.i.d. sequences of mean zero andvar( ε ) = I m , while Y an n -dimensional endogenous observed sequence.It will be convenient to collect the parameters B , . . . , B p ∈ R n × n and A , . . . , A k ∈ R n × m in the form of polynomial matrices B ( z ) = P pi =0 B i z i and A ( z ) = P ki =0 A i z i . The parameterspace of the VARMA model, denoted by Ω V ARMA , is then a set of pairs(
B, A ) ∈ R [ z ] n × n × R [ z ] n × m characterized by three restrictions, the first of which is:(EU-VARMA) rank( B ( z )) = n for all z ∈ D . Restriction (EU-VARMA) is equivalent to the existence and uniqueness of a stationary causalsolution, Y t = B − ( L ) A ( L ) ε t , t ∈ Z , here L is the lag operator, B − ( L ) is the composition of L and the Taylor series expan-sion of B − ( z ) in a neighbourhood of z = 0, and A ( L ) is the composition of L and A (Hannan & Deistler, 2012, pp. 10-11). This implies that the spectral density of the observeddata, f Y Y ( z ) = ∞ X j = −∞ cov( Y j , Y ) z j , satisfies f Y Y ( z ) = B − ( z ) A ( z ) A ′ ( z − ) B − ′ ( z − ) . We say that two parameters (
B, A ) and ( ˜ B, ˜ A ) are observationally equivalent and denotethis by ( ˜ B, ˜ A ) ∼ ( B, A ) if both produce the same spectral density; that is, if and only if B − ( z ) A ( z ) A ′ ( z − ) B − ′ ( z − ) = ˜ B − ( z ) ˜ A ( z ) ˜ A ′ ( z − ) ˜ B − ′ ( z − ) . Just as in the classical simultaneous equations model, in order to make progress here it isnecessary to impose further restrictions. One such restriction requires the transfer function B − ( z ) A ( z ) to be of full rank for all z ∈ D so that every shock can be reconstructed fromthe present and past values of Y . This condition is known variably in the literature as the invertibility , fundamentalness , or minimum phase condition. Proceeding then, it is well knownin the literature that observational equivalence holds if and only if there exists an orthogonalmatrix V ∈ R m × m such that ˜ B − ˜ A = B − AV.
See e.g. Theorems 4.6.8 and 4.6.11 of Lindquist & Picci (2015). We may then eliminate V just as we did in the classical simultaneous equations model. Thus, we arrive at the secondrestriction on all ( B, A ) ∈ Ω V ARMA ,(CF-VARMA)rank (cid:0) B − ( z ) A ( z ) (cid:1) = m for all z ∈ D and B − (0) A (0) is canonical quasi-lower triangular.If (EU-VARMA) is maintained, then (CF-VARMA) is equivalent to the more familiarrestriction that A ( z ) have rank m for all z ∈ D and B − (0) A (0) be canonical quasi-lowertriangular (Anderson et al., 2016). However, our formulation is more convenient for LREMapplications. ote also that when n = m , assumption (CF-VARMA) sets B − (0) A (0) equal to theCholesky factor of the variance of the innovations of Y . Thus, it corresponds to the identifi-cation strategy commonly attributed to Sims (1980).Even though we have not yet completed our characterization of Ω V ARMA , we can alreadysimplify observational equivalence based on conditions (EU-VARMA) and (CF-VARMA).
Theorem A.6.
Let ( B, A ) , ( ˜ B, ˜ A ) ∈ Ω V ARMA . Then ( B, A ) ∼ ( ˜ B, ˜ A ) if and only if ˜ B − ˜ A = B − A. (14)The set of points in R [ z ] n × n × R [ z ] n × m satisfying (EU-VARMA) and (CF-VARMA) isinfinite dimensional. The sets of observationally equivalent parameters, as described in The-orem A.6, are also infinite dimensional. Therefore, in practice one usually specifies a non-negative integer κ such that for every ( B, A ) ∈ Ω V ARMA ,(L-VARMA) max deg( B ) ≤ κ, max deg( A ) ≤ κ. Thus, Ω
V ARMA is the set of pairs (
B, A ) ∈ R [ z ] n × n × R [ z ] n × m satisfying (EU-VARMA),(CF-VARMA), and (L-VARMA). The condition (L-VARMA) allows us to think of Ω V ARMA as a subset of R n ( n + m )( κ +1) with the topology induced by Euclidean topology. Proposition A.7. Ω V ARMA is homeomorphic to a subset of R n ( n + m )(1+ κ ) − m ( m − , the in-terior of which consists of two connected components.Proof. LetΘ
V ARMA = ( ( B , . . . , B κ , C , A , . . . , A κ ) : B , . . . , B κ ∈ R n × n , C , A , . . . , A κ ∈ R n × m , rank κ X i =0 B i z i ! = n for all z ∈ D ,C is canonical quasi-lower triangular , andrank B C + κ X i =1 A i z i ! = m for all z ∈ D ) . Then Θ
V ARMA can be viewed as a subset of R n ( n + m )(1+ κ ) − m ( m − and the mapping φ V ARMA :Θ V ARMA → Ω V ARMA defined by( B , . . . , B κ , C , A , . . . , A κ ) κ X i =0 B i z i , B C + κ X i =1 A i z i ! s a homeomorphism. Now letΘ ◦ V ARMA = ( ( B , . . . , B κ , C , A , . . . , A κ ) ∈ Θ V ARMA : rank B C + κ X i =1 A i z i ! = m for all z ∈ T ) . If we pick any point ( B , . . . , B κ , C , A , . . . , A κ ) ∈ Θ V ARMA \ Θ ◦ V ARMA , then for any ρ >
1, thepoint ( B , . . . , B κ , C , ρA , . . . , ρ κ A κ ) Θ V ARMA because rank (cid:0) B C + P κi =1 A i ( ρz ) i (cid:1) willfall below m for some point in D . Thus Θ V ARMA \ Θ ◦ V ARMA are boundary points. In contrast,the continuity of zeros of a polynomial with respect to its coefficients (Horn & Johnson, 1985,Appendix D) ensures that Θ ◦ V ARMA is open. Thus Θ ◦ V ARMA is the interior of Θ
V ARMA . Bysimilar reasoning, ( B , (1 − t ) B , . . . , (1 − t ) κ B κ , C , (1 − t ) A , . . . , (1 − t ) κ A κ ) is in Θ ◦ V ARMA for any t ∈ [0 , B , B , . . . , B κ , C , A , . . . , A κ ) ∈ Θ ◦ V ARMA is in the same connectedcomponent as ( B , , . . . , , C , , . . . , strictlyinvertible processes in Ω V ARMA . Thus, the parameter space may be thought to have a bound-ary consisting of those elements of Ω
V ARMA , which are invertible but not strictly invertible.Proposition A.7 is the analogue to Theorem 2.5.3 (ii) of Hannan & Deistler (2012). Note,however, that Hannan and Deistler’s parametrization is canonical (i.e. they parametrize theequivalence classes of parameters) and ours is not.The condition (L-VARMA) also simplifies observationally equivalence. For suppose (
B, A ) ∼ ( ˜ B, ˜ A ) and let C = B − A. Then (14) can be rewritten as ˜ A = ˜ BC.
If ˜ B ( z ) = P κj =0 ˜ B i z i , ˜ A ( z ) = P κi =0 ˜ A i z i , and C ( z ) = P ∞ j =0 C i z i for z ∈ D , then equatingTaylor series coefficients we arrive to the following equivalent expression h ˜ A · · · ˜ A κ · · · i = h ˜ B · · · ˜ B κ · · · i C C C C · · · C C C . . . C C . . .. . . . . . . . . . . . . lthough there are infinitely many equations in this expression, (L-VARMA) will allow us torestrict attention to only the first 1 + ( n + 1) κ equations, which will then allow us to determinethe dimension of the sets of observationally equivalent models. This is achieved in the nextresult. Before we state it formally, we review the concepts of coprimeness and McMillandegree. If B ∈ R [ z ] n × n , A ∈ R [ z ] n × m , and det( B ) is not identically zero, we say that the pair( B, A ) is coprime if rank (cid:16)h B ( z ) A ( z ) i(cid:17) = n for all z ∈ C . Every C ∈ R ( z ) n × m can berepresented as C = B − A for some coprime pair ( B, A ) and(15) δ ( C ) = max deg(det( B ))is an invariant of such representations of C known as the McMillan degree of C (Hannan & Deistler,2012, pp. 41 and 51). Lemma A.8.
Let ( B, A ) ∈ Ω V ARMA , let C = B − A have a Taylor series expansion C ( z ) = P ∞ i =0 C i z i in a neighbourhood of z = 0 , and let H = C κ +1 C κ +2 · · · C ( n +1) κ . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ . Then:(i) rank( H ) = δ ( C ( z − ) − C (0)) ≤ nκ .(ii) The set of parameters satisfying rank( H ) = nκ is generic (i.e. contains an open anddense subset of Ω V ARMA ).Proof. (i) The rank of the infinite Hankel matrix C C C . . .C C C . . .C C C . . .. . . . . . . . . . . . is equal to δ ( C ( z − ) − C (0)) (Hannan & Deistler, 2012, Theorem 2.4.1 (iii)). Now write C ( z − ) − C (0) = B − ( z − ) (cid:0) A ( z − ) − B ( z − ) C (0) (cid:1) = (cid:0) z κ B ( z − ) (cid:1) − (cid:0) z κ A ( z − ) − z κ B ( z − ) C (0) (cid:1) . estriction (L-VARMA) implies that z κ B ( z − ) ∈ R [ z ] n × n and z κ A ( z − ) ∈ R [ z ] n × m . Since B = B (0) is non-singular by restriction (EU-VARMA), max deg (cid:0) det (cid:0) z κ B ( z − ) (cid:1)(cid:1) = nκ (Hannan & Deistler, 2012, p. 42). It follows that δ ( C ( z − ) − C (0)) ≤ nκ (Hannan & Deistler,2012, Lemma 2.2.1 (e)). By Theorem 2.4.1 (iii) of Hannan & Deistler (2012) again and re-ordering the blocks, δ ( C ( z − ) − C (0)) is the rank of the matrix Q = C nκ C nκ +1 · · · C nκ − . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ . Since A = BC is a polynomial matrix of degree at most κ , the κ + 1 , κ + 2 , . . . , nκ − BC are all zero. That is, κ X i =0 B i C j − i = 0 , κ + 1 ≤ j ≤ nκ − . Since B is non-singular, C j = − B − κ X i =1 B i C j − i , κ + 1 ≤ j ≤ nκ − . This implies that all of the top blocks of Q are linear dependent on the bottom κ blocks of Q .This implies that δ ( C ( z − ) − C (0)) = rank( Q ) = rank( H ) = rank( ˇ H ) , where ˇ H = C κ C κ +1 · · · C ( n +1) κ − . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ will be needed later.(ii) The proof is in two steps.STEP 1: The set of coprime ( B, A ) ∈ Ω V ARMA with non-singular B κ is generic in Ω V ARMA . Our technique for proving this follows the technique used to prove Theorem 3 of Anderson et al.(2016) although our proof is substantially more explicit. We are not able to rely on that result irectly because of our additional assumption that C (0) is quasi-lower triangular. Recall thatΩ V ARMA = n ( B, A ) ∈ R [ z ] n × n × R [ z ] n × m : max deg( B ) , max deg( A ) ≤ κ, rank( B ) = n for all z ∈ D , rank( A ) = m for all z ∈ D , and B − (0) A (0) is canonical quasi-lower triangular o . Recall also that the topology on this set is the topology inherited from R n ( n + m )( κ +1) . We willprove that the following subset is open and dense in Ω V ARMA ,ˇΩ
V ARMA = n ( B, A ) ∈ Ω V ARMA : det( B ) has distinct zeros , det( B κ ) = 0 , rank( A ) = m for all z ∈ T , and ( B, A ) is coprime o . To see that ˇΩ
V ARMA is open, let (
B, A ) ∈ ˇΩ V ARMA . We will construct a neighbourhood of(
B, A ) in ˇΩ
V ARMA as N = N ∩ N ∩ N ∩ N , where N i is a neighbourhood of ( B, A ) satisfyingthe i -th additional condition in ˇΩ V ARMA .By the continuity of the zeros of a polynomial with respect to its coefficients (Horn & Johnson,1985, Appendix D), there is a neighbourhood N of ( B, A ) such that for every ( ˇ B, ˇ A ) ∈ N ,det( ˇ B ) has distinct zeros.Since det( B κ ) = 0, the continuity of the determinant implies that there is a neighbourhood N of ( B, A ) such that for every ( ˇ B, ˇ A ) ∈ N , det( ˇ B κ ) = 0.Since rank( A ) = m for all z ∈ D , it has a minor of order m that has no zeros in D . Usingthe continuity of the zeros of a polynomial with respect to its coefficients again, there is aneighbourhood N of ( B, A ) such that for every ( ˇ B, ˇ A ) ∈ N , rank( ˇ A ) = m for all z ∈ D .Finally, since det( B ) has distinct zeros and B κ is non-singular, det( B ) has nκ distinctzeros z , . . . , z nκ ∈ C . The fact that ( B, A ) is coprime then implies thatΠ i A ( z i ) = 0 , i = 1 , . . . , nκ, where Π i is the orthogonal projection matrix onto the left null space of B ( z i ). Sincerank( B ( z i )) = n − , i = 1 , . . . , nκ, small perturbations to B ( z i ) that leave the rank fixed at n − i (Gohberg et al., 2006, Theorem 13.5.1). Thus, there exists a neighbourhood N of ( B, A )such that for every ( ˇ B, ˇ A ) ∈ N ,ˇΠ i ˇ A (ˇ z i ) = 0 , i = 1 , . . . , nκ, here ˇ z , . . . , ˇ z nκ are the zeros of det( ˇ B ) and ˇΠ i is the orthogonal projection matrix onto theleft null space of ˇ B (ˇ z i ). In other words, every ( ˇ B, ˇ A ) ∈ N is coprime.To see that ˇΩ V ARMA is dense, let (
B, A ) ∈ Ω V ARMA \ ˇΩ V ARMA . We propose an infinites-imal perturbation of B , followed by an infinitesimal perturbation A that leads to a point inˇΩ V ARMA . Thus, any neighbourhood of (
B, A ) in Ω
V ARMA will contain an element of ˇΩ
V ARMA .If det( B ) has a zero of multiplicity greater than 1, we claim that there exists an infinites-imal perturbation of B into the set of polynomial matrices satisfying (EU-VARMA) and(L-VARMA) and whose determinants have distinct zeros. In the course of proving their The-orem 3, Anderson et al. (2016) prove that an infinitesimal perturbation exists indeed into theset of polynomial matrices satisfying (L-VARMA) whose determinants have distinct zeros. Theclaim then follows from the fact that the set of polynomial matrices satisfying (EU-VARMA)and (L-VARMA) is an open subset of its ambient space R n ( κ +1) by the continuity of zeros ofpolynomials with respect to their coefficients.Next, if B κ is singular, we can infinitesimally perturb its singular values at zero to obtaina non-singular matrix. By the continuity of zeros of polynomials with respect to their coeffi-cients, this infinitesimal perturbation does not interfere with the det( B ) having distinct zerosor condition (EU-VARMA).Next, if rank( A ( z )) < m for some z ∈ T then for ρ < A ( ρz ) is such that rank( A ( ρz )) = m for all z ∈ D . This perturbation has no effecton the rank of the transfer function in D and leaves the constant coefficient matrix invariant.Thus, the perturbation leaves condition (CF-VARMA) invariant.Finally, suppose that after the sequence of infinitesimal perturbations above we arrive ata ( B, A ) ∈ Ω V ARMA that is not coprime. Enumerate the zeros of det( B ) as z , . . . , z nκ andchoose non-zero vectors v , . . . , v nκ ∈ C n spanning the left null spaces of B ( z ) , . . . , B ( z nκ )respectively. Since ( B, A ) is not coprime, there is an index i such that v ′ i A ( z i ) = 0. Choose ∆ ∈ R n × m satisfying v ′ i ∆ = 0 for all i such that v ′ i A ( z i ) = 0. Then an infinitesimal perturbationof A κ in the direction of ∆ is sufficient to produce a coprime element pair. This infinitesimalperturbation has no effect on the condition that rank( A ( z )) for all z ∈ D by the continuityof zeros of polynomials with respect to their coefficients and it leaves A (0) invariant. Thus(CF-VARMA) remains invariant.STEP 2: For all parameters in ˇΩ V ARMA , rank( H ) = nκ . e have already established that the matrix ˇ H encountered in (i) is of rank δ ( C ( z − ) − C (0)) ≤ nκ . If rank( ˇ H ) < nκ , then there exist vectors x i ∈ R n , i = 0 , . . . , κ −
1, not all zero,such that ( x ′ , . . . , x ′ κ − ) ˇ H = 0 × nmκ . Setting x ( z ) = κ − X i =0 x i z i , this implies that the terms of degree κ, κ + 1 , . . . , ( n + 1) κ − x ′ C vanish. To see thatindeed all higher degree terms vanish as well, notice that each element of the numerator of y ′ = x ′ C = x ′ adj( B ) A det( B )is expressible as a polynomial of degree bounded above by max deg( x ) + max deg(adj( B )) +max deg( A ) ≤ ( κ −
1) + ( n − κ + κ = ( n + 1) κ −
1. Since det( B (0)) = 0 by (EU-VARMA),it follows from Lemma 4.3 that y ∈ R m [ z ] and max deg( y ) ≤ κ −
1. Now setting U = (cid:2) x ′ B − S (cid:3) with S ∈ R ( n − × n chosen so that det( U ) is not identically zero (e.g. choose z ∈ D such that x ( z ) = 0 and choose S as an orthogonal complement to x ′ ( z ) B − ( z )). Then˙ B = U B = (cid:2) x ′ SB (cid:3) ∈ R [ z ] n × n , ˙ A = U A = h y ′ SA i ∈ R [ z ] n × m , and ˙ B − ˙ A = B − A = C . But this violates the minimality of δ ( C ) = nκ among all matrixfraction descriptions of C because max deg( ˙ B ) < nκ (Hannan & Deistler, 2012, Lemma 2.2.1(e)). Thus, rank( ˇ H ) = nκ and therefore rank( H ) = nκ .We are now in a position to simplify Theorem A.6 and characterize the set( B, A ) / ∼ = n ( ˜ B, ˜ A ) ∈ Ω V ARMA : ( ˜ B, ˜ A ) ∼ ( B, A ) o , ( B, A ) ∈ Ω V ARMA . Theorem A.9.
Let ( B, A ) , ( ˜ B, ˜ A ) ∈ Ω V ARMA , let C = B − A have a Taylor series expansion ( z ) = P ∞ i =0 C i z i in a neighbourhood of z = 0 , and let T = C C · · · C κ C . . . ...... . . . . . . C · · · C , H = C κ +1 C κ +2 · · · C ( n +1) κ . . . . . . . . . ...C . . . . . . C nκ +1 C C . . . C nκ ,P = − T − HI m ( κ +1) m ( κ +1) × nmκ . Then:(i) ( ˜ B, ˜ A ) ∼ ( B, A ) if and only if vec (cid:16)h ˜ B · · · ˜ B κ ˜ A · · · ˜ A κ i(cid:17) ∈ ker (cid:0) P ′ ⊗ I n (cid:1) . (ii) ( B, A ) / ∼ is a relatively open subset of the subspace mat (cid:0) ker (cid:0) P ′ ⊗ I n (cid:1)(cid:1) , where mat : vec (cid:16)h B · · · B κ A · · · A κ i(cid:17) κ X i =0 B i z i , κ X i =0 A i z i ! . (iii) dim (( B, A ) / ∼ ) = n ( κ + 1) − n δ (cid:0) C ( z − ) − C (0) (cid:1) ≥ n and for generic points in theparameter space dim (( B, A ) / ∼ ) = n .Proof. (i) If ( ˜ B, ˜ A ) ∼ ( B, A ), then Theorem A.6 implies that0 = − ˜ BC + ˜ A = − ˜ B adj( B ) A + det( B ) ˜ A det( B ) . Each element of the right hand side can be expressed as a ratio of a polynomial (of degree atmost max n max deg( ˜ B ) + max deg(adj( B )) + max deg( A ) , max deg(det( B )) + max deg( ˜ A ) o ≤ max { κ + ( n − κ + κ, nκ + κ } = ( n + 1) κ ) and det( B ), which satisfies det( B (0)) = 0 by(EU-VARMA). By Lemma 4.3, this is equivalent to the first 1 + ( n + 1) κ Taylor seriescoefficients equating to zero. Thus, observational equivalence is equivalent to − h ˜ B . . . ˜ B κ i h T H i + h ˜ A . . . ˜ A κ n × m . . . n × m i = 0 n × (1+( n +1) κ ) m r equivalently, h ˜ B · · · ˜ B κ ˜ A · · · ˜ A κ i P = 0 n × (1+( n +1) κ ) m . Vectorizing we obtain (cid:0) P ′ ⊗ I n (cid:1) vec (cid:16)h ˜ B · · · ˜ B κ ˜ A · · · ˜ A κ i(cid:17) = 0 nm (1+( n +1) κ ) × . (ii) If ( ˘ B, ˘ A ) ∈ mat(ker( P ′ ⊗ I n )) then it satisfies (L-VARMA). If, additionally, it satisfies(EU-VARMA), the preceding implies that ˘ A = ˘ BC and therefore (CF-VARMA) is satisfied.Thus, ( B, A ) / ∼ is the intersection of mat(ker( P ′ ⊗ I n )) with n ( ˘ B, ˘ A ) ∈ R [ z ] n × n × R [ z ] n × m : (EU-VARMA) and (L-VARMA) are satisfied o . The latter set is open in R n ( n + m )( κ +1) due to the continuity of zeros of polynomials withrespect to their coefficients (Horn & Johnson, 1985, Appendix D). Therefore, ( B, A ) / ∼ isrelatively open in mat(ker( P ′ ⊗ I n )).(iii) dim(ker( P ′ )) = dim(ker( H ′ )) and so the result follows from Lemma A.8 (i) and thestandard properties of Kronecker products (Horn & Johnson, 1991, Theorem 4.2.15). Forgeneric parameters the result follows from Lemma A.8 (ii).Theorem A.9 is a direct generalization of Theorem A.3 to the VARMA setting. Theo-rem A.9 (i) characterizes the sets of observationally equivalent parameters as subsets of par-ticular subspaces of the parameter space determined by the first 1+( n +1) κ impulse responses.This result is equivalent to Theorem 1 of Deistler & Schrader (1979) although Deistler andSchrader use the more traditional formulation of observational equivalence, ˜ B = U B and˜ A = U A for some U ∈ R ( z ) n × n . Theorem A.9 (ii) then shows that the sets of observationallyequivalent parameters constitute relatively open although not necessarily dense subsets of theaforementioned subspaces (see Example A.1). Theorem A.9 (iii) finally characterizes the di-mension of observationally equivalent parameters. Theorem A.9 (ii) and (iii) are analoguesto Theorem 2.5.3 (v) of Hannan & Deistler (2012) (see also their Remark 2 on page 67). Asnoted earlier, however, Hannan and Deistler parametrize the VARMA model differently. xample A.1. Let n = m = κ = 1 and suppose ( B, A ) = (1 , P ′ ⊗ I n = − − . Clearly (1 , − , , − ′ ∈ ker( P ′ ⊗ I n ). Thus, ( ˜ B, ˜ A ) = (1 − z, − z ) ∈ mat(ker( P ′ ⊗ I n )). However, this pair violates condition (EU-VARMA) and by the continuity of zeros ofpolynomials with respect to their coefficients (Horn & Johnson, 1985, Appendix D), there isa neighbourhood of ( ˜ B, ˜ A ) in R containing no parameters. Thus, ( B, A ) / ∼ is is not densein ker( P ′ ⊗ I n ).We are now in a position to consider affine restrictions. Let Ω RV ARMA be a subset ofΩ
V ARMA endowed with the relative topology. We say that (
B, A ) ∈ Ω RV ARMA is identified in Ω RV ARMA if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RV ARMA is equal to (
B, A ). We say that a parameter(
B, A ) is locally identified in Ω RV ARMA if it has a neighbourhood N in Ω RV ARMA such thatevery ( ˜ B, ˜ A ) ∼ ( B, A ) in N is equal to ( B, A ). Again, a parameter is locally identified inΩ
RV ARMA if it is identified in Ω
RV ARMA but the converse is not true in general.
Theorem A.10.
Let Ω RV ARMA be the set of ( B, A ) ∈ Ω V ARMA satisfying (16) R vec (cid:16)h B · · · B κ A · · · A κ i(cid:17) = u, where R ∈ R r × n ( n + m )( κ +1) and u ∈ R r . If ( B, A ) ∈ Ω RV ARMA and P is defined as in TheoremA.9, then ( B, A ) is identified in Ω RV ARMA if and only if M = P ′ ⊗ I n R is of full column rank n ( n + m )( κ + 1) .Proof. Let ζ = vec (cid:16)h B · · · B κ A · · · A κ i(cid:17) . If M is of full column rank, then ζ isthe only point in ker( P ′ ⊗ I n ) that satisfies (16). Theorem A.9 (i) then implies that ( B, A ) isidentified in Ω
RV ARMA . If M is not of full rank, then there exists 0 = ξ ∈ ker( P ′ ⊗ I n ) ∩ ker( R ).If c > B, A ) ∼ mat( ζ + cξ ) = ( B, A ) andsince mat( ζ + cξ ) satisfies (16), ( B, A ) is not identified in Ω
RV ARMA . heorem A.10 is a direct generalization of Theorem A.4. The geometry of Theorem A.10 isalso exactly analogous to that of Theorem A.4. Thus, for the classical VARMA model subjectto affine restrictions, a parameter is identified if and only if it is locally identified . Corollary A.11 (Deistler & Schrader (1979)) . Let E : vec (cid:16)h Y · · · Y n +1) κ X · · · X n +1) κ i(cid:17) vec (cid:16)h Y · · · Y κ X · · · X κ i(cid:17) , where Y , . . . , Y n +1) κ ∈ R n × n and X , . . . , X n +1) κ ∈ R n × m , let E ⊥ be an orthogonalcomplement to E , let R DS = REE ⊥ , and let D = B · · · B κ · · · A · · · A κ · · · . . . . . . ... . . . . . . ...... . . . . . . . . . ... . . . . . . . . . ... . . . . . . B κ ... . . . . . . A κ ... . . . . . . ... ... . . . . . . ... · · · · · · · · · B · · · · · · · · · A n + 1) κ blocks . Then ( B, A ) is identified in Ω RV ARMA if and only if R DS ( D ′ ⊗ I n ) is of full column rank n (1 + ( n + 1) κ ) .Proof. We claim that (cid:2) P ′ ⊗ I n R (cid:3) is of full column rank if and only if R DS ( D ′ ⊗ I n ) is of fullcolumn rank. Let ζ = vec (cid:16)h Y · · · Y κ X · · · X κ i(cid:17) . Then ( P ′ ⊗ I n ) ζ = 0 if and only if h Y · · · Y κ i h T H i = h X · · · X κ n × m · · · n × m i| {z } n +1) κ blocks . his is equivalent to h Y · · · Y ( n +1) κ i C C · · · · · · C n ( κ +1) . . . . . . ...... . . . . . . . . . ...... . . . . . . C · · · · · · C | {z } T DS = h X · · · X ( n +1) κ i E ⊥ vec (cid:16)h Y · · · Y ( n +1) κ X · · · X ( n +1) κ i(cid:17)| {z } ζ DS = 0 n ( n + m ) κ × . Thus, the kernel of (cid:2) P ′ ⊗ I n R (cid:3) is { } if and only if the kernel of h P ′ DS ⊗ I n R DS i is { } , where P DS = − T DS I m (1+( n +1) κ ) . Since A = BC , D is an orthogonal complement to P ′ DS . Thus, ( P ′ DS ⊗ I n ) ζ DS = 0 if and onlyif ζ DS = ( D ′ ⊗ I n ) ξ DS for some ξ DS ∈ R n (1+( n +1) κ ) . It follows that h P ′ DS ⊗ I n R DS i ζ DS = 0 if andonly if ζ DS = ( D ′ ⊗ I n ) ξ DS and R DS ( D ′ ⊗ I n ) ξ DS = 0. In other words, the kernel of h P ′ DS ⊗ I n R DS i is { } if and only if the kernel of R DS ( S ′ ⊗ I n ) is { } .Corollary A.11 due to Deistler & Schrader (1979) is evidently equivalent to Theorem A.10.The main difference between the two formulations is that our determining matrix is popu-lated by impulse responses, whereas Deistler and Schrader’s matrix is populated by structuralparameters.Suppose now that we are interested in identifying just the i -th equation of (13). LetΩ RV ARMA be as before and let (
B, A ) ∈ Ω RV ARMA . We say that the i -th equation of (13)is identified at ( B, A ) in Ω RV ARMA if every ( ˜ B, ˜ A ) ∼ ( B, A ) in Ω
RV ARMA has the same i -thequation as ( B, A ). We say that the i -th equation of (13) is locally identified at ( B, A ) in Ω RV ARMA if (
B, A ) has a neighbourhood N in Ω RV ARMA such that every ( ˜ B, ˜ A ) ∼ ( B, A ) in N has the same i -th equation as ( B, A ). Again, if all equations are (locally) identified at (
B, A )in Ω
RV ARMA , then (
B, A ) is (locally) identified in Ω
RV ARMA . Theorem A.12.
Let Ω RV ARMA be the set of ( B, A ) ∈ Ω V ARMA satisfying R i vec (cid:16) e ′ i h B · · · B κ A · · · A κ i(cid:17) = u i , (17) here R i ∈ R r × ( n + m )( κ +1) , u i ∈ R r , and e i ∈ R n is the i -th standard unit vector. If ( B, A ) ∈ Ω RV ARMA and P is defined as in Theorem A.9, then the i -th equation of (13) is identified at ( B, A ) in Ω RV ARMA if and only if M i = P ′ R i is of full column rank ( n + m )( κ + 1) .Proof. Let ζ = vec (cid:16)h B · · · B κ A · · · A κ i(cid:17) . If M i is of full column rank, thenvec (cid:16) e ′ i h B · · · B κ A · · · A κ i(cid:17) = ( I ( n + m )( κ +1) ⊗ e ′ i ) ζ is the only point in ker( P ′ )that satisfies (17), since P ′ ( I ( n + m )( κ +1) ⊗ e ′ i ) = ( I m (1+( n +1) κ ) ⊗ e ′ i )( P ′ ⊗ I n ). Theorem A.9(i) then implies that any parameter in Ω RV ARMA that is observationally equivalent to (
B, A )must have the same i -th equation as ( B, A ). Thus the i -th equation is identified at ( B, A )in Ω
RV ARMA . If M i is not of full rank, then there exists 0 = ξ i ∈ ker( P ′ ) ∩ ker( R i ). If c > B, A ) ∼ mat( ζ + cξ i ⊗ e i ) and sincemat( ζ + cξ i ⊗ e i ) satisfies (17) but has a different i -th equation than ( B, A ), the i -th equationis not identified at ( B, A ) in Ω
RV ARMA . References
Al-Sadoon, M. M. (2018). The Linear Systems Approach to Linear Rational ExpectationsModels.
Econometric Theory , (03), 628–658.Anderson, B. D., Deistler, M., Chen, W., & Filler, A. (2012). Autoregressive Models ofSingular Spectral Matrices. Automatica , (11), 2843 – 2849.Anderson, B. D., Deistler, M., Felsenstein, E., & Koelbl, L. (2016). The Structure of Multi-variate AR and ARMA Systems: Regular and Singular Systems; the Single and the MixedFrequency Case. Journal of Econometrics , (2), 366 – 373.Bacchiocchi, E. & Kitagawa, T. (2019). The Dark Side of the SVAR: a Trip Into the LocalIdentification World. Technical report.Blanchard, O. (2018). On the Future of Macroeconomic Models. Oxford Review of EconomicPolicy , (1-2), 43–54.Bollen, K. A. (1989). Structural Equations with Latent Variables . Wiley Series in Probability nd Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Inc.,New York. A Wiley-Interscience Publication.Canova, F. & Sala, L. (2009). Back to Square One: Identification Issues in DSGE Models. Journal of Monetary Economics , (4), 431–449.Clancey, K. F. & Gohberg, I. (1981). Factorization of Matrix Functions and Singular IntegralOperators . Operator Theory: Advances and Applications (Vol. 3). Boston, USA: Birkh¨auserVerlag Basel.Deistler, M. & Schrader, J. (1979). Linear Models with Autocorrelated Errors: StructuralIdentifiability in the Absence of Minimality Assumptions.
Econometrica , (2), 495–504.Dufour, J.-M. & Renault, E. (1998). Short Run and Long Run Causality in Time Series:Theory. Econometrica , (5), 1099–1125.Fuller, W. A. (1987). Measurement Error Models . Wiley Series in Probability and Mathemat-ical Statistics. New York, USA: John Wiley & Sons, Inc.Gohberg, I., Lancaster, P., & Rodman, L. (2006).
Invariant Subspaces of Matrices withApplications . Classics in Applied Mathematics. Philadelphia, USA: SIAM.Hall, B. C. (2003).
Lie Groups, Lie Algebras, and Representations: An Elementary Introduc-tion . Graduate Texts in Mathematics. NewYork, USA: Springer.Hannan, E. J. (1971). The Identification Problem for Multiple Equation Systems with MovingAverage Errors.
Econometrica , (5), 751–765.Hannan, E. J. & Deistler, M. (2012). The Statistical Theory of Linear Systems . Classics inApplied Mathematics. Philadelphia, PA, USA: SIAM.Hansen, L. P. & Sargent, T. J. (1981). Formulating and Estimating Dynamic Linear RationalExpectations Models. In R. E. Lucas & T. J. Sargent (Eds.),
Rational Expectations andEconometric Practice (pp. 91–125). Minneapolis: The University of Minnesota Press.Herbst, E. P. & Schorfheide, F. (2016).
Bayesian Estimation of DSGE Models . Princeton,USA: Princeton University Press.Horn, R. A. & Johnson, C. R. (1985).
Matrix Analysis . Cambridge, United Kingdom: Cam-bridge University Press.Horn, R. A. & Johnson, C. R. (1991).
Topics in Matrix Analysis . Cambridge, United Kingdom:Cambridge University Press.Hsiao, C. (1983). Identification. In Z. Griliches & M. D. Intriligator (Eds.),
Handbook ofEconometrics , volume 1 of
Handbook of Econometrics chapter 4, (pp. 223–283). Elsevier. skrev, N. (2010). Local Identification in DSGE Models. Journal of Monetary Economics , (2), 189–202.Kociecki, A. & Kolasa, M. (2018). Global Identification of Linearized DSGE Models. Quan-titative Economics , (3), 1243–1263.Komunjer, I. & Ng, S. (2011). Dynamic Identification of Dynamic Stochastic General Equi-librium Models. Econometrica , (6), 1995–2032.Krantz, S. G. & Parks, H. R. (2002). A Primer of Real Analytic Functions (2 ed.). Birkh¨auserAdvanced Texts: Basler Lehrb¨ucher. Boston, USA: Birkh¨auser Boston.Lindquist, A. & Picci, G. (2015).
Linear Stochastic Systems: A Geometric Approach toModeling, Estimation, and Identification . Series in Contemporary Mathematics 1. BerlinHeidelberg: Springer-Verlag.L¨utkepohl, H. (2005).
New Introduction to Multiple Time Series Analysis . Berlin, Germany:Springer.Muth, J. F. (1981). Estimation of Economic Relationships Containing Latent ExpectationsVariables. In R. E. Lucas & T. J. Sargent (Eds.),
Rational Expectations and EconometricPractice (pp. 321–328). Minneapolis: The University of Minnesota Press.Onatski, A. (2006). Winding Number Criterion for Existence and Uniqueness of Equilibrium inLinear Rational Expectations Models.
Journal of Economic Dynamics and Control , (2),323–345.Pagan, A. & Hannan, E. J. (1985). The ET Interview: Professor E. J. Hannan. EconometricTheory , (2), 263–289.Pesaran, M. (1981). Identification of Rational Expectations Models. Journal of Econometrics , (3), 375 – 398.Pesaran, M. H. & Smith, R. P. (2011). Beyond the DSGE Straitjacket. The ManchesterSchool , (s2), 5–16.Qu, Z. & Tkachenko, D. (2017). Global Identification in DSGE Models Allowing for Indeter-minacy. The Review of Economic Studies , (3), 1306–1345.Romer, P. (2016). The Trouble with Macroeconomics. The American Economist , , 1–20.Rosenblatt, M. (2000). Gaussian and Non-Gaussian Linear Time Series and Random Fields .Springer Series in Statistics. New York, USA: Springer Verlag.Rubio-Ram´ırez, J. F., Waggoner, D. F., & Zha, T. (2010). Structural Vector Autoregressions:Theory of Identification and Algorithms for Inference.
The Review of Economic Studies , (2), 665–696.Rudin, W. (1976). Principles of Mathematical Analysis (3 ed.). New York, USA: McGrawHill, Inc.Sims, C. A. (1980). Macroeconomics and Reality.
Econometrica , (1), 1–48.Wallis, K. F. (1980). Econometric Implications of the Rational Expectations Hypothesis. Econometrica , (1), 49–73.(1), 49–73.