[PDF] Inference on the New Keynesian Phillips Curve with Very Many Instrumental Variables

Abstract

Limited-information inference on New Keynesian Phillips Curves (NKPCs) and other single-equation macroeconomic relations is characterised by weak and high-dimensional instrumental variables (IVs). Beyond the efficiency concerns previously raised in the literature, I show by simulation that ad-hoc selection procedures can lead to substantial biases in post-selection inference. I propose a Sup Score test that remains valid under dependent data, arbitrarily weak identification, and a number of IVs that increases exponentially with the sample size. Conducting inference on a standard NKPC with 359 IVs and 179 observations, I find substantially wider confidence sets than those commonly found.

Full PDF

IInference on the New Keynesian Phillips Curve with VeryMany Instrumental Variables

Max-Sebastian Dov`ı ∗ [email protected] 26, 2021 Abstract

Limited-information inference on New Keynesian Phillips Curves (NKPCs) and other single-equation macroeconomic relations is characterised by weak and high-dimensional instrumen-tal variables (IVs). Beyond the eﬃciency concerns previously raised in the literature, I showby simulation that ad-hoc selection procedures can lead to substantial biases in post-selectioninference. I propose a Sup Score test that remains valid under dependent data, arbitrarilyweak identiﬁcation, and a number of IVs that increases exponentially with the sample size.Conducting inference on a standard NKPC with 359 IVs and 179 observations, I ﬁnd sub-stantially wider conﬁdence sets than those commonly found. ∗ I thank Sophocles Mavroeidis and Anna Mikusheva for very helpful comments and suggestions. All errorsand omissions are my own. a r X i v : . [ ec on . GN ] J a n Introduction

Instrumental variable (IV) methods are often used to conduct limited-information inference on(structural) single-equation macroeconomic relations that describe the dependence of a scalarvariable on a set of covariates. Examples of such macroeconomic relations include New Keyne-sian Phillips Curves (NKPCs), Euler equations, and Taylor rules. IV-based limited-informationinference on such macroeconomic relations has arguably proven popular because there is no re-quirement that parts of the model other than the speciﬁed relation itself be necessarily trueto conduct valid inference. In virtually all applications, the relation is assumed to contain anadditive error term that is shown (e.g., by the assumption of Rational Expectations (RE)) orprimitively assumed to be uncorrelated with predetermined variables excluded from the speciﬁedrelation. This makes any predetermined variable a valid IV.As documented extensively in the existing literature, using IVs to conduct limited-informationinference on such macroeconomic relations often runs into issues related to weak identiﬁcation.This occurs when the variation in the IVs is only able to explain a small portion of the variation ofthe endogenous variables. This problem is especially pronounced when the analysis is restrictedto using only a few variables to forecast the endogenous variables, a restriction that arises whenusing IV methods that treat the number of IVs as ﬁxed relative to the sample size. Since anypredetermined variable is a valid (if not very informative) IV, this naturally raises the questionof which IVs to choose out of the very many available ones.The limited literature that seeks to formally address the high dimensionality of the availableIVs in such macroeconomic settings is primarily motivated by the potential ineﬃciency of usingIVs selected in an ad-hoc way (Bayar, 2018; Berriel, Medeiros, and Sena, 2016, 2019; Kapetan-ios, Khalaf, and Marcellino, 2015; Mirza and Storjohann, 2014). Through simulations and/orempirical applications, these studies ﬁnd smaller conﬁdence sets than the ones implied by IVstraditionally used in the past. Although this evidence is certainly suggestive, it should be notedthat formal eﬃciency claims rely on conditions that are not easily veriﬁable in practice. Rather than being motivated by such eﬃciency concerns, this paper revisits the question of high-dimensional limited-information inference because some types of formal or intuitive regularisationcan lead to invalid inference, even if weak-IV robust methods are used after regularisation. Thisis due to what Chernozhukov, Hansen, and Spindler (2015) call the ‘endogeneity bias’, whicharises when variables are selected on the basis of their in-sample correlation with a model’s error For the case of NKPCs, see Dufour, Khalaf, and Kichian (2006), Kapetanios, Khalaf, and Marcellino (2015),Kleibergen and Mavroeidis (2009), Ma (2002), Mavroeidis, Plagborg-Møller, and Stock (2014), and Mirza andStorjohann (2014), for the case of Euler equations see Ascari, Magnusson, and Mavroeidis (2019), Kleibergen(2005), Stock and Wright (2000), and Yogo (2004), for the case of Taylor rules see Mavroeidis (2010) and Mirzaand Storjohann (2014). For instance, factor-based approaches to reduce the dimensionality of the IVs likely work well if there is a factorstructure, and if whatever explains most of the variation in the IVs, also explains (a good portion of) the variationof the endogenous variables. While the former may be made plausible through certain tests, the latter remainsan assumption the researcher has to make. Similarly, a LASSO-based selection of IVs works well only under theassumption that the relation between the endogenous variables and the candidate IVs is suﬃciently sparse.

Notation . For any real number a , (cid:98) a (cid:99) indicates the smallest integer b such that b ≤ a . Forany two real numbers c and d , c (cid:46) d if c is smaller than or equal to d up to a universal positiveconstant. The remaining notation follows standard conventions. Organisation of the paper . Section 2 introduces the model considered in this paper. Section3 outlines the methods used in this paper to conduct inference in the context of very many IVs.Section 4 provides simulation-based evidence on the size and power of these methods. Section 5revisits inference on the US NKPC using very many IVs. Section 6 concludes.3

Model

The structural equation I consider is the hybrid NKPC of Gal`ı and Gertler (1999), π t = c + λs t + γ f E t [ π t +1 ] + γ b π t − + u t , (1)where π t is the inﬂation rate, s t is the forcing variable, and λ , c , γ f , and γ b are parameters ofthe model. u t is an unobserved disturbance term, which can be interpreted as a measurementerror, or as a shock to inﬂation, such as a cost-push shock.The identifying moment conditions can be derived within the framework of Generalised In-strumental Variable (GIV) estimation. In this approach, realised one-period-ahead inﬂation issubstituted in for expected inﬂation. This means that Equation (1) can be re-written as π t = c + λs t + γ f π t +1 + γ b π t − + u t − γ f [ π t +1 − E t [ π t +1 ]] (cid:124) (cid:123)(cid:122) (cid:125) (cid:15) t . If it is further assumed that E t − [ u t ] = 0, the assumption of RE gives rise to the momentconditions E [ Z t (cid:15) t ] = 0 , for any k × Z t . Due to the very large number of predeter-mined time series available, the dimension of Z t is comparable to or larger than the number ofobservations, T .It should be noted that the example of NKPCs (including the particular speciﬁcation chosen),and the assumption of RE are not central to two of the contributions of this paper. The sameconcerns relating to the endogeneity bias persist, and the same Sup Score test proposed belowremains valid for the broad class of models deﬁned by single-equation relations of the type y = g ( Y, X, θ ) + ε, (2)and moment equations given by E [ Z t ε t ] = 0 , (3)where y is a T × g is a known real-valued function, Y is a T × p matrix of endogenouscovariates, X is a T × p matrix of exogenous covariates, Z is a T × k matrix of variables suchthat k ≥ p + p , θ is a ( p + p ) × p and p are both ﬁxed, and ε is a T × k is of the same magnitude or even larger than T , such as limited-information inferenceon NKPCs, Euler equations, and Taylor rules.In particular, the NKPC considered in Equation (2) can be mapped into the more general model4n Equation (2) as follows. Since the NKPC is linear, the exogenous (predetermined) variables canbe partialled out. Hence, y = M X π , Y = M X [ s π +1 ], M X = I − X ( X (cid:48) X ) − X (cid:48) , X = [1 T × π − ], g ( Y, X, θ ) =

Y θ , θ = [ λ, γ f ] (cid:48) , ε = M X (cid:15) , Z = M X ˜ Z , ˜ Z is a T × ( k −

2) matrix of excluded IVs, s, π +1 , π − , and (cid:15) are the T × s t , π t +1 , π t − , and (cid:15) t , respectively. For all methods considered in this paper, conﬁdence sets are constructed by inverting statisticsthat test the hypothesis H : θ = θ vs H : θ (cid:54) = θ . (4)The (1 − α ) conﬁdence set can be constructed by collecting the values of θ for which the nullhypothesis in Equation (4) is not rejected at the α level of signiﬁcance. For convenience, deﬁne ε ≡ y − g ( Y, X, θ ). Most of the existing literature that conducts inference on relations of the form presented inEquation (2) using moment conditions of the type shown in Equation (3) has employed methodsthat require the IVs to be low-dimensional. In the presence of very many IVs, these approachescan be seen as a two-step procedure. First, the IVs are selected. Second, a low-dimensional(weak-identiﬁcation robust) method is applied with the selected IVs. The ﬁrst step is usually notmade explicit, and is often not given any attention, which makes it impossible to model this stepaccurately. In Section 3.1.2, I consider three diﬀerent selection procedures that reasonably cover(in terms of their deleterious eﬀect on subsequent inference) the range of selection proceduresused in the previous literature. These are random selection, ‘crude thresholding’, and LASSO. InSection 3.1.1, I outline the S statistic of Stock and Wright (2000), which forms the post-selectioninferential method common to all three selection procedures considered in this paper.Before proceeding, it is helpful to gain some intuition as to why IV selection may lead to invalidIVs. For simplicity, suppose that all variables are endogenous (or that the model is linear andthat the exogenous covariates have been partialled out). Consider the following projection (‘ﬁrststage’) Y = Zζ + v, where ζ is a k × v is a T × ζ = 0, and a selection procedure that selects the IVs that are mosthighly correlated with the endogenous variables, Y . This amounts to selecting those IVs thatare most highly correlated in-sample with the ﬁrst-stage error term. By the endogeneity of thesystem, this means that those IVs most correlated with the error term, ε , will be selected, sothat conditional on selection , the IVs are no longer valid. This phenomenon carries over more5roadly to cases of weak (but non-zero) identiﬁcation as discussed in Hansen and Kozbur (2014). S Statistic

In this paper, the GMM-based S statistic of Stock and Wright (2000) will be used for low-dimensional post-selection inference. Letting k s ≥ p + p denote the number of IVs selected,the S statistic is given by T times the value of the continuously updated GMM objective functiongiven by S ( θ ) = T ε T ( θ ) (cid:48) W T ( θ ) ε T ( θ ) , (5)where ε T ( θ ) = T − (cid:80) Tt =1 Z st ε t , Z st is the k s × W T ( θ )is the continuously updated k s × k s weight matrix that is a consistent estimator of the covariancematrix of the moment conditions of the selected IVs as in Kleibergen and Mavroeidis (2009) andStock and Wright (2000). Throughout, I use the heteroscedasticity and autocorrelation consistent(HAC) estimator of Newey and West (1987). Under the null hypothesis in Equation (4) and theregularity conditions discussed in Stock and Wright (2000), this statistic is asymptotically χ k s .Whenever p (cid:54) = 0 (i.e., there are exogenous covariates in the relation), the exogenous covariatescan be concentrated out, to yield the concentrated S statistic as in Stock and Wright (2000,Theorem 3). The S statistic further recommends itself in this context because it allows for a straightforwardtest of the exclusion restrictions of the IVs. It may be hoped that any substantial bias causedby improper selection may be ﬂagged in the form of a low p -value for the test of the nullhypothesis that the IVs selected, Z s , are uncorrelated with the structural error term, ε . Toinvestigate this possibility further, in the simulations, I also evaluate the weak-identiﬁcationrobust Hansen test. This is given by the minimum value of the S statistic in Equation (5).Without making an assumption of strong identiﬁcation, this statistic is asymptotically boundedby a χ k s − p distribution (Mavroeidis, Plagborg-Møller, and Stock, 2014, p. 178), which providesa weak-identiﬁcation robust critical value for the test of the overidentifying restrictions of theIVs selected. Conducting inference with the S statistic requires selecting a suﬃciently small subset of k s IVsfrom the available k IVs. In most of the empirical studies on IV-based limited-information More powerful and computationally intensive (GMM-based) weak-identiﬁcation robust methods could be usedinstead of the S statistic (see Kleibergen and Mavroeidis (2009) and Mirza and Storjohann (2014)). Consideringthem instead of the S statistic does not qualitatively aﬀect the results of the simulations, while increasing theircomputational burden substantively. Furthermore, Mavroeidis, Plagborg-Møller, and Stock (2014, p. 165) statethat amongst the diﬀerent speciﬁcations for the NKPC they consider, the conﬁdence sets implied by these morepowerful methods are similar to the ones implied by the S statistic. In both the simulations and the empirical application below, the constant and the one-period lagged inﬂationare concentrated out. An often-used rule of thumb is to select k s to be of the order of magnitude of T / . This rate result is motivatedby the results in Andrews and Stock (2007), and Newey and Windmeijer (2009), who show that this rate condition k s IVs thatare subsequently used for analysis. Often, the choice of IVs is simply motivated with referenceto previous studies that used those IVs. It is hence impossible to model the choice of IVs ofthe previous literature accurately in a simulation exercise. As an (imperfect) approximation, Iconsider the following three selection procedures.The ﬁrst selection procedure involves randomly selecting k s IVs out of the k available IVs. Sincethe selection of IVs is not informed by the data itself, this selection procedure is guaranteed tonot violate the identifying moment conditions.The second selection procedure I consider will be referred to as crude thresholding. This involvesﬁrst computing p separate k × (cid:98) k s /p (cid:99) entries in each of the vectors. By selecting the variables based on in-sample correlations,this selection procedure is likely to break the exclusion restriction of the IVs selected. Although(to my knowledge) this crude thresholding has not been applied to IV-based limited-informationinference, more sophisticated versions of thresholding have been considered in the past (e.g.,Mirza and Storjohann (2014, Appendix B) and Bayar (2018)). The ﬁrst two selection procedures (random selection and crude thresholding) arguably coverthe extremes in terms of the eﬀects IV selection can have on the validity of the IVs. Randomselection provides the selection ideal, since it leaves the identifying moment conditions completelyunaﬀected. However, particularly with reference to the traditional IVs often considered in theliterature, it seems unlikely that random selection (over the very many available predeterminedmacroeconomic time series) led to choosing proximate lags of the endogenous variables as IVs.Indeed, given the persistence of most macroeconomic time series (and hence of the endogenousvariables in any given application), it seems plausible that at least part of the motivation forconsidering proximate lags of the endogenous variables as IVs stems from their ability to usefullyexplain some of their in-sample variation. Suggestive evidence for this type of selection is alsogiven by the fact that the IVs selected by crude thresholding in the empirical application inSection 5 show substantial overlap with these traditional IVs. Therefore, it seems likely thatrandom selection and crude thresholding provide a suggestive lower and upper bound on theselection-induced bias that could underlie existing empirical applications.The third selection procedure I consider is a LASSO-based selection of IVs. This is motivated bythe recent increase in popularity of penalisation-based approaches to the (very) many IV problem is suﬃcient for the case of independent data. Recently, fully weak-identiﬁcation robust AR-type statistics havebeen developed that allow for the number of IVs to be of the order of magnitude of T (Anatolyev and Gospodinov,2010; Crudu, Mellace, and S´andor, 2020; Mikusheva and Sun, 2020). However, all of these approaches treat theIVs as ﬁxed, and are hence not applicable in the context of time series. The hard thresholding in Mirza and Storjohann (2014, Appendix B) and Bayar (2018) is not applicable inhigh-dimensional contexts, since OLS is infeasible when there are more variables than observations. See also the ranking of IVs based on t -values in Mirza and Storjohann (2014, Table B1). ζ r = arg min ζ r ∈ R k T (cid:88) t =1 ( Y rt − ζ (cid:48) r Z t ) + Λ r | ζ r | , where Y rt is the element in position t of the T × Y r given by the r th column of Y , ζ r for r = 1 , . . . , p is a k × r > r = 1 , . . . , p are scalar penalty parameters thatare set such that ˆ ζ r has (cid:98) k s /p (cid:99) elements. The IVs selected are given by the IVs that have atleast one corresponding non-zero entry in at least one of ˆ ζ r for r = 1 , . . . , p . Although the interplay between weak identiﬁcation and high-dimensional IVs has recently re-ceived some attention (see Hansen and Kozbur (2014)), none of the currently available approachesare both robust to arbitrarily weak identifcation and applicable in a time-series context. Indeed,to the best of my knowledge, the only approach that is formally robust to arbitrarily weak iden-tiﬁcation in the presence of very many IVs is the Sup Score test of Belloni et al. (2012). The SupScore test of Belloni et al. (2012), however, treats the IVs as ﬁxed, and is hence not applicablein time-series contexts. In this section, I propose a Sup Score test that remains valid underhigh-dimensional dependent data using recent results of Zhang and Cheng (2014, 2018).The Sup Score statistic I propose is given by R = max ≤ j ≤ k (cid:12)(cid:12)(cid:12)(cid:12) √ T Z (cid:48) j ε (cid:12)(cid:12)(cid:12)(cid:12) . (6)This can be seen as a non-studentised version of the Belloni et al. (2012) Sup Score statistic, whichin turn can be interpreted as an extension to high dimensions of the Anderson and Rubin (1949)(AR) statistic. It also bears some resemblance to the non-studentised AR statistic proposed byHorowitz (2018).The critical values for the test statistic in Equation (6) are computed using a block bootstrap.Let l T ≡ (cid:98) T /b T (cid:99) , where b T is the block length. Deﬁne the block sumsˆ A tj = tb T (cid:88) l =( t − b T +1 Z lj ε l − (cid:8) Z (cid:48) ε (cid:9) j , for t = 1 , . . . , l T , where (cid:8) Z (cid:48) ε (cid:9) j is the j th element of the k × T (cid:80) Tt =1 Z t ε t . Consider the bootstrap statistic8iven by L ˆ A = max ≤ j ≤ k √ T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) l T (cid:88) t =1 ˆ A tj e t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where { e t } is a sequence of i.i.d. N [0 ,

1] random variables. The critical value for a test of size α of Equation (4) is given by c ( α ) = inf (cid:8) γ ∈ R : P ( L ˆ A ≤ γ |{ Z t ε t } Tt =1 ) ≥ − α (cid:9) . The decision rule for testing the null hypothesis in Equation (4) at the α level of signiﬁcance isgiven by Reject H ⇐⇒ R > c ( α ) . I now turn to conditions that are suﬃcient to ensure that the test described above has correctsize.

Assumption 1. i. Z t ε t is a stationary time series that allows for the causal representation Z t ε t = G ( . . . , u t − , u t )for some measurable function G , where u t are a sequence of mean-zero i.i.d. random vari-ables. Furthermore, assume that Z tj ε t = G j ( . . . , u t − , u t ) for all j = 1 , . . . , k , where G j isthe j th component of the map G .ii. E [ Z t ε t ] = 0, E [ Z tj ε t ] >

0, and E [ Z tj ε t ] < ∞ for all j = 1 , . . . , k .iii. k (cid:46) exp( T b ), b T (cid:46) T ˜ b for b < /

15, 4˜ b + 7 b <

1, and somepositive constant C , where q ≥

4, and { u ∗ t } are i.i.d. copies of { u t } .Assumption 1.i. requires the product of the IVs and the error terms to be stationary, and havesome causal representation. Assumption 1.ii. makes weak assumptions on the moments of thedata, and includes the identifying moment condition. In practice, I standardise the IVs in-sampleto ensure that the test is invariant to the scaling of IVs. Assumption 1.iii. bounds the degree ofhigh dimensionality permitted and the size of the block bootstraps. Although the restriction onthe dimensionality ( b < /

15) is stronger than the ones usually encountered in the independentcase (see Belloni et al. (2012) and Deng and Zhang (2020)), it still allows for very many IVscompared to the sample size. Assumption 1.iv. imposes restrictions on the correlation of theproduct of the IVs with the error term across diﬀerent points in time. Assumption 1.v. im-poses a (uniform) Geometric Moment Contraction (GMC) restriction on the product of the IVsand the error terms as in Wang and Shao (2019). The GMC requires that the process under9onsideration have a suﬃciently ‘short memory’. Processes that obey such a condition include(under suitable assumptions) standard linear processes (e.g., standard vector autoregressionsand Volterra processes) as well as several nonlinear processes (e.g., autoregressive models withconditional heteroscedasticity, random coeﬃcient autoregressive models, and exponential autore-gressive models). I refer to Chen et al. (2016), Hsing and Wu (2004), Wang and Shao (2019),Wu (2005), and Zhang and Cheng (2014, 2018) and the references therein for a discussion of thediﬀerent processes that obey such a condition.No assumption on the ﬁrst stage (i.e., the relationship between Y and Z ) has to be made. Thismeans that the proposed Sup Score test is uniformly valid over all (ﬁnite) values of the coeﬃcienton the IVs in the ﬁrst stage (including arbitrarily weak identiﬁcation). This also means thatno restriction on the factor or sparsity structure of the ﬁrst stage has to be imposed. The lackof assumptions on the ﬁrst stage also implies that the Sup Score test does not suﬀer from any‘missing IV problem’(see also Dufour (2009)).Whether these conditions are satisﬁed in any given macroeconomic application depends on theerror terms (i.e., the structural equation), and on the properties of the excluded IVs. Example1 shows that under suitable assumptions on the error term that encompass, amongst others,some popular assumptions made in the literature on NKPCs (e.g., Dufour, Khalaf, and Kichian(2006)), it is only required that the IVs satisfy a GMC condition. This is attractive in the contextof limited-information inference, since the researcher only has to assume that the IVs belong toone of the many processes that have been shown to obey such a condition, without having totake a stance on the particular process. Example 1.

Assume that ε t is i.i.d. across t , E [ ε t ] = 0, E [ ε t ] >

0, and E [ ε t ] < ∞ . Assume that Z t is a stationary time series and allows for the causal representation Z t = F ( . . . , v t − , v t ) , Z tj = F j ( . . . , v t − , v t ) for some measurable function F , where v t are a sequence of mean-zero i.i.d.random variables (independent of ε t ). Assume further that E [ Z tj ] > E [ Z tj ] < ∞ for all j = 1 , . . . , k . Assume that the conditions on the dimensionality of the IV problem in Assumption1.iii. are satisﬁed. Further, assume that Z t satisﬁes: E [ | Z tj − F j ( . . . , v ∗− , v ∗ , v , . . . , v t ) | ] < ˜ C ˜ ρ t (7)where { v ∗ t } are i.i.d. copies of { v t } , ˜ C is some constant, and 0 < ˜ ρ < Proof.

See Appendix A.I now state the main theoretical result of this paper, which ensures that the approach proposedcontrols the size of the test. It should be noted, however, that–similarly to other sup-based test statistics, such as in Belloni et al. (2012) and heorem 1. Under Assumption 1. and the null hypothesis in Equation (4),lim T →∞ P (Reject H ) ≤ α. Proof.

See Appendix B.Theorem 1 makes it possible to construct conﬁdence sets by inverting the test as outlined above.

The simulations presented in this section serve a twofold purpose. First, I use the simulationsto study how improper selection of IVs can lead to problematic post-selection inference. Second,I use the simulations to illustrate the asymptotic validity of the Sup Score test established inthe section above, as well as its ﬁnite-sample power properties. Taken together, the simulationshence motivate and further justify applying the Sup Score test proposed in Section 2 in practice. The simulations in this paper are based on the approach in Mavroeidis, Plagborg-Møller, andStock (2014). The central diﬀerence is that rather than modelling the econometrician as havingperfect knowledge of the relevant IVs, and incorrectly employing methods that are not robust toweak identiﬁcation, I model the econometrician as using exclusively weak-identiﬁcation robustmethods, but not knowing which IVs correspond to the truly relevant ones. Given the extensiveliterature that pointed out that NKPCs can suﬀer from weak identiﬁcation, this setup seemscloser to the estimation problem that an econometrician is likely to face.I base my simulations on the simplest possible speciﬁcation considered in Mavroeidis, Plagborg-Møller, and Stock (2014). This involves imposing the restriction γ b + γ f = 1 (which is known tothe econometrician) and setting c = 0 (which is not known to the econometrician), so that theNKPC can be re-written as (1 − γ f )∆ π t = λs t + γ f E [∆ π t +1 ] + (cid:15) t . (8)I embed this NKPC into a dynamic system by specifying that the reduced-form dynamics of the Chernozhukov, Chetverikov, and Kato (2018)–the above approach is not eﬃcient. This is to be expected, giventhe weak assumptions made on (the structure of) the IVs. The (ﬁnite-sample) power properties of the aboveapproach will be investigated in the simulation section below. The results show that it has non-trivial power. Due to the focus of this paper on the bias introduced by IV selection, I do not consider the factor-based approachesof Kapetanios, Khalaf, and Marcellino (2015) and Mirza and Storjohann (2014). The substantial biases causedby the improper selection of a small number of IVs can also serve to motivate the use of such factor methods.However, the factor-based GMM approach in Mirza and Storjohann (2014) seems to treat the number of IVs asﬁxed (and does not provide formal conditions for validity), while the factor AR statistic of Kapetanios, Khalaf,and Marcellino (2015) is only applicable in a high-dimensional context if a suﬃciently strong factor structureis assumed. In contrast, the Sup Score test proposed in this paper remains valid in high-dimensional contextsregardless of the factor or sparsity structure of the IVs.  π t s t f t  =  a a a a a a a a a   π t − s t − f t −  +  u t u t u t  , (9)where  u t u t u t  i.i.d. ∼ N  ,  ω ω ω ω ω ω ω ω ω  , and f t is a scalar factor variable. All coeﬃcients except for a , a , and a have to be calibrated.The coeﬃcients a , a , and a are backed out of the NKPC based on the Anderson and Moore(1985) algorithm.High dimensionality of the IVs is introduced by specifying that there exists an m × Q t that follow the process given by Q t = ξf t + u t , u t i.i.d. ∼ N [0 , I m ] , (10)where ξ is an m × λ and γ f within the GIV and RE framework,∆ π t = c + λs t + γ f ( π t +1 − π t − ) + (cid:15) t , E [ Z st (cid:15) t ] = 0 , (11)where the variables are deﬁned as in Section 2 and Section 3. The econometrician does not observethe factor itself, but only observes the forcing variable and inﬂation, as well as the m variablesin Q t . In this setup, Z st is a subset of the available IVs given by Z t = [1 , π t − , s t − , Q (cid:48) t − ] (cid:48) thatalways includes a constant (since it is speciﬁed in the structural equation the econometricianconsiders). This setup recommends itself for two reasons. First, it constitutes a minimal departure frompopular simulations in the existing literature. This ensures that any reported results are not anartefact of a particularly uncharitable setup. Second, it ensures the existence of a suﬃcientlysmall set of (excluded) ‘oracle IVs’ (given by π t − , s t − , f t − ) without necessarily imposing asparse setup on the observed ﬁrst-stage projection (although it can be imposed by setting a = For the case of post-selection inference based on the S statistic, the constant is concentrated out. For the caseof the Sup Score statistic, it is partialled out. It is, for instance, straightforward to include the variables in Q t directly in the reduced-form VARs. However,the results from such a DGP are very sensitive to the particular calibration of the parameters chosen, and hencemay cast doubt on whether the simulations are in fact capturing the eﬀect of IV selection as opposed to thebehaviour of a particular DGP. = ω = ω = ω = ω = 0 or simply ξ = 0). The former is desirable because itallows for a comparison of the diﬀerent ad-hoc inference procedures relative to the most eﬃcientapproach (conditional on using the S statistic). The latter is desirable because it allows for amore general (and perhaps more realistic, see Giannone, Lenza, and Primiceri (2018)) approachto modelling the ﬁrst stage. This setup is able to achieve both a sparse (unobserved) oracle ﬁrststage and an observed ﬁrst stage that is not necessarily sparse because the elements of Q t − thathave a non-zero corresponding entry in ξ will contain some relevant variation for identiﬁcation,due to the dependence of the endogenous variables [ π t +1 − π t − , s t ] (cid:48) on f t − . Based on the setupabove, it is possible to derive two diﬀerent concentration parameters ( µ O and µ E ) that reﬂectthe strength of identiﬁcation in the sparse unobserved oracle ﬁrst stage and the observed ﬁrststage. The details are given in Appendix C.The calibrations are as follows. Throughout, I set γ f = 0 . λ = 0 . T = 100, and ω = 0 . ω = ω = 0 . ω = 0 . ω = ω = ω = ω = 0, and ω = 0 . a = 0. For simplicity, I set a = a = 0, so that the factor structure follows an autoregressiveprocess with coeﬃcient given by a = 0 . a is considered). I set m = 200 and ξ q = τ ( − q log (cid:0) ( q + 1) /mq (cid:1) for q = 1 , . . . , m and τ = 0 .

05. This is meant to provide a deterministic calibration that balancespositive and negative, as well as large and small coeﬃcients. The small value chosen for τ ensuresthat the information on the factor contained in the observed variables is suﬃciently diluted, andthat there is some interesting variation in the informational content of the unobserved oracleﬁrst stage and the one actually observed. The results are unaﬀected by diﬀerent choices of ξ or τ . For all selection procedures, I force the selection of k s = 4 IVs to ensure that the ﬁrst stageis not overﬁtted, which again ensures that any distortions in inference are attributable to theselection step itself. For the S statistic, I set the lag-length for the Newey and West (1987) HACvariance estimator to 4. For the Sup Score test proposed above, I set the block size to b T = 4and the bootstrap replications to 500. I allow a , a , and a to take on diﬀerent values. Thecoeﬃcient a controls how informative the factor is in predicting the endogenous variables, andby extension how informative the variables in Q t − are.Table 1 shows the size of the S and Sup Score statistic following the diﬀerent selection proceduresoutlined above for a test with nominal size 10%. The calibrations chosen ensure that a broadrange of identiﬁcation strength and sparsity structures are considered. The ﬁrst panel for a corresponds to the perfectly sparse ﬁrst stage where none of the variables in Q t − are informativeIVs. As a consequence, the concentration parameter of the unobserved oracle ﬁrst stage is thesame as the one that is observed. The second and third panel increase the dependence of the two Ensuring a suﬃciently sparse set of ‘oracle IVs’ further motivates considering only a single lag of a single factor. I refer to Appendix C for more details on this. The derivations also show that choosing small values of τ has asimilar eﬀect to choosing a larger term for the variance of the errors in Equation (10). ξ , this meansthat all of the IVs observed by the econometrician are at least somewhat informative. Since thevariables in Q t contain noisy information on the unobserved factor, the concentration parameterin the observed ﬁrst stage will now be lower than the concentration parameter of the unobservedoracle ﬁrst stage. As expected, the oracle IVs yield correct, if somewhat conservative, size. Sincerandom selection does not make use of any correlations present in the actual data, the S statisticwith randomly selected IVs also yields correct size. The results for crude thresholding and theLASSO suggest that in all cases size is not controlled, although the distortions appear to besomewhat milder for the LASSO. The results for the Sup Score test proposed in this paper showthat the test controls for size regardless of the DGP considered.Table 1 also reports the rejection frequency of a two-step approach that tests the null hypothesisat a given level of signiﬁcance only if the robust test of overidentifying restrictions fails toreject the hypothesis of exogeneity for the IVs selected at that level of signiﬁcance. This isa very conservative approach. Indeed, when faced with evidence that the selected IVs maybe endogenous, rather than abandoning the analysis altogether, it seems more likely that theeconometrician will proceed to select other IVs, potentially worsening the endogeneity bias.Even in this conservative approach, crude thresholding fails to control for size. LASSO selectionfollowed by this two-step approach appears to control for size. These results suggest that whilethe test of overidentifying restrictions can help mitigate some of the endogeneity bias introducedby improper selection, it is unable to fully remove it.Figure 1 shows the power of the diﬀerent approaches. I present the results for the case where a = a = a = 0 . S statistic) that controls for size. The results show that randomly selecting IVsyields no power. This is unsurprising, given that in this setup the ﬁrst stage is sparse, sothat random selection predominantly selects not very informative IVs. The power heatmaps forcrude thresholding and the LASSO have a similar shape to the oracle heatmaps. However, forcertain parts of the parameter space considered, the rejection frequency of these procedures issubstantially higher than the one of the oracle test. Conditional on using the same test post-selection, both crude thresholding and the LASSO can be at most as powerful as the test thatdirectly uses the oracle IVs. Therefore, this excess rejection frequency is spurious, which inpractice would translate to small conﬁdence sets. The power heatmaps for the Sup Score testshow that the Sup Score test has non-trivial power.14able 1: Simulation results: size. a = 0 . a = 0 . a = 0 . a = 0 . a .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 . µ O µ E − − − − − − − − − a = 0 . a = 0 . a = 0 . a = 0 . a .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 . µ O µ E − − − − − − − − − a = 0 . a = 0 . a = 0 . a = 0 . a .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 .

450 0 .

000 0 .

200 0 . µ O µ E − − − − − − − − − Notes :R.F. denotes the rejection frequency.T.S. denotes the rejection frequency where a null hypothesis is rejected only if the robust test of overidenti-fying restrictions fails to reject the selected IVs’ exogeneity.Nominal test size: 10%.1,000 Monte Carlo replications. (cid:88)(cid:121)(cid:121) (cid:121)(cid:88)(cid:107)(cid:56) (cid:121)(cid:88)(cid:56)(cid:121) (cid:121)(cid:88)(cid:100)(cid:56) (cid:82)(cid:88)(cid:121)(cid:121) (a) Oracle (b) Random(c) Crude Thresholding (d) LASSO(e) Sup Score Figure 1: Simulation results: power. a = a = a = 0 .

45. Nominal test size: 10%. 1,000Monte Carlo replications. 16

Empirical Application

Data for the empirical part of this paper is taken from FRED. I use the non-farm labour shareas transformed in Gal`ı and Gertler (1999) as the forcing variable. I use the inﬂation rate impliedby the GDP deﬂator. I consider the period 1974Q2-2018Q4, and include 90 variables aimed toreﬂect diﬀerent parts of the US economy based on the list in McCracken and Ng (2016) withfour lags each, transforming them as recommended therein. This yields 179 observations with359 IVs. Appendix D contains a detailed description of the data.A natural question to ask is whether considering these 359 IVs is enough to dispel concernsabout potential endogeneity biases. Though being more than any number of IVs previouslyconsidered in the literature, there are certainly more valid IVs (i.e., additional predeterminedvariables). However, a substantial endogeneity bias caused by the selection of these 359 IVswould emerge only if the variables were included in the list of McCracken and Ng (2016) basedon their correlation with the endogenous variables of this application. This seems very unlikely.For all ad-hoc selection procedures, I limit the number of IVs selected to four, to ensure thatoverﬁtting is not a concern, and set the lag-length for the Newey and West (1987) HAC varianceestimator to 4. The results do not change appreciably when other values are chosen. Theconﬁdence sets yielded by the traditional IVs and the ad-hoc selection procedures are shown inFigure 2, and the corresponding IVs are listed in Table 2. Mirroring the results in Section 4, theconﬁdence set resulting from random selection is extremely wide, and suggests that the hybridNKPC is essentially unidentiﬁed. The conﬁdence set from applying the LASSO is smaller thanthe one implied by random selection, but it also does not exclude that the coeﬃcient on expectedinﬂation is in fact equal to zero.Table 2: Identity of the IVs for each of the selection procedures.

Traditional Random Crude Thresholding LASSOPRS85006173.-1 WILL5000IND.-3 PRS85006173.-1 PRS85006173.-1PRS85006173.-2 NDMANEMP.-3 PRS85006173.-2 PRS85006173.-2GDPDEF.-2 EXUSUK.-3 DSERRG3M086SBEA.-1 DSERRG3M086SBEA.-1GDPDEF.-3 PERMITMW.-4 CES3000000008.-1 SRVPRD.-3

Notes :PRS85006173 refers to Nonfarm Business Sector: Labor Share, GDPDEF refers to Gross Domestic Product: Implicit PriceDeflator, WILL5000IND.-3 refers to Wilshire 5000 Total Market Index, NDMANEMP.-3 refers to All Employees, Non-durable Goods, EXUSUK.-3 refers to U.S. / U.K. Foreign Exchange Rate, PERMITMW.-4 refers to New Private HousingUnits Authorized by Building Permits in the Midwest Census Region, DSERRG3M086SBEA refers to Personal consump-tion expenditures: Services (chain-type price index), CES3000000008 refers to Average Hourly Earnings of Production andNonsupervisory Employees, Manufacturing, SRVPRD refers to All Employees, Service-Providing. I do not include all the variables listed in McCracken and Ng (2016) since they are not all available over asuﬃciently long period of time. a) Traditional (b) Random(c) Crude Thresholding (d) LASSO Figure 2: 90% conﬁdence sets for the NKPC in Equation (1) using the S statistics with selectedIVs by diﬀerent ad-hoc procedures. The identity of the IVs selected by each of theprocedures is shown in Table 2.The conﬁdence sets implied by traditional IVs and crude thresholding are qualitatively verysimilar. This similarity is explained by the fact that the IVs chosen by crude thresholding arevery similar to the traditional IVs, as shown in Table 2. This suggests two things. First, itsuggests that the ad-hoc selection procedures used in this paper may in fact provide a reasonableapproximation to the approach taken for selecting IVs in the past literature. Second, given thatin the simulation exercise in Section 4 crude thresholding yields the worst size distortions ofthe selection procedures considered, this result suggests that the conﬁdence sets reported in theprevious literature are likely to suﬀer from at least some distortion due to endogeneity bias.In particular, it suggests that the process of trying to ﬁnd ‘strong’ IVs may have led to anundercovering of the true parameter values.The conﬁdence sets of the Sup Score test proposed in this paper for diﬀerent block lengths (4, 6,8, and 10) are shown in Figure 3. For all block lengths considered (the results do not appear to besensitive to the choice of block length), the conﬁdence sets are smaller than the ones yielded by the18andom selection approach, but wider than for the traditional, crude thresholding, and LASSOapproach. This is likely due to a combination of the incorrect size of the latter approaches, and thelow power of the Sup Score test documented in Section 4. The results suggest that while certainparts of the parameter space considered can be rejected at the 10% level of signiﬁcance, neither λ nor γ f are found to be diﬀerent from zero for all values of the parameter space considered. (a) b T = 4 (b) b T = 6(c) b T = 8 (d) b T = 10 Figure 3: 90% conﬁdence sets for the NKPC in Equation (1) using the Sup Score test.The Sup Score test can also help shed some light on what the most relevant IVs are, since thereis a (likely) unique IV that maximises the Sup Score statistic in Equation (6) for every nullhypothesis being tested. I record the identity of the IV maximising the Sup Score statistic foreach null hypothesis tested, and report the results in Table 3 and Figure 4. Table 3 shows theidentity of the IVs maximising the Sup Score statistic. Figure 4 shows in which part of theparameter space the diﬀerent IVs maximise the Sup Score statistic. Two things stand out. First,the IVs that feature prominently in Table 2 (i.e., IVs selected by the ad-hoc procedures) alsotend to feature in the set of IVs maximising the Sup Score statistic (i.e., lags of PRS85006173and CES3000000008). The reason why the conﬁdence sets are larger for the Sup Score test is inpart due to the Sup Score test being able to account for the very many other valid IVs that thesevariables were chosen from. Second, there are some IVs that maximise the Sup Score statistic19hat do not feature in Table 2, such as the three-period lagged Housing Starts in NortheastCensus Region (e.g., HOUSTNE.-3). This relates to the predominant motivation for wanting toconsider very many IVs: the truly relevant IVs can often be ‘exotic’, in the sense that intuitionalone would not point to their relevance.Table 3: Identity of the IVs maximising the Sup Score statistic.

IV H DescriptionCES3000000008.-1 13,879 One-period lag of Average Hourly Earnings of Production andNonsupervisory Employees, ManufacturingHOUSTNE.-3 11,021 Housing Starts in Northeast Census RegionPRS85006173.-3 8,404 Three-period lag of Nonfarm Business Sector: Labor SharePRS85006173.-1 7,263 One-period lag of Nonfarm Business Sector: Labor SharePRS85006173.-2 4,339 Two-period lag of Nonfarm Business Sector: Labor ShareCUMFNS.-3 1,802 Three-period lag of Capacity Utilization: ManufacturingCUSR0000SAS.-2 1,742 Two-period lag of Consumer Price Index for All Urban Consumers:Services in U.S. City AverageIPCONGD.-3 68 Three-period lag of Industrial Production: Consumer GoodsDDURRG3M086SBEA.-1 2 One-period lagged personal consumption expenditures: durablegoods

Figure 4: Location in the parameter space γ f × λ where the diﬀerent IVs in Table 3 maximisethe Sup Score statistic. IV-based limited-information estimation of single equations has become increasingly popularin Macroeconomics over the last 20 years. Using a simulation exercise based on NKPCs, Ishowed that selecting IVs in ad-hoc ways (random selection, crude thresholding, and LASSO)20an invalidate them, thus yielding invalid inference even if tests with desirable properties (suchas robustness to weak identiﬁcation) are used post-selection. To address this issue, I propose aSup Score test that remains valid for high-dimensional IVs and for time series data. In the samesimulation exercise that showed that ad-hoc selection procedures can lead to invalid inference,this statistic yielded correct size and reasonable power. Finally, I applied the Sup Score testto conduct inference on the US NKPC with 359 IVs on a sample size of 179 observations. Theresults showed that the conﬁdence sets implied by the Sup Score test are substantially wider thanthe ones of all other approaches. The simulation results and the empirical application point tothe importance of developing further high-dimensional IV methods with good power propertiesthat remain valid under dependence and arbitrarily weak identiﬁcation.21 eferences

Anatolyev, Stanislav and Nikolay Gospodinov (2010). “Speciﬁcation Testing in Models withMany Instruments”. In:

Econometric Theory

Economics Letters

The Annals of Mathematical Statistics

Journal of Econometrics

Journal of Monetary Economics , pp. 1–24.Bayar, Omer (2018). “Weak instruments and estimated monetary policy rules”. In:

Journal ofMacroeconomics

58, pp. 308–317.Belloni, Alexandre, Daniel Chen, Victor Chernozhukov, and Christian Hansen (2012). “SparseModels and Methods for Optimal Instruments With an Application to Eminent Domain”. In:

Econometrica

Economics Letters

EuropeanConferences of the Econometrics Community Conference Paper , pp. 1–27.Chen, Xiaohong, Qi-Man Shao, Wei Biao Wu, and Lihu Xu (2016). “Self-normalized Cram´er-typemoderate deviations under dependence”. In:

The Annals of Statistics

The Review of Economic Studies

AmericanEconomic Review

Econometric Theory

77, pp. 1–30.Deng, Han and Cunhui Zhang (2020). “Beyond Gaussian Approximation: Bootstrap for Maximaof Sums of Independent Random Vectors”. In: arXiv

Journal of Business & Economic Statistics

Journal ofEconomic Dynamics and Control

Journal of Monetary Economics

CEPR Discussion Paper

Journal of Econometrics arXiv

The Annals of Probability

Journal of Applied Econometrics

Econometrica

Journal of Business & Economic Statistics

Economics Letters

American Economic Review

Journal of EconomicLiterature

Journal of Business & Economic Statistics arXiv

Journal of Money,Credit and Banking

Econometrica

Journal of Time Series Econometrics

Econometrica

Mimeo , pp. 1–30.Wu, Wei Biao (2005). “Nonlinear System Theory: Another Look at Dependence”. In:

Proceedingsof the National Academy of Sciences

The Review of Economics and Statistics arXiv

Bernoulli

Proof of Example 1

Proof.

Assumption 1.ii. holds by the assumption of mean-zero independent error terms, theassumption of non-zero variances, and the assumption of ﬁnite fourth moments.Assumption 1.iv. is satisﬁed by the assumption of (mean-zero) independent error terms.Since E [ ε t ] < ∞ , I can re-write Equation (7) as E [ | Z tj − F j ( . . . , v ∗− , v ∗ , v , . . . , v t ) | ] E [ | ε t | ] ≤ ˘ C ˜ ρ t , for some new constant ˘ C . Then it follows that˘ C ˜ ρ t ≥ E (cid:104)(cid:0) F j ( . . . , v t − , v t ) ε t − F j ( . . . , v ∗− , v ∗ , v , . . . , v t ) ε t (cid:1) (cid:105) . I now deﬁne ˜ v t = [ v (cid:48) t , ε t ] (cid:48) (so that ˜ v t is an i.i.d. mean-zero random variable), which yields Z tj ε t = F j ( . . . , v t − , v t ) ε t ≡ ˜ F j ( . . . , ˜ v t − , ˜ v t ) , so that F j continues to be a measurable function with arguments that are i.i.d. random variables.Therefore, Assumption 1.i. holds.Since ε t is independent across t and of v s for s = 1 , . . . , T and identically distributed, it followsthat F j ( . . . , v ∗− , v ∗ , v , . . . , v t ) ε t = ˜ F j ( . . . , ˜ v ∗− , ˜ v ∗ , ˜ v , . . . , ˜ v t ) , where { ˜ v ∗ t } are i.i.d. copies of { ˜ v t } . Thus,˘ C ˜ ρ t ≥ E (cid:20)(cid:16) ˜ F j ( . . . , ˜ v t − , ˜ v t ) − ˜ F j ( . . . , ˜ v ∗− , ˜ v ∗ , ˜ v , . . . , ˜ v t ) (cid:17) (cid:21) . Therefore, Assumption 1.v. holds.

B Proof of Theorem 1

Proof.

Throughout, it is assumed that the null hypothesis in Equation (4) holds, so that ε t isreplaced by ε t . The proof is a straightforward application of the results in Zhang and Cheng(2018) (referred to as ZC18 in the sequel) and Zhang and Cheng (2014) (referred to as ZC14 inthe sequel). To this end, let W t = [ W t , . . . , W tj ] (cid:48) be a Gaussian sequence which is independentof Z t ε t and preserves the autocovariance structure of Z t ε t . Let L Zε = max ≤ j ≤ k √ T Z (cid:48) j ε and L W =max ≤ j ≤ k √ T W j where W j is the T × j th column of the matrix W =[ W , . . . , W T ]. 25 ﬁrst verify that the conditions in Assumption 1. are suﬃcient for Theorem 2.1 in ZC18 to hold.Assumption 2.1 in ZC18 holds since by Assumption 1.ii. Z tj ε t has ﬁnite fourth moments, so thatsetting D n in ZC18 to T (3 − b − b ) / , and h ( · ) in ZC18 to h ( x ) = x satisﬁes the ﬁrst of thetwo possible conditions in Assumption 2.1 of ZC18 by the assumption that 12˜ b + 13 b < b and ˜ b given in Assumption 1.iii.).Assumption 2.2 in ZC18 holds by replacing M in ZC18 with b T , and setting γ in ZC18 to γ = T − (1 − b − b ) / = o (1) (see also the sentence immediately following Theorem 3.2 in ZC14).Assumption 2.3 in ZC18 contains two conditions. The ﬁrst condition (what they express as c < min ≤ j ≤ k σ j,j ≤ max ≤ j ≤ k σ j,j < c ) holds since by Assumption 1.ii., Z tj ε t has non-degenerate ﬁnitesecond moments. The second condition (what they express as (cid:80) + ∞ j =1 jθ j,k, < c ) is satisﬁed byAssumption 1.v., since, as per Remark 3.2 in Wang and Shao (2019), the GMC condition used inthe present paper (and arguably in the literature that uses physical dependence measures morebroadly) is equivalent to the one used in ZC18 and ZC14.Therefore, by Theorem 2.1 in ZC18, under the conditions in Assumption 1., the process Z t ε t canbe approximated by its Gaussian equivalent, i.e.,sup a ∈ R | P ( L Zε ≤ a ) − P ( L W ≤ a ) | (cid:46) T − (1 − b − b ) / . (B.1)The bound in Equation (B.1) satisﬁes the ﬁrst condition for Theorem 4.2 in ZC14. It remainsto verify Condition 2 of Assumption 4.1 in ZC14. Condition 2 in Assumption 4.1 in ZC14 requireschecking four conditions.The ﬁrst condition (what they express as ¯ σ x,M ∨ ¯ σ x,N (cid:46) n s ) is satisﬁed by Assumption 1.v.,since by Remark 4.1 in ZC14, the ﬁrst condition of Condition 2 of Assumption 4.1 in ZC14 issatisﬁed with s = 0 whenever the data in question obeys the GMC condition.The second condition (what they express as ς x,M ∨ ς x,N (cid:46) n s (cid:48) / ) is satisﬁed by setting their M, N to b T and s (cid:48) to ˜ b and noticing that for all j = 1 , . . . , k ,  b T E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b T (cid:88) t =1 Z tj ε t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)  / ≤ (cid:18) b T E (cid:20) max ≤ t ≤ b T Z tj ε t (cid:21) b T (cid:19) / ≤ b / T E (cid:20) max ≤ t ≤ T Z tj ε t (cid:21) (cid:46) b / T The careful reader will have noticed that Theorem 4.2 in ZC14 appeals to the conditions in Theorem 3.3 inZC14 which is virtually the same theorem as Theorem 2.1 in ZC18 except for an additional GMC assumptionon the Gaussian equivalent of Z t ε t . However, a careful reading of the proof of Theorem 4.2 in ZC14 revealsthat this theorem exclusively appeals to the conditions in Theorem 3.3 in ZC14 in order to establish a boundon the Gaussian approximation as in Equation (B.1) above. Since Theorem 2.1 in ZC18 establishes this boundwithout this assumption, the GMC on the Gaussian equivalent of Z t ε t can be dropped in appealing to Theorem4.2 in ZC14.

26y Assumption 1.ii.. Thus,  E  max ≤ j ≤ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b T (cid:88) t =1 Z tj ε t √ b T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)  / (cid:46) b / T (cid:46) T ˜ b/ , by Assumption 1.iii.. This ensures that the second condition of Condition 2 of Assumption 4.1in ZC14 is satisﬁed.The third condition (what they express as (cid:36) x (cid:46) n s ) is satisﬁed by Assumption 1.iv. and setting s = ˘ b .The fourth condition (what they express as s (cid:48) b >

0) is satisﬁed since (1 − b ) / >

0, (1 − b − ˜ b ) / − ˜ b >

0, and ˜ b − b − ˘ b > b , ˜ b , and ˘ b in Assumption 1.iii. andAssumption 1.iv..It is hence possible to invoke Theorem 4.2 in ZC14, which yieldssup α ∈ (0 , | P ( L Zε ≤ ˜ c ( α )) − α | (cid:46) T − c , where ˜ c ( α ) = inf (cid:110) γ ∈ R : P ( ˜ L ˆ A ≤ γ |{ Z t ε t } Tt =1 ) ≥ − α (cid:111) , ˜ L ˆ A = max ≤ j ≤ k √ T l T (cid:88) t =1 ˆ A tj e t , and { e t } is a sequence of i.i.d. N [0 ,

1] random variables. The constant c is positive, since(1 − b − ˜ b ) / >

0, (1 − b − ˜ b ) / − ˜ b >

0, ˜ b − b − ˘ b >

0, and (1 − b − b ) > b , ˜ b , and ˘ b in Assumption 1.iii. and Assumption 1.iv..Finally, notice that the procedure proposed in this paper is computing only the means of randomvariables. This means that the ‘inﬂuence function’ ( IF in ZC14) does not have to be estimated(since the true value is known under the null hypothesis). It also means that the statistic is‘exactly linear’, i.e., the remainder term R N in ZC14 is zero. This implies that the two conditionsin Assumption 5.1 in ZC14 are trivially satisﬁed (since, in their notation, E AB = R N = 0). Also,the block length of Z tj ε t is simply unity so that N in ZC14 is simply T and the dimension ofthe parameter to be estimated ( q in their notation) is simply the number of IVs considered, k .By Theorem 5.1 in ZC14, which requires the conditions for Theorem 4.1 and Assumption 5.1 inZC14 to hold, and the identifying moment condition E [ Z (cid:48) ε ] = 0, it hence follows thatsup α ∈ (0 , (cid:12)(cid:12)(cid:12)(cid:12) P (cid:18) max ≤ j ≤ k √ T (cid:12)(cid:12)(cid:12)(cid:12) T Z (cid:48) j ε (cid:12)(cid:12)(cid:12)(cid:12) ≤ c ( α ) (cid:19) − α (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) T − c , α ∈ (0 , | P ( R ≤ c ( α )) − α | (cid:46) T − c . Letting T → ∞ yields the required result. C Derivation of Concentration Parameters for Simulations

I ﬁrst derive the concentration parameter for the unobserved oracle ﬁrst stage. The derivationsfor this concentration parameter are very similar to those in the online appendix of Mavroeidis,Plagborg-Møller, and Stock (2014).The endogenous variables in the model the econometrician estimates (Equation (11)) can bewritten in terms of the excluded IVs as (cid:34) π t +1 − π t − s t (cid:35) = (cid:34) (cid:35)(cid:124) (cid:123)(cid:122) (cid:125) E  π t +1 s t +1 f t +1  + (cid:34) (cid:35)(cid:124) (cid:123)(cid:122) (cid:125) E  π t s t f t  + (cid:34) − (cid:35)(cid:124) (cid:123)(cid:122) (cid:125) E  π t − s t − f t −  . (C.1)Deﬁne p t = [ π t +1 − π t − , s t ] (cid:48) , R t = [ π t , s t , f t ] (cid:48) , u t = [ u t , u t , u t ] (cid:48) ,Ψ =  a a a a a a a a a  , and Ω =  ω ω ω ω ω ω ω ω ω  . Equation (C.1) can now be written as p t = E R t +1 + E R t + E R t − = ( E Ψ + E Ψ + E ) R t − + ( E Ψ + E ) u t + E u t +1 = DR t − + w t , for D = E Ψ + E Ψ + E , w t = ( E Ψ + E ) u t + E u t +1 .Assuming that R t is stationary, and letting Γ = V [ R t ],vec(Γ) = ( I − Ψ ⊗ Ψ) − vec(Ω) . As in Mavroeidis, Plagborg-Møller, and Stock (2014, Online Appendix), the constant can be omitted for thepurposes of deriving the concentration matrix because in all simulations it is set equal to zero in the structuralNKPC. p t on R t − has coeﬃcient matrix given by M = E [ p t R (cid:48) t − ]Γ − = E (cid:2) ( DR t − + w t ) R (cid:48) t − (cid:3) Γ − = D E (cid:2) R t − R (cid:48) t − (cid:3) Γ − = D, since E [ w t R (cid:48) t − ] = E (cid:2) (( E Ψ + E ) u t + E u t +1 ) R (cid:48) t − (cid:3) = 0.The projection error of the unobserved oracle ﬁrst stage is given by e t = p t − M R t − = DR t − + w t − DR t − = w t . The variance of the population projection error of the unobserved oracle ﬁrst stage Σ = V [ e t ] ishence given by Σ = V [ w t ]= ( E Ψ + E )Ω( E A + E ) (cid:48) + E Ω E (cid:48) . The concentration matrix of the unobserved oracle ﬁrst stage is then given by C = T Σ − / D Γ D (cid:48) Σ − / (cid:48) , where Σ − / Σ − / , (cid:48) = Σ − . The minimum eigenvalue µ O of matrix C gives the concentrationparameter of the unobserved oracle ﬁrst stage.The steps to derive the concentration matrix corresponding to the ﬁrst stage observed by theeconometrician are similar. Deﬁne ˜ R t = [ π t , s t , Q (cid:48) t ] (cid:48) , andΞ =  m × m × ξ  , F = (cid:34) × × m m × I m (cid:35) , so that ˜ R = Ξ R t + ˜ u t , where ˜ u t = [0 , , u (cid:48) t ] (cid:48) , V [˜ u t ] = F .Assuming that R t is stationary, ˜ R t is also stationary. Letting ˜Γ = V [ ˜ R t ],˜Γ = V [Ξ R t + ˜ u t ]= Ξ V [ R t ] Ξ (cid:48) + F = ΞΓΞ (cid:48) + F. p t on ˜ R t − has coeﬃcient matrix given by˜ M = E [ p t ˜ R (cid:48) t − ]˜Γ − = E [ p t ( R (cid:48) t − Ξ (cid:48) + ˜ u (cid:48) t )]˜Γ − = E [ p t R (cid:48) t − ]Ξ (cid:48) ˜Γ − + E [ p t ˜ u (cid:48) t ]˜Γ − = E [( DR t − + w t ) R (cid:48) t − ]Ξ (cid:48) ˜Γ − = D E [ R t − R (cid:48) t − ]Ξ (cid:48) ˜Γ − + E [ w t R (cid:48) t − ]Ξ (cid:48) ˜Γ − = D ΓΞ (cid:48) ˜Γ − . The projection error for the observed ﬁrst stage is given by˜ e t = p t − ˜ M ˜ R t − = DR t − + w t − ˜ M ˜ R t − = DR t − + w t − ˜ M (Ξ R t − + ˜ u t − )= ( D − ˜ M Ξ) R t − + w t − M ˜ u t − . The variance of the population projection error of the observed ﬁrst stage ˜Σ = V [˜ e t ] is hencegiven by ˜Σ = V [( D − ˜ M Ξ) R t − ] + V [ w t ] + V [ ˜ M ˜ u t − ]= ( D − ˜ M Ξ)Γ( D − ˜ M Ξ) (cid:48) + Σ + ˜ M F ˜ M (cid:48) . The concentration matrix of the unobserved oracle ﬁrst stage is then given by˜ C = T ˜Σ − / ˜ M ˜Γ ˜ M (cid:48) ˜Σ − / (cid:48) , where ˜Σ − / ˜Σ − / , (cid:48) = ˜Σ − . The minimum eigenvalue µ E of matrix ˜ C gives the concentrationparameter of the observed ﬁrst stage. 30 Data

Table 4: Data description.

Code Description Type Code Description TypeRPI Real Personal Income G PERMIT New Private Housing Units Authorized by Building Permits GINDPRO Industrial Production Index G PERMITNE New Private Housing Units Authorized by Building Permits inthe Northeast Census Region GCUMFNS Capacity Utilization: Manufacturing D PERMITMW New Private Housing Units Authorized by Building Permits inthe Midwest Census Region GIPFINAL Industrial Production: Final Products (MarketGroup) G PERMITS New Private Housing Units Authorized by Building Permits inthe South Census Region GIPCONGD Industrial Production: Consumer Goods G PERMITW New Private Housing Units Authorized by Building Permits inthe West Census Region GIPDCONGD Industrial Production: Durable ConsumerGoods G DPCERA3M086SBEA Real personal consumption expenditures (chain-type quantity in-dex) GIPNCONGD Industrial Production: Nondurable ConsumerGoods G CMRMTSPL Real Manufacturing and Trade Industries Sales GIPBUSEQ Industrial Production: Business Equipment G UMCSENT University of Michigan: Consumer Sentiment GIPMAT Industrial Production: Materials G M1SL M1 Money Stock GIPMANSICS Industrial Production: Manufacturing (SIC) G M2SL M2 Money Stock GIPB51222s Industrial Production: Residential utilities G TOTRESNS Total Reserves of Depository Institutions GIPFUELS Industrial Production: Fuels G BUSLOANS Commercial and Industrial Loans, All Commercial Banks GCLF16OV Civilian Labor Force Level G REALLN Real Estate Loans, All Commercial Banks GUNRATE Unemployment Rate N NONREVSL Total Nonrevolving Credit Owned and Securitized, Outstanding GUEMPMEAN Average Weeks Unemployed D DTCOLNVHFNM Consumer Motor Vehicle Loans Owned by Finance Companies,Outstanding GUEMPLT5 Number Unemployed for Less Than 5 Weeks G DTCTHFNM Total Consumer Loans and Leases Owned and Securitized by Fi-nance Companies, Outstanding GUEMP5TO14 Number Unemployed for 5-14 Weeks G INVEST Securities in Bank Credit, All Commercial Banks GUEMP15OV Number Unemployed for 15 Weeks & Over G FEDFUNDS Effective Federal Funds Rate GUEMP15T26 Number Unemployed for 15-26 Weeks G TB3SMFFM 3-Month Treasury Bill Minus Federal Funds Rate NUEMP27OV Number Unemployed for 27 Weeks & Over G TB6SMFFM 6-Month Treasury Bill Minus Federal Funds Rate NPAYEMS All Employees, Total Nonfarm G T1YFFM 1-Year Treasury Constant Maturity Minus Federal Funds Rate NUSGOOD All Employees, Goods-Producing G T5YFFM 5-Year Treasury Constant Maturity Minus Federal Funds Rate NCES1021000001 All Employees, Mining G T10YFFM 10-Year Treasury Constant Maturity Minus Federal Funds Rate NUSCONS All Employees, Construction G AAAFFM Moody’s Seasoned Aaa Corporate Bond Minus Federal FundsRate NMANEMP All Employees, Manufacturing G BAAFFM Moody’s Seasoned Baa Corporate Bond Minus Federal FundsRate NPRS85006173 Nonfarm Business Sector: Labor Share GG TWEXMMTH Trade Weighted U.S. Dollar Index: Major Currencies, Goods GDMANEMP All Employees, Durable Goods G EXSZUS Switzerland / U.S. Foreign Exchange Rate GNDMANEMP All Employees, Nondurable Goods G EXJPUS Japan / U.S. Foreign Exchange Rate GSRVPRD All Employees, Service-Providing G EXUSUK U.S. / U.K. Foreign Exchange Rate GUSTPU All Employees, Trade, Transportation, and Util-ities G EXCAUS Canada / U.S. Foreign Exchange Rate GUSWTRADE All Employees, Wholesale Trade G WPSFD49502 Producer Price Index by Commodity for Final Demand: PersonalConsumption Goods GUSTRADE All Employees, Retail Trade G WPSID61 Producer Price Index by Commodity for Intermediate Demandby Commodity Type: Processed Goods for Intermediate Demand GUSFIRE All Employees, Financial Activities G WTISPLC Spot Crude Oil Price: West Texas Inter mediate (WTI) GUSGOVT All Employees, Government G PPICMM Producer Price Index by Commodity Metals and metal products:Primary nonferrous metals GCES0600000007 Average Weekly Hours of Production and Non-supervisory Employees, Goods-Producing G CPIAUCSL Consumer Price Index for All Urban Consumers: All Items inU.S. City Average GAWOTMAN Average Weekly Overtime Hours of Productionand Nonsupervisory Employees, Manufacturing D CUSR0000SAC Consumer Price Index for All Urban Consumers: Commoditiesin U.S. City Average GAWHMAN Average Weekly Hours of Production and Non-supervisory Employees, Manufacturing D CUSR0000SAD Consumer Price Index for All Urban Consumers: Durables inU.S. City Average GCES0600000008 Average Hourly Earnings of Production andNonsupervisory Employees, Goods-Producing G CUSR0000SAS Consumer Price Index for All Urban Consumers: Services in U.S.City Average GCES2000000008 Average Hourly Earnings of Production andNonsupervisory Employees, Construction G PCEPI Personal Consumption Expenditures: Chain-type Price Index GCES3000000008 Average Hourly Earnings of Production andNonsupervisory Employees, Manufacturing G DDURRG3M086SBEA Personal consumption expenditures: Durable goods (chain-typeprice index) GHOUST Housing Starts: Total: New Privately OwnedHousing Units Started G DNDGRG3M086SBEA Personal consumption expenditures: Nondurable goods (chain-type price index) GHOUSTNE Housing Starts in Northeast Census Region G DSERRG3M086SBEA Personal consumption expenditures: Services (chain-type priceindex) GHOUSTMW Housing Starts in Midwest Census Region G GDPDEF Gross Domestic Product: Implicit Price Deflator GHOUSTS Housing Starts in South Census Region G WILL5000IND Wilshire 5000 Total Market Index GHOUSTW Housing Starts in West Census Region G GDPC1 Real Gross Domestic Product G

Notes :N refers to no transformation of the data.G refers transforming variable ft ft by computing 100 (cid:16) log( ft ) − log( ft − (cid:17) .D refers transforming variable ft by computing ft − ft − ft by computing 0 . ×

100 log( ft/100) (as in Gal`ı and Gertler (1999) and Kleibergen and Mavroeidis (2009)).