[PDF] Estimating the Effect of Central Bank Independence on Inflation Using Longitudinal Targeted Maximum Likelihood Estimation

Abstract

The notion that an independent central bank reduces a country's inflation is a controversial hypothesis. To date, it has not been possible to satisfactorily answer this question because the complex macroeconomic structure that gives rise to the data has not been adequately incorporated into statistical analyses. We develop a causal model that summarizes the economic process of inflation. Based on this causal model and recent data, we discuss and identify the assumptions under which the effect of central bank independence on inflation can be identified and estimated. Given these and alternative assumptions, we estimate this effect using modern doubly robust effect estimators, i.e., longitudinal targeted maximum likelihood estimators. The estimation procedure incorporates machine learning algorithms and is tailored to address the challenges associated with complex longitudinal macroeconomic data. We do not find strong support for the hypothesis that having an independent central bank for a long period of time necessarily lowers inflation. Simulation studies evaluate the sensitivity of the proposed methods in complex settings when certain assumptions are violated and highlight the importance of working with appropriate learning algorithms for estimation.

Full PDF

EEstimating the Eﬀect of Central Bank Independence on Inﬂation UsingLongitudinal Targeted Maximum Likelihood Estimation ∗ Philipp F. M. Baumann † Michael Schomaker ‡ Enzo Rossi § July 30, 2020

Abstract

The notion that an independent central bank reduces a country’s inﬂation is a controversial hypothesis.To date, it has not been possible to satisfactorily answer this question because the complex macroe-conomic structure that gives rise to the data has not been adequately incorporated into statisticalanalyses. We develop a causal model that summarizes the economic process of inﬂation. Based on thiscausal model and recent data, we discuss and identify the assumptions under which the eﬀect of centralbank independence on inﬂation can be identiﬁed and estimated. Given these and alternative assump-tions, we estimate this eﬀect using modern doubly robust eﬀect estimators, i.e., longitudinal targetedmaximum likelihood estimators. The estimation procedure incorporates machine learning algorithmsand is tailored to address the challenges associated with complex longitudinal macroeconomic data.We do not ﬁnd strong support for the hypothesis that having an independent central bank for a longperiod of time necessarily lowers inﬂation. Simulation studies evaluate the sensitivity of the proposedmethods in complex settings when certain assumptions are violated and highlight the importance ofworking with appropriate learning algorithms for estimation.

Keywords: causal inference, doubly robust, super learning, macroeconomics, monetary policy. ∗ The views, opinions, ﬁndings, and conclusions or recommendations expressed in this paper are strictly those of theauthors. They do not necessarily reﬂect the views of the Swiss National Bank (SNB). The SNB takes no responsibility forany errors or omissions in, or for the correctness of, the information contained in this paper. † KOF Swiss Economic Institute, ETH Zurich. e-mail: [email protected] ‡ Institute of Public Health, Medical Decision Making and Health Technology Assessment, UMIT - University for HealthSciences, Medical Informatics and Technology, Hall in Tirol, Austria and Centre for Infectious Disease Epidemiology &Research, University of Cape Town, Cape Town, South Africa. e-mail: [email protected] § Swiss National Bank and University of Zurich. e-mail: [email protected] a r X i v : . [ ec on . E M ] J u l Introduction

The impact of the institutional design of central banks on real economic outcomes has received considerableattention over the past three decades. Whether central bank independence (CBI) can lower inﬂation andprovide inﬂation stability in a country is a particularly controversial issue. It has been claimed thatmore than 9,000 works have been devoted to the investigation of the role of CBI in inﬂuencing economicoutcomes (Vuletin and Zhu, 2011). After the 2008-09 Global Financial Crisis, the debate on the optimaldesign of monetary policy authorities has become even more intense.The statistical and economic literature is rich in studies that evaluate the relationship between CBIand inﬂation. A common approach is to treat countries as units in a linear regression model whereinﬂation (the percentage change in the consumer price index (CPI)) is the outcome and a binary CBIindex and several economic and political variables are covariates. While many studies have found thatan independent central bank may lower inﬂation (Grilli et al., 1991; Cukierman et al., 1992; Alesina andSummers, 1993; Klomp and De Haan, 2010 a , b ; Arnone and Romelli, 2013), other studies that have useda broader range of characteristics of a nation’s economy have been unable to ﬁnd such a relationship(Cargill, 1995; Fuhrer, 1997; Oatley, 1999). Moreover, there have been studies suggesting that the eﬀectof CBI on inﬂation can only be seen during speciﬁc time periods (Klomp and De Haan, 2010 a ) or only indeveloped countries (Klomp and De Haan, 2010 b ; Neyapti, 2012; Alpanda and Honig, 2014).Numerous articles have pointed out the weaknesses that come with simple cross-sectional regressionapproaches when evaluating the eﬀect of CBI on inﬂation. First, the problem at hand is longitudinal innature, and only an appropriate panel setup may be suitable to estimate the (long-term) eﬀect of CBIon inﬂation. Second, the question of interest is essentially causal: i.e., what (average) inﬂation would weobserve in 10 years’ time, if – from now on – each country’s monetary institution had an independentcentral bank compared to the situation in which the central bank were not independent? However, theabovementioned cross-sectional regression approaches do not incorporate any causal considerations intotheir analyses.Some more recent work has attempted to overcome at least parts of these problems. For example, Croweand Meade (2007, 2008) use a panel data setup with two time intervals, and Klomp and De Haan (2010 b )work with a random coeﬃcient panel model. Other authors, e.g., Walsh (2005), acknowledge not onlythat current CBI may cause future inﬂation but also that current inﬂation is possibly related to futureCBI status. Several authors have thus tried to use instrumental variable approaches but have been unableto ﬁnd strong instruments (Crowe and Meade, 2008; J´acome and V´azquez, 2008).It is clear that evaluating the eﬀect of CBI on inﬂation requires a longitudinal causal estimation approach.However, it has been shown repeatedly that standard regression approaches are typically not suitable toanswer causal questions, particularly when the setup is longitudinal and when the confounders of theoutcome-intervention relationship are aﬀected by previous intervention decisions (Daniel et al., 2013).There are at least three methods to evaluate the eﬀect of longitudinal (multiple time-point) interven-tions on an outcome in such complex situations: 1) inverse probability of treatment weighted (IPTW)approaches (Robins et al., 2000); 2) standardization with respect to the time-dependent confounders (i.e.,g-formula-type approaches (Robins, 1986; Bang and Robins, 2005)); and 3) doubly robust methods, suchas targeted maximum-likelihood estimation (TMLE, Van der Laan and Rose, 2011), which can be seen1s a combination and generalization of the other two approaches.Longitudinal targeted maximum likelihood estimation (LTMLE, van der Laan and Gruber, 2012) is a dou-bly robust estimation technique that requires iteratively ﬁtting models for the outcome and interventionmechanisms at each time point. With LTMLE, the causal quantity of interest (such as an average treat-ment eﬀect (ATE)) is estimated consistently if either the iterated outcome regressions or the interventionmechanisms are estimated consistently. LTMLE, like other doubly robust methods, has an advantageover other approaches in that it can more readily incorporate machine learning methods while retainingvalid statistical inference. Recent research has shown that this is important if correct model speciﬁcationis diﬃcult, such as when dealing with complex longitudinal data, potentially of small sample size, whererelationships and interactions are most likely highly nonlinear and where the number of variables is largecompared to the sample size (Tran et al., 2019; Schomaker et al., 2019).Using causal inference in economics has a long history, starting with path analyses and potential outcomelanguage (Tinbergen, 1930; Wright, 1934) and continuing with regression discontinuity analyses (Hahnet al., 2001), instrumental variable designs (Imbens, 2014), and propensity score approaches in the con-text of the potential outcome framework (Rosenbaum and Rubin, 1983), among many other methods.More recently, there have been works advocating the use of doubly robust techniques in econometrics(Chernozhukov et al., 2018). From the perspective of statistical inference, this is a very promising sug-gestion because the integration of modern machine learning methods in causal eﬀect estimation is almostinevitable in areas with a large number of covariates and complex data-generating processes (Schomakeret al., 2019).However, the application of doubly robust eﬀect estimation can be challenging for (macro-)economic data.First, the causal model that summarizes the knowledge about the data-generating process is often morecomplex for economic than for epidemiological questions, where most successful implementations havebeen demonstrated thus far (Kreif et al., 2017; Decker et al., 2014; Schnitzer, Moodie, van der Laan,Platt and Klein, 2014; Schnitzer, van der Laan, Moodie and Platt, 2014; Schnitzer, Lok and Bosch, 2016;Tran et al., 2016; Schomaker et al., 2019; Bell-Gorrod et al., 2019). The task of representing the causalmodel in a directed acyclic graph (DAG) becomes particularly challenging when considering how economicvariables interact with each other over time. Thus, to build a DAG, a thorough review of a vast amount ofliterature is needed, and economic feedback loops need to be incorporated appropriately. Imbens (2019),who discusses diﬀerent schools of causal inference and their use in statistics and econometrics, as well asdiﬀerent estimation techniques, emphasizes this point: ”[...] a major challenge in causal inference is coming up with the causal model.“ Second, even if a causal model has been developed, identiﬁcation of an estimand has been established anddata have been collected, statistical estimation may be nontrivial given the complexity of a particular dataset (Schomaker et al., 2019). If the sample size is small, potentially smaller than the number of (time-varying) covariates, recommended estimation techniques can fail, and the development of an appropriateset of learning and screening algorithms is important. The beneﬁts of LTMLE, which is doubly robusteﬀect estimation in conjunction with machine learning to reduce the chance of model misspeciﬁcation, canbe best utilized under a good and broad selection of learners that are tailored to the problem of interest.Estimating the eﬀect of CBI on inﬂation is a typical example of a causal inference question that faces all ofthe challenges described above. Our paper makes ﬁve novel contributions to the literature. i) We discuss2dentiﬁcation and estimation for our question of interest and estimate the eﬀect of CBI on inﬂation; ii)we develop a causal model that can be applied to other questions related to macroeconomics in general;iii) we demonstrate that it is possible to develop a DAG for economic questions, which is important, as ithas been argued that ”the lack of adoption in economics is that the DAG literature has not shown muchevidence of the beneﬁts for empirical practice in settings that are important in economics.” (Imbens, 2019);iv) we demonstrate how to integrate machine learning into complex causal eﬀect estimation, including howto deﬁne a successful learner set when the number of covariates is larger than the sample size and whenthere is time-dependent confounding with treatment-confounder feedback (Hernan and Robins, 2020);and v) we use simulation studies to study the performance of doubly robust estimation techniques underthe challenges described above.This paper is structured as follows. In the next section, we motivate our question of interest, and thisis followed by the description of our framework. Section 4 contains the data analysis and describes thedoubly robust estimation strategy to estimate the eﬀect of CBI on inﬂation. In Section 5, we conductsimulation studies motivated by our data analysis. Section 6 concludes the paper.

When governments have discretionary control over monetary instruments, typically a short-term interestrate, they can prioritize other policy goals over price stability. For instance, after nominal wages havebeen negotiated (or nominal bonds purchased), politicians may be tempted to create inﬂation to boostemployment and output (gross domestic product, GDP) or to devalue government debt. This is referredto as the time-inconsistency problem of commitments to price stability. It results in an inﬂation ratehigher than what is socially desirable. To overcome this outcome, the literature stresses the beneﬁtsof enforced commitments (rules). In particular, Rogoﬀ (1985) has proposed delegating monetary policyto an independent and “conservative” central banker to reduce the tendency to produce high inﬂation.Here, conservative means that the central banker dislikes inﬂation more than the government, in the sensethat (s)he places a greater weight on price stability than the government does. Once central bankers areinsulated from political pressures, commitments to price stability can be credible, which helps to maintainlow inﬂation. Rogoﬀ’s seminal paper had a twofold eﬀect: stimulating the implementation of central bankreforms on the policy side and creating avenues for the design of indices that are suitable to capture thedegree of independence of these institutions on the research side.Following these ideas, a considerable policy consensus grew around the potential of having independentcentral banks to promote inﬂation stability (Bernhard et al., 2002; Kern et al., 2019). Numerous countriesfollowed this policy advice. Between 1985 and 2012, and excluding the creation of regional centralbanks, there were 266 reforms to the statutory independence of central banks, 236 of which were beingimplemented in developing countries. Most of these reforms (77%) strengthened CBI (Garriga, 2016).However, despite the broad impact of this policy advice, the empirical evidence in support of it remainscontroversial. We thus investigate the eﬀect of CBI on inﬂation with a causal framework that treatscountries as units in a longitudinal (panel) setup. The data set we use in our analysis was createdspeciﬁcally for this purpose and extends the data set from Baumann et al. (2019).3

Methodological Framework

We consider panel data with n units studied over time ( t = 0 , , . . . , T ). At each time point t , we observean outcome Y t , an intervention of interest A t and several time-dependent covariates L jt , j = 1 , . . . , q ,collected in a set L t = { L t , . . . , L qt } . Variables measured at the ﬁrst time point ( t = 0) are denoted as L = { L , . . . , L q } and are called “baseline variables”. The intervention and covariate histories of a unit i (up to and including time t ) are ¯ A t,i = ( A ,i , . . . , A t,i ) and ¯ L st,i = ( L s ,i , . . . , L st,i ), s = 1 , ..., q , respectively,with q, q ∈ N .We are interested in the counterfactual outcome Y ¯ a t t,i that would have been observed at time t if unit i had received, possibly contrary to the fact, the intervention history ¯ A t,i = ¯ a t . For a given intervention¯ A t,i = ¯ a t , the counterfactual covariates are denoted as ¯ L ¯ a t t,i . If an intervention depends on covariates, it isdynamic. A dynamic intervention d t ( ¯L t ) = ¯ d t assigns treatment A t,i ∈ { , } as a function ¯L t,i . If ¯L t,i isthe empty set, the treatment ¯ d t is static. We use the notation ¯ A t = ¯ d t to refer to the intervention historyup to and including time t for a given rule ¯ d t . The counterfactual outcome at time t related to a dynamicrule ¯ d t is Y ¯ d t t,i , and the counterfactual covariates at the respective time point are ¯ L ¯ d t t,i . If we assume a time ordering of L t → A t at each time point, use Y T as the outcome, and deﬁne Y t , t < T , to be contained in L t , the data can be represented as n iid copies of the following longitudinaldata structure: O = ( L , A , L , A , . . . , L T − , A T − , Y T ) iid ∼ P For the given ordering, we can write the respective likelihood L ( O ) as p ( O i ) = p ( L ,i , A ,i , L ,i , A ,i , . . . , L T − ,i , A T − ,i , Y T,i )= p ( Y T,i | ¯ A T − ,i , ¯ L T − ,i ) × p ( A T − | ¯ L T − ,i , ¯ A T − ,i ) × p ( L T − | ¯ A T − ,i , ¯ L T − ,i ) × . . . × p ( L ,i )= p ( Y T,i | ¯ A T − ,i , ¯ L T − ,i ) ×  T − (cid:89) t =0 p ( A t,i | ¯ L t,i , ¯ A t − ,i ) (cid:124) (cid:123)(cid:122) (cid:125) g ,At  ×  T − (cid:89) t =0 p ( L t,i | ¯ A t − ,i , ¯ L t − ,i ) (cid:124) (cid:123)(cid:122) (cid:125) ˜ q , L t  . In the above factorization, p ( · ) refers to the density of P (with respect to some dominating measure)and A − := L − := ∅ . If an order for L t is given, e.g., L t → . . . → L qt , a more reﬁned factorization ispossible. In line with the notation of other papers (e.g., Tran et al., 2019), we deﬁne the q -portion ofthe likelihood to also contain the outcome: q , L t := ˜ q , L t × p ( Y T,i | ¯ A T − ,i , ¯ L T − ,i ). Similarly, we deﬁne g := (cid:81) Tt =0 g ,A t and q := (cid:81) Tt =0 q , L t . 4 .3 On the distinction between the causal and statistical model Estimating causal eﬀects cannot be established from data alone but requires additional structural (i.e.causal) assumptions about the data-generating process. Therefore, any causal analysis comes with both astructural (i.e. causal) and a statistical model. The former can be represented by a directed acyclic graph(DAG), encodes conditional independence assumptions and is logically equivalent to a (non-parametric)structural equation framework. Ideally, the structural model is supported by knowledge from the litera-ture. The statistical model encodes assumptions about the family of possible observed data distributionsassociated with the DAG, with the ultimate aim to estimate post-intervention distributions and quantities.With doubly robust eﬀect estimation, any parametric assumptions are typically eschewed to avoid modelmis-speciﬁcation; and to incorporate machine learning while retaining valid inference. In our frameworkand analyses below, we proceed as follows: we start with minimal assumptions with respect to bothcausal and statistical model (Sections 3.4 and 3.5), i.e. we don’t impose any parametric restrictions onthe statistical model and require for the causal model only that variables can be aﬀected by the past, andnot the future. In our analysis in Section 4.1, we then make more detailed assumptions: ﬁrst, we encodeour structural assumptions in a directed acyclic graph (Figure 1) and support this model with referencesfrom the economic literature (Appendix). In the statistical model, we use the above likelihood factor-ization and targeted maximum likelihood estimation with super learning, to avoid any overly restrictiveparametric assumptions.

In line with the notation of Section 3.2, we consider a statistical model M = { P = q × g : q ∈ Q , g ∈ G } for the true distribution P that requires minimal (parametric) assumptions. In contrast to many medicalapplications, we do not impose restrictions on this model; that is, A t and Y t are not deterministicallydetermined for any given data history. Once an intervention is implemented, it can be stopped at anytime point and potentially started again. Similarly, the outcome can be observed at any time point, andwe do not assume that censoring is possible. Causal assumptions about the data-generating process are encoded in the model M F . This nonparametric(structural equation) model states our assumptions about the time ordering of the data and the causalmechanism that gave rise to the data. Thus far, it relates to Y T = f Y T ( ¯ A T − , ¯ L T − , U Y T ) L t = f L t ( ¯ A t − , ¯ L t − , U L t ) : t = 0 , , . . . , T − A t = f A t ( ¯ L t , ¯ A t − , U A t ) : t = 0 , , . . . , T − U := ( U Y T , U L t , U A t ) are unmeasured variables from some underlying distribution P U . For now,we do not make any assumptions regarding P U . However, in the data example further below, we needto enforce some restrictions on this distribution. The functions f O ( · ) are (deterministic) nonparametricstructural equations that assume that each variable may be aﬀected only by variables measured in the past5nd not those that are measured in the future. Section 4.3 reﬁnes the causal model for the data-generatingprocess of the motivating question and represents any additional assumptions made in a DAG. In this paper, we focus on the diﬀerences in intervention-speciﬁc means, i.e., in target parameters such as ψ j,k = E ( Y ¯ d jt T ) − E ( Y ¯ d kt T ) , j (cid:54) = k . (1)If we set the intervention according to a static or dynamic rule ( ¯ A t = ¯ d lt for ∀ t ) with l ∈ { j, k } in thecausal model M F , we obtain the post-intervention distribution P ¯ d lt . The counterfactual outcome Y ¯ d lt T isthe one that would have been observed had A t been set deterministically to 0 or 1 according to rule ¯ d lt .We thus restrict the set of possible interventions to those where the intervention is binary A t,i ∈ { , } .It has been shown that target parameters of the form (1) can be identiﬁed under the (partly untestable)assumptions of consistency, conditional exchangeability and positivity, which are deﬁned below. Specif-ically, it follows from the work of Bang and Robins (2005) and van der Laan and Gruber (2012) thatgiven these three assumptions, using the iterative conditional expectation rule, and for the particulartime-ordering as deﬁned in Section 3.2, we can write the target parameter as ψ j,k = E ( Y ¯ d jt T ) − E ( Y ¯ d kt T )= E ( E ( . . . E ( E ( Y T | ¯ A T − = ¯ d jT − , ¯L T − ) | ¯ A T − = ¯ d jT − , ¯L T − ) . . . | ¯ A = ¯ d j , L ) | L ) ) − E ( E ( . . . E ( E ( Y T | ¯ A T − = ¯ d kT − , ¯L T − ) | ¯ A T − = ¯ d kT − , ¯L T − ) . . . | ¯ A = ¯ d k , L ) | L ) ) . (2)The assumptions of consistency, conditional exchangeability and positivity have been discussed in theliterature in detail (Daniel et al., 2011, 2013; Robins and Hernan, 2009; Young et al., 2011; Tran et al.,2019). Brieﬂy, consistency is the requirement that Y ¯ d t T = Y T if ¯ A t − = ¯ d t − and ¯ L ¯ d t t = ¯ L t if ¯ A t − = ¯ d t − .Conditional exchangeability requires the counterfactual outcome under the assigned treatment rule to beindependent of the observed treatment assignment, given the observed past: Y ¯ d t T (cid:96) A t − | ¯ L t − , ¯ A t − for ∀ ¯ A t = ¯ d t , ¯ L t = ¯ l t , ∀ t , and positivity says that each unit should have a positive probability of continuingto receive the intervention according to the assigned treatment rule, given that this has been done sofar, and irrespective of the covariate history: P ( A t = ¯ d t | ¯ L t = ¯ l t , ¯ A t − = ¯ d t − ) > ∀ t, ¯ d t , ¯ l t with P ( ¯ L t = ¯ l t , ¯ A t − = ¯ d t − ) (cid:54) = 0.In principle, (conditional) exchangeability can be veriﬁed graphically in a DAG using the back-doorcriterion (Pearl, 2010; Molina et al., 2014); i.e., by closing all back-door paths and by nonconditioningon descendants of the intervention. For multiple time-point interventions, a generalized version of thiscriterion can be used to verify conditional exchangeability. This requires blocking all back-door pathsfrom A t to Y T that do not go through any future treatment node A t +1 (Hernan and Robins, 2020).More generally, it has been suggested to use single-world intervention graphs to verify exchangeability,particularly to evaluate identiﬁcation for complex dynamic interventions. See Richardson and Robins fordetails (Richardson and Robins, 2013). 6 .7 Eﬀect estimation with Longitudinal TMLE The longitudinal TMLE estimator (van der Laan and Gruber, 2012) relies on equation (2). To estimate ψ j,k , one can separately evaluate each of the two nested expectation terms and integrate out ¯ L T − withrespect to the post-intervention distribution P ¯ d lt . To improve inference with respect to ψ j,k , a targetedestimation step at each time point yields a doubly robust estimator of the desired target quantity (seeVan der Laan and Rose (2011) or Schnitzer and Cefalu (2017) for details). Speciﬁcally, we recur to thefollowing algorithm for t = T, ..., Q T = E ( Y T | ¯A T − , ¯L T − ) with an appropriate model (for t = T ). If t < T , use theprediction from step 3d (of iteration t −

1) as the outcome, and ﬁt the respective model. Theestimated model is denoted as ˆ Q ,t .2. Now, plug in ¯A t − = ¯ d lt − based on rule ¯ d lt , and use the ﬁtted model from step 1 to predict theoutcome at time t (which we denote as ˆ Q ¯ d lt ,t ).3. To improve estimation with respect to the target parameter, update the initial estimate of step 2by means of the following regression:a) The outcome refers again to the measured outcome for t = T and to the prediction from item3d (of iteration t −

1) if t < T .b) The oﬀset is the original predicted outcome ˆ Q ¯ d lt ,t from step 2 (iteration t ).c) The “clever covariate” is deﬁned as: H t − = t − (cid:89) s =0 I ( ¯ A s = ¯ d s ) g ,A t = ¯ d ls (3)with g ,A t = ¯ d ls = P ( A s = ¯ d ls | ¯ L = ¯ l s , ¯ A s − = ¯ d ls − ). The estimate of g ,A t = ¯ d ls is denoted as ˆ g A t = ¯ d ls .d) predict the updated (nested) outcome, ˆ Q ¯ d lt ,t , based on the model deﬁned through 3a, 3b, and 3c.This model contains no intercept. Alternatively, the same model can be ﬁtted with H t − as a weightrather than a covariate (Kreif et al., 2017; Tran et al., 2019). In this case, an intercept is required.We follow the latter approach in our implementations.4. The estimate for E ( Y ¯ d lt T ) is obtained by calculating the mean of the predicted outcome from step 3d(where t = 1).5. Conﬁdence intervals can, for example, be obtained using the vector of the estimated inﬂuence curve;see Tran et al. (2018) for a review of adequate choices.6. Repeat 1.-5. to estimate E ( Y ¯ d jt T ) and E ( Y ¯ d kt T ). Now, ˆ ψ j,k and its corresponding conﬁdence intervalscan be calculated. For an arbitrary distribution P ∈ M and a speciﬁc intervention rule g = g ( P ) we consider the statisticalmodel M ( g ) = { P ∗ ∈ M : g ( P ∗ ) = g } for the respective treatment rules g . With such a model we could7stimate ψ ∗ with the algorithm described in 3.7. For ψ ∗ it can be shown (e.g. van der Laan and Rose,2018) that ˆ ψ ∗ is an asymptotically eﬃcient estimator of ψ ∗ where √ n ( ˆ ψ ∗ − ψ ∗ ) d −→ N (0 , σ , ∗ ) . (4)The variance can be estimated with the sample variance of the estimated inﬂuence curve. This is es-sentially because the construction of the covariate in step 3c, guarantees that the estimating equationcorresponding to the (eﬃcient) inﬂuence curve is solved, which in turn yields desirable (asymptotic) in-ferential properties. The inﬂuence curve emerges from the linear span of the scores (i.e. ﬁrst derivative)of the logistic loss for the density of the outcome variable (evaluated at zero) for a given value of theclever covariate (Schnitzer, van der Laan, Moodie and Platt, 2014). Thus, in the longitudinal case, forinterventions rules ¯ g t , these score components can be summed across the points in time which yields theeﬃcient inﬂuence curveˆ IC ∗ = (cid:40) T (cid:88) t =1 ˆ H ∗ t − [ ˆ Y ∗ ¯ d t =¯ g t t − ˆ Y ∗ ¯ d t − =¯ g t − t − ] (cid:41) + ˆ Y ∗ ¯ d t =¯ g t − ˆ ψ ∗ . (5) The above estimation procedure is doubly robust, which means that the estimator is consistent as long aseither the Q- or the g-models (steps 1 and 3c in the algorithm described above) are estimated consistently(Bang and Robins, 2005). If both are estimated consistently (at reasonable rates), the estimator isasymptotically eﬃcient because the construction of the covariate in step 3c guarantees that the estimatingequation corresponding to the eﬃcient inﬂuence curve is solved, which in turn yields desirable (asymptotic)inferential properties (Van der Laan and Rose, 2011; Schnitzer and Cefalu, 2017).To estimate the conditional expectations in the algorithm, one could use (parametric) regression models.Under the assumption that they are correctly speciﬁed, this approach would be valid. However, in thecontext of complex macroeconomic data, as in our motivating example below, it is challenging to esti-mate appropriate parametric models because of small sample sizes, a large number of relevant variablesand complex nonlinear relationships. Longitudinal TMLE can (in contrast to many competing estima-tion techniques) incorporate machine learning algorithms while still retaining valid inference to reducethe possibility of model misspeciﬁcation. However, in the settings presented below, machine learningapproaches need to be tailored to the speciﬁc problem and address the following challenges:i)

Complexity : Macroeconomic relationships are often highly nonlinear and have various interactionsof higher order, which need to be modeled in a sophisticated manner while taking into account thetime ordering of the data.ii)

Dispensable variables : The inclusion of covariates in the estimation procedure that are not re-quired for identiﬁcation, i.e., do not block any back-door paths, can potentially be harmful even ifthey are not colliders or mediators (Schnitzer, Lok and Gruber, 2016); that is, the inclusion of suchvariables can increase the ﬁnite-sample variance and lead to small estimated probabilities of followinga particular treatment rule given the past, which may be both incorrectly interpreted as positivityviolations and make the updating step in the TMLE algorithm unstable.8ii) p > n : For longitudinal macroeconomic data, the number of parameters is often larger than thesample size. This is because for long follow-up, the whole covariate history needs to be considered,interactions may be nonlinear, and diﬀerent variables may have diﬀerent scales and features thatneed to be modeled adequately. Consequently, one needs to either reduce the number of parameterswith an appropriate estimation procedure or eliminate variables beforehand using variable screening.It has been argued that screening of variables is inevitable to facilitate estimation with LTMLE inmany settings (Schnitzer, Lok and Gruber, 2016).Section 4.5 recommends possible approaches to tackle these challenges in common macroeconomic settings. We accessed databases of the World Bank and the International Monetary Fund to collect annual datafor economic, political and institutional variables. Our outcome of interest is inﬂation in 2010 ( Y ). Allcovariates are measured annually at equidistant points in time for t ∗ = 1998 , . . . , A t ∗ ), which we deﬁne as suggested by Dincer and Eichengreen(2014): their CBI index measures several dimensions of independence and runs from 0, the lowest level ofindependence, to 1, the highest level of independence. It contains considerations such as the independenceof the chief executive oﬃcer (CEO) and limits on his/her reappointment, the bank’s independence interms of policy formulation, its objective or mandate, the stringency of limits on lending money to thepublic sector, measures of provisions aﬀecting (re)appointment of board members other than the CEO,restrictions on government representation on the board, and intervention of the government in exchangerate policy formulation. Our outcome variable is deﬁned as the year-on-year changes (expressed as annualpercentages) of average consumer prices measured by a CPI. A CPI measures changes in the prices ofgoods and services that households consume. To calculate CPIs, government agencies conduct householdsurveys to identify a basket of commonly purchased items and then track the cost of purchasing thisbasket over time. The cost of this basket at a given time, expressed relative to a base year, is the CPI,and the percentage change in the CPI over a certain period is referred to as consumer price inﬂation, themost widely used measure of inﬂation. Our measured covariates are L t ∗ = { L t ∗ , . . . , L t ∗ } and include avariety of macroeconomic variables such as money supply, energy prices, economic openness, institutionalvariables such as central bank transparency and monetary policy strategies, and political variables (seeFigure 1, Table 2 and Baumann et al. (2019) for details.). In line with the notation of Section 3, weconsider Y t ∗ , t ∗ < T = 2010, to be part of L t ∗ , i.e., we deﬁne L t ∗ := Y t ∗ .Our aim was to include as many countries as possible in our analysis. This entailed a tradeoﬀ betweenthe number of countries and the completeness of the data set. We were able to collect annual datafrom 1998 to 2010 for 124 countries for 14 explanatory variables and for the dependent variable Y t ∗ . Wefurther derived growth rates and other indicators from those measured variables to capture data for all 18covariates ( L t ∗ ). Some of the data were missing, however. To decide whether the missing data were likelymissing not at random (MNAR) and therefore possibly not useful without making additional assumptions,9e examined countries’ characteristics. We decided that observations for certain variables, countries orgroups of countries had to be excluded because they were not available; for instance, sometimes wars,insuﬃciently developed institutions, social unrest or other reasons made the collection of data impossible.We split the data set according to our assessment of whether the observation was MNAR. Data that weregarded as missing at random (MAR) (2.7% of the data set) were multiply imputed using Amelia II (Honaker et al., 2011), taking the time-series cross-sectional structure of the data into account. We didnot impute data that were likely MNAR. However, some variables that were categorized as MNAR wereused in the analysis (e.g., CBI). As a result, we obtained observations for 60 countries and 13 points intime (i.e., calendar years 1998-2010) for 19 measured variables ( L t ∗ ,. . . , L t ∗ , L t ∗ ,. . . , L t ∗ , Y t ∗ ≡ L t ∗ , A t ∗ ). Inthis ﬁnal data set, 0.1% of observations were missing and thus imputed.According to the World Bank’s income classiﬁcation, approximately 20% of the remaining 60 countries arelow-income countries, 36% belong to the lower-middle-income category, 27% to the upper-middle-incomecategory and 17% belong to the high-income category. From this country income distribution, we inferthat our results reﬂect the actual heterogeneity present across the world. Our target parameters are ATEs as deﬁned in (1). To be more speciﬁc, consider the following threeinterventions, of which two are static and one dynamic, each of them applied to ∀ t ∗ ∈ { , . . . , } :¯ d t ∗ = { a t ∗ = 1¯ d t ∗ ,i ( ¯ L t ∗ − ) = (cid:40) a t ∗ ,i = 1 if (cid:92) median( L t ∗ − ,i , . . . , L t ∗ − ,i ) ≤ (cid:92) median( L t ∗ − ,i , . . . , L t ∗ − ,i ) ≥ a t ∗ ,i = 0 otherwise¯ d t ∗ = { a t ∗ = 0A country’s central bank is set to be either dependent or independent during the whole time period underthe ﬁrst and third intervention above (i.e., ¯ d t ∗ and ¯ d t ∗ ). This means that we intervene on the ﬁrst 11 (i.e.from 1998-2008) out of 13 points (i.e. from 1998-2010) in time. This is because we assume a two-yearlag between the CBI intervention and its eﬀect on inﬂation. The second (dynamic) intervention sets acountry’s central bank to be independent if its median inﬂation rate in the past 7 years was below 0%or greater than 5%. The rationale for this relates to the fact that excessive inﬂation and deﬂation overseveral years are considered to produce harmful eﬀects on a country’s economy (see, e.g., Tobin (1965);Fisher (1933)). To guarantee price stability, which excludes inﬂation beyond a certain level and deﬂation,an independent central bank is required. Over the last twenty years, the optimal level of inﬂation hasbeen associated with approximately 2%. If a country’s inﬂation is constantly well above this level, inour case 5%, it will change the status of its central bank towards independence. The same holds for aninﬂation rate systematically falling below a value of zero. Note that for the dynamic intervention ¯ d t ∗ ,i ,data prior to 1998 had to be collected and utilized.10e deﬁne the following two target parameters: ψ , = E ( Y ¯ d t ∗ ) − E ( Y ¯ d t ∗ ) , (6) ψ , = E ( Y ¯ d t ∗ ) − E ( Y ¯ d t ∗ ) . (7)The ﬁrst, ψ , , quantiﬁes the expected diﬀerence in inﬂation two years after the last intervention (i.e., in2010) if every country had an independent central bank for 11 years in a row compared to a dependentcentral bank for 11 consecutive years. The second, ψ , , quantiﬁes the eﬀect that would have beenobserved if every country’s central bank had become independent for time points when the country’smedian inﬂation in the preceding 7 years had been outside the range from 0 to 5, compared to a strictlydependent central bank for 11 consecutive years (i.e., for the period 1998-2008). We separate the measured variables into blocks. The ﬁrst block comprises L At ∗ := { L t ∗ , . . . , L t ∗ , L t ∗ , . . . , L t ∗ } ,and the second comprises L Bt ∗ := { L t ∗ , . . . , L t ∗ } . In line with Sections 3.2 and 3.4, we do not make anyoverly restrictive assumptions with respect to our statistical model. First, we assume that our data comefrom a general true distribution P and are ordered such that O = ( Y , L A , A , L B , Y , L A , A , L B . . . , Y , L A , A , L B , Y ) iid ∼ P . In the context of our application, we do not need to make any deterministic assumptions regarding ourintervention assignment: a central bank can, in principle, be independent or dependent at any point intime, irrespective of the country’s history – and thus be intervened upon.As discussed in Section 3.5, we assume that each variable may be aﬀected only by variables measuredin the past and not those that are measured in the future. In addition, we make several assumptionsregarding the data-generating process, which are summarized in the DAG in Figure 1. Not all variableslisted in O are needed during estimation; see Section 4.5.11 o n s u m e r P r i c e s t ∗ ( Y t ∗ ) C o n s u m p t i o n T a x t ∗ P r i c i n g b y C o m p a n i e s t ∗ P r i c e M a r k − U p t ∗ − M a r k e t P o w e r t ∗ − O u t p u t t ∗ − ( L t ∗ − ) S a v i n g s t ∗ − A g e S t r u c t u r e t ∗ − ( L t ∗ − ) N o m i n a l W a g e s t ∗ − I n v e s t m e n t s t ∗ − T o b i n (cid:48) s q t ∗ − F i r m s (cid:48) n e t w o r t h t ∗ − A S & M H t ∗ − F i r m s (cid:48) l i q u i d . t ∗ − A ss e t P r i c e s t ∗ − C o n s u m p t i o n t ∗ − D i s p o s a b l e I n c o m e t ∗ − T a x e s a n d S o c i a l S e c u r i t i e s t ∗ − W e a l t h t ∗ − P u b l i c D e b t t ∗ − ( L t ∗ − ) P u b l i c D e b t t ∗ − ( L t ∗ − ) P u b l i c D e b t t ∗ − ( L t ∗ − ) D e b t M a n a g e m e n t t ∗ − P r i m . B a l a n c e t ∗ − ( L t ∗ − ) F i s c a l S p e n d i n g t ∗ − F i s c a l R e v e n u e t ∗ − T a x e s a n d S o c i a l S e c u r i t y t ∗ − C o n s u m p t i o n T a x t ∗ − N e t E x p o r t s t ∗ − F o r e i g n O u t p u t t ∗ − ( L t ∗ − ) R e a l E x c h a n g e R a t e t ∗ − T r a d e O p e nn e ss t ∗ − ( L t ∗ − ) S h a r e o f N o n − T r a d a b l e s t ∗ − P a s t I n f l a t i o n t ∗ − ( L t ∗ − ) C o n s u m e r P r i c e s t ∗ − ( L t ∗ − ) P r o d u c t i o n C o s t t ∗ − N o n − L a b o r C o s t s t ∗ − E n e r g y P r i c e s t ∗ − ( L t ∗ − ) T e c h n o l o g i c a l P r o g r e ss t ∗ − T e c h n o l o g i c a l P r o g r e ss t ∗ − T e c h n o l o g i c a l P r o g r e ss t ∗ ,..., t ∗ + L a b o r C o s t s t ∗ − T a x e s a n d S o c i a l S e c u r i t y t ∗ − N o m i n a l W a g e s t ∗ − B a r g a i n i n g P o w e r t ∗ − B a r g a i n i n g P o w e r t ∗ − L a b o r P r o d u c t i v i t y t ∗ − L a b o r P r o d u c t i v i t y t ∗ ,..., t ∗ + O u t p u t G a p t ∗ − ( L t ∗ − ) L a b o r U n i o n s t ∗ − H u m a n a n d P u b l i c C a p i t a l t ∗ − H u m a n a n d P u b l i c C a p i t a l t ∗ − ,..., t ∗ − P a s t I n f l a t i o n t ∗ − ( L t ∗ − ) I n f l a t i o n E x p e c t a t i o n s t ∗ − ( L t ∗ − ) C B C r e d i b i l i t y t ∗ − E x c h a n g e − R a t e R e g i m e t ∗ − T a r g e t i n g R e g i m e t ∗ − M o n e y S u pp l y t ∗ − ( L t ∗ − ) C B T r a n s p a r e n c y t ∗ − ( L t ∗ − ) C B I n d e p e n d e n c e t ∗ − ( A t ∗ − ) T i m e P r e f e r e n c e t ∗ − P o l . I n s t i t . t ∗ − ( L t ∗ − ) P o l . I n s t a b . t ∗ − ( L t ∗ − ) G D P p . c . t ∗ − ( L t ∗ − ) M o n e y D e m a n d t ∗ − C o n s u m e r P r i c e s t ∗ − ( L t ∗ − ) M o n e y S u pp l y t ∗ − ( L t ∗ − ) N o m i n a l E x c h a n g e R a t e t ∗ − N o m i n a l I n t e r e s t R a t e t ∗ − R e a l I n t e r e s t R a t e t ∗ − M P D e c i s i o n t ∗ − C a p i t a l O p e nn e ss t ∗ − ( L t ∗ − ) C u rr e n c y C o m p e t i t i o n t ∗ − O u t p u t t ∗ − ( L t ∗ − ) N o m i n a l I n t e r e s t R a t e t ∗ − B a n k L o a n s t ∗ − ( L t ∗ − ) C o n s u m p t i o n t ∗ − F i g u r e : D A G c o n t a i n i n g t h e s t r u c t u r a l a ss u m p t i o n s a b o u tt h e d a t ag e n e r a t i n g p r o ce ss f o r a s p ec i ﬁ c t i m e p o i n t t ∗ = ,..., . T h e t a r g e t q u a n t i t y i s ψ j , k a nd r e l a t e s t o Y w h i c h r e f e r s t o C o n s u m e r P r i c e s t ∗ c o l o r e d i n g r ee n . T h e i n t e r v e n t i o n r u l e s r e l a t e t o C B I a tt i m e t ∗ , c o l o r e d i n r e d . M e a s u r e d c o v a r i a t e s a r e g r e y , a ndun m e a s u r e d c o v a r i a t e s a r e w h i t e . A j u s t i ﬁ c a t i o n o f t h e D A G i s g i v e n i n A pp e nd i x A . . A → B reﬂects our belief, corroborated by economic theory, that A maycause B , whereas an absence of such an arrow states that we assume no causal relationship betweenthe two variables in question. Figure 1 has been developed based on economic theory. For example,arrow number 6 describes the causal eﬀect from real GDP (Output) on one component of companies’price setting (Price Markup), which is motivated by the fact that changes in demand (c.p.) in the goodsmarket enable companies to set higher prices in a proﬁt-maximizing environment. Detailed deﬁnitions ofthe considered variables, as well as detailed justiﬁcation for the assumptions encoded in our DAG, aregiven in Tables 2 and 3 in the Appendix. The DAG shows the causal pathways through which CBI can aﬀect consumer prices and thus ultimatelyinﬂation. We next explain the main paths from the intervention node to consumer prices. An independentcentral bank sets its policy tools autonomously to achieve its objective(s). Moreover, an independentcentral bank is less pressured to pursue an overly expansionary monetary policy that would produce onlyhigh inﬂation. Such a central bank is more likely to live up to its word, which increases its credibility(arrow 74). Higher credibility keeps inﬂation expectations in check (arrow 32). The more containedinﬂation expectations are, the lower the demands for nominal wage compensation will be (arrow 75),which, in turn, keeps labor costs (arrow 29), production costs (arrow 23) and companies’ prices (arrow3) low. This will ultimately also be reﬂected in relatively low consumer prices (arrow 2). Anotherpathway from the intervention to the outcome acts through monetary policy decisions. Following anintervention, monetary policy makers’ time preferences are reduced (arrow 69), and this will be takeninto account in their monetary policy decisions (arrow 49). Monetary policy decisions are mirrored inmoney supply (arrow 52), which is tantamount to banks’ loan creation (arrow 66) and, as a result, aﬀectsﬁrms’ investment decisions (arrow 67) and thus output (arrow 11). The ﬁnal stage aﬀects ﬁrms’ markups(arrow 6) in their prices with a ﬁnal eﬀect on consumer prices (arrows 4 and 2).There are several back-door paths from the intervention to the outcome. They all start with arrow 99because CBI is inﬂuenced by past inﬂation, which also aﬀects current monetary policy decisions (arrow65). Monetary policy will in turn impact the formation of inﬂation expectations (arrow 59) or the moneysupply (arrow 52). Along edges 66, 67, 11, 6, 4 and 2, this aﬀects the outcome. Under the assumption thatthe DAG as motivated in Appendix A is correct, establishing identiﬁcation in terms of the (generalized)back-door criterion requires the following considerations: some back-door paths that start with an arrowfrom previous consumer prices into the intervention are subsequently blocked by the collider of monetarypolicy decisions (that is, along edges 99, 65, 56, etc.). Other back-door paths along edges 99, 65, etc. canbe blocked by conditioning on the measured variable past inﬂation ( L t ∗ ).There are various paths from the intervention to the outcome that start with edges 69, 49 and 52. Allthose paths contain mediators one should not necessarily condition on in our example because otherwisethe eﬀect of CBI on inﬂation through these paths would be blocked (Hernan and Robins, 2020). Thesame considerations apply to the paths starting with edges 74 and 32.In summary, our DAG suggests that all back-door paths from A t ∗ to the outcome (that do not go through13ny future treatment node A t ∗ +1 ) can be blocked by including past inﬂation in the analysis. As manyother variables lie on a mediating path from the intervention to the outcome (i.e., are descendants of A t ∗ ),they should not be conditioned upon. A ﬁnal consideration suggests that conditioning on past inﬂation(to block all back-door paths) may also block the indirect eﬀect of CBI on the future outcome along paths72 and 75, so the estimate of the ﬁnal eﬀect that includes L t ∗ might be slightly conservative.We argue that the developed DAG should serve as the basis for identiﬁcation considerations and estimationstrategies. However, in complex macroeconomic situations, violations of this causal model need to betaken into account, and other estimation strategies may also be useful. We now explain how this can befacilitated. We can, in principle, follow the algorithm described in Section 3.7 to estimate the target quantity ofinterest. This includes estimation of the (nested) outcome model ¯ Q t ∗ (step 1) and the intervention model g ,A t ∗ = ¯ d ls (step 3c) for each time point. That is, we can estimate the g -model for t ∗ = 1998 , . . . , Q t ∗ for t ∗ = 2000 , . . . , Y T := Y , which corresponds to the value ofinﬂation in 2010, while ¯ d t ∗ , ¯ d t ∗ ,i ( ¯ L t ∗ − ) and ¯ d t ∗ are the interventions targeting CBI as described in Section4.2.We consider three approaches to covariate inclusion. The ﬁrst is based on the identiﬁability considerationsrelated to our DAG, and the other two reﬁne variable inclusion criteria based on the scenario in whichsome structural causal assumptions in the DAG may be incorrect.i) DAG-based approach ( PlainDAG ): Based on the identiﬁability arguments from Section 4.4, ¯L t ∗ contains only the relevant baseline variables from 1998 that were measured prior to the ﬁrst inter-vention node, as well as L t ∗ .ii) Greedy super learning approach ( ScreenLearn ): This approach contains the full set of measuredvariables L t ∗ . This approach assumes that each variable could potentially lie on a back-door pathbut that this was undiscovered due to misspeciﬁcation of the causal model. For example, a researcherwho argues that bank loans directly aﬀect a central bank’s independence (i.e., that there is an arrowfrom bank loans to CBI) would have to consider a back-door path along arrows 67, 11, 6, 4, 2 andthus include public debt in L t ∗ . Similarly, if it is doubted that some variables are not necessarilymediators but rather confounders on a back-door path that exists due to unmeasured variables, e.g., CBI → unmeasured variable → Output → . . . → Consumer Prices , then measured variables such as

Output (real GDP) would also have to be included in L t ∗ . We suggest that an analysis that includesall measured variables in L t ∗ can serve as a useful sensitivity analysis to explore the extent to whicheﬀect estimates may change under diﬀerent assumptions.iii) Economic theory approach ( EconDAG ): A further approach, termed

EconDAG , includes onlyvariables that are measured during a particular transmission cycle, as deﬁned by our DAG. That is,for the Q-model at t ∗ , every measured variable between t ∗ − t ∗ − t ∗ , only variables at t ∗ and past intervention variables are considered.As above, given the assumed time ordering, only variables from the past, and not from the future,are utilized in the respective models.Given the complexity of the data-generating process, it makes sense to use machine learning techniquesto estimate the respective g- and Q-models. For a speciﬁed set of learning algorithms and a given setof data, the method minimizing the expected prediction error (as estimated by k -fold cross validation)could be chosen. As the best algorithm in terms of prediction error may depend on the given data set, itis often recommended to use super learning instead – and this is what we use for i), ii) and iii). Superlearning (Van der Laan et al., 2007) (or “stacking”, Breiman (1996)) considers a set of learners; insteadof picking the learner with the smallest prediction error, one chooses the convex combination of learnersthat minimizes the k -fold cross validation error (for a given loss function, we use k = 10). The weightsrelating to this convex combination can be obtained with non-negative least squares estimation (which isimplemented in the R -package SuperLearner , Polley et al. (2017)). It can be shown that this weightedcombination will perform asymptotically at least as well as the best algorithm, if not better (Van derLaan et al., 2008).As described in Section 3.7.2, the challenge of model speciﬁcation, including the choice of appropriatelearners and screening algorithms, is to address the complex nonlinear relationships in the data and the p > n problem.Our strategy is to use the following algorithms: the arithmetic mean; generalized linear models (withmain terms only and including all two-way interactions); Bayesian generalized linear models with an in-dependent Gaussian prior distribution for the coeﬃcients; classiﬁcation and regression trees; multivariateadaptive (polynomial) regression splines; generalized additive models; Breimans’ random forest; general-ized boosted regression modeling; and single-hidden-layer neural networks. The algorithms are carefullychosen to reﬂect a balance between simple and computationally eﬃcient strategies and more sophisticatedapproaches that are able to model highly nonlinear relationships and higher-order interactions that maybe prevalent in the data. Furthermore, parametric, semiparametric and nonparametric approaches wereapplied to allow for enough ﬂexibility with respect to committing to parametric assumptions. In partic-ular, tree-based procedures were chosen to handle challenges that frequently come with economic data –for instance outliers. In addition, since some of the continuous predictors are transformed by the naturallogarithm, this strict monotone transformation may aﬀect its variable importance in a regression-basedprocedure, while trees are not impaired in that respect.For strategies i)-iii), we use the following learning and screening algorithms:a)

Screening algorithms:

Used only for estimation approach ii) because of the large covariate setcompared to the sample size; we used the elastic net (Zou and Hastie, 2005), the random forest(Breiman, 2001), Cramer’s V (with either 4 or 8 variables selected at a maximum) and the Pearsoncorrelation coeﬃcient. The screening algorithms were chosen such that at least a subset of them couldhandle both categorical and quasi-continuous variables well.b)

Learning algorithms:

The 11 learning algorithms mentioned above are the same for estimationstrategies i) and iii). i) and iii) were thus estimated with 11 algorithms each. In contrast, strategy ii)beneﬁted from the 5 screening algorithms mentioned in a), and we thus omitted generalized boosted15egression modeling from the learner set. In addition, learning algorithms that are applicable in the p > n case were added without prior screening to the 50 (= 5 ×

10) algorithms. As a result, whenBreimans random forest and single-hidden-layer neural networks were added without screening, 52algorithms could be used for strategy ii); see also Figure 4 in the Appendix.

The results of our analyses are visualized in Figure 2.Our main analysis (

PlainDAG ) suggests that if a country had legislated CBI for every year between 1998and 2008, it would have had an average reduction in inﬂation of -0.63 (95% conﬁdence interval (CI): -2.33;1.07) percentage points in 2010. The other two approaches led to similar results: -0.50 (95% CI: -2.31;1.30) for

ScreenLearn and -0.86 (95% CI: -2.74; 1.03) for

EconDAG . ll l −0.5−0.86 −0.63 −2.31−2.74 −2.33 1.31.03 1.07 ll l −0.44−0.82 −0.43 −1.93−2.16 −2.63 1.040.52 1.77 y ^ y ^ EconDAG PlainDAG ScreenLearn EconDAG PlainDAG ScreenLearn−4−3−2−1012

Estimation Strategy A T E ( w i t h % − C I ) Figure 2: ˆ ψ , and ˆ ψ , for the three diﬀerent treatment strategiesSimilarly, if a country had legislated an independent central bank for every year when the median ofthe past 7 years of inﬂation had been above 5% or below 0% from 1998 to 2008, it would have led toan average reduction in inﬂation of -0.43 percentage points (95% CI: -2.63; 1.77) in 2010 compared toa dependent central bank (that is, dichotomized CBI = 0) for the same time span obtained from theestimation strategy PlainDAG . The other two strategies led to similar conclusions.Thus, if there is any inﬂation-reducing eﬀect from CBI, it is probably small. This is our main ﬁndingfrom a monetary policy point of view.Interestingly, all three estimation approaches led to similar results.The diagnostics for all analyses are given in Table 1. The cumulative product of inverse probabilities wasnever below the truncation level of 0 .

01, which was re-assuring. The maximum value of clever covariates,as deﬁned in (3), was always well below 10, which suggests that the chosen super learning approachworked well. However, the mean clever covariate, which is supposed to be broadly approximately 1, wasnot ideal for dynamic treatment strategy 2, suggesting that ψ , should be interpreted with care.16 creenLearn, ˆ ψ , ScreenLearn, ˆ ψ , EconDAG, ˆ ψ , EconDAG, ˆ ψ , PlainDAG, ˆ ψ , PlainDAG, ˆ ψ , Intervention ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ ¯ A t ∗ = ¯ d t ∗ Trunc. (%) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000CC Mean 0.823 0.760 0.803 0.457 0.846 0.828 0.826 0.454 0.830 0.718 0.870 0.490CC Max. 3.038 3.635 3.071 1.996 3.603 4.709 3.323 2.089 3.214 3.568 3.417 2.429CC Mean Max. 0.939 0.983 0.879 0.502 0.874 0.942 0.914 0.487 0.878 0.863 0.924 0.553CC Mean Min. 0.735 0.602 0.731 0.432 0.763 0.642 0.749 0.405 0.781 0.621 0.780 0.442

Table 1: Row 1: percentage of observations that had to be truncated because the cumulative product ofinverse probabilities was < .

01. Rows 2 and 3: Mean and maximum value of the clever covariate. Allresults are averages over the 5 imputed data sets. Rows 4 and 5 contain the minimum and maximum ofthe ﬁve mean clever covariate values across the imputed data sets.Figure 4 (Appendix) visualizes the learner weight distribution. In our analysis, a multitude of learnersand screening algorithms were important, including neural networks, random forests, regression trees andBayesian generalized linear models.A naive analysis comparing the mean reductions in inﬂation between 2000 and 2010 between those coun-tries that had an independent central bank (from 1998 to 2008) and those that had a dependent centralbank led to the following results: The mean reduction in inﬂation between 2000 and 2010 was 2.3 per-centage points for those with independent central banks, compared to 1.0 percentage points for those thathad dependent central banks. The diﬀerence in reduction was thus 1.3 percentage points (95% CI: -6.1;3.5). However, such a crude comparison does not permit a causal interpretation and is not an estimateof ψ , . Motivated by our data analysis, we explore the extent to which model misspeciﬁcation and choice oflearner sets may aﬀect eﬀect estimation with longitudinal maximum likelihood estimation (and competingmethods).

We speciﬁed two data-generating processes: a simple one with 3 time points and one time-dependentconfounder and a more complex one with up to 6 time points and 10 time-varying variables.For the ﬁrst simulation (

Simulation 1 ), we assume the following time ordering: O = ( L , A , Y , L , A , Y , L , A , Y )Using the R -package simcausal (Sofrygin et al., 2016), we deﬁne preintervention distributions as listedin Table 4 (Appendix).For the second simulation ( Simulation 2 ), we use the following time ordering: O = ( L , A , Y , L , . . . , L , . . . , L , A , Y , L , . . . , L , L , A , Y )17e generated the preintervention data according to the distributions speciﬁed in Table 5 (Appendix). For both simulations, we were interested in evaluating ATEs between two static interventions. That is,we were interested in¯ d Sim , t + = (cid:8) a t + = 1 for ∀ t + ∈ { , , } ¯ d Sim , t + = (cid:8) a t + = 0 for ∀ t + ∈ { , , } and¯ d Sim , t ++ = (cid:8) a t ++ = 1 for ∀ t ++ ∈ { , , , , , } ¯ d Sim , t ++ = (cid:8) a t ++ = 0 for ∀ t ++ ∈ { , , , , , } The target parameters of interest are thus ψ = E ( Y ¯ d Sim , t + ) − E ( Y ¯ d Sim , t + ) , ψ = E ( Y ¯ d Sim , t ++ ) − E ( Y ¯ d Sim , t ++ ) , (8) In our primary analysis, we used longitudinal targeted maximum likelihood estimation for both simula-tions. In a secondary analysis, we also evaluated the performance of (longitudinal) inverse probability oftreatment weighting (see, e.g., Daniel et al., 2013 and the references therein).For LTMLE, we considered four diﬀerent estimation approaches, the ﬁrst for the ﬁrst simulation andanother three for the second simulation:i) Estimation as explained in Section 3.7. Q- and g-models were ﬁtted with (generalized) linear models.This is estimation approach

GLM .ii) Estimation as explained in Section 3.7. Q- and g-models were ﬁtted with a data-adaptive approachusing super learning. There were four candidate learners: the arithmetic mean, GLMs, Bayesiangeneralized linear models with an independent Gaussian prior distribution for the coeﬃcients, as wellas classiﬁcation and regression trees. No screening of variables was conducted. This is estimationapproach L1 .iii) Estimation as explained in Section 3.7. Q- and g-models were ﬁtted with a data-adaptive approachusing super learning. The same four learners as in L1 are utilized; however, variable screeningwith Pearson’s correlation coeﬃcient was conducted. In addition, four more learners were added:multivariate adaptive (polynomial) regression splines (Friedman, 1991), generalized additive models,and generalized linear models including the main eﬀects with all corresponding two-way interactions.18hese additional four learners included variable screening with the elastic net ( α = 0 . L2 .iv) Estimation as explained in Section 3.7. Q- and g-models were ﬁtted with a data-adaptive approachusing super learning. The eight learning/screening combinations from L2 were used. In addition,single-hidden-layer neural networks were used, once without variable screening and once with elasticnet screening. Finally, the last learner is composed of classiﬁcation and regression with the randomforest. This is estimation approach L3 .We also obtained estimates for the ATE based on IPTW. The estimation of the propensity scores wasidentical to the estimation of the g-models within LTMLE and is thus also based on the estimationprocedures described in i)-iv). We compared the estimated absolute (abs.) bias and coverage probabilities for the estimated ATEs forthe two simulations and for both correctly and incorrectly speciﬁed Q-models (see details below).i)

Simulation 1:

The incorrect, misspeciﬁed, Q-models omit L := ( L , L , L ) entirely. By contrast,the g-models were speciﬁed such that the entire covariate histories are taken into account. As aresult, if no screening is applied (estimation strategies GLM and L1), all relevant variables are usedfor estimation; however, with screening (estimation strategies L2 and L3), some variables might beomitted.ii) Simulation 2:

The incorrect, misspeciﬁed, Q-models do not use L := ( L , L , L , L , L , L , L )for estimation. Thus, one relevant back-door path remains unblocked, which leads to time-dependentconfounding with treatment-confounder feedback. As in simulation 1, all g-models were speciﬁedsuch that the entire covariate histories are taken into account. The results after 1000 simulation runs are summarized in Figure 3.19 l l ll ll l

Simulation 1 Simulation 2

Both Correct Q Incorrect Both Correct Q Incorrect0246 A b s . B i a s l l l ll ll l Simulation 1 Simulation 2

Both Correct Q Incorrect Both Correct Q Incorrect9095100 C o v e r age P r obab ili t y ( % ) Learner l l l l

GLM L1 L2 L3

Figure 3: Absolute bias and coverage probability for both simulations – for correctly speciﬁed Q- andg-models (

Both Correct ) and misspeciﬁed Q-models (

Q Incorrect ) of LTMLE.In simulation 1, LTMLE provides approximately unbiased estimates even under misspeciﬁed Q-models.This is because targeted maximum likelihood estimation is a doubly robust estimation, and thus misspec-iﬁcation of either the Q- or g-models can be handled. However, the coverage probabilities are too high.See Tran et al. (2018) for a discussion of this issue.Under the more complex setup of simulation 2, there is small bias if both the Q- and g-models containthe relevant adjustment variables (

Both Correct ) and learner set L1 is used (Bias = 0.991). The moresophisticated learner sets L2 and L3 yield much better estimates (Bias = 0.158 and 0.144). With incorrectspeciﬁcation of the Q-model, there is again some bias (Bias = 1.438, 0.639, 0.663). Interestingly, forsimulation 2, the most complex estimation approach with the largest learner set L3 does not produce asubstantial improvement over L2 . This highlights that a simple increase in learners does not necessarilyimprove the ﬁnite sample performance of LTMLE, although suﬃcient breadth and complexity is certainlyalways needed, as seen by the inferior performance of the ﬁrst learner set.In simulation 1, the conﬁdence intervals have too large coverage probabilities. However, in simulation 2,using L2 and L3 yields (close to) nominal coverage probabilities. Nevertheless, our results highlight theneed to develop more reliable variance estimators, such that overall better coverage can be achieved.Note that while LTMLE may produce approximately unbiased point estimates, IPTW does not seem tobeneﬁt from complex estimation procedures for the propensity scores (g-models) in the second simulation.The estimates are rather volatile, with some bias and poor coverage probabilities. These conclusions holdfor all learner sets considered (Appendix, Figure 5). We have shown that even for complex macroeconomic questions, it is possible to develop a causal modeland implement modern doubly robust longitudinal eﬀect estimators. We believe that this is an important20ontribution in light of the current debate on the appropriate implementation and use of causal inferencefor economic questions (Imbens, 2019). Our suggestion was to commit to a causal model, motivate it insubstantial detail (as in Appendix A.2), discuss possible violations of it, and ultimately conduct sensitivityanalyses that evaluate eﬀect estimates under diﬀerent (structural) assumptions.While the statistical literature has emphasized the beneﬁts of doubly robust eﬀect estimation in conjunc-tion with extensive machine learning (Van der Laan and Rose, 2011), its use in sophisticated longitudinalsettings has sometimes been limited due to computational challenges and constraints (Schomaker et al.,2019). We have shown how the use of screening and learning algorithms that are tailored to the questionof interest can help to facilitate a successful implementation of this approach.As stressed by Imbens (2019): “[...] models in econometric papers are often developed with the idea thatthey are useful on settings beyond the speciﬁc application in the paper” . We hope that both our causalmodel, i.e., the DAG, and our proposed estimation techniques will be useful in applications other thanours.Our simulation studies suggest that LTMLE with super learning can yield good point estimates comparedto competing approaches, even under model misspeciﬁcation. However, both the coverage of conﬁdenceintervals and the appropriate choice of learners are challenges that warrant more investigation. Recentresearch conﬁrms that the development of more robust variance estimators is urgently needed (Tran et al.,2018) and that learner selection is becoming more diverse (Gehringer et al., 2018).From a monetary policy point of view, we conclude that there is no strong support for the hypothesisthat an independent central bank necessarily lowers inﬂation, although our conﬁdence intervals were wide.Future research may investigate whether this ﬁnding holds for subgroups of particular countries, such asdeveloping countries, and for diﬀerent time periods. However, even if the impact of CBI on inﬂationseems to be weak, independent central banks could still have beneﬁcial eﬀects on outcomes other thanthose investigated by us.

References

Alesina, A. and Summers, L. H. (1993), “Central bank independence and macroeconomic performance: some comparativeevidence”,

Journal of Money, Credit and Banking

Journal of International Money and Finance

44, 118–135.Arnone, M. and Romelli, D. (2013), “Dynamic central bank independence indices and inﬂation rate: A new empiricalexploration”,

Journal of Financial Stability

Journal of MonetaryEconomics

Biometrics

Working Paper . URL: https://arxiv.org/pdf/2006.06274.pdf

Bell-Gorrod, H., Fox, M. P., Boulle, A., Prozesky, H., Wood, R., Tanser, F., Davies, M.-A. and Schomaker, M. (2019), “Theimpact of delayed switch to second-line antiretroviral therapy on mortality, depending on failure time deﬁnition and cd4 ount at failure”, bioRxiv . URL:

Bernanke, B., T., L., S., M. F. and A., P. (2001),

Inﬂation Targeting: Lessons from the International Experience , PrincetonUniversity Press.Bernhard, W., Broz, J. L. and Clark, W. R. (2002), “The political economy of monetary institutions”,

International Orga-nization

Macroeconomics: A European Perspective , Prentice Hall.Blinder, A. S. (2000), “Central-bank credibility: why do we care? how do we build it?”,

American Economic Review

Review of Industrial Organization

URL:

Breiman, L. (1996), “Stacked regressions”,

Machine Learning

Macroeconomics: A European Text , OUP Oxford.Burnside, C. (2005),

Fiscal Sustainability in Theory and Practice: a Handbook , Washington, DC: World Bank.Cahuc, P., Postel-Vinay, F. and Robin, J.-M. (2006), “Wage bargaining with on-the-job search: Theory and evidence”,

Econometrica

URL:

Cargill, T. (1995), “The statistical association between central bank independence and inﬂation”,

BNL Quarterly Review

Macroeconomics: Institutions, Instability, and the Financial System , Oxford UniversityPress.Chernozhukov, V., Chetverikov, D., Demirer, M., Duﬂo, E., Hansen, C., Newey, W. and Robins, J. (2018), “Double/debiasedmachine learning for treatment and structural parameters”,

The Econometrics Journal

Journal of Economic Surveys

Journal of EconomicPerspectives

EuropeanJournal of Political Economy

The World Bank Economic Review

The Stata Journal

Statistics in Medicine ecker, A., Hubbard, A., Crespi, C., Seto, E. and Wang, M. (2014), “Semiparametric estimation of the impacts of longitudinalinterventions on adolescent obesity using targeted maximum-likelihood: Accessible estimation with the ltmle package”, Journal of Causal Inference

Applied Economics

International Journal of Central Banking

Journal of the European Economic Association in J. B. Taylor and M. Woodford, eds, “Handbook of Macroe-conomics”, 1 edn, Vol. 1, Part C, Elsevier, chapter 25, pp. 1615–1669.Fisher, I. (1933), “The debt-deﬂation theory of great depressions”,

Econometrica: Journal of the Econometric Society

The Annals of Statistics pp. 1–67.Fuhrer, J. C. (1997), “Central bank independence and inﬂation targeting: monetary policy paradigms for the next millen-nium?”,

New England Economic Review .(Jan/Feb), 19–36.Garriga, A. C. (2016), “Central bank independence in the world: A new data set”,

International Interactions

Epidemiology

Asian Journal of Finance andAccounting

Economic Policy

Journal of the European Economic Association

Econometrica

Journal of Institutional andTheoretical Economics (JITE)

The Foundations of Modern Macroeconomics , Oxford University Press.Hernan, M. and Robins, J. (2020),

Causal Inference , Vol. forthcoming of

Chapman & Hall/CRC Monographs on Statistics& Applied Probab , Taylor & Francis.

URL:

Honaker, J., King, G. and Blackwell, M. (2011), “Amelia II: A program for missing data”,

Journal of Statistical Software

Statistical Science

URL: ´acome, L. I. and V´azquez, F. (2008), “Is there any link between legal central bank independence and inﬂation? evidencefrom Latin America and the Caribbean”, European Journal of Political Economy

Swiss Journal of Economics and Statistics

American Political Science Review

European Journalof Political Economy a ), “Inﬂation and central bank independence: a meta-regression analysis”, Journal ofEconomic Surveys b ), “Central bank independence and inﬂation revisited”, Public Choice

American Journal of Epidemiology

IMFStaﬀ Papers

Journal of International Economics

The Economics of Money, Banking and Financial Markets: EuropeanEdition , Pearson.Mishkin, F. S. (1999), “International experiences with diﬀerent monetary policy regimes”,

Journal of Monetary Economics

American Economic Review

Epidemiological Methods

Journal of Economic PolicyReform

Public Choice

International Journal of Biostatistics

SuperLearner: Super Learner Prediction . R package version2.0-22.Richardson, T. and Robins, J. (2013), “Single world intervention graphs (SWIGs): A uniﬁcation of the counterfactual andgraphical approaches to causality”,

Center for Statistics and the Social Sciences, University of Washington Working PaperSeries

Number 128.

URL:

Robins, J. (1986), “A new approach to causal inference in mortality studies with a sustained exposure period - applicationto control of the healthy worker survivor eﬀect”,

Mathematical Modelling in G. Fitzmaurice, M. Da-vidian, G. Verbeke and G. Molenberghs, eds, “Longitudinal Data Analysis”, CRC Press, pp. 553–599.Robins, J. M., Hernan, M. A. and Brumback, B. (2000), “Marginal structural models and causal inference in epidemiology”,

Epidemiology ogoﬀ, K. (1985), “The optimal degree of commitment to an intermediate monetary target”, The Quarterly Journal ofEconomics

The Quarterly Journal of Economics

Biometrika

Statistics in Medicine

37, 530–543.Schnitzer, M. E., Lok, J. and Bosch, R. J. (2016), “Double robust and eﬃcient estimation of a prognostic model for eventsin the presence of dependent censoring”,

Biostatistics

The International Journal of Biostatistics

Biometrics

Annals of Applied Statistics

Statistics in Medicine simcausal: Simulating Longitudinal Data with Causal InferenceApplications . R package version 0.5.3.Svensson, L. E. (1997), “Inﬂation forecast targeting: Implementing and monitoring inﬂation targets”,

European EconomicReview in “Handbook of Monetary Economics”, Vol. 3, Elsevier, pp. 1237–1302.Taylor, J. B. (1993), Discretion versus policy rules in practice, in “Carnegie-Rochester Conference Series on Public Policy”,Vol. 39, pp. 195–214.Tinbergen, J. (1930), “Determination and interpretation of supply curves: an example”, Zeitschrift f¨ur National¨okonomie

1, 669–679.Tobin, J. (1965), “Money and economic growth”,

Econometrica arXiv e-prints p. arXiv:1810.03030.

URL: https://arxiv.org/abs/1810.03030

Tran, L., Yiannoutsos, C., Musick, B., Wools-Kaloustian, K., Siika, A., Kimaiyo, S., Van der Laan, M. and Petersen, M. L.(2016), “Evaluating the impact of a HIV low-risk express care task-shifting program: A case study of the targeted learningroadmap”,

Epidemiological Methods

InternationalJournal of Biostatistics in press.van der Laan, M. J. and Gruber, S. (2012), “Targeted minimum loss based estimation of causal eﬀects of multiple time pointinterventions”,

International Journal of Biostatistics

Statistical applications in genetics andmolecular biology an der Laan, M., Polley, E. and Hubbard, A. (2008), “Super learner”, Statistical Applications in Genetics and MolecularBiology

6, Article 25.Van der Laan, M. and Rose, S. (2011),

Targeted Learning , Springer.van der Laan, M. and Rose, S. (2018),

Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies ,Springer Series in Statistics, Springer International Publishing.Vuletin, G. and Zhu, L. (2011), “Replacing a disobedient central bank governor with a docile one: A novel measure of centralbank independence and its eﬀect on inﬂation”,

Journal of Money, Credit and Banking manuscript, University of California, SantaCruz . URL: https://iecon.tau.ac.il/sites/economy.tau.ac.il/ﬁles/media server/Economics/Sapir/conferences/Carl%20E.Walsh.pdf

Walsh, C. E. (2010),

Monetary theory and policy , Vol. 3rd edition, The MIT Press, Cambridge, Massachusetts.Wei, S.-J. and Tytell, I. (2004), “Does ﬁnancial globalization induce better macroeconomic policies?”, 4(84).

URL:

Wright, P. (1934), “The method of path coeﬀcients”,

The Annals of Mathematical Statistics

5, 161–215.Young, J. G., Cain, L. E., Robins, J. M., O’Reilly, E. J. and Hernan, M. A. (2011), “Comparative eﬀectiveness of dynamictreatment regimes: an application of the parametric g-formula”,

Statistics in Biosciences

Journal of the Royal StatisticalSociety: Series B (Statistical Methodology) More details on the causal model

A.1 Deﬁnition of the variables listed in the DAG

Node Explanation Emp. Approx.Consumer Prices Price changes in the consumption basket of a representative household. Inﬂation (%)Consumption Tax Value added tax on the net price of goods and services. UnmeasuredPricing by Companies Firms set their product prices based on production costs and markups to maximize proﬁt. UnmeasuredPrice Markup Surcharge on marginal cost. It depends on aggregate demand and market power. UnmeasuredProduction Cost Convenient breakdown of unit costs into labor and non-labor costs. It generally depends onthe industry and countries’ development. UnmeasuredLabor Costs Direct wages, salaries, labor taxes, and social security contributions. UnmeasuredNon-Labor Costs Capital, land and intermediate inputs such as intermediate goods, primary commodities andenergy. UnmeasuredEnergy Prices Mainly world market prices for energy resources such as oil, gas and coal. En. Prices (USD)Taxes and Social Secu-rities Labor taxes and social security contributions. UnmeasuredMarket Power Perfect competition forces ﬁrms to set marginal costs equal to prices. This corresponds to alack of market power. By contrast, product diﬀerentiation suggests high market power. UnmeasuredOutput In a small open economy, output consists of consumption, investments, government spendingand net exports. GDP (USD)Consumption Private consumption as a share of disposable household income. This is divided into twocomponents: autonomous consumption and marginal propensity to consume. UnmeasuredDisposable Income Consumer income after transfers and taxes. UnmeasuredTobin’s q An economic measure that compares the market value of installed capital with the replace-ment cost of installed capital. A value greater than 1 leads to new investments. If the valueis smaller than 1, purchasing existing capital is cheaper than investing in new capital. UnmeasuredInvestments Purchases of real estate by households and purchases of new capital goods (machines andplants) by ﬁrms. UnmeasuredNominal Wage Employees’ salaries unrelated to the development of prices or indexation. UnmeasuredBargaining Power Strength of bargaining position of employees in the wage-setting process. UnmeasuredLabor Unions Associations that represent the employed labor force in setting wage levels, working conditionsand worker rights. UnmeasuredLabor Productivity The ratio of output (GDP) to the number of workers. UnmeasuredOutput Gap Fluctuations of current output (GDP) from its potential. Out. Gap (%)Technological Progress A technological improvement resulting in higher machine productivity. UnmeasuredHuman and PublicCapital Expenses for discovering and developing new ideas and products. UnmeasuredInﬂation Expectations Expected consumer price level changes approximated by the backward-looking geometricmean of inﬂation over the past three years. Inﬂation (%)Savings The sum of accumulated private (and public) savings. Savings can be negative. UnmeasuredForeign Output World output (GDP) depending on foreign consumption, investment and ﬁscal spending. UnmeasuredNet Exports Deﬁned as exports minus the value of imports. UnmeasuredReal Exchange Rate Determined by the nominal exchange rate and the domestic and foreign price levels. UnmeasuredNominal ExchangeRate Domestic currency in terms of foreign currency. UnmeasuredFiscal Spending The sum of all government expenditures (on education, consumption, investments, etc.). UnmeasuredFiscal Revenue The sum of ﬁscal earnings (mainly taxes). UnmeasuredPrimary Balance Primary surplus/deﬁcit: Government revenues minus government spending excluding interestpayments on outstanding debt. Prim. Bal. (% GDP)Public Debt If the government runs a primary deﬁcit in a given year, debt increases. The increase in debtis exacerbated by interest payments on existing debt. Debt (% GDP)Debt Management Decisions of a government on debt structure, potentially resulting in diﬀerent currency, priceand interest-rate indexation composition as well as diﬀerent maturities of newly issued andoutstanding debt. UnmeasuredMoney Demand Demand for money, deﬁned as currency plus deposit accounts, determined by GDP and thelevel of interest rates on bonds. UnmeasuredMoney Supply Diﬀerent monetary aggregates (M0-M3) are available. For this analysis M2 was used. M2 Gr. (%)Nominal Interest Rate The level of the interest rate is determined by the intersection of money supply and moneydemand. Unmeasured argeting Regime Monetary policy strategy introduced in the 1990s intended to stabilize inﬂation at a pre-announced point target or target range. UnmeasuredExchange-Rate Regime Monetary policy strategy intended to stabilize inﬂation at a level commensurate with that of astrong currency. By pegging the currency to an anchor country’s currency, its monetary policyand, hence, inﬂation is imported. Deviations from the target exchange rate are corrected bypurchases and sales of the pegged currency. UnmeasuredCapital Openness Index measuring a country’s degree of capital account openness. Fin. Open.AS & MH Adverse selection and moral hazard due to information asymmetries in credit markets. UnmeasuredFirms’ net worth A ﬁrm’s total assets minus its total liabilities yields its equity. UnmeasuredFirms’ liquidity Firms’ liquidity is directly linked to their cashﬂow. Cash is the most liquid asset and is usedto meet short-term liabilities. UnmeasuredAge structure Demographic indicator that captures the share of the total population older than 65 years. Age 65 (%)Trade openness The sum of imports and exports is set in relation to a country’s output. It is a proxy forglobalization. Imports + Exports /GDPAsset Prices Prices of assets in which households, ﬁrms, or governments are able to hold wealth, such asstocks, bonds, bank deposits, cash or real estate. UnmeasuredReal Interest Rate The diﬀerence between the nominal interest rate and the expected rate of inﬂation. UnmeasuredCurrency Competition Governments and central banks are forced to implement disciplined policies since they com-pete with foreign currencies for capital. The primary mechanism through which greateropenness to foreign capital might lead to lower inﬂation arises presumably from its disciplin-ing eﬀect on monetary policy. UnmeasuredCB Transparency Central banks publicly announce their forecasts, policy decisions and assessments of theeconomy. A central bank’s transparency is strongly related to its accountability and itscredibility. TransparencyCB Independence Independence of a central bank from governmental bodies. Measured via de jure indices (e.g.,statutes); see the main text for detailed explanations. CBICB Credibility A central bank that does what it has announced publicly is considered to be credible. Thisis reﬂected in inﬂation expectations that are low and stable. UnmeasuredPol. Instab. The percentage of veto players dropping from the government in any given year. In presiden-tial systems, veto players are deﬁned as the president and the largest party in the legislature.In parliamentary systems, the veto players are deﬁned as the prime minister and the threelargest government parties. Pol. Stab.Pol. Instit. The quality of political institutions. Civil LibertiesTime Preference Time horizon envisaged by policymakers within which they want to achieve a certain macroe-conomic outcome. It may vary from a short (high time preference) to a middle- to long-termperspective (low time preference). UnmeasuredShare of Non-Tradables Distinction between tradeable and non-tradeable goods. Non-tradability means that a goodis produced and consumed in the same economy (e.g., haircuts). UnmeasuredGDP p.c. GDP is the sum of all ﬁnished goods and services that are produced in a year. The p.c. termdivides this value by the number of citizens. GDP p.c. is a proxy for economic wealth andliving standards. GDP pc (USD)Bank Loans Commercial banks create money when they oﬀer loans depending on the availability of centralbank reserves at their disposal. Credit (% GDP) Gr.Past Inﬂation Median of inﬂation during the past 7 years. Inﬂation (%)MP Decision Monetary policy makers’ (i.e., central bankers’) decisions are contractionary, neutral or ex-pansionary. UnmeasuredWealth Household wealth is accumulated savings over previous periods (it can be negative in theevent of net debt) and disposable income in the current period. Unmeasured .2 Explanation for the arrows in the DAG Arrow Causality Assumption Source1 Consumer prices can change after changes in consumption taxes (e.g., VAT). Gelardi (2014)2 Consumer prices are set individually by retailers and companies. Burda and Wyplosz(2010, p. 290)3 Production costs generally dominate the price-setting process. Proﬁt margins strongly depend on the indus-try in question. Burda and Wyplosz(2010, p. 291)4 Channels the aggregate demand side of the price-setting process. In a small open economy, demand shocksto goods and services aﬀect the price level. Burda and Wyplosz(2010, p. 312)5 Higher product diﬀerentiation leads to higher market power and higher markups in a proﬁt-maximizingenvironment. Burda and Wyplosz(2010, p. 291)6 Changes in aggregate demand in the goods market enable ﬁrms to set higher prices. Bloch and Olive (2001)7 Expansionary monetary policy, which lowers nominal interest rates, also causes an improvement in ﬁrms’balance sheets because it raises their cash ﬂow. The rise in cash ﬂow increases ﬁrms’ (or households’)liquidity. Mishkin et al. (2013, p.544 f.)8 In a small open economy, domestic demand for goods, and thus output, is also aﬀected by net exports. Blanchard et al. (2010,p. 125)9 Fiscal spending describes the decision of the government to spend money. It aﬀects output (GDP). Blanchard et al. (2010,p. 45)10 Private consumption also aﬀects output. Blanchard et al. (2010,p. 44)11 Investments are another factor aﬀecting output. Blanchard et al. (2010,p. 44)12 The share of disposable income that is not consumed in this period is saved based on the marginal propensityto save. Blanchard et al. (2010,p. 52)13 Governments undertake investments in human capital (e.g., education) or public capital (e.g., infrastructure)to bolster long-run economic growth. Burda and Wyplosz(2010, pp. 85 ﬀ.)14 A Tobin’s q not equal to 1 gives incentives to invest or divest in capital and therefore aﬀects aggregateinvestment. Burda and Wyplosz(2010, p. 195)15 Similar to arrow 13, companies and other non-governmental agents aﬀect human capital. Burda and Wyplosz(2010, pp. 85 ﬀ.)16 The current value of GDP may deviate from its potential. Burda and Wyplosz(2010, p. 11)17 Investments in human capital have a positive impact on innovation and economic development. Diebolt and Hippe(2019)18 Training and education generally lead to high-skilled workers, and in turn, to high productivity of the laborforce. Burda and Wyplosz(2010, pp. 85 f.)19 Potential output growth is mainly determined by technological progress Burda and Wyplosz(2010, p. 71)20 Technological progress indicates higher productivity, and higher productivity can again be expressed asobtaining the same output with fewer inputs (here, lower non-labor costs and higher proﬁts) Burda and Wyplosz(2010, p. 71)21 The ﬁrst (second cf. 23) component that determines the production costs are non-labor costs. Burda and Wyplosz(2010, p. 291)22 Changes in energy prices are transmitted through supply shocks and aﬀect the non-labor costs of production. Burda and Wyplosz(2010, p. 297)23 The second component that determines production costs are labor costs. Burda and Wyplosz(2010, p. 291)24 Gross hourly labor costs also include vacation, social security contributions and other beneﬁts paid byemployers to the beneﬁt of workers. Burda and Wyplosz(2010, p. 291)25 Higher skills increase workers’ bargaining power in the wage-setting process. Cahuc et al. (2006)26 During boom periods, rising employment generally improves the bargaining position of workers. Burda and Wyplosz(2010, p. 294)27 Labor unions generally improve the bargaining position of workers. Burda and Wyplosz(2010, p. 121)28 A better bargaining position leads to higher wage markup. Burda and Wyplosz(2010, p. 294)29 Nominal wages translate directly into labor costs. Burda and Wyplosz(2010, p. 292)30 Inﬂation expectations are built on publicly announced inﬂation targets. G¨urkaynak et al.(2010)31 Fiscal revenue increases the government’s capacity to spend. Walsh (2010, p. 136) Past Inﬂation can be considered as a summary statistic of past consumer price movements. By deﬁnition.73 ”The hybrid Phillips curve is an example of how models used in the policy arena seek to overcome unsatisfac-tory features of both the adaptive expectations Phillips curve (it is empirically successful, but is subject tothe Lucas critique; lacks micro-foundations and rational expectations; and lacks a channel for credibility toaﬀect inﬂation) and the NKPC (which is forward looking and therefore not subject to the Lucas critique; hasmicro-foundations and rational expectations with a role for credibility, but counterfactual empirical predic-tions). The hybrid Phillips includes forward-looking inﬂation expectations but acknowledges that inﬂationappears to be persistent or inertial, i.e. that it depends on lagged values of itself....The hybrid Phillips curvecan be rationalized by the assumption that some proportion of ﬁrms use a backward-looking rule of thumbto set their inﬂation expectations while the remainder use forward-looking expectations.” Carlin and Soskice(2015, p. 610)74 One way for a central bank to establish credibility is by increasing its independence. Blinder (2000)75 Employees want to protect themselves from a loss in purchasing power, so they embed their inﬂation expec-tations into their nominal wages. Burda and Wyplosz(2010, p. 293)76 ”Expansionary monetary policy, which causes a rise in stock prices along the lines described earlier, raisesthe net worth of ﬁrms . . . ”. Mishkin et al. (2013, p.544)77 ”The lower the net worth of business ﬁrms, the more severe the adverse selection and moral hazard problemsin lending to these ﬁrms. Lower net worth means that lenders in eﬀect have less collateral for their loans,so their potential losses from adverse selection are higher.”. Mishkin et al. (2013, p.544)78 ”The lower net worth of businesses also increases the moral hazard problem because it means that ownershave a lower equity stake in their ﬁrms, giving them more incentive to engage in risky investment projects.Because taking on riskier investment project makes it more likely that lenders will not be paid back, adecrease in businesses net worth leads to a decrease in lending and hence in investment spending.”. Mishkin et al. (2013, p.544)79 In a more integrated world, competition between currencies is even more present since countries want toattract foreign investments, and this race is exacerbated in a ﬁnancially integrated world. Wei and Tytell (2004)80 The primary mechanism through which greater openness to foreign capital might lead to lower inﬂation ispresumably some sort of disciplining eﬀect on monetary policy. Wei and Tytell (2004)81 The quality of political institutions might directly inﬂuence the relationship between CBI and inﬂation. Theeﬀectiveness of CBI in strengthening credibility and enhancing inﬂation performance is increased by thepresence of multiple political veto players or if checks and balances are suﬃciently strong. Keefer and Stasavage(2003) & Hayo andVoigt (2008)82 Political instability can have a number of possible eﬀects. The most commonly discussed of these is thatmore instability makes it diﬃcult for policy makers to commit to low inﬂation. Campillo and Miron(1996, p. 10) Additional Material related to the Data Analysis S L . m ea n _sc r ee n . c o r P ea r s on S L . g l m _sc r ee n . c o r P ea r s on S L . b ayes g l m _sc r ee n . c o r P ea r s on S L . r p a r t _sc r ee n . c o r P ea r s on S L . g l m .i n t e r ac t i on _ i n f o _sc r ee n . c o r P ea r s on S L . ea r t h r ee n . c o r P ea r s on S L . g a m r ee n . c o r P ea r s on S L . po l y m a r s_sc r ee n . c o r P ea r s on S L . r a ndo m Fo r es t _ g r i d r ee n . c o r P ea r s on S L . nn e t _sc r ee n . c o r P ea r s on S L . m ea n _sc r ee n . c r a m e r sv_ g r i d S L . g l m _sc r ee n . c r a m e r sv_ g r i d S L . b ayes g l m _sc r ee n . c r a m e r sv_ g r i d S L . r p a r t _sc r ee n . c r a m e r sv_ g r i d S L . g l m .i n t e r ac t i on _ i n f o _sc r ee n . c r a m e r sv_ g r i d S L . ea r t h r ee n . c r a m e r sv_ g r i d S L . g a m r ee n . c r a m e r sv_ g r i d S L . po l y m a r s_sc r ee n . c r a m e r sv_ g r i d S L . r a ndo m Fo r es t _ g r i d r ee n . c r a m e r sv_ g r i d S L . nn e t _sc r ee n . c r a m e r sv_ g r i d S L . m ea n _sc r ee n . c r a m e r sv_ g r i d S L . g l m _sc r ee n . c r a m e r sv_ g r i d S L . b ayes g l m _sc r ee n . c r a m e r sv_ g r i d S L . r p a r t _sc r ee n . c r a m e r sv_ g r i d S L . g l m .i n t e r ac t i on _ i n f o _sc r ee n . c r a m e r sv_ g r i d S L . ea r t h r ee n . c r a m e r sv_ g r i d S L . g a m r ee n . c r a m e r sv_ g r i d S L . po l y m a r s_sc r ee n . c r a m e r sv_ g r i d S L . r a ndo m Fo r es t _ g r i d r ee n . c r a m e r sv_ g r i d S L . nn e t _sc r ee n . c r a m e r sv_ g r i d S L . m ea n _sc r ee n . r a ndo m Fo r es t _ g r i d S L . g l m _sc r ee n . r a ndo m Fo r es t _ g r i d S L . b ayes g l m _sc r ee n . r a ndo m Fo r es t _ g r i d S L . r p a r t _sc r ee n . r a ndo m Fo r es t _ g r i d S L . g l m .i n t e r ac t i on _ i n f o _sc r ee n . r a ndo m Fo r es t _ g r i d S L . ea r t h r ee n . r a ndo m Fo r es t _ g r i d S L . g a m r ee n . r a ndo m Fo r es t _ g r i d S L . po l y m a r s_sc r ee n . r a ndo m Fo r es t _ g r i d S L . r a ndo m Fo r es t _ g r i d r ee n . r a ndo m Fo r es t _ g r i d S L . nn e t _sc r ee n . r a ndo m Fo r es t _ g r i d S L . m ea n _sc r ee n . g l m n e t _ n V a r S L . g l m _sc r ee n . g l m n e t _ n V a r S L . b ayes g l m _sc r ee n . g l m n e t _ n V a r S L . r p a r t _sc r ee n . g l m n e t _ n V a r S L . g l m .i n t e r ac t i on _ i n f o _sc r ee n . g l m n e t _ n V a r S L . ea r t h r ee n . g l m n e t _ n V a r S L . g a m r ee n . g l m n e t _ n V a r S L . po l y m a r s_sc r ee n . g l m n e t _ n V a r S L . r a ndo m Fo r es t _ g r i d r ee n . g l m n e t _ n V a r S L . nn e t _sc r ee n . g l m n e t _ n V a r S L . nn e t _ A ll S L . r a ndo m Fo r es t _ g r i d A ll Q − W e i gh t s g − W e i gh t s Figure 4: Distribution of learner weights. The visualized distributions are based on the merged learner weights that resultedfrom the estimation of Ψ , and Ψ , ( ¯ d t ∗ , ¯ d t ∗ and twice ¯ d t ∗ ), summarized across the imputed data sets. The plotted pointrepresents the mean of each distribution. If it is below 0.01, both the distribution and the mean are displayed in red.33 Details on the Simulation Study

C.1 IPTW l l l l

Simulation 1 Simulation 2

Correct g−Model + SL Correct g−Model + SL0246 A b s . B i a s l l l l Simulation 1 Simulation 2

Correct g−Model + SL Correct g−Model + SL708090100 C o v e r age P r obab ili t y ( % ) Learner l l l l

GLM L1 L2 L3

Figure 5: Absolute bias and coverage probabilities for estimation with IPTW. Bias: 0.009 (GLM), 6.377 (L1), 6.325 (L2),6.431 (L3) and coverage probability: 99.1 % (GLM), 67.3 % (L1), 66.6 % (L2), 66.3 % (L3).

C.2 Data-Generating Processes (DGP) t = 1 t = 2 , L t ∼ N (0 , . L t ∼ N ( L t − + A t − , . A t ∼ B (expit( L t )) A t ∼ B (expit( L t + 2 × A t − − L t − )) Y t ∼ N (50 × A t + L t , . Y t ∼ N (50 × A t + L t + L t − + Y t − , . Table 4: DGP for Simulation 1 t = 1 t = 2 , . . . , L t ∼ N (0 , . L t ∼ N ( L t − , . A t ∼ B (expit( L t )) A t ∼ B (expit(0 . × L t + 0 . × L t − )) Y t ∼ N ( A t + L t , Y t ∼ N ( A t + L t + L t − + 0 . × L t − , . L t ∼ N ( Y t , . t = 1 , . . . , t = 1 , . . . , L t ∼ N ( A t + L t , . L t ∼ N ( Y t + L t − , . L t ∼ N ( Y t + L t , L t ∼ N ( A t , . L t ∼ N ( L t , . L t ∼ N ( L t , . L t ∼ N ( L t , . L t ∼ N ( L t , L t ∼ N ( L t + L t , .25)