[PDF] The sooner the better: lives saved by the lockdown during the COVID-19 outbreak. The case of Italy

Abstract

This paper estimates the effects of non-pharmaceutical interventions - mainly, the lockdown - on the COVID-19 mortality rate for the case of Italy, the first Western country to impose a national shelter-in-place order. We use a new estimator, the Augmented Synthetic Control Method (ASCM), that overcomes some limits of the standard Synthetic Control Method (SCM). The results are twofold. From a methodological point of view, the ASCM outperforms the SCM in that the latter cannot select a valid donor set, assigning all the weights to only one country (Spain) while placing zero weights to all the remaining. From an empirical point of view, we find strong evidence of the effectiveness of non-pharmaceutical interventions in avoiding losses of human lives in Italy: conservative estimates indicate that for each human life actually lost, in the absence of lockdown there would have been on average other 1.15, the policy saved in total 20,400 human lives.

Full PDF

TThe sooner the better: lives saved by the lockdown during theCOVID-19 outbreak. The case of Italy

Roy Cerqueti a,b , Raﬀaella Coppier c , Alessandro Girardi d ∗ , Marco Ventura e † a Department of Social and Economic Sciences – Sapienza University of Rome, Italy b School of Business – London South Bank University, UKEmail: [email protected] c Department of Law and Economics – University of MacerataEmail: raﬀ[email protected] d Parliamentary Budget Oﬃce, PBO, Rome, ItalyEmail: [email protected] e Department of Economics and Law – Sapienza University of Rome, ItalyEmail: [email protected]

January 29, 2021

Abstract

This paper estimates the eﬀects of non-pharmaceutical interventions – mainly, thelockdown – on the COVID-19 mortality rate for the case of Italy, the ﬁrst Western coun-try to impose a national shelter-in-place order. We use a new estimator, the AugmentedSynthetic Control Method (ASCM), that overcomes some limits of the standard Syn-thetic Control Method (SCM). The results are twofold. From a methodological pointof view, the ASCM outperforms the SCM in that the latter cannot select a valid donorset, assigning all the weights to only one country (Spain) while placing zero weights toall the remaining. From an empirical point of view, we ﬁnd strong evidence of the eﬀec-tiveness of non-pharmaceutical interventions in avoiding losses of human lives in Italy: ∗ Any opinion, ﬁnding, and conclusion or recommendation expressed in this material are those of theauthor(s) and do not necessarily reﬂect the views of the PBO. † Corresponding author: Marco Ventura, Department of Economics and Law, Sapienza Universityof Rome, Via del Castro Laurenziano 9- 00161, Rome, Italy. Fax: +39 (0)6 233 232 419. Email:[email protected] a r X i v : . [ ec on . E M ] J a n onservative estimates indicate that for each human life actually lost, in the absence oflockdown there would have been on average other 1.15, the policy saved in total 20,400human lives. Keywords:

COVID-19; non-pharmaceutical interventions; Augmented Synthetic Con-trol Method; Italy.

JEL Classiﬁcation: C19; C23; I18

Exponentially growing threats require strong and early policy response. In the ﬁrst waveof the COVID-19 outbreak, the timing of conﬁnement measures played a fundamental rolein ﬂattening the contagion curve (Flaxman et al., 2020; Amuedo-Dorantes et al., 2020).However, policy makers reasonably hesitate to take resolute measures when threats appearto be limited. This caution is reasonable, because if countermeasures work, it will seem inretrospect as if the policy response was an overreaction, possibly causing a loss of consensus(Pisano et al., 2020; The Economics, 2020). Once the curve is ﬂattened, the public willlikely blame the incumbent government for the tremendous economic losses caused by socialdistancing orders without fully grasping their essential role in halting the spread of the viraldisease.This paper adds to this debate, estimating the eﬀects of lockdown – or, in general,of the so-called non-pharmaceutical interventions – on the propagation of COVID-19 withparticular emphasis on the most relevant aspect: saving human lives. Its real-world rel-evance, implications and its high level of socio-economic meaningfulness need no furtherexplanation. The task is particularly challenging from a methodological perspective due totypical selection bias problems, and this explains the growing interest of econometriciansin this research question. At ﬁrst instance, a natural candidate to face this challenge isthe SCM – ﬁrst introduced by Abadie and Gardeazabal (2003) and the subsequent studiesby Abadie et al. (2010, 2015). At its very essence, the SCM involves the comparison ofoutcome variables between the treated unit, i.e., the unit aﬀected by the intervention andsimilar but diﬀerent unaﬀected units, reproducing an accurate counterfactual of the unit of Since its introduction, the SCM has been widely used in social sciences and applied to a broad spectrumof topics, spanning from terrorism and crime to natural resources and disasters, political and economicreforms, immigration, education, pregnancy and parental leave, taxation, as well as social connections andlocal development. Athey and Imbens (2017) have deﬁned it as the most crucial innovation in the policyevaluation literature over the last ﬁfteen years. For a recent survey see Abadie (2020). vis-a-vis a group of countries in which interventions took place (see Bornet al., 2020 and Cho, 2020).Despite its popularity, however, some real-world circumstances can make this instrumentunapplicable, under penalty of biased estimates. This occurrence may be due to manyreasons, such as the employment of diﬀerent heterogeneous measurement methodologieswhereby the same phenomenon is measured in diﬀerent countries. The case under scrutinyfalls exactly within these circumstances. To overcome this problem, we have followed a gen-eral approach proposed by Ben-Michael et al. (2020), augmenting the SCM with the Ridgeregression model, therefore obtaining a Ridge ASCM, which can be written as a weightedaverage of the control unit outcomes. To the best of our knowledge, the Ridge ASCM3as not yet been applied to assess the impact of non-pharmaceutical interventions for theCOVID-19 pandemic. It is possible to ﬁnd just a couple of papers in the environmentalﬁeld (notably, forest ﬁres in Colombia, Amador-Jimenez et al., 2020, and air pollution inChina, Cole et al., 2020).Our contribution to the literature on the socio-economic eﬀects related to the COVID-19pandemic is manifold. First, it sheds light on a controversial policy intervention that hasbeen and still is largely debated. The intervention’s eﬀectiveness is evaluated in terms ofthe most immediate and desired eﬀect, namely avoided deaths. Secondly, from a strictlymethodological point of view, it uses one of the most recent advances of the popular SCM,i.e., the Ridge ASCM, which allows one to overcome the non-negligible limit of non-perfectpre-treatment ﬁt, which, in turn, would generate a biased estimate from the canonical esti-mator. Thirdly, the study focuses on Italy, which is widely acknowledged as a paradigmaticcase, owing to its pioneering role – immediately after China – in facing the current pan-demic disease. The ﬁrst Italian COVID-19 cases were registered quite early (January 2020or even before) and contagion has accelerated since its inception. In March 2020, Italy wasthe country with the highest number of cases – apart from China – rapidly becoming theEuropean epicentre of outbreak, with 207,428 conﬁrmed cases and 28.236 deaths as of thebeginning of May 2020 (Ministry of Health). These ﬁgures represented approximately 14%of all conﬁrmed cases and 20% of deaths in Europe, 6% of conﬁrmed cases and just over11% of deaths worldwide. Moreover, and even more interestingly from our standpoint, Italywas the ﬁrst Western country in which the government imposed restrictions on mobility,economic activities and social interactions – the already mentioned strict lockdown. Thelockdown order was oﬃcially imposed in Italy from March 9 up to the May 18, 2020, 70 days.The intervention was highly criticized at that time, possibly because it was the ﬁrst amongWestern countries, and it was not yet completely clear the importance of acting timely,especially on the part of some media and politicians. Nonetheless, many other Europeancountries followed the Italian model within a few weeks, including the UK that initiallyclaimed to be against such a type of intervention. A long list of scientiﬁc contributionswitnesses the relevance of the Italian case for understanding the COVID-19 spread and, Other recent advances pertain to multiple treated units (Robbins et al., 2017; Abadie and L’Hour,2019), extensions of permutation methods (Dube and Zipperer, 2015), denoising the outcome variable andimputation of missing values (Amjad et al. 2018, 2019 and Athey et al. 2018), inference (Chernozhukov etal. 2019a, 2019b; Cattaneo et al. 2019), the role of covariances (Botosaru and Ferman, 2019; Ferman et al.,2020) and bias correction (Powel, 2018; Arkhangelesky et al., 2019; Chernozhukov et al., 2019b).

This section is devoted to the illustration of the Ridge ASCM procedure. To be self-contained and provide a better understanding of this econometric method, we begin with abrief description of the canonical SCM. In its very essence, the SCM aims to simulate theoutcome path of a country if it did not undergo a particular policy intervention. Opera-tively, the synthetic control is built as a weighted average of the units in the control group(donor pool), where the weights are chosen so that the synthetic control’s outcome closelymatches the treated unit’s trajectory in the pre-treatment period, while also satisfying someconstraints such as being non-negative or adding up to one.

More formally, let Y it (0) and Y it (1) represent the potential outcomes for unit i , with i =1 , . . . , N , at time t , with t = 1 , . . . , T , under control and treatment, respectively. Let W i bean indicator that unit i is treated at time T < T , where units with W i = 0 never receivethe treatment. In the SCM, one supposes that only one unit receives treatment and forease of reference this is listed as the ﬁrst one, W = 1, the remaining N = N − T = T + 1. The observedoutcomes are then Y it =  Y it (0) if W i = 0 or t ≤ T Y it (1) if W i = 1 and t > T (1)We also assume that control potential outcomes are generated as a ﬁxed component m it plus a mean-zero additive noise ε it drawn from some distribution, Y it (0) = m it + ε it (2)The treated potential outcome is then Y it (1) = Y it (0) + τ it where τ it represent the treatmenteﬀects – which are the objects of our estimation – and are ﬁxed parameters. Therefore, thetreatment eﬀects can be rewritten as τ = τ T = Y iT (1) − Y iT (0). The error terms in thepost-treatment period are collected in the vector ε T = ( ε T , . . . , ε NT ) and are assumed to be6ean-zero and uncorrelated with treatment assignment. That is, the treatment assignment W i is ignorable given m it , E ε T [ W i ε iT ] = E ε T [(1 − W i ) ε iT ] = E ε T [ ε iT ] (3)where E ε T denotes the expectation taken with respect to the error term ε T . It follows thatthe noise terms for the treated and control units do not systematically deviate from eachother. Let X it represent pre-treatment outcomes that are used as and along with othercovariates, X represents the N × T matrix of control units pre-treatment outcome andcovariates. Y T is the N vector of control unit outcomes in period T . With only one treatedunit, Y T is a scalar, and X is a T -row vector of treated unit pre-treatment outcomesand/or covariates. The potential outcome for the treated unit, Y T (0), is computed by theSCM as a weighted average of the control outcomes, Y (cid:48) T γ , being γ = ( γ , . . . , γ N ) thevector of weights. The elements of γ are chosen to balance pre-treatment outcomes andother covariates. To our aim, the SCM can be formalized as a solution with respect to γ of the following constrained optimization problemmin γ (cid:107) ( X − X (cid:48) γ ) (cid:107) + ζ (cid:80) W i =0 f ( γ i ) s.t. (cid:80) W i =0 γ i = 1 γ i ≥ i : W i = 0 (4)where the constrains limit γ to the unit simplex and where (cid:107) ( X − X (cid:48) γ ) (cid:107) ≡ ( X − X (cid:48) γ ) (cid:48) ( X − X (cid:48) γ ) is the 2-norm on R T . The simplex con-straint in (4) ensures that the weights will be sparse and non-negative, while the hyperpa-rameter ζ > X , is inside theconvex hull of the control units’ lagged outcomes and covariates, X . Due to possible highdimension, however, achieving perfect pre-treatment ﬁt is not always feasible with weightsconstrained to be on the simplex and in these cases Abadie et al. (2015) recommend againstusing SCM. Thus, the conditional nature of the analysis is critical to deploying SCM, exclud- Notice that since pre-treatment outcomes are included in the X matrix, “pre-treatment ﬁt” and “co-variance balancing” are equivalent expressions. Y (cid:48) T (0) = (cid:88) W i =0 ˆ γ scmi Y iT +  ˆ m T − (cid:88) W i =0 ˆ γ scmi ˆ m iT  (5)= ˆ m T + (cid:88) W i =0 ˆ γ scmi ( Y iT − ˆ m iT ) (6)where ˆ γ scmi is the estimated i -th SCM weight. In this context, the canonical SCM is a specialcase in which ˆ m iT is constant. Albeit fully equivalent, equations (5) and (6) highlight twodistinct features of the Ridge ASCM. From (5) it appears clear that Ridge ASCM correctsthe SCM estimate, (cid:80) W i =0 ˆ γ scmi Y iT , by the imbalance in a particular function of the pre-treatment, ˆ m ( . ). Intuitively, since ˆ m ( . ) estimates the post-treatment outcome, we can viewthis as an estimate of the bias due to imbalance, analogous to bias correction for inexactmatching (Rubin, 1973; Abadie and Imbens, 2011). Therefore, the SCM and Ridge ASCMestimates will be similar if the estimated bias is small. Diﬀerently, equation (6) is similarin spirit to standard doubly robust estimation (Robins et al., 1994), which begins with theoutcome model but then re-weights to balance residuals. Given these premises, the choice ofthe estimator ˆ m ( . ) is important both to understand the properties of the procedure and forpractical performance. Notably, there are some attractive features of estimating ˆ m ( . ) viaRidge regression which is linear in both pre-treatment outcomes and in comparison units.This case is referred to as the Ridge ASCM. In this case, the estimator of the post-treatmentoutcome is ˆ m ( X i ) = ˆ η ridge + X (cid:48) i ˆ η ridge , where ˆ η ridge and ˆ η ridge are the coeﬃcients of a Ridgeregression of control post-treatment outcomes Y T on centered pre-treatment outcomes X with penalty hyperparameter λ ridge : (cid:110) ˆ η ridge , ˆ η ridge (cid:111) = argmin η , η (cid:88) W i =0 (cid:0) Y i − (cid:0) η + X (cid:48) i η (cid:1)(cid:1) + λ ridge (cid:107) η (cid:107) (7)The Ridge ASCM estimator is then:ˆ Y aug T (0) = (cid:88) W i =0 ˆ γ scmi Y iT +  X − (cid:88) W i =0 ˆ γ scmi X i  · ˆ η ridge (8)when augmenting with Ridge regression the implied weights are themselves the solutionto a penalized synthetic control problem, as in the standard SCM problem. Nevertheless,8hile the original SCM constrains weights to be on the simplex, this does not occur withthe Ridge ASCM. Indeed, when the treated unit lies outside the convex hull of the controlunits, the Ridge ASCM improves the pre-treatment ﬁt relative to the SCM by allowing fornegative weights and extrapolating away from the convex hull. The Ridge ASCM directlypenalizes the distance from the sparse, non-negative SCM weights, controlling the amountof extrapolation by the choice of λ ridge , and only resorts to negative weights if the treatedunit is outside of the convex hull. When the treated unit is in the convex hull of the controlunits – so the SCM weights exactly balance the lagged outcomes – the Ridge ASCM andSCM weights are identical. When the SCM weights do not achieve an exact balance, theRidge ASCM solution will use negative weights to extrapolate from the convex hull of thecontrol units. The amount of extrapolation is determined both by the amount of imbalanceand by the hyperparameter λ ridge . When SCM yields good pre-treatment ﬁt or when λ ridge is large, the adjustment term will be small and γ aug will remain close to the SCM weights,Ridge ASCM and SCM weights will be equivalent and the estimation error will only be dueto variance of the weights and post treatment noise. It follows that λ ridge plays a crucialrole and its value must be derived optimally. Operatively, one possibility is to follow thein-time placebo check proposed by Abadie et al. (2015). Let ˆ Y ( − t )1 k = (cid:80) W i =0 ˆ γ augi ( − t ) Y ik bethe estimate of Y k obtained excluding time period t from the sample. The idea consists incomparing the diﬀerence Y t − ˆ Y ( − t )1 t for some t ≤ T as a placebo check. We can extend thisidea to compute the leave-one-out cross validation Mean Squared Error (MSE) over timeperiods: CV ( λ ridge ) = (cid:88) T t =1 (cid:16) Y t − ˆ Y ( − t )1 t (cid:17) (9)and the cross validation procedure chooses either the lambda that minimizes (9) or themaximal value of λ ridge with MSE within one standard deviation of the minimal MSE, assuggested by Hastie et al. (2009), among others. Such a choice trades oﬀ overﬁtting, i.e. atoo small λ ridge , and biased estimates, i.e. a too large λ ridge , indeed if λ ridge → ∞ ASCMis equivalent to SCM. 9 .2 Data sources and variables construction

As mentioned above, the ASCM procedure’s goal is to evaluate the impact of a lockdownon the most immediate and desired eﬀect, namely the number of avoided deaths. Specif-ically, the outcome variable Y is the mortality rate, which is deﬁned as the cumulativedeath counts per million population ( dth ) taken from the Epidemic Intelligence team ofthe ECDC (European Center for Disease Prevention and Control). Since daily reportedﬁgures for deaths tend to be challenging to compare and qualify across countries due topossible confounding idiosyncratic socioeconomic diﬀerences related to health care systemsand population ageing, we also consider several covariates, X , that are expected to be linkedto the outcome variable. Accordingly, among the predictors, we include variables capturingthe COVID-19 dynamics, such as cumulative cases per million population ( num ), which areintuitive predictors of mortality rates. The second group of predictors includes variablescapturing the “resilience” of each country’s health system. This subset includes the numberof hospital beds per hundred thousand population ( hsp ) under the assumption that themore developed the health system, the less fatal the COVID-19 infection will be. As forthe outcome variable, the source for num and hsp is the Epidemic Intelligence team of theECDC. Following S´a (2020) and Rockl¨ov and Sj¨odin (2020), among others, we also includesocioeconomic characteristics that are likely to be (positively) related to mortality rates;accordingly, the median age ( age ), as well as the average household size ( hld ), are added tothe set of covariates. All demographic variables are taken from the United Nations report(United Nations, 2019). We also control for “mobility trends” across diﬀerent categories ofplaces and behavior changes derived from Google Mobility Reports, which collect percent-age changes in visits and length of stay at diﬀerent places relative to a baseline given by themedian values of the same day of the week from January 3, 2020, to February 6, 2020. Fol-lowing Chernozhukov et al. (2021), we focus on four out of six mobility sub-indices (namely,“Grocery and Pharmacy”, “Transit Stations”, “Retail and Recreation” and “Workplaces”). For ease of reference, henceforth the simpler expression “ASCM” will be used referring to “Ridge ASCM” Cho (2020) pointed out that identifying the most appropriate outcome variable to assess real epidemi-ological eﬀects is controversial. On the one hand, endogenous cross-country diﬀerences in testing ratesregarding eligibility and accessibility limit the usefulness of the cases of infection. On the other hand, dailydeath counts might be aﬀected by measurement problems because some jurisdictions include both conﬁrmedand probable cases of deaths, as opposed to others only reporting conﬁrmed cases. This type of data is collected from smartphones with an initial level of “normal conditions” which is setto 0. When this data is reduced from the base value, it suggests that some forms of mobility constraintshave been imposed in a speciﬁc area so that the average mobility decreases. mob ) by following a “nonmodel based”aggregation scheme, as discussed in Marcellino (2006). The subsequent logical step is iden-tifying the donor states to form the synthetic control unit. When constructing a reliablecounterfactual, it is well understood that the relationship between the predictors and theoutcome variable in the donor pool must be as similar as possible to the relationship in thetreated unit. Accordingly, the selection of the donor pool’s candidate elements should becarried out by identifying countries sharing some key similarities to the treated one. In thepresent context, geographical proximity is a crucial factor to be considered as the spreadof the pandemic has been not homogeneous across space and over time, moving from Asiain late 2019 to Europe at the beginning of 2020 and, subsequently, to the Americas. Givenour focus on the Italian case, an obvious choice to select the donor pool’s elements is tofocus on European countries. Accordingly, we have included all members belonging to theEuropean Union (except Luxembourg) plus Switzerland, Norway, and the United Kingdom(28 countries in total).Since the daily evolution of the mortality rate at the individual country level reﬂects diﬀer-ent diﬀusion patterns at a given point in time, we have normalized the time unit such that“day 1” refers to the day on which cumulative infection cases per million exceeds one in thetreated country as in Cho (2020). In our case, “day 1” corresponds to February 23, 2020,with the lockdown policy enacted on March 9. Therefore, in our setup, the pre-treatmentperiod consists of 15 daily observations. Because the treated state contrasts to the controlunit after treatment, the relevant policy under scrutiny (the impact of non-pharmaceuticalinterventions in our context) should not be enacted in any donor pool state during thestudy. Accordingly, our sample’s ending date is given by the date when lockdown measureshave taken place in the synthetic counterfactual. To identify such an average date, we haveused an ad hoc index elaborated by the Oxford COVID-19 Government Response Tracker,namely the Stringency Index, SI, which collects standardized information on several diﬀer- Speciﬁcally, the four sub-indices are standardised to have zero mean and unit standard deviation. Thisstep helps avoid the resulting (simple) average index, which is calculated in the subsequent step, to bedominated by variables with a particularly pronounced degree of volatility or an incomparably high absolutemean.

Resorting to the ASCM makes it possible to choose in a transparent way the weights tobuild the counterfactual for the treated unit (Abadie et al., 2010 p. 494). Nonetheless, suchan advantage is weakened by a lack of consensus on how (and what) covariates should bechosen. Due to this approach’s relative infancy, there are not enough papers to formallytest for speciﬁcation searching (Brodeur et al., 2016). On the one hand, using all laggedoutcome variables avoids the problem of omitting potentially irrelevant covariates because iteliminates all other predictors eﬀects (Kaul et al., 2018) so that the synthetic counterfactualis created regardless of the other predictor’s values. This speciﬁcation is the one that mini-mizes the Root Mean Squared Prediction Error (RMSPE) in the pre-treatment period, andthat is not subject to arbitrary decisions. On the other hand, it makes all the other covari-ates irrelevant, threatening the estimators’ unbiasedness in the post-treatment predictions(Ferman et al., 2020). Given this lack of guidance, focusing on the speciﬁcation that uses allthe pre-treatment outcome lags as matching variables, is generally recommended (Fermanet al., 2020) unless there is a strong prior belief that it is crucial to balance on a speciﬁcset of covariates (as in the present context). Moreover, optimizing the dependent variable’spre-treatment ﬁt and ignoring the covariates can be quite misleading: the more the covari-ates are truly inﬂuential for future values of the outcome, the larger a potential bias of theestimated treatment eﬀect may become. Therefore, economic theory and the researcher’sintuition play a relevant part in the context of the ASCM as well. Building on the relevantliterature on the topic discussed in the Introduction, we have considered the following setsof covariates: variables related to the resilience of the health care system (captured by hsp ),to the demographic structure of the population (represented by age and hld ), as well asto the pandemic dynamics (epitomized by num and mob ). Operatively, we ﬁrst consider This result is essentially attributable to the fact that covariates are ﬁtted rather poorly when all outcomelags are used, introducing a bias that can be substantial even for reasonably long-treatment time spans. hsp , age and hld ). Finally, in the third group of speciﬁcations from (c1) to (c5), we extend the set ofpredictors by including lagged time-varying predictors averaged over time in a way whichis consistent with the pre-treatment outcome values in (b1) to (b5), respectively.As a preliminary step, we compute the Average Treatment Eﬀect on the Treated (ATT) thatis the average deviation of the counterfactual series from the actual one over the treatmentperiod (from March 9 2020, to April 11 2020) for each speciﬁcation (a0)-(c5) to identify thespeciﬁcations with a negative and statistically signiﬁcant gap in a way which is consistentwith our priors. Since there is more than one possible speciﬁcation that satisﬁes the con-ditions above, we follow the recommendation by Ferman et al. (2020) of presenting resultsfor many diﬀerent speciﬁcations, and in particular, we include the speciﬁcation (a0) as abenchmark. The upper part of Table 1 presents an overview of all the ASCM speciﬁcationsthat we consider in the analysis, while the last row reports the associated p-value for thecomputed ATT for the corresponding speciﬁcation. INSERT HERE TABLE 1The results show that all of the ATTs are negative and statistically signiﬁcant at the 5percent level (or better), calling for a criterion to combine the test statistics for the indi-vidual speciﬁcations to distil them into a summary test statistic (Imbens and Rubin, 2015).Expressly, we assume that the test function is simply a weighted average of the test statisticsfor individual speciﬁcations. The same equally-weighted scheme is applied to combine eachspeciﬁcation into a synthetic statistic (Christensen and Miguel, 2018; Cohen-Cole et al.,2009). In this vein, Figure 1 shows the treatment eﬀects, deﬁned as the diﬀerences betweenthe mortality rate in Italy and the synthetic control over the evaluation period, averaged In all the speciﬁcations, we select the hyperparameter λ ridge as the largest λ within one standard errorof the λ that minimizes the cross-validation placebo ﬁt CV( λ ) as discussed in Section 2.1 above. The resultsobtained under the alternative rule of picking the minimal λ are almost identical to those reported in themain text. ± . Though suggestive, the visual evidence presented above is insuﬃcient to ensure proper im-plementation of the ASCM. Its practical use calls for the fulﬁllment of three conditions. Asfor the ﬁrst requirement, only the treated unit is aﬀected by the policy change assessed overthe post-treatment period (I); secondly, the counterfactual outcome can be approximatedby a ﬁxed combination of donor states (II); ﬁnally, the policy change has no eﬀect before itis implemented (III). While the procedures to align country-speciﬁc variables to a commonstarting date as well as the deﬁnition of a general rule to identify the (average) treatmentdate for the donor set (i.e., the last observation of our sample) discussed in Section 2.2above help to answer point (I), in what follows we focus on the remaining two conditions.As for point (II), Figure 3 reports the estimated weights according to both the ASCMand the canonical SCM for each country belonging to the donor pool. It emerges thatthe structure of the donor pool identiﬁed by the SCM consists of just one element (Spain)with weight zero attached to the remaining 27 countries. In contrast, in the case of theASCM, a richer structure of the weights emerges. With regard to mortality rate this is inline with Cho (2020) who observes that death counts might be aﬀected by measurementproblems because some jurisdictions include both conﬁrmed and probable cases of deaths,as opposed to others only reporting conﬁrmed cases. In the speciﬁc case of Italy, Cerqua etal. (2020) claim that the oﬃcial count of deaths due to COVID-19 is likely to under-reportthe phenomenon. In our context, the mechanics behind ASCM allows us to extrapolate-out of the convex hull and to assign non-zero weights to a higher number of countries. Inmore detail, there is conﬁrmation of a relevant role for Spain (0.917), along with France(0.730) and Poland (0.583), with (relatively smaller) positive weights attached for severalother countries as well (namely Belgium, Bulgaria, Denmark, Greece, the United Kingdom,the Netherlands, and Slovenia). In contrast, Germany, Switzerland, and Ireland do notcontribute to the synthetic unit’s construction, while the remaining 15 countries are associ-ated with negative weights (ranging from -0.312 for Norway to -0.076 for Estonia). Overall,from an applied viewpoint, the ASCM can identify three European countries where thepandemic severity peaked compared to other donor pool candidate units. Simultaneously,15esorting to the ASCM seems like a legit choice from a methodological standpoint due tothe documented diﬃculty to build up a donor pool within the canonical SCM. In such anoccurrence, SCM and ASCM estimates tend to diverge. Speciﬁcally, ASCM estimates arelikely to rely heavily on extrapolation from the convex hull of the control units in orderto improve pre-treatment ﬁt (Ben-Michael et al., 2020) by allowing for negative weights (ifthe treated unit is outside the convex hull) in place of the sparse and always non-negativeweighting structure of the standard SCM. Moreover, resorting to a CV procedure makesit possible to obtain the optimal amount of extrapolation through the choice of λ ridge ofcondition (9). INSERT HERE FIGURE 3Turning to point (III), the synthetic outcome is expected to closely match the treatedoutcome’s temporal proﬁle during the pre-treatment period. Thus, as a preliminary step,Table 2 reports the average values over the pre-treatment period of the predictors for Italy(“actual”) and the counterfactual control (“synth”), where the latter is constructed withthe ASCM weights assigned to the elements of the donor pool as detailed in Figure 3.Overall, the synthetic control unit provides a much better-matched proﬁle of Italy alongthe predictors compared to the simple average of all countries in the donor pool (donor),suggesting that the ASCM-based selection of weights is more appropriate as a control unit,rather than choosing subjectively the weights by means, for instance, the simple averageacross all the donor units. INSERT HERE TABLE 2Such a requirement is necessary to ensure that the comparison of the outcome pathsduring the post-treatment period provides insight into the eﬀect of the treatments: whenthe dynamics of the synthetic control and the treated entity tend to diverge, then thetreatment presumably caused the diﬀerence; in contrast, if both paths display similaritiesin the treatment period, the treatment does not appear to have aﬀected the outcome.Figure 4 compares the temporal proﬁle of mortality rate for both actual and synthetic Italyby disentangling pre- and post-treatment periods: the upper panel assesses the quality ofﬁt in the pre-treatment period and refers to the ﬁrst 15 daily observations since the caseper million exceeds one (corresponding to the temporal window from February 23, 2020,16o March 8, 2020); the lower panel displays the diﬀerence of the two series over the entiresample span which also includes the post-treatment period ranging from March 9, 2020(when the policy intervention took place) to April 11, 2020.INSERT HERE FIGURE 4While the cumulative mortality rate in the synthetic control unit closely overlaps theactual series prior to the pre-treatment period, there is a visible divergence a few days afterthe policy intervention date (vertical line in the lower panel) when the synthetic unit startsfollowing a much steeper path than the actual counterpart series. The resulting gap, deﬁnedas the diﬀerence between the actual series and its synthetic control, turns out to be negativeand statistically signiﬁcant with an ATT of -132.9 and a p-value of 0.000. In more detail, weﬁnd that the cumulative mortality rate in synthetic Italy exceeds 670 per million populationroughly ﬁve weeks after the lockdown intervention, while the corresponding ﬁgure for actualItaly is as low as 310. This evidence suggests that with a (negative) gap of around 340 casesper million, Italy’s mortality rate case would have been higher by over 115 percent had therenot been the policy intervention. Moreover, the gap from the actual series turns out to bestatistically signiﬁcant at the 95 percent level according to the conﬁdence interval computedby the jackknife+ approach of Barber et al. (2019). Conﬁdence intervals as those reported in Figure 4 above have only recently been introduced(see also Cattaneo et al., 2019) in the SCM literature so that the statistical signiﬁcance ofthe post-treatment gap is typically assessed by resorting to permutation techniques (the so-called placebo test; Abadie et al., 2010). Nonetheless, permutation-based tests can conveyuseful information in the present context to support our empirical ﬁndings. In what follows,we discuss three types of sensitivity tests: placebo in-space, placebo in-time, and leave-one-out tests.Under the “placebo in-space” test, the ASCM is sequentially applied to each country in thedonor pool as though it is a treated state, using the remaining members of the pool as before.The resulting placebo unit is thus compared with its synthetic counterpart. Comparing thediﬀerence between the treated unit and its synthetic control to the diﬀerences among placebocountries and their controls makes it possible to evaluate better the eﬀectiveness of policy17ntervention on the treated unit.As the Root Mean Square Prediction Error (RMSPE) measures the gap between the variableof interest for the treated country and its synthetic counterpart, it is possible to calculate aset of RMSPE values for the pre-and post-treatment periods for each unit considered in theanalysis. Consequently, the RMSPE of the treated country after the treatment is expectedto be large relative to its value before treatment. On the other hand, placebo units shouldnot see a substantial increase in their RMSPE following the treatment. For this reason,Table 3 reports the RMSPE pre/post-treatment ratio of each donor country divided by thesame quantity computed for the treated country, Italy. Whenever the entry in the table isless than 1, it indicates a relatively higher diﬃculty when forecasting future outcome valuesfor Italy. The share of RMSPEs above one is then used to obtain a p-value for Italy, whichmeasures the probability of observing a ratio as high as the one obtained for Italy if onewere to pick a country at random from the potential controls.INSERT HERE TABLE 3Overall, the RMSPE ratios turn out to be well below the unit threshold, suggestingthat the actual path of the treated states tends to diverge away from the synthetic controlafter the intervention in a much more substantial way than all countries belonging to thedonor pool. The resulting p-value is (2/29=) 0.069 as it ranks second out of 29 countries,which falls within the conventional range of statistical signiﬁcance used in the relevantliterature. A remarkable exception is Belgium’s case, with an RMSPE slightly above theunit; nonetheless, the associated ATT has an opposite sign to the expected one in a waysimilar to what emerges for the other eight countries. There is no evidence of a statisticallysigniﬁcant ATT for three entities of the donor pool, while for the remaining cases, theestimated ATT ranges from -0.7 (for Croatia) to -34.5 (for Greece and Hungary).As a further sensitivity test, we run the “in-time placebo” test, in which the donor poolremains ﬁxed and the treated unit is always Italy, but the treatment date is re-assignedto occur during the pre-treatment period, as devised by Abadie et al. (2015). Moreover,this placebo model’s sample period must end when the actual treatment occurred (day15, in our context) to avoid capturing its eﬀects. Operatively, the in-time placebo testis conducted under the assumption that the treatment occurred on day 8, roughly in themiddle of our pre-treatment period. Apart from the lockdown date, we apply the baseline18etup’s exact setting to use the same predictor variables, including lagged outcome valuesfor the ﬁrst half of the pre-treatment period. As Figure 5 shows, our synthetic Italy for aplacebo treatment on day 8 closely follows the path of actual Italy, not only during the ﬁrsthalf of the baseline pre-treatment period but also in the second part of the sample with anestimated gap barely diﬀerent from zero (continuous black line) also according to the 95percent conﬁdence region. Similar results are obtained when the ﬁctitious treatment dateis assigned to days 10 and 12 (corresponding to two-thirds and three-fourths of the baselinepre-treatment period, respectively). According to the dotted and dashed lines in Figure5, signiﬁcant reductions in the mortality rate for these two fake lockdown dates cannotbe found over the actual pre-treatment period. Overall, the in-time placebo test assuresthat the placebo estimate resembles the actual pre-treatment path closely enough to give usconﬁdence that our main ﬁndings are not through chance, ruling out the possibility that theabove-discussed diﬀerence between the synthetic and actual Italy arises for reasons otherthan the treatment. INSERT HERE FIGURE 5INSERT HERE FIGURE 6The third sensitivity check we consider is the leave-one-out test (Abadie et al., 2015),where the model is iterated over to leave out one selected donor country each time to assesswhether one of the donor units is driving the results. Figure 6 shows all leave-one-outsynthetic gaps (thin grey lines) and the mean value across all of them (dashed line). Itemerges that the average gap across all permutations closely matches the baseline gap thatincludes all donor states in terms of ATTs (-128.2 and -132.9, respectively), giving furthersupport to the robustness of our ﬁndings.

This paper is the ﬁrst contribution that uses the ASCM to evaluate the eﬀectiveness ofnon-pharmaceutical interventions against COVID-19. Evidence has been provided for Italy,the ﬁrst Western country which has implemented shelter-in-place orders after China. Thepaper shows how the ASCM helps remove bias from a naive application of the canonicalSCM. Indeed, the latter estimator shrinks the donor pool to only one country, i.e., Spain,19enerating de facto an estimate of the eﬀect as a bare diﬀerence in mean between Italy andSpain. Constraining SCM weights on the unit simplex may be too restrictive, especiallywhen it is hard to reproduce accurate synthetic pre-treatment dynamics. Our empirical casefalls precisely in this circumstance, and the ASCM overcomes the problem by assigning neg-ative weights to some donor units. As already pointed out by Ben-Michael et al. (2020),since the ASCM removes the non-negativity constraint and allows for extrapolation outsideof the convex hull, the pre-treatment ﬁt from ASCM turns out to be at least as good asthe pre-treatment ﬁt from the SCM alone. Our evidence suggests that applied economistsshould compare weights obtained from the SCM and the ASCM and opt for the former onlyif the two are not entirely diﬀerent. The socio-economic relevance of the issue analyzed inthis contribution makes the importance of such a comparison even more evident becausethe estimates may signiﬁcantly diverge, and diﬀerent conclusions may be based on biasedestimates.The results report extensive evidence on the eﬀectiveness of non-pharmaceutical interven-tions in avoiding human deaths and preventing health care systems from collapsing in thepresent COVID-19 era. According to our benchmark estimate, for each life lost, the policyhas saved 1.15 lives. In other words, while the cumulative mortality rate has recorded 310lives lost per million population, without lockdown policy, it would have been 670, thatis 340 lives saved per million population. With 60 million as the 2020 Italian populationin total, the policy has produced (60*340=) 20,400 lives saved. It is important to stressthat this ﬁgure can be considered conservative because the sample span we can use forthe econometric exercise is shorter than the total temporal horizon over which the policyhas been implemented, due to the lack of non-treated units from a speciﬁc point in timeonwards. Similar desirable results, albeit not comparable in magnitude, have been foundby Friedson et al. (2020) for California, while for the case of Sweden, Cho (2020) estimates25 percent of lives lost attributable to the non-treatment decision. Due to limited externalvalidity and diﬀerent methodologies applied, the ﬁgure by Cho (2020) is scantly compara-ble with ours; nevertheless, a more substantial eﬀect in Italy is quite reasonable because ofstructural diﬀerences between the two countries in terms of (lower) endowment of hospitalbeds, (older) median age of the population and (larger) household average size (for Italy).As a possible extension, one could think of extending and projecting the ﬁndings up tothe last day of the policy (i.e., other 36 extra days up to May 18, 2020) and re-calculate20he total eﬀect of the policy. One can also consider relating the ﬁndings of this paper toeconomic damages caused by the lockdown measure, while from a strictly methodologicalpoint of view, a further possible extension consists in constructing a formal test to test theequality in the mean of the weights generated by the SCM and the ASCM. These issues arebeyond the scope of the present work and are left for further research.

References

Gap plot for the benchmark case

Note: The horizontal axis indicates the days after treatment. The dashed line is the diﬀer-ence between mortality rate in Italy and the synthetic control from the speciﬁcation withall pre-treatment outcome values (a0). The solid black line is the gap plot obtained byaveraging the synthetic control over all the alternative speciﬁcations. The shaded area isthe conﬁdence region computed as ± . a b l e : F u ll s e t o f s p ec i ﬁ c a t i o n s N O T E : d t h ( i ) :i - t h l ago f c u m u l a t i v e d e a t h c o un t s p e r m illi o np o pu l a t i o n ; h s p : nu m b e r o f h o s p i t a l b e d s p e r hund r e d t h o u s a ndp o pu l a t i o n ; a ge m e d i a n ag e ; h l d : a v e r ag e h o u s e h o l d s i ze ; n u m ( . ) : c u m u l a t i v ec a s e s p e r m illi o np o pu l a t i o n ; m o b ( . ) : m o b ili t y i nd i c a t o r ; ( . ) i nd i c a t e s t h e a v e r ag e v a l u e o v e r t h e l ag s c h o s e n f o r t h e d e p e nd e n t v a r i a b l e : ( * ) s t a nd s f o r t h e a v e r ag e v a l u e o v e r t h ee n t i r e p r e - s a m p l e p e r i o d ; A TT = A v e r ag e T r e a t m e n t e ﬀ ec t o n t h e T r e a t e d . Sp ec i ﬁ c a t i o n ( a0 )( a1 )( a2 )( a3 )( a4 )( a5 )( b )( b )( b )( b )( b )( c )( c )( c )( c )( c ) d t h ( ) ********** d t h ( ) ********** d t h ( ) ********** d t h ( ) ********** d t h ( ) ********** d t h ( ) ********** d t h ( ) ********** d t h ( ) ******* d t h ( ) ******* d t h ( ) ******* d t h ( ) **** d t h ( ) **** d t h ( ) **** d t h ( ) **** d t h ( ) **** d t h ( * ) *** h s p ********** ag e ********** h l d ********** nu m ( . ) ***** m o b ( . ) ***** A TT - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . - . p - v a l u e [ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ][ . ] Gap plots for alternative speciﬁcations

Note: The horizontal axis indicates the days after treatment. The solid black line is thediﬀerence between the mortality rate in Italy and the synthetic control (gap) from thespeciﬁcation (c4) with the ﬁrst half of the pre-treatment outcome values averaged overtime and structural time-invariant covariates ( hsp , age , hld ), as deﬁned in Section 3.1. Thedotted and dashed lines represent the gap plot obtained by taking the average and themedian of all the alternative speciﬁcations, respectively. The hyphenated line is the gapfrom the speciﬁcation (c5) with the ﬁrst three-fourths of the pre-treatment outcome valuesaveraged over time and structural time-invariant covariates ( hsp , age , hld ), as deﬁned inSection 3.1. 29igure 3: Comparison between SCM and ASCM weights

Note: the picture reports the weights from the SCM (white) and from the ASCM (black)Table 2: Balancing table.

NOTE: dth(i) : i-th lag of cumulative death counts per million population; hsp : number of hospi-tal beds per hundred thousand population; age median age; hld : average household size; num(.) :cumulative cases per million population; mob(.) : mobility indicator.

Actual Synth Donordth(1) 0.033 0.026 0.001dth(2) 0.033 0.079 0.008dth(3) 0.099 0.109 0.01dth(4) 0.182 0.124 0.011dth(5) 0.199 0.182 0.019dth(6) 0.282 0.253 0.023dth(7) 0.348 0.356 0.039hsp 3.180 3.641 4.777age 45.500 44.034 42.007hld 2.400 2.383 2.449num(.) 6.341 5.066 6.002mob(.) 4.143 4.943 2.49230 a) Pre-treatment period(b) Pre- and post-treatment periods

Figure 4:

Actual and synthetic Italy

Note: In both panels, the horizontal axis indicates days after the death per million exceedsone. The proﬁle of Italy is shown by the solid line while its synthetic counterfactual, thedashed line. The vertical line in Panel (b) represents the lockdown date. The shaded areais the conﬁdence interval computed by means of the jackknife+ approach of Barber et al.(2019). 31igure 5:

In-time placebo

Note: The horizontal axis indicates the days after the death per million exceeds one. Theblack solid, dotted and dashed lines are the gap plot when the ﬁctitious treatment date isassigned to half, two-thirds and three-fourths of the actual pre-treatment period, respec-tively. The shaded area is the 95 percent conﬁdence interval computed by means of thejackknife+ approach of Barber et al. (2019).32igure 6:

Leave-one-out

Note: The horizontal axis indicates days after the death per million exceeds one. The blackdashed line is the gap plot of the baseline speciﬁcation obtained with the entire set of donorcountries. The grey lines are the leave-one-out gap plots obtained by removing one countryat a time from the donor pool of the baseline speciﬁcation.33able 3: In-space placebo.