[PDF] Latent Causal Socioeconomic Health Index

Abstract

This research develops a model-based LAtent Causal Socioeconomic Health (LACSH) index at the national level. We build upon the latent health factor index (LHFI) approach that has been used to assess the unobservable ecological/ecosystem health. This framework integratively models the relationship between metrics, the latent health, and the covariates that drive the notion of health. In this paper, the LHFI structure is integrated with spatial modeling and statistical causal modeling, so as to evaluate the impact of a continuous policy variable (mandatory maternity leave days and government's expenditure on healthcare, respectively) on a nation's socioeconomic health, while formally accounting for spatial dependency among the nations. A novel visualization technique for evaluating covariate balance is also introduced for the case of a continuous policy (treatment) variable. We apply our LACSH model to countries around the world using data on various metrics and potential covariates pertaining to different aspects of societal health. The approach is structured in a Bayesian hierarchical framework and results are obtained by Markov chain Monte Carlo techniques.

Full PDF

LLATENT CAUSAL SOCIOECONOMIC HEALTH INDEX B Y F. S

WEN K UH , G RACE

S. C

HIU

AND A NTON

H. W

ESTVELD The Australian National University, * [email protected]; † [email protected] William & Mary’s Virginia Institute of Marine Science; University of Washington; University of Waterloo ‡ [email protected] Abstract

This research develops a model-based LAtent Causal Socioeconomic Health(LACSH) index at the national level. We build upon the latent health factorindex (LHFI) approach that has been used to assess the unobservable ecolog-ical/ecosystem health. This framework integratively models the relationshipbetween metrics, the latent health, and the covariates that drive the notion ofhealth. In this paper, the LHFI structure is integrated with spatial modelingand statistical causal modeling, so as to evaluate the impact of a continuouspolicy variable (mandatory maternity leave days and government’s expendi-ture on healthcare, respectively) on a nation’s socioeconomic health, whileformally accounting for spatial dependency among the nations. A novel vi-sualization technique for evaluating covariate balance is also introduced forthe case of a continuous policy (treatment) variable. We apply our LACSHmodel to countries around the world using data on various metrics and poten-tial covariates pertaining to different aspects of societal health. The approachis structured in a Bayesian hierarchical framework and results are obtainedby Markov chain Monte Carlo techniques.

1. Introduction.

The gross domestic product (GDP) has been conventionally used asa measure when benchmarking different countries’ growth and production. However, thecommonly used GDP arguably only captures one aspect/perspective—the economic perfor-mance of a country—rather than a country’s overall performance and wellbeing. Conse-quently, many ongoing discussions and much effort have been made to ﬁnd an alternative‘wellbeing’ indicator as a holistic measure of a country’s socioeconomic health [Conceic¸ ˜aoand Bandura (2008)]. Such wellbeing indices are useful for governments and organizationsto benchmark a country’s overall performance (other than solely economic) and help policymakers form evidence-based decisions. Despite that, there are issues with existing meth-ods that attempt to quantify this health/wellbeing feature. For instance, combining multiplesources of subjectivity and arbitrarily turning them into a single score, yet without rigor-ously quantifying the uncertainties around the score [New Economics Foundation (2016);Organisation for Economic Co-operation and Development (OECD) (2016); United NationsDevelopment Programme (2018)], or measuring a country’s wellbeing using a chosen proxyvariable such as the life satisfaction score [Sachs et al. (2018)], which is not a direct mea-surement of the variable of interest. Health and wellbeing are increasingly being accepted asmultidimensional concepts that often involve multiple subjective and objective measures onthe macro- and micro-levels [McGillivray and Clarke (2006); Yang (2018)]. We recognizethat the concept of wellbeing is inevitably subjective and we focus on reducing the subjectiv-ity on the quantiﬁable measures through statistical inference of the country’s socioeconomichealth as a model parameter.

Keywords and phrases:

LACSH index, Bayesian inference, causal inference, LHFI, latent health, hierarchicalmodel, spatial modeling, generalized propensity score a r X i v : . [ s t a t . M E ] S e p This paper proposes a LAtent Causal Socioeconomic Health (LACSH) index by develop-ing a hierarchical, latent variable framework to simultaneously model each country’s healthas a latent parameter, account for spatial correlation among countries, and evaluate the causalimpact of a policy variable on the latent health. This new methodology contributes to theaforementioned effort towards a holistic approach by addressing the subjectivity and uncer-tainty propagation through a single statistical inferential framework. The LACSH index is anadaptation of the latent health factor index (LHFI) method [Chiu et al. (2011); Chiu, Wu andLu (2013)] to quantify the country’s ‘health’ H as a latent parameter. Our work builds on theconcept of assessing the underlying ecosystem health in Chiu et al. (2011) and Chiu, Wu andLu (2013) as unobservable and latent, to assessing societal health for countries.Note that the approach to measuring latent traits is not unique, as the idea appears in itemresponse theory (IRT) in the psychometrics literature [Rabe-Hesketh and Skrondal (2004)].Other examples include the quantiﬁcation of the position of political actors on a politicalspectrum [Jackman (2001); Martin and Quinn (2002)], constructing measures of nations’ un-derlying democracy [Treier and Jackman (2008)], and assessing ecological/ecosystem health[Chiu et al. (2011); Chiu, Wu and Lu (2013)]. Rijpma et al. (2016) model the wellbeingof countries also as a latent variable, similar to the special-case LHFI model that regresseshealth indicators on H alone. In contrast, the general LHFI model further regresses H on co-variates that are chosen due to their perceived explanatory nature to health. In this paper, ourholistic framework further incorporates spatial and causal modeling structure into the LHFIframework.In applying our work, we quantify the latent health of the countries using data collected atthe national-level. Observable variables (e.g. gross national income (GNI) per capita, life ex-pectancy, mean years of schooling, etc.) are treated as either indicators or drivers/covariates of a country’s underlying health condition as opposed to measures of health. We use ‘health’and ‘wellbeing’ interchangeably to capture the notion of a country’s socioeconomic perfor-mance from the social, political, economic and environmental perspectives simultaneously.For the rest of the paper we will continue to refer to this holistic notion as (latent) healthwhen referring to both the model parameter and the concept of wellbeing. As national-levelvariables tend to be spatially dependent [Ward and Gleditsch (2018)], we incorporate a spa-tial modeling structure into the LHFI framework to formally model this dependency amongthe countries.In addition to the quantiﬁcation of the latent health of countries, the incorporation of causalmodeling into our framework enables further insight into the effect of a policy variable onthe health of a country. Propensity score adjustment for reducing confounding bias in obser-vational studies has been used widely in the literature since the seminal paper by Rosenbaumand Rubin (1983). Subsequently, there have been ample discussions [An (2010); McCand-less et al. (2010); Kaplan and Chen (2012); Zigler et al. (2013)] on modeling the uncertaintyassociated with the inference of the propensity score, as reﬂected by McCandless, Gustafsonand Austin (2009) who model the uncertainty under a Bayesian framework to evaluate theimpact of statin therapy on mortality of myocardial infarction patients. We extend this ideato using the generalized propensity score framework for continuous treatment to estimate adose-response function [Hirano and Imbens (2004); Imai and Van Dyk (2004)]. We evaluatethe impact of varied doses of a ‘policy treatment’ variable (in our case, mandatory mater-nity leave (MML) days and domestic general government health expenditure (GGHE) percapita) on a country’s health. Including this notion of ‘policy treatment’ in our model allowsa model-based assessment of the effect of a policy variable on the (latent) health of a country,in the context of counterfactuals.To elaborate on the above elements, our paper is laid out as follows. In the next section,we brieﬂy review the methodologies used to construct some of the existing socioeconomic health indices. Section 3 introduces the countries’ data, and Section 4 discusses the method-ology and building blocks we will employ to construct our latent socioeconomic health fornations. In Section 5, we propose a framework (using the building blocks discussed in Sec-tion 4) which is applied to the data, and highlight some of the results from our models; wealso discuss a new visualization technique for assessing covariate balance under the general-ized propensity score framework. In Section 6, we revisit the data by providing an in-depthdiscussion of the speciﬁcs of the data and model structure we have used. Finally, we reviewthe limitations of our work and conclude the paper by discussing some potential future workin Section 7. Appendices A–E in the supplemental document contain further details on com-putation, posterior distributions, and additional insights.

2. Review on existing indices.

Global and regional indices.

There is an increasing awareness that the GDP hasbeen inappropriately used as a broader benchmark measure for overall welfare among coun-tries [Kubiszewski et al. (2014)]. Several methods have been proposed as an alternative mea-sure to the GDP, but existing approaches have used variables such as the life evaluation scoreor ‘happiness’ as a proxy measure of a country’s health (or subjective wellbeing) [Sachset al. (2018); Conceic¸ ˜ao and Bandura (2008)]. This is also problematic, as a country’s healthis a multidimensional concept as aforementioned. We review ﬁve such alternative indices inTable 2.1. The background and components contributing to these ﬁve and other indices havebeen discussed by Hashimoto, Oda and Qi (2018) and Kubiszewski et al. (2014), but herewe focus on the statistical methodology being used. Note that 2 out of these 5 indices as-sume equal weighting of pre-speciﬁed variables that contribute to a country’s health. Thereappears to be little justiﬁcation that the concept of health is represented by equal parts of awide variety of variables, apart from convenience. T ABLE

Selection of existing indices and methodology used I NDEX S TATISTICAL M ETHODOLOGY

United Nations Human Develop-ment Index (HDI) Arithmetic means of different variables are com-puted, then a geometric mean of the arithmeticmeans is computed to form the HDI

World Happiness Report Pooled ordinary least squares regression (fromeconometrics) of the national average response tothe survey question of life evaluations on 6 cate-gories of variables hypothesized as underlying de-terminants of the nation’s ‘happiness score’

Social Progress Index (SPI) First, a principal component analysis (PCA) isused to determine the weighting of indicatorswithin each component, and the weights and indi-cators are multiplied to obtain component scores.Next, component scores are transformed onto ascale of 0-100, an arithmetic mean is computedfor each dimension, and another arithmetic meanis computed to obtain the ﬁnal SPI

Happy Planet Index (HPI) The variables ‘experienced wellbeing’ and ‘lifeexpectancy’ are multiplied, then divided by ‘eco-logical footprint’; scaling constants are used tomap the ﬁnal HPI to range from 0-100

Organisation for Economic Co-operation and Development(OECD) Better Life Index (BLI) OECD BLI website user-speciﬁed weights are as-signed to each topic (e.g. education, income, etc.),and up to four indicators which constitute eachtopic are assigned equal weights to form the ﬁnalBLI

3. Data.

We collated our data from the years 2010–2016 from publicly availabledatabases. We consider 15 metrics, Y , 2 treatment variables, T , and 4 covariates, X , shownin Table 3.1.Most of the metrics and covariates employed in our models are taken from the data sectionin the United Nations Human Development Report, which is sourced from various organiza-tions and the World Bank database. Speciﬁcally, the POLITY variable is sourced from thePolity IV project [Center for Systemic Peace (2016)], and Corruption Perception Index fromthe Transparency International website [Transparency International (2018)]. Other relevantvariables (e.g. literacy rate among adults in the country) were not included in our model dueto a substantial amount of missing data. T ABLE

List of variables X , T and Y X, Covariates T, Treatment variable(s)

Forest area (cid:93)

Federally mandated maternity leave days (MML)Access to electricity, rural Domestic general government expenditureMean years of schooling on health (GGHE) per capita (cid:91)

Population, total (cid:91)

Y, Metrics

Education index (cid:63)

Population density (cid:91)(cid:63)

Popn., urban (% of total) (cid:63)

Popn., ages 65 and older (cid:91)(cid:63)

Employment to popn. ratio (%) (cid:63)

Unemployment rate (%) (cid:93)(cid:63)

Corruption Perception Index (cid:93)

Life expectancy ∂ Infant mortality rate (cid:93)

Internet users (% of popn.) (cid:93)(cid:63)

Renewable energy consumption (%) (cid:93) † (cid:63) POLITY index ∂ Gross National Income (GNI)per capita (current international $) (cid:91)(cid:63)

Prop. of parliamentary seatsheld by women (%) (cid:93)

Popn. with at least somesecondary education (% ages 25 and older) (cid:63)

Variables included as Y ∗ (cid:93) square-root transformed (cid:91) log-transformed ∂ cubic-transformed As the covariates X ’s in this framework are regarded as the drivers of a country’s socioe-conomic health, we chose them based on the country’s resources and existing infrastructure(e.g. forest area). The Y ’s are indicators of health (e.g. education index) based on measuresthat we perceive as reﬂective of a country’s health. In particular, GNI as opposed to GDPwas used as it is perceived as a more inclusive indicator of a country’s wealth [Klugman,Rodr´ıguez and Choi (2011)]. These indicators, or metrics, have been a priori transformed sothat increasing values reﬂect better health and to reduce skewness; see Section 6.3 for addi-tional details. For our single policy treatment variable T , we consider each of the followingin two separate models — federally mandated number of maternity leave (MML) days anddomestic general government health expenditure (GGHE) per capita. These two variableswere chosen due to their proposed beneﬁts to individuals, the economy or society as a whole[Chapman et al. (2008); Lea (1993)]. Note that the World Bank data source only has alter-nate years of maternity leave data, and we had to informally impute the data for some of theOECD countries using data from the OECD website [OECD (2018)]. A discussion of dataimputation is found in Section 6.3. † At the time that this manuscript was being prepared, the latest publicly available data obtained on this metricwas from 2015.

Importantly, it is recognized that the selection of modeled variables is inevitably subjectivebut could be informed by the modeler’s domain knowledge. As such, in this paper we focuson the methodology and its interpretation.

4. Methodology.

To quantify the latent causal socioeconomic health (LACSH) and itsuncertainty in a policy-speciﬁc context, we integrate two recent approaches—the latent healthfactor index (LHFI) [Chiu et al. (2011)] and the generalized propensity score (GPS) method-ology and its extensions [Hirano and Imbens (2004), Imai and Van Dyk (2004)]—along withspatial modeling to account for spatial dependence among countries. We are interested inthese two methods as the former describes health as latent , i.e. a trait that is not directly mea-surable, while the latter allows us to examine the effect of policy prescription and estimatethe dose-response function for different ‘doses’ of a policy treatment and the correspondingresponse – a nation’s overall socioeconomic health.4.1.

Latent health.

As an analogy to a country’s latent health, the underlying health con-ditions of a person who is deemed healthy cannot be directly compared to those conditionsof another person. It is the measurable variables such as height, weight or calorie intake of aperson that can be compared. Similarly, for a country, there is no single directly observablequantity that can represent “how well a country is doing”. Thus, the health of a country is anotion that we wish to evaluate comprehensively and holistically. For instance, we may arguethat variables like GNI, life expectancy, and infant mortality rate can each coarsely inform uson some aspect of the state in which a country’s health is, but not its overall health. The LHFIframework uniﬁes multiple aspects of health by modeling the underlying condition that wewish to assess as a latent parameter (not directly measurable), but it is dependent on differentmeasurables that are either drivers of health (covariates), or indicators of health (metrics). Aschematic representation of the LHFI framework is shown in Figure 1. F IG . (adapted from Chiu, Wu and Lu (2013)) The LHFI structure employed to model health as a latent variable for our speciﬁc contextis a type of mixed model [Rabe-Hesketh and Skrondal (2004)], where nation-speciﬁc healthis a random effect. We formulate our model as a Bayesian hierarchical mixed model, as it isnoted in Gelman and Hill (2007) as the most direct approach to handle latent structures.Martin and Quinn (2002) discuss the unidentiﬁability issues that are prominent in item re-sponse models (a type of generalized linear mixed model) in their work. To address this issuein our framework, we truncate the distribution for one pre-selected country’s health param-eter, H anc , to anchor our latent score’s scale. A more in-depth discussion of the anchoringapproach can be found in Section 6.1.Note that a country’s metrics are multivariate in nature. Thus, in our hierarchical model, weuse a multivariate normal distribution on the ﬁrst level in the hierarchy, i.e. the metric-level(Y-level) (equation (4.1)). We consider the previous years’ covariates and metrics, averagedover years, as a set of combined covariates that drive the countries’ health in the followingyear, which, in turn, is reﬂected by that year’s metrics.Speciﬁcally, we designate the year 2015 as the current year, in which the treatment isadministered, and we evaluate its effect on the country’s socioeconomic health and metrics in the following year (2016). X ∗ denotes the averaged covariate values over the years 2010–2014. Similarly, Y ∗ denotes the averaged metric values over the years 2010–2014; to avoidcollinearity, we retain only one of the metrics that show a correlation of 0.8+ with anothermetric or a covariate X ∗ (see Table 3.1). In particular, we eliminate the metrics one-by-oneuntil all correlations between X ∗ and Y ∗ are less than 0.8. Both X ∗ and Y ∗ are regardedas predictors of latent health in 2016. (The exclusion of 2015 from the deﬁnitions of X ∗ and Y ∗ will be discussed in Section 4.3.) Therefore, our base LHFI model (excluding GPS andspatial elements) with an ‘ H -anchor’ takes on the form below: y i | a , H i , Σ Y ind. ∼ MVN ( a H i , Σ Y ) (4.1) H | ζ , W ∗ , σ H ∼ TMVN ( W ∗ ζ , Σ H ) { H anc < } (4.2) where Σ H = σ H I and where MVN and TMVN denote the multivariate and truncated multivariate normal dis-tributions, respectively. The TMVN is deﬁned as the joint distribution of N − MVNs (fornon-anchor countries) with a truncated normal (for H anc ). This joint distribution has mean W ∗ ζ and an N × N diagonal covariance matrix Σ H , which reﬂects the naive assumptionthat countries are independent given the covariates. See Horrace (2005) for full details ofthe TMVN formulation. Normality is assumed due to the nature of our metric variables (seeSection 3).At the Y-level, we let y i = ( y i , . . . , y iP ) T be a P × i th country’s metricsin the year 2016 for i = 1 , . . . , N ; a = ( a , . . . , a P ) T be the P × vector for the ‘loadings’of any country’s health on its metrics; and Σ Y be the P × P covariance matrix for the 2016metrics.We refer to equation (4.2) as the health-level ( H -level), where H is an N × latent health in 2016 for all N countries including the chosen anchor country; W ∗ =( , T , X ∗ , Y ∗ ) is an N × (2 + K + Q ) matrix where is an N × vector of ones; T isan N × vector of treatment values in 2015; X ∗ = ( x ∗ , . . . , x ∗ K ) is an N × K matrix for K different covariates, x ∗ k being an N × vector of the k th covariate for k = 1 , . . . , K ,averaged over the years 2010–2014; and Y ∗ = ( y ∗ , . . . , y ∗ Q ) is an N × Q matrix where y ∗ q isan N × vector of the q th metric for q = 1 , . . . , Q , also averaged over the years 2010–2014(and Q

Macro-level variables of countries are expected to be spatiallycorrelated [Ward and Gleditsch (2018)], as countries that are close together in regions (e.g.Europe, North America and Central Asia) tend to be more similar in terms of a cultural,economic, social or political context. This suggests that the latent health of countries mayalso be spatially dependent. In order to assess the need for spatial modeling in our framework,we ﬁt the base LHFI model, treating the policy variable as a regular covariate, in this caseMML and not

GGHE using equations (4.1) - (4.2) and examined its residuals. The residualsare deﬁned as (cid:91) (cid:15) H ∗ = H − Xζ where the ‘hat’ ( (cid:98) ) value is the posterior median of the residuals after subtraction on theright-hand-side based on the Markov chain Monte Carlo (MCMC) samples. F IG . Residuals (cid:99) (cid:15) H ∗ on the world map Figure 2 presents the residuals from the base LHFI model ﬁt on the world map. The grayareas on the map are countries that are not represented in our dataset. It is apparent thatcountries that are geographically close together in regions such as North America, WesternEurope, Central and South East Asia have posterior residuals that are either similarly under-or over-estimated by our model. Similarly, the model that replaces MML with GGHE alsoshows the need to address spatial dependency. To accommodate this, we incorporate spatialdependency among our residuals on the health-level, and modify equation (4.2) to accountfor spatial dependence in its residuals.4.3.

Causal inference.

In addition to quantifying our latent socioeconomic health and itsuncertainty, we seek to integrate causal modeling into our framework to provide insight intothe effect of a ‘policy treatment’ variable on the health of a country.Two schools of thought dominate the causal inference literature — namely, “Pearl’scausal diagram” [Pearl (2009)] and “Rubin’s causal model” [Imbens and Rubin (2015)].Both attempt to establish causal effects from observational studies, which was previouslyconsidered impossible because such studies are not randomized controlled trials [Imbensand Rubin (2015); Hern´an and Robins (2020)]. Among causal inference methods for non-experimental data, propensity score (PS) analysis (stratiﬁcation, matching and covariate ad-justment) in the so-called Rubin’s approach has been widely used to address selection bias. Inour current work under the Rubin framework, instead of dichotomizing the continuous treat-ment variables, we consider the generalized propensity score (GPS) for continuous treatment,which were developed similarly by Hirano and Imbens (2004) and Imai and Van Dyk (2004),to estimate the dose-response function. The GPS approach is an extension to the propen-sity score method for binary treatments and multi-valued treatments [Rosenbaum and Ru-bin (1983); Imbens (2000)]. It allows us to fully utilize the raw information while reducing the ambiguity due to an arbitrary quantile used to categorize treatments. We follow the spec-iﬁcations as laid out by Hirano and Imbens (2004) in our work.In the case of binary treatments, there has since been research that considers the uncer-tainty in the propensity scores [McCandless, Gustafson and Austin (2009); An (2010)], al-though incorporating the outcome variable at the stage where the inference of the PS is con-ducted may be contentious [Kaplan and Chen (2012); Zigler et al. (2013); Zigler (2016)].For this reason, our GPS framework extends the work by Zigler et al. (2013) for binary treat-ment, whereby we use the Bayesian posterior-predictive distribution of the GPS to separatethe design stage and analysis stage in order to ‘cut the feedback’ (i.e. to ensure that the infer-ence of the GPS does not depend on the outcome variable) [McCandless et al. (2010); Zigleret al. (2013); Zigler and Dominici (2014)]. See Appendix A for implementation details.Irrespective of a categorical or continuous treatment variable, there are three main as-sumptions in Rubin’s approach of causal modeling, namely, (i) the stable unit treatmentvalue assumption (SUTVA), which stipulates no interference between units [Rubin (2005)];(ii) strongly ignorable treatment assignment, which stipulates no unmeasured confounders[Rosenbaum and Rubin (1983)]; and (iii) consistency, where the potential outcome of thetreatment must correspond to the observed response when the treatment variable is set to theobserved ‘exposure’ level [Cole and Frangakis (2009)]. For the GPS method, Hirano and Im-bens (2004) generalize (ii) to the weak unconfoundedness assumption, which only requiresconditional independence for each value of the treatment ( H ( t ) ⊥ T | X for all t ∈ T ) asopposed to joint independence for all potential outcomes.However, incorporating causal modeling in a spatial setting potentially violates the nointerference assumption (SUTVA) as discussed at length in Keele and Titiunik (2015) andNoreen (2018). In investigating the effect of convenience voting and voter turnout, Keele andTitiunik (2015) are concerned about interference and spillover effects – the units (individuals)may be inﬂuenced due to proximity of geographical regions (or inﬂuenced at the workplaceor by their social network, etc.).In our paper, the causal question of interest is the effect of a national policy variableon a country’s health, with the unit of interest being at the national-level rather than at theindividual-level. We consider two policy variables in our work. For the mandatory maternityleave (MML) variable, two obvious scenarios of interference and spillover in our case maybe a) individuals immigrating or emigrating and in their newly adopted country, either inﬂu-encing policy makers or affecting the health of the country (e.g. a Canadian mother whosewellbeing beneﬁted from the Canadian federal maternity leave policy emigrates to the UnitedStates, which does not have federal maternity leave, thus possibly improving the health ofthe United States); b) policy makers being inﬂuenced by their international social networks.The second policy variable of interest is domestic general government health expenditure(GGHE) per capita in a country. Note that ‘health’ in this variable refers to the individual-level’s public healthcare funding rather than the countries’ overall socioeconomic health H .We argue that the potential scenarios of interference and spillover are similar to a) and b)mentioned above.As such, we can assume minimal effects of individuals’ international migration on MML,GGHE, or socioeconomic health at the national level. Additionally, we can assume that fed-eral policy making regarding maternity leave and public healthcare expenditure is a collectivedomestic effort and generally conducted with minimal foreign interference. Finally, as dis-cussed by Schutte and Donnay (2014), when there is only minor overlap in the units, a con-sistent treatment effect can still be valid. These arguments suggest that SUTVA is reasonablein our case.Moreover, we utilize structural variables, namely, the country’s existing infrastructure andthe average of previous years’ metrics as covariates in our framework. This is to align with the approach by Rubin (2005) of conditioning on the pre -treatment variables. We can assumethat the current year’s policy treatment is affected by metrics and covariates from previousyears. We assume that, given these observable pre-treatment covariates through the GPS, acountry’s choice of MML days and GGHE is random, and that there are no unmeasuredimportant confounders. Hence, we proceed with the GPS framework while assuming that therequired assumptions (i), (ii), and (iii) hold.Our GPS formulation is based on work by Hirano and Imbens (2004). Similar to the PSapproach for the binary case, in which the PS is the probability of receiving treatment giventhe covariates, the GPS is deﬁned as the conditional probability density of the continuoustreatment given the covariates. The relevant properties and methodology are discussed atlength in Hirano and Imbens (2004) and Kluve et al. (2012). The ‘outcome variable’ in ourwork is the country’s latent health. A schematic representation is presented in Figure 3. F IG . Extension of the LHFI framework with causal modeling As such, we propose the LHFI methodology that includes causal and spatial modeling asthe LAtent Causal Socioeconomic Health (LACSH) index. To incorporate causal modelinginto our spatial LHFI model, we introduce the policy treatment variable T and its generalizedpropensity score R = r ( T, γ, Z ∗ , σ T ) to our health-level through its mean (however, notethe special MCMC implementation in Appendix A regarding “cutting the feedback” in theMCMC): y i | a , H i , Σ Y ind. ∼ MVN ( a H i , Σ Y ) (4.3) H | β , T , R , Σ H ∼ TMVN ( µ , Σ H ) { H anc < } (4.4) T i | Z ∗ i , γ , σ T ∼ N ( Z ∗ i γ , σ T ) (4.5) where [ µ ] i = β + β T i + β T i + β R i + β R i + β T i R i (4.6) R i = r ( T i , γ , Z ∗ i , σ T ) = 1 √ πσ T exp (cid:18) − σ T ( T i − Z ∗ i γ ) (cid:19) (4.7) Σ H = σ H Ω ( d, φ ) (4.8) Ω ( d, φ ) =  ρ · · · ρ n ρ . . . ...... . . . . . . ρ n − ,n ρ n · · · ρ n,n −  (4.9) ρ nm = exp ( − d nm /φ ) = ρ mn (4.10)where Σ H denotes the N × N spatial covariance matrix for health ; ρ nm is the correlationparameter between countries n and m , which is a function of d nm (the great circle distance(GCD) between two countries) and φ (the ‘range’ or inverse rate of decay parameter).In equation (4.6), T i is the policy treatment variable of interest in 2015, R i = r ( · ) is theGPS, and β = ( β , β , . . . , β ) T are the associated regression coefﬁcients. The inclusion ofquadratic and interaction terms of the GPS and treatment variable are described in Section Z ∗ i is the i th row vector of the N × ( K + Q ) matrix Z ∗ = ( , X ∗ , Y ∗ ) ,whose columns are the N × vector of ones, and the covariates and metrics averaged over2010–2014 (as described in Section 4.1); the corresponding (1 + K + Q ) × regressioncoefﬁcient vector is γ = ( γ , γ , . . . , γ K + Q ) T .The covariance function we employ in equations (4.8) - (4.10) is a special case of theMat´ern class of spatial covariance functions, for modeling the dependence between spatialobservations [Gelfand et al. (2010)]. For instance, a large value of ρ suggests that countriesthat are relatively far from one another are still moderately correlated [Hoeting et al. (2006)].Note that while we consider GCDs, geographical distance measures on a global scale have al-ways been a contentious issue [Ward and Gleditsch (2018)]. We discuss some of the possibleextensions to the spatial component in our framework in Section 6.2.

5. Latent Health for the World.

We present results from our LACSH model (spatialcausal LHFI in Section 4.3), separately ﬁtted to the countries’ data using MML days andGGHE as the treatment variable.For Bayesian inference, an adaptive MCMC algorithm [Roberts and Rosenthal (2009)]was used to automatically tune all parameters on the H -level, and as well as H anc in theMCMC due to non-conjugacy and to improve convergence and mixing. All other parameterswere sampled using Gibbs sampling. Speciﬁc sampling speciﬁcations are documented inAppendix A. For model results, we utilized roughly 100,000 post-burn-in MCMC samplesfrom the posterior distribution. Standard diagnostics suggested that each parameter of theMCMC chain had reached its steady state.5.1. Priors.

We specify conjugate diffuse priors for most parameters. For each regressioncoefﬁcient a j and β k , and variance log( σ H ), we specify a normal prior distribution with mean0 and variance 100. The covariance matrix Σ Y is given an inverse-Wishart prior with P + 2 degrees of freedom, and an identity scale matrix. log( φ ) is modeled with a Normal prior withmean 0 and variance 100 for a positive diffuse prior for our spatial correlation inverse decayrate φ . The diffuse priors for γ and σ T are N (0 , and inverse gamma with shape = 1 andscale = 0.01, respectively. Ranking of countries according to latent health, H.I. MML days. F IG . Latent health for 120 countries in 2016 color-coded by income group, with MML as treatment BulgariaBurkina FasoFinlandMaliNorwayQatarSri LankaSwedenUkraine United Arab Emirates

MaliBurkina FasoQatarMozambiqueUnited Arab EmiratesYemen, Rep.Gambia, TheNigerLesothoBahrainMalawiAfghanistanOmanEthiopiaBeninRwandaBurundiUgandaTogoZambiaKuwaitPapua New GuineaSenegalNamibiaCameroonSudanCongo, Rep.KenyaLiberiaZimbabweGhanaPakistanSaudi ArabiaLao PDRIraqBotswanaBhutanBangladeshTajikistanGabonNepalUzbekistanSouth AfricaCambodiaIran, Islamic Rep.MyanmarIndiaIndonesiaMoroccoGuyanaJordanAzerbaijanHondurasVietnamGuatemalaAlgeriaPhilippinesDominican RepublicEl SalvadorNicaraguaTurkeyMalaysiaThailandColombiaTunisiaEcuadorMongoliaJamaicaChinaMauritiusKazakhstanTrinidad and TobagoParaguayLebanonBrazilArmeniaSri LankaMexicoPanamaPeruIsraelCosta RicaUkraineCyprusRussian FederationUruguayAlbaniaChileSouth KoreaUnited StatesBelarusPortugalGreeceIrelandBelgiumSpainPolandUnited KingdomLuxembourgLatviaCroatiaFranceBulgariaNetherlandsLithuaniaSwitzerlandHungaryItalySloveniaNew ZealandAustriaGermanyAustraliaEstoniaCanadaDenmarkJapanFinlandSwedenNorway −0.1 0.0 0.1

Latent Health Index C oun t r i e s Income Group Low income Lower middle income Upper middle income High income

Figure 4 shows the country ranking based on the posterior medians of the H ’s (coloreddots) along with their corresponding 90% credible intervals (gray bands). The ﬁgure high-lights some countries that are ranked highest, lowest, or differently than its United Nations’(UN) designated income group * . Formal quantiﬁcation of the uncertainty for our healthparameter suggests that countries are not polarized into developed/developing countries orrich/poor countries; the lack of polarization aligns with the ﬁndings in Rosling (2019).Nevertheless, our color-coding according to the designated income groups shows that thecountries are generally ranked according to their income group. This suggests that the healthof a country is highly correlated with the income group of the country. However, as willbe discussed below, income is not necessarily the most important index to examine whenconsidering the health of a country.The posterior median and corresponding credible interval for the highest-ranked, lowest-ranked, and anchor country are shown in Table 5.1. * UN and the World Bank classify countries every year into four income groups based on their GNI per capita(current US$). II. GGHE. F IG . Latent health for 120 countries in 2016 color-coded by income group, with GGHE as treatment Croatia JapanLiberiaMozambiqueNigerNorwaySweden Trinidad and TobagoTunisiaUkraine

MozambiqueNigerLiberiaMaliMalawiBurundiBurkina FasoGambia, TheAfghanistanEthiopiaTogoUgandaBeninYemen, Rep.ZimbabweLesothoCameroonRwandaZambiaKenyaSenegalSudanPakistanNepalTajikistanCongo, Rep.BangladeshGhanaPapua New GuineaCambodiaLao PDRMyanmarIndiaHondurasGuatemalaGuyanaNamibiaUzbekistanIraqSouth AfricaBotswanaNicaraguaPhilippinesVietnamIndonesiaGabonMoroccoBhutanJamaicaEl SalvadorArmeniaParaguayMongoliaDominican RepublicAzerbaijanJordanSri LankaEcuadorTunisiaUkrainePeruAlbaniaColombiaBrazilChinaAlgeriaMexicoMauritiusKazakhstanIran, Islamic Rep.ThailandPanamaTrinidad and TobagoLebanonCosta RicaTurkeyMalaysiaBulgariaOmanRussian FederationBelarusBahrainUruguayCroatiaLatviaChileUnited Arab EmiratesCyprusHungarySaudi ArabiaGreeceKuwaitPolandLithuaniaPortugalSloveniaEstoniaItalyIsraelSpainSouth KoreaQatarFranceNew ZealandAustriaBelgiumUnited KingdomCanadaIrelandNetherlandsUnited StatesSwitzerlandDenmarkGermanyFinlandLuxembourgAustraliaJapanSwedenNorway −0.10 −0.05 0.00 0.05 0.10

Latent Health Index C oun t r i e s Income Group Low income Lower middle income Upper middle income High income

The countries’ health ranking using GGHE as the treatment variable is shown in Figure 5.Compared to using MML as a treatment variable, this ranking shows tighter credible intervals(narrower gray bands) and follow the UN’s designated income groups more closely. This maybe due to the high correlation between the GGHE treatment variable and many metric andcovariate variables (e.g. ‘ mean years of schooling ’, ‘

GNI per capita ’, etc.)Some exceptions are Ukraine and Sri Lanka, which are classiﬁed as lower-middle incomecountries by the United Nations, but are ranked among the high and upper-middle incomecountries in our rankings in both models ﬁtted using MML and GGHE as treatment variable.These results suggest that a country’s health is not solely reﬂected by its income or wealth.5.3.

Numerical results and implications.

Posterior summaries for selected parametersare shown in Table 5.1. The other model parameters are tabulated in Appendix C.

Nations’ latent health, H . Table 5.1 left panel shows the highest and lowest ranked H i from our LACSH model using MML as treatment variable, corresponding to Norway ( i =

84) and Mali ( i = For the top-ranked countries, the posterior probability for Norway to be in better health thanSweden is negligible at 0.53, for Sweden to be better than Finland is 0.51, whereas for Finlandto be better health than Japan is 0.57, suggesting the four countries may be grouped together.Similar calculations of posterior probabilities can also be easily obtained for other countries.Some of the posterior summaries for our LACSH model using GGHE as treatment variableare tabulated in the right panel of Table 5.1. Norway ( i =

84) is again ranked at the top,similar to using MML as treatment, followed by Sweden ( i = i =

57) andAustralia ( i = Health loadings, a . Insights into the associated strength and direction of relationshipbetween metrics and health are available from the inference about the health loadings, a j .Table 5.1 left panel shows the results (using MML as treatment variable) for the fourwhich have the highest positive impact, based on the medians of each of the marginalposterior distributions, and one example of a loading that has a negative impact. In de-creasing order of effect size, the corresponding positive metrics are: ‘ population,ages 65 and older (% of total) ” ( j = education index ’ ( j = infant mortality rate (per 1,000 live births) ’ (reversed scale) ( j = life expectancy at birth ’ ( j = GNI per capita ’ is ranked eighth in terms of its positive association with a country’shealth. In fact, the posterior probability for health to have a bigger effect on ‘ educationindex ’ than ‘

GNI per capita ’ is 0.995. This suggests with rigor that a country’s healthis not solely reﬂected by a country’s wealth, but other social factors as well. Table 5.1 rightpanel shows the results for using GGHE as treatment variable; relevant discussions are inAppendix D. F IG . Plot of employment-to-population ratio vs. LACSH model posterior median of latent health, with MML astreatment, and with a least squares regression ﬁt for visualization Figure 6 shows that there is a weak negative relationship between the metric ‘ employmentto population ratio ’ and a country’s latent health (90% credible interval for a is ( − . , − . ). As we can see from the ﬁgure, countries with a high proportion ofemployment-to-population ratio are generally in the low-income group, and the ratio de-creases with increasing income. This perhaps goes against the naive belief that a high em-ployment ratio reﬂects a country’s ‘good’ health. There are several possible explanationsfor this result. One, wages are typically low in low-income countries, resulting in a higherproportion of the population having to work in order to secure a decent living wage. Two,lower-income countries relatively lack effective social safety nets and social protection sys-tems for its population, resulting in a higher proportion of the population working for alonger period of time until (a possibly later) retirement age. Some exceptions are Qatar andUnited Arab Emirates (UAE), which are high-income countries but also have high employ-ment ratio due to a large proportion of its population being expatriate workers [Parcero andRyan (2017)].Given the other metrics in the model, for MML as treatment variable, three met-rics namely, ‘ population density ’, ‘ unemployment rate ’ (reversed-scale), and‘ renewable energy consumption ’ were found to not have a substantial statisticalrelationship with a country’s latent health. For GGHE as the treatment variable, the metric‘ proportion of seats held by women in national parliament ’ alsoshows no statistical relationship with a country’s latent health as opposed to ‘ renewableenergy consumption ’. In fact, the model suggests there is a negative relationship be-tween ‘ renewable energy consumption ’ and a country’s latent health.These results demonstrate that our model-based approach does not require a priori inputon which metrics reﬂect ‘good’ health, or which metrics are important to a country’s latenthealth. We provide further discussions in Appendix D. T ABLE

Posterior summaries for selected LACSH model parameters, with MML and GGHE as treatment, respectively

Parameter MC †

5% Median 95% H (Norway) H (Mali) -0.15 -0.10 -0.07 H anc -0.12 -0.07 -0.04 a a a a -7.35 -4.17 -1.38 β -0.02 -0.01 0.00 σ H φ γ -0.10 0.04 0.17 Parameter MC 5% Median 95% H (Norway) H (Mozambique) -0.13 -0.11 -0.09 H anc -0.12 -0.10 -0.08 a a a a -13.72 -10.86 -8.24 β σ H φ γ -0.14 -0.00 0.14 † Markov chain Average dose-response function.

To examine the dose-response function which relatesthe impact of varied levels of a policy treatment variable on the country’s health , we utilizethe GPS formulation as proposed by Hirano and Imbens (2004). The GPS approach allowsus to evaluate a country’s health outcome that corresponds to each speciﬁed value of thecontinuous treatment (i.e. MML days or GGHE). Note that the conditional expectation of theoutcome as a function of the treatment T and the GPS R is β ( t, r ) = E [ H | T = t, R = r ] foreach speciﬁed ‘dose’ of treatment t ∈ T .We can obtain an estimate of the entire dose-response function through estimating theaverage potential outcome at a given t . In particular, µ ( t ) = E [ H i ( t )] is calculated similarto equation (4.6) but substituting speciﬁc values of t for T in equations (4.6) and (4.7). Theentire dose-response function is then µ ( t ) = E [ β { t, r ( t, Z ∗ i ) } ] , which is estimated by ˆ µ ( t ) =(1 /N ) (cid:80) i β { t, r ( t, Z ∗ i ) } . As ˆ µ ( t ) depends on the parameter β , we examine the posteriormedian of ˆ µ ( t ) at a given t (red curves in Figures 7 and 8).Hirano and Imbens (2004) assert that the conditional expectation of the outcome as a func-tion of the treatment level T and the GPS R ( β ( t, r ) ) does not have a causal interpretation,but that µ ( t ) which corresponds to the dose-response function for treatment level t , whencompared to another value of t (cid:48) does have causal interpretation.Many authors in the GPS literature use quadratic and interaction terms of the GPS andtreatment variable in its conditional expectation of the outcome. We examine the need forthose terms, using MML as treatment, by comparing both models with and without theterms in equation (4.6) using the log-pseudo marginal likelihood (LPML) (also known asthe pseudo-Bayes factor) [Gelfand and Dey (1994); Chib (1995); Hanson (2006)] and ﬁndthat the marginally preferred model is with the quadratic and interaction terms (LPML val-ues -176.50 and -177.31 respectively), as congruent with those in the literature. We plot theposterior average dose-response function using thinned posterior samples of coefﬁcients (forvisualization purposes) from our LACSH model (eq. (4.4) - (4.7)). F IG . The LACSH model posterior median dose-response function (red), with MML as treatment, and with 100thinned MCMC samples (gray) for visualization of uncertainty The left panel in Figure 7 shows gray curves as the posterior dose-response based on athinned MCMC sample; the red curve indicates the posterior median based on approximately100,000 MCMC scans. The curves show a very weak increase (with a slight blip) in theaverage dose-response as the number of MML days increases. The right panel, the curve of posterior median zoomed-in vertically, shows an increasing dose-response when MML daysrange over 53-145 and 236-410. Upon further inspection, the double ‘dip’ may be due to thelack of data around certain ranges of MML days with clusters of countries having similarMML values. F IG . The LACSH model posterior median dose-response function (red), with GGHE as treatment, and with 100thinned MCMC samples (gray) for visualization of uncertainty The average dose-response function shown in Figure 8 suggests that increasing the levelof health expenditure (GGHE) monotonically leads to an increased level of the country’shealth. The ﬁgure also presents strong evidence of a causal phenomenon, in that all posteriorsamples in the ﬁgure are monotone.

Assessing covariate balance under the generalized propensity scores framework.

In Ru-bin’s approach of causal inference, the balance between treatment and control groups withrespect to the pre-treatment covariates is a crucial assumption [Imbens and Rubin (2015)].We introduce a novel technique to visually assess covariate balance in the GPS framework.Generally, testing for covariate balance with a continuous treatment variable is not straight-forward, and we consider the approaches by both Hirano and Imbens (2004) and Imai andVan Dyk (2004) (as noted in Kluve et al. (2012)) to test for: Z ∗ i ⊥ { T i = t } | r ( t, Z ∗ i ) where the GPS r ( · ) is evaluated at different speciﬁed values of t for the continuous treatmentvariable. Speciﬁcally, we divide the sorted data of { T , . . . , T N } into moving blocks of 20observations, overlapping 10 observations between neighboring blocks. (This results in 11blocks for N =

120 observations.) The GPS is then evaluated at the median of each block.As the GPS is a dimension reduction tool to control for what is usually a large numberof covariates, we investigate covariate balance collectively instead of what is done in theliterature, being on each individual covariate. Speciﬁcally, we represent the covariates by itsﬁrst principal component f ( · ) when evaluating f ( Z ∗ i ) ⊥ { T i = t } | r ( t, Z ∗ i ) (5.1)where t is the block median, { T i = t } = (cid:26) if country i is in the current block of 20 countries otherwiseand r ( · ) is computed as r ( t ; u i , v ) = v √ π exp ( − ( t − u i ) v ) where u i and v are based on astandalone frequentist linear multiple regression of the T data on the Z ∗ data, such that u i isthe ﬁtted value for the i th country, and v is the ﬁtted residual standard error. To evaluate if equation (5.1) holds, ﬁrst, for each block of 20 countries, we run a frequentistlogistic regression of { T i = t } on f and r ( · ) and record the p -value of the slope for f . A p -value exceeding, say, 0.1, suggests that the covariates, given the GPS, are not signiﬁcantlyrelated to the treatment variable at and around that median value. Because the main objectiveof the GPS methodology is to remove any potential biases introduced by the covariates, weconsider the set of 11 p -values collectively, whereby covariate balance is deemed adequate ifmost of the 11 p -values satisfy 0.9 > p . In the case of MML, Figure 9 shows there are 5 outof 11 blocks that are above the 0.9 threshold, but if we used 0.95 then 8 out of 11 blocks showadequate covariate balance. Figure 10 shows the same blocks that contribute to the overallimbalance whether we use 0.9 or 0.95 as the threshold. F IG . Plot of p-values (reversed-scale) for assessing covariate balance, with MML as treatment F IG . Plot of p-values (reversed-scale) for assessing covariate balance, with GGHE as treatment In addition to ‘blocking on the (generalized propensity) score’ to assess covariate balance[Imai and Van Dyk (2004)], our novel approach described here also allows us to identify theranges of treatment values that may be the sources of any overall imbalance. The observa-tions that fall under those ranges may be further investigated for improving overall covariateimbalance, if necessary. In the case of MML, (Figure 9) does not show consistent patterns ofcovariate imbalance. In contrast, in the case of GGHE, (Figure 10) shows that the lower rangeof GGHE value (approximately 0-90) consistently show covariate imbalance (approximately0-90). To address this, we removed countries in those ranges and re-ran our model on thesubsample of countries, which removed the consistent pattern of imbalance. More details arein Appendix E.

Inverse decay parameter φ and spatial correlation function ρ ( d, φ ) . Figures 11 and 12show the posterior median of the spatial correlation function ρ evaluated at a given d betweencapital cities. F IG . Spatial correlation function based on the posterior of ρ = exp ( − dφ ) given d , under LACSH model withMML as treatment F IG . Spatial correlation function based on the posterior of ρ = exp ( − dφ ) given d , under LACSH model withGGHE as treatment Figures 11 and 12 show decreasing spatial correlation between the countries as distanceincreases. The model with MML has a sharper decrease but with more uncertainty (widercredible band) compared to GGHE. For the two countries where capitals are furthest apart(Spain and New Zealand), our models yield noticeable spatial correlation (90% credible in-tervals of (0 . , . and (0 . , . ) with MML and GGHE as treatment variable, respec-tively. These results suggest that LACSH models would be inadequate if spatial dependencewere not accounted for.

6. Some technical details.

Identiﬁability.

Recall that at the metric-level (Y-level), a j is the population-levelloading of a country’s health on its j th metric. However, similar to the discussions by Chiuand Westveld (2011) and Martin and Quinn (2002), modeling health H i as a random effectleads to an unidentiﬁable a vector unless constraints are imposed.The constraint we have utilized in our models is a TMVN distribution on the H -level sothat the health of an anchor country is negative (or positive, if desired). To decide on theanchor country, we conducted a pilot run of the base LHFI model in Section 4.1 but withoutany anchor, then selected a low-income group country (Burundi, in this case) on the extremeend of the H -scale as the anchor in all subsequent formal models.As the ranking of a country is relative to the others’, the constraint restricts the anchorcountry’s health in the negative space and imposes this ﬁxed scale on all other countries.This constraint solves the parameter identiﬁability issue along with aiding the interpretationof H i , as it encodes in the model the fact that a higher value of H i should be interpreted as ahigher level of health, not lower.We have explored alternative constraints, including a ‘soft anchor’ (ﬁxing mean and vari-ance for H anc ), a ‘hard anchor’ (ﬁxing H anc to a constant), and transposing the H -scalemanually post-MCMC sampling (so the scale of H aligns with increasing value of H corre-sponding to increasing levels of health). While our ‘truncated anchor’ leads to an additionalcomputational burden, we believe this constraint to be the most desirable as it results in themost ﬂexible approach.6.2. Spatial distances.

In our LACSH model, spatial correlation between countries wasincluded to explicitly account for their respective geographical locations. It is modeledthrough the simplest Mat´ern covariance function in the form of an exponential decay overgreat circle distances between capital cities of countries. Due to the earth’s spherical nature,Euclidean distances may be inappropriate on a global scale [Banerjee (2005)], and thereforeGleditsch and Ward (2001) also deﬁne a minimum distance between countries based on coun-try borders of up to a certain distance. When we extend the LACSH model to more complexforms in order to formally incorporate dependencies jointly across time (2010–2016) andspace (see Section 7), we may explore this minimum distance measure, as well as alternativecovariance functions.6.3.

Data transformation and missing values.

Higher unemployment rate is generally re-garded as bad for societal health [Wulfgramm (2014); Helliwell and Huang (2014)]. For thisreason, the metric for the ‘infant mortality rate’ had been linearly transformedprior to modeling so that higher values reﬂect better societal health. The same transformationwas also applied to the metric for ‘unemployment rate’ . In practice, the modeler needsnot to carry out this transformation, because the ﬁtted model can be used to distinguish thestrength and direction of the relationship between latent health and metrics, as reﬂected by thesigns of the metrics’ loadings. For instance, the results from both of our treatment variables suggest that higher values of latent health are associated with higher values of the (reversed-scale) infant mortality rate metric (equivalent to low infant mortality rate in the country beforethe data transformation). In other words, it is inferred that countries with better latent healthhave low infant mortality rate, conditioned on the set of metrics and covariates that are in-cluded in our model. Other visualizations and summaries of the posterior distribution for thenegative metric effect, a j , are included in Appendix D in the supplementary material.Note that the selection of variables to be included as metrics and covariates in our modelwas largely based on the availability of data. Initial model ﬁts included more covariates;however, due to collinearity, past-year metrics with a correlation higher than the arbitrarythreshold of 0.8 with other covariates/metrics were removed sequentially from modeling.While this paper focuses on the development of methodology, when applying the methodol-ogy in practice, the covariates and metrics could be speciﬁed by the modeler more accordingto their domain knowledge and less to data availability. In either case, the issue of missingdata may require special attention.In this paper, we considered two continuous ‘policy treatment’ variables separately in ourmodels. In particular, for MML days, the data obtained from the World Bank only includeMML data every other year. Therefore, data for the years 2010, 2012, and 2014 are consid-ered missing. For those years, only countries with the same values for the years before andafter the missing year entered our model. To further reduce data missingness in each year,OECD data were used for some OECD countries when MML was missing from the WorldBank data, although we note that the two organizations have slightly different deﬁnitions ofmaternity leave. For example, according to the World Bank, Sweden has zero MML daysbased on its deﬁnition. However, the OECD and other sources suggest that this may not bean accurate representation of their maternity policy. (As such, future iterations of our workwill consider non-World Bank deﬁnitions.)In our work, given the right-skewed nature of GGHE, we log-transformed the values be-fore standardization. MML days also appeared skewed, but the data appeared in clusters, thusshowing large unobserved ranges, and various transformations did not improve the data dis-tribution. For this reason, we had kept the variable untransformed other than standardizingit to have mean zero and unit variance. In future extensions of the current work, we mayconsider other approaches such as rank likelihood estimation when using treatment variablesthat are distributed in clusters. All transformations including those in Table 3.1 are intendedto adhere to our normality assumptions.Finally, even if data exist in published records, it is recognized that such data collected onthe country-level by various world organizations may have been derived from different andunpublished imputation techniques. Of course, the quality of the data would depend on theactual imputation techniques employed. Moreover, one may not rule out the possibility thatdata or ofﬁcial statistics reported by certain countries may have been fabricated. Althoughthese disadvantages could reduce the accountability of our modeling results, overcomingsuch data-related challenges is beyond the scope of our paper.

7. Discussion and future work.

In this paper, we developed a LACSH index for nationsthrough a comprehensive model-based approach which integrates spatial dependence andexamines policy effects through a causal modeling framework. Through rigorous handling ofquantitative treatment variables and the uncertainty quantiﬁcation that directly results fromthe integrated LACSH modeling, we have demonstrated that our novel uniﬁed frameworkand visual assessment of covariate balance can be valuable to evidence-based social sciencewith causal implications.As mentioned in Section 4.3, to facilitate formal causal inference, our LACSH approachincorporates the Bayesian extension of the generalized propensity score in the spatial LHFI framework. In addition, we intend to consider the alternative framework of Pearl’s causal di-agram approach, which could reveal if, by controlling for certain variables, we have uninten-tionally opened some ‘backdoor paths’ in the causal diagram which would result in spuriouscorrelation [Pearl (2009)]. In the literature, backdoor paths are any non-causal paths betweenthe treatment and outcome variables in the causal diagram. As such, Pearl’s approach mightprompt us to control for a different set of variables and potentially lead to a different scientiﬁcconclusion.Regarding temporal data, currently, metrics from 2016 are hierarchically regressed on thetreatment from 2015 and the GPS, which depends on temporally averaged covariates andmetrics from 2010–2014. A substantively more complex model would be required to formallymodel temporal correlation in any of Y, H, T and X, as an extension to this paper. This wouldresult in a spatio-temporal hierarchical causal model. We anticipate that careful considerationof separability (or otherwise) between space and time will be required.Lastly, formal inference that allows us to identify which metrics are crucial in reﬂectingthe health of a country may be of interest. To do so, we would consider modeling the metriceffects as proportions that sum to 1, resulting in a type of variable/model selection framework.The implication of such a parameterization is the reduction in any modeler-induced selectionbias due to choosing variables that are a priori perceived as being important in reﬂecting acountry’s health. Acknowledgments.

This research has been supported by an IBISWorld philanthropicdonation by Phil Ruthven to the Australian National University in the form of researchfunds awarded to GS Chiu. We thank Beatrix Jones, Bruce Chapman, Carolyn Huston, Cor-win Zigler, Paul Gustafson, Peter Mueller, Tim Higgins, and the attendees of Bayes on theBeach 2017 and Joint Statistical Meeting 2018 for stimulating discussions and constructivecomments on the topic. The authors acknowledge William & Mary Research Computing( ) for providing computational resources and technicalsupport that have contributed to the results reported within this paper.APPENDIX A: MCMC ALGORITHMAll model inference is done through MCMC sampling. All parameters are updated viaGibbs sampling except for some of the parameters on the H -level, because equations (4.2)and (4.4) are a combination of N − multivariate normals and a truncated normal:1. Sample H i |− i from its N( M, V ) full conditional distribution where: V = (cid:0) a T Σ − Y a + D − (cid:1) M = V − (cid:0) a T Σ Y y i + D − m i (cid:1) D = Σ H [ i,i ] − Σ TH [ i, − i ] Σ − H [ − i, − i ] Σ H [ − i,i ] m i = µ i + Σ TH [ i, − i ] Σ − H [ − i, − i ] ( H [ − i ] − µ − i ) µ = ( , T , T , R , R , TR ) β for i = 1 , . . . , N ; i (cid:54) = anc

2. Sample a = ( a , . . . , a P ) T from its MVN( M , V ) full conditional distribution where: V = (cid:32) N (cid:88) i =1 H i Σ − Y + 100 I P (cid:33) M = V − (cid:0) Σ − Y Y T H (cid:1) where I P is a P × P identity matrix and P = 15 .3. Sample Σ Y from its Inv-Wishart( ν n , S n ) full conditional distribution where: ν n = ν + N S n = ( Y − Ha T ) T ( Y − Ha T ) + I P where ν = P + 2 , N = 120 and I P is as described above.4. Sample σ T from its Inverse-Gamma( α n , β n ) full conditional distribution where: α n = N/ β n = N (cid:88) i =1 D i / . D i = T i − Z ∗ i γ

5. Sample γ = ( γ , . . . , γ ) T from MVN( M , V ) where: V = (cid:0) σ − T Z ∗ T Z ∗ + 100 I (1+ K + Q ) (cid:1) M = σ − T V − Z ∗ T T where I (1+ K + Q ) is a (1 + K + Q ) × (1 + K + Q ) identity matrix and (1 + K + Q ) = 10 .This MVN distribution is proportional to (cid:81) ni =1 P ( T i | γ, σ T ) P ( γ ) which is not a full condi-tional for γ , in order to ‘cut the feedback’ [McCandless et al. (2010); Zigler et al. (2013)].This approximate conditional for γ is then used as the posterior predictive on the H -level. Note that this approximation ignores the ( H -level) contribution from the (country’shealth) outcome, thus cutting the feedback.6. Sample the H -level parameters ( β (cid:63) , log ( σ (cid:63) H ) , log ( φ (cid:63) ) , H (cid:63)i = anc ) as a vector from theproposal distribution Q s ( u, · ) as set out below. In particular, the Metropolis algorithmis performed with scan-speciﬁc proposal distribution; for s ≤ , take Q s ( u , · ) = MVN ( u , (0 . I d /d ) , whereas for s > , take Q s ( u , · ) = (0 . MVN ( u , v Σ s /d ) +(0 . MVN ( u , (0 . I d /d ) where u = our parameter vector in the previous MCMC iter-ation; d = dimension of our target distribution; v = 2 . for MML and v = 5 for GGHE.The different values for v were adapted from Roberts and Rosenthal (2009) to improveslow mixing.We deﬁne the ratio of densities as κ = p ( β (cid:63) , log ( σ (cid:63)H ) , log ( φ (cid:63) ) , H (cid:63)i = anc | R , T , H i (cid:54) = anc ) p ( β , log ( σ H ) , log ( φ ) , H i = anc | R , T , H i (cid:54) = anc ) and accept ( β (cid:63) , log ( σ (cid:63)H ) , log ( φ (cid:63) ) , H (cid:63)i = anc ) jointly with probability κ ∧ . APPENDIX B: COUNTRIES USED IN LACSH MODEL i Country1 Afghanistan2 Albania3 United Arab Emirates4 Armenia5 Australia6 Austria7 Azerbaijan8 Burundi (Anchor country)9 Belgium10 Benin11 Burkina Faso12 Bangladesh13 Bulgaria14 Bahrain15 Belarus16 Brazil17 Bhutan18 Botswana19 Canada20 Switzerland21 Chile22 China23 Cameroon24 Congo, Rep.25 Colombia26 Costa Rica27 Cyprus28 Germany29 Denmark30 Dominican Republic i Country31 Algeria32 Ecuador33 Spain34 Estonia35 Ethiopia36 Finland37 France38 Gabon39 United Kingdom40 Ghana41 Gambia, The42 Greece43 Guatemala44 Guyana45 Honduras46 Croatia47 Hungary48 Indonesia49 India50 Ireland51 Iran, Islamic Rep.52 Iraq53 Israel54 Italy55 Jamaica56 Jordan57 Japan58 Kazakhstan59 Kenya60 Cambodia i Country61 South Korea62 Kuwait63 Lao PDR64 Lebanon65 Liberia66 Sri Lanka67 Lesotho68 Lithuania69 Luxembourg70 Latvia71 Morocco72 Mexico73 Mali74 Myanmar75 Mongolia76 Mozambique77 Mauritius78 Malawi79 Malaysia80 Namibia81 Niger82 Nicaragua83 Netherlands84 Norway85 Nepal86 New Zealand87 Oman88 Pakistan89 Panama90 Peru i Country91 Philippines92 Papua New Guinea93 Poland94 Portugal95 Paraguay96 Qatar97 Russian Federation98 Rwanda99 Saudi Arabia100 Sudan101 Senegal102 El Salvador103 Slovenia104 Sweden105 Togo106 Thailand107 Tajikistan108 Trinidad and Tobago109 Tunisia110 Turkey111 Uganda112 Ukraine113 Uruguay114 United States115 Uzbekistan116 Vietnam117 Yemen, Rep.118 South Africa119 Zambia120 Zimbabwe APPENDIX C: RESULTS FOR LACSH MODELS

C.1. Posterior summaries for LACSH model, MML as treatment variable. a j

5% 50% 95% a a -7.35 -4.17 -1.38 a a a a a -3.17 -0.11 2.93 a a a a a -4.47 -1.44 1.46 a a a -7.66 -3.59 0.53 β -0.01 0.04 0.10 β -0.02 -0.01 0.00 β -0.00 0.00 0.00 β -0.20 -0.01 0.18 β -0.24 0.07 0.41 β σ H φ Σ Y { , } ‡ Σ Y { , } -0.47 -0.69 -0.47 Σ Y { , } -0.40 -0.61 -0.40 Σ Y { , } Σ Y { , } -0.40 -0.62 -0.40 γ (1+ K + Q ) γ -0.11 0.00 0.11 γ -0.10 0.04 0.17 γ -0.21 -0.08 0.05 γ -0.11 0.02 0.14 γ -0.07 0.07 0.20 γ -0.09 0.03 0.14 γ -0.02 0.12 0.26 γ -0.15 -0.03 0.08 γ -0.14 -0.01 0.12 γ -0.11 0.02 0.15 γ -0.13 -0.01 0.11 γ -0.19 -0.07 0.06 γ -0.13 0.00 0.13 γ -0.21 -0.08 0.05 j Metrics, Y1 Education index2 Employment to popn. ratio, 15+, total (%)3 GNI per capita (2011 PPP$)4 Internet users (% of popn.)5 Life expectancy at birth, total (years)6 Mortality rate, infant (per 1,000 live births)7 Population density8 Popn. with at least some secondary education (% ages 25 and older)9 Popn., ages 65 and older (% of total)10 Popn., urban (% of total)11 Renewable energy consumption (% of total ﬁnal energy consumption)12 Proportion of seats held by women in national parliaments (%)13 Unemployment, total (% of total labor force)14 POLITY index15 Corruption Perception Index K + Q Previous years’ covariates, Z ∗ = ( X ∗ , Y ∗ ) ‡ Only the top ﬁve in magnitude of the posterior median are presented. H i

5% 50% 95% H -0.11 -0.08 -0.05 H H -0.15 -0.09 -0.04 H -0.01 0.02 0.04 H H H -0.04 -0.02 0.01 H -0.12 -0.07 -0.04 H H -0.11 -0.08 -0.05 H -0.14 -0.10 -0.07 H -0.07 -0.04 -0.01 H H -0.15 -0.08 -0.03 H H -0.00 0.02 0.04 H -0.08 -0.05 -0.02 H -0.08 -0.05 -0.02 H H H H -0.01 0.01 0.03 H -0.10 -0.06 -0.04 H -0.10 -0.06 -0.03 H -0.02 0.00 0.02 H H H H H -0.02 -0.00 0.02 5% 50% 95% H -0.04 -0.01 0.02 H -0.02 0.00 0.02 H H H -0.11 -0.08 -0.05 H H H -0.07 -0.04 -0.01 H H -0.09 -0.06 -0.03 H -0.12 -0.09 -0.06 H H -0.03 -0.01 0.01 H -0.04 -0.02 0.01 H -0.04 -0.02 0.01 H H H -0.04 -0.02 0.00 H -0.05 -0.03 -0.00 H H -0.07 -0.03 0.00 H -0.09 -0.05 -0.02 H H H -0.01 0.01 0.03 H -0.05 -0.02 0.01 H H -0.02 0.01 0.04 H -0.09 -0.06 -0.04 H -0.06 -0.03 -0.01 5% 50% 95% H H -0.13 -0.07 -0.03 H -0.08 -0.05 -0.02 H -0.01 0.01 0.04 H -0.10 -0.06 -0.02 H -0.01 0.02 0.05 H -0.13 -0.08 -0.05 H H H H -0.04 -0.02 0.01 H -0.00 0.02 0.05 H -0.15 -0.10 -0.07 H -0.05 -0.03 0.00 H -0.03 0.01 0.04 H -0.13 -0.09 -0.05 H -0.02 0.01 0.03 H -0.12 -0.08 -0.05 H -0.02 0.00 0.02 H -0.10 -0.06 -0.04 H -0.13 -0.09 -0.05 H -0.02 -0.00 0.02 H H H -0.06 -0.04 -0.01 H H -0.14 -0.08 -0.03 H -0.09 -0.06 -0.03 H H -0.00 0.02 0.05 5% 50% 95% H -0.03 -0.01 0.01 H -0.10 -0.07 -0.04 H H H -0.01 0.01 0.04 H -0.16 -0.09 -0.04 H H -0.11 -0.07 -0.04 H -0.10 -0.05 -0.01 H -0.10 -0.06 -0.03 H -0.10 -0.07 -0.04 H -0.02 -0.00 0.02 H H H -0.11 -0.07 -0.04 H -0.02 0.00 0.03 H -0.07 -0.04 -0.01 H -0.02 0.01 0.04 H -0.02 0.00 0.02 H -0.02 0.00 0.02 H -0.11 -0.07 -0.05 H H H H -0.07 -0.04 -0.01 H -0.04 -0.01 0.01 H -0.13 -0.09 -0.06 H -0.07 -0.03 -0.00 H -0.11 -0.07 -0.04 H -0.09 -0.06 -0.03 C.2. Posterior summaries for LACSH model, GGHE as treatment variable. a j §

5% 50% 95% a a -6.54 -3.69 -0.88 a a a a a -2.46 0.42 3.32 a a a a -0.95 1.91 4.82 a -3.17 -0.32 2.60 a a a -13.72 -10.86 -8.24 β -0.07 0.02 0.15 β β β -0.01 0.01 0.03 β -0.03 -0.01 0.02 β σ H φ Σ Y { , } ¶ Σ Y { , } Σ Y { , } Σ Y { , } Σ Y { , } γ (1+ K + Q ) γ -0.14 -0.00 0.14 γ -0.04 0.11 0.26 γ -0.16 -0.01 0.13 γ -0.13 0.01 0.15 γ -0.02 0.13 0.28 γ -0.16 -0.01 0.13 γ -0.06 0.09 0.24 γ -0.16 -0.03 0.11 γ -0.03 0.12 0.27 γ -0.24 -0.09 0.06 γ -0.13 0.01 0.15 γ -0.15 -0.01 0.13 γ -0.09 0.05 0.20 γ -0.03 0.12 0.27 § Refer to Appendix C.1 for indexing of metrics and covariates ¶ Only the top ﬁve in magnitude of the posterior median are presented. H i

5% 50% 95% H -0.10 -0.08 -0.07 H -0.00 0.00 0.01 H H -0.02 -0.01 0.00 H H H -0.01 -0.00 0.01 H -0.11 -0.09 -0.07 H H -0.09 -0.08 -0.07 H -0.10 -0.09 -0.07 H -0.07 -0.05 -0.04 H H H H -0.00 0.01 0.02 H -0.02 -0.01 -0.00 H -0.04 -0.02 -0.01 H H H H H -0.08 -0.07 -0.06 H -0.07 -0.06 -0.04 H -0.00 0.01 0.02 H H H H H -0.01 -0.00 0.00 5% 50% 95% H H -0.01 0.00 0.01 H H H -0.10 -0.08 -0.07 H H H -0.03 -0.02 -0.01 H H -0.07 -0.05 -0.04 H -0.10 -0.09 -0.07 H H -0.04 -0.03 -0.02 H -0.04 -0.03 -0.02 H -0.04 -0.03 -0.03 H H H -0.03 -0.02 -0.01 H -0.05 -0.04 -0.03 H H H -0.04 -0.03 -0.01 H H H -0.02 -0.01 -0.00 H -0.01 -0.00 0.01 H H H -0.08 -0.06 -0.05 H -0.06 -0.04 -0.03 5% 50% 95% H H H -0.06 -0.04 -0.03 H H -0.11 -0.09 -0.08 H -0.01 0.00 0.01 H -0.09 -0.07 -0.06 H H H H -0.02 -0.01 -0.00 H H -0.11 -0.09 -0.08 H -0.05 -0.04 -0.03 H -0.02 -0.01 0.01 H -0.13 -0.11 -0.09 H H -0.10 -0.09 -0.07 H H -0.04 -0.03 -0.02 H -0.11 -0.09 -0.08 H -0.03 -0.02 -0.01 H H H -0.07 -0.06 -0.05 H H H -0.07 -0.06 -0.05 H H -0.01 0.00 0.02 5% 50% 95% H -0.03 -0.02 -0.01 H -0.07 -0.05 -0.04 H H H -0.02 -0.01 0.00 H H H -0.08 -0.07 -0.05 H H -0.08 -0.06 -0.05 H -0.08 -0.06 -0.05 H -0.02 -0.01 -0.00 H H H -0.10 -0.08 -0.07 H H -0.07 -0.06 -0.05 H H -0.01 0.00 0.01 H H -0.10 -0.08 -0.07 H -0.01 0.00 0.01 H H H -0.04 -0.03 -0.02 H -0.03 -0.02 -0.01 H -0.09 -0.08 -0.06 H -0.04 -0.02 -0.01 H -0.08 -0.07 -0.05 H -0.09 -0.08 -0.06 APPENDIX D: ADDITIONAL INTERESTING RESULTS ON PARAMETERS F IG D.1 . Plot of employment-to-population ratio vs. posterior median of latent health, with GGHE as treatmentand with a least squares regression ﬁt for visualization

Figure D.1 shows a negative relationship for employment-to-population ratio and coun-tries’ latent health, conditioned on other metrics with GGHE as treatment. The relevant dis-cussions appear in Section 5.1.2.Additionally, with GGHE as treatment, the metric ‘ renewable energy consumption ’(shown in Figure D.2 and discussed in Section 6.3) also has a negative relationship with thecountry’s latent health (90% credible interval for a is ( − . , − . ). F IG D.2 . Plot of renewable energy consumption vs. posterior median of latent health, with GGHE as treatmentand with a least squares regression ﬁt for visualization

There are a few possible explanations for this negative relationship. For example, lower-income countries lack the capital to expand existing infrastructure for electricity access, es-pecially into rural areas, hence certain renewable energy sources that do not rely on existinginfrastructure serve as more viable options. In addition, high-income countries may be re-luctant, for various reasons, to transition from established infrastructure to renewable energysources. APPENDIX E: POSTERIOR AVERAGE DOSE-RESPONSE CURVES FORSUBSAMPLE OF COUNTRIES F IG E.1 . Posterior median dose-response function (red), with GGHE as treatment, for the subsample of 82 coun-tries, with 100 thinned MCMC samples (gray) for visualization of uncertainty

Figure E.1 shows the average dose-response curves for 82 countries, a subsample of theoriginal set of 120 countries. Subsampling was done to address the consistent pattern of co-variate imbalance in Figure 10. Compared to using the full sample, the average dose-responsecurve for the subsample shows less uncertainty with tighter credible bands for the dose-response curve. F IG E.2 . Plot of p-values (reversed-scale) for assessing covariate balance, with GGHE as treatment, for thesubsample of 82 countries

Figure E.2 shows no consistent pattern of covariate imbalance. REFERENCES [1] A N , W. (2010). 4. Bayesian Propensity Score Estimators: Incorporating Uncertainties in Propensity Scoresinto Causal Inference. Sociological Methodology ANERJEE , S. (2005). On geodetic distance computations in spatial modeling.

Biometrics HAPMAN , B., H

IGGINS , T., L IN , L. et al. (2008). Sharing the costs of parental leave: Paid parental leaveand income contingent loans. Information Paper (Committee for Economic Development of Australia) iv.[4] C

HIB , S. (1995). Marginal likelihood from the Gibbs output.

Journal of the american statistical association HIU , G. S. and W

ESTVELD , A. H. (2011). A unifying approach for food webs, phylogeny, social net-works, and statistics.

Proceedings of the National Academy of Sciences

HIU , G. S., W U , M. A. and L U , L. (2013). Model-based assessment of estuary ecosystem health usingthe latent health factor index, with application to the Richibucto Estuary. PloS one e65697.[7] C HIU , G. S., G

UTTORP , P., W

ESTVELD , A. H., K

HAN , S. A. and L

IANG , J. (2011). Latent health factorindex: A statistical modeling approach for ecological health assessment.

Environmetrics OLE , S. R. and F

RANGAKIS , C. E. (2009). The consistency statement in causal inference: a deﬁnition oran assumption?

Epidemiology ONCEIC ¸ ˜ AO , P. and B ANDURA , R. (2008). Measuring subjective wellbeing: A summary review of theliterature.

United nations development programme (UNDP) development studies, working paper .[10] O

RGANISATION FOR E CONOMIC C O - OPERATION AND D EVELOPMENT (OECD) (2016). OECD BetterLife Index.[11] C

ENTER FOR S YSTEMIC P EACE (2016). The Polity Project.[12] N EW E CONOMICS F OUNDATION (2016). Happy Planet Index 2016: Methods Paper Technical Report.[13] G

ELFAND , A. E. and D EY , D. K. (1994). Bayesian model choice: asymptotics and exact calculations. Journal of the Royal Statistical Society: Series B (Methodological) ELFAND , A. E., D

IGGLE , P., G

UTTORP , P. and F

UENTES , M. (2010).

Handbook of spatial statistics .CRC press.[15] G

ELMAN , A. and H

ILL , J. (2007).

Data analysis using regression and multilevel/hierarchical models .Cambridge University Press New York, NY, USA.[16] G LEDITSCH , K. S. and W

ARD , M. D. (2001). Measuring space: A minimum-distance database and appli-cations to international studies.

Journal of Peace Research ANSON , T. E. (2006). Inference for mixtures of ﬁnite Polya tree models.

Journal of the American Statis-tical Association

ASHIMOTO , T., O DA , K. and Q I , Y. (2018). On Well-being, Sustainability and Wealth Indices beyondGDP: A guide using cross-country comparisons of Japan, China, South Korea. Economic Studies ELLIWELL , J. F. and H

UANG , H. (2014). New measures of the costs of unemployment: Evidence fromthe subjective well-being of 3.3 million Americans.

Economic Inquiry ERN ´ AN , M. and R OBINS , J. (2020).

Causal inference . Boca Raton: Chapman & Hall/CRC, forthcoming .[21] H

IRANO , K. and I

MBENS , G. W. (2004). The propensity score with continuous treatments.

AppliedBayesian modeling and causal inference from incomplete-data perspectives

OETING , J. A., D

AVIS , R. A., M

ERTON , A. A. and T

HOMPSON , S. E. (2006). Model selection forgeostatistical models.

Ecological Applications OFF , P. D. (2009).

A ﬁrst course in Bayesian statistical methods . Springer.[24] H

ORRACE , W. C. (2005). Some results on the multivariate truncated normal distribution.

Journal of multi-variate analysis MAI , K. and V AN D YK , D. A. (2004). Causal inference with general treatment regimes: Generalizing thepropensity score. Journal of the American Statistical Association MBENS , G. W. (2000). The role of the propensity score in estimating dose-response functions.

Biometrika MBENS , G. W. and R

UBIN , D. B. (2015).

Causal inference in statistics, social, and biomedical sciences .Cambridge University Press.[28] T

RANSPARENCY I NTERNATIONAL (2018). Corruption Perception Index.[29] J

ACKMAN , S. (2001). Multidimensional analysis of roll call data via Bayesian simulation: Identiﬁcation,estimation, inference, and model checking.

Political Analysis APLAN , D. and C

HEN , J. (2012). A two-step Bayesian approach for propensity score analysis: Simulationsand case study.

Psychometrika EELE , L. J. and T

ITIUNIK , R. (2015). Geographic boundaries as regression discontinuities.

Political Anal-ysis [32] K LUGMAN , J., R

ODR ´ IGUEZ , F. and C

HOI , H.-J. (2011). The HDI 2010: new controversies, old critiques.

The Journal of Economic Inequality LUVE , J., S

CHNEIDER , H., U

HLENDORFF , A. and Z

HAO , Z. (2012). Evaluating continuous trainingprogrammes by using the generalized propensity score.

Journal of the Royal Statistical Society: SeriesA (Statistics in Society)

UBISZEWSKI , I., C

OSTANZA , R., F

RANCO , C., L

AWN , P., T

ALBERTH , J., J

ACKSON , T. andA

YLMER , C. (2014). Beyond GDP: are there better ways to measure well-being?

The Conversation .[35] L EA , R. A. (1993). World Development Report 1993:Investing in Health.[36] M ARTIN , A. D. and Q

UINN , K. M. (2002). Dynamic ideal point estimation via Markov chain Monte Carlofor the US Supreme Court, 1953–1999.

Political Analysis C C ANDLESS , L. C., G

USTAFSON , P. and A

USTIN , P. C. (2009). Bayesian propensity score analysis forobservational data.

Statistics in medicine C C ANDLESS , L. C., D

OUGLAS , I. J., E

VANS , S. J. and S

MEETH , L. (2010). Cutting feedback inBayesian regression adjustment for the propensity score.

The international journal of biostatistics .[39] M C G ILLIVRAY , M. and C

LARKE , M. (2006).

Human well-being: Concepts and measures . United NationsUniversity Press.[40] N

OREEN , S. (2018). Quantifying the Impact of Local SUTVA Violations in Spatiotemporal Causal Models,PhD thesis, Emory University.[41] OECD (2018). OECD.Stat.[42] P

ARCERO , O. J. and R

YAN , J. C. (2017). Becoming a knowledge economy: the case of Qatar, UAE, and17 benchmark countries.

Journal of the Knowledge Economy EARL , J. (2009).

Causality . Cambridge university press.[44] U

NITED N ATIONS D EVELOPMENT P ROGRAMME (2018). Human Development Indices and Indicators:2018 Statistical Update Technical Report.[45] R

ABE -H ESKETH , S. and S

KRONDAL , A. (2004).

Generalized latent variable modeling: Multilevel, longi-tudinal, and structural equation models . Chapman and Hall/CRC.[46] R

IJPMA , A. et al. (2016). What can’t money buy? Wellbeing and GDP since 1820 Technical Report No.Working Papers 0078, Utrecht University, Centre for Global Economic History.[47] R

OBERTS , G. O. and R

OSENTHAL , J. S. (2009). Examples of adaptive MCMC.

Journal of Computationaland Graphical Statistics OSENBAUM , P. R. and R

UBIN , D. B. (1983). The central role of the propensity score in observationalstudies for causal effects.

Biometrika OSLING , H. (2019).

Factfulness . Flatiron.[50] R

UBIN , D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions.

Journal ofthe American Statistical Association

ACHS , J. D., L

AYARD , R., H

ELLIWELL , J. F. et al. (2018). World Happiness Report 2018 TechnicalReport.[52] S

CHUTTE , S. and D

ONNAY , K. (2014). Matched wake analysis: ﬁnding causal relationships in spatiotem-poral event data.

Political Geography OUTH , A. (2011). rworldmap: A New R package for Mapping Global Data.

R Journal .[54] S TERN , S., W

ARES , A. and E

PNER , T. (2018). 2018 Social Progress Index Methodology Summary Tech-nical Report.[55] T

REIER , S. and J

ACKMAN , S. (2008). Democracy as a latent variable.

American Journal of Political Science ARD , M. D. and G

LEDITSCH , K. S. (2018).

Spatial regression models . Sage Publications.[57] W

ULFGRAMM , M. (2014). Life satisfaction effects of unemployment in Europe: The moderating inﬂuenceof labour market policy.

Journal of European Social Policy ANG , L. (2018). Measuring well-being: a multidimensional index integrating subjective well-being andpreferences.

Journal of Human Development and Capabilities

IGLER , C. M. (2016). The Central Role of Bayes Theorem for Joint Estimation of Causal Effects andPropensity Scores.

The American Statistician IGLER , C. M. and D

OMINICI , F. (2014). Uncertainty in propensity score estimation: Bayesian methods forvariable selection and model-averaged causal effects.

Journal of the American Statistical Association

IGLER , C. M., W

ATTS , K., Y EH , R. W., W ANG , Y., C

OULL , B. A. and D

OMINICI , F. (2013). Modelfeedback in Bayesian propensity score estimation.

Biometrics69