The sociospatial factors of death: Analyzing effects of geospatially-distributed variables in a Bayesian mortality model for Hong Kong
Thayer Alshaabi, David Rushing Dewhurst, James P. Bagrow, Peter Sheridan Dodds, Christopher M. Danforth
TThe sociospatial factors of death: Analyzing effects of geospatially-distributedvariables in a Bayesian mortality model for Hong Kong
Thayer Alshaabi,
1, 2, ∗ David Rushing Dewhurst,
1, 2, 3
James P. Bagrow,
1, 4
Peter Sheridan Dodds,
1, 2, 4 and Christopher M. Danforth
1, 2, 4, † Vermont Complex Systems Center, The University of Vermont, Burlington, VT 05405. Computational Story Lab, The University of Vermont, Burlington, VT 05405 MassMutual Data Science, Boston, MA 02110 Department of Mathematics & Statistics, The University of Vermont, Burlington, VT 05405 (Dated: July 17, 2020)Human mortality is in part a function of multiple socioeconomic factors that differ both spatiallyand temporally. Adjusting for other covariates, the human lifespan is positively associated withhousehold wealth. However, the extent to which mortality in a geographical region is a function ofsocioeconomic factors in both that region and its neighbors is unclear. There is also little informationon the temporal components of this relationship. Using the districts of Hong Kong over multiplecensus years as a case study, we demonstrate that there are differences in how wealth indicatorvariables are associated with longevity in (a) areas that are affluent but neighbored by sociallydeprived districts versus (b) wealthy areas surrounded by similarly wealthy districts. We also showthat the inclusion of spatially-distributed variables reduces uncertainty in mortality rate predictionsin each census year when compared with a baseline model. Our results suggest that geographicmortality models should incorporate nonlocal information (e.g., spatial neighbors) to lower thevariance of their mortality estimates, and point to a more in-depth analysis of sociospatial spillovereffects on mortality rates.
I. INTRODUCTIONA. Mortality risks and social deprivation
Temporal dynamics of mortality risks with regards tonation-wide epidemics [1, 2], pollution [3, 4], and lifequality and expectancy [5] have been widely studiedthroughout the last decade in Hong Kong and aroundthe world. Researchers have hypothesized and identifiedmany connections between longevity, social deprivation,and socioeconomic inequality. Studies have put forwardevidence of disparities in mortality risks all around theglobe [6–8].Notably, there are many different definitions ofsocial deprivation. Messer et al. ’s study [9] pro-vides a well-written overview of the literature regard-ing neighborhood-level socioeconomic deprivation. Thestudy not only discusses ways in which deprivation hasbeen defined throughout the last decade of research, butit also proposes a new method to calculate and hope-fully standardize what they call a “neighborhood depri-vation index” (NDI). The authors highlight the limita-tions of definitions found in the literature. Using prin-cipal components analysis (PCA) on census data from1995 to 2001, they demonstrate the effectiveness of theirproposed measurement at capturing socioeconomicallydeprived areas across the US.Researchers have investigated various socioeconom-ic, psychological, and behavioral factors of mortality ∗ [email protected] † [email protected] risks [10]. In the present study, we use a set of socialsocioeconomic attributes including income, unemploy-ment, access to life insurance, and mobility, to define andcapture the magnitude of social deprivation compared toan average baseline for Hong Kong.The notion of disparity in health and mortality risksis often studied using population-scale inputs such as airpollution, as well as sensitive individual variables such asage, race, and gender with respect to the privacy concernsthat emerge from such applications [ ? ]. Ou et al. [11]infer socioeconomic status by type of housing, education,and occupation. They find that regions with lower socioe-conomic status have a higher ratio of air pollution whenthey look at the spatial correlation of groups living inpublic rental housing compared to areas with a higherdensity of private housing. They also report that areaswith a higher density of blue-collar workers have a higherair pollution-associated mortality than others.Chung et al. [12] show evidence of disparities in mor-tality risks when considering age as a control variable.The authors investigate the effect of socioeconomic sta-tus concerning the rapid economic development of HongKong from 1976 to 2010. They find a decline in socioe-conomic disparity in mortality risks across the countrythroughout that period. Mortality rates are generallyhigher for lower socioeconomic status regions regardlessof gender. They also show that various health benefitsbrought by economic growth are greater for regions withhigher socioeconomic status. The market share of healthbenefits is unequally distributed among groups of vary-ing status: Individuals with higher socioeconomic statushave access to greater benefits than those of lower socioe-conomic status.Typeset by REVTEX a r X i v : . [ phy s i c s . s o c - ph ] J u l B. Spatial association of mortality risks
The spatial associations between income inequalityand health risks are widely understood both internation-ally [13–17], and for individual cities and states [18, 19].Although many studies have investigated income inequal-ity [20–23] and compared that across countries [24], moststudies use a spatially localized approach in their investi-gations. Nonetheless, studies have shown the importanceof spatial associations in identifying relations betweensocioeconomic deprivation and longevity.Geographically weighted regression (GWR) [25, 26]is a commonly used approach designed to account forspatial associations. Local attributes can play a strongrole in the model dynamics given the assumption thatsocioeconomic factors vary geographically, and GWR wasintroduced to overcome the challenge of uncertainty ina global (e.g., average) spatial regression model. Theauthors argue that socioeconomic attributes are intrinsi-cally intra-connected over space due to the mechanismsby which communities evolve. Their study makes anempirical comparison of their method to other station-ary regression models to investigate the spatial distribu-tion of long-term illness on a data set published by theUK Census of Population. They show that GWR canaccount for the spatial nonstationarity between differentvariables in the data.Researchers have looked into the spatial associa-tion between air pollution and mortality in HongKong [27], Czech [28], Rome [29], and France [30]. Coss-man et al. [31] examine the spatial distribution of mortal-ity rates over 35 years starting from 1968 to 2002 acrossall counties in the US. Unlike previous studies, the Coss-man et al. study highlights a nonrandom pattern of clus-tering in mortality rates in the US where high mortalityrates are primarily driven by economic decline. Howev-er, counties in the middle-upper plain of the US showan interestingly unique pattern. Although these countieshave followed a similar economic decline and an increas-ing rate of out-migration, they still have the lowest mor-tality rates across the country. By contrast, the south-eastern counties have the highest mortality rates despitehaving similar rates of economic decline and migration.To assess geospatial associations between pollutionand mortality in Hong Kong, Thach et al. [32] exam-ine the spatial interactions of tertiary planning units(TPUs) [33]—similar to census-blocks in the US. Theauthors indicate a positive spatial correlation betweenmortality rates and seasonal thermal changes across theregion. They argue that the variation between TPUs is akey factor for cause-specific mortality rates. Their resultsshow that socioeconomically deprived regions have ahigher rate of mortality, especially during the winter sea-son.
C. Sociospatial factors of death
Studying the relative spatial interactions of social andeconomic indicators is certainly not a new field. Investi-gations of such relations can be noted throughout the lit-erature measuring nonlocal and/or interdependent inter-actions of inequality in health care [34], peer pressure ineducation [35, 36], and decision-making [37, 38].Yang et al. [39] argue that mortality rates of countiesin the US are highly associated with social and economi-cal features found in neighboring counties. Their findingssuggest that mortality rates in a given county are highlydriven by social indicators from bordering counties dueto the spillover of socioeconomic wealth or social depri-vation across neighborhoods.Another recent study by Holtz et al. [40] highlightsthe significant influence of nonlocal interactions such aspeer effect and spillovers on regional policies regardingthe global outbreak of COVID–19.To identify and examine broader dimensions ofinequality from a spatial point of view, various stud-ies have either investigated the geographical effect ofneighborhoods [41], or proposed new methods [42] simi-lar to Moran’s I [43] and spatial auto-regression. Using anetwork-based approach to explore the dynamics of com-munities and their impact on mortality risks, we presenthere a small-case study using a collection of datasets fromHong Kong.Though Hong Kong is a small island territory, itexhibits significant variation in occupations, income, for-eign inhabitant density, and residence status of workers.We demonstrate heterogeneity of such exogenous featuresand investigate nonlocal behavioral interactions of wealthand social deprivation across neighborhoods. We presenta statistical analysis comparing local and nonlocal modelsto show the importance of spatial associations for mor-tality modeling. In particular, we use a spatial networkapproach to study a socioeconomic peer-pressure effectamong communities. For instance, we investigate howthe magnitude of a socially deprived area can consequent-ly have a nonlocal effect on its neighbors’ mortality risks.Similarly, we examine how the spatial spread of wealthof an affluent area can spillover to its surrounding areas,and thus impact their longevity. Our work not only showsthe deep influence of these spatial interactions of neigh-borhoods on predicting mortality rates, but also providesa method for investigating systematic inference errors ofmortality models.
II. METHODS
We introduce and analyze the datasets in Sec. II A.Detailed descriptive analysis of our data set variablescan be found in Appendix A. For our statistical analysis,we employ a set of simple Bayesian generalized additivemodels to predict mortality rates across districts in HongKong. We use three different models to illustrate the roleof spatial associations by comparing models with spa-tial features to a baseline model without spatial features.First, we present our local model that does not use anyspatial information in Sec. II B 1. We compare our Base-line model to two nonlocal spatial models in Sec. II B 2.Our first nonlocal model uses spatial features from thenearest neighbours, while the second uses features fromall neighbours weighted by their distance to the targetarea. We will refer to the nonlocal models as SP, andWSP respectively.
A. Data sources
Mortality data:
We use the official counts of knownand registered deaths provided by the Census and Statis-tics Department of Hong Kong [47]. The data set con-tains 892,055 death records between 1995 and 2017.Every record includes a wide range of information such asage, gender, and place of death (TPU) [33]—a geospatialreference system used to report population census statis-tics.
Census data:
We collected socioeconomic variablescurated by the Census and Statistics Department of HongKong [48]. We have three snapshots at 5-year intervals,2006, 2011, and 2016. For each year, the data set includesthe total population density by district, median income,median rent to income ratio, median monthly householdincome, unemployment rates, unemployment rates acrosshouseholds, unemployment rates among minorities, theproportion of homeless people, the proportion of home-less mobile residents, the proportion of single parents, theproportion of households with kids in school, the propor-tion of households with children (aged under 15), and theproportion of households with elderly (aged 65 or above).
Life insurance data:
We have obtained data from aHong Kong based life insurance provider. According tothe Hong Kong Insurance Authority [49], our providerhad roughly 2.5% market share of all non-linked individ-ual life insurance policies issued in Hong Kong in 2016.We normalize the number of polices issued at the districtlevel by population size for each time snapshot to reportthe proportion of individuals insured by each district.
Geospatial unit:
Initially, we planned to use TPU asthe main geospatial unit to combine our data sets spa-tially. Unfortunately, most death records are found ina only small fraction of TPUs likely due to data priva-cy concerns—records in small TPUs may reveal sensi-tive information about specific individuals there. Giventhat the mortality data set has a large fraction of missingTPUs, and to avoid any risk of identifying individuals inthe data set, we use districts as our main spatial unit ofanalysis [44]. This choice is consistent with prior work,where most studies have either filtered out small TPUsin their analyses [12, 27], or aggregated their records atthe district level [32] to overcome this challenge.
Categorization:
We organize our features into threedifferent categories. 1.
Base : This set has most of the socioeconomic fea-tures in our data sets such as population density,unemployment rates, the proportion of homelesspeople, mobile residents, and single-parent house-holds. However, we do not include wealth-, age-, orrace-related features here.2.
Wealth : In addition to the base features describedabove, we include median income, median rent-to-income ratio, median monthly household income,and life insurance coverage by the district.3.
All : This set includes all features in our data setincluding sensitive variables—from a sociopoliticalperspective—such as the proportion of minorities,and unemployment rates among minorities. Thisset also includes age-related features such as theproportion of young and elderly households.
B. Statistical methods
1. Local model (Baseline)
We treat the design tensors X as exogenous vari-ables and do not model their evolution across time.A “design tensor” is a rank 3 tensor given by X =( X , X , . . . , X T ) where X t is the design matrix fortime period t . Each design matrix X t is of dimension N × ( p + 1), where N is the number of observations and p is the number of predictors. The +1 in the dimension-ality of the design matrix is because we add a constantto the linear model. The dynamics of the local modelsare described by a system of linear equations, y t = X t β t + σ t u t , (1) β t = β t − + µ + Lv t , (2)log σ t = log σ t − + µ + (cid:96)w t , (3)for t = 1 , ..., T .Eq. 1 is an ordinary linear model for the response vec-tor y t as a function of the design matrix X t and coeffi-cients β t . We presently define the quantities that com-pose Eqs. 2 and 3. We set u t , v t ∼ MultivariateNormal( , I ) (4)in Eqs. 1 and 2, while w t ∼ Normal(0 , I we meanthe ( p + 1) × ( p + 1) identity matrix. Hence the modellikelihood is p ( y | β , σ ) = T (cid:89) t =1 N (cid:89) n =1 p ( y tn | X tn β t , σ t )= T (cid:89) t =1 N (cid:89) n =1 Normal( X tn β t , σ t ) . (5)We display a graphical model corresponding to Eq. 5 inFig. 2. FIG. 1.
Spatial networks of Hong Kong’s districts . ( A ) We display a map of the 18 districts of Hong Kong as outlinedin Ref. [44]. We cross-reference the roads and bridges connecting these districts to build a spatial network [45, 46]. ( B ) Wedemonstrate the first undirected network layout of Hong Kong’s districts. Districts (nodes) are linked if they border each otheror share a direct road/bridge in a binary fashion. ( C ) We show a fully connected version of the network. For the fully connectednetwork, edges to neighboring districts are weighted exponentially. Different weighting schemes can be applied here, however,for our application we use the spatial distance measured by the length of the shortest path connecting any two districts on thenetwork. FIG. 2.
A graphical model representing the likelihoodfunction given in Eq. 5.
Latent log σ t and β t evolve asbiased random walks, while y tn and X tn are treated as observ-able random variables and exogenous parameters respectively.The entire temporal model is plated across districts N = 18. We assume a prior on β t that evolves as a biased ran-dom walk with drift given by µ and correlation matrix Σ with Cholesky decomposition L . We make this choicebecause we a priori believe that β t does not remain con-stant throughout the time under study, though we areunsure of exactly how it changes over this time. Likewise,we suppose that log σ t evolves according to a univariaterandom walk with drift given by µ and standard devia-tion (cid:96) . We make this assumption for the same reason:We do not believe it is likely that σ t remains constantover the entire time period of study. The random walkpriors for β t and σ t are each centered about zero becausewe impose a zero mean prior on µ and µ . We initializethese random walks with zero-centered multivariate nor-mal initial conditions, β ∼ MultivariateNormal( , Σ ) , (6)and log σ ∼ Normal(0 , (cid:96) ) . (7)The distribution of β T ≡ ( β , ..., β T ) is thus given by p ( β T | µ , Σ ) = T (cid:89) t =1 p ( β t | β t − , µ , Σ ) (8)= T (cid:89) t =1 MultivariateNormal( β t − + µ , Σ ) , (9)with an analogous (univariate) distribution holding forlog σ . We set µ ∼ MultivariateNormal( , I ) , (10) µ ∼ Normal(0 , , (11)so that the prior distribution over the paths of the regres-sion coefficients and log standard deviation are centeredabout zero—the null hypothesis—for all t . We place a uniform prior (LKJ(1)) over the correlationcomponent of Σ . The mean of this prior is at the identi-ty matrix. The vector of standard deviations of Σ , s , ishypothesized to follow an isotropic multivariate log nor-mal distribution as s ∼ LogNormal(0 , ,
1) prior on (cid:96) , the standard devi-ation of the increments of log σ t . We make this choicebecause we do not possess prior information about theappropriate noise scale of β t or log σ t and the log normaldistribution is a weakly-informative prior that does notencode much prior information about their noise scales.We did not perform exact inference of this model butrather fit parameters of a surrogate variational posterior(“guide”) distribution. We made this choice because vari-ational inference replaces the process of sampling withthe much faster problem of optimization [50]. Denotingthe vector of all latent random variables by z ≡ ( µ ( σ ) , (cid:96), µ ( β ) , Σ ( β ) , log σ T , β T ) , (12)we fit the parameters ψ of an approximate posterior dis-tribution q ψ ( z ) to maximize the variational lower bound,defined as the expectation under q ψ ( z ) of the differencebetween the log joint probability and log q ψ ( z ) [51]. Wechose a low-rank multivariate normal guide with rankequal to approximately √ dim z . This low rank approxi-mation allows for modeling of correlations in the poste-rior distribution of z with a lower number of parametersthan, for example, a full-rank multivariate normal guidedistribution. All bounded latent random variables arereparameterized to lie in an unconstrained space so thatwe could approximate them with the multivariate normalguide.
2. Nonlocal models (SP and WSP)
We use the road networks in Hong Kong [45, 46] tobuild a spatial network of the 18 districts of Hong Kong.Each node in the network represents a single district.Nodes are linked if they share a direct road or bridge.In Fig. 1A, we show a map of Hong Kong’s districts.We display an undirected network of districts in Fig. 1B.By contrast, we show a fully connected version of thenetwork in Fig. 1C. Edges are weighted by their spatialdistance measured by the length of the shortest path d ij to reach from district i to district jw ij = exp {− ( d ij − } , (13)and weights decays exponentially as the length of theshortest path increases between any two districts.Similarly, we treat the adjacency matrix A as an exoge-nous variable. We fit two nonlocal models that takeaccount of the design matrix associated with each dis-trict’s neighbors; a spatial model (SP) that uses the bina-ry adjacency matrix and a weighted spatial model (WSP)that uses the weighted adjacency matrix. The equations FIG. 3.
Temporal dynamics of spatial socioeconomic characteristics of Hong Kong . We show the spatial distributionof five features in our datasets for three different census years. Here, heatmaps are normalized by the mean and standarddeviation. Darker shades of red show areas above the mean for each of these variables while shades of grey show areas belowthe mean. (
A–C ) We display the spatial growth of population over time. (
D–F ) We demonstrate the variation of mortalityrates, and life insurance converge (
G–I ). We see some segregation of unemployment rates in (
J–L ), and median income in(
M–L ). describing the time evolution of this data generating pro-cess are y t = X t β t + f ( X t ⊗ B ) γ t + σ t u t , (14) β t = β t − + µ + Lv t , (15) γ t = γ t − + m + Λ q t , (16)log σ t = log σ t − + µ + (cid:96)w t , (17)for t = 1 , ..., T . The rank three tensor X t ⊗ B is theouter product of B ≡ A − I with the design matrix X t . The function f is a reduction function that low-ers the rank of the tensor by one by collapsing the firstdimension. Here we take f to be the mean across thefirst dimension. In other words, f ( X t ⊗ B ) is a designmatrix where f ( X t ⊗ B ) ij is the average of the valuesof predictor j over all of the neighbors of district i in thenetwork.The prior distributions for γ t , m , Λ , and q t are identi-cal to those for β t , µ , L , and u t except their dimension-ality is lowered from p + 1 to p since we do not includean intercept term in f ( X t ⊗ B ).We use Pyro [52], a probabilistic programming lan-guage that operates on top of
Pytorch [53], a dynamicgraph differentiable programming library, to implementour models. Our source code is available on Gitlab [54].
III. RESULTS AND DISCUSSION
In Fig. 3, we show the spatial distribution of socioe-conomic characteristics for 2006, 2011, and 2016. Eachheatmap is normalized by the mean and standard devi-ation for each year. Darker shades of red show areasabove the mean for each of these variables while shadesof grey show areas below the mean. We show normal-ized population density in Figs. 3A through C. We seedense clusters both at the center of the country and onthe northwestern side. We note high mortality rates (seeFigs. 3D–F) are evident in the southern islands while lifeinsurance policies decline there. Life insurance coveragevaries across Hong Kong but is still higher in denselyurbanized areas shown in Figs. 3G through I. The north-western territories have higher rates of unemploymentcompared to the southeastern side of Hong Kong as wesee in Figs. 3J–L. In Figs. 3M–O, we observe that theeast and center districts have higher normalized medianincome when adjusted for inflation. We display addition-al statistics regarding households rather than individualsin Fig. S2.Our models provide evidence to suggest that thereare significant relationships between socioeconomic vari-ables, such as household unemployment, percentage ofsingle parents, and mortality rate. Many of these rela-tions are significant in each of the census periods understudy (2006, 2011, and 2016) while other relations aresignificant for some of the census periods but not others.We display the distributions of our models’ coefficientsand associated significance assessments in Figs. S6–S9. To take a closer look at this variation across mod-els, we compute the difference between the mean of theprediction distributions and the ground truth mortali-ty rates for each district. We provide empirical pdfs ofthe error distributions in Figs. 4 and S13–S14. If themodels accurately associate features with observed mor-tality rate, these distributions should be centered aroundzero. Conversely, if the models display systematic over-or under-estimation of mortality rate, these distributionwill not be centered around zero. In Fig. 4, we examinedistribution of errors for each model and for each district(observation) for 2016 which is possible since our modelsare fully Bayesian and generate a distribution of possibleoutcomes.We assess significance of model coefficients using cen-tered Q % credible intervals (CI). A centered Q % credi-ble interval of a probability density function (pdf) p ( x )is an interval ( a, b ) defined such that (1 − Q ) = (cid:82) a −∞ dx p ( x ) = (cid:82) ∞ b dx p ( x ). We measure the signifi-cance of systematic errors in each model by computingthe 80% CI, whereby systematic overestimation is col-ored in orange (CI > < FIG. 4.
Systematic error distributions for models trained on the default set of features in 2016 . We examinesigned errors by computing the difference between the mean of prediction distributions p i and the ground truth mortality rates y for each district. A perfect model would have a narrow error distribution centred on 0 (solid red line going across). Positive errorvalues demonstrate overestimation whereas negative values show underestimation of mortality rates for each district. Modelswith significance systematic overestimation are colored in orange while models with significance systematic underestimationare colored in blue as measured by the 80% CI. We show the same distributions for models trained on wealth-related, and allfeatures in Fig. S13 and Fig. S14, respectively. FIG. 5.
Impact of sociospatial factors on mortality risks. . The first three rows show normalized inference error for fourdistricts that are poorly fit by our models. (
A–B ) We show districts with systematic overestimation of mortality rates, while(
C–D ) show districts where mortality rates are systematically underestimated. For each district, we show the normalized valueof some features of interest along with the average value of these features in the neighboring districts indicated by (*) symbol.The red dashed line indicate the average value for each of these normalized features centered at zero. Evidence of sociospatialfactors of longevity can be seen in all four districts. Particularly, we note a spillover of wealth measured by median income.Districts in ( A ) and ( B ) maintain lower mortality rates while surrounded by districts with around average mortality rates. Bycontrast, districts in ( C ) and ( D ) have a socioeconomic pull driving the whole neighborhood to have higher mortality rates. number of mobile residents than average (see Fig. 5C). InFig. 5D, we see an extraordinary higher number of home-less people compared to the neighboring population. Oursystematic errors can also be driven by a higher percent-age of minorities in this district, which hints at disparitiesin mortality risks in the area. Further analyses with fin-er geospatial resolution would be needed to explain thisbehavior.We also analyze our mortality models’ error distribu-tions through a local ego network approach. An ego net-work of a node is the network consisting that node andits nearest neighbors. Here each node is a district and itsneighbors are that district’s neighboring districts. Thejoint distribution of mortality at time t and district i , y ti ,and model mortality prediction error e ti ≡ p ti − y ti condi-tioned on location at node i , can be concisely represent-ed in the form of an ego network for each district. Wedisplay an example of this representation in Fig. 6. Wecolor nodes by standardized mortality and edges by stan-dardized prediction error. Though this is an exploratorymethodology that deserves greater expansion and atten-tion in future work, we note qualitatively that the localview of predicted mortality versus true mortality variessubstantially as a function of district. For example, thedistrict of Wan Chai is connected to four other districts, three of which have substantially-higher mortality thanaverage and in particular higher mortality than Wan Chaiitself, which has lower mortality than average. The mod-el predictions for these neighbors are lower than theirtrue mortality. An observer in Wan Chai who has accessonly to the model predictions and not the true mortali-ty data would rationally assume that mortality in thesedistricts is much lower than it truly is and could sub-sequently make further inferences or decisions based onfaulty-but-rational assumption.Notably, socioeconomic diversity or heterogeneity ofthe neighborhood can change the local perception ofmortality models across neighboring communities. Weobserve that high divergence from the neighborhood isassociated with higher rates of uncertainty/error in themodels. One can propose using prediction error as aproxy of what the models have learned (or failed to learn)about socioeconomic status of neighborhood relative tothe target district. We envision future work incorporat-ing this sort of information to fine tune mortality mod-els.0 FIG. 6.
Ego networks of each district demonstrating sociospatial factors of mortality for the 2016 weightedspatial model . We display ego networks of each district in Hong Kong and its nearest neighbors in the road and bridgenetwork. The central node (highlighted with a grey box) of each network corresponds to the labelled district. Neighborsare not arranged around the ego district geographically. Node color corresponds to normalized mortality rate and edge colorcorresponds to signed prediction error for the 2016 WSP model. These ego networks encode a qualitative measure of thesociospatial factors in mortality modeling. We display the equivalent networks for the Baseline and SP models in Fig. S15 andFig. S16 respectively.
IV. CONCLUDING REMARKS
Our results support our hypothesis that spatial asso-ciations of wealth or social deprivation among neighbor-hoods has a direct and sometimes substantial impact onmortality risks. Our experiments reveal that localizedmodels—which do not account for sociospatial factors—can systematically over- or under-estimate mortalityrate—while spatial models manage to reduce the errorof predicting mortality rate. We find varying degrees ofinference error as a factor of the marginal function of thedesign tensor. Different features can introduce differentbiases to the models indicating different levels of dispar-ities in predicting mortality risks. We recommend usinga non-local approach to account for all of these factors.Our findings are limited for four reasons. First andforemost, we only have access to census data for threeindividual years spanning a decade and a half. A betterexplanation of the nonlocalized effect of neighborhoodscould be achieved by testing our hypotheses on additionalyears, along with more socioeconomic features to enrichour design tensor. Second, our geospatial resolution is unfortunately nothigh enough to identify some of the dynamics of con-nected communities. Nonetheless, our methodology canbe implemented in a similar manner regardless of the spa-tial unit used for the experiment. Our spatial network ismainly based on the road network of Hong Kong, whichcould be extended to account for roads/bridges and pub-lic transport across any desired spatial unit.Third, our models are rather simple but explainableto demonstrate the role of relative spatial associationsom longevity. More sophisticated geospatial techniquescould help us understand the impact of socioeconomicfactors on mortality risks.Finally, we have only explored a distance-based weight-ing scheme for the connections across districts in the net-work. Population density could be included to enrich thesocioeconomic effect of neighboring regions on a givennode within the network (for example, theory of inter-vening opportunities [55]). Other attributes such as geo-graphic information associated with community healthservices could help us assess their value and reallocatethese centers to more optimized locations.1
ACKNOWLEDGMENTS
The authors are grateful for the computing resourcesprovided by the Vermont Advanced Computing Coreand financial support from the Massachusetts MutualLife Insurance Company and Google Open Source underthe Open-Source Complex Ecosystems And Networks(OCEAN) project. While they did not collaborate on the manuscript, we thank Hong Kong’s Census and Statis-tics Department for facilitating access to their mortalitydataset. The authors are grateful for useful conversa-tions with Adam Fox, Marc Maier, and Xiangdong Gu.We also thank Josh Minot, Michael Arnold, Anne MarieStupinski, Colin Van Oort, and many of our colleaguesat the Computational Story Lab for their discussions andfeedback on this project. [1] I. O. Wong, C. Schooling, B. J. Cowling, and G. M.Leung. Breast cancer incidence and mortality in a tran-sitioning Chinese population: Current and future trends.
British Journal of Cancer , 112(1):167–170, 2015.[2] P. Wu, A. M. Presanis, H. S. Bond, E. H. Lau, V. J.Fang, and B. J. Cowling. A joint analysis of influenza-associated hospitalizations and mortality in Hong Kong,1998–2013.
Scientific Reports , 7(1):929, 2017.[3] C.-M. Wong, S. Ma, A. J. Hedley, and T.-H. Lam. Effectof air pollution on daily mortality in Hong Kong.
Envi-ronmental Health Perspectives , 109(4):335–340, 2001.[4] H. Qiu, L. Tian, K.-f. Ho, V. C. Pun, X. Wang, andT. Ignatius. Air pollution and mortality: Effect modifi-cation by personal characteristics and specific cause ofdeath in a case-only study.
Environmental Pollution ,199:192–197, 2015.[5] T.-H. Lam, S.-Y. Ho, A. J. Hedley, K.-H. Mak, and G. M.Leung. Leisure time physical activity and mortality inHong Kong: Case-control study of all adult deaths in1998.
Annals of Epidemiology , 14(6):391–398, 2004.[6] M. D. Hayward, A. M. Pienta, and D. K. McLaughlin.Inequality in men’s mortality: The socioeconomic statusgradient and geographic context.
Journal of Health andSocial Behavior , pages 313–330, 1997.[7] A. Deaton and D. Lubotsky. Mortality, inequality andrace in American cities and states.
Social Science &Medicine , 56(6):1139–1153, 2003.[8] G. Rey, E. Jougla, A. Fouillet, and D. H´emon. Ecologicalassociation between a deprivation index and mortalityin France over the period 1997–2001: Variations withspatial scale, degree of urbanicity, age, gender and causeof death.
BMC Public Health , 9(1):33, 2009.[9] L. C. Messer, B. A. Laraia, J. S. Kaufman, J. Eyster,C. Holzman, J. Culhane, I. Elo, J. G. Burke, andP. Ocampo. The development of a standardized neigh-borhood deprivation index.
Journal of Urban Health ,83(6):1041–1062, 2006.[10] E. Puterman, J. Weiss, B. A. Hives, A. Gemmill,D. Karasek, W. B. Mendes, and D. H. Rehkopf. Pre-dicting mortality from 57 economic, behavioral, social,and psychological factors.
Proceedings of the NationalAcademy of Sciences , 2020.[11] C.-Q. Ou, A. J. Hedley, R. Y. Chung, T.-Q. Thach,Y.-K. Chau, K.-P. Chan, L. Yang, S.-Y. Ho, C.-M.Wong, and T.-H. Lam. Socioeconomic disparities in airpollution-associated mortality.
Environmental Research ,107(2):237–244, 2008.[12] R. Y. Chung, F. T. Lai, G. K. Chung, B. H. Yip, S. Y.Wong, and E. K. Yeoh. Socioeconomic disparity in mor-tality risks widened across generations during rapid eco-nomic development in Hong Kong: An age-period-cohort analysis from 1976 to 2010.
Annals of Epidemiology ,28(11):743–752, 2018.[13] K. Buckingham and P. R. Freeman. Sociodemographicand morbidity indicators of need in relation to the use ofcommunity health services: Observational study.
BMJ ,315(7114):994–996, 1997.[14] D. Kim, I. Kawachi, S. Vander Hoorn, and M. Ezzati.Is inequality at the heart of it? Cross-country associ-ations of income inequality with cardiovascular diseasesand risk factors.
Social Science & Medicine , 66(8):1719–1732, 2008.[15] Z. Chen and C. A. G. Crawford. The role of geograph-ic scale in testing the income inequality hypothesis asan explanation of health disparities.
Social Science &Medicine , 75(6):1022–1031, 2012.[16] J. X. Fan, M. Wen, and L. Kowaleski-Jones. Tract-andcounty-level income inequality and individual risk of obe-sity in the United States.
Social Science Research , 55:75–82, 2016.[17] T.-C. Yang, S. A. Matthews, F. Sun, and M. Armendariz.Modeling the importance of within-and between-countyeffects in an ecological study of the association betweensocial capital and mental distress.
Preventing ChronicDisease , 16:E75–E75, 2019.[18] E. E. Bjornstrom. An examination of the relation-ship between neighborhood income inequality, socialresources, and obesity in Los Angeles county.
AmericanJournal of Health Promotion , 26(2):109–115, 2011.[19] D. Kim, F. Wang, and C. Arcan. Peer reviewed: Geo-graphic association between income inequality and obesi-ty among adults in New York state.
Preventing ChronicDisease , 15, 2018.[20] I. Kawachi and B. P. Kennedy. The relationship of incomeinequality to mortality: Does the choice of indicator mat-ter?
Social Science & Medicine , 45(7):1121–1127, 1997.[21] J. M. Major, C. A. Doubeni, N. D. Freedman, Y. Park,M. Lian, A. R. Hollenbeck, A. Schatzkin, B. I. Graubard,and R. Sinha. Neighborhood socioeconomic deprivationand mortality: NIH-AARP diet and health study.
PLOSONE , 5(11), 2010.[22] S. F. Reardon and K. Bischoff. Income inequality andincome segregation.
American Journal of Sociology ,116(4):1092–1153, 2011.[23] T.-C. Yang and L. Jensen. Exploring the inequality-mortality relationship in the US with Bayesian spa-tial modeling.
Population Research and Policy Review ,34(3):437–460, 2015.[24] N. A. Ross, M. C. Wolfson, J. R. Dunn, J.-M. Berthelot,G. A. Kaplan, and J. W. Lynch. Relation between incomeinequality and mortality in Canada and in the UnitedStates: Cross sectional assessment using census data and vital statistics. BMJ , 320(7239):898–902, 2000.[25] A. S. Fotheringham, M. E. Charlton, and C. Brunsdon.Geographically weighted regression: A natural evolutionof the expansion method for spatial data analysis.
Envi-ronment and Planning A , 30(11):1905–1927, 1998.[26] A. S. Fotheringham, C. Brunsdon, and M. Charlton.
Geo-graphically weighted regression: The analysis of spatiallyvarying relationships . John Wiley & Sons, 2003.[27] C.-M. Wong, C.-Q. Ou, K.-P. Chan, Y.-K. Chau, T.-Q.Thach, L. Yang, R. Y.-N. Chung, G. N. Thomas, J. S. M.Peiris, T.-W. Wong, et al. The effects of air pollution onmortality in socially deprived urban areas in Hong Kong,China.
Environmental Health Perspectives , 116(9):1189–1194, 2008.[28] M. Branis and M. Linhartova. Association betweenunemployment, income, education level, population sizeand air pollution in Czech cities: Evidence for environ-mental inequality? A pilot national scale analysis.
Health& Place , 18(5):1110–1114, 2012.[29] F. Forastiere, M. Stafoggia, C. Tasco, S. Picciotto,N. Agabiti, G. Cesaroni, and C. A. Perucci. Socioeco-nomic status, particulate air pollution, and daily mor-tality: Differential exposure or differential susceptibility.
American Journal of Industrial Medicine , 50(3):208–216,2007.[30] C. M. Padilla, W. Kihal-Talantikite, V. M. Vieira,P. Rossello, G. Le Nir, D. Zmirou-Navier, and S. Deguen.Air quality and social deprivation in four Frenchmetropolitan areas—a localized spatio-temporal environ-mental inequality analysis.
Environmental Research ,134:315–324, 2014.[31] J. S. Cossman, R. E. Cossman, W. L. James, C. R. Camp-bell, T. C. Blanchard, and A. G. Cosby. Persistent clus-ters of mortality in the United States.
American Journalof Public Health , 97(12):2148–2150, 2007.[32] T.-Q. Thach, Q. Zheng, P.-C. Lai, P. P.-Y. Wong, P. Y.-K. Chau, H. J. Jahn, D. Plass, L. Katzschner, A. Krae-mer, and C.-M. Wong. Assessing spatial associationsbetween thermal stress and mortality in Hong Kong: Asmall-area ecological study.
Science of the Total Envi-ronment
Journal of Health Economics , 30(4):685–694,2011.[35] R. W. Zimmer and E. F. Toma. Peer effects in private andpublic schools across countries.
Journal of Policy Anal-ysis and Management: The Journal of the Associationfor Public Policy Analysis and Management , 19(1):75–92, 2000.[36] B. Sacerdote. Peer effects in education: How might theywork, how big are they and how much do we know thusfar? In
Handbook of the Economics of Education , vol-ume 3, pages 249–277. Elsevier, 2011.[37] E. Bodine-Baron, S. Nowak, R. Varadavas, and N. Sood.Conforming and non-conforming peer effects in vaccina-tion decisions. Technical report, National Bureau of Eco-nomic Research, 2013.[38] D. Albert, J. Chein, and L. Steinberg. The teenage brain:Peer influences on adolescent decision making.
Current Directions in Psychological Science , 22(2):114–120, 2013.[39] T.-C. Yang, A. J. Noah, and C. Shoff. Exploring geo-graphic variation in US mortality rates using a spatialDurbin approach.
Population, Space and Place , 21(1):18–37, 2015.[40] D. Holtz, M. Zhao, S. G. Benzell, C. Y. Cao, M. A.Rahimian, J. Yang, J. N. L. Allen, A. Collis, A. V.Moehring, T. Sowrirajan, et al. Interdependence and thecost of uncoordinated responses to COVID–19. 2020.[41] T.-C. Yang, L. Jensen, and M. Haran. Social capitaland human mortality: Explaining the rural paradox withcounty-level mortality data.
Rural Sociology , 76(3):347–374, 2011.[42] H. Li, C. A. Calder, and N. Cressie. Beyond Moran’sI: Testing for spatial dependence based on the spatialautoregressive model.
Geographical Analysis , 39(4):357–375, 2007.[43] P. A. Moran. Notes on continuous stochastic phenomena.
Biometrika
International Journal of Geographical Information Sci-ence , 30(2):263–299, 2016.[46] B. Y. Chen, W. H. Lam, A. Sumalee, Q. Li, and Z.-C.Li. Vulnerability analysis for large-scale and congestedroad networks with demand uncertainty.
Transporta-tion Research Part A: Policy and Practice
Proceedings of the SeventeenthInternational Conference on Artificial Intelligence andStatistics , 2014.[51] M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley.Stochastic variational inference.
The Journal of MachineLearning Research , 14(1):1303–1347, 2013.[52] E. Bingham, J. P. Chen, M. Jankowiak, F. Obermey-er, N. Pradhan, T. Karaletsos, R. Singh, P. Szerlip,P. Horsfall, and N. D. Goodman. Pyro: Deep universalprobabilistic programming.
Journal of Machine LearningResearch , 2018.[53] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury,G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Rai-son, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang,J. Bai, and S. Chintala. Pytorch: An imperative style,high-performance deep learning library. In H. Wallach,H. Larochelle, A. Beygelzimer, F. d Alch´e-Buc, E. Fox,and R. Garnett, editors,
Advances in Neural InformationProcessing Systems 32 , pages 8024–8035. Curran Asso-ciates, Inc., 2019.[54] https://gitlab.com/compstorylab/asis.[55] S. A. Stouffer. Intervening opportunities: A theory relat- ing mobility and distance. American Sociological Review ,5(6):845–867, 1940.
Appendix A: Data variables
FIG. S1.
Cross-sectional attributes for calendar year 2016 by district including (A) mortality rates, (B) numberof life insurance policies, (C) unemployment rates, (D) number of homeless individuals, and (E) number of mobile residents.The dashed red lines indicate the average value for each of these variables in Hong Kong. The four attributes presented hereare not strongly correlated with mortality. • Mortality rates : Normalized death rates by districts derived from the registered deaths provided by the Censusand Statistics Department of Hong Kong [47]. • MMA life insurance coverage : Annual proportion of insurance policies issued in each district and normalizedby population size. • Median income : Estimation of median income by the Census and Statistics Department of Hong Kong foreach district. • Median rent to income ratio : The percentage of monthly household income spent on monthly householdrent in inflation-adjusted terms. • Median monthly household income : Similar to median income, but this is computed for households ratherthan individuals. • Unemployment rates : The proportion of people without an active job normalized by population size. This isa proxy variable derived from the gap between the available labour force and economically active individuals—aged over 15 and has been working for at least a week in the reference year [48]. • Unemployment rates across households : Similar to unemployment rates but derived at the householdslevel rather than individuals. • Unemployment rates among minorities : Similar to unemployment rates but computed for minorities only.4
FIG. S2.
Households statistics by year. • Proportion of homeless people : Another proxy variable to estimate the proportion of people not living inregistered domestic households. • Proportion of homeless mobile residents : Hong Kong’s permanent residents who had stayed in a givendistrict for “at least 1 month but less than 3 months” during the the 6 months before or after the referencemoment [48]. • Proportion of single parents : “Mothers or fathers who are never married, widowed, divorced or separated,with child(ren) aged under 18 living with them in the same household” [48]. • Proportion of households with kids in school : The proportion of households with young adults attending5
FIG. S3.
Records by TPU. (A) shows the fraction of records reported by each TPU in the mortality dataset. (B) showsan exponential declines in the number of records by TPU whereby the horizontal axis shows TPUs ranked by the fraction ofdeath records reported on the vertical axis. Similarly, (C) shows the annual average of deaths for each decade by TPU. full-time courses in educational institutions in Hong Kong. • Proportion of households with children : The proportion of households with young kids aged under 15. • Proportion of households with elderly : The proportion of households with seniors aged 65 or above.Our initial analysis shows variations in mortality rates across districts. In Fig. S1, we show a cross-sectional analysisof 4 different socioeconomic indices with respect to our response variable (mortality rate) for 2016. We normalizethese values to show the number of people in each bin per 1,000 individuals for each variable. The dashed red linesindicate the average value. Districts with a lower number of mobile residents have higher rates of morality such asWong Tai Sin, Kwun Tong, Kwai Tsing, and Sham Shui Po. We note high mortality rates in the Yau Tsim Mong andSouthern districts, while having a higher number of homeless people. By contrast, many districts with a lower numberof homeless people have higher rates of mortality such as Wong Tai Sin, Eastern, and Kwun Tong, and Kowloon City.6
Appendix B: Tertiary Planning Unit (TPU)
A Tertiary Planning Unit (TPU) [33]—similar to a census-block in the US—is a geospatial reference system usedby the Hong Kong’s Census and Statistics Department to report population census statistics. To take a closer lookat the geospatial tags we have in our dataset, we show the fraction of death records found in our data set by eachTPU (Fig. S3) and by districts (Fig. S4). Unfortunately, most death records were found in a small fraction of TPUs,which is understandable due to known data privacy concerns about the dataset as records in small TPUs may revealsensitive information about specific individuals there. In Fig. S5, we show the annual number of reported deathsby TPU. We note that death records are not reported frequently at the TPU level in the dataset. Missing recordsare flagged with XXX as indicated in the data set dictionary provided by the Hong Kong’s Census and StatisticsDepartment.
FIG. S4.
Records by district. (A) shows the fraction of records reported by each district in the mortality dataset. (B) shows an exponential declines in the number of records by district whereby the horizontal axis shows districts ranked by thefraction of death records reported on the vertical axis. Similarly, (C) shows the annual average of deaths for each decade bydistrict.
To avoid any risk of identifying individuals in the dataset, we use districts as our main spatial unit trying our bestto stay away from the pitfalls of data privacy in census data. In fact, most studies have either filtered out small TPUsin their analyses [12, 27], or aggregated their records at the district level [32] to overcome this challenge in the dataset.7In Fig. S4, the main 18 districts of Hong Kong show very consistent data distributions for all variables in the dataset throughout the last few decades. We note that the last 5 entries in the heatmap are not necessarily districts perse, but they are only used by the census department to report deaths outside the borders of the main 18 districts.However, given access to a more granular geospatial resolution ( i.e.,
TPU) our methodology can be implemented ina similar manner to identify and explore some of these sociotechnical dynamics.
Appendix C: Model parameter estimates
We show the distributions of β ’s in Fig. S6—the parameter for the baseline model. For each panel, we show thekernel density estimation of β as a function of each variable in the design tensor. Distributions that are significantlyabove 0 are colored in orange while distributions significantly below 0 are colored in blue as measured by the 80% CI.Similar demonstrations of spatial and weighted spatial models can be found in Fig. S7, and Fig. S8 respectively. Inaddition to the distributions of β ’s, we also show the kernel density estimation of γ ’s—the hyperparameter used forthe spatial competent in each model in Fig. S9. Appendix D: Kernel density estimation of normalized mortality rates
In Fig. S10, we show the kernel density estimation of normalized mortality rates by a model trained on the defaultset of base features for the calendar year 2016. The red dashed line indicates the true normalized mortality rate foreach district in 2016. The KDE of each model is plotted by district: Baseline model (grey), Spatial model (blue),and Weighted Spatial model (orange). For additional insights into the models, Fig. S11 and Fig. S12 show differentvariants of these models trained on wealth-related and all features respectively.
Appendix E: Supplementary Figures FIG. S5.
Annual number of deaths by TPU.
Death records are not reported frequently at the TPU level in our mortalitydataset. We note that the code (XXX) is used to indicate missing labels. FIG. S6.
Distributions of β for the baseline model. We show the kernel density estimation of β coefficient for eachfeature in the design tensor. Distributions that are significantly above the mean are colored in orange while distributionssignificantly below the mean are colored in blue as measured by the 80% CI. FIG. S7.
Distributions of β for the spatial model. We show the kernel density estimation of β coefficient for each featurein the design tensor. Distributions that are significantly above the mean are colored in orange while distributions significantlybelow the mean are colored in blue as measured by the 80% CI. FIG. S8.
Distributions of β for the weighted spatial model. We show the kernel density estimation of β coefficient foreach feature in the design tensor. Distributions that are significantly above the mean are colored in orange while distributionssignificantly below the mean are colored in blue as measured by the 80% CI. FIG. S9.
Distributions of γ for the weighted spatial model. We show the kernel density estimation of γ coefficient foreach feature in the design tensor. Distributions that are significantly above the mean are colored in orange while distributionssignificantly below the mean are colored in blue as measured by the 80% CI. FIG. S10.
KDE of normalized mortality rates trained on the default base features for calendar year 2016 . Weshow the KDE the Baseline model (grey), Spatial model (blue), and Weighted Spatial model (orange). The red dashed lineindicates the true normalized mortality rate for each district in 2016. FIG. S11.
KDE of normalized mortality rates by model trained on wealth-related in addition to the set of basefeatures for calendar year 2016 . We show the KDE the Baseline model (grey), Spatial model (blue), and Weighted Spatialmodel (orange). The red dashed line indicates the true normalized mortality rate for each district in 2016. FIG. S12.
KDE of normalized mortality rates by model trained on all features for calendar year 2016 . We showthe KDE the Baseline model (grey), Spatial model (blue), and Weighted Spatial model (orange). The red dashed line indicatesthe true normalized mortality rate for each district in 2016. FIG. S13.
Systematic error distributions for models trained on the wealth-related set of features in 2016 . Modelswith significance systematic overestimation are colored in orange while models with significance systematic underestimationare colored in blue as measured by the 80% CI. FIG. S14.
Systematic error distributions for models trained on all features in 2016 . Models with significancesystematic overestimation are colored in orange while models with significance systematic underestimation are colored in blueas measured by the 80% CI. FIG. S15.
Ego networks of each district demonstrating sociospatial factors of mortality for the 2016 baselinemodel . We display ego networks of each district in Hong Kong and its nearest neighbors in the road and bridge network. Thecentral node (highlighted with a grey box) of each network corresponds to the labelled district. Neighbors are not arrangedaround the ego district geographically. Node color corresponds to normalized mortality rate and edge color corresponds tosigned prediction error for the 2016 WSP model. These ego networks encode a qualitative measure of the sociospatial factorsin mortality modeling.FIG. S16.