Interplay between intra-urban population density and mobility in determining the spread of epidemics
Surendra Hazarie, David Soriano-Paños, Alex Arenas, Jesús Gómez-Gardeñes, Gourab Ghoshal
IInterplay between intra-urban population density and mobility indetermining the spread of epidemics.
Surendra Hazarie, ∗ David Soriano-Pa˜nos, ∗ Alex Arenas, Jes´us G´omez-Garde˜nes,
2, 4, † and Gourab Ghoshal
1, 5, ‡ Department of Physics & Astronomy,University of Rochester, Rochester, NY, 14627, USA. GOTHAM Lab – Department of Condensed Matter Physics andInstitute for Biocomputation and Physics of Complex Systems (BIFI),University of Zaragoza, E-50009 Zaragoza, Spain. Departament d’Enginyeria Inform`atica i Matem`atiques,Universitat Rovira i Virgili, E-43007 Tarragona, Spain. Center for Computational Social Science (CCSS),Kobe University, Kobe 657-8501, Japan. Department of Computer Science, University of Rochester, Rochester, NY, 14627, USA.
Abstract
In this work, we address the connection between population density centers in urban areas,and the nature of human flows between such centers, in shaping the vulnerability to the onset ofcontagious diseases. A study of 163 cities, chosen from four different continents reveals a universaltrend, whereby the risk induced by human mobility increases in those cities where mobility flows arepredominantly between high population density centers. We apply our formalism to the spread ofSARS-COV-2 in the United States, providing a plausible explanation for the observed heterogeneityin the spreading process across cities. Armed with this insight, we propose realistic mitigationstrategies (less severe than lockdowns), based on modifying the mobility in cities. Our resultssuggest that an optimal control strategy involves an asymmetric policy that restricts flows enteringthe most vulnerable areas but allowing residents to continue their usual mobility patterns. ∗ These two authors contributed equally to this work. † [email protected] ‡ [email protected] a r X i v : . [ phy s i c s . s o c - ph ] F e b . Introduction During the last century, humankind has rapidly evolved into an interconnected society drivenby the existence of a vast mobility network connecting different areas around the globe. Inparticular, the striking growth experienced by the international mobility network [1] hashelped to bridge socio-cultural [2–4] and economic gaps [5]. Accompanying this is the phe-nomenon of urbanization, whereby a majority of the world’s population reside in denselypacked urban centers, with the trend only accelerating [6, 7], given the socioeconomic ad-vantages that cities afford [8, 9]. Allayed against these benefits, the increase in mobility hasremoved the main bottleneck limiting the spatial diffusion of confined epidemic outbreaks.Once the disease spreads to different regions, it takes advantage of the high populationdensity and infrastructure networks in cities [10] to rapidly spread through the local popu-lation. As a consequence, over the past few years, several contagious disease outbreaks haveemerged, notable among them H1N1 in 2009 [11], Ebola in 2014 [12], ZIKV in 2015 [13]and of course more recently SARS-COV-2 [14, 15]. Indeed, the frequency with which thesepandemics occur is troublingly increasing [16].Despite the different nature of the pathogens, their spreading, both globally and locally,is primarily explained by the sequential combination of case importation from contagionsources, followed by local community transmission converting initially unaffected regionsinto new endemic areas. Different flavors of mobility play a role in each process: on the onehand, long-haul flights [17] are usually the drivers facilitating the entry of pathogens into agiven country, to the extent that the airport mobility network has proved to be a reliableproxy to determine pathogen arrival times [18], international infection routes [19], and tocomplement phylogeographic inference of emerging pathogens such as SARS-CoV-2 [20]. Onthe other hand, once index cases are found within a given region, a complex combination ofthe local mobility [21] and socio-economic features of the population [22, 23] determines thespeed of epidemic spread and the extent of its outbreak.Quantifying the impact of local mobility on the global diffusion of a pandemic constitutesa challenging task. In this sense, several examples addressing the impact of daily recur-2ent mobility patterns on the spread of contagious diseases can be found in the literature[24–31]. The majority of these, however, are theoretical frameworks analyzing the featuresof synthetic mobility networks, and the influence of total volume of travelers on the courseof the epidemic. Nonetheless, recent advances made in data-gathering techniques allow forobtaining accurate representations of daily urban rhythms constructed from mobile phonetraces [32], geolocalized tweets [33], Location Based Social Networks [34], or extensive cen-sus surveys. These data sets enable the extension of the theoretical machinery to addressreal epidemic scenarios. Indeed, recurrent mobility patterns have been already useful foridentifying the most exposed areas in some epidemic scenarios [35] as well as reproducingthe infection routes of H1N1 influenza [36], Malaria [37], and more recently SARS-COV-2[38–41].While much attention has been spent on reconstructing past infections, or epidemic forecast-ing in the case of extant pandemics, an important question that immediately arises, is whatmakes regions—in particular urban agglomerations where most people reside—vulnerable tothe spread of pathogens in the first place? While factors such as population density, levelsof healthcare, quality of infrastructure and socioeconomic disparities play a major role [42],vulnerability to spread is a complex interplay between these features that is, in general,difficult to disentangle. For instance, the role of population density is an open question withevidence both for and against its influence on epidemic spreading [43, 44]. Indeed, merelythe density of contacts, while relevant at a neighborhood level, is not enough to explain themechanisms of spread; one would also need to consider the mobility network of flows thatgovern the exchange of people between the regions. In such a setting, the spatial distribu-tion of the population densities and the strength of interaction between the regions becomeespecially relevant.In other words it is reasonable to assume that the morphology of the city in terms of howits residents are distributed and how they navigate the city plays a crucial role in theirsusceptibility to pandemics. Indeed, recent studies have shown that the spatial patterns ofhow residents utilize transportation infrastructure is a strong indicator of that regions’ levelsof social inclusion, quality of infrastructure and wealth creation [45]. In [46] the authors3ropose a measure of a city’s dynamical organization based on mobility hotspots [47] toclassify them in a spectrum between compact-hierarchical and sprawled layouts. The extentto which cities are compact or sprawled serve as a low-dimensional proxy for various urbanindicators related to quality of life, health and pollution.In this work we connect the dots between the morphology of human activity in cities, in termsof its associated mobility flows and the distribution of resident populations, and its effecton shaping the transmission of infectious diseases and their associated epidemic outbreaks.We collect data from 163 cities across four continents, on their population density at thezip-code level, and intra-urban mobility flows for the first half of 2020. Using this we extractpopulation density hotspots (i.e. those areas with the highest concentration of residents)and measure the extent to which flows between hotspots dominate the total flows in thecity. To capture epidemic spreading, we generalize a MIR (Movement-Interaction-Return)epidemic model [27] that captures the interplay between recurrent mobility flows and thedistributions of resident populations. We derive the epidemic threshold, representing theminimum infectivity per contact required to instigate an epidemic outbreak, and connect itwith the distribution of flows among population density hotspots. In particular we show that,despite their ostensible differences in terms of spatial layout, evolutionary history, or levels ofinfrastructure, all considered cities lie on a universal curve capturing an inverse relationshipbetween the epidemic threshold and the extent to which mobility flows are localized betweenhotspots.The results suggests an increased susceptibility to epidemics as a function of flows beingconcentrated between high population centers. As a proof-of-concept, we analyze the cur-rent SARS-COV-2 pandemic by quantifying the epidemic growth from the initial infectioncurves as an empirical proxy for city vulnerability and plotting it against our calculatedepidemic threshold. The empirical trends match our theoretical formalism where cities withmobility concentrated primarily between hotspots are more vulnerable as compared to thosewith more egalitarian mobility distributions. Based on this observation, we propose a re-alistic mitigation policy that, being much less severe than draconian lockdowns, lowers thesusceptibility of cities that lie in the vulnerable spectrum.4
I. ResultsA. Data
The population density for cities at zip-code resolution was collected from national censusbureaus [48–50] and from high resolution population density estimates recently published byFacebook [51]. Each of these cities correspond to the largest in their respective countriesin terms of population size. From this data, we extract H k population density hotspotsfor each city k (varying from city-to-city) by applying a non-parametric method based onthe derivative of the Lorenz curve [46, 47] (for details of the calculation see SupplementarySection S1).The mobility flows within each city are sourced from the Google SARS-COV-2 AggregatedMobility Research Dataset, and contain anonymized flows mobility flows aggregated overusers who have turned on the Location History setting, which is off by default. The flowsare between cells of approximately 5 km for the period ranging November 3rd 2019 toFebruary 29th 2020. For the purposes of our analysis we only consider the period before anymobility mitigation measures were initiated as a response to the SARS-COV-2 pandemic(see Supplementary Section S2 for further details). The flows are encoded in a matrix T whose elements T ij correspond to the population out-flows from i to j and whose diagonalelements correspond to self-flows. For each city k , given H k hotspots, we calculate thehotspot concentration, κ k , defined as the fraction of total flows in the city that occur onlybetween hotspots thus, κ k = (cid:80) i,j(cid:15)H k T ij (cid:80) i,j T ij . (1)This metric lies in the range 0 ≤ κ k ≤
1, with the limiting cases corresponding to flowsexclusively between hotspots or only between hotspot and non-hotspot areas. The list ofcities for each country, the administrative unit, and the hotspot concentration is shown inTab. S1. The results show a wide spectrum of values (0 . ≤ κ k ≤ .
79) both within andbetween countries, indicating significant variability in cities in terms of the morphology of5
IG. 1.
Spatial representation of the urban mobility network for selected cities (upperpanel) Autralia, (lower panel) United States. Areas marked blue correspond to population densityhotspots. The cities are organized in descending order according to the extent to which mobilityflows are concentrated between areas of high population density, κ (Eq.(1)). Line color encodes thenumber of inhabitants following each route normalized by the highest flow observed within eachcity. human flows between population centers.In Fig. 1, we plot the spatial layout of the hotspots and the mobility network for six repre-sentative cities in Australia (upper panel) and the United States (lower panel). The citiesare organized in descending order of κ , and it is apparent that in those cities with high κ ,flows are mainly confined around hotspots (marked in red), whereas they are increasinglymore distributed with decreasing κ . An additional feature is that hotspots are more spatiallyconcentrated in those cities with lower κ and dispersed in those with higher κ , indicatinga more heterogeneous distribution of population density in the former as compared to thelatter. 6 . Model To characterize a city’s vulnerability to disease spread, we calculate the epidemic thresholdby generalizing the formalism introduced in [27]. In what is to follow, we incorporate a Sus-ceptible (S), infectious (I) or recovered (R) dynamics, however, we note that the formalismis easily generalizable to more elaborate compartmental schemes. An infectious individualtransmits the pathogen to healthy counterparts via direct interaction at a rate λ . In turn,infectious individuals enter the compartment R at a rate µ , which typically encodes theinverse of the expected contagious period. The mixing among healthy and infectious indi-viduals is governed by the spatial distribution of the population and their mobility patterns.The different zip-codes of the city are represented as patches i , which are initially populatedby n i residents. The activity of the residents is considered on a daily basis and split intothree stages: Movement, Interaction and Return. Residents decide to either move to a dif-ferent patch with probability p , or remain with probability 1 − p . If the former, the choiceof location is proportional to the elements of the origin destination matrix T . After allmovements (or lack thereof) have been completed, interactions occur within patches accord-ing to a mean-field assumption where every individual makes the same number of contactsproportional to the population density via a function f i ; the final step involves return to theplace of residence. The same process is then repeated for the next time-step (day).Given N patches in a city, the dynamics is completely specified by 2 × N coupled discreteequations governing the temporal evolution of the fraction of infected and recovered indi-viduals residing in each patch. Namely, the fraction of infected, ρ Ii ( t + 1), and recoveredindividuals, ρ Ri ( t + 1), associated to patch i at time t + 1, reads: ρ Ii ( t + 1) = (1 − µ ) ρ Ii ( t ) + (1 − ρ Ii ( t ) − ρ Ri ( t ))Π i ( t ) ,ρ Ri ( t + 1) = ρ Ri ( t ) + µρ Ii ( t ) , (2)where Π i ( t ) represents the probability of a susceptible individual resident in i to contract thedisease at time t . Assuming that the process reaches a steady-state and that the size of the7utbreak is small in comparison to the overall population, after a sequence of manipulations(Supplementary Section S3, Eqns. S4-S15), it can be shown that the epidemic threshold isof the form, λ c ( p ) = µ/ Λ max ( M ), where Λ max is the spectral radius of a mixing matrix M ,taking into account the mobility flows, the degree of mobility p , the effective population in agiven patch, and the number of contacts as a function of population density in that patch. If p = 0 then the threshold would correspond to a static population that never moves, whereasif p = 1 then this accounts for a fully active population. Based on this observation one candefine a normalized epidemic threshold ˜ λ c = λ c ( p = 1) /λ c ( p = 0) to focus only on mobilityeffects while removing the influence of the population density. C. Connecting the epidemic threshold to hotspot flows
Given the quantitative description of human flows composing the mobility backbone of eachcity, we now focus on determining the effect that their morphology has on the epidemicthreshold. Given the extensive list of considered cities, and the attendant variation in popu-lation density, in order for a fair characterization, it is important to remove this component,and therefore the relevant parameter here is the normalized threshold ˜ λ c . In Fig. 2 A weplot ˜ λ c as a function of κ for all the cities in our dataset, represented as filled circles coloredaccording to their respective countries. We find a universal trend, whereby all cities fall intoa curve marking an inverse relationship between the vulnerability of the cities and the extentto which flows are concentrated between hotspots, i.e ˜ λ c ∼ κ β with β ≈ − .
25. In partic-ular, cities where there is a more egalitarian distribution of flows (lower κ ), the epidemicthreshold is higher for a mobile population as compared to a static population, indicatingthat movement between regions lowers the risk of an epidemic outbreak. Conversely, in thosecities where the population moves primarily between hotspots, there is little-to-no differencein risk in terms of whether residents stay in their patches, or whether they move to differentones. In Fig. S1, we show the equivalent of Fig. 2 A but now split by country finding thesame trend attesting to the robustness of the inverse dependence. (The fits to β for eachcountry are shown in Tab. S1). As a comparison to other population-related measures, in8 . . . . . . . . . ˜ c United StatesItalyAustraliaSouth Africa c ˜ M O D c Beneficial Neutral Detrimental020406080100120 C o un t AB C
FIG. 2.
Connecting morphology to vulnerability A
Normalized epidemic threshold ˜ λ c versuspopulation density hotspot concentration κ for cities in each of the chosen 4 countries (color code).The solid line shows the fit to a power-law function given by ˜ λ c = Aκ β , with A = 1 . ± . β = − . ± .
03. The shadowed region covers the 95% confidence interval. B Normalizedthreshold after removing all the flows connecting hotspots ˜ λ MODc as a function of the normalizedepidemic threshold without intervention ˜ λ c . The green (red) areas contains those cities for whichremoving flows among hotspots is beneficial (detrimental). C Division of the cities according tothe result: beneficial, neutral and detrimental as described in the text.
Fig. S2, we plot the Spearman correlation matrix for κ , the average population density, theLloyd mean crowding [52] and ˜ λ c for the 50 American cities. The figure suggests that theconcentration of flows between areas of high population density, κ , is a much better predictorof the vulnerability of cities than the other two measures.9o further characterize the relation existing between cities’ vulnerability and the concentra-tion of mobility between density hotspots, we next analyze the impact of reshuffling the flowsat a local scale for each of the analyzed cities. In particular, we preserve the total amountof flows, by removing all links connecting hotspots (thus setting κ = 0) and redistribut-ing them evenly across non-hotspot locations. We then recompute the resulting normalizethreshold ˜ λ MODc and in Fig. 2 B plot it as a function of the threshold corresponding to theunperturbed network ˜ λ c . As the figure indicates for the vast majority of cities (irrespectiveof their original value of κ ), the effect of switching off the flows between hotspots leadsto an increase in the epidemic threshold, in turn lowering their vulnerability to epidemicspread. In particular there appears to be three categories of cities: beneficial , those wherethe threshold is increased greater than 10%; neutral , those where the threshold is increasedby less than 10%; and finally detrimental , those where the threshold is instead lowered. Thenumber of cities belonging to each category is plotted as a bar-chart in Fig. 2 C , indicatingthat around 75% of cities experience a lowering of their vulnerability, 15% remain neutral,and the remaining 10% experience an increased susceptibility (interestingly this category isdominated by American cities). For the small number of cities, where we find this counter-intuitive effect of lowered threshold, it is likely that there are other more complex featuresat play not considered in this analysis.Nevertheless, these results provide overwhelming evidence that, in most cases, the con-centration of human mobility between densely populated areas is a feature that enhancesdisease spreading and makes such cities vulnerable to epidemics. Moreover, the beneficialeffect caused by the reorientation of intra-hotspot flows towards less densely populated areasseems to be rooted in a homogenization of the distribution of the underlying density, which,as mentioned in [28, 31], enforces infected individuals to stay away from the contagion focus,thus reducing their infection power. In turn, this homogenizing flow structure appears natu-rally in cities with low κ and, as suggested by the empirical trends in Fig. 2 A , characterizesthe most resilient cities. Therefore, a lower κ translates into a greater mix of populations,between high and low population density centers, where they can actually take advantage ofmobility between city sub-regions to prevent outbreaks.10 . Application to real pandemic settings The formalism proposed here can readily be applied to assess the exposure of cities to actualoutbreaks. To illustrate this, we next focus on the spread of SARS-COV-2 in the 50 mostpopulated Core-Based-Statistical-Areas (CBSA) in the United States, chosen due to theappropriate spatial resolution in terms of infection data. Note that, although, thus far, wehave focused on the SIR model, the following will illustrate the generality of the results,in the context of the spread of SARS-CoV-2, that has been recently analyzed with moreelaborate compartmental models [38, 52–55].As a proxy for a city’s vulnerability to epidemic spread, we make use of the number ofconfirmed infected cases at the county-level collected from the New York Times [56] andUSAFacts [57]. Given the inherent noise due to reporting artifacts, and assuming an ex-ponential growth, I k ( t ) ∼ exp( b k t ) during early onset, we apply a smoothing procedure toextract the growth-rate b k of the number of infected cases, and use that as proxy for a city k ’s susceptibility to disease spread. To remove any effects due to non-pharmaceutical inter-ventions and behavioral changes in the population, we focus on the period before mitigationmeasures. The full details of the procedure are shown in Supplementary Section S5, and thetemporal infection plots for each city along with the fits are shown in Fig. S3.In order to properly connect with the growth rate, we need to reintroduce the effect of thepopulation distribution and, therefore, the relevant variable is the unnormalized threshold λ c ( p = 1). Note that, by choosing p = 1, we force all the inhabitants within a city tofollow the flow matrix T , but not all of them leave their residential area due to existence ofself-loops in this matrix. Indeed, according to the avalaible data for the cities analyzed here,around 36% of the population remains on average inside their residential administrative unit.In Fig. 3 A we plot the empirically extracted growth rate b as a function of the epidemicthreshold, λ c , finding once again an inverse trend, confirming the role of the threshold as aproxy for vulnerability. Those cities which experienced a faster epidemic growth during theearly onset of the pandemic indeed had a lower threshold according to our formalism.11 .
55 0 .
60 0 .
65 0 .
70 0 . λ c . . . . . . . . C i t yv u l n e r a b ili t y b New York BostonMiamiPhiladelphia r P = − . r S = − . .
55 0 .
60 0 .
65 0 .
70 0 . . . . . r P = − . r S = − . h d hot i κ . . . . . . . . . C i t yv u l n e r a b ili t y b New YorkBostonMiami Philadelphia r P = 0 . r S = 0 . . . . . r P = 0 . r S = 0 . FIG. 3.
Validation of the model in US cities A
Epidemic growth rate b as a function of theepidemic threshold λ c ( p ) (Eq. S16). In inset we show the same plot after removing NYC. B Epi-demic growth rate b as a function of the average population density within hotspots (cid:104) d hot (cid:105) and thehotspot concentration κ . The exponent 0 .
43 is introduced to reflect the dependence obtained whenfitting the normalized epidemic threshold to κ for the United States (Table S2). In both panels, r P and r S denote the Pearson and Spearman rank correlation coefficients among the representedquantities. Next we connect, the localization of flows to hotspots to the empirical vulnerability in eachof the cities. Since κ only takes into account flows between hotspots but does not accountfor their population density, the variable that captures the effective interaction betweenthe residents in hotspots areas should be a combination of both factors. Specifically, wehave chosen (cid:104) d hot (cid:105) κ . , where the first term corresponds to the average population densitywithin hotspots and the second term reflects the scaling obtained for the normalized epidemicthreshold in the individual case of CBSA from the United States (see Supplementary Section4 for further details). In Fig. 3 B we plot b as a function of this quantity finding a clearmonotonically increasing trend. Thus taken together, the results of Fig. 3 indicate thatthe empirical trends mirror our theoretical formalism, whereby cities that experience strongearly growth of the epidemic have a lower threshold, a phenomenon for which one of the maincausative mechanisms is that the movement in cities occurs primarily between hotspots.12 . Potential mitigation measures The observations thus far, immediately suggest the possibility of effective mitigation mea-sures that may shore up the robustness of vulnerable cities to the onset of epidemic spread.Given the lack of therapeutics or vaccines for SARS-COV-2, the prevailing strategy adoptedglobally has been to resort to non-pharmaceutical interventions of which a key ingredienthas been aggressive lock-downs. While ostensibly being very effective in mitigating an ac-tive epidemic, significant disruption to the socio-economic fabric is one of the unfortunateconsequences [58, 59]. Having demonstrated the key role played by the interactions be-tween population density hotspots, we next investigate some targeted interventions, or evenpreemptions, that are milder than completely restricting mobility city-wide and assess theirefficacy in reducing vulnerability. The strategy we pursue is to modify flows between differenttypes of locations in the city without the need to isolate individuals at home. The differentschemes are illustrated in the upper panel of Fig. 4. In the first intervention (InterventionI), an asymmetrical strategy involves restricting flows from non-hotspot (heretofore referredto as suburbs) towards hotspot areas and converting them to self-loops, while keeping allother flows the same. Intervention II corresponds to the reverse situation where flows fromhotspots to suburbs are converted to self-loops. Finally, in Intervention III, only movementbetween hotspots is restricted.For each of these scenarios we recompute the normalized epidemic threshold ˜ λ mc ( m ∈{ I,II,III } ) and plot it against the original threshold, ˜ λ c , in the bottom panels of Fig. 4.In panel A , we find an increase of the threshold across the board indicating that Interven-tion I is a rather effective strategy. Given that a potential epidemic has a high probabilityof being seeded and correspondingly spreading extensively in high population density ar-eas, preventing the residents in suburbs from visiting these locations protects them frombeing exposed to the disease. Conversely in panel B we see that restricting residents inhotspots to travel to suburbs has the opposite effect in further decreasing the threshold.This counterproductive effect emerges due to a phenomenon discussed in [28]. When thereis an asymmetry between the population density in cities, mobility from hotspots to suburbs13 ntervention I Intervention II Intervention III SuburbHotspot
A B C c ˜ I c c ˜ II c c ˜ III c . . . . . . FIG. 4.
Impact of mobility interventions on the epidemic threshold within cities inthe United States . The impact is quantified by representing the normalized threshold for eachintervention, ˜ λ Ic , ˜ λ IIc , ˜ λ IIIc (in panels
A,B,C respectively) as a function of the normalized threshold˜ λ c . In all panels, the dashed line denotes the boundary separating the cities for which the inter-vention is beneficial (above) or detrimental (below). The color of the points represent the hotspotconcentration κ for each CBSA. leads to an increase in the threshold due to the dilution of the effective population in thehotspots thus reducing the number of contacts, as well as diverting potentially infectiousindividuals to to lower density regions where their impact is mitigated. Removing this routemakes the situation significantly worse. Finally, panel C reveals that limiting the mobilitybetween hotspots has a mostly neutral effect, although the trends are noisier given thatits effectiveness is closely related to the underlying population density distribution insidehotspots. The results suggest that, among the scenarios outlined, the most beneficial policyis to restrict residents in suburbs from visiting hotspots, while at the same time allowingresidents in hotspots to continue with their regular mobility behavior. Note, that severalother combinations are possible, for instance a combination of interventions I & III which islikely to have even more of a beneficial effect.14 II. Discussion
Similar to how a virus enters the human body and replicates from cell to cell, the spreadof pathogens in susceptible populations is influenced by the interaction between its hosts.Thus, it is the social interactions, mediated by behavioral and mixing patterns, that shapesthe spread of disease in human populations. Among these aspects, human mobility is akey factor underlying the unfolding patterns of epidemic outbreaks. Understanding howhuman mobility shapes the spatiotemporal unfolding of contagious diseases is essential forthe design of efficient containment policies to ameliorate their impact. In this paper, weinvestigate the interplay between the population density and the spatial distribution of flowsin urban areas, and its impact on determining exposure to epidemic outbreaks. We reporta clear trend worldwide: the existence of a high volume of individuals commuting amongthe population density centers of a given city makes it more vulnerable to the spread ofepidemics. The extent of a city’s vulnerability as determined from our formalism, allowsus to shed light on a real epidemic scenario: the spread of SARS-COV-2 across UnitedStates. In particular, the epidemic threshold determined by the population density profilesand urban mobility patterns provides one of the potential causative mechanisms behind thedifferent levels of infection observed across cities in the United States.As an application to potential epidemic waves, our indicator allows for identifying those citiesthat are likely to become epidemic centers once the first imported cases arrive there. This isof importance, given that it can guide authorities to identify places where timely containmentpolicies can be locally implemented to avoid large outbreaks caused by massive communitytransmission. It is precisely the lack of anticipation to the SARS-COV-2 pandemic, that hasled countries to enforce aggressive containment measures aimed at ameliorating the impactof the disease. The predominant strategy has been the implementation of lockdown policies,forcing a large fraction of the population to stay isolated at home, thus reducing consid-erably their number of interactions. While there is consensus on the effectiveness of theseinterventions to mitigate an ongoing outbreak, the collateral socio-economic damage causedby lockdowns requires a change of direction towards less-aggressive containment measures.15n the case of SARS-COV-2, the strict individual isolation characterizing the first interven-tions have given rise to more relaxed lockdown scenarios combined with efficient Test-Trackand Isolate policies [60–62]. Along these lines, we present different realistic scenarios basedon modifying mobility habits to actively avoid the emergence of large areas of contagion.Our analysis suggest that a potentially effective policy involves an asymmetric closure ofneuralgic centers of the cities, restricting movement to population density hotspots from res-idents of other areas, while allowing those living in hotspots to commute, in order to dilutethe number of contacts in the most vulnerable areas. Teleworking and effective distributionof key services in a city are practical manifestations of such interventions.An advantage of our formalism is its relative simplicity, paving the way to extend the resultsto more general scenarios. For example, these results are built upon the assumption thatpopulation density centers are much more vulnerable to contagious diseases than scarcelydense areas. While this is a logical assumption, the beneficial effect of mobility from hotspotsand suburbs could be reversed for diseases with large reproduction number R . In thisscenario, suburbs would also have the potential to develop large outbreaks so the existenceof infection routes across the city would lead to an acceleration of the propagation of theepidemic front. Finally, another obvious extension is to consider movement at differentgeographical scales [63, 64]. Accounting for movement between cities, for instance, couldprovide valuable information to coordinate joint efforts among different regions to modifyboth inter- and intra-urban flows in service of reducing the impact of a pandemic. Needlessto say, pandemics are complex processes involving a multitude of spatial and socioeconomicfactors. The results presented here may provide one of the building blocks for policy plannersin devising effective preventive and mitigation measures for future crises. Limitations
These results should be interpreted in light of several important limitations. First, the Googlemobility data is limited to smartphone users who have opted in to Google’s Location Historyfeature, which is off by default. These data may not be representative of the population aswhole, and furthermore their representativeness may vary by location. Importantly, theselimited data are only viewed through the lens of differential privacy algorithms, specifically16esigned to protect user anonymity and obscure fine detail. Moreover, comparisons acrossrather than within locations are only descriptive since these regions can differ in substantialways. [1] The world of air transport in 2018. Tech. Rep., International Civil Aviation Organization(2018).[2] Varghese, N.
Globalization of higher education and cross-border student mobility (Citeseer,2008).[3] Filatotchev, I., Liu, X., Lu, J. & Wright, M. Knowledge spillovers through human mobilityacross national borders: Evidence from zhongguancun science park in china.
Research Policy , 453–462 (2011).[4] Williams, A. M. & Bal´aˇz, V. What human capital, which migrants? returned skilled migrationto slovakia from the uk 1. International migration review , 439–468 (2005).[5] Boubtane, E., Dumont, J.-C. & Rault, C. Immigration and economic growth in the oecdcountries 1986–2006. Oxford Economic Papers , 340–360 (2016).[6] United Nations, Department of Economic and Social Affairs, Population Division (2019).World Urbanization Prospects 2018: Highlights (ST/ESA/SER.A/421). https://population.un.org/wup/Publications/Files/WUP2018-Highlights.pdf . Accessed: 2019-01-30.[7] Le N´echet, F. Urban spatial structure, daily mobility and energy consumption: a study of 34european cities. Cybergeo: European Journal of Geography (2012).[8] Bettencourt, L. M. The origins of scaling in cities.
Science , 1438–1441 (2013).[9] Pan, W., Ghoshal, G., Krumme, C., Cebrian, M. & Pentland, A. Urban characteristicsattributable to density-driven tie formation.
Nature Communications , 1961 (2013).[10] Kirkley, A., Barbosa, H., Barthelemy, M. & Ghoshal, G. From the betweenness centrality instreet networks to structural invariants in random planar graphs. Nature Communications ,2501 (2018).
11] Van Kerkhove, M. D. et al.
Epidemiologic and virologic assessment of the 2009 influenzaa (h1n1) pandemic on selected temperate countries in the southern hemisphere: Argentina,australia, chile, new zealand and south africa.
Influenza and other respiratory viruses ,e487–e498 (2011). URL https://pubmed.ncbi.nlm.nih.gov/21668677 .[12] Gomes, M. F. et al. Assessing the international spreading risk associated with the 2014 westafrican ebola outbreak.
PLoS currents (2014).[13] Zhang, Q. et al. Spread of zika virus in the americas.
Proceedings of the National Academy ofSciences , E4334–E4343 (2017).[14] Estrada, E. Covid-19 and sars-cov-2. modeling the present, looking at the future.
PhysicsReports , 1–51 (2020).[15] Organization, W. H. et al.
Coronavirus disease 2019 (covid-19): situation report, 72. Tech.Rep. (2020).[16] Saunders-Hastings, P. R. & Krewski, D. Reviewing the history of pandemic influenza: Under-standing patterns of emergence and transmission.
Pathogens , 66 (2016).[17] Bowen Jr, J. T. & Laroe, C. Airline networks and the international diffusion of severe acuterespiratory syndrome (sars). Geographical Journal , 130–144 (2006).[18] Brockmann, D. & Helbing, D. The Hidden Geometry of Complex, Network-Driven ContagionPhenomena.
Science , 1337 LP – 1342 (2013). URL http://science.sciencemag.org/content/342/6164/1337.abstract .[19] Colizza, V., Barrat, A., Barth´elemy, M. & Vespignani, A. Predictability and epidemic path-ways in global outbreaks of infectious diseases: the sars case study.
BMC Medicine , 34(2007). URL https://doi.org/10.1186/1741-7015-5-34 .[20] Lemey, P., Hong, S., Hill, V. & et al. Accommodating individual travel history and unsampleddiversity in bayesian phylogeographic inference of sars-cov-2. Nature Communications , 5110(2020).[21] Barbosa, H. et al. Human mobility: Models and applications.
Physics Reports , 1–74(2018). URL .[22] Bonaccorsi, G. et al.
Economic and social consequences of human mobility restrictions undercovid-19.
Proceedings of the National Academy of Sciences , 15530–15535 (2020).
23] Ahmed, F., Ahmed, N., Pissarides, C. & Stiglitz, J. Why inequality could spread covid-19.
The Lancet Public Health , e240 (2020).[24] Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processesin complex networks. Reviews of Modern Physics , 925–979 (2015). URL https://link.aps.org/doi/10.1103/RevModPhys.87.925 .[25] Belik, V., Geisel, T. & Brockmann, D. Natural Human Mobility Patterns and Spatial Spreadof Infectious Diseases. Physical Review X , 011001 (2011). URL https://link.aps.org/doi/10.1103/PhysRevX.1.011001 .[26] Balcan, D. & Vespignani, A. Phase transitions in contagion processes mediated by recurrentmobility patterns. Nature Physics , 581–586 (2011). URL https://doi.org/10.1038/nphys1944 .[27] G´omez-Garde˜nes, J., Soriano-Pa˜nos, D. & Arenas, A. Critical regimes driven by recurrentmobility patterns of reaction–diffusion processes in networks. Nature Physics , 391–395(2018). URL https://doi.org/10.1038/s41567-017-0022-7 .[28] Soriano-Pa˜nos, D., Lotero, L., Arenas, A. & G´omez-Garde˜nes, J. Spreading Processes inMultiplex Metapopulations Containing Different Mobility Networks. Physical Review X ,031039 (2018). URL https://link.aps.org/doi/10.1103/PhysRevX.8.031039 .[29] Moss, R., Naghizade, E., Tomko, M. & Geard, N. What can urban mobility data reveal aboutthe spatial distribution of infection in a single city? BMC Public Health , 656 (2019). URL https://doi.org/10.1186/s12889-019-6968-x .[30] Granell, C. & Mucha, P. J. Epidemic spreading in localized environments with recurrentmobility patterns. Physical Review E , 052302 (2018).[31] Soriano-Pa˜nos, D., Ghoshal, G., Arenas, A. & G´omez-Garde˜nes, J. Impact of temporal scalesand recurrent mobility patterns on the unfolding of epidemics. Journal of Statistical Mechanics:Theory and Experiment , 024006 (2020). URL https://iopscience.iop.org/article/10.1088/1742-5468/ab6a04 .[32] Tizzoni, M. et al.
On the use of human mobility proxies for modeling epidemics.
PLOSComputational Biology , 1–15 (2014). URL https://doi.org/10.1371/journal.pcbi.1003716 .
33] Mazzoli, M. et al.
Field theory for recurrent mobility.
Nature Communications , 3895(2019).[34] Barbosa, H., de Lima-Neto, F. B., Evsukoff, A. & Menezes, R. The effect of recency to humanmobility. EPJ Data Science , 21 (2015).[35] Soriano-Pa˜nos, D. et al. Vector-borne epidemics driven by human mobility.
Physical ReviewResearch , 013312 (2020).[36] Tizzoni, M. et al. Real-time numerical forecast of global epidemic spreading: case studyof 2009 a/h1n1pdm.
BMC Medicine , 165 (2012). URL https://doi.org/10.1186/1741-7015-10-165 .[37] Wesolowski, A. et al. Quantifying the impact of human mobility on malaria.
Science ,267–270 (2012).[38] Arenas, A. et al.
A mathematical model for the spatiotemporal epidemic spreading ofCOVID19. medRxiv .[39] Costa, G. S., Cota, W. & Ferreira, S. C. Metapopulation modeling of COVID-19 advancing intothe countryside: an analysis of mitigation strategies for Brazil. medRxiv .[40] Badr, H. S. et al.
Association between mobility patterns and covid-19 transmission in the usa:a mathematical modelling study.
The Lancet Infectious Diseases .[41] Bertuzzo, E. et al.
The geography of covid-19 spread in italy and implications for the relaxationof confinement measures.
Nature Communications , 4264 (2020).[42] Alirol, E., Getaz, L., Stoll, B., Chappuis, F. & Loutan, L. Urbanisation and infectious diseasesin a globalised world. The Lancet. Infectious diseases , 131–141 (2011). URL https://pubmed.ncbi.nlm.nih.gov/21272793 .[43] Kraemer, M. U. G. et al. Big city, small world: density, contact rates, and transmissionof dengue across pakistan.
Journal of the Royal Society, Interface , 20150468–20150468(2015). URL https://pubmed.ncbi.nlm.nih.gov/26468065 .[44] Li, R., Richmond, P. & Roehner, B. M. Effect of population density on epidemics. Physica A:Statistical Mechanics and its Applications , 713–724 (2018). URL https://EconPapers. epec.org/RePEc:eee:phsmap:v:510:y:2018:i:c:p:713-724 .[45] Lee, M., Barbosa, H., Youn, H., Holme, P. & Ghoshal, G. Morphology of travel routes andthe organization of cities. Nature Communications , 2229 (2017).[46] Bassolas, A. et al. Hierarchical organization of urban mobility and its connection withcity livability.
Nature Communications , 4817 (2019). URL https://doi.org/10.1038/s41467-019-12809-y .[47] Louail, T. et al. From mobile phone data to the spatial structure of cities.
Scientific Reports , 5276 (2014). URL https://doi.org/10.1038/srep05276 .[48] Free US Population Density And Unemployment Rate By Zip Code (ac-cessed 2020- 08-27). URL https://blog.splitwise.com/2014/01/06/free-us-population-density-and-unemployment-rate-by-zip-code/ .[49] Australian Bureau of Statistics. .[50] Statistics South Africa. .[51]
Facebook (accessed 2020- 08-27). URL https://data.humdata.org/organization/facebook .[52] Rader, B. et al.
Crowding and the shape of covid-19 epidemics.
Nature Medicine (2020).[53] Di Domenico, L., Pullano, G., Sabbatini, C. E., Bo¨elle, P.-Y. & Colizza, V. Impact of lockdownon covid-19 epidemic in ˆıle-de-france and possible exit strategies.
BMC Medicine , 240(2020).[54] Prem, K. et al. The effect of control strategies to reduce social mixing on outcomes of thecovid-19 epidemic in wuhan, china: a modelling study.
The Lancet Public Health (2020).[55] Gatto, M. et al.
Spread and dynamics of the covid-19 epidemic in italy: Effects of emergencycontainment measures.
Proceedings of the National Academy of Sciences , 10484–10491(2020).[56] https://github.com/nytimes/covid-19-data .[57] https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/ .[58] Gdp and employment flash estimates for the second quarter of 2020. Tech. Rep., Eurostat(2020).[59] Gross domestic product, second quarter 2020 (advance estimate) and annual update. Tech. ep., Bureau of Economic Analysis (2020).[60] Hellewell, J. et al. Feasibility of controlling covid-19 outbreaks by isolation of cases andcontacts.
The Lancet Global Health (2020).[61] Ferretti, L. et al.
Quantifying sars-cov-2 transmission suggests epidemic control with digitalcontact tracing.
Science (2020).[62] Salath´e, M. et al.
Covid-19 epidemic in switzerland: on the importance of testing, contacttracing and isolation.
Swiss medical weekly , w20225 (2020).[63] Watts, D. J., Muhamad, R., Medina, D. C. & Dodds, P. S. Multiscale, resurgent epidemics ina hierarchical metapopulation model.
Proceedings of the National Academy of Sciences ,11157–11162 (2005).[64] Balcan, D. et al.
Multiscale mobility networks and the spatial spreading of infectious diseases.
Proceedings of the National Academy of Sciences , 21484–21489 (2009).[65] Wilson, R. et al.
Differentially private sql with bounded user contribution (2020).[66] Https://research.google/pubs/pub48778/.[67] Hu, H., Nigmatulina, K. & Eckhoff, P. The scaling of contact rates with population densityfor the infectious disease models.
Mathematical Biosciences , 125 – 134 (2013). URL .[68] https://github.com/nytimes/covid-19-data.[69] https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/.[70]
Savitzky–Golay filter (accessed 2020- 08-17). URL https://en.wikipedia.org/wiki/Savitzky-Golay_filter .[71]
Smoothing in Python (accessed 2020- 08-17). URL https://plotly.com/python/smoothing/ . upporting Information S1. Hotspot classification
Hotspots are identified by setting a threshold on the population densities of cells withina city. The threshold for hotspots is assigned by applying a non-parametric method, theLouBar method [46, 47], based on the derivative of the Lorenz curve. The Lorenz curve isthe sorted cumulative distribution of population densities and is obtained by plotting, inascending order, the normalized cumulative number of nodes vs. the normalized cumulativepopulation density. The threshold is then obtained by taking the derivative of the Lorenzcurve at (1, 1) and extrapolating it to the point at which it intersects the x-axis. Weclassify hotspots in cities according to this LouBar method applied to population densities,in agreement with the reliance of the model on effective densities. A cell i is considered ahotspot of city k , H k if it satisfies: d i,k > d Lou k (S1)where d i is the population density of cell i in city k and d Lou k is the threshold determined byperforming the LouBar method on the population densities of all cells in city k . This allowsus to place emphasis on zones within cities that encourage the most relative interaction, asopposed to sharp biasing due to population magnitude. We then examine the hotspots flowconcentration in each city k , κ k , defined as fraction of total flows in the city system thatexist between these population density hotspots of city. Therefore, κ k is given by: κ k = (cid:80) i,j(cid:15)H k T ij (cid:80) i,j T ij , (S2)where T ij denotes the flow of individuals going from patch i to patch j according to themobility data. S-1
2. Data
TABLE S1: Hotspot flow concentration κ for the differentcities analyzed in the manuscript. The resolution column con-tain the two geographical divisions used inside each countryto construct the metapopulations. For example, “Zip codeswithin CBSA” in the case of the USA implies that each city(metapopulation) corresponds to a CBSA and the entitiescomposing each city (patches) correspond to zip codes. City Country Resolution κ Bunbury AUS SA2 within SA4 0.059Capital Region AUS SA2 within SA4 0.125Sydney AUS SA2 within SA4 0.162Darwin AUS SA2 within SA4 0.185Cairns AUS SA2 within SA4 0.186Richmond AUS SA2 within SA4 0.202Gold Coast AUS SA2 within SA4 0.212South Australia AUS SA2 within SA4 0.221Hume AUS SA2 within SA4 0.228Moreton Bay AUS SA2 within SA4 0.233Central Queensland AUS SA2 within SA4 0.236Wide Bay AUS SA2 within SA4 0.237Melbourne AUS SA2 within SA4 0.250Townsville AUS SA2 within SA4 0.250Australian Capital Territory AUS SA2 within SA4 0.253Launceston and North East AUS SA2 within SA4 0.303Sunshine Coast AUS SA2 within SA4 0.308
S-2 ity Country Resolution κ Central Coast AUS SA2 within SA4 0.310Illawarra AUS SA2 within SA4 0.325Perth AUS SA2 within SA4 0.330Logan AUS SA2 within SA4 0.349Brisbane AUS SA2 within SA4 0.350Hunter Valley exc Newcastle AUS SA2 within SA4 0.369Newcastle and Lake Macquarie AUS SA2 within SA4 0.381Western Australia AUS SA2 within SA4 0.386Ipswich AUS SA2 within SA4 0.396West and North West AUS SA2 within SA4 0.415Mackay AUS SA2 within SA4 0.415Latrobe AUS SA2 within SA4 0.419North West AUS SA2 within SA4 0.423Hobart AUS SA2 within SA4 0.438Central West AUS SA2 within SA4 0.496Adelaide AUS SA2 within SA4 0.520Bolzano ITA S2 cells within communes 0.210Sassari ITA S2 cells within communes 0.278Catania ITA S2 cells within communes 0.300Livorno ITA S2 cells within communes 0.306Taranto ITA S2 cells within communes 0.307Foggia ITA S2 cells within communes 0.318Perugia ITA S2 cells within communes 0.319Bari ITA S2 cells within communes 0.337Giugliano In Campania ITA S2 cells within communes 0.337Terni ITA S2 cells within communes 0.341Piacenza ITA S2 cells within communes 0.344
S-3 ity Country Resolution κ Genova ITA S2 cells within communes 0.376Ravenna ITA S2 cells within communes 0.381Ferrara ITA S2 cells within communes 0.392Ancona ITA S2 cells within communes 0.392Lecce ITA S2 cells within communes 0.393Cesena ITA S2 cells within communes 0.394Pesaro ITA S2 cells within communes 0.405Parma ITA S2 cells within communes 0.406Pescara ITA S2 cells within communes 0.434Roma ITA S2 cells within communes 0.435Venezia ITA S2 cells within communes 0.436Modena ITA S2 cells within communes 0.458Trieste ITA S2 cells within communes 0.466Brescia ITA S2 cells within communes 0.467Trento ITA S2 cells within communes 0.469Siracusa ITA S2 cells within communes 0.470Napoli ITA S2 cells within communes 0.471Palermo ITA S2 cells within communes 0.472Milano ITA S2 cells within communes 0.478Rimini ITA S2 cells within communes 0.479Prato ITA S2 cells within communes 0.489Verona ITA S2 cells within communes 0.494Salerno ITA S2 cells within communes 0.501Torino ITA S2 cells within communes 0.507Arezzo ITA S2 cells within communes 0.527Reggio Di Calabria ITA S2 cells within communes 0.534Latina ITA S2 cells within communes 0.543
S-4 ity Country Resolution κ Udine ITA S2 cells within communes 0.545Bologna ITA S2 cells within communes 0.567Vicenza ITA S2 cells within communes 0.568Andria ITA S2 cells within communes 0.575Novara ITA S2 cells within communes 0.582Padova ITA S2 cells within communes 0.586Bergamo ITA S2 cells within communes 0.598Firenze ITA S2 cells within communes 0.629Messina ITA S2 cells within communes 0.657Cagliari ITA S2 cells within communes 0.751Monza ITA S2 cells within communes 0.773Virginia Beach USA Zip codes within CBSA 0.033Washington USA Zip codes within CBSA 0.104Miami USA Zip codes within CBSA 0.109Columbus USA Zip codes within CBSA 0.140Seattle USA Zip codes within CBSA 0.140Atlanta USA Zip codes within CBSA 0.157Houston USA Zip codes within CBSA 0.170Pittsburgh USA Zip codes within CBSA 0.178Richmond USA Zip codes within CBSA 0.180Raleigh USA Zip codes within CBSA 0.182San Francisco USA Zip codes within CBSA 0.184Nashville USA Zip codes within CBSA 0.188Los Angeles USA Zip codes within CBSA 0.189Salt Lake City USA Zip codes within CBSA 0.204Cincinnati USA Zip codes within CBSA 0.205New York USA Zip codes within CBSA 0.213
S-5 ity Country Resolution κ Austin USA Zip codes within CBSA 0.213Minneapolis USA Zip codes within CBSA 0.215Birmingham USA Zip codes within CBSA 0.220Boston USA Zip codes within CBSA 0.229Providence USA Zip codes within CBSA 0.231Charlotte USA Zip codes within CBSA 0.235Louisville/Jefferson County USA Zip codes within CBSA 0.246Dallas USA Zip codes within CBSA 0.247Orlando USA Zip codes within CBSA 0.252St. Louis USA Zip codes within CBSA 0.256Riverside USA Zip codes within CBSA 0.269Buffalo USA Zip codes within CBSA 0.270Chicago USA Zip codes within CBSA 0.276Kansas City USA Zip codes within CBSA 0.286Baltimore USA Zip codes within CBSA 0.292San Jose USA Zip codes within CBSA 0.295Denver USA Zip codes within CBSA 0.297Philadelphia USA Zip codes within CBSA 0.298Cleveland USA Zip codes within CBSA 0.304Hartford USA Zip codes within CBSA 0.316San Antonio USA Zip codes within CBSA 0.327Portland USA Zip codes within CBSA 0.349San Diego USA Zip codes within CBSA 0.352Phoenix USA Zip codes within CBSA 0.383Milwaukee USA Zip codes within CBSA 0.385Memphis USA Zip codes within CBSA 0.393Indianapolis USA Zip codes within CBSA 0.411
S-6 ity Country Resolution κ Las Vegas USA Zip codes within CBSA 0.415Oklahoma City USA Zip codes within CBSA 0.418Sacramento USA Zip codes within CBSA 0.419New Orleans USA Zip codes within CBSA 0.424Jacksonville USA Zip codes within CBSA 0.436Tampa USA Zip codes within CBSA 0.436Detroit USA Zip codes within CBSA 0.462Govan Mbeki ZAF Wards within municipalities 0.013Hibiscus Coast ZAF Wards within municipalities 0.023City of Matlosana ZAF Wards within municipalities 0.025Nelson Mandela Bay ZAF Wards within municipalities 0.028City of Cape Town ZAF Wards within municipalities 0.029Emalahleni ZAF Wards within municipalities 0.031City of Tshwane ZAF Wards within municipalities 0.033Local Municipality of Madibeng ZAF Wards within municipalities 0.041Ekurhuleni ZAF Wards within municipalities 0.044City of Johannesburg ZAF Wards within municipalities 0.054Matjhabeng ZAF Wards within municipalities 0.060Nkomazi ZAF Wards within municipalities 0.060Greater Letaba ZAF Wards within municipalities 0.064Mangaung ZAF Wards within municipalities 0.069Makhado ZAF Wards within municipalities 0.075Emfuleni ZAF Wards within municipalities 0.107Greater Tzaneen ZAF Wards within municipalities 0.137eThekwini ZAF Wards within municipalities 0.147Mogalakwena ZAF Wards within municipalities 0.149Lepele ZAF Wards within municipalities 0.151
S-7 ity Country Resolution κ Buffalo City ZAF Wards within municipalities 0.216Polokwane ZAF Wards within municipalities 0.244The Msunduzi ZAF Wards within municipalities 0.247Rustenburg ZAF Wards within municipalities 0.253Makhuduthamaga ZAF Wards within municipalities 0.265Thembisile ZAF Wards within municipalities 0.285Chief Albert Luthuli ZAF Wards within municipalities 0.465Dr JS Moroka ZAF Wards within municipalities 0.497Maluti a Phofung ZAF Wards within municipalities 0.502Bushbuckridge ZAF Wards within municipalities 0.580Thulamela ZAF Wards within municipalities 0.645
A. Mobility Data
The Google COVID-19 Aggregated Mobility Research Dataset contains anonymized mobilityflows aggregated over users who have turned on the Location History setting, which is offby default. This is similar to the data used to show how busy certain types of places are inGoogle Maps — helping identify when a local business tends to be the most crowded. Thedataset aggregates flows of people from region to region.To produce this dataset, machine learning is applied to logs data to automatically segmentit into semantic trips [46]. To provide strong privacy guarantees, all trips were anonymizedand aggregated using a differentially private mechanism [65] to aggregate flows over time (see https://policies.google.com/technologies/anonymization ). This research is done onthe resulting heavily aggregated and differentially private data. No individual user data wasever manually inspected, only heavily aggregated flows of large populations were handled.All anonymized trips are processed in aggregate to extract their origin and destination loca-tion and time. For example, if users traveled from location a to location b within time intervalS-8 , the corresponding cell ( a, b, t ) in the tensor would be n ± err, where err is Laplacian noise.The automated Laplace mechanism adds random noise drawn from a zero mean Laplacedistribution and yields ( (cid:15), δ )–differential privacy guarantee of (cid:15) = 0 .
66 and δ = 2 . × − per metric. Specifically, for each week W and each location pair ( a, b ), we compute thenumber of unique users who took a trip from location a to location b during week W . Toeach of these metrics, we add Laplace noise from a zero-mean distribution of scale 1 / . (cid:15), δ )–differential privacy with values defined above. The parameter (cid:15) controls thenoise intensity in terms of its variance, while δ represents the deviation from pure (cid:15) –privacy.The closer they are to zero, the stronger the privacy guarantees.We use data collected weekly from November 3rd 2019 to February 29th 2020, ensuring thatwe capture standard movement behavior, uninfluenced by pandemic conditions. Dependingon the availability of population data per country, the mobility flows between S2 cells areaggregated to patch size of comparable resolution between countries: • United States (USA): 50 Urban areas with zip code patches [48]. • Italy (ITA): 49 Communes with S2 cell patches [51]. • South Africa (ZAF): 31 Municipalities with ward patches [50]. • Australia (AUS): 33 Statistical Areas Level 4, with Level 2 patches [49].To aggregate the S2 cells to the corresponding alternate patch types, the centroid pointsof the S2 cells were spatially joined with GIS boundaries of the alternate resolution. Forexample, in the US, flows from a zip code X to a zip code Y are determined by summingall of the flows that start in S2 cells whose centroids lie within X and end in S2 cells whosecentroids lie within Y. S-9 . Population density data
Population density data is aggregated to patch-size in a similar manner when necessary. Inthe case of US zip codes, Australian statistical areas, and South African wards, populationinformation is collected and available at those respective resolutions. For patches in Italy,population is determined by aggregating the populations of the 30 meter tiles collected byFacebook [51] to S2 cells. This is done by spatially merging the centroids of the tiles to thepatches, similar to the S2-to-patch mobility flow aggregation.
S3. Epidemic modelA. Dynamical equations
The model used for estimating the vulnerability of a given city is a generalized version of theformalism included in [27]. From an epidemiological point of view, the model incorporates aSIR dynamics where individuals can be susceptible of contracting the disease (S), infectious(I) or recovered (R). An infectious individual transmits the pathogen to healthy counterpartsvia direct interaction at a rate λ . In turn, infectious individuals enter the compartment R at a rate µ , which typically encodes the inverse of the expected contagious period. Themixing among healthy and infectious individuals is governed by the spatial distributionof the population and their mobility patterns, which are accommodated in the formalismby using metapopulations. A metapopulation is a complex network composed by a setof N patches which represent places gathering agents. Those individuals can move acrossthe metapopulation, being these movements determined by the mobility patterns usuallyencoded in origin-destination matrices.In our model, each individual has an associated node which is identified as her residence.Therefore, each patch i is initially populated by n i agents. We split each day into threestages: Movement, Interaction and Return. First, agents decide whether or not moving witha probability p . If moving, they choose their destination according to the origin destinationS-10atrix R , being the elements R ij the probability of moving from patch i to patch j . Weconstruct this matrix from mobility data, encoded in T , as R ij = T ij (cid:80) l T il . (S3)After all the movements have been completed, interaction among agents sharing the samecurrent location occurs. At this point, we make a mean-field assumption within each patch,so every agent inside a given location makes the same number of contacts. We also assumethat the number of contacts is proportional to the density of each patch via a function f which allows for including different ways of weighting the relevance of population density forthe number of interactions inside a given patch. Finally, as we want to reflect the mostlycommuting nature of human mobility, we force all the individuals to return to their residenceand repeat the same process for a new time step (day).Under these assumptions, the dynamics is totally characterized by a set of 2 × N coupleddiscrete equations governing the temporal evolution of the fraction of infected and recoveredindividuals with residence in each patch. In particular, the fraction of infected, ρ Ii ( t + 1),and recovered individuals, ρ Ri ( t + 1), associated to patch i at time t + 1, read: ρ Ii ( t + 1) = (1 − µ ) ρ Ii ( t ) + (1 − ρ Ii ( t ) − ρ Ri ( t ))Π i ( t ) , (S4) ρ Ri ( t + 1) = ρ Ri ( t ) + µρ Ii ( t ) . (S5)Eq. (S5) determines the evolution of recovered patients. As the SIR model assumes thatthe compartment R constitutes the final epidemiological state, this evolution is just givenby the number of infected individuals overcoming the disease. Regarding the evolution ofinfectious individuals, the r.h.s. of Eq. (S4) corresponds to those infected overcoming thedisease and the second one involves contagions of susceptible individuals. In this sense, theprobability that a susceptible individual living inside i contracts the disease at time t , Π i ( t ),S-11an be expressed as: Π i ( t ) = (1 − p ) P i ( t ) + p N (cid:88) j =1 R ij P j ( t ) . (S6)The first term identifies those contagions occurring inside the residence patch whereas thesecond term contains those taking place inside neighboring areas. Likewise, the probabilityof contracting the disease inside a given node i at time t , P i ( t ), is given by: P i ( t ) = 1 − (cid:32) − λ I effi ( t ) n effi (cid:33) f i , (S7)where f i determines the number of contacts per day of individuals inside i . Finally, theterms n effi and I effi ( t ), which denote the effective population and effective number of infectedindividuals inside patch i at time t after population movements, read: n effi = N (cid:88) j =1 n j ( δ ij (1 − p ) + R ji ) , (S8) I effi = N (cid:88) j =1 n j ρ Ij ( t ) ( δ ij (1 − p ) + R ji ) . (S9)
1. Estimating cities’ vulnerability
The former equations offer the possibility of estimating the vulnerability for each city. Forthis purpose, we study the epidemic threshold defined as the minimum infectivity per contactneeded to observe an epidemic outbreak. Therefore, the lower the epidemic threshold is, theeasier an epidemic wave propagates, thus reflecting a higher city vulnerability to epidemicoutbreaks. To estimate the epidemic threshold, we assume that the disease has reached astationary equilibrium and that the epidemic size is negligible compared with the populationsize. Mathematically, this imply that (cid:126)ρ I ( t + 1) = (cid:126)ρ ( t ) = (cid:126)(cid:15) (cid:28) (cid:126)
1. In addition, we neglectthe individuals belonging to the compartment R by setting (cid:126)ρ R ( t + 1) = (cid:126)ρ R ( t ) = (cid:126)
0. BothS-12ssumptions allow us to linearize Eq. (S7) which now reads: P i (cid:39) λf i I effi n effi . (S10)After introducing Eqs.(S6-S10) into Eq.(S4) and taking into account the stationary regime,we obtain: µλ (cid:15) i = N (cid:88) j =1 (cid:34) (1 − p ) δ ij f i n effi + p (1 − p ) (cid:32) R ij f j n effj + R ji f i n effi (cid:33) + p (cid:88) l R il R jl f l n effl (cid:35) n j (cid:124) (cid:123)(cid:122) (cid:125) M ij (cid:15) j . (S11)The former expression holds if µλ corresponds to an eigenvalue of matrix M . As our goal isobtaining the minimum λ value triggering epidemic outbreaks, the epidemic threshold λ c isgiven by: λ c = µ Λ max ( M ) , (S12)with Λ max ( M ) denoting the spectral radius of matrix M . B. Function governing contacts
At this point, it is necessary to specify the form of the function f , determining how thenumber of contacts that each individual makes inside each patch depends on its density. Toshed light on the role of mobility inside each city, we follow a non-parametric approach byassuming that these contacts are linearly proportional to the density inside each patch. Thisway, the results shown in Figs. 2 & 4 are obtained by assuming: f i = n effi a i , (S13)being a i the area of patch i .If one is interested in isolating the role played by mobility in shaping cities’ vulnerability,we must remove the dependence of the overall population density for each case. To do so,S-13 . . . . . ˜ ∏ c USA BeneficialNeutralDetrimental 0 . . . . . . . . . . . . ∑ ˜ ∏ c AUS 0 . . . . . . . ∑ A BC D
FIG. S1. Normalized epidemic threshold versus population density hotspot concentration for the( A ) United States (Spearman: -.62, Pearson = -.59), ( B ) Italy (Spearman = -.81, Pearson = -.77 ),( C ) South Africa (Spearman = -.79, Pearson = -.57), and ( D ) Australia (Spearman = -.45, Pearson= -.47) , with each city colored according to the outcome of removing flows connecting hotspotsand distribute them evenly among non-hotspots neighboring areas (Fig. 2 B ). Solid line shows thepower-law fitting ˜ λ c = Aκ β and shadowed regions cover 68% confidence interval. we decide to compute the normalized epidemic threshold ˜ λ c which is defined as the ratiobetween the actual epidemic threshold, computed by accounting for human mobility, andthe threshold corresponding to a static scenario. Therefore, this quantity reads:˜ λ c = Λ max ( M )( p = 0)Λ max ( M )( p = 1) (S14)Although useful for illustrating the role of mobility in each city, the non-parametric linearrelation does not correspond with a realistic scenario due to the large difference in termsof contacts existing among zones with disparate densities. To make a more fair comparisonS-14n the expansion of COVID-19 over different cities, we choose a more complex functiongoverning the number of contacts. Following [67], we assume that f i = 2 − e − ξ neffiai , (S15)which is bounded such that f i ∈ [1 , ξ is estimated by maximizingthe correlation among the theoretical and the observed vulnerabilities, yielding ξ = 2 · − square miles. By including this function, we estimate the epidemic threshold for each city, λ c , as: λ c = µ Λ max ( M )( p = 1) , (S16)where we set µ = 1 for the sake of simplicity. Note that this parameter does not have anyinfluence on the cities’ ranking since it is inherent to the disease features and does not dependon human interactions. S4. Examining the impact of hotspot concentration at the country level
In the main text, we reveal that all the cities analyzed here, regardless of their associatedcountry, fall in an universal curve governing the dependence of the normalized epidemicthreshold ˜ λ c on the flow concentration within hotspots κ . Moreover, we check that thisuniversal curve is well-represented by a sub-linear power law decay function. Nonetheless,despite the robustness of the results, the spatial resolution of the basic units composing themetapopulations is limited by data availability for each country. Therefore, different spatialresolutions are mixed in Fig. 2 of the main text, which partially hinders the close relationbetween ˜ λ and κ c .To characterize the relevance of the flows connecting hotspots for epidemic spreading, we splitthe universal curve presented in the main text and represent in Fig. S1 the individual curvesfor each of the four countries analyzed here. The high Pearson and Spearman correlationcoefficients among ˜ λ and κ along with the goodness of the power law fitting for each individualcountry strengthen our message about the close connection between the normalized epidemicS-15 ABLE S2. Exponent of the power law function ˜ λ c = Aκ β governing the dependence of thenormalized epidemic threshold on the hotspot concentration. Country Decay rate β USA − . ± . − . ± . − . ± . − . ± . S5. Connecting Indicators and Population Measures
We examine the relationship between normalized epidemic threshold ¯ λ , population densityhotspot concentration κ and two other common population measures: average populationdensity and Lloyd’ mean crowding for cities in the US in Fig. S2. A. Average Population Density
The average was calculated by averaging the population densities d i,k of all i tot zip codes i within each city k : (cid:80) d i,k i tot (S17)and describes the relative concentration of people within a city.S-16 λ c κ AveragePopulationDensity Lloyd’s MeanCrowding¯ λ c κ AveragePopulationDensityLloyd’s MeanCrowding 1.00 -0.62 -0.25 0.22-0.62 1.00 -0.05 -0.15-0.25 -0.05 1.00 0.360.22 -0.15 0.36 1.00 − . − . . . . FIG. S2. Spearman Correlations between κ , the average population density in the city, and theLloyd mean crowding [51] with the normalized epidemic threshold ˜ λ c B. Lloyd Mean Crowding
Lloyd mean crowding [52] was calculated according to (cid:80) ( q i − q i (cid:80) q i (S18)for patch populations q i . Lloyd mean crowding measures the number of unique contactspossible for members of patches in a city. We note that unlike κ and average populationdensity, this measure is independent of patch size and therefore doesn’t specifically addressproximity.We find that our chosen pair of ¯ λ and κ contains the strongest correlations. Of the threepopulation-related metrics κ , average density, and Lloyd’s Mean crowding, κ is the onlyS-17ne that contains mobility information, and like average population density, differentiatesbetween patches of different spatial size, supporting that mobility and density are vitalinclusions to properly understanding the epidemic situation of a city. S6. Empirical Epidemic-Indicator Correlations
We quantify the extent to which COVID-19 is able to spread in a US city by examining thetimeseries of confirmed cases per county [68, 69], from January 23 2020 to April 16 2020(before truncating.) We aggregate this data to the level of CBSAs by summing across eachCBSAs component counties. Given the noise in the data (due to collection and reportingartifacts, etc) we perform preprocessing on the curves. We use a Savitzky-Golay Filter[70] to smooth the data by fitting intermediate windows [71] with low-order polynomials.We then truncate our data to a window of two weeks after 100 cases were confirmed ineach county. This allows our window of observation to capture the regime where COVID-19 awareness encouraged active testing, but before intervention methods influence how thedisease propagates within cities. This way, we capture the disease behavior specific to thecity structure, and not external suppression. To estimate the vulnerability of each city, wefit the filtered cumulative number of infections to an exponential function I ( t ) = ae bt (S19)and extract the growth rate b . Although more sophisticated approaches have been proposedin the literature based on the estimation of the effective reproductive number R e , our proce-dure is simple but effective to capture the vulnerability of each city at the early stage of theoutbreak. The smoothed curves along with the exponential fits are presented in Fig. S3.S-18 . . . . . . l n I Atlanta b = 0.1866 ± . . . . b = 0.1264 ± . . . . . b = 0.1598 ± . . . . b = 0.1054 ± b = 0.2762 ± . . . . . l n I Buffalo b = 0.1517 ± . . . . . b = 0.1365 ± b = 0.2184 ± . . . . b = 0.1455 ± . . . . . b = 0.1373 ± . . . . . l n I Columbus b = 0.1309 ± . . . . . b = 0.1732 ± . . . . . b = 0.1869 ± b = 0.1990 ± . . . . . b = 0.1479 ± . . . . . . l n I Houston b = 0.1903 ± . . . . . . . b = 0.1769 ± . . . . b = 0.1379 ± . . . . b = 0.1189 ± . . . . . b = 0.1813 ± . . . . . . l n I Los Angeles b = 0.2086 ± . . . . b = 0.1005 ± . . . . . b = 0.1276 ± b = 0.2369 ± . . . . . b = 0.1362 ± . . . l n I Minneapolis b = 0.0972 ± . . . . . b = 0.1612 ± b = 0.1855 ± b = 0.4319 ± . . . . b = 0.1109 ± . . . . . l n I Orlando b = 0.1628 ± b = 0.2237 ± . . . . . b = 0.1348 ± . . . . . b = 0.1271 ± . . . . b = 0.1203 ± . . . . . l n I Providence b = 0.1711 ± . . . . . .
00 Raleigh b = 0.0964 ± . . . . b = 0.1105 ± . . . . . . b = 0.1736 ± . . . . b = 0.1278 ± . . . . l n I Salt Lake City b = 0.1341 ± . . . . b = 0.1276 ± . . . . b = 0.1501 ± . . . . b = 0.1653 ± . . . . b = 0.1271 ± . . . . . l n I Seattle b = 0.1585 ± . . . . . . b = 0.1530 ± . . . . . b = 0.1646 ± . . . . b = 0.1291 ± . . . . . . b = 0.1931 ± FIG. S3. Cumulative number of COVID-19 reported 14 days after the first 100 reported casesfor the 50 cities in the United States. Dots show real data, smoothed to remove the inherent noisynature of case reports by using a Savitzky-Golay filter. The line shows the exponential fit of thedata via least-squares method to I ( t ) = ae bt ..