Disproportionate incidence of COVID-19 in African Americans correlates with dynamic segregation
DDisproportionate incidence of COVID-19 in African Americans correlates withdynamic segregation
Aleix Bassolas, Sandro Sousa, and Vincenzo Nicosia School of Mathematical Sciences, Queen Mary University of London, London E1 4NS, United Kingdom (Dated: July 9, 2020)Socio-economic disparities quite often have a central role in the unfolding of large-scale catas-trophic events. One of the most concerning aspects of the ongoing COVID-19 pandemics [1] is thatit disproportionately affects people from Black and African American backgrounds [2–6], creatingan unexpected infection gap. Interestingly, the abnormal impact on these ethnic groups seem tobe almost uncorrelated with other risk factors, including co-morbidity, poverty, level of education,access to healthcare, residential segregation, and response to cures [7–11]. A proposed explanationfor the observed incidence gap is that people from African American backgrounds are more oftenemployed in low-income service jobs, and are thus more exposed to infection through face-to-facecontacts [12], but the lack of direct data has not allowed to draw strong conclusions in this sense sofar. Here we introduce the concept of dynamic segregation, that is the extent to which a given groupof people is internally clustered or exposed to other groups, as a result of mobility and commutinghabits. By analysing census and mobility data on more than 120 major US cities, we found that thedynamic segregation of African American communities is significantly associated with the weeklyexcess COVID-19 incidence and mortality in those communities. The results confirm that knowingwhere people commute to, rather than where they live, is much more relevant for disease modelling.
The spread of a non-air-borne virus like COVID-19is mostly mediated by direct face-to-face contacts withother infected people. This is why the first measures at-tempting at containing the spread of the virus includedthe introduction of travel restrictions, social distancing,curfews, and stay-at-home orders [13–16]. However, thedistribution of the number of contacts per person isknown to be fat-tailed [17], so that most of the infectionsare actually caused by a relatively small set of individu-als, called super-spreaders [18, 19], who normally have adisproportionately high number of face-to-face contacts.Intuitively enough, super-spreaders are most commonlyfound among service workers –cashiers, postmen, clerks,cooks, bus drivers, waiters, etc.– since their job involvesbeing in direct contact with a large number of peopleon a regular basis. This fact makes super-spreadersmore prone to catch diseases that propagate preferen-tially through direct contacts, like COVID-19 does, and–involuntarily– more efficient at spreading them.The fact that mainly African Americans seem tobe affected by such a markedly unusual COVID-19incidence[20–22], rather than, say, people with low-income, little access to healthcare, or with other in-creased risk factors [7–11], points to ethnic segregation,i.e., the tendency of people belonging to the same ethnicgroup to live closer in space, as a possible culprit [23–27]. Indeed, ethnic segregation is long-standing problemacross the US [28], so the idea that the abnormal propor-tion of COVID-19 infections among African Americanscould be due to spatial segregation does not sound un-reasonable. However, the results available so far confirmthat, although there is a correlation between ethnic seg-regation and overall incidence of COVID-19 in the popu-lation, there seems to be little evidence of an associationwith infection gap in African Americans [29].Our hypothesis is that the observed infection gap is most probably due to a prevalence of super-spreading be-haviours in African American communities, i.e., activi-ties that contribute to increase the typical number andvariety of face-to-face contacts of individuals —includingfor instance their job, habits, social life, commuting andmobility patterns— and that effectively make them moreexposed to the infection. In particular, we argue thatthese super-spreading behaviours are connected to thepresence of what we call dynamic segregation . By dy-namic segregation we mean the extent to which individ-uals of a certain class or group are either preferentially ex-posed to other groups, or internally clustered, as a resultof their mobility patterns. In this sense, dynamic segre-gation is somehow complementary to the classical notionof segregation based on residential data, and is insteadrelated with similar measures of segregation based on theconcept of activity space [30]. In principle, the fact that acertain residential neighbourhood has an overabundanceof people belonging to a single ethnic group might have per se little or no role in increasing the probability thatthose people catch COVID-19. Conversely, the fact thata group of people works preferentially in specific sectors,or in specific areas of a city, almost automatically in-creases the typical number of face-to-face contacts theyhave during a day, e.g., by forcing them to commute longdistances in packed public transport services.
RESULTSA. Model
We quantify the dynamic segregation of a certain groupin a urban area by means of the typical time needed byindividuals of that group to get in touch with individualsof other groups when they move around the city. In our a r X i v : . [ phy s i c s . s o c - ph ] J u l FIG. 1.
Using typical times of random walks to quantify urban dynamic segregation.
The sequence of ethnicities(here indicated by different colours) visited by a random walk over a the adjacency network or f the commuting network amongcensus tracts of a city retains relevant information about the presence of spatial correlations in ethnicity distribution. Indeed,the normalised values of Class Coverage Time (cid:101) γ α (panels b,d,g,i ) and Class Mean First Passage Time (cid:101) τ αβ (panels c,e,h,j of arandom walk exhibit different patterns in different cities, and reveal different kinds of ethnic correlations in the adjacency andin the commuting network of the same city. We show here the values for Chicago or Los Angeles, since Illinois and Californiahave, respectively, one of the highest and one of the lowest COVID-19 incidence gap. Indeed, the mean first passage time fromAfrican American to White neighbourhoods in the adjacency graph is much higher in Chicago than in Los Angeles, while thecommuting graphs reveals that African Americans are much more exposed to all the other ethnicities in Chicago than in LosAngeles. model, a city is represented by a graph G where nodesare census tracts and each edge indicates a relation be-tween two areas, namely either physical adjacency or theexistence of commuting flows between them. Each nodeis assigned to a class, according to the ethnicity distribu-tion in the corresponding area (see Methods for details).Then, we consider a random walk on the graph G , and welook at the statistics of Class Mean First Passage Times(CMFPT) and Class Coverage Times (CCT). The formeris the number of steps needed to a walker starting on anode of a certain class α to end up for the first time ona node of class β , while the latter is related to the timeneeded to a random walk to visit all the classes in thesystem (see Methods for details). The underlying idea is that a random walk through the graph preserves most ofthe information about correlations and heterogeneity ofnode classes [31]. Consequently, if a system is dynami-cally segregated, the statistics of CMFPT and CCT willbe substantially different from those observed on a null-model graph having exactly the same set of nodes andedges, but where a node is assigned a class at randomfrom the underlying ethnicity distribution.In Fig. 1 we provide a visual sketch of the model and weshow the distributions of CMFPT and CCT in Chicagoand Los Angeles. We chose these two specific cities sinceIllinois and California are two states respectively charac-terised by a relatively high and a relatively low incidencegap [32, 33] (a detail of incidence gap across US states C (clustering) ∆ A i n f R / =0 . *** R / =0 . *** a E (exposure) R / =0 . R / =0 . b Adjacency I (isolation) R / =0 . R / =0 . ** c G (spatial Gini) R / =0 . R / =0 . * d C (clustering) ∆ A i n f R / =0 . *** R / =0 . *** e E (exposure) R / =0 . ** R / =0 . ** f Commuting I (isolation) R / =0 . R / =0 . g G (spatial Gini) R / =0 . R / =0 . h FIG. 2.
Correlation between incidence gap and dynamic segregation in the early stages of the epidemics.
The incidence gap ∆ A inf across US states in the first two weeks after extensive lock-down measures were enforced exhibitssomehow strong correlation with measure of segregation based on CMFPT and CCT on the adjacency (panels a-d ) and on thecommuting graphs (panels e-h ). In particular, the dynamic clustering C ( a,e ) is always positively correlated with ∆ A inf , thethe dynamic exposure E ( b,f ) is positively correlated with ∆ A inf only in the commuting network, and the dynamic isolation I ( c,g ) is negatively associated with incidence gap only in the adjacency network. Notice that classical measures of residentialsegregation, like the Spatial Gini coefficient ( d,h ), are instead poorly or not correlated at all with incidence gap. Each colourcorresponds to a temporal snapshot of the data set, red for 12 / / / / p < . p < . p < . is available in Supplementary Figures 18-19). Here eachnode is associated to one of the seven high-level ethnicgroups defined by the US Census Borough [34], with aprobability proportional to the abundance of that eth-nicity in the corresponding census tract. The variablesof interest are (cid:101) τ αβ and (cid:101) γ α . These are, respectively, theratio of the CMFPT from class α to class β in the realsystem and in the null-model, and the ratio of the CCTwhen the walker starts from class α in the real systemand in the null model (see Eq. 8 and Eq. 10 in Methods).In short, the farther away (cid:101) τ α,β is from 1, the higher thedynamic segregation from class α to class β . Similarly,the higher the value of (cid:101) γ α the more isolated ethnicity α is from all the other ones.The top panels of Fig. 1 correspond to the unweightednetwork A of physical adjacency between census tracts,while the bottom panels are obtained on the weightednetwork C of typical daily commute flows among the sameset of census tracts [35] (see Methods for details). Noticethat the two graphs have quite different structures: theadjacency graph is planar and each edge connects onlynodes that are physically close, while in the commut-ing graph long edges between physically separated tractsare not only possible, but quite frequent. As a conse-quence, the adjacency graph provides information about short trips, e.g., for daily shopping and access to local ser-vices, while the commuting graph represents long-rangetrips, e.g., related to commuting to and from work. It isclear that each ethnicity has a peculiar pattern of pas-sage times to the other ethnicities, and this pattern variesacross cities. For instance, in Chicago the two largestvalues of (cid:101) τ αβ on the adjacency graph are observed be-tween African Americans and White, and between Asianand African Americans. Conversely, in Los Angeles thetwo largest values of (cid:101) τ αβ are between African Americanand Asian and between Other and Asian. As expected,the profile of (cid:101) τ αβ for a given class is quite different if weconsider the commuting network instead of the adjacencygraph. In Chicago, the largest value of (cid:101) τ αβ is from Whiteto African American, while in Los Angeles there are a lotof pairs of classes with pretty similar values of (cid:101) τ αβ , indi-cating that in this city dynamic segregation for AfricanAmericans is less prominent than in Chicago. The valueof (cid:101) γ α for African Americans is especially low in Chicago,but noticeably different from that of the other ethnicitiesin Los Angeles. Results for other cities are discussed inAppendix A Supplementary Figures 1-4. As we shall seein a moment, (cid:101) γ α is related to the isolation of a class, sothat lower values correspond to increased exposure to allthe other classes. FIG. 3.
Distribution of local dynamic segregation.
The distribution of the fraction of African American population livingin each census tract (panels a,f ) is mostly unrelated to the local clustering index (cid:101) ξ i (panels b,d,g,i ) and to the local isolationindex (cid:101) ψ i (panels c,e,h,j ). The figure shows the result for Chicago (top panels) and for Los Angeles (bottom panels). Overall,there is little correlation between the density of African American residents and the dynamic segregation of African Americansin an area. This explains why dynamic segregation indices in a city correlate quite strongly with the COVID-19 infection gap,while no strong association with residential segregation has been found so far. B. Dynamic segregation and infection gap
Starting from the statistics of CMFPT and CCT atthe level of each city, we defined three indices of dy-namic segregation, namely dynamic clustering (C), dy-namic exposure (E), and dynamic isolation (I), and weassociated to each state in the US the weighted averageof each of those indices across the largest metropolitanareas of the state (the definitions of these measures areprovided in Methods, while a ranking of US states byeach segregation index is reported in Appendix B and inSupplementary Figure 5). We considered two temporaldata sets of weekly percentage of African Americans in-fected by and deceased due to COVID-19 for each statein the US [32, 33] (more details available in Methods),and we calculated the incidence gap ∆ A inf in each stateas the difference between the percentage of infected ofthat state that are African Americans and the percent-age of African American population in the same state.Hence, Positive values of ∆ A inf correspond to a dispro-portionate incidence of COVID-19 on African Americancommunities.In Fig. 2 we show the scatter plots of the averagedynamic clustering, exposure, and isolation of AfricanAmericans at state level, and of the correspondingCOVID-19 infection gap in the first two weeks after ma-jor lock-down measures were introduced across the US.We chose these two temporal snapshots because the num-ber of confirmed infected individuals in a week actuallydepends on their contacts up to two weeks before, dueto the COVID-19 incubation period [36]. The top panelsreport the results on the adjacency networks of census tracts, while the bottom panels are for the commutinggraphs. Interestingly, there exists a quite strong correla-tion between dynamic segregation and the disproportion-ate number of infected in African American communities.In particular, the dynamic clustering of African Ameri-cans in a state correlates positively and quite stronglywith the infection gap observed in that state in the firsttwo weeks of the data set, both on the adjacency (respec-tively R = 0 .
58 and R = 0 .
44 in the first two weeks)and in the commuting network (respectively R = 0 . R = 0 . R = 0 .
40 and R = 0 . R = 0 . A dec , and the ratios of infec-tion/deaths incidence instead of the difference (see Sup-plementary Figures 8-10). In particular, dynamic isola-tion exhibits a somehow stronger correlation with deathgap ( R = 0 . / / / / / / C (clustering) R ∆ A i n f a CommutingAdjacency / / / / / / E (exposure) b Single variable analysis / / / / / / I (isolation) c / / / / / / G (spatial Gini) d / / / / / / C (clustering)+ PT R ∆ A i n f e / / / / / / E (exposure)+ PT f Multivariate analysis / / / / / / I (isolation)+ PT g / / / / / / G (spatial Gini)+ PT h FIG. 4.
Temporal evolution of incidence gap correlations and multivariate analysis with public transport usage.
Evolution of the Pearson correlation ( R ) between African American incidence gap and a dynamic clustering, b dynamicexposure, c dynamic isolation, d Spatial Gini coefficient, respectively on the adjacency (solid red lines) and commuting graphs(dashed blue lines). e-h
Multivariate analysis of the same indices and usage of public transportation by African Americans for e dynamic clustering, f dynamic exposure, g dynamic isolation, h and spatial Gini coefficient. The type of marker indicatesthe sign of the correlation (triangles pointing up for positive correlations, and down for negative correlation). Given the uneventemporal reporting of ethnicity data, each temporal snapshot has a slightly different number of US states (details provided inSupplementary Figure 17). We have also tried alternative formulations for C and E obtaining significant correlations, as shownin Supplementary Figure 27. dices can bet better explained by looking at how residen-tial data and dynamic segregation are distributed acrossa city. In Fig. 3 we show the heat-maps of abundanceof African American residents in Chicago and Los An-geles together with the local segregation indices (cid:101) ξ i and (cid:101) ψ i , respectively derived from passage times and coveragetimes, on the adjacency and on the commuting graph ofcensus tracts (see the definitions provided in Methods,additional maps for Detroit and Houston are reportedin Appendix C and Supplementary Figure 15). It is truethat in Chicago (cid:101) ξ i in the adjacency graph is still somehowcorrelated with the fraction of African American popula-tion (see Supplementary Figure 16). But the distributionof (cid:101) ξ i in the commuting graph is totally different. In par-ticular, the regions characterised by residential clustersof African Americans exhibit lower values of (cid:101) ξ i , mean-ing that the commuting patterns make those neighbour-hoods overall less isolated. Conversely, new hot-spots areidentified in the South-Eastern region of Gary, likely dueto the fact that people in this region do not commutemuch to the city centre anyway. Similarly, the areasof Los Angeles with the largest local isolation are notthe neighbourhoods with a higher percentage of AfricanAmericans residents, rather the suburbs characterised by high commuting. C. Combined effects of dynamic segregation anduse of public transport
Finally, in Fig. 4 a-d we show the correlation betweenthe infection gap and the different segregation measuresas the pandemic progresses. Unsurprisingly, the corre-lation with any single measure decreases over time forall the indices, and both on the adjacency and on thecommuting graph. Similar results are found for the cor-relation with death gap and with ratios of incidenceand deaths in African Americans (see Appendix D andSupplementary Figures 20-22) as well as with a seconddataset we had access to [33] (see Supplementary Fig-ures 23-26). The main reason for the observed decreasesis that once large-scale mobility restrictions are put inplace —as it happened between the end of March andthe beginning of April across all the US states with stay-at-home orders and curfews— the overall mobility struc-ture of each city is massively disrupted. As a result,super-spreading behaviours due to usual commuting pat-terns are massively reduced, and the contagion progressesmainly through face-to-face interactions happening closeto the residential place of each individual, and are notcaptured well by CMFPT and CCT on the commutinggraph.In order to capture the focus on local transport afterlock-downs are enforced, in Fig. 4 e-h we show the resultsof the multivariate analysis of the same set of segregationindices shown in Fig. 2 and of the fraction of AfricanAmerican population using public transport in each city(see Methods for details). The combination of dynamicsegregation and use of public transport correlates quiteconsistently with the incidence gap. These findings aremade more relevant by the fact that the incidence gapin African Americans in the same period is quite poorlycorrelated with the overall usage of public transport inthe population, as well as with a variety of other socio-economic indices, as shown in Supplementary Figures 30-31. Since cities are complex interconnected systems, itis plausible to hypothesise that segregation and publictransport usage are related in subtle and intricate ways,so that it is practically impossible to establish whetherthe former has caused the latter, or instead the two phe-nomena have co-evolved over time.
DISCUSSION
The vulnerability of African American communitiesand their higher socio-economic disparities has been astanding issue in the US long before the pandemic, thedisproportional infection rates simple highlighted andamplified the problem. The presence of a COVID-19incidence gap in Black and African American popula-tion is somehow unexpected, since no specific biologicalrisk factor has been strongly associated with an increasedvulnerability to the virus of any specific ethnic group.Hence, the most unbiased assumption to explain such adisproportionate incidence, which in some areas is threeto five times higher than the fraction of African Americanpopulation, is that it should be related to behaviouraland social factors, rather than to biological ones. Themost frequently whispered theory is that African Amer-icans are more exposed to COVID-19 because they aremore frequently employed in service works. This expla-nation is indeed reasonable, since service workers nor-mally have hundreds of face-to-face interactions duringa day. Indeed, some recent studies have estimated thatthe switching to remote-working was mainly available topeople employed in non-essential services, and amountedto 22%-25% of the work force before April [37]. As ex-pected, service workers are one of those categories towhich the option to switching to remote-working dur-ing the lock-down was not available at all, especially insectors deemed vital for the functioning of a country dur-ing lock-down, including food production and retailers,healthcare, transportation, and logistics. According tothe US Labor Force Statistics [38], the occupations withthe highest concentration of African Americans are in-deed jobs characterised by face-to-face interaction, and most of them fall in the area of essential jobs: postalservice sorters/processors (42%), nursing (37%), postalservice clerks (35%), protective service workers (34%)and barbers (32%). It would not then come as a surpriseto discover that one of the major early COVID-19 out-breaks happened in South Dakota, in a meat-processingplant, whose workers were mainly of African Americanbackground [39].The potential relation between ethnicity and mobilitywas somehow hinted to in a recent study [40] which foundthat the decrease in the usage of subway transport in NewYork during the lock-down was uneven across ethnicities,with African Americans experiencing the smallest rela-tive drop. But unfortunately, the publicly available dataabout COVID-19 incidence do not contain detailed infor-mation about socio-economic characteristics of infectedindividuals, so drawing an association between AfricanAmericans, employment in essential service jobs, avail-ability of remote-working options, and increased COVID-19 exposure is very hard.An interesting finding of the present work is that thecombination of dynamic segregation and use of publictransport seems to explain the persistence of infectiongap throughout the early phases of the pandemic. In-deed, before lock-downs are put in place, African Amer-icans are found to be more exposed to the virus, mainlydue to the structure of their daily commuting patterns.After lock-downs are enforced, instead, they are morelikely to pass the virus over to other African Americans,as a result of the high levels of clustering and isolation ofthese communities measured in the adjacency graphs ofcensus tracts, which are a more reliable proxy for face-to-face interactions when long-distance commuting is dis-rupted. In general, the states where African Americansare more exposed with respect to long-distance trips arealso those where they are more clustered with respect toshort-range mobility (the rank correlation between thetwo measures is 0 .
62, as shown in Supplementary Figure6-7 and Supplementary Table I).The importance of considering the interaction of dif-ferent classes due to mobility through the urbanscapehas recently received some attention [30, 41–46]. In thissense, it is quite interesting that the simple diffusionmodel we used here to quantify the presence of dynamicsegregation, and the corresponding indices of clustering,exposure, and isolation, are able to unveil a relativelystrong correlation between the structure of mobility in ametropolitan area and the excess incidence of COVID-19infections and deaths in African Americans. Althoughthe model we consider uses relatively small and coarse-grained information about a city —placement of censustracts, local ethnicity distribution, and commuting tripsamong them— the strong correlation between dynamicsegregation and incidence gap allows to conclude thatwhen it comes to predicting the exposure of a group to anon-airborne virus, knowing the places where the mem-bers of that group commute for work is more importantand more relevant than knowing where they actually live.This is also confirmed by the quite poor association ofincidence gap with other classical measures of racial seg-regation (see Supplementary Figures 28-29).The results presented in this work suggest that policymakers should definitely take into account mobility pat-terns when modelling the spread of a disease in a urbanarea, and in predicting the impact of specific counter-measures. In particular, a strategy to mitigate incidencegap should focus on reducing as much as possible long-distance trips for people that are naturally more exposedto face-to-face contacts, e.g., due to their occupation, andenforcing stricter measures of social distancing on localactivities.
METHODSGeographic network data sets
Ethnicity data was obtained from [34] and includesthe data from the 2010 decennial census. Commutingtrips data comes from the 2011 US census [35], focus-ing on the seven highest-level ethnicity classes, namely:White, Black or African American, American Indian andAlaska Native, Asian, Native Hawaiian and Other PacificIslander, Some Other Race, Two or More Races. Pop-ulation is updated to the latest American CommunitySurvey 2014-2018 5-yeas Data Release [47].For each metropolitan area we constructed two distinctspatial networks. The first one is the adjacency network ,denoted by A and obtained by associating each cell toa node and connecting two nodes with a link if the cor-responding cells border each other. Notice that A is anundirected and unweighted graph,. The second graph isthe commuting network , denoted as C . In this networkeach node is a tract and the directed and weighted link ω ij between node i and j indicates the number of com-muting trips from i to j as obtained from census informa-tion. To reconstruct a mobility network that resemblesthe real one (which amounts to something between 30%and 40% of the total mobility in a city) we aggregatedboth the trips from home to work and the correspondingreturn trip from work to home.Each node of the adjacency network A preserves in-formation about the ethnicity distribution on the cor-responding census tract. We use the N × Γ matrix M = { m i,α } , where Γ is the number of ethnicities presentin the city. The generic element m i,α of M indicates thenumber of citizens of ethnicity α living on node i . Wedenote by M i = { m i,α } the vector of population dis-tribution at node i , and by M α = (cid:80) Ni =1 m i,α the totalnumber of individuals of class α present in the system. Inthe commuting network C , instead, we attribute to eachnode i both the resident population at the correspond-ing tract and the population commuting to node i , sothat the abundance of individuals of class α on node i becomes: (cid:101) m i,α = m i,α + (cid:88) ω ji m j,α . (1)where ω ji is the number of daily commuting trips fromnode j to node i . By doing so we aim to capture the factthat a commuter to cell i will potentially have face-to-faceinteractions with both residents in that area and otherworkers commuting to that area every day. Moreover,since the commuting network C accounts for both work-home and home-work trips, the adjusted population onthe commuting network accounts for the potential con-tacts that individuals had at the origin of a trip as well. Class Mean First Passage Time (CMFPT)
Let us consider a generic graph G ( V , E ) with |V| = K edges on |V| = N nodes, and a colouring function f : V → χ that associates to each node i of G a discretelabel f ( i ) from the finite set χ with cardinality | χ | = Γ.Let us also consider a random walk on G , defined by thetransition matrix Π = { π ij } where π ji is the probabilitythat the walk jumps from node i to node j in one step.On the adjacency network A we use a uniform randomwalk, i.e., π ji = k i , while on the commuting graph C wehave π ji = ω ij s i , where s i = (cid:80) j ω ij is the out-strength ofnode i .Here we focus on the statistical properties of the tra-jectories W i = { f ( i ) , f ( i ) , . . . } of node labels visitedby the random walk W at each time when starting from i = i at time t = 0. This dynamics contains informationabout the existence of correlation and heterogeneity inthe distribution of colours. For instance, if the graph G is a regular lattice and the function f associates coloursto nodes uniformly at random, we expect that, for long-enough time, all the trajectories starting from each of the N nodes will be statistically indistinguishable.We denote as T i,α the Mean First Passage Time froma given node i to nodes of class α , i.e., the expectednumber of steps needed to a walk starting on i to visitfor the first time any node j such that f ( j ) = α . We canwrite a self-consistent forward equation for T i,α [48]: T i,α = 1 + N (cid:88) j =1 (cid:0) − δ f ( j ) ,α (cid:1) π ji T j,α (2)The Mean First Passage Time τ βα from class α to class β is defined as: τ αβ = 1 N α N (cid:88) j =1 T j,β δ f ( j ) ,α , (3)where N α is the number of nodes in the graph associ-ated to class α . Notice that in practice the value of τ αβ is obtained as an average over many realisations of therandom walk.A notable issue of the MFPT defined in Eq. (3) is thefact that its values might depend on the specific distribu-tion of colours (i.e., on their abundance) and on the sizeof the network under consideration, which makes it diffi-cult to compare Mean First Passage Times computed ondifferent systems. To obviate to this problem, we definethe normalised Class Mean First Passage Time betweenclass α and class β as: (cid:101) τ αβ = τ αβ τ null αβ (4)where τ null αβ is the MFPT from class α to class β obtainedin a null-model graph. The null-model considered here isthe graph having the same topology of the original one,and where node colours have been reassigned uniformlyat random, i.e., reshuffled by keeping their relative abun-dance. Notice that (cid:101) τ αβ is a pure number: if (cid:101) τ αβ > (cid:101) τ αβ < β when starting from a node of class α is higher (resp., lower) than in the corresponding null-model. In general, a value different from 1 indicates thepresence of correlations and heterogeneity. Class Coverage Time (CCT)
The coverage time is classically defined as the num-ber of steps needed to a random walk to visit a cer-tain percentage of the nodes of a graph when startingfrom a given node i [48]. In the case of a network withcoloured nodes, a walk started at node i will be asso-ciated to the generic trajectory W i = { f i , f i , f i , . . . } of node labels visited by the walk at each time. Sincewe are interested in quantifying the heterogeneity of eth-nicity distributions, we consider the time series W i = { M i , M i , M i , . . . } where M i t = { m i t ,α } is the distribu-tion of ethnicities at node i t visited by the walk at time t . If we consider the trajectory up to time t , the vector Q i t = H t (cid:80) τ M i τ is the distribution of ethnicities vis-ited up to time t by the walker started at i (here H t is anormalisation constant that guarantees (cid:80) j { Q i t } j = 1).We quantify the discrepancy between Q i t and the globalethnicity distribution across the city P = H (cid:48) M (cid:124) N bymeans of the Jensen-Shannon divergence: J ( P(cid:107) Q i t ) = 12 [ D ( P(cid:107) µ ) + D ( Q i t (cid:107) µ )] , (5)where µ = ( P + Q i t ) and D ( P (cid:107) Q ) is the Kullback-Liebler divergence between P and Q . We define the ClassCoverage Time from node i at threshold ε as: γ i = argmin t { J ( P|| Q i t ) ≤ ε } (6)and the associated normalised Class Coverage Time: (cid:101) γ i = γ i γ i, null (7) where γ i, null is the Class Coverage Time from node i ina null-model where the colours associated to the nodeshave been reshuffled uniformly at random. CMFPT and CCT in census networks
In the case of ethnicity distributions in geographi-cal networks, each node is not uniquely associated to acolour, but it has instead a local distribution of ethnic-ities. Nevertheless, the formalism for the computationof Class Mean First Passage Times and Class CoverageTime described above can still be used in this case as well.We consider a stochastic colouring function (cid:101) f : V → C that associates to each node i of the adjacency graphone of the Γ = 7 ethnicities α with probability m i,α (cid:80) β m i,β (respectively, with probability (cid:101) m i,α (cid:80) β (cid:101) m i,β in the commutinggraph), i.e., proportionally to the abundance of ethnicity α in node i .To compute the CMFPT we consider S independentrealisations of the stochastic colouring process for eachnetwork. On each realisation (cid:96) , we estimate the MFPTamong all classes as in Eq. (3), and the correspondingnull-model MFPT. Then, we compute the average ClassMean First Passage Time from class α to class β as: (cid:104) (cid:101) τ αβ (cid:105) = (cid:80) S(cid:96) =1 τ ( (cid:96) ) αβ (cid:80) S(cid:96) =1 (cid:16) τ null αβ (cid:17) ( (cid:96) ) (8)where τ ( (cid:96) ) αβ is the CMFPT computed on the (cid:96) -th reali-sation and (cid:16) τ null αβ (cid:17) ( (cid:96) ) is the corresponding value in thenull-model. For each system we computed τ null αβ on 500 re-alisations of the null model, with 500 independent colourassignments per realisations, and 2000 walks per node.The computation of CCT works in a similar way. Inorder to take into account the heterogeneous distribu-tion of ethnicities across nodes, before a walker startsfrom node i we sample one of the ethnicities present on i , according to their local abundance at i { m i,β } , andwe attribute node i to it. Then, we compute the CCTfrom node i of class α as the average CCT from node i across all the walks starting from i where node i wasactually assigned to class α , and we call this quantity γ iα . Notice that in this case we consider the trajectories W αi = { M αi , M αi , M αi , . . . } where M αi (cid:96) is the distributionof ethnicities at the (cid:96) -th node visited by the walker, whichdoes not include class α . The normalised Class CoverageTime from class α when starting from node i is definedas: (cid:101) γ iα = γ iα γ i, null α (9)where γ i, null α is the CCT from node i of class α in the null-model. Finally, the average CCT from class α is simplyobtained as: (cid:101) γ α = 1 N N (cid:88) i =1 (cid:101) γ iα (10)For all the computations of CCT shown in the paper weconsidered averages over 5000 walks per node and weset ε = 0 . Global indices of dynamic ethnic segregation
We constructed three global indices of dynamic seg-regation based on the values of CMFPT and CCT. Inparticular, we focused on the observed discrepancies ofCMFPT and CCT between African Americans and otherethnicities. In the following the index A will always indi-cate African Americans, while the index O will indicateall the other ethnicities. We start by defining the follow-ing quantities: τ AA = (cid:104) (cid:101) τ AA (cid:105) τ AO = (cid:80) α (cid:54) = A M α (cid:104) (cid:101) τ Aα (cid:105) (cid:80) α (cid:54) = A M α τ OA = (cid:80) α (cid:54) = A M α (cid:104) (cid:101) τ αA (cid:105) (cid:80) α (cid:54) = A M α τ OO = (cid:80) α,β (cid:54) = A (cid:104) (cid:101) τ αβ (cid:105) M α M β (cid:80) α,β (cid:54) = A M α M β In practice: τ AA is the expected CMFPT from AfricanAmericans to African Americans; τ AO is the expectedCMFPT from African Americans to all the other classes(weighted by ethnicity distribution); τ OA is the expectedCMFPT from all the other classes to African Americans(again, weighted by ethnicity distribution); and τ OO isthe expected CMFPT among all the other ethnicities.Notice that all these quantities are pure numbers, sincethey are based on the corresponding quantities definedin Eq. (8) which are correctly normalised with respect tothe null-model.The clustering of African Americans is quantified as: C = τ AO τ OO (11)so that values of C larger than 1 indicate that for anAfrican American finding any other ethnicity is harder(i.e., requires more time) than for all other ethnicities.Similarly, we define the exposure of African Americansto other ethnicities as: E = τ OA τ AO (12)where values of E larger than 1 indicate that it is easierfor African American to be found in touch with any other ethnicity than for people from all the other ethnicities tobe found in touch with African Americans.We define similar quantities for the CCT of AfricanAmericans and other ethnicities, namely (cid:101) γ iA as in Eq. (9),and: (cid:101) γ iO = 1Γ − (cid:88) α (cid:54) = A (cid:101) γ iα (13)Finally, we define the isolation of African Americans forthe whole system by the average ratio: I = 1 N N (cid:88) i =1 (cid:101) γ iA (cid:101) γ iO (14)over all the nodes. Notice that values of I larger than 1indicate that the normalised CCT from nodes of class A (African American) is higher than the CCT from nodes ofall the other classes. The State-level value of each indexis obtained as an average of the corresponding index onthe cities of the state, weighted by the population of eachcity. Local dynamic ethnic segregation
We define two local segregation indices for AfricanAmericans in a census tract i . The first index is basedon CMFPT: (cid:101) ξ i = (cid:80) α (cid:54) = A (cid:101) T i,α (Γ − (cid:101) T i,A (15)where (cid:101) T i,α corresponds to the normalised Mean First Pas-sage Time to a generic class α when a random walkerstarts from node i , while (cid:101) T i,A is the CMFPT to AfricanAmericans tracts. Values of (cid:101) ξ i larger than 1 indicate thatthe time to reach any other ethnicity is higher than thetime needed to reach African Americans, hence indicat-ing a local clustering of African Americans around node i . The local index of isolation is derived from CCT: (cid:101) ψ i = (cid:101) γ iA (cid:101) γ iO (16)where (cid:101) γ iA is the CCT from node i for African Americansand (cid:101) γ iO is the average CCT from node i for all the otherethnicities, as defined in Eq. (13). In general, if (cid:101) ψ i islarger than 1 then African Americans living at node i are isolated, since they will require more time to visit allthe other classes than required by individuals from otherethnicities. Data on COVID-19 incidence among the AfricanAmerican Population
The data related to the percentage of infected AfricanAmericans was obtained from two different sources [32]0and [33]. The first data set reports the number of infectedand deceased of each ethnicity along with those unknown.To calculate the percentage of African Americans we haveremoved first the unknown from the total, otherwise ouranalysis would also capture the fraction of unknown. Forthe other data set we just extract the data they providein tables.
Public Transportation data set
The public transportation data set was obtained fromthe 2018 American Community Survey from U.S. Cen-sus Bureau [34]. It includes information about the per-centage of public transportation usage per ethnicity andState.
Multivariate Analysis
The multivariate analysis was performed using R andthe ANOVA model in the car package.
ACKNOWLEDGEMENTS
A.B. and V.N. acknowledge support from the EPSRCNew Investigator Award Grant No. EP/S027920/1. Thiswork made use of the MidPLUS cluster, EPSRC GrantNo. EP/K000128/1. This research utilised Queen Mary’sApocrita HPC facility, supported by QMUL Research-IT. doi.org/10.5281/zenodo.438045.
AUTHOR CONTRIBUTIONS
All the authors devised the study. A.B. and S. S. per-formed the simulations and computations. All the au-thors provided methods and analysed the results. A. B.and S. S. prepared the figures and all the visual material.All the authors wrote the paper and approved the finalsubmitted version.
Appendix A: Quantifying ethnic segregation through CMFPT and CCTEthnic segregation and COVID-19 incidence through CMFPT
Ethnic segregation is quantified here through random walks, and more precisely, class mean first passage times(CMFPT). Given a set of classes present in a city – ethnicities in this case– we are interested in the number ofsteps you need to reach one as a function of the ethnicity at the origin. Random walks start from each of the citycells –or tracts– and move until they have visited each of the distinct classes or ethnicities present in a city. If weaverage the passage times across all the city cells and then divide by the same quantity from the null model weobtain the normalised CMFPT between class α and β (cid:101) τ αβ . It is important to note that this matrix is not necessarilysymmetric and depends on the spatial distribution of classes. We show in Supplementary Figure S-1 the CMFPT onthe adjacency graph for four cities: Detroit, Chicago, Houston and Los Angeles. On a first look, strong differencescan be detected between those cities on the left and those on the right. The normalised CMFPT are substantiallyhigher in Detroit and Chicago when compared to Houston and Los Angeles. Despite the difference in the maximumvalues, the shape of the matrix and curves is not so different across cities, with African Americans much more isolatedthan the rest of ethnicities. Reaching African Americans is much harder for any other class, while it is considerablylow for other African Americans. As can be seen, there are not only strong differences between both cities but alsobetween the type of network used (See Supplementary Figure S-2). One significant change that appears in somecities when (cid:101) τ αβ is computed over the adjacency graph is that for African Americans, Whites are more easy to reachthan themselves, which means that mobility plays a crucial role on approaching African Americans to the rest of thepopulation and exposing them. Additionally, the normalised CMFPT seems to be less dependent on the ethnicity ofthe origin and more on the ethnicity of the destination. Likely as a consequence of the higher mixing produced bythe long-range links present in the mobility network. Not only that, but the differences between each ethnicity arealso reduced. Overall, to properly quantify segregation we need to take into account not only the residences but alsohow the ethnicities move in cities. Ethnic segregation and COVID-19 incidence through CCT
We consider a number of Consolidated Statistical Areas (CSA) in the US, 128 networks are constructed based onadjacency while 171 networks were constructed for commuting. These systems are represented as a spatial graph G W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e To WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more F r o m Detroit e τ αβ W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Detroit
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e To WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more F r o m Houston e τ αβ W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Houston
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e To WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more F r o m Chicago e τ αβ W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Chicago
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e To WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more F r o m Los Angeles e τ αβ W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Los Angeles
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more
FIG. S-1. Normalised inter-class mean first passage times among the different ethnicities contained in our data set when walkersmove on the adjacency network in two different visualisation styles for the following cities: Chicago, Detroit, Houston and LosAngeles. W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Detroit
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Houston
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Chicago
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more W h i t e A f r i c a n A m e r i c a n A m e r i c a n I n d i a n A s i a n N a t i v e H a w a ii a n O t h e r T w o o r m o r e Destination class (Ethnicity) e τ α β Los Angeles
WhiteAfrican AmericanAmerican IndianAsianNative HawaiianOtherTwo or more
FIG. S-2. Normalised inter-class mean first passage times among the different ethnicities contained in our data set when walkersmove on the commuting network in two different visualisation styles for the following cities: Chicago, Detroit, Houston andLos Angeles. and we look at the statistical properties of the trajectories of a random walk on G . Each walk starting at node i is associated to an ethnicity sampled from Q i t and it stops at time t when J ( P|| Q i t ) ≤ ε . When the ethnicity α is sampled, the corresponding bin is removed from the computation J ( P|| Q i t ) so that the effect that α has on thecoverage time at threshold (cid:15) can be quantified.Trajectories from each node are averaged over 5000 repetitions for the adjacency network and 2000 for commuting.The null model for CCT is obtained by randomly reassigning the vector M i to a new node in G and the concentrationof an ethnicity in a region - if such spatial pattern is present - is dissolved across G . Here we consider 20 independentrepetitions of the null model for each CSA on both adjacency and commuting networks, then, the Class CoverageTime ( CCT ) for the null model of a city is the average behaviour over 20 independent realisations.The normalised
CCT is reported for the two spatial configurations in Fig. S-3 a-b . The coverage times obtainedfor the adjacency networks a are considerably larger compared to commuting b , where on the later, the majority ofvalues spams in a small range between 0 and 4. The distributions of (cid:101) γ α for each ethnicity have a comparable shapewithin the network type where most values are contained in a common interval, yet, they are distinguishable and2 W h i t e A f r . A m . A m . I nd . A s i an N a t. H a w . O t he r T w o / M o r e a W h i t e A f r . A m . A m . I nd . A s i an N a t. H a w . O t he r T w o / M o r e b W h i t e A f r . A m . A m . I nd . A s i an N a t. H a w . O t he r T w o / M o r e c W h i t e A f r . A m . A m . I nd . A s i an N a t. H a w . O t he r T w o / M o r e d FIG. S-3. Class coverage times and the corresponding normalised values on the adjacency a and commute b networks for allCSA. Panels c-d report the coverage times (non-normalised) for adjacency and commute networks where the black crossescorrespond to the values for the equivalent null model. Each data point contained in an ethnicity column is equivalent to (cid:101) γ α at a CSA. differences between the ethnicities can be observed. The corresponding non-normalised quantities can be read onpanels c-d where the CCT of the real system and the equivalent null model are reported.It is important to note that the difference on the
CCT of the real system and the null model is significantly largeon the adjacency networks, with the former having cases where coverage times are two orders of magnitude larger(See Fig. S-3 c ). Although the null model corresponds to the non-segregated counterpart of the city and coveragetimes are expected to be smaller, these large differences suggest caution and open an interesting question for furtherinvestigation. In particular, to understand what factors influence the large differences, for instance if it is mainlydriven by the population distribution, the threshold (cid:15) , the network topology or the combination of two or morefactors.In addition to the cities discussed in the main manuscript, we report two other systems in Fig. S-4 where individualvalues of (cid:101) γ α for each ethnicity can be observed on the adjacency and commute networks. African Americans aresubstantially less isolated in Houston compared to the other ethnicities in both adjacency and commute networks.In Detroit, The adjacency information gives the opposite picture for African Americas while Whites are the mostisolated on the commute network.3 a b c d Detroit Houston
FIG. S-4. Normalised coverage time on the adjacency and commute networks for Detroit a-b and Houston c-d . Values arelarger on the adjacency network compared to commute for both cities while African Americans are significantly less isolated inHouston for both networks.
City name City name City name City nameAlbany-Schenectady Albuquerque-Santa Fe-Las Vegas Appleton-Oshkosh-Neenah Asheville-BrevardAtlanta–Athens-Clarke County–Sandy Springs Bend-Redmond-Prineville Birmingham-Hoover-Talladega Bloomington-BedfordBloomington-Pontiac Bloomsburg-Berwick-Sunbury Boston-Worcester-Providence Bowling Green-GlasgowBrownsville-Harlingen-Raymondville Buffalo-Cheektowaga Cape Coral-Fort Myers-Naples Cape Girardeau-SikestonCharleston-Huntington-Ashland Charlotte-Concord Chattanooga-Cleveland-Dalton Chicago-NapervilleCincinnati-Wilmington-Maysville Cleveland-Akron-Canton Clovis-Portales Columbia-Moberly-MexicoColumbia-Orangeburg-Newberry Columbus-Auburn-Opelika Columbus-Marion-Zanesville Columbus-West PointCorpus Christi-Kingsville-Alice Dallas-Fort Worth Davenport-Moline Dayton-Springfield-SidneyDenver-Aurora DeRidder-Fort Polk South Des Moines-Ames-West Des Moines Detroit-Warren-Ann ArborDixon-Sterling Dothan-Enterprise-Ozark Eau Claire-Menomonie Edwards-Glenwood SpringsElmira-Corning El Paso-Las Cruces Erie-Meadville Fargo-WahpetonFayetteville-Lumberton-Laurinburg Findlay-Tiffin Fort Wayne-Huntington-Auburn Fresno-MaderaGainesville-Lake City Grand Rapids-Wyoming-Muskegon Green Bay-Shawano Greensboro–Winston-Salem–High PointGreenville-Spartanburg-Anderson Greenville-Washington Harrisburg-York-Lebanon Harrisonburg-Staunton-WaynesboroHartford-West Hartford Hickory-Lenoir Hot Springs-Malvern Houston-The WoodlandsHuntsville-Decatur-Albertville Idaho Falls-Rexburg-Blackfoot Indianapolis-Carmel-Muncie Ithaca-CortlandJackson-Brownsville Jackson-Vicksburg-Brookhaven Jacksonville-St. Marys-Palatka Johnson City-Kingsport-BristolJohnstown-Somerset Jonesboro-Paragould Joplin-Miami Kalamazoo-Battle Creek-PortageKansas City-Overland Park-Kansas City Knoxville-Morristown-Sevierville Kokomo-Peru Lafayette-Opelousas-Morgan CityLafayette-West Lafayette-Frankfort Lake Charles-Jennings Lansing-East Lansing-Owosso Las Vegas-HendersonLexington-Fayette–Richmond–Frankfort Lima-Van Wert-Celina Lincoln-Beatrice Little Rock-North Little RockLongview-Marshall Los Angeles-Long Beach Louisville/Jefferson County–Elizabethtown–Madison Lubbock-LevellandMacon-Bibb County–Warner Robins Madison-Janesville-Beloit Manhattan-Junction City Mankato-New Ulm-North MankatoMansfield-Ashland-Bucyrus Martin-Union City McAllen-Edinburg Medford-Grants PassMemphis-Forrest City Miami-Fort Lauderdale-Port St. Lucie Midland-Odessa Milwaukee-Racine-WaukeshaMinneapolis-St. Paul Mobile-Daphne-Fairhope Modesto-Merced Monroe-Ruston-BastropMorgantown-Fairmont Moses Lake-Othello Mount Pleasant-Alma Myrtle Beach-ConwayNashville-Davidson–Murfreesboro New Bern-Morehead City New Orleans-Metairie-Hammond New York-NewarkNorth Port-Sarasota Oklahoma City-Shawnee Omaha-Council Bluffs-Fremont Orlando-Deltona-Daytona BeachOskaloosa-Pella Paducah-Mayfield Parkersburg-Marietta-Vienna Pensacola-Ferry PassPeoria-Canton Philadelphia-Reading-Camden Pittsburgh-New Castle-Weirton Portland-Lewiston-South PortlandPortland-Vancouver-Salem Pueblo-Canyon City Pullman-Moscow Quincy-HannibalRaleigh-Durham-Chapel Hill Rapid City-Spearfish Redding-Red Bluff Reno-Carson City-FernleyRichmond-Connersville Rochester-Austin Rochester-Batavia-Seneca Falls Rockford-Freeport-RochelleRocky Mount-Wilson-Roanoke Rapids Rome-Summerville Sacramento-Roseville Saginaw-Midland-Bay CitySt. Louis-St. Charles-Farmington Salt Lake City-Provo-Orem San Jose-San Francisco-Oakland Savannah-Hinesville-StatesboroSeattle-Tacoma Sioux City-Vermillion South Bend-Elkhart-Mishawaka Spokane-Spokane Valley-Coeur d’AleneSpringfield-Branson Springfield-Greenfield Town Springfield-Jacksonville-Lincoln State College-DuBoisSteamboat Springs-Craig Syracuse-Auburn Tallahassee-Bainbridge Toledo-Port ClintonTucson-Nogales Tulsa-Muskogee-Bartlesville Tyler-Jacksonville Victoria-Port LavacaVirginia Beach-Norfolk Visalia-Porterville-Hanford Washington-Baltimore-Arlington Wausau-Stevens Point-Wisconsin RapidsWichita-Arkansas City-Winfield Williamsport-Lock Haven Youngstown-Warren
TABLE S-I. Table of cities studied
Appendix B: Correlations between the incidence of COVID-19 among the African American population casesand ethnic segregation)Rankings and comparisons
We detail the cities studied in Supplementary Table S-I, it is important to note that those are the cities studieddisregarding if those states provide ethnic information on the impact of COVID-19.Supplementary Figure S-5 displays the ranking of values for each of the four metrics studied in the main manuscriptcomputed over the adjacency or commuting graphs. As can be seen, strong similarities between rankings appear.To evaluate how similar are those rankings we have calculated the Kendall τ k between each pair of rankings.Supplementary Figure S-6 displays the values of τ k between each pair of the four metrics studied in the main manuscriptcomputed in the adjacency and commuting graphs. For instance, there is a high correlation between the index C computed in the adjacency and the commuting graphs while for the exposure index E there is almost no correlation4 a M i c h i g a n M a r y l a n d I n d i a n a W i s c o n s i n D e l a w a r e W e s t V i r g i n i a N e w Y o r k A l a b a m a K a n s a s M a ss a c hu s e tt s K e n t u c k y A r k a n s a s F l o r i d a O k l a h o m a T e x a s N o r t h C a r o li n a C o l o r a d o W a s h i n g t o n U t a h N e v a d a N o r t h D a k o t a N e w M e x i c o C ( A d j a c e n c y ) b I lli n o i s O h i o I n d i a n a M i ss o u r i D e l a w a r e W i s c o n s i n P e nn s y l v a n i a G e o r g i a K a n s a s W e s t V i r g i n i a M a ss a c hu s e tt s I d a h o M i nn e s o t a N e b r a s k a S o u t h C a r o li n a N o r t h C a r o li n a C o l o r a d o W a s h i n g t o n M a i n e S o u t h D a k o t a N o r t h D a k o t a N e w M e x i c o C ( C o mm u t i n g ) c C o l o r a d o O h i o W a s h i n g t o n A l a b a m a C a li f o r n i a M i ss o u r i A r k a n s a s T e nn e ss ee F l o r i d a I o w a M a ss a c hu s e tt s N e w Y o r k A r i z o n a N o r t h C a r o li n a D e l a w a r e M a i n e G e o r g i a I d a h o W e s t V i r g i n i a L o u i s i a n a V i r g i n i a S o u t h D a k o t a E ( A d j a c e n c y ) d I lli n o i s O h i o M i c h i g a n P e nn s y l v a n i a M a r y l a n d V i r g i n i a L o u i s i a n a D e l a w a r e A l a b a m a C a li f o r n i a W e s t V i r g i n i a N e w H a m p s h i r e S o u t h C a r o li n a O k l a h o m a A r i z o n a N o r t h C a r o li n a U t a h M i nn e s o t a W a s h i n g t o n N e b r a s k a N e w M e x i c o I d a h o E ( C o mm u t i n g ) e I d a h o S o u t h D a k o t a T e x a s M i nn e s o t a U t a h C o nn e c t i c u t N e v a d a K e n t u c k y A r i z o n a S o u t h C a r o li n a D e l a w a r e T e nn e ss ee M a ss a c hu s e tt s O r e g o n N e b r a s k a M i ss o u r i I lli n o i s M i c h i g a n N e w M e x i c o C o l o r a d o W e s t V i r g i n i a M a r y l a n d I ( A d j a c e n c y ) f I d a h o N e w Y o r k M a r y l a n d I n d i a n a V i r g i n i a L o u i s i a n a N e w M e x i c o K e n t u c k y S o u t h D a k o t a W e s t V i r g i n i a G e o r g i a C o l o r a d o M i ss i ss i pp i C a li f o r n i a A r i z o n a C o nn e c t i c u t M i ss o u r i O r e g o n M a i n e O k l a h o m a M a ss a c hu s e tt s D e l a w a r e I ( C o mm u t i n g ) g L o u i s i a n a G e o r g i a K a n s a s D e l a w a r e O h i o I n d i a n a M i c h i g a n N e v a d a C o nn e c t i c u t N o r t h D a k o t a U t a h V i r g i n i a M i nn e s o t a T e nn e ss ee N e w H a m p s h i r e W i s c o n s i n C o l o r a d o W e s t V i r g i n i a I o w a A r i z o n a I d a h o O r e g o n G ( A d j a c e n c y ) h L o u i s i a n a G e o r g i a K a n s a s D e l a w a r e O h i o I n d i a n a M i c h i g a n N e v a d a C o nn e c t i c u t N o r t h D a k o t a U t a h V i r g i n i a M i nn e s o t a T e nn e ss ee N e w H a m p s h i r e W i s c o n s i n C o l o r a d o W e s t V i r g i n i a I o w a A r i z o n a I d a h o O r e g o n G ( C o mm u t i n g ) FIG. S-5. Ranking for the four indices studied in the main manuscript: C , E , I and G computed in both the adjacencyand commuting graph. a C (Adjacency), b C (Commuting), c E (Adjacency), d E (Commuting), e I (Adjacency), f I (Commuting), g G (Adjacency), h G (Commuting). between both. Likely pointing out that exposure can only be effectively measured by including the commuting network.Another additional observation is the connection between C and E measured on the commuting graph, which seemsto point out that in those states where African Americans are more segregated they are also more exposed.In Supplementary Figure S-7, we compare the indices studied in this work, showing that most of them are relatedto each other yet not necessarily linearly. We find that those indices capturing the clustering of African Americansare also related to those related to mixing and exposure. The more clustered together, more sensible and exposed tothe rest of the population. When comparing the same indices in the commuting and adjacency graphs, we observethat they have a non-linear relation, highlight again the importance of considering both of them.5 C ( A d j a c e n c y ) C ( C o mm u t i n g ) E ( A d j a c e n c y ) E ( C o mm u t i n g ) I ( A d j a c e n c y ) I ( C o mm u t i n g ) G ( A d j a c e n c y ) G ( C o mm u t i n g ) C (Adjacency) C (Commuting) E (Adjacency) E (Commuting) I (Adjacency) I (Commuting) G (Adjacency) G (Commuting) Kendall τ k τ k FIG. S-6. Kendall tau τ k correlation between each of the four indices studied in the main manuscript computed either over theadjacency or the commuting graphs. Correlations between the segregation indices and other measures of COVID-19 incidence
The COVID-19 data used was obtained from [32] and includes several temporal snapshots until mid-may. The mainvariables we used are the difference on infected/deceased African Americans, where 0 would mean that the percentageof African Americans in the population of a state is the same than the percentage of infected, and the ratio calculatedas the percentage of infected African Americans divided by their percentage among the overall population, which willbe one if they are equal and higher than one if there are more African Americans infected/deceased among the overallpopulation. Supplementary Table S-II summarises the results obtained for the linear fit for the Figure 2 in the mainmanuscript.Supplementary Figure S-8 summarises the results obtained in the case of the ratio of infected African Americans.As detailed in the main manuscript, there are two versions of each index depending on whether the walkers moveupon the adjacency or the commuting network. Compared to the difference in percentage correlations are much lowerfor the ratio, likely as a consequence of the several outliers. While states with a low percentage of African Americansamong the overall population might easily suffer a huge increase on the ratio, those with a higher percentage of AfricanAmericans among the population might have a lower increase.We have also evaluated how our metrics relate to the ratio and difference among the deceased African Americans(See Supplementary Figures S-9 and S-10 ). Despite many more factors such as the age or underlying health conditionsmight influence the deceased individuals, still, most of the correlations remain significant to some extent, especiallythose related to their exposure. Moreover, those indices computed on the commuting network seem to be moreinformative than those based on the adjacency, which seems to point out that residential segregation provides only apartial picture of ethnic inequality. Mobility is also crucial to understand the mixing between different ethnicities, itis not only relevant where certain ethnicities live but also where they work and with whom they interact when theydo so.Overall, despite our metrics are informative in both cases, they seem to be more related to the difference inpercentage more than the ratio. There are states in which the percentage of African Americans among the populationis low and, therefore, the ratio can increase drastically.Ideally, each ethnicity α should be compared with the corresponding ratio to the overall population and the incidence6 a C (Adjacency) E ( C o mm u t i n g ) b C (Commuting) C ( A d j a c e n c y ) c C (Commuting) E ( C o mm u t i n g ) d C (Commuting) I ( A d j a c e n c y ) e C (Commuting) G ( C o mm u t i n g ) f E (Commuting) G ( C o mm u t i n g ) g I (Adjacency) I ( C o mm u t i n g ) h G (Adjacency) G ( C o mm u t i n g ) FIG. S-7. Comparison between the indices studied in the main manuscript. a C (Adjacency) and E (Commuting), b C (Com-muting) and C (Adjacency), c C (Commuting) and E (Commuting), d C (Commuting) and I (Adjacency), e C (Commuting)and G (Commuting), f E (Commuting) and G (Commuting), g I (Adjacency) and I (Commuting), h G (Adjacency) and G (Commuting) C (clustering) R a t i o o f i n f e c t e d R / =0 . ** R / =0 . * a E (exposure) R / =0 . R / =0 . b Adjacency I (isolation) R / =0 . R / =0 . c G (spatial Gini) R / =0 . R / =0 . d C (clustering) R a t i o o f i n f e c t e d R / =0 . ** R / =0 . * e E (exposure) R / =0 . ** R / =0 . f Commuting I (isolation) R / =0 . R / =0 . g G (spatial Gini) R / =0 . R / =0 . h FIG. S-8.
Relation between the ratio of infected African Americans and the four indices considered. a-g
Indicescomputed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indicescomputed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). Each of thecolours corresponds to a temporal snapshot of the data set, red for 12 / / / / R is computedas the square of the linear correlation coefficient. C (clustering) ∆ A d ec R / =0 . ** R / =0 . * a E (exposure) R / =0 . R / =0 . b Adjacency I (isolation) R / =0 . ** R / =0 . c G (spatial Gini) R / =0 . R / =0 . ** d C (clustering) ∆ A d ec R / =0 . ** R / =0 . ** e E (exposure) R / =0 . ** R / =0 . * f Commuting I (isolation) R / =0 . R / =0 . g G (spatial Gini) R / =0 . R / =0 . ** h FIG. S-9.
Relation between the difference on the percentage of deceased African American as a function of thefour indices considered. a-g
Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). Each of the colours corresponds to a temporal snapshot of the data set, red for 12 / / / / R is computed as the square of the linear correlation coefficient. Date Index Network type slope intercept12/04/2020 C Adjacency 8.26 -3.4712/04/2020 E Adjacency -12.35 27.8212/04/2020 I Adjacency -19.66 30.0112/04/2020 G Adjacency 31.86 -3.2812/04/2020 C Commuting 139.14 -135.9412/04/2020 E Commuting 38.83 -29.9212/04/2020 I Commuting -7.70 20.9912/04/2020 G Commuting 33.65 -9.1019/04/2020 C Adjacency 6.95 -1.8719/04/2020 E Adjacency -6.45 19.0419/04/2020 I Adjacency -7.633 17.7719/04/2020 G Adjacency 20.20 1.6519/04/2020 C Commuting 117.98 -114.1119/04/2020 E Commuting 25.93 -16.4519/04/2020 I Commuting -4.79 16.1519/04/2020 G Commuting 24.88 -5.02TABLE S-II. Coefficients obtained from the linear fits in Figure 2 of the main manuscript C (clustering) R a t i o o f d e c e a s e d R / =0 . * R / =0 . a E (exposure) R / =0 . R / =0 . b Adjacency I (isolation) R / =0 . R / =0 . c G (spatial Gini) R / =0 . R / =0 . d C (clustering) R a t i o o f d e c e a s e d R / =0 . * R / =0 . e E (exposure) R / =0 . ** R / =0 . * f Commuting I (isolation) R / =0 . R / =0 . g G (spatial Gini) R / =0 . R / =0 . h FIG. S-10.
Relation between the ratio of deceased African Americans and the four indices considered. a-g
Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indicescomputed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). Each of thecolours corresponds to a temporal snapshot of the data set, red for 12 / / / / R is computedas the square of the linear correlation coefficient. of COVID-19 cases. As this data was not available during the preparation of this work, we look at the gap ∆ A inf of African American and the relation with the Isolation level of all other ethnicities in this study. Considering thequantity for all other ethnicities defined as: (cid:101) γ iO = 1Γ − (cid:88) β (cid:54) = α (cid:101) γ iβ (S-1)the Isolation index for an ethnicity α is given by: I α = 1 N N (cid:88) i =1 (cid:101) γ iα (cid:101) γ iO (S-2)9Correlations with the infection rate gap ∆ A inf on the adjacency and commute network are reported on Fig. S-11and S-12 for all ethnicities. Quantities are obtained from the COVID-19 data set at 2 different periods, 12-04-2020and 19-04-2020 respectively. The corresponding R of the Pearson correlation is reported in the inset of each panel.There is a negative correlation in b which indicates that less isolated African Americans have a higher incidence ofinfection cases. Whites a and Native Hawaiians e exhibit no correlation while the remaining ethnicities c-d and f-g have a positive R which decreases over time. We found no correlation for any ethnicity on the commuting network. I White A i n f aa R :0.00 R :0.03 I Afr. Am. bb R :0.31 R :0.05 I Am. Ind. cc R :0.21 R :0.30 I Asian dd R :0.24 R :0.31 I Nat. Haw. A i n f ee R :0.02 R :0.08 I Other ff R :0.16 R :0.18 I Two/More gg Adjacency R :0.23 R :0.33 FIG. S-11. Isolation index of all ethnicities as a function of the infected rate gap of COVID-19 cases in the African Americanpopulation on the adjacency network. African American is the only ethnicity to exhibit a negative correlation of isolation and∆ A inf , suggesting that a higher infection rate can be related to lower isolation. I White A i n f aa R :0.00 R :0.00 I Afr. Am. bb R :0.03 R :0.01 I Am. Ind. cc R :0.04 R :0.11 I Asian dd R :0.03 R :0.05 I Nat. Haw. A i n f ee R :0.02 R :0.13 I Other ff R :0.06 R :0.11 I Two/More gg Commuting R :0.04 R :0.10 FIG. S-12. Isolation index of all ethnicities as a function of the infected rate gap of COVID-19 cases in the African Americanpopulation considering the commute network. There is no significant correlation for any of the ethnicities.
Similarly, correlations with ∆ A deceased are computed for the deceased data on the adjacency and commute networks(See Fig. S-13 and S-14). We can observe a similar pattern with the results obtained from infected rates where thereis correlations for the same group of ethnicities and no significant relationship on the commuting network.0 I White A d e c e a s e d aa R :0.05 R :0.04 I Afr. Am. bb R :0.20 R :0.27 I Am. Ind. cc R :0.28 R :0.31 I Asian dd R :0.28 R :0.25 I Nat. Haw. A d e c e a s e d ee R :0.08 R :0.09 I Other ff R :0.18 R :0.28 I Two/More gg Adjacency R :0.32 R :0.35 FIG. S-13. Isolation index of all ethnicities as a function of the deceased rate gap of COVID-19 cases in the African Americanpopulation considering the adjacency network. I White A d e c e a s e d aa R :0.05 R :0.12 I Afr. Am. bb R :0.00 R :0.01 I Am. Ind. cc R :0.09 R :0.13 I Asian dd R :0.06 R :0.15 I Nat. Haw. A d e c e a s e d ee R :0.05 R :0.12 I Other ff R :0.07 R :0.10 I Two/More gg Commuting R :0.09 R :0.11 FIG. S-14. Isolation index of all ethnicities as a function of the deceased rate gap of COVID-19 cases in the African Americanpopulation considering the commute network. There is no significant correlation for any of the ethnicities.
Appendix C: Local segregation maps through CMFPT and CCT and spatial correlation
In the main manuscript, we show the values for the local segregation indices ξ and ψ for Chicago and Los Angelesshowing that there were significant differences on their spatial distribution as well as in their maximum values. Herewe provide also results for Detroit and Houston to show that again there are significant differences. In this case,Detroit is the most populated city in Michigan, which is one of the states with highest values in most of the indicesconsidered and Houston is the most populated city in Texas, which is a state with consistent low values in mostsegregation indices. Regarding the impact of COVID-19 among the African Americans of those states, in Michiganthe gap is around 34% in early April and 24% in mid-may. In Texas, instead, the gap is around 2% at the beginningof April and 5% in mid-May.In the main manuscript and Supplementary Figure S-15 we plot the local measures of segregation in each of thecensus tracts of Chicago, Los Angeles, Detroit and Houston. Those maps display certain common patterns that wequantify in Supplementary Figure S-16. Therein we have calculated the Kendall τ k correlation coefficient performing1 FIG. S-15.
Maps of local segregation in American cities.
Ratio of African American population and local segregationindices computed with CMFPT and CCT in a-e
Detroit and f-j
Houston. For Detroit: a Ratio of African American population, b-c (cid:101) ξ and (cid:101) ψ computed over the adjacency graph and d-e (cid:101) ξ and (cid:101) ψ computed over the commuting graph. For Houston: f Ratio ofAfrican American population, g-h (cid:101) ξ and (cid:101) ψ computed over the adjacency graph and i-j (cid:101) ξ and (cid:101) ψ computed over the commutinggraph. pairwise comparisons of the values for each tract unit. Additionally to the segregation indices, we also compared thevalues for the ratio of African American population. It is relevant to note that while the value of τ k for the ratioof African American population and (cid:101) ξ computed in the adjacency graph is around 0 . (cid:101) ξ is computed in the commuting graph – i.e., 0 .
81 in Detroit and 0 .
55 in Houston –meaning that the effect of commuting in the segregation of African American population can display strong differencesacross cities and, therefore, mobility offers a different picture of urban segregation.
Appendix D: Temporal analysis of correlations with segregation indices and other socioeconomic indicatorsStatistical analysis of the COVID-19 incidence data
In this section, we provide the temporal evolution of correlations between the difference in the percentage of COVID-19 incidence among African Americans and other segregation indices. First of all, Supplementary Figure S-17 showsthe number of states included in each of the temporal snapshots. As can be seen, it increases with time yet already inthe first temporal snapshots there are almost 20 of them. It is important to note that by mid-April, the US reachedthe first peak of the pandemic.Additionally to the number of states included in the analysis we also observe significant changes in the valuesacross time for the different states analysed. We provide in Supplementary Figures S-18 and S-19 the evolution of thedifference in percentage of infected and deceased African Americans. States are split in quartiles of the distributionof the percentage of African Americans among the overall population. While the average seems almost stable in mostof the quartiles this is more a product of compensating changes than of stability in the values for a single state. Forinstance, in the first quartile there is a sharp increase in Minnesota compensated by a decrease in DC. On the thirdquartile, the sharp decrease in Illinois is compensated by the increase in Arkansas. It is also important to note thatsome states display strong discrepancies between the percentage on deceased and infected as, for instance, Minnesota.
Correlations with the ratio of infected African Americans and the difference in percentage and the ratio ofdeceased African Americans
In the main manuscript, the main variable analysed is the difference in the percentage of African Americans sinceother factors might influence the deceased and the ratio that can lead to several outliers. In Supplementary Figures2 A f r . A m e r . p o p . e ξ ( A d j a c e n c y ) e ξ ( C o mm u t i n g ) f ψ ( A d j a c e n c y ) f ψ ( C o mm u t i n g ) Afr. Amer. pop. e ξ (Adjacency) e ξ (Commuting) f ψ (Adjacency) f ψ (Commuting) Chicago τ k A f r . A m e r . p o p . e ξ ( A d j a c e n c y ) e ξ ( C o mm u t i n g ) f ψ ( A d j a c e n c y ) f ψ ( C o mm u t i n g ) Afr. Amer. pop. e ξ (Adjacency) e ξ (Commuting) f ψ (Adjacency) f ψ (Commuting) Los Angeles τ k A f r . A m e r . p o p . e ξ ( A d j a c e n c y ) e ξ ( C o mm u t i n g ) f ψ ( A d j a c e n c y ) f ψ ( C o mm u t i n g ) Afr. Amer. pop. e ξ (Adjacency) e ξ (Commuting) f ψ (Adjacency) f ψ (Commuting) Detroit τ k A f r . A m e r . p o p . e ξ ( A d j a c e n c y ) e ξ ( C o mm u t i n g ) f ψ ( A d j a c e n c y ) f ψ ( C o mm u t i n g ) Afr. Amer. pop. e ξ (Adjacency) e ξ (Commuting) f ψ (Adjacency) f ψ (Commuting) Houston τ k FIG. S-16.
Correlations between the each of the local metrics of segregation (cid:101) ξ and (cid:101) ψ and the local ratioof African American population. Correlation between each of the local indices of segregation and the ratio of AfricanAmerican population by census tract as well. On the top row Chicago and Los Angeles and on the bottom row Detroit andHouston.
S-20, S-21 and S-22, we show respectively the correlation with the difference in percentage of deceased AfricanAmericans, the ratio of infected and the ratio of deceased. In the case of the difference in percentage we can seethat despite correlations are lower they are stable across time. It is important to note that in the case of deceasedindividuals other factors like the age or the underlying health conditions might play a significant role. In the case ofboth ratios, correlations are slightly high in the first snapshots suffer a steeper decrease. Again C and E computedin the commuting graph seem to outperform the rest of metrics. Temporal evolution of correlations with another data set
We also had access to another project that aggregates data on the ethnicity of both infected and deceased AfricanAmericans by COVID-19 through three different temporal snapshots 22 / / / / / / / / / / / / N u m b e r o f s t a t e s FIG. S-17. Number of states included in the analysis for each temporal snapshot. / / / / / / ∆ A i n f / / / / / / ∆ A i n f / / / / / / ∆ A i n f / / / / / / ∆ A i n f FIG. S-18. The temporal evolution of the difference in percentage on infected African Americans by state. Each plot representsa quartile of the distribution of percentage of African American population. the ratio of infected and deceased African Americans as well as the difference on the percentage of deceased are alsocompatible with those obtained with the previous data set (See Supplementary Figures S-26, S-24 and S-25).4 / / / / / / ∆ A d ec aa / / / / / / ∆ A d ec bb / / / / / / ∆ A d ec cc / / / / / / ∆ A d ec dd FIG. S-19. The temporal evolution of the difference in percentage on deceased African Americans by state. Each plot representsa quartile of the distribution of percentage of African American population. / / / / / / C (clustering) R ∆ A d ec a PearsonSpearman / / / / / / E (exposure) b Adjacency / / / / / / I (isolation) c / / / / / / G (spatial Gini) d / / / / / / C (clustering) R ∆ A d ec e / / / / / / E (exposure) f Commuting / / / / / / I (isolation) g / / / / / / G (spatial Gini) h FIG. S-20. Evolution of the Pearson and Spearman correlation ( R ) found between the difference in percentage of deceasedAfrican Americans and the four indices studied in the main manuscript. a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). / / / / / / C (clustering) R R a t i o o f i n f e c t e d a PearsonSpearman / / / / / / E (exposure) b Adjacency / / / / / / I (isolation) c / / / / / / G (spatial Gini) d / / / / / / C (clustering) R R a t i o o f i n f e c t e d e / / / / / / E (exposure) f Commuting / / / / / / I (isolation) g / / / / / / G (spatial Gini) h FIG. S-21. Evolution of the Pearson and Spearman correlation ( R ) found between the ratio of infected African American andeach of the indices studied in this work. a-g Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). / / / / / / C (clustering) R R a t i o o f d e c e a s e d a PearsonSpearman / / / / / / E (exposure) b Adjacency / / / / / / I (isolation) c / / / / / / G (spatial Gini) d / / / / / / C (clustering) R R a t i o o f d e c e a s e d e / / / / / / E (exposure) f Commuting / / / / / / I (isolation) g / / / / / / G (spatial Gini) h FIG. S-22. Evolution of the Pearson and Spearman correlation ( R ) found between the ratio of deceased African Americansand the four indices studied in the main manuscript. a-g Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). / / / C (clustering) R ∆ A i n f a PearsonSpearman / / / E (exposure) b Adjacency / / / I (isolation) c / / / G (spatial Gini) d / / / C (clustering) R ∆ A i n f e / / / E (exposure) f Commuting / / / I (isolation) g / / / G (spatial Gini) h FIG. S-23. Evolution of the Pearson and Spearman correlation ( R ) found between the difference on the deceased AfricanAmericans and the four indices studied in the main manuscript using another data source. a-g Indices computed over theadjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over thecommuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). The markers indicate the sign of therelation, positive for triangles pointing up and negative for triangles pointing down. / / / C (clustering) R R a t i o o f i n f e c t e d a PearsonSpearman / / / E (exposure) b Adjacency / / / I (isolation) c / / / G (spatial Gini) d / / / C (clustering) R R a t i o o f i n f e c t e d e / / / E (exposure) f Commuting / / / I (isolation) g / / / G (spatial Gini) h FIG. S-24. Evolution of the Pearson and Spearman correlation ( R ) found between the difference on the deceased AfricanAmericans and the four indices studied in the main manuscript using another data source. a-g Indices computed over theadjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over thecommuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). The markers indicate the sign of therelation, positive for triangles pointing up and negative for triangles pointing down. / / / C (clustering) R ∆ A d ec a PearsonSpearman / / / E (exposure) b Adjacency / / / I (isolation) c / / / G (spatial Gini) d / / / C (clustering) R ∆ A d ec e / / / E (exposure) f Commuting / / / I (isolation) g / / / G (spatial Gini) h FIG. S-25. Evolution of the Pearson and Spearman correlation ( R ) found between the ratio of deceased African Americans andthe four indices studied in the main manuscript using another data source. a-g Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). The markers indicate the sign of the relation, positive fortriangles pointing up and negative for triangles pointing down. / / / C (clustering) R R a t i o o f d e c e a s e d a PearsonSpearman / / / E (exposure) b Adjacency / / / I (isolation) c / / / G (spatial Gini) d / / / C (clustering) R R a t i o o f d e c e a s e d e / / / E (exposure) f Commuting / / / I (isolation) g / / / G (spatial Gini) h FIG. S-26. Evolution of the Pearson and Spearman correlation ( R ) found between the ratio of infected African Americans andthe four indices studied in the main manuscript using another data source. a-g Indices computed over the adjacency network: a C (clustering), b E (exposure), c I (isolation), d G (spatial Gini). e-h Indices computed over the commuting network: e C (clustering), f E (exposure), g I (isolation), h G (spatial Gini). The markers indicate the sign of the relation, positive fortriangles pointing up and negative for triangles pointing down. Formulation of the alternative indices C (cid:48) and E (cid:48) In the main manuscript we have studied the metrics C and E which are computed from the elements of thenormalised CMFPT (cid:101) τ α,β . However, there are more potential ways to capture the clustering and exposure of anethnicity by doing the other calculations from that matrix. Here we propose the two alternative formulations forthose two metrics C (cid:48) = τ OA τ OO E (cid:48) = τ AA τ AO The first quantity comes from the ratio between the time from other ethnicities to African Americans and the timefrom other ethnicities to others, where higher values correspond more isolated African Americans compared to otherethnicities. The second quantity instead, is the ratio between the time separating African Americans and the timebetween African Americans and any other ethnicity, where higher values correspond to African Americans moreexposed to others than to themselves. The correlation between our alternative proposals and the difference in thepercentage of African Americans infected is shown in Supplementary Figure S-27. While correlations are slightlylower, they are still significant. One interesting finding is that E (cid:48) changes the sign of the correlation when computedover the adjacency graph and the commuting network. Highlighting once again the need of considering mobility tounderstand the segregation and exposure of ethnicities in urbanscapes. Temporal evolution of correlations with other segregation indices from the literature
We have also studied the correlations between the difference in the percentage of COVID-19 incidence amongAfrican Americans and other segregation indices from the literature. First of all, we obtained the segregation index σ α proposed in [42], which is also based on the movement of random walks is spatial systems and captures theprobability that a randomly chosen individual of group α meets another individual of the same group, or in this case,ethnicity. Additionally, we also computed Moran’s I, which is a measure of spatial auto-correlation and compares theethnic composition of neighbourhoods [26]. The correlation of both metrics with the difference in the percentage ofinfected among African Americans. The evolution of the correlations is shown in Supplementary Figure S-28, whereonly the Moran index calculated over the adjacency graph seems to display a significant correlation.For the second metric we have build a matrix of distance between ethnicities similar to the one obtained for (cid:101) τ α,β using a measure proposed by [41]. Inspired by the Getis and Ord statistic [49], the metric proposed [41] quantifies foreach location i the exposure of ethnicity α to ethnicity β as βα G ∗ i = (cid:80) nj =1 w ij ( ˆ d pi ) m j,β (cid:80) nj =1 m j,β , (S-1)where each n is the total number of location in a city, j corresponds to each of those locations and m j,β is thepopulation of ethnicity β in location j , ˆ d pi is an estimate of trip length and w ij ( ˆ d pi ) is a function of the distance thatis equal to 1 when d ij < d pj and 0 otherwise. In the case of the adjacency graph only adjacent pair of tracts wereconsidered whereas in the case of the commuting network only pairs connected by commuting trips were considered.Overall βα G ∗ i quantify the ratio of population of ethnicity β to which the individuals residing in i are exposed. In ourcase we set the threshold d pj equal to the average commuting distance in each of the cities. Succinctly, βα G ∗ i is a valuebetween 0 and 1 that encapsulates the fraction of the population of ethnicity. We average the value of βα G ∗ i to obtaina distance matrix between ethnicities in each of the cities as βα G ∗ i = n (cid:88) i =1 m i,αβα G ∗ i n (cid:88) i =1 m i,α , (S-2)so that we take into account the fraction of population of ethnicity α in location i . Finally from the matrix βα < G ∗ > we compute the same exposure and clustering indices computed from (cid:101) τ α,β in the main text. Calculating first9 / / / / / / C (clustering)0.10.30.50.7 R ∆ A i n f a Adjacency
PearsonSpearman / / / / / / E (exposure) b / / / / / / C (clustering)0.10.30.50.7 R ∆ A i n f c Commuting / / / / / / E (exposure) d FIG. S-27. Evolution of the Pearson and Spearman correlation ( R ) found between the difference of infected African Americansand the alternative indices proposed C (cid:48) and E (cid:48) . a C (cid:48) (clustering) and b E (cid:48) (exposure) calculated upon the adjacency network. c C (cid:48) (clustering) and d E (cid:48) (exposure) calculated upon the commuting network. The markers indicate the sign of the relation,positive for triangles pointing up and negative for triangles pointing down. < G ∗ > AO = (cid:80) ∀ β (cid:54) = A M βAα < G ∗ > (cid:80) ∀ β (cid:54) = A e β < G ∗ > OA = (cid:80) ∀ β (cid:54) = A M ββA < G ∗ > (cid:80) ∀ β (cid:54) = A M β < G ∗ > OO = (cid:80) ∀ α,β (cid:54) = A βα < G ∗ > M α M β (cid:80) ∀ α,β (cid:54) = A M α M β , to finally obtain0 / / / / / / σ R ∆ A i n f c Commuting (dynamical population) / / / / / / Moran index d / / / / / / σ R ∆ A i n f a Adjacency
PearsonSpearman / / / / / / Moran index b FIG. S-28.
The temporal evolution of the correlation ( R ) between the incidence of COVID-19 in AfricanAmerican population and the segregation indices σ and Moran’s I. a σ and b Moran’s I computed over the adjacencygraph. c σ and d Moran’s I computed over the commuting graph. The markers indicate the sign of the relation, positive fortriangles pointing up and negative for triangles pointing down. C f = < G ∗ > AO < G ∗ > OO ,E f = < G ∗ > OO < G ∗ > OO . Additionally to the calculation of the indices in the adjacency and the commuting network with dynamical popu-lation we also computed it with the residential population and the commuting network to investigate the role playedby the dynamical population. As can be seen in Supplementary Figure S-29, significant correlations appear with allindices yet the higher ones are with the exposure index especially when computed on the commuting network withdynamical population. Correlations are, however, lower and less stable than those obtained in the main manuscript.Overall, it is important to note that none of the additional segregation metrics we have studied in this section ismore informative than the ones we proposed on the main manuscript based on CMFPT and CCT. Moreover, the use1 / / / / / / C f (clustering)0.10.30.50.7 R ∆ A i n f c Commuting (dynamical population) / / / / / / E f (exposure) d / / / / / / C f (clustering)0.10.30.50.7 R ∆ A i n f a Adjacency
PearsonSpearman / / / / / / E f (exposure) b / / / / / / C f (clustering)0.10.30.50.7 R ∆ A i n f f Commuting (residential population) / / / / / / E f (exposure) g FIG. S-29.
Correlations between the incidence of COVID-19 in African American population and the clusteringand exposure indices computed from the G ∗ statistic proposed in [41]. a,b Correlation with the clustering C f andexposure E f indices computed over the adjacency graph. c,d Correlation with the clustering C f and exposure E f indicescomputed over the commuting graph when the dynamical population is incorporated. e,f Correlation with the clustering C f and exposure E f indices computed over the commuting graph when only the residential population is incorporated. Themarkers indicate the sign of the relation, positive for triangles pointing up and negative for triangles pointing down. The relation between the incidence of COVID-19 in African Americans and socio-economic indicators
We present in this section the correlations between a set of socio-economic indicators and the incidence of COVID-19 in African Americans. For the sake of brevity, we focus here only on the data set used in the main manuscriptas well as in the difference in the percentage of infections which is the case where correlations are higher. Theset of indicators we have studied are the median household income, the percentage of the population below thepoverty level, the percentage of insured and uninsured African Americans, the usage of public transportation by bothAfrican Americans and the overall population, the percentage of African American population in a state, the averagecommuting distance and the ratio between the average commuting distance of African Americans and the overallpopulation. All of the metrics are provided at the level of the African American population and the results are shownin Supplementary Figure S-30. The median household income, the percentage of the population below the povertylevel, the percentage of insured and uninsured African Americans were obtained from the 2018 American CommunitySurvey elaborated by the U.S. Census Bureau [34]. Most of the variables yield low or very low correlations except forthe usage of public transportation by African Americans. Economic indicators such as median income or percentageof poverty seem to slightly correlate with the incidence of COVID-19, which could because because a more deprivedAfrican American community puts them in a more risky situation. Regarding the health indicators related to thedegree of insurance of African Americans, it seems there is no direct relation with the number of infected. Not sosurprising results since we are analysing the percentage of infected and, therefore, the fact of having insurance mightnot change significantly the risk of getting the illness. Finally, the usage of public transportation seems to play acrucial role in the spread of the disease, especially if we compare the use done by the African American population andthe overall population where no correlation appears. The fact that African Americans use more public transportationmight put them on a more dangerous position as well as might a reflection of their economic status. Moreover, itcould happen that in those cities in which African Americans are more segregated they also have to use more thepublic transportation.Additionally to those socio-economic variables we also tested if the overall African American population can also beused as a proxy for the difference in percentage. We also computed on our commuting networks the average commutingdistance of the African American population as well as the ratio with the commuting distance of the overall population.As displayed in Supplementary Figure S-31, the overall percentage of African American population seems to be relatedto the difference in the percentage of infected. However, there is a striking difference between the Pearson and theSpearman correlation coefficients, which means that the rank is more or less conserved yet there are strong outliers.In other words, a state with more percentage of African American population will more easily have a higher differenceon the infected yet the population does not align the points in a straight trend. Regarding the mobility indicators,none of them yields a significant correlation, meaning that it is not so relevant how far African Americans travel andwhere they travel and whom they meet.3 / / / / / / R ∆ A i n f Percentage of African Americans below the poverty level
PearsonSpearman / / / / / / R ∆ A i n f Median household income of African Americans
PearsonSpearman / / / / / / R ∆ A i n f Percent of insured African Americans
PearsonSpearman / / / / / / R ∆ A i n f Percent of uninsured African Americans
PearsonSpearman / / / / / / R ∆ A i n f Percentage of PT usage among African Americans
PearsonSpearman / / / / / / R ∆ A i n f Percentage of PT usage among the overall population
PearsonSpearman
FIG. S-30.
The temporal evolution of Pearson and Spearman correlations ( R ) between the incidence of COVID-19 in African American population and a set of socio-economic indicators. On the top row and from left to rightwe have the median household income of African Americans, the percentage of African Americans below the poverty level andthe percent of insured African Americans. On the bottom row and from left to right there is the percent of uninsured AfricanAmericans, the percentage of use of public transportation among African Americans and the percentage of use among theoverall population. The markers indicate the sign of the relation, positive for triangles pointing up and negative for trianglespointing down. / / / / / / R ∆ A i n f Percentage of African American Population
PearsonSpearman / / / / / / R ∆ A i n f Average commuting distance of African Americans (Km)
PearsonSpearman / / / / / / R ∆ A i n f Normalized average commuting distance of African Americans
PearsonSpearman
FIG. S-31.
Correlations between the incidence of COVID-19 in African American population and a set ofpopulation and mobility indicators.