Early Indicators of COVID-19 Spread Risk Using Digital Trace Data of Population Activities
Xinyu Gao, Chao Fan, Yang Yang, Sanghyeon Lee, Qingchun Li, Mikel Maron, Ali Mostafavi
EE ARLY I NDICATORS OF
COVID-19 S
PREAD R ISK U SING D IGITAL T RACE D ATA OF P OPULATION A CTIVITIES WITH D IFFERENT P URPOSE
Xinyu Gao ∗ + Zachry Department of Civil andEnvironmental EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected]
Chao Fan + Zachry Department of Civil andEnvironmental EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected]
Yang Yang
Department of Computer Science& EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected]
Sanghyeon Lee
Department of Electrical andComputer EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected]
Qingchun Li
Zachry Department of Civil andEnvironmental EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected]
Mikel Maron
Community TeamMapboxWashington, DC, 20005 [email protected]
Ali Mostafavi*
Zachry Department of Civil andEnvironmental EngineeringTexas A&M UniversityCollege Station, Texas, 77843 [email protected] A BSTRACT
The spread of pandemics such as COVID-19 is strongly linked to human activities. The objectiveof this paper is to specify and examine early indicators of disease spread risk in cities during theinitial stages of outbreak based on patterns of human activities obtained from digital trace data. Inthis study, the
Venables distance ( D v ), and the activity density ( D a ) are used to quantify and evaluatehuman activities for 193 US counties, whose cumulative number of confirmed cases was greater than100 as of March 31, 2020. Venables distance provides a measure of the agglomeration of the levelof human activities based on the average distance of human activities across a city or a county (lessdistance could lead to a greater contact risk). Activity density provides a measure of level of overallactivity level in a county or a city (more activity could lead to a greater risk). Accordingly, Pearsoncorrelation analysis is used to examine the relationship between the two human activity indicatorsand the basic reproduction number in the following weeks. The results show statistically significantcorrelations between the indicators of human activities and the basic reproduction number in allcounties, as well as a significant leader-follower relationship (time lag) between them. The resultsalso show one to two weeks’ lag between the change in activity indicators and the decrease in thebasic reproduction number. This result implies that the human activity indicators provide effectiveearly indicators for the spread risk of the pandemic during the early stages of the outbreak. Hence, theresults could be used by the authorities to proactively assess the risk of disease spread by monitoringthe daily Venables distance and activity density in a proactive manner. ∗ Corresponding author + Xinyu Gao and Chao Fan have equal contribution to this paper. a r X i v : . [ phy s i c s . s o c - ph ] S e p Introduction
The objective of this study is to reveal and evaluate early indicators for COVID-19 spread risk in cities during the initialstages of the outbreak using measures of human activities derived from digital trace data. An arguably unprecedentedglobal pandemic, the coronavirus disease 2019 (COVID-19) has infected millions of people worldwide with a mortalityrate of 6.6% and a high infection rate [1][2]. Since the spread of COVID-19 is highly dependent on human activities,incidence of infection could be contained by restricting human activities and mobility [3]. Many countries and authoritieshave implemented various non-pharmaceutical interventions (e.g., shelter-in-place orders, regional lockdowns, andtravel restrictions), which were undertaken to slow the spread of disease by disrupting transmission chains by restrictinghuman mobility and activities. Such social distancing and activity reduction interventions have proven to be critical inslowing down the spread of pandemics both in previous epidemics [4] and during COVID-19 [5][6][7].While reduction in human activities is considered an effective measure for containing epidemics and pandemics, thereare limited reliable, proven, real-time leading indicators related to human activities that could provide early insightsabout the risk of disease spread in a region to inform proactive policy making. One reason for this limitation has beenthe absence of quantitative measures and data that could be examined to proactively evaluate human activities. Withadvancements in location intelligence data technologies, however, information derived from cellular devices offers alarge depository of digital trace data related to human activities have been have increasingly been adapted and analyzedto promote understanding of and to quantify human activity and mobility in pandemic analysis, as well as in otherapplications [8][9][10]. For example, in the context of COVID-19, the radius of gyration, which captures the mobilityof individuals using human movement trajectories, was adopted to analyze the COVID-19 spread in Japan [11]. Dailystep-counts (gathered from smartphones) were used to estimate and predict decreased movement of individuals withinthe United States during COVID-19 [3]. Two of the most important aspects of human activities during an epidemic areagglomeration of activities and intensity of activities.Although previous research reveals insights regarding human activities in the context of COVID-19, the relationshipbetween human activities and disease-spreading risk has not been fully explored, and leading indicators of humanactivities to proactively assess the risk of disease spread during the early stages of pandemics are lacking. Themajority of research studies [12][13][14][15] focus on quantifying and analyzing the changes in human activities as aconsequence of the outbreak of the virus and in response to protective policies (such as shelter-in-place policies). Thetime-lag relationship between these human activity metrics and the spread of virus, which can be generally described bythe basic reproduction number ( R ), has not yet been fully examined. The basic reproduction number, R , is defined asthe number of secondary cases produced by one previous case in a completely susceptible population [16]. Althoughresearch studies [17][18] have focused on leading indicators obtained from users’ online search behavior, the decreaseof online search frequencies may not have direct impact on the spread of virus. Hence, the previous indicators cannotbe utilized for proactive assessment of disease spread risk in a proactive manner.In addition, the nature of human activities—such as activities in public indoor venues versus in residences—may havediffering effects on the spread risk of a disease. The measures of human activities should distinguish between the natureof activities to provide useful leading insights for decision making and policy formulation.In this study, we adopted the Venables distance ( D v ) index [19] and also created the activity density ( D a ) index to serveas two real-time indicators to examine the spatial and temporal patterns of human activities across 193 counties in theUnited States using Mapbox high-resolution temporal-spatial activity index data from January 1 to March 31, 2020.The Venables distance captures the average distance (i.e., concentration) of human activities across a city or county(less distance between persons might indicate a greater contact risk). The activity density captures the intensity level ofoverall activities in a county or city (higher activity levels might indicate a greater spread risk). Human activities wereexamined in four categories—social, traffic, work, and other—based on the location and time of activities. Accordingly,we analyzed the correlation between the two metrics ( D v and D a ) and the basic reproduction number for 193 countieswith the highest number of confirmed COVID-19 cases. The rest of this paper is organized into three sections. The firstsection discusses the description of the two datasets (Mapbox data and total confirmed cases number data), as well asthe analysis methods. The second section describes the results of time-lag correlation analysis between the two metricsand the basic reproduction number. The last section presents the results and the implications of the findings for futurework. In this section, we describe the two datasets—Mapbox data and total confirmed cases number data—and the proceduresfor human activity categorization. Also covered in this section are definitions and equations related to the Venables2istance ( D v ), the activity density ( D a ), and the basic reproduction number ( R ). The time-lag cross-correlationanalysis method is presented at the end of this section. We utilized digital trace telemetry data obtained from Mapbox from January 1 to March 30, 2020. The dataset containsa metric of telemetry-based human activity, a tile,t , which varies across spatial tiles and time t. The partition of tilesis based on Mercantile, a Python library, which is capable of creating spatial-resolution grids all worldwide. The a tile,t is collected, aggregated, and normalized by Mapbox from geography information updates from users’ cell phonelocations by time flows. The more users located in a tile at time t, the higher the human activity (i.e., a tile,t ). Thedataset comprises the United States and the District of Columbia; however, in this study, we examined only 193 countieswhose cumulative confirmed cases were greater than 100 as of March 31, 2020 (shown in 1). In the raw data, thetemporal resolution is 4 hours. Each tile represents about 100 by 100 square meters for spatial resolution. Since thedata is derived from cell phone activity, data may not exist for all cells at all times. For example, a park opens duringthe day but is closed at night would not generate any data at midnight. Also, for protecting users’ privacy and the dataaggregation process, tiles with a small number of users are reported without any activity data. It is also noteworthy thatthe data is aggregated and normalized for each month, so the absolute values of activity indices for different monthscannot be directly compared. To reveal the time-lag relationship between metrics and spread of the virus, the totalnumber of confirmed cases was used. We obtained the data from the COVID-19 Data Repository by the Center forSystems Science and Engineering (CSSE) at Johns Hopkins University [20]. The data in this repository were gatheredand aggregated from various sources, such as the World Health Organization (WHO) and the Centers for DiseaseControl and Prevention (CDC). We extracted the total number of confirmed cases c i,j from the CSSE repository, wherei represents each date and j represents each county.Figure 1: 193 selected counties whose cumulative confirmed cases were greater than 100 as of March 31, 2020. The nature of an activity might put its participants at a higher risk level for contracting the virus. For example, activitiesin public common areas, such as grocery stores or gyms, would lead to greater risk of disease spread compared to theactivities in residential areas, such as working from home or walking a dog in the community. The fine granularity ofthe spatial resolution enables classification of each tile into one of the four categories: (1) social tiles, (2) traffic tiles, (3)work tiles, and (4) other tiles. Categorization is based on the following characteristics: (1) social tiles are the location ofat least one point of interest; (2) traffic tiles include a road; (3) work tiles show no activity in the evening; and (4) other3iles are located in residential areas. The analysis in this paper examines human activities for social, work, and traffictiles separately. Example tile maps related to each category are shown in 2 for Harris County, Texas.Figure 2: Maps of four different tile categories (Social, Traffic, Work, and Other) in Harris County, Texas.
To quantify the agglomeration of human activities, we used the Venables distance ( D v ) as a weighted average distanceof human activities. The Venables Distance aggregates the spatial distribution of a tile,t in a county and captures theurban spatial structure of human activities (Louail et al. 2014). The D v is calculated using Equation 1: D v ( t ) = (cid:80) tile (cid:54) = tile a tile ,t × a tile ,t × d tile ,tile (cid:80) tile (cid:54) = tile a tile ,t × a tile ,t (1)where, a tile ,t and a tile ,t are the metrics of human activities in tile and tile at time t, respectively, and d tile ,tile )is the distance from the centroids between these two tiles. In Harris County, Texas, there are more than 70K uniquetiles, which makes it computational expensive to analyze all pairs of existing tiles. To reduce the computational burden,we aggregated the 100 by 100-square-meter tiles, a tile,t , to square cells 2 kilometers in length using Equation 2: a k,t = (cid:80) for all tile in cell k a tile,t A k (2)where, a k,t is the intensity of human activity in cell k , at time t , A k is the area of the cell k . By aggregating humanactivity into a larger spatial cell, we reduced the computational efforts, maintaining a meaningful spatial resolutionwithout losing important characteristics in the raw data. Accordingly, the modified Venables distance is derived asshown in Equation 3: D v ( t ) = (cid:80) k (cid:54) = k a k ,t × a k ,t × d k ,k (cid:80) k (cid:54) = k a k ,t × a k ,t (3)where, a k ,t and a k ,t are the intensity of human activities in cells k and k at time t , respectively, and d k ,k is thedistance from the centroids between these two cells. In Equation 3, the values of the activity intensity ( a k,t ) are used asweights to calculate a human activity-weighted distance for the whole area. In other words, the relative values of a k,t were used to examine changes in agglomeration of activities. We calculated D v ( t ) for each county j , which is denotedas D v ( j, t ) for all cells in the county j . 4 .4 Activity Density Although D v captures the agglomeration of human activities, the density of activities is also critical for examiningpopulation contact. To make the raw data ( a tile,t ) from Mapbox comparable among different months, we de-normalizedthe activity index to the contact activity metric ca tile,t for each tile and each month (where t is time). In the de-normalization process, the assumption was that, in each month, the minimum human activity intensity among tiles isthe same. First, the tile with minimum human activity index ( a tile,t ) in each month is found. Then we set the value ofcontact activity ca tile,t in that low-activity tile as 5, and de-normalized the values for other tiles based on this value. Foreach county, the activity density at time t ( D a ( t ) ) was calculated using Equation 4: D a ( t ) = (cid:118)(cid:117)(cid:117)(cid:116) N N (cid:88) tile =1 ca tile,t (4) The basic reproduction number ( R ) (the number of secondary cases arising from one previous case) is a criticalparameter in epidemic modeling for understanding the speed of disease spread [16][21], as well as the risk of virusspread [22][23][24][25]. R is calculated using Equation 5: c i + t = c i · R t/τ (5)where, c i and c i + t represent the total confirmed cases in day i and day i + t , respectively; τ is a constant parameter. Weestimated R using CDC data ( c i,j ) in Equation 6: R i,j = e τ ln ci,j − ln ci − t,jt (6)where R i,j is the basic reproduction number at date i in county j . Because the CDC total confirmed cases datafluctuates within the course of a week (i.e., more reported cases in weekdays and less reported cases during weekends),the time interval t was set to 7 days. Based on the existing literature and simulation models related to COVID-19[26][27][28], the constant parameter τ was set to 5.1 days. Accordingly, the R was calculated for all 193 counties forthe analysis period. In the next step, we examined the correlation between the two human activity indicators and the basic reproductionnumber across all counties. Since these variables are a time series, we used time-lagged cross-correlation analysis toassess the synchrony of time series data sets. The cross-correlation coefficient was calculated using Equation 7: ρ A ,A ( δ ) = Cov ( A ( t ) , A ( t + δ )) σ A σ A (7)where ρ A ,A is the cross-correlation coefficient for two time series data A and A ; δ is the time offset of A ; Cov ( X, Y ) is the function calculating the covariance of two variables; σ A and σ A are the standard deviation ofdata A and A , respectively. Based on the definition, ρ A ,A represents the correlation between two variables and | ρ A ,A | ≤ ( | ρ A ,A | = 1 happens if and only if A = mA + n , where m and n are constants). Then, by iterativelycalculating the ρ A ,A ( δ ) with different δ , the correlation coefficient would reach its peak when δ = T , and T wasdetermined as the time lag between two variables. This section presents the results related to the calculation of the two human activity metrics and their time-laggedcorrelation with the basic reproduction number across 193 counties during the initial stage of the COVID-19 outbreakin the United States.
In this study, the Venables distance ( D v ), and the activity density ( D a ) were calculated to assess the human activities atthe county level using data from Mapbox. The very first four weeks (January 1 to 28) were considered as the baseline,5nd D v and D a values were compared with the average baseline in corresponding weekdays. For example, the D v values on March 1 (Sunday) were compared with the mean value of D v values on Sundays between January 1 and 28.Three different ways of daily activity aggregation were used: peak, average, and noon. The peak (largest), average, orthe noon (11 a.m. to 3 p.m.) values of human activities a tile,t were selected and set as the representative value of eachtile at each day. Figure 3 and Figure 4 show the percentage of D v and D a change for social, traffic, and work activitycategories and for three types of daily tile aggregation.Figure 3: The percentage change of D v for 193 counties in March 2020. The height of each bar is the average percentagechange of all 193 counties. The error bar indicates the standard deviation among all counties. The three tile categoriesof social, traffic, and work are shown in each column, and three daily tile aggregations of peak, average, and noon areshown in each row.Figure 4: The percentage change of D a for 193 counties in March 2020. The height of each bar is the averagepercentage change of all 193 counties, and the error bar indicates the standard deviation among all counties. The threetile categories of social, traffic, and work are shown in each column, and three daily tile aggregations of peak, average,and noon are shown in each row.The increasing trend of Venables distance ( D v ) implies declining concentration and rising distance among people, andthe decreasing trend of activity density ( D a ) indicates less human activity compared with the beginning of this year.Due to the shelter-in-place policies, residents changed their daily activity patterns. More and more people reduced thenon-essential outdoor activities (e.g., shopping in supermarkets, exercising in gyms, and eating at restaurants). Suchchanges in daily human activity patterns led to the change of D v and D a . For the three categories, significant changecan be seen in both social and traffic tiles, while the change in work tiles is not obvious. This is because the activitiesin work tiles could be essential activities. The D v increased the most, around 15%, in social tiles, while the D a fellthe most in traffic tiles, which is around 25%. Differences among the three types of daily tile aggregation were notsignificant. The percentage change of peak values is slightly greater than the other two values, indicating that peakvalues are influenced more by COVID-19, while average values are more stable. The following analysis uses peakvalues to calculate D v and D a . Histograms of average percentage change of D v and D a for each county are shown inFigure 5. The average percentage change is calculated during the last week of March. The D v values in the majority ofcounties increased, and D a values decreased for social and traffic categories, while the work category shows more evendistribution around 0% for both D v and D a . These histogram plots are consistent with the claim that human activities6n work tiles are more essential than other two (social and traffic), which did not show significant change during theCOVID-19 study period.Figure 5: Histogram plots of average percentage change of D v and D a (each row) for three different categories (eachcolumn).While D v and D a describe the different global characteristics of human activity—the D v captures spatial distributionof human activity, and the D a focuses on the intensity of human activity—they all reveal the insight of massive humanactivity patterns, which could have a quite significant influence to the spread of the virus. The correlation analysisbetween these two metrics and the basic reproduction number R becomes critical. The spread of the coronavirus is closely related to the human activity patterns. In the previous section, we showed thatthe average distance between human activities ( D v ) increased by 10% to 15%, and the average human activity intensity( D a ) decreased 5% to 10% for social tiles during March 2020 compared with the baseline period of January 2020.This result provides a good indication of the reduction in human activities in response to social distancing policies.In this section, we examine the extent to which the change in human activity metrics was related with the change inreproduction rate of corona virus in the 193 counties under study. Hence, we conducted the time-lagged correlationanalysis for the two human activity metrics calculated based on social, traffic, and work tile activity categories. Figure 6shows the time offset results between the Venables distance ( D v ) change and the basic reproduction number ( R ) forsocial, traffic, and work activity categories. Since the number of confirmed cases follows a skewed distribution duringMarch 2020, the log scale is used to illustrate the results. The results show that, in majority of typical counties, thedecline in the basic reproduction number ( R ) happens 20 to 40 days after the increase in Venables distance. This resultis consistent across all three activity categories. In the right column of Figure 6, the bar charts show the correlationbetween the offset D v and R within different P-value intervals. The average correlation coefficients (with P-valuesless than 0.05) are around 0.8 for each category, indicating a significant relationship between the increased distanceamong human activities and the decline in the virus spread speed. For the P-values greater than 0.05 (indicating nosufficient evidence to prove the correlation between two variables), the correlation indices are correspondingly smaller.The number of counties in each P-value interval show that about 50% of the counties have P-values less than 0.001 forVenables Distance calculated based on social and traffic activity tiles. The results related to work tiles, however, show asignificant correlation between the two variables in a smaller number of counties.Figure 7 shows the time offset result between activity density ( D a ) change and reproduction number ( R ) for the threeactivity categories. The results show that the decline in the basic reproduction number happens 6 to 17 days after thereduction of the activity intensity ( D a ); a similar result exists in all activity categories. The time lag is less than the oneobtained for the Venables distance ( D v ), which means that the spread of virus responds to human activity intensityreduction more quickly than to human agglomeration reduction. In the right column of Figure 7, the bar charts showthe correlation of offset D a and R in different P-value intervals. The average correlation indices for P-values less7igure 6: The time-lagged correlation analysis between Venables distance ( D v ) change and reproduction number ( R ).The left column shows the number of cases and the number of counties for different offset days, and the right columnshows the correlation index and the number of counties with different P-value intervals. Each row presents one of thethree tile activity categories: social, traffic and work.than 0.05 are around 0.9 for tile activity categories. This result indicates a significant relationship between the humanactivity intensity reduction and the decline in the virus spread speed. For P-values greater than 0.05, the correlationindices are smaller as well. The number of counties in each P-value interval show about 50% of counties have P-valuesless than 0.01 for social and traffic tiles, while the work tiles result shows P-values between 0.1 and 1.0 (indicating anon-significant correlation). In the next step, we examined the variation of findings across counties with different population sizes, number ofconfirmed cases, and date of first confirmed cases. The goal is to examine the extent to which the correlation betweenthe two metrics of human activities and the reproduction number is sensitive to these county features. The 193 countieswere divided into three uniform categories according to population size and confirmed cases (on March 18, 2020)labeled high, medium, and low. Similarly, the first case dates were labeled as early, mid-range, and late for each onethird of counties. Then, the changes in D v and D a (on March 31, 2020) were examined for each label in each tilecategory, and the results are plotted in Figure 8. As shown in Figure 8, for all three tile activity categories, the D a declined more in counties with larger population, more confirmed cases, and earlier first-case date. This result indicatesa greater recognition and response to the pandemic risks in more populous counties with early confirmed cases.8igure 7: The time-lagged correlation analysis result between activity density ( D a ) change and reproduction number( R ). The left column shows the number of cases and the number of counties for different offset days, and the rightcolumn shows the correlation index and the number of counties with different P-value intervals. Each row presents oneof the three tile activity categories: social, traffic and work. This study shows the utility of two human activity metrics (the Venables distance ( D v ) and the activity density ( D a ))as leading indicators for the spread speed of COVID-19 in the early stages of the outbreak. These metrics werederived from digital trace data obtained from Mapbox high-resolution temporal-spatial datasets. The results providestatistical evidence regarding the time-lag correlation between these two metrics and the basic reproduction number( R ) in the context of COVID-19. The results regarding the significant leader-follower relationship between humanactivities and the rate of spread of viral infections could provide valuable implications for authorities to monitor andcontrol the transmission of COVID-19 and future pandemics. For example, time lag indicates that the spread of virusresponds to human activity intensity reduction more quickly than to human agglomeration reduction. Hence, theproposed indicators can be calculated using digital trace telemetry data in near real time to proactively assess the riskof virus spread. This study has some limitations which need to be improved in future studies. First, the tile activitycategorization—social, work, and traffic—is not precise. One tile could be labeled as both social and work. In this studyand due to data availability limitations, however, we classified tiles into only one of the three categories. Second, theCDC confirmed-cases data had limitations due to testing availability. In this study, we did not adjust the confirmed casedata based on the extent of testing in different counties. A lack of testing in some areas resulted in the underestimationof the total cases. 9igure 8: . Change in Venables distances ( D v ) and activity density ( D a ) across counties with different population size,confirmed cases number, and the date of first confirmed case in the three tile activity categories (social, traffic, andwork). Acknowledgements
The authors would also like to acknowledge that Mapbox provided digital trace telemetry data of human activity. Theauthors would like to thank Kieran Gupta, Sofia Heisler, Ruggero Tacchi from Mapbox for providing technical support.
Funding
This work was supported by several grants including from the United States National Science Foundation RAPIDproject
Author contributions statement
Research design and conceptualization: X. G., C. F., A. M.; Data collection, processing, analysis, and visualization: X.G., C. F., Y. Y., S. L., Q. L; Writing: X. G., A. M.; Reviewing and revising: all authors.
References [1] World Health Organization.
WHO Coronavirus Disease (COVID-19) Dashboard , 2020. https://covid19.who.int/ .[2] Raghuvir Keni, Anila Alexander, Pawan Ganesh Nayak, Jayesh Mudgal, and Krishnadas Nandakumar. Covid-19:emergence, spread, possible treatments, and global burden.
Frontiers in public health , 8:216, 2020.[3] Anton Gollwitzer, Cameron Martel, Julia Marshall, Johanna Marie Höhs, and John A Bargh. Connectingself-reported social distancing to real-world behavior at the individual and us state level. 2020.[4] Peter Caley, David J Philp, and Kevin McCracken. Quantifying social distancing arising from pandemic influenza.
Journal of the Royal Society Interface , 5(23):631–639, 2008.[5] Roy M Anderson, Hans Heesterbeek, Don Klinkenberg, and T Déirdre Hollingsworth. How will country-basedmitigation measures influence the course of the covid-19 epidemic?
The Lancet , 395(10228):931–934, 2020.106] Huaiyu Tian, Yonghong Liu, Yidan Li, Chieh-Hsi Wu, Bin Chen, Moritz UG Kraemer, Bingying Li, Jun Cai,Bo Xu, Qiqi Yang, et al. An investigation of transmission control measures during the first 50 days of the covid-19epidemic in china.
Science , 368(6491):638–642, 2020.[7] Ankit Ramchandani, Chao Fan, and Ali Mostafavi. Deepcovidnet: An interpretable deep learning model forpredictive surveillance of covid-19 using heterogeneous features and their interactions.
IEEE Access , 2020.[8] Fereshteh Asgari, Vincent Gauthier, and Monique Becker. A survey on human mobility and its applications. arXivpreprint arXiv:1307.0814 , 2013.[9] Duygu Balcan, Vittoria Colizza, Bruno Gonçalves, Hao Hu, José J Ramasco, and Alessandro Vespignani.Multiscale mobility networks and the spatial spreading of infectious diseases.
Proceedings of the NationalAcademy of Sciences , 106(51):21484–21489, 2009.[10] Hugo Barbosa, Marc Barthelemy, Gourab Ghoshal, Charlotte R James, Maxime Lenormand, Thomas Louail,Ronaldo Menezes, José J Ramasco, Filippo Simini, and Marcello Tomasini. Human mobility: Models andapplications.
Physics Reports , 734:1–74, 2018.[11] Takahiro Yabe, Kota Tsubouchi, Naoya Fujiwara, Takayuki Wada, Yoshihide Sekimoto, and Satish V Ukkusuri.Non-compulsory measures sufficiently reduced human mobility in japan during the covid-19 epidemic. arXivpreprint arXiv:2005.09423 , 2020.[12] Serina Y Chang, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, and Jure Leskovec.Mobility network modeling explains higher sars-cov-2 infection rates among disadvantaged groups and informsreopening strategies. medRxiv , 2020.[13] Paolo Cintia, Daniele Fadda, Fosca Giannotti, Luca Pappalardo, Giulio Rossetti, Dino Pedreschi, Salvo Rinzivillo,Pietro Bonato, Francesco Fabbri, Francesco Penone, et al. The relationship between human mobility and viraltransmissibility during the covid-19 epidemics in italy. arXiv preprint arXiv:2006.03141 , 2020.[14] Song Gao, Jinmeng Rao, Yuhao Kang, Yunlei Liang, and Jake Kruse. Mapping county-level mobility patternchanges in the united states in response to covid-19.
SIGSPATIAL Special , 12(1):16–26, 2020.[15] Qingchun Li, Liam Bessell, Xin Xiao, Chao Fan, Xinyu Gao, and Ali Mostafavi. Disparate patterns of movementsand visits to points of interests located in urban hotspots across us metropolitan cities during covid-19. arXivpreprint arXiv:2006.14157 , 2020.[16] Klaus Dietz. The estimation of the basic reproduction number for infectious diseases.
Statistical methods inmedical research , 2(1):23–41, 1993.[17] Vasileios Lampos, Simon Moura, Elad Yom-Tov, Ingemar J Cox, Rachel McKendry, and Michael Edelstein.Tracking covid-19 using online search. arXiv preprint arXiv:2003.08086 , 2020.[18] Tina Lu and Ben Y Reis. Internet search patterns reveal clinical course of disease progression for covid-19 andpredict pandemic spread in 32 countries. medRxiv , 2020.[19] Thomas Louail, Maxime Lenormand, Oliva G Cantu Ros, Miguel Picornell, Ricardo Herranz, Enrique Frias-Martinez, José J Ramasco, and Marc Barthelemy. From mobile phone data to the spatial structure of cities.
Scientific reports , 4:5276, 2014.[20] Johns Hopkins University. Johns hopkins university coronavirus resource center, 2020. https://coronavirus.jhu.edu/us-map/ .[21] Marino Gatto, Enrico Bertuzzo, Lorenzo Mari, Stefano Miccoli, Luca Carraro, Renato Casagrandi, and AndreaRinaldo. Spread and dynamics of the covid-19 epidemic in italy: Effects of emergency containment measures.
Proceedings of the National Academy of Sciences , 117(19):10484–10491, 2020.[22] Alberto Aleta, David Martin-Corral, Ana Pastore y Piontti, Marco Ajelli, Maria Litvinova, Matteo Chinazzi,Natalie E Dean, M Elizabeth Halloran, Ira M Longini Jr, Stefano Merler, et al. Modeling the impact of socialdistancing, testing, contact tracing and household quarantine on second-wave scenarios of the covid-19 epidemic. medRxiv , 2020.[23] Giulia Giordano, Franco Blanchini, Raffaele Bruno, Patrizio Colaneri, Alessandro Di Filippo, Angela Di Matteo,and Marta Colaneri. Modelling the covid-19 epidemic and implementation of population-wide interventions initaly.
Nature Medicine , pages 1–6, 2020.[24] Quan-Hui Liu, Marco Ajelli, Alberto Aleta, Stefano Merler, Yamir Moreno, and Alessandro Vespignani. Measura-bility of the epidemic reproduction number in data-driven contact networks.
Proceedings of the National Academyof Sciences , 115(50):12680–12685, 2018.[25] Mark EJ Newman. Spread of epidemic disease on networks.
Physical review E , 66(1):016128, 2002.1126] Chao Fan, Sanghyeon Lee, Yang Yang, Bora Oztekin, Qingchun Li, and Ali Mostafavi. Effects of popula-tion co-location reduction on cross-county transmission risk of covid-19 in the united states. arXiv preprintarXiv:2006.01054 , 2020.[27] Juanjuan Zhang, Maria Litvinova, Yuxia Liang, Yan Wang, Wei Wang, Shanlu Zhao, Qianhui Wu, Stefano Merler,Cécile Viboud, Alessandro Vespignani, et al. Changes in contact patterns shape the dynamics of the covid-19outbreak in china.
Science , 2020.[28] Juanjuan Zhang, Maria Litvinova, Wei Wang, Yan Wang, Xiaowei Deng, Xinghui Chen, Mei Li, Wen Zheng, LanYi, Xinhua Chen, et al. Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outsidehubei province, china: a descriptive and modelling study.