A machine learning aided global diagnostic and comparative tool to assess effect of quarantine control in Covid-19 spread
AA machine learning aided global diagnostic and comparativetool to assess effect of quarantine control in Covid-19 spread
Raj Dandekar , Chris Rackauckas , and George Barbastathis Department of Civil and Environmental Engineering, Massachusetts Institute ofTechnology, Cambridge, MA 02139, USA Department of Applied Mathematics, Massachusetts Institute of Technology,Cambridge, MA 02139, USA Department of Mechanical Engineering, Massachusetts Institute of Technology,Cambridge, MA 02139, USA Singapore-MIT Alliance for Research and Technology (SMART) Centre,Singapore 138602July 28, 2020
Article Summary Line:
Data-driven epidemiological model to quantify and compare quarantinecontrol policies in controlling COVID-19 spread in Europe, North America, South America andAsia.
Running Title:
Machine Learning aided quarantine model - Covid19.
Keywords:
COVID, Machine Learning, Epidemiology
We have developed a globally applicable diagnostic Covid-19 model by augmenting the classicalSIR epidemiological model with a neural network module. Our model does not rely upon previousepidemics like SARS/MERS and all parameters are optimized via machine learning algorithmsemployed on publicly available Covid-19 data. The model decomposes the contributions to theinfection timeseries to analyze and compare the role of quarantine control policies employed inhighly affected regions of Europe, North America, South America and Asia in controlling thespread of the virus. For all continents considered, our results show a generally strong correlationbetween strengthening of the quarantine controls as learnt by the model and actions taken by theregions’ respective governments. Finally, we have hosted our quarantine diagnosis results for thetop 70 affected countries worldwide, on a public platform, which can be used for informed decisionmaking by public health officials and researchers alike.
The Coronavirus respiratory disease 2019 originating from the virus “SARS-CoV-2”
1, 2 has led to aglobal pandemic, leading to 12 , ,
765 confirmed global cases in more than 200 countries as of July12, 2020. As the disease began to spread beyond its apparent origin in Wuhan, the responses oflocal and national governments varied considerably. The evolution of infections has been similarlydiverse, in some cases appearing to be contained and in others reaching catastrophic proportions.1 a r X i v : . [ phy s i c s . s o c - ph ] J u l n Hubei province itself, starting at the end of January, more than 10 million residents were quar-antined by shutting down public transport systems, train and airport stations, and imposing policecontrols on pedestrian traffic. Subsequently, similar policies were applied nation-wide in China.By the end of March, the rate of infections was reportedly receding. By the end of February 2020, the virus began to spread in Europe, with Italy employing ex-traordinary quarantine measures starting 11 March 2020. France enforced a lockdown beginning 17March followed later by UK on 23 March; whereas no lockdown was enforced in Sweden. SouthKorea, Iran and Spain experienced acute initial increases, but then adopted drastic generalizedquarantine. In the United States, the first infections were detected in Washington State as early as20 th January 2020 and now it is being reported that the virus had been circulating undetected inNew York City as early as mid-February. Federal and state government responses were compar-atively delayed and variable, with most states having stay at home orders declared by the end ofMarch. In South America, Brazil, Chile and Peru are the highest affected countries as of 12 Julyand they employed differing quarantine policies. Brazil’s first case was reported in the last weekof February and the country went into a state of partial quarantine on 24 March. Chile declareda state of catastrophe for 90 days in the first week of March, and the military was deployed toenforce quarantine measures. In Peru, a nationwide curfew was employed much later, on March19. Thus, affected countries around the world enforced differing quarantine strategies in an effortto mitigate the virus spread.Given the available Covid-19 data for the infected case count by country and world-wide, it isseen that the infection growth curve also showed significantly diverse behaviour globally. In somecountries, the infected case count peaked within a month and showed a subsequent decline, whilein certain other countries, it was seen to increase for much longer before plateauing. In some ofthe highly affected countries, the infected count has not yet reached a plateau and the daily activecases continue to increase or remain stagnant as of 12 July 2020.Given the observed spatially and temporally diverse government responses and outcomes, therole played by the varying quarantine measures in different countries in shaping the infectiongrowth curve is still not clear. With publicly available Covid-19 data by country and world-wideby now widely available, there is an urgent need to use data-driven approaches to bridge this gap,quantitatively estimate and compare the role of the quarantine policy measures implemented inseveral countries in curtailing spread of the disease.As of this writing, more than a 100 papers have been made available, mostly in preprint form.Existing models have one or more of the following limitations: • Lack of independent estimation: Using parameters based on prior knowledge of SARS/MERScoronavirus epidemiology and not derived independently from the Covid-19 data or param-eters like rate of detection, nature of government response fixed prior to running the model. • Lack of global applicability: Not implemented on a global scale. • Lack of interpretibility: Using several free/fitting parameters making it a cumbersome, com-plicated model to reciprocate and use by policy makers. In this paper, we propose a globally scalable, interpretable model with completely independent pa-rameter estimation through a novel approach: augmenting a first principles-derived epidemiologicalmodel with a data-driven module, implemented as a neural network. We leverage this model toquantify the quarantine strengths and analyze and compare the role of quarantine control policiesemployed to control the virus effective reproduction number in the European, North Amer-ican, South American and Asian continents. In a classical and commonly used model, known asSEIR, the population is divided into the susceptible S , exposed E , infected I and recovered R groups, and their relative growths and competition are represented as a set of coupled ordi-nary differential equations. The simpler SIR model does not account for the exposed population E . These models cannot capture the large-scale effects of more granular interactions, such as thepopulation’s response to social distancing and quarantine policies. However, a major assumptionof these models is that the rate of transitions between population states is fixed. In our approach,2 a)(b) Figure 1: (a) Schematic of the augmented QSIR model considered in the present study. (b) Schematic ofthe neural network architecture used to learn the quarantine strength function Q ( t ) . we relax this assumption by estimating the time-dependent quarantine effect on virus exposureas a neural network informs the infected variable I in the SIR model. This trained model thusdecomposes the effects and the neural network encodes information about the quarantine strengthfunction in the locale where the model is trained.In general, neural networks with arbitrary activation functions are universal approximators. Unbounded activation functions, in particular, such as the rectified linear unit (ReLU) has beenknown to be effective in approximating nonlinear functions with a finite set of parameters.
Thus, a neural network solution is attractive to approximate quarantine effects in combinationwith analytical epidemiological models. The downside is that the internal workings of a neuralnetwork are difficult to interpret. The recently emerging field of Scientific Machine Learning exploits conservation principles within a universal differential equation, SIR in our case, to mit-igate overfitting and other related machine learning risks.In the present work, the neural network is trained from publicly available infection and popu-lation data for Covid-19 for a specific region under study; details are in the Experimental Proce-dures section. Thus, our proposed model is globally applicable and interpretable with parameterslearned from the current Covid-19 data, and does not rely upon data from previous epidemics likeSARS/MERS.
The classic SIR epidemiological model is a standard tool for basic analysis concerning the outbreakof epidemics. In this model, the entire population is divided into three sub-populations: susceptible S ; infected I ; and recovered R . The sub-populations’ evolution is governed by the following system3f three coupled nonlinear ordinary differential equationsd S ( t ) d t = − β S ( t ) I ( t ) N (1)d I ( t ) d t = β S ( t ) I ( t ) N − γI ( t ) (2)d R ( t ) d t = γI ( t ) . (3)Here, β is the infection rate and γ is the recovery rates, respectively, and are assumed to beconstant in time. The total population N = S ( t ) + I ( t ) + R ( t ) is seen to remain constant as well;that is, births and deaths are neglected. The recovered population is to be interpreted as thosewho can no longer infect others; so it also includes individuals deceased due to the infection. Thepossibility of recovered individuals to become reinfected is accounted for by SEIS models, butwe do not use this model here, as the reinfection rate for Covid-19 survivors is considered to benegligible as of now. The reproduction number R t in the SEIR and SIR models is defined as R t = βγ. (4)An important assumption of the SIR models is homogeneous mixing among the subpopulations.Therefore, this model cannot account for social distancing or or social network effects. Additionallythe model assumes uniform susceptibility and disease progress for every individual; and that nospreading occurs through animals or other non-human means. Alternatively, the SIR model maybe interpreted as quantifying the statistical expectations on the respective mean populations, whiledeviations from the model’s assumptions contribute to statistical fluctuations around the mean. To study the effect of quarantine control globally, we start with the SIR epidemiological model.Figure 1a shows the schematic of the modified SIR model, the QSIR model, which we consider.We augment the SIR model by introducing a time varying quarantine strength rate term Q ( t ) and a quarantined population T ( t ) , which is prevented from having any further contact with thesusceptible population. Thus, the term I ( t ) denotes the infected population still having contactwith the susceptibles, as done in the standard SIR model; while the term T ( t ) denotes the infectedpopulation who are effectively quarantined and isolated. Thus, we can write an expression for thequarantined infected population T ( t ) as T ( t ) = Q ( t ) × I ( t ) (5)Further we introduce an additional recovery rate δ which quantifies the rate of recovery of thequarantined population. Based on the modified model, we define a Covid spread parameter in asimilar way to the reproduction number defined in the SIR model (4) as C p ( t ) = βγ + δ + Q ( t ) . (6) C p > C p < Q ( t ) does not follow from first principles and is highly dependent on local quarantine policies,we devised a neural network-based approach to approximate it.Recently, it has been shown that neural networks can be used as function approximators torecover unknown constitutive relationships in a system of coupled ordinary differential equa-tions.
30, 32
Following this principle, we represent Q ( t ) as a n layer-deep neural network withweights W , W . . . W n , activation function r and the input vector U = ( S ( t ) , I ( t ) , R ( t )) as Q ( t ) = r ( W n r ( W n − . . . r ( W U ))) ≡ NN ( W, U ) (7)4or the implementation, we choose a n = S ( t ) d t = − β S ( t ) I ( t ) N (8)d I ( t ) d t = β S ( t ) I ( t ) N − ( γ + Q ( t )) I ( t ) == β S ( t ) I ( t ) N − ( γ + NN ( W, U )) I ( t ) (9)d R ( t ) d t = γI ( t ) + δT ( t ) (10)d T ( t ) d t = Q ( t ) I ( t ) = NN ( W, U ) I ( t ) − δT ( t ) . (11)More details about the model initialization and parameter estimation methods is given in the Ex-perimental Procedures section.In all cases considered below, we trained the model using data starting from the dates when the500 th infection was recorded in each region and up to June 1 2020. In each subsequent case study, Q ( t ) denotes the rate at which infected persons are effectively quarantined and isolated from theremaining population, and thus gives composite information about (a) the effective testing rate ofthe infected population as the disease progressed and (b) the intensity of the enforced quarantine asa function of time. To understand the nature of evolution of Q ( t ) , we look at the time point when Q ( t ) approximately shows an inflection point, or a ramp up point. An inflection point in Q ( t ) indicates the time when the rate of increase of Q ( t ) i.e dQ ( t )/ dt was at its peak while a ramp uppoint corresponds to a sudden intensification of quarantine policies employed in the region underconsideration.We define the quarantine efficiency, Q eff as the increase in Q ( t ) within a month following thedetection of the 500 th infected case in the region under consideration. Thus Q eff = Q ( ) − Q ( ) (12)The magnitude of Q eff shows how rapidly the infected individuals were prevented from cominginto contact with the susceptibles in the first month following the detection of the 500 th infectedcase; and thus contains composite information about the quarantine and lockdown strength; andthe testing and tracing protocols to identify and isolate infected individuals. Figure 2 shows the comparison of the model-estimated infected and recovered case counts withactual Covid-19 data for the highest affected European countries as of 1 June 2020, namely: Rus-sia, UK, Spain and Italy, in that order. We find that irrespective of a small set of optimizedparameters (note that the contact rate β and the recovery rate γ are fixed, and not functions oftime), a reasonably good match is seen in all four cases.Figure 3 shows the evolution of the neural network learnt quarantine strength function Q ( t ) for the considered European nations. Inflection points in Q ( t ) are seen for UK, Spain and Italyat 14, 10 and 16 days, respectively, post detection of the 500 th case i.e on 23 th March, 15 th Marchand 14 th March, respectively. This is in good agreement with nationwide quarantine imposed on25 th March, 14 th March and 9 th March in UK, Spain and Italy, respectively.
5, 33, 34
Figure 16a shows the comparison of the contact rate β , quarantine efficiency as defined in thebeginning of this subsection and the recovery rate γ . It should be noted that the contact and5
20 40 60
Days post 500 infected (a) Russia
Days post 500 infected (b) UK
Days post 500 infected (c) Spain
Days post 500 infected (d) Italy
Figure 2: COVID-19 infected and recovered evolution compared with our neural network augmented modelprediction in the highest affected European countries as of June 1, 2020.
Days post 500 infected Q ( t ) Quarantine strengthNationwide stay at home imposed (a) Russia
Days post 500 infected Q ( t ) Quarantine strengthGovernment Lockdown imposedInflection point in learnt Q(t) (b) UK
Days post 500 infected Q ( t ) Quarantine strengthGovernment Lockdown imposedInflection point in learnt Q(t) (c) Spain
Days post 500 infected Q ( t ) Quarantine strengthGovernment Lockdown imposedInflection point in learnt Q(t) (d) Italy
Figure 3: Quarantine strength Q ( t ) learned by the neural network in the highest affected Europeancountries as of June 1, 2020. The transition from the red to blue shaded region indicates the Covid spreadparameter of value C p < Q ( t ) plot denoted by the red dashed line. For regions inwhich a clear inflection or ramp up point is not seen (Russia), the red dashed line is not shown.
10 20 30 40 50 60
Days post 500 infected C p C p =1 (a) Russia Days post 500 infected C p C p =1 (b) UK Days post 500 infected C p C p =1 (c) Spain Days post 500 infected C p C p =1 (d) Italy Figure 4: Control of COVID-19 quantified by the Covid spread parameter evolution in the highest affectedEuropean countries as of June 1, 2020. The transition from the red to blue shaded region indicates C p < recovery rates are assumed to be constant in our model, in the duration spanning the detectionof the 500 th infected case and June 1 st , 2020. The average contact rate in Spain and Italy isseen to be higher than Russia and UK over the considered duration of 2 − Although the social distancing strength also varied with time, we do notfocus on that aspect in the present study, and will be the subject of future studies. A higherquarantine efficiency combined with a higher recovery rate led Spain and Italy to bring downthe Covid spread parameter (defined in (6)), C p from > < ,
25 days. respectively, ascompared to 32 days for UK and 42 days for Russia (figure 4).
Figure 5 shows Q eff for the 23 highest affected European countries. We can see that Q eff in thewestern European regions is generally higher than eastern Europe. This can be attributed to thestrong lockdown measures implemented in western countries like Spain, Italy, Germany, Franceafter the rise of infections seen first in Italy and Spain. Although countries like Switzerland andTurkey didn’t enforce a strict lockdown as compared to their west European counterparts, theywere generally successful in halting the infection count before reaching catastrophic proportions,due to strong testing and tracing protocols.
37, 38
Subsequently, these countries also managedto identify potentially infected individuals and prevented them from coming into contact withsusceptibles, giving them a high Q eff score as seen in figure 5. In contrast, our study also managesto identify countries like Sweden which had very limited lockdown measures; with a low Q eff scoreas seen in figure 5. This strengthens the validity of our model in diagnosing information about theeffectiveness of quarantine and isolation protocols in different countries; which agree well with theactual protocols seen in these countries. 7 a) Figure 5: (a) Quarantine efficiency, Q eff defined in (12) for the 23 highest affected European countries. Notethat Q eff contains composite information about the quarantine and lockdown strength; and the testing andtracing protocols to identify and isolate infected individuals. Map also shows the demarcation betweencountries with a high Q eff shown by a green dotted line and those with a low Q eff shown by a red dottedline. Days post 500 infected
Data: InfectedPredictionData: RecoveredPrediction (a) New York
Days post 500 infected (b) New Jersey
Days post 500 infected (c) Illinois
Days post 500 infected (d) California
Figure 6: COVID-19 infected and recovered evolution compared with our neural network augmented modelprediction in the highest affected USA states as of June 1, 2020.
20 40 60
Days post 500 infected Q ( t ) Quarantine strengthStay at home imposedInflection point in learnt Q(t) (a) New York
Days post 500 infected Q ( t ) Quarantine strengthStay at home imposedInflection point in learnt Q(t) (b) New Jersey
Days post 500 infected Q ( t ) Quarantine strengthStay at home imposedInflection point in learnt Q(t) (c) Illinois
Days post 500 infected Q ( t ) Quarantine strengthStay at home imposedRamp up point in learnt Q(t) (d) California
Figure 7: Quarantine strength Q ( t ) learned by the neural network in the highest affected USA states asof June 1, 2020. The transition from the red to blue shaded region indicates the Covid spread parameterof value C p < Q ( t ) plot denoted by the red dashed line. Days post 500 infected C p C p =1 (a) New York Days post 500 infected C p C p =1 (b) New Jersey Days post 500 infected C p C p =1 (c) Illinois Days post 500 infected C p C p =1 (d) California Figure 8: Control of COVID-19 quantified by the Covid spread parameter evolution in the highest affectedUSA states as of June 1, 2020. The transition from the red to blue shaded region indicates C p < .4 USA Figure 6 shows reasonably good match between the model-estimated infected and recovered casecounts with actual Covid-19 data for the highest affected North American states (including statesfrom Mexico, the United States, and Canada) as of 1 June 2020, namely: New York, New Jersey,Illinois and California. Q ( t ) for New York and New Jersey show a ramp up point immediately inthe week following the detection of the 500 th case in these regions, i.e. on 19 March for New Yorkand on 24 March for New Jersey (figure 7). This matches well with the actual dates: 22 Marchin New York and 21 March in New Jersey when stay at home orders and isolation measures wereenforced in these states. A relatively slower rise of Q ( t ) is seen for Illinois while California showinga ramp up post a week after detection of the 500 th case. Although no significant difference is seenin the mean contact and recovery rates between the different US states, the quarantine efficiencyin New York and New Jersey is seen to be significantly higher than that of Illinois and California(figure 16b), indicating the effectiveness of the rapidly deployed quarantine interventions in NewYork and New Jersey. Owing to the high quarantine efficiency in New York and New Jersey,these states were able to bring down the Covid spread parameter, C p to less than 1 in 19 days(figure 8). On the other hand, although Illinois and California reached close to C p = C p still remained greater than 1 (figure 8), indicating thatthese states were still in the danger zone as of June 1, 2020. An important caveat to this result isthe reporting of the recovered data.Comparing with Europe, the recovery rates seen in North America are significantly lower (fig-ures 16a,b). It should be noted that accurate reporting of recovery rates is likely to play a majorrole in this apparent difference. In our study, the recovered data include individuals who cannotfurther transmit infection; and thus includes treated patients who are currently in a healthy stateand also individuals who died due to the virus. Since quantification of deaths can be done in arobust manner, the death data is generally reported more accurately. However, there is no cleardefinition for quantifying the number of people who transitioned from infected to healthy. As aresult, accurate and timely reporting of recovered data is seen to have a significant variation be-tween countries, under reporting of the recovered data being a common practice. Since the effectivereproduction number calculation depends on the recovered case count, accurate data regarding therecovered count is vital to assess whether the infection has been curtailed in a particular region ornot. Thus, our results strongly indicate the need for each country to follow a particular metric forestimating the recovered count robustly, which is vital for data driven assessment of the pandemicspread. Figure 9a shows the quarantine efficiency for 20 major US states spanning the whole country.Figure 9b shows the comparison between a report published in the Wall Street Journal on May21 highlighting USA states based on their lockdown conditions, and the quarantine efficiencymagnitude in our study. The size of the circles represent the magnitude of the quarantine efficiency.The blue color indicate the states for which the quarantine efficiency was greater than the meanquarantine efficiency across all US states, while those in red indicate the opposite. Our resultsindicate that the north-eastern and western states were much more responsive in implementingrapid quarantine measures in the month following early detection; as compared to the southern andcentral states. This matches the on-ground situation as indicated by a generally strong correlationis seen between the red circles in our study (states with lower quarantine efficiency) and the yellowregions seen in in the Wall Street Journal report (states with reduced imposition of restrictions)and between the blue circles in our study (states with higher quarantine efficiency) and the blueregions seen in the Wall Street Journal report (states with generally higher level of restrictions).This strengthens the validity of our approach in which the quarantine efficiency is recovered througha trained neural network rooted in fundamental epidemiological equations.10 Y N J M I P A F L G A C A M A T X I L M D O K U T A Z N E W A O H O R C O S D (a)(b) Figure 9: (a) Quarantine efficiency, Q eff defined in (12) for 20 major USA states. Note that Q eff containscomposite information about the quarantine and lockdown strength; and the testing and tracing protocolsto identify and isolate infected individuals. (b) Comparison between a report published in the Wall StreetJournal on May 21 and the quarantine efficiency magnitude in our study. A generally strong correlationis seen between the magnitude of quarantine efficiency in our study and the level of restrictions actuallyimposed in different USA states.
20 40 60
Days post 500 infected (a) India
Days post 500 infected (b) China
Days post 500 infected (c) South Korea
Figure 10: COVID-19 infected and recovered evolution compared with our neural network augmentedmodel prediction in the highest affected Asian countries as of June 1, 2020.
Days post 500 infected Q ( t ) Quarantine strengthSecond phase of Government Lockdown (a) India
Days post 500 infected Q ( t ) Quarantine strengthGovernment Lockdown imposedRamp up point in learnt Q(t) (b) China
Days post 500 infected Q ( t ) Quarantine strengthWidespread testing, isolation and tracingRamp up point in learnt Q(t) (c) South Korea
Figure 11: Quarantine strength Q ( t ) learnt by the neural network in the highest affected Asian countriesas of June 1, 2020. The transition from the red to blue shaded region indicates the Covid spread parameterof value C p < Q ( t ) plot denoted by the red dashed line. For regions in which a clearinflection or ramp up point is not seen (India), the red dashed line is not shown.
10 20 30 40 50 60
Days post 500 infected C p C p =1 (a) India Days post 500 infected C p C p =1 (b) China Days post 500 infected C p C p =1 (c) South Korea Figure 12: Control of COVID-19 quantified by the Covid spread parameter evolution in the highest affectedAsian countries as of June 1, 2020. The transition from the red to blue shaded region indicates C p < Figure 10 shows reasonably good match between the model-estimated infected and recovered casecount with actual Covid-19 data for the highest affected Asian countries as of 1 June 2020, namely:India, China and South Korea. Q ( t ) shows a rapid ramp up in China and South Korea (figure 11)which agrees well with cusps in government interventions which took place in the weeks leadingto and after the end of January and February for China and South Korea respectively. Onthe other hand, a slow build up of Q ( t ) is seen for India, with no significant ramp up. Thisis reflected in the quarantine efficiency comparison (figure 16c), which is much higher for Chinaand South Korea compared to India. South Korea shows a significantly lower contact rate thanits Asian counterparts, indicating strongly enforced and followed social distancing protocols. No significant difference in the recovery rate is observed between the Asian countries. Owing tothe high quarantine efficiency in China and a high quarantine efficiency coupled with stronglyenforced social distancing in South Korea, these countries were able to bring down the Covidspread parameter C p from > < Figure 13 shows reasonably good match between the model-estimated infected and recovered casecount with actual Covid-19 data for the highest affected South American countries as of 1 June2020, namely: Brazil, Chile and Peru. For Brazil, Q ( t ) is seen to be approximately constant ≈ Q ( t ) is seen to stagnate (figure14a). The key difference between the Covid progression in Brazil compared to other nations isthat the infected and the recovered (recovered healthy + dead in our study) count is very close toone another as the disease progressed (figure 13). Owing to this, as the disease progressed, the newinfected people introduced in the population were balanced by the infected people removed fromthe population, either by being healthy or deceased. This higher recovery rate combined with agenerally low quarantine efficiency and contact rate (figure 16d) manifests itself in the Covid spreadparameter for Brazil to be < Q ( t ) is almost constant for the entire duration considered (figure 14b). Thus, althoughgovernment regulations were imposed swiftly following the initial detection of the virus, leading toa high initial magnitude of Q ( t ) , the government imposition became subsequently relaxed. This13
20 40 60
Days post 500 infected (a) Brazil
Days post 500 infected (b) Chile
Days post 500 infected (c) Peru
Figure 13: COVID-19 infected and recovered evolution compared with our neural network augmentedmodel prediction in the highest affected South American countries as of June 1, 2020.
Days post 500 infected Q ( t ) Quarantine strengthQuarantine imposed in big cities (a) Brazil
Days post 500 infected Q ( t ) Quarantine strengthNationwide curfew imposed (b) Chile
Days post 500 infected Q ( t ) Quarantine strengthNationwide quarantine announced (c) Peru
Figure 14: Quarantine strength Q ( t ) learnt by the neural network in the highest affected South Americancountries as of June 1, 2020. The transition from the red to blue shaded region indicates the Covid spreadparameter of value C p <
20 40 60
Days post 500 infected C p C p =1 (a) Brazil Days post 500 infected C p C p =1 (b) Chile Days post 500 infected C p C p =1 (c) Peru Figure 15: Control of COVID-19 quantified by the Covid spread parameter evolution in the highest affectedSouth American countries as of June 1, 2020. The transition from the red to blue shaded region indicates C p < maybe attributed to several political and social factors outside the scope of the present study. Even for Chile, the infected and recovered count remain close to each other compared to othernations. A generally high quarantine magnitude coupled with a moderate recovery rate (figure16d) leads to C p being < Q ( t ) shows a very slow build up (figure 14c) with a very low magnitude. Also, the recovered count islower than the infected count compared to its South American counterparts (figure 13c). A lowquarantine efficiency coupled with a low recovery rate (figure 16d) leads Peru to be in the dangerzone ( C p >
1) for 48 days post detection of the 500 th case (figure 15c). Our model captures the infected and recovered counts for highly affected countries in Europe,North America, Asia and South America reasonably well, and is thus globally applicable. Alongwith capturing the evolution of infected and recovered data, the novel machine learning aided epi-demiological approach allows us to extract valuable information regarding the quarantine policies,the evolution of Covid spread parameter C p , the mean contact rate (social distancing effectiveness),and the recovery rate. Thus, it becomes possible to compare across different countries, with themodel serving as an important diagnostic tool.Our results show a generally strong correlation between strengthening of the quarantine con-trols, i.e. increasing Q ( t ) as learnt by the neural network model; actions taken by the regions’respective governments; and decrease of the Covid spread parameter C p for all continents consid-ered in the present study.Based on the Covid-19 data collected (details in the Materials and Methods section), we notethat accurate and timely reporting of recovered data is seen to have a significant variation betweencountries; with under reporting of the recovered data being a common practice. In the NorthAmerican countries, for example, the recovered data are significantly lower than its European andAsian counterparts. Thus, our results strongly indicate the need for each country to follow a par-ticular metric for estimating the recovered count robustly, which is vital for data driven assessmentof the pandemic spread. 15 . Russia 2. UK 3. Spain 4. Italy0.00.20.40.60.8 Contact rate: Quarantine efficiency: Q eff Recovery rate: + (a) Europe
1. NY 2. NJ 3. Illinois 4. CA0.000.250.500.751.00 Contact rate: Quarantine efficiency: Q eff Recovery rate: + (b) USA
1. India 2. China 3. Korea0.000.250.500.751.00 Contact rate: Quarantine efficiency: Q eff Recovery rate: + (c) Asia
1. Brazil 2. Chile 3. Peru0.000.250.500.751.00 Contact rate: Quarantine efficiency: Q eff Recovery rate: + (d) South America
Figure 16: Global comparison of infection, recovery rates and quarantine efficiency. This could bethe subject of future studies.
The starting point t = i.e. I ≈ t = T ( t ) is initialized to a smallnumber T ( t = ) ≈ The time resolved data for the infected, I data and recovered, R data for each locale considered isobtained from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University.The neural network-augmented SIR ODE system was trained by minimizing the mean square errorloss function L NN ( W, β, γ, δ ) = ∣∣ log ( I ( t ) + T ( t )) − log ( I data ( t ))∣∣ + ∣∣ log ( R ( t )) − log ( R data ( t ))∣∣ (13)that includes the neural network’s weights W . For most of the regions under consideration, W, β, γ, δ were optimized by minimizing the loss function given in (13). Minimization was employedusing local adjoint sensitivity analysis
32, 46 following a similar procedure outlined in a recent study with the ADAM optimizer with a learning rate of 0 .
01. The iterations required for convergencevaried based on the region considered and generally ranged from 40 , − , W, β, γ, δ . In the first stage, (13) was minimized. For the second stage, we fixthe optimal γ, δ found in the first stage to optimize for the remaining parameters:
W, β based onthe loss function defined just on the infected count as L ( W, β ) = ∣∣ log ( I ( t ) + T ( t )) − log ( I data ( t ))∣∣ .In the second stage, we don’t include the recovered count R ( t ) in the loss function, since R ( t ) depends on γ, δ which have already been optimized in the first stage. By placing more emphasison minimizing the infected count, such a two stage procedure leads to much more accurate modelestimates; when the recovered data count is low. The iterations required for convergence in bothstages varied based on the region considered and generally ranged from 30 , − , <
1% for allregions considered.Preliminary versions of this work can be found at medRxiv 2020.04.03.20052084 and arXiv:2004.02752 . Data for the infected and recovered case count in all regions was obtained from the Center for Sys-tems Science and Engineering (CSSE) at Johns Hopkins University. All code files are available athttps://github.com/RajDandekar/MIT-Global-COVID-Modelling-Project-1. All results are pub-licly hosted at https://covid19ml.org/ or https://rajdandekar.github.io/COVID-QuarantineStrength/.
This effort was partially funded by the Intelligence Advanced Reseach Projects Activity (IARPA.)We are grateful to Emma Wang for help with some of the simulations, and to Haluk Akay,Hyungseok Kim and Wujie Wang for helpful discussions and suggestions.
The authors declare no conflicts of interest.
References [1] Chan, J. F.-W, Yuan, S, Kok, K.-H, To, K. K.-W, Chu, H, Yang, J, Xing, F, Liu, J, Yip, C.C.-Y, Poon, R. W.-S, et al. (2020) A familial cluster of pneumonia associated with the 2019novel coronavirus indicating person-to-person transmission: a study of a family cluster. TheLancet 395, 514-523.[2] CDC. (2020) Coronavirus Disease 2019 (COVID-19) Situation Summary, 3 March 2020.[3] WHO. (2020) Coronavirus disease 2019 (COVID-19) Situation Report - 174, 12 July 2020.[4] Cyranoski, D. (2020) What china’s coronavirus response can teach the rest of the world.Nature.[5] Gibney, E. (2020) Whose coronavirus strategy worked best? Scientists hunt most effectivepolicies. Nature News ( ).[6] Holshue, M. L, DeBolt, C, Lindquist, S, Lofy, K. H, Wiesman, J, Bruce, H, Spitters, C,Ericson, K, Wilkerson, S, Tural, A, et al. (2020) First case of 2019 novel coronavirus in theunited states. New England Journal of Medicine.[7] Carey, B & Glanz, J. (2020) Hidden Outbreaks Spread Through U.S. Cities Far Earlier ThanAmericans Knew, Estimates Say. The New York Times ( ).[8] Report. (2020) Coronavirus in Latin America: What governments are doingto stop the spread. Global Americans ( https://theglobalamericans.org/2020/03/coronavirus-in-latin-america/ ).[9] Bertsimas, D, Bandi, H, Boussioux, L, Cory-Wright, R, Delarue, A, Digalakis, V, Gilmour,S, Graham, J, Kim, A, Lahlou Kitane, D, Lin, Z, Lukin, G, Li, M, Mingardi, L, Na, L, Or-fanoudaki, A, Papalexopoulos, T, Paskov, I, Pauphilet, J, Skali Lami, M, Sobiesk, B, Stellato,B, Carballo, Y, Wang, H, Wiberg, C, & Zeng, C. (2020) An aggregated dataset of clinicaloutcomes for covid-19 patients. covid analytics ( ).1810] Chinazzi, M, Davis, J. T, Ajelli, M, Gioannini, C, Litvinova, M, Merler, S, y Piontti, A. P,Mu, K, Rossi, L, Sun, K, et al. (2020) The effect of travel restrictions on the spread of the2019 novel coronavirus (covid-19) outbreak. Science.[11] Li, M. L, Bouardi, H. T, Lami, O. S, Trikalinos, T. A, Trichakis, N. K, & Bertsimas, D.(2020) Forecasting covid-19 and analyzing the effect of government interventions ( https://doi.org/10.1101/2020.06.23.20138693 ).[12] Kraemer, M. U, Yang, C.-H, Gutierrez, B, Wu, C.-H, Klein, B, Pigott, D. M, Du Plessis, L,Faria, N. R, Li, R, Hanage, W. P, et al. (2020) The effect of human mobility and controlmeasures on the covid-19 epidemic in china. Science 368, 493-497.[13] Ferguson, N, Laydon, D, Nedjati Gilani, G, Imai, N, Ainslie, K, Baguelin, M, Bhatia, S,Boonyasiri, A, Cucunuba Perez, Z, Cuomo-Dannenburg, G, et al. (2020) Impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand. Im-perial College London.[14] Imai, N, Cori, A, Dorigatti, I, Baguelin, M, Donnelly, C. A, Riley, S, & Ferguson, N. M. (2020)Report 3: transmissibility of 2019-nCov. Imperial College London.[15] Read, J. M, Bridgen, J. R, Cummings, D. A, Ho, A, & Jewell, C. P. (2020) Novel coronavirus2019-ncov: early estimation of epidemiological parameters and epidemic predictions ().[16] Tang, B, Wang, X, Li, Q, Bragazzi, N. L, Tang, S, Xiao, Y, & Wu, J. (2020) Estimationof the transmission risk of the 2019-nCov and its implication for public health interventions.Journal of Clinical Medicine 9, 462.[17] Li, Q, Guan, X, Wu, P, Wang, X, Zhou, L, Tong, Y, Ren, R, Leung, K. S, Lau, E. H, Wong,J. Y, et al. (2020) Early transmission dynamics in Wuhan, China, of novel coronavirus–infectedpneumonia. New England Journal of Medicine.[18] Wu, J. T, Leung, K, & Leung, G. M. (2020) Nowcasting and forecasting the potential domesticand international spread of the 2019-nCov outbreak originating in Wuhan, China: a modellingstudy. The Lancet 395, 689-697.[19] Kucharski, A. J, Russell, T. W, Diamond, C, Liu, Y, Edmunds, J, Funk, S, Eggo, R. M, Sun,F, Jit, M, Munday, J. D, et al. (2020) Early dynamics of transmission and control of covid-19:a mathematical modelling study. The Lancet Infectious Diseases.[20] Fang, H, Chen, J, & Hu, J. (2006) Modelling the sars epidemic by a lattice-based monte-carlosimulation. IEEE 27, 7470-7473.[21] Saito, M. M, Imoto, S, Yamaguchi, R, Sato, H, Nakada, H, Kami, M, Miyano, S, & Higuchi, T.(2013) Extension and verification of the seir model on the 2009 influenza a (h1n1) pandemicin japan. Mathematical biosciences 246, 47-54.[22] Smirnova, A, deCamp, L, & Chowell, G. (2019) Forecasting epidemics through nonparametricestimation of time-dependent transmission rates using the seir model. Bulletin of mathematicalbiology 81, 4343-4365.[23] Cybenko, G. (1989) Approximations by superpositions of sigmoidal functions. Mathematicsof Control, Signals and Systems 2, 303-314.[24] Hornik, K. (1991) Approximation capabilities of multilayer feedforward networks. NeuralNetworks.[25] Sonoda, S & Murata, N. (2017) Neural network with unbounded activation functions isuniversal approximator. Appl. Comp. Harmonic Anal. 43, 233-268.[26] Glorot, X, Bordes, A, & Bengio, Y. (2011) Deep sparse rectifier neural networks. Proc. 14thInternational Conference on Artificial Intelligence and Statistics, 315-323.1927] Goodfellow, I, Warde-Farley, D, Mirza, M, Courville, A, & Bengio, Y. (2013) Maxout net-works. 30th int. conf. mach. learn., 1319-1327.[28] Dahl, G. E, Sainath, T. N, & Hinton, G. E. (2013) Improving deep neural networks for LVCSRusing rectified linear units and dropout. IEEE Acoustics, Speech and Signal Processing, 8609-8613.[29] Baker, N, Alexander, F, Bremer, T, Hagberg, A, Kevrekidis, Y, Najm, H, Parashar, M, Patra,A, Sethian, J, Wild, S, et al. (2019) Workshop report on basic research needs for scientificmachine learning: Core technologies for artificial intelligence. USDOE Washington.[30] Rackauckas, C, Ma, Y, Martensen, J, Warner, C, Zubov, K, Supekar, R, Skinner, D, &Ramadhan, A. (2020) Universal Differential Equations for Scientific Machine Learning(arXiv:2001.04385).[31] Mukhopadhyay, B & Bhattacharyya, R. (2008) Analysis of a spatially extended nonlinear seisepidemic model with distinct incidence for exposed and infectives. Nonlinear Analysis: RealWorld Applications 9, 585-598.[32] Rackauckas, C, Innes, M, Ma, Y, Bettencourt, J, White, L, & Dixit, V. (2019) Diffeqflux.jl- A Julia Library for Neural Differential Equations. CoRR abs/1902.02376 ( http://arxiv.org/abs/1902.02376 ).[33] Jones, S. (2020) Spain orders nationwide lockdown to battle coronavirus. The Guardian,March 14.[34] Harlan, C & Pitrelli, S. (2020) Italy extends coronavirus lockdown to entire country, imposingrestrictions on 60 million people. The Washington Post.[35] Helm, T, Graham-Harrison, E, & Mckie, R. (2020) How did Britain get its coro-navirus response so wrong? Guardian ( ).[36] DW. (2020) Coronavirus: What are the lockdown measures across europe?. DW ( ).[37] O’Dea, C. (2020) What Switzerland did right in the battle against coronavirus. MarketWatch,June 15.[38] Guerin, O. (2020) Coronavirus: How Turkey took control of covid-19 emergency. BBC News( ).[39] Goodman, S. (2020) Sweden Has Become the Worlds Cautionary Tale. The New York Times( ).[40] Maxouris, C. (2020) These states have some of the most drastic restrictionsto combat the spread of coronavirus. CNN ( ).[41] Gershman, J. (2020) A Guide to State Coronavirus Reopenings and Lockdowns. The WallStreet Journal, May 21.[42] Normille, D. (2020) Coronavirus cases have dropped sharply in south korea.whats the secret to its success?. science ( ).[43] Thompson, D. (2020) Whats Behind South Koreas COVID-19 Exceptionalism?. The Atlantic( ). 2044] Romo, R. (2020) Politics and poverty hinder Covid-19 response inLatin America. CNN, May 29 ( ).[45] Apple. (2020) Mobility trend report. ( ).[46] Cao, Y, Li, S, Petzold, L, & Serban, R. (2003) Adjoint sensitivity analysis for differential-algebraic equations: The adjoint dae system and its numerical solution. SIAM journal onscientific computing 24, 1076-1089.[47] Kingma, D. P & Ba, J. (2014) Adam: A method for stochastic optimization( arXivpreprintarXiv:1412.6980arXivpreprintarXiv:1412.6980