A deep-learning model for evaluating and predicting the impact of lockdown policies on COVID-19 cases
Ahmed Ben Said, Abdelkarim Erradi, Hussein Aly, Abdelmonem Mohamed
AA deep-learning model for evaluating and predicting theimpact of lockdown policies on COVID-19 cases
Ahmed Ben Said, Abdelkarim Erradi, Hussein Ahmed Aly, AbdelmonemMohamed
Computer Science & Engineering Department, College of Engineering, 2713, Doha,Qatar { abensaid, erradi, ha1601589, am1604044 } @qu.edu.qa Abstract
To reduce the impact of COVID-19 pandemic most countries have imple-mented several counter-measures to control the virus spread including schooland border closing, shutting down public transport and workplace and re-strictions on gathering. In this research work, we propose a deep-learningprediction model for evaluating and predicting the impact of various lock-down policies on daily COVID-19 cases. This is achieved by first clusteringcountries having similar lockdown policies, then training a prediction modelbased on the daily cases of the countries in each cluster along with the datadescribing their lockdown policies. Once the model is trained, it can usedto evaluate several scenarios associated to lockdown policies and investigatetheir impact on the predicted COVID cases. Our evaluation experiments,conducted on Qatar as a use case, shows that the proposed approach achievedcompetitive prediction accuracy. Additionally, our findings highlighted thatlifting restrictions particularly on schools and border opening would resultin significant increase in the number of cases during the study period.
Keywords:
Deep-learning prediction model, Impact of lockdown policies,COVID-19, What-If Analysis
1. Introduction
A new coronavirus (COVID-19) [1, 2]has emerged from Wuhan, the capi-tal of Hubei province in central China in December 2019. The World HealthOrganization has classified the virus as a global pandemic on March 11 2020
Preprint submitted to Journal Name September 14, 2020 a r X i v : . [ c s . S I] S e p R . The findings also showedthat lockdown is indeed effective in locations with high infection rate. A liftof lockdown restriction after May 17 would result in high spike of numberof cases. In the study of Atalan [6], analysis of the effect of the number oflockdown days on the spread of COVID-19 is presented. The study showedevidence that the pandemic can be suppressed by effective lockdown mea-sures. Thee author argue also that these measures are also effective frompsychological, environmental and economical perspectives. Vinceti et al. [7]analyzed mobility restriction in the three most affected Italian regions, Lom-bardy, Veneto and Emilia-Romagna, from February 1 through April 6, 2020.Results showed that the daily number of cases is inversely correlated to mo-bility restriction after the second lockdown policy. The peak is witnessed14 to 18 days from imposing the lockdown. A study of the epidemiologicalsituation in the french region of ˆIle de France is presented in [8]. The authorsa stochastic age-structured transmission model that includes age profiles andsocial contacts in the area of study. This model is used to evaluate the lock-down policy. The simulated scenarios are integrated by introducing changesto the contact matrices. The lockdown information is derived from mobilitydata provided by mobile phone. Simulation scenarios are established includigtesting with different types and duration of social distancing. Results showedthat the prior to lockdown, the estimated reproduction number is 3.18. Itfalls to 0.68 during lockdown thanks to 81% reduction in number of contacts.Although great research efforts have been dedicated to investigate the im-2act of restriction measures on the spread of the pandemic, to the best ofour knowledge, no reported work has investigate the impact of multiple re-strictions measures e.g. school closing, restriction on gathering, workplaceclosing, travel restriction, public transport shutdown, and how any changein the restriction would affect the pandemic situation. In this work, we pro-posed a deep learning based approach that enables stakeholders and decisionmakers to establish several scenarios related to restriction measures and in-vestigate their impact. In addition, the proposed solution is versatile, in thesense that it is applicable for any country. In addition, the proposed methodconsiders countries sharing similar restrictions and lockdown policies to an-alyze establish the simulated scenarios and assess their implications, ratherthan analyzing each country separately
2. What If analysis
Our objective is to propose a solution to assist decision makers to estab-lish multiple scenarios related to restrictions imposed to control the spreadof COVID-19. In the followings, we present an overview of the proposedapproach and its technical details.
Elbow MethodK-Means
OptimalNumber ofClusters
Daily CasesPredictionModel
CountryClustersCasesPrediction
What if
Figure 1: What If analysis framework • How fast the country reacted to the spread of the virus: this is assessedin terms of difference in days between the first case is reported and theday the following restrictions are imposed: school closing, restriction ongatherings, border closing, public transport shutdown and workplaceclosing. • Efficiency of the lockdown policy: we propose to assess the efficiencyin terms of reproduction number. This specific number is the mainconcern of all health authorities and decision makers. Indeed, a valuegreater than 1 indicates that the virus is spreading while a value lessthan 1 indicates that the virus spread is being controlled. We proposeto calculate this number for reach country and averaged every twoweeks.This set of extracted features are intended to be clustered in order to deter-mine countries sharing similar lockdown and restriction policies. For this, weuse the well-known K-Means algorithm which requires the number of clustersas input paramter. Hence, we use Elbow method to determine the optimalnumber of cluster. For a given country we intend to analyze, we use its dailyreported cases as well as to the daily reported cases of the countries belong-ing to the same cluster. These data combined with the lockdown measuresare used to train a deep learning model to predict the next day number ofCOVID-19 cases. Once trained, this learning model is queried with datareflecting a lockdown scneario. In other words, we are ”asking” the model topredict the next day number of cases if, for example, schoolo are open and/orrestriction on gathering is restricted while other measures remain effective.In the following, we present the technical details of the reproduction numbercalculation, Elbow method, K-Means clustering and the deep learning-basedprediction model.
Bettencourt et al. [9] proposed a Bayesian approach to estimate thereproduction number in real time R t . This value depends on the value of theprevious day R t − and the all the previous m values R t − m . Using Bayesianmodelling, the belief about the true value R t is updated based on how many4eported cases each day: P ( R t | k ) = P ( k t | R t ) P ( R t ) P ( k t ) (1)where P ( R t ) is the prior belief about R t without data, P ( k | R t ) is the condi-tional probability of having k cases given R t and P ( k ) is the probability ofhaving k cases. By using the posterior of the previous day P ( R t − | k t − ) astoday prior P ( R t ), then we can write P ( R t | k t ) ∝ P ( k t | R t ) P ( R t − | k t − ) (2)By iterating across all periods, we obtain: P ( R t | k t ) ∝ P ( R ) T (cid:89) t =0 P ( k t | R t ) (3)Assuming a uniform prior P ( R ), we have: P ( R t | k t ) ∝ T (cid:89) t =0 P ( k t | R t ) (4)However, as emphasized by [9], for R t that remains greater than 1 for a longperiod and then becomes less than 1, the posterior gets stuck. In other words,the posterior cannot forget about long period of times for which R t >
1. Inhis study of reproduction number for United States, Kevin Systrom [10]suggested including into the posterior only the previous m days. Hence, theprior is derived using information from the recent past rather than the entirehistory. Thus, we can write: P ( R t | k t ) ∝ T (cid:89) t = T − m P ( k t | R t ) (5) The objective to group and identify countries having similar lockdownpolicy. The intuitive idea is that COVID-19 would impact these countriesthe same way. To achieve this goal, we apply the Elbow method to determinethe optimal number of clusters. This optimal number is used as input pa-rameter for the K-Means clustering algorithm. In the following, we presentthe technical details of each step. 5 .3.1. Elbow method
Let X = { x , x , ..., x n } be n of d-dimensional points to be clustered into K clusters, i.e. assigning each x i , i = 1 , ..., n to a cluster c k , k = 1 , ..., K .K-Means partitions the data by minimizing the squared error between themean of a cluster and the data points, members of the clusters. Let m k bethe mean of cluster c k . The squared error between a cluster center and itsmembers is defined as: J ( c k ) = (cid:88) x i ∈ c k || x i − m k || (6)K-Means seeks to minimize the sum of the squared errors: J ( C ) = K (cid:88) k =1 (cid:88) x i ∈ c k || x i − m k || (7)Where C is the set of clusters. To minimize Eq. 7, the following steps areapplied:1. Randomly assign K cluster centers and repeat step 2 and 3.2. Assign each data point to the closest cluster center.3. Calculate the new cluster centers.The number of cluster K is an input parameter for K-Means. Hence, we usethe Elbow method to determine the optimal number of clusters for which theobtained partition is compact, i.e. low J ( C ). By adding more clusters wouldresult in even more compact partition which leads to over-fitting. Hence,the variation of J ( C ) with respect to K would exhibit first a fast decreasefollowed by a slow one. The Elbow method recommends to select the numberof cluster that corresponds to the elbow of the curve J ( C ) vs K . To conduct the ’What if’ analysis for a specific country, the cluster towhich the contry belongs is idenfied. Then, the daily COVID-19 cases of thesecountries are collected in addition to data related to lockdown measures. Inthe followings, we detail characteristics of data and prediction model trainedon both daily COVID-19 and lockdown data.6 .4.1. Data description
The temporal characteristic is a critical component of the data as dataare collected on daily basis. Hence, it can be seen as multivariate time seriesand consists of: • Daily COVID-19 cases. These data are provided by government agen-cies It can be collected through several APIs for this information. Wecollect data from February 15th to July 31st. • School closing, where 0 indicates no measures are taken, 1- recommendclosing, 2- require closing (only some levels or categories, e.g. just highschool, or just public schools) and 3- require closing all levels. • Workplace closing, where 0 indicates no measures are taken, 1- recom-mend closing (or recommend work from home), 2- require closing (orwork from home) for some sectors or categories of workers and 3- re-quire closing (or work from home) for all-but-essential workplaces (e.g.grocery stores, doctors). • Restrictions on gatherings: where 0 indicates no restrictions are im-posed, 1- restrictions on very large gatherings (the limit is above 1000people), 2- restrictions on gatherings between 101-1000 people, 3- re-strictions on gatherings between 11-100 people, 4- restrictions on gath-erings of 10 people or less. • Public transport shutdown where 0 indicates no measures are taken,1- recommend closing (or significantly reduce volume/route/means oftransport available) and 2- require closing (or prohibit most citizensfrom using it) • International travel controls where 0 indicates no restrictions are taken,1-screening arrivals, 2- quarantine arrivals from some or all regions, 3-ban arrivals from some regions and 4- ban on all regions or total borderclosure.
Fig. 2 depicts the general architecture of the proposed daily COVID-19forecasting model. It is characterized by two pathways: one pathway dedi-cated to daily COVID-19 time series data while the other pathway is fed withmultivariate time series of lockdown measures. The first pathway is consists7f a stack of two Bidirectional LSTM layers followed by a dense layer. Thesecond pathway is a stack of two LSTM layers followed by a dense layer. Ontop of these pathways, a merge layer is added where concatenation of the twodense layers is performed. The merge layer is followed by two dense layers.Th model outputs the prediction of the next day number of cases.
Bi-LSTMBi-LSTMDense LSTMLSTMDenseConcantenateDenseDensePredicted COVID-19 cases
Figure 2: Two-pathway model
In the followings, we present the technical details of the LSTM and Bidirec-tional LSTM layers. • LSTM layer:
An LSTM layer consists of a sequence of LSTM cells andthe sequence data are fed in a forward way. The LSTM cell depictedin Fig. 3. Given the current value x t , the previous hidden state h t − and the previous state C t − , the following transformations are applied: f t = σ (cid:16) W f · [ h t − , x t ] + b f (cid:17) (8) i t = σ (cid:16) W i [ h t − , x t ] + b i (cid:17) (9)8 X tanh X tanh X Figure 3: Long Short-Term Memory (LSTM) cell ˆ C t = tanh (cid:16) W C [ h t − , x t ] + b c (cid:17) (10) C t = f t ∗ C t − + i t ∗ ˆ C t (11) o t = σ (cid:16) W o [ h t − , x t ] + b o (cid:17) (12) h t = o t ∗ tanh ( C t ) (13)Where σ and tanh are the sigmoid and hyperbolic tangent functionrespectively. f t is the forget gate, i t is the input gate and o t is the outputgate. W and b are the weight matrix and bias vector respectively. [ · ]is the concatenation operator and ∗ is the dot product. • Bidirectional LSTM (Bi-LSTM):
Bi-LSTM includes another LSTMlayer for which the data is fed in backward way as depicted in Fig. 4.
3. Experimental results
In this section, we focus on Qatar as a use-case. First, we assess theproposed method by assessing the prediction accuracy, i.e. the error betweenthe actual and predicted daily COVID-19 cases in Qatar. We compare ourproposed solution with two other approaches: same model trained withoutlockdown data and model trained on data of Qatar daily COVID-19 casesonly. Then, we investigate the model outcome while changing several inputparameters associated to the lockdown measures.9
STMLSTM LSTMLSTM LSTMLSTM
Forward Backward
Figure 4: Bidirectional Long Short-Term Memory (LSTM)
Fig. 5 illustrated the Elbow method results. Results show that the opti-mal number of clusters correspond to K = 10 with a distortion score = 1324.This number is used as input parameter to K-Means algorithm. Clusteringresults shows that Qatar belongs to a cluster with other countries includ-ing Azerbaijan, Benin, Bahrain, Georgia, Croatia, Indonesia, Italy, Kuwait,Lebanon, Mexico, Mozambique, Norway, Oman, Pakistan, Romania, Sey-chelles. These countries share similar characteristics in terms of how fastthey reacted to the first reported cases and the evolution of the reproductionnumber R t . They are characterized by an average Rt = 1 .
20 40 60 80 100 K D i s t o r ti on DistortionElbow at K = 10, Score = 1324
Figure 5: Distortion score for different numbers of clusters. Elbow corresponds to K= 10
Approach RMSE MAEProposed 28 21Qatar Data only 129 108No Lockdown Data 197 161
In this section, we use the proposed model as a baseline to assess Qatarlockdown policy during the first week of September. The proposed modelcan be used to investigate the effect of changing the lockdown policy on theevolution of COVID-19 cases. On September 1st, Qatar entered the 4thphase of its lockdown policy which consists of: partially lifting restrictionson gathering, partially opening public transport and educational institutions.Workplace are opened with 80% capacity. Travelers entering Qatar from lowrisk countries are still required to self-quarantine at home for one week. For11 ul 31 Aug 07 Aug 14 Aug 21 Aug 28 Sep 04
Day D a il y c C OV I D - C a s e s True Daily CasesProposedQatar Data onlyNo Lockdown Data
Figure 6: Distortion score for different numbers of clusters. Elbow corresponds to K= 10 other countries, passengers must be quarantined at hotels for 15 days. Inthis section, we analyze how lifting all restrictions on school, public trans-port and workplace and border opening would impact the number of casesin Qatar during the first week of September 2020. We establish multiplescenarios in which we hypothetically completely lift restriction on one sectorwhile keeping all other sectors under their actual policy and investigate themodel outcome.Fig. 7 illustrates how lifting all restrictions on school and educational insti-tutions would impact the daily cases in Qatar starting from 1st of September.Results show that the proposed model predicts a fluctuating numbers till 3rdof September then a continuous alarming increase is witness starting from4th of September. The actual number of cases is almost stable. Duringthe period of this analysis, school and educational institution policy man-dates a partial opening. Indeed, at Qatar University for example, most of12he lectures are conducted online for this period. Fig. 8 shows the evolu-
Aug 01 Aug 08 Aug 15 Aug 22 Aug 29 Sep 05 Sep 12
Day D a il y C OV I D - C a s e s True Daily COVID-19 CasesPredicted Cases with Schools fully open
Figure 7: Effect of lifting all restrictions on school in Qatar tion of number of cases if all restrictions on public transport are lifted. Wenotice that the proposed model predicted a fluctuating number of cases forthe one week analysis period without any significant impact. We illustrate
Aug 01 Aug 08 Aug 15 Aug 22 Aug 29 Sep 05 Sep 12
Day D a il y C OV I D - C a s e s True Daily COVID-19 CasesPredicted Cases with Public Transport fully open
Figure 8: Effect of lifting all restrictions on public transport in Qatar in Fig. 9 the effect of lifting all restrictions on gathering. No specific pat-13ern is detected which suggests that fully opening public transport servicewould not significantly affect the daily cases. Fig. 10 depicts the predicted
Aug 01 Aug 08 Aug 15 Aug 22 Aug 29 Sep 05 Sep 12
Day D a il y C OV I D - C a s e s True Daily COVID-19 CasesPredicted Cases with no restriction on gathering
Figure 9: Effect of lifting all restrictions on gathering in Qatar number of cases if restriction on workplace is completely lifted. We noticefluctuating number of cases close to the actual numbers which suggest thatlifting restriction on workplace while keeping all other restrictions would notaffect COVID-19 cases. Effect of lifting all restrictions on borders is detailedin Fig. 11. Results show dramatic implication of such decision. Indeed,an exponential growth of number of cases would occur with more than 350cases expected with difference of more than 100 cases compared to the actualnumber. During the period of this analysis, passengers countries identified aslow risk are required to quarantine for one week while passengers from othercountries must quarantine at hotels for two weeks. Our analysis has identifiedschool, educational institution and border restrictions as key factors affectingthe daily COVID-19 cases. When restrictions are fully lifted, the proposedmodel predicted a sudden increase in number of cases indicating that takingsuch decisions would lead to dramatic consequences. The propose model didnot detect any significant changes in number of cases if restriction is fullylifted on gathering, workplace and public transport.14 ug 01 Aug 08 Aug 15 Aug 22 Aug 29 Sep 05 Sep 12
Day D a il y C OV I D - C a s e s True Daily COVID-19 CasesPredicted Cases with Workplace fully open
Figure 10: Effect of lifting all restrictions on workplace in Qatar
Aug 01 Aug 08 Aug 15 Aug 22 Aug 29 Sep 05 Sep 12
Day D a il y C OV I D - C a s e s Actual Daily COVID-19 CasesPredicted Cases with borders fully open
Figure 11: Effect of lifting all restrictions on borders in Qatar . Conclusion We proposed a data driven approach aiming at predicting the daily COVID-19 cases which allows also testing several scenarios related to lockdown policy.The proposed model considered both lockdown information and daily casesof countries having similar lockdown policy and showed same response tothe outbreak of the virus. We focused in our experiments on Qatar as ause case and showed that the proposed model achieved better prediction byincluding lockdown information and training model on data of countries withsimilar policies. Our analysis also showed that completely lifting restrictionson schools and borders would contribute to sudden increase of number ofcases in Qatar.
Acknowledgment
This work was made possible by COVID-19 Rapid Response Call (RRC)grant
References [1] Yang, Y., Peng, F., Wang, R., Guan, K., Jiang, T., Xu, G., Sun, J.,Chang, C. The deadly coronaviruses: the 2003 SARS pandemic and the2020 novel coronavirus epidemic. in China. J. Autoimmun. 2020; 102434[2] Chauhan S, Comprehensive review of coronavirus disease 2019 (COVID-19). Biomedical Journal 2020:In Press[3] WHO, 2020. Situation report - 77 coronavirus disease 2019 (COVID-19).WWW Document. [4] Z. Sun, H. Zhang, Y. Yang, H. Wan, Y. Wang, Impacts of geographic fac-tors and population density on the COVID-19spreading under the lock-down policies of China, Science of the Total Environment, 746, 141347,2020 165] T. Sardara, Sk. S. Nadim, S. Rana, J. Chattopadhyay, Assessment oflockdown effect in some states and overall India: A predictive mathe-matical study on COVID-19 outbreak, Chaos, Solitons & Fractals, 139,110078, 2020[6] A. Atalan, Is the lockdown important to prevent the COVID-19 pan-demic? Effects on psychology, environment and economy-perspective,Annals of Medicine and Surgery, 56, pp. 38-42, 2020[7] M Vinceti, T. Filippini, K. J. Rothman, F. Ferrari, A. Goffi, G. Maffeis,N. Ors, ini, Lockdown timing and efficacy in controlling COVID-19 usingmobile phone tracking, EClinicalMedicine, 100457, 2020[8] L. Di Domenico, G Pullano, C. E. Sabbatini, P-Y Bolle, V. Colizza,Impact of lockdown on COVID-19 epidemic in le-de-France and possibleexit strategies, medRxiv 2020.04.13.20063933[9] L. M. A. Bettencourt ,R. M. Ribeiro, Real time Bayesian estimation of theepidemic potential of emerging infectious disease, PLoS ONE, 3, e2185,2008[10] https://rt.live/https://rt.live/