Mobility-based prediction of SARS-CoV-2 spreading
Lorenzo Chicchi, Lorenzo Giambagli, Lorenzo Buffoni, Duccio Fanelli
MMobility-based prediction of SARS-CoV-2 spreading
Lorenzo Chicchi a,b,c , Lorenzo Giambagli a,b,c , Lorenzo Buffoni a,b , DuccioFanelli a,b,c a Department of Physics and Astronomy, University of Florence, Sesto Fiorentino,Florence, Italy b CSDC, University of Florence, Sesto Fiorentino, Florence, Italy c INFN Sezione di Firenze, Sesto Fiorentino, Florence, Italy
Abstract
The rapid spreading of SARS-CoV-2 and its dramatic consequences, are forc-ing policymakers to take strict measures in order to keep the population safe.At the same time, societal and economical interactions are to be safeguarded.A wide spectrum of containment measures have been hence devised and im-plemented, in different countries and at different stages of the pandemicevolution. Mobility towards workplace or retails, public transit usage andpermanence in residential areas constitute reliable tools to indirectly photo-graph the actual grade of the imposed containment protocols. In this paper,taking Italy as an example, we will develop and test a deep learning modelwhich can forecast various spreading scenarios based on different mobility in-dices, at a regional level. We will show that containment measures contributeto “flatten the curve” and quantify the minimum time frame necessary forthe imposed restrictions to result in a perceptible impact, depending on theirassociated grade.
Keywords:
LSTM, COVID-19, Mobility, Deep Learning
1. Introduction
Machine Learning (ML) [1, 2] has been extensively employed in the con-text of time series modeling and forecasting [3]. Groundbreaking applicationsin natural language processing [4], financial forecasting [5], speech recogni-tion [6] have earned this particular subfield of ML lots of investments andattention. Notably, the use of Deep Neural Networks [7, 8], with respectto traditional approach to time series analysis, enabled the algorithm itself
Preprint submitted to Journal Name February 17, 2021 a r X i v : . [ c ond - m a t . d i s - nn ] F e b o learn from the data the relevant variables and their associated correla-tions. Following the rapid spreading of SARS-CoV-2, numerous attemptshave been made for predicting the time evolution of epidemics across dif-ferent spatial scales [9, 10, 11]. To this end ML techniques have been alsoemployed [12, 13, 14]. Although very accurate and useful, these models of-ten lack the ability to incorporate the effects of containment measures asimplemented by local governments and solely rely on selected epidemiologi-cal variables (e.g. number of tests performed, number of deaths) to predictthe spreading of the virus. The putative impact of different containmentstrategies as devised by local governments is hence customarily modeled byresorting to standard epidemiological tools [15, 16],a choice which potentiallylimits the predictive ability of the trained ML devices. Starting from thesepremises, we suggest that mobility indices provide solid, almost real-time,indicators of the implemented containment strategies. When included in thetraining, they are processed as key information for future forecasting of MLalgorithm. A self-consistent argument allows in turn to estimate the time ittakes for the imposed mobility restrictions to materialize in an effective dropof the curve of infected individuals.In the following, we will describe the adopted machine learning approachwhich is tailored to predicting the SARS-CoV-2 epidemic evolution in thetwenty regions of Italy . The model is trained by using the time series ofselected epidemic quantities (number of infections, number of death, etc..)and includes information on the population mobility. We will show that, bylooking at epidemic and mobility trends during the n p past days, the modelis able to return sensible information on the values of a target epidemiologi-cal parameter in the next n f days. Working in the proposed framework, weare also able to estimate the time needed for the imposed restriction to yieldconsequences that can be appreciated at the scale of the whole communityin terms of reduction of hospitalized individuals. To this end, we considerdifferent grades of imposed restrictions on individual mobility ranging froma complete, nationwide lockdown to milder, regional-level restrictions to vir-tually no restrictions at all. Valle d’Aosta, Piemonte, Lombardia, Trentino - Alto Adige, Veneto, Friuli VeneziaGiulia, Liguria, Emilia Romagna, Toscana, Marche, Umbria Lazio,Abruzzo, Molise, Cam-pania, Apulia, Basilicata, Calabria, Sicilia e Sardegna . Methods We worked with Recurrent Neural Networks (RNN) [17], a class of deeplearning architectures widely used in time series machine learning modeling.Such architecture is designed to be sensitive to the ordering of the elements inthe input sequence [17]. This is achieved by introducing an inner state vectorthat is updated by the network itself, during each successive iteration. Thislatter vector allows the network to “keep memory” of the past input values.RNNs suffer of the so-called vanishing gradient problem: the gradients inlater steps of the sequence fade away quickly in the backpropagation process,without reaching earlier input signals and thus making it hard for the RNNto apprehend and correctly incorporate long-range dependencies [17]. Tooppose this problem, gating-based architectures, such as the Long short-termmemory (LSTM), have been proposed [18]. Trainable vectors, called gates,are accommodated for in the architecture and control the inner state update,at each iteration. This technical solution makes it possible for the networkto “forget” or “store” the novel bits of information that are processed ateach time step, along the sequence of collected events. In this way, earlyinformation deemed crucial for handling the forecasting task can be storedin the bulk while, recent inputs, identified as unessential, are safetly removedfrom the memory kernel. This is precisely the reason why we have decided toemploy a LSTM-like architecture for the problem at hand. In the followingwe shall operate with a deep architecture composed by two LSTM hiddenlayers of 300 and 20 nodes, respectively. Moreover, an additional dense layeris introduced to produce the sought output. Further, use is made of Adam[19] optimizer with a learning rate of 0 . The dataset consists in discrete daily series of length T of selected epi-demic and mobility parameters for each of the 20 regions in Italy. Morespecifically, we focus on the following quantities: (i) number of patients in intensive care (ii) number of hospitalized patients (iii) number of patients in home isolation (iv) number of deaths . Data from the COVID-19 CommunityMobility Reports of Google [20] are employed to track the change in time ofthe degree of mobility, as associated to different regions of Italy. We calculate3n particular the evolution as percentile change from baseline values of thereported mobility indexes in the following areas:1. retail and recreation2. grocery and pharmacy3. parks4. transit stations5. workplaces6. residentialIn Fig. 1 the evolution of the reference mobility indicators are displayed forthe case of Lombardy. The impact of the imposed restrictions on the mo-bility indexes can be clearly appreciated by visual inspection of the depictedglobal trends. It is hence surmised that the aforementioned mobility indica-tors provide a faithful barometer to gauge the actual impact of the imposedcontainment measures. As such, they could be accounted for when trainingthe LSTM to forecasting the future evolution of the epidemics.Combining all these together, for each day of the time series, we access 10 Figure 1: Time evolution of mobility parameters in Lombardy. A seven-days movingaverage was performed to filter out the weekly fluctuations and highlight the global trend. The base value is defined as the average value in the five weeks between 3th Januaryand 6th February 2020 for the considered week day, as explained in [20] n p days. Epidemics and mobility data sum upto a total of 10 scalar parameters per day of acquisition. The output targetvector has length n f , the time horizon of the prediction. More specifically,each entry of the output vector returns a prediction for the number of pa-tients in intensive care (IC) units, at the day of forecast, up to n f daysin the future. A schematic representation of data structure and processinghandling is provided in Fig. 2. The mobile window is made to slide alongthe scrutinised time series, day after day. For each position of the window,the information stemming from the n p preceding days (including the currentday of observation) are acquired and confronted with the desired output,the number of occupied IC units in the future n f days. During the trainingphase, this information is used to adjust the weights of the LSTM. Whenproperly trained, this device is used for forecasting purposes by letting thesliding window to explore a portion of the times series not supplied duringthe learning phase. The computing apparatus is fed with the needed inputinformation referred to the past n p days (including the day of elaboration) toanticipate the future (the following n f days) in terms of expected COVID-19patients necessitating IC units.Summing up, the training data set is made of 20( T − n p − n f ) examples(that is, couples input-target ) where the factor 20 stems from the number ofconsidered regions. In the analysis reported below n p = 21 (meaning thatwe process data from the last 21 days of observation) and n f = 7 (hence, foreach position of the sliding window, we look forward in time with a horizonof prediction that covers one week in the future).The data set is divided into two subsets: training and test sets . The firstset is used to train the model, whereas the second one is employed to testthe ability of the trained device to cope with data that were not suppliedduring the learning stages. To probe the robustness of the method we havedevised two different procedures to split the data in training and test set,respectively. These are listed in the following:.1. The training set consists in a limited segment of the available timesseries . In this second case, the training procedure is carried out by5 igure 2: Each data consist of an ensemble of input parameters (concerning both epi-demiological and mobility quantities and associated to the n p days that precede the daywhere the analysis is carried out.) and an output set that contains our forecast. This isthe number of occupied Intensive Care (IC) units in the following n f days (from the lastday of observation). solely employing data up a prescribed date. The future evolution ofthe system, beyond the last day of the processed observation, is usedto test the model. This makes it possible to test the performance of theLSTM model against data that refer to a time window not processedduring learning.2. The training set is a sub-set of the available regions . In this case, thelearning process is carried out over the entire length of the time seriesassociated to a subset of the 20 regions. The accuracy of the predictionis tested against data referred to regions that were not used during thelearning phases.In the next subsection we will discuss the obtained results and validate themodel as a viable tool to anticipate SARS-CoV-2 spreading across the Coun-try. 6 . Results
We used the architecture and data set as described in the previous sec-tion to define a learning problem that allows one to predict the evolution ofthe number of intensive care units (IC) occupied by COVID-19 patients indifferent regions of Italy.We begin by adopting the first of the two aforementioned frameworks. Thisimplies dealing with the full set of available time series, up to a given time,for the training phase. The trained network is then employed to forecast theevolution of the epidemics. Results are reported in Fig. 3 for a subsets ofregions, namely Piedmont, Umbria and Veneto. The evolution of occupiedIC units (orange trace) is nicely predicted by the model (coloured dots).For each day, the number of truly occupied IC units is compared to thecorresponding value, as predicted by the LSTM with different time horizons.More specifically, yellow dots refer to predictions which exploit informationmade accessible up to the preceding day. On the opposite limit, black dotsare forecast that process information older than one week (7 days). Interme-diate color grades refer to predictions which interpolate between these twoextremes. The data reported (Fig. 3) are obtained by training the LSTMwith data up to November 16th, where the dashed line is positioned. Fromhere on, predictions are obtained by sliding the computing window (as de-picted in Fig. 2) forward in time. The information relative to the n p inputdays are processed and used to anticipate the expected load of IC units in thenext n f days. The forecasted evolution agrees pretty well with the observedcurve of occupied IC units. Remarkably, the position of the peak is nicelycaptured by the computed time series which is hence capable to anticipatingthe evolution of the examined system.In Fig. 4 the results obtained when dealing with the alternative setting aslisted above, are depicted. As mentioned, we now train the model by focusingon a subset of the available regions and use this knowledge to predict theevolution of IC units occupied by COVID-19 patients in regions that werenot supplied as part of the training set. Color are assigned following the samecode introduced above: yellow dots refer to prediction that looks to just oneday in the future. Black dots stand for the opposite extreme: the LSTManticipates the evolution one week ahead in time. The agreement betweenpredicted and observed times series is again remarkable.In Fig. 5 the Root-Mean-Square Error (RMSE) associated to the predic-tions is plotted. Each bar represents the error made when trying to predict7 igure 3: Predicted evolution as compared to the experimentally recorded time series: theplotted curves refer to the number of occupied Intensive Care (IC) units by SARS-CoV-2patients in Piemonte, Umbria and Veneto. Red lines stand for the observed (hence, real)evolution. Coloured dots represent the forecast of the LSTM model. Yellow dots arepredictions that look at one day in the future. Black symbols rely instead on processingone week old data. Different color gradings, ranging from yellow to black, interpolatebetween these two limiting scenarios. In this case n p is set to 21. Data are rescaled byusing, for each variable of the set, its corresponding maximum value, as displayed in thetraining interval. This latter value is also used to normalize data from the test set, so thatonly information from the training set are effectively employed. the target values d ∈ [1 , n f ] days in the future, where the parameter n f de-fines the forecast horizon of the model. The RMSE is computed over the testset. Panel A of Fig. 5 is referred to n f = 7, whereas panel B is obtainedfor n f = 14. As expected, the accuracy goes down when d becomes larger.Although the model with larger n f allows us to make early predictions, theaccuracy of the predictions get worse when confronted with actual data: alower accuracy is found not only for distant predictions but also for closerones. The choice n f = 7 is a compromise between the need to cope with8 igure 4: The number of occupied IC is plotted against time, expressed in days. Red tracesreflect the observed, hence true, numbers. Colored dots stand for the LSTM forecasst.Here, the analysis is carried out for three regions that were not part of the training set.The adopted color code is specified in the caption of Fig. 3. The analysis is carried out byusing n p = 21. Here, epidemiological data are normalized by an arbitrary constant thatwe assume to extensively scale with the size of the examined region. reliable predictions, on the one side, and the request of imposing a plausibletemporal horizon, i.e. useful for forecast, on the other.In the following, we will shortly elaborate on the role of the mobility andhighlight its reflexes on the evolution of the epidemics. Different containment measures have been imposed to contrast the spread-ing of the COVID-19 epidemic. Such measures, like social distancing andlockdown, result in a clear impact on the mobility. In Fig. 6 the evolution offive normalized mobility parameters are plotted for Piedmont and Tuscany.Data are processed by operating a 7 days moving average to obtain smoother9 igure 5: Root-Mean-Square Error (RMSE) computed on the test set: the error comparesthe real evolution of IC occupation number and the expected evolution on the basis of theLSTM prediction at day d ∈ [1 , n f ], from the last processed observation. Panel (A) andpanel (B) referred to LSTM models with n f = 7 and n f = 14, respectively. profiles and remove weekly fluctuations. The averaged mobility indexes dis-play global trends which bear the imprint of the containment measures asimposed by national and local authorities. To support this claim, for eachregion, five time intervals associated to different containment measures havebeen identified. At the beginning of the time series, a depression of all themobility scores is detected (except for the parameter that quantifies mobilityin residential areas – purple lines – which in general, and as expected, showsopposite trends as compared to those stemming from other parameters). Thishas to be put in relation with the strict lockdown taken by the Italian gov-ernment in the spring of 2020. Subsequently, the curves associated to theparameters of not-residential areas grow up until they reach a new plateau.The plateau follows a no-restrictions (or few-restrictions) period, during thesummer, when containment measures had been relaxed. Other characteristicperiods can be indeed identified, specifically at the end of November and atthe beginning of December. This last segment of the recorded time series isindirectly influenced by the introduced color code labelling of the regions, asreflecting the degree of local severity of the epidemics. Each region is in factassociated to a color (respectively, yellow, orange and red in ascending orderof severity) and a different level of restrictions are adopted depending onthe region color. The correlation between the actual severity of the imposedrestrictions and the displayed mobility trends can be clearly appreciated by10isual inspection of Fig. 6. To help visualization few (colored) vertical stripesare depicted which refer to different conditions of the mobility, as outlinedabove. The first bar, colored in grey, is traced in correspondence of the strictlockdown back in the spring 2020. By averaging over the selected time in-terval (the width of the greyish bar) we obtain an average estimate of themobility parameters, as associated to the lockdown phase. Similarly, theother depicted bars identify other characteristic instances of the epidemicsevolution: the green stripe is meant to select mobility score referred to thesummer 2020. The red/orange/yellow bars identify the status of the region,as follow the novel strategy to label the severity of the disease at the localscale. Also in this case, by averaging over the width of the correspondingintervals, one obtains a set of values for the mobility indexes which indirectlyreflect the imposed containment action (from draconian lockdown to no re-strictions, via the intermediate settings as associated to different labellingcolors).This information can be used in the attempt to predict the role of anenforced modulation of the mobility, as follow the different scenarios recalledabove. More specifically, at any given day, one can change the mobility en-tries as supplied to the trained LSTM (by fishing from the aforementionedalternative classes, identified via the corresponding averaged entries). Theaim is examine the ensuing effect which materializes at the level of the fore-casted evolution of the occupied IC units, the target of the LSTM. In Fig. 7the result of the analysis is displayed for two reference regions, although thereached conclusion holds in general. A punctual modulation of the mobility(i.e. a change in the mobility that is confined to just one day) producedsensible changes in the predicted hospitalization, the response being moremarked the stricter the reduction of the mobility being imposed. Remark-ably, and according to the LSTM, the effect of a local change in the mobilitybecomes visible 8-10 days in the future, a plausible outcome of the analysiswhich calls for a timely planning of the containment protocols. On the ba-sis of the above, it is hence surmised that machine learning schemes of thetype here analyzed could help devising optimal strategies for an intelligentcombination of openings and closures, at the local scale. Furthermore, noticethat the lag time quantified above provides an a posteriori justification forchoosing n f = 7 as a forecast horizon of the LSTM machinery.11 igure 6: Mobility evolution in Piedmont and Tuscany, with reference to five distinctcategories, as outlined in the annexed legend. Data are from [20] and have been normalizedto yield quantities that range in the interval [0 ,
4. Conclusions
To summarize our findings, using a simple LSTM model trained on bothepidemiological and mobility data we were able to correctly forecast thespreading of SARS-CoV-2 across different regions and at different times. Ourmodel proved robust to alternative train/test splits in the spatial (hold out aregion) and temporal (hold out a temporal interval) domains. The choice ofemploying available information on human mobility constitutes the novelty12 igure 7: Predictions of IC units occupied by COVID-19 patients in Piedmont and Tuscanymade under the hypotheses of 6 different past mobility scenarios: true mobility (bluedots), lockdown (black dots), no restrictions (green dots), red zone (red dots), orange zone(orange dots) and yellow zone (yellow dots). In each column, from left to right, the changein the mobility scores is operated at a different day, as measured from the time the firstprediction is made (see annexed legend). of the proposed approach. The obtained forecasts are indeed shown to sen-sibly depend on the imposed mobility scores. When artificially reducing thedegree of imposed mobility, yields a consistent flattening of the curve of ex-pected occupied intensive care units. Importantly, the effect of an abatementof the mobility materializes in a consequent contraction of the occupied IC.The contraction becomes visible after 8-10 days, from the time the mobilitychange became effective. Interestingly, punctual mobility stops (one day)seem to generate a noticeable effect on the predicted IC occupation curve.Elaborating further along these lines could help devising viable strategies tooppose the spreading of the epidemics, with a minimal impact on both socialand economical activities. 13 eferences [1] Christopher M. Bishop.
Pattern Recognition and Machine Learning .Springer, New York, 1st ed. 2006. corr. 2nd printing 2011 edition edition,April 2011.[2] Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
The elementsof statistical learning: data mining, inference, and prediction . SpringerScience & Business Media, 2009.[3] Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, LhassaneIdoumghar, and Pierre-Alain Muller. Deep learning for time series clas-sification: a review.
Data Mining and Knowledge Discovery , 33(4):917–963, 2019.[4] KR Chowdhary. Natural language processing. In
Fundamentals of Ar-tificial Intelligence , pages 603–649. Springer, 2020.[5] Peter Tino, Christian Schittenkopf, and Georg Dorffner. Financialvolatility trading using recurrent neural networks.
IEEE Transactionson Neural Networks , 12(4):865–874, 2001.[6] Li Deng, Geoffrey Hinton, and Brian Kingsbury. New types of deepneural network learning for speech recognition and related applications:An overview. In , pages 8599–8603. IEEE, 2013.[7] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.
Na-ture , 521(7553):436–444, 2015.[8] Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
Deep Learning .MIT Press, 2016. .[9] Duccio Fanelli and Francesco Piazza. Analysis and forecast of covid-19 spreading in china, italy and france.
Chaos, Solitons & Fractals ,134:109761, 2020.[10] Timoteo Carletti, Duccio Fanelli, and Francesco Piazza. Covid-19: Theunreasonable effectiveness of simple models.
Chaos, Solitons & Fractals:X , 5:100034, 2020. 1411] Alessandro Vespignani, Huaiyu Tian, Christopher Dye, James O Lloyd-Smith, Rosalind M Eggo, Munik Shrestha, Samuel V Scarpino, BernardoGutierrez, Moritz UG Kraemer, Joseph Wu, et al. Modelling covid-19.
Nature Reviews Physics , 2(6):279–281, 2020.[12] Sourabh Shastri, Kuljeet Singh, Sachin Kumar, Paramjit Kour, and Vib-hakar Mansotra. Time series forecasting of covid-19 using deep learningmodels: India-usa comparative case study.
Chaos, Solitons & Fractals ,140:110227, 2020.[13] Junaid Farooq and Mohammad Abid Bazaz. A deep learning algorithmfor modeling and forecasting of covid-19 in five worst affected states ofindia.
Alexandria Engineering Journal , 60(1):587–596, 2020.[14] Jimson Mathew, Ranjan Kumar Behera, et al. A deep learning frame-work for covid outbreak prediction. arXiv preprint arXiv:2010.00382 ,2020.[15] Moritz UG Kraemer, Chia-Hung Yang, Bernardo Gutierrez, Chieh-HsiWu, Brennan Klein, David M Pigott, Louis Du Plessis, Nuno R Faria,Ruoran Li, William P Hanage, et al. The effect of human mobil-ity and control measures on the covid-19 epidemic in china.
Science ,368(6490):493–497, 2020.[16] Cornelia Ilin, S´ebastien E Annan-Phan, Xiao Hui Tai, Shikhar Mehra,Solomon M Hsiang, and Joshua E Blumenstock. Public mobility dataenables covid-19 forecasting and management at local and global scales.Technical report, National Bureau of Economic Research, 2020.[17] Yoav Goldberg. Neural network methods for natural language process-ing.
Synthesis Lectures on Human Language Technologies , 10(1):1–309,2017.[18] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory.
Neural computation , 9(8):1735–1780, 1997.[19] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochasticoptimization. arXiv preprint arXiv:1412.6980 , 2014.[20] Google.