Predicting regional COVID-19 hospital admissions in Sweden using mobility data
Philip Gerlee, Julia Karlsson, Ingrid Fritzell, Thomas Brezicka, Armin Spreco, Toomas Timpka, Anna Jöud, Torbjörn Lundh
PPredicting regional COVID-19 hospital admissions in Swedenusing mobility data
Philip Gerlee, ∗ Julia Karlsson, Ingrid Fritzell, Thomas Brezicka, Armin Spreco,
3, 4
Toomas Timpka,
3, 4
Anna J¨oud,
5, 6 and Torbj¨orn Lundh Mathematical Sciences, Chalmers University ofTechnology and University of Gothenburg, Sweden Sahlgrenska University Hospital, Gothenburg, Sweden Department of Health, Medicine and Caring Science, Link¨oping University Center for Health Services Development,Region ¨Osterg¨otland, Link¨oping, Sweden Department of Laboratory Medicine, Lund University, Sweden Sk˚ane University Hospital, Department of Research and Development, Lund, Sweden (Dated: January 5, 2021) a r X i v : . [ q - b i o . P E ] J a n bstract The transmission of COVID-19 is dependent on social contacts, the rate of which have variedduring the pandemic due to mandated and voluntary social distancing. Changes in transmissiondynamics eventually affect hospital admissions and we have used this connection in order to modeland predict regional hospital admissions in Sweden during the COVID-19 pandemic. We use anSEIR-model for each region in Sweden in which the infectivity is assumed to depend on mobilitydata in terms of public transport utilisation and mobile phone usage. The results show that themodel can capture the timing of the first and beginning of the second wave of the pandemic. Fur-ther, we show that for two major regions of Sweden models with public transport data outperformmodels using mobile phone usage. The model assumes a three week delay from disease transmissionto hospitalisation which makes it possible to use current mobility data to predict future admissions.
I. INTRODUCTION
Infectious diseases are disseminated through transmission of infectious agents in associ-ation with physical meetings (social contacts) between individuals. These meetings occurat home or at other locations such as workplaces or schools, which are reached using somemeans of transportation, e.g. by car, public transport or foot. The meetings tend to takeon regular patterns and variations, and these can be used for different types of analyticpurposes [1, 2].The COVID-19 pandemic has affected the society in numerous ways. One striking fea-ture is the reduction in individual mobility, which has been enforced either by strict legallockdowns or, as in the case of Sweden, by recommendations to the general public. Thisreduction in mobility has had the intended effect of “flattening the curve” during the first,second and possibly future waves of the pandemic.Obtaining an understanding of the effect of mobility on the transmission of COVID-19 requires an ability to measures and quantify said changes. This has been achieved bygeographically tracking cell phone usage , either directly by mobile phone operators [3] orvia usage of Google services [4] that are readily available for all regions. In addition to this,mobility has also been measured by considering the utilisation of public transport [5]. ∗ [email protected] II. METHODS
A retrospective design was used for data collection and analysis. We developed an SEIR-model of disease transmission which outputs the expected number of hospital admissions.Here we describe the hospital admission and mobility data, the epidemiological model thatwe have used as well as the method for fitting the model to data. The code for the modeland the data used is available at:
A. Data
Endpoint data: We consider hospital admission data from Sweden at the regional levelaggregated by National Board of Health and Welfare [10]. The data contains the totalnumber of newly admitted patients diagnosed with COVID-19 per week, starting with week10. The data is reported separately for each of the 21 regions in Sweden. Missing datapoints were replaced by zeroes for all regions.Syndromic data: In order to account for changes in behaviour due to governmentalrecommendations we have made use of mobility data from two sources: public transport3ata from the public transport authorities in Region V¨astra G¨otaland and Region Sk˚anecalled V¨asttrafik (VT) and Sk˚anetrafiken (ST), and Google mobility reports (GMR). TheVT- and ST-data describe the total number of journeys made by public transport in theregion and are reported on a weekly basis. Data are given in terms of a percent changecompared to travel during week 9. The GMR-data also describes the change in mobilitycompared to a baseline, which is the median value from the 5-week period Jan 3 – Feb6, 2020. Mobility is split into place categories and we have used values from the category’transit stations’. The GMR-data is reported on a daily basis and in order to make itcompatible with the model we calculate weekly averages. Figure 6 in the Appendix showsthe above mobility measures as a function of time.
B. Epidemiological model
To model the weekly time series of COVID-19 related hospital admissions we have used anSEIR-model with time-dependent infectivity β ( t ) which is informed by mobility measures.Infectivity is assumed to vary with mobility such that the number of new social contacts foreach infected individual increases with travel.We assume that mobility measured by public transport utilisation and mobile phoneusage reflects the general level of mobility in each region, which is then assumed to impactthe contact rate and consequently the infectivity. Note that we do not assume that diseasetransmissions occurs exclusively during travel, but rather that the above mobility measuresserve as a useful proxy for the rate of social contacts.The model is defined in terms of the following set of coupled ordinary differential equa-tions: dSdt = − β ( t ) SINdEdt = β ( t ) SIN − ρEdIdt = ρE − γIdRdt = γI. (1)Here ρ is the rate at which people leave the exposed compartment, γ is the rate of recoveryand N is the population size of the region. In order to solve the system we also need to4pecify an initial condition and when in time it occurs. We assume that all individuals aresusceptible except an initial number of I of infectious individuals at t weeks prior to thefirst data point in the admission data (week 10).To connect the dynamics of the SEIR-model with hospital admissions we assume thatindividuals in the infectious compartment give rise to future hospital admissions. To modelthis we assume that the number of hospital admissions t a weeks into the future is given bya fraction p of the present number of infectious individuals. C. Model parametrisation and fitting
The parameters of the SEIR-model were taken from previously published studies and wehave used ρ = 1 .
37 week − (corresponding to a latency period of on average 5.1 days) andrecovery rate γ = 1 . − (corresponding to a infectious period of on average 5 days)[11].Since testing was limited during the early stages of the pandemic in Sweden it is difficultto estimate the initial condition for our model. For simplicity we assume a single infectedindividual in a population of susceptibles appearing t = 4 weeks prior to the first data point.Adjusting the initial condition for each region could possibly yield more accurate prediction,but here we have chosen a robust initial condition which gives sensible predictions for allregions.The scaling that relates the number of infected to hospital admissions was set to p = 0 . t a = 3 weeks. This value is related to the time from infection to hospital admission,which has been reported to be 17 days (5 days latency [11] plus 12 days from symptom onsetto admission [12]). However, it should not be interpreted as a parameter describing the fateof an individual patient, but should rather be interpreted as the time it takes for changes indisease transmission to propagate (sometimes via secondary cases) to hospital admissions.A previous study using mobility data has shown a time delay in admissions due to mobilityrestrictions in the range of 9-25 days [13], which covers our assumed value of 21 days.Given the uncertainty in many of the above parameter values we have carried out asensitivity analysis by varying one parameter at a time within a reasonable range. Theresults of this analysis is presented in the Appendix.5he infectivity β ( t ) is informed by the mobility data in the following way: For V¨astraG¨otaland and Sk˚ane we use the public transport data and assume a linear relationship β ( t ) = a + bV ( t )where a, b are parameters that are fitted to the admission data (see below for details) and V ( t ) is the change in travel during week t . For all other regions we use the GMR-data in asimilar way and assume that β ( t ) = a + bG i ( t )where G i ( t ) is the GMR-data (place category ’transit stations’) for region i , and a, b areparameters that are estimated.In order to account for the fact that not only mobility changed at the onset of thepandemic, but also other circumstances such as physical distancing and increased handhygiene, we adjust the baseline values for V ( t ) and G i ( t ) from 0 to 0.2.The infectivity parameters a, b are estimated by minimising the mean squared error(RMSE) E ( θ ) = (cid:118)(cid:117)(cid:117)(cid:116) n n (cid:88) i =0 ( pI ( t i + t a , θ ) − A ( t i )) with respect to θ = ( a, b ). Here pI ( t i + t a , θ ) is the predicted number of hospital admissionsand A ( t i ) is the actual number of admissions and the sum runs over all time points t i . Tofind the minimum RMSE we use the grid search method with 80 linearly spaced values inthe range 1-12 for both a and b [14]. For each region i we thus obtain a set ˆ θ i = (ˆ a i , ˆ b i )of estimated parameters. When comparing the model error between different regions wenormalise the RMSE by dividing with the maximum number of weekly admissions for eachregion.In order to quantify the uncertainty in our parameter estimates we select all parametersets ( a, b ) that achieve an RMSE of within 20% of E (ˆ θ ). We solve the SEIR-model for allthose parameter combinations and remove the lower and upper 5th percentile to obtain a95% credible interval. This procedure corresponds to sampling from the posterior in anApproximate Bayesian Computation framework with E ( θ ) as our summary statistic [15].6or Region V¨astra G¨otaland we fit the mobility-driven SEIR-model (1) using increasingamounts of reported hospital admissions. We start by including data up until week 20 andtest the models predictive ability in terms of the mean average predictive error (MAPE) onthe coming three weeks. This procedure is repeated for increasing amounts of training data.To illustrate the robustness of the model we also plot how the estimated model parametersˆ a and ˆ b change as we include more weekly data. III. RESULTSA. Predicting hospital admissions using public transport utilisation
For Region V¨astra G¨otaland the resulting model error in terms of MAPE can be seenin figure 1A. By successively increasing the training data, we see in fig. 1B that the modelremains largely unchanged beyond week 30, which timewise corresponds to the end of thefirst wave of the pandemic.When using all available data we find that ˆ a = 4 .
16 and ˆ b = 5 .
74 (fig. 1C), and we notethat the model captures the dynamics of admissions during both the first and beginning ofthe second wave, although the rate of decline during the first wave is overestimated.
B. Using Google mobility data to predict hospital admissions
For all other regions we make use of Google mobility data (see Methods for details).Figure 2 shows model fits for ¨Osterg¨otland and Stockholm (see fig. 5 in the Appendixfor model fits to admissions in all Swedish regions and table I for normalised RMSE andestimated parameters). Again, we note that the model correctly describes the timing of thefirst and second wave. Visual inspection of the model fits for all regions suggest that themodel performs better for regions with a larger population.
C. Public transport data improves model fit compared to google mobility reports
For Sk˚ane Region we have both public transport data and GMR-data, which was used onall other regions. Figure 3 shows the best model fits using the mobility data from the publictransport agency Sk˚anetrafiken compared to GMR-data. We note that although the model7 BC FIG. 1. Model fit to admission data from Region V¨astra G¨otaland. A The model error in terms ofthe MAPE on 3 week predictions as a function of the number of weeks of data used in the fitting. B The estimated model parameters (ˆ a, ˆ b ) as a function of the number of weeks of data used in thefitting. C The optimal fit when all data points are used (until week 45). The dashed lines showthe 95% credible interval for the model fit (see Methods). using GMR fits the data during the second wave better the overall fit is considerably im-proved by using data from public transport. In terms of the RMSE we observe that the modelerror is 23 admissions/week for the public transport model compared to 40 admissions/weekfor the model that uses GMR. A similar trend is seen for Region V¨astra G¨otaland wherepublic transport data yields an RMSE of 40 admissions/week whereas GMR-data gives anerror of 90 admissions/week. 8 B FIG. 2. Optimal model fit for A Stockholm (ˆ a = 4 .
68 and ˆ b = 6 .
37) and B ¨Osterg¨otland (ˆ a = 3 . b = 5 . IV. DISCUSSION
We set out to investigate whether variations in data reflecting weekly commuting rateswere associated with later COVID-19 hospitalisation rates. It was found that COVID-19hospital admission can be modelled using time-dependent mobility data and that a SEIRmodel can be fitted using two free parameters to regional data from Sweden.Our approach is similar to a recent study by Chang et al. [14] who used spatially resolvedmobility data in order to model disease transmission in metropolitan areas in the US. Theycompared their model output to COVID-19 incidence, whereas we have focused on hospi-tal admissions. The reasons for this are twofold: firstly, the data on incidence in Swedenis unreliable due to limited testing and secondly hospital admissions is a more interesting9
IG. 3. Optimal model fit for Sk˚ane Region using mobility data from public transport (red line)and Google Mobility Report (black line). metric for healthcare providers. Given the time lag between current mobility which drivesinfections and future hospital admissions, the model provides a tool for predicting the de-mands on hospital beds up to three weeks in advance. Although the SEIR-model describesthe transmission of the disease, the model details were not in focus in the present study.Uncertainty in model parameters such as the initial condition and the fraction of individualsthat become hospitalised implies that the model dynamics in terms of the number of suscep-tible, infectious and recovered individuals are unreliable. It is also worth pointing out thatthe connection between infection and hospitalisation is not assumed to be direct. It maywell be that an individual who contributes to the measured mobility transmits the virus inseveral steps to an individual with an increased risk of severe illness who is subsequentlyhospitalised.Despite these simplifications, the model was able to capture the general shape and timingof both the first and the beginning of the second wave for most regions. There appears tobe a link between the population size of the region and the goodness of fit. The modelfits the admission data for larger regions better, and a possible explanation for this is thelarge degree of randomness seen in the smaller regions. A recurring feature seen across mostregions is the inability of the model to accurately describe the width of the first peak. Themodel tends to underestimate the actual width, and this is likely due to the lack of detail10n the model or inaccuracy in the assumed parameter values. Moreover, it is noteworthythat the model captured the timing of the onset of the second peak without accountingfor seasonal or temperature-driven infectivity. Instead, the results suggest that mobility initself, which might contain seasonal variation is sufficient to capture the dynamics of hospitaladmissions.The present research has some limitations that should be taken in consideration wheninterpreting the results. When trained on admission data until week 20 for Region V¨astraG¨otaland the model initially performs well in terms of the model error (MAPE) on the 3consecutive weeks (Fig. 1A). This is followed by an increase in the MAPE during week 25-30, which is due to an increase in admissions that the model is unable to capture, and then asubsequent decline. In terms of the model robustness, we observe that the parameters remainlargely unchanged after week 30, suggesting that data from the first wave was sufficient tofit the model (Fig. 1B).In the model we have disregarded any kind of age-structure, migration of cases betweenregions and assumed a highly simplified connection between infection and hospital admis-sion. In addition, we have assumed that the disease was introduced in an identical wayin all regions. These choices were made in order to formulate a simple and general model,which could be applied directly to all regions. We have shown for Region Sk˚ane that publictransport data provides a better fit between model and admission data, and further tai-loring the model to each region will most likely improve model fit even further. In termsof numerical methods, we have also made a couple of simplifications. We carried out theparameter estimation one region at a time. Here it would be beneficial to consider a hierar-chical mixed-effects model that considers all regions simultaneously [16]. We have performeda sensitivity analysis with respect to the initial condition, the parameters that relate the sizeof the infectious compartment to hospital admissions s and the initial mobility (see Fig. 4).The results show that the model fit can be somewhat improved by making slight adjustmentto the baseline parameter values. However, given the uncertainty in these parameter valueswe do not find it motivated to adjust our baseline values.This study should be seen as a first attempt to model regional-level hospital admissionsin Europe using mobility data. The assumed delay of three weeks between infection andadmission implies that the model can, with current mobility data, make predictions threeweeks into the future. The results encourage continued research on use of mobility data in11ealth service capacity planning during the COVID-19 pandemic.
V. ACKNOWLEDGEMENTS
We would like to thank Jonas H¨agglund at V¨asttrafik for providing the public transportdata. PG and TL would like to acknowledge seed funding from Chalmers University ofTechnology Areas of Advance “Information and Communication Technology” and “HealthEngineering”. [1] E. Holm, T. Timpka, et al. , A discrete time-space geography for epidemiology: from mixinggroups to pockets of local order in pandemic simulations, in
MedInfo (2007) pp. 464–468.[2] M. Str¨omgren, E. Holm, ¨O. Dahlstr¨om, J. Ekberg, H. Eriksson, A. Spreco, and T. Timpka,Place-based social contact and mixing: a typology of generic meeting places of relevance forinfectious disease transmission, Epidemiology & Infection , 2582 (2017).[3] B. Klein, T. LaRocky, S. McCabey, L. Torresy, F. Privitera, B. Lake, M. U. Kraemer, J. S.Brownstein, D. Lazer, T. Eliassi-Rad, et al. , Assessing changes in commuting and individualmobility in major metropolitan areas in the united states during the covid-19 outbreak (2020).[4] Google covid-19 community mobility reports, , accessed: 2020-12-14.[5] E. Jenelius and M. Cebecauer, Impacts of covid-19 on public transport ridership in sweden:Analysis of ticket validations, sales and passenger counts, Transportation Research Interdis-ciplinary Perspectives , 100242 (2020).[6] K. Linka, M. Peirlinck, and E. Kuhl, The reproduction number of covid-19 and its correlationwith public health interventions, medRxiv (2020).[7] Y. Zhou, R. Xu, D. Hu, Y. Yue, Q. Li, and J. Xia, Effects of human mobility restrictions onthe spread of covid-19 in shenzhen, china: a modelling study using mobile phone data, TheLancet Digital Health , e417 (2020).[8] M. Liu, R. Thomadsen, and S. Yao, Forecasting the spread of covid-19 under different reopen-ing strategies, medRxiv (2020).
9] N. Picchiotti, M. Salvioli, E. Zanardini, and F. Missale, Covid-19 pandemic: a mobility-dependent seir model with undetected cases in italy, europe and us, arXiv preprintarXiv:2005.08882 (2020).[10] Statistik om antal slutenv˚ardade covid-19 patienter, , accessed: 2020-12-14.[11] Estimates of the peak-day and the number of infected individuals during thecovid-19 outbreak in the stockholm region, sweden february – april 2020, ,accessed: 2020-12-14.[12] F. Zhou, T. Yu, R. Du, G. Fan, Y. Liu, Z. Liu, J. Xiang, Y. Wang, B. Song, X. Gu, et al. ,Clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china:a retrospective cohort study, The lancet (2020).[13] M. Vinceti, T. Filippini, K. J. Rothman, F. Ferrari, A. Goffi, G. Maffeis, and N. Orsini,Lockdown timing and efficacy in controlling covid-19 using mobile phone tracking, EClini-calMedicine , 100457 (2020).[14] S. Chang, E. Pierson, P. W. Koh, J. Gerardin, B. Redbird, D. Grusky, and J. Leskovec,Mobility network models of covid-19 explain inequities and inform reopening, Nature , 1(2020).[15] A. A. Alahmadi, J. A. Flegg, D. G. Cochrane, C. C. Drovandi, and J. M. Keith, A comparisonof approximate versus exact techniques for bayesian parameter inference in nonlinear ordinarydifferential equation models, Royal Society open science , 191315 (2020).[16] M. J. Lindstrom and D. M. Bates, Nonlinear mixed effects models for repeated measures data,Biometrics , 673 (1990). I. APPENDIXA. Sensitivity analysis
Figure 4 shows how the model error (RMSE) of the best fit for Region V¨astra G¨otalandchanges when the parameters p , t a , I and V ( t = 0) are varied. We note that it is possibleto achieve a slightly better model fit when the probability of hospitalisation is lowered to p = 0 .
1, but the improvement in model fit is minor. For the delay we see that our value of t a = 3 weeks lies close to a local minimum, but little would be gained (in terms of RMSE)by increasing the delay. The number of infected individuals at t = 0 has a more complicatedimpact on the error. A smaller RMSE could be achieved by increasing I from its defaultvalue of 1, but the improvement is again minor. Lastly, the initial infectivity has a minorimpact on the model error as long as it remains below 0.6. AB CD
FIG. 4. Sensitivity analysis of model parameters for Region V¨astra G¨otaland. The defauly valuesare p = 0 . t a = 3, I = 1 and V ( t = 0) = 0 . . Fitting the model to 20 Swedish regions Here we present model fits for all Swedish regions expect Gotland for which no data wasavailable from the National Board of Health and Welfare.