COVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model
Mohammed Alser, Jeremie S. Kim, Nour Almadhoun Alserr, Stefan W. Tell, Onur Mutlu
BBioinformatics doi.10.1093/bioinformatics/xxxxxxAdvance Access Publication Date: Day Month YearManuscript Category
Subject Section
COVIDHunter: An Accurate, Flexible, andEnvironment-Aware Open-Source COVID-19Outbreak Simulation Model
Mohammed Alser ∗ , Jeremie S. Kim, Nour Almadhoun Alserr, Stefan W. Tell,and Onur Mutlu ∗ ETH Zurich, Zurich 8006, Switzerland ∗ To whom correspondence should be addressed.
Associate Editor: XXXXXXX
Received on XXXXX; revised on XXXXX; accepted on XXXXX
Abstract
Motivation:
Early detection and isolation of COVID-19 patients are essential for successful implementationof mitigation strategies and eventually curbing the disease spread. With a limited number of daily COVID-19 tests performed in every country, simulating the COVID-19 spread along with the potential effect of eachmitigation strategy currently remains one of the most effective ways in managing the healthcare systemand guiding policy-makers. We introduce
COVIDHunter , a flexible and accurate COVID-19 outbreaksimulation model that evaluates the current mitigation measures that are applied to a region and providessuggestions on what strength the upcoming mitigation measure should be. The key idea of COVIDHunteris to quantify the spread of COVID-19 in a geographical region by simulating the average number of newinfections caused by an infected person considering the effect of external factors, such as environmentalconditions (e.g., climate, temperature, humidity) and mitigation measures.
Results:
Using Switzerland as a case study, COVIDHunter estimates that the policy-makers need tokeep the current mitigation measures for at least 30 days to prevent demand from quickly exceedingexisting hospital capacity. Relaxing the mitigation measures by 50% for 30 days increases both the daily capacity need for hospital beds and daily number of deaths exponentially by an average of . × , whomay occupy ICU beds and ventilators for a period of time. Unlike existing models, the COVIDHuntermodel accurately monitors and predicts the daily number of cases, hospitalizations, and deaths due toCOVID-19. Our model is flexible to configure and simple to modify for modeling different scenarios underdifferent environmental conditions and mitigation measures. Availability: https://github.com/CMU-SAFARI/COVIDHunter
Contact: [email protected], [email protected]
Supplementary information:
Supplementary data is available at
Bioinformatics online.
Coronavirus disease 2019 (COVID-19) is caused by SARS-CoV-2 virus,which was first detected in Wuhan, the capital city of Hubei Province inChina, in early December 2019 (Du Toit, 2020). Since then, it has rapidlyspread to nearly every corner of the globe and has been declared a pandemicin March 2020 by the World Health Organization (WHO). As of January2021, COVID-19 has since resulted in more than 96 million laboratory-confirmed cases around the world, and has killed nearly 2.2% of theinfected population. As there are currently no anti-SARS-CoV-2-specificdrugs or effective vaccines widely available to everyone, early detectionand isolation of COVID-19 patients remain essential for effectively curbingthe disease spread. As a result, many countries across the world haveimplemented unprecedented lockdown and social distancing measures,affecting millions of people. Regardless of the availability and affordability of COVID-19 testing, it is still extremely challenging to detect and isolateCOVID-19 infections at early stages due to three key issues. 1) It is verydifficult to accurately identify the initial contraction time of COVID-19for a patient. This is because COVID-19 patients can develop symptomsbetween 2 to 14 days (or longer in a few cases) after exposure to thenew coronavirus (Lauer et al. , 2020; Li et al. , 2020). This variable delay isreferred to as the virus’ incubation period . 2) The coronavirus genome canexhibit rapid genetic changes in its nucleotide sequence, which may occurduring viral cell replication, within the host body, or during transmissionbetween hosts (Andersen et al. , 2020). This genetic diversity affectsthe virus virulence, infectivity, transmissibility, and evasion of the hostimmune responses (Phan, 2020; Pachetti et al. , 2020; Toyoshima et al. ,2020). 3) The situation becomes even worse as the coronavirus can surviveand therefore remain infectious outside the host, on common surfacessuch as metal, glass, and banknotes (both paper and polymer) at roomtemperature for up to 28 days (Kampf et al. , 2020; Riddell et al. , 2020). a r X i v : . [ q - b i o . P E ] F e b Alser et al.
Simulating the spread of COVID-19 has the potential to mitigatethe effects of the three key issues, help to better manage the healthcaresystem, and provide guidance to policy-makers on the effectiveness ofvarious (current, planned or discussed) social distancing and mitigationmeasures. To this end, many COVID-19 simulation models are proposed(e.g., (Tradigo et al. , 2020; Russell et al. , 2020; Ashcroft et al. , 2020)),some of which are announced to assist in decision-making for policy-makers in countries such as the United Kingdom (ICL (Flaxman et al. ,2020)), United States (IHME (Reiner et al. , 2020)), and Switzerland(IBZ (Huisman et al. , 2020)). These models tend to follow one of two keyapproaches. (1) Evaluating the current actual epidemiological situation byaccounting for reporting delays and under-reporting due to inefficienciessuch as low number of COVID-19 tests. (2) Evaluating the current andfuture epidemiological situation by simulating the COVID-19 outbreak without relying on the observed (laboratory-confirmed) number of casesin simulation.The first approach, taken by the IBZ (Huisman et al. , 2020),LSHTM (Russell et al. , 2020), and (Ashcroft et al. , 2020) models, is not mainly used for prediction purposes as it reflects the epidemiologicalsituation with about two weeks of time delay (due to its dependence onobserved COVID-19 reports). The IBZ model (Huisman et al. , 2020)estimates the daily reproduction number, R , of SARS-CoV-2 fromobserved COVID-19 incidence time series data after accounting forreporting delays and under-reporting using the numbers of confirmedhospitalizations and deaths. The R number describes how a pathogenspreads in a particular population by quantifying the average number ofnew infections caused by each infected person at a given point in time. TheLSHTM model (Russell et al. , 2020) adjusts the daily number of observedCOVID-19 cases by accounting for under-reporting (uncertainty) usingboth deaths-to-cases ratio estimates and correcting for delays betweencase confirmation (i.e., laboratory-confirmed infection) to death.The second approach, taken by ICL (Flaxman et al. , 2020) andIHME (Reiner et al. , 2020) models, usually requires a large number ofvarious input parameters and assumptions. IHME (Reiner et al. , 2020)model requires input parameters such as testing rates, mobility, socialdistancing policies, population density, altitude, smoking rates, self-reported contacts, and mask use. This model makes two key assumptions:1) the infection fatality rate (IFR), which indicates the rate of people thatdie from the infection is taken using data from the Diamond Princess Cruiseship and New Zealand and 2) the decreasing fatality rate is reflective ofincreased testing rates (identifying higher rates of asymptomatic cases).ICL (Flaxman et al. , 2020) model requires input parameters such as thedaily number of confirmed deaths, IFR, mobility rates from Google, age-and country-specific data on demographics, patterns of social contact, andhospital availability. This model makes three key assumptions: 1) age-specific IFRs observed in China and Europe are the same across everycountry, 2) the number of confirmed deaths is equal to the true number ofCOVID-19 deaths, and 3) the change in transmission rates is a function ofaverage mobility trends.To our knowledge, there is currently no model capable of accuratelymonitoring the current epidemiological situation and predicting futurescenarios while considering a reasonably low number of parameters andaccounting for the effects of environmental conditions, as we summarizein Table 1. The low number of parameters provides four key advantages:1) allowing flexible (easy-to-adjust) configuration of the model inputparameters for different scenarios and different geographical regions,2) enabling short simulation execution time and simpler modeling, 3)enabling easy validation/correction of the model prediction outcomes byadjusting fewer variables, and 4) being extremely useful and powerfulespecially during the early stages of a pandemic as many of theparameters are unknown. Simulation models need to consider the factthat the environmental conditions (e.g., air temperature) affect pathogeninfectivity (Fares, 2013; Kampf et al. , 2020; Riddell et al. , 2020; Xu et al. ,2020) and simulating this effect helps to provide accurate estimation ofthe epidemiological situation.Our goal in this work is to develop such a COVID-19 outbreaksimulation model. To this end, we introduce COVIDHunter , a simulationmodel that evaluates the current mitigation measures (i.e., non-pharmaceutical intervention or NPI) that are applied to a region andprovides insight into what strength the upcoming mitigation measure should be and for how long it should be applied, while consideringthe potential effect of environmental conditions. Our model accuratelyforecasts the numbers of infected and hospitalized patients, and deaths fora given day, as validated on historical COVID-19 data (after accountingfor under-reporting). The key idea of COVIDHunter is to quantify thespread of COVID-19 in a geographical region by calculating the dailyreproduction number, R , of COVID-19 and scaling the reproductionnumber based on changes in both mitigation measures and environmentalconditions. The R number changes during the course of the pandemicdue to the change in the ability of a pathogen to establish an infectionduring a season and mitigation measures that lead to lower number ofof susceptible individuals. COVIDHunter simulates the entire populationof a region and assigns each individual in the population to a stage of theCOVID-19 infection (e.g., from being healthy to being short-term immuneto COVID-19) based on the scaled R number. Our model is flexible toconfigure and simple to modify for modeling different scenarios as it uses only three input parameters, two of which are time-varying parameters, tocalculate the R number. Whenever applicable, we compare the simulationoutput of our model to that of four state-of-the-art models currently usedto inform policy-makers, IBZ (Huisman et al. , 2020), LSHTM (Russell et al. , 2020), ICL (Flaxman et al. , 2020), and IHME (Reiner et al. , 2020).The contributions of this paper are as follows: • We introduce COVIDHunter, a flexible and validated simulationmodel that evaluates the current and future epidemiological situationby simulating the COVID-19 outbreak. COVIDHunter accuratelyforecasts for a given day 1) the reproduction number, 2) the number ofinfected people, 3) the number of hospitalized people, 4) the numberof deaths, and 5) number of individuals at each stage of the COVID-19infection. COVIDHunter evaluates the effect of different current andfuture mitigation measures on the COVIDHunter’s five numbers. • As a case study, we statistically analyze the relationship betweentemperature and number of COVID-19 cases in Switzerland. Wefind that for each 1 ◦ C rise in daytime temperature, there is a 3.67%decrease in the daily number of confirmed cases. We demonstratehow considering the effect of climate (e.g., daytime temperature) onCOVID-19 spread significantly improves the prediction accuracy. • Compared to IBZ, LSHTM, ICL, and IHME models, COVIDHunterachieves more accurate estimation, provides no prediction delay, andprovides ease of use and high flexibility due to the simple modelingapproach that uses a small number of parameters. • Using COVIDHunter, we demonstrate that the spread of COVID-19 inSwitzerland is still active (i.e., R > 1.0) and curbing this spread requiresmaintaining the same strength of the currently applied mitigationmeasures for at least another 30 days. • We release the well-documented source code of COVIDHunter andshow how easy it is to flexibly configure for any scenario and extendfor different measures and conditions than we account for.
The primary purpose of our COVIDHunter model is to monitor andpredict the spread of COVID-19 in a flexibly-configurable and easy-to-use way, while accounting for changes in mitigation measures andenvironmental conditions over time. We employ a three-stage approach todevelop and deploy this model. (1) The COVIDHunter model predicts thedaily R value based on only three input parameters to maintain both quicksimulation and high flexibility in configuring these parameters. Each inputparameter is configured based on either existing research findings or user-defined values. Our model allows for directly leveraging existing modelsthat study the effect of only mitigation measures (or only environmentalconditions) on the spread of COVID-19, as we show in Section 2.2. (2)The COVIDHunter model predicts the number of COVID-19 cases basedon the predicted R number. COVIDHunter simulates the entire populationof a region and labels each individual according to different stages ofthe COVID-19 infection timeline. Each stage has a different degree ofinfectiousness and contagiousness. The model simulates these stages foreach individual to maintain accurate predictions. (3) The COVIDHuntermodel predicts the number of hospitalizations and deaths based on both OVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model Table 1.
Comparison to other models used to inform government policymakers, as of January 2021.
Open Well- Accounting for Low Number ReportedModel Source Documented
Weather Changes of Parameters COVID-19 Statistics
COVIDHunter (this work) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) ( R , cases, hospitalizations, and deaths)IBZ (Huisman et al. , 2020) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (only R )LSHTM (Russell et al. , 2020) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (only cases)ICL (Flaxman et al. , 2020) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) ( R , cases, hospitalizations, and deaths)IHME (Reiner et al. , 2020) (cid:51) ∗ (cid:55) (cid:55) (cid:55) (cid:55) (cases, hospitalizations, and deaths) Based on each model’s GitHub page (all models are available on GitHub). ∗ The available packages are configured only for the IHME infrastructure.the predicted number of cases and the R number. Next, we explain theCOVIDHunter model in detail. The COVIDHunter model predicts the dynamic value of R for a populationat a given day while considering three key factors: 1) the transmissibilityof an infection into a susceptible host population, 2) mitigation measures(e.g., lockdown, social distancing, and isolating infected people), and 3)environmental conditions (e.g., air temperature). Our model calculates thetime-varying R number using Equation 1 as follows: R ( t ) = R ∗ (1 − M ( t )) ∗ C e ( t ) (1)The R number for a given day, t , is calculated by multiplying three terms:1) the base reproduction number ( R ) for the subject virus, 2) one minusthe mitigation coefficient ( M ), for the given day t and 3) the environmentalcoefficient ( C e ) for the given day t .The R number quantifies the transmissibility of an infection into asusceptible host population by calculating the expected average number ofnew infections caused by an infected person in a population with no priorimmunity to a specific virus (as a pandemic virus is by definition novel to allpopulations). Hence, the R number represents the transmissibility of aninfection at only the beginning of the outbreak assuming the population isnot protected via vaccination. Unlike the R number, R number is a fixedvalue and it does not depend on time. The R number is a time-dependentvariable that accounts for the population’s reduced susceptibility. The R number for the COVID-19 virus can be obtained from several existingstudies (such as in (Hilton and Keeling, 2020; Chang et al. , 2020; Shi et al. , 2020; de Souza et al. , 2020; Rahman et al. , 2020)) that estimate itby modeling contact patterns during the first wave of the pandemic.The mitigation coefficient ( M ) applied to the population is a time-dependent variable and it has a value between 0 and 1, where 1 representsthe strongest mitigation measure and 0 represents no mitigation measureapplied. In different countries, mitigation measures take different forms,such as social distancing, self-isolation, school closure, banning publicevents, and complete lockdown. These measures exhibit significantheterogeneity and differ in timing and intensity across countries (Hale et al. , 2020; Davies et al. , 2020). Quantifying the mitigation measureson a scale from 0 to 1 across different countries is challenging. TheOxford Stringency Index (Hale et al. , 2020) maintains a twice-weekly-updated index that takes values from 0 to 100, representing the severityof nine mitigation measures that are applied by more than 160 countries.Another study (Brauner et al. , 2020) estimates the effect of only sevenmitigation measures on the R number in 41 countries. We can directly leverage such studies for calculating the mitigation coefficient on a givenday after changing the scale from 0:100 to 0:1 by dividing each value of,for example, the Oxford Stringency Index by 100.The environmental coefficient ( C e ) is a time-dependent variablerepresenting the effect of external environmental factors on the spreadof COVID-19 and it has a value between 0 and 2. Several related viralinfections, such as the Influenza virus, human coronavirus, and humanrespiratory, already show notable seasonality (showing peak incidencesduring only the winter (or summer) months) (Moriyama et al. , 2020;Fisman, 2012). The seasonal changes in temperature, humidity, andultraviolet light affect the pathogen infectiousness outside the host (Fares,2013; Kampf et al. , 2020; Riddell et al. , 2020; Xu et al. , 2020).However, the indoor environmental conditions are usually well-controlledthroughout the year, where human behavior and number of householdscan be the major contributor to the spread of the COVID-19 (Moriyama et al. , 2020). There are currently several studies that demonstrate thestrong dependence of the transmission of SARS-CoV-2 virus on oneor more environmental conditions, even after controlling (isolating) theimpact of mitigation measures and behavioral changes that reduce contacts.Several studies have demonstrated increased infectiousness by a country-dependent fixed-rate with each 1 ◦ C fall in daytime temperature (Xieand Zhu, 2020; Prata et al. , 2020). Another study supports the sametemperature-infectiousness relationship, but it also finds that beforeapplying any mitigation measures, a one degree drop in relative humidityshows increased infectiousness by a rate lower (2.94 × less) than that oftemperature (Wang et al. , 2020).One of the most comprehensive studies that spans more than 3700locations around the world is HARVARD CRW (Xu et al. , 2020). It findsthe statistical correlation between the relative changes in the R numberand both weather (temperature, ultraviolet index, humidity, air pressure,and precipitation) and air pollution (SO2 and Ozone) after controllingthe impact of mitigation measures. The study provides a CRW Index thathas a value from 0.5 to 1.5. The percentage difference between any twoconsecutive values provided by the CRW Index represents the effect thatboth weather and air pollutants have on the R number. For example, a dropin the CRW Index by 10% in a given location points to a 10% reduction inthe R number due to weather changes and air pollutants. Our model enablesapplying any of these studies by adjusting our environmental coefficient ona given day, as we experimentally demonstrate in Section 3. For example,if the COVIDHunter user chooses to consider the HARVARD CRW study,and the CRW Index shows, for example, a 10% drop compared to itsimmediately preceding data point, then the environmental coefficient ofCOVIDHunter should be 0.9 so that the R value decreases by also 10%.Next, we explain how our model forecasts the number of COVID-19 casesbased on Equation 1. COVIDHunter tracks the number of infected and uninfected persons overtime by clustering the population into four main categories:
HEALTHY , INFECTED , CONTAGIOUS , and
IMMUNE . The model initially considersthe entire population as uninfected (i.e.,
HEALTHY ). For each simulatedday, the model calculates the R value using Equation 1 and decideshow many persons can be infected during that day. The day whenthe first case of infection in a population introduced is defined bythe user. For each newly infected person ( INFECTED ), the modelmaintains a counter that counts the number of days from being infected tobeing contagious (
CONTAGIOUS ). Several COVID-19 case studies showthat presymptomatic transmission can occur 1–3 days before symptomonset (Wei et al. , 2020; Slifka and Gao, 2020). COVID-19 patients candevelop symptoms mostly after an incubation period of 1 to 14 days (themedian incubation period is estimated to be 4.5 to 5.8 days) (Lauer et al. ,2020; Li et al. , 2020). We calculate the number of days of being contagiousafter being infected as a random number with a Gaussian distributionthat has user-defined lowest and highest values. Each contagious personmay infect N other persons depending on mobility, population density,number of households, and several other factors (Ferguson et al. , 2020).We calculate the value of N to be a random number with a Gaussiandistribution that has the lowest value of 0 and the highest value determinedby the user. If N is greater than the R number (i.e., the target number ofinfections for that day has been reached), further infections are curtailedpreventing overestimation of N by infecting only R persons. Once thecontagious person infects the desired number of susceptible persons, thestatus of the contagious person becomes immune ( IMMUNE ). The immune
Alser et al. status indicates that the person has immunity to reinfection due to eithervaccination or being recently infected (Lumley et al. , 2020).Our model also simulates the effect of infected travelers (e.g., dailycross-border commuters within the European Union) on the value of R . These travelers can initiate the infection(s) at the beginning of thepandemic. If such infected travelers are absent (due to, for example,emergency lockdown) from the target population, the virus would dieout once the value of R decreases below 1 for a sufficient period of time.Both the number and percentage of infected travelers entering a region areconfigurable in our model. The percentage of incoming infected travelersis not affected by the changes in the local mitigation measures, as thesetravelers were infected abroad.Our model predicts the daily number of COVID-19 cases for a givenday t , as follows: Daily _ Cases ( t ) = T INF ( t ) (cid:88) n =0 N ( n ) + U CON ( t ) (cid:88) m =0 N ( m ) (2)where T INF is the daily number of infected travelers that is a user-defined variable, N () is a function that calculates the number of personsto be infected by a given person as a random number with a Gaussiandistribution, and U CON is the daily number of contagious personscalculated by our model.
There are currently two key approaches for calculating the estimatednumber of both hospitalizations and deaths due to COVID-19: 1) usinghistorical statistical probabilities, each of which is unique to each agegroup in a population (Bhatia and Klausner, 2020; Bi et al. , 2020) and 2)using historical COVID-19 hospitalizations-to-cases and deaths-to-casesratios (Kobayashi et al. , 2020). We choose to follow a modified versionof the second approach as it does not require 1) clustering the populationinto age-groups and 2) calculating the risk of each individual using thegiven probability, which both affect the complexity of the model and thesimulation time.The number of COVID-19 hospitalizations for a given day, t , can becalculated as follows: Daily _ Hospitalizations ( t ) = Daily _ Cases ( t ) ∗ X ∗ C X (3)where Daily _ Cases ( t ) is calculated using Equation 2 and X is thehospitalizations-to-cases ratio that is calculated as the average of dailyratios of the number of COVID-19 hospitalizations to the laboratory-confirmed number of COVID-19 cases. As the true number of cases isunknown due to lack of population-scale testing, it is extremely difficult tomake accurate estimates of the true number of COVID-19 hospitalizations.As such, we assume a fixed multiplicative relationship between the numberof laboratory-confirmed cases and the true number of cases. We use theuser-defined correction coefficient, C X , of the hospitalizations-to-casesratio to account for such a multiplicative relationship.The number of COVID-19 deaths for a given day t can be calculatedas follows: Daily _ Deaths ( t ) = Daily _ Cases ( t ) ∗ Y ∗ C Y (4)where Daily _ Cases ( t ) is calculated using Equation 2 and Y is thedeaths-to-cases ratio, which is calculated as the average of daily ratios ofthe number of COVID-19 deaths to the number of COVID-19 laboratory-confirmed cases. The observed number of COVID-19 deaths can still beless than the true number of COVID-19 deaths due to, for example, under-reporting. We use the user-defined correction coefficient, C Y , to accountfor the under-reporting. One way to find the true number of COVID-19deaths is to calculate the number of excess deaths. The number of excessdeaths is the difference between the observed number of deaths during timeperiod and expected (based on historical data) number of deaths during thesame time period. For this reason, C Y may not necessarily be equal to C X . We can validate our model using two key approaches. 1) Comparing thedaily R number predicted by our model (using Equation 1) with the dailyreported official R number for the same region. 2) Comparing the dailynumber of COVID-19 cases predicted by our model (using Equation 2)with the daily number of laboratory-confirmed COVID-19 cases. As ofJanuary 2021, we have already witnessed one year of the pandemic, whichprovides us several observations and lessons. The most obvious sourceof uncertainty, affecting all models, is that the true number of personsthat are previously infected or currently infected is unknown (Wilke andBergstrom, 2020). This affects the accuracy of the reported R numbersince it is calculated as, for example, the ratio of the number of cases for aweek (7-day rolling average) to the number of cases for the precedingweek. Adjusting the parameters of our model to fit the curve of thenumber of confirmed cases is likely to be highly uncertain. The publicly-available number of COVID-19 hospitalizations and deaths can providemore reliable data.For these reasons, we decide to use a combination of reported numbersof cases, hospitalizations, and deaths for validating our model using threekey steps. 1) We leverage the more reliable data of reported number ofhospitalizations (or deaths) to estimate the true number of COVID-19cases using the ratio of number of laboratory-confirmed hospitalizations(or deaths) to the number of laboratory-confirmed cases during the secondwave of the COVID-19 pandemic. We assume that the COVID-19 statisticsduring the second wave is more accurate than that during the first wavebecause generally more testing is performed in the second wave. 2) Weconsider a multiplicative relationship between the true number of COVID-19 cases and that estimated in step 1. In our experimental evaluation(Section 3), we use the true number of COVID-19 cases calculated usingdifferent multiplicative factor values (we refer to them as certainty ratelevels ) as a ground-truth for validating our model. A certainty rate of, forexample, 50% means that the true number of COVID-19 cases is actually double that calculated in step 1. 3) We use our model to calculate both thedaily R number (using Equation 1) and the number of COVID-19 cases(using Equation 2). We fix the two terms of Equation 1, R and C e , usingpublicly-available data for a given region and change the third term, M ,until we fit the curve of the number of cases predicted by our model to theground-truth plot calculated in step 2. We use the same methodology tovalidate our predicted numbers of hospitalizations and deaths with differentcertainty rate levels as we show in Section 3 and the Supplementary ExcelFile . We especially build COVIDHunter model to be flexible to configure andeasy to extend for representing any existing or future scenario usingdifferent values of the three terms of Equation 1, 1) R , 2) M ( t ) , 3) C e ( t ) , in addition to several other parameters such as the population,number of travelers, percentage of expected infected travelers to the totalnumber of travelers, and hospitalizations- or deaths-to-cases ratios. Ourmodeling approach acts across the overall population without assumingany specific age structure for transmission dynamics. It is still possible toconsider each age group separately using individual runs of COVIDHuntermodel simulation, each of which has its own parameter values adjustedfor the target age group. The COVIDHunter model considers eachlocation independently of other locations, but it also accounts for potentialmovement between locations by adjusting the corresponding parametersfor travelers. By allowing most of the parameters to vary in time, t ,the COVIDHunter model is capable of accounting for any change intransmission intensity due to changes in environmental conditions andmitigation measures over time. As we explain in Section 2.2, the flexibilityof configuring the environmental coefficient and mitigation coefficientallows our proposed model to control for location-specific differences inpopulation density, cultural practices, age distribution, and time-variantmitigation responses in each location. Our modeling approach considersa single strain of the COVID-19 virus by using a single base reproduction https://github.com/CMU-SAFARI/COVIDHunter/blob/main/Evaluation_Results/SimulationResultsForSwitzerland.xlsx OVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model number, R . It is possible to consider multiple virus strains by runningthe model simulation multiple times, each of which considers one of thestrains individually. The model can be extended to consider multiple virusstrains by replacing the R number by multiple R numbers that representthe different strains (Reichmuth et al. , 2021). We evaluate the daily 1) R number, 2) mitigation measures, and 3)numbers of COVID-19 cases, hospitalizations, deaths. We also evaluate thedaily numbers of HEALTHY , INFECTED , CONTAGIOUS , and
IMMUNE in the Supplementary Excel File . We compare the predicted values totheir corresponding observed values whenever possible. We provide acomprehensive treatment of all datasets, models, and evaluation resultswith different model configurations in the Supplementary Materials andthe Supplementary Excel Files . We use Switzerland as a use-case for all the experiments. However, ourmodel is not limited to any specific region as the parameters it uses arecompletely configurable. To predict the R number, we use Equation 1 thatrequires three key variables. We set the base reproduction number, R , forthe SARS-CoV-2 in Switzerland as 2.7, as shown in (Hilton and Keeling,2020). We choose two main approaches for setting the value of the time-varying environmental coefficient variable ( C e ). 1) Performing statisticalanalysis for the relationship between the daily number of COVID-19cases and average daytime temperature in Switzerland. As we providein the Supplementary Materials, Section 1, our statistical analysis showsthat each 1 ◦ C rise in daytime temperature is associated with a 3.67%( t -value = -3.244 and p -value = 0.0013) decrease in the daily numberof confirmed COVID-19 cases. We refer to this approach as Cases-Temperature Coefficient (CTC). 2) Applying the HARVARD CRW (Xu et al. , 2020) (CRW in short), which provides the statistical relationshipbetween the relative changes in the R number and both weather factorsand air pollutants after controlling for the impact of mitigation measures.We change the daily mitigation coefficient, M ( t ) , value based on theratio of number of confirmed hospitalizations to the number of confirmedcases with two certainty rate levels of 100% and 50%, as we explain indetail in Section 2.5. This helps us to take into account uncertainty in theobserved number of COVID-19 cases, hospitalizations, and deaths. We setthe minimum and maximum incubation time for SARS-CoV-2 as 1 and 5days, respectively, as 5-day period represents the median incubation periodworldwide (Lauer et al. , 2020; Li et al. , 2020). We set the population to8654622. We empirically choose the values of N , the number of travelers,and the ratio of the number of infected travelers to the total number oftravelers to be 25, 100, and 15%, respectively. As the exact true number of COVID-19 cases remains unknown (dueto, for example, lack of population-scale COVID-19 testing), we expectthe true number of COVID-19 cases in Switzerland to be higher thanthe observed (laboratory-confirmed) number of cases. We calculate theexpected true number of cases based on both numbers of deaths andhospitalizations, as we explain in Section 2.5. To account for possiblemissing number of COVID-19 deaths, we consider the excess deathsinstead of observed deaths. We calculate the excess deaths as the differencebetween the observed weekly number of deaths in 2020 and 5-year averageof weekly deaths. We find that X (hospitalizations-to-cases ratio) and Y (deaths-to-cases ratio, using excess death data) to be 3.526% and 2.441%,respectively, during the second wave of the pandemic in Switzerland.We choose the second wave to calculate the values of X and Y as https://github.com/CMU-SAFARI/COVIDHunter/blob/main/Evaluation_Results/SimulationResultsForSwitzerland.xlsx https://github.com/CMU-SAFARI/COVIDHunter/blob/main/Evaluation_Results/ Switzerland has increased the daily number of COVID-19 testing by . × (21641/4074) on average compared to the first wave. We calculate theexpected number of cases on a given day t with certainty rate levelsof 100% and 50% based on hospitalizations by dividing the number ofhospitalizations at t by X and X/ , respectively, as we show in Figure 1.We apply the same approach to calculate the expected number of cases ona given day t with certainty rate levels of 100% and 50% based on deathsusing Y and Y/ , respectively.Based on Figure 1, we make two key observations. 1) The plotfor the expected number of cases calculated based on the number ofdeaths is shifted forward by 10-20 days (15 days on average) from thatfor the expected number of cases calculated based on the number ofhospitalizations. This is due to the fact that each hospitalized patientusually spends some number of days in hospital before dying of COVID-19. We do not observe a significant time shift between the plot of theexpected number of cases calculated based on the number hospitalizationsand the plot of observed (laboratory-confirmed) cases. 2) The expectednumber of cases calculated based on the number of hospitalizations is onaverage . × higher than the expected number of cases calculated basedon the number of deaths (after accounting for the 15-day shift) for the samecertainty rate. This is expected as not all hospitalized patients die.We conclude that both numbers of hospitalizations and deaths can beused for estimating the true number of COVID-19 cases after accountingfor the time-shift effect. Expected cases based on hospitalizations (100%)Expected cases based on hospitalizations (50%)Expected cases based on deaths (100%)Expected cases based on deaths (50%)Observed cases N u m b e r o f C O V I D - D e a t h s Date
Fig. 1.
Observed (officially reported) and expected number of COVID-19 cases inSwitzerland during the year of 2020. We calculate the expected number of cases basedon both the hospitalizations-to-cases and deaths-to-cases ratios for the second wave. Weassume two certainty rate levels of 50% and 100%. R number of SARS-CoV-2 We calculate the predicted R number using our model (Equation 1) andcompare it to the observed official R number and the R number of twostate-of-the-art models, ICL and IBZ, for the two years of 2020 and2021. We configure COVIDHunter using the following configurations: 1)CTC as environmental condition approach, 2) certainty rate levels of 50%and 100%, and 3) mitigation coefficient value of 0.7. All our scripts areprovided in our GitHub page. We consider the mean R number provided bythe ICL model. We consider the median R number calculated by the IBZmodel based on observed number of hospitalized patients. IBZ providesthe predicted (after mid of December 2020) R number as the mean of theestimates from the last 7 days.Based on Figure 2, we make three key observations. 1) COVIDHunterpredicts the changes in R number much (4-13 days) earlier than thatpredicted by ICL model, which leads to a more accurate prediction. The R number predicted by COVIDHunter (with a certainty rate level of 50%)is on average 1.56 × less than that predicted by ICL model, IBZ model,and the observed official R number. Using a certainty rate level of 100%,COVIDHunter predicts the R number to be close in value to the observed R number. 2) Our model predicts that the current R number is still higherthan 1 (1.137 and 1.023 using certainty rate levels of 50% and 100%,respectively) during January 2021. This indicates that the spread of theSARS-CoV-2 virus is still active and it causes exponential increase innumber of new cases. 3) Our model predicts that if we keep the samemitigation measure strength as that of January 2021 for at least 30 days(M(t)= 0.7), then the R number would drop by 18.2% ( R = 0.929 and 0.836for certainty rate levels of 50% and 100%, respectively). However, if themitigation measures that are applied nationwide in Switzerland are relaxed Alser et al. by 50% (M(t)= 0.35) for only 30 days (22 January to 22 February 2021),then the R number increases by at least 2.17 × .We conclude that COVIDHunter’s estimation of the R number is moreaccurate than that calculated by the ICL and IBZ models, as validated bythe currently observed R number. Date
Date
Observed RICLCTC_50%_M(t)=0.35CTC_50%_M(t)=0.7CTC_100%_M(t)=0.7IBZ R e p r o du c t i o n N u m b e r , R ( t ) Monitoring Predicting
Fig. 2.
Observed and predicted reproduction number, R ( t ) , for the two years of 2020 and2021. We use CTC environmental condition approach, certainty rate levels of 50% and100%, and mitigation coefficient values of 0.35 and 0.7 for COVIDHunter. We compareCOVIDHunter’s predicted R number to the observed R number and two state-of-the-artmodels, ICL and IBZ. The horizontal dashed line represents R ( t ) =1.0. We evaluate the mitigation coefficient, M ( t ) , which represents themitigation measures applied (or to be applied) in Switzerland fromJanuary 2020 to May 2021. We use two different environmental conditionapproaches, CRW and CTC. We assume two certainty rate levels of 50%and 100% to account for uncertainty in the observed number of cases.We use five mitigation coefficients, M ( t ) , values of 0.35, 0.4, 0.5, 0.6,and 0.7 for each configuration of COVIDHunter during 22 January to22 February 2021. We compare the evaluated mitigation measures to thatevaluated by the Oxford Stringency Index (Hale et al. , 2020), as we providein Figure 3. We also evaluate the mitigation coefficient when we ignorethe effect of environmental changes (i.e., by setting C e =1 in Equation 1),while maintaining the same number of COVID-19 cases of that providedwith a certainty rate level of 50%.Based on Figure 3, we make four key observations. 1) Excluding theeffect of environmental changes from the COVIDHunter model, by setting C e =1 in Equation 1, leads to an inaccurate evaluation of the mitigationmeasures. For example, during the summer of 2020 (between the twomajor waves of 2020), COVIDHunter ( WithoutCTC_50% ) evaluates themitigation coefficient to be as high as 0.6. This means that the mitigationmeasures ( only mandatory of wearing mask on public transport) appliedduring the summer of 2020 are only
14% more relaxed compared to themitigation measures (e.g., closure of schools, restaurants, and borders,ban on small and large events) applied during the first wave, which isimplausible. This highlights the importance of considering the effect ofexternal environmental changes on simulating the spread of COVID-19.Unfortunately, environmental change effects are not considered by any ofthe IBZ, LSHTM, ICL, and IHME models, which we believe is a seriousshortcoming of these prior models. 2) A drop by 3% (as we observe duringthe mid of November 2020) to 30% (as we observe during the end of August2020) in the strength of the mitigation measures for a certain period of time(10 to 20 days) is enough to double the predicted number of COVID-19cases. 3) We evaluate the strength of the mitigation measures applied inSwitzerland to be usually (65% of the time) up to 80% to 131% higherthan that provided by the Oxford Stringency Index. 4) The strength of themitigation measures has changed 11 times during the year of 2020, eachof which is maintained for at least 9 days and at most 66 days (32 days onaverage).We conclude that considering the effect of environmental changes (e.g.,daytime temperature) on the spread of COVID-19 improves simulationoutcomes and provides accurate evaluation of the strength of the past andcurrent mitigation measures. M i t i g a t i o n C o e ff i c i e n t M ( t ) Date
Oxford Stringency IndexCTC_50%CTC_100%CRW_50%CRW_100%WithoutCTC_50% Monitoring
Fig. 3.
Predictedstrengthofthemitigationmeasures(mitigationcoefficient, M ( t ) )appliedin Switzerland from January 2020 to May 2021 provided by Oxford Stringency Index andCOVIDHunter. We use two different environmental condition approaches, CRW and CTC.We assume two certainty rate levels of 50% and 100%. We use five mitigation M ( t ) valuesof 0.35, 0.4, 0.5, 0.6, and 0.7 for each configuration of our model during 22 January to22 February 2021. The plot called WithoutCTC_50% represents the evaluation of thecurrent mitigation measures while ignoring the effect of environmental changes.
We evaluate COVIDHunter’s predicted daily number of COVID-19 casesin Switzerland. We compare the predicted numbers by our model to theobserved numbers and those provided by three state-of-the-art models(ICL, IHME, and LSHTM), as shown in Figure 4. We calculate theobserved number of cases as the expected number of cases with a certaintyrate level of 100% (as we discuss in Section 3.2). We use three defaultconfigurations for the prediction of the ICL model: 1) strengtheningmitigation measures by 50%, 2) maintaining the same mitigation measures,and 3) relaxing mitigation measures by 50% which we refer to as
ICL+50% , ICL , and
ICL-50% , respectively, in Figures 4, 5, and 6.We use the mean numbers reported by the IHME model that representsthe most relaxed mitigation measures, called as "no vaccine" by the IHMEmodel. We use the median numbers reported by the LSHTM model.Based on Figure 4, we make four key observations. 1) Our modelpredicts that the number of COVID-19 cases reduces significantly (lessthan 600 daily cases) within March 2021 if the same strength of thecurrently applied mitigation measure is maintained for at least 30 days. Ifthe authority decides to relax the mitigation measures to the lowest strengththat has been applied during the year of 2020 (i.e., M ( t ) = 0 . ), thenthe daily expected number of cases increases by an average of . × and . × (up to 288,827 daily cases) using the CRW and CTC environmentalapproaches, respectively. We provide a comprehensive evaluation for theeffect of different mitigation coefficient values on the number of cases in theSupplementary Materials, Section 2. 2) COVIDHunter predicts the numberof COVID-19 cases to be equivalent to that predicted by the IHME modelduring the second wave with a certainty rate level of 50%. However, duringthe first wave, the predictions of the IHME model matches the expectednumber of cases using a certainty rate level of 100%. This means that,unlike our model, the IHME model considers the laboratory-confirmedcases to be as if the tests are done at a population-scale during the first wave,which is very likely incorrect. This is in line with a recent study (Ioannidis et al. , 2020) that demonstrates the high inaccuracy of the IHME model. 3)Overall, our model predicts on average . × and . × smaller number ofCOVID-19 cases than that predicted by ICL model using CTC and CRWapproaches, respectively, and a certainty rate of 50%. This suggests thatthe multiplicative relationship between the confirmed number of casesand the true number of cases can be represented by a certainty rate of22% to 33%, which our model can easily account for. The ICL modelalso shows that there is a sharp drop in the daily number of cases after 13November 2020, which corresponds to a 1.6 × , 1.4 × , and 1.3 × increasein the Oxford Stringency Index, CRW coefficient, and CTC coefficient,respectively, applied on 30 October 2020 as we show in Figure 3. 4) Thenumber of COVID-19 cases estimated by the LSHTM model during thefirst wave is 1) on average 24% less than that estimated by COVIDHunterand 2) 10 days late from that predicted by COVIDHunter, IHME, andICL. The prediction of the LSHTM model during the second wave is notavailable by the model’s pre-computed projections.We conclude that COVIDHunter provides more accurate estimationof the number of COVID-19 cases, compared to IHME (which provides OVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model inaccurate estimation during the first wave) and ICL (which provides over-estimation), with a complete control over the certainty rate level, mitigationmeasures, and environmental conditions. Unlike LSHTM, COVIDHunteralso ensures no prediction delay. Expected CasesCRW_50%-M(T)=0.7CRW_50%_M(t)=0.35CTC_50%_M(t)=0.7CTC_50%_M(t)=0.35CTC_100%_M(t)=0.7IHMELSHTMICLICL+50%ICL-50% N u m b e r o f C O V I D - C a s e s Date Date
Fig. 4.
Observed and predicted number of COVID-19 cases by our model and other threestate-of-the-art models. We use two different environmental condition approaches, CRWand CTC with two certainty rate levels of 50% and 100%. We use two mitigation coefficient, M ( t ) , values of 0.35 and 0.7 for each configuration of our model during 22 January to 22February 2021. We evaluate COVIDHunter’s predicted daily number of COVID-19hospitalizations in Figure 5. We use the observed official numberof hospitalizations as is. Using the number of cases calculated withEquation 2, we find X (hospitalizations-to-cases ratio) to be 4.288% and2.780%, using CRW and CTC, respectively, during the second wave.We make five key observations based on Figure 5. 1) The number ofhospitalizations calculated by COVIDHunter with a certainty rate levelof 50% matches that calculated by the IHME model. However, IHMEmodel provides a 10-12-day late prediction compared to that providedby COVIDHunter and the ICL model. 2) The ICL model predicts thenumber of hospitalizations to be × and × higher than that predicted byCOVIDHunter during the first wave ( . × and . × during the secondwave), using the CTC and CRW approaches, respectively, for evaluatingthe environmental conditions and a certainty rate of 50%. This suggests thatthe ICL model provides × and . × higher number of hospitalizationscompared to the observed number of hospitalizations, during first andsecond waves, respectively, which is highly unlikely and overestimated.3) COVIDHunter with a certainty rate level of 100% predicts the number ofcases to perfectly fit the curve of the observed number of hospitalizations,reaching up to 257 hospitalized patients a day. 4) Our model predicts thatthe number of COVID-19 hospitalizations reduces with stricter mitigationmeasures maintained for at least 30 days. Relaxing the mitigation measuresby 50% ( M is changed from 0.7 to 0.35) exponentially increases thenumber of hospitalizations by an average of . × and . × , reachingup to 12385 new daily hospitalized patients, as predicted by COVIDHunterusing CRW and CTC environmental condition approaches, respectively.This is in line with what the ICL model ( ICL-50% ) predicts, when ICLmodel is configured to 50% relaxation in the mitigation measures. 5) Theuse of the CTC approach for determining the environmental coefficientvalue yields a slightly different number of hospitalizations comparedto that provided by the use of the CRW approach. This is expectedas the CTC approach considers only the monthly average change intemperature, whereas the CRW approach considers the daily change in several environmental conditions.We conclude that 1) unlike the IBZ and LSHTM models,COVIDHunter is able to predict the number of hospitalizations and2) COVIDHunter provides more accurate estimation of the numberof hospitalizations compared to that calculated by ICL (whichprovides overestimation) and IHME (which provides late estimation).COVIDHunter predicts the number of COVID-19 hospitalizations in asimple, convenient and flexible way that requires calculating only the dailynumber of cases and the hospitalization-to-cases ratio, C X . N u m b e r o f C O V I D - H o s p i t a li z a t i o n s Date Date
Observed HospitalizationsCRW_50%_M(t)=0.7CRW_50%_M(t)=0.35CTC_50%_M(t)=0.7CTC_50%_M(t)=0.35CTC_100%_M(t)=0.7IHMEICLICL+50%ICL-50%
Monitoring Predicting
Fig. 5.
ObservedandpredictednumberofCOVID-19hospitalizations. Weusetwodifferentenvironmental condition approaches, CRW and CTC with two certainty rate levels of 50%and 100%. We use two mitigation coefficient values, M ( t ) , of 0.35 and 0.7 for eachconfiguration of our model during 22 January to 22 February 2021. We evaluate COVIDHunter’s predicted daily number of COVID-19 deathsin Figure 6 after accounting for the 15-day shift (as we discuss inSection 3.2). We calculate the observed number of deaths as the number ofexcess deaths (Section 2.4) to account for uncertainty in reporting COVID-19 deaths. Using the number of cases calculated using Equation 2, wefind Y (deaths-to-cases ratio, using excess death data) to be 2.730% and1.739%, using CRW and CTC, respectively, during the second wave.We make three key observations based on Figure 6. 1) COVIDHunterwith a certainty rate of 100% predicts the number of deaths to perfectlyfit the three curves of the observed number of excess deaths, ICL deaths,and IHME deaths, reaching up to 160 hospitalized patients a day. Duringthe second wave, the ICL curve is shifted (late prediction) by 5-10 daysfrom that of other models. 2) Similar to what we observe for the number ofhospitalizations, our model predicts that the number of COVID-19 deathssignificantly reduces with stricter mitigation measures maintained for atleast the upcoming 30 days. Relaxing the mitigation measures by 50%( M ( t ) is changed from 0.7 to 0.35) exponentially increases the death toll byan average of . × and . × , reaching up to 7885 new daily deaths, aspredicted by COVIDHunter using CRW and CTC environmental conditionapproaches, respectively. 3) During the first wave, the use of a certainty rateof 50% provides . × and . × ( . × and . × during the secondwave) higher number of deaths compared to that provided by ICL andIHME models, when COVIDHunter uses CRW and CTC environmentalcondition approaches, respectively.We conclude that 1) unlike the IBZ and LSHTM models,COVIDHunter is able to predict the number of deaths, 2) COVIDHunterpredicts the number of deaths to be similar to that predicted by the ICLand IHME models. Yet, COVIDHunter provides more accurate estimationof other COVID-19 statistics ( R , number of cases and hospitalizations)compared to ICL and IHME, as we comprehensively evaluate in theprevious sections, and 3) COVIDHunter requires calculating only the dailynumber of cases and the deaths-to-cases ratio, C Y , to predict the dailynumber of deaths. N u m b e r o f C O V I D - D e a t h s Monitoring Predicting
Date Date
Observed Excess DeathsCRW_50%_M(t)=0.7CRW_50%_M(t)=0.35CTC_50%_M(t)=0.7CTC_50%_M(t)=0.35CTC_100%_M(t)=0.7IHMEICLICL+50%ICL-50%
Fig. 6.
Observed and predicted number of COVID-19 deaths. We use two differentenvironmental condition approaches, CRW and CTC with two certainty rate levels of 50%and 100%. We use two mitigation coefficient values, M ( t ) , of 0.35 and 0.7 for eachconfiguration of our model during 22 January to 22 February 2021. Alser et al.
We demonstrate that we can monitor and predict the spread of COVID-19in an easy-to-use, flexible, and validated way using our new simulationmodel, COVIDHunter. We show how to flexibly configure our model forany scenario and easily extend it for different mitigation measures andenvironmental conditions. The use of a small number of variables in ourmodel enables a simple and flexible yet powerful way of adapting ourmodel to different conditions for a given region. We demonstrate theimportance of considering the effect of environmental changes on thespread of COVID-19 and how doing so can greatly improve simulationaccuracy. COVIDHunter flexibly offers the ability to directly make the bestuse of existing models that study the effect of one or both of environmentalconditions and mitigation measures on the spread of COVID-19.We benchmark our model against major alternative models of theCOVID-19 pandemic that are used to assist governments. Compared tothese models, COVIDHunter achieves more accurate estimation, providesno prediction delay, and provides ease of use and high flexibility due tothe simple modeling approach that uses a small number of parameters.Using COVIDHunter, we demonstrate that the spread of COVID-19 inSwitzerland (as a case study) is still active (i.e., R > 1.0) and curbingthis spread requires maintaining the same strength of the currently appliedmitigation measures for at least another 30 days. Using COVIDHunter( CTC_100%_M(t)=0.7 ) on 7 January 2021, we predicted that on 27January 2021 the number of cases, hospitalizations, and deaths willdrop by 19%, 20%, and 30%, respectively. The predicted drop is inline with the observed official number of cases, hospitalizations, anddeaths (as shown by the Federal Office of Public Health in Switzerland ) but with different ratios (41%, 59%, and49%, respectively). We believe the difference between the observed and theCOVIDHunter’s predicted numbers of cases, hospitalizations, and deathsis due to one or more of the following reasons: 1) The lack of population-scale COVID-19 testing, 2) the use of a more stricter mitigation measurethan M ( t ) = 0 . , and 3) the lack of information about ground truthon number of COVID-19 cases, hospitalizations, and deaths. We provideinsights on the effect of each change in the strength of the applied mitigationmeasure on the number of daily cases, hospitalizations, and deaths. Wemake all the data, statistical analyses, and a well-documented modelimplementation publicly and freely available to enable full reproducibilityand help society and decision-makers to accurately and openly review thecurrent situation and estimate future impact of decisions.We suggest and plan at least five main directions/additions to furtherimprove the predictive power and benefits of our COVIDHunter model. 1)Clustering the population based on age-groups. This has potential differenteffects on, for example, population, environmental conditions, mitigationmeasures (Bhatia and Klausner, 2020; Bi et al. , 2020). 2) Consideringvaccinated persons as another new category of persons in a population.3) Considering reinfection after immunity (Lumley et al. , 2020). 4)Considering the average number of households (or population density),as well as other potential population-level effects, while calculatingthe number of new infected persons caused by an infected person. 5)Considering different strains of the COVID-19 virus by allowing formultiple base reproduction numbers. Our goal is to update COVIDHunterwith such improvements and capabilities while keeping its simplicity, easeof use, and flexibility of its modeling strategy. References
Andersen, K. G., Rambaut, A., Lipkin, W. I., et al. (2020). The proximal origin ofSARS-CoV-2.
Nature medicine , (4), 450–452.Ashcroft, P., Huisman, J. S., Lehtinen, S., et al. (2020). COVID-19 infectivity profilecorrection. Swiss Medical Weekly , (3132).Bhatia, R. and Klausner, J. (2020). Estimating individual risks of COVID-19-associated hospitalizationand death usingpublicly available data. PloSone , (12),e0243026.Bi, Q., Wu, Y., Mei, S., et al. (2020). Epidemiology and transmission of COVID-19in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospectivecohort study. The Lancet Infectious Diseases , (8), 911–919.Brauner, J. M., Mindermann, S., Sharma, M., et al. (2020). Inferring the effectivenessof government interventions against COVID-19. Science .Chang, S. L., Harding, N., Zachreson, C., et al. (2020). Modelling transmission andcontrol of the COVID-19 pandemic in Australia.
Nature Communications .Davies, N. G., Kucharski, A. J., Eggo, R. M., et al. (2020). Effects of non-pharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study.
The Lancet Public Health .de Souza, W. M., Buss, L. F., da Silva Candido, D., et al. (2020). Epidemiologicaland clinical characteristics of the early phase of the COVID-19 epidemic in Brazil.
Nature Human Behaviour , , 856–865.Du Toit, A. (2020). Outbreak of a novel coronavirus. Nature Reviews Microbiology , (3), 123–123.Fares, A. (2013). Factors influencing the seasonal patterns of infectious diseases. International journal of preventive medicine , (2), 128.Ferguson, N., Laydon, D., Nedjati-Gilani, G., et al. (2020). Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcaredemand. Imperial College London , , 77482.Fisman, D. (2012). Seasonality of viral infections: mechanisms and unknowns. Clinical Microbiology and Infection , (10), 946–954.Flaxman, S., Mishra, S., Gandy, A., et al. (2020). Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature , (7820), 257–261.Hale, T., Petherick, A., Phillips, T., and Webster, S. (2020). Variation in governmentresponses to COVID-19. Blavatnik school of government working paper , .Hilton, J. and Keeling, M. J. (2020). Estimation of country-level basic reproductiveratios for novel Coronavirus (SARS-CoV-2/COVID-19) using synthetic contactmatrices. PLOS Computational Biology , (7), 1–10.Huisman, J. S., Scire, J., Angst, D. C., et al. (2020). Estimation and worldwidemonitoring of the effective reproductive number of SARS-CoV-2. medrxiv .Ioannidis, J. P., Cripps, S., and Tanner, M. A. (2020). Forecasting for COVID-19 hasfailed. International journal of forecasting .Kampf, G., Todt, D., Pfaender, S., and Steinmann, E. (2020). Persistence ofcoronaviruses on inanimate surfaces and their inactivation with biocidal agents.
Journal of Hospital Infection , (3), 246–251.Kobayashi, T., Jung, S.-m., Linton, N. M., et al. (2020). Communicating the Risk ofDeath from Novel Coronavirus Disease (COVID-19). Journal of Clinical Medicine , (2).Lauer, S. A., Grantz, K. H., Bi, Q., et al. (2020). The incubation period of coronavirusdisease 2019 (COVID-19) from publicly reported confirmed cases: estimation andapplication. Annals of internal medicine , (9), 577–582.Li, Q., Guan, X., Wu, P., et al. (2020). Early transmission dynamics in Wuhan, China,of novel coronavirus–infected pneumonia. New England Journal of Medicine .Lumley, S. F., O’Donnell, D., Stoesser, N. E., et al. (2020). Antibody status andincidence of SARS-CoV-2 infection in health care workers.
New England Journalof Medicine .Moriyama, M., Hugentobler, W. J., and Iwasaki, A. (2020). Seasonality of respiratoryviral infections.
Annual review of virology , .Pachetti, M., Marini, B., Benedetti, F., et al. (2020). Emerging SARS-CoV-2mutation hot spots include a novel RNA-dependent-RNA polymerase variant. Journal of Translational Medicine , , 1–9.Phan, T. (2020). Genetic diversity and evolution of SARS-CoV-2. Infection, geneticsand evolution , , 104260.Prata, D. N., Rodrigues, W., and Bermejo, P. H. (2020). Temperature significantlychanges COVID-19 transmission in (sub) tropical cities of Brazil. Science of theTotal Environment , page 138862.Rahman, B., Sadraddin, E., and Porreca, A. (2020). The basic reproduction numberof SARS-CoV-2 in Wuhan is about to die out, how about the rest of the World?
Reviews in Medical Virology , page e2111.Reichmuth, M., Hodcroft, E., Riou, J., et al. (2021). Transmission of SARS-CoV-2variants in Switzerland. https://ispmbern.github.io/covid-19/variants/index.pdf . Accessed: 2021-1-30.Reiner, R. C., Barber, R. M., Collins, J. K., et al. (2020). Modeling COVID-19scenarios for the United States.
Nature Medicine .Riddell, S., Goldie, S., Hill, A., et al. (2020). The effect of temperature on persistenceof SARS-CoV-2 on common surfaces.
Virology journal , (1), 1–7.Russell, T. W., Golding, N., Hellewell, J., et al. (2020). Reconstructing the earlyglobal dynamics of under-ascertained COVID-19 cases and infections. BMCmedicine , (1), 1–9.Shi, Q., Hu, Y., Peng, B., et al. (2020). Effective control of SARS-CoV-2 transmissionin Wanzhou, China. Nature medicine , pages 1–8.Slifka, M. K. and Gao, L. (2020). Is presymptomatic spread a major contributor toCOVID-19 transmission?
Nature Medicine , (10), 1531–1533.Toyoshima, Y., Nemoto, K., Matsumoto, S., et al. (2020). SARS-CoV-2 genomicvariations associated with mortality rate of COVID-19. Journal of human genetics , (12), 1075–1082.Tradigo, G., Guzzi, P. H., Kahveci, T., and Veltri, P. (2020). A method to assessCOVID-19 infected numbers in Italy during peak pandemic period. In , pages 3017–3020. IEEE.Wang, J., Tang, K., Feng, K., and Lv, W. (2020). High temperature and high humidityreduce the transmission of COVID-19. Available at SSRN 3551767 .Wei, W. E., Li, Z., Chiew, C. J., et al. (2020). Presymptomatic Transmission ofSARS-CoV-2—Singapore, January 23–March 16, 2020.
Morbidity and MortalityWeekly Report , (14), 411.Wilke, C. O. and Bergstrom, C. T. (2020). Predicting an epidemic trajectory isdifficult. ProceedingsoftheNationalAcademyofSciences , (46), 28549–28551.Xie, J. and Zhu, Y. (2020). Association between ambient temperature and COVID-19infection in 122 cities from China. Science of the Total Environment , , 138201.Xu, R., Rahmandad, H., Gupta, M., et al. (2020). The Modest Impact of Weatherand Air Pollution on COVID-19 Transmission. medRxiv . upplementary Material for COVIDHunter: An Accurate, Flexible,and Environment-Aware Open-SourceCOVID-19 Outbreak Simulation Model
Mohammed Alser, Jeremie S. Kim, Nour Almadhoun Alserr,Stefan W. Tell, and Onur Mutlu
ETH Zurich, Zurich 8006, Switzerland
The purpose of this study is to explore the relationship between the daily new confirmed COVID-19case counts or death counts and temperature in Switzerland. We obtain the daily number of confirmedCOVID-19 cases and deaths in Switzerland from official reports of the Federal Office of Public Health(FOPH) in Switzerland [1] starting from March 2020 until January 2020. We obtain the air temperaturedata from the Federal Office of Meteorology and Climatology (MeteoSwiss) in Switzerland [2]. Wecalculate the daily average air temperature during the same time period (March 2020 to December 2020)for all the 26 cantons in Switzerland.To evaluate the correlation between the temperature data and the number of daily confirmed COVID-19 cases or the daily counts of death, we use a generalized additive model (GAM). GAM is usually used tocalculate the linear and non-linear regression models between meteorological factors (e.g., temperature,humidity) with COVID-19 infection and transmission [3, 4, 5]. Our analyses are performed with Rsoftware version 4.0.3., where p − value < .
05 is considered statistically significant. Our model attemptsto represent the linear behavior of the growth curve of the counts of the new confirmed cases or deaths inSwitzerland. Therefore, we can test the hypothesis of whether there is a significant negative correlationbetween the COVID-19 confirmed daily case or death counts and temperature.The results demonstrate a significant negative correlation between temperature and COVID-19 dailycase and death counts. Specifically, the relationship is linear for the average temperature in the rangefrom 1-26 ◦ C. Based on Figure S1, we make two key observations. 1) For each 1 ◦ C rise in temperature,there is a 3.67% ( t -value = 3.244 and p -value = 0.0013) decrease in the daily number of COVID-19confirmed cases (Figure S1(a)). 2) For each 1 ◦ C rise in temperature, there is a 23.8% decrease in thedaily number of COVID-19 deaths ( t -value = 9.312 and p -value = 0.0), as shown in Figure S1(b).1 a)(b) Figure S1: Correlation between temperature and COVID-19 confirmed (a) case count and (b) deathcount in 26 cantons of Switzerland. 2
Evaluating the Effect of Different Mitigation Coefficient Val-ues on COVIDHunter’s Predicted Number of Cases, Hospi-talizations, and Deaths
Using COVIDHunter, we predict the number of COVID-19 cases, hospitalizations, and deaths during 22January to 31 March 2021. We show the maximum and the average of daily number of COVID-19 cases,hospitalizations, and deaths over 22 January to 31 March 2021 in Figures S2 and S3, respectively. Weuse two environmental condition approaches, CRW and CTC, with a certainty rate level of 50%. Weassume five mitigation coefficient, M ( t ), values of 0.35, 0.4, 0.5, 0.6, and 0.7 for each configuration ofCOVIDHunter during 22 January to 22 February 2021.This range of mitigation coefficient values covers the lowest (i.e., M ( t )=0.35) and the highest (i.e., M ( t )=0.7) strengths of mitigation measures that have been applied during the year of 2020.Based on Figures S2 and S3, we make three key observations. 1) COVIDHunter predicts that the maximum of daily number of COVID-19 cases, hospitalizations, and deaths over 22 January to 31 March2021 would be 4972, 213, and 136, respectively, using CRW and M ( t )=0.7, as we show in Figure S2(a-c).Using our environmental condition approach, CTC, and M ( t )=0.7, the maximum of daily number ofCOVID-19 cases over 22 January to 31 March 2021 would be 7580 and the maximum of daily numberof COVID-19 hospitalizations and deaths would be almost same as that calculated by COVIDHunterwith CRW, as we show in Figure S2(d-f). 2) Relaxing the mitigation measures by 50% ( M is changedfrom 0.7 to 0.35) exponentially increases the maximum of daily number of cases, hospitalizations, anddeaths by 58 × , reaching up to 288827, 12385, and 7885, respectively, as predicted by COVIDHunter withthe CRW approach (Figure S2(a-c)). Using the CTC appraoch and M ( t )=0.35, COVIDHunter predictsan exponential increase in the maximum of daily number of cases, hospitalizations, and deaths by only . × , as we show in Figure S2(a-c). This is expected as the CTC approach considers only the drop intemperature rather than the average effect of many environmental conditions as the CRW approach does.3) Relaxing the mitigation measures by 50% ( M is changed from 0.7 to 0.35) causes the daily number ofcases, hospitalizations, and deaths to exponentially increase by an average of 29 . × and 23 . × over 22January to 31 March 2021 using CRW and CTC environmental approaches, respectively, as we show inFigure S3.We conclude that COVIDHunter provides flexible evaluation of the effect of different strength of thepast and current mitigation measures on the number of COVID-19 cases, hospitalizations, and deaths.COVIDHunter evaluates the applied mitigation measures with high flexibility of configuring the envi-ronmental coefficient and mitigation coefficient, which helps society and decision-makers to accuratelyreview the current situation and estimate future impact of decisions.3 a) (b) (c)(d) (e) (f) -11.79x -11.79x -11.79x -10.41x -10.41x -10.41x M a x . N u m b e r o f C a s e s M a x . N u m b e r o f H o s p i t a li z a t i o n s M a x . N u m b e r o f D e a t h s M a x . N u m b e r o f C a s e s M a x . N u m b e r o f H o s p i t a li z a t i o n s M a x . N u m b e r o f D e a t h s Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t )Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) CRW CRW CRWCTC CTC CTC
Figure S2: The maximum of daily number of COVID-19 cases, hospitalizations, and deaths as predictedby COVIDHunter over 22 January to 31 March 2021. We use five mitigation coefficient, M ( t ), values of0.35, 0.4, 0.5, 0.6, and 0.7 for each configuration of our model during 22 January to 22 February 2021. Weuse two different environmental condition approaches, CRW (a)-(c) and CTC (d)-(f ) with a certaintyrate level of 50%. Dashed line represents exponential model fit to data. (a) (b) (c)(d) (e) (f) -9.714x -9.714x -9.714x A v e r a g e N u m b e r o f C a s e s A v e r a g e N u m b e r o f H o s p i t a li z a t i o n s A v e r a g e N u m b e r o f D e a t h s -9.018x -9.018x -9.018x A v e r a g e N u m b e r o f C a s e s A v e r a g e N u m b e r o f H o s p i t a li z a t i o n s A v e r a g e N u m b e r o f D e a t h s Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t )Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) Mitigation Coefficient, M ( t ) CRW CRW CRWCTC CTC CTC
Figure S3: The average of daily number of COVID-19 cases, hospitalizations, and deaths as predictedby COVIDHunter over 22 January to 31 March 2021. We use five mitigation coefficient, M ( t ), values of0.35, 0.4, 0.5, 0.6, and 0.7 for each configuration of our model during 22 January to 22 February 2021. Weuse two different environmental condition approaches, CRW (a)-(c) and CTC (d)-(f ) with a certaintyrate level of 50%. Dashed line represents exponential model fit to data.4 Evaluated Datasets
Our experimental evaluation uses a large number of different real datasets, including 1) daily R numbervalues, 2) observed daily number of COVID-19 cases, 3) observed daily number of COVID-19 hospital-izations, 4) observed daily number of COVID-19 deaths, 5) number of excess deaths, 6) the estimatedstrength of mitigation measures as calculated by the Oxford Stringency Index, 7) estimation of COVID-19 statistics as calculated by existing state-of-the-art simulation models, ICL, IHME, LSHTM, and IBZ,from seven different sources as we list below. The raw datasets are provided in the Supplementary ExcelFile and it can be also obtained from the original sources as we list below: • Observed COVID-19 statistics (R number values and number of cases, hospitalizations, and deaths) – Official reports (January 7, 2021): – Smoothed data (January 7, 2021): https://ourworldindata.org/coronavirus/country/switzerland?country=~CHE • Excess deaths: – Information: – Direct link (January 7, 2021): • Oxford Stringency Index – • Imperial College London (ICL) Model: – Information: https://mrc-ide.github.io/global-lmic-reports/ – Direct link (January 15, 2021): https://github.com/mrc-ide/global-lmic-reports/raw/master/data/2021-01-30 v7.csv.zip • Institute for Health Metrics and Evaluation (IHME) Model: – Information: https://mrc-ide.github.io/global-lmic-reports/ – Direct link (January 15, 2021): • The London School of Hygiene Tropical Medicine (LSHTM) Model: – Information: https://cmmid.github.io/topics/covid19/global cfr estimates.html – Direct link (January 15, 2021): https://raw.githubusercontent.com/cmmid/cmmid.github.io/master/topics/covid19/reports/under reporting estimates/under ascertainment estimates.csv • The Theoretical Biology Group at ETH Zurich (IBZ) Model: – Information: https://ibz-shiny.ethz.ch/covid-19-re-international/ – Direct link (January 15, 2021): https://github.com/covid-19-Re/dailyRe-Data
References [1] Coronavirus - The Federal Office of Public Health in Switzerland. . Accessed: 2020-12-31.[2] Switzerland forecast - The Federal Office of Meteorology and Climatology MeteoSwiss. . Accessed: 2020-12-31. https://github.com/CMU-SAFARI/COVIDHunter/blob/main/Evaluation Results/ComparisonToOtherModels.xlsx
53] Jiangtao Liu, Ji Zhou, Jinxi Yao, Xiuxia Zhang, Lanyu Li, Xiaocheng Xu, Xiaotao He, Bo Wang,Shihua Fu, Tingting Niu, et al. Impact of meteorological factors on the COVID-19 transmission: Amulti-city study in China.
Science of the Total Environment , page 138513, 2020.[4] David N Prata, Waldecy Rodrigues, and Paulo H Bermejo. Temperature significantly changes COVID-19 transmission in (sub) tropical cities of Brazil.
Science of the Total Environment , page 138862, 2020.[5] Jingui Xie and Yongjian Zhu. Association between ambient temperature and COVID-19 infection in122 cities from China.