[PDF] Bayesian dynamical estimation of the parameters of an SE(A)IR COVID-19 spread model

Abstract

In this article, we consider a dynamic epidemiology model for the spread of the COVID-19 infection. Starting from the classical SEIR model, the model is modified so as to better describe characteristic features of the underlying pathogen and its infectious modes. In line with the large number of secondary infections not related to contact with documented infectious individuals, the model includes a cohort of asymptomatic or oligosymptomatic infectious individuals, not accounted for in the data of new daily counts of infections. A Bayesian particle filtering algorithm is used to update dynamically the relevant cohort and simultaneously estimate the transmission rate as the new data on the number of new infections and disease related death become available. The underlying assumption of the model is that the infectivity rate is dynamically changing during the epidemics, either because of a mutation of the pathogen or in response to mitigation and containment measures. The sequential Bayesian framework naturally provides a quantification of the uncertainty in the estimate of the model parameters, including the reproduction number, and of the size of the different cohorts. Moreover, we introduce a dimensionless quantity, which is the equilibrium ratio between asymptomatic and symptomatic cohort sizes, and propose a simple formula to estimate the quantity. This ratio leads naturally to another dimensionless quantity that plays the role of the basic reproduction number R 0 of the model. When we apply the model and particle filter algorithm to COVID-19 infection data from several counties in Northeastern Ohio and Southeastern Michigan we found the proposed reproduction number R 0 to have a consistent dynamic behavior within both states, thus proving to be a reliable summary of the success of the mitigation measures.

Full PDF

BBayesian dynamical estimation of the parameters of an SE(A)IRCOVID-19 spread model

D Calvetti , A Hoover J Rose E Somersalo , Department of Mathematics, Applied Mathematics, and StatisticsCase Western Reserve University Center for Community Health IntegrationCase Western Reserve University Department of MathematicsUniversity of Akron

Abstract

In this article, we consider a dynamic epidemiology model for the spread of the COVID-19 infection. Start-ing from the classical SEIR model, the model is modiﬁed so as to better describe characteristic features of theunderlying pathogen and its infectious modes. In line with the large number of secondary infections not relatedto contact with documented infectious individuals, the model includes a cohort of asymptomatic or oligosymp-tomatic infectious individuals, not accounted for in the data of new daily counts of infections. A Bayesian particleﬁltering algorithm is used to update dynamically the relevant cohort and simultaneously estimate the transmissionrate as new daily incidence and mortality data become available. The underlying assumption of the model isthat the infectivity rate is dynamically changing, either because of a mutation of the pathogen or in response tomitigation and containment measures. The sequential Bayesian framework naturally provides a quantiﬁcation ofof uncertainty in model parameter estimates, including the reproduction number, and of the size of the differentcohorts. Moreover, we introduce a dimensionless quantity, which is the equilibrium ratio between asymptomaticand symptomatic cohort sizes, and propose a simple formula to estimate the quantity. This ratio leads naturally toanother dimensionless quantity that plays the role of the basic reproduction number R of the model. When weapply the model and particle ﬁlter algorithm to COVID-19 infection data from several counties in NortheasternOhio and Southeastern Michigan we found the proposed reproduction number R to have a consistent dynamicbehavior within both states, thus proving to be a reliable summary of the success of the mitigation measures. Since its emergence in Wuhan, China, at the end of 2019, the novel coronavirus SARS-CoV-2 has spread worldwide.In a little over four months it has evolved into a pandemic affecting nearly every country in the world in spite ofthe measures taken to control and contain the contagion. The novelty of the virus, which is related, but differentfrom the coronaviruses responsible for the SARS and MERS epidemics in 2003 and 2009, poses a challenge tousing mathematical models to predict the dynamics of the pandemic and the effects of changing mitigation strate-gies. Following the initial outbreak, number of contributors have either proposed mathematical models speciﬁcallydesigned to examine the spread of COVID-19 [1, 2], or addressed important points related to the estimation of thekey model parameters [3, 4] from partial, biased daily updated data [5, 6] and incomplete information about thepathogen responsible for the pandemic. 1 a r X i v : . [ q - b i o . P E ] M a y arly virological assessment of SARS-CoV-2 [7] from a small sample of patients in Germany found active virusreplication in the upper respiratory tract, as opposed to SARS where the predominant expression was in the lowerrespiratory tract. Patients were tested when symptoms were mild or in the prodromal phase. Pharyngeal viralshedding was very high during the ﬁrst week of symptoms, peaking on the fourth day, with shedding continuedafter the symptoms subsided, well into the second week. Moreover, most of the patients seemed to have passedtheir shedding peak in the upper respiratory tract at the time of the ﬁrst testing, suggesting efﬁcient transmission ofSARS-CoV2 through pharyngeal shedding when the symptoms are mild. In [8] peak shedding is suggested to occuraround day 0, and of the transmission is estimated to occur in the asymptomatic phase. These speciﬁc clinicalfeatures need an interpretation in the epidemic models.The spread and speed of the COVID-19 pandemic has triggered a burst of modeling activity in the effort to helppredict the location and intensity of the next hotspots. One of the most used indicators of the potential for spread ofan infectious agent is the basic reproduction number R [9, 10, 11], although questions have been raised as to howit should be computed [12, 13, 14] and interpreted [15, 16, 17], and what can be inferred from it [18]. In general, R > signals epidemic potential, and the larger the R , the faster the spread of the infectious agent. In [19],where a SEIR model is calibrated and tested on the Wuhan outbreak data, the initial estimate of R , whose deﬁnitionand signiﬁcance will be discussed later, was updated after the implementation of strict measures for control andprevention of the contagion. The model was then used to predict the magnitude of the pandemic by predicting thenumber of infected individuals on February 29, 2020 under the two different scenarios of an increasing R , changingfrom 1.9 to 2.6 and ending at 3.1, corresponding to letting the pandemic follow its course, and that of a decreasing R , going from 3.1 to 2.6, then 1.9, and eventually 0.9 and 0.5, following the mitigation measures taken in Wuhan.Of models particularly adapted to descibe COVID-19, we mention the SIDARTHE model [20], consisting of eightseparate compartments to address the different role in the spread of the pandemic of unreported infectious individu-als, who presumably constitute the great majority. The model parameters, inferred from ofﬁcial data on the numberof diagnosed and recovered cases, and ﬁtted by recursive least squares methods, are adaptively changed during thesimulation to reﬂect the introduction of progressively restrictive measures. More speciﬁcally, for the Wuhan out-break the value of R , set to 2.38 on day 1, is changed to 1.6 on day 5, increasing to 1.8 on day 12 and returning to1.6 on day 22. After day 22, R is reduced to 0.99, and from day 38 it is set to 0.85. The SIDARTHE model is thenused to predict the course of the pandemic if the lockdown measures are either weakened or tightened.The importance of considering the contribution from undocumented infections is repeatedly addressed in the COVID-19 literature [21]. The population seroprevalence of antibodies in a random sample of 3300 people in Santa ClaraCounty, California [22] early in the U.S. outbreak, would indicate the number of infections to be much higher thanthe number of reported cases, possibly by a factor of 50 to 80. This is in line with 10% seropositivity reported in thetown of Robbio, Italy, and 14% in the town of Gangelt, Germany.Another important issue, raised in [23], is the bias in the estimate of key epidemic parameters if the delay in casereporting is not taken into consideration.Documented and undocumented infections are accounted for in separate compartments in the SEIR network dynamicmetapopulation model in [24], which assumes that asymptomatic, or oligosymptomatic cases can expose a far largerproportion of the susceptible cohort to the virus than can reported cases, with asymptomatic cases less infectiousthan symptomatic ones. The model takes into account the travels of infectious individuals between cities by settingup a network comprised of 375 cities in China, with the amount of trafﬁc between any pair of cities estimated frommobility data, and assigns fours separate cohort to each city. The estimator of the distribution of the six modelparameters via a variant of ensemble Kalman ﬁlter suggests that in the ﬁrst month of the Wuhan outbreak, onlyapproximately one out of six infections was reported. The adoption of strong mitigation measures starting fromJanuary 23, 2020 is included in the model by ﬁrst reducing the mobility coefﬁcients to 2%, of the normal value,and by setting them to zero when all trafﬁc stopped. The median latent period of COVID-19 in Wuhan according2o this model is 3.7 days, and the median infectious period 3.48 days, with a median delay in case reporting of 10days at ﬁrst, and 6 days after the lockdown. The median estimate of the reproduction number varies from 2.38 in theearly phase, to 1.66 and 0.99 after containment measures, when the fraction of undocumented infections decreasessharply.In [25], the epidemic is assumed to go through three phases: in the ﬁrst there is a slow accumulation of newinfections, most of which are unreported. The second phase is characterized by a rapid growth of infections, diseasesand deaths, while in the third phase the epidemic slows down due to the depletion of susceptible individuals. TheSIR model used to describe the course of the COVID-19 pandemic, calibrated on the daily number of deaths in theUnited Kingdom and Italy and with R = 2 . and R = 2 . , respectively, inferred from the literature, suggestthat the epidemic already started one month before the ﬁrst reported death.The most widely used tools in the United States for predicting the dynamics of the pandemic include the CHIME[26, 27] and the IHME forecasting model [28]. The former is an SIR/SEIR based differential-algebraic simulationtool with an accessible interface. The latter, in order to counteract the typical SIR overestimation of the proportionof infected and to focus on the most severely ill patients, uses empirically observed COVID-19 population mortalitycurves. The underlying statistical model assumes that the cumulative death rate at each location follows the Gaussianerror function, and produces long and short term predictions.The goal of this article is to provide a relatively simple and ﬂexible model equipped with a Bayesian state andparameter estimation protocol to dynamically process the COVID-19 data inﬂow to assess the current standing ofthe population and to make short term forecasts of the progression. All parameters and outputs of the algorithmare easily interpretable and adjustable. The underlying model is an adaptation of the classical SEIR model [29],adjusted to better conform with certain speciﬁc features of the current COVID-19 epidemics. In particular, we re-interpret the cohort of exposed individuals, deﬁning it as individuals who carry and shed the virus asymptomatically,presymptomatically, or oligosymptomatically, thus not being isolated or hospitalized. Moreover, this cohort doesnot contribute to the number of conﬁrmed cases. We refer to this model as SE(A)IR. We propose a Bayesian particleﬁltering (PF) algorithm for estimating dynamically the state vector consisting of the sizes of the four cohorts in themodel based on a Poisson distributed observation of the infected cohort size, with the dynamic model generatingthe mean of the distribution. Concomitantly, we estimate the presumably dynamically changing rate of transmissionwith posterior envelopes of model uncertainty. Being a fully Bayesian algorithm, the output consist of model un-certainties. Moreover, we show the forecasting of the expected number of new infections based on the model withpredictive uncertainty envelopes.One key factor contributing to the challenge of making predictions and planning is the unknown number of indi-viduals spreading the virus asymptomatically. The particle ﬁlter algorithm provides an estimate for the ratio of theasymptomatic and symptomatic virus carriers. A novel feature of this contribution is the derivation of a Riccati typeequation for the ratio of the sizes of the two cohorts. Moreover, the Riccati equation has a short time approximatestable equilibrium. The equilibrium value, which can be analytically calculated from the model parameters, corre-sponds well to the model-based estimated ratio and can be used to deﬁne a dynamically changing effective basicreproduction number R for the epidemic, facilitating the comparison of model predictions with other models.The methodology is extensively tested using COVID-19 data of 18 counties in Northeastern Ohio (Cleveland area)and 19 counties in Southeastern Michigan (Detroit area) during the period from early March 2020 to early May2020, including the period when both states introduced similar, yet slightly different mitigation protocols.3 COVID-19 epidemiology model

Compartment models in mathematical epidemiology partition a homogenous and well-mixed population into cohortsof individuals at different stages of the infection [ ? ]. The popular SIR model, with separate compartments forsusceptible (S), infected (I) and recovered (R), introduced nearly a century ago by Kermack and McKendrick [29],introduced a population dynamics component into the previous, purely phenomenological statistical models (see,e.g. [30]) that still seem to have a life of their own in modeling the COVID-19 epidemic [28].A signiﬁcant challenge for the control and containment of the COVID-19 epidemic is the spread of the infection bya large portion of asymptomatic or lightly symptomatic infectious individuals who are unaware of being vectors ofthe virus. This is especially problematic when, as is the case at the time of the writing of this article, due to limitedavailability, testing priority is given to symptomatic individuals or to vulnerable populations, thus the size of theasymptomatic cohort must be estimated indirectly. In the next subsection, we propose a compartment model thatcan be used to obtain such an estimate, and discuss its advantages and limitations. We begin by considering a modiﬁcation of the classical SEIR model where the infected cohort I is subdivided intotwo groups according to the manifestation of symptoms, denoting by A the asymptomatic, infected, and infectioussubcohort and by I the symptomatic infected one. Hence, while E and A are both asymptomatic and technicallyinfected, the E cohort is not infectious as in the classical SEIR model, as opposed to A that sheds the virus. Thecompartment model, schematically represented by the branching ﬂow diagram in the left panel of Figure 1, isgoverned by the system of differential equations dSdt = − ϕ ,dEdt = ϕ − ϕ − ϕ (1)3 ,dAdt = ϕ (1)3 − ϕ (2)3 , (1) dIdt = ϕ − ϕ − ϕ ,dRdt = ϕ (2)3 + ϕ , where the functional form of the ﬂuxes is will be speciﬁed below. The COVID-19 data, consisting of the daily countof newly reported infection, corresponds to observations of the ﬂux ϕ : In the absence of additional population-level inputs, this is the data that must be used to estimate the cohort sizes and model parameters. In particular, ifno data concerning the asymptomatic cohort dynamics are available, the values of the ﬂuxes ϕ (1)3 and ϕ (2)3 can beset rather arbitrarily, since they are minimally connected with the ﬂuxes in the lower branch of the ﬂow diagram,whose values are part of the observations. To estimate for the size of the asymptomatic cohort in the absence ofadditional information, it is necessary to modify the model. Below we propose a modiﬁed version of the model thatis suitable for estimating the size of the asymptomatic cohort, while retaining many of the salient features of theextended model. The modiﬁcation that we propose applies an approach similar to that used in metabolic networkmodel reduction [ ? ], where lumping enzymatic reactions whose parameters cannot be estimated from the data isfairly common.In our reduced model, the ﬁctitious compartment E ( A ) embeds the asymptomatic cohort into the exposed one, as4 igure 1: Left: compartment diagram of the model including both symptomatic and asymptomatic infected cohorts.After a non-infectious incubation period, the exposed individuals branch either to symptotmatic ( I ) or asymptomatic( A ) infectious compartments, with unknown frequencies and unknown reasons. In the modiﬁed SE(A)IR model(right), the two compartments E and A on the blue background are lumped together to form the ﬁctitious E ( A ) compartment representing the exposed, asymptomatic cohorts. The two ﬂuxes ϕ (1)3 and ϕ (2)3 are merged to a singleﬂux ϕ .depicted in Figure1, (right), thus the size of the new cohort is E ( A ) = E + A, and, from (1) it follows that dE ( A ) dt = dEdt + dAdt = ϕ − ϕ − ϕ (2)3 . In this manner, the ﬂux ϕ (1)3 , that describes a variation internal to the lumped compartment, is no longer part of themodel governing equations. Assuming, for simplicity, that both asymptomatic and symptomatic individuals moveto the recovered compartment with the same relative rate, denoted by γ , it follows that ϕ = γE ( A ) , ϕ = γI. The ﬂux ϕ represents the rate at which part of the exposed individuals develop symptoms, and for simplicity, weassume that the ﬂux is proportional to the size of the compounded cohort, that is, ϕ = ηE ( A ) . One of the most relevant parameters in an epidemic is the rate at which susceptible individual are infected. In theclassical SEIR model, the transmission ﬂux assumes the form ϕ = β IN S, where β is the transmission rate and N is the population size. However, one of the problems of COVID-19 is thesigniﬁcant amount of virus transmission by the asymptomatic cohort within E ( A ) . Our model assumes that in thecurrent COVID-19 epidemics, most infected individuals who have developed symptoms, are either hospitalized or inself-isolation, thus playing a limited role in the exposure of susceptible individuals to the contagion. Acknowledgingthe differing roles that symptomatic and asymptomatic virus shedders play in transmission to susceptibles, we deﬁne ϕ = β pE ( A ) + qIN S, where p and q are frequencies, ≤ p, q ≤ . The frequency p may be interpreted as the fraction of the compoundcohort E(A) who are infectious. Classical SEIR models, where usually it is assumed that the exposed cohort is not5 Figure 1: Left: compartment diagram of the model including both symptomatic and asymptomatic infected cohorts.After a non-infectious incubation period, the exposed individuals branch either to symptotmatic ( I ) or asymptomatic( A ) infectious compartments, with unknown frequencies and unknown reasons. In the modiﬁed SE(A)IR model(right), the two compartments E and A on the blue background are lumped together to form the ﬁctitious E ( A ) compartment representing the exposed, asymptomatic cohorts. The two ﬂuxes ϕ (1)3 and ϕ (2)3 are merged to a singleﬂux ϕ .depicted in Figure1, (right), thus the size of the new cohort is E ( A ) = E + A, and, from (1) it follows that dE ( A ) dt = dEdt + dAdt = ϕ − ϕ − ϕ (2)3 . In this manner, the ﬂux ϕ (1)3 , that describes a variation internal to the lumped compartment, is no longer part of themodel governing equations. Assuming, for simplicity, that both asymptomatic and symptomatic individuals moveto the recovered compartment with the same relative rate, denoted by γ , it follows that ϕ = γE ( A ) , ϕ = γI. The ﬂux ϕ represents the rate at which part of the exposed individuals develop symptoms, and for simplicity, weassume that the ﬂux is proportional to the size of the compounded cohort, that is, ϕ = ηE ( A ) . One of the most relevant parameters in an epidemic is the rate at which susceptible individual are infected. In theclassical SEIR model, the transmission ﬂux assumes the form ϕ = β IN S, where β is the transmission rate and N is the population size. However, one of the problems of COVID-19 is thesigniﬁcant amount of virus transmission by the asymptomatic cohort within E ( A ) . Our model assumes that in thecurrent COVID-19 epidemics, most infected individuals who have developed symptoms, are either hospitalized or inself-isolation, thus playing a limited role in the exposure of susceptible individuals to the contagion. Acknowledgingthe differing roles that symptomatic and asymptomatic virus shedders play in transmission to susceptibles, we deﬁne ϕ = β pE ( A ) + qIN S, where p and q are frequencies, ≤ p, q ≤ . The frequency p may be interpreted as the fraction of the compoundcohort E(A) who are infectious. Classical SEIR models, where usually it is assumed that the exposed cohort is not5nfective, hence only the infected I cohort is responsible for the spreading of the epidemic, can be accounted for bysetting p = 0 , q = 1 . In our model COVID-19 dynamics, we assume that p > q . The transmission rate β , whichis the quantity of primary interest when it comes to assessing how fast the epidemics spread, integrates elementsrelated to the characteristics of the pathogen determining the probability of infection of a given susceptible contact,as well as factors related to social behavior, including the number and nature of daily contacts. In summary, thegoverining equations of the proposed COVID-19 model are dSdt = − β pE ( A ) + qIN S, (2) dE ( A ) dt = β pE ( A ) + qIN S − ηE ( A ) − γE ( A ) , (3) dIdt = ηE ( A ) − γI − µI, (4) dRdt = γE ( A ) + γI, (5)where β is the transmission rate, γ is the recovery rate, η is the symptomatic rate integrating in it the incubationprocess E → I , and µ is the death rate.The data that is used to inform our parametrized model, as well as to update the state estimation of the cohorts overtime is the number of conﬁrmed, symptomatic new daily cases, or, equivalently, daily observations of the ﬂux ϕ = ηE ( A ) , or, in fact, of a stochastic integer-valued realization of it, as explained in more detail in the next section.Since, as already pointed out, the rate of transmissibility integrates not only properties of the pathogen, but alsofactors related to human behavior that change in the course of an epidemic, in general β is not a static parameter. Inthe ﬁltering approach used to estimate the size of the cohorts in the course of the epidemic, to be explained below, β is modeled as a time dependent parameter. In the absence of a known deterministic dynamical evolution model for β , its proposed changes will be described in terms of a stochastic geometric random walk.Before describing the computational methodology on which our model predictions are based, some qualitative com-ments about the model are in order. In classical SEIR models, the inclusion of the exposed cohort E adds a timedelay corresponding to the incubation period not accounted for in SIR models. While our expression for ϕ im-plicitly assumes that the infected and asymptomatic individuals are immediately infectious, the ﬂux ϕ introduces aslight delay in the I cohort if, for instance, the transmission rate is changed. Moreover, if p (cid:54) = 0 , we may write ϕ = βp E ( A ) + ( q/p ) IN S, and by absorbing p into the transmission rate β , the model can be written in terms of the ratio q/p determining thesize of the relative effect of infected symptomatic individuals on the spread of the infection. Hence, without a loss ofgenerality, we may set p = 1 , and assume q < . The value q = 0 . used in our numerical tests is a rough estimate,and the effect of that parameter on the estimated quantities is brieﬂy discussed in the results. Finally, we point outthat the simplifying assumption that the symptomatic and asymptomatic individuals recover at the same relative rateis not essential for the methodology developed below, and can be easily removed.In the discussion to follow, we simplify the notation and write E instead of E ( A ) for the exposed/asymptomaticcohort size. 6 Bayesian particle ﬁltering

Bayesian ﬁltering algorithms update the information about the state vector and parameters in a sequential manner asnew data arrives. In the following, we use the convention of denoting by upper case letters the random variables andby lower case letter their realizations. We refer to [31, 32] for particle ﬁltering, and to [33, 34, 35] for the particulartype of applications to ODE systems.Let { X t } denote a discrete time Markov process, with the transition probability distribution π X t +1 | X t ( x t +1 | x t ) .Furthermore, let { B t } denote the stochastic process representing the observations, and let π B t | ( b t | x t ) denote thelikelihood density, where we implicitly assume that the current observation B t depends only on the current state X t and not on the past. Finally, let B t denote the cumulative data up to time t , that is, B t = { B , B , . . . , B t } . We denote by B t the set of observed realizations, B t = { b , b , . . . , b t } . In Bayesian ﬁltering algorithms the updateof the posterior distributions is carried out in two consecutive steps, π X t | B t → π X t +1 | B t → π X t +1 | B t +1 , where the ﬁrst step is referred to as propagation step, and the second as correction, or analysis step. The ﬁrst step isaccomplished through the Chapman-Kolmogorov formula, π X t +1 | B t ( x t +1 | B t ) = (cid:90) π X t +1 | X t ( x t +1 | x t ) π X t | B t ( x t | B t ) dx t , while the analysis step builds on Bayes’ formula, π X t +1 | B t +1 ( x t +1 | B t +1 ) = π X t +1 | B t ( x t +1 | B t ) π B t +1 | X t +1 ( b t +1 | x t +1 ) , that is, the predicted distribution of X t +1 acts as the prior as the next observation arrives. In the following, wespecify the transition probability kernel and the likelihood in our model, and describe the computational steps forthe numerical implementation. We then extend the discussion to include the estimation of static parameters.Consider our modiﬁed SE(A)IR model introduced in the previous section, and deﬁne the state vector at time tz t =  S t E t I t R t  , t = 1 , , . . . . Since we assume that the infectivity parameter β may vary over time, we denote its value at time t by β t . Formally,let ψ be the numerical propagator advancing the state variable and the infectivity parameter from one day to the next, t → t + 1 , ψ ( z t , β t ) = z t +1 . In our computations, the time integration performed by means of a standard ODE solver such as Runge-Kutta, andduring the one day propagation step, the infectivity β t is kept constant.Formally, we write a propagation model of the form x t = (cid:20) z t β t (cid:21) → F ( x t ) = (cid:20) ψ ( z t , β t ) β t (cid:21) = (cid:98) x t +1 , (6)7nd we account for uncertainties both in the state vector z t and the parameter vector β t by introducing an innovationterm. We guarantee that all components of the state vector and the parameter β are nonnegative, with a multiplicativeinnovation, assuming a geometric random walk model, log X t +1 ∼ N (log (cid:98) x t +1 , C ) , (7)where C ∈ R is a diagonal positive deﬁnite matrix.In the deﬁnition of the likelihood, we assume that the data consist of realizations b t of the daily new infections,denoted by B t . The new infection count is assumed to be Poisson distributed, with the Poisson parameter equal tothe ﬂux ϕ ( t ) , B t ∼ Poisson( ϕ ( t )) , where ϕ ( t ) = ηE t . Therefore, the likelihood of the observed number b t of new infections is π B t | Z t ( b t | z t ) = ( ηE t ) b t b t ! e − ηE t . To initialize the process, we need to deﬁne the initial state at the time t = 0 before the ﬁrst observed infections. Sincethe initial state is unknown, its initial value becomes part of the estimation problem. We postulate that before theﬁrst infection is observed, there is an unknown number of asymptomatic individuals in the community. We assumethat the initial number of asymptomatic cases follows a Poisson distribution with a uniformly distributed expectedvalue Λ ∼ Uniform([0 , λ max ]) . Moreover, we assume that β follows a uniform distribution over some interval [ β min , β max ] .We are now ready to outline the particle ﬁlter (PF) algorithm for estimating the state vector X t based on the infectioncount. Assume that at time t a sample from the distribution π X t | B t { x t , x t , . . . , x Nt } , is available, and each sample point is associated its corresponding weight w jt . We write a particle approximation ofthe Chapman-Kolmogorov formula π X t +1 | B t ( x t +1 | B t ) = (cid:90) π X t +1 | X t ( x t +1 | x t ) π X t | B t ( x t | B t ) dx t ≈ N (cid:88) j =1 w jt π X t +1 | X t ( x t +1 | x jt ) , and combine it with Bayes’ formula to obtain the updating formula π X t +1 | B t +1 ( x t +1 | B t +1 ) ≈ N (cid:88) j =1 w jt π X t +1 | X t ( x t +1 | x jt ) π B t +1 | X t +1 ( b t +1 | x t +1 ) . Let (cid:98) x jt +1 denote a propagated predictor particle associated with x jt , that is, (cid:98) x jt +1 = F ( x jt ) . At the arrival of the next observation b t +1 , the likelihood π B t +1 | X t +1 ( b t +1 | (cid:98) x jt +1 ) expresses how well the predictoris at explaining the data. Writing the above formula as π X t +1 | B t +1 ( x t +1 | B t +1 ) ≈ N (cid:88) j =1 (cid:110) w jt π B t +1 | X t +1 ( b t +1 | (cid:98) x t +1 ) (cid:111)(cid:124) (cid:123)(cid:122) (cid:125) (1) (cid:26) π B t +1 | X t +1 ( b t +1 | x t +1 ) π B t +1 | X t +1 ( b t +1 | (cid:98) x t +1 ) (cid:27)(cid:124) (cid:123)(cid:122) (cid:125) (2) π X t +1 | X t ( x t +1 | x jt ) (cid:124) (cid:123)(cid:122) (cid:125) (3) ,

8e can interpret the formula as follows: Given a predictor particle (cid:98) x jt +1 , the expression (1), combining the impor-tance of its predecessor in the weight and its ﬁtness in the likelihood, evaluates the relevance of the particle; (2)weighs the importance of the new particle relative to the predictor, and the transition kernel (3) generates a newparticle from the predictor by the formula (7). This hierarchical organization is the backbone of the particle ﬁlteralgorithm. Particle ﬁltering algorithmInitialize:

Draw N independent realizations of β ∼ Uniform([ β min , β max ]) and Λ ∼ Uniform([0 , λ max ]) , { β , β , . . . , β N } , { λ , λ , . . . , λ N } , and generate N realizations of the initial state of Z , z j =  N − λ j λ j  , j = 1 , , . . . , N, then deﬁne the initial cloud of the particles { x , x , . . . , x N } , x j = (cid:34) z j β j (cid:35) . Set t = 0 . While t < t max do (a) Propagate the particles according to (6) to generate the predictive particle cloud { (cid:98) x t +1 , (cid:98) x t +1 , . . . , (cid:98) x Nt +1 } . (b) Extract the second component (cid:98) e jt +1 of each of the propagated particles, and compute the weights, (cid:98) g jt +1 = w jt ( η (cid:98) E jt +1 ) b t +1 b t +1 ! e − η (cid:98) E jt +1 , (cid:98) g jt +1 ← (cid:98) g jt +1 (cid:80) (cid:96) (cid:98) g (cid:96)t +1 . (c) Sample with replacement N indices (cid:96) j ∈ { , , . . . , N } , j = 1 , , . . . , N , with the probability weights g jt +1 . Deﬁne the new resampled predictive cloud (cid:98) x jt +1 ← (cid:98) x (cid:96) j t +1 . Generate a new particle cloud through the innovation process, log x jt +1 = log (cid:98) x jt +1 + C / w j , w j ∼ N (0 , I ) . e jt +1 from each new particle, and compute the new likelihood weights, g jt +1 = ( ηE jt +1 ) b t +1 b t +1 ! e − ηE jt +1 , g jt +1 ← g jt +1 (cid:80) (cid:96) g (cid:96)t +1 . Update the weights, w jt +1 = g jt +1 (cid:98) g jt +1 . Advance t → t + 1 . end do In the particle ﬁlter algorithm, the model parameters γ , η and µ are ﬁxed. It is possible to modify the algorithm toestimate these parameters also. In that case, to estimate the death rate µ , the information about the deceased must beincluded in the data, as the new infection count is essentially insensitive to that parameter. The parameters γ and η are less time-dependent than the infectivity β , and we set their values according to what is suggested in the literature. A challenges in forecasting the spread of the COVID-19 epidemic is the presumably large portion of population thatis asymptomatic or shows only light symptoms, while shedding the virus, which in our model is part of the cohort E . The ratio between the numbers of symptomatic and asymptomatic individuals, ρ ( t ) = E ( t ) I ( t ) , can be estimates from the output of the particle ﬁlter. Differentiating ρ with respect to time, dρdt = 1 I dEdt − EI dIdt , expressing the derivatives of E and I in terms of the governing equations (3)-(4), and simplifying, we ﬁnd that ρ satisﬁes the Riccati equation dρdt = β p + qρN S − ( η + γ ) ρ − ηρ − ( γ + µ ) ρ = − η (cid:18) ρ + (cid:18) η + 2 γ + µη − βη SN q (cid:19) ρ − βη SN p (cid:19) . If we assume that the frequency of susceptible cohort ν S = S/N is approximately constant, we ﬁnd an equilibriumstate ρ ∗ of the ratio, ρ ∗ = − (cid:18) η + 2 γ + µη − βη ν S q (cid:19) + (cid:115) (cid:18) η + 2 γ + µη − βη ν S q (cid:19) + βη ν S p, (8)which is a stable equilibrium. In particular, when the infection has not spread yet to a signiﬁcant portion of thepopulation, substituting ν S ≈ yields a potentially useful estimate for the prevalence of the asymptomatic infectionin the population.In the computed results, a comparison of the values of ratio ρ computed from the state vectors and the equilibriumestimate, can be used to asses whether the infection pattern is settling to or near the equilibrium state.10he equilibrium condition provides a natural way of deﬁning a basic reproduction number R for the model. As-suming equilibrium, we have dIdt = ηE − ( γ + µ ) I = − ( γ + µ ) (cid:18) − ηγ + µ ρ ∗ (cid:19) I = − ( γ + µ )(1 − R ) I, where R = ηγ + µ ρ ∗ . Thus, when R = 1 the infection stops growing under the assumption of a stable ratio of asymptomatic to symp-tomatic infections. This interpretation of the reproduction number is in close agreement with the R for SIR models.While not perfect, this number summarizes the phase of the epidemic, as our computed examples will demonstrate. To test our model, we consider the COVID-19 data of new infection counts in different counties of NortheasternOhio and Southeastern Michigan. The counties represent different population densities and demographics, e.g.,Cuyahoga (OH) and Wayne (MI) include dense urban areas (Cleveland and Detroit), Summit (OH) and Genesee(MI) represent mid-size urban centers (Akron and Flint), Lake (OH) and Oakland (MI) represent areas near urbancenters, while Holmes (OH) and Sanilac (MI) host rural communities. The data consist of the conﬁrmed daily casesmade available by USAfacts [6].The values of the model parameters and of the prior densities, as described in the algorithm in Section 3, are listedin Table 1. The innovation covariance matrix C is a diagonal matrix of the form C = diag( σ /N , σ , σ , σ , σ ) , where σ and σ are the variances of the uncertainty in the logarithm of the state vector Z t and in the logarithm ofthe transmission rate β t ,respectively. Observe that for the susceptible population, the variance is weighted with thepopulation squared to keep the innovation for this cohort from becoming excessive. We have S jt +1 = (cid:98) S jt +1 exp (cid:16) σ N w jt +1 (cid:17) ≈ (cid:98) S jt +1 + σ (cid:98) S jt +1 N w jt +1 , and without the scaling by N , the innovation in a large population could be signiﬁcantly large. Numerical testsindicate that with a too large innovation, the algorithm may go astray.The panels in the left column of Figures 2-3 display the raw data and the dynamically estimated expected valuesof the new infections computed with the particle ﬁlter. In the estimation process, we averaged the data over amoving window of three days to attenuate the effects of, e.g., reporting lags, and differences between weekends andweekdays. For each county we show the 50% and 75% credibility intervals and the median deﬁned by the particles.More precisely, for each parameter/state vector sample X jt , ≤ j ≤ N , we calculate the corresponding ﬂuxes ϕ j ( t ) = ηE jt , and evaluate the interval obtained by discarding the / × (1 − p/ × N smallest and largestvalues. The red curve is the median value of the ﬂuxes.The middle column in Figures 2-3 shows the 50% and 75% posterior belief envelope of the transmission rate param-eter for the respective counties.Finally, the right column of Figure 2-3 shows the 50% and 75% posterior envelopes estimated ratio between thecohorts E and I . In the same ﬁgure, the dashed curve represents the equilibrium value ρ ∗ of the ratio based on the11 article ﬁlter paremeters a priori lower bound transmission rate [1/days] β min β max λ max N

20 000standard deviation of innovation of the state σ σ Model paremeters recovery rate γ η µ β ( t ) equal to the posterior mean, β ( t ) = N (cid:88) j =1 w jt β jt . Towards the end of the observation period the equilibrium value is in good agreement with the sample-based estimate.The numerical results indicate that the asymptotic value ρ ∗ is a good proxy of the ratio E/I , in particular once thetransmission rate β has stabilized, therefore providing a quick way to assess the size of the asymptomatic cohortfrom the size of infected cohort. To understand better the dependency of the ratio ρ ∗ on the model parameters, weintroduce two dimensionless quantities characterizing the system of differential equations, t = γη , s = βη . Neglecting the effect of the death rate µ , and assuming that the infection is not yet widespread, so that S/N ≈ , weﬁnd an approximate formula for ρ ∗ of (8) in terms of the dimensionless quantities t and s , ρ ∗ = −

12 (1 + 2 t − sq ) + (cid:114)

14 (1 + 2 t − sq ) + sp, (9)Figure 5 shows the equilibrium value as a function of the dimensionless parameters for three different choices of theparameter: q = 0 . (left), q = 0 . (center) and q = 1 (right). In the four Ohio counties, the equilibrium value isconsistently near ρ ∗ ≈ . , while slightly below it in the Michigan counties. Observe that the value ρ ∗ = 0 . wouldcorrespond to effective basic reproduction number R ≈ / / ρ ∗ ≈ . , indicating that the disease is in a slow progression phase.While this deﬁnition of R in terms of ρ ∗ implicitly assumes equilibrium, it is possible to compute the time courseof R for the period when the system has not reached the equilibrium, by using formula (9) and the estimatedtransmission rate β . Figure 4 shows the posterior belief envelopes for R for six counties, three in Ohio, and three inMichigan, calculated from the posterior sample for β at each time. We did not show the results for Holmes County(OH) and Sanilac County (MI), whose R was consistently very low. The time courses of R show similar patterns12n all six counties, and while in Michigan the peak values were higher than in Ohio, the values towards the end ofthe observation period are lower, varying around the critical values R = 1 .The particle solutions to the estimation problem provide a natural tool for predicting new infections. For eachparticle, the last realization x jt provides an initial value for the differential equations and an estimated value forthe transmission rate, thus we may use the system of ODEs to propagate the particle data forward in time. Fromthe ensemble of the predictions we can then extract the forecasting belief envelopes, similarly as we did for thenowcasting problem. Figures 6-7 show the predictions for the following 20 days for the four sample counties underthe assumption that the mitigation conditions remain unchanged. In addition, we have run two alternative scenarios:In the ﬁrst one, the last estimate of the transmission rate for each particle is multiplied by a factor κ = 0 . , simulatinga tightening of the containment and control measures, while in the second scenario, the transmission rate is increasedby multiplicative factor κ = 1 . , simulating a relaxation of the control measures. We observe that the predictionsillustrate well the what the lower R in Michigan compared to Ohio means in terms of expected number of newinfections.To further test the predictive skills of the model, we leave out from the current data set { b , b , . . . , b T } the last t pred data points, considering only the data up to (cid:101) T = T − t pred , and compute a sample of predicted new cases, deﬁningthe predicted particle sample b j pred ,t ∼ Poisson( ηE j pred ,t ) , t = (cid:101) T + 1 , . . . , (cid:101) T + t pred = T, ≤ j ≤ N, that is, we propagate the particles computed at (cid:101) T forward, and generate artiﬁcial observations. A comparison of theresulting predictive envelopes and the actual data is indicative of the predictive skill of the model: large deviationsindicate that the model may not have taken some important factors into account.Figure 8 shows the day predictions for three counties in Ohio and three counties in Michigan, together with theactual data. The plots indicate that, in general, the model captures the trend of the data relatively well, with theexception of the Saginaw County (MI), where the predicted trend seems higher than the actual observation.Finally, we consider the sensitivity of the results to the choice of some of the model parameters that assumed tobe known, and in particular, the time constants T inc = 1 /η (incubation) and T rec = 1 /γ (recovery). To test thesensitivity, we select one of the counties, Cuyahoga (OH) and run the program with different parameter combina-tions. Figure 9–12 show a selection of results. The results show that both the estimated transmission rate and theratio ρ = E/I are rather robust for variations of the parameter, the former decreasing slightly with increasing T rec .Interestingly, towards the end of the observation interval, the ratio ρ is in all cases close to the equilibrium value ρ ∗ ≈ . . Therefore, the value of R is approximately R = ηγ + µ ρ ∗ ≈ ηγ ρ ∗ ≈ . × T rec T inc . However, this approximation gives an idea of the spreading of the transmission only if the ratio

E/I is near theequilibrium. The plots show that in particular at the beginning of the observation period the ratio is far from theequilibrium, and R would predict a too slow growth rate for the transmission dynamics. The proposed dynamical Bayesian ﬁltering algorithm based on particle ﬁlters is shown to be able to produce a robustand consistent estimate of the state vectors consisting of the sizes of the four cohorts of a new SE(A)IR model forCOVID-19 spread, as well as of the transmission rate, a key parameter that is not expected to be a constant. In13pidemiology models, the transmission rate β is usually deﬁned as β = (transmission probability per contact ) × (number of contacts per day),where the ﬁrst factor is related to the characteristics of the pathogen, while the second factor is related to the behaviorof the individuals and can change in connection with hygiene, social distancing and other mitigation measures. Inthe Ohio and Michigan counties, we observe a rapid growth of β at the beginning of the outbreak, followed by aclear and consistent drop. The initial increase may be due, to some extent, to the scarcity of the data in the beginningof the epidemics, and reﬂect the learning effort of the algorithm. The drop that follows, however, correlates wellwith the stay-at-home order and other social distancing measures introduced early on in both states, from March 16to March 23 in Michigan, and from March 15 to March 19 in Ohio. In both states and across all counties consideredhere, the transmission rate is fairly stable, except for Wayne County, Michigan, where β is slightly higher earlyin the epidemic. This may be the effect of a wider use of public transportation, socio-economic conditions anddemographic factors. If the reason for the drop is the adoption of mitigation measures, it is natural to wonder why itsonset is not the same in all counties. A possible explanation of the delay may lie in the fact that the infection arrivedin different communities through the mobility of infectious individuals, and what happens earlier in large, denselypopulated areas affects the surrounding communities with a delay. The effects of the network structure on the timingand intensity of the COVID-19 spread will be addressed separately.This SE(A)IR model we present directly addresses the role of asymptomatic individuals shedding virus, many ofwhom recover before developing symptoms, in transmission dynamics. We demonstrate a method for dynamicallyestimating the ratio of asymptomatic/presymptomatic/oligosymptomatic individuals in the E compartment to symp-tomatic individuals in the I compartment. This ratio, whose equilibrium value consistently settles to a value lessthan one across all counties examined and with a wide range of model parameters, is considerably lower than the50-85 ratio speculated in [22] based on a serosurvey conducted in California on April 3-4, 2020. This disparity maystem partly from the growing availability of testing throughout April as well as the cross-reactivity in the ELISAtests used as part of the serosurvey. Understanding better this discrepancy will be a topic of future studies. How-ever, we emphasize that while our methodology predicts that the asymptomatic cohort is typically smaller than thesymptomatic one, according to the model, it is still responsible for the majority of transmissions. Furthermore, asalready pointed out, the current SE(A)IR model does not address some of the features of COVID-19 transmission,including the latent period of the asymptomatic cohort. While designing a model that properly addresses the delayof the infectious phase of the asymptomatic patients is straightforward, retaining a structure that allows us to informabout the size of asymptomatic cohort based on data on symptomatic patients alone remains a challenge. Elaboratingon that point will be a future direction of this research.The model-based predictions indicate that the current value of β alone is not enough to project how the infectionprocess will proceed. In fact, while the value β at the end of the considered time interval is nearly the same forall counties, the predictions on the new infections differ signiﬁcantly. The main reason for this discrepancy is thedifferent size of the cohorts. The numbers infected tend to increase strongly even if β is relatively small if thepipeline of exposed/asymptomatic individuals is crowded. Because an unobservable concentration of individuals inthe exposed/asymptomatic category may represent an uptick in future cases, only several days (preferably two weeks)of consecutively falling case numbers likely represents systematic decline in population-level disease activity. Theuncertainty associated with undulating case counts is reﬂected in the wide predictive envelopes seen in projectionsfor several of the Ohio counties. It is worth noting 10 day projections shown in Figure 8 tend to overestimateobserved incidence in the last few days. We suspect this is attributable in part to reporting delays.In this paper we introduce a new way of estimating the reproduction number R for the proposed model, based on anequilibrium condition of a related non-linear Riccati type equation. The equilibrium condition, in turn, was shownto correspond well to the estimated ratio of the symptomatic and non-symptomatic cases after the transmissionrate had stabilized, presumably due to the social distancing measures. Interestingly, the equilibrium value ρ ∗ and14he corresponding R are very consistent for a number of counties in two states, Ohio and Michigan, that adoptedsimilar distancing measures around the same time, regardless of the fact that Michigan had a signiﬁcantly highernumber of infection than Ohio at the time of the adoption. These observations give an indication that the quantitiesreﬂect well the effectiveness of the mitigation measures, providing a useful and quick indicator for assessing suchmeasures. Acknowledgements

The work of DC was partly supported by NSF grants DMS-1522334 and DMS-1951446, and the work of ES waspartly supported by NSF grant DMS-1714617.

References [1] Adam J Kucharski, Timothy W Russell, Charlie Diamond, Yang Liu, John Edmunds, Sebastian Funk, Ros-alind M Eggo, Fiona Sun, Mark Jit, James D Munday, et al. Early dynamics of transmission and control ofCOVID-19: a mathematical modelling study.

The lancet infectious diseases , 2020.[2] Kathy Leung, Joseph T Wu, Di Liu, and Gabriel M Leung. First-wave COVID-19 transmissibility and sever-ity in China outside hubei after control measures, and second-wave scenario planning: a modelling impactassessment.

The Lancet , 2020.[3] Cleo Anastassopoulou, Lucia Russo, Athanasios Tsakris, and Constantinos Siettos. Data-based analysis, mod-elling and forecasting of the COVID-19 outbreak.

PloS one , 15(3):e0230405, 2020.[4] Shi Zhao, Qianyin Lin, Jinjun Ran, Salihu S Musa, Guangpu Yang, Weiming Wang, Yijun Lou, DaozhouGao, Lin Yang, Daihai He, et al. Preliminary estimation of the basic reproduction number of novel coron-avirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak.

International journal of infectious diseases , 92:214–217, 2020.[5] CSSE Johns Hopkins. Coronavirus 2019-nCoV global cases.

Center for Systems Science and Engineering,Johns Hopkins University , 2020.[6] Coronavirus locations: COVID-19 map by county and state, 2020.[7] Roman W¨olfel, Victor M Corman, Wolfgang Guggemos, Michael Seilmaier, Sabine Zange, Marcel A M¨uller,Daniela Niemeyer, Terry C Jones, Patrick Vollmar, Camilla Rothe, Michael Hoelscher, Tobias Bleicker, Sebas-tian Br¨unink, Julia Schneider, Rosina Ehmann, Katrin Zwirglmaier, Christian Drosten, and Clemens. Virolog-ical assessment of hospitalized patients with COVID-2019.

Nature , 2020.[8] Xi He, Eric H. Y. Lau, Peng Wu, Xilong Deng, Jian Wang, Xinxin Hao, Yiu Chung Lau, Jessica Y. Wong,Yujuan Guan, Xinghua Tan, Xiaoneng Mo, Yanqing Chen, Baolin Liao, Weilie Chen, Fengyu Hu, QingZhang, Mingqiu Zhong, Yanrong Wu, Lingzhai Zhao, Fuchun Zhang, Benjamin J. Cowling, Fang Li, andGabriel M. Leung. Temporal dynamics in viral shedding and transmissibility of COVID-19.

Nature Medicine ,30(30):10127–10134, 2020.[9] Richard G MacDonald. The analysis of equilibrium in malaria.

Tropical diseases bulletin , 1952.[10] George Macdonald et al. The epidemiology and control of malaria.

The Epidemiology and Control of Malaria. ,1957. 1511] Johan Andre Peter Heesterbeek. A brief history of R and a recipe for its calculation. Acta biotheoretica ,50(3):189–204, 2002.[12] O Diekmann, J a P Heesterbeek, and J A J Metz. On the deﬁnition and computation of the basic reproductionratio R in models for infectious diseases in heterogeneous populations. Journal of Mathematical Biology ,28(4):365–382, 2005.[13] Odo Diekmann, JAP Heesterbeek, and Michael G Roberts. The construction of next-generation matrices forcompartmental epidemic models.

Journal of the Royal Society Interface , 7(47):873–885, 2010.[14] Klaus Dietz. The estimation of the basic reproduction number for infectious diseases.

Statistical methods inmedical research , 2(1):23–41, 1993.[15] Paul L Delamater, Erica J Street, Timothy F Lesley, Y Tony Yang, and Kathryn H Jacobsen. Complexity of thebasic reproduction number (R ). Emerging Infectious Diseases , 25(1):1–4, 2019.[16] Jane M Heffernan, R J Smith, and L M Wahl. Pespectives on the basic reproductive ratio.

Journal of the RoyalSociety, Interface , 2(4):365–382, 1990.[17] Jing Li, Daniel Blakeley, et al. The failure of r . Computational and mathematical methods in medicine , 2011,2011.[18] Benjamin Ridenhour, Jessica M Kowalik, and David K Shay. Unraveling R : Considerations for public healthapplications. American Journal of Public Health , 104(2):e, 2014.[19] Huwen Wang, Zezhou Wang, Yinqiao Dong, Ruijie Chang, Chen Xu, Xiayue Yu, Shuxian Zhang, LhakpaTsamlag, Meili Shang, Jinyan Huang, Ying Wang, gang Xu, Tian Shen, Xinxin Zhang, and Yong Cai. Phase-adjusted estimation of the number of coronavirus disease 2019 cases in Wuhan, China.

Cell Discovery ,6(10):281–93, 2020.[20] Giulia Giordano, Franco Blanchini, Raffaele Bruno, Patrizio Colaneri, Alessandro Di Filippo, Angela Di Mat-teo, and Marta Colaneri. Modelling the COVID-19 epidemic and implementation of population-wide interven-tions in Italy.

Nature Medicine , pages 1–6, 2020.[21] Joseph T Wu, Kathy Leung, and Gabriel M Leung. Nowcasting and forecasting the potential domestic andinternational spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.

Lancet ,2020.[22] E Bendavid, B Mulaney, N Sood, S Shah, E Ling, R Bromley-Dulfano, C Lai, Z Weissberg, R Saavedra,J Tedrow, et al. Covid-19 antibody seroprevalence in Santa Clara county, California. medRxiv. 2020.[23] Tim K Tsang, Peng Wu, Yun Lin, Eric HY Lau, Gabriel M Leung, and Benjamin J Cowling. Effect of chang-ing case deﬁnitions for COVID-19 on the epidemic curve and transmission parameters in mainland China: amodelling study.

The Lancet Public Health , 2020.[24] Ruiyun Li, Sen Pei, Bin Chen, Yimeng Song, Tao Zhang, Wan Yang, and Jeffrey Shaman . Substantialundocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2).

Science , 2020.[25] Jos´e Lourenc¸o, Robert Paton, Mahan Ghafari, Moritz Kraemer, Craig Thompson, Peter Simmonds, Paul Klen-erman, and Sunetra Gupta. Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic. medRxiv , 2020.[26] Ismael Khorshed Abdulrahman. Simcovid: An open-source simulation program for the covid-19 outbreak. medRxiv , 2020. 1627] Gary E Weissman, Andrew Crane-Droesch, Corey Chivers, ThaiBinh Luong, Asaf Hanish, Michael Z Levy,Jason Lubken, Michael Becker, Michael E Draugelis, George L Anesi, et al. Locally informed simulation topredict hospital capacity needs during the covid-19 pandemic.

Annals of internal medicine , 2020.[28] IHME COVID-19 health service utilization forecasting team and Christopher JL Murray. Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilator-days and deaths by US state in the next 4 months, 2020.[29] William O Kermack and AG McKendrick. A contribution to the mathematical theory of epidemic.

ProceedingsRoyal Society London , 115(772):700 –721, 1927.[30] William Farr. Causes of death in England and Wales.

Second Annual Report of the Registrar General of Births,Deaths and Marriages in England , 2:69– 98.[31] Jane Liu and Mike West. Combined parameter and state estimation in simulation-based ﬁltering. In

SequentialMonte Carlo methods in practice , pages 197–223. Springer, 2001.[32] Jari Kaipio and Erkki Somersalo.

Statistical and computational inverse problems , volume 160. SpringerScience & Business Media, 2006.[33] Andrea Arnold, Daniela Calvetti, and Erkki Somersalo. Parameter estimation for stiff deterministic dynamicalsystems via ensemble Kalman ﬁlter.

Inverse Problems , 30(10):105008, 2014.[34] Andrea Arnold, Daniela Calvetti, and Erkki Somersalo. Linear multistep methods, particle ﬁltering and se-quential Monte Carlo.

Inverse Problems , 29(8):085007, 2013.[35] Andrea Arnold, Daniela Calvetti, Albert Gjedde, Peter Iversen, and Erkki Somersalo. Astrocytic tracer dynam-ics estimated from [1-11C]-acetate PET measurements.

Mathematical medicine and biology: a journal of theIMA , 32(4):367–382, 2015. 17 igure 2: Particle ﬁlter estimates with of 50% and 75% belief posterior envelopes using the COVID-19 data of fourcounties in Northeastern Ohio. The top row corresponds to Cuyahoga County (Cleveland area), the second row toSummit County (Akron area), the third row to Lake County (commuter community close to Cleveland), and thebottom row to the rural Holmes County. In the left column, the blue bars indicate the daily count of conﬁrmedcases and the grey bars are the deaths (not used here), while the envelopes indicate the nowcasting of the meanof new cases with model uncertainties. The red curve indicates the median of the sample. In the middle column,the posterior envelopes of 50% and 75% of the transmission rate β are shown. On the right, we show the posteriorenvelopes of the ratio of the cohorts E and I . The dashed line indicates the equilibrium value ρ ∗ based on the Riccatiequation, computed with the posterior mean of β , and with the approximation S/N ≈ .16 Figure 2: Particle ﬁlter estimates with of 50% and 75% belief posterior envelopes using the COVID-19 data offour counties in Northeastern Ohio. The top row corresponds to Cuyahoga County (Cleveland area), the secondrow to Summit County (Akron area), the third row to Lake County (commuter community close to Cleveland), andthe bottom row to the rural Holmes County. In the left column, the blue bars indicate the daily count of conﬁrmedcases and the grey bars are the deaths (not used here), while the envelopes indicate the nowcasting of the meanof new cases with model uncertainties. The red curve indicates the median of the sample. In the middle column,the posterior envelopes of 50% and 75% of the transmission rate β are shown. On the right, we show the posteriorenvelopes of the ratio of the cohorts E and I . The dashed line indicates the equilibrium value ρ ∗ based on the Riccatiequation, computed with the posterior mean of β , and with the approximation S/N ≈ .18 igure 3: Particle ﬁlter estimates 50% and 75% belief posterior envelopes using the COVID-19 data of four countiesin Southeastern Michigan. The top row corresponds to Wayne County (Detroit area), the second row to GeneseeCounty (Flint area), the third row to Oakland County (suburban near Detroit), and the bottom row to the rural SanillacCounty. For the explanation of the different columns, see the caption of Figure 2.17 Figure 3: Particle ﬁlter estimates 50% and 75% belief posterior envelopes using the COVID-19 data of four countiesin Southeastern Michigan. The top row corresponds to Wayne County (Detroit area), the second row to GeneseeCounty (Flint area), the third row to Oakland County (suburban near Detroit), and the bottom row to the rural SanillacCounty. For the explanation of the different columns, see the caption of Figure 2.19igure 4: The estimated R posterior envelopes for three counties in Ohio (top row), and four in Michigan (bottomrow). The lighter envelope correspond to 75% posterior belief, and the darker one to 50% belief.Figure 5: The equilibrium value ρ ∗ of the ratio E/I as a function of the dimensionless quantities t = γ/η and s = β/η with three different values of the infectivity factor of the cohort I : q = 0 . (left), q = 0 . (center) and q = 1 (right). 20igure 6: The predictive envelopes for four counties in Ohio. In the left column, the propagation for each particleis continued by using the last state vector as initial data, and the last estimated β value as transmission rate. In themiddle column, the transmission rates are slightly lowered by a factor of 0.8, while in the right column, the rates areincreased by a factor of 1.2. The predictive envelopes correspond to 50% and 75% levels of belief.21igure 7: The predictive envelopes for four selected counties in Michigan. The explanations of the columns are thesame as in Figure 6. 22igure 8: Ten days prediction of new daily infections in six different counties, three in Ohio and three in Michigan.The predictions are based on the data excluding the last ten observations. The two different shades correspond to50% and 75% uncertainty of predicting new observations. The actual data are shown as stem plots.Figure 9: The effect of parameters η = 1 /T inc and γ = 1 /T rec on the estimated expected value of new cases. Theparameter combinations T inc /T rec in these ﬁgures are chosen as follows: First row: 4/7, 4/10, 4/14. Second row:7/10, 7/14, 7/21. 23igure 10: The effect of parameters η = 1 /T inc and γ = 1 /T rec on the estimated value of the transmission rate β . The parameter combinations T inc /T rec in these ﬁgures are chosen as follows: First row: 4/7, 4/10, 4/14. Secondrow: 7/10, 7/14, 7/21.Figure 11: The effect of parameters η = 1 /T inc and γ = 1 /T rec on the estimated value of the ratio E ( A ) /I . Theequilibrium value ρ ∗ is plotted as a dashed curve. The parameter combinations T inc /T rec in these ﬁgures are chosenas follows: First row: 4/7, 4/10, 4/14. Second row: 7/10, 7/14, 7/21.24igure 12: The effect of parameters η = 1 /T inc and γ = 1 /T rec on the estimated reproduction number R . Theparameter combinations T inc /T recrec