[PDF] A stochastic SIR model for the analysis of the COVID-19 Italian epidemic

Abstract

We propose a stochastic SIR model, specified as a system of stochastic differential equations, to analyse the data of the Italian COVID-19 epidemic, taking also into account the under-detection of infected and recovered individuals in the population. We find that a correct assessment of the amount of under-detection is important to obtain reliable estimates of the critical model parameters. Moreover, a single SIR model over the whole epidemic period is unable to correctly describe the behaviour of the pandemic. Then, the adaptation of the model in every time-interval between relevant government decrees that implement contagion mitigation measures, provides short-term predictions and a continuously updated assessment of the basic reproduction number.

Full PDF

aa r X i v : . [ q - b i o . P E ] F e b A stochastic SIR model for the analysis of the COVID-19 Italianepidemic

A. Bodini, S. Pasquali, A. Pievatolo, F. RuggeriCNR IMATI “E. Magenes”, Milano, Italy

Abstract

We propose a stochastic SIR model, speciﬁed as a system of stochastic diﬀerential equations, toanalyse the data of the Italian COVID-19 epidemic, taking also into account the under-detectionof infected and recovered individuals in the population. We ﬁnd that a correct assessment of theamount of under-detection is important to obtain reliable estimates of the critical model parameters.Moreover, a single SIR model over the whole epidemic period is unable to correctly describe thebehaviour of the pandemic. Then, the adaptation of the model in every time-interval betweenrelevant government decrees that implement contagion mitigation measures, provides short-termpredictions and a continuously updated assessment of the basic reproduction number.

Keywords : susceptible-infected-removed; basic reproduction number; state-space SDE; under-detection;identiﬁability; particle ﬁltering

In this work we apply a stochastic version of the well-known SIR (with the three Susceptible, Infectiveand Removed compartments) epidemic model to the Italian COVID-19 data. The SIR model has beenintroduced about ten years after the 1918 inﬂuenza pandemic, also known as Spanish ﬂu ([24]), andis still popular as a simple tool to approximate disease behaviour ([34]). Numerous extensions such asthe simpler SEIR and SIRD models or the more complex SIDARTHE ([19]) have been subsequentlydeveloped to make more reliable assumptions on the epidemic dynamic. To recognize the role ofheterogenous contact networks in the transmission dynamics of infectious diseases, an extended SIRmodel with seven compartments has been developed on the nodes of a network where each nodeaccounts for the mean number of contacts in a typical day, [41]. For the outbreak of COVID-19,alternative models have been introduced ranging from elementary models ([36]) to very complex onesincluding spatial and temporal variations (e.g. [1], [2], [38], [37]). However, as a parsimonious model1ble to allowing the measurement and forecast of the impact of non-pharmacological interventions likesocial distancing, the SIR model still preserves a primary role in the analysis of the early phase ofCOVID-19 outbreak ([3],[8]). In [12] for instance, the SIR model is combined with Bayesian parameterinference and the eﬀects of governmental interventions in Germany are modelled as ﬂexible changepoints in the spreading rate. A SIR model with time-dependent infectivity and recovery rate which areestimated by solving an inverse problem is considered in [27]. A hierarchical epidemiological model inwhich two observed time series of daily proportions of infected and removed cases are emitted from theunderlying infection dynamics governed by a Markov SIR infectious disease process is proposed in [39].By introducing a time–varying transmission rate, the Authors cover the eﬀects of diﬀerent interventionmeasures in dissimilar periods in Italy. In [10] it is assumed that the susceptible population is a variablethat can be adjusted to account for new infected individuals spreading throughout a community. Witha similar aim, [7] include among the parameters of a SIRD model the initial number of susceptibleindividuals as well as the proportionality factor relating the detected number of positives with theactual (and unknown) number of infected individuals. An alternative way to account for possiblerandom errors in reporting is proposed in [20] as well.As mentioned before, several compartments may be added to the SIR model but, in previous work([4]) it was shown that while the parameters of the SIR model can be uniquely determined fromthe time evolution of the normalized curve of removed individuals, the same does not hold for morecomplex models. Thus, the SEIR and other models should not be used in the absence of additionalinformation that might be obtained from clinical studies. In the present work, since we assume noclinical information, we will use the SIR model. In particular, we introduce a stochastic SIR model todescribe the dynamics of the infection, coupled with an observation equation that relates the actualnumbers of infected and removed to the observed ones. There are two sources of randomness: the ﬁrstin the SIR model, introduced through uncertainty in the infection and recovery rate parameters, suchthat they depend on time; the second in the observation equation, where a random under-reportingerror is assumed. This model is amenable to sequential updating of parameters and forecasting, auseful feature when data come in on a daily basis. The updating method is a particle ﬁlter withparameter learning [14] for the fraction of individuals in each compartment.The article has two objectives: on the one side, the exploration of the suitability of a SIR model,while, on the other, it provides some results on the epidemic in Italy. The ﬁrst ﬁnding is that asingle SIR model is unable to capture the dynamics of the entire development of the COVID-19epidemic, because of the numerous policy adjustments by health authorities and diﬀerent governmentinterventions that took place during the emergency. Hence we ﬁtted one SIR model to each phasebetween selected government interventions, obtaining a good ﬁt to the data. Notwithstanding, thepredictive capability of this model remains very limited. The second ﬁnding is that it is very important2o use the correct probability distribution for the observation error: failure to do so may produceparameter estimates that seemingly provide a good ﬁt to the data but do not correctly describe thetrue underlying dynamics.The article is organised as follows. In Section 2 we introduce the stochastic SIR model andin Section 3 we present a discretised state-space version and the observation equation, where thestate is the fraction of susceptible, infected and removed individuals. A Rao-Blackwellised particleﬁlter (RBPF) algorithm for state ﬁltering, state prediction and parameter learning is illustrated inAppendix A, [14]. In Section 4 we study the ﬁltering and prediction problem on data simulated fromour model with the help of graphical displays addressing the following: the accuracy and the precisionof both parameter estimation and ﬁltering and the accuracy of the forecast (Sections 4.1 and 4.2); theproblem of the simultaneous identiﬁability of the parameters in both the SIR model and the observationerror distribution (Section 4.3). In Section 5 we apply our method to the Italian data of both the ﬁrstand the second infection wave, obtaining a good ﬁt, as stated above, but a forecast that may be validonly in the short term. In the same section we brieﬂy compare the performance of a deterministic SIRmodel to our stochastic SIR model, showing that the latter is superior. We also consider the problemof assigning the correct observation error distribution using available information on the InfectionFatality Rate and compare the level of the ﬁltered state to the result of a sample serological surveycarried out by Istat (Italian national statistical oﬃce) and the Italian Health Ministry between Mayand July 2020. This comparison indicates that our model, if correctly calibrated, provides a realisticassessment of the state of the epidemics. A section with some ﬁnal remarks concludes the article.

Consider a population and denote by S t the fraction of susceptible individuals at time t , by I t thefraction of infected individuals and by R t the fraction of recovered individuals (survivors and dead).We suppose that the population is closed, then for every time tS t + I t + R t = 1 . The deterministic SIR model can be written  dS t dt = − βI t S tdI t dt = βI t S t − γI tdR t dt = γI t (1)where β is the disease transmission rate, that is, the fraction of all contacts, between infected andsusceptible people, that become infectious per unit of time and per individual in the population, and γ

3s the removal rate. The reciprocal of γ is the duration of infection. The parameters β and γ allow toapproximate the basic reproduction number (or ratio, also called basic reproductive number or ratio)that can be thought of as the expected number of infected people generated by an infected individualin a population where all individuals are susceptible to infection. Despite its conceptual simplicity, R is usually estimated with complex mathematical models developed using various sets of assumptions([13]). In the above SIR model it holds R = βγ , where parameters β and γ are unknown and have to be estimated. We suppose that they are subjectto uncertainty and change in time as follows: β t = β + σw (1) t γ t = γ + ηw (2) t (2)with w (1) t and w (2) t independent Wiener noises, that is, β t is supposed normally distributed with mean β and variance σ t and γ t is normally distributed with mean γ and variance η t . For alternative waysto introduce stochasticity, see [17]. The parameters σ and η measuring the noise intensity are assumedknown and suﬃciently small to obtain positive β t and γ t with probability approximately equal to one.Substituting the expression (2) for β and γ in system (1), we obtain the following stochastic SIRmodel:  dS t = − β I t S t dt − σI t S t dw (1) t dI t = ( β I t S t − γ I t ) dt + σI t S t dw (1) t − ηI t dw (2) t dR t = γ I t dt + ηI t dw (2) t (3)Unfortunately, the introduction of the noise in the parameters β and γ no longer grants thecondition S t + I t + R t = 1. We can enforce it by substituting S t by 1 − I t − R t in the second and thirdequations and removing the ﬁrst equation to obtain the reduced system  dI t = ( β I t (1 − I t − R t ) − γ I t ) dt + σI t (1 − I t − R t ) dw (1) t − ηI t dw (2) t dR t = γ I t dt + ηI t dw (2) t (4)Denoting by X t = ( I t , R t ) T the state vector, by W t = (cid:16) w (1) t , w (2) t (cid:17) T the vector of independent Wienerprocesses and by θ = ( β , γ ) T the parameter vector, we can rewrite system (4) in vectorial form: dX t = h ( X t ) θ dt + g ( X t ) dW t (5)where h ( X t ) =  I t (1 − I t − R t ) − I t I t  ; g ( X t ) =  σI t (1 − I t − R t ) − ηI t ηI t  . (6)We called X t the state of the system, which for COVID-19 is unobservable, and introduce Y t to denotewhat can be actually observed, in accordance with the terminology derived from state-space modelling.The vector Y t is characterised in the following section.4 Parameter estimation, ﬁltering, forecasting, and goodness-of-ﬁt

To estimate parameter θ we propose a Rao-Blackwellized particle ﬁlter (RBPF) algorithm based onthe Euler discretization of the stochastic system (5): X t +1 = X t + h ( X t ) θ ∆ t + g ( X t )∆ W t , t = 0 , , , . . . (7)where we also use t for discrete time to save notation. The method is described in Appendix A. Thisalgorithm allows to jointly calculate, at each time step, the estimated parameter and the state of thesystem using a noisy observation of the state as input. It is widely recognised that in this pandemiccollected data on infected and removed suﬀer of under–diagnosis and under–detection, ([40]) that is, theobservations are subject to an observation error. We suppose that each component of the observationvector Y t +1 is given by the product of the corresponding component of X t +1 and a random variable:  Y t, Y t,  =  U t, X t, U t, X t,  (8)where U t, and U t, are independent beta distributed random variables with shape parameters a and b . (In the following, by U , Y and X with no subscript we mean scalar random variables distributed as U t,i , Y t,i , and X t,i , respectively, i = 1 , θ is Gaussian with mean µ and covariance matrix Σ .The model can also be used to predict the future behaviour of the epidemic. Let y t be the timeseries of observations up to time t ; for a ﬁxed initial state x , the RBPF algorithm provides a sample x ( i )0: t , i = 1 , . . . , M , to approximate the posterior distribution of the state p ( x t | y t ). Furthermore,the conditional distributions of θ given x t , p ( θ | x t ), is Gaussian with mean µ t = E ( θ | x t ) andcovariance matrix Σ t = Cov ( θ | x t ). The RBPF algorithm produces a sample ( µ ( i ) t , Σ ( i ) t ) of conditionalmean vectors and covariance matrices given x ( i )0: t . To forecast X t + k given y t , we aim at computing E ( X t + k | y t ). If we ﬁx θ and x t , and run model (7) for k time steps, we obtain a value for X t + k asa function f k ( x t , θ , ξ ), in which ξ indicates the sequence of increments ∆ W s , s = t + 1 , . . . , t + k .Using f k ( x t , θ , ξ ), the conditional expectation is E ( X t + k | y t ) = Z f k ( x t , θ , ξ ) p ( ξ ) p ( θ | x t ) p ( x t | y t ) dξdθ dx t (9)where conditional independence of θ on y t given x t allows for substitution of p ( θ | x t , y t ) by p ( θ | x t ). Then, if for each i we draw θ ( i )0 from p ( θ | x ( i )0: t ) and ξ ( i ) from the distribution of the Wienerprocess increment, the predictive expectation of X t + k is approximated by E ( X t + k | y t ) ≃ M M X i =1 f k ( x ( i )0: t , θ ( i )0 , ξ ( i ) ) . (10)5o visualize how well the SIR model ﬁts the observations, we need a way to compare the simulatedtrajectories with the observed data. We have supposed that the observations are smaller than thetrue value of the state X , therefore we have to scale them by a factor that makes them comparableto the ﬁltered state. A scaling factor is suggested by building a prediction interval of the state X ateach observation time. Note that from (8) the random variable Y /X is a pivotal quantity with betadistribution and we may state that1 − q = P (cid:18) u q ≤ YX ≤ u − q (cid:19) = P Yu − q ≤ X ≤ Yu q ! (11)where u q and u − q are the q and the 1 − q percentiles of the beta distribution of U . Then, thecorresponding prediction interval for X , after observing y , is yu − q , yu q ! (12)and a natural scaling factor for a point prediction of X is the median of U , producing y/u . . Thefeature of (12) is that it does not depend on the SIR modelling assumption, but only on the observationerror assumption, therefore it oﬀers a way to see how well the SIR dynamic follows the (transformed)data. To check the convergence of the method we ﬁx a parameter value and simulate the observations (or data ).We start from an initial condition of 1% infected and 0.1% removed. We simulate data for theparameters β = 0 . γ = 0 .

1. The parameters σ and η in (2) are 0.03 and 0.01, respectively.We run model (4) with an underlying time step of 1 /

24 day to generate 67 daily step states. Theﬁrst 60 states will be used for the estimation procedure and the other 7 to check the goodness of theforecast. Then, we use equation (8) with a = 10 and b = 40 for the beta distribution of the observationerror. This distribution is reported in the left panel of Figure 1 and the simulated observations are inthe right panel of Figure 1.We apply the RBPF algorithm described in Appendix A with 200,000 particles and with a timestep of 1 /

24 day. Since we have a single observation for each day, to run the algorithm we haveto impute new hourly observations by means of a linear interpolation between two consecutive dailyobservations. The imputed observations are no longer indipendently distributed conditionally on thestates, however this approximate procedure keeps the eﬀective sample size of the RBPF algorithm atlarge values with no appreciable diﬀerence on the results.6 detected fraction time (days) no r m a li s ed n . o f i nd i v i dua l s susceptibleinfectedremoved Figure 1: Left panel: beta distribution with parameters a = 10 and b = 40 for the observation errorterms U t, and U t, in (8). Right panel: data generated by (8) from state trajectories generated by(4).The initial guess for the parameter θ = ( β , γ ) T is µ = (0 . , . T and the prior covariancematrix Σ = diag (0 . , . S t = 1 − I t − R t and then the goodness of ﬁt is a consequence ofthe ﬁt for the other two compartments.We denote the estimates of β and γ with information up to time t by ˆ β t and ˆ γ t , see (A-5). Theirtime behaviour is shown in the left panel of Figure 3. The behaviour of ˆ R ( t ), the estimated basicreproduction number (A-6), is displayed in the right panel of Figure 3.We denote the ﬁltered or forecasted state as ˆ I t and ˆ R t , where the value of t determines whetherwe are ﬁltering or forecasting, that is, if our observation period ends at time s , then when t > s , ˆ I t and ˆ R t are forecasts; otherwise they are ﬁltered states. From the RBPF we get the ﬁltered statesand, using (10), we get a forecast of the dynamics. ˆ I t and ˆ R t are compared to the true state in theleft panel of Figure 2, where we see that both the ﬁt up to day 60 and the forecast on days 61-67 aresatisfactory. In particular, the forecast well represents the trend of the state.In reality the true states are unobserved, and we can only compare the ﬁltered states to theobservations, by taking under-detection into account. Therefore, we compute daily prediction intervalsfor infected I t and removed R t , as in (12) with q = 0 .

10 20 30 40 50 60 70 time (days) no r m a li s ed n . o f i nd i v i dua l s Figure 2: Application of the RBPF algorithm. Left panel: trajectories of infected (red line) andremoved (green line). Circles represent the true state. The dynamics up to day 60 are the ﬁlteredstates, while the dynamics from day 61 to day 67 are forecasts. Right panel: posterior distributionsfor the parameters β and γ at time 60. time (days) time (days) Figure 3: Left panel: behaviour of ˆ β t (blue) and ˆ γ t (red) obtained from (A-5). Right panel: behaviourof ˆ R ( t ) obtained from (A-6) (blue line) and value of true R = β /γ = 0 . / .

10 20 30 40 50 60 70 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved Figure 4: True states (circles), ﬁltered state and forecast (thick lines) and adjusted observations (thinline) with 95% prediction intervals (12). The thick lines up to day 60 are the ﬁltered states, whilethose from day 61 to day 67 are forecasts. The thin lines are the observations divided by u . , themedian of the observation error distribution.error distribution in the left panel of Figure 1, for which q . = 0 .

10 and q . = 0 .

32. The simulateddynamics (the true states) are inside the intervals and cross the adjusted observations, then, if theadjusted observations and the ﬁltered state agree with each other, this is a necessary condition for theﬁltered state to follow the true state.

In this section we analyze the variability of the ﬁltered states and of the estimated parameters dueto the variability of the data generated from the system (4), in order to get an impression of how farthey can get from the true values, even if the true random under-reporting error distribution is used.We use the same parameters of Section 4.1, that is, ( β , γ ) = (0 . , . σ = 0 . η = 0 .

03, initialvalues µ = (0 . , . T and Σ = diag (0 . , .

02) and initial state X = (1% , . /

24 day and for 200,000 particles. For every simulations we compute the trajectoriesˆ I t /I t and ˆ R t /R t , where we recall that ˆ I t and ˆ R t are the estimated infected and removed individualsﬁltered by the RBPF algorithm, while I t and R t are the true states for the corresponding simulation.For the sake of a clear representation, in Figure 5 we show one every ﬁve trajectories of ˆ I t /I t (left9

10 20 30 40 50 60 time (days) time (days)

Figure 5: Left panel: behaviour of ˆ I t /I t for 100 diﬀerent simulations. Right panel: behaviour of ˆ R t /R t for 100 diﬀerent simulations.panel) and ˆ R t /R t (right panel).In both panels of Figure 5, after a transient phase with larger dispersion, ˆ I t /I t and ˆ R t /R t end upﬂuctuating around 1, with a stable dispersion in the left panel and a decreasing dispersion in the rightpanel. Now consider the sum of the root mean square error (RMSE) between ˆ I t and I t and of theRMSE between ˆ R t and R t for all the trajectories, as a measure of distance between the estimate andthe truth. Among all the trajectories we represented the one with the smallest and the one with thelargest distance in the left and in the right panels of Figure 6, respectively. The latter picture showsthat the ﬁt may be very unsatisfactory.Finally, we report the scatter plot of all the pairs ( ˆ β t , ˆ γ t ) obtained in the 500 simulations (Figure 7)at t = 60. We observe that they are dispersed around the true value of the parameter (0 . , . β t / ˆ γ t shows a smaller variability than ˆ β t andˆ γ t , around a straight line with slope close to three, the true value of R . Therefore we expect to beable to estimate the basic reproduction number with better accuracy and precision than the infectionand removal rate parameters. A statistical model, belonging to a parametric family, is called identiﬁable if, for any two diﬀerentvalues of parameters, there exists at least a measurable set in the sample space that is not assigned10

10 20 30 40 50 60 time (days) no r m a li s ed n . o f i nd i v i dua l s trajectory min distance time (days) no r m a li s ed n . o f i nd i v i dua l s trajectory max distance Figure 6: Left panel: dynamics of ﬁltered states ˆ I t and ˆ R t (continuous lines) in the case of minimumsum of root mean square error between ﬁltered states and true states (circles). Right panel: dynamicsof ﬁltered states ˆ I t and ˆ R t (continuous lines) in the case of maximum sum of root mean square errorbetween ﬁltered states and true states (circles). posterior meansparams. for max trject.params. for min trject.true value Figure 7: Scatter plot of the parameters (ˆ γ , ˆ β ) obtained for the diﬀerent simulations. The reddot represents the true pair (0 . , . β against ˆ γ (slope 2.78). 11he same probability by the two members of the family, that is, given θ = θ , there exists at leastone set B such that Pr( Y t ∈ B ; θ ) = Pr( Y t ∈ B ; θ ) , (13)where Y t denotes a ﬁnite-length trajectory of observations from (8). For a deterministic model, thisproperty is called structural identiﬁability, which holds if there exists a map from the parameter tothe output θ y t ( θ ) which is injective, that is, given θ = θ , the two models y ( θ ) and y ( θ )describe diﬀerent output trajectories. By a diﬀerential algebra approach, [30] show that the followingdeterministic SIR model with its output  dI t dt = β I t ( N − I t − R t ) − γ I tdR t dt = γ I t y ,t = K I t y ,t = K R t (14)deﬁned for a non-normalised population of size N , is structurally identiﬁable with respect to theunknown parameters β , γ and K . The parameter K > U random variables in (8). Then, after adding noiseto the output they go on to show that, despite structural identiﬁability, the parameters may notbe practically identiﬁable, that is, a good or acceptable agreement between observations and ﬁt isdisplayed for diﬀerent values of the parameters when observations end before reaching the peak.The way randomness has been included into this problem via model (7)-(8) is diﬀerent from [30],but we also observe practical identiﬁability. The problem of identiﬁability is also discussed in [17] forstochastic SIR models. We generate state trajectories composed by 30 daily values using model (4)with parameters β = 0 . γ = 0 .

03. Then, to obtain the actual observations we multiply eachvalue by a number drawn from a beta distribution with parameters a = 10 and b = 40 (blue linein Figure 8). After running the RBPF algorithm using the same beta distribution we obtain resultsanalogous to those in the previous sections, that is, a satisfactory ﬁt of the observed dynamics and agood estimate of the parameters β and γ .Now we run the RBPF algorithm assuming an observation error with beta distribution with amean larger than the truth, with parameters a = 10 and b = 30 (red line in Figure 8). We comparethe ﬁltered states with both the true ones and the observed data. First, we consider 500 simulationsand look at the ratio between the ﬁltered state and the state. For the sake of a clear representation, inFigure 9 we show one every ﬁve trajectories for both the infected and the removed individuals. Theseratios are, generally, less than 1 denoting an underestimation of both infected and removed.Then, we consider the ratio between the ﬁltered states and the adjusted observations. The resultsof one every ﬁve trajectories are reported in Figure 10 for both infected and removed individuals.12 detected fraction Figure 8: Comparison between the two beta densities used to model the observation error. time (days) time (days)

Figure 9: Left panel: behaviour of ˆ I t /I t for 100 diﬀerent simulations. Right panel: behaviour of ˆ R t /R t for 100 diﬀerent simulations. 13 time (days) time (days) Figure 10: Left panel: behaviour of ˆ I t / ( y t, /m ) for 100 diﬀerent simulations, where m is the truemedian of the observation error distribution. Right panel: behaviour of ˆ I t / ( y t, /m ) for 100 diﬀerentsimulations.These ratios ﬂuctuate around 1, denoting a satisfactory ﬁt to the observed data. Figure 11 shows thedynamics of two simulations: the dynamics with minimum distance of the ﬁltered state from the truestate (intended as the minimum sum of the root mean square errors over the two components), inthe left panel, and the dynamics with maximum distance, in the right panel. Both trajectories ﬁt the(scaled) observed data reasonably well.Even if the ﬁltered states follow the adjusted observations, the values of the estimated parametersare not correct, as shown in Figure 12. In fact, the pair of estimated parameters in the 100 simulationsare not equally dispersed around the true value (red point) but are placed mainly below the true value,denoting a bad estimation for β . The estimation of γ is better.It follows that it is very important to suitably choose the beta distribution of the observation error(as we have done in Section 5) in the collection of infected and removed people to avoid practicalnonidentiﬁability. In this section we use data of the epidemic in Italy, collected by Protezione Civile (Civil ProtectionDepartment) from 24th February. As a ﬁrst step we consider the data for the whole Italy from 1stMarch up to 26th November 2020. Available data are the number of infected, dead, and recoveredindividuals. Removed people can be obtained by summing dead and recovered people. In Italy, alldeaths of people infected with SARS-CoV-2 were classiﬁed as COVID-19 ([31]). The infected andremoved individuals in Italy from 24th February to 26th November are represented in Figure 13. Thetotal residing population as of 31st December 2019 is 60,244,639 people, as certiﬁed by Istat.14 time (days) no r m a li s ed n . o f i nd i v i dua l s trajectory min distance time (days) no r m a li s ed n . o f i nd i v i dua l s trajectory max distance Figure 11: Left panel: case of minimum sum of root mean square error between true and ﬁltered state,dynamics of ˆ I t and ˆ R t (continuous lines) and adjusted observations (thin line with asterisks). Rightpanel: case of maximum sum of root mean square error between true and ﬁltered state, dynamics ofˆ I t and ˆ R t (continuous lines) and adjusted observations (thin line with asterisks). posterior meansparams. for max trject.params. for min trject.true value Figure 12: Scatter plot of the parameters (ˆ γ , ˆ β ), obtained for the diﬀerent simulations, withwrong parameters a and b in the observation error distribution. The red dot represents the true pair(0 . , . n . o f i nd i v i dua l s Figure 13: Infected (red asterisks) and removed (green asterisks) in Italy from 24th February (time0) to 26th November 2020. Data from Protezione Civile.To deal with underdetection we consider observations as generated by (8), where we recall that U t, and U t, are independently beta distributed with common shape parameters a and b . In particular,we have I obs = Y = U X and R obs = Y = U X , where the additional notations I obs and R obs areintroduced as meaningful names for what follows.As we can see from Figure 13 the epidemic can be divided in two waves: the ﬁrst oﬃcially began on24th February and lasted until mid-summer, when the number of infected people began to rise again,as in the rest of Europe ([5]). The second wave is distinguished from the ﬁrst also by the increasedtest capacity. Hence we consider the two waves as diﬀerent models, with respect to both the SIRparameters and the observation error distribution parameters and we conventionally set the start ofthe second wave on 1st August. To ﬁx a and b we refer to the Infection Fatality Ratio (IFR), a fundamental quantity to estimatethe seriousness of the epidemic, and to its crude estimate, the Case Fatality Ratio (CFR). The IFRis the ratio of COVID-19 deaths to total infections of SARS-CoV-2, asymptomatic and undiagnosedinfections included, while the CFR is the ratio of COVID-19 deaths to conﬁrmed cases. By thedeﬁnition, the CFR is greater than the IFR. At any time t an estimate CF R t of the CFR and an16stimate IF R t of the IFR are related by the simple relationship CF R t = D t R obs,t = D t R t × R t R obs,t = IF R t × R t R obs,t (15)where D t denotes all deaths by time t ([18]).Since by (15) R obs,t = IF R t CF R t × R t , (16)the ratio IF R t /CF R t can be regarded as the under-reporting factor that we modelled as the U betarandom variable introduced earlier.Given estimates of IF R t as t = 1 , . . . , T and the corresponding observed sequence of CF R t wewould obtain a sample u = IF R /CF R , . . . , u T = IF R T /CF R T and an estimate of a and b by anyestablished method. Using the method of moments, for example, and considering an estimate of theIFR to substitute IF R t , we would getˆ a = ¯ u (cid:18) ¯ u (1 − ¯ u ) s u − (cid:19) = ¯ u /CF R ( ¯ u /CF R (1 − IF R ¯ u /CF R ) s /CF R − IF R ) ˆ b = (1 − ¯ u ) (cid:18) ¯ u (1 − ¯ u ) s u − (cid:19) = 1 − IF R ¯ u /CF R IF R ( ¯ u /CF R (1 − IF R ¯ u /CF R ) s /CF R − IF R ) (17)where ¯ u and s u are the sample mean and variance of ( u , . . . , u T ) and ¯ u /CF R and s /CF R are thesample mean and variance of 1 /CF R , . . . , /CF R T . These equations show that the IFR aﬀects bothparameters.The fatality ratio approach has the advantage that the IFR is a pure number and information onits value can be gathered from diﬀerent populations. Then, in practice, we may estimate the samplemean and variance of 1 /CF R , . . . , /CF R T from the observed fatality and removal data, and for aselected IF R assign ˆ a and ˆ b . If a range of values is available for the IFR from another source, such asa conﬁdence or a credibility interval, we may repeat the analysis for IF R varying within the intervaland evaluate the sensitivity of the results.For Italy, we may use an indirect method to point at a plausible value of the IFR within thisinterval, taking advantage of a seroprevalence survey targeting IgG antibody conducted in Italy fromMay to July by Istat, the Italian national statistical oﬃce, and the Italian Health Ministry. Preliminaryresults obtained from 64,660 people were presented in early August ([21]). According to them, almost1.5 million people in Italy or 2.5% of the population had developed coronavirus antibodies, a ﬁgure sixtimes larger than oﬃcial numbers reported. In short, the idea is to compare the 2.5% ﬁgure of peoplewho developed antibodies to the healed people (who have antibodies) estimated from the ﬁlteredstate ˆ R t in an appropriate time interval. The infected compartment may also contain seropositiveindividuals, still, the fraction of people in this compartment had become small when Istat’s surveystarted, so we consider only the recovered compartment. The reasoning behind this comparison is17hat if the assumed IFR is correct, then the observation error distribution derived from (17) is correctand the ﬁltered states are realistic and they should be in agreement with the Istat survey result.To be more speciﬁc, let R t = H t + D t , where H t and D t are the fractions (over the population) ofhealed and dead people by time t , respectively. Healed people can be seronegative if IgG antibodiesare no longer in their system, but we can safely assume that a person enters the healed record soonso they s/he can be considered as seropositive when they do. Now, H t includes all healed since thestart of the epidemic, hence a fraction of H t can be seronegative, depending on the duration d ofseropositivity. Hence we should compare 2.5% to H t − H t − d , where H t − d = 0 if t − d <

1. The truevalues of H t are unknown. We may recover them from ˆ R t and the available data on the fraction ofdeaths as ˆ H t = ˆ R t − D t /u . , where u . is the median of the distribution of the observation error.Since Istat’s survey was carried out between 25 May and 15 July 2020, we compare 2.5% to¯ H = 152

15 July X t =25 May ( ˆ H t − ˆ H t − d ) . (18)A plausible value of d is three months ([15]). This procedure rests on several assumptions and we onlyregard it as a way to check for gross deviations of our model from reality. We run the RBPF algorithm with 20,000 particles and time discretisation step of 1 /

24 day as donefor the synthetic data. The initial values are ( β , γ ) T = µ = (0 . , . T and Σ = diag (0 . , . σ = 0 .

03 and η = 0 .

20 40 60 80 100 120 140 160 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremovedIstat value detected fraction Figure 14: Left panel: Filtered states for infected (thick red line) and removed (thick green line).Parameters of the beta observation error distribution (right panel) are a = 11 .

72 and b = 81 . q = 0 . u . . The magenta dottedline represents the 2 .

5% of the Italian population that have developed coronavirus antibodies in theIstat analysis from 25th May to 15th July. Time 0 is 1st March.indicates a point estimate of IFR of 0.68% (0.53%–0.82%) with high heterogeneity, and suggests thatin many populations the IFR would be >

1% if excess mortality was taken into account.For the ﬁrst wave, we computed a and b from (17) for a range of IFR values from 0.1%–6%,where the minimum value is the lowest we found in the relevant literature and has been suggestedas lower bound of IFR in Europe by CEBM. The maximum value is still inspired by CEBM andby the considerations in [28]. Indeed, in Italy an estimated initial CFR of about 11-19% has beenreported ([11], [35], [31], [9]). This suggested to consider as a possible maximum initial value anIFR=6%, accounting for the lack of knowledge at the beginning of the ﬁrst wave. In particular, weconsidered 0 . . . IF R = 5%, for which ¯ H = 2 . a = 11 .

72 and b = 81 .

39 and the corresponding beta densityis shown in the right panel of Figure 14. The initial condition for I t ( R t ), for each trajectory, is givenby the normalized number of infected (removed) people collected by Protezione Civile on 1st Marchdivided by the median of this beta distribution. The ﬁltered states of infected and removed individualsare represented with thick lines in the left panel of Figure 14 where also the prediction intervals forboth infected and removed are reported. The prediction intervals are computed from (12).The plots of ˆ β t and ˆ γ t from (A-5) are in the top panel of Figure 15 and the plot of ˆ R ( t ) from(A-6) is in the bottom panel. 19

20 40 60 80 100 120 140 160 time (days) time (days)

Figure 15: Single time interval case. Top panel: plots of ˆ β t and ˆ γ t from (A-5). Bottom panel: plot ofˆ R ( t ) from (A-6). Time 0 is 1st March.The gap between the thick and the thin red lines in Figure 14 shows that a single SIR is notable to correctly describe the behaviour of the true dynamics. Therefore, we split the ﬁrst wave intosubintervals. For the partition we consider the DPCMs with the greatest impact on social organizationallowing for 10 days for the DPCM to have an eﬀect on the epidemics (that is, change-points are theDPCM dates plus 10 days). In particular, we consider the following DPCM dates: 11th March, 22ndMarch, 26th April and 3rd June, so the change-points are on 21st March, 1st April, 3rd May and 13rdJune.For each time interval, except for the ﬁrst, we use the ﬁltered state ˆ x t (A-4) at the end of theprevious interval as initial state and the values of ˆ β t and ˆ γ t at the end of the previous interval as startingparameters. Then the discontinuity in the update is determined only by the initial covariance matrix.Since in the case of a single time interval we observed that the entries of Σ are updated to very smallvalues, then for the case of several intervals we do not restart each interval with Σ = diag (0 . , . at the end of the previous interval and the initial Σ , takingΣ = diag (0 . , . β t , ˆ γ t and ˆ R ( t ) show jumps at the change-points (Figure 17), not very pronounceddue to the choice of a small Σ . After a few steps from each jump, the trajectories stabilize following DPCM: Italian acronym for government decrees. For a summary of the DPCMs related to the COVID-19 emergencysee

20 40 60 80 100 120 140 160 time (days) i nd i v i dua l s infectedremoved Figure 16: Filtered states for infected (thick red line) and removed (thick green line) from ﬁve diﬀerentSIRs in the intervals [0 , , , , , a = 11 .

72 and b = 81 .

39. The prediction intervals are computed from (12) with q = 0 . u . . Time0 is 1st March. time (days) time (days) Figure 17: Multiple time intervals case. Top panel: plots of ˆ β t and ˆ γ t from (A-5). Bottom panel: plotof ˆ R ( t ) from (A-6). Time 0 is 1st March. 21

10 20 30 40 5000.0050.010.0150.020.0250.03 no r m a li s ed n . o f i nd i v i dua l s infectedremoved infectedremoved time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved time (days) infectedremoved Figure 18: Dynamics of infected and removed individuals with forecasts during an increasing phase(ﬁrst row) or a decreasing phase (second row). The forecasts start 10 days after a change point (ﬁrstcolumn) or 20 days after (second column). Starting dates are: 11th April (top left), 21st April (topright), 6th May (bottom left), 16th May (bottom right). Forecasts of infected and removed individualsare highlighted with diﬀerent colours.a regular trend. The dynamics of infected individuals ﬁts very well the observed infected divided by u . . After day 66 (corresponding to 26 April), ˆ β t is smaller than ˆ γ t , so ˆ R ( t ) <

1. This value is morerealistic than ˆ R ( t ) in the single time interval case, which is always greater than 1 (Figure 15).This result is in agreement with the eﬀective reproduction number published for the ﬁrst time byIstituto Superiore di Sanit`a (Italian National Istitute of Health, ISS) on 30 April ([23]): the eﬀectivereproduction numbers were reported for every Italian region (except for two because of bad qualitydata) and they were all smaller than one.Now, we analyze the forecast of the infected and removed dynamics for the ﬁrst wave, computedas in (10). We consider diﬀerent cases: we suppose to have observations up to 10 days or 20 days afterthe second or third change-points, and we try to forecast the dynamics for 7 future days (Figure 18).It can be observed from Figure 18 that the forecast is satisfactory when it starts 10 days aftera change-point (left panels), while when it starts 20 days after a change-point is satisfactory only inthe decreasing phase (bottom right panel). This diﬀerent performance is mainly due to how fast ˆ β t changes after each change-point, rather than to the number of observations taken before the forecastis started. 22 detected fraction Figure 19: Beta distributions of the observation error for diﬀerent values of IFR: 1 .

15% (light bluedashed line), 1 .

3% (red dotted line), 1 .

5% (green continuous line), 1 .

75% (magenta dashed-dottedline). For comparison also the beta density used for the ﬁrst wave is reported (blue continuous line).Now, we focus on the second wave, starting on 1st August and we select three change-points 10days after the following measures: on 1st September (partial opening of museums, stadiums, andthe increase in public transport occupancy); on 21st October (curfew in Lombardy, the most aﬀectedregion); on 3rd November (DPCM establishing red, orange and yellow scenarios to classify the Regionsfrom the highest to the lowest risk and introducing tiered restrictions). We run the RBPF algorithm,with µ = (0 . , . T as initial value for ( β , γ ) T and Σ = diag (0 . , . σ = 0 . η = 0 .

01. The state is formed by the infected individuals and by the new removed individualssince 1st August, that is, the diﬀerence between the collected removed at each time and the removedon 31st July.For the second wave the under-detection error of infected and removed people is smaller, becauseof an increase in resources for taking swab tests. For this reason it is appropriate to recalculate theparameters of the beta observation error distribution from (17) only with the data since 1st August.Unfortunately, we lack a benchmark such as the serological survey during the ﬁrst wave, and thereforewe present the results obtained by considering four diﬀerent

IF R values: 1 . . . . IF R because the corresponding observation error beta distributions are located on small values likefor the ﬁrst wave.It can be seen from Figure 19 that both the mean and the variance of the beta distribution increaseas the IFR increases. The corresponding dynamics are shown in Figure 20. The normalised numbersof individuals decrease when the IFR increases, because observations are divided by the median of the23

60 180 200 220 240 260 280 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved

160 180 200 220 240 260 280 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved

160 180 200 220 240 260 280 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved Figure 20: Filtered states for infected (thick red line) and removed (thick green line) in the secondwave for diﬀerent IFR: 1 .

15% (top left), 1 .

3% (top right), 1 .

5% (bottom left), 1 .

75% (bottom right).The prediction intervals are computed from (12) with q = 0 . u . . Forecast of infected and removed individuals arehighlighted with diﬀerent colors. Time 160 is 1st August.beta distribution which is increasing with the IFR. Also the width of the prediction intervals decreasesas IFR increases. The ﬁltered dynamics well reconstruct the behaviour of the adjusted data in allthe cases. The estimated parameters ˆ β t and ˆ γ t are very similar for the diﬀerent cases and only theparameters obtained using IFR=1 .

3% are reported in Figure 21 with the corresponding ˆ R ( t ).We observe a big jump in ˆ R ( t ) on the date of the second change point, 10 days after the curfew inLombardy, followed by a slow decrease, denoting that this measure did not produce the desired eﬀect.Then it was followed by the measure of 3rd November that allows ˆ R ( t ) to accelerate its decrease,approaching one, in agreement with what the ISS reported in its 25th November bulletin [22]. In this section we consider for comparison model (1) combined with the observation equations (8),that is, the state dynamics is completely deterministic. We repeat part of the analysis done withthe stochastic SIR model on the ﬁrst wave data, using the same observation error distribution, and24

60 180 200 220 240 260 280 time (days) time (days)

Figure 21: Top panel: plots of ˆ β t and ˆ γ t from (A-5). Bottom panel: plot of ˆ R ( t ) from (A-6). TheIFR is 1 . β, γ ) via maximum likelihood. In Figure 22 we show the single-phase deterministic SIRsimulation along with the adjusted observations, in two situations: when the number of infected isdescending after the peak and when the descent is almost complete. In both cases the deterministic SIRis unable to capture the dynamics. The single-phase stochastic SIR model (see Figure 14), althoughunsuitable, performs better thanks to the learning mechanism that makes adaptations both to theparameter and the ﬁltered state. The piecewise deterministic SIR model, see Figure 23, follows thescaled observations more closely, still it is rather less ﬂexible than its stochastic counterpart (see Figure16). This is a further demonstration that a single SIR is unsuitable to describe the true behaviour ofthe pandemic. In this work we have proposed a piecewise stochastic SIR model with change-points to the COVID-19data in Italy from 24th February to 26th November 2020, using the dates of measures taken by thegovernment to control the epidemic to deﬁne the change-points. The under-detection of the fractions ofinfected and removed in the population has also been explicitly modelled, introducing a distribution forthe observation error. This strategy makes it possible to estimate the actual dynamics of the epidemicby overcoming the limitations of a single deterministic SIR model on the one hand and by correctingthe observed data on the other hand. Then, via a particle ﬁltering and parameter learning algorithm,the model can produce short-term predictions of the population in each compartment and continously25

20 40 60 80 100 time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved Figure 22: First wave: deterministic SIR simulations with ML parameter estimates and adjusted ob-servations and seven-day forecast, until 15 May (left) and until 30 June (right). Estimated parametersare ( β, γ ) = (0 . , .

08) and ( β, γ ) = (0 . , . time (days) no r m a li s ed n . o f i nd i v i dua l s infectedremoved time (days) Figure 23: First wave: piecewise deterministic SIR simulations with ML parameter estimates andadjusted observations and seven-day forecast (left) until 15 May; parameter estimates ( ˆ β, ˆ γ ) in eachphase (right) are (0 . , . . , . . , . . × − , . Acknowledgments

The authors acknowledge the many fruitful discussions with several colleagues, and in particular thecolleagues at CNR-IMATI that participated in the COVID-19 modelling study group.

A RBPF algorithm

To estimate parameter θ = ( β , γ ) T and state X t we propose to apply a Rao-Blackwellized particleﬁlter (RBPF) algorithm. We consider the Euler discretization of the stochastic system (5) reportedin equation (7). Since the system is linear in θ , we can apply the Kalman ﬁlter. Suppose that θ = ( β , γ ) T has a normal prior distribution with mean µ and covariance matrix Σ , then thedistribution of θ , given x t +1 = ( x , x , ..., x t +1 ) after t + 1 time steps, as t = 0 , , , . . . , is normal27ith mean µ t +1 and covariance matrix Σ t +1 given by  µ t +1 = µ t + S Tt +1 [ x t +1 − x t − h ( x t ) µ t ∆ t ]Σ t +1 = Σ t − S Tt +1 h ( x t ) Σ t ∆ tS Tt +1 = Σ t h T ( x t ) ∆ t (cid:2) h ( x t ) Σ t h T ( x t ) (∆ t ) + g ( x t ) g T ( x t )∆ t (cid:3) − (A-1)The distribution of X t +1 given x t is Gaussian with mean B t +1 and covariance matrix G t +1 given by  B t +1 = x t + h ( x t ) µ t ∆ tG t +1 = h ( x t ) Σ t h T ( x t ) (∆ t ) + g ( x t ) g T ( x t )∆ t. (A-2)Recalling that the observations are obtained multiplying the state X t for the beta-distributedobservation error term, as deﬁned in equation (8), the RBPF algorithm can be summarized as follows:STEP 1 • At time t = 0, we draw M initial values of X from its prior distribution π ( x ) and obtain M values x ( i )0 , i = 1 , , ..., M or, alternatively, we put x equal to the initial observation. • We consider a prior distribution for the parameter θ , given by a normal distribution N ( µ , Σ ),where µ is a vector of initial parameters, and Σ is a diagonal covariance matrix. • To obtain candidate values of the state at importance sampling steps, we will use the distributionimplied by the state-transition equation (7) after marginalising it with respect to θ . At stepone, a value for ˜ X ( i )1 , conditional on x ( i )0 , is sampled from a normal distribution with mean B ( i )1 and covariance matrix G ( i )1 , for i = 1 , . . . , M , given by (A-2) with k = 0. • Denoting by y the observation at time k = 1, we compute weights for each particle from thelikelihood at ˜ x ( i )1 ˜ v ( i )1 = L (˜ x ( i )1 ; y ) = p ( y , | ˜ x ( i )1 , ) × p ( y , | ˜ x ( i )1 , )where p ( y | x ) = (cid:0) yx (cid:1) a − (cid:0) − yx (cid:1) b − B ( a, b ) 1 x I [0 ,x ] ( y ) . (A-3)In order to resample the particles, we need to normalize the weights: v ( i )1 = ˜ v ( i )1 P Mi =1 ˜ v ( i )1 . • We update the posterior distribution of θ given n ˜ x ( i )1 , x ( i )0 o by taking one step of the Kalmanﬁlter of equation (A-1), obtaining the new mean vector ˜ µ ( i )1 and covariance matrix ˜Σ ( i )1 .28 We resample M particles from a discrete distribution with support n(cid:16) ˜ x ( i )1 , ˜ µ ( i )1 , ˜Σ ( i )1 (cid:17)o i =1 ,...,M and corresponding probabilities n v ( i )1 o i =1 ,...,M . We denote by n(cid:16) x ( i )1 , µ ( i )1 , Σ ( i )1 (cid:17)o i =1 ,...,M theresampled particles.At time t ≥

1, assume the sample n(cid:16) x ( i ) t , µ ( i ) t , Σ ( i ) t (cid:17)o i =1 ,...,M is available.STEP t + 1 • For i = 1 , . . . , M , sample candidate particles ˜ x ( i ) t +1 from a normal distribution with mean B ( i ) t +1 and covariance matrix G ( i ) t +1 , given by (A-2). • Compute the weights ˜ v ( i ) t +1 for each particle as the product of two distributions with density(A-3). Normalize the weights: v ( i ) t +1 = ˜ v ( i ) t +1 P Mi =1 ˜ v ( i ) t +1 . • Update the posterior distribution of θ given x ( i )0: t +1 , which is a normal distribution with mean˜ µ ( i ) t +1 and covariance matrix ˜Σ ( i ) t +1 given by equation (A-1). • Resample M particles using the probabilities n v ( i ) t +1 o i =1 ,...,M and denote the resampled particlesby n(cid:16) x ( i ) t +1 , µ ( i ) t +1 , Σ ( i ) t +1 (cid:17)o i =1 ,...,M .Particles n(cid:16) x ( i ) t , µ ( i ) t , Σ ( i ) t (cid:17)o i =1 ,...,M are a sample from the distribution of interest. In detail, the x ( i ) t ’s are a sample from p ( x t | y t ) and, by keeping track of the resampling history, the entire sample x ( i )0: t , i = 1 , . . . , M is potentially available, hence a sample from p ( x t | y t ). The mean of x ( i ) t over theparticles approximates E ( x t | y t ) and we call it the ﬁltered state:ˆ x t = 1 M M X i =1 x ( i ) t . (A-4)The µ ( i ) t ’s and Σ ( i ) t ’s are a sample of conditional means and covariance matrices, that is, E ( θ | x ( i )0: t )and Cov ( θ | x ( i )0: t ). Therefore, an estimate of E ( θ | y t ) is( ˆ β t , ˆ γ t ) T = ˆ θ t = 1 M M X i =1 µ ( i ) t (A-5)and, by sampling M values from M Gaussian distributions N ( µ ( i ) t , Σ ( i ) t ), i = 1 , . . . , M , we produce asample ( θ (1) t , . . . , θ ( M ) t ) from p ( θ | y t ).The basic reproduction number is deﬁned as R = β /γ , therefore an estimate based on y t is E ( β /γ | y t ), which is computed as ˆ R ( t ) = 1 M M X i =1 β ( i ) t γ ( i ) t (A-6)29here ( β ( i ) t , γ ( i ) t ) T = θ ( i ) t . If the variances on the diagonal of Σ ( i ) t are small, the additional sampling fromthe N ( µ ( i ) t , Σ ( i ) t ) may be unnecessary and the following approximation might be used, correspondingto degenerate conditional distributions: ˆ R ( t ) = 1 M M X i =1 µ ( i )1 ,t µ ( i )2 ,t . (A-7)ˆ R ( t ) can be regarded as our best estimate of the basic reproduction number in the light of the observeddata, not to be confused with the net or eﬀective reproduction number. A very useful discussion onthe meaning of R is provided by [13]. References [1] A. Arenas, W. Cota, J. G´omez-Garde˜nes, S. G´omez, C. Granell, J. T. Matamalas, D. Soriano-Pa˜nos, and B. Steinegger. Modeling the Spatiotemporal Epidemic Spreading of COVID-19 andthe Impact of Mobility and Social Distancing Interventions.

Physical Review X , 10:041055, 2020.[2] Y. Bai, A. Saﬁkhani, and G. Michailidis. Non-stationary Spatio-Temporal Modeling of COVID-19Progression in The US. medRxiv , 2020.[3] A. L. Bertozzi, E. Franco, G. Mohler, M. B. Short, and D. Sledge. The challenges of modelingand forecasting the spread of COVID-19.

Proceedings of the National Academy of Sciences ,117(29):16732–16738, 2020.[4] A. H. Bilge, F. Samanlioglu, and O. Ergonul. On the uniqueness of epidemic models ﬁtting anormalized curve of removed individuals.

Journal of Mathematical Biology , 71(4):767–794, 2015.[5] E. Bontempi. The Europe second wave of COVID-19 infection and the Italy “strange” situation.

Environmental Research , page 110476, 2020.[6] N. F. Brazeau, R. Verity, S. Jenks, H. Fu, C. Whittaker, P. Winskill, I. Dorigatti, P. Walker,S. Riley, R. P. Schnekenberg, et al. Report 34: COVID-19 infection fatality ratio: estimates fromseroprevalence. 2020.[7] G. C. Calaﬁore, C. Novara, and C. Possieri. A Modiﬁed SIR Model for the COVID-19 Contagionin Italy. In , pages 3889–3894, 2020.[8] T. Carletti, D. Fanelli, and F. Piazza. COVID-19: The unreasonable eﬀectiveness of simplemodels.

Chaos, Solitons & Fractals: X , 5:100034, 2020.309] J. Cavataio and S. Schnell. Interpreting SARS-CoV-2 seroprevalence, deaths, and fatality rate —Making a case for standardized reporting to improve communication.

Mathematical Biosciences ,333:108545, 2021.[10] I. Cooper, A. Mondal, and C. G. Antonopoulos. A SIR model assumption for the spread ofCOVID-19 in diﬀerent communities.

Chaos, Solitons & Fractals , 139:110057, 2020.[11] G. De Natale, V. Ricciardi, G. De Luca, D. De Natale, G. Di Meglio, A. Ferragamo, V. Marchitelli,A. Piccolo, A. Scala, R. Somma, et al. The COVID-19 Infection in Italy: a Statistical Study ofan Abnormally Severe Disease.

Journal of Clinical Medicine , 9(5):1564, 2020.[12] J. Dehning, J. Zierenberg, F. P. Spitzner, M. Wibral, J. P. Neto, M. Wilczek, and V. Priesemann.Inferring change points in the spread of COVID-19 reveals the eﬀectiveness of interventions.

Science , 369(6500), 2020.[13] P. L. Delamater, E. J. Street, T. F. Leslie, Y. T. Yang, and K. H. Jacobsen. Complexity of theBasic Reproduction Number (R0).

Emerging Infectious Diseases , 25(1):1, 2019.[14] A. Doucet, S. Godsill, and C. Andrieu. On sequential Monte Carlo sampling methods for Bayesianﬁltering.

Statistics and computing , 10(3):197–208, 2000.[15] E. Duysburgh, L. Mortgat, C. Barbezange, K. Dierick, N. Fischer, L. Heyndrickx, V. Hutse,I. Thomas, S. Van Gucht, B. Vuylsteke, et al. Persistence of IgG response to SARS-CoV-2.

TheLancet Infectious Diseases , 21(2):163–164, 2021.[16] M. Fabiani, G. Onder, S. Boros, M. Spuri, G. Minelli, A. Mateo Urdiales, X. Andrianou, F. Ric-cardo, M. Del Manso, D. Petrone, L. Palmieri, M. F. Vescio, A. Bella, and P. Pezzotti. Il case fatal-ity rate dell’infezione SARS-CoV-2 a livello regionale e attraverso le diﬀerenti fasi dell’epidemiain Italia. Technical Report 1/2021, Istituto Superiore di Sanit`a, 2020. [Online; accessed 10-February-2021; in Italian].[17] T. Ganyani, C. Faes, and N. Hens. Simulation and Analysis Methods for Stochastic Compart-mental Epidemic Models.

Annual Review of Statistics and Its Applications , 8:19.1–19.20, 2021.[18] A. C. Ghani, C. A. Donnelly, D. R. Cox, J. T. Griﬃn, C. Fraser, T. H. Lam, L. M. Ho, W.-S.Chan, R. M. Anderson, A. J. Hedley, et al. Methods for estimating the case fatality ratio for anovel, emerging infectious disease.

American Journal of Epidemiology , 162(5):479–486, 2005.[19] G. Giordano, F. Blanchini, R. Bruno, P. Colaneri, A. Di Filippo, A. Di Matteo, and M. Colaneri.Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy.

Nature Medicine , 26(6):855–860, 2020. 3120] H. G. Hong and Y. Li. Estimation of time-varying reproduction numbers underlying epidemio-logical processes: A new statistical tool for the COVID-19 pandemic.

PloS ONE , 15(7):e0236464,2020.[21] Istituto Nazionale di Statistica. Primi risultati dell’indagine di sieroprevalenza sul SARS-CoV-2.Technical report, 2020. [Online; accessed 01-February-2021; in Italian].[22] Istituto Superiore di Sanit`a. Epidemia COVID-19. Aggiornamento nazionale25 novem-bre 2020 – ore 16:00. , 2020. [Online;accessed 10-February-2021; in Italian].[23] Istituto Superiore di Sanit`a. Epidemia COVID-19. Aggiornamento nazionale28aprile 2020 – ore 16:00. , 2020. [Online;accessed 01-February-2021; in Italian].[24] W. O. Kermack and A. G. McKendrick. A contribution to the mathematical theory of epidemics.

Proceedings of the Royal Society of London. Series A , 115(772):700–721, 1927.[25] K. Linka, M. Peirlinck, and E. Kuhl. The reproduction number of COVID-19 and its correlationwith public health interventions.

Computational Mechanics , 66(4):1035–1050, 2020.[26] S. Mallapaty. How deadly is the coronavirus? Scientists are close to an answer.

Nature , pages467–468, 2020.[27] T. T. Marinov and R. S. Marinova. Dynamics of COVID-19 using inverse problem for coeﬃcientidentiﬁcation in SIR epidemic models.

Chaos, Solitons & Fractals: X , 5:100041, 2020.[28] G. Meyerowitz-Katz and L. Merone. A systematic review and meta-analysis of published researchdata on COVID-19 infection-fatality rates.

International Journal of Infectious Diseases , 2020.[29] J. Oke and C. Heneghan. Global Covid-19 Case Fatality Rates. , 2020. [Online; accessed 01-February-2021].[30] C. Piazzola, L. Tamellini, and R. Tempone. A note on tools for prediction under uncertaintyand identiﬁability of SIR-like dynamical systems for epidemiology.

Mathematical Biosciences ,332:108514, 2021. 3231] F. Riccardo, M. Ajelli, X. D. Andrianou, A. Bella, M. Del Manso, M. Fabiani, S. Bellino, S. Boros,A. M. Urdiales, V. Marziano, et al. Epidemiological characteristics of COVID-19 cases andestimates of the reproductive numbers 1 month into the epidemic, Italy, 28 January to 31 March2020.

Euro Surveillance , 25(49):2000790, 2020.[32] T. W. Russell, J. Hellewell, C. I. Jarvis, K. Van Zandvoort, S. Abbott, R. Ratnayake, S. Flasche,R. M. Eggo, W. J. Edmunds, A. J. Kucharski, et al. Estimating the infection and case fatality ratiofor coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the DiamondPrincess cruise ship, February 2020.

Euro Surveillance , 25(12):2000256, 2020.[33] T. Stocks, T. Britton, and M. H¨ohle. Model selection and parameter estimation for dynamic epi-demic models via iterated ﬁltering: application to rotavirus in Germany.

Biostatistics , 21(3):400–416, 2021.[34] J. Tolles and T. Luong. Modeling Epidemics With Compartmental Models.

JAMA , 323(24):2515–2516, 2020.[35] D. Tosi, A. Verde, and M. Verde. Clariﬁcation of Misleading Perceptions of COVID-19 Fatalityand Testing Rates in Italy: Data Analysis.

Journal of Medical Internet Research , 22(6):e19825,2020.[36] N. Wang, Y. Fu, H. Zhang, and H. Shi. An evaluation of mathematical models for the outbreakof COVID-19.

Precision Clinical Medicine , 3(2):85–93, 2020.[37] Z. Wang, X. Zhang, G. Teichert, M. Carrasco-Teja, and K. Garikipati. Correction to: Systeminference for the spatio-temporal evolution of infectious diseases: Michigan in the time of COVID-19.

Computational Mechanics , page 1, 2020.[38] Z. Wang, X. Zhang, G. Teichert, M. Carrasco-Teja, and K. Garikipati. System inference for thespatio-temporal evolution of infectious diseases: Michigan in the time of COVID-19.

Computa-tional Mechanics , 66(5):1153–1176, 2020.[39] J. Wangping, H. Ke, S. Yang, C. Wenzhe, W. Shengshu, Y. Shanshan, W. Jianwei, K. Fuyin,T. Penggang, L. Jing, et al. Extended SIR Prediction of the Epidemics Trend of COVID-19 inItaly and Compared With Hunan, China.

Frontiers in Medicine , 7:169, 2020.[40] World Health Organization. Estimating mortality from COVID-19: scientiﬁc brief, 4 August2020. Technical documents, 2020. 3341] L. Xue, S. Jing, J. C. Miller, W. Sun, H. Li, J. G. Estrada-Franco, J. M. Hyman, and H. Zhu. Adata-driven network model for the emerging COVID-19 epidemics in Wuhan, Toronto and Italy.