[PDF] Analysis of Virus Propagation: A Transition Model Representation of Stochastic Epidemiological Models

Abstract

The growing literature on the propagation of COVID-19 relies on various dynamic SIR-type models (Susceptible-Infected-Recovered) which yield model-dependent results. For transparency and ease of comparing the results, we introduce a common representation of the SIR-type stochastic epidemiological models. This representation is a discrete time transition model, which allows us to classify the epidemiological models with respect to the number of states (compartments) and their interpretation. Additionally, the transition model eliminates several limitations of the deterministic continuous time epidemiological models which are pointed out in the paper. We also show that all SIR-type models have a nonlinear (pseudo) state space representation and are easily estimable from an extended Kalman filter.

Full PDF

AAnalysis of Virus Propagation: A Transition ModelRepresentation of Stochastic Epidemiological Models

C. Gourieroux ∗ , J. Jasiak ‡ ﬁrst version: March 31, 2020revised: June 19, 2020 The growing literature on the propagation of COVID-19 relies on vari-ous dynamic SIR-type models (Susceptible-Infected-Recovered) whichyield model-dependent results. For transparency and ease of comparingthe results, we introduce a common representation of the SIR-typestochastic epidemiological models. This representation is a discretetime transition model, which allows us to classify the epidemiologicalmodels with respect to the number of states (compartments) and theirinterpretation. Additionally, the transition model eliminates severallimitations of the deterministic continuous time epidemiological modelswhich are pointed out in the paper. We also show that all SIR-typemodels have a nonlinear (pseudo) state space representation and areeasily estimable from an extended Kalman ﬁlter.

Keywords:

Covid-19, Epidemiological Model, Compartment, SIRModel, Transition Model, State-Space Representation. ∗ University of Toronto, Toulouse School of Economics and CREST, e-mail : [email protected] ‡ York University, Canada, e-mail : [email protected] The authors gratefully acknowledge ﬁnancial support of the Chair ACPR: Regulation et Systemic Risks,the ANR (Agence Nationale de la Recherche) and the Natural Sciences and Engineering Council ofCanada (NSERC). We thank A. Monfort for helpful comments. a r X i v : . [ q - b i o . P E ] J un HIS VERSION: June 19, 2020 For almost 100 years following the introductory article of Kermack, McKendrick (1927),the SIR (Susceptible-Infected-Recovered) model has remained the primary tool of analysisfor epidemiological studies and has inspired a considerable number of extensions [seee.g. Hethcote (2000), Brauer,Castillo-Chavez (2001),Vynnicky et al. (2010), for generalreview]. In recent applications to the Coronavirus propagation however, these SIR-typemodels tend to produce results and forecasts that lack robustness and show variationacross models.This paper introduces a common representation of SIR-type stochastic epidemiolog-ical models to facilitate the comparison between models and their outcomes. This rep-resentation is a discrete time transition model, which is used to deﬁne a typology ofepidemiological models with respect to the number of states (compartments) and theirinterpretation.The discrete time transition model is characterized by a transition matrix, whichdetermines the probabilities of transitions between the states distinguished in an epi-demiological model. As such, it can easily accommodate individual and aggregate countdata sampled at various frequencies. On the contrary, a discretization bias arises whena continuous time deterministic diﬀerential system is adapted to data sampled at ﬁxedintervals. In particular, the discretization bias aﬀects the estimated collective immunityratio, which causes its reliability and even its existence to be questionable. Moreover, atime discretized SIR-type model is shown to provide diﬀerent results in an application todata sampled at diﬀerent frequencies, such as the daily or weekly frequencies, due to itsinconsistency with respect to the time unit. The same limitation concerns the reproduc-tion number, which is a commonly used epidemiological parameter. We also commenton the constrained speciﬁcations of some SIR-type models, with the potentially complexdynamics of infection probabilities being determined from very few parameters. The tran-sition model allows us to avoid these and other limitations revealed in the commonly usedepidemiological models and discussed in the paper.When aggregate counts are observed, all SIR-type models are shown to have a (Gaus-sian) (pseudo) nonlinear state space representation, which is convenient for statisticalinference. We show that a quasi-maximum likelihood (QML) estimation method can be

HIS VERSION: June 19, 2020

This section introduces a general speciﬁcation that encompasses the main epidemiologicalSIR-type models. It is a discrete time stochastic transition model that allows for mod-elling of individual histories during an epidemic without the limitations revealed in thedeterministic epidemiological models. In particular, we point that the time discretizedversion of a continuous time SIR-type model depends on the time unit and needs to bere-adjusted for the sampling frequency of the data. Moreover, the reproduction numbercomputed from the time discretized model depends on the time unit as well and takes adiﬀerent value when computed from daily or weekly data. We also show a diﬃculty withinference based on a continuous time model of frequencies.

We consider a large panel of individual histories Y i,t , i = 1 , .., N, t = 0 , .., T , where thevariable Y is qualitative multinomial with J alternatives denoted by j = 1 , ..., J . Thesealternatives are the states of infection, recovery or death, depending on the model spec-iﬁcation (or compartments in the epidemiological terminology). The discrete time t is HIS VERSION: June 19, 2020

Assumption A1:

The individual histories are such that:i) The variables Y i,t , i = 1 , .., N at time t ﬁxed have the same marginal (i.e. cross-sectional) distributions. This common distribution depends on time t and is discrete. Itis determined by the vector p ( t ) of size J with components: p j ( t ) = P ( Y i,t = j ) , j = 1 , .., J. These components are non-negative and sum up to 1 .ii) The processes ( Y i,t , t = 1 , ., T ) , i = 1 , .., N are independent, heterogeneous Markovprocesses of order 1, with common transition probabilities. The transitions from date t − t are characterized by the J × J transition matrix Π[ p ( t − p ( t ) represents the cross-sectional (marginal) probabilities of states. Inpractice, the cross-sectional probability p ( t ) is close to the cross-sectional frequency f ( t ),computed from the values of Y i,t . Then, transition matrix Π[ p ( t − f ( t − p ( t −

1) to remainindependent of the population size.

Assumption A2: i) The epidemic starts at time 0. ii) At time 0 all individuals are instate j = 1, which is interpreted as the susceptible compartment.Under Assumptions A1 and A2, the individual histories Y i,t are independent and iden-tically distributed. Therefore, the individuals are exchangeable, i.e. have similar riskfactors (homogenous population).The initial condition also implies nonstationary evolutions of the processes of individ-ual histories over time. This nonstationarity and the time dependence of the transitionmatrix through p ( t −

1) only are the distinct characteristics of a SIR-type model. Thereis one exception, however. When the SIR model is a homogenous Markov process, thetransition matrix Π is independent of p ( t − HIS VERSION: June 19, 2020 p ( t −

1) in the presence of time dependence. The models diﬀerwith respect to the form of those functions and of the components of p ( t − p ( t ). Those bounds are mainly determined by the maximum (resp. mini-mum) of moduli of all eigenvalues of Π[ p ( t − t . Assumptions A1 and A2 deﬁning the stochastic dynamics of Y i,t lead to a deterministicnonlinear recursive model for the dynamics of marginal probabilities p ( t ). This determin-istic representation is obtained by applying the Bayes formula and is given by: p ( t ) = Π [ p ( t − (cid:48) p ( t − , t = 1 , .., T, (2.1)with initial condition: p (0) = (1 , , .., (cid:48) . As shown in Section 4, system (2.1) can be usedas a system of estimating equations for statistical inference.System (2.1) can be rewritten to deﬁne the dynamics of changes in marginal proba-bilities: ∆ p ( t ) = p ( t ) − p ( t −

1) = { Π [ p ( t − − Id } p ( t − , (2.2)where Id denotes the identity matrix. The equation (2.2) highlights the role of thegenerator: Π [ p ( t − − Id , in determining the changes in marginal probabilities ∆ p ( t ).Remark 1: Discretization bias.A major part of literature concerns epidemiological models written as deterministicdiﬀerential systems in continuous time. A continuous time analogue of the deterministicmodel (2.2) is: dp ( t ) /dt = { Π[ p ( t )] − Id } p ( t ) . (2.3) HIS VERSION: June 19, 2020 p ( t ) deﬁned from (2.2) and (2.3), especially over the medium run.To highlight the diﬀerences between the discrete and continuous time modelling, letus consider the 2-state SIS model with a linear force of infection, i.e. a linear function π (see Remark 1). The probability of being infected p ( t ) satisﬁes the following recursiveequation : p ( t ) = [ βp ( t − − p ( t − − γ ) p ( t − , in discrete time and dp ( t ) /dt = ( β − γ ) p ( t ) − γp ( t ) , in continuous time. Even though both equations look similar and contain the same pa-rameter symbols, the following diﬀerences can be pointed out:i) The discrete time version of SIS is not time consistent, as p ( t ) at lag 2 derived byrecursive substitution is a quartic function of p ( t − β and γ in both the discrete and continuous time SIS models givenabove depend on the selected time unit too. In the continuous time model however, β and γ are multiplied by the same factor when the time unit is changed. Then, the so-called reproduction number R = β/γ is invariant with respect to the time unit. As thediscrete time model is not time consistent, R computed directly from the time discretizedmodel is not invariant with respect to the time unit either. Hence, diﬀerent values of R are obtained from daily and weekly data. Any result obtained from the time discretizeddiﬀerential equation (called the Euler discretization) and interpreted in the continuoustime framework needs to be interpreted with caution. This ﬁnding calls into question Coeﬃcients β, γ are assumed constrained to ensure that p ( t ) takes values between 0 and 1, for any p ( t −

1) in [0,1].

HIS VERSION: June 19, 2020 R and some of its transforms, such as theasymptotic value p ( ∞ ), which are important components of epidemiological studies.Remark 2: Non-diﬀerentiability of a continuous time frequency modelIt is common to write the epidemiological model as a diﬀerential system of frequencies f ( t ), instead of marginal probabilities p ( t ), as: df ( t ) /dt = [Π[ f ( t )] − Id ] f ( t ) . (2.4)This diﬀerential system is not compatible with the set of admissible values of vectors f ( t ) .The components of f ( t ) are not continuously valued functions, as they take on values equalto the multiples of 1 /N , which implies the non-diﬀerentiability of function f ( t ). Therefore,an epidemiological model of this type cannot provide accurate results. Moreover, as themodel is deterministic, it cannot take into account the ex-ante uncertainty about thevectors f ( t ), which are random. Let us now examine the dynamics of marginal probabilities of states p ( t ). The analysis oftheir evolution during an epidemic can be simpliﬁed if we focus on either the beginning,or the end of the epidemic and consider local expansions. At time t = 0, the initial value is: p (0) = (1 , , .., (cid:48) . Below, we consider expansionsof orders 1 and 2 of the recursive system (2.1) in a neighbourhood of p (0). i) First-Order Expansion :The ﬁrst-order expansion is: p ( t ) = Π[ p (0)] (cid:48) p ( t − , (2.5)which corresponds to a homogeneous Markov model with transition matrix Π[ p (0)]. Itcan be solved analytically as: It is also the case when a stochastic feature is introduced by replacing the deterministic diﬀerentialequation by a stochastic one, such as a multivariate Jacobi process to account for the positivity and unitmass restrictions on the components of f ( t ) [see, Admani et al. (2018), Jiang et al. (2011), El Kouﬁ etal. (2019) for examples of stochastic diﬀerential epidemic models and Gourieroux,Jasiak (2006) for themultivariate Jacobi process]. HIS VERSION: June 19, 2020 p ( t ) = Π [ p (0)] (cid:48) t p (0) , (2.6)We ﬁnd that locally the components of marginal probabilities p ( t ) are combinations ofexponential functions (and also of sine functions, cosine functions, which can possibly bemultiplied by polynomials, if some eigenvalues of Π [ p (0)] are complex and/or multiple).Their dynamics are constrained by the speciﬁc form of matrix Π[ p (0)], which is a transitionmatrix. More speciﬁcally, all components of p ( t ) have to take values between 0 and 1.In order to satisfy this restriction, locally, the marginal probability p ( t ) is exponentiallydecreasing over time, whereas marginal probabilities of other states p j ( t ) , j = 2 , .., J areexponentially increasing over time. ii) Second-Order Expansion The second order expansion leads to the dynamic system: p ( t ) = { Π [ p (0)] + J (cid:88) j =1 d Π [ p (0)] /dp j p j ( t − } (cid:48) p ( t − , (2.7)which is a Riccati quadratic recursive system. In general, the epidemiological models include some absorbing states, such as thestates of deceased, or recovered (see the examples in Section 3 and Appendix 1). In thiscase, the sequence of marginal probabilities p ( t ) has a limit when t → ∞ : p ( ∞ ), say. Ifthere is only one absorbing state J , say, we get p ( ∞ ) = (0 , , .., (cid:48) . Then, the ﬁrst- andsecond-order expansions can be performed in a neighbourhood of p ( ∞ ), yielding dynamicapproximations systems analogous to systems (2.5) and (2.7).The early phases of the epidemic, which are characterized by expansions given inSection 2.3.1. are illustrated in Section 4. Expansions derived in Section 2.3.2 can beused for other research objectives, such as determining the level of collective immunity. Let us now study the dynamic properties of two commonly used epidemiological models,which are the SI and SIR models (see Appendix 1) in the framework of a discrete time

HIS VERSION: June 19, 2020

The transition model representation of the deterministic SI model involves the following2 × − π ( p ); π ( p ).row 2, I: 0, 1.The state I of infected, still infectious and immunized is the absorbing state. Function π is the contagion function (called the force of infection ) that satisﬁes the followingassumption: Assumption A3: π is a non-decreasing function of p , which takes values between 0 and1. The value π (0) can be interpreted as an exogenous component of the contagion. In anopen economy, it can be due to the eﬀect of tourism, international trade and migration.In a closed economy, such as the world in its entirety, it can be set equal to zero. Thereis a strict (endogeneous) contagion eﬀect if function π is strictly increasing. Example 1:

A common speciﬁcation of the force of infection π is the linear function: π ( p ) = b p , where parameter b takes values between 0 and 1, or a logistic function of p : π ( p ) = exp( a + bp ) / [1 + exp ( a + bp )] , where coeﬃcient b is non-negative. In the logistic force of infection the exogenous infectionrate is measured by: exp a/ [1 + exp a ] and the strict endogenous contagion eﬀect byparameter b . Other functional forms have also been considered in the growth literatureand obtained, for instance, by replacing p by a power of p in the expressions givenabove [see e.g. Richards(1959), Kuhi et al. (2003), Table 1, Brandenburg (2019), Wu etal. (2020), Harvey, Kattuman (2020)].The form of the transition matrix given above leads to the following nonlinear recursiveequation of order 1 for the marginal probability of being infected: HIS VERSION: June 19, 2020 p ( t ) = π ( p ( t − − p ( t − p ( t − . (3.1) Proposition 1:

Under Assumption A3, p ( t ) is a non-decreasing function of time withexponential lower and upper bounds:1 − [1 − π (0)] t ≤ p ( t ) ≤ − [1 − π (1)] t .It tends to 1 when t → ∞ .Proof:i) It is non-decreasing, as p ( t ) − p ( t −

1) is non-negative.ii) The bounds are obtained by observing that p ( t ) is an increasing function of π .iii) Since p ( t ) is non-decreasing and bounded by 1, it converges to a value p ( ∞ ). Thislimit is equal to 1, by considering (3.1) at t = ∞ .QEDIn particular, if there is no contagion eﬀect i.e. if π ( p ) is constant and equal to π ,then the marginal probability of being infected is: p ( t ) = 1 − [1 − π ] t . It is interesting to consider the expansions of the dynamics of the probability of beinginfected in the SI model at the beginning of the contagion,i.e. when p is close to zero, orat the end of the contagion, i.e. when p is close to 1.i) Beginning of the contagionA second-order expansion leads to: p ( t ) − p ( t − ∝ [ π (0) + dπ (0) /dp p ( t − − p ( t − . (3.2)ii) End of the contagionThe second-order expansion in a neighbourhood of p = 1 yields: p ( t ) − π (1) + dπ (1) /dp [ p ( t − − − p ( t − p ( t − − . (3.3)Both approximations lead to discrete time logistic recursive equations for the probabil-ity of being infected p ( t ) and the probability of not being infected 1 − p ( t ), respectively. HIS VERSION: June 19, 2020 The continuous time analogue of the recursive equation (3.2) is: dp ( t ) /dt = ( α + βp ( t ))(1 − p ( t )) , (3.4)where α = π (0) , β = dπ (0) /dp are both nonnegative.Equation (3.4) can be solved analytically. Proposition 2: i) Assuming that the beginning of the epidemics is at time t = 0, thesolution of equation (3.4) is: p ( t ) = [ α exp[( α + β ) t ] − α ] / [ α exp[( α + β ) t ] + β ] . ii) If β > α , the solution is such that the derivative dp ( t ) /dt attains the maximumwhen p ( t ) = ( β − α ) / (2 β ). The time-to- inﬂection is reached at t ∗ = log( β/α ) / ( α + β ).Proof: see Appendix 2.It follows that the probability of being infected p ( t ) is a logistic function of time.Moreover, if the strict contagion eﬀect is large as compared to the exogenous componentof the contagion, there is a peak in the changes of ratios of infected individuals over time.The size and timing of the peak depend on the propagation parameters. However, theSI model has only two parameters, which is insuﬃcient to independently determine thepeak, the time to peak and other characteristics such as the ﬂatness of the curve at thepeak (the so-called ”plateau” eﬀect) as well as the asymmetry of the curve with respectto the peak.These above outcomes of the SI model (3.1) are based on a local expansion of the initialnonlinear recursive equation and are therefore valid at the beginning of an epidemic only.The length of the time episode over which such an expansion is valid depends on function π , and also on the values of parameters a, b in the parametric SI model in Example 1. Let us consider below the parametric SI model in Example 1 and illustrate graphicallyits dynamics. At time 0, we ﬁx the probability of being infected p (0) = 0 and set theparameters α, β , where α > , β > α = 0 . , β = 0 . HIS VERSION: June 19, 2020 p ( t ) which satisﬁes the continuoustime SI model (3.4), i.e. with the logistic evolution given in Proposition 2, and its discretetime Euler approximation at the beginning of an epidemic given in equation(3.2). It iscomputed with the same values of parameters α, β given above. Figure 1 shows that thediscrete time approximation (3.2) can be very misleading when it is used for forecastingover a medium or long run.[Insert Figure 1: Evolutions of p ( t ), SI Model]When β < α , we get an increasing concave curve that tends to 1. When β > α , as inFigure 1, we get an exponential convex increase for small t , followed by an increasingconcave pattern of convergence to 1.The evolutions of changes of p ( t ) are shown in Figure 2:[Insert Figure 2: Evolutions of Changes in p(t), SI Model]When β > α , we get a hump-shaped pattern with the curve decreasing at a slower rateafter the peak than increasing before the peak.We complete the sensitivity analysis of the main features of the SI model by examiningthe size of peak (Figure 3) and the time-to-inﬂection (Figure 4).[Insert Figure 3: Size of Peak, SI Model][Insert Figure 4: Time to Inﬂection, SI Model] Let us now consider the SIR model (see Appendix 1) with three states: S for Suscepti-ble; I for Infected, infectious, not immunized; R for Recovered, immunized and no longerinfectious. Its transition model representation involves the 3 × − π ( p ); π ( p ) ; 0row 2, I : 0, p ; p row3,R : 0; 0; 1where p is strictly positive, and R is the absorbing state. HIS VERSION: June 19, 2020 p = 0, the 2 × p ( t ) = π [ p ( t − − p ( t − − p ( t − p p ( t − , (3.5) p ( t ) = p p ( t −

1) + p ( t − . From the second equation of system (3.5),we get: p ( t −

1) = [ p ( t ) − p ( t − /p , (3.6)and by substituting into the ﬁrst equation, we derive the recursive equation satisﬁed by p ( t ): p ( t ) = p ( t −

1) + π [( p ( t − − p ( t − /p ][ p − p ( t −

1) + (1 − p ) p ( t − p [ p ( t − − p ( t − . (3.7) Proposition 3 i) The sequence p ( t ) [resp. p ( t )] is decreasing [resp. increasing].ii) The sequence p ( t ) satisﬁes a nonlinear recursive equation of order 2.iii) The sequence p ( t ) is a linear moving average of order 1 in p ( t + 1).The higher order of temporal dependence in p ( t ) is due to the interpretation of state I asa transitory state between S and R. Thus, the dynamics of p ( t ) has to account for boththe entries into and exits from the state I.Let us now discuss the behaviour of marginal probabilities p(t) when t tends to inﬁnity.Since p ( t ) is increasing, and it is upper bounded by 1, its limit is p ( ∞ ), say. Fromequation (3.6), it follows that the limit of p ( t ) is zero. Then, by taking into account theﬁrst equation of (3.5), we get: HIS VERSION: June 19, 2020 Lemma 1:

When t → ∞ ,i) p ( t ) → π (0) is diﬀerent from 0, p ( t ) → π (0) = 0 , p ( t ) might tend to a limiting value p ( ∞ ) < p ( ∞ ), which is strictly lessthan 1 and computing this limiting value, which is interpreted as the level of collectiveimmunity are common topics in the epidemiological literature. Due to the time discreti-sation bias (see Remark 1), estimation errors on a long run parameter p ( ∞ ) are large inan early phase of epidemic. Hence, the estimated ratio of collective immunity and evenits existence may be unreliable. As mentioned Section in 3.1, it is interesting to consider the homogeneous Markovchain, obtained when function π is constant (no contagion). Then the evolution of p ( t ) isdriven by a linear recursive equation of order 1: p ( t ) = Π (cid:48) p ( t − − π, p ,

1. The following proposition is obtained:

Proposition 4 : For a constant π , we have: p ( t ) = A (1 − π ) t + Bp t + C , where A, B, C are 3-dimensional vectors.The eﬀects of entries into and exits from state I induce the two driving exponentialfunctions.When function π is not constant, the decreasing sequence p ( t ) and increasing sequence p ( t ) take values between their analogues computed from a homogeneous Markov chainwith π = π (0) and π = π (1), respectively. At the beginning of an epidemic the probability of being infected p ( t ) is close to 0.Then, the estimating equations can be replaced by second-order discrete or continuousdeterministic systems of Riccati type. Some of these Riccati systems have closed-formsolutions that involve transcendental functions [see Miller (2012), Harko et al. (2014)].However these analytical solutions have complicated expressions. Alternatively, they canbe derived by simulation methods, which are easy to perform in the SIR model. HIS VERSION: June 19, 2020 π ( p ) = a + bp , where a > , b > , a + b <

1. Then theSIR model involves 3 independent parameters, and is expected to provide more ﬂexibilitythan the SI model, due to the additional parameter p . Below, we perform a sensitivityanalysis similar to that in Section 3.1.4 and focused on series p ( t ). Parameters a, b, p are set equal to a = 0 . , b = 0 . , p = 0 . p ( t ), SIR Model][Insert Figure 6 : Evolution of Change in p ( t ), SIR Model]The timing of a peak is determined by parameter a as shown below. We hold parameter b = 0 .

85 constant and change the values of parameter a in Figure 7.[Insert Figure 7 : Timing of Peak, a varying, SIR Model]Next, parameter a = 0 .

005 is held constant and the values of parameter b are allowed tovary. The size of peak is determined by parameter b as shown in Figure 8 below.[Insert Figure 8: Size of Peak, b varying, SIR model] This section present the methods of inference for the discrete time transition model. Letus now consider a parametric transition matrix Π[ p ( t − θ ], with parameter vector θ ,and assume that the empirical frequencies f ( t ) , t = 1 , .., T are observed. In addition, weassume that the parametric model is well speciﬁed. Under Assumptions A1, A2, these frequencies converge at rate 1 / √ N to their theoreticalcounterparts and are asymptotically normal. Thus we can write: f ( t ) = p ( t ) + u ( t ) , (4.1)where the errors are Gaussian with mean zero and the variance-covariance matrix at lag h given below [Gourieroux, Jasiak (2020)]: Cov [ u ( t ) , u ( t − h )] = (1 /N ) { Π( t − , h ) diag [ p ( t − h )] − p ( t ) p ( t − h ) (cid:48) } , (4.2) HIS VERSION: June 19, 2020 t − , h ) = Π[ p ( t − ... Π[ p ( t − h )]. System (4.1) resembles a measurement equation in a state space system with the mea-surement variable f ( t ), measurement error u ( t ), and the following system of transitionequations for the state variable p ( t ),: p ( t ) = Π[ p ( t − θ ] (cid:48) p ( t − . (4.3)However, the system of equations (4.3) and (4.1) does not fully satisfy the deﬁnition ofa state space representation because the measurement errors u ( t ) are serially correlated,as shown in equation (4.2).This diﬃculty is easily circumvented by assuming a pseudo Gaussian distribution forthe errors u (cid:48) t , and disregarding the autocorrelation. Their variance-covariance matrix attime t can be assumed equal to an identity matrix Id (Ordinary Least Squares approach)or an unknown constant matrix Ω (Weighted Least Squares approach), or even the trueexpression of V ( u t ) can be considered. Upon this change of autocovariance structure, a(pseudo) state space representation is obtained.The pseudo state space representation can be estimated by the Gaussian quasi-maximumlikelihood (QML). The quasi-maximum likelihood approach has also an interpretation interms of estimating equations and asymptotic least squares [see, Berkson (1944), Go-dambe, Thompson (1974), McRae (1977), Kalbﬂeisch et al. (1983), Hardin, Hilbe (2003),Miller, Judge (2015)]. As the asymptotic theory is established for T ﬁxed and N → ∞ ,the QML and weighted least squares methods provide consistent estimators of parametervector θ , which are not fully eﬃcient as the true structure of autocovariances of errrors u t has not been taken into account [see, Gourieroux, Jasiak (2020), Appendix 2].In practice, the QML estimates of θ can be computed numerically from an extended,or unscented Kalman Filter applied to the (pseudo) state space model.The estimation approach outlined above remains valid when some frequencies f j ( t ) As the QML approach does not account for the structure of the variance-covariance matrix of the u ( t )’s it can be improved by replacing a ”moment” estimator by a GMM estimator. see, e.g. Song, Grizzle (1995), Krener (2003) for the Extended Kalman Filter. HIS VERSION: June 19, 2020

The simple epidemiological models can be easily extended to allow for time dependentparameters, obtained by replacing θ by a ( t ) , b , say, to distinguish the time dependentparameter vector from the constant parameters. Then, their transition model represen-tation involves the transition matrix Π[ p ( t − , a ( t ) , b ]. The time dependent propagationparameters can, for instance, capture the time varying implementation of social distanc-ing measures and the compliance with these measures [see e.g. DiDomenico et al. (2020),Alvares et al. (2020)]. Such an extended model can be written also in the (pseudo) statespace representation and estimated by using the methods given in the previous Section.The (pseudo) state space representations depend on the assumptions on the evolution ofparameter vector a ( t ). At least three types of modelling approaches can be considered.i) Dynamics of a ( t ) left unspeciﬁed The state space representation comprises the measurement equation: f ( t ) = p ( t ) + u ( t ) , and the transition equations: HIS VERSION: June 19, 2020 p ( t ) = Π[ p ( t − , a ( t ) , b ] (cid:48) p ( t − . The state variables are the marginal probabilities p ( t ) and parameters a ( t ) while b isthe vector of constant parameter. They can be jointly estimated and the state variablesﬁltered by the extended Kalman ﬁlter, under an identiﬁcation condition. In particular,the order condition: ( J − T − ≥ T dim a + dim b has to be satisﬁed.ii) Stochastic evolution of a ( t )An alternative model has the same measurement equation and extends the previous statespace representation system as it includes additional transitions such as: a ( t ) = φa ( t −

1) + v ( t ) , where the errors v ( t ) are Gaussian noises independent of the measurement errors u ( t ) and | φ | < p ( t ) andparameters a ( t ), and b is the constant parameter vector. The extended Kalman Filtercan be used to jointly estimate b and ﬁlter the components of a ( t ).iii) Exogenous information on a ( t )If the indicators x ( t ) of social distancing are available, such as counts of travellers[see, Hurtacsu et al. (2020)] and the daily numbers of ﬁnes for disobeying the socialdistancing rules, the model can be extended to include x ( t ). Then, the model is similar tothe representation above with the autoregressive dynamic replaced by an equation suchas: a ( t ) = Cx ( t ) + v ( t )where C is a row vector of constant coeﬃcients to be estimated and v ( t ) are Gaussiannoises independent of the measurement errors u ( t ). Let us now examine the limiting case of time independent stochastic parameters,which is a very special case of stochastic dynamics and consider the logistic model ofProposition 2. The stochastic parameters are introduced to account for heterogeneity of

HIS VERSION: June 19, 2020 q k , k = 1 , ..., K on values ( α k , β k ) , k = 1 , ..., K [see, Gourieroux et al.(1996),Yan, Chowell (2019)]. Then, equation (3.4) is written conditional on α, β and deﬁnes theevolution of p ,k ( t ) for values α k , β k . Next, these evolutions need to be re-integrated withrespect to α, β , in order to ﬁnd the marginal probability of being infected p ( t ) as follows: p ( t ) = K (cid:88) k =1 q k p ,k ( t ) = K (cid:88) k =1 { q k [ α k [exp( α k + β k ) t ] − α k ] / [ α k exp[( α k + β k ) t ] + β k ] } . We get a convex combination of logistic functions . This additive speciﬁcation impliesthat p ( t ) cannot follow a quadratic diﬀerential equation such as (3.4). There is a doubleheterogeneity i) in coeﬃcients α , which means that there exist multiple initial exogenousclusters of infection of diﬀerent sizes, ii) in coeﬃcients β , which means that the speeds ofpropagation of each cluster are diﬀerent. This model is not a special case of (SI) K [seeAppendix 1] as there is no contagion between the sub-populations k and only contagionwithin is allowed.The presence of heterogeneity in a logistic model generates the following eﬀects:i) several peaks can appear in the changes in p ( t ) over time. This is the wave eﬀect[Witham (1974), Gourieroux et al. (1996)], due to diﬀerent propagation parameters ofeach wave.ii) the persistence of p ( t ) increases due to the additive representation. This is a well-known long memory eﬀect revealed in Granger, Joyeux (1980) for linear autoregressivemodels that also exist in nonlinear logistic models [see Sattenspiel (1990)].Below, a three-wave pattern is illustrated in Figure 9 for the following parametervalues: α = 0 . , . , . β = 0 . , . , . used as a basis of functions in neural networks. HIS VERSION: June 19, 2020 Contrary to the major part of literature on epidemiological models, which considers deter-ministic continuous time models of counts of individuals in various compartments (states),we consider stochastic models in discrete time for variables representing individual his-tories [see also Allen (1994), Das et al.(2011) for discrete time approach]. The proposeddiscrete time transition model has the following advantages:i) It eliminates the lack of consistency between time discretized continuous time mod-els with respect to the time unit. In the continuous time setup, it also eliminates theassumption of diﬀerentiability of aggregate counts, which are discrete variables. This isespecially important in the early phase of an epidemic when some of these counts aresmall.ii) The stochastic model allows us to combine aggregate count variables and individualmedical histories of patients under medical care.iii) The stochastic component allows us for deriving not only the point, but alsointerval forecast. This is important at the beginning of an epidemic when the number ofobservations is small and the results are less reliable.iv) The estimation of the transition model can be performed by applying an extendedKalman ﬁlter to its (pseudo) state space representation.

HIS VERSION: June 19, 2020 REFERENCES

HIS VERSION: June 19, 2020

HIS VERSION: June 19, 2020 Appendix A.1Structural Epidemiological Models

This Appendix presents a typology of basic epidemiological models that can serve asbuilding blocks of more complex speciﬁcations. The diﬀerence between the basic modelsare with respect to:i) the number of states ( compartments) and their interpretations as S= susceptible,E=exposed, I =infected, R =recovered, D=deceased.ii) The number and types of virus propagation sourcesiii) The location of zeros in the transition matrix (i.e. the causal structure)iv) The structure of time dependent transition probabilities.We provide below the transition probabilities along with the state interpretations. Thetime dependent transition probabilities are denoted by π and are functions of (lagged)marginal probabilities p ( t ).The models described below are the following:2-state: SI model, SIS model3-state:SIR model4-state : SIRD model, SEIR model, SIR model, (SI) model, S,IU ,ID, R model.The interpretations of states and the form of transition matrices are given below.2-state SI modelS= susceptible,I= infected,being immunized and staying infectious for the S peoplerow 1,S: π ( p ) , π ( p )row 2, I : 0,1One absorbing state,one source of infection.2-state SIS modelS:susceptible, I infected, can recover, but without being immunized.row 1,S: π ( p ), π ( p )row 2,I: p , p , with p > HIS VERSION: June 19, 2020 π ( p ) , π ( p ) , π ( p )row 2, I: 0, p , p row 3, R: 0,0,1One absorbing state,one source of infection [see e.g. Tchenchue et al.(2003), Meng,Chen(2008),Jianget al. (2011), Toda (2020)].4-state SIRD modelS=susceptible, I=infected, not immunized,infectious, R=recovered, no longer infectious,immunized, D=deceased.row 1, S: π ( p ) , π ( p ) , π ( p ) , p row 2, I : 0, p , p , p row 3, R: 0,0, p , p row 4, D: 0,0,0,1One absorbing state, one propagation source.4-state (SI) The population is divided into 2 sub-populations, as region 1 & region 2, male & female,young & old. It is easily extended to any number of regions. S j =susceptible of type j , I j = infected, immunized, infectious of type j .row 1, S : π ( p , p ) , , π ( p , p ) , S : 0, π ( p , p ) , , π ( p , p )row 3, I : 0,0,1,0row 4, I : 0,0,0,1Two absorbing states,two propagation sources [see e.g. Feng et al. (2005)].4-state SEIRS=susceptible E= exposed, but not yet infectious (there is a latency period), not immu-nized, I = infected and infectious, not immunized, R=recovered,no longer infectious,immunized.row 1,S: π ( p ) , π ( p ) , , p , p , p , p HIS VERSION: June 19, 2020 π ( p , p ) , π ( p , p ) , π ( p , p ) , p row 2, IU: 0, p , p , p row 3, ID: 0, 0, p , p row 4, R: 0, 0, 0, 1One absorbing state, two propagation sources [see e.g. Gourieroux, Jasiak (2020)].These structural models can be extended by considering other states, such as the birth to oﬀset the future number of deaths due to coronavirus, the types of medical treatmentof patients in hospitals, the severity (asymptomatic (mild), symptomatic, high), or thedetection (contact tracing, inﬂuenza like illness surveillance, tests, etc) [see, e.g. Verity etal. (2020)]. The models can also be extended by combining the basic models as buildingblocks to construct a 5-state model such as the SEIRD [see e.g. Korolev(2020)], or S IUID R D, a 6-state model such as the S IU , ID , R, or ( SIR ) . The structure of thetransition matrix can also be modiﬁed to account for the possibility that a fraction ofrecovered individuals is not entirely immunized and can be infected twice. In transition models the state birth is usually introduced to balance the deaths and to providestationary evolutions of the processes [see e.g. Harko et al. (2014)]. This ad-hoc introduction of birthsis not relevant at the beginning of the epidemic when the interest is in determining the nonstationarydynamic at the beginning of the epidemic, rather than in the long run equilibrium. Second, there is noincreased count of births (known 9 months earlier) in order to oﬀset the increasing number of deaths dueto coronavirus.

HIS VERSION: June 19, 2020 Appendix A.2Rational Recursive Equations

This Appendix shows the exact solution and the exact time discretization of the prob-ability to be infected in a continuous time SI model. We provide below the results fora general one-dimensional Riccati equation in continuous time written on a series x ( t ).Then, the results can be applied to the special case x ( t ) = p ( t ).i) The diﬀerential equation

This diﬀerential equation is: dx ( t ) /dt = − λ ( x ( t ) − a )( x ( t ) − b ) / ( a − b ) , where λ is strictly positive.ii) The solution

Since: ( a − b ) / [( x ( t ) − a )( x ( t ) − b )] = 1 / ( x ( t ) − a ) − / ( x ( t ) − b ) , we deduce: dx ( t )[1 / ( x ( t ) − a ) − / ( x ( t ) − b )] = − λdt, and by integration: | x ( t ) − a | / | x ( t ) − b | = exp ( − λt ) | x (0) − a | / | x (0) − b | . The form of this relation implies that the trajectory x ( t ) and the starting value x (0)satisfy always the same relationship with respect to a and b, that is x ( t ) is in the interval(a,b) (resp. below, above), if x (0) is in this interval (resp. below, above). Therefore, wecan disregard the absolute values to get:( x ( t ) − a ) / ( x ( t ) − b ) = exp ( − λt )( x (0) − a ) / ( x (0) − b ) , for any nonnegative t . HIS VERSION: June 19, 2020 x ( t ): x ( t ) = [ a − bkexp ( − λt )] / [1 − kexp ( − λt )] , where: k = ( x (0) − a ) / ( x (0) − b ).iii) The exact time discretized recursive model

We deduce that the exact time discretized counterpart corresponds to a rational trans-form of x ( t − x ( t ) = { a [ x ( t − − b ] − b [ x ( t − − a ] exp ( − λ ) } / { [ x ( t − − b ] − [ x ( t − − a ] exp ( − λ ) } . This rational recursive equation, which is the exact time discetization, diﬀers from thecrude Euler discretization.iv)

Special case

The results above can be computed for equation (3.4) with: λ = β > a = − α/β, b =1. Without an exogenous source of infection: α = 0 ,the interval ( a, b ) is the interval (0,1), x ( t ) = p ( t ) is decreasing and tends to 0 when t tends to inﬁnity, for any starting value p (0). HIS VERSION: June 19, 2020 . . . . . . timetime Figure 1: Evolution of p ( t ), SI Model HIS VERSION: June 19, 2020 . . . . . timetime

HIS VERSION: June 19, 2020 beta a l pha . . . . . . Figure 4: Time to Inﬂection, SI Model

HIS VERSION: June 19, 2020 . . . . . timetime

HIS VERSION: June 19, 2020 . . . . . timetime Figure 5: Evolution of p ( t ), SIR Model HIS VERSION: June 19, 2020 . . . . . timetime

HIS VERSION: June 19, 2020 . . . . . timetime Figure 5: Evolution of p ( t ), SIR Model HIS VERSION: June 19, 2020 . . . . . timetime Figure 6: Evolutions of Changes in p ( t ), SIR Model HIS VERSION: June 19, 2020 . . . . . time a=0.015, b=0.85a=0.005, b=0.85a=0.0001, b=0.85 Figure 7: Timing of Peak, a varying, SIR Model HIS VERSION: June 19, 2020 . . . . . time b=0.95b=0.75b=0.55b=0.95b=0.75b=0.55 Figure 8: Size of Peak, b varying, SIR Model HIS VERSION: June 19, 2020 . . . . . . . timetime