Quantifying the transmission potential of pandemic influenza
aa r X i v : . [ q - b i o . P E ] D ec Quantifying the transmission potential of pandemicinfluenza
Gerardo Chowell ∗ , Hiroshi Nishiura School of Human Evolution and Social Change, Arizona State UniversityTempe, AZ 85282, USA Theoretical Epidemiology, University of Utrecht,Yalelaan 7, 3584 CL Utrecht, The Netherlands ∗ Corresponding author. Email: [email protected] Fax: (505) 480-965-7671 bstract This article reviews quantitative methods to estimate the basic reproductionnumber of pandemic influenza, a key threshold quantity to help determine theintensity of interventions required to control the disease. Although it is difficultto assess the transmission potential of a probable future pandemic, historical epi-demiologic data is readily available from previous pandemics, and as a referencequantity for future pandemic planning, mathematical and statistical analyses ofhistorical data are crucial. In particular, because many historical records tendto document only the temporal distribution of cases or deaths (i.e. epidemiccurve), our review focuses on methods to maximize the utility of time-evolutiondata and to clarify the detailed mechanisms of the spread of influenza.First, we highlight structured epidemic models and their parameter estima-tion method which can quantify the detailed disease dynamics including thosewe cannot observe directly. Duration-structured epidemic systems are subse-quently presented, offering firm understanding of the definition of the basic andeffective reproduction numbers. When the initial growth phase of an epidemic isinvestigated, the distribution of the generation time is key statistical informationto appropriately estimate the transmission potential using the intrinsic growthrate. Applications of stochastic processes are also highlighted to estimate thetransmission potential using similar data. Critically important characteristicsof influenza data are subsequently summarized, followed by our conclusions tosuggest potential future methodological improvements. ACS classifications:
Viral diseases (87.19.xd); Population dynamics (87.23.Cc);Stochastic models in biological physics (87.10.Mn)
Keywords:
Influenza; pandemic; epidemiology; basic reproduction number; model.
Contents R using a structured epidemic model 11 R and R ( t ) R using the intrinsic growth rate . . . . . . . . . . . . . 194.2 The concept of R ( t ) and its estimation . . . . . . . . . . . . . . . . . . 24 R What to be clarified further? 46 R really means? . . . . . . . . . . . . . . . . . . . . . . . . . . . 467.2 New theory to replace mass action principle . . . . . . . . . . . . . . . 497.3 Which kind of data do we have to explore? . . . . . . . . . . . . . . . . 50 Influenza epidemics are observed around the world during the wintertime and with astrong seasonal component in temperate regions [1][2]. Influenza is a disease causedby the influenza virus, an RNA virus belonging to the Orthomyxoviridae [3]. Manyfeatures are common with those of the paramyxovirus infections of the acute upperrespiratory tract. Typical symptoms of the disease are characterized by fever, myal-gia, severe malaise, non-productive cough, and sore throats. The disease spreads whenan infected individual coughs or sneezes and sends the virus into the air, and othersusceptible individuals inhale the virus. The virus is also believed to be transmittedwhen a person touches a surface that is contaminated with the virus ( e.g. door knob,etc.) and then touches the nose or eyes. Infected individuals can transmit the virusalmost within a day following infection ( i.e. latent period). Although it is generallybelieved that infected individuals can pass the virus for 3-7 days following symptom on-set, there is some uncertainty on the duration of the infectious period. The generationtime ( i.e. sum of latent and infectious periods) for influenza, reported and assumed inthe literature, ranges from 3 days [4][5][6] to about 6 days [7][8].Individuals that are infected with influenza are believed to become permanently4mmune against the specific virus strain. Hence, the virus is able to persist in the humanpopulation through relatively minor (single point) mutations in the virus compositionknown as drifts. Influenza (sub)types A/H3N2, A/H1N1 and B are currently co-circulating in the human population [9]. Major changes in the virus composition viarecombination or gene reassortment processes (known as genetic shifts) can lead tothe emergence of novel influenza viruses with the potential of generating dramaticmorbidity and mortality levels around the world [3].The 1918-19 influenza pandemic known as the
Spanish influenza caused by the in-fluenza virus A (H1N1) has been the most devastating in recent history with estimatedworldwide mortality ranging from 20 to 100 million deaths [10][11] with a case fatalityof 2-6 percent [12][13]. The worldwide 1918 influenza pandemic spread in three wavesstarting from Midwestern United States in the spring of 1918 [14][15]. The deadlysecond wave began in late August probably in France while the third wave is generallyconsidered as part of normal more scattered winter outbreaks similar to those observedafter the 1889/90 pandemic [14]. Subsequent pandemics during the 20th century areattributed to subtyes A (H2N2) from 1957-58 (Asian influenza) and A (H3N2) in 1968(Hong Kong influenza) [16].The ability to quickly detect and institute control efforts at the early stage of aninfluenza pandemic is directly linked to the final levels of morbidity and mortality inthe population [13]. To appropriately assess the disaster size of a probable future pan-demic, we have to quantify the transmission potential (and its associated uncertainty).Although it is difficult to directly measure the transmissibility of a future pandemic,5istorical epidemiologic data is readily available from previous pandemics, and as areference quantity for future pandemic planning, mathematical and statistical analysesof the historical data can offer various insights. In particular, because many historicalrecords tend to document only the temporal distribution of cases or deaths ( i.e. epi-demic curve), we modelers have faced with a difficult need to clarify the mechansms ofthe spread of influenza using such time-evolution data alone. In this paper, we reviewa number of mathematical and statistical methods for the estimation of the transmis-sion potential of pandemic influenza, focusing on theoretical techniques to maximizethe utility of the temporal distribution of influenza cases. The methods that havebeen incorporated in this review include the applications of epidemiologically struc-tured epidemic models, explicitly duration-structured epidemic system, and stochasticprocesses ( i.e. branching and counting processes). Whereas this review does not coverthe spread of influenza in space, spatial heterogeneity in transmission and the growinginterest in the role of contact network are briefly discussed as the future challenge.
The basic reproduction number, R (pronounced as R nought ), is a key quantity usedto estimate transmissibility of infectious diseases. Theoretically, R is defined as theaverage number of secondary cases generated by a single primary case during its entireperiod of infectiousness in a completely susceptible population [17]. As the epidemicprogresses, the number of susceptible individuals is decreased due to infection, and thereproduction number decays following R ( t ) = R S ( t ) /S (0) where S ( t ) and S (0) are,6espectively, the number of susceptible individuals at time t and before the epidemicstarts; the latter is equivalent to the total population size N given that all individualsare susceptible before the beginning of an epidemic. Clearly, this definition only appliesto (novel) emerging infectious diseases ( e.g. the epidemic of severe acute respiratorysyndrome (SARS) from 2002-3) or re-emerging infectious diseases that had not circu-lated in the population in question for long enough to allow for residual immunity inthe population to disappear due to births and deaths.The reproduction number is directly related to the type and intensity of interven-tions necessary to control an epidemic since the objective is to make R ( t ) < R ( t ) <
1, one or a combination of control strategies may beimplemented. For example, one of the best known uses of R is in determining thecritical coverage of immunization required to eradicate a disease in a randomly mixingpopulation. That is, when vaccine is available against a disease in question, it is ofinterest to estimate the critical proportion of the population that needs to be vacci-nated ( i.e. vaccination coverage) in order to attain R < − p c (in a randomly mixing population) can be es-timated from the R of the disease in question as follows [20]:7 c > ǫ (cid:18) − R (cid:19) (1)where ǫ is the efficacy ( i.e. direct effectiveness) of vaccination [21]. p c given in (1)suggests that the disease could be eradicated even when all susceptible individuals arenot vaccinated. The protection conferred to the population by achieving a critical vac-cination coverage is known as herd immunity [22][23].A brief history of the theoretical developments on the basic reproduction numberand its analytical computation via epidemic modeling is given elsewhere [23][24][25].The mathematical definition and calculation of R using next-generation argumentsis described initially by Odo Diekmann and his colleagues [17][26], where R is thedominant eigenvalue of the resulting next generation matrix. Further elaborations andreviews can be found elsewhere [27][28][29][30][31][32][33]. Classically, rather than thethreshold phenomena, R was used to suggest the severity of an epidemic, because theproportion of those experiencing infection at the end of an epidemic depends only on R [34] (see Section 3).Statistical methods to quantitatively estimate R have been reviewed by Klaus Di-etz [35]. Depending on the characteristics of data and underlying assumptions of themodels, R can be estimated using various different approaches [36]. In addition to thefinal size equation, R of an epidemic of newly emerging disease can be estimated fromthe intrinsic growth rate [4][6][8][18][37][38], which is also referred to as the rate ofnatural increase , suggesting the natural growth rate of infected individuals in a fullysusceptible population (discussed in Section 4). Moreover, for simple epidemic models8ith relatively few parameters, R can be estimated with other unobservable quantitiesby rigorous curve fitting of model equations to the observed epidemic data (discussedin Section 3)[38][39][40]. Not only R but also R ( t ) can be estimated from the temporaldistribution of infectious diseases, reconstructing the transmission network or inferringthe time-inhomogeneous number of secondary transmissions [41][42][43][44].To estimate the basic reproduction number of endemic diseases, different approachesare taken. One would need first to carry out serological surveys to quantify the frac-tion of the population that is effectively protected against infection ( i.e. age- and/ortime-specific proportion of those possessing acquired immunity needs to be estimated).Through this effort, the force of infection , the rate at which susceptible individualsget infected, is estimated [45]. For example, this is the case for rubella, mumps andmeasles (that are still circulating in some regions of the world even when high effectivevaccination coverage is achieved). Although the estimation of R for endemic diseasesis out of the scope of this review, methodological details and the applications to esti-mate the force of infection and R can be found elsewhere [46][47][48][49][50][51][52].In practice, the reproduction number denoted simply by R and defined as the num-ber of secondary cases generated by a primary infectious cases in a partially protectedpopulation might be useful. R can also be estimated from the initial growth phase of anepidemic in such a partially immunized population. In a randomly mixing population,the relationship between the basic reproduction number ( R ) and the reproductionnumber ( R ) is given by R = (1 − p ) R where p is the proportion of the population thatis effectively protected against infection (in the beginning of an epidemic). Besides, for9any recurrent infectious diseases including seasonal influenza, estimating the back-ground immunity p in the population is extremely difficult due to cross-immunity ofantigenically-related influenza strains and vaccination campaigns.With reagard to seasonal influenza, the reproduction number ( R ) over the last 3decades has been estimated at 1 . .
05) in the United States, France, and Australiawith an overall range of 0 . − . R estimate of 1 . . . R has been suggested for the 1951 influenza epidemicin England and Canada [57].Because influenza pandemics such as the Spanish flu from 1918-19 are associated tothe emergence of novel influenza strains to which most of the population is susceptible,it might be reasonable to assume that the reproduction number R ≈ R . Previousstudies have estimated that R of the 1918-19 influenza pandemic ranged between 1 . . R in recent studies. The variability of R es-timates suggests that local factors, including geographic and demographic conditions,could play an important role in disease spread [65][66][67]. In the following sections,we review how these estimates are obtained and how we shall interpret the estimates,starting from a simple structured epidemic model proposed in 1927.10 Estimating R using a structured epidemic model Mathematical models provide a unique way to analyze the transmission dynamics andstudy various different scenarios associated to the spread of communicable diseasesin population(s) [18][68][69]. The history of the mathematical modeling of infectiousdiseases greatly remounts to the study of Sir Ronald Ross in 1911 [70] who inventeda classic malaria model and also discovered the mosquito-borne transmission mecha-nisms of malaria. Employing a mass action principle for the spread of malaria, Rossexplored the effects of controling the mosquito population using simple mathematicalmodels [71]. Following his effort, Kermack and Mckendrick introduced a classical SIR(susceptible-infectious-removed) epidemic model in 1927, which is most frequently uti-lized in the present day, given by the following system of nonlinear ordinary differentialequations (ODEs) [72]: dS ( t ) dt = − βS ( t ) I ( t ) NdI ( t ) dt = βS ( t ) I ( t ) N − γI ( t ) dR ( t ) dt = γI ( t ) (2)where S ( t ) denotes susceptible individuals at time t ; I ( t ), infected (assumed infectious)individuals at time t ; and R ( t ), recovered (assumed permanently immune) individualsat time t ; β is the transmission rate; γ the recovery rate; and N is the total popula-tion size which is assumed constant for a closed population ( i.e. a population withoutimmigration and emmigration). Susceptible individuals in contact with the virus en-ter the infectious class ( I ) at the rate βI/N . That is, homogeneous mixing between11ndividuals is assumed.The basic reproduction number, R , for the epidemic system (2) is given by theproduct of the transmission rate and the mean infectious period. That is: R = βγ (3)Classically, R has been known as a quantity to suggest severity of an epidemic [34].Indeed, analytical expression of R in (3) is derived simply by solving the above system(2). Replacing I ( t ) in the right hand side of dS ( t ) /dt by (1 /γ ) dR ( t ) /dt , we get1 S ( t ) dS ( t ) dt = − βγN dR ( t ) dt (4)Integrating both sizes of (4) from 0 to infinity,ln S ( ∞ ) S (0) = − βγN ( R ( ∞ ) − R (0)) (5)Since S ( ∞ ) = N − R ( ∞ ), and because we assume S (0) = N and R (0) = 0, equation(5) can be rewritten as ln N − R ( ∞ ) N = − βγ R ( ∞ ) N (6)In the above equation (6), final size , i.e. , the proportion of those experiencing infectionamong a total number of individuals in a community following a large scale epidemic,is defined as p := R ( ∞ ) /N . That is,ln(1 − p ) = − R p (7)Therefore, the following final size equation of an autonomous SIR (or SEIR) modelis obtained: ˆ R = − ln(1 − p ) p (8)12quation (8) can be analytically derived using both deterministic (models governedby ODEs or partial differential equations (PDEs))[73] and stochastic models [74].Despite the usefulness of (2), SIR assumptions given by ODEs are not always di-rectly applicable to real data. One of the reasons include that there is no disease wherean infected individual can cause secondary transmission immediately after his/her in-fection.Accordingly, we have used slightly extended compartmental models in the previ-ous studies to describe the transmission dynamics of the 1918-19 influenza pandemicand estimate the reproduction number [38][39]. We now describe two different SEIR(susceptible-exposed-infectious-removed) models that have been used to estimate thereproduction number. The first model is the simple SEIR model, and the second modelaccounts for asymptomatic and hospitalized individuals.The simple SEIR model classifies individuals as susceptible (S), exposed (E), infec-tious (I), recovered (R), and dead (D) [18]. Susceptible individuals in contact with thevirus enter the exposed class at the rate βI ( t ) /N , where β is the transmission rate, I ( t ) is the number of infectious individuals at time t and N = S ( t ) + E ( t ) + I ( t ) + R ( t )is the total population for any t . The entire population is assumed to be susceptibleat the beginning of the epidemic. Individuals in latent period (E) progress to theinfectious class at the rate k (where 1 /k suggests the mean latent period). We as-sume homogeneous mixing ( i.e. random mixing) between individuals and, therefore,the fraction I ( t ) /N is the probability of a random contact with an infectious individ-ual in a population of size N . Since we assume that the time-scale of the epidemic13s much faster than characteristic times for demographic processes (natural birth anddeath), background demographic processes are not included in the model. Infectiousindividuals either recover or die from influenza at the mean rates γ and δ , respectively.Recovered individuals are assumed protected for the duration of the outbreak. Themortality rate is given by δ = γ [CFP/(1-CFP)], where CFP is the mean case fatalityproportion. The transmission process can be modeled using the system of nonlineardifferential equations: dS ( t ) dt = − βS ( t ) I ( t ) NdE ( t ) dt = βS ( t ) I ( t ) N ( t ) − kE ( t ) dI ( t ) dt = kE ( t ) − ( γ + δ ) I ( t ) dR ( t ) dt = γI ( t ) dD ( t ) dt = δI ( t ) dC ( t ) dt = kE ( t ) (9)where C ( t ) is the cumulative number of infectious individuals. The basic reproductionnumber of the above system (9) is given by the product of the mean transmission rateand the mean infectious period, R = β/ ( γ + δ ).A more complex SEIR model (Figure 1) classifies individuals as susceptible ( S ),exposed ( E ), clinically ill and infectious ( I ), asymptomatic and partially infectious( A ), diagnosed and reported ( J ), recovered ( R ), and death ( D ). The birth and naturaldeath rates are assumed to have a common rate µ (60-year life expectancy as in [38]).The entire population is assumed susceptible at the beginning of the pandemic wave.Susceptible individuals in contact with the virus progress to the latent class at the14ate β ( I ( t ) + J ( t ) + qA ( t )) /N where β is the transmission rate, and 0 < q < A ). Since there isno explicit evidence estimating and proving the effectiveness of public health interven-tions, and because a high burden was placed upon the sanitary and medical sectors,diagnosed/hospitalized individuals ( J ) are assumed equally infectious. Although it isdifficult to explicitly evaluate the difference in infectiousness between general commu-nity and hospital, we roughly made this assumption since 78 percent of the nurses of theSan Francisco Hospital contracted influenza [75]. A more rigorous assumption requireseither statistical analysis of more detailed time-series data [76] or an epidemiologicalcomparison of specific groups by contact frequency [77]. The total population size attime t is given by N = S ( t )+ E ( t )+ I ( t )+ A ( t )+ J ( t )+ R ( t ). We assumed homogeneousmixing of the population and, therefore, the fraction ( I ( t ) + J ( t ) + qA ( t )) /N is theprobability of a random contact with an infectious individual. A proportion 0 < ρ < I ) at the rate k whilethe remaining (1 − ρ ) progress to the asymptomatic partially infectious class ( A ) at thesame rate k (fixed to 1/1.9 days − [8]). Asymptomatic cases progress to the recoveredclass at the rate γ . Clinically infectious individuals (class I ) are diagnosed (reported)at the rate α or recover without being diagnosed (e.g., mild infections, hospital refusals)at the rate γ . Diagnosed individuals recover at the rate γ = 1 / (1 /γ − /α ) or dieat rate δ . The mortality rates were adjusted according to the case fatality proportion(CFP), such that δ = CF P − CF P ( µ + γ ).The transmission process can be modeled using the following system of nonlinear15ifferential equations: dS ( t ) dt = µN ( t ) − βS ( t )( I ( t ) + J ( t ) + qA ( t )) N − µS ( t ) dE ( t ) dt = βS ( t )( I ( t ) + J ( t ) + qA ( t )) N − ( k + µ ) E ( t ) dA ( t ) dt = k (1 − ρ ) E − ( γ + µ ) A ( t ) dI ( t ) dt = kρE ( t ) − ( α + γ + µ ) I ( t ) dJ ( t ) dt = αI ( t ) − ( γ + δ + µ ) J ( t ) dR ( t ) dt = γ ( A ( t ) + I ( t )) + γ J ( t ) − µR ( t ) dD ( t ) dt = δJ ( t ) dC ( t ) dt = αI ( t ) (10)We assume the cumulative number of influenza notifications, our observed epidemicdata, is given by C ( t ). Seven model parameters ( β , γ , α , q , ρ , E (0), I (0)) are estimatedfrom the epidemic curve by least squares fitting as explained below. The reproductionnumber for model (10) is given by (see [38]): R = βkk + µ ( ρ (cid:18) γ + α + µ + α ( γ + α + µ )( γ + δ + µ ) (cid:19) + (1 − ρ ) (cid:18) qγ + µ (cid:19)) (11)and the clinical reporting proportion is given by: O = αα + γ + µ . (12) In the simplest manner, model parameters can be estimated via least-square fitting ofthe model solution to the observed data. That is, one looks for the set of parameters ˆΘ whose model solution best fits the epidemic data by minimizing the sum of the squared16ifferences between the observed data y t and the model solution C ( t, Θ ). That is, weminimize: X ( Θ ) = n X t = ( y t − C ( t , Θ )) (13)The standard deviation of the parameters can be estimated by computing the asymp-totic variance-covariance AV ( ˆΘ) matrix of the least-squares estimate by [78]: AV ( ˆΘ ) = σ ( ∇ Θ C ( Θ ) ∇ Θ C ( Θ ) T ) − (14)which can be estimated by ˆ σ ( ˆ ∇ ˆ Θ C ( ˆΘ ) ˆ ∇ ˆ ΘC ( ˆΘ ) T ) − (15)where n is the total number of observations, ˆ σ is the estimated variance, and ˆ ∇ C are numerical derivatives of C . Estimates of ˆ R can be obtained by substituting thecorresponding individual parameter estimates into an analytical formula of R . Fur-ther, using the delta method [79], we can derive an expression for the variance of theestimated basic reproduction number ˆ R . An expression for the variance of R for thesimple SEIR model (Equations 9) is given by: V ( ˆ R ) ≈ ˆ R { V ( ˆ β )ˆ β + V (ˆ γ )(ˆ γ + ˆ δ ) + V (ˆ δ )(ˆ γ + ˆ δ ) − ( 2ˆ β (ˆ γ + ˆ δ ) )( Cov (ˆ γ, ˆ β ) − ˆ βCov (ˆ δ, ˆ γ )ˆ γ + ˆ δ + Cov (ˆ δ, ˆ β )) } . (16)This expression depends on the variance (denoted by V ) of the individual parameterestimates as well as their covariance (denoted by Cov ).17 .2 Bootstrap confidence intervals
Another method to generate uncertainty bounds on the reproduction number is gen-erating bootstrap confidence intervals by generating sets of realizations of the best-fitcurve C ( t ) [80]. Each realization of the cumulative number of case notifications C i ( t )( i = 1, 2, . . . , m ) is generated as follows: for each observation C ( t ) for t = 2, 3, . . . , n days generate a new observation C ′ i ( t ) for t ≥ C ′ i (1) = C (1)) that is sampled from a Poisson distribution with mean: C ( t ) − C ( t −
1) (the daily increment in C ( t ) from day t − t ). The corresponding realization of the cumulative number of influenzanotifications is given by C i ( t ) = P tj =1 C ′ i ( t ) where t = 1, 2, 3, . . . , n . The reproductionnumber was then estimated from each of 1000 simulated epidemic curves to generatea distribution of R estimates from which simple statistics can be computed including95% confidence intervals. These statistics need to be interpreted with caution. For ex-ample, 95% confidence intervals for R derived from our bootstrap sample of R shouldbe interpreted as containing 95% of future estimates when the same assumptions aremade and the only noise source is observation error. It is tempting but incorrect tointerpret these confidence intervals as containing the true parameters with probability0 .
95. Figure 2 shows the temporal distributions of the reproduction number and theproportion of the clinical reporting obtained by the bootstrap method after fitting thecomplex SEIR epidemic model to the initial phase of the Fall influenza wave using 17epidemic days of the Spanish Flu Pandemic in San Francisco, California.18
Primer of mathematics and statistics of R and R ( t ) In addition to the estimation of R , it is of practical importance to evaluate time-dependent variations in the transmission potential. Explanation of the time course ofan epidemic can be partly achieved by estimating the effective reproduction number, R ( t ), defined as the actual average number of secondary cases per primary case attime t (for t >
0) [41][42][44][81][82] . Although effective interventions against Spanishinfluenza may have been limited in the early 20th century, it is plausible that the contactfrequency leading to infection varied with time owing to the huge number of deathsand dissemination of information through local media ( e.g. newspapers). R ( t ) showstime-dependent variation with a decline in susceptible individuals (intrinsic factors)and with the implementation of control measures (extrinsic factors). If R ( t ) <
1, itsuggests that the epidemic is in decline and may be regarded as being under control attime t (vice versa, if R ( t ) > R using the intrinsic growth rate To appropriately understand the theoretical concept of R ( t ), let us firstly consideran explicitly infection-age structured epidemic model. Whereas Kermack-McKendrickmodel governed by ODEs (i.e. SIR and SEIR models as discussed above) has been well-known, Kermack and McKendrick had actually proposed an infection-age structured R ( t ) should not be confused with the number of removed individuals using the same notation. Inthe following arguments of this paper, R ( t ) denotes the effective reproduction number. S ( t ) and U ( t ). Further, let i ( t, τ ) be the density of infectiousindividuals at time t and infection-age τ ( i.e. time since infection). The model isgiven by dS ( t ) dt = − λ ( t ) S ( t ) (cid:18) ∂∂t + ∂∂τ (cid:19) i ( t, τ ) = − γ ( τ ) i ( t, τ ) i ( t,
0) = λ ( t ) S ( t ) dU ( t ) dt = R ∞ γ ( τ ) i ( t, τ ) dτ (17)where λ ( t ) is referred to as the force of infection (foi) ( i.e. as discussed in Section 2,foi is defined as the rate at which susceptible individuals get infected) which is givenby: λ ( t ) = Z ∞ β ( τ ) i ( t, τ ) dτ (18)and γ ( τ ) is the rate of recovery at infection-age τ . It should be noted that the abovemodel has not taken into account the background host demography ( i.e. birth anddeath). In a closed population, the total population size N is thus given by: N = S ( t ) + Z ∞ i ( t, τ ) dτ + U ( t ) (19)It should also be noted that, although i ( t, τ ) is referred to as density , it is not meantto be a normalized density ( i.e. integral of i ( t, τ ) over t and τ does not sum up to1). Rather, we use density to mathematically refer to the absolute frequency in theinfection-age space. 20he system (17) can be reasonably integrated i ( t, τ ) = Γ( τ ) j ( t − τ ) , f or t − τ > τ )Γ( τ − t ) j ( τ − t ) , f or τ − t > j ( t ) = i ( t, τ ) = exp (cid:0) − R τ γ ( σ ) dσ (cid:1) (21)and j ( τ ) suggests the density of initially infected individuals at the beginning of anepidemic. In the following arguments, we call j ( t ) as incidence of infection ( i.e. newinfections at a given point of time t ). It is not difficult to derive S ( t ) = S (0) − Z t j ( σ ) dσ (22)from (17)- (21). Thus, the subequation of i ( t,
0) in system (17) is rewritten as j ( t ) = λ ( t ) (cid:20) S (0) − Z t j ( σ ) dσ (cid:21) (23)Taking into account the initial condition in (20), equation (22) is rewritten as j ( t ) = (cid:20) S (0) − Z t j ( σ ) dσ (cid:21) (cid:20) G ( t ) + Z t ψ ( τ ) j ( t − τ ) dτ (cid:21) (24)where ψ ( τ ) = β ( τ )Γ( τ ) G ( t ) = R ∞ β ( σ + t ) Γ( σ + t )Γ( σ ) j ( σ ) dσ (25)Considering the initial invasion phase ( i.e. initial growth phase of an epidemic), weget a linearized equation j ( t ) = S (0) G ( t ) + S (0) Z t ψ ( τ ) j ( t − τ ) dτ (26)21he equation (26) represents Lotka’s integral equation, where the basic reproductionnumber, R , is given by R = S (0) Z ∞ ψ ( τ ) dτ (27)Thus, the epidemic will grow if R > R <
1. The abovemodel can yield the same final size equation as seen in models governed by ODEs [17].Assuming that the infection-age distribution is stable, we get a simplified renewalequation j ( t ) = Z ∞ A ( τ ) j ( t − τ ) dτ (28)where A ( τ ) is the product of ψ ( τ ) and S (0). Moreover, assuming that we observe anexponential growth of incidence during the initial phase (ı.e. j ( t ) = k exp( rt ) where k and r are, respectively, a constant ( k >
0) and the intrinsic growth rate), the followingrelationship must be met j ( t ) = j ( t − τ ) exp( rτ ) (29)Replacing j ( t − τ ) in the right hand side of (28) by (29), we get j ( t ) = Z ∞ A ( τ ) j ( t ) exp( − rτ ) dτ (30)Removing j ( t ) from both sides of (30), we get the Lotka-Euler characteristic equation:1 = Z ∞ e − rτ A ( τ ) , dτ (31)Further, we consider a probability density of the generation time ( i.e. the time frominfection of an individual to the infection of a secondary case by that individual [85]),denoted by w ( τ ): w ( τ ) := A ( τ ) R ∞ A ( x ) dx = A ( τ ) R . (32)22sing (32), the equation (31) can be replaced by1 R = Z ∞ exp( − rτ ) w ( τ ) , dτ (33)The equations (29)-(33) are what Wallinga and Lipsitch discussed in a recent study[6], reasonably suggesting the relationship between the generation time and R . Ac-cordingly, the estimator of R using the intrinsic growth rate is given by:ˆ R = 1 M ( − r ) , (34)where M ( − r ) is the moment generating function of the generation time distribution w ( τ ), given the intrinsic growth rate r [6] . Equation (34) significantly improved theissue of estimating R using the intrinsic growth rate alone, because (34) permits val-idating estimates of R by various different distributional assumptions for w ( τ ). Theissue of assuming explicit distributions for latent and infectious periods has been high-lighted in recent studies [86][87][88][89],[90] and indeed, this point is in part addressedby (34), because the convolution of latent and infectious periods yields w ( τ ). Moreover,since the assumed lengths of generation time most likely yielded different estimates of R for Spanish influenza by different studies [60], equation (34) highlights a critical In the original study of Wallinga and Lipsitch [6], the notation R is not used for equation (34)and rather document (34) as the estimator of R . Most likely, there are two reasons for this. First,we cannot assure if all individuals are susceptible to pandemic influenza before the start of epidemic(as discussed in Section 2). Second, we cannot assume that infection-age distribution is stable duringthe initial growth phase, which is highlighted in (20). Thus, it should be remembered that the abovediscussion is mathematically tight in theory, but there are certain number of assumptions to apply theconcept to observed data. Since writing R alone is always confusing (as it is unclear if R is concernedwith time or immunity status), here we use R instead. R with thisdataset is limited to the initial growth phase only (right panel in Fig 3). Even thoughthe data represent deaths over time ( i.e. not infection events with time), we can di-rectly extract the same intrinsic growth rate as practised with onset data, assumingthat death data are a good proxy for morbidity data (see our discussions in Section 6).Assuming exponential growth in deaths as shown in (29), the intrinsic growth rate r isestimated to be 0.16 per day. Supposing that w ( τ ) is arbitrarily assumed to follow agamma distribution with mean G and coefficient of variation, CV = p V ar ( G ) /G , R is given by R = (1 + rG ( CV ) ) CV )2 (35)Although there is no concensus regarding the generation time of Spanish influenza, weassume it ranges from 2-5 days. Assuming further that CV = 0 . R is estimated torange from 1.36 (for G = 2 day) to 2.07 (for G = 5 days). R ( t ) and its estimation In the following, let us consider the non-linear phase of an epidemic. Derivation of R given by (34) assumes an exponential growth which is applicable only during the veryinitial phase of an epidemic (or, when the transmission is stationary over time), andthus, it is of practical importance to widen the utility of above-described renewal equa-24ions in order to appropriately interpret the time-course of an influenza pandemic. Letus explicitly account for the depletion of susceptible individuals, as we deal with an es-timation issue with time-inhomogeneous assumptions (ı.e. non-linear phase). Adoptingthe mass action assumption, we get: j ( t ) = S ( t ) Z ∞ ψ ( τ ) j ( t − τ ) dτ = Z ∞ A ( t, τ ) j ( t − τ ) dτ (36)where A ( t, τ ) should be interpreted as the reproductive power at time t and infection-age τ at which an infected individual generates secondary cases. We refer to the latterpart of equation (36) as a non-autonomous renewal equation, where the number of newinfection at time t is proportional to the number of infectious individuals (as assumedin the renewal equation in the initial phase).Using equation (36), the effective reproduction number, R ( t ) ( i.e. instantaneousreproduction number at calendar time t ) is defined as: R ( t ) = Z ∞ A ( t, τ ) dτ (37)Following (37), we can immediately see that R ( t ) with an autonomous assumption ( i.e. where contact and recovery rates do not vary with time) is given by: R ( t ) = S ( t ) S (0) R (38)which is shown in [17]. In practical terms, equation (38) suggests that time-varying de-crease in transmission potential as well as decline in the epidemic reflects only depletionof susceptible individuals. This corresponds to a classic assumption of the Kermackand McKendrick model. 25owever, as we discussed in the beginning of this section, we postulate that humancontact behaviors (and other extrinsic factors) modifies the dynamics of pandemicinfluenza, assuming that the decline in incidence does reflect not only depletion ofsusceptibles but also various extrinsic dynamics ( e.g. isolation, quarantine and clo-sure of public buildings). Thus, instead of the assumption in (36), we shall assumetime-inhomogeneous ψ ( t, τ ); i.e. j ( t ) = S ( t ) Z ∞ ψ ( t, τ ) j ( t − τ ) dτ = Z ∞ A ( t, τ ) j ( t − τ ) dτ (39)to describe A ( t, τ ).To derive simple estimator of R ( t ), it is convenient to assume separation of vari-ables for A ( t, τ ) (implicitly assuming that the relative infectiousness to infection-age isindependent of calendar time) [92]. Under this assumption, A ( t, τ ) is rewritten as theproduct of two functions φ ( t ) and φ ( τ ): A ( t, τ ) = φ ( t ) φ ( τ ) (40)Arbitrarily assuming a normalized density for φ ( τ ), i.e. , Z ∞ φ ( τ ) dτ ≡ R ( t ) = Z ∞ A ( t, τ ) dτ = φ ( t ) (42)suggesting that the function φ ( t ) is equivalent to the effective reproduction number R ( t ). Another function φ ( τ ) represents the density of infection events as a function26f infection-age τ . Accordingly, we can immediately see that φ ( τ ) is exactly the sameas w ( τ ), the generation time distribution. That is, the above arguments suggest that A ( t, τ ) ( i.e. the rate at which an infectious individual at calendar time t and infection-age τ produces secondary transmission) is decomposed as: A ( t, τ ) = R ( t ) w ( τ ) (43)Inserting (43) into (39) yields an estimator of R ( t ) [92]:ˆ R ( t ) = j ( t ) R ∞ j ( t − τ ) w ( τ ) dτ (44)The above equation (44) is exactly what was proposed in applications to SARS [41]and foot and mouth disease [93]; i.e. discretizing (44) to apply it to the daily incidencedata ( i.e. using j i incident cases infected between time t i and time t i +1 and descretizedgeneration time distribution w i ), ˆ R ( t i ) = j i P nj =0 j i − j w j (45)was used as the estimator. However, it should be noted that the study in SARSimplicitly assumed that onset data c ( t ) at time t reflects the above discussed infectionevent j ( t ). That is, supposing that we observed c i onset cases reported between t i and t i +1 , R ( t ) was calculated as ˆ R ( t i ) = c i P nj =0 c i − j s j (46)where s j is the discretized serial interval which is defined as the time from onset of aprimary case to onset of the secondary cases [94][95]. The method permits reasonabletransformation of an epidemic curve ( i.e. temporal distribution of case onset) to the27stimates of time-inhomogeneous reproduction number R ( t ). Employing the relativelikelihood of case k infected by case l using the density function of serial interval s ( t ); i.e. , p ( k,l ) = s ( t k − t l | θ ) P m = k s ( t k − t m | θ ) (47)Using (47), expected value and variance of R ( t i ) are given by the following E ( R ( t i )) = 1 n t X l : t l = t n − q X k =1 p ( k,l ) V ar ( R ( t i )) = 1 n t n − q X k =1 X l : t l = t p ( k,l ) (1 − p ( k,l ) ) − X l,m : t l = t m = t p ( k,l ) p ( k,m ) ! (48)where n t is the total number of reported case onsets at time t [96].In the present day, only by using the above described methods (or similar conceptswith similar assumptions), we can transform epidemic curves into R ( t ) and roughlyassess the impact of control measures on an epidemic. However, whereas the equations(45) and (46) are similar in theory, we need to explicitly account for the difference be-tween onset and infection event. In fact, when there are many asymptomatic infectionsand asymptomatic secondary transmissions, serial interval is not equivalent to the gen-eration time, and thus, directly adopting the above methods would be inappropriate.Since this point is particularly important in analyzing influenza data, we discuss thisissue in Section 6. 28 Statistical methods to estimate R In the previous sections, we discussed several different methods to estimate R either by(i) employing detailed curve fitting method assuming a structured epidemic model or(ii) using the intrinsic growth rate (or doubling time [97][98]). Summarizing the abovediscussions, we believe that the readers should benefit from memorizing R = 1 /M ( − r )for the use of the intrinsic growth rate r in estimating R [6] and remembering the finalsize equation R = − ln(1 − p ) /p suggesting the severity of an epidemic as the theoreticalconcept. Indeed, estimator using either the intrinsic growth rate or final size has stillcontinued to play an important role in discussing R of pandemic influenza [63].However, it should be noted that deterministic models do not permit incorporatingstochasticity explicitly ( e.g. standard error for R is determined by measurement oferrors alone), as the models argue only average number of secondary transmissions within the assumed transmission dynamics. That is, our arguments given above exploreonly the time-evolution of influenza spread in the mean field . To address the issue ofvariation in secondary transmissions, full stochastic models are called for [99].From a viewpoint of data science, the discrete-time branching process, which is alsoreferred to as Galton-Watson process, can reasonably assess individual heterogeneityin secondary transmissions [100][101]. As we discussed the initial growth phase ofthe deterministic model, let us consider the same epidemic phase where we observe ageometric increase in the number of cases by generation [23]. We denote the initial29umber of infected individuals by C in generation 0. Then, during the first generation, C cases are produced by secondary transmissions of C . Similarly, let C n be thenumber of infections in generation n . The branching process of this type assumes thatevery infected individual has an independently and identically distributed stochasticrandom variable ρ ( n ) i representing the number of secondary cases produced by case i ingeneration ( n ), and that environmental stochasticity and immigration/emigration canbe ignored. Supposing that the pattern of secondary transmission follows a discreteprobability distribution p k with k secondary transmission(s); i.e. , p k = P r ( ρ ( n ) i = k ) ( k = 0 , , , ... ) (49)then, the expected number of secondary transmissions and the variance are given by R = E ( ρ ( n ) i ) = P ∞ k =1 kp k V ar ( ρ ( n ) i ) = P ∞ k =0 ( k − R ) p k (50)In other words, the concept of probability distribution p k reflects offspring dis-tribution in population ecology, and this permits explicit modeling of variations insecondary transmissions in infectious diseases [102][103]. This approach is particularlyimportant during the initial phase of an epidemic, because the number of infectiousindividuals is small in this stage, and thus, it is deemed essential to take into accountdemographic stochasticity, i.e. , variation in the numbers of secondary transmissionsby chance. Indeed, the model has been applied to observed outbreak data where weobserved the extinction before growing to a major epidemic [104][105].Let us briefly discuss the variation in secondary transmissions and an estimationmethod of R using the discrete-time branching process, deriving analytical expres-30ions of the expected number of infected individuals in generation n , M n = E ( C n ) andthe variance V n = V ar ( C n ). It is impossible to avoid using the probability generatingfunction (pgf) to discuss the branching process. The above described ρ ( n ) i character-ize positive and discrete number of secondary transmissions, and thus, is a non-zerodiscrete random variable. The pgf of ρ , g ρ ( s ) is given by g ρ ( s ) = E ( s ρ ) = ∞ X k =0 p k s k (51)There are two basic properties concerning g ( s ) in relation to the epidemic process.First, R is by definition the mean value of secondary transmissions (equation (50))and, thus given by g ′ (1). Second, the probability that an infected individual does notcause any secondary transmissions p = Pr( ρ =0) is given by g (0), which is useful fordiscussing threshold phenomena and extinction [101]. If we note that C = 1 (i.e. onlyone index case), the Galton-Watson process has the following pgf identity: g ( s ) = sg n +1 ( s ) = g n ( g ( s )) = g ( g n ( s )) (52)Even when there are C = a cases in generation 0 (where a > a different independent infection-trees and thus g C ( s ) = s a g C n ( s ) = ( g n ( s )) a (53)31rom the above discussions, the expected number of cases in generation n , M n , andthe variance V n is M n = R n M V n = nV ar ( ρ ) , ( R = 1) R n − V ar ( ρ ) R n − R − , ( R = 1) (54)The process grows geometrically if R >
1, stays constant if R = 1, and decays ge-ometrically if R <
1. These three cases are referred to as supercritical , critical ,and subcritical , respectively. However, unlike the deterministic model, it should beremembered that critical process does not permit continued transmissions, and rather,the process becomes extinct almost surely (i.e. probability of extinction given R = 1is one) [101].Mathematically, demographic stochasticity in transmission is represented by a Pois-son process, which has been practiced in the application of branching processes to epi-demics [17]. Assuming that mean value of secondary transmissions is a constant R ,the conditional distribution of observing C n +1 cases, given C n cases, follows a Poissondistribution: C n +1 | C n ∼ P oisson [ C n R ] (55)Supposing that we analyze influenza data documenting the generations of cases from0 to n in which we observed geometric growth, the likelihood of estimating R isproportional to n Y k =1 ( R C k − ) C k exp ( − R C k − ) (56)32ere we apply the above model to the Spanish influenza data in Zurich (Figure 3).Assuming that the generation time of length τ , w ( τ ), is given by the following deltafunction with the mean length 3 days, w ( τ ) = ∞ , f or τ = 30 , f or τ = 3 (57)then the observed series of data can be grouped by generation ( C , C , C , ...):1 , , , , , , , ... (58)Since we assumed exponential growth during the initial 16 days in the previous sec-tion, here we similarly assume a geometric increase up to the 6th generation. Applying(56) to the above data, maximum likelihood estimate of R (and the corresponding95 percent confidence intervals) is 1.51 (1.24, 1.81). The model is simple enough toestimate R , and indeed, a slight extension of the discrete-time branching process hasbeen employed to estimate R as well as the proportion of undiagnosed cases in theanalysis of SARS outbreak data [106].It should be noted that the discrete-time branching process assumes homogeneouspattern of spread. A technical issue has arisen on this subject during the SARS out-break. Usually, we observe some cases who produce an extraordinary number of sec-ondary cases compared with other infected individuals, which are referred to as super-spreaders . Because of this, observed offspring distributions for directly transmitteddiseases tend to be extremely skewed to the right. Empirically, it has been suggestedthat Poisson offspring distribution is sometimes insufficient to highlight the presence33f superspreaders in epidemic modeling [107]. For example, if non-zero discrete dis-tribution of secondary cases follows a geometric distribution with mean R , the pgf isgiven by a geometric distribution with mean R : g ( s ) = 11 + R (1 − s ) (59)Moreover, if the offspring distribution follows gamma distribution with mean R anddispersion parameter k , the pgf g ( s ) follows negative binomial distribution [108]: g ( s ) = (cid:18) Rk (1 − s ) (cid:19) − k (60)We still do not know if pandemic influenza is also the case to warrant the skewedoffspring distributions. To explicitly test if superspreading events frequently exist ininfluenza transmission, it is necessary to accumulate contact tracing data for this dif-ficult disease, the cases of which often show flu-like symptoms only (as discussed inSection 1). In addition, it should be noted that we cannot attribute the skewed off-spring distribution to the underlying contact network only. To date, there are twomajor reasons which can generate superspreaders: (i) those who experience very fre-quent contacts (social superspreader) or (ii) those who are suffering from high pathogenloads or those who can scatter the pathogen through the air such as the use of nebuliserin hospitals (biological superspreader). From the offspring distribution only, we cannotdistinguish these two mechanisms. With regard to the estimation of R using final size, we briefly discuss another methodbased on a stochastic process. As we discussed above, let S ( t ), I ( t ) and U ( t ) be the34umbers of susceptible, infectious and recovered individuals at time t , respectively.Further, let β and 1 /γ denote the transmission rate and the mean duration of theinfectious period, respectively. Supposing that K ( t ), the number of individuals whoexperienced infection between time 0 and time t , is given by K ( t ) = S (0) − S ( t ),the two processes K ( t ) and U ( t ) are increasing counting processes where the generalepidemic is explained by: P r ( dK ( t ) = 1 , dU ( t ) = 0 | Z t ) = β ¯ S ( t ) I ( t ) dtP r ( dK ( t ) = 0 , dU ( t ) = 1 | Z t ) = γI ( t ) dtP r ( dK ( t ) = 0 , dU ( t ) = 0 | Z t ) = 1 − β ¯ S ( t ) I ( t ) dt − γI ( t ) dt (61)where Z t denotes the σ -algebra generated by the history of the epidemic { S ( u ) , I ( u ); 0 < u < t } and ¯ S ( t ) = S ( t ) /n (where n is the size of the susceptible population at time 0). Thelatter is equivalent to assuming density-independent transmission ( i.e. also referredto as true mass action or frequency dependent assumption [109]). Based on equation(61), two zero-mean martingales [110] are defined by: M ( t ) = K ( t ) − R t β ¯ S ( u ) I ( u ) duM ( t ) = U ( t ) − R t γI ( u ) du (62)From the martingale theory [111], a zero-mean martingale is given by M ( t ) = R t S ( u ) dM ( u ) − βγ M ( t )= nS (0) + nS (0) − ... + nS ( t ) + 1 − βγ U ( t ) (63)Thus, the estimator ˆ θ = ˆ β/ ˆ γ is given byˆ θ = (cid:20) nS (0) + nS (0) − ... + nS ( T ) + 1 (cid:21) U ( T )= − ln (1 − ˜ p ) U ( T ) (64)35here ˜ p is the observed final size (= 1 − S ( T ) /n ) at the end of the epidemic at time T . Furthermore, the variance of the zero-martingale is given by V ar ( M ( t )) = V ar ( M ( t )) + θ V ar ( M ( t )) (65)From the martingale central limit theorem [112], the estimator θ is approximatelynormally distributed in a major outbreak in a large community. The standard error isthen consistently estimated by: s.e. (ˆ θ ) = (cid:20) nS (0) + n ( S (0) − + ... + n ( S ( T ) + 1) − ˆ θ R ( T ) (cid:21) U ( T )= nS (0) + 12 + nS (0) + 12 − ˆ θ R ( T ) U ( T ) (66)Consequently, the estimator and standard error of R are given by:ˆ R = n × ˆ θs.e. ( ˆ R ) = n × s.e. (ˆ θ ) (67)More detailed mathematical descriptions can be found elsewhere [74][113][114].Here we show a numerical example. Let us consider a large epidemic of equineinfluenza ( i.e. influenza in horses) as our case study. In 1971, a nationwide epidemicof equine-2 influenza A (H3N8) was observed in Japan [115]. For example, in NiigataRacecourse, 580 influenza cases were diagnosed with influenza among a total of 640susceptible horses. The final size p is thus 90.6 percent (95 percent CI: 88.4, 92.9).From this data, we calculate R and its uncertainty bounds.Using ˜ p = 0 .
906 and total number of infected U ( T ) = 580 in equation (64), ˆ θ is36stimated as 0.00408. Therefore, the estimate of R = 2 .
60 is given by equation (67).Moreover, from equation (66) where S ( T ) = 639 −
579 = 60 and S (0)(= n ) = 639 (weassume one case was already infected at time t = 0), we obtain s.e. (ˆ θ ) = 0 . θ is assumed to follow normal distribution. Therefore, the 95 percent confidenceinterval for R is given as [ R − . × × . , R + 1 . × × . . , . β and γ are independent of time ( i.e. constant), and thus, that any extrinsicfactors should not have influenced the course of the observed epidemic. In the above described models, we always assumed that the pattern of influenza trans-mission is homogeneous, which is clearly unrealistic to capture the transmission dy-namics of influenza. Since the last century, it has already been understood that thetransmission dynamics are not sufficiently modeled by assuming homogeneous mixing.However, because more detailed data are lacking ( e.g. epidemic records of pandemicinfluenza with time, age and space), what we could offer has been mainly to extractthe intrinsic growth rate from the initial exponential growth, and estimate R usingthe estimator based on a model with the homogeneous mixing assumption.One line of addressing heterogeneous patterns of transmission using the observeddata is separating household transmission from community transmission. In other37ords, it is of practical importance to distinguish between individual and group R [116]. From the beginning of explicit modeling of influenza [117][118], a method toseparately estimate the transmission parameters has been proposed, which has beenpartly extended in a recent study [92] or applied to further old data of pandemic in-fluenza [119]. Indeed, an important aspect of this issue was highlighted in a recentstudy which compared estimates of R between those having casual and close contacts[63]. To estimate key parameters of household and community transmissions of in-fluenza, or to simulate realistic patterns of influenza spread, such a consideration isfruitful.Mathematically elaborating this concept, there are several publications which pro-posed the basis of analyzing household transmission data employing stochastic models[120][121][122]. Moreover, a rigorous study has been made to estimate parameters de-termining the intrinsic dynamics ( e.g. infectious period) using household transmissiondata with time [123].Future challenges on the estimation of R include the application of such theo-ries to the observed data with some extension. For example, as we discussed above,knowing the generation time would be crucial to elucidate a robust estimate of R [6][89][92][90]. However, we do not know if the generation time varies between closeand casual contacts; this should be the case, because, as long as the generation timeis given by covolution of latent and infectious periods, close contact should lead toshorter generation time than casual contact. In future studies, influenza models maybetter to highlight the increasing importance of considering household transmission to38stimate the transmission potential using the temporal distribution of infection events. Except for our approach in Section 3, mathematical arguments given in this paperare not particularly special for influenza. In other words, we modelers have employedsimilarly structured models which describe the population dynamics of other directlytransmitted diseases, and such models are applicable not only for influenza but alsofor many viral diseases including measles, smallpox, chickenpox, rubella and so on[18]. However, influenza has many different epidemiologic characteristics comparedto other childhood viral diseases. For instance, following the previous efforts in in-fluenza epidemiology [124][125][126] and modeling [127][128], we should at least notethe following:1. Detailed mechanisms of immunity have yet to be clarified. Since influenza virushas an wide antigenic diversity ( i.e. unlike other childhood viral diseases, anti-genic stimulation is not monoclonal), this complicates our understanding in thefraction of immune individuals, cross-protection mechanisms and evolutionarydynamics [129][130][131][132].2. Flu-like symptoms are too common, and thus, we cannnot explicitly distinguishinfluenza from other common viral infections without expensive laboratory testsfor each case. Because of this character, it is difficult to effectively implementusual public health measures ( e.g. contact tracing and isolation).39. Although explicit estimates are limited [133][134], a certain fraction of infectedindividuals does not exhibit any symptoms (following infection). This compli-cates not only the eradication [135] but also epidemiologic evaluations of vaccinesand therapeutics [136].4. Looking into the details of the intrinsic dynamics [4][123], it appears recentlythat the generation time and infectious period are much shorter than what werebelieved previously. Therefore, despite the relatively small R estimate, the turn-over of a transmission cycle ( i.e. speed of growth) is rather quick. The incubationperiod of Spanish influenza is as short as 1.5 days [137], complicating the imple-mentation of quarantine measures [138].Thus, depending on the characteristics of observed data (and the specific purpose ofmodeling), we have to highlight these factors referring to the best available evidence.This is one of the most challenging issues in designing public health interventionsagainst a potential future pandemic. In addition to the above described issue, we, of course, must remember what the re-ported data is. In many studies, the compartment I ( t ) or relevant class of infectiousindividuals of the SIR (or SEIR) model was fitted to the observed data. Indeed, in themajority of previous classic studies, R ( t ) ( i.e. removed class; denoted by U ( t ) in ourdiscussion) of Kermack and McKendrick model was fitted to the data, assuming thatthe removed class highlights observed data as the reported cases no longer produce40econdary cases. However, the observed epidemiologic data is actually neither I ( t ) nor U ( t ). Always, what we get as the temporal distribution reflects case onset or deaths with time which is mostly accompanied by some reporting delay.We believe this is one of the most challenging issues in epidemic modeling. Ex-cept for rare examples in sexually transmitted diseases, infection event is not directlyobservable, and thus, we have to maximize the utility of reported (observed) data,explicitly understanding what the data represents.In this case, back-calculation of the infection events is called for. Let c ( t ) denotethe number of onsets at time t , this should be modeled by using incidence j ( t ) and thedensity of the incubation period of length τ , f ( τ ): c ( t ) = Z ∞ j ( t − τ ) f ( τ ) dτ (68)Further, supposing that b ( t ) is the number of reported cases at time t and the densityof reporting delay of length σ is h ( σ ), observed data is modeled as: b ( t ) = R ∞ c ( t − s ) h ( s ) ds = R ∞ R ∞ j ( t − s − τ ) f ( τ ) dτ h ( s ) ds (69)That is, only by using the observed data b ( t ) and known information of the reportingdelay h ( s ) and incubation period distributions f ( τ ), we can translate the observed datainto infection process j ( t ).In some cases, only death data with time, d ( t ), is available [60]. Similarly, thiscan be modeled using the backcalculation. Let q denote the case fatality of influenzawhich is reasonably assumed time-independent, and further let m ( u ) be the relative41requency of time from onset to death, d ( t ) is given by: d ( t ) = q Z ∞ c ( t − u ) m ( u ) du (70)Even when using onset data with delay or death data, it should be noted that theintrinsic growth rate is identical to that estimated from the infection event distribution.Assuming that the incidence j ( t ) exhibits exponential growth during the initial phaseof an epidemic, i.e. , j ( t ) = k exp( rt ), equations (68) and (70) can be rewritten as b ( t ) = R ∞ R ∞ k exp( r ( t − s − τ )) f ( τ ) dτ h ( s ) ds = k exp( rt ) R ∞ R ∞ exp( − r ( s + τ )) f ( τ ) dτ h ( s ) ds (71)and d ( t ) = q R ∞ R ∞ j ( t − u − τ ) f ( τ ) dτ m ( u ) du = q R ∞ R ∞ k exp( r ( t − u − τ )) f ( τ ) dτ m ( u ) du = qk exp( rt ) R ∞ R ∞ exp( − r ( u + τ )) f ( τ ) dτ m ( u ) du (72)Thus, the growth terms exp( rt ) ( i.e. which depends on time) of b ( t ) and d ( t ) are stillidentical to that of incidence j ( t ). In other words, mathematically the equations (71)and (72) could be a justification to extract an estimate of the intrinsic growth rate fromcases with reporting delay or deaths with time. However, we should always rememberthat the infection-age distribution is not stable during the initial phase, and moreover,this method cannot address individual variation in the secondary transmissions ( e.g. superspreaders, as we discussed in Section 5). In this way, it’s not an easy task to clarify the infection events with time. A simi-lar application of the convolution equation has been intensively studied in modeling42IV/AIDS. Since AIDS has a long incubation period, and because AIDS diagnosisis certainly reported in the surveillance data (at least, in industrialized countries),backcalculation of the number of HIV infections with time using the nubmer of AIDSdiagnoses and the incubation period distribution has been an issue to capture the wholeepidemiologic picture of HIV/AIDS [139][140][141][142]. In the current modeling prac-tice using the temporal distribution of onset events, we are now faced with a need toapply this technique to diseases with much shorter incubation periods.Now, let’s look back at a method to estimate R ( t ), which was proposed by Wallingaand Teunis [41]. Whereas the method has a background of mathematical reasoning(as shown in (46), Section 4.2), the estimator was derived implicitly assuming that observed data exactly reflects infection events . If asymptomatic infection and transmis-sion are rare, this assumption might be justifiable as the lengths of the serial intervaland generation time become almost identical. However, as long as we cannot ignoreasymptomatic transmissions, which is particularly the case for influenza, the assump-tion s ( σ ) = w ( τ ) might be problematic [85].Since R ( t ) of this method was given by summing up the probability of causingsecondary transmissions by an onset case at the onset time of this case t , we shouldrewrite the assumption using a modified onset-based renewal equation as follows [60]: c ( t ) = Z t c ( t − τ ) R ( t − τ ) s ( τ ) dτ (73)For simplicity, we ignore reporting delay in the observed data, roughly assuming thatthe observed data reflects c ( t ). Translating equation (73) in words, it is implicitlyassumed that secondary transmission happens exactly at the time of onset , and based43n this assumption, R ( t ) in the right hand side of (73) can be backcalculated.To understand the assumptions behind the above equation, let us assume thatincidence j ( t ) is given by j ( t ) = S ( t ) Z ∞ β ( σ )Γ( σ ) c ( t − σ ) dσ (74)where β ( σ ) is the transmission rate at disease-age σ ( i.e. the time since onset ofinfection [143]) and Γ( σ ) is the survivorship of cases following onset. It should benoted that equation (74) ignores secondary transmissions before onset of illness. Aswe discussed above, c ( t ) is given by j ( t ) and the incubation period distribution f ( τ ), c ( t ) = Z ∞ j ( t − τ ) f ( τ ) dτ (75)Replacing c ( t ) in the right hand side of (74) by (75), we get j ( t ) = S ( t ) Z ∞ j ( t − s ) φ ( s ) ds (76)where s represents infection-age ( i.e. time since infection), and φ ( s ) is given by φ ( s ) = Z s β ( a )Γ( a ) f ( s − a ) da (77)which represents generation time distribution. From equations (76) and (77), we canfind that R ( t ) is given by R ( t ) = S ( t ) R ∞ φ ( s ) ds = S ( t ) R ∞ β ( σ )Γ( σ ) dσ R ∞ f ( τ ) dτ (78)Equation (78) can be further reduced to R ( t ) = R S ( t ) /S (0) which represents Kermackand McKendrick’s assumption. Replacing j ( t ) in the right hand side of (75) by (74),44e get c ( t ) = Z ∞ ψ ( t, σ ) c ( t − σ ) dσ (79)where ψ ( t, σ ) denotes the serial interval distribution of calender time t and disease-age σ : ψ ( t, σ ) = Z σ β ( σ − τ )Γ( σ − τ ) f ( τ ) S ( t − τ ) dτ (80)Equation (80) is difficult to solve as it includes S ( t − τ ) in the right hand side. However,in the special case, e.g. , let’s say when we can assume β ( τ )Γ( τ ) = kδ ( τ ) (where k isconstant and δ ( · ) is delta function), ψ ( t, σ ) = kf ( σ ) S ( t − σ ) (81)Inserting (81) back to (79), c ( t ) = R ∞ kf ( σ ) S ( t − σ ) c ( t − σ ) dσ = R ∞ R ( t − σ ) c ( t − σ ) f ( σ ) R ∞ f ( τ ) dτ dσ (82)which is onset-based renewal equation which was presented in (73). What to be learntfrom (82) is, the assumption that secondary transmission happens immediately afteronset suggests that the incubation period distribution is identical to the serial intervaldistribution as shown above, which is a bit funny conclusion. Maximizing the utility ofobserved data has still remained an open question.In addition to modeling the temporal distribution, explicit modeling of asymp-tomatic infection is also called for [135]. Provided that there are so many asymptomatictransmissions which are not in the negligible order, we need to shift our concept oftransmissibility; e.g. , rather than R , a threshold quantity of symptomatic infection is45equired. In such a case, application of type-reproduction number T might be useful[144][145], and it has already been put into practice [146]. In this review, we focused on the use of the temporal distribution of influenza toestimate R (or R ( t )) and the relevant key parameters. It must be remembered thatour arguments, almost necessarily, employed homogeneous mixing assumption, as wecannot extract information on heterogeneous patterns of infection from a single streamof temporal data alone. Presently, more information ( e.g. at least, spatio-temporaldistribution) is becoming available for influenza. In this section, we briefly sketch whatcan be (and should be) done in the future to quantify the transmission dynamics ofpandemic influenza. R really means? It’s not a new issue that heterogeneous patterns of transmission could even destroy themean field theory in infectious diseases. For example, in a pioneering study of gon-orrhea transmission dynamics by Hethcote and Yorke [147], an importance of contactheterogeneity was sufficiently highlighted. Since a simple model assuming homogeneousmixing did not reflect the patterns of gonorrhea transmission in the United States, Het-hcote and Yorke divided the population in question into two; those who are sexuallyvery active and not, the former of which was referred to as core group. Comparedwith the temporal distribution of infection given by a model with homogeneous mixing46ssumption, the simple heterogeneous model with a core group revealed much quickerincrease in epidemic size, showing rather different trajectory of an epidemic. Giventhat the variance of sexual partnership is extremely large ( i.e. if the distribution ofthe frequency of sexual intercourse is extremely skewed to the right with a very longright tail), the estimate of R is shown to become considerably high. The finding sup-ports a vulnerability of our society to the invasion of sexually transmitted diseases.Following this pioneering study, considerable efforts have been made to approximatelymodel the heterogeneous patterns of transmission using extended mean field equations[18][26][148][149].In addition to such an approximation of heterogeneous transmission, recent progressin epidemic modeling with explicit contact network structures suggests that vari-ance of the contact frequency plays a key role in determining the threshold quantity,and in some special cases, the concept of threshold phenomena could be confused[150][151][152][153]. In Section 4, we defined the force of infection as λ ( t ) = Z ∞ β ( τ ) i ( t, τ ) dτ (83)In deteministic models given by simple ODEs (which ignores infection-age), λ ( t ) isequivalent to βI ( t ). These are what classical mean field models suggest.Let us account for an epidemic on networks, whose node-connectivity distribution( i.e. the distribution of probabilities that nodes have exactly k neighbors) follows someexplicit distribution P ( k ). The force of infection λ c , which yields R = 1, in a staticcontact network is given by λ c = h k ih k i (84)47ere h k i denotes the average connectivity of the nodes. Assuming that P ( k ) follows apower law of the form P ( k ) = ck − v (where c is constant), h k i = c X k k − v (85)Given that v ≤ λ c = 0, and in such a case, R even becomes infinite. This impliesthat the disease spread will continue for any mean estimate of R . Such a networkstructure is referred to as scale free [154], complicating disease control efforts in publichealth [155][156]. The importance of the network structure would also be highlightedfor v > v is estimated to be around 3 or alittle larger [157]. Following such a finding, many non-sexual directly transmitteddiseases are also modeled in the present day assuming the scale-free network [150].However, it should be noted that the pattern of contact does not necessarily followscale-free for all directly transmitted diseases. Indeed, there is no empirical evidencewhich suggests that the contact structure of any droplet infections follows the powerlaw ( i.e. we do not know if the above described contact heterogeneity is the case fordiseases except for sexually transmitted diseases). A typical example of confusion isseen in the superspreading events during the 2002-03 SARS epidemics [158], wherewe cannot explicitly attribute the phenomena either to contact network or biologicalfactors (as long as contact and infection event are not directly observable). We stilldo not know how we should account for the distribution P ( k ) for influenza and otherviral respiratory diseases ( i.e. power law or not) which remains to be clarified for each48isease in future research. Methodoligical developments have been made to account for the network heterogeneitywith data [159]. An approximate approach to address this issue is highlighted partic-ularly in spatio-temporal modeling, an excellent account of which is reviewed by MattKeeling [160].Even though it’s difficult to quantify the transmission dynamics with an explicitcontact network with time, there are useful analytical approximations to capture thedynamics of influenza (and other respiratory transmitted viral diseases) and estimatethe transmission potential. For example, the force of infection with a power law ap-proximation is reasonably given by: λ ( t ) = βI ( t ) α S ( t ) (86)In (86), α and Ψ characterize the epidemic dynamics; e.g. initial growth ( i.e if α isless than 0, the modified form acts to dampen the exponential growth of incidence)and endemic equilibrium ( i.e. when Ψ is greater than 0, density-dependent damping isincreased). A model of this type was actually validated with measles data in Englandand Wales, comparing the prediction with that of employing the mass action principle[161].Another approximation might be a pair-wise model [162], which can explicitly ac-count for the correlation between connected pairs. The model reasonably permitsderiving the force of infection λ using the number of various connected paris, which49mplies wide applicability to the epidemiologic data of sexually transmitted infections.Incorporating spatial heterogeneity in an approximate manner would shed light on fur-ther quantifications [163][164], and thus, simple and reasonably tractable models whichpermit spatio-temporal modeling of influenza are expected ( e.g. [165]). Summarizing the above discussions, we have presented modeling approaches that canquantify the transmission potential of pandemic influenza. As we have shown, temporalcase distributions have been analyzed in many instances, and previous efforts have comeclose to maximize the utility of temporal distributions ( i.e. epidemic curve). However,at the same time, we have also learned that we can extract almost the intrinsic growthrate alone from a single time-evolution data. Accordingly, we are now faced with a needto clarify heterogeneous patterns of transmission and more detailed intrinsic dynamicsof influenza [166][167][168]. With regard to the latter, primitive epidemiologic questions( e.g. probability of clinical attack given infection) remain to be answered for Spanish,Asian and Hong Kong influenza. Let’s summarize what we need to clarify theoreticallyabout pandemic influenza in list:1. Acquired immunity2. Evolutionary dynamics3. Multi-host species transmission4. Asymptomatic transmission 50. Attack rate ( i.e.
Pr(onset | infection))6. Case fatality ( i.e. Pr(death | onset))7. Generation time and serial interval8. Latent, incubation, infectious and symptomatic periods with further data9. Transmission potential with time, space and antigenic types10. Transmission potential with time and ageThese issues highlight an importance to quantify the transmission of influenza usingnot only cases ( i.e. those followed onset of symptoms) but also some hint suggesting theinfection event. For example, majority of the above listed issues could be reasonablyaddressed by implementing serological surveys ( e.g. antibody titers of individuals and,preferably, time-delay delay distribution from infection to seroconversion). Since theproportion of those who do not experience symptomatic infection ( i.e. probabilityof asymptomatic infection) is not small for influenza [64][146], case records can tellus little to address the above mentioned issues, and thus, historical data of Spanishinfluenza may hardly offer further information. By maximizing the utility of observeddata, we have to clarify the dynamics of influenza further, and identify key informationwhich characterize the specific mechanisms of spread.51 eferences [1] L. Simonsen, The global impact of influenza on morbidity and mortality. Vaccine (1999) S3-S10.[2] A.W. Park and K. Glass, Dynamic patterns of avian and human influenza in eastand southeast Asia. Lancet Infectious Diseases (2007) 543-548.[3] R.G. Webster, W.J. Bean, O.T. Gorman, T.M. Chambers and Y. Kawaoka, Evo-lution and ecology of influenza A viruses. Microbiological Reviews (1992)152-179.[4] N.M. Ferguson, D.A.T. Cummings, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai,S. Iamsirithaworn and D.S. Burke, Strategies for containing an emerging influenzapandemic in Southeast Asia. Nature (2005) 209-214.[5] N.M. Ferguson, D.A. Cummings, C. Fraser, J.C. Cajka, P.C. Cooley and D.S.Burke, Strategies for mitigating an influenza pandemic.
Nature (2006) 448-452.[6] J. Wallinga and M. Lipsitch, How generation intervals shape the relationshipbetween growth rates and reproductive numbers.
Proceedings B (2007) 599-604.[7] I.M. Longini, M.E. Halloran, A. Nizam and Y. Yang, Containing pandemic in-fluenza with antiviral agents.
American Journal of Epidemiology (2004) 623-633. 528] C.E. Mills, J.M. Robins and M. Lipsitch, Transmissibility of 1918 pandemic in-fluenza.
Nature (2004) 904-906.[9] K. Nicholson, A. Hay, Textbook of influenza (Blackwell, Oxford, 1998)[10] B.A. Cunha, Influenza: historical aspects of epidemics and pandemics.
InfectiousDisease Clinics of North America (2004) 141-155.[11] C.J. Murray, A.D. Lopez, B. Chin, D. Feehan and K.H. Hill, Estimation ofpotential global pandemic influenza mortality on the basis of vital registry datafrom the 1918-20 pandemic: a quantitative analysis. Lancet (2006) 2211-2218.[12] E. Sydenstricker, Variations in case fatality during the influenza epidemic of 1918.
Public Health Reports (1921) 2201-2211.[13] H. Markel, H.B. Lipman, J.A. Navarro, A. Sloan, J.R. Michalsen, A.M. Stern andM.S. Cetron, Nonpharmaceutical interventions implemented by US cities duringthe 1918-1919 influenza pandemic. JAMA (2007) 644-654.[14] K.D. Patterson and G.F. Pyle, The geography and mortality of the 1918 influenzapandemic.
Bulletin of the History of Medicine (1991) 4-21.[15] N.P. Johnson and J. Mueller, Updating the accounts: global mortality of the1918-1920 ”Spanish” influenza pandemic. Bulletin of the History of Medicine (2002) 105-115. 5316] L. MacKellar, Pandemic influenza: A review. Population and Development Re-view (2007) 429-451.[17] O. Diekmann and J.A.P. Heesterbeek, Mathematical Epidemiology of InfectiousDiseases: Model Building, Analysis and Interpretation (John Wiley and Sons,New York, 2000).[18] R.M. Anderson and R.M. May, Infectious Diseases of Humans: Dynamics andControl (Oxford University Press, Oxford, 1991).[19] R.M. Anderson and R.M. May, Directly transmitted infectious diseases: controlby vaccination. Science (1982) 1053-1060.[20] C.E. Smith, Factors in the transmission of virus infections from animal to man.
The Scientific Basis of Medicine Annual Reviews (1964) 125-150.[21] M.E. Halloran, M. Haber, I.M. Longini and C.J. Struchiner, Direct and indirecteffects in vaccine efficacy and effectiveness.
American Journal of Epidemiology (1991) 323-331.[22] P.E. Fine, Herd immunity: history, theory, practice.
Epidemiologic Reviews (1993) 265-302.[23] H. Nishiura, K. Dietz and M. Eichner, The earliest notes on the reproductionnumber in relation to herd immunity: Theophil Lotz and smallpox vaccination. Journal of Theoretical Biology (2006) 964-967.5424] J.A.P. Heesterbeek, A brief history of R and a recipe for its calculation. ActaBiotheoretica (2002) 189-204.[25] H. Nishiura and H. Inaba, Discussion: Emergence of the concept of the ba-sic reproduction number from mathematical demography. Journal of TheoreticalBiology (2007) 357-364.[26] O. Diekmann, J.A.P. Heesterbeek and J.A.J. Metz, On the definition and thecomputation of the basic reproductive ratio R in models for infectious diseases. Journal of Mathematical Biology (1990) 503-522.[27] P. van den Driessche and J. Watmough, Reproduction numbers and sub-thresholdendemic equilibria for compartmental models of disease transmission. Mathemat-ical Biosciences (2002) 29-48.[28] C. Castillo-Chavez, Z. Feng and W. Huang, On the computation of R and its roleon global stability, in: Mathematical Approaches for Emerging and ReemergingInfectious Diseases: An Introduction, IMA Volume 125 (Springer-Veralg, Berlin,2002) pp. 229-250.[29] J.M. Hyman and J. Li, An intuitive formulation for the reproductive number forthe spread of of diseases in heterogeneous populations. Mathematical Biosciences (2000) 65-86.[30] J.M. Heffernan, R.J. Smith and L.M. Wahl, Perspectives on the basic reproduc-tive ratio.
Journal of the Royal Society Interface (2005) 281-293.5531] J.M. Heffernan and L.M. Wahl, Improving estimates of the basic reproductiveratio: using both the mean and the dispersal of transition times. TheoreticalPopulation Biology (2006) 135-145.[32] H. Nishiura, Mathematical and statistical analyses of the spread of dengue. Dengue Bulletin (2006) 51-67.[33] M.J. Keeling and B.T. Grenfell, Individual-based perspectives on R(0). Journalof Theoretical Biology (2000) 51-61.[34] D.G. Kendall, Deterministic and stochastic epidemics in closed populations. in:Third Berkeley Symposium on Mathematical Statistics and Probability 4, ed P.Newman (University of California Press, New York, 1956) pp.149-165.[35] K. Dietz, The estimation of the basic reproduction number for infectious diseases.
Statistical Methods in Medical Research (1993) 23-41.[36] M.C. De Jong, O. Diekmann and J.A. Heesterbeek, The computation of R fordiscrete-time epidemic models with dynamic heterogeneity. Mathematical Bio-sciences (1994) 97-114.[37] M. Lipsitch, T. Cohen, B. Cooper, J.M. Robins, S. Ma, L. James, G. Gopalakr-ishna, S.K. Chew, C.C. Tan, M.H. Samore, D. Fisman and M. Murray, Trans-mission dynamics and control of severe acute respiratory syndrome.
Science (2003) 1966-1970. 5638] G. Chowell, C.E. Ammon, N.W. Hengartner and J.M. Hyman, TransmissionDynamics of the Great Influenza Pandemic of 1918 in Geneva, Switzerland: As-sessing the Effects of Hypothetical Interventions.
Journal of Theoretical Biology (2006) 193-204.[39] G. Chowell, H. Nishiura and L.M. Bettencourt, Comparative estimation of thereproduction number for pandemic influenza from daily case notification data.
Journal of the Royal Society Interface (2007) 155-166.[40] G. Chowell, C.E. Ammon, N.W. Hengartner and J.M. Hyman, Estimating thereproduction number from the initial phase of the Spanish flu pandemic waves inGeneva, Switzerland. Mathematical Biosciences and Engineering (2007) 457-470.[41] J. Wallinga and P. Teunis, Different epidemic curves for severe acute respira-tory syndrome reveal similar impacts of control measures. American Journal ofEpidemiology (2004) 509-516.[42] S. Cauchemez, P.Y. Boelle, G. Thomas and A.J. Valleron, Estimating in real timethe efficacy of measures to control emerging communicable diseases.
AmericanJournal of Epidemiology (2006) 591-597.[43] L.M.A. Bettencourt, R.M. Ribeiro, G. Chowell, T. Lant and C. Castillo-Chavez,Towards real time epidemiology: data assimilation, modeling and anomaly detec-tion of health surveillance data streams. in: Intelligence and security informatics:Biosurveillance. Proceedings of the 2nd NSF Workshop, Biosurveillance, 2007.57ecture Notes in Computer Science. eds F. Zeng et al. (Springer-Verlag, Berlin,2007) pp. 79-90.[44] H. Nishiura, M. Schwehm, M. Kakehashi and M. Eichner, Transmission potentialof primary pneumonic plague: time inhomogeneous evaluation based on historicaldocuments of the transmission network.
Journal of Epidemiology and CommunityHealth (2006) 640-645.[45] H. Muench, Catalytic Models in Epidemiology (Harvard University Press, Cam-bridge, Massachusetts, 1959).[46] D. Schenzle, K. Dietz and G.G. Frosner, Antibody against hepatitis A in sevenEuropean countries. II. Statistical analysis of cross-sectional surveys. AmericanJournal of Epidemiology (1979) 70-76.[47] C.P. Farrington, Modelling forces of infection for measles, mumps and rubella.
Statistics in Medicine (1990) 953-967.[48] N. Keiding, Age-specific incidence and prevalence: a statistical perspective. Jour-nal of the Royal Statistical Society Series A (1991) 371-412.[49] A.E. Ades and D.J. Nokes, Modeling age- and time-specific incidence from sero-prevalence:toxoplasmosis.
American Journal of Epidemiology (1993) 1022-1034. 5850] H.J. Whitaker and C.P. Farrington, Estimation of infectious disease parame-ters from serological survey data: the impact of regular epidemics.
Statistics inMedicine (2004) 2429-2443.[51] C.P. Farrington, M.N. Kanaan and N.J. Gay, Estimation of the basic reproduc-tion number for infectious diseases from age-stratified serological survey data. Applied Statistics (2001) 251-283.[52] K. Satou and H. Nishiura, Transmission dynamics of hepatitis E among swine:potential impact upon human infection. BMC Veterinary Research (2007) 9.[53] G. Chowell, M.A. Miller and C. Viboud, Seasonal influenza in the United States,France, and Australia: transmission and prospects for control. Epidemiology andInfection (2007) in press.[54] A. Flahault, S. Letrait, P. Blin, S. Hazout, J. Menares and A.J. Valleron, Mod-elling the 1985 influenza epidemic in France.
Statistics in Medicine (1988)1147-1155.[55] C.C. Spicer, The mathematical modelling of influenza epidemics. British MedicalBulletin (1979) 23-28.[56] C.C. Spicer, C.J. Lawrence, Epidemic influenza in Greater London. Journal ofHygiene (1984) 105-112. 5957] C. Viboud, T. Tam, D. Fleming, A. Handel, M.A. Miller and L. Simonsen, Trans-missibility and mortality impact of epidemic and pandemic influenza, with em-phasis on the unusually deadly 1951 epidemic. Vaccine (2006) 6701-6707.[58] R. Gani, H. Hughes, D.M. Fleming, T. Griffin, J. Medlock and S. Leach, Poten-tial impact of antiviral drug use during influenza pandemic. Emerging InfectiousDiseases (2005) 1355-1362.[59] E. Massad, M.N. Burattini, F.A. Coutinho and L.F. Lopez, The 1918 influenza Aepidemic in the city of Sao Paulo, Brazil. Medical Hypotheses (2007) 442-445.[60] H. Nishiura, Time variations in the transmissibility of pandemic influenza inPrussia, Germany, from 1918-19. Theoretical Biology and Medical Modelling (2007) 20.[61] G. Sertsou, N. Wilson, M. Baker, P. Nelson and M.G. Roberts, Key transmissionparameters of an institutional outbreak during the 1918 influenza pandemic es-timated by mathematical modelling. Theoretical Biology and Medical Modelling (2006) 38.[62] V. Andreasen, C. Viboud and L. Simonsen, Epidemiologic characterization of thesummer wave of the 1918 influenza pandemic in Copenhagen: Implications forpandemic control strategies. Journal of Infectious Diseases (2008) in press.[63] E. Vynnycky, A. Trindall and P. Mangtani, Estimates of the reproduction num-bers of Spanish influenza using morbidity data.
International Journal of Epi-demiology (2007) 881-889. 6064] J.D. Mathews, C.T. McCaw, J. McVernon, E.S. McBryde and J.M. McCaw, Abiological model for influenza transmission: pandemic planning implications ofasymptomatic infection and immunity. PLoS ONE (2007) e1220.[65] G. Chowell, L.M.A. Bettencourt, N. Johnson, W.J. Alonso and C. Viboud, The1918-1919 influenza pandemic in England and Wales: Spatial patterns in trans-missibility and mortality impact Proceedings of the Royal Society B (2008) inpress.[66] L. Sattenspiel and D.A. Herring, Structured epidemic models and the spread ofinfluenza in the central Canadian subarctic.
Human Biology (1998) 91-115.[67] L. Sattenspiel and D.A. Herring, Simulating the effect of quarantine on the spreadof the 1918-19 flu in central Canada. Bulletin of Mathematical Biology (2003)1-26.[68] F. Brauer and C. Castillo-Chavez, Mathematical Models in Population Biologyand Epidemiology (Springer-Verlag, New York, 2000)[69] H.W. Hethcote, The mathematics of infectious diseases. SIAM Review (2000)599-653.[70] R. Ross, The Prevention of Malaria (John Murray, London, 1911).[71] K. Dietz, Mathematical models for transmission and control of malaria. in:Malaria, Principles and Practice of Malariology, eds W.H. Wernsdorfer and I.McGregor (Churchill Livingstone, Edinburgh, 1988) pp.1091-1133.6172] W.O. Kermack and A.G. McKendrick, Contributions to the mathematical theoryof epidemics - I. Proceedings of the Royal Society Series A (1927) 700-721(reprinted in
Bulletin of Mathematical Biology (1991) 33-55).[73] J. Ma and D.J. Earn, Generality of the final size formula for an epidemic of anewly invading infectious disease. Bulletin of Mathematical Biology (2006)679-702.[74] N.G. Becker, Analysis of infectious disease data (Chapman and Hall, New York,1989).[75] A.K. Hrenoff, The Influenza Epidemic of 1918-1919 in San Francisco. The Mili-tary Surgeon (1941) 805-811.[76] B. Cooper and M. Lipsitch, The analysis of hospital infection data using hiddenMarkov models. Biostatistics (2004) 223-237.[77] H. Nishiura, T. Kuratsuji, T. Quy, N.C. Phi, V. Van Ban, L.E. Ha, H.T. Long, H.Yanai, N. Keicho, T. Kirikae, T. Sasazuki and R.M. Anderson, Rapid awarenessand transmission of severe acute respiratory syndrome in Hanoi French Hospital,Vietnam. American Journal of Tropical Medicine and Hygiene (2005) 17-25.[78] M. Davidian and D.M. Giltinan, Nonlinear Models for Repeated Measurementdata. Monographs on Statistics and Applied Probability 62. (Chapman and Hall,New York, 1995). 6279] P. Bickel and K.A. Doksum, Mathematical Statistics. (Holden-Day, Oakland,California, 1977).[80] B. Efron and R.J. Tibshirani, Bootstrap methods for standard errors, confidenceintervals, and other measures of statistical accuracy. Statistical Science (1986)54-75.[81] D.T. Haydon, M. Chase-Topping, D.J. Shaw, L. Matthews, J.K. Friar, J. Wile-smith and M.E. Woolhouse, The construction and analysis of epidemic trees withreference to the 2001 UK foot-and-mouth outbreak. Proceedings of the Royal So-ciety of London Series B (2003) 121-127.[82] S. Cauchemez, P.Y. Boelle, C.A. Donnelly, N.M. Ferguson, G. Thomas, G.M.Leung, A.J. Hedley, R.M. Anderson and A.J. Valleron. Real-time estimates inearly detection of SARS.
Emerging Infectious Diseases (2006) 110-113.[83] O. Diekmann, Limiting behaviour in an epidemic model. Nonlinear Analysis,Theory, Methods and Applications (1977) 459-470.[84] J.A.J. Metz, The epidemic in a closed population with all susceptibles equallyvulnerable; some results for large susceptible populations and small initial infec-tions. Acta Biotheoretica (1978) 75-123.[85] A. Svensson, A note on generation times in epidemic models. Mathematical Bio-sciences (2007) 300-311. 6386] H.J. Wearing, P. Rohani and M.J. Keeling, Appropriate models for the manage-ment of infectious diseases.
PLoS Medicine (2005) e174.[87] A.L. Lloyd, Destabilization of epidemic models with the inclusion of realisticdistributions of infectious periods. Proceedings of the Royal Society of LondonSeries B (2001) 985-993.[88] A.L. Lloyd, Realistic distributions of infectious periods in epidemic models:Changing patterns of persistence and dynamics.
Theoretical Population Biology (2001) 59-71.[89] M.G. Roberts and J.A. Heesterbeek, Model-consistent estimation of the basicreproduction number from the incidence of an emerging infection. Journal ofMathematical Biology (2007) 803-816.[90] P. Yan. Separate roles of the latent and infectious periods in shaping the relationbetween the basic reproduction number and the intrinsic growth rate of infectiousdisease outbreaks. Journal of Theoretical Biology (2008) in press.[91] A. Imahorn, Epidemiologische Beobachtungen ueber die Grippe-Epidemie 1918im Oberwallis. Inaugural-Dissertation zur Erlangung der Doktorwuerde derMedizinischen Fakultaet der Universitaet Zuerich (Universitaet Zuerich, Zurich,2000) (in German).[92] C. Fraser, Estimating individual and household reproduction numbers in anemerging epidemic.
PLoS ONE (2007) e758.6493] N.M. Ferguson, C.A. Donnelly and R.M. Anderson, Transmission intensity andimpact of control policies on the foot and mouth epidemc in Great Britain. Nature (2001) 542-548.[94] P.E. Fine, The interval between successive cases of an infectious disease.
Ameri-can Journal of Epidemiology (2003) 1039-1047.[95] H. Nishiura, Epidemiology of a primary pneumonic plague in Kantoshu,Manchuria, from 1910 to 1911: statistical analysis of individual records collectedby the Japanese Empire.
International Journal of Epidemiology (2006) 1059-1065.[96] B.J. Cowling, L.M. Ho and G.M. Leung, Effectiveness of control measures duringthe SARS epidemic in Beijing: a comparison of the Rt curve and the epidemiccurve. Epidemiology and Infection , (2007) in press.[97] C.A. Marques, O.P. Forattini and E. Massad, The basic reproduction number fordengue fever in Sao Paulo state, Brazil: 1990-1991 epidemic.
Transactions of theRoyal Society of Tropical Medicine and Hygiene (1994) 58-59.[98] A.P. Galvani, X. Lei and N.P. Jewell, Severe acute respiratory syndrome: tem-poral stability and geographic variation in case-fatality rates and doubling times. Emerging Infectious Diseases (2003) 991-994.[99] J. Lessler, D.A. Cummings, S. Fishman, A. Vora and D.S. Burke, Transmissibilityof swine flu at Fort Dix, 1976. Journal of the Royal Society Interface (2007)755-762. 65100] P. Haccou, P. Jagers and V.A. Vatutin, Branching Processes: Variation, Growthand Extinction of Populations (Cambridge University Press, Cambridge, 2005).[101] M. Kimmel, D.E. Axelrod, Branching Process in Biology (Springer, New York,2002).[102] N.G. Becker, On a general stochastic epidemic model. Theoretical PopulationBiology (1977) 23-36.[103] N. Becker, Estimation for discrete time branching processes with application toepidemics. Biometrics (1977) 515-522.[104] C.P. Farrington, M.N. Kanaan and N.J. Gay, Branching process models forsurveillance of infectious diseases controlled by mass vaccination. Biostatistics (2003) 279-295.[105] N.M. Ferguson, C. Fraser, C.A. Donnelly, A.C. Ghani and R.M. Anderson, Publichealth. Public health risk from the avian H5N1 influenza epidemic. Science (2004) 968-969.[106] K. Glass, N. Becker and M. Clements, Predicting case numbers during infectiousdisease outbreaks when some cases are undiagnosed.
Statistics in Medicine (2007) 171-183.[107] J.O. Lloyd-Smith, S.J. Schreiber, P.E. Kopp and W.M. Getz, Superspreadingand the effect of individual variation on disease emergence. Nature (2005)355-359. 66108] J.O. Lloyd-Smith, Maximum likelihood estimation of the negative binomial dis-persion parameter for highly overdispersed data, with applications to infectiousdiseases.
PLoS ONE (2007) e180.[109] M.C.M. De Jong, O. Diekmann and J.A.P. Heesterbeek, How does transmissionof infection depend on population size? in: Epidemic models: their structureand relation to data. ed. D. Mollison (Cambridge University Press, Cambridge,1995) pp.84-94.[110] N.G. Becker, Martingale methods for the analysis of epidemic data. StatisticalMethods in Medical Research (1993) 93-112.[111] P. Hall and C.C. Heyde, Martingale limit theory and its application (AcademicPress, New York, 1980).[112] W.N. Rida, Asymptotic properties of some estimators for the infection rate inthe general stochastic epidemic model. Journal of the Royal Statistical SocietySeries B (1991) 269-283.[113] N.G. Becker and T. Britton, Statistical studies of infectious disease incidence. Journal of the Royal Statistical Society Series B (1999) 287-307.[114] H. Andersson and T. Britton, Stochastic Epidemic Models and Their StatisticalAnalysis. Lecture Notes in Statistics No. 151 (Springer, New York, 2000).67115] K. Satou and H. Nishiura, Basic reproduction number for equine-2 influenzavirus A (H3N8) epidemic in racehorse facilities in Japan, 1971. Journal of EquineVeterinary Science (2006) 310-316.[116] B. Barnes, K. Glass and N.G. Becker, The role of health care workers and antiviraldrugs in the control of pandemic influenza. Mathematical Biosciences (2007)403-416.[117] I.M. Longini and J.S. Koopman, Household and community transmission param-eters from final distributions of infections in households.
Biometrics (1982)115-126.[118] I.M. Longini, J.S. Koopman, A.S. Monto and J.P. Fox, Estimating householdand community transmission parameters for influenza. American Journal of Epi-demiology (1982) 736-751.[119] H. Nishiura and G. Chowell, Household and community transmission of the Asianinfluenza A (H2N2) and influenza B viruses in 1957 and 1961.
Southeast AsianJournal of Tropical Medicine and Public Health (2007) in press.[120] F.M. Ball and G. Scalia-Tomba, Epidemics with two levels of mixing. Annals ofApplied Probability (1997) 46-89.[121] F. Ball and P. Neal, A general model for stochastic SIR epidemics with two levelsof mixing. Mathematical Biosciences (2002) 73-102.68122] N.G. Becker and K. Dietz, The effect of household distribution on transmissionand control of highly infectious diseases.
Mathematical Biosciences (1995)207-219.[123] S. Cauchemez, F. Carrat, C. Viboud, A.J. Valleron and P.Y. Boelle, A BayesianMCMC approach to study transmission of influenza: application to householdlongitudinal data.
Statistics in Medicine (2004) 3469-3487.[124] E.D. Kilbourne, Influenza (Plenum Medical, New York, 1987).[125] E.D. Kilbourne (Ed), The influenza viruses and influenza (Academic Press, NewYork, 1975).[126] R.E. Hope-Simpson, The Transmission of Epidemic Influenza (Plenum Medical,New York, 1992).[127] P. Selby, Influenza Models (Kluwer Academic Pub, New York, 1982).[128] A.D. Cliff, P. Haggett, J.K. Ord, Spatial Aspects of Influenza Epidemics (PionLimited, London, 1986).[129] N.M. Ferguson, A.P. Galvani and R.M. Bush, Ecological and immunological de-terminants of influenza evolution. Nature (2003) 428-433.[130] L.J. Abu-Raddad and N.M. Ferguson, The impact of cross-immunity, mutationand stochastic extinction on pathogen diversity.
Proceedings B (2004) 2431-2438. 69131] M.I. Nelson, L. Simonsen, C. Viboud, M.A. Miller, J. Taylor, K.S. George, S.B.Griesemer, E. Ghedin, N.A. Sengamalay, D.J. Spiro, I. Volkov, B.T. Grenfell,D.J. Lipman, J.K. Taubenberger and E.C. Holmes, Stochastic processes are keydeterminants of short-term evolution in influenza a virus.
PLoS Pathogens (2006) e125.[132] K. Koelle, S. Cobey, B. Grenfell and M. Pascual, Epochal evolution shapes thephylodynamics of interpandemic influenza A (H3N2) in humans. Science (2006) 1898-1903.[133] E. Vynnycky and W.J. Edmunds, Analyses of the 1957 (Asian) influenza pan-demic in the United Kingdom and the impact of school closures.
Epidemiologyand Infection , (2007) in press.[134] J. Mulder and N. Masurel, Pre-epidemic antibody against 1957 strain of Asiaticinfluenza in serum of older people living in the Netherlands.
Lancet (1958)810-814.[135] C. Fraser, S. Riley, R.M. Anderson and N.M. Ferguson, Factors that make aninfectious disease outbreak controllable. Proceedings of the National Academy ofScience U S A (2004) 6146-6151.[136] J.T. Kemper, Error sources in the evaluation of secondary attack rates.
AmericanJournal of Epidemiology (1980) 457-464.[137] H. Nishiura, Early efforts in modeling the incubation period of infectious diseaseswith an acute course of illness.
Emerging Themes in Epidemiology (2007) 2.70138] T. Day, A. Park, N. Madras, A. Gumel and J. Wu, When Is Quarantine a UsefulControl Strategy for Emerging Infectious Diseases? American Journal of Epi-demiology (2006) 479-485.[139] R. Brookmeyer and M.H. Gail, AIDS Epidemiology: A Quantitative Approach(Oxford University Press, Oxford, 1994).[140] N.P. Jewell, K. Dietz and V.T. Farewell (Eds), AIDS Epidemiology: Method-ological issues (Birkhaeuser, Boston, 1992).[141] T. Colton, T. Johnson and D. Machin D (Eds). Proceedings of the Conference onQuantitative Methods for Studying AIDS, held in Blaubeuren, Germany, June14-18, 1993.
Statistics in Medicine (1994) 1899-2188.[142] H. Nishiura, Lessons from previous predictions of HIV/AIDS in the United Statesand Japan: epidemiologic models and policy formulation. Epidemiologic Perspec-tives and Innovations (2007) 3.[143] H. Nishiura and M. Eichner, Infectiousness of smallpox relative to disease age:estimates based on transmission network and incubation period. Epidemiologyand Infection (2007) 1145-1150.[144] J.A. Heesterbeek and M.G. Roberts, The type-reproduction number T in modelsfor infectious disease control.
Mathematical Biosciences (2007) 3-10.71145] M.G. Roberts and J.A. Heesterbeek, A new method for estimating the effortrequired to control an infectious disease.
Proceedings of the Royal Society ofLondon Series B (2003) 1359-1364.[146] H. Inaba and H. Nishiura, On the dynamical system formulation of the typereproduction number for infectious diseases and its application to the asymp-tomatic transmission model.
Mathematical Biosciences (2007) submitted.[147] H.W. Hethcote and J.Y. Yorke, Gonorrhea Transmission Dynamics and Control.Lecture Notes in Biomathematics 56, ed S. Levin (Berlin, Springer, 1980).[148] M.E. Woolhouse, C. Dye, J.F. Etard, T. Smith, J.D. Charlwood, G.P. Garnett,P. Hagan, J.L. Hii, P.D. Ndhlovu, R.J. Quinnell, C.H. Watts, S.K. Chandiwanaand R.M. Anderson, Heterogeneities in the transmission of infectious agents: im-plications for the design of control programs.
Proceedings of the Natinal Academyof Science U S A (1997) 338-342.[149] S.M. Blower and A.R. McLean, Mixing ecology and epidemiology. Proceedings ofthe Royal Society of London Series B (1991) 187-192.[150] M.E.J. Newman, Spread of epidemic disease on networks.
Physical Review E (2002) 016128.[151] R.M. May and A.L. Lloyd, Infection dynamics on scale-free networks. PhysicalReview E (2001) 066112. 72152] J.C. Miller, Epidemic size and probability in populations with heterogeneousinfectivity and susceptibility. Physical Review E (2007) 010101.[153] H.P. Duerr, M. Schwehm, C.C. Leary, S.J. De Vlas and M. Eichner, Epidemic sizeand probability in populations with heterogeneous infectivity and susceptibility. Epidemiology and Infection (2007) 1124-1132.[154] R. Albert and A.L. Barabasi, Statistical mechanics of complex networks.
Reviewof Modern Physics (2002) 47-97.[155] I.Z. Kiss, D.M. Green and R.R. Kao, Disease contact tracing in random andclustered networks. Proceedings B (2005) 1407-1414.[156] I.Z. Kiss, D.M. Green and R.R. Kao, Infectious disease control using contacttracing in random and scale-free networks.
Journal of the Royal Society Interface (2006) 55-62.[157] F. Liljeros, C.R. Edling, L.A.N. Amaral, H.E. Stanley and Y Aberg, The web ofhuman sexual contacts. Nature (2001) 907-908.[158] N. Masuda, N. Konno and K. Aihara, Transmission of severe acute respira-tory syndrome in dynamical small-world networks.
Physical Review E (2004)031917.[159] M.J. Keeling and K.T. Eames, Networks and epidemic models. Journal of theRoyal Society Interface (2005) 295-307.73160] M.J. Keeling, Extensions to mass-action. in:Ecological Paradigms Lost. Routesof Theory Change. eds K. Cuddington and B.E. Besiner (Elsevier, Amsterdam,2005) pp. 107-142.[161] O.N. Bjornstad, B.F. Finkenstadt and B.T. Grenfell, Dynamics of measles epi-demics: Estimating scaling of transmission rates using a time series SIR model. Ecological Monographs (2002) 169-184.[162] M.J. Keeling, The effects of local spatial structure on epidemiological invations. Proceedings of the Royal Society of London Series B (1999) 859-869.[163] S. Riley, Large-scale spatial transmission models of infectious disease.
Science (2007) 1298-1301.[164] M.J. Keeling, S.P. Brooks and C.A. Gilligan, Using conservation of pattern toestimate spatial parameters from a single snapshot.
Proceedigns of the NationalAcademy of Science U S A (2004) 9155-9160.[165] C. Viboud C, O.N. Bjornstad, D.L. Smith, L. Simonsen, M.A. Miller and B.T.Grenfell, Synchrony, waves, and spatial hierarchies in the spread of influenza.
Science (2006) 447-51.[166] H. Nishiura, Smallpox during pregnancy and maternal outcomes.
Emerging In-fectious Diseases (2006) 1119-1121.74167] H.P. Duerr, S.O. Brockmann, I. Piechotowski, M. Schwehm and M. Eichner,Influenza pandemic intervention planning using InfluSim: pharmaceutical andnon- pharmaceutical interventions. BMC Infectious Diseases (2007) 76.[168] M. Eichner, M. Schwehm, H.P. Duerr and S.O. Brockmann, The influenza pan-demic preparedness planning tool InfluSim. BMC Infectious Diseases (2007)17. 75able 1: Reported estimates of R for pandemic influenza during the fall wave (2ndwave) from 1918-19Location Serial interval R Autonomous system Reference(days) fitted with entireepidemic curveSan Francisco, USA 6 3.5 Yes [39]6 2.4 No [39]45 cities in the USA 6 a b c a Sensitivity of the R estimates to different assumptions for the serial interval wasexamined; b Three pandemic waves were simultaneously fitted; c The epidemic wasobserved in a community with closed contact (i.e. military camp).76igure 1:
Flow chart of the state progression of individuals among the dif-ferent epidemiological classes as modeled by the complex SEIR model.
Seeequations (10). 77 F r equen cy F r equen cy S c a l ed r e s i dua l s Time (days)0 2 4 6 8 10 12 14 160500 Time (days) C u m u l a t i v e nu m be r o f i n f l uen z a no t i f i c a t i on s
17 epidemic days
Figure 2:
Model fits, residuals plots, and the resulting distributions of thereproduction number and the proportion of the clinical reporting obtainedafter fitting the complex SEIR epidemic model to the initial phase of theFall influenza wave using 17 epidemic days of the Spanish Flu Pandemic inSan Francisco, California . See equations (10) [39]. In the top panel, the epidemicdata of the cumulative number of reported influenza cases are the circles, the solid lineis the model best fit, and the solid blue lines are 1000 realizations of the model fit tothe data obtained through parametric bootstrapping as explained in the main text.78igure 3:
Temporal distribution of Spanish influenza in Zurich.
Left panelshows an epidemic curve ( i.e.i.e.