Heterogeneity and Superspreading Effect on Herd Immunity
HHeterogeneity and Superspreading Effect on Herd Immunity
Yaron Oz , Ittai Rubinstein , and Muli Safra Raymond and Beverly Sackler School of Physics and Astronomy, Tel-Aviv University,Tel-Aviv 69978, Israel School of Natural Sciences, Institute for Advanced Study, Princeton NJ, USA Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, IsraelOctober 14, 2020
Abstract
We model and calculate the fraction of infected population necessary for herd immunity to occur,taking into account the heterogeneity in infectiousness and susceptibility, as well as the correlationbetween the two parameters. We show that these cause the reproduction number to decrease withprogression, and consequently have a drastic effect on the estimate of the necessary percentage of thepopulation that has to contract the disease for herd immunity to be reached. We discuss the implicationsto COVID-19 and other pandemics.
The COVID-19 pandemic has had a dramatic impact on the world in recent months, setting in motion a hugeeffort in diverse research disciplines in an attempt to understand the nature of the disease and the dynamicsof the virus’s spread. A question of utmost importance in this context is how many infected individuals ittakes to reach herd immunity, where by herd immunity one means that the virus is unable to find enoughsusceptible hosts to continue its spread and consequently the disease fades out.Herd immunity is typically expected to be reached when a large fraction of the population becomesimmune to the disease and the effective reproduction number (which quantifies the spread) drops belowone. The standard estimate for the necessary fraction for herd immunity to be reached is about 60% of thesusceptible individuals. This estimate, however, assumes a homogeneous structure of the epidemic spreadnetwork, where both the infectiousness and susceptibility of individuals are assumed to be homogeneouslydistributed.It is well recognized, however, that the epidemic spread network is not homogeneous but rather het-erogeneous, with distinct people being infectious (likely to infect others) and susceptible (likely to becomeinfected themselves) to different degrees (for reviews see e.g. [1, 2]). The class of individuals with very highsecondary infection rates are referred to as superspreaders [3]. An estimate for the COVID-19 pandemic [4]asserts that between 5% to 10% of the infected individuals cause 80% of the secondary infections.The reasons that different people are infectious and susceptible to different degrees may be, for instance,increased contact with others that can increase both parameters, or hygiene and better protective equipmentthat can decrease both. Thus, infectiousness and susceptibility are potentially highly correlated. A correla-tion between infectiousness and susceptibility can significantly affect our estimate for the percentage of thepopulation that must contract the disease for herd immunity to be reached.If more infectious people are also more susceptible, then our initial estimates of the basic reproductionnumber (the mean value of secondary infections caused by an infected individual) will be biased leading usto believe that it is much larger than it really is—we are oversampling the infectiousness of more susceptible1 a r X i v : . [ q - b i o . P E ] O c t eople. Deviations in the susceptibility can lead to us seeing an early spike in the number of cases as thesusceptible are infected, with a sudden drop later on as the disease spreads to less susceptible populations.Furthermore, if more infectious people are also more susceptible, then they will also be infected and developnatural immunity much sooner.Our aim in this work is to model and calculate the fraction of infected in the population that givesrise to herd immunity while taking into account the heterogeneity in infectiousness and susceptibility, theircorrelation and the superspreading effect.In order to analyze the spread of the disease we assign to each individual a a susceptibility parameter S ( a )and an infectiousness parameter I ( a ) drawn from some probability distributions. S ( a ) and I ( a ) quantifyhow likely a is to be infected and infect others, respectively. The probability that a will transmit the diseaseto b —should a be infected—is I ( a ) × S ( b ) and this is the effective parameter in our analysis. The productof I and S scales like the inverse of the susceptible population, where typically S scales like its inverse and I is independent of it.We measure the progress of the disease as a function of the number of individuals who contracted it.This is the natural governing parameter when considering questions of the type: will the disease fade out asa function of the fraction of the population that got infected. We begin the evolution process of the diseaseat step n = 0 with a certain number of infected individuals and increase the number of infected by one ateach step. The initial number of infected individuals does not influence the analysis and we will take it tobe one. Our calculation is done by taking an expectation value over all possible scenarios of infection.We model the system for any distribution of I, S , and present a general formula for the behaviour ofheterogeneous diseases. We also consider the special case where each individual is infectious and susceptibleto the same degree, that is, where the distributions from which I and S are drawn are highly correlated.We derive a simple analytical result when the infectiousness and susceptibility parameters are, in particular,chosen from a Gamma distribution with scaling and shape parameters k and θ , respectively—a distributionpreviously attributed to the infectiousness of COVID-2 [3].Consider first the general case. Define the average conditional infectiousness ϕ ( s ): ϕ ( s ) def = E S ( a )= s [ I ( a )] , (1.1)and the normalized susceptibility distribution at the step n in the evolution of the disease ρ ( s, n ): ρ ( s, n ) def = Pr [ ( S ( a ) = s ) | a is healthy after n steps] . (1.2)Denote the susceptibility distribution at the beginning by ρ ( s ) = ρ ( s,
0) . The average conditionalinfectiousness (1.1) is independent of the number of infected individuals while the susceptibility distributiondoes depend on it—individuals with higher susceptibility are more likely to be chosen first, hence their ratedecreases as the process progresses. We will prove the following general claim about the fraction of thepopulation necessary to reach herd immunity:
Claim I (General Case): for any δ when − (cid:90) ρ ( σ ) exp ( − δσ ) dσ (1.3) fraction of the population is infected, the effective reproduction number will be reduced by a factor of (cid:82) ϕ ( σ ) ρ ( σ ) σ exp ( − δσ ) dσ (cid:82) ϕ ( σ ) ρ ( σ ) σdσ . (1.4)The threshold for herd immunity is when the value of the effective reproduction number is 1.Consider next the particular case of the Gamma distribution with shape and scale parameters k and θ ,respectively. We will prove the following claim: 2 laim II (Gamma distribution): Under the above assumptions, herd immunity will be reached when ε = 1 − R − k ( k +2) (1.5) fraction of the population is infected. R is the reproduction number at the beginning of the disease spreadand k is the shape (spread) parameter of the Gamma distribution. Note, that the necessary fraction of the population (1.5) does not depend on the scale parameter ofthe distribution θ . Substituting the estimates for COVID-19: R ≈ k ≈ . ε ≈ k = 0 .
19 [3] and in this case we get ε ≈ Consider the effective value of the reproduction number at step n in the evolution of the pandemic which wewill denote by R ( n ), R = R (0). We define it as the expectation value of secondary infections conditional onthe individuals that have been infected. Denote by Λ n the distribution over the n th individual to be infected(namely, linear in susceptibility at that stage)—we thus have: R ( n ) def = E a ∼ Λ n [number of people a infects] . (2.1)Mathematically we have: R ( n ) = E a ∼ Λ n I ( a ) (cid:88) b (cid:54) = a S ( b ) == (cid:88) a S ( a ) (cid:80) a (cid:48) S ( a (cid:48) ) I ( a ) (cid:88) b (cid:54) = a S ( b ) = (cid:39) (cid:88) a S ( a ) I ( a ) , (2.2)where the summation over a in the last two lines is on the healthy individuals at step n − a to the summation. Note, that since I × S scales like the inverse of the susceptible population at the n th step N ( n ), R (2.2) does not scale with it.Using (1.1) and (1.2) we get: R ( n ) = N ( n ) (cid:90) ϕ ( s ) ρ ( s, n ) sds . (2.3)We denote the susceptible population at the beginning by N = N (0).In the next section we will study the dynamics of the spread of the disease and how the effective repro-duction number R ( n ) depends on the number of infected individual.3 Disease Spread Dynamics
The process starts at step n = 0 with one infected individual. At each step that someone is infected, theprobability that a was infected is proportional to S ( a ). Thus,Pr [ a is healthy at step n ] == (cid:32) − S ( a ) (cid:80) b : b is healthy at step n − S ( b ) (cid:33) ·· Pr [ a is healthy at step n − . (3.1)We therefore get: log( Pr ( a is healthy at step n )) −− log (Pr ( a is healthy at step n − − S ( a ) N ( n − E b ∼ Λ n − [ S ( b )] − O (cid:32) S ( a ) (cid:80) b | b is healthy at step n − S ( b ) (cid:33) = − α ( n ) S ( a ) − O (cid:32)(cid:18) max b S ( b ) N E b S ( b ) (cid:19) (cid:33) , (3.2)Where α ( n ) = 1 (cid:80) b | b is healthy at step n − S ( b ) . (3.3)Since at each step someone is infected, there can be at most N steps:log(Pr [a is healthy at step n ]) == − (cid:88) τ ≤ n α ( τ ) S ( a ) − O (cid:32)(cid:18) max b S ( b ) N E b S ( b ) (cid:19) n (cid:33) == − (cid:88) τ ≤ n α ( τ ) S ( a ) − O (cid:32)(cid:18) max b S ( b ) E b S ( b ) (cid:19) nN (cid:33) . (3.4)Therefore, as long as max b S ( b ) E b S ( b ) (cid:28) √ N we have:log(Pr [ a is healthy at step n ]) ≈≈ − (cid:88) τ ≤ n α ( τ ) S ( a ) . (3.5)Denote δ ( n ) = (cid:80) τ ≤ n α ( τ ), we get a relation between the susceptibility distributions at steps n and zero: ρ ( s, n ) ρ ( s,
0) = exp ( − δ ( n ) s ) (cid:82) ρ ( σ, exp ( − δ ( n ) σ ) dσ , (3.6)between the susceptible populations: N ( n ) N (0) = (cid:18)(cid:90) ρ ( s, exp ( − δ ( n ) s ) ds (cid:19) , (3.7)4nd between the effective reproduction numbers (2.3): R ( n ) R (0) = (cid:82) ϕ ( s ) ρ ( s, exp ( − δ ( n ) s ) sds (cid:82) ϕ ( s ) ρ ( s, sds . (3.8)This proves Claim I with (1.3) being 1 − N ( n ) N (0) .We can derive the condition for reaching herd immunity at step n herd by:1 = R ( n herd ) = N ( n herd ) (cid:90) ϕ ( σ ) ρ ( σ, n herd ) σdσ == N (0) (cid:90) ϕ ( σ ) ρ ( σ,
0) exp( − δ ( n herd ) σ ) σdσ , (3.9)where N ( n herd ) is related to N ( n = 0) by (3.7). Consider first the particular case where the infectiousness and susceptibility I ∼ S and both are drawn froma Gamma distribution: ρ k,θ ( s,
0) = 1Γ( k ) θ k s k − exp (cid:16) − sθ (cid:17) , (4.1)where k and θ are shape and scale parameters of the distribution at n = 0. Using (3.6) we get: ρ k,θ ( s, n ) = ρ k,θ ( n ) ( s, , θ ( n ) = θ θδ ( n ) . (4.2)Thus, while the shape of the distribution does not change during the evolution of the disease its scale does.We denote β ( n ) = θ ( n ) θ , then using (3.7) and (3.8) we get: N ( n ) N (0) = β ( n ) k , R ( n ) R (0) = (cid:18) N ( n ) N (0) (cid:19) k +2 k , (4.3)where R ( n ) = N ( n ) (cid:90) ρ ( s, n ) s ds = (cid:0) k + k (cid:1) θ ( n ) N ( n ) . (4.4)Herd immunity is reached when R ( n ) drops below one and this happens when N ( n ) N (0) = R − kk +2 , (4.5)thus proving claim II.Calculating the fraction of the population ε leading to herd immunity, following claim I, can be carriedout analytically for the case of a Gamma distribution, while for general distributions it has do be performednumerically. In figures 1 and 2 we plot ε as a function of the coefficient of variation, i.e. the ratio of thestandard deviation and the mean, Cv = σµ . We plot the results for three distributions: Gamma, FoldedNormal (Truncated Gaussian) and Power Law when R equals three and six, respectively.We set ϕ ( s ) = s , hence R is the second moment of the distribution. We can see that as Cv approaches0, the distributions behave similarly. However, for larger values, the behaviour of the system depends uponthe distribution with the power law distribution approaching ε = 0 much faster. The coefficient of variationof the truncated Gaussian distribution converges to (cid:113) π − ≈ . µσ → + .5n general, we see that the higher the variance of the infectiousness and susceptibility the lower thefraction of the population that needs to be infected in order to reach herd immunity. Figure 1: The percentage of infected in the population necessary for herd immunity to occur as a function of thecoefficient of variation for R = 3. The results are shown for three distributions: Gamma, Folded Normal and PowerLaw. The higher the variance of the infectiousness and susceptibility the lower the fraction of the populationnecessary.Figure 2: The percentage of infected in the population necessary for herd immunity to occur as a function of thecoefficient of variation for R = 6. The results are shown for three distributions: Gamma, Folded Normal and PowerLaw. In comparison to figure 1 we see, as expected, that the required fraction is higher for a given coefficient ofvariation. Discussion
The spread of the COVID-19 pandemics is characterized by high variance of the infection and susceptibledistributions. In addition to the high degree of heterogeneity in infectiousness and susceptibility, one expectsa significant correlation between them stemming, for instance, from the social aspect of the spread of diseases.We studied the implications of this structure on the condition to reach herd immunity.We proved two claims, one for general distributions and one for the Gamma distribution and showed thatthe heterogeneity and correlation have a drastic effect on the estimate of the percentage of the populationthat must contract the disease before herd immunity is reached.Under the assumption of Gamma distribution we found that for COVID-19 a fraction ε ≈
5% of infectedpopulation suffices to reach herd immunity while a fraction ε ≈
9% is needed for COVID-2. While writingthe paper we became aware of two recent works [9, 10] that, using different mathematical fraemeworks,reached similar conclusions for the fraction of infected population that is required for herd immunity underthe assumption of Gamma distribution.Our mathematical analysis of the disease spread dynamics can be viewed as a random walk on a completegraph. It is of interest to study other graph structures and quantify the differences [11].
Acknowledgements
We would like to thank Nir Kalkstein and Aviad Rubinstein for valuable discussions on the importanceof the high variance to the spread of the disease. The work is supported in part by the Israeli ScienceFoundation center of excellence, the European Research Council (ERC) under the European Union’s Horizon2020 research and innovation program (Grant agreement No. 835152), BSF 2016414 and the IBM EinsteinFellowship at the Institute for Advanced Study in Princeton.
References [1] K Rock, S Brand, J Moir, MJ Keeling, ”Dynamics of infectious diseases”, Reports on Prog. Phys. ,026602 (2014).[2] R. Pastor-Satorras, C. Castellano, P. Van Mieghem and A. Vespignani, ”Epidemic processes in complexnetworks”, Rev. Mod. Phys. , 925 (2015).[3] J. O. Lloyd-Smith, S. J. Schreiber, P. E. Kopp and W. M. Getz, ”Superspreading and the effect ofindividual variation on disease emergence”, Nature , 355–359 (2005).[4] D. Miller et al.,”Full genome viral sequences inform patterns of SARS-CoV-2 spread into and withinIsrael”, https://doi.org/10.1101/2020.05.21.20104521.[5] A Endo, S Abbott, A Kucharski, S Funk, Estimating the overdispersion in covid-19 transmis- sion usingoutbreak sizes outside china. Wellcome Open Res.5