Stochastic SIR epidemics in a population with households and schools
SStochastic SIR epidemics in a population withhouseholds and schools ∗ Tanneke Ouboter, Ronald Meester and Pieter TrapmanSeptember 19, 2018
Abstract
We study the spread of stochastic SIR (Susceptible → Infectious → Recovered) epidemics in two types of structured populations, both con-sisting of schools and households. In each of the types, every individualis part of one school and one household. In the independent partitionmodel , the partitions of the population into schools and households areindependent of each other. This model corresponds to the well-studiedhousehold-workplace model. In the hierarchical model which we intro-duce here, members of the same household are also members of thesame school.We introduce computable branching process approximations forboth types of populations and use these to compare the probabili-ties of a large outbreak. The branching process approximation in thehierarchical model is novel and of independent interest. We prove bya coupling argument that if all households and schools have the samesize, an epidemic spreads easier (in the sense that the number of indi-viduals infected is stochastically larger) in the independent partitionmodel. We also show by example that this result does not necessarilyhold if households and/or schools do not all have the same size.
Mathematical modeling of the spread of infectious diseases has a long his-tory [Diekmann et al., 2012]. A commonly used model for epidemics, theSIR epidemic in a closed population, is easy to describe, but this modelalready has interesting features. In stochastic models for epidemics in large,unstructured homogeneously mixing populations (that is, every pair of indi-viduals makes contacts at the same rate) branching process approximationscan be used to compute the probability that an epidemic will occur if adisease is introduced in a population. In case of an epidemic, we can also ∗ PT is supported by Vetenskapsr˚adet (Swedish Research Council) Grant nr: 2010587 a r X i v : . [ phy s i c s . s o c - ph ] M a r se branching process approximations to compute the expected fraction ofthe population that is infected throughout the course of the epidemic.The homogeneous mixing assumption is too strong. One way to gainrealism is to assume multiple levels of mixing. A popular extension is theso-called household model [Ball et al., 1997]. In this model the popula-tion is partitioned into households of relatively small size. Within house-holds, contacts are more frequent than in the general population. Inspiredby the modeling of the spread of childhood diseases, this model has beenextended further by also introducing an independent partition of the popu-lation into schools (or workplaces) [Ball and Neal, 2002, Pellis et al., 2009,2012]. For these diseases, the spread in schools plays an important role andthe household-school model is a natural model in this case.The assumption that the partition in households and schools are inde-pendent has as a consequence that in large populations it is unlikely thatmembers from the same family attend the same school. However, it wouldbe more realistic to assume that siblings do go to the same school. This idealeads to the hierarchical model which we define below.Mathematically speaking, the independent partition model is the easierone to understand. It would be of great theoretical and practical interest ifwe could show that epidemics spread more easily (the precise meaning of thisis explained in the following sections) in the independent partition modelthan in the more realistic hierarchical model. Indeed, control strategieswhich are known to work in the independent model, will then also stopepidemics in the hierarchical model.We show in Theorem 2.3 that in case the sizes of schools are all the sameand the sizes of households are all the same, then indeed an epidemic spreadseasier in the independent partition model. We use branching process approx-imations for this conclusion. However, if the sizes of schools and householdsare variable, then, perhaps surprisingly, this is not true in general, see The-orem 2.4. The branching process approxiamtion in the hierarchical model isnew and interesting in its own right. We consider two models for a population structure. In either model, indi-viduals are part of exactly one household and exactly one school. In the independent partition model , the partitions of the population into house-holds and into schools are independent. This model has been studied as thehousehold-workplace model before in e.g. [Ball and Neal, 2002, Pellis et al.,2009, 2012]. In the hierarchical model , members of the same household at-tend the same school. 2e can formally construct populations of either type with a given num-ber of n schools as follows. In the hierarchical model every school containsindividuals from an independent and identically distributed (i.i.d.) num-ber of households, where this number of households the individuals in aschool are part of is distributed as N c . Households have i.i.d. sizes, dis-tributed as N h . Hence the number of individuals in a school is distributedas N s ∼ (cid:80) N c k =1 N ( k ) h , where N ( k ) h , k = 1 , , . . . are independent copies of N h ,which are also independent of N c . For mathematical convenience we assumethat both N h and N c have bounded support on the positive integers. Wedenote by N = N ( n ) the total number of individuals in the population, thatis, N = (cid:80) nk =1 N ( k ) s , where N ( k ) s denotes the number of individuals in the k -th school.In the independent partition model we use the school and householdsizes from the hierarchical model and use the independent partitions of the N individuals in the population, uniformly chosen among all partitions withrespectively the required household sizes and required school sizes. We consider a stochastic SIR epidemic in a closed population. In this model,individuals are in one of the three states, S , I and R . If a susceptible indi-vidual contacts (note Remark 2.1 below) an infectious individual, the sus-ceptible one becomes infectious immediately (she is infected ) and stays sofor exactly one time unit (but note Remark 2.2 below). After this infec-tious period the individual recovers and stays immune forever. There arethree types of contacts: pairs of individuals which are in the same householdmake household-contacts according to Poisson processes with intensity λ h .Similarly, pairs of individuals within the same school make school-contactsaccording to Poisson processes with intensity λ s . Finally, all pairs of individ-uals in the population make global-contacts according to Poisson processeswith intensity λ g / ( N − λ h + λ s + λ g / ( N − household epidemic is defined as an outbreak which occurs if all globaland school contacts are ignored; a household epidemic is always restrictedto one household. Similarly a school epidemic is an outbreak which occursif all global and household contacts are ignored.In this paper we are mainly interested in the final size of an epidemic,that is, the fraction ρ of individuals which are infected throughout the epi-3emic. We use (and state the arguments for this) that in the large populationlimit, if the fraction of infected individuals is positive (i.e. a major outbreak occurs) than with probability tending to 1 (as the population size grows)this fraction is equal to the probability of a major outbreak. Remark 2.1
Contacts as defined above, are not necessarily identical tophysical contacts. Only encounters which lead to the transmission of thedisease if one of the individuals is infectious and the other susceptible areconsidered to be contacts. So, if only half of the contacts of an infectiousindividual with a susceptible leads to transmission, then we can model the“infectious contacts” by thinning all original Poisson processes representingphysical encounters. The remaining points are still distributed according toa Poisson process, now with half the density of the original process.
Remark 2.2
The assumptions that (1) the infectious period is non-random,(2) the infectious period starts immediately at infection, and (3) that thecontacts are described by homogeneous Poisson processes, are too strong.They might be replaced by the assumption that for every (ordered) pair ofindividuals the event that the first individual, if infectious, contacts the sec-ond individual is independent of contacts between other pairs of individuals.In particular, the methods and results of this manuscript apply to modelsin which the infectious periods of individuals are not random. This inclusedSEIR epidemic models with non-random infectious period, in which thereis a random exposed (latent) period between the moment an individual isinfected and the moment that it starts to be infectious. See [Kuulasmaa,1982, Meester and Trapman, 2011] for a discussion.
Our results are twofold. In the first place we introduce certain branchingprocesses (sometimes multi-type) which enable us to carefully describe theinitial phase of an epidemic. As far as we are aware, our methods for com-puting the quantities of interest in the hierarchical model, in particular theapproximating branching process, are new. The various branching processesused for the two models are somewhat hard to compare directly since theunits of the various branching processes are not the same. In order to use thebranching processes and make actual computations, we need to know how tomake exact computations for epidemics restricted to households or schools,which are relatively small compared to the total population. This part iscarried out in Section 3, while the actual branching process approximationsare described in Section 4. Our strategy for the independent partition modelis similar to the computations suggested by Ball and Neal [2002] and ourresults are in agreement with theirs. We note that Ball and Neal [2002] doallow for random infectious period and their model is in that sense moregeneral than ours. 4n the second place we are interested in direct comparison of the inde-pendent and the hierarchical model. We prove the following theorem.
Theorem 2.3
Consider a hierarchical and independent partition model inwhich households and schools have non-random sizes. Let n be the numberof schools and let Z H ( n ) and Z I ( n ) denote the number of ultimately recov-ered individuals in the hierarchical model and independent partition model,respectively. Then, for any fixed k , we have lim inf n →∞ (cid:8) P ( Z H ( n ) ≤ k ) − P ( Z I ( n ) ≤ k ) (cid:9) ≥ . (1)Hence for fixed household and school sizes, the epidemic spreads easier in theindependent model. Note that the theorem implies that the probability of alarge outbreak in the hierarchical model is bounded above by the probabilityof a large outbreak in the independent partition model.The assumption that all households and all schools have non-randomsizes cannot be deleted in general. This is shown in the following theorem. Theorem 2.4
If we allow for variation in the sizes of the household andschool in the population, then (1) does not hold in general. In particular,we have the following two counterexamples.Let p h := 1 − e − λ h and p s := 1 − e − λ s . In either of the following twosituations, (1) does not hold for j large enough:1. For some fixed (and large) j , households have size j with probability (2 j ) − , and size 1 with probability − (2 j ) − . Furthermore, N c ≡ ,that is, in the hierarchical model schools contain exactly one house-hold. Furthermore, p s = p h = 2(3 j ) − and λ g = 1 / .2. All households have size 2, so N h ≡ . For some fixed (and large) j , N c = j with probability (4 j ) − and N c = 1 with probability − (4 j ) − .Furthermore, λ g = 1 / , p h = 1 and p s = (3 j ) − . Of course one example would suffice to show that (1) does not hold ingeneral, but the reason why (1) does not hold is somewhat different in thetwo examples, and therefore we present them both.We prove these theorems in Section 5.
Remark 2.5
The examples in Theorem 2.4 are obviously extreme and cho-sen such that we can exploit dependencies which do not appear if all house-holds and schools have the same size. In the first example there are hugedifferences between the sizes of the households. Individuals which are part ofa large household are automatically part of a large school in the hierarchicalmodel, while the sizes of the school and the household of an individual areindependent in the independent partition model. The dependence can be5sed to increase the offspring mean of the branching process approximatingthe epidemic in the hierarchical model, defined in Section 4. In particular,the branching process can become supercritical in the hierarchical model,while without the dependencies it would be sub-critical.In the second example, all households have the same size and are rel-atively small, but the variance in school sizes is large. Here we use thathousehold members of individuals in a large school are also part of a largeschool in the hierarchical model, while this is not automatically the case inthe independent partition model. In particular, the parameters of the modelare chosen in such a way that if we ignore all global contacts, expected epi-demic sizes will be larger in the hierarchical model than in the independentpartition model.Less extreme examples could be used, but computations would be messyand the examples would rely on the same principles.
It is useful to describe the collection of ultimately recovered individuals bymeans of a random graph in which vertices represent individuals. This is aclassical approach, see e.g. [Cox and Durrett, 1988]. The graph is built up asfollows. For every vertex we draw, with probability p h = 1 − e − λ h , directed“household edges” to each of the other vertices corresponding to individualsin its household. Similarly we draw, with probability p s = 1 − e − λ s directed“school edges” to each of the other vertices representing individuals in itsschool. Finally we draw, with probability p g = 1 − e − λ g / ( N − , directed“global edges” to each of the other vertices in the population. All edges aredrawn independently of each other.The endpoints of respectively household, school or global edges startingat a given vertex correspond to the individuals that will be contacted bythe individual represented by this given vertex via respectively household,school or global contacts during its infectious period, were it to be infectedin the course of the epidemic. The set of individuals infected in the courseof the epidemic started at a randomly chosen individual is distributed as theset of vertices that can be reached by a directed path starting at the vertexrepresenting such a randomly chosen individual.It is well known [Cox and Durrett, 1988] that the set of vertices whichcan be reached from the vertex representing the initially infected individ-ual has the same distribution as the cluster of this vertex in an undirected graph in which undirected (household) edges are drawn independently be-tween (pairs of) vertices representing members of the same household withprobability p h , undirected (school) edges are drawn independently betweenvertices representing members of the same school with probability p s andfinally, all pairs of vertices share a (global) edges with probability p g . In this6ndirected graph, the epidemic generating graph , there is no time evolution.The school cluster of a vertex is the cluster of the vertex in the epidemicgenerating graph if all global and household edges are ignored. This clus-ter corresponds to the individuals infected through a school epidemic if theindividual corresponding to the index vertex gets infected. The householdcluster of a vertex is the cluster of the vertex in the epidemic generatinggraph if all global and school edges are ignored. The household cluster hasa similar interpretation as the school cluster.We state without proof that for both the hierarchical and the indepen-dent model, the epidemic generating graph has, in the large population limitwith probability tending to 1, at most one cluster of the same order of sizeas the population. The fraction of the number of vertices in this clusterconverges in probability to a constant as the number of schools goes to in-finity. The proof of this fact runs along the lines of the proofs of similarstatements for Erd˝os-R´enyi graphs [Durrett, 2006] (cf. [Ball et al., 2009]). Itimplies that if the initially infected individual is chosen uniformly at randomfrom the population, then the probability of a large outbreak is the sameas the fraction of individuals which are ultimately removed. Indeed, if wefirst construct the epidemic generating graph and then choose the vertexrepresenting the initial infectious individual uniformly at random, then itis straightforward to see that the probability of a large outbreak and thefraction of the individuals which are infected during such an outbreak areboth equal to the fraction of the vertices in the large cluster.Next we present exact computations in small populations. Since weapply this to either household or school epidemics, we consider the situationin which there is only one type of contact, that is, the epidemic generatinggraph is built up in such a way that undirected edges between any pair ofvertices exist with probability p , independently of all other pairs. The resultis a special case of [Diekmann et al., 2012, p.281]. Theorem 3.1 ([Diekmann et al., 2012])
Consider a standard SIR epi-demic in a homogeneously mixing population in which initially 1 individual isinfectious and L individuals are susceptible. Let, for ≤ k ≤ L , P Lk = P Lk ( p ) denote the probability that k members out of the initial population of L sus-ceptibles are ultimately recovered. Then we have for all ≤ (cid:96) ≤ L , (cid:96) (cid:88) k =0 P Lk (cid:18) L − k(cid:96) − k (cid:19) (1 − p ) − ( L − (cid:96) )( k +1) = (cid:18) L(cid:96) (cid:19) . (2)Applying this result for (cid:96) = 0 , , . . . , L consecutively enables us to effi-ciently compute P L , P L , . . . , P LL .This theorem has a multi-type variant [Ball, 1986], which we will usein the analysis of the hierarchical model. Let k be a positive integer. For (cid:96) := ( (cid:96) , . . . , (cid:96) k ) and L := ( L , . . . , L k ), (cid:96) ≤ L is defined to mean (cid:96) i ≤ L i for7ll 1 ≤ i ≤ k . Furthermore, we write (cid:18) L (cid:96) (cid:19) = k (cid:89) i =1 (cid:18) L i (cid:96) i (cid:19) and (cid:96) (cid:88) u = = (cid:96) (cid:88) u =0 · · · (cid:96) k (cid:88) u k =0 . Let p ij be the probability that a given type- i individual contacts a giventype- j individual during its infectious period in case the type- i individualis infected. We assume that initially there is one infective individual, whichwithout loss of generality can be chosen to have type 1. Theorem 3.2 ([Ball, 1986])
Consider a population subdivided into k dif-ferent types of sizes L = ( L + 1 , L , . . . , L k ) , where 1 individual of type1 is initially infectious and all other individuals are initially susceptible.Let P u be the probability that the vector of numbers of ultimately recoveredindividuals (not including the initial infective) in the epidemic is equal to u = ( u , . . . , u k ) . Then for each ≤ (cid:96) ≤ L we have (cid:96) (cid:88) u =0 (cid:18) L − u (cid:96) − u (cid:19) P (cid:96) / k (cid:89) i =1 k (cid:89) j =1 (1 − p ij ) L i − (cid:96) j { i =1 } + u i = (cid:18) L (cid:96) (cid:19) . (3) For SIR epidemics in large homogeneously mixing populations, branchingprocess approximations are reasonable since it is unlikely that during theearly stages of the epidemic contacts of infectious individuals are made withnon-susceptibles (see e.g. [Diekmann et al., 2012] or use birthday problemarguments). However, within schools or households, the epidemics takeplace in small groups, and a standard branching process approximation isno longer viable. In this section, we explain that despite the existence of localepidemics inside schools or households, a branching process approximationcan still be carried out. We carry out the approximations separately for thetwo types of models. We reserve the word “child” to refer to the offspring ofa particle in the branching processes, not to a child in the actual population.
In this subsection we show that we can carry out a branching process ap-proximating in the independent partition model, where the units are in-dividuals of the population and which will have three types. The threetypes of particles in the branching process correspond to individuals in-fected through either global, school or household contacts respectively. Thenumber of global children in the branching process of each particle is Pois-son distributed with mean µ G . Particles corresponding to individuals not8nfected through a school contact have a number of “school-children” dis-tributed as the size of a school epidemic of a randomly chosen individualin the population. Similarly, particles corresponding to individuals not in-fected through a household contact have a number of “household-children”distributed as the size of a household epidemic of a randomly chosen indi-vidual in the population. Below follows a more detailed approach.Imagine that we know the number and sizes of the schools and the house-holds in the partition, but that we have not assigned individuals to thepartitions yet. Let n be the number of schools and n (cid:48) be the number ofhouseholds, and number the schools from 1 to n and the households from 1to n (cid:48) .If we choose an individual uniformly at random from the population,then the probability that we choose an individual from a household of size k is given by the size biased distribution k P ( N h = k ) / E ( N h ). Indeed, it is k times more likely to be in a given household of size k than it is to be ina given household of size 1. A size biased variant of a random variable isdecorated with a tilde, so P ( ˜ N h = k ) := k P ( N h = k ) / E ( N h ). Similarly theschool size of a uniformly at random chosen individual is distributed as ˜ N s and the number of households within the school of a uniformly at randomchosen individual is distributed as ˜ N c .We now create an i.i.d. sequence of randomly chosen schools by pickingschools with replacement according to a size biased distribution. (There is atechnical detail here which we ignore: in order to draw with replacement andto perform couplings below, we actually need to use the empirical distribu-tions for household and school sizes, i.e. the distribution determined by theactual (finite) sequence of household and school sizes. A rigorous treatmentof this detail can be found in [Ball et al., 2009].) Say that this sequence is x (1) , x (2) , . . . . Let T s be the first repeated index in this sequence, that is, T s = min { j : x ( j ) = x ( i ) for some i < j } . Similarly create an infinite i.i.d.sequence of households drawn (with replacement) according to a size biaseddistribution. We denote this sequence by x (cid:48) (1) , x (cid:48) (2) , . . . and let T h be thefirst repeated index in the sequence of households. Since the school andhousehold sizes have bounded support, a birthday type argument gives thatfor c ∈ (0 , / P ( T s < n c ) → P ( T h < n c ) → n → ∞ .We now describe the coupled construction of the branching process andthe epidemic generating graph. In the cluster of the initial infective inthe epidemic generating graph and the approximating branching process wedistinguish between three types of vertices: “global”, “school” and “house-hold”, where the type of a vertex is the type of the edge through whichthe vertex enters the process. The ancestor of the branching process (andthe uniformly at random chosen vertex used to start the exploration of theepidemic generating graph) receives the type “global”.We associate to the ancestor in the branching process school x (1) andhousehold x (cid:48) (1). Assume that school x (1) has size l , then the number of9hildren of the ancestor with type “school” is equal to k with probability P l − k ( p s ). (Here, the − l −
1, comes from the fact that in this schoolepidemic there are l − x (cid:48) (1) hassize l (cid:48) then the number of children of the ancestor with type “household” isequal to k (cid:48) with probability P l (cid:48) − k (cid:48) ( p h ), and this number is independent of thenumber of “school” children. Finally, the ancestor also has a Poisson numberof “global” children with expectation λ g . This number is independent of thenumber of “household” and “school” children.Important to note is that, in order to be able to use branching processapproximations, we treat all vertices in a school/household cluster of theancestor in the epidemic generating graph as its children, while in realitythe ancestor and child might not share an edge, but only have a path ofschool/household edges between them. We need this technique of assigningvertices to a generation to keep branching process approximations mean-ingful in the sense that we retain enough independence (cf. [Pellis et al.,2012]).To the “global” children we assign both households and schools: if thenumber of “global” children is k then we assign schools x (2) up to and in-cluding x ( k + 1) and households x (cid:48) (2) up to and including x (cid:48) ( k (cid:48) + 1) to thosechildren. Furthermore, we assign households to the “school” children (thefollowing households in the sequence) and schools to the “household” chil-dren (the following schools in the sequence). As long as the total number ofassigned schools is less than T s and the total number of assigned householdsis less than T h , we can proceed with the construction in the obvious way. Inthis construction we create part of the epidemic generating graph througha branching process.If the coupling proceeds then we assign vertices to school x (1) and house-hold x (cid:48) (1) in such a way that the household and school overlap at only onevertex, say vertex v , which is the vertex corresponding to the ancestor in thebranching process. If in the branching process the ancestor has k “school”children, then the size of the cluster of v created by school edges is k + 1 (the+1 is because v is also part of the cluster). In the same way we choose thecluster of v created by household contacts in x (cid:48) (1). If the number of “house-hold” children of the ancestor is k (cid:48) , then this cluster has size k (cid:48) + 1. Finally,global edges are drawn to vertices which are part of households and schoolswhich (i) only overlap with each other at the chosen vertices itself, and (ii)do not overlap with the households and schools of vertices in households andschools already explored.The next step is to assign individuals (and the schools and householdsthey belong to) to the generation 1 vertices. This happens in exactly thesame way as individuals were assigned to the ancestor, apart from the factthat “school” individuals do not have “school” children, since their schoolis already explored, and “household” individuals do not have “household”children. As long as the number of schools assigned to individuals does not10xceed T s and the number of households assigned to individuals does notexceed T h , the construction proceeds.Since the probability that T s or T h is less than n c for c ∈ (0 , /
2) goesto 0 as n → ∞ , the branching process approximation works with largeprobability for all small clusters (i.e. clusters of smaller order than n / ) inthe epidemic generating graph. If the vertex representing the initial infectiveindividual is in such a cluster then the approximating branching processgoes extinct. If the initial infective individual is part of a large cluster, i.e. acluster which asymptotically contains a positive fraction of the graph, thenstandard arguments used in random graph theory (e.g [Durrett, 2006, Ch.3]) show that the approximating branching process survives.The next step is to compute the probability that the constructed branch-ing process dies out. Every individual has a Poisson number of “global”children with mean λ g . Every “global” and “school” individual has a ran-dom number of “household” children, and we denote the probability thatthis number is equal to k by z h ( k ) := ∞ (cid:88) l = k P ( ˜ N h = l + 1) P lk ( p h ) , where P lk ( p h ) is defined via (2). Similarly every “global” and “household”individual has a random number of “school” children. We denote the prob-ability that this number is equal to k by z s ( k ) := ∞ (cid:88) l = k P ( ˜ N s = l + 1) P lk ( p s ) . “School” individuals do not have “school” children and “household” indi-viduals do not have “household” children, but apart from that the numberof children of the different types are independent.It is well known [Jagers, 1975, Ch. 4] how to compute extinction proba-bilities for multi-type branching processes. Define the probability generatingfunctions of the offspring distribution as follows. For 0 ≤ t g , t s , t h ≤ f g ( t g , t s , t h ) := ∞ (cid:88) k g =0 ∞ (cid:88) k s =0 ∞ (cid:88) k h =0 ( λ g ) k g k g ! e − λ g z s ( k s ) z h ( k h )( t g ) k g ( t s ) k s ( t h ) k h , = ∞ (cid:88) k s =0 ∞ (cid:88) k h =0 e − λ g (1 − t g ) z s ( k s ) z h ( k h )( t s ) k s ( t h ) k h and similarly, f s ( t g , t h ) := ∞ (cid:88) k h =0 e − λ g (1 − t g ) z h ( k h )( t h ) k h (0 ≤ t g , t h ≤ ,f h ( t g , t s ) := ∞ (cid:88) k s =0 e − λ g (1 − t g ) z s ( k s )( t s ) k s (0 ≤ t g , t s ≤ . t g ,where ( t g , t s , t h ) is the smallest positive real solution of the following set ofequations: t g = f g ( t g , t s , t h ) ,t s = f s ( t g , t h ) ,t h = f h ( t g , t s ) . It is equally well known [Jagers, 1975, Ch. 4] how to quickly decidewhether or not the probability that the approximating branching processsurvives is positive. Let m s = (cid:80) ∞ k =0 kz s ( k ) be the expected number ofindividuals infected in a school epidemic (excluding the initially infectedindividual in the school) and m h = (cid:80) ∞ k =0 kz h ( k ) be the expected number ofindividuals infected in a household epidemic (exuding the initially infectedindividual in the household). Define the so called next generation matrix ofthe branching process by M = λ g m s m h λ g m h λ g m s . (4)The probability of extinction of the branching process is strictly less than 1 ifand only if the largest eigenvalue of M (which is positive and real) is strictlylarger than 1. A small computation shows that this is the case exactly when λ g ( m s +1)( m h +1) > − m s m h = ( m s +1)+( m h +1) − ( m s +1)( m h +1) . (5)These results corresponds to the results in [Ball and Neal, 2002]. This largesteigenvalue of M is often referred to as the basic reproduction number, R [Diekmann et al., 2012, Pellis et al., 2012]. In Section 5 we will give someexamples where (5) is used. In the hierarchical model the spread within its household and within itsschool caused by an individual are no longer independent, since householdsare entirely contained in schools. We get around this problem by chang-ing the unit of the branching process. In particular, the particles in thebranching process no longer correspond to individuals, but to clusters of in-dividuals in the same household. Moreover, the type of a particle is given bythe number of individuals in its corresponding cluster. Below we derive theoffspring distribution for this branching process. However, note that thereare no easy closed expressions available for describing this distribution.Consider a household of size k , say. The epidemic generating graphrestricted to this household and restricted to household edges, partitions the12ousehold into clusters. The joint distribution of the sizes of the clusters inthe partition can be obtained via (2). Indeed, the size of the first cluster(in order of exploration) is l with probability P k − l − . Conditional on thesize of the first cluster being l , then the size of the second cluster is l withprobability P k − l − l − , and so on. In this way, every household is partitionedinto clusters with the property that if one of the individuals in the clustergets infected, then all the vertices in that cluster get infected. Furtherinfections within that household have to go through either school and globalcontacts, or through individuals outside the household.Instead of considering a school as partitioned into households, we viewa school as partitioned into clusters generated by household edges. Thesizes of those clusters are not independent. The joint distribution of thesecluster sizes is difficult to describe explicitly, but, as described in the previousparagraph, computationally relatively easy to deal with.To compute the final size of an epidemic restricted to school and house-hold contacts for a school with a given configuration of households in it, wefirst assign types to the clusters generated by household contacts, where thetype is the number of vertices within the cluster. Those clusters are the“super-individuals” and we apply (3) to compute the final size within theschool, with p ij = 1 − (1 − p s ) ij . Note that 1 − p ij is the probability that thereis no school contact between any of the individuals in the type i cluster andany of the individuals in the type j cluster. In order to compute the numberof individuals infected in an epidemic restricted to a school, let u i denote thenumber of clusters of size i which are ultimately infected through school con-tacts (now including the initially infected cluster within the school), whichcan be computed by using (3). The final size of the epidemic restricted tohousehold and school contacts is then (cid:80) ∞ i =0 iu i −
1. (The − N h . The number of households in theschool this vertex is part of is then distributed as ˜ N c . All other householdsin this school have sizes distributed as N h . We can use this to find thedistribution of the size of the cluster created by school and household edgeswhich contains a uniformly at random chosen vertex from the population.Let Y have the same distribution as this random variable. Note that theuniformly chosen vertex is incorporated in Y . The distribution of Y isdifficult to describe in closed form, but it is computationally tractable.Finally we can describe the branching process. Since school sizes have13ounded support, a giant component in the epidemic generating graphmeans that many schools are infected. This suggests that we can consider abranching process of initial cases in schools (and therefore a branching pro-cess of infected schools cf. [Ball and Neal, 2002]). The (direct) offspring of aparticle of the branching process consists of all vertices that can be reachedby a path of school and household edges, apart from the final edge which isglobal and leads to a new infected school. So, the direct offspring of a par-ticle correspond to all vertices which can be reached by a global edge fromone of the vertices corresponding to an epidemic restricted to household andschool contacts from the individual corresponding to the particle. If thisapproximating branching process survives then the vertex corresponding tothe initial infectious individual is in the giant component (with large prob-ability as n → ∞ ), while if this branching process goes extinct than thecluster of the vertex corresponding with the initial infectious individual isalso small (compared to n ). The number of children of an individual in thebranching process is distributed as Z ∼ (cid:80) Yk =1 X k , where the X k ’s are i.i.d.Poisson random variables with mean λ g , which are independent of Y . Thisbranching process is a single type branching process for which the extinctionprobability, q , is the smallest root of t = (cid:80) ∞ k =0 P ( Z = k ) t k [Jagers, 1975].This smallest root is strictly less than 1 if and only if the offspring mean R ∗ = E ( Y ) λ g >
1. We give some examples of how to use these computationsin the proof of Theorem 2.4 below.
Proof of Theorem 2.3.
We use a coupling of the epidemic processes on thehierarchical and the independent partition models similar to the one used forthe branching process approximations in the previous section. This couplingis then used to show that Z I ( n ) and Z H ( n ) are asymptotically stochasticallyordered as stated in Theorem 2.3.Again we consider an i.i.d. sequence x (1) , x (2) , . . . of schools, by uni-formly picking schools with replacement. Let T s = T s ( n ) be the index ofthe first repeated school in this sequence. Similarly create an infinite i.i.d.sequence x (cid:48) (1) , x (cid:48) (2) , . . . of households drawn uniformly with replacement,and let T h = T h ( n ) be the index of the first repeated household in thissequence. Let A k = A k ( n ) = { T s > k + 1 } ∩ { T h > k + 1 } . We have P ( Z I ( n ) ≤ k ) ≤ P ( { Z I ( n ) ≤ k } ∩ A k ) + P ( A ck ) . By birthday-problem type arguments we have that for all δ > P ( A ck ) < δ for sufficiently large n . Since P ( { Z H ( n ) ≤ k } ∩ A k ) ≤ P ( Z H ( n ) ≤ k ), itsuffices to prove that P ( { Z I ( n ) ≤ k } ∩ A k ) ≤ P ( { Z H ( n ) ≤ k } ∩ A k ) , (6)14r equivalently, P ( Z I ( n ) ≤ k |A k ) ≤ P ( Z H ( n ) ≤ k |A k ) . (7)We simultaneously construct the epidemic generating graph of the hi-erarchical and of the independent partition model on a suitable probabil-ity space and show that for all k , and conditioned on A k , the inclusion { Z I ≤ k } ⊆ { Z H ≤ k } holds.Let x (1) be the school of the initially infected individual both for thehierarchical and the independent partition model. Let a school epidemicrun in this school and use this epidemic for both the hierarchical and theindependent partition model. Note that this gives the right distribution ofthe first school epidemic in both models.Now assume that the size of the school cluster in the correspondingepidemic generating graph is j , where j ≤ k . In the independent partitionmodel the households of the j individuals already infected are x (cid:48) (1) , . . . , x (cid:48) ( j ).Those households do not overlap since we condition on A k . The size ofthe household epidemics (including the initial infected within the house-hold) are then i.i.d. and all distributed as the random variable X , where P ( X = i ) = P n h − i − , where P n h − i − is defined as in Theorem 3.1. In the hi-erarchical model, the households of the j individuals affected by the schoolepidemic do not need to be all different. The household epidemics are nowrun one by one, the initial infectives for those household epidemics are the j individuals affected by the school epidemic, we do however ignore the in-dividuals already infected before (by the school epidemic, or by householdepidemics explored earlier). The probability that i individuals are ultimatelyinfected through such a household epidemics is given by P ˜ n h i − , where ˜ n h isthe number of individuals in the household not affected before by the epi-demic. Observing that ˜ n h ≤ n h − n s − A k ), whilein the hierarchical model this number is at most n s −
2. We proceed inthis way analysing the epidemic through school and household contacts andwe notice that the size of the cluster generated by school and householdedges in the epidemic generating graph in the hierarchical model is boundedabove by the same quantity as in the independent partition model. Let C H (respectively C I ) denote this cluster in the hierarchical model (respectivelyindependent partition model). If the number of vertices in such a clusterin the hierarchical or independent partition model is at most l , then thenumber of households and schools investigated is at most l since each vertexis part of exactly 1 school and 1 household.15he next step is to investigate the global edges from the vertices in C H and C I , assume that C H has size l (cid:48) and C I has size l . Note that l (cid:48) ≤ l . Wekeep the two epidemic generating graphs coupled, by using a sequence of l (cid:48) i.i.d. Poisson numbers with mean λ g and use these for the number of globalcontacts of the first l (cid:48) vertices in C I and for the l (cid:48) vertices in C H . In additionwe independently assign i.i.d. Poisson numbers (with mean λ g ) to the other l − l (cid:48) vertices in C I . Note that if l plus the total number of global edgesfrom C I does not exceed k , then, because of the conditioning on A k , theglobally contacted vertices are all in different households and schools andalso not in households and schools encountered before. In the hierarchicalmodel we assign the same schools to the globally contacted vertices as inthe independent partition model or a subset of those. Then we proceed withinvestigating the epidemic generating graph by investigating the new schoolsin the same way as we investigated the school of the initially infected vertex.By this coupling we obtain that if the exploration of the cluster stopsbefore k +1 vertices are included in the independent partition model, then inthe hierarchical model the cluster size is also less than k +1. Note that if wehave explored the cluster in the independent partition model until we haveincluded k vertices and there are still school, household or global edges notyet explored in the construction then conditioning on A k guarantees thatthis k +1-st edge is to a not yet encountered vertex and so the cluster of theinitially chosen vertex is at least k +1. This guarantees that conditioned on A k , we have { Z I ≤ k } ⊆ { Z H ≤ k } and completes the proof. Proof of Theorem 2.4.
We start the computations with Example 1. Inthis example households have size j with probability (2 j ) − , and size 1with probability 1 − (2 j ) − . Furthermore, in the hierarchical model schoolscontain exactly one household. We set p s = p h = 2(3 j ) − and λ g = 1 / j large enough to support our claim. Observe that thefraction of individuals within a household of size 1 is given by1 − (2 j ) − − (2 j ) − + j (2 j ) − = 2 j − j − ≈ . Consider the independent model. Since j is large, we may approximatethe epidemic within a household or school of size j by a sub-critical branchingprocess with offspring mean 2 /
3. The total number of ultimately removedindividuals in a school/household is then roughly (cid:80) ∞ k =0 (2 / k = 3. In aschool or household of size 1, there are no secondary individuals. It is easyto check with (5) that M has largest eigenvalue less than 1, and thereforethat the epidemic in the independent partition model is sub-critical.In the hierarchical model, the probability of having an edge between twovertices in the same household is 1 − (1 − p s )(1 − p h )(1 − p g ) and since p s , p h are small and p g is even much smaller than that ( p g ≈ λ/N , where N is thetotal population size), we have that 1 − (1 − p s )(1 − p h )(1 − p g ) ≈ p s + p h =16(3 j ) − . If j is large this leads to a supercritical Erd˝os-R´enyi graph, if theepidemic generating graph is restricted to a combined household and school.In particular the largest cluster in a large school is of order j ; say that itis with probability 1 / αj , where α >
0. We claim that each largecomponent is in expectation, via global edges, connected to λ g × αj × / × / × α other components of size at least αj . Indeed, approximately one thirdof those edges connects to other vertices in large schools/households, half ofthem contains a combined household/school cluster of at least size αj and α is the probability that this edge actually ends in the large household/schoolcluster. If j is large enough, the combined quantity is larger than 1 andthere is a cluster of household/school clusters of size at least αj all connectedthrough global edges which itself has size of order N . This shows that theepidemic generating graph has a cluster of the same order of magnitude asthe population, which implies that (1) does not hold.Next we consider Example 2. Recall that in this example all householdshave size 2. In the hierarchical model schools contain j households withprobability (4 j ) − and only 1 household with probability 1 − (4 j ) − . Fur-thermore, λ g = 1 / p h = 1 and p s = (3 j ) − . Similarly to Case 1, wededuce that m s ≈ / m h = 2. This implies, using (5) again, that theepidemic in the independent partition model is subcritical.In the hierarchical model, we see the households as “super-individuals”(since if one of the two household members gets infected the other will auto-matically get infected as well). So we can consider a social structure whichonly contains schools. Two (super) individuals (i.e. households) contacteach other with probability 1 − (1 − (3 j ) − ) ≈ j ) − . Again, the epi-demic generating graph for the school epidemic is an Erd˝os-R´enyi graph, inwhich vertices on average share edges with 4 / j and we can copy the argument from the previous example. We have discussed an ordering of epidemic severity of infectious diseases intwo extreme population models. We stress that the comparisons are aboutthe final size and the probability of a large outbreak, and not about thereproduction number R (the offspring mean of the approximating branchingprocess). Indeed, since the units of the branching processes are so differentwe have different interpretations of R in the various models and it makes noimmediate sense to compare this value for the two models directly. This isfurther illustrated in Figure 1 where we plot R and the survival probability ρ against the infection rates. In this figure we keep the proportions λ g : λ s : λ h fixed at 1 : 2 : 4, while household sizes are 2 and schools have size 4. Wenote that while the survival probability of the independent partition model17s at least as large as the survival probability for the hierarchical model, R is not ordered in this way. However the infection rates for which R crossesthe threshold 1, are lower for the independent partition model, as it shouldbe, since the survival probability (for which there is an ordering) is strictlypositive if and only if R > References
F. Ball. A unified approach to the distribution of total size and total area un-der the trajectory of infectives in epidemic models.
Adv. in Appl. Probab. ,18(2):289–310, 1986.F. Ball and P. Neal. A general model for stochastic sir epidemics with twolevels of mixing.
Math. Biosci. , 180:73–102, 2002.F. Ball, D. Mollison, and G. Scalia-Tomba. Epidemics with two levels ofmixing.
Ann. Appl. Probab. , 7:46–89, 1997.F. Ball, D. Sirl, and P. Trapman. Threshold behaviour and final outcomeof an epidemic on a random network with household structure.
Adv. inAppl. Probab. , 41(3):765–796, 2009.J. Cox and R. Durrett. Limit theorems for the spread of epidemics andforest fires.
Stochastic Process. Appl. , 30(2):171–191, 1988.O. Diekmann, H. Heesterbeek, and T. Britton.
Mathematical Tools for Un-derstanding Infectious Disease Dynamics . Princeton University Press,2012.R. Durrett.
Random graph dynamics . Cambridge University Press, 2006.P. Jagers.
Branching Processes with Biological Applications . Wiley, NewYork, 1975.K. Kuulasmaa. The spatial general epidemic and locally dependent randomgraphs.
J. Appl. Probab. , 19(4):745–758, 1982. ISSN 0021-9002.R. Meester and P. Trapman. Bounding basic characteristics of spatial epi-demics with a new percolation model.
Adv. in Appl. Probab. , 43(2):335–347, 2011.L. Pellis, N. Ferguson, and C. Fraser. Threshold parameters for a model ofepidemic spread among households and workplaces.
Journal of the RoyalSociety Interface , 6(40):979–987, 2009.L. Pellis, F. Ball, and P. Trapman. Reproduction numbers for epidemicmodels with households and other social structures. I. Definition and cal-culation of R . Math. Biosci. , 235(1):85–97, 2012.18 .2 0.4 0.6 0.8 1.0 Λ g R (a) Λ g Ρ (b) Figure 1: The basic reproduction number R (a) and the survival probability ρ (b) as a function of the global infection rate λ g for a model in which theproportions of the rates are λ g : λ s : λ hh