[PDF] A Systematic Framework of Modelling Epidemics on Temporal Networks

Abstract

We present a modelling framework for the spreading of epidemics on temporal networks from which both the individual-based and pair-based models can be recovered. The proposed temporal pair-based model that is systematically derived from this framework offers an improvement over existing pair-based models by moving away from edge-centric descriptions while keeping the description concise and relatively simple. For the contagion process, we consider the Susceptible-Infected-Recovered (SIR) model, which is realized on a network with time-varying edges. We show that the shift in perspective from individual-based to pair-based quantities enables exact modelling of Markovian epidemic processes on temporal tree networks. On arbitrary networks, the proposed pair-based model provides a substantial increase in accuracy at a low computational and conceptual cost compared to the individual-based model. From the pair-based model, we analytically find the condition necessary for an epidemic to occur, otherwise known as the epidemic threshold. Due to the fact that the SIR model has only one stable fixed point, which is the global non-infected state, we identify an epidemic by looking at the initial stability of the model.

Full PDF

HHumphries et al.

RESEARCH

A Systematic Framework of Modelling Epidemicson Temporal Networks

Rory Humphries , Kieran Mulchrone , Jamie Tratalos , Simon More and Philipp H¨ovel * Correspondence:[email protected] School of Mathematical Sciences,University College Cork, WesternRoad, T12 XF64 Cork, IrelandFull list of author information isavailable at the end of the article

Abstract

We present a modelling framework for the spreading of epidemics on temporalnetworks from which both the individual-based and pair-based models can berecovered. The proposed temporal pair-based model that is systematically derivedfrom this framework oﬀers an improvement over existing pair-based models bymoving away from edge-centric descriptions while keeping the description conciseand relatively simple. For the contagion process, we consider theSusceptible-Infected-Recovered (SIR) model, which is realized on a network withtime-varying edges. We show that the shift in perspective from individual-basedto pair-based quantities enables exact modelling of Markovian epidemic processeson temporal tree networks. On arbitrary networks, the proposed pair-based modelprovides a substantial increase in accuracy at a low computational andconceptual cost compared to the individual-based model. From the pair-basedmodel, we analytically ﬁnd the condition necessary for an epidemic to occur,otherwise known as the epidemic threshold. Due to the fact that the SIR modelhas only one stable ﬁxed point, which is the global non-infected state, we identifyan epidemic by looking at the initial stability of the model.

In recent years epidemiological modelling, along with many other ﬁelds, has seenrenewed activity thanks to the emergence of network science [1, 2, 3, 4]. Approach-ing these models from the view of complex coupled systems has shed new lightonto spreading processes where the early black-box ordinary diﬀerential equation(ODE) models from Kermack and McKendrick had its limitations [5, 6]. TheseODE models assume homogeneous mixing of the entire population, which may bean appropriate approximation for small communities. However, when attempting to a r X i v : . [ phy s i c s . s o c - ph ] N ov umphries et al. Page 2 of 20 model the spread of disease at a national or international level, they fail to capturehow heterogeneities in both travel patterns and population distributions contributeto and aﬀect the spread of disease. Epidemiological models on complex networksaim to solve this problem by moving away from averaged dynamics of populationsand mean-ﬁeld descriptions. Instead, the focus is on interactions between individ-uals or meta populations, where the spreading process is driven by contacts in thenetwork [7, 8, 9].There have been many improvements made in regards to network models, e.g.,generalised multi-layer network structures or more speciﬁcally temporal networksthat allow for the network structure to change with time [4, 10, 11, 12]. Temporalnetworks are a natural way of representing contacts and lead to an insightful inter-play between the disease dynamics and the evolving network topology [13, 14, 15].With the ever growing availability of mobility and contact data it has become eas-ier to provide accurate and high-resolution data to inform network models. Theresults can be extremely useful tools for public-health bodies and other stakehold-ers [16, 17, 18].In previous works, a widely used epidemiological concept is the individual-basedmodel [1, 19, 20]. It assumes statistical independence in the state of each vertex. Amajor problem associated with such a model is that it suﬀers quite badly from anecho chamber eﬀect due to the fact that there is no memory of past interactionsdue to statistical independence. There have been eﬀorts to ameliorate this prob-lem by introducing memory at the level of each vertex’s direct neighbours. Thesemodels referred to as contact-based [21] or pair-based [22] and have been shownto signiﬁcantly reduce the echo chamber eﬀect, depending on the underlying net-work structure. These two models diﬀer in their initial approach. The contact-basedmodel takes an edge-based perspective, which extends the message passing approach[13, 23], and all dynamic equations are formulated in terms of edges. By contrast,the pair-based model keeps the vertex-based approach of the individual-based modeland dynamic equations are in terms of vertices.In this paper, we extend the Pair-Based (PB) model to a temporal setting givinga Temporal Pair-Based (TPB) Model. We show how it can be drastically reducedand simpliﬁed under a certain dynamical assumption [24]. We deal speciﬁcally withsusceptible-infected-recovered (SIR) models. Once the TPB model is written in umphries et al.

Page 3 of 20 concise form, it is then possible to show that the contact-based model is equivalentto a linearised version of the TPB model. We then establish the conditions foran epidemic to occur according to the TPB model, also known as the epidemicthreshold. We investigate how the TPB model performs on a number of syntheticand empirical networks and investigate what kind of network topologies work bestwith the TPB model.The remainder of this paper is structured as follows: In Section 2, we summarizethe theoretical framework. Then, we address the calculation of an epidemic thresh-old in Section 3, before presenting the main results in Section 4. Finally, we oﬀersome conclusions in Section 5.

Let us consider a temporal network G = { G , . . . , G T } to be a series of networks G t = ( V, E t ), which all share the same vertex set V but diﬀer in their edge set E t .The adjacency matrix for the network at time t will be denoted by A [ t ] , and A [ t ] ij = 1implies an edge between vertices i and j at time t . Let Ω be the set of compartments in the model, that is, in the SIR model: Ω = { S, I, R } . Let X n = ( X n , X n , . . . , X nN ) T be the vector whose i -th element refers tothe state of the i -th vertex in the network at time step n , thus X n ∈ Ω N where | V | = N . The evolution of the disease is then described by the master equations[25], P ( X n +1 ) = (cid:88) X n ∈ Ω N P ( X n +1 | X n ) P ( X n ) . (1)In other words, we assume that the infection process is Markovian. P ( X n +1 ) is theprobability of the network being in the particular conﬁguration of states given by X n +1 and P ( X n +1 | X n ) is probability of the network moving from the conﬁgurationof states X n to X n +1 between their respective time steps. These equations describethe entire process on the network. However, in order to progress the system forwardone step in time, the probabilities of all combinations of state vectors must be found.This usually is not feasible for network processes with potentially billions of verticesas for the SIR process the total combination of states is given by 3 N . umphries et al. Page 4 of 20

Instead, it is possible to describe the evolution of the disease using a system ofReduced Master Equations (RME) [20], which describes the evolution of subsystemswithin the network, such as individual vertices, removing the need to obtain everypossible combination of states. An important note is that these RMEs are in factnot themselves true master equations as they are not necessarily linear due to thefact that the transition rates of the subsystems are non-linear combinations of thetransitions rates of the original system. However, we shall continue to use the termRME introduced by the author of [20].For notational convenience we use the following notation for the joint marginalprobabilities, P ( X ni , X ni , · · · , X ni m ) = (cid:104) X ni X ni · · · X ni m (cid:105) . (2)When we wish to specify a particular realisation of X ni , we denote it by S ni , I ni or R ni to imply X ni = S , X ni = I or X ni = R respectively. Employing this new notationwe start with the RME which describes the evolution of individual vertices, (cid:104) X n +1 i (cid:105) = (cid:88) X ni ∈ Ω (cid:104) X n +1 i | X ni (cid:105)(cid:104) X ni (cid:105) . (3)For SIR dynamics, the evolution of each vertex in each compartment is given as thefollowing, (cid:104) S n +1 i (cid:105) = (cid:104) S ni (cid:105) − (cid:104) I n +1 i | S ni (cid:105)(cid:104) S ni (cid:105) (4a) (cid:104) I n +1 i (cid:105) = (cid:104) I ni (cid:105) − (cid:104) R n +1 i | I ni (cid:105)(cid:104) I ni (cid:105) + (cid:104) I n +1 i | S ni (cid:105)(cid:104) S ni (cid:105) (4b) (cid:104) R n +1 i (cid:105) = (cid:104) R ni (cid:105) + (cid:104) R n +1 i | I ni (cid:105)(cid:104) I ni (cid:105) , (4c)where (cid:104) I n +1 i | S ni (cid:105) reads the probability vertex i is infected at time n + 1 given it wassusceptible at time n and similarly for (cid:104) R n +1 i | I ni (cid:105) . Note that (cid:104) R ni (cid:105) can be recoveredusing the conservation of the probabilities (cid:104) S ni (cid:105) + (cid:104) I ni (cid:105) + (cid:104) R ni (cid:105) = 1. In order tocompute the transition rates we deﬁne the following quantities, the probability ofinfection on contact β , the rate of recovery µ . A [ n ] is the temporal adjacency matrixof the network on which the process is occurring. Following directly from [22], the umphries et al. Page 5 of 20 transition rates of moving from S to I , and I to R are given by (cid:104) I n +1 i | S ni (cid:105) = 1 (cid:104) S ni (cid:105) (cid:34) β (cid:88) j ∈ V A [ n ] ij (cid:104) S ni I nj (cid:105) − β (cid:88) j ,j ∈ V A [ n ] ij A [ n ] ij (cid:104) S ni I nj I nj (cid:105) + . . . − ( − β ) N − (cid:88) j ,...,j N − ∈ V A [ n ] ij . . . A [ n ] ij N − (cid:104) S ni I nj . . . I nj N − (cid:105)  (5a) (cid:104) R n +1 i | I ni (cid:105) = µ. (5b)For the expression within the square brackets of Eq. (5a), the ﬁrst term is the probabilitythat vertex i is infected by some other vertex in the network, however, double counts events.Each subsequent term accounts for double-counting and over-correcting in the previous.These equations describe the probabilistic SIR process on temporal networks. However,the system of equations is not closed as it lacks a description for their joint probabilities.There are a number of ways which this problem can be tackled, usually by making anumber of numerical or dynamical approximations [21, 7, 19, 26]. In the next sections weattempt to improve on and unify many existing approaches showing how they are derivedfrom the system of RME’s given by Eqs. (4) and (5). One of the most commonly used epidemiological models on networks is individual-based(IB) model, which is extended to the temporal setting in [19]. We refer to this exten-sion as the temporal individual-based (TIB) model. This refers to the assumption ofstatistical independence of vertices or the mean ﬁeld approximation, i.e., the factorisa-tion (cid:104) X ni X ni . . . X ni M (cid:105) = (cid:104) X ni (cid:105)(cid:104) X ni (cid:105) . . . (cid:104) X ni M (cid:105) . By assuming this independence of vertices,Eq. (5a) simpliﬁes to (cid:104) I n +1 i | S ni (cid:105) = β (cid:88) j ∈ V A [ n ] ij (cid:104) I nj (cid:105)− β (cid:88) j ,j ∈ V A [ n ] ij A [ n ] ij (cid:104) I nj (cid:105)(cid:104) I nj (cid:105) + . . . − ( − β ) N − (cid:88) j ,...,j N − ∈ V A [ n ] ij . . . A [ n ] ij N − (cid:104) I nj (cid:105) . . . (cid:104) I nj N − (cid:105) , (6a)which upon factorising can be concisely written as (cid:104) I n +1 i | S ni (cid:105) = 1 − (cid:89) k ∈ V (cid:16) − βA [ n ] ki (cid:104) I nk (cid:105) (cid:17) . (7)Upon substituting the transition rates (cid:104) I n +1 i | S ni (cid:105) and (cid:104) R n +1 i | I ni (cid:105) under the assumption ofstatistical independence the full TIB model is written as (cid:104) S n +1 i (cid:105) = (cid:104) S ni (cid:105) (cid:89) k ∈ V (cid:16) − βA [ n ] ki (cid:104) I nk (cid:105) (cid:17) (8a) (cid:104) I n +1 i (cid:105) = (cid:104) I ni (cid:105) (1 − µ ) + (cid:104) S ni (cid:105) (cid:32) − (cid:89) k ∈ V (cid:16) − βA [ n ] ki (cid:104) I nk (cid:105) (cid:17)(cid:33) . (8b) umphries et al. Page 6 of 20

A This model closes the equation (5a) at the level of vertices, thus ignoring all correla-tions with other vertices at previous times. However, ignoring all past correlations causesthe model to suﬀer quite badly from an echo chamber eﬀect [14]. This echo chamber hasthe eﬀect of vertices artiﬁcially amplifying each other’s probability of being infected (cid:104) I ni (cid:105) at each new time step, as the marginal probability of each vertex is highly correlated withthe rest of the network and the factorisation of Eq. (5a) means each vertex forgets its pastinteractions. As demonstrated in [14], it is possible to show that in the absence of a recov-ered compartment, a static network of two linked vertices for non-zero initial conditions hasprobabilities of being infected which converge according to lim n →∞ (cid:104) I n (cid:105) = lim n →∞ (cid:104) I n (cid:105) = 1for the TIB model. In contrast to the TIB-model, instead of assuming independence of vertices we can ap-proximate the marginal probabilities in terms of combinations of lower order marginalsusing some form of moment closure [22][24]. Here we make an equivalent assumption tothat of the message passing approaches [14][13]. We assume the network contains no time-respecting non-backtracking cycles, in other words, starting at some initial vertex i thatleaves via vertex j , there is no way to ﬁnd a time-respecting path returning to this vertexthat does not return via j . This is equivalent to a tree network when the temporal networkis viewed in its static embedding of the supra-adjacency representation [27]. This allows usto write all higher order moments in Eq. (5a) as a combination of pairs (cid:104) S ni I nk (cid:105) . To showwhy this is possible, consider the three vertices i, j, k connected by two edges through i .If conditional independence of these vertices is assumed given we have the state of i , thenone can make the following assumption, A (cid:104) X ni X nj X nk (cid:105) = (cid:104) X nj X nk | X ni (cid:105)(cid:104) X ni (cid:105) = (cid:104) X ni X nj (cid:105)(cid:104) X ni X nk (cid:105)(cid:104) X ni (cid:105) . (9)This has the eﬀect of assuming the network is tree-like in structure as it implies anyinteraction between vertices j and k must occur through vertex i , thus the process isexact on networks which contain no time-respecting non-backtracking cycles and otherwiseprovides an improved approximation of varying degree which depends on the true networkstructure. The result obtained in Eq. (9) is often referred to as the Kirkwood closure [23].Under the assumption that the network is tree like, the following simpliﬁcation is obtainedfor Eq. (5a), (cid:104) I n +1 i | S ni (cid:105) = 1 − (cid:89) k ∈ V (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) . (10)However, we run into the problem that we have no description for pairs of vertices. Thus,we derive expressions for their evolution from the RMEs for pairs of vertices which is givenby, (cid:104) X n +1 i X n +1 j (cid:105) = (cid:88) ( X ni ,X nj ) ∈ Ω (cid:104) X n +1 i X n +1 j | X ni X nj (cid:105)(cid:104) X ni X nj (cid:105) , (11) umphries et al. Page 7 of 20

For (cid:104) S n +1 i I n +1 j (cid:105) , we obtain (cid:104) S n +1 i I n +1 j (cid:105) = (cid:104) S ni I nj (cid:105) + (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105)(cid:104) S ni S nj (cid:105)− (cid:104) I n +1 i I n +1 j | S ni I nj (cid:105)(cid:104) S ni I nj (cid:105)− (cid:104) S n +1 i R n +1 j | S ni I nj (cid:105)(cid:104) S ni I nj (cid:105)− (cid:104) I n +1 i R n +1 j | S ni I nj (cid:105)(cid:104) S ni I nj (cid:105) . (12)Note that the RME for (cid:104) S ni S nj (cid:105) is also required, which we ﬁnd to be the following (cid:104) S n +1 i S n +1 j (cid:105) = (cid:104) S ni S nj (cid:105) − (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105)(cid:104) S ni S nj (cid:105)− (cid:104) I n +1 i I n +1 j | S ni S nj (cid:105)(cid:104) S ni S nj (cid:105)− (cid:104) I n +1 i I n +1 j | S ni S nj (cid:105)(cid:104) S ni S nj (cid:105) . (13)Since only the probabilities (cid:104) S ni I nj (cid:105) and (cid:104) S ni S nj (cid:105) are needed in order to describe theRMEs in Eq. (10), we consider those two combinations of states. From [22], we obtain theexact transition rates for pairs of vertices, though we ﬁnd we can factorise the pair-wisetransition rates similar to Eq. (5a). Here, we give the expression for (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105) only while the rest of the pair-wise transition rates are given in Appendix A: (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105) =1 (cid:104) S ni S nj (cid:105)  β (cid:88) k ∈ V A [ n ] ik (cid:104) S ni S nj I nk (cid:105) − β (cid:88) k ,k ∈ V A [ n ] ik A [ n ] ik (cid:104) S ni S nj I nk I nk (cid:105) + . . . − ( − β ) N − (cid:88) k ,...,k N − ∈ V A [ n ] ik . . . A [ n ] ik N − (cid:104) S ni S nj . . . I nk N − (cid:105)  ×  − β (cid:88) k ∈ V A [ n ] ik (cid:104) S ni S nj I nk (cid:105) + β (cid:88) k ,k ∈ V A [ n ] ik A [ n ] ik (cid:104) S ni S nj I nk I nk (cid:105)− . . . +( − β ) N − (cid:88) k ,...,k N − ∈ V A [ n ] ik . . . A [ n ] ik N − (cid:104) S ni S nj . . . I nk N − (cid:105)  . (14) In the above equation, the term in the ﬁrst pair of square brackets corresponds to theprobability that vertex i does not become infected and the term in the second pair of squarebrackets corresponds to the probability that vertex j becomes infected. Upon applying ourmoment closure technique Eq. (14) may be written as (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105) = (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19)  − (cid:89) k ∈ Vk (cid:54) = i (cid:18) − βA [ n ] kj (cid:104) S nj I nk (cid:105)(cid:104) S nj (cid:105) (cid:19) . (15)By introducing the following functions, the RMEs for pairs as well as the individualvertices can be written more concisely. The probability that vertex i does not becomeinfected at time step n + 1, given that i is not infected at time step n is denoted byΨ ni = (cid:89) k ∈ V (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) . (16) umphries et al. Page 8 of 20

Similarly, the probability that vertex i does not become infected at time step n + 1, giventhat i is not infected at time step n while excluding any interaction with j , is given byΦ nij = (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) . (17)Then, the evolution of the state of every vertex in the network is determined by thefollowing closed set of equations, (cid:104) S n +1 i (cid:105) =Ψ ni (cid:104) S ni (cid:105) (18a) (cid:104) I n +1 i (cid:105) =(1 − µ ) (cid:104) I ni (cid:105) + [1 − Ψ ni ] (cid:104) S ni (cid:105) (18b) (cid:104) S n +1 i I n +1 j (cid:105) =(1 − µ )(1 − βA ji )Φ nij (cid:104) S ni I nj (cid:105) + Φ nij (1 − Φ nji ) (cid:104) S ni S nj (cid:105) (18c) (cid:104) S n +1 i S n +1 j (cid:105) =Φ nij Φ nji (cid:104) S ni S nj (cid:105) . (18d)This approximation allows a large increase in accuracy compared to TIB model whileonly adding two equations to the ﬁnal model. All past dynamic correlations are nowtracked by the model and so the echo chamber eﬀect is eliminated, but only with directneighbours (vertices which share an edge). A major beneﬁt of this particular TPB modelover other existing iterations [21][26] is that this model can be implemented as an element-wise sparse matrix multiplication rather than having to iterate through all edges for everytime step making it extremely computationally eﬃcient and fast on even large networks. Italso beneﬁts from a low conceptual cost by not deviating from a vertex-based perspective,like the contact-based models, which move to the perspective of edges and thus deﬁne themodel in terms of the line-graphs and non-backtracking matrices[21].Similar to [14], we can compare pair-based models to the TIB model using the twovertex example. In that illustrative conﬁguration, we consider two vertices connectedby an undirected static edge and give the two vertices some initial non-zero probabil-ity (cid:104) I (cid:105) = (cid:104) I (cid:105) = z of being infected. We then run the TIB and TPB models for somegiven parameters β and µ and compare it to the ground truth, which is the average of anumber of Monte-Carlo realisations. (cid:104) I (cid:105) = 0 . (cid:104) I (cid:105) = 0 . From Fig. 1, it becomes apparent how the TIB model fails to capture the true SIR pro-cess on the network due to the previously discussed echo chamber induced by assumingstatistical independence of vertices. It becomes clear that the TPB model accurately de-scribes the underlying SIR process for this simple example as each vertex is able to recoverthe dynamic correlations of past interactions with direct neighbours. umphries et al.

Page 9 of 20

Figure 1

Running 25 time steps of the TIB and TPB SIR model as well as the average over 10 Monte-Carlo simulations for the two vertex example. Parameters: β = 0 . , µ = 0 . and (cid:104) I (cid:105) = (cid:104) I (cid:105) = 0 . . In the contact-based model [21], the central component is θ nij , which is the probability thatnode j has not passed infection to node i up to time step n . From θ nij , the quantity (cid:104) S ni (cid:105) may be computed as (cid:104) S n +1 i (cid:105) = (cid:104) S i (cid:105) (cid:89) j ∈ V θ n +1 ij . (19)This equation is the basis for the contact-based model and allows us to easily comparewith the pair-based model as it describes the same quantity as our Eq. (18a). The authorsalso assume that the evolution of θ nij satisﬁes the following relation θ n +1 ij = θ nij − βA [ n ] ji (cid:104) S ni I nj (cid:105)(cid:104) S ni (cid:105) (20) θ ij = 1 . In the pair-based model, the evolution of the susceptible probability, given by Eq. (18a),can similarly be rewritten in terms of its initial conditions, (cid:104) S n +1 i (cid:105) = Ψ ni (cid:104) S ni (cid:105) (21a)= (cid:104) S i (cid:105) n (cid:89) m =0 Ψ mi (21b)= (cid:104) S i (cid:105) (cid:89) j ∈ V n (cid:89) m =0 (cid:18) − βA [ m ] ji (cid:104) S mi I mj (cid:105)(cid:104) S mi (cid:105) (cid:19) . (21c)From equating (19) and (21) it is clear that if the models are exactly equivalent then θ ij is deﬁned by θ n +1 ij = n (cid:89) m =0 (cid:18) − βA [ m ] ji (cid:104) S mi I mj (cid:105)(cid:104) S mi (cid:105) (cid:19) . (22) umphries et al. Page 10 of 20

However, this contradicts the assumption made by Eq. (20). Thus the pair-based andcontact-based models are only equivalent if the following linearisation is assumed: n (cid:89) m =0 (cid:18) − βA [ m ] ji (cid:104) S mi I mj (cid:105)(cid:104) S mi (cid:105) (cid:19) ≈ − n (cid:88) m =0 βA [ m ] ji (cid:104) S mi I mj (cid:105)(cid:104) S mi (cid:105) , (23)which then implies (22) can be written as θ n +1 ij = 1 − n − (cid:88) m =0 βA [ m ] ji (cid:104) S mi I mj (cid:105)(cid:104) S mi (cid:105) − βA [ n ] ji (cid:104) S ni I nj (cid:105)(cid:104) S ni (cid:105) (24a)= θ nij − βA [ n ] ji (cid:104) S ni I nj (cid:105)(cid:104) S ni (cid:105) (24b)This shows that the contact-based model is a linearised version of the pair-based model. One of the most important metrics used in epidemiology is the epidemic threshold, whichdetermines the critical values of the model parameters at which a transition in qualitativebehaviour occurs and the disease-free equilibrium (DFE), which we deﬁne as (cid:104) S ni (cid:105) = S ∗ i , (cid:104) I ni (cid:105) = 0 , (cid:104) R ni (cid:105) = R ∗ i ∀ i (25)where (cid:104) S ∗ i (cid:105) + (cid:104) R ∗ i (cid:105) = 1. From this deﬁnition it is clear that there exists a whole classof DFEs which must considered. At this critical point, the DFE becomes unstable and onaverage an epidemic occurs. Calculating the epidemic threshold is a bit more diﬃcult withthe SIR model compared with the SIS due to the fact that the ﬂow of probability is inonly one direction between compartments S → I → R , thus the class of DFE solutionsare always asymptotically stable. Thus, we will look at classifying the initial stability ofthe SIR model as we perturb it from the state, (cid:104) S ni (cid:105) = 1 , (cid:104) I ni (cid:105) = 0 , (cid:104) R ni (cid:105) = 0 ∀ i (26)which we shall deﬁne as the pre-disease equilibrium (PDE). If it is unstable that meansthe disease has a chance to take hold and will spread through the network causing anepidemic before dying out. We now look at small perturbations from the PDE, if theyvanish then the disease will die out and won’t have a chance to propagate through thenetwork. We shall deﬁne an epidemic in the SIR model as instability of the PDE under suchperturbations. First, we look to linearise the diﬀerence equation for (cid:104) I n +1 i (cid:105) near the PDE,this translates to linearising the non-linear function Ψ ni . Under the assumption (cid:104) I ni (cid:105) = (cid:15) i for every vertex i such that 0 < (cid:15) i (cid:28) , we ﬁnd0 ≤ (cid:104) S ni I nj (cid:105)(cid:104) S ni (cid:105) ≤ (cid:15) j (27)by the fact that for the joint probability (cid:104) S ni I nj (cid:105) ≤ min {(cid:104) S ni (cid:105) , (cid:104) I nj (cid:105)} . Thus for (cid:15) i (cid:28) (cid:104) S ni I nj (cid:105)(cid:104) S ni (cid:105) ≈ (cid:15) j . Upon substituting this into Ψ ni we ﬁnd that umphries et al. Page 11 of 20 Ψ ni ≈ (cid:89) k ∈ V (cid:16) − βA [ n ] ki (cid:15) j (cid:17) ≈ (cid:88) k ∈ V βA [ n ] ki (cid:15) j (28)We can then use this to linearise (cid:104) I n +1 j (cid:105) from (18). While (cid:15) i (cid:28) (cid:104) I n +1 i (cid:105) ≈ (cid:104) I ni (cid:105) (1 − µ ) + (cid:88) k ∈ V βA [ n ] ik (cid:104) I nk (cid:105) . (29)This linearisation eliminates (cid:104) S ni (cid:105) from the equation. Interestingly, this is exactly theform of the SIS model in the TIB framework for which the epidemic threshold is easilyfound [19]. Therefore, we ﬁnd that the SIS and SIR models share the same epidemicthreshold condition. We introduce the matrix M [ n ] , called the infection propagator, whichis a linear map that describes the evolution of the SIR model close to the PDE: M [ n ] ij = βA [ n ] ij + δ ij (1 − µ ) . (30)Following Ref. [19], we ﬁnd that the condition required for an epidemic to occur is givenby ρ (cid:32) n (cid:89) m =0 M [ m ] (cid:33) > . (31)For the values of β and µ which the above is satisﬁed, implies that when a disease isintroduced into the network the PDE is unstable for a period of time. What this meansis that in the equivalent SIS model with the same parameters, the proportion of infectedvertices never settles on the PDE. In this section, we compare the accuracy of the TIB model and the TPB model againstthe ground truth Monte-Carlo average, that is, direct stochastic simulations. In short, weshow how the TPB model can oﬀer a massive increase in accuracy and also discuss when itfails to accurately capture the true dynamics of the stochastic SIR process. Furthermore,we validate the analytical epidemic threshold.

The assumption in the TPB model is conditional independence between vertices with aneighbour in common given the common neighbours state. This is equivalent to assumingthe network contains no time-respecting non-backtracking cycles [28, 29]. To illustrate thisreasoning, we consider a static tree network, that is, tree networks contain no cycles oflength 3 or greater, made up of 100 vertices. All vertices start from some initial non-zeroprobability (cid:104) I i (cid:105) = z of being infected. We then run the TIB model and the TPB model forsome given parameters β and µ and compare it to the ground truth, which is the averageof a number of Monte-Carlo realisations. umphries et al. Page 12 of 20

Figure 2 (a) A random tree network made up of 100 vertices. (b) Time series of the TIB (bluedashed) and TPB (green solid) SIR model as well as 10 Monte-Carlo (MC) simulations (reddotted) for the tree network shown in panel (a). Parameters: β = 0 . , µ = 0 . and (cid:104) I (cid:105) = (cid:104) I (cid:105) = 0 . . Figure 2(b) shows how the TIB model fails to capture the true SIR process on the networkdue to the previously discussed echo chamber induced by assuming statistical independenceof vertices. It becomes clear that the TPB model accurately describes the underlying SIRprocess for this simple example as each vertex is able to recover the dynamic correlationsof past interactions with direct neighbours. As we will see from the next section, temporalnetworks that are well approximated by tree networks are also well approximated by theTPB model.

In the following section, we consider 2 empirical temporal networks that all vary in bothstructure and temporal activity. For each of the empirical networks we wish to test ourour ﬁndings that the TPB model oﬀers an increase in accuracy over the TIB model. Weobserve a change in behaviour as the model parameters cross the epidemic threshold. Werun the SIR TIB and TPB models for all our networks for diﬀerent values of β and µ and then compare them to the average of a suﬃciently large number of MC simulations.This allows us to quantify how well the diﬀerent models approximate the dynamics of thetrue SIR process. At each time-step, the average prevalence of states within the networkare collected and denoted as (cid:104) S n avg (cid:105) , (cid:104) I n avg (cid:105) and (cid:104) R n avg (cid:105) with the cumulative prevalence ofinfection being taken as (cid:104) I n avg (cid:105) + (cid:104) R n avg (cid:105) . Then, we validate our analytical ﬁnding for theepidemic threshold of the TPB SIR process. For this purpose, we ﬁx a value for µ and thenfor increasing values of β , perform a number of MC simulations for long times in orderto get a distribution of the ﬁnal out break size, which is given by lim n →∞ (cid:104) I n avg (cid:105) + (cid:104) R n avg (cid:105) .In the long-term dynamics of the SIR process, lim n →∞ (cid:104) I n avg (cid:105) + (cid:104) R n avg (cid:105) , will usually exceedthe observation time of the network. Therefore, periodicity of the networks is assumed ina similar way to [19] when computing the ﬁnal outbreak sizes. umphries et al. Page 13 of 20

Table 1

List of empirical networks. Network ListNetwork Vertices Agg. Edges Avg. Edges SnapshotsConference 405 9699 20.02 3509Cattle Trades 111513 1041054 347.17 365

The Cattle Trades network consists of all trades between herds within the Republic ofIreland during the year 2017 with a temporal resolution of one day [17]. Due to the natureof the trade data, interactions are directional. Thus, this data set is modelled by a directednetwork, where each vertex represents a herd and each edge represents a trade weightedby the number of animals traded. The aggregated degree distribution of the network asshown in Fig. 3 (a) indicates a scale-free behaviour often seen in empirical networks. Thenetwork appears to be quite sparse as is evident from Fig. 3 (b), with an average of only347 edges per day while having an aggregated 1,041,054 edges over the entire year. Thedata also displays a strong bi-modal seasonal trend with there being two distinct peakswhile there tends to be very little trades occurring on Sundays when the data points lienear zero. Although we ignore external drivers of the disease, this model still oﬀers insightinto how susceptible to epidemics the network is as trade is the main vector of non-localtransmission. There are a number of infectious diseases that aﬀect cattle, such as Footand Mouth Disease and Bovine Tuberculosis, the latter of which is still a major problemin Ireland, thus eﬀective models for the spread of infectious diseases among herds areincredibly useful tools. In the present study, we focus on the SIR dynamics, but the TPBmodel framework can easily be extended to other models,From Fig. 4 we can compare the performance of the TIB and TPB models on the cattletrades network. The ﬁgure shows a year worth of simulations of both models plottedagainst the average of 10 MC realisations for the same choice of parameters. As is evidentfrom the ﬁgure, the TPB model oﬀers a signiﬁcant improvement over the TIB model asthere is a clear agreement with the MC average in the both the average and cumulativeaverage disease prevalence for both choices of parameters. The reason for such a signiﬁcantimprovement can be explained by the fact that the TPB model is exact on networks withno non-backtracking cycles. However, because the cattle trade network is a productionnetwork, there exist very few non-backtracking cycles making the network structure highlytree-like in its supra-adjacency embedding. This can be explained by the fact that theexistence of such cycles are ineﬃcient and cost prohibitive in the trade process. As aresult the network is well approximated by a tree network and contains very few non-backtracking cycles. Therefore, the SIR process is well approximated by the TPB modelfor such a network. As shown in Fig. 5 the number of non-backtracking cycles as a fractionof the total number of paths in the network is very small, peaking at just over 0.025%.Hence, the network is unlikely to suﬀer from the echo chamber eﬀect in the TPB model.However, there are very many reciprocal (bi-directional) links in the network, meaningmany farms trade in both directions. The reason for the diﬀerence in prediction between umphries et al.

Page 14 of 20

Figure 3 (a) In- and out-degree of the network aggregated over the entire year worth of data. (b)Time series of number of active edges per day in the network. (c) trades at the level of countiesaggregated over the observation period. The number of trades is indicated by the edge width andcolour. the TIB and TPB models is that the TPB model is able to account for all these reciprocaledges.Next, we test our analytical ﬁndings for the epidemic threshold on the cattle tradenetwork. Figure 5(b) depicts the average ﬁnal outbreak sizes of a number of realisationsagainst increasing values for β while keeping µ ﬁxed at 0.005. The critical β which givesrise to an epidemic according to the analytically computed epidemic threshold for such aﬁxed µ in the TPB model is given as β crit ≈ . β that are greater thanthe computed epidemic threshold, there is an obvious but gradual change in dynamics aslocal outbreaks no longer die out, but now propagate throughout the network leading tolarger ﬁnal outbreak sizes as the value for β gets larger, thus showing agreement with theanalytical result for the epidemic threshold. Overall, we ﬁnd that such trade networks area good candidate for the TPB model as they avoid many non-backtracking cycles. The second data set (cf. Tab. 1) is the Conference network described in Ref. [18]. Itincludes the face-to-face interactions of 405 participants at the SFHH conference held inNice, France 2009. Each snapshot of the network represents the aggregated contacts inwindows of 20s. Since this data set describes face-to-face interactions, each contact is bi-directional and so an undirected network is the natural choice to model these interactions.Because of the small number of nodes in the network, it is diﬃcult to draw detailedconclusions from the degree distribution. As shown in Fig. 6(a), there is a clear heavy tailwith most vertices having a relatively small aggregated degree. In Fig. 6(b) we see thatthe network activity in this case shows a number of peaks occurring then quickly dying umphries et al.

Page 15 of 20

Figure 4

TIB (blue dashed) and TPB (green solid) SIR models on the Irish cattle trade networktogether with the average of Monte-Carlo simulations (red dotted). Panels (a),(c) show thetime series for the prevalence of infection in the network and panels (b),(d) shows the cumulativeprevalence of infection within the network. The probability of infection is set to β = 0 . in (a),(b)and β = 0 . in (c),(d). The initial chance of infection is 0.03. Other parameters: µ = 0 . . out. These are explained by breaks between sessions at the conference during which theparticipants converse and interact. Because of the time scale and observation period of thisparticular temporal network, it is not feasible to model the spread of disease as infectionand recovery is unlikely to occur within the observation period, which is approximately20 hours. However, we can use our model to simulate the spread of viral information or”gossip” using the same dynamics as the SIR model. Infection is equivalent to receivingsome information in such a way that it becomes interesting enough to for the individualto try and spread to those they contact in the future and recovery is equivalent to growingtired of the information and no longer inform others they meet.Figure 7 shows the time series of the diﬀerent models for two probabilities of infection: β = 0 . β = 0 . umphries et al. Page 16 of 20

Figure 5 (a) Proportion (bars) and number of all non-backtracking (NBT) paths (blue solid) andcycles (blue dashed) that close at each respective time step. (b) Final outbreak sizes for varyingvalues of β with µ = 0 . . The vertical line shows the critical probability β crit obtained from theTPB model. The average of the Monte-Carlo (MC) is shown as green curve. Figure 6 (a) Degree of the conference network aggregated over the entire observation period. (b)Time series of number of active edges per time step in the network.

The reason we do not see a good agreement with the MC average for this particular dataset is due to the underlying topology of the network that is a physical social interactionnetwork where individuals congregate in groups and most or all in the group interact withone another. This leads to large clusters that give rise to many non-backtracking cycles.The more time-respecting non-backtracking cycles that occur, the worse the TPB modelwill perform. It is for this reason that we see a relatively large deviation from the MCsimulations for the TPB model. This can be explained by Figure 8(a), which in contrastto the cattle network shows that the number of NBT-cycles is relatively dense at manypoints in time, reaching as high as 12%.In Fig. 8(b) we see the distribution of ﬁnal outbreak proportions against the critical β computed for the epidemic threshold of the TPB model. Again, for values of β that aregreater than the computed epidemic threshold, there is an obvious but gradual changein dynamics as local outbreaks no longer die out, but now propagate throughout thenetwork leading to larger ﬁnal outbreak sizes as the value for β gets larger, thus showingagreement with the analytical result for the epidemic threshold. However, the agreement umphries et al. Page 17 of 20

Figure 7

TIB (blue dashed) and TPB (green solid) SIR models on the conference networktogether with the average of Monte-Carlo (MC) simulations (red dotted). Panels (a),(c) showthe time series for the prevalence of infection in the network. Panels (b),(d) depict the cumulativeprevalence of infection within the network. The probability of infection is set to β = 0 . in (a),(b)and β = 0 . in (c),(d). The initial chance of infection is 0.03. Other parameters: µ = 0 . . with the ﬁnal outbreak sizes only remains consistent with the MC average for values of β below and slightly above the epidemic threshold. In this paper, we have presented work done on SIR pair-based models by systematicallyextending them to a temporal setting and investigating the eﬀect of non-backtrackingcycles on the accuracy of the model on arbitrary network structures. We have found thatthe existence of many such non-backtracking cycles leads to a deviation in the pair-basedmodel from the true SIR process due to the echo chamber eﬀect they induce. Thus, thepair-based model is best suited to network structures which do not contain many cycles,such as production networks. We also ﬁnd that our analytical ﬁnding for the epidemicthreshold holds up when compared to numerical simulations by, showing a qualitativechange in the ﬁnal outbreak proportion.

Abbreviations

NBT

Non-Backtracking Tracking. IB Individual Based.

TIB

Temporal Individual-Based PB Pair Based.

TPB

Temporal Pair-Based. MC Monte-Carlo.

DFE

Disease-Free Equilibrium.

PDE

Pre-Disease Equilibrium. umphries et al.

Page 18 of 20

Figure 8 (a) Proportion (bars) and number of all non-backtracking (NBT) paths (blue solid) andcycles (blue dashed) that close at each respective time step. (b) Final outbreak sizes for varyingvalues of β with µ = 0 . . Author details School of Mathematical Sciences, University College Cork, Western Road, T12 XF64 Cork, Ireland. UCD Centrefor Veterinary Epidemiology and Risk Analysis, UCD School of Veterinary Medicine, University College Dublin,Belﬁeld, D04 W6F6 Dublin, Ireland.

References

1. Newman M. Networks. Oxford University Press; 2018.2. Barabasi AL. Network Science. Cambridge University Pr.; 2016.3. Zhan XX, Li Z, Masuda N, Holme P, Wang H. Susceptible-infected-spreading-based network embedding instatic and temporal networks. EPJ Data Science. 2020;9(1):30.4. Masuda N, Holme P. Temporal network epidemiology. Springer; 2017.5. Keeling MJ, Eames KTD. Networks and epidemic models. Journal of The Royal Society Interface. 2005jun;2(4):295–307.6. Humphries R, Spillane M, Mulchrone K, Wieczorek S, O ' Riordain M, Hoevel P. A Metapopulation NetworkModel for the Spreading of SARS-CoV-2: Case Study for Ireland. medRxiv. 2020 jun;Available from: .7. Yang Wang, Chakrabarti D, Chenxi Wang, Faloutsos C. Epidemic spreading in real networks: an eigenvalueviewpoint. In: 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings. Florence,Italy: IEEE Comput. Soc; 2003. p. 25–34. Available from: http://ieeexplore.ieee.org/document/1238052/ .8. Sharkey KJ, Kiss IZ, Wilkinson RR, Simon PL. Exact Equations for SIR Epidemics on Tree Graphs. Bulletin ofMathematical Biology. 2015 Apr;77(4):614–645. Available from: http://link.springer.com/10.1007/s11538-013-9923-5 .9. Colizza V, Pastor-Satorras R, Vespignani A. Reaction–diﬀusion processes and metapopulation models inheterogeneous networks. Nature Physics. 2007;3(4):276–282.10. Iannelli F, Koher A, Brockmann D, H¨ovel P, Sokolov IM. Eﬀective Distances for Epidemics Spreading onComplex Networks. Phys Rev E. 2017 January;95(1):012313.11. Koher A, Lentz HHK, H¨ovel P, Sokolov IM. Infections on Temporal Networks - A Matrix-Based Approach.PLOS ONE. 2016 April;11(4):e0151209.12. Lentz HHK, Koher A, H¨ovel P, Gethmann J, Selhorst T, Conraths FJ. Livestock disease spread through animalmovements: a static and temporal network analysis of pig trade in Germany. PLOS ONE. 2016May;11(5):e0155196.13. Karrer B, Newman MEJ. Message passing approach for general epidemic models. Phys Rev E. 2010Jul;82(1):016101. Available from: https://link.aps.org/doi/10.1103/PhysRevE.82.016101 .14. Shrestha M, Scarpino SV, Moore C. Message-passing approach for recurrent-state epidemic models onnetworks. Phys Rev E. 2015 Aug;92(2):022821. Available from: umphries et al.

Page 19 of 20 https://link.aps.org/doi/10.1103/PhysRevE.92.022821 .15. Lentz HH, Selhorst T, Sokolov IM. Unfolding accessibility provides a macroscopic approach to temporalnetworks. Physical review letters. 2013;110(11):118701.16. Gethmann J, Probst C, Bassett J, Blunk P, H¨ovel P, Conraths FJ. An epidemiological and economic simulationmodel to evaluate strategies for the control of bovine virus diarrhea in Germany. Front Vet Sci. 2019;6:406.17. Tratalos JA, Madden JM, McGrath G, Graham DA, ´Aine B Collins, More SJ. Spatial and networkcharacteristics of Irish cattle movements. Preventive Veterinary Medicine. 2020;183:105095. Available from: .18. G´enois M, Barrat A. Can co-location be used as a proxy for face-to-face contacts? EPJ Data Science. 2018Dec;7(1):11. Available from: https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-018-0140-1 .19. Valdano E, Ferreri L, Poletto C, Colizza V. Analytical Computation of the Epidemic Threshold on TemporalNetworks. Phys Rev X. 2015 Apr;5(2):021005. Available from: https://link.aps.org/doi/10.1103/PhysRevX.5.021005 .20. Sharkey KJ. Deterministic epidemic models on contact networks: Correlations and unbiological terms.Theoretical Population Biology. 2011 Jun;79(4):115–129. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0040580911000128 .21. Koher A, Lentz HHK, Gleeson JP, H¨ovel P. Contact-Based Model for Epidemic Spreading on TemporalNetworks. Phys Rev X. 2019 Aug;9(3):031017. Available from: https://link.aps.org/doi/10.1103/PhysRevX.9.031017 .22. Frasca M, Sharkey KJ. Discrete-time moment closure models for epidemic spreading in populations ofinteracting individuals. J Theor Biol. 2016 Jun;399:13–21. Available from: https://linkinghub.elsevier.com/retrieve/pii/S002251931600165X .23. Kirkwood JG. Statistical Mechanics of Fluid Mixtures. J Chem Phys. 1935 May;3(5):300–313. Available from: http://aip.scitation.org/doi/10.1063/1.1749657 .24. Sharkey KJ, Wilkinson RR. Complete hierarchies of SIR models on arbitrary networks with exact andapproximate moment closure. Math Biosci. 2015 Jun;264:74–85. Available from: https://linkinghub.elsevier.com/retrieve/pii/S002555641500067X .25. Gardiner CW, Gardiner CW. Stochastic methods: a handbook for the natural and social sciences. 4th ed.Springer series in synergetics. Berlin: Springer; 2009.26. G´omez S, Arenas A, Borge-Holthoefer J, Meloni S, Moreno Y. Discrete-time Markov chain approach tocontact-based disease spreading in complex networks. EPL (Europhysics Letters). 2010 Feb;89(3):38009.Available from: http://stacks.iop.org/0295-5075/89/i=3/a=38009?key=crossref.94807395648b53ac18f3a1e6eff617a9 .27. Bianconi G. Multilayer networks: structure and function. First edition ed. Oxford: Oxford University Press;2018. OCLC: 1045077621.28. Hashimoto Ki. Zeta functions of ﬁnite graphs and representations of p-adic groups. In: Automorphic forms andgeometry of arithmetic varieties. Elsevier; 1989. p. 211–280.29. Bordenave C, Lelarge M, Massouli´e L. Non-backtracking spectrum of random graphs: community detection andnon-regular ramanujan graphs. In: 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.IEEE; 2015. p. 1347–1357.

Appendix A: Vertex Pair Transition Rates for the PB Model

In this Appendix, we provide the transition rates used in Eqs. (14) : (cid:104) S n +1 i I n +1 j | S ni S nj (cid:105) = (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19)  − (cid:89) k ∈ Vk (cid:54) = i (cid:32) − βA [ n ] kj (cid:104) S nj I nk (cid:105)(cid:104) S nj (cid:105) (cid:33) (32a) umphries et al. Page 20 of 20 (cid:104) I n +1 i S n +1 j | S ni S nj (cid:105) =  − (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) (cid:89) k ∈ Vk (cid:54) = i (cid:32) − βA [ n ] kj (cid:104) S nj I nk (cid:105)(cid:104) S nj (cid:105) (cid:33) (32b) (cid:104) I n +1 i I n +1 j | S ni I nj (cid:105) =(1 − µ )  − (1 − βA [ n ] ji ) (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) (32c) (cid:104) S n +1 i R n +1 j | S ni I nj (cid:105) = µ  (1 − βA [ n ] ji ) (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) (32d) (cid:104) I n +1 i R n +1 j | S ni I nj (cid:105) = µ  − (1 − T ji ) (cid:89) k ∈ Vk (cid:54) = j (cid:18) − βA [ n ] ki (cid:104) S ni I nk (cid:105)(cid:104) S ni (cid:105) (cid:19) ..