[PDF] Estimating the effective reproduction number for heterogeneous models using incidence data

Abstract

The effective reproduction number, R(t), is a central point in the study of infectious diseases. It establishes in an explicit way the extent of an epidemic spread process in a population. The current estimation methods for the time evolution of R(t), using incidence data, rely on the generation interval distribution, g(\tau), which is usually obtained from empirical data or already known distributions from the literature. However, there are systems, especially highly heterogeneous ones, in which there is a lack of data and an adequate methodology to obtain g(\tau). In this work, we use mathematical models to bridge this gap. We present a general methodology for obtaining an explicit expression of the reproduction numbers and the generation interval distributions provided by an arbitrary compartmental model. Additionally, we present the appropriate expressions to evaluate those reproduction numbers using incidence data. To highlight the relevance of such methodology, we apply it to the spread of Covid-19 in municipalities of the state of Rio de janeiro, Brazil. Using two meta-population models, we estimate the reproduction numbers and the contributions of each municipality in the generation of cases in all others. Our results point out the importance of mathematical modelling to provide epidemiological meaning of the available data.

Full PDF

rrspa.royalsocietypublishing.org

Research

Article submitted to journal

Subject Areas:

Mathematical modeling, Differentialequations, Applied mathematics

Keywords:

Effective reproduction number,Mathematical models,Meta-population models , COVID-19

Author for correspondence:

Daniel Cardoso Pereira Jorgee-mail: [email protected]

Estimating the effectivereproduction number forheterogeneous models usingincidence data

D. C. P. Jorge , J. F. Oliveira , J. G. V.Miranda , R. F. S. Andrade , and S. T. R.Pinho Instituto de Física Teórica, Universidade EstadualPaulista - UNESP, R. Dr. Teobaldo Ferraz 271, SãoPaulo 01140-070, Brazil Center of Data and Knowledge Integration for Health(CIDACS), Instituto Gonçalo Moniz, FundaçãoOswaldo Cruz, Salvador, Bahia, Brazil Instituto de Física, Universidade Federal da Bahia,Salvador, Bahia, Brazil

The effective reproduction number, R ( t ) , is a centralpoint in the study of infectious diseases. It establishesin an explicit way the extent of an epidemic spreadprocess in a population. The current estimationmethods for the time evolution of R ( t ) , usingincidence data, rely on the generation intervaldistribution, g ( τ ) , which is usually obtained fromempirical data or already known distributions fromthe literature. However, there are systems, especiallyhighly heterogeneous ones, in which there is a lack ofdata and an adequate methodology to obtain g ( τ ) . Inthis work, we use mathematical models to bridge thisgap. We present a general methodology for obtainingan explicit expression of the reproduction numbersand the generation interval distributions providedby an arbitrary compartmental model. Additionally,we present the appropriate expressions to evaluatethose reproduction numbers using incidence data.To highlight the relevance of such methodology, weapply it to the spread of Covid-19 in municipalitiesof the state of Rio de janeiro, Brazil. Using two meta-population models, we estimate the reproductionnumbers and the contributions of each municipalityin the generation of cases in all others. Our resultspoint out the importance of mathematical modellingto provide epidemiological meaning of the availabledata. © The Author(s) Published by the Royal Society. All rights reserved. a r X i v : . [ q - b i o . P E ] F e b r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A ..........................................................

1. Introduction

The human population gets along with different microorganisms. Some of them lead totransmitted diseases that result in epidemics, even pandemics such as SARS-Cov-2. It is veryimportant to deﬁne reliable measures to characterize the spread of those pathogens, both at thebeginning and in the course of epidemics.The reproduction number R ( t ) indicates the number of new infections that result from a singleinfected individual at any time t . When it is greater than 1, each infected individual tends togenerate more than one infected individuals resulting in the occurrence of an epidemic outbreak.The reproduction number is usually known from its initial value, when all of the populationis susceptible, known as the basic reproduction number R . Due to the recovery of infectedindividuals, a portion of the population becomes immune to the disease, thus, herd immunitybegins to emerge in the population. This acts as a barrier to the transmission of the disease, so R ( t ) takes into account the evolution of the number of susceptible individuals in the populationand is usually referred as the effective reproduction number. It provides us with information onhow disease transmission occurs during the outbreak it is an important metric to make healthcaremanagers aware of the current level of the disease propagation.By considering heterogeneities in the homogeneous compartmental models, the reproductionnumber is affected. Indeed different classes of infected individuals can lead to differentcontributions for the generation of new infections. A lot of studies have taken the heterogeneityinto account when dealing with the basic reproduction number R [1,2], however, this is not thesame for the effective reproduction number, R ( t ) .To study reproduction number, the renewal equation, a Lotka-Euler type equation [3], isusually used. This equation emerges from the mathematical models dynamics and can be derivedusing infection-age models. This type of modelling comes from the study of age-structuredpopulation growth and was ﬁrst used within the scope of mathematical demography with theaim of modeling the birth dynamics due to the offspring produced by an individual over itslifespan. Analogously, the same methodology can be applied to the infectious diseases scenarioby modelling the generation of new infected individuals during the infection period of thosepreviously infected [4]. Infection-age models are the origins of the widely known SIR and SEIRmodels developed by Kermack-Mckendrick [5]. Most known compartment ODE models forinfectious diseases have their infection-age identical counterparts. The advantage with respectto the infection-age models is that they allow us to have access to the distribution of the infectedindividuals during the infectious period.The estimation of the reproduction number in the course of an outbreak usually rely on thedaily count of new cases along with the renewal equation. This approach uses the generationinterval distribution, also called the serial interval or generation time distribution, and waspopularized by [3]. Several works indicate how to obtain this distribution through empiricaldata [6–9] or from other known distributions and models [3,10–13]. However, these distributionsrely on epidemiological and empirical studies and several systems are very difﬁcult to analyse oreven do not have the required data for the envisaged evaluation. Moreover, highly heterogeneoussystems do not have a general methodology for estimating the reproduction number and itsgeneration interval distribution [14–17].In this study, we use mathematical models to bridge the gap between the reported casesand the reproduction number. Therefore, we present a methodology that derives reproductionnumbers and the generation interval distribution for an arbitrary compartment mathematicalmodel. With such methodology, we can investigate highly heterogeneous systems usingappropriate models and evaluate their reproduction numbers. With that in mind, apply themethod to a meta-population model to analyse the role of spatial heterogeneity in the spreadingof COVID-19 in the Brazilian territory.Our work is separated as follow: in section 2, we introduce heterogeneity by allowing theexistence of different groups of individuals in a population that can be sorted into compartments r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... [18], considering then a very general heterogeneous model that can be reduced to most models inliterature. We use this general formulation to develop our methodology to obtain the reproductionnumbers and expressions that correlate it to actual data. Then, in section 3, we apply it to aspeciﬁc scenario related to the role of the inter-municipal commuter ﬂow of people and extractits reproduction number and the generation interval distribution. In section 4 and 5 we useactual data of the emerging SARS-Cov-2 coronavirus pandemic in the municipalities of the stateof Rio de Janeiro, Brazil, to estimate the reproduction numbers that emerge from the system.Additionally, we reconstruct the time series of cases from the contributions of each municipalityin the propagation of the disease.

2. Reproduction number in a heterogeneous population (a) A general infection-age model

In a heterogeneous population, individuals can be discernible by age-groups, spatial locations,behaviour, different susceptibility to diseases or any other factor that may distinguish themfor one another. In this section, we consider a general heterogeneous model that separates theindividuals in m homogeneous compartments. For the purpose of evaluating the reproductionnumber, we ﬁrst consider a sub-set of variables encompassing the infected compartments, i.e.,those compartments containing individuals that carry the etiologic agent in their organisms.These compartments are denoted by x ( t, τ ) = ( x ( t, τ ) , ..., x n ( t, τ )) , where t and τ indicate,respectively, the calendar time and the infection-age, the time elapsed since they got the infection.Of course the condition x ( t, τ ) = 0 for τ > t must be satisﬁed. If m = m i + m , where m i and m indicate the number of infected and non-infected compartments in the homogeneous model, n = m i . Therefore, x represents the number of individuals in each compartment at a speciﬁccalendar time “ t ” and at infection age “ τ ”. x entries can be called the infection-age distributions.The total number of individuals in each compartment, denoted as X i ( t ) , can be obtained byintegrating x i ( t, τ ) with respect of τ from zero to inﬁnity, where X ( t ) = ( X ( t ) , ..., X n ( t )) . Forsimplicity, we will be using matrix notation in some equations. The matrices and vectors willbe identiﬁed by using bold font, whereby M ( y, z ) = h M ij ( y, z ) i . Also, as usual, integrals andderivatives with respect of scalar variables operate componentwise.Drawing a parallel with Van den Driessche’s next generation method [19], we distinguish thenew infections from the other compartment ﬂows, deﬁning F ( t ) = ( F ( t ) , ..., F n ( t )) as the rateof appearance of new infections and V ( t, τ ) = ( V ( t, τ ) , ..., V n ( t, τ )) as the rate of transfer intocompartments. Therefore, F ( t ) describes the ﬂow from non-infected compartments into infectedones and depends on X ( t ) . On the other hand, V ( t, τ ) is related to the ﬂow between infectedcompartments or from infected to non-infected ones and must depend on x ( t, τ ) . Thus, an usualinfection-age model can be written as: (cid:16) ∂∂t + ∂∂τ (cid:17) x ( t, τ ) = − V ( t, τ ) , (2.1) x ( t, τ = 0) = F ( t ) . (2.2)Infection-age models are partial differential equations (PDE) that takes into account the timeelapsed since the infection τ . The equations (2.1) and (2.2) can usually be linked to their respectiveODE models, and in fact the Kermack-Mckendrick SIR and SEIR models [5] are the special casesof their infection-age correspondents. This modeling allows us to have access to the distributionof the infected during the infectious period, which enable us to estimate the number of activeinfected individuals and the reproduction number along the outbreak using the reported data.Integrating (2.1) from zero to inﬁnity of with respect of τ we obtain: ddt X ( t ) = F ( t ) − Z ∞ V ( t, τ ) dτ, (2.3) r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... which is the correspondent to the sub-set of equations for the infected variables of the ODE model.Therefore, even though we are dealing with an infection-age model, one can simply build F and V from an ODE system and apply the methodology in this work. We demonstrate this for usualODE models in the Supplementary Material 1.Since F i ( t ) and V i ( t, τ ) are well deﬁned, we proceed to solving equation (2.1) using themethod of integration along the characteristic line. Thus, the left side of the equation (2.1)indicates that the characteristics of those PDE’s are lines of slope 1, which implies that t = τ + c with c being an arbitrary constant. We ﬁx a point ( t , τ ) and introduce a variable ω such that u i ( s ) = x i ( t + ω, τ + ω ) are functions that provide the values of the compartment densitiesalong the characteristic. After straightforward calculations [4], we obtain ddω u ( ω ) = − V ( ω ) , (2.4)where V i ( ω ) = V i ( t + ω, τ + ω ) . In epidemic models, ¯ V i ( ω ) are usually linear equations, makingthis system easy to solve. In that case, the system can be represented as: ddω u = − ∂ V ∂ u u , (2.5)where − ∂ V ∂ u = h − ∂∂u j V i ( ω ) i is the matrix of the linear system. Assuming that there are noinfected individuals prior to t = 0 , we only need to take into account the solution where t > τ ,leading to ω = τ , t = τ + t and τ = 0 . The solution for a linear system can be written as u ( ω ) = Γ ( ω ) u (0) , (2.6)where Γ ( ω ) = Γ ( t, τ ) , is the fundamental matrix obtained by solving (2.4). Therefore, we identify F ( t ) = u (0) in (2.2). So, (2.6) becomes x ( t, τ ) = Γ ( t, τ ) F ( t − τ ) . (2.7)For a linear V ( t, τ ) , Γ ( τ ) components are exponential functions. Equation (2.7) can also beused to express the solution of a general non-linear V ( t, τ ) , but if it is not the case, such systemwill not be considered in this work. On the other hand, we can also assume that F i ( t ) can bewritten as F i ( t ) = n X j Z ∞ Ω ij ( t, τ ) x j ( t, τ ) dτ, (2.8)in which Ω ( t, τ ) is related to the generation of infected individuals in " i " due to " j ". If Ω does notdepend on τ , as in ODE models, it assumes the form of Ω ( t ) = h ∂∂x j F i ( t ) i (2.9)Most system in literature satisﬁes (2.7) and (2.8), thus the method in this work is very generaland can be applied to a wide range of models. (b) Obtaining the reproduction numbers To estimate the reproduction number using the reported data of new infections, we need to linkthem with the equations of the model. This can be done by substituting (2.7) in (2.8). Arrangingthe equation we get F ( t ) = Z ∞ A ( t, τ ) F ( t − τ ) dτ, (2.10)for r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... A ( t, τ ) = Ω ( t, τ ) Γ ( t, τ ) . (2.11)The functions A ij ( t, τ ) , analogously to [10], are referred as the rate of new infections in X i dueto X j previous infections at a calendar time t and infection-age τ , whereby A ij ( t, τ > t ) ≡ . So,we can describe the new cases in " X i " from the cases that occurred previously in all the groups.In fact, (2.10) is a general form for the widely known renewal equation, [3,10], actually a sum ofrenewal equations. Thus, by deﬁning the number of new infections in the " i " compartment due to" j " previews infections as J ij ( t ) = Z ∞ A ij ( t, τ ) F j ( t − τ ) dτ, (2.12)it becomes clear that F i ( t ) = P nj J ij ( t ) . Thus, it is possible to quantify the inﬂuence of theinfections occurred in one compartment new ones in another compartment. To obtain theexpected number of new infections in X i a newly infected individual form X j is expected togenerate, thus the reproduction number, we integrate A ij ( t, τ ) from zero to inﬁnity with respectto τ , as in [10]. R ij ( t ) = Z ∞ A ij ( t, τ ) dτ. (2.13) R is usually called the next-generation matrix, where its entries corresponds to thereproduction numbers of the system [14]. It is remarkable that, for a ODE model where V islinear, the next-generation matrix evaluated at the free-disease ﬁxed point R| x , is equivalent tothe one developed in [19]. Naturally from (2.13), we deﬁne g ij ( t, τ ) = A ij ( t, τ ) R ∞ A ij ( t, τ ) dτ , (2.14)where g ij ( t, τ ) is normalized and denoted as the generation interval distribution [3,10]. Thus, itis related to the ﬂow of individuals between infected compartments and their recover process.We assume a generation interval distribution depending on t and τ , which is general enough tobe applied to models whose dynamics can change over time. Therefore, using (2.13) and (2.14) in(2.12) we obtain: J ij ( t ) = R ij ( t ) Z ∞ g ij ( t, τ ) F j ( t − τ ) dτ. (2.15)It is important to highlight that R ij ( t ) is not necessarily the number of new infected in X i generated by X j . Instead, the meaning of R ij ( t ) is linked to the generation of new infected onesin “ i ”, F i ( t ) , due to infected individuals previously generated in “ j ”, F j ( t − τ ) , regardless of whatstage of the disease these infected individuals who entered “ j ” are at instant t . That is the casebecause a new infected individual may generate new infections through out many stages of thedisease. In the SEIR model, for example, all of the new infections are in the E compartment, butthe individual only becomes infective when it goes to the I . To access the reproduction numbera newly infected in X j is expected to generate to all of the other compartments, we can just sumthe R ij over i , deﬁning R j ( t ) = n X i R ij ( t ) . (2.16)Through out this work, the over-line represents the merge of the ﬁrst index , in this case, in theform of a sum. This lead us to analogously deﬁne A j = P ni A ij , whereby its integral from zero toinﬁnity with respect to τ is R j . Thus, we are able to write J j ( t ) = R j ( t ) Z ∞ g j ( t, τ ) F j ( t − τ ) dτ, (2.17)where J j ( t ) = P ni J ij ( t ) and r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... g j ( t, τ ) = A j ( t, τ ) R ∞ A j ( t, τ ) dτ = P ni R ij ( t ) g ij ( t, τ ) P ni R ij ( t ) (2.18)Therefore, J j ( t ) is the total number of new infections generated by previews infections in X j . It is interesting that the generation interval distribution g j ( t, τ ) takes the form of a weightedaverage in which the g ij ( t, τ ) are the weights.Thus, the implementation of the proposed method in this work amounts to: identifying theterms F and V from a model; using them to ﬁnd Ω and Γ ; obtaining A and integrating to get R and g . Further in this work we present applications of the method and estimations using actualdata. Examples of the method in different types of models can be found on the SupplementaryMaterial 1. (c) The total reproduction number After obtaining the reproduction numbers of the constituents of the system, we now intend todeﬁne a reproduction number for the whole system. The number of new infections in a group, F i ( t ) , can be described as a fraction of the total number of cases from all groups, F T ( t ) = P ni F i ( t ) such that F i ( t ) = α i ( t ) F T ( t ) . (2.19)Since α i ( t ) is the proportion of the total number of cases that F i ( t ) represents, the condition P i α i ( t ) must be satisﬁed. Thus, using (2.10) with (2.19), we obtain: F T ( t ) = R T ( t ) Z ∞ g T ( t, τ ) F T ( t − τ ) dτ, (2.20)where R T ( t ) = α · R is the reproduction number of the system and g T ( t, τ ) = P i α i ( t ) R i ( t ) g i ( τ ) P i α i ( t ) R i ( t ) . (2.21)Since R T ( t ) is the scalar product between α and R , we can interpret it as the projection ofthe R over the fractions α forming a weighted average. Thus, it is interesting to note that we candescribe the system as an average of its heterogeneities. Since the deﬁnition of the total number ofreproductions R T is very general, it is not always meaningful. For example, if we form a systemof two independent dynamics, that is, ( R ij = 0 for i = j ) it is still possible to obtain a R T , evenif there is no meaning to it. The α i ( t ) functions are the key for analysing the feasibility of a totalreproduction number that has a dynamical meaning.We focus our attention on a case where the α i ( t ) appears naturally in the equations. If Ω ij ( t, τ ) can be separated in a function of t depending on the " i " index and another function of t and τ depending on the " j " index, then it can be written as Ω = α ⊗ Ω = h α i ( t ) Ω j ( t, τ ) i (2.22)where ⊗ represents a tensor product. We deﬁne Ω j = P i Ω ij , and noticed that A j = P k Ω k Γ kj e A = α ⊗ A . Thus, Ω and A can be separated in proportions, α and Ω . In fact, the above equationalso impacts (2.16) and (2.17) which can be factorized in terms of R = α ⊗ R e J = α ⊗ J .Using (2.18) we can also get g ij ( t, τ ) = g j ( t, τ ) . Furthermore, because the next generation matrixis obtained from a tensor product of vectors, the largest eigenvalue of R corresponds to the scalarproduct of R e α , that is R T . Thus, in these systems the total reproduction number assessed at thedisease-free equilibrium point, t = 0 , corresponds to the basic reproduction number, R T (0) = R .Also, in that case, the R T is the spectral radius of R , as in [19]. This is very common in diseasetransmission models, in fact the SIR and SEIR model are examples of this case. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... (d) Estimations with real data So far, we’ve developed a general framework to estimate the reproduction numbers from the rateof new infections F . However, when we deal with real data what we have is the collection of allof the new infections in an ∆t period of time, which leads to the deﬁnition of B and T . B i ( t ) = ρ i Z t + ∆tt F i ( t ) dt , T ij ( t ) = ρ i Z t + ∆tt J ij ( t ) dt . (2.23)Therefore B i ( t ) are the reported cases in X i , where ρ i is a constant related to a higher orlower notiﬁcation, due to sub/super-notiﬁcation. Analogously we deﬁne T ij ( t ) as the numberof reported cases in X i due to previous cases in X j . By assuming that R ij , g ij ( t, τ ) and F i areapproximately constants during a ∆t interval, we use (2.15) and (2.8) to derive T ij ( t ) = R ij ( t ) t X τ =0 ρ i ρ j g ij ( t, τ ) B j ( t − τ ) ∆t (2.24)for B i ( t ) = P nj T ij . (2.24) is a general form for the discrete version of the renewal equation. In fact,by using (2.20) we are able to recover a well known result in literature [17] R T ( t ) = B T ( t ) P tτ =0 ρ i ρ j g T ( t, τ ) B T ( t − τ ) ∆t , (2.25)where B T ( t ) = P ni B i ( t ) .

3. Explicit expression of R ( t ) for two meta-population models In this section we analyse two meta-population models detailed in Supplementary Material 2.The methodology developed on this study is applied to both models to obtain the correspondentreproduction numbers and generation interval distributions, see Supplementary Material 1 fordetailed calculations. (a) SIR-type meta-population model

In this section we will consider a meta-population model takes into account groups of spatiallyseparated "island" populations with some level of interaction. Such models are widely used forsystems in which the movement of individuals between meta-populations is considered [20–26].Since each population is connected with the others, the system can be interpreted as a networkfor which the nodes represent the meta-populations and the weight of their edges representthe intensity of the movement between them. This type of model does not describe the dailymovement of individuals explicitly, but as an interaction of the meta-populations. It is suitablewhen the population sizes are not permanently affected by the ﬂow of individuals, as in the caseof commuter movement between locations of residence, work and study. This type of movementis obligatory cyclical, predictable and recurring regularly, most of the time on a daily basis. In thismodel, we assumed that each meta-population i , with N i individuals, has its own transmissionrate β i ( t ) . Also, the movement of individuals between meta-populations is taken into account,where we introduce Φ ij ( t ) as the density of ﬂow between the populations i and j . That is, theamount of i resident individuals commuting from i to j divided by the total population of i , N i . The parameters β i ( t ) and Φ ij ( t ) are time dependent, so we can incorporate the changes in thebehavior of the populations on those variables.In Supplementary Material 2 we derive this model,inspired by [26], and in Supplementary Material 1 we obtain the corresponding reproductionnumbers and generation interval distribution for it, which reads: r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... R ij ( t ) = S i ( t ) λ ij ( t ) γ , g ij ( τ ) = g ( τ ) = γ e − γ τ . (3.1)Here λ ij is related to the transmission of the disease from a meta-population “ i ” to other meta-population “ j ” and is derived based on simple assumptions about the commuter movementof individuals in the network, see Supplementary Material 2. Notably, if we isolate the meta-populations from the network, Φ ij ( t ) = 0 , for all i and j , then the reproduction numbers and thegeneration interval distributions becomes identical to the classical SIR model [10]. (b) A meta-population model for Covid-19 (SEIIR) The classical SIR model is a very simple and qualitative approach to disease transmissiondynamics, but it does not provide the best description for most diseases. For now on, we willbe focusing on a model for a speciﬁc disease, the SARS-Cov-2 coronavirus. In this case, thetransmission can be facilitated by the existence of individuals whose symptoms are very weakor even nonexistent [27], therefore, this heterogeneity can change the dynamics. In order to haveconsistence description of this aspect, it is wise to assume that the existence of two classes ofinfected individuals, the symptomatic and the asymptomatic/undetected ones, as consideredin a general model for the SARS-Cov-2 coronavirus [28]. Therefore, we now take into accountinfected individuals who do not need to be hospitalized and are not recorded in ofﬁcial registereddata, thus becoming undetectable. For the sake of simplicity, we will refer to those individualsonly as asymptomatic ones, for simplicity. In Supplementary Material 2 we proceed to derivethis model, based in the meta-population SIR-type approach described in the previous section. InSupplementary Material 1 the expressions for the reproduction number and generation intervaldistribution are detailed derived. There, we show that we only need R for i, j ≤ n to describe thedynamics. Thus, in this main framework, whenever we refer to R or g we will be alluding to their i, j ≤ n elements. Thus, we obtain: R ij ( t ) = S i ( t ) λ ij ( t ) h pγ s + δ (1 − p ) γ a i , g ij ( τ ) = g ( τ ) = pγ s g s ( τ ) + δ (1 − p ) γ a g a ( τ ) pγ s + δ (1 − p ) γ a . (3.2)the λ ij ’s expressions are the same presented for the SIR-type model in Supplementary Material2. δ is a factor that reduces or enhances the asymptomatic infectivity, p is the proportion ofindividuals that becomes symptomatic when infected, γ a and γ s are the recover rates of theasymptomatic and symptomatic individuals, respectively. g a ( τ ) and g s ( τ ) are expressed in termsof exponential functions described according to (3.3). g a ( τ ) = κγ a γ a − κ ( e − κτ − e − γ a τ ) , g s ( τ ) = κγ s γ s − κ ( e − κτ − e − γ s τ ) . (3.3)Whereby, if we isolate the meta-populations from the network, Φ ij ( t ) = 0 , for all i and j , thereproduction numbers and generation interval distributions returns to the expression obtainedin [28].

4. Applications for the meta-population models using actual data

In this section, we present results for the methodology applied to the meta-population modeldeveloped in the previews section. We use actual data of the ﬁrst six months of the COVID-19 pandemic in Brazilian cities, such as: reported cases in each municipality, daily commutermovement due to work between municipalities and daily mobility tends towards workplaces.In Supplementary Material 3 we derive and present the expressions and parameters needed toestimate the reproduction numbers for both meta-population models. Thus, we obtain a dailytime series of the reproduction numbers for each model. r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... (a) Database In this work, we use daily notiﬁcations of new cases due to COVID-19 in Brazil, obtained frompublic websites: https://covid.saude.gov.br/ and https://brasil.io/datasets/ ,which provide data from the Health Ministry. We obtained intermunicipal commuter movementof workers and students data from a study on population arrangements and urban concentrationsin Brazil conducted by IBGE (Brasilian Institute of Geography and Statistics) in 2015 that can befound at [29]. In addition, we obtain daily mobility data for each Brazilian state from a publicreport by Google, accessed at: https://google.com/covid19/mobility/ .We used the data observed until September 14th, 2020, and performed a 10-day movingaverage, in order to attenuate noise and better express the data trend. To take the social distancingrestrictions into account, we considered only the commuter movement data related to work, since,with the mitigation measures of COVID-19, the ﬂow due to education were signiﬁcantly reduced.In addition, because the movement towards work also dropped due to social isolation policies,we used the mobility data obtained from the community mobility report provided by Google toestimate this reduction. This database compares for each state the daily mobility to workplaceswith the past trends, therefore, we can access the percentage of reduction in commuting to work.A moving average is performed in the mobility time series and we reduce the intermunicipalwork ﬂow according to the percentage indicated on the data. Thus, we obtain the ﬂow ofindividuals due to commuter work that leads to Φ ij ( t ) . The parameters used to feed the modelwere obtained in [30] and are displayed in Supplementary Material 3.The data from the state of Rio de Janeiro was selected for this analysis. Ten cities with thehighest commuter ﬂow with the capital Rio de Janeiro (RJ), were chosen, namely: Duque deCaxias (DdC), Nova Iguaçu (NI), São João de Meriti (SJdM), Niterói (Nt), São Gonçalo (SG),Belford Roxo (BR), Nilópolis (Ns), Mesquita (Mq), Queimados (Q), Magé (Ma). All of the chosencities for this work are part of the Rio de Janeiro’s metropolitan region. Additional informationabout the municipalities is presented in Supplementary Material 4. (b) Analyses of the results In our ﬁrst results, shown in Figure 1, we present a comparison between the SIR and SEIIRoutputs. Using the daily time series of the reproduction numbers, see Supplementary Material3, we obtain the series of R ( t ) and T ( t ) from (2.16) and (2.24), respectively. We observe thatthe R of SEIIR model is, in the average, higher then the corresponding results for the SIRmodel. On the other hand, the estimations of the total number of exported cases that are reported, P t P ni T ij ( t ) for i = j , is very similar for both SIR and SEIIR models Also, it seems that the totalcommuter movement, which is the sum of all the inﬂow and outﬂow happening in a municipality,is not the only main factor that determine the number of exported cases of a municipality. Thisnon-linearity can be observed when comparing São João de Meriti (SJdM) and Niterói (Nt) orDuque de Caxias (DdC) and Nova Iguaçu (NI) (see Figure 1b). Those municipalities have a similaramount of total ﬂow but very different results for the exported cases. Interestingly, even nothaving the highest R j , the capital, Rio de Janeiro (RJ) presents the largest amount of exportedcases, which also showcases the non-linear dynamics of the phenomenon.From now on, we will only look at the results of the SEIIR model. With T ij ( t ) we are able toaccess the contribution of each municipality on the outbreaks happening in the state. Thus, bydividing T ij ( t ) by B i ( t ) in every time step, we obtain a time series for the proportion of the totalcases in “ i ” generated by “ j ”. Therefore, we proceed into evaluating the mean value through timeof T ij / B i , where it is displaced in Figure 2. It is observed a high autochthonous behavior on thedisease transmission, whereby the highest inﬂuence of a city is on itself. Therefore, most of thecases generated in a municipality are caused by its own individuals. However, we also identifycities where the cases generated by other municipalities on it are very important. The R ( t ) matrixalso corroborates the presence of an important autochthonous behavior as its diagonal elements r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... RJ DdC NI SJdM Nt SG BR Ns Mq Q Ma

Cities i (a) RJ DdC NI SJdM Nt SG BR Ns Mq Q Ma

Cities N º o f i n d i v i d u a l s (b) SIR SEIIR Movement

Figure 1. Comparison between SIR and SEIIR outputs.

SIR results in blue and SEIIR ones in orange. The bar graph(a) compares the SIR and SEIIR R i averages in time for all municipalities. In (b), for each municipality, estimations forthe total number of exported cases that are reported for both models are displaced with the total commuter movement inthat city. The names of the municipalities are abbreviated using acronyms: Rio de Janeiro (RJ), Duque de Caxias (DdC),Nova Iguaçu (NI), São João de Meriti (SJdM), Niterói (Nt), São Gonçalo (SG), Belford Roxo (BR), Nilópolis (Ns), Mesquita(Mq), Queimados (Q), Magé (Ma). RJ DdC NI SJdM Nt SG BR Ns Mq Q MaRJDdCNISJdMNtSGBRNsMqQMa

Figure 2. Inﬂuence on number of cases of a municipality in another.

The heatmap captures the average, throughtime, inﬂuence on number of cases of a municipality in another as a proportion of the total number of daily cases, T ij / B i . i corresponds to the rows and j to the columns. correspond to the highest values of the reproduction numbers. We also observed lots of verysmalloff-diagonal elements low values through out the matrix.In Figure 3 we illustrate the results of Figure 2, whereby only non-autochthonous inﬂuencesabove 5% are considered. Thus, we must highlight the capital, Rio de Janeiro (RJ), as the mostimportant agent on the disease transmission to the municipalities on the network. However, citieslike Nova Iguaçu (NI) also presents itself as a relevant disseminator of the pathogen. As shownin Figure 1, Nova Iguaçu NI is the highest city in case exportation, besides the capital. In Figure3 we identify cities like: São João de Meriti (SJdM), Belford Roxo (BR), Nilópolis (Ns), Mesquita r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... R J N t N I S G N s S J d M B R D d C M q Q M a Figure 3. Visualization of the inﬂuence between municipalities.

Here, we present a visualization of the resultsfrom Figure 2. Only non-autochthonous inﬂuences above 5% were considered. The thickness of the lines connectingmunicipalities is proportional to the number of cases that one generates on the other. The color of each line representsthe municipality that is generating the cases. N º o f c a s e s (a) N º o f c a s e s (b) Figure 4. Reconstruction of the time series.

The black dots represents the daily reported cases in (a) São Gonçalo(SG), (b) São João de Meriti (SJdM). The blue dots are the amount of cases generated in (a) São Gonçalo (SG), (b) SãoJoão de Meriti (SJdM) due to the capital, Rio de Janeiro (RJ). In red we have the number of cases reported in (a) SãoGonçalo (SG) due to Niterói (Nt); (b) São João de Meriti (SJdM) due to Nova Iguaçu (NI). (Mq) and Queimados(Q) as the main receptors of those cases. Niterói (Nt), even having a NI-like number of exported cases, did not presented a high inﬂuence on a lot of cities. On the otherhand, Niterói (Nt) generates a signiﬁcant amount of cases in São Gonçalo (SG), highlighting theimportance of the connection of those two municipalities.We illustrate the time series reconstruction in Figure 4 where two scenarios are displaced. Inthe ﬁrst one, we focus on the cases reported in São Gonçalo (SG) and compare total amount, B i ( t ) , with the number of daily cases in SG generated by Rio de Janeiro (RJ) and Niterói (Nt).We observe that Nt has a predominance over the capital, presenting a higher number of casesgenerated in all times. The second scenario is related to the total number of cases in São João de r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... Meriti (SJdM) and the contributions due to Rio de Janeiro (RJ) and Nova Iguaçu (NI). In this case,the capital presents the largest number of cases generated in that city, besides the city itself. It isalso interesting to observe in both scenarios how the cities contributions merge into a part of thetotal cases notiﬁed on a municipality.

5. Discussion

Understanding and dealing with infectious diseases is an on going challenge to humankind.The methodology presented in this work is very general and can be applied to multiple diseasetransmission systems. By merging the model dynamics with the available epidemic data, thismethod is able to estimate key epidemiological factors, like the reproduction numbers and thegeneration interval distributions. These theoretical results are the basis for a lot of possible dataanalyses, specially for highly heterogeneous systems that have been demanding for a suitablemethodology for estimating the reproduction number through actual data. The method is robustand reproduces known results in literature, as shown in Supplementary Material 1 for the SIR,SEIR and SEIIR models. This methodology opens room for the analyses of more sophisticatedmodels, that leads to a better understanding and control of infectious diseases.With the emerging of the COVID-19 pandemic, due to SARS-Cov-2 coronavirus, in December2019, scientists allover the world have joined forces to formulate control strategies [31]. Factorslike the presence of asymptomatic individuals, risk groups and spatial distribution of the diseasebrings out a system with multiples layers of heterogeneity [26,27,32,33]. Although the applicationof some vaccines starts recently, mitigating COVID-19 with non-pharmacological strategies isessential. Therefore, the second half of this work we focused on the application of the developedmethodology for two models with spatial heterogeneity, one focusing on the COVID-19 speciﬁcdynamics. Each municipal demography, captured by β parameters, and the commuter movementwere combined by the model lead to some interesting results. The reproduction numbers areobtained for both models indicate that they only differ by multiplying factors, as /γ and pγ s + δ (1 − p ) γ a , as in [28,30]. The measured results shows that the asymptomatic presence is relatedto an increase on the reproduction number, because the model predicts more infected individualsthan what is reported. Another substantial result is obtained when comparing the exported caseswith the total commuter movement in a municipality. In this case, the relationship is not direct,since the intricate topology of the network fosters a non-linear dynamic in the phenomenon. Thisresult points up the importance of a model to provide epidemiological meaning of the availabledata. For this speciﬁc case, it became evident that analysing the system only by the commutermovement data wouldn’t be enough to point out the key cities on the disease transmissiondynamic, as we did using the models and methodology.The results of the model display the role of each municipality on the epidemic on the network.Cities like Nova Iguaçu, Niterói and São Gonçalo pops out as important agents on the spreadof the pathogen through out the metropolitan region of Rio de Janeiro, Brazil. The capital ishighlighted as the main hub of spreading, which is related to its high incidence of the diseasecombined with its central role on the movement behavior of individuals on the network, as alsoobserved in [26,30,34]. However, cases like Niterói and São Gonçalo, where the interaction of bothcities is higher than the capital’s inﬂuence, cannot be neglected. This call’s the attention for therelevance of analysis, like in this work, that provides the epidemiological interpretation of data.In this work, we choose to present an analysis with actual data in which the reproduction numberis not the main result. We proceeded with this approach to portrait the reproduction number notas a dead end result, but as a tool for obtaining deeper analyses, such as the reconstruction of thetime series and the number of exported cases.Our results for the Rio de Janeiro metropolitan area, however, have limitations. The modeldeveloped in this work provides a very simple approximation of intercity commuter ﬂow,while more sophisticated models available in the literature are required to provide more precisedescription of the movement behavior. Another limitation of this work is that it does not consider r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... the interstate ﬂow, since the state of Rio de Janeiro was portrayed as a closed system Also, we didnot take into account further heterogeneous features, which could be accounted as well by thegeneral theoretical formalism, besides space and asymptomatic presence, leaving a gap for otherkey COVID-19 dynamics characteristics, like age-groups. In addition, the examples presented inthis work considered actual reported data of conﬁrmed cases and mobility, which may presentproblems of underestimation and report delay. Taking those limitations into account, the resultsdisplaced here are still able to give substantial understanding of the system studied, which is acommon feature in mathematical modelling. Finally, we reinforce the importance of the methodproposed on this work and highlight its broad application on infectious disease models.Ethics. Since all data handled in this study is publicly available, an approval by an ethics committee is notrequired, according to Resolutions 466/2012 and 510/2016 (article 1, sections III and V) from the NationalHealth Council (CNS), Brazil.

Data Accessibility.

All data and codes are gathered and presented in our public GitHub repository at https://github.com/danielcpj/Rt-heterogeneous-models

Authors’ Contributions.

DCPJ, STRP and RFSA conceived of and designed the methodology presentedin this study. DCPJ, STRP, JGVM and JFO formulated and interpreted the meta-population models andapplications. DCPJ performed the data analysis and drafted the manuscript. All authors read, reviewed andapproved the manuscript.

Competing Interests.

The authors declare that they have no competing interests.

Funding.

DCPJ was funded by a Scientﬁic Initiation scholarship from CNPq (process number 117568/2019-8). JFO was supported by the Fiocruz Program of Promotion of Innovation - innovative ideas and products -COVID-19, orders and strategies - Inova Fiocruz (Processo VPPIS-005-FIO-20-2-40), and the Center of Dataand Knowledge Integration for Health (CIDACS) through the Zika Platform - a long-term surveillanceplatform for Zika virus and microcephaly (Uniﬁed Health System (SUS), Brazilian Ministry of Health).STRP was supported by an International Cooperation grant (process number INT0002/2016) from BahiaResearch Foundation (FAPESB). RFSA was supported by Brazilian agency CNPq through Grants No.422561/2018-5 and 304257/2019-2. STRP and RFSA were supported by the National Institute of Science andTechnology - Complex Systems from CNPq, Brazil. JGVM acknowledges the support of the National Councilof Technological and Scientiﬁc Development, CNPq, Brazil (Grant number: 307828/2018-2).

Acknowledgements.

The authors acknowledge the discussions and suggestions from members of theCoVida Network ( ). Supplementary Materials

Supplementary Material 1-

Methodology applied for epidemiological compartment models

Supplementary Material 2-

Meta-population models formulation.

Supplementary Material 3-

Expressions and parameter values to estimate the reproductionnumbers of the meta-population models

Supplementary Material 4-

Additional information about the municipalities.

References

1. Brauer, F. 2008

Epidemic Models with Heterogeneous Mixing and Treatment . Bulletin ofMathematical Biology 70, 1869-1885. (doi:10.1007/s11538-008-9326-1)2. Hyman, J. & Li, J. 2000

An intuitive formulation for the reproductive number for the spread ofdiseases in heterogeneous populations . Mathematical Biosciences 167, 65-86. (doi:10.1016/s0025-5564(00)00025-0)3. Wallinga, J. & Lipsitch, M. 2006

How generation intervals shape the relationship between growth ratesand reproductive numbers . Proceedings of the Royal Society B: Biological Sciences 274, 599-604.(doi:10.1098/rspb.2006.3754)4. Martcheva, M. 2015

An introduction to mathematical epidemiology . New York; Heidelberg:Springer. (doi:10.1007/978-1-4899-7612-3_12) r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A ..........................................................

5. Kermack, W. & McKendrick, A. 1927

A contribution to the mathematical theory of epidemics .Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematicaland Physical Character 115, 700-721.(doi:10.1098/rspa.1927.0118).6. Park, S., Champredon, D. & Dushoff, J. 2020 Inferring generation-interval distributionsfrom contact-tracing data. Journal of The Royal Society Interface 17, 20190719.(doi:10.1098/rsif.2019.0719)7. Huber, J., Johnston, G., Greenhouse, B., Smith, D. & Perkins, T. 2016 Quantitative, model-based estimates of variability in the generation and serial intervals of Plasmodium falciparummalaria. Malaria Journal 15. (doi:10.1186/s12936-016-1537-6)8. Lessler, J. et al. 2016 Times to key events in Zika virus infection and implications forblood donation: a systematic review. Bulletin of the World Health Organization 94, 841-849.(doi:10.2471/blt.16.174540)9. Nishiura, H., Linton, N. & Akhmetzhanov, A. 2020 Serial interval of novel coronavirus(COVID-19) infections. International Journal of Infectious Diseases 93, 284-286.(doi:10.1016/j.ijid.2020.02.060)10. Nishiura, H. & Chowell, G. 2009

The Effective Reproduction Number as a Prelude toStatistical Estimation of Time-Dependent Epidemic Trends . Mathematical and Statistical EstimationApproaches in Epidemiology , 103-121. (doi:10.1007/978-90-481-2313-1_5).11. Champredon, D. & Dushoff, J. 2020 Intrinsic and realized generation intervals in infectious-disease transmission. (doi:10.1098/rspb.2015.2026)12. Akkouchi, M. 2004

On the convolution of exponential distribution . J. Chungcheong Math. Soc..21.13. Champredon, D., Dushoff, J. & Earn, D. J. D. 2018 Equivalence of the Erlang-distributed SEIRepidemic model and the renewal equation. (doi:10.1137/18M1186411)14. Nishiura, H., Castillo-Chavez, C., Safan, M., & Chowell, G. 2009 Transmission potential ofthe new inﬂuenza A(H1N1) virus and its age-speciﬁcity in Japan. Euro Surveill, 14, 19227.(doi:10.2807/ese.14.22.19227-en)15. Park, S., Champredon, D., Weitz, J. & Dushoff, J. 2019 A practical generation-interval-based approach to inferring the strength of epidemics from their speed. Epidemics 27, 12-18.(doi:10.1016/j.epidem.2018.12.002)16. Arenas A, Cota W, Gómez-Gardeñes J, Gómez S, Granell C, Matamalas JT, Soriano-Paños D, Steinegger B. 2020 Modeling the Spatiotemporal Epidemic Spreading of COVID-19and the Impact of Mobility and Social Distancing Interventions. Phys. Rev. X 10, 041055.(doi:10.1103/PhysRevX.10.041055)17. Fraser C. 2007 Estimating Individual and Household Reproduction Numbers in an EmergingEpidemic. PLOS ONE 2, e758. (doi:10.1371/journal.pone.0000758)18. Hethcote, H. 1978

An immunization model for a heterogeneous population . Theoretical PopulationBiology 14, 338-349. (doi:10.1016/0040-5809(78)90011-4)19. van den Driessche, P. & Watmough, J. 2002 Reproduction numbers and sub-threshold endemicequilibria for compartmental models of disease transmission. Mathematical Biosciences 180,29-48. doi:10.1016/s0025-5564(02)00108-620. van den Driessche, P. 2008 Spatial Structure: Patch Models. Mathematical Epidemiology , 179-189. (doi:10.1007/978-3-540-78911-6_7)21. Bichara, D. & Iggidr, A. 2017 Multi-patch and multi-group epidemic models: a newframework. Journal of Mathematical Biology 77, 107-134. (doi:10.1007/s00285-017-1191-9)22. Lloyd, A. & May, R. 1996 Spatial Heterogeneity in Epidemic Models. Journal of TheoreticalBiology 179, 1-11. (doi:10.1006/jtbi.1996.0042)23. Lajmanovich, A. & Yorke, J. 1976 A deterministic model for gonorrhea in a nonhomogeneouspopulation. Mathematical Biosciences 28, 221-236. (doi:10.1016/0025-5564(76)90125-524. Arino, J. & Portet, S. 2015 Epidemiological implications of mobility between a largeurban centre and smaller satellite cities. Journal of Mathematical Biology 71, 1243-1265.(doi:10.1007/s00285-014-0854-z)25. Keeling, M., Danon, L., Vernon, M. & House, T. 2010 Individual identity and movementnetworks for disease metapopulations. Proceedings of the National Academy of Sciences 107,8866-8870. (doi:10.1073/pnas.1000416107)26. Miranda, J. G. V. et al. 2021 Scaling effect in covid-19 spreading: The role of heterogeneityin a hybrid ode-networkmodel with restrictions on the inter-cities ﬂow. Physica D: NonlinearPhenomena 415, 132792. (doi:10.1016/j.physd.2020.132792) r s pa . r o y a l s o c i e t y pub li s h i ng . o r g P r o c R S o c A .......................................................... stimating the eﬀective reproduction number forheterogeneous models using actual data D. C. P. Jorge , J. F. Oliveira , J. G. V. Miranda , R. F. S. Andrade , S. T. R. Pinho Instituto de F´ısica Te ´orica, Universidade Estadual Paulista - UNESP, R. Dr. Teobaldo Ferraz 271, S ˜ao Paulo01140-070, Brazil Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonc¸alo Moniz, Fundac¸ ˜ao OswaldoCruz, Salvador, Bahia, Brazil Instituto de F´ısica, Universidade Federal da Bahia, Salvador, Bahia, Brazil * Correspondence: Daniel C. P. Jorge ([email protected]).

ABSTRACT

This document contain the Supplementary Materials for the manuscript entitled "Estimating the eﬀective reproduction number forheterogeneous models using incidence data" by Jorge et al . a r X i v : . [ q - b i o . P E ] F e b upplementary Material 1 Methodology applied to epidemiological compart-ment models pplying the method to epidemiological models

We proceed to illustrate the application of the method developed in this work to compartmental epidemic models.We chose two simple models that are known in literature and two variations of a meta-population model that isused in the main framework and developed in Supplementary Material 2. We seek to show the method step by step,presenting its detailed calculations. All models presented are composed of ordinary diﬀerential equations. In thetransitions between infected compartments described in these models, V ( t, τ ), the the infected compartments x ( t, τ )appear with linear dependence. Therefore we can simplify: ∂ V i ∂u j = ∂ V i ∂x j = cte and ddω u ( ω ) = − ∂ V ∂ x u ( ω ) . (1)In addition, all the parameters of the models in this Supplementary Material are constant, which leads to Ω ( t ) = ∂∂ X F ( t ) and Γ ( ω ) = Γ ( τ ) (2) SEIR model

The SEIR model is designed by introducing a new exposed E stage of the disease into the SIR model . We canassume that when an individual becomes infected, it must go through a latency period before showing its ﬁrstsymptoms and starting to infect other individuals. This is accomplished by introducing the exposed compartment E and its removal rate κ . Thus, all individuals who are infected start in the exposed state and, on average, aftera 1 /κ latency time are introduced into the I compartment. It is also possible to introduce a factor related to a (cid:15) pre-symptomatic infection. This way, individuals can start to infect in the exposed compartment, before presentingsymptoms. Thus, this model, considering pre-symptomatic infection, can be written as dSdt = − βSN h I + (cid:15)E i , (3) dEdt = βSN h I + (cid:15)E i − κE, (4) dIdt = κE − γI, (5) dRdt = γI. (6)Thus, we can sort the infected compartments as X ( t ) = [ E ( t ) , I ( t )]. In this way the distributions of the infectiousphase are deﬁned as x ( t, τ ) = [ i e ( t, τ ) , i i ( t, τ )]. We get F ( t ) e V ( t, τ ): F ( t ) = βSN (cid:2) I + (cid:15)E (cid:3) ! , V ( t, τ ) = κi e ( t, τ ) γi i ( t, τ ) − κi e ( t, τ ) ! , (7)as we recover the sub-set of equations for the infected compartments with: ddt X ( t ) = F ( t ) − Z ∞ V ( t, τ ) dτ. (8)The change from E to I is not considered a new infection, but the progression of the disease stage. Therefore, newinfections only occur in the exposed compartment E , causing F ( t ) = 0. Thus, from F ( t ), we obtain Ω ( t ): Ω ( t ) = h ∂∂ X F ( t ) i = (cid:18) (cid:15) βSN βSN (cid:19) . (9)We proceed to obtain Γ( τ ) by solving ddω u ( ω ) = − ∂ V ∂ x u ( ω ) = (cid:20) − κ κ − γ (cid:21) u ( ω ) . (10) n order to obtain u ( ω ) = (cid:20) e − κω κγ − κ [ e − κω − e − γω ] e − γω (cid:21) u (0) . (11)Therefore: Γ ( τ ) = (cid:20) e − κω κγ − κ [ e − κτ − e − γτ ] e − γτ (cid:21) (12)With Ω ( t ) and Γ ( τ ) we perform the multiplication between the matrices, in order to obtain A ( t, τ ) = (cid:20) (cid:15) βSN βSN (cid:21) (cid:20) e − κτ κγ − κ [ e − κτ − e − γτ ] e − γτ (cid:21) . (13)All terms in the second line are null, leaving only A ( t, τ ) = (cid:15) βSN e − κτ + βSN κγ − κ [ e − κτ − e − γτ ] , (14) A ( t, τ ) = βSN e − γτ . (15)Performing the integral from zero to inﬁnity with respect to τ we obtain the reproduction numbers of the system R ( t ): R ( t ) = β SN  (cid:15)κ + 1 γ γ  . (16)Where it is clear that R = β SN  (cid:15)κ + 1 γ γ  . (17)Since there is only generation of new infected in the exposed compartment, we have to F ( t ) = F T ( t ). Thus, thevector α ( t ) can be written as α = (cid:0) , (cid:1) , thus R = α ⊗ R = (cid:18) (cid:19) O β SN  (cid:15)κ + 1 γ γ  . (18)Thus, to obtain the total reproduction number of the system, it is enough to make the scalar product between R and α : R T ( t ) = α · R = βSN h (cid:15)κ + 1 γ i , (19)which leads to g T ( τ ) = (cid:15) e − κτ + κγ − κ [ e − κτ − e − γτ ] (cid:15)/κ + 1 /γ . (20) hen we make (cid:15) →

0, we retrieve the result of the reproduction number of the classic SEIR model (withoutpre-symptomatic infection), R = β/γ . Similarly, the generation interval distribution of the SEIR model is alsoreduced to the well-known form in the literature , demonstrating robustness in the method. It is trivial to apply themethod to the SIR model, whose analysis corresponds to tanking a limit of κ → ∞ . Therefore, both the reproductionnumber and the generation interval distribution, g ( τ ) = γe − γτ , return to the known results from literature . SIIR model

This is an adaptation of the SIR model, where we include two diﬀerent manifestations of the disease, I and I .In this model, there is only one susceptible population whose individuals can evolve into two compartments thatcarry the infectious agent. Both types of the disease carry the same pathogen, so that individuals infected witheither type can go for I and I . When infected, an individual has the probability p of manifesting the type I ofthe disease and q = (1 − p ) of manifesting the type I . We assume two independent infection rates β and β for I and I . Likewise, each slot has its own γ or γ removal rate. So, we write the model equations: dSdt = − ( β I + β I ) SN , (21) dI dt = p ( β I + β I ) SN − γ I , (22) dI dt = q ( β I + β I ) SN − γ I , (23) dRdt = γ I + γ I . (24)We sort the infected compartments as X ( t ) = [ I ( t ) , I ( t )] and x ( t, τ ) = [ i ( t, τ ) , i ( t, τ )], where it is clear that X ( t ) = R ∞ x ( t, τ ) dτ . We obtain F ( t ) and V ( t, τ ) as: F ( t ) = SN p ( β I + β I ) q ( β I + β I ) ! , V ( t, τ ) = (cid:18) γ i ( t, τ ) γ i ( t, τ ) (cid:19) . (25)Thus Ω ( t ) = SN pβ pβ qβ qβ ! , − ∂ V ∂ x = (cid:18) − γ − γ (cid:19) . (26)The linear O.D.E system described by the matrix − ∂ V /∂ x it is simple to solve, since this is a diagonal matrix.So, we get Γ ( τ ) = (cid:18) e − γ τ e − γ τ (cid:19) , (27)which, when multiplied by the matrix Ω ( t ) results in A ( t, τ ) = SN pβ e − γ τ pβ e − γ τ qβ e − γ τ qβ e − γ τ ! (28)It is interesting to realize that this is one of the cases where A ij ( t, τ ) = α i A j ( t, τ ), recalling that A j ( t, τ ) = P i A ij ( t, τ ).Therefore α = (cid:0) p, q (cid:1) and we can factorize the matrix into A and α as: A = α ⊗ A = (cid:18) pq (cid:19) O  β SN e − γ τ β SN e − γ τ  . (29) here ⊗ represents the tensorial product, α ⊗ A = h α i A j i . Integrating A ( t, τ ) in relation to τ we get: R ( t ) = Z ∞ A ( t, τ ) dτ = SN β /γ β /γ ! . (30)Therefore, we proceed to the scalar product between α and R : R T ( t ) = α · R = S ( t ) N (cid:20) p β γ + q β γ (cid:21) . (31)When t → R = pβ /γ + qβ /γ . We realize that the total reproduction number is the sum of the reproductionnumbers of the two types of infection times the percentage of occurrence of each. Thus, we proceed to obtain thedistribution of the generation interval, which takes the form: g T ( τ ) = pβ e − γ τ + qβ e − γ τ β /γ + β /γ . (32) SIR-type meta-population model

In this section we consider a meta-population model that will be used in the main framework and is detailed atSupplementary Material 2. Here we summarize the model. We consider the existence of “ n ” meta-populations withcoupled SIR-type dynamics. Where, due to the movement of individuals between meta-populations, an infectedcompartment in one meta-population can inﬂuence the disease transmission process of all the others. The couplingof the equations happens by the transmission rates λ ik related to the contamination process that emerges from theﬂow of individuals. The SIR-type model for n meta-populations can be written as dS i dt = − n X j λ ij ( t ) I j ( t ) S i ( t ) , (33) dI i dt = n X j λ ij ( t ) I j ( t ) S i ( t ) − γ I i ( t ) , (34) dR i dt = γI i ( t ) , (35)in which S i ( t ), I i ( t ) and R i ( t ) correspond to the susceptible, infected and removed individuals that belong to themeta-population “ i ”. The λ ij parameter is related to the transmission between the meta-populations “ i ” and “ j ”,see Supplementary Material 2. We assume that the recover rate γ is uniform for all meta-populations.The infectedcompartments are sorted as X ( t ) = [ I ( t ) , I ( t ) , . . . , I n ( t )] and x ( t, τ ) = [ i ( t, τ ) , i ( t, τ ) , . . . , i n ( t, τ )], where it is clearthat X ( t ) = R ∞ x ( t, τ ) dτ . We proceed to identify F i ( t ) and V i ( t, τ ) as F ( t ) = " n X j λ ij ( t ) I j ( t ) S i ( t ) , V ( t, τ ) = " − γ i i ( t, τ ) (36)Therefore we get: Ω ( t ) =  λ S λ S . . . λ n S λ S λ S . . . λ n S ... ... . . . ... λ n S n λ n S n . . . λ nn S n  , − ∂ V ∂ x =  γ . . . γ . . . . . . γ  . (37)It is clear to see that − ∂ V /∂ x is a diagonal matrix, such that ddω u ( ω ) = − ∂ V ∂ x u ( ω ) are of trivial solution Γ ( τ ) = e − γτ I , (38) here I represents the identity matrix of dimension “ n ”. Thus, when multiplying the matrices Ω and Γ we arrive at: A ( t, τ ) = h λ ik S i e − γτ i (39)Integramos A de forma a obter a matriz de próxima geração do sistema R , dada por: R ( t ) = " λ ik S i γ . (40)That leads to: g ij ( τ ) = g ( τ ) = γ e − γ τ . (41)Whereby, if only one meta-population is considered, the SIR model results are recovered . SEIIR-type meta-population model

Finally, we present a meta-population model related to the transmission dynamics of SARS-Cov-2 coronavirus. Asin the previous section, the meta-populations are connected by the commuter movement of individuals. In thismodel, individuals that are infected have to pass through a latency period to become infectious, during that time,we consider that those are in the exposed compartment E . After a mean latency period of 1 /k , those infected canbecome symptomatic or asymptomatic, I s and I a respectively. While in these compartments, the individuals of“ j ” are able to generate new infected ones in “ i ” based on the transmission rate λ ij . However, asymptomatic onesare considered to have a lower transmissibility, thus, the transmission rate must be multiplied by a factor δ thatlowers it’s infectivity. The SEIIR model was developed by Oliveira in and in Supplementary Material 2 we derivean SEIIR-type meta-population model that can be written as: dS i dt = − n X j λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i S i ( t ) , (42) dE i dt = n X j λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i − κ E i ( t ) , (43) dI si dt = pκ E i ( t ) − γ s I si ( t ) , (44) dI ai dt =(1 − p ) κ E i ( t ) − γ a I ai ( t ) , (45) dR i dt = γ s I si ( t ) + γ a I ai ( t ) . (46)To proceed with the methodology, we sort the compartments as X = [ E , ..., E n , I a , ..., I na , I s , ..., I ns ] and x = [ i e , ..., i en , i s , ..., i ns , i a , ..., i na ] in a way that if there are n meta-populations we have 3 n infected compartments.Therefore, we obtain F ( t ) and V ( t, τ ) as: F i ( t ) = (P nj λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i , for i ≤ n , for i > n (47) V i ( t, τ ) =  κ x i ( t, τ ) , for i ≤ nγ s x i ( t, τ ) − pκ x i − n ( t, τ ) , for n < i ≤ nγ a x i ( t, τ ) − (1 − p ) κ x i − n ( t, τ ) , for 2 n < i (48)where I sj = X j + n and I aj = X j +2 n . Thus, from (47), we obtain: ( t ) =  s Ω a  , − ∂ V ∂ x =  − κ pκ − γ s − p ) κ − γ a  ⊗ I n . (49)Whereby I n is an n × n identity matrix and ⊗ is a tensor product. Also, Ω is divided into nine n × n submatrices,where is an all-zeroes n × n matrix, Ω s ≡ h λ ij S i i and Ω a ≡ δ Ω s . Solving the system of diﬀerential equations onthe characteristic line, we obtain: Γ ( τ ) =  e − κτ p κγ s − κ (cid:16) e − κτ − e − γ s τ (cid:17) e − γ s τ − p ) κγ a − κ (cid:16) e − κτ − e − γ a τ (cid:17) e − γ a τ  ⊗ I n . (50)We proceed into the multiplication of the matrices Ω and Γ in order to obtain: A ( t, τ ) =  A e A s A a  (51)whereby the submatrices are deﬁned as: A e ( t ) ≡ " λ ij ( t ) S i ( t ) (cid:18) p κγ s − κ (cid:16) e − κτ − e − γ s τ (cid:17) + δ (1 − p ) κγ a − κ (cid:16) e − κτ − e − γ a τ (cid:17)(cid:19) , (52) A s ≡ e − γ s τ Ω s and A a ≡ e − γ a τ Ω a . Therefore, the next generation matrix is R ( t ) =  R e R s R a  (53)where: R e ( t ) ≡ " λ ij ( t ) S i ( t ) (cid:18) pγ s + δ (1 − p ) γ a (cid:19) , (54)Because there is no generation of infected individuals on the symptomatic and asymptomatic compartments, theexpressions for R s and R a are not needed. This is so because F i ( t ) for i > n is null for any time. Therefore, therenewal equations of F i ( t ) for i > n have the tautological result that zero equals zero, regardless of the values of R s or R a . Thus, all we need to describe the dynamics is R for i, j ≤ n , that is R e . Lastly, we obtain the generationinterval distribution matrix for i, j ≤ n : g ij ( τ ) ≡ g ( τ ) = pγ s g s ( τ ) + δ (1 − p ) γ a g a ( τ ) pγ s + δ (1 − p ) γ a , (55)for g a ( τ ) = κγ a γ a − κ ( e − κτ − e − γ a τ ) , g s ( τ ) = κγ s γ s − κ ( e − κτ − e − γ s τ ) . (56)Whereby, if only one meta-population is considered, we return to the results in literature . upplementary Material 2 Meta-population models formulation eta-population models formulation

In this Supplementary Material we will be, inspired by , developing a meta-population model that takes into accountthe movement of individuals between meta-populations to describe the propagation of a disease through out multiplemunicipalities . We will be interpreting the inter-municipal ﬂow as a complex network where its nodes representmunicipalities and the weight of its edges represents the intensity of the ﬂow between the connected municipalities.We also consider that each municipality can be interpreted as a meta-population, with its own compartmentsand parameters, which can be represented by a vector y i ( t ), where the index “ i ” indicates the meta-populationportrayed.Each entry of y i ( t ) represents a compartment of this meta-population, such as y i ( t ) = (cid:0) S i ( t ) , I i ( t ) , R i ( t ) (cid:1) in the case of a SIR model. In which S i ( t ), I i ( t ) and R i ( t ) correspond to the susceptible, infected and removedindividuals that are residents of the meta population " i ", respectively. The sum of the elements of y i corresponds tothe number of individuals of the meta-population " i ", in the SIR model we have S i ( t ) + I i ( t ) + R i ( t ) = N i ( t ).We can represent the amount of individuals that goes from the meta-population " i " to another meta-population " j "each day as ϕ ij , so it represents the ﬂow of individuals between those meta-populations. Since each meta-populationis described from it’s compartments, we can describe the ﬂow of those using y i ( t ) inFlow from i to j = ϕ ij ( t ) y i N i . (57)We deﬁne Φ ij ( t ) ≡ ϕ ij ( t ) N i as the density of ﬂow. In that way, Φ ij y i is the number of individuals from eachcompartment class that are ﬂowing from " i " to " j ". It’s natural to see that sum of Φ ij y i for each compartment classof y i is equal to ϕ ij , in the SIR model for example: Φ ij S i + Φ ij I i + Φ ij R i = ϕ ij .Given that the meta-populations are connected through the ﬂow , it is necessary to identify how they. For thispurpose, graphs are used which, in general, represent complex networks, given the large number of connections. Torepresent networks, matrices are used, as in the case of adjacency matrices whose elements represent the density ofﬂow (Φ ij ). Thus, the values of each Φ ij ( t ) are the weight of the edges with a null value representing the absence ofan edge. Since there are no self interactions in this network, we do not consider the ﬂow of a meta-population toitself, thus ϕ ii ≡ i ” described by the vector y e, i = (cid:0) S e ,i ( t ) , I e ,i ( t ) , R e ,i ( t ) (cid:1) which corresponds to individuals located, at time t, in the “i” meta-population regardless of their origin meta-population.The eﬀective population represents the new characteristics of a population due to the ﬂow; for example, ameta-population that does not have infected resident individuals may have carriers of the pathogen in their eﬀectivepopulation due to the ﬂow from some other meta-population that presents infected individuals. We can write y e, i as: y e, i ( t ) = Resident Pop. z }| { y i ( t ) − Outﬂow z }| { n X j = i Φ ij y i ( t ) + Inﬂow z }| { n X j = i Φ ji y j ( t ) (58)For, N being the number of meta populations. Rearranging the equation: y e, i ( t ) = y i ( t ) (cid:16) − n X j Φ ij (cid:17) + n X j Φ ji y j ( t ) (59)It’s clear then that the eﬀective population of " i " is the sum of the individuals from " i " that are in " i " , y i (cid:16) − P nj = i Φ ij (cid:17) , with the individuals from other meta populations that are in " i ", P nj = i Φ ji y j . If we sum eachelement of y e, i we have the eﬀective population number: N e, i = N i − P nj = i ϕ ij + P nj = i ϕ ji . It is important tohighlight that the resident population is not changed, therefore, we do not consider migration but only commuterperiodic movement, whereby the individuals of one meta-population go to another to work or study.We now proceed in order to describe how the transmission of a disease occurs in a meta-population, takinginto account the ﬂow of individuals in the network. We must then describe the occurrence of new infections in the eta-population y i ( t ), which can occur within the “i” meta-population β i N i × Susceptible Indv. from “ i ” in “ i ” z }| { S i (cid:16) − n X j Φ ij (cid:17) × Infected Indv. in “ i ” z}|{ I e, i , (60)or on another meta-population " j " β j N j × Susceptible Indv. from “ i ” in “ j ” z }| { Φ ij S i × Infected Indv. in “ j ” z}|{ I e, j . (61) β i ( t ) and β j ( t ) being the transmission rates within the “i” and “j” meta-populations, respectively. The parameters β i ( t ) and Φ ij ( t ) are time dependent, so we can incorporate the changes in the behavior of the populations on thosevariables. Thus, the number of new infections of individuals that belong to “ i ”, F i ( t ), will be equal to the sum of(60) and (61). We can substitute equation (59) in expressions (60) and (61), in order to expand the term of theeﬀective population of infected individuals( I e ,i ( t ) , I e ,i ( t )) and be able to express them according to the infectedresidents in each meta-population. Carrying out these operations, we get to: F i ( t ) = n X j λ ij ( t ) S i ( t ) I j ( t ) (62)For: λ ii = β i N i (cid:16) − n X j Φ ij (cid:17) + n X j β j N j Φ ij (63) λ ij = β i N i Φ ji (cid:16) − n X k Φ ik (cid:17) + β j N j Φ ij (cid:16) − n X k Φ jk (cid:17) + n X k β k N k Φ ik Φ jk (64)The λ ij ( t ) are related to the transmission between individuals of the same meta-population and from individualsof the “j” meta-population to individuals of the “i” meta-population. SIR-Type model

In this section, we construct the set of equations for the SIR-type model based in the contamination process modeledin this Supplementary Material. We will be modeling the resident populations of each municipality and will not beconsidering migration, therefore, resident individuals of one meta-population will always remain residents of thatsame meta-population. It is assumed that the susceptible population can only decrease due to infection process,therefore, no life and death dynamic. In the same way, the removed individuals are only generated by a removingrate γ , uniform for all meta-populations. Gathering those considerations, we write the system of equations: dS i dt = − n X j λ ij ( t ) I j ( t ) S i ( t ) , (65) dI i dt = n X j λ ij ( t ) I j ( t ) S i ( t ) − γ I i ( t ) , (66) dR i dt = γI i ( t ) . (67)Where it is clear that when ϕ ij = 0, for all i ’s and j ’s, each meta-population will be described by the classicalSIR homogeneous model. Of course, this SIR approach is not taking into account the other heterogeneities of adisease besides space. On the following section we present a more sophisticated approach focused on the Covid-19transmission dynamics. meta-population model for Covid-19 (SEIIR) In this section, we establish a more precise description for the dynamics of SARS-Cov-2 coronavirus on themunicipalities network. To do so, we consider, as in , that the infected individuals can be separated in three classes:the exposed E , individuals infected which are in the latency period and do not transmit the disease; the symptomaticindividuals I s , that are infectious, present a substantial amount of symptoms and are registered in the oﬃcial data;the asymptomatic/undetected ones I a , that are infectious but present mild/non-existing symptoms and are notregistered in the oﬃcial data. The infected individuals always start in the exposed compartment, a portion p of themeventually becomes symptomatic and the other (1 − p ) portion becomes asymptomatic. Since we have two types ofinfectious individuals, both of them must be taken in consideration in the generation of new infected. Therefore,assuming that the asymptomatic individuals transmission rate is a fraction δ of the symptomatic transmission rate,it is simple to derive, similarly to (60) and (61), the number of infected individuals that are generated on the exposedcompartment of each meta-population: F ei ( t ) = n X j λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i . (68)Therefore, by the same assumptions presented on the previews section, the following model is obtained: dS i dt = − n X j λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i S i ( t ) , (69) dE i dt = n X j λ ij ( t ) S i ( t ) h I sj ( t ) + δI aj ( t ) i − κ E i ( t ) , (70) dI si dt = pκ E i ( t ) − γ s I si ( t ) , (71) dI ai dt =(1 − p ) κ E i ( t ) − γ a I ai ( t ) , (72) dR i dt = γ s I si ( t ) + γ a I ai ( t ) . (73)Whereby κ , γ s and γ a are the removing rate of the exposed, symptomatic and asymptomatic compartments,respectively, and are uniform in all meta-populations. upplementary Material 3 Expressions and parameter values for evaluating R ( t ) for the meta-population models xpressions for evaluating R ( t ) using incidence data Here, we derive the expressions and parameter values needed to estimate the reproduction numbers for the meta-population models. Firstly, we start by substituting the explicit expressions of the reproduction numbers of eachmodel in the equation 2.24 of the main framework. After some straightforward calculations, we obtain: Q i ( t ) = n X j λ ij ( t ) a j ( t ) . (74)whereby, Q i ( t ) = B i ( t ) /S i ( t ) and a j = t X τ =0 γ g ( τ )∆ t (75)for the SIR model and a j = t X τ =0 h pγ s + δ (1 − p ) γ a i g ( τ )∆ t (76)for the SEIIR model. We consider that the B i ( t ) is the collection of all the new infections during a ∆ t time intervalof 1 day. With appropriate units of measure, we have ∆ t = 1. In the SIR model we consider that every infection isreported, ρ i = 1 and in the SEIIR case only the symptomatic individuals report their infections, ρ i = p . We proceedinto estimating the susceptible population. If we integrate (65) and substitute (62) and equation 2.23 of the mainframework, we get: S i ( t ) = N i − t X t =0 B i ( t ) ρ i . (77)Finally, substituting (63) and (64) in (74) results in: Q i ( t ) = n X j Θ ij ( t ) β j ( t ) , (78)for: Θ ii = 1 N i " a i (cid:16) − n X k ϕ ik (cid:17) + n X j a j ϕ ji (cid:16) − n X k ϕ ik (cid:17) , (79)Θ ij = 1 N j " a j ϕ ij (cid:16) − n X k ϕ jk (cid:17) + ϕ ij n X k a k ϕ kj . (80)Therefore value of Q i ( t ) and θ ij for each day can be obtained using the reported data and parameters. Thus, foreach day, this leaves us with algebraic system of “ n ” variables, β j ( t ) and “ n ” equations. We can analytically solvethis system for every meta-population, with the help of a computer algorithm, and obtain the daily values of β j ( t )of every meta population. Substituting the values of every β j ( t ) in (63) and (64), we can compute the values ofthe λ i j ( t )’s which leads us to the values of the reproduction number, equations 3.1 and 3.2 of the main framework.Since the available data is of daily number of cases, the R ( t ) is evaluated for each calendar day. arameters To estimate the β i ( t ) parameters, and consequently estimate the reproduction numbers, the parameters of bothmodels must be obtained. δ , γ a , γ s , p , and κ are estimated for the state of Rio de Janeiro in and can be foundon Table 1. Additionally, in the SIR type model we assume γ = γ s . The intermunicipal commuter movement ofworkers and students for the cities of Brazil, ϕ ij , can be found in a study conducted by IBGE (Brasilian Institute ofGeography and Statistics) . Supplementary Table 1.

Key epidemiological parameters of the SEIIR model obtained in . Parameter Description Value δ Asymptomatic/non-detected infectivity factor 0 . p Proportion of latent (E) that proceed to symptomatic infective 0 . κ − Mean exposed period (days − ) 1 / . γ − a Mean asymptomatic period (days − ) 1 / . γ − s Mean symptomatic period (days − ) 1 / . upplementary Material 4 Additional information about the municipalities dditional information about the municipalities

Municipality Acronyms Populational size Total reported cases

Belford Roxo BR 485,687 8,578Duque de Caxias DdC 905,129 8,736Magé Ma 242,113 3,539Mesquita Mq 800,835 1,351Nilópolis Ns 154,749 1,245Niterói Nt 497,883 12,165Nova Iguaçu NI 167,287 5,908Queimados Q 150,333 2,376

Rio de Janeiro

RJ 6,592,227 95,444São Gonçalo SG 1,075,372 11,601São João de Meriti SJdM 448,340 3,167

Supplementary Table 2Supplementary Table 3.

The selected cities of the state of Rio de Janeiro in the Southeast of Brazil with theiracronyms, number of inhabitants, and the reported COVID-19 cases until September 14th (2020). In bold, thecapital (Rio de Janeiro).

Supplementary Figure 1.

Distribution of cases by the chosen cities of this study. J N s N t S J d M N I S G B R D d C M q Q M a Supplementary Figure 2.