[PDF] Building Mean Field State Transition Models Using The Generalized Linear Chain Trick and Continuous Time Markov Chain Theory

Abstract

The well-known Linear Chain Trick (LCT) allows modelers to derive mean field ODEs that assume gamma (Erlang) distributed passage times, by transitioning individuals sequentially through a chain of sub-states. The time spent in these states is the sum of k exponentially distributed random variables, and is thus gamma (Erlang) distributed. The Generalized Linear Chain Trick (GLCT) extends this technique to the much broader phase-type family of distributions, which includes exponential, Erlang, hypoexponential, and Coxian distributions. Intuitively, phase-type distributions are the absorption time distributions for continuous time Markov chains (CTMCs). Here we review CTMCs and phase-type distributions, then illustrate how to use the GLCT to efficiently build mean field ODE models from underlying stochastic model assumptions. We generalize the Rosenzweig-MacArthur and SEIR models and show the benefits of using the GLCT to compute numerical solutions. These results highlight some practical benefits, and the intuitive nature, of using the GLCT to derive ODE models from first principles.

Full PDF

BBuilding Mean Field State Transition Models UsingThe Generalized Linear Chain Trick andContinuous Time Markov Chain Theory

Preprint

Hurtado, Paul J.

University of Nevada, RenoORCID: 0000-0002-8499-5986 [email protected]

Richards, Cameron

University of Nevada, RenoORCID: 0000-0002-1620-9998July 9, 2020

Abstract

The well-known Linear Chain Trick (LCT) allows modelers to derive mean field ODEs thatassume gamma (Erlang) distributed passage times, by transitioning individuals sequentiallythrough a chain of sub-states. The time spent in these states is the sum of k exponentiallydistributed random variables, and is thus gamma (Erlang) distributed. The Generalized LinearChain Trick (GLCT) extends this technique to the much broader phase-type family of distribu-tions, which includes exponential, Erlang, hypoexponential, and Coxian distributions. Intuitively,phase-type distributions are the absorption time distributions for continuous time Markov chains(CTMCs). Here we review CTMCs and phase-type distributions, then illustrate how to use theGLCT to efficiently build mean field ODE models from underlying stochastic model assumptions.We generalize the Rosenzweig-MacArthur and SEIR models and show the benefits of using theGLCT to compute numerical solutions. These results highlight some practical benefits, and theintuitive nature, of using the GLCT to derive ODE models from first principles. K eywords Linear chain trick; gamma chain trick; phase-type distribution; Coxian distribution;Erlang distribution a r X i v : . [ q - b i o . P E ] J u l reprint - July 9, 2020 Contents

A.1 Rosenzweig-MacArthur Model & Extensions . . . . . . . . . . . . . . . . . . . . . . . 18A.2 SEIR Model & Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 reprint - July 9, 2020

Continuous time state transition models, often formulated as mean field ODE models, are widelyused throughout the biological sciences and across multiple scales. Examples include models ofmulti-species interactions, infectious disease transmission, cell proliferation, and various otherapplications in which entities transition among a finite number of states (e.g., Allen, 2007; Andersonand May, 1992; Arrowsmith and Place, 1990; Beuter et al., 2003; Clapp and Levy, 2015; Dayan andAbbott, 2005; Edelstein-Keshet, 2005; Ellner and Guckenheimer, 2006; Hirsch, Smale, and Devaney,2012; Izhikevich, 2010; Keener and Sneyd, 2008a,b; Meiss, 2017; Murray, 2011a,b; Strogatz, 2014;Wiggins, 2003; Yates, Ford, and Mort, 2017). One criticism of mean field ODE models is thatthey often implicitly assume the time individuals spend in the different states are exponentiallydistributed, and it is known that the timing of state transitions can very meaningfully affect modeldynamics and model outputs in an applied setting (Getz et al., 2018; Krylova and Earn, 2013;Metz and Diekmann, 1986; Metz and Diekmann, 1991; Nisbet, Gurney, and Metz, 1989; Robertsonet al., 2018; Wearing, Rohani, and Keeling, 2005). That is, an ODE model with a linear lossrate can be interpreted as implicitly assuming an underlying stochastic state transmission modelwith an exponentially distributed dwell-time in that focal state. For example, the simple model dx/dt = b − d x is consistent with assuming an underlying stochastic state transition model inwhich individuals spend an exponentially distributed amount of time with mean 1 /d in the statecorresponding to variable x .One remedy to address this issue with ODE models is known as the Linear Chain Trick (LCT;Hurtado and Kirosingh, 2019; Smith, 2010, and references therein), which allows modelers to insteadassume gamma (Erlang ) distributed passage times (a.k.a., dwell times ). This is accomplished bypartitioning a state into a series of k sub-states, where individuals transition sequentially throughthis ‘‘linear chain” of states. The resulting time spent in this collection of sub-states is thus thesum of k exponentially distributed random variables, and therefore follows an Erlang distribution(if each exponential has the same rate) or a generalized Erlang distribution if the rates differ.The Generalized Linear Chain Trick (GLCT) (Hurtado and Kirosingh, 2019) extends this techniqueto allow modelers to assume these passage times follow a much broader family of distributions thatincludes the phase-type family of distributions (Bladt and Nielsen, 2017a,b; Horv´ath, Scarpa, andTelek, 2016; Horv´ath and Telek, 2017; Reinecke, Bodrog, and Danilkina, 2012). This broad familyincludes exponential, Erlang, hypoexponential, hyperexponential, Coxian and some other nameddistributions. Intuitively, phase-type distributions can be thought of as the family of all possible hitting time (or absorption time ) distributions for continuous time Markov chains (CTMCs). Inaddition, statistical methods exist for estimating such distributions from data (Horv´ath, Scarpa, andTelek, 2016; Horv´ath et al., 2012; Hurtado and Kirosingh, 2019, and references therein) allowingresearchers to build approximate empirical distributions into ODE models using a more flexiblefamily of distributions than only the Erlang distributions.In this paper, we illustrate how to use the GLCT alongside concepts and techniques from CTMCtheory to build and numerically solve mean field ODE models using a much richer set of possibleunderlying stochastic model assumptions. The paper is organized as follows. First, we reviewCTMCs and phase-type distributions. We then state the GLCT for phase-type distributions Gamma distributions with integer-valued shape parameters are those that can be thought of as the sum of iid exponentially distributed random variables, and are known as

Erlang distributions. The sum of independent exponentially distributed random variables with different rates is known as a generalizedErlang or hypoexponential distribution. reprint - July 9, 2020 and, for comparison, the well-known LCT. In the Results section, we generalize some simplebiological state transition models by replacing their implicit assumption of exponentially distributedpassage time assumptions with arbitrary phase-type distributions. Lastly, we investigate some ofthe computational costs and benefits of using this generalized model framework with regards tocomputing numerical solutions. To provide proper context for an intuitive understanding of the Generalized Linear Chain Trick(GLCT), we briefly review continuous time Markov chains (CTMCs) with a focus on CTMCs thathave a single absorbing state. Our focus will then be on the probability distributions that describethe time it takes to reach that absorbing state starting from one of the transient states, sincethese absorption time distributions define the phase-type family of probability distributions. Thefollowing summaries build upon similar descriptions laid out in Hurtado and Kirosingh (2019).

Discrete time Markov chains (DTMCs) describe the transition of an individual (or other distinctentity) among a set of n states. The transition probabilities from a state i to a state j ( p ij where1 ≤ i, j ≤ n ) are best organized using a transition probability matrix P , where p ij is value in the i th row and j th column of the matrix ( P ij = p ij ). For our purposes below, we will restrict our attentionto Markov Chains in which the first k = n − transient states , and the last ( k + 1) stateis an absorbing state . This means the system eventually enters this last state and remains there oneach subsequent time step with probability 1.The transition probability matrix P can be written in block form according to these first k = n − P = (cid:34) P X P a (cid:35) (1)where P X is a k × k matrix describing transition probabilities among transient states, P a is the k × i th transient state to the absorbing state, and is a 1 × k vector of zeros.In a continuous time Markov chain (CTMC), these transitions don’t occur according to a fixedtime step, but instead each transition occurs after an exponentially distributed amount of time.If the individual is in state i , that exponential distribution has rate λ i or equivalently has meanduration 1 /λ i . Let R be the vector of the rates λ i , for i = 1 , . . . , k . Due to the memorylessnessproperty of exponential distributions transitions from state i to state i can effectively be ignored,thus we can formulate a new transition probability matrix that describes an equivalent Markovchain but where we only track transitions to new states. This reformulated Markov chain is knownas the embedded jump process (or embedded Markov chain ), and it is described with a transitionprobability matrix (cid:101) P where the diagonal entries are 0 and the off-diagonal transition probabilities (cid:101) p ij = p ij / (cid:80) j (cid:54) = i p ij . That is, these are just the off-diagonal entries of P with the diagonal set to0, and the rows normalized so each row sums to 1. For our purposes, since a Markov chain withan absorbing state is not ergodic and therefore does not properly have an embedded jump processrepresentation, the above procedure is only applied to k rows corresponding to transient states.4 reprint - July 9, 2020 Thus, the last row of P and (cid:101) P will be the same, so that the last state in the Markov chain remainsan absorbing state.In a CTMC context, the transition probability matrix (cid:101) P and rate vector R are combined intoa single matrix called the transition rate matrix Q . The entries of Q can be thought of as themean-field, per-individual loss rates from each state (along the diagonal) and the transition ratesfrom state i into state j (the off diagonal entries; see below). It has the block form Q = (cid:34) A B0 (cid:35) (2)where A is a k × k matrix describing the transition rates among transient states, B is the k × A and B are constructed from the entries of the transition probability matrix (cid:101) P and thevector of rates for the exponential distributed dwell-times R as follows. The i th diagonal entry of A is the loss rate − λ i and the rest of the entries in the i th row are the product of λ i and the transitionprobability (cid:101) p ij . That is, the ij off-diagonal entries of A are the per-individual transition rates fromthe i th state into the j th state, given by λ i (cid:101) p ij . Since the rows of (cid:101) P sum to 1, the rows of Q sum to0, and it then follows that vector B is equal to the negative of the row sums of A . Thus, we canwrite B = − A 1 , where is a column vector of k ones.Finally, assume that the initial state of such a CTMC is one of the k transient states. Let α be thelength k column vector of probabilities (i.e., (cid:80) ki =1 α i = 1) that define the initial state distributionover these k states.Note that, for CTMCs which have k transient states and 1 absorbing state, all of the informationnecessary to describe the CTMC is contained in the transition rate matrix for transient states, A ,and initial distribution vector α . As detailed next, these quantities are also sufficient to parameterizethe corresponding phase-type distribution. With the above family of CTMCs in mind, let T i be defined as the duration of time that it takesto first reach the absorbing state, given the CTMC starts in the i th transient state. We call thisan absorption time . More generally, let T be the absorption time given that the initial state isdetermined by the initial probability vector α (i.e., T follows the mixture distribution of randomvariables T i with mixing probabilities α ). Phase-type distributions are the family of absorptiontime distributions for all such T described above.The most familiar examples are the exponential distribution, and the Erlang distribution (i.e., thosegamma distributions that have an integer shape parameter k ≥

1) which can be thought of asthe sum of k independent exponentially distributed random variables, each with rate r . Erlangdistributions can be parameterized in terms of their mean µ and coefficient of variation c v (thestandard deviation over the mean), or their rate r and shape k , where µ = kr , σ = kr , c v = 1 √ k , and thus, r = µσ , and k = µ σ = 1 c v . (3)More generally, phase-type distributions are parameterized by vector α and transient state ratematrix A (as defined in the previous section), and have the probability density function, cumulative5 reprint - July 9, 2020 distribution function, and j th moment given (respectively) by: f ( t ) = α e A t ( − A1 ) (4a) F ( t ) = 1 − α e A t (4b) E ( T j ) = j ! α ( − A ) − j . (4c)Here is a column vector of ones that has the same number of rows as α and A . Note that α and A are not a unique parameterization of a given phase-type distribution, and there are equivalentparameterizations using vector-matrix pairs of the same dimension as well as different dimensions.Phase-type distributions can be classified as cyclic (transient states can be visited infintely often) andacyclic (transient states can only be visited once). This family of distributions has been relativelywell studied in the queuing theory literature, and elsewhere, and readers are encouraged to consultAltiok (1985), Bladt and Nielsen (2017a,b), Horv´ath, Scarpa, and Telek (2016), Horv´ath et al.(2012), Reinecke, Bodrog, and Danilkina (2012), and Reinecke, Krau, and Wolter (2012) for furtherdetails.Additionally, freely available computational tools such as BuTools for Matlab and Python (Horv´athand Telek, 2017, 2020) enable researchers to fit phase-type distributions to data. This fact, combinedwith the Generalized Linear Chain Trick, allows for the construction of ODE models that incorporateempirically derived distributional assumptions for the time spent in a given state. The GLCT provides modelers with a direct way to take an existing ODE model that includes astate that has an exponentially distributed dwell time, and obtain a new set of ODEs where thatexponentially distributed dwell time has been replaced with a phase-type dwell time distribution.This is done by partitioning that focal state into a set of sub-states and using the GLCT to write thenew systems of ODEs that govern those sub-states using the matrix and vector parameterizationof the assumed phase-type distribution. This technique can also be used to implement the classicLinear Chain Trick (LCT), since Erlang distributions (i.e., gamma distributions with integer shapeparameters) are a subfamily of phase-type distributions.The GLCT in its most general form (Hurtado and Kirosingh, 2019) extends the GLCT for phase-type distributions to the scenario where the rates and probabilities in the CTMC frameworkdescribed above can vary with time. Here, we only provide a statement of the GLCT for phase-typedistributions:

Theorem 1 (GLCT for phase-type distributions [Corollary 2 in Hurtado and Kirosingh, 2019]) . Assume individuals enter state X at rate I ( t ) and that the distribution of time spent in state Xfollows a continuous phase-type distribution given by the length k initial probability vector α andthe k × k matrix A . The mean field equations for these transient sub-states x i are given by ddt x ( t ) = α I ( t ) + A T x ( t ) (5) where the rate of individuals leaving each of these sub-states of X is given by the vector ( − A 1 ) ◦ x ,where ◦ is the Hadamard (element-wise) product of the two vectors, and thus the total rate ofindividuals leaving state X is given by the sum of those terms, i.e., ( − A 1 ) T x = − T A T x . reprint - July 9, 2020 Note that the influx of individuals at time t (at rate I ( t )) is distributed across the sub-states of Xaccording to the initial distribution vector α , and the second term in eq. (5) describes both themovements among sub-states of X as well as the loss rate from the state X from each sub-state.The Linear Chain Trick (LCT) has been known for decades, and is special case of the GLCTfor phase-type distributions stated above (Hurtado and Kirosingh, 2019). Here we give a formalstatement of the LCT, which assumes an Erlang distributed dwell time, with shape parameter k and rate parameter r . Corollary 1 ( Linear Chain Trick ) . Consider the GLCT above. Assume that the dwell-time distribution is an Erlang distribution withshape k and rate r (or if written in terms of shape k and mean τ = k/r , then use rate r = k/τ ).Then the corresponding mean field ODE equations for the k sub-states of X are dx dt = I ( t ) − r x dx dt = r x − r x ...dx k dt = r x k − − r x k . (6) where the total loss rate from state X at time t is the loss rate from the final sub-state, r x k ( t ) .Proof. The phase-type distribution formulation of the Erlang distribution with shape k and rate r is given by v and M below, and substituting these into eq. (5) which yields the desired result. v =  ...  and M =  − r r · · · − r r . . . ... . . . . . . . . . . . . . . . − r r · · · − r  (7)See Hurtado and Kirosingh (2019) for a direct proof that uses a recursive relationship betweenErlang density functions and their derivatives. (cid:4) In the sections below, we extend two well-known models using the GLCT by replacing the implicitexponentially distributed dwell time assumptions of these models with phase-type distributionassumptions. These more general model formulations can also be used as a way to formulate modelsthat could otherwise be derived using the standard LCT (i.e., the assumption of Erlang distributeddwell times). This may be the more desirable approach since the phase-type formulation of suchmodels can be more practically and computationally advantageous to work with, which we show insection 2.4. 7 reprint - July 9, 2020

Maturation delays in population models can influence model outputs, although such delays are notalways incorporated into models used in applications (Robertson et al., 2018). In this section, weillustrate how one can use the GLCT to incorporate phase-type maturation times into such populationmodels, using the widely used Rosenzweig-MacArthur model of predator-prey (consumer-resource)dynamics (Murdoch, Briggs, and Nisbet, 2003; Rosenzweig and MacArthur, 1963): dNdt = r N (cid:18) − NK (cid:19) − a Pk + N N (8a) dPdt = χ a Nk + N P − µ P (8b)In the absence of predators ( P ), the prey population ( N ) is subject to logistic growth, and predatorsconsume prey following a Holling’s type II functional response (Dawes and Souza, 2013; Holling,1959a,b; Murdoch, Briggs, and Nisbet, 2003). Predators will then live for an exponentially distributedlifetime with mean 1 /µ .One approach to incorporating a maturation delay of duration τ is to formulate a delay differentialequation (DDE), as in (Xia et al., 2009): dN ( t ) dt = r N ( t ) (cid:18) − N ( t ) K (cid:19) − a P ( t ) k + N ( t ) N ( t ) (9a) dP ( t ) dt = χ a N ( t − τ ) k + N ( t − τ ) P ( t − τ ) − µ P ( t ) (9b)This can be thought of as the limit of a distributed delay model, with mean delay time τ , forwhich the variance or coefficient of variation goes to zero. This corresponds to a delay distributionwith point mass at τ (i.e., the distribution can be described with a Dirac delta function). TheLCT has long been used to approximate such limiting cases in DDE models by assuming instead adelay distribution that is Erlang distributed with mean τ and a very small coefficient of variation,i.e., a large shape parameter k = 1 /c v (Hurtado and Kirosingh, 2019; Smith, 2010). Writing thisapproximating model, as in Hurtado (2020), yields the Rosenzweig–MacArthur model with Erlangdistributed maturation time in the predators: dNdt = r N (cid:18) − NK (cid:19) − a Pk + N N (10a) dx dt = χ a Nk + N P − kτ x (10b) dx j dt = kτ x j − − kτ x j , for j = 2 , . . . , k (10c) dPdt = kτ x k − µ P (10d)The sub-states x j , j = 1 , . . . , k , track the immature stages of the predators before they mature.Using the GLCT, the above model can be generalized in two ways. First, the Erlang distributedmaturation time assumption that yields the sub-states x i can be replaced by the assumption of amore general phase-type distribution with matrix-vector parameterization α X and A X . Similarly,the exponentially distributed time duration that predators spend as adults can also be replaced8 reprint - July 9, 2020 with a more general phase-type distribution with parameter vector α P and matrix A P . Accordingto the GLCT – where x ( t ) denotes the vector of maturing predator sub-states x i ( t ), y ( t ) is thevector of adult predator sub-states y j ( t ), and where P ( t ) = (cid:80) all j y j ( t ) – these assumptions yieldthe more general model: dNdt = r N (cid:18) − NK (cid:19) − a Pk + N N (11a) d x dt = χ a Nk + N P α X + A X T x (11b) d y dt = − T A X T x (cid:124) (cid:123)(cid:122) (cid:125) scalar α P + A P T y (11c)Observe that eqs. (10) are the special case of eqs. (11) where the phase-type distribution matrix-vector parameterization for an Erlang distribution with mean τ and shape k is given by α X =  ...  and A X =  − kτ kτ · · · − kτ kτ · · · . . . · · · − kτ kτ · · · − kτ  (12)and for an exponential distribution with rate µ , α P = (cid:104) (cid:105) and A P = (cid:104) − µ (cid:105) . (13)Note that eqs. (10) are a much more compact way of formulating such generalized models withoutthe need to specify the number of sub-states. As shown below, this formulation allows modelers towrite more efficient computer code for computing numerical solutions to such models, and also canbe used with computer algebra systems to generate ODEs like eqs. (10) from first principles. SIR-type models of infectious disease transmission are widely used in the study of infectious diseases,and can help inform public health efforts to limit the spread of infectious diseases (Anderson andMay, 1992; Diekmann and Heesterbeek, 2000; Keeling and Grenfell, 1997; Kermack and McKendrick,1927, 1932, 1933, 1991a,b,c; Lloyd, 2009; Wearing, Rohani, and Keeling, 2005). For example, suchmodels are currently being used in response to the ongoing SARS-CoV-2/COVID-19 pandemic.It is known that including a latent period prior to the onset of symptoms and infectiousness, as wellas incorporating non-exponential distributions for the time spent in the different infection states,can be important to include in models that are being used in such applications (Feng, Xu, andZhao, 2007; Wang et al., 2017; Wearing, Rohani, and Keeling, 2005).Here we use the GLCT to formulate a more general SEIR model where we assume that the latentperiod (time spent in state E) and infectious period (time spent in state I) follow independentphase-type distributions. For simplicity, we assume the state variables have been scaled by thetotal population size so that S + E + I + R = 1, and that there are no births or deaths in the model.9 reprint - July 9, 2020 To begin, consider this simple SEIR model, where S is the fraction of susceptibles in the population, E the fraction of exposed individuals with latent infections, I the fraction of individuals with activeinfections, and R the fraction of recovered or removed individuals: dSdt = − β S I (14a) dEdt = β S I − r E E (14b) dIdt = r E E − r I I (14c) dRdt = r I I (14d)Next, assume the latent period distribution is phase-type with parameters α E and A E , and theinfectious period distribution is also phase-type, but with parameters α I and A I . Let x = [ E , . . . ] T and y = [ I , . . . ] T be the column vectors of the fraction of individuals in each of the exposed andinfectious sub-states, respectively, where E = (cid:80) E j and I = (cid:80) I i . Then by the GLCT we can writethe mean field ODEs for the generalized SEIR model as follows: dSdt = − β S I (15a) d x dt = α E β S I + A E T x (15b) d y dt = α I (cid:0) ( − A E ) T x (cid:1)(cid:124) (cid:123)(cid:122) (cid:125) scalars + A I T y (15c) dRdt = (cid:122) (cid:125)(cid:124) (cid:123) ( − A I ) T y (15d)To assume, for example, an Erlang distributed latent period with mean τ E and shape k E , i.e., rate1 /τ E and coefficient of variation c v = 1 / √ k E , then one would use α E =  ...  and A E =  − k E τ E k E τ E · · · − kτ E k E τ E · · · . . . · · · − k E τ E k E τ E · · · − k E τ E  (16)Similarly, an Erlang infectious period distribution with mean τ I and shape parameter k I (coefficientof variation 1 / √ k I ) would be parameterized by α I =  ...  and A I =  − k I τ I k I τ I · · · − kτ I k I τ I · · · . . . · · · − k I τ I k I τ I · · · − k I τ I  (17)10 reprint - July 9, 2020 Simplifying the right hand side of eqs. (15) using the matrix and vector definitions above yieldsthe familiar sub-state equations for an SEIR model with Erlang distributed latent and infectiousperiods, eqs. (18). dSdt = − β S I (18a) d x dt =  β SI − k E τ E E k E τ E E − k E τ E E k E τ E E − k E τ E E ... k E τ E E k E − − k E τ E E k E  (18b) d y dt =  k E τ E E k E − k I τ I I k I τ I I − k I τ I I k I τ I I − k I τ I I ... k I τ I I k I − − k I τ I I k I  (18c) dRdt = k I τ I I k I (18d)where x = [ E , . . . , E k E ] T , y = [ I , . . . , I k I ] T , and I = (cid:80) I i .Other phase-type distributions could be assumed, e.g., by fitting non-Erlang phase-type distributionsto data using computational tools like the free software BuTools (Horv´ath and Telek, 2017, 2020).

The examples above illustrate how an existing DDE or ODE model can be generalized by assumingthat states with fixed or exponentially distributed dwell times instead have phase-type distributeddwell times. Here we take the generalized SEIR model above and use the GLCT to further exploremore complex model assumptions. We do this by considering two special cases of this generalizedmodel (see Figs. 1, 2): one that models hospitalization in a manner that does not change thedistribution of time in the infected class, and a second case that models heterogeneity in the severityand duration of disease. In each case, we assume that infected individuals will either experiencemild or severe disease with the potential for distributional differences in the duration of infection.For simplicity, here we assume the removed class R contains both recovered and deceased individuals,and that, upon transitioning from the class of individuals with latent infection (E) to the infectiousclass (I), a fraction ρ will go on to develop severe symptoms (in state I s ) that may requirehospitalization, whereas the remaining fraction (1 − ρ ) do not develop severe disease and insteadenter a different state of more mild disease (I ). Using the GLCT, the states I and I s are partitionedinto sub-states, where the numbers in each are described by vectors y and y s , respectively, andtheir dynamics obey the following system of ODEs: dSdt = − β S I (19a) d x dt = α E β S I + A E T x (19b)11 reprint - July 9, 2020 d y dt = α I (cid:0) (1 − ρ )( − A E ) T x (cid:1) + A I T y (19c) d y s dt = α I s (cid:0) ρ ( − A E ) T x (cid:1) + A I s T y s (19d) dRdt = ( − A I ) T y + ( − A I s ) T y s (19e)Eqs. (19) can be viewed as a special case of eqs. (15) defined in terms of a mixture of two phase-typedistributions, where y = [ y T , y s T ], the vector α I = [(1 − ρ ) α I T , ρ α I s T ], and A I is the blockdiagonal matrix A I =diag( A I , A I s ). The two cases below are described in the context of eqs. (19). In this scenario, individuals progress towards recovery or death according to an Erlang distributionwith rate λ and shape k y . Independently, they also move towards hospitalization according to anErlang distributed hospitalization time with rate r and shape k H (for an example of this structureused to model Ebola, but where k H = 1, see Feng et al., 2016). Modeling this with Erlang latentand infectious period distributions for simplicity, and according to the competing Poisson processesmotif detailed in (Hurtado and Kirosingh, 2019), yields a sub-state structure as shown in Fig. 1.The matrix-vector pairs α E , A E , and α I , A I are as described above for Erlang distributions.The matrix-vector pair A I s and α I s are defined as follows . If we order these states starting at the I entry and work across each rows left to right before moving down to the next row (see Fig. 1),then the associated rate matrix has the following block form with each column and row having k H blocks of dimension k × k . This block structure corresponds to each row of the I sub-statesshown in the lower portion of Fig. 1 as a k H × k grid of sub-states. A I s =  A d1 A sup · · ·

00 A d1 A sup · · ·

00 0 A d1 A sup . . . ... . . . . . . . . . . . . ... d1 A sup · · · d2  (20)where the diagonal and superdiagonal blocks are A d1 =  − λ − r λ · · · . . . λ · · · ... . . . . . . ... · · · . . . λ · · · − λ − r  , A sup =  r · · · . . . · · · ... . . . . . . ... · · · . . .

00 0 · · · r  (21) Compare to the matrix-vector parameterization of the minimum of two Erlang distributions (the minimum of twophase-type distributions is itself phase-type) using the formulas given in Bladt and Nielsen (2017b). reprint - July 9, 2020 S E E E k E I I I k I I I k I I I k I k H I k H I k H k R · · · · · ·· · ·· · ·· · · ... ... ... ... Figure 1:

SEIR-type model with heterogeneity in illness severity and hospitalizations that do not alter theinfectious period. A special case of eqs. (19) (cf. Fig. 2). See the main text for details. Here the standardLCT has been applied to the exposed (E) state, and the GLCT is applied to the infectious (I) states, using thecompeting Poisson Process approach (Hurtado and Kirosingh, 2019) to model hospitalizations in a fraction ofthe infectious individuals. This ensures that the time spent in I is independent of whether or not individualstransition to the hospital. and the bottom right diagonal entry is A d2 =  − λ λ · · · . . . λ · · · ... . . . . . . ... · · · . . . λ · · · − λ  . (22)Together, the matrix A I s and the length k H k I s initial distribution vector α I s = [1 , , · · · , T complete the parameterization of model eqs. (19) to yield the model structure illustrated in Fig. 1.Observe that the matrix above has the same diagonal and superdiagonal blocks in all rows exceptfor the last row, for which the diagonal entries are − λ and not − λ − r . Here the dwell time ineach sub-state of I s (except for the last row) follows the minimum of two independent exponentialdistributions with rates λ and r , so by the properties of exponential distributions, those dwell timesare each exponentially distributed with rate λ + r . Individuals leaving those sub-states then eithermove horizontally towards resolving their infections with probability λ/ ( λ + r ), or move downwardstowards hospitalization (the last row) with probability r/ ( λ + r ) (Hurtado and Kirosingh, 2019). Inthe last row, individuals are hospitalized and can only move horizontally towards the resolutionof their illness. This structure ensures that time transition to the hospital (the last row) does notimpact their overall distribution of time spent in state I .13 reprint - July 9, 2020 S E E E k E I I I k I I I k I R I R I Rk R I C I C I Ck C R · · · · · ·· · · · · ·· · · Figure 2:

SEIR-type model with heterogeneity in illness severity in which a fraction of infected individualsexperience severe illness and a fraction of those require critical care. This is also a special case of eqs. (19)(see Fig. 1). See the main text for details.

To further illustrate the flexibility of eqs. (15) and (19), we now consider the model structureillustrated in Fig. 2. In this case, we make similar assumptions to the case above, except for thestates that pertain to the fraction ρ of individuals who experience severe illness. Those individualsexhibit an Erlang distributed period of more mild disease, with rate r and shape parameter k .Those individuals then either recover (with probability f ) after an Erlang distributed period oftime with rate r R and shape k R , or they become even more ill (with probability 1 − f ) and requirehospitalization for an Erlang distributed amount time with rate r C and shape k C . As above, allindividuals eventually enter a removed state R which, for our purposes here, makes no distinctionbetween recovery and death.Comparing Figs. 1 and 2, we see that this second scenario is also a special case of eqs. (19), and onlydiffers from that case in the definition of matrix A I s and the length k + k R + k C initial distributionvector α I s = [1 , , · · · , T . Ordering the substates of I s from I to I k to I R to I Rk R to I C toI Ck C , then, by the assumptions above, A I s is given by A I s =  − r r · · · − r r · · · ... . . . . . . ... · · · r r · · · − r · · · . . . · · · ... . . . . . . ... · · · . . . f r · · · · · · . . . · · · ... . . . . . . ... · · · . . . − f ) r · · · − r R r R · · · − r R r R · · · ... . . . . . . ... · · · − r R r R · · · − r R

00 0 − r C r C · · · − r C r C · · · ... . . . . . . ... · · · − r C r C · · · − r C  . (23)The two examples above illustrate the utility of deriving models using the GLCT and by thinkingabout deriving model structure from first principles using intuition from CTMCs. This approach14 reprint - July 9, 2020 can be leveraged for the analytical study of such models (Hurtado and Richards, in prep. ), but aswe show in the next section it can also facilitate the process of computing numerical solutions tosuch systems of ODEs. Software like Matlab, Python, Julia, and R, have built-in ODE solvers that implement variousnumerical methods for obtaining numerical solutions to ODEs. In Fig. 3, we summarize the averagetime it takes to compute a numerical solution to the generalized Rosenzweig-MacArthur modelwith Erlang distributed maturation time and time predators spend in the adult stage, either inthe (LCT) form of eqs. (10) or in the (GLCT) form of eq. (11). In Fig. 4, we make a similarcomparison with the SEIR model with Erlang latent and infectious periods comparing the timeit takes to compute numerical solutions to that model in the form of eqs. (18) or the equivalentmodel in the (GLCT) form of eqs. (15). See the figure captions, the R code in Appendix A, and theElectronic Supplements for further computational details.

M (Model dimension is 1 + 2M) E l ap s ed T i m e ( s e c ond s ) Rosenzweig−MacArthur Model

Standard: No maturation delay & exponential time as adult predatorsLCT: Erlang maturation delay & time as adult predatorsGLCT: Erlang maturation delay & time as adult predators (phase−type formulation)

Figure 3:

Benchmark results for 140 iterations of computing numerical solutions to the Rosenzweig-MacArthur model with Erlang (gamma) distributed maturation times and time spent in the adult-stage,using either a direct (LCT; medium gray) or more general (GLCT; black) model formulations (the standardRosenzweig-MacArthur model with no maturation delay and exponentially distributed time spent in theadult stage [light gray; eqs. (8)] is included as a baseline). The second and third cases are mathematicallyequivalent systems. For smaller shape parameters (lower dimension systems) the GLCT model formulation isrelatively slower than explicitly writing out the 2 M + 1 equations, whereas for larger shape parameters (higherdimension systems) the GLCT formulation becomes markedly faster. This is likely due to the efficiency of thematrix computations. The x-axis values M indicate the number of maturing predator sub-states ( k x = M )and adult predator sub-states ( k y = M ), which yields a 2 M + 1 dimensional model. Numerical solutionswere computed using the ode() function in the deSolve package (Soetaert, Petzoldt, and Setzer, 2010) in R(R Core Team, 2020), using method ode45 with atol=10^-6 , for time points 0 to 500 in increments of 1,with r = 1, K = 1000, a = 5, k = 500, χ = 0 . µ x = 0 . µ y = 1, N (0) = 1000, x i (0) = 0 ( i ≥ y (0) = 10,and y j (0) = 0 ( j > reprint - July 9, 2020 N (Substates of E and I; Model dimension = 2 + 2N) E l ap s ed T i m e ( s e c ond s ) SEIR Model

Standard: Exponential latent & infetious periodsLCT: Erlang latent & infectious periodsGLCT: Erlang latent & infectious periods (phase−type formulation)

Figure 4:

Benchmark results for 500 iterations of computing numerical solutions to the SEIR model withErlang distributed latent and infectious periods, using either a direct (LCT; medium gray) or more general(GLCT; black) model formulations (SEIR with exponentially distributed latent and infectious periods [lightgray; eqs. (14)] is included as a baseline). The second and third cases are mathematically equivalent systems.For smaller shape parameters (lower dimension systems) the GLCT model formulation is relatively slowerthan explicitly writing out the 2N+2 equations, whereas for larger shape parameters (higher dimensionsystems) the GLCT formulation becomes faster, likely due to the efficiency of the matrix computations.The x-axis values N indicate the number of sub-states in each of E ( k E = N ) and I ( k I = N ), whichyields a 2 N + 2 dimensional model. Numerical solutions were computed using the ode() function in the deSolve package (Soetaert, Petzoldt, and Setzer, 2010) in R (R Core Team, 2020), using method ode45 with atol=10^-6 , for time points 0 to 100 in increments of 0 .

5, with β = 1, µ E = 4, µ I = 7, S (0) = 0 . E (0) = 0 . E i (0) = 0 ( i > I (0) = 0, I i (0) = 0 ( i > R (0) = 0. See Appendix A for details. To summarize these results, neither approach performs uniformly better than the other. In bothcomparisons, low dimensional models (i.e., smaller shape parameters and thus larger coefficients ofvariation) coded using the more explicit (LCT-based) model formulation, like eqs. (10) and (18),yielded numerical solutions faster than mathematically equivalent models coded using the moregeneral phase-type (GLCT-based) formulation, like eqs. (11) and (15).For higher dimensional models (i.e., larger shape parameters and thus smaller coefficients ofvariation), the phase-type (GLCT) formulation of these models outperformed their LCT-typecounterparts. This is very likely the result of the matrix calculations used in the phase-type (GLCT)formulation of the models being more computationally efficient.It is noteworthy that the GLCT-based ODE function only needs to be coded once, as it is agnosticof the number of sub-state variables. In contrast, researchers typically hard-code the number ofsub-states for such computations using an LCT-based model, and must write multiple ODE functionsto consider model outputs using different shape parameters for the assumed Erlang distributions.In summary, the GLCT may allow for faster computing times for high dimensional systems andit can simplify writing code for ODE solvers since a single instance of the model can be used tosimulate ODE models with an arbitrary number of dimensions.16 reprint - July 9, 2020

ODE models are widely used in the biological sciences and can often be viewed as mean fieldmodels for some (oftentimes, unspecified) stochastic state transition model. Such ODE models aresometimes criticized for their implicit assumption of exponentially distributed dwell times (e.g.,exponentially distributed lifetimes of organisms), and their frequent lack of age or stage structure,which may not adequately capture important time lags in the system being modeled, such as thematuration times of organisms or latent periods in disease transmission models.In this paper, we have provided an overview of the Generalized Linear Chain Trick (GLCT),a relatively new tool modelers can use to improve upon existing ODE models to address theseshortcomings, and we illustrate its utility using multiple examples. The GLCT extends the well-known Linear Chain Trick (LCT) to a broad family of probability distributions known as thephase-type distributions, and also clarifies in a straightforward way how mean field ODEs reflectunderlying stochastic model assumptions when viewed from the perspective of continuous timeMarkov chains (CTMCs). Therefore, we have also provided an overview of CTMCs, and theirabsorption time distributions in particular. Importantly, the phase-type distributions comprisethese absorption time distributions, and include a broad set of named probability distributionsincluding exponential, Erlang, hypoexponential (generalized Erlang), hyperexponential, and Coxiandistributions. Freely available statistical tools like BuTools (Horv´ath and Telek, 2017, 2020) existfor fitting phase-type distributions to data, enabling modelers to build approximate empirical dwelltime distributions into ODE models using the GLCT.We have illustrated how to apply the GLCT by using it to derive extensions of two familiarmodels: the Rosenzweig-MacArthur Predator-Prey model, and the SEIR model of infectiousdisease transmission. We showed how two special cases of the generalized SEIR model can beconstructed to accommodate additional complexity among infected individuals: the first case modelsa hospitalization scenario in which the transition to the hospitalized state has no impact on thedistribution of the overall time individuals spend sick (cf. Feng et al., 2016), and the second casemodels heterogeneity in the progression and severity of infection outcomes. These examples illustratehow the GLCT can be used to refine model assumptions in a rigorous manner, but without the needto explicitly derive mean field ODEs from stochastic models and/or mean field integral equations.Lastly, we showed some of the potential computational benefits of using a GLCT-based approachto ODE model formulation by comparing the time it takes to compute numerical solutions ofODEs using the standard approach versus using a GLCT-based approach. We found that, forlow dimensional models, the GLCT-based computations are slower than using a more traditionalapproach; however for higher dimensional models, the GLCT-based computations were faster. Thisimprovement in computing time is likely the result of the computational efficiency of doing matrixand vector based computations. In addition to faster computation times, another benefit of theGLCT-based approach is that only one ODE model function needs to be written since it is agnostic ofthe model dimension. In contrast, models that have been extended using the standard LCT typicallyhave the number of sub-states (i.e., the shape parameter for the Erlang distributions) hard-coded,and therefore multiple model functions must be coded to explore different shape parameters.In conclusion, we hope this work encourages other researchers to think more carefully aboutunderlying model assumptions when deriving ODE models. We also hope this work demonstratesthe relative ease with which some basic intuition from Markov chain theory can be used to specifyclear model assumptions from first principles, which can then be very quickly realized as one ormore mean field ODE models using the GLCT (Hurtado and Kirosingh, 2019).17 reprint - July 9, 2020

Acknowledgements

The authors thank Deena Schmidt, Jillian Kiefer, and the other members Mathematical BiologyLab Group at UNR for conversations and comments that improved this manuscript.

Funding

This work was supported by a grant awarded to PJH by the Sloan Scholars Mentoring Network ofthe Social Science Research Council with funds provided by the Alfred P. Sloan Foundation; andthis material is based upon work supported by the National Science Foundation under Grant No.DEB-1929522. This work was partly motivated by PJH’s participation in the ICMA-VII conferenceheld at Arizona State University, October 12-14, 2019, with travel support provided to PJH fromNSF grant

Disclosure statement

The authors declare that they have no conflict of interest.

Appendix A R Code for Numerical Solutions to ODEs

For the complete R code used to generate Figs. 3 and 4, see the Electronic Supplements. Thefollowing R code shows a portion of that code, to illustrate how the GLCT-based model formulationsdiffer from the LCT-based formulations.

A.1 Rosenzweig-MacArthur Model & Extensions library(deSolve) library(rbenchmark) IC = c(N=1000,Y=10); parms = c(r = 1, K = 1000, a = 5, k = 500, chi = 0.5, mux = 0.5, muy = 1); RM.ode <- function(tm,z,ps) { r = ps[1]; K = ps[2]; a = ps[3]; k = ps[4]; chi = ps[5]; mux = ps[6]; muy = ps[7]; N=z[1] Y=z[2] dN = r*N*(1-N/K)-(a/(k+N))*Y*N dY = chi*(a/(k+N))*N*Y-Y/muy return(list(c(dN,dY))) } reprint - July 9, 2020 RMpt.ode <- function(tm,z,ps) { r = ps[1]; K = ps[2]; a = ps[3]; k = ps[4]; chi = ps[5]; mux = ps[6]; muy = ps[6]; N = z[1] X = z[1+(1:kx)] Y = z[1+kx+(1:ky)] P = onesyt %*% Y dN = r*N*(1-N/K)- a/(k+N)*P*N dX = chi*(a/(k+N))*P[1,1]*N * ax + Axt %*% X dY = ay %*% -onesxt %*% Axt %*% X + Ayt %*% Y return(list(c(dN,as.numeric(dX),as.numeric(dY)))) } RMpt.init <- function(ps) { mux=ps[["mux"]] muy=ps[["muy"]] kx <<- ps[[’kx’]] ky <<- ps[[’ky’]] ax = matrix(0, nrow=kx, ncol=1); ax[1] = 1; Ax = kx/mux*(diag(rep(-1,kx),kx)); if(kx >1) for(i in 1:(kx -1)) {Ax[i,i+1] = kx/mux} ay = matrix(0, nrow=kx, ncol=1); ay[1] = 1; Ay = ky/muy*(diag(rep(-1,ky),ky)); if(ky >1) for(i in 1:(ky -1)) {Ay[i,i+1] = ky/muy} z0=numeric (1+kx+ky) z0[1] <- IC[["N"]] z0[1+kx+1] <- IC[["Y"]] ax <<- ax Axt <<- t(Ax) ay <<- ay Ayt <<- t(Ay) onesxt <<- matrix(1,ncol=kx,nrow=1) onesx <<- matrix(1,nrow=kx,ncol=1) onesyt <<- matrix(1,ncol=ky,nrow=1) onesy <<- matrix(1,nrow=ky,ncol=1) ICs <<- z0 } RMlct1.ode <- function(tm,z,ps) { r = ps[1]; K = ps[2]; a = ps[3]; k = ps[4]; chi = ps[5]; mux = ps[6]; muy = ps[7]; N = z[1] P = onesyt %*% z[1+kx+(1:ky)] dN = r*N*(1-N/K) - a/(k+N)*P*N dX1 = chi*a/(k+N)*P*N - z[2]*kx/mux dY1 = kx/mux*z[2] - z[3]*ky/muy return(list(c(dN, dX1 , dY1))) } RMlct2.ode <- function(tm,z,ps) { r = ps[1]; K = ps[2]; a = ps[3]; k = ps[4]; chi = ps[5]; mux = ps[6]; muy = ps[7];

N = z[1]

P = onesyt %*% z[1+kx+(1:ky)] dN = r*N*(1-N/K) - a/(k+N)*P*N dX1 = chi*a/(k+N)*P*N - z[2]*kx/mux dX2 = kx/mux*z[2] - z[3]*kx/mux dY1 = kx/mux*z[3] - z[4]*ky/muy dY2 = ky/muy*z[4] - z[5]*ky/muy return(list(c(dN, dX1 , dX2 , dY1 , dY2))) } RMlct20.ode <- function(tm,z,ps) { r = ps[1]; K = ps[2]; a = ps[3]; k = ps[4]; chi = ps[5]; mux = ps[6]; muy = ps[7];

N = z[1] reprint - July 9, 2020 P = onesyt %*% z[1+kx+(1:ky)] dN = r*N*(1-N/K) - a/(k+N)*P*N dX1 = chi*a/(k+N)*P*N - z[2]*kx/mux dX2 = kx/mux*z[2] - z[3]*kx/mux dX3 = kx/mux*z[3] - z[4]*kx/mux dX4 = kx/mux*z[4] - z[5]*kx/mux dX5 = kx/mux*z[5] - z[6]*kx/mux dX6 = kx/mux*z[6] - z[7]*kx/mux dX7 = kx/mux*z[7] - z[8]*kx/mux dX8 = kx/mux*z[8] - z[9]*kx/mux dX9 = kx/mux*z[9] - z[10]*kx/mux dX10 = kx/mux*z[10] - z[11]*kx/mux dX11 = kx/mux*z[11] - z[12]*kx/mux dX12 = kx/mux*z[12] - z[13]*kx/mux dX13 = kx/mux*z[13] - z[14]*kx/mux dX14 = kx/mux*z[14] - z[15]*kx/mux dX15 = kx/mux*z[15] - z[16]*kx/mux dX16 = kx/mux*z[16] - z[17]*kx/mux dX17 = kx/mux*z[17] - z[18]*kx/mux dX18 = kx/mux*z[18] - z[19]*kx/mux dX19 = kx/mux*z[19] - z[20]*kx/mux dX20 = kx/mux*z[20] - z[21]*kx/mux dY1 = kx/mux*z[21] - z[22]*ky/muy dY2 = ky/muy*z[22] - z[23]*ky/muy dY3 = ky/muy*z[23] - z[24]*ky/muy dY4 = ky/muy*z[24] - z[25]*ky/muy dY5 = ky/muy*z[25] - z[26]*ky/muy dY6 = ky/muy*z[26] - z[27]*ky/muy dY7 = ky/muy*z[27] - z[28]*ky/muy dY8 = ky/muy*z[28] - z[29]*ky/muy dY9 = ky/muy*z[29] - z[30]*ky/muy dY10 = ky/muy*z[30] - z[31]*ky/muy dY11 = ky/muy*z[31] - z[32]*ky/muy dY12 = ky/muy*z[32] - z[33]*ky/muy dY13 = ky/muy*z[33] - z[34]*ky/muy dY14 = ky/muy*z[34] - z[35]*ky/muy dY15 = ky/muy*z[35] - z[36]*ky/muy dY16 = ky/muy*z[36] - z[37]*ky/muy dY17 = ky/muy*z[37] - z[38]*ky/muy dY18 = ky/muy*z[38] - z[39]*ky/muy dY19 = ky/muy*z[39] - z[40]*ky/muy dY20 = ky/muy*z[40] - z[41]*ky/muy return(list(c(dN, dX1 , dX2 , dX3 , dX4 , dX5 , dX6 , dX7 , dX8 , dX9 , dX10 , dX11 , dX12 , dX13 , dX14 , dX15 , dX16 , dX17 , dX18 , dX19 ,dX20 , dY1 , dY2 , dY3 , dY4 , dY5 , dY6 , dY7 , dY8 , dY9 , dY10 , dY11 , dY12 , dY13 , dY14 , dY15 , dY16 , dY17 , dY18 , dY19 , dY20))) } mthd = "ode45" atol= 1e-6 Tmax = 500 tms=seq(0,Tmax ,length =500) reps <-140 parms1 <- parms; parms1[’kx’]=1; parms1[’ky’]=1; RMpt.init(parms1); b1=benchmark(RM.1 ={ode(y=IC, times = tms , func = RM.ode , parms = parms , method = mthd , atol=atol)},

RMlct .1={ode(y=ICs , times=tms , func=RMlct1.ode , parms = parms1 , method = mthd , atol=atol)},

RMpt.1 ={ode(y=ICs , times=tms , func=RMpt.ode , parms = parms1 , method = mthd , atol=atol)}, replications = reps) parms2 <- parms; parms2[’kx’]=2; parms2[’ky’]=2; parms2; RMpt.init(parms2); b2=benchmark(RM.2 ={ode(y=IC, times = tms , func = RM.ode , parms = parms , method = mthd , atol=atol)},

RMlct .2={ode(y=ICs , times=tms , func=RMlct2.ode , parms = parms2 , method = mthd , atol=atol)},

RMpt.2 ={ode(y=ICs , times=tms , func=RMpt.ode , parms = parms2 , method = mthd , atol=atol)}, replications = reps) parms20 <- parms; parms20[’kx’]=20; parms20[’ky’]=20; parms20; RMpt.init(parms20); b20=benchmark(RM.20 ={ode(y=IC, times = tms , func = RM.ode , parms = parms , method = mthd , atol=atol)},

RMlct .20={ ode(y=ICs , times=tms , func=RMlct20.ode , parms = parms20 , method = mthd , atol=atol)},

RMpt.20 ={ode(y=ICs , times=tms , func=RMpt.ode , parms = parms20 , method = mthd , atol=atol)}, replications = reps) out <- rbind(b1,b2,b20) out reprint - July 9, 2020 A.2 SEIR Model & Extensions library(deSolve) library(rbenchmark) reps <-500 parms = c(b=1, muE=4, muI=7, cvE=1, cvI=1, kE=1, kI=1) parms2 = c(b=1, muE=4, cvE=1/sqrt (2+1e-10), muI=7, cvI=1/sqrt (2+1e-10), kE=2, kI=2) parms2 parms3 = c(b=1, muE=4, cvE=1/sqrt (3.01), muI=7, cvI=1/sqrt (3.01)) parms3[’kE’] <- floor ((1/parms[[’cvE’]])^2) parms3[’kI’] <- floor ((1/parms[[’cvI’]])^2) parms3 SEIR.ode <- function(tm,z,ps) { b=ps[["b"]] muE=ps[["muE"]] muI=ps[["muI"]] dS = -b*z[1]*z[3] dE = b*z[1]*z[3] - z[2]/muE dI = z[2]/muE - z[3]/muI dR = z[3]/muI return(list(c(dS, dE, dI, dR))) } SEIRlct1.ode <- function(tm,z,ps) { b=ps[["b"]] muE=ps[["muE"]] muI=ps[["muI"]] kE =1 kI =1 Itot = sum(z[1+kE+(1:kI)]) dS = -b*z[1]*Itot dE1 = b*z[1]*Itot - z[2]*kE/muE dI1 = z[2]*kE/muE - z[3]*kI/muI dR = z[3]*kI/muI return(list(c(dS, dE1 , dI1 , dR))) } SEIRlct2.ode <- function(tm,z,ps) { b=ps[["b"]] muE=ps[["muE"]] muI=ps[["muI"]] kE =2 kI =2 Itot = sum(z[1+kE+(1:kI)]) dS = -b*z[1]*Itot dE1 = b*z[1]*Itot - z[2]*kE/muE dE2 = z[2]*kE/muE - z[3]*kE/muE dI1 = z[3]*kE/muE - z[4]*kI/muI dI2 = z[4]*kI/muI - z[5]*kI/muI dR = z[5]*kI/muI return(list(c(dS, dE1 , dE2 , dI1 , dI2 , dR))) } SEIRlct25.ode <- function(tm,z,ps) { b=ps[["b"]] muE=ps[["muE"]] muI=ps[["muI"]] kE = 25 kI = 25 Itot = sum(z[1+kE+(1:kI)]) dS = -b*z[1]*Itot dE1 = b*z[1]*Itot - z[2]*kE/muE dE2 = z[2]*kE/muE - z[3]*kE/muE reprint - July 9, 2020 dE3 = z[3]*kE/muE - z[4]*kE/muE dE4 = z[4]*kE/muE - z[5]*kE/muE dE5 = z[5]*kE/muE - z[6]*kE/muE dE6 = z[6]*kE/muE - z[7]*kE/muE dE7 = z[7]*kE/muE - z[8]*kE/muE dE8 = z[8]*kE/muE - z[9]*kE/muE dE9 = z[9]*kE/muE - z[10]*kE/muE dE10 = z[10]*kE/muE - z[11]*kE/muE dE11 = z[11]*kE/muE - z[12]*kE/muE dE12 = z[12]*kE/muE - z[13]*kE/muE dE13 = z[13]*kE/muE - z[14]*kE/muE dE14 = z[14]*kE/muE - z[15]*kE/muE dE15 = z[15]*kE/muE - z[16]*kE/muE dE16 = z[16]*kE/muE - z[17]*kE/muE dE17 = z[17]*kE/muE - z[18]*kE/muE dE18 = z[18]*kE/muE - z[19]*kE/muE dE19 = z[19]*kE/muE - z[20]*kE/muE dE20 = z[20]*kE/muE - z[21]*kE/muE dE21 = z[21]*kE/muE - z[22]*kE/muE dE22 = z[22]*kE/muE - z[23]*kE/muE dE23 = z[23]*kE/muE - z[24]*kE/muE dE24 = z[24]*kE/muE - z[25]*kE/muE dE25 = z[25]*kE/muE - z[26]*kE/muE dI1 = z[26]*kE/muE - z[27]*kI/muI dI2 = z[27]*kI/muI - z[28]*kI/muI dI3 = z[28]*kI/muI - z[29]*kI/muI dI4 = z[29]*kI/muI - z[30]*kI/muI dI5 = z[30]*kI/muI - z[31]*kI/muI dI6 = z[31]*kI/muI - z[32]*kI/muI dI7 = z[32]*kI/muI - z[33]*kI/muI dI8 = z[33]*kI/muI - z[34]*kI/muI dI9 = z[34]*kI/muI - z[35]*kI/muI dI10 = z[35]*kI/muI - z[36]*kI/muI dI11 = z[36]*kI/muI - z[37]*kI/muI dI12 = z[37]*kI/muI - z[38]*kI/muI dI13 = z[38]*kI/muI - z[39]*kI/muI dI14 = z[39]*kI/muI - z[40]*kI/muI dI15 = z[40]*kI/muI - z[41]*kI/muI dI16 = z[41]*kI/muI - z[42]*kI/muI dI17 = z[42]*kI/muI - z[43]*kI/muI dI18 = z[43]*kI/muI - z[44]*kI/muI dI19 = z[44]*kI/muI - z[45]*kI/muI dI20 = z[45]*kI/muI - z[46]*kI/muI dI21 = z[46]*kI/muI - z[47]*kI/muI dI22 = z[47]*kI/muI - z[48]*kI/muI dI23 = z[48]*kI/muI - z[49]*kI/muI dI24 = z[49]*kI/muI - z[50]*kI/muI dI25 = z[50]*kI/muI - z[51]*kI/muI dR = z[51]*kI/muI return(list(c(dS, dE1 , dE2 , dE3 , dE4 , dE5 , dE6 , dE7 , dE8 , dE9 , dE10 , dE11 , dE12 , dE13 , dE14 , dE15 , dE16 , dE17 , dE18 ,dE19 , dE20 , dE21 , dE22 , dE23 , dE24 , dE25 , dI1 , dI2 , dI3 , dI4 , dI5 , dI6 , dI7 , dI8 , dI9 , dI10 , dI11 , dI12 , dI13 , dI14, dI15 , dI16 , dI17 , dI18 , dI19 , dI20 , dI21 , dI22 , dI23 , dI24 , dI25 , dR))) } SEIRpt.ode <- function(tm,z,ps) { b=ps[["b"]] muE=ps[["muE"]] muI=ps[["muI"]] kE =ps[[’kE’]] kI =ps[[’kI’]]

Itot = sum(z[1+kE+(1:kI)]) dS = -b*z[1]*Itot dE = b*z[1]*Itot * aE + AEt %*% z[1+(1:kE)] dI = aI %*% (-OnesE%*%AEt %*% z[1+(1:kE)]) + AIt %*% z[1+kE+(1:kI)] dR = as.numeric(-OnesI%*%AIt %*% z[1+kE+(1:kI)]) return(list(c(dS, as.numeric(dE), as.numeric(dI), dR))) } SEIRpt.init <- function(ps) { muE=ps[["muE"]] muI=ps[["muI"]] kE = ps[[’kE’]] kI = ps[[’kI’]] aE = matrix(0,nrow = kE, ncol=1); aE[1] = 1;

AE = kE/muE*(diag(rep(-1,kE),kE)); if(kE >1) for(i in 1:(kE -1)) {AE[i,i+1] = kE/muE} aI = matrix(0,nrow = kI, ncol=1); aI[1] = 1;

AI = kI/muI*(diag(rep(-1,kI),kI)); if(kI >1) for(i in 1:(kI -1)) {AI[i,i+1] = kI/muI}

PopSize =10000 z0=numeric(kE+kI+2) z0[2] <- 1/PopSize reprint - July 9, 2020 z0[1] <- 1-z0[2] aE <<- aE AEt <<- t(AE) aI <<- aI

AIt <<- t(AI)

OnesE <<- matrix(1,ncol=kE,nrow=1)

OnesI <<- matrix(1,ncol=kI,nrow=1)

ICs <<- z0 } library(ggplot2) Tmax = 100 tms=seq(0,Tmax ,length =200)

SEIRpt.init(parms) out = ode(y=ICs , times = tms , func = SEIRpt.ode , parms = parms , method = "ode23"); c(Pos=sum(out[,-1]>=0), Neg=sum(out[,-1]<0) ) matplot(tms , out[,-1],type="l",lty=1) out = ode(y=ICs , times = tms , func = SEIRpt.ode , parms = parms , method = "ode45"); c(Pos=sum(out[,-1]>=0), Neg=sum(out[,-1]<0) ) matplot(tms , out[,-1],type="l",lty=2,add=TRUE) out = ode(y=ICs , times = tms , func = SEIRpt.ode , parms = parms , method = "lsodes"); c(Pos=sum(out[,-1]>=0), Neg=sum(out[,-1]<0) ) matplot(tms , out[,-1],type="l",lty=2,add=TRUE) out = ode(y=ICs , times = tms , func = SEIRpt.ode , parms = parms , method = "vode"); c(Pos=sum(out[,-1]>=0), Neg=sum(out[,-1]<0) ) matplot(tms , out[,-1],type="l",lty=2,add=TRUE)

SEIR.ode <- compiler::cmpfun(SEIR.ode)

SEIRpt.ode <- compiler::cmpfun(SEIRpt.ode)

SEIRlct1.ode <- compiler::cmpfun(SEIRlct1.ode)

SEIRlct2.ode <- compiler::cmpfun(SEIRlct2.ode)

SEIRlct25.ode <- compiler::cmpfun(SEIRlct25.ode) mthd = "ode45"; atol= 1e-6

IC=c(S=0.9999 , E=0.0001 , I=0, R=0) parms1 <- parms; parms1[’kE’]=1; parms1[’kI’]=1; parms1; SEIRpt.init(parms1) b1=benchmark(SEIR.1 ={ode(y=IC, times = tms , func = SEIR.ode , parms = parms , method = mthd , atol=atol)},

SEIRlct .1={ode(y=ICs , times=tms , func=SEIRlct1.ode , parms = parms1 , method = mthd , atol=atol)},

SEIRpt.1 ={ode(y=ICs , times=tms , func=SEIRpt.ode , parms = parms1 , method = mthd , atol=atol)}, replications = reps) parms2 <- parms; parms2[’kE’]=2; parms2[’kI’]=2; parms2; SEIRpt.init(parms2) b2=benchmark(SEIR.2 ={ode(y=IC, times = tms , func = SEIR.ode , parms = parms , method = mthd , atol=atol)},

SEIRlct .2={ode(y=ICs , times=tms , func=SEIRlct2.ode , parms = parms2 , method = mthd , atol=atol)},

SEIRpt.2 ={ode(y=ICs , times=tms , func=SEIRpt.ode , parms = parms2 , method = mthd , atol=atol)}, replications = reps) parms25 <- parms; parms25[’kE’]=25; parms25[’kI’]=25; parms25; SEIRpt.init(parms25) b25=benchmark(SEIR.25 ={ode(y=IC, times = tms , func = SEIR.ode , parms = parms , method = mthd , atol=atol)},

SEIRlct .25={ ode(y=ICs , times=tms , func=SEIRlct25.ode , parms = parms25 , method = mthd , atol=atol)},

SEIRpt .25 ={ode(y=ICs , times=tms , func=SEIRpt.ode , parms = parms25 , method = mthd , atol=atol)}, replications = reps) b1; b2; b25 out <- rbind(b1,b2,b25) out$N <- stringr::str_split_fixed(out$test ,’\\.’,2)[,2] out$Model <- stringr::str_split_fixed(out$test ,’\\.’,2)[,1] out$Model <- gsub("SEIR$","Standard: Exponential latent & infetious periods",out$Model) out$Model <- gsub("lct","LCT: Erlang latent & infectious periods",out$Model) out$Model <- gsub("pt","GLCT: Erlang latent & infectious periods (phase -type formulation)",out$Model) out$Model <- gsub("SEIR",’’,out$Model) out$Model <- factor(out$Model , levels = unique(out$Model) ) out$N <- as.numeric(out$N) out reprint - July 9, 2020 References

Allen, L.J.S. (2007).

An Introduction to Mathematical Biology . Pearson/Prentice Hall.Altiok, Tayfur (1985). ‘‘On the Phase-Type Approximations of General Distributions’’. In:

IIETransactions doi : .Anderson, Roy M. and Robert M. May (1992). Infectious Diseases of Humans: Dynamics andControl . Oxford University Press.Arrowsmith, D. K. and C. M. Place (1990).

An Introduction to Dynamical Systems . CambridgeUniversity Press. 432 pp.Beuter, Anne et al., eds. (2003).

Nonlinear Dynamics in Physiology and Medicine . InterdisciplinaryApplied Mathematics (Book 25). Springer. 468 pp.Bladt, Mogens and Bo Friis Nielsen (2017a).

Matrix-Exponential Distributions in Applied Probability .Springer US. doi : .– (2017b). ‘‘Phase-Type Distributions’’. In: Matrix-Exponential Distributions in Applied Probability .Springer US, pp. 125–197. doi : .Clapp, Geoffrey and Doron Levy (2015). ‘‘A review of mathematical models for leukemia andlymphoma’’. In: Drug Discovery Today: Disease Models

16, pp. 1 –6. doi : .Dawes, J.H.P. and M.O. Souza (2013). ‘‘A derivation of Holling’s type I, II and III functionalresponses in predator–prey systems’’. In: Journal of Theoretical Biology doi : .Dayan, Peter and Laurence F. Abbott (2005). Theoretical Neuroscience: Computational andMathematical Modeling of Neural Systems . Computational Neuroscience. The MIT Press.Diekmann, O. and J.A.P. Heesterbeek (2000).

Mathematical epidemiology of infectious diseases:Model building, analysis and interpretation . Wiley Series in Mathematical and ComputationalBiology. John Wiley & Sons, LTD, New York.Edelstein-Keshet, Leah (2005).

Mathematical Models in Biology . Classics in Applied Mathematics(Book 46). Society for Industrial and Applied Mathematics. doi : .Ellner, Stephen P. and John Guckenheimer (2006). Dynamic Models in Biology . Princeton UniversityPress.Feng, Zhilan, Dashun Xu, and Haiyun Zhao (2007). ‘‘Epidemiological Models with Non-ExponentiallyDistributed Disease Stages and Applications to Disease Control’’. In:

Bulletin of MathematicalBiology doi : .Feng, Zhilan et al. (2016). ‘‘Mathematical models of Ebola-Consequences of underlying assumptions’’.In: Mathematical biosciences

Ecology Letters doi : .Hirsch, Morris W., Stephen Smale, and Robert L. Devaney (2012). Differential Equations, DynamicalSystems, and an Introduction to Chaos . 3rd. Elsevier. doi : .24 reprint - July 9, 2020 Holling, C. S. (1959a). ‘‘Some Characteristics of Simple Types of Predation and Parasitism’’. In:

The Canadian Entomologist doi : .– (1959b). ‘‘The Components of Predation as Revealed by a Study of Small-Mammal Predation ofthe European Pine Sawfly’’. In: The Canadian Entomologist doi : .Horv´ath, Andr´as, Marco Scarpa, and Mikl´os Telek (2016). ‘‘Phase Type and Matrix ExponentialDistributions in Stochastic Modeling’’. In: Principles of Performance and Reliability Modelingand Evaluation: Essays in Honor of Kishor Trivedi on his 70th Birthday . Ed. by Lance Fiondellaand Antonio Puliafito. Cham: Springer International Publishing, pp. 3–25. doi : .Horv´ath, G´abor and Mikl´os Telek (2017). ‘‘BuTools 2: A rich toolbox for Markovian performanceevaluation’’. English. In: ValueTools 2016 - 10th EAI International Conference on PerformanceEvaluation Methodologies and Tools . Association for Computing Machinery, pp. 137–142. doi : .– (2020). BuTools V2.0 . http://webspn.hit.bme.hu/~telek/tools/butools/doc/ . Accessed:2020-05-15.Horv´ath, G´abor et al. (2012). ‘‘Efficient Generation of PH-Distributed Random Variates’’. In: Proceedings of the 19th international conference on Analytical and Stochastic Modeling Techniquesand Applications . Ed. by Khalid Al-Begain, Dieter Fiems, and Jean-Marc Vincent. Berlin,Heidelberg: Springer Berlin Heidelberg, pp. 271–285. doi : .Hurtado, Paul J. (2020). ‘‘Building New Models: Rethinking and Revising ODE Model Assumptions’’.In: Foundations for Undergraduate Research in Mathematics . Springer International Publishing,pp. 1–86. doi : .Hurtado, Paul J. and Adam S. Kirosingh (2019). ‘‘Generalizations of the ‘Linear Chain Trick’:incorporating more flexible dwell time distributions into mean field ODE models’’. In: Journal ofMathematical Biology doi : .Izhikevich, Eugene M. (2010). Dynamical Systems in Neuroscience: The Geometry of Excitabilityand Bursting . Computational Neuroscience. MIT Press. 464 pp.Keeling, M. J. and B. T. Grenfell (1997). ‘‘Disease Extinction and Community Size: Modeling thePersistence of Measles’’. In:

Science doi : .eprint: http://science.sciencemag.org/content/275/5296/65.full.pdf .Keener, James and James Sneyd (2008a). Mathematical Physiology I: Cellular Physiology . 2nd.Springer.– (2008b).

Mathematical Physiology II: Systems Physiology . 2nd. Springer.Kermack, W. O. and A. G. McKendrick (1927). ‘‘A Contribution to the Mathematical Theory ofEpidemics’’. In:

Proceedings of the Royal Society of London. Series A, Containing Papers of aMathematical and Physical Character

Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematicaland Physical Character reprint - July 9, 2020

Kermack, W. O. and A. G. McKendrick (1933). ‘‘Contributions to the Mathematical Theory ofEpidemics. III. Further Studies of the Problem of Endemicity’’. In:

Proceedings of the RoyalSociety of London. Series A, Containing Papers of a Mathematical and Physical Character

Bulletin of MathematicalBiology doi : .– (1991b). ‘‘Contributions to the mathematical theory of epidemics—II. The problem of endemicity’’.In: Bulletin of Mathematical Biology doi : .– (1991c). ‘‘Contributions to the mathematical theory of epidemics—III. Further studies of theproblem of endemicity’’. In: Bulletin of Mathematical Biology doi : .Krylova, Olga and David J. D. Earn (2013). ‘‘Effects of the infectious period distribution onpredicted transitions in childhood disease dynamics’’. In: Journal of The Royal Society Interface doi : .Lloyd, Alun L. (2009). ‘‘Sensitivity of Model-Based Epidemiological Parameter Estimation to ModelAssumptions’’. In: Mathematical and Statistical Estimation Approaches in Epidemiology . Ed. byGerardo Chowell et al. Dordrecht: Springer Netherlands, pp. 123–141. doi : .Meiss, James D. (2017). Differential Dynamical Systems, Revised Edition . Philadelphia, PA: Societyfor Industrial and Applied Mathematics. doi : .Metz, J. A. J. and O. Diekmann, eds. (1986). The Dynamics of Physiologically Structured Populations .Vol. 68. Lecture Notes in Biomathematics. Springer, Berlin, Heidelberg. doi : .Metz, J.A.J. and Odo Diekmann (1991). ‘‘Exact finite dimensional representations of models forphysiologically structured populations. I: The abstract formulation of linear chain trickery’’. In: Proceedings of Differential Equations With Applications in Biology, Physics, and Engineering1989 . Ed. by J. A. Goldstein, F. Kappel, and W. Schappacher. Vol. 133, pp. 269–289.Murdoch, William W., Cheryl J. Briggs, and Roger M. Nisbet (2003).

Consumer--Resource Dynamics .Vol. 36. Monographs in Population Biology. Princeton, USA: Princeton University Press.Murray, James D. (2011a).

Mathematical Biology: I. An Introduction . Interdisciplinary AppliedMathematics (Book 17). Springer. 584 pp.– (2011b).

Mathematical Biology II: Spatial Models and Biomedical Applications . InterdisciplinaryApplied Mathematics (Book 18). Springer. 584 pp.Nisbet, R. M., W. S. C. Gurney, and J. A. J. Metz (1989). ‘‘Stage Structure Models Applied inEvolutionary Ecology’’. In:

Applied Mathematical Ecology . Ed. by Simon A. Levin, Thomas G.Hallam, and Louis J. Gross. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 428–449. doi : .R Core Team (2020). R: A Language and Environment for Statistical Computing . R Foundation forStatistical Computing. Vienna, Austria. 26 reprint - July 9, 2020

Reinecke, Philipp, Levente Bodrog, and Alexandra Danilkina (2012). ‘‘Phase-Type Distributions’’.In:

Resilience Assessment and Evaluation of Computing Systems . Ed. by Katinka Wolter et al.Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 85–113. doi : .Reinecke, Philipp, Tilman Krau, and Katinka Wolter (2012). ‘‘Cluster-based fitting of phase-typedistributions to empirical data’’. In: Computers & Mathematics with Applications doi : .Robertson, Suzanne L. et al. (2018). ‘‘A matter of maturity: To delay or not to delay? Continuous-time compartmental models of structured populations in the literature 2000-2016’’. In: NaturalResource Modeling doi : .Rosenzweig, M. L. and R. H. MacArthur (1963). ‘‘Graphical Representation and Stability Conditionsof Predator-Prey Interactions’’. In: The American Naturalist doi : .Smith, Hal (2010). An introduction to delay differential equations with applications to the lifesciences . Vol. 57. Springer Science & Business Media.Soetaert, Karline, Thomas Petzoldt, and R. Woodrow Setzer (2010). ‘‘Solving Differential Equationsin R: Package deSolve’’. In:

Journal of Statistical Software doi : .Strogatz, Steven H. (2014). Nonlinear Dynamics and Chaos: With Applications to Physics, Biology,Chemistry, and Engineering . 2nd. Studies in Nonlinearity. Westview Press.Wang, Xiaojing et al. (2017). ‘‘Evaluations of Interventions Using Mathematical Models withExponential and Non-exponential Distributions for Disease Stages: The Case of Ebola’’. In:

Bulletin of Mathematical Biology doi : .Wearing, Helen J, Pejman Rohani, and Matt J Keeling (2005). ‘‘Appropriate Models for theManagement of Infectious Diseases’’. In: PLOS Medicine doi : .Wiggins, Stephen (2003). Introduction to Applied Nonlinear Dynamical Systems and Chaos . 2nd.Vol. 2. Texts in Applied Mathematics. Springer-Verlag New York. 868 pp. doi : .Xia, Jing et al. (2009). ‘‘The Effects of Harvesting and Time Delay on Predator-prey Systemswith Holling Type II Functional Response’’. In: SIAM Journal on Applied Mathematics doi : .Yates, Christian A., Matthew J. Ford, and Richard L. Mort (2017). ‘‘A Multi-stage Representationof Cell Proliferation as a Markov Process’’. In: Bulletin of Mathematical Biology doi :10.1007/s11538-017-0356-4