Abstract

Cell division times in microbial populations display significant fluctuations, that impact the population growth rate in a non-trivial way. If fluctuations are uncorrelated among different cells, the population growth rate is predicted by the Euler-Lotka equation, which is a classic result in mathematical biology. However, cell division times can be significantly correlated, due to physical properties of cells that are passed through generations. In this paper, we derive an equation remarkably similar to the Euler-Lotka equation which is valid in the presence of correlations. Our exact result is based on large deviation theory and does not require particularly strong assumptions on the underlying dynamics. We apply our theory to a phenomenological model of bacterial cell division in E.coli and to experimental data. We find that the discrepancy between the growth rate predicted by the Euler-Lotka equation and our generalized version is relatively small, but large enough to be measurable by our approach.

Full PDF

aa r X i v : . [ q - b i o . P E ] J a n Generalized Euler-Lotka equation for correlated cell divisions

Simone Pigolotti ∗ Biological Complexity Unit, Okinawa Institute of Science and Technology and Graduate University, Onna, Okinawa 904-0495.

Cell division times in microbial populations display signiﬁcant ﬂuctuations. These ﬂuctuationsimpact the population growth rate in a non-trivial way. If ﬂuctuations are uncorrelated amongdiﬀerent cells, the population growth rate is predicted by the Euler-Lotka equation, which is a classicresult in mathematical biology. However, cell division times can present signiﬁcant correlations,due to physical properties of cells that are passed from mothers to daughters. In this paper, wederive an equation remarkably similar to the Euler-Lotka equation which is valid in the presence ofcorrelations. Our exact result is based on large deviation theory and does not require particularlystrong assumptions on the underlying dynamics. We apply our theory to a phenomenological modelof bacterial cell division. We ﬁnd that the discrepancy between the growth rate predicted bythe Euler-Lotka equation and our generalized version is relatively small, but large enough to bemeasurable in experiments.

Microbial populations in steady, nutrient-rich condi-tions tend to grow exponentially. Their exponentialgrowth rate Λ can be taken as a proxy for the popu-lation ﬁtness and is therefore a biologically importantquantity. In a population of cells dividing at regulartimes τ , the population size at a time T multiple of τ is N ( T ) = N (0)2 T/τ , so that Λ = ln 2 /τ . In practice,cell division times of microbial populations signiﬁcantlyﬂuctuate, so that the division time τ i of a given individ-ual i must be considered as a random quantity. As aconsequence, the growth of N ( T ) is stochastic. In thesesituations, we can still deﬁne an exponential growth rateby Λ = lim T →∞ T ln N ( T ) . (1)For independent, identically distributed cell divisiontimes τ i , the exponential growth rate converges to a de-terministic value and can be computed as solution of thecelebrated Euler-Lotka equation2 h e − Λ τ i τ = 1 , (2)where we denote the average over the distribution p ( τ )of the division times by h f ( τ ) i τ = R dτ p ( τ ) f ( τ ). We usethis notation also for discrete variables, with the integralappropriately replaced by a sum. Equation (2) is a clas-sic result in mathematical biology. A recent experimentalstudy has tested its prediction by tracking individual celldivisions in a microﬂuidic device [1]. Beside microbialpopulations, the Euler-Lotka equation ﬁnds importantapplications in epidemiology, where the factor 2 is re-placed by the reproductive number R [2].Experimental studies have revealed that ﬂuctuations inmicrobial cell features are correlated among generations[3–5]. These correlations are caused by properties of cellsthat are passed through generations. These propertiescan be physical such as cell mass, or biological such asgene expression. Their ﬂuctuations are often controlledto preserve homeostasis, i.e., a stable state of cells acrossgenerations. For example, experimental and theoretical studies provided evidences for an ”adder” mechanism, inwhich cells attempt at growing their mass by a constantamount before dividing [4, 6, 7].Regardless of the mechanism underlying correlationsin cell division times, generalizing Eq. (2) to the corre-lated case has proven to be a hard problem. One rela-tively simple case is the “Markovian” scenario where acell generation time depends only on that of her mother.Expressions for the growth rate in these cases have beenderived in classic works by Powell [8] and Lebowitz andRubinow [9], see also [10]. Alternative approaches esti-mate the growth rate by comparing the outcome of sam-pling the population forward in time with retrospectivesampling, in which individuals in the ﬁnal population aretraced back to their ancestors [11, 12]. A recent studylinked the exponential growth rate Λ to the asymptoticdistribution of the number of cell divisions ∆ among lin-eages [13]. This approach makes use of techniques bor-rowed from large deviation theory and has the advantageof neither requiring the Markovian assumption, nor ret-rospective sampling.In this paper, we introduce a generalization of theEuler-Lotka equation (2) which is valid for correlated celldivision times:lim ∆ →∞

1∆ ln * e − Λ ∆ P i =1 τ i + { τ i } = − ln 2 , (3)where h . . . i { τ i } denotes an average over sequences { τ i } of cell division times along lineages. If the τ i s were un-correlated, then the left hand side of Eq. (3) reduces tothe cumulant generating function ln h e qτ i τ , and thereforeEq. (3) becomes equivalent to the traditional Euler-Lotkaequation (2). Equation (3) only requires as hypothesesthat population dynamics is at steady state and the sumof the τ i s across a lineage satisﬁes a large deviation prin-ciple with a convex rate function, which in practice arerather mild assumptions. Equation (3) can therefore beused to compute the population growth rate from indi-vidual cell division times in rather general settings.We consider a microbial population initially consti-tuted of a single individual. The population grows intime by a sequence of cell divisions. We represent thegenealogy of the population by a tree, whose nodes arecell division events and branches are times between con-secutive cell divisions, see Fig. 1a. T i m e (a) (b) (cid:127) τ i FIG. 1. Population dynamics represented as a lineage tree.(a) a microbial population grows in time from a single cell.Nodes (circle) denote cell division events. Lengths of branchesdenote the cell division times τ i s. One lineage is representedwith a thick orange line. In this case, the population is let toevolve until a ﬁxed time T . (b) Lineage tree in an alternativeensemble, in which lineages are let to evolve until they haveaccumulated exactly ∆ = 4 cell divisions. We now introduce the concept of a lineage. A lineageis identiﬁed by an individual in the population at time T complemented by its past history, i.e., the number ∆of cell divisions separating it from the individual at time0 and the sequence of cell division times { τ i } of all itsancestors, see Fig. 1a. Following Refs. [13, 14], we nowimagine to randomly select a lineage by starting from theinitial individual and picking at each divisions one of thetwo newborns with equal probability. With this proce-dure, each individual lineage is chosen with probability2 − ∆ , where ∆ depends in principle on the lineage. Theprobability that this randomly selected lineage includes∆ cell division events is equal to p (∆; T ) = 2 − ∆ N (∆; T ),where N (∆; T ) is the number of lineages with ∆ cell di-visions at time T . Since N ( T ) = P ∆ N (∆; T ), we obtain N ( T ) = h ∆ i ∆ . (4)Substituting this expression into the deﬁnition of the ex-ponential growth rate, Eq. (1), we ﬁndΛ = lim T →∞ T ln h ∆ i ∆ . (5)We remark that, in this argument, we did not distinguishbetween the a priori probabilities p (∆; T ) and the empir-ical ones, estimated from lineage frequencies in a singletree. In fact, it can be shown that the empirical proba-bilities rapidly converge to the a priori ones in the large T limit [13, 15], so that this distinction is not importantfor our aims.To make further progress, we introduce some ideasfrom large deviation theory [16]. Large deviation the-ory describes the leading behavior of distributions when a parameter (like the time T in our case) becomes large.In large deviation theory, variables such as ∆, whose av-erage is proportional to T , are called extensive. We as-sociate with T the intensive variable δ = ∆ /T , whoseaverage tends to a constant for large T . The large devi-ation principle for δ is expressed by p ( δ ) ≍ e − T I ( δ ) ( δ ) . (6)The function I ( δ ) ( δ ) is called the rate function. We usethe notation I ( δ ) to stress that I is the rate function asso-ciated with the distribution of the variable δ . The symbol“ ≍ ” denotes the leading exponential behavior; it can beseen as a shorthand for I ( δ ) = − lim T →∞ [ln p ( δ )] /T .An alternative way of studying asymptotic ﬂuctuationsof intensive random variables is via the scaled cumulantgenerating function, deﬁned by ψ ( δ ) ( q ) = lim T →∞ T ln h e qT δ i δ . (7)The Gartner-Ellis theorem states that, if the rate func-tion is convex, it is related with the scaled cumulant gen-erating by a Legendre-Fenchel transform I ( δ ) ( δ ) = sup q [ qδ − ψ ( δ ) ( q )] . (8)Since the Legendre-Fenchel transform is an involution, italso holds that ψ ( δ ) ( q ) = sup δ [ qδ − I ( δ ) ( δ )].We now return to Eq. (4) and brieﬂy summarize themain result of Ref. [13]. Assuming that δ satisﬁes a largedeviation principle, we obtainΛ = lim T →∞ T ln Z dδe T [ δ ln 2 − I ( δ ) ( δ )] . (9)In the limit T → ∞ , the integral can be evaluated withthe method of steepest descent, obtainingΛ = sup δ [ δ ln 2 − I ( δ ) ( δ )] . (10)Equation (10) is the central result of Ref. [13]. An al-ternative way to obtain it is to directly identify the ex-pression of the scaled cumulant generating function inEq. (5): Λ = ψ ( δ ) (ln 2) . (11)Equation (10) then follows by expressing the scaled cu-mulant generating function in terms of the rate functionby means of the Gartner-Ellis theorem.Application of this theory requires knowledge of theasymptotic distribution of ∆, or its intensive counterpart δ . However, in analogy with the Euler-Lotka equation (2)it would be desirable to express the growth rate in termsof the distribution of division times and its correlations.To this aim, we now consider a case in which, rather thanletting the population grow until a given time T , eachlineage is let to grow until it has accumulated exactly ∆cell divisions, see Fig. 1b. In this alternative ensemble , ∆is ﬁxed whereas T ﬂuctuates among lineages. In this case,we consider T as an extensive random variable, since itsaverage grows linearly with the ﬁxed large parameter ∆.We similarly associate with T the intensive variable t = T / ∆. We expect t to satisfy as well a large deviationprinciple: p ( t ) ≍ e − ∆ I ( t ) ( t ) . (12)In the language of probability theory and in particularof queuing theory, ∆( T ) is called a counting process and T (∆) its inverse. A useful result [17] states that the largedeviation of their associated intensive variables, δ and t respectively, are related by I ( δ ) ( x ) = xI ( t ) (1 /x ) . (13)Note that this is the result that one would obtain bytaking the large deviation form of the probability dis-tribution and simply applying the rules for a change ofvariable.We now substitute this result into Eq. (10), obtainingΛ = sup δ (cid:26) δ (cid:20) ln 2 − I ( t ) (cid:18) δ (cid:19)(cid:21)(cid:27) , (14)and, by applying the Gartner-Ellis theorem,Λ = sup δ (cid:26) δ (cid:20) ln 2 − sup q (cid:16) qδ − ψ ( t ) ( q ) (cid:17)(cid:21)(cid:27) = sup δ inf q h δ ln 2 − q + δψ ( t ) ( q ) i . (15)We assume that the function in curly brackets smoothlydepends on δ and q and therefore compute the supremumand inﬁmum by simply taking derivatives. The extremal-ity condition respect to δ is expressed by ψ ( t ) ( q inf ) = − ln 2 . (16)Substituting this condition back into Eq. (15) yields Λ = − q inf , so that we rewrite Eq. (16) as ψ ( t ) ( − Λ) = − ln 2 . (17)Upon substituting the deﬁnition of the scaled cumulantgenerating function, Eq. (7), into Eq. (17), we obtain thegeneralized Euler-Lotka equation (3), as anticipated.Taking the derivative in Eq. (15) respect to q resultsin δ max ddq ψ ( t ) ( q ) (cid:12)(cid:12)(cid:12)(cid:12) q = − Λ = 1 . (18)This equation relates the dominant value of δ with thestatistics of the division times and provides another facetto the generalized Euler-Lotka theory. Equation (18) is best interpreted in the simple case of uncorrelated celldivisions, where it reduces to δ max = 1 / h τ e − Λ τ i . If Λ ≪

1, the dominant value of γ is simply its average value,i.e. the inverse of the average division time. However,for quickly growing population, the dominant value of δ becomes signiﬁcantly larger than this value, as cells thatreproduce faster contribute more to population growth. Λ ( m i n - ) ∆ c=0, n lin =100GELEL 0.0118 0.0119 0.012 0.0121 0.0122 10 100 Λ ( m i n - ) n lin c=0, ∆ =21GELEL 0.0119 0.012 0.0121 0.0122 10 100 Λ ( m i n - ) ∆ c=0.5, n lin =100GELEL 0.0118 0.0119 0.012 0.0121 0.0122 10 100 Λ ( m i n - ) n lin c=0.5, ∆ =21GELEL FIG. 2. Estimates of the growth rate Λ in the adder model ofRef. [4] obtained by the generalized Euler-Lotka equation (3)(GEL) and by the conventional Euler-Lotka equation (2)(EL). In all panels, parameters of the growth rate distributionare α = 0 . − and σ α = 0 . − . The inher-ited length fraction f is distributed according to a Gaussianwith mean f = 0 . σ f = 0 .

03. Theadded length l follows a lognormal distribution with mean l = 3 . µm and standard deviation σ l = 0 . µm . In pan-els (a) and (c), we ﬁx the number of lineages n lin = 100 andplot the results as a function of the number of cell divisions∆ in each lineage. In panels (b) and (d), ∆ = 21 is ﬁxed andresults are plotted as a function of n lin . Panels (a) and (b):growth rates of mothers and daughters are positively corre-lated ( c = 0 . c = 0). In all panels, sim-ulations are repeated 20 times; error bars denote standarddeviations computed from these realizations. To illustrate this result, we consider a phenomenolog-ical model of bacterial growth inspired to the “adder”principle [4]. In the model, each bacterial cell grows inlength at a rate α . The rate α ﬂuctuates among cells butis constant in time per each individual cell. is distributedaccording to the formula α = α + c ( α M − α ) + σ α (1 − c ) ξ. (19)where α M is the value of α of the mother of the con-sidered cell. The parameters α and σ α are the averageand variance of the distribution of α , respectively. Theparameter ξ is a Gaussian random variable with zero av-erage and unit variance. Finally, the parameter c con-trols the degree of correlations between the growth rateof mothers and daughters.Each cell is characterized by a length s b at birth and s d at death. The adder model postulates that the addedlength l = s d − s b is roughly constant among cells. Afterdivision, a daughter inherits a fraction f of the mother’slength. We allow for some variability by taking both f and l as random variables, with Gaussian and lognor-mal distribution respectively. Averages and variances ofthese distributions are estimated from experimental data[4]. According to our assumptions, the time between celldivisions is expressed by τ = [ α ln( s d /s b )] − .We simulate the model to obtained n lin lineages, eachincluding ∆ cell divisions. We plug the results of thesesimulations into the generalized Euler-Lotka equation (3)and thereby compute the value of Λ, see Fig. 2. We con-sider two diﬀerent scenarios: one in which the growthrate α is positively correlated across generations c = 0 . c = 0). In both sce-narios, we compare the results of the generalized Euler-Lotka equations with those of the classic Euler-Lotkaequation (2). We ﬁnd that, in the correlated case, thegrowth rates predicted by the two equations are nearlyindistinguishable, see Fig. 2a and 2b. This is not surpris-ing, as the adder mechanism tends to cause negative cor-relations among cell division times, which can be coun-terbalance by positive correlations in the growth rate [4].Indeed, when α is uncorrelated, the adder mechanism isnot compensated and we ﬁnd a diﬀerence between thegrowth rate predicted by the two equations, see Fig. 2cand 2d. In this case, this diﬀerence is on the order of0 . p ( τ i | τ i − , τ i − , τ i − . . . ) = p ( τ i | τ i − ) . (20)In this case, one has D e q P ∆ i =1 τ i E = Z dτ . . . dτ ∆ p τ ∆ Y i =1 e qτ i p ( τ i | τ i − ) . (21)This expression and the deﬁnition of the scaled cumulantgenerating function, Eq. (7), imply that ψ ( τ ) ( q ) = ln λ ( q ) , (22)where λ ( q ) is the leading eigenvalue of the convolutionkernel Π τ i ,τ j = e qτ i p ( τ i | τ i =1 ) . (23) Combining Eq. (3) and Eq. (22) we obtain 2 λ ( − Λ) = 1,which is a classic result for the Markovian case [8–10].In conclusions, in this paper we derived the general-ized Euler-Lotka equation (3). This equation describesthe growth rate of populations where cell divisions occurin a correlated way. We obtained this result by meansof a result in queuing theory [17] that was recently ap-plied in stochastic thermodynamics [18] and to study en-zyme replicating information [19]. Using a phenomeno-logical model of bacterial cell division, we have demon-strated that our result can be easily applied to lineagedata. A comparison with the prediction of the tradi-tional Euler-Lotka equation permits to quantitatively as-sess the impact of correlations on the population growthrate. Due to these properties, we expect the generalizedEuler-Lotka equation to become a useful tool to analyzepopulation dynamics tracked at the single-cell level inexperiments.I thank Deepak Bhat and Anzhelika Koldaeva for dis-cussions on the Euler-Lotka theory and Luca Peliti for acritical reading of the manuscript. ∗ [email protected][1] M. Hashimoto, T. Nozoe, H. Nakaoka, R. Okura,S. Akiyoshi, K. Kaneko, E. Kussell, and Y. Wakamoto,Proceedings of the National Academy of Sciences ,3251 (2016).[2] N. C. Grassly and C. Fraser, Nature Reviews Microbiol-ogy , 477 (2008).[3] P. Wang, L. Robert, J. Pelletier, W. L. Dang, F. Taddei,A. Wright, and S. Jun, Current biology , 1099 (2010).[4] M. Campos, I. V. Surovtsev, S. Kato, A. Paintdakhi,B. Beltran, S. E. Ebmeier, and C. Jacobs-Wagner, Cell , 1433 (2014).[5] M. Wallden, D. Fange, E. G. Lundius, ¨O. Baltekin, andJ. Elf, Cell , 729 (2016).[6] A. Amir, Physical review letters , 208102 (2014).[7] S. Taheri-Araghi, S. Bradde, J. T. Sauls, N. S. Hill, P. A.Levin, J. Paulsson, M. Vergassola, and S. Jun, Currentbiology , 385 (2015).[8] E. Powell, Microbiology , 492 (1956).[9] J. L. Lebowitz and S. Rubinow, Journal of MathematicalBiology , 17 (1974).[10] J. Lin and A. Amir, Cell systems , 358 (2017).[11] T. J. Kobayashi and Y. Sughiyama, Physical review let-ters , 238102 (2015).[12] R. Garc´ıa-Garc´ıa, A. Genthon, and D. Lacoste, PhysicalReview E , 042413 (2019).[13] E. Levien, T. GrandPre, and A. Amir, Physical ReviewLetters (2020).[14] T. Nozoe, E. Kussell, and Y. Wakamoto, PLoS genetics , e1006653 (2017).[15] F. Jafarpour, C. S. Wright, H. Gudjonson, J. Riebling,E. Dawson, K. Lo, A. Fiebig, S. Crosson, A. R. Dinner,and S. Iyer-Biswas, Physical Review X , 021007 (2018).[16] H. Touchette, Physics Reports , 1 (2009).[17] P. W. Glynn and W. Whitt, Queueing Systems , 107 (1994).[18] T. R. Gingrich and J. M. Horowitz, Physical review let-ters , 170601 (2017). [19] D. Chiuchi´u, Y. Tu, and S. Pigolotti, Physical reviewletters123