Some mathematical tools for the Lenski experiment
aa r X i v : . [ q - b i o . P E ] O c t Some mathematical tools for the Lenski experiment
Bernard Ycart ∗ Agnès Hamon † J. Gaffé ‡ D. Schneider § Abstract
The Lenski experiment is a long term daily reproduction of
Escherichia coli , that hasevidenced phenotypic and genetic evolutions along the years. Some mathematical models,that could be usefull in understanding the results of that experiment, are reviewed here:stochastic and deterministic growth, mutation appearance and fixation, competition ofspecies.
Keywords: cell kinetics; branching processes; Luria-Delbrück distribution
MSC:
This is primarily intended as a review of a very small part of the vast litterature onmathematical models applied to population biology. The application that we have inmind is the
Lenski experiment (referred to as LE hereafter, see [4, 43, 64, 71, 72, 73]).Here is a much simplified description (see Philippe et al. [64] for a more detailed account).The ancestral strain was
Escherichia Coli B . Twelve populations have been propagatedby daily serial transfer, using a 1:100 dilution, in the same defined environment, formore than 40,000 generations. The growth medium supports a maximum of 5 × cellsper mL; therefore the number of cells in each one of the twelve 10 mL vessels variesdaily from 5 × to 5 × . Our main focus will be on modelling the appearanceand fixation of mutations along successive generations. The total number of mutationalevents that have happened during the 40,000 generations in each population is estimatedat 3 × ([64], p. 849). Our main question here is: how did some of the mutantstrains survive and eventually invade the population? We argue that only beneficialmutations ( i.e. increasing the fitness) can have survived the experimental process. Indeedwhen a nonbeneficial mutation occurs a given day, the proportion of mutants in thepopulation remains extremely small, and therefore the chances that mutants survive the1:100 sampling to the next day are very small. On the contrary, a strain carrying abeneficial mutation survives if mutant cells can be found in sizeable amounts at themoment of dilution. Provided the new strain survives the first day dilution, chances arethat its proportion in the global population will rapidly increase. When that proportionis close to one, the initial strain has a high probability to be wiped out by daily dilution.Our objective here is to provide a quantified basis to the above assertions.Although our emphasis will mainly be on probability, we shall use some elementarypopulation dynamics, on which our basic references are Murray [58] and Kot [45]. Thestochastic tools used here, essentially discrete sampling and birth-and-death processes,are presented in many excellent textbooks. For probability theory, Feller’s treatise (inparticular the second volume [22]) remains a good reference. Ross [68] and Tuckwell ∗ LJK, CNRS UMR 5224, Univ. Grenoble-Alpes, France
[email protected] † LJK, CNRS UMR 5224, Univ. Grenoble-Alpes, France
[email protected] ‡ LAPM, CNRS UMR 5163, Univ. Grenoble-Alpes, France
[email protected] § LAPM, CNRS UMR 5163, Univ. Grenoble-Alpes, France
77] are oriented towards applications in biology. The basic theory of continuous timeMarkov processes is developed in Bharucha-Reid [11] and Çinlar [18]. Ethier & Kurtz[20] give a more advanced treatment, in particular of asymptotics and convergence todiffusion processes. Some textbooks, such as Gardiner [25] are particularly oriented toapplications. Being one of the founders of the theory, Bartlett [8, 9] emphasizes populationmodelling. Moran [57] develops genetic applications. Some more recent textbooks havebeen devoted to stochastic modelling in biology, such as Ewens [21] and Wilkinson [83];Allen [2] and Durrett [17] are of particular interest to us. The course given on “StochasticPopulation Systems” by D. Dawson to the Summer School in probability at PIMS-UBCin 2009 covers all the material that we shall use and much more. The lecture notes areavailable on line .The paper is organized as follows. The daily sampling process will be examinedfirst, in section 2. Next, we shall review in section 3 some deterministic models forthe evolution of two competing populations (normal and mutant cells) in a constrainedenvironment. On a very simple model, it will shown that the final proportion of mutants( i.e. before the next dilution) can be deduced from the initial proportion ( i.e. right afterthe previous dilution) and the two division rates of normal and mutant cells. Section4 will be devoted to the stochastic counterpart of the deterministic models examined insection 3. The emphasis will be on the convergence of stochastic to deterministic models.Stochastic mutation models will be reviewed in section 5. There we shall focus on the firstappearance of mutations and the proportion of mutants at the end of the first day. Allmodels considered here are based on several parameters, among which at least individualdivision rates and mutation probabilities. The question of estimating those parametersis therefore crucial, and it will be reviewed in section 6. The problem of beneficial mutations being lost in populations with periodic bottleneckshas been studied by Wahl and Gerrish [82]. Our aim here is to give a simple quantitativeevaluation of the survival of mutations from one day to the next. Recall that the maximalcapacity of the medium is n f = 5 × . At the end of each day, n f cells are present on thepopulation, out of which n = n f /
100 are sampled. Assume that at the end of one day, N normal cells and M = n f − N mutant cells are present. How many mutant cells willremain after the 1:100 dilution? The simplest model assumes equiprobability: assumingthat the medium is homogeneous, each n -sized sample has the same probability to bechosen for the next day. Therefore the number of mutants surviving dilution follows thehypergeometric distribution with parameters n f , M and n . In other words, the chancesto find exactly n normal cells and m = n − n mutant cells after dilution are: (cid:0) Mm (cid:1)(cid:0) Nn (cid:1)(cid:0) n f n (cid:1) . Since n f is large, and n is small compared to n f , the hypergeometric distribution can becorrectly approximated by the binomial with parameters n and p = M/n f (proportionof mutants). As a consequence, the probability to find exactly m mutants is (cid:0) Mm (cid:1)(cid:0) Nn (cid:1)(cid:0) n f n (cid:1) ≃ (cid:18) n m (cid:19) p m (1 − p ) n − n . If p is large enough (say p > − ), the binomial distribution above can be approxi-mated by a Gaussian, with expectation n p and variance n p (1 − p ). In other words, the roportion p ′ = m/n will be close to p , a 95% fluctuation interval being: p ± . p p (1 − p ) √ n . Since n = 5 10 , one can express it differently by saying that with 98.73% probability, p ′ is at distance at most 5 × − of p .However, if p is too small, the approximation above is no longer valid. A betterapproximation is given by the Poisson distribution with parameter n p : (cid:0) Mm (cid:1)(cid:0) Nn (cid:1)(cid:0) n f n (cid:1) ≃ e − n p ( n p ) m m ! . We should like to emphasize here that when M is small (a few units), chances thatall mutant cells disappear after dilution are high. Indeed, the probability that no mutantcell remains is: (cid:0) Nn (cid:1)(cid:0) n f n (cid:1) ≃ e − n M/n f ≃ − M . For instance, if M = 10 mutant cells are present, there is a 90% probability that theywill all disappear after dilution. From the 100-fold increase, one can evaluate the dailynumber of generations from a given cell to be of order 6 (2 = 64) to 7 (2 = 128). Thetable below gives the values of the probability of disappearance for each value of M = 2 d , d (the number of generations) ranging from 0 to 8. Numerical calculations (from theexact hypergeometric distribution) were obtained through R [67].divisions 0 1 2 3 4 5 6 7 8cells 1 2 4 8 16 32 64 128 256proba of disappearance 0.99 0.98 0.96 0.92 0.85 0.72 0.53 0.28 0.08 The simplest deterministic model, that of exponential growth, is named after Malthus[53]: d N ( t )d t = νN ( t ) , where N ( t ) denotes the number of cells at time t , and ν is the individual division rate(IDR) of each cell. That model had been considered long before Malthus, in particularby Euler. What Malthus actually discussed in 1798 was precisely the growth limitationdue to lack of resources.It may safely be pronounced, therefore, that population, when unchecked, goeson doubling itself every twenty-five years, or increases in a geometrical ratio.The rate according to which the productions of the earth may be supposedto increase, it will not be easy to determine. Of this, however, we may beperfectly certain, that the ratio of their increase must be totally of a differentnature from the ratio of the increase of population.Quetelet was the first one to propose to account for growth limitations by substracting aquadratic correction term to the differential equation above, in 1836 (see [66] p. 288).La population tend à croître selon une progression géométrique.La résistance, ou la somme des obstacles à son développement, est, touteschoses égales d’ailleurs, comme le carré de la vitesse avec laquelle la populationtend à croître. wo years later, Verhulst [79] discussed various other limitating terms and compared themodels so obtained to experimental data. The model with a quadratic limitation is nowknown as Verhulst model, or else logistic model .d N ( t )d t = νN ( t ) (cid:18) − N ( t ) n f (cid:19) , where n f is the maximal population sustained by the environment. The model wasgeneralized to several species competing in the same environment by Lotka [50] in 1925,and Volterra [80] in 1926. Volterra discussed the different models in a course given in1928 to the newly founded Institut Henri Poincaré. The lecture notes “Lessons on themathematical theory of the struggle for life” [81], to which we shall refer, appeared in1931. Soon after, Lotka and Volterra predictions were confronted to experimental datacoming from all sorts of different contexts, in particular by Gause [26].Here are the first lines of Volterra [81], from section 1.1 entitled “Deux espèces sedisputant la même nourriture”.Supposons que, avec une nourriture en quantité suffisante pour satisfaire com-plètement la voracité de ces êtres, il y ait des coefficients d’accroissementpositifs et constants ε , ε . Si nous nous plaçons maintenant dans le cas réeld’espèces vivant dans un milieu délimité, la nourriture diminuera quand lesnombres N , N des individus des deux espèces augmenteront et cela ferabaisser la valeur des coefficients d’accroissement. Si l’on représente la nour-riture dévorée par unité de temps par F ( N , N ) fonction nulle avec N , N ensemble, tendant vers l’infini avec chacune des variables et fonction crois-sante de chacune d’elles, il sera assez naturel de prendre comme coefficientsd’accroissement ε − γ F ( N , N ) , ε − γ F ( N , N ) ,γ , γ étant des constantes positives correspondant aux deux espèces et à leursbesoins respectifs de nourriture.D’où le système différentiel traduisant le développement des espèces.d N d t = [ ε − γ F ( N , N )] N , d N d t = [ ε − γ F ( N , N )] N . (1)Maintenant se pose le problème mathématique d’étudier les intégrales N , N de ce système, avec des valeurs initiales N , N positives pour t = t .On peut démontrer que pour tout intervalle fini ( t , T ) il y a une solutionunique, de deux fonctions continues, restant comprises entre deux nombrespositifs, le plus grand ne dépendant pas de l’extrémité T de l’intervalle (c’est-à-dire que N , N restent bornés).Étudions ce qui arrive quand le temps s’écoule indéfiniment. En transcrivant(1) sous la formed log N d t = ε − γ F ( N , N ) , d log N d t = ε − γ F ( N , N ) , (1’)il vient par combinaison γ d log N d t − γ d log N d t = ε γ − ε γ , puis N γ N γ = ( N ) γ ( N ) γ e ( ε γ − ε γ )( t − t ) . (2) égligeons le cas infiniment peu probable où ε γ − ε γ = 0et supposons, en permutant au besoin les espèces, que ε γ − ε γ > ε γ > ε γ ;alors, d’après (2), lim t = ∞ N γ N γ = + ∞ . (3) N restant borné, N tend donc vers 0.Nous concluerons donc que la seconde espèce, celle de εγ le plus petit, s’épuise et disparaît, tandis que la première subsiste .More as an example than a realistic model, we shall consider a particular case of Volterra’ssystem. Let us denote by N ( t ) the number of normal cells at time t , and by M ( t ) thenumber of mutant cells. Let ν and µ be their respective IDR’s, and n f denote themaximal capacity (total number of cells sustained by the medium). Our model, that weshall call Volterra Model (VM) even though it is only a very particular case of (1) deemed“infiniment peu probable” by Volterra, will be: d N ( t )d t = νN ( t ) (cid:18) − N ( t ) + M ( t ) n f (cid:19) d M ( t )d t = µM ( t ) (cid:18) − N ( t ) + M ( t ) n f (cid:19) (VM)In other words, the growth of both normal and mutant cells is equally restricted, pro-portionally to their sum. It stops when that sum reaches the maximal capacity n f . Ofcourse, the IDR’s ν and µ should be estimated, and we shall review estimation methodsin section 6. For the moment, we shall assume that the time scale has been set so that ν = 1. Therefore µ = µ/ν can be viewed as the fitness of mutants, and we shall mainlyconsider here the case µ > N ( t ) and M ( t ), all starting from the same initial values N (0) = 0 . × (5 × ) and M (0) = 0 . × (5 × ). The value of µ (the relative fitness of mutants) ranges from 1 . t tends to + ∞ . The calculations show that the asymptotic value is reachedin practice for relatively small values of t . All numerical calculations and plots were madethrough Scilab [14].Let us denote by N ( ∞ ) and M ( ∞ ), the limits as t tends to infinity of N ( t ) and M ( t ).They can be computed using Volterra’s method: ( N ( ∞ )) µ ( M ( ∞ )) ν = ( N (0)) µ ( M (0)) ν N ( ∞ ) + M ( ∞ ) = n f Our main interest is in the evolution of the relative proportion of mutants at time t ,denoted by P ( t ): P ( t ) = M ( t ) N ( t ) + M ( t ) . .0e+005.0e+071.0e+081.5e+082.0e+082.5e+083.0e+083.5e+084.0e+08 0 1 2 3 4 5 6 7 8 Figure 1: Evolution in time of the numbers of normal and mutant cells in time in the Volterramodel. The initial proportion of mutants is 0 .
1. The IDR of normal cells is 1. That of mutantcells is 1 . . . . Figure 2 shows the evolution in time of P ( t ), with the same values of the parameters asbefore. Unlike the general case studied by Volterra, the two populations reach an equi-librium where the proportion of mutants is not 1. Nevertheless the IDR of mutants beinglarger than that of normal cells, P ( t ) increases in time. It converges to an asymptoticvalue, denoted by P ( ∞ ). That “limit proportion of mutants” can be viewed as a functionboth of the initial proportion P (0) and the fitness µ/ν . Figure 3 plots that function, forsmall values of P (0) and fitnesses ranging from 1 to 5.Consider a given strain of mutants with fitness µ >
1, and denote by f µ the (increasing)function that maps P (0) onto P ( ∞ ). Say that the mutation has taken place on day 0,and denote by P the (small but positive) proportion of mutant cells at the end of day0. After dilution, the proportion of mutant cells at the beginning of day 1 will be arandom variable with expectation P (see the discussion in section 2). Provided it isnon null, at the end of day 1 the new proportion will be the image by f µ of the initial .10.20.30.40.50.60.70.8 0 1 2 3 4 5 6 7 8 Figure 2: Evolution in time of the proportion of mutant cells in time in the Volterra model.The initial proportion of mutants is 0 .
1. The IDR of normal cells is 1. That of mutant cellsis 1 . . . . one. Dilution transforms it in another random variable P with expectation f µ ( P ), andso on. Denote by P n the proportion of mutants at the beginning of day n . At the endof day n , it will be f µ ( P n ), then at the beginning of day n + 1, P n +1 will be a newrandom variable with expectation f µ ( P n ). The sequence of random variables so definedis a Markov chain, converging to 1 (mutants will eventually invade the whole population).Using renewal theory techniques, the distribution of the (random) number of days beforecomplete invasion can be precisely estimated. The rapid growth of f µ (figure 3) makes itquite likely that even starting at day 0 with a small proportion of mutants, the number ofdays to complete invasion will be relatively small, conditionally of course to the fact thatmutants do not disappear upon dilution in the first days. An estimate for the numberof days until complete invasion can be obtained by iteratively applying f u , starting with p = 1 /n (a single mutant on day 0). The results are plotted on figure 4: with a relativefitness of 1.2, it takes about 25 days for mutant cells to invade the population, whereas Figure 3: Limit proportion of mutant cells in the Volterra model (Z-axis, normal scale). Theinitial proportion P (0), ranges from e − = 0 .
368 to e − = 4 . × − (X-axis, log-scale).The 50 fitnesses are linearly spaced between 1 and 5 (Y-axis, normal scale). with a relative fitness of 2, it takes 5 days.Symmetrically, the same techniques will show that for a nonbeneficial mutation ( µ N ( t ) and M ( t ), oneadds the substrate (nutrient) concentration S ( t ). The model, that we shall call (somewhatarbitrarily) Powell Model or PM is given by the following system of differential equations .00.10.20.30.40.50.60.70.80.91.0 0 5 10 15 20 25 30 Figure 4: Daily evolution for the proportion of mutant cells. The initial proportion is 1 / (5 10 )at the beginning of day 0. The IDR of normal cells is 1. That of mutant cells is 1 . . . . (see Kot [45], system (12.16) p. 205). d S ( t )d t = D ( S i − S ( t )) − y n νS ( t ) N ( t ) k n + S ( t ) − y m µS ( t ) N ( t ) k m + S ( t )d N ( t )d t = νS ( t ) N ( t ) k n + S ( t ) − DN ( t )d M ( t )d t = µS ( t ) M ( t ) k m + S ( t ) − DN ( t ) (PM)The parameters are the following.• ν, µ : the IDR’s of normal and mutant cells,• y n , y m : the yield coefficients of normal and mutant cells (consumption of substrateby bacteria per unit of time), k n , k m : the half-saturation constants of normal and mutant cells,• D : the dilution,• S i : the inflowing substrate concentration.In our case, one can consider that S i = 0 for each single day. Besides being closer to theexperimental reality, the PM has another interesting feature. In the VM we consideredthat the evolutionary advantage of the mutants translated only into a faster division rate.It might also yield a higher consumption of nutrients, which would even accelerate theinvasion. A mathematical study of the PM is no more difficult than for the VM; but itwould require estimates for 7 parameters instead of 3. Nevertheless, we believe that thequalitative conclusions that were drawn from the VM would still remain true for the PM;they are summarized below.1. nonbeneficial mutations disappear over a few days, through the effect of daily dilu-tion2. beneficial mutations, if the strain survives dilution on the first day, have a highprobability to invade the full population, and eventually to eliminate normal cells,after at most a few tens of days.3. On a single day, two competing strains of bacteria increase at different rates, accord-ing to their relative fitnesses; their populations eventually stabilize to asymptoticvalues, the relative proportion of the fittest being strictly larger at the end of theday than at the beginning.We also wish to point out that the models above can be easily extended to more than 2competing species. As a common feature, the generalizations will obey the principle ofcompetitive exclusion (see section 3.5 of Murray [58]): in the long term, the fittest specieseliminates the others.The fixation along successive generations of beneficial mutations has been discussedin many references and textbooks. Papp et al. [63] recently wrote an interesting reviewon systems-biology approaches to genomic evolution. Lang et al. [48] propose a thoroughexperimental study in different strains of yeast. Campos & Wahl [12, 13] discuss theeffect of population bottlenecks. Hermisson & Pfaffelhuber [32] study genetic hitchhikingunder recurrent mutations. The probabilistic modelling of population growth started in 1925 with Yule [84]. Suchan early date is somewhat misleading. The burst of the theory took place in a few yearsaround 1950 with such important contributors as Bartlett [5, 6, 7], Harris [31, 10] andKendall [39, 40, 41]. More references and a perspective on the early historical developmentcan be found in the discussion following Armitage’s presentation to the Royal StatisticalSociety [3], the chairman being Bartlett. Kendall [41] gives another useful early review. Ofcourse the same material has since appeared in many textbooks, starting with Bartlett’s[8, 9] (see also Bharucha-Reid [11]). Since the 1950’s, stochastic population models haveremained a lively subject: see e.g.
Pakes [62] for a more recent review. Aldous [1] givesan interesting perpective relating Yule’s founding paper to present researches.The basic hypothesis of all models is that all bacteria behave independently in thesame manner; of course this should be understood in the stochastic sense: the times atwhich all bacteria divide are independent and identically distributed random variables.The common distribution function of these random times will be denoted by F . The twosimplest particular cases are:• deterministic: F ( t ) = I [ t , + ∞ ) ( t ) (the division time is fixed and equal to t ). Markovian: F ( t ) = (1 − e − νt ) I [0 , + ∞ ) ( t ) (the division time is exponentially dis-tributed with parameter ν ).These are the cases where an explicit mathematical treatment is easily feasible, and weshall focus on the second one. More general families of distributions have been considered,including Gamma distributions (see Kendall [39, 41] for a discussion and comparison withexperimental data).When a division occurs, the cell that divides is turned into a random number of newbacteria, each having a fresh division date. Let ϕ be the generating function of therandom number of “children” produced at division time: ϕ ( z ) = E [ z K ] = + ∞ X k =0 z k P [ K = k ] , where K is the random number of children at any division. Even though we shall beconcerned exclusively by the deterministic case K ≡ ϕ = z ), let us mention thatthe general case is the basis of the famous Galton-Watson model of branching processes.Notice also that K can be null, which corresponds to a death.Generating functions of integer valued random variables will be used extensively inwhat follows. Let us recall the following basic properties:1. If two random variables N and M are independent, then the generating function of N + M is the product of the generating functions of N and M .2. The generating function of N has a k -th left derivative at z = 1 if and only if E [ N k ]exists. In that case,lim z → − d E [ z N ]d z = E [ N ( N − · · · ( N − k + 1)] . The basic object of interest is the number of bacteria alive at time t , when one single cellis present at time 0; its generating function will be denoted by G ( z, t ). G ( z, t ) = + ∞ X n =0 z n P [ N t = n | N = 1] , where N t denotes the (random) number of bacteria living at time t , under the hypothesisthat N ≡
1. The function G is related to F and ϕ through the Bellman-Harris integralequation (see [10]). G ( z, t ) = (cid:18)Z t ϕ ( G ( z, t − s )) d F ( s ) (cid:19) + z (1 − F ( t )) . (BH)Even though we shall not provide a rigorous proof, giving an intuitive interpretation ofthat equation can still be useful for future developments. Let us argue as follows. Thedistribution function F ( t ) is the probability that the division of the initial cell has takenplace no later than t ; 1 − F ( t ) is the probability it has occurred after t , d F ( s ) is theprobability that it occurs between s and s + d s . If at time t the division has not occurred(probability (1 − F ( t ))) then the number of bacteria is still equal to 1, and its generatingfunction is z : this accounts for the second part of the right hand side, z (1 − F ( t )).Suppose now that a division has occurred at some time s between 0 and t (in [ s, s + d s ]with probability d F ( s )). Assume it leads to k new cells, each starting a new lineage.Since all lineages develop with identical rules, the population at time t , stemming fromone lineage which started at time s , will have generating function G ( z, t − s ). The totalnumber will be the sum of all k lineages, which are supposed to be independent: its enerating function will be the k -th power G k ( z, t − s ). Now since k cells are producedwith probability P [ K = k ], the generating function at time t becomes + ∞ X k =0 G k ( z, t − s ) P [ K = k ] = ϕ ( G ( z, t − s )) . This accounts for the first term in the (BH) equation.As a particular case, when F ( t ) = (1 − e νt ) I [0 , + ∞ ) ( t ) (exponential distribution, Marko-vian case), and K ≡ ϕ ( z ) = z ), the Bellman-Harris equation (BH) becomes: G ( z, t ) = (cid:18)Z t G ( z, t − s ) ν e − νs d s (cid:19) + z e − νt . Inside the integral, change t − s into u : G ( z, t ) = (cid:18) e − νt Z t G ( z, u ) ν e νu d u (cid:19) + z e − νt . or else: G ( z, t )e νt = (cid:18) Z t G ( z, u ) ν e νu d u (cid:19) + z . The derivative in time is: ∂G ( z, t ) ∂t e νt + G ( z, t )( ν e νt ) = G ( z, t )( ν e νt ) . Simplifying by e νt and rearranging terms, leads to the following differential equation. ∂G ( z, t ) ∂t = ν ( G ( z, t ) − G ( z, t )) . (DE)The solution to (DE) for G ( z,
0) = z is easily computed: G ( z, t ) = e − νt z − (1 − e − νt ) z . This is the generating function of the geometric distribution with parameter p = e − νt . Inother terms, for all n >
1, the probability that n bacteria are alive at time t is: P [ N t = n ] = e − νt (1 − e − νt ) n − . The first two moments of N t are: E [ N t ] = 1 p = e νt and Var[ N t ] = 1 − pp = e νt − e νt . The Markov process { N t , t > } is one of the simplest examples of a birth-and-deathprocess (actually a “pure birth” process). It is usually named after Yule (from [84]),who derived the geometric distribution of N t . Furry [24] introduced the same process inanother context. Novozhilov et al. [59] give a very simple presentation of birth-and-deathprocesses used in biology.In the LE, the number of bacteria varies from n = 5 × to n f = 5 × . Soit is legitimate to consider large number approximations. If n bacteria are initiallypresent, their lineages evolve independently, according to the distribution that has justbeen described: at each time t , the total population is distributed as the sum of n independent random variables, each following the geometric distribution with parameter − νt , that is a Pascal distribution with parameters n and e − νt . If n is large, it can beapproximated by a Gaussian (normal) distribution with same mean and variance, i.e. : E [ N t ] = n p = n e νt and Var[ N t ] = n − pp = n e νt − n e νt . Observe that for n large, the standard-deviation is small compared to the expectation: E [ N t ] = n p = n e νt and p Var[ N t ] ≃ p n e νt = E [ N t ] √ n . Therefore, as a first approximation, the population is expected to grow deterministically,as n e νt .The Yule process is the stochastic counterpart of the deterministic exponential growthmodel. Let us denote by N ( t ) and n ( t ) the respective numbers of cells in the stochasticmodel and in the deterministic one. The deterministic model is specified by an ordinarydifferential equation. d n ( t )d t = ν n ( t ) , or equivalently by the integral equation n ( t ) = n + Z t ν n ( s ) d s . The basis of the stochastic model is a random time scale, specified by a Poisson process.Let { Y ( t ) , t > } be a unit Poisson process: Y has jumps of size 1 at successive instants,separated by exponentially distributed durations with expectation 1. The stochasticprocess N ( t ) can be defined by the following integral equation, which is the stochasticcounterpart of the deterministic one. N ( t ) = n + Y (cid:18)Z t νN ( s ) d s (cid:19) . The distribution at time t of N ( t ) is the solution of a system of ordinary differential equa-tions, the Chapmann-Kolmogorov equations (also called the
Chemical Master Equation in the context of kinetics). Denoting by p n ( t ) the probability that N ( t ) = n , the equationfor n > p n ( t )d t = − νnp n ( t ) + ν ( n − p n − ( t ) . As we have seen, that system has an explicit solution. However, it is a very particularcase and no explicit solution can be hoped for in more general situations. Also, the factthat the expectation of N ( t ) is exactly equal to the deterministic solution n ( t ) is quiterare. What is general however, is the large number approximation that was explainedabove. We shall rewrite it in a way that can be easily generalized. Its foundation is thelong term approximation for the Poisson process Y . At (large) time n u , Y ( n u ) equals n u on average, with fluctuations of order √ n , described by a Brownian motion.lim n → + ∞ Y ( n u ) − n u √ n = W ( u ) , where Y is a unit Poisson process, W is the standard Brownian motion and the limit isunderstood in distribution. Consider now the integral equation defining N ( t ) and divideboth members by n . N ( t ) n = 1 + 1 n Y (cid:18)Z t νN ( s ) d s (cid:19) . enote by U ( t ) the ratio N ( t ) n . U ( t ) = 1 + 1 n Y (cid:18)Z t νn U ( s ) d s (cid:19) . Approximating Y ( n u ) by n u + √ n W ( u ): U ( t ) ≃ Z t νU ( s ) d s + 1 √ n W (cid:18)Z s νU ( s ) d s (cid:19) . The diffusion process U is the solution of a Stochastic Differential Equation (called the
Chemical Langevin Equation in kinetics)d U ( t ) = νU ( t ) d t + 1 √ n p νU ( t ) d W ( t ) . Of course for very large n , the diffusion term vanishes and the equation becomes thedeterministic differential equation (the Reaction Rate Equation in kinetics).What we have just seen on the example of the Yule process is an illustration of a verygeneral modelling principle. Three different scales can be considered.1.
Microcopic scale : stochastic jump process with discrete state space. The randomfluctuations are governed by Poisson processes;2.
Mesoscopic scale : stochastic diffusion process with continous state space. The ran-dom fluctuations are governed by Brownian motions;3.
Macroscopic scale : deterministic function of time, solution of a differential systemof equations.The three scales are coherent in the sense that each scale is a large number approximationof the previous one.In order to illustrate the three scales principle, we shall introduce here a microscopicmodel of competition between normal and mutant cells, matching the Volterra modelstudied in section 3. Let N ( t ) and M ( t ) still denote the numbers of normal and mutantcells. At each division, one of the two counts increases by one unit. When ( N ( t ) , M ( t )) =( n, m ), the respective rates of increase of normal and mutant cells are: ρ ( n, m ) = νn (cid:18) − n + mn f (cid:19) and σ ( n, m ) = µm (cid:18) − n + mn f (cid:19) . We shall call this model competitive process (CP). A loose description of the CP can begiven as follows.1. When n normal and m mutant cells are present, the next division will occur aftera random time, following the exponential distribution with parameter ρ ( n, m ) + σ ( n, m ).2. Upon next division,(a) with probability ρ ( n,m ) ρ ( n,m )+ σ ( n,m ) , N ( t ) will increase by 1, M ( t ) remaining un-changed.(b) with probability σ ( n,m ) ρ ( n,m )+ σ ( n,m ) , M ( t ) will increase by 1, N ( t ) remaining un-changed.The formal description uses two independent unit Poisson processes, Y and Y . N ( t ) = N (0) + Y (cid:18)Z t ρ ( N ( s ) , M ( s )) d s (cid:19) M ( t ) = M (0) + Y (cid:18)Z t σ ( N ( s ) , M ( s )) d s (cid:19) (CP) et p be the initial proportion of mutant molecules, so that N (0) = n (1 − p ) and M (0) = n p . Assuming that n is large, we reproduce the diffusion approximationscheme that was exposed above for the Yule process. Dividing both equations by n ,then setting U ( t ) = N ( t ) n and V ( t ) = M ( t ) n , one gets: U ( t ) = 1 − p + 1 n Y (cid:18)Z t νn U ( s ) (cid:18) − n n f ( U ( s ) + V ( s )) (cid:19) d s (cid:19) V ( t ) = p + 1 n Y (cid:18)Z t µn V ( s ) (cid:18) − n n f ( U ( s ) + V ( s ) (cid:19) d s (cid:19) Replace now Y ( n u ) and Y ( n v ) by: n u + √ n W ( u ) and n v + √ n W ( v ) , where W and W are two independent standard Brownian motions. The CM becomes: U ( t ) ≃ − p + Z t νU ( s ) (cid:18) − n n f ( U ( s ) + V ( s ) (cid:19) d s + 1 √ n W (cid:18)Z t U ( s ) (cid:18) − n n f ( U ( s ) + V ( s ) (cid:19) d s (cid:19) V ( t ) ≃ p + Z t µV ( s ) (cid:18) − n n f ( U ( s ) + V ( s ) (cid:19) d s + 1 √ n W (cid:18)Z t V ( s ) (cid:18) − n n f ( U ( s ) + V ( s ) (cid:19) d s (cid:19) Thus the stochastic process (
U, V ) is a bidimensional diffusion, solution to the followingstochastic differential equations. d U ( t ) = νU ( t ) (cid:18) − n n f ( U ( t ) + V ( t ) (cid:19) d t + 1 √ n s U ( t ) (cid:18) − n n f ( U ( t ) + V ( t )) (cid:19) d W ( t )d V ( t ) = µV ( t ) (cid:18) − n n f ( U ( t ) + V ( t ) (cid:19) d t + 1 √ n s µV ( t ) (cid:18) − n n f ( U ( t ) + V ( t )) (cid:19) d W ( t )This is the mesoscopic scale model. Of course, by neglecting the diffusion term in √ n ,one gets the deterministic (or macroscopic) Volterra model of section 3.How should one choose among the three modelling scales? Figure 5 shows a simulationof 10 trajectories of the CM, for n = 5 × and n f = 5 × , the relative fitness ofmutants being 1 .
6. On the same graphics, the trajectory of the deterministic Volterramodel has been superposed. As can be observed, the dispersion of random trajectoriesaround the deterministic ones is small. For the more realistic values of n = 5 × and n f = 5 × , the random trajectories would be almost undistinguishable from thedeterministic ones. However, this is only true for a relatively large value of p . If onlya few mutant cells were initially present, stochastic models would certainly give morereliable results than the deterministic one. n = 5 × and n f = 5 × .The initial proportion of mutant cells is 0 .
1. The IDR’s of normal and mutant cells are 1 and1 . Once again, our intention in proposing that very simple model, was to illustrate thepossible treatments rather than imposing it as the most realistic. More sophisticatedlogistic-type stochastic models were proposed long ago: see Novozhilov [59] and section6.8 p. 242 of Allen [2]. Logistic growth processes have been the object of several mathe-matical studies, in particular by Tan & Piantadosi [76], or more recently by Lambert [47].Refinements include for instance spatial random dispersal [78] or local regulation [19].
The history of mutation models is as long as that of stochastic growth processes, sinceestimates of mutation probabilities were already present in Yule’s founding paper [84]. owever, it really started with Luria and Delbrück experiments [51]. Interestingly enough,they used the same Escherichia Coli B as the LE. Their idea was to introduce a virus thatkilled most bacteria, except those having acquired resistance by mutation. This allowedthem to count mutant bacteria. Repeating the experiment, they estimated the distri-bution of the (random) number of mutants. The data seemed to indicate a distributionwith a much heavier tail than expected: most counts had very few mutant cells, but in asizeable proportion of the counts, they found a relative high number of mutants. In viewof our discussion in section 2, that feature is crucial for the survival of mutations throughdaily sampling in the LE. Indeed we have shown that mutations have good chances tosurvive daily dilution, only when they are carried by sufficiently many cells. Luria andDelbrück used a very simple deterministic model, assuming that cell counts doubled ateach multiple of a given fixed period. Nevertheless, that model allowed them to derive anasymptotic distribution of the number of mutants, that showed indeed a heavy tail behav-ior. Other models were later consider, in particular by Lea & Coulson [49], Harris [31],Bartlett [5, 6], Kendall [40, 41], and Armitage [3]. Mandelbrot [54] proposed a generalproof for the convergence to the Luria-Delbrück distribution. More recently, the impor-tance of that distribution for the treatment of mutation experimental data has stemmedfurther researches such as [69, 52, 38]. The Luria-Delbrück distribution is known onlythrough its generating function and no close form exists for its probabilities. However anefficient recursive formula permits to calculate them explicitly. Pakes [61] gives an easyderivation of that formula. An example of generalization is given by Dewanji et al. [16].Presently, the most active author on the subject is Zheng; he wrote a useful mathematicalreview in [85]. Historical reviews include Sarkar [69] and Zheng [88].The objective of this section is to present Bartlett’s derivation of the Luria-Delbrückdistribution under the following modelling assumptions (see [6], [3], p. 37, [9], section 4.31p. 124, and more recently Zheng [87]).1. At time 0 a homogeneous population of n “normal” cells is given ( n is large);2. normal cells divide at a constant rate ν ;3. when a division occurs:(a) with probability 1 − p two normal cells are produced,(b) with probability p one normal and one mutant cells are produced,(the mutation probability p is small);4. mutant cells divide at a constant rate µ ;5. any other mutation than normal to mutant is excluded;6. all random events (divisions and mutations) are mutually independent.Let G denote the bivariate generating function for the numbers of normal and mutantcells, starting with a single normal cell at time 0. G ( z, y, t ) = + ∞ X n =0 + ∞ X m =0 z n y m P [ N ( t ) = n , M ( t ) = m | N (0) = 1 , M (0) = 0] . If the population starts with a single mutant cell, only mutant cells can be produced later,and there is no need to consider a bivariate generating function. We shall denote by H the generating fonction of the number of mutant cells, starting with a single mutant cellat t = 0. H ( y, t ) = + ∞ X m =0 y m P [ M ( t ) = m | M (0) = 1] . Its calculation from the Bellman-Harris equation was already exposed in section 4. H ( y, t ) = y e − µt − y + y e − µt . he generating functions G and H are related through another Bellman-Harris equation. G ( z, y, t ) = (cid:18)Z t (cid:16) (1 − p ) G ( z, y, t − s )+ p G ( z, y, t − s ) H ( y, t − s ) (cid:17) ν e − νs d s (cid:17) + z e − νt . (BH2)The justification of (BH2) is quite similar to that of (BH), given in section 4. The initialnormal cell divides at a time which is exponentially distributed with parameter ν . At time t , it may not have divided yet (with probability e − νt ), and the generating function is still z . If it divides at some time s between 0 and t (in [ s, s + d s ] with probability ν e − νs d s ), itturns either into 2 normal cells with probability 1 − p , or into a normal and a mutant cellwith probability p . The two new cells start independent lineages of their own, accountedfor by G ( z, y, t − s ) if two normal cells are produced, and by G ( z, y, t − s ) H ( y, t − s ) ifone normal and one mutant are produced. As in section 4, (BH2) can be transformedinto an ordinary differential equation. ∂G ( z, y, t ) ∂t = ν ((1 − p ) G ( z, y, t ) + p G ( z, y, t ) H ( y, t ) − G ( z, y, t )) . (DE2)This is a linear first order equation in the inverse G − ( z, y, t ): ∂G − ( z, y, t ) ∂t = ν ( p −
1) + νG − ( z, y, t )(1 − p H ( y, t )) . Replacing H ( y, t ) by its explicit expression, the general solution to the homogeneousequation is found to be proportional to:e νt (1 − y + y e − µt ) p νµ . Using the initial value G ( z, y,
0) = z , G − ( z, y, t ) = e νt (1 − y + y e − µt ) p νµ (cid:18) z + ν ( p − Z t e − νs (1 − y + y e − µs ) − p νµ d s (cid:19) . When µ = ν , the primitive can be explicitly calculated. This is the only case consideredby Bartlett. Z t e − νs (1 − y + y e − νs ) − p d s = 1(1 − p ) νy (cid:0) − (1 − y + y e − νt ) − p (cid:1) . Hence Bartlett’s explicit expression: G − ( z, y, t ) = (1 − y + y e − νt ) p e − νt z + (1 − y + y e − νt ) y e − νt − (1 − y + y e − νt ) p y e − νt . This gives the generating function of both counts, starting with one normal cell. Thegenerating function of mutant cells alone is obtained by setting z = 1 in the expressionabove. The generating function of mutant cells, starting with n normal cells is the n power: G n (1 , y, t ) = (cid:18) (1 − y + y e − νt ) p e − νt + (1 − y + y e − νt ) y e − νt − (1 − y + y e − νt ) p y e − νt (cid:19) − n . We want an asymptotic value for this expression, as p tends to 0, n and t to + ∞ . Letus replace the two terms in (1 − y + y e − νt ) p by their order 1 expansion:(1 − y + y e − νt ) p = 1 + p log(1 − y + y e − νt ) + o ( p ) . emember also that t is large, so that:log(1 − y + y e − νt ) = log(1 − y ) + O (e − νt )One gets: G n (1 , y, t ) = (cid:18) p y − y e − νt log(1 − y ) + o ( p ) + pO (e − νt ) (cid:19) − n The non-trivial limit is obtained as as p tends to 0, n and t to + ∞ , in such a way thatlim n p e νt = α , where α is a positive real number. Then:lim G n (1 , y, t ) = exp (cid:18) α − yy log(1 − y ) (cid:19) = (1 − y ) α − yy . The
Luria-Delbrück distribution with parameter α is defined as the probability distribu-tion on integers with generating function: g α ( y ) = (1 − y ) α − yy . The function g α has no left derivative at y = 1: the Luria-Delbrück distribution has nomoment of any order. Denote by p n the corresponding probabilities: g α ( y ) = + ∞ X m =0 p m y m . The first value is obtained for y = 0: p = e − α . There is no explicit expression for p m asa function of m . An equivalent as m tends to infinity is p m ∼ αm . The exact values canbe numerically computed through the recursive formula: p m = αm m − X i =0 p i n − i + 1 . See Ma et al. [52], Pakes [61], and Kemp [38] for simple derivations of the main resultson the p n ’s. As an example, the table below gives some values for the probability thanmore than 50 mutant cells remain, for different values of α . α P [ X >
50] 0 .
021 0 .
045 0 .
072 0 .
102 0 .
136 0 .
172 0 .
213 0 .
256 0 .
302 0 . × . Let us take n = 6 × . For themutation probability, Philippe et al. [64] give p = 5 × − per base pair. This issensibly lower than the value given by Kendall [41], who simply says that the mutationprobability is lesser than 10 − . As we have seen in section 4, e νt is the expected numberof cells stemming from one initial cell. In the LE, the daily increase is 100-fold. So weshall retain e νt = 10 . The parameter for the Luria-Delbrück distribution is: α = n p e νt = 3 . For the Luria-Delbrück distribution with parameter 3, the probability to get no mutantcells is p = 0 .
05. The probability to get more than 100 mutant cells is 0 . he asymptotics of the number of mutants in the general case µ = ν is discussedby Jaeger and Sarkar [34]. To the best of our knowledge, no derivation from Bartlett’smodel with µ = ν has been made. However, we do not think it would change by much theconclusions. If µ > ν (beneficial mutation), mutant cells will multiply faster on average,(thus be more numerous) than predicted by the Luria-Delbrück distribution. In otherwords, the Luria-Delbrück distribution function is an upper bound for the distributionfunction of the number of mutant cells in the general case. The heavy tail property canonly be reinforced.The Bartlett model that was exposed here is not unique. Luria-Delbrück distributionscan be obtained through other models, including the earlier deterministic growth modelof Lea and Coulson [49] (see Zheng’s review [85]). It has been extended to other types ofnon-Markovian growth models by Dewandji et al. [16]. Gause [26], in the introduction to his section “On the mechanism of competition in yeastcells”, cites early 30’s publications while making quite clear statements on the relationbetween experimental data and mathematical models. We could not say it better.No mathematical theories can be accepted by biologists without a most carefulexperimental verification. We can but agree with the following remarks madein Nature (H. T. H. P. ’31) concerning the mathematical theory of the strugglefor existence developed by Vito Volterra: “This work is connected with Prof.Volterra’s researches on integro-differential equations and their applications tomechanics. In view of the simplifying hypothesis adopted, the results are notlikely to be accepted by biologists until they have been confirmed experimen-tally, but this work has as yet scarcely begun.” First of all, very reasonabledoubts may arise whether the equations of the struggle for existence given inthe preceding chapter express the essence of the processes of competition, orwhether they are merely empirical expressions. everybody remembers the at-tempt to study from a purely formalistic viewpoint the phenomena of heredityby calculating the likeness between ancestors and descendants. This methoddid not give the means of penetrating into the mechanism of the correspond-ing processes and was consequently entirely abandoned. In order to dissipatethese doubts and to show that the above-given equations actually express themechanism of competition, we shall now turn to an experimental analysis of acomparatively simple case. It has been possible to measure directly the factorsregulating the struggle for existence in this case, and thus to verify some ofthe mathematical theories.Generally speaking, biologists usually have to deal with empirical equations.The essence of such equations is admirably expressed in the following wordsof Raymond Pearl (’30): “The worker in practically any branch of science ismore or less frequently confronted with this sort of problem: he has a seriesof observations in which there is clear evidence of a certain orderliness, on theone hand, and evident fluctuations from that order, on the other hand. Whathe obviously wishes to do. . . is to emphasize the orderliness and minimize thefluctuations about it. . . He would like an expression, exact if possible, or, fail-ing that, approximate, of the law if there be one. This means a mathematicalexpression of the functional relation between the variables. . .“It should be made clear at the start that there is, unfortunately, no methodsknown to mathematics which will tell anyone in advance of the trial whatis either the correct or even the best mathematical function with which tograduate a particular set of data. The choice of the proper mathematical unction is essentially, at its very best, only a combination of good judgmentand good luck. In this realm, as in every other, good judgment depends inthe main only upon extensive experience. What we call good luck in thissort of connection has also about the same basis. The experienced person inthis branch of applied mathematics knows at a glance what general class ofmathematical expression will take a course, when plotted, on the whole likethat followed by the observations. He furthermore knows that by putting asmany constants into his equation as there are observations in the data he canmake his curve hit all the observed points exactly, but in so doing will havedefeated the very purpose with which he started, which was to emphasize thelaw (if any) and minimize the fluctuations, because actually if he does what hasbeen described he emphasizes the fluctuations and probably loses completelyany chance of discovering a law.“Of mathematical functions involving a small number of constants there arebut relatively few. . . In short, we live in a world which appears to be organizedin accordance with relatively few and relatively simple mathematical functions.Which of these one will choose in starting off to fit empirically a group ofobservations depends fundamentally, as has been said, only on good judgmentand experience. There is no higher guide” (pp. 407-408).The mathematical models that have been presented so far have no predictive value, untilthey have been confronted to experimental data. The parameters should be estimated,the data have to be adjusted, and the goodness-of-fit must be tested. The adjustmentof population models to bacteria growth experiments has been the object of countlesspublications, of which we have retained a few among the most recent.Miao et al. [56] give an interesting review of parameter estimation methods for de-terministic models. The general methods for adjusting models are presented by Jaqamanand Danuser in [35]. Gutenkunst et al. [29] argue on sloppy parameter sensitivity insystems biology models. All models have at least in common growth rates and mutationprobabilities, as parameters to be estimated from the data. Although most referencesfocus on mutation rates, the estimation of growth rates was recently studied by Maruvka et al. [55]. Mutation rates can be estimated by adjusting observed number of mutants inbacterial growth experiments. The method is usually referred to as fluctuation analysis .It has been presented in many references, including Koch [44], Sarkar et al. [70], Stewart et al. [75], Jones et al. [37]. Dedicated softwares were made available by Zheng [86] andHall et al. [30]. Several refinements in the use of the Luria-Delbrück probabilities havebeen proposed, in particular by Kepler & Oprea [42, 60].The estimation of mutation rates was studied as early as 1974 by Crump & Hoel [15].The quality of mutation rates estimates is discussed by Stewart [74]. Different estimationmethods are presented by Foster [23]. Improvements have been proposed by Koziol [46],and by Gerrish [27]. Grant et al. [28] made a case study on Mycobacterium tubercolosis . The few classics of mathematical modelling that have been reviewed here are hoped tobe of some potential use in the description of different aspects of the Lenski experiment.As already said, none of them has any scientific validity as long as they have not beenconfronted to real data. However, the models share some common qualitative featuresthat are summarized here for comparison with actual observations. Instead of categorizingthe models according to mathematical coherence as we did before, we will summarize theirpredictions on four different questions about the LE.1.
Day 0: Which number of cells can carry a given mutation that was not present the revious day? All existing models use asymptotics as the initial number of cells is large and theprobability of mutations is low. The random number of new mutants follows aheavy-tailed distribution of which the Luria-Delbrück distribution is a prototype.This means that with a reasonable probability, sizeable amounts of mutants can bepresent at the end of any given day.2.
Night 0: What are the chances that a mutation having appeared on day 0 disappearsthrough dilution?
Since the daily sampling eliminates each day 99% of the cells, mutants presentin small quantities have high chances to disappear. In particular, non-beneficialmutations do not survive. Beneficial mutations carried by enough cells (at least afew tens), have good chances to be represented the next days.3.
Day 1: If two different strains of cells are present at the beginning of a given day,what will their proportion be at the end of the day?
If a beneficial mutation is carried by a certain proportion of the population, even aweak one, that proportion will gradually increase along the day with the multipli-cation of cells. So with high probability, the initial proportion will be larger on thenext day.4.
Day N: How many days will it take before a beneficial mutation is carried by allcells in the population?
The initial proportion of day N + 1 is an increasing function of that of the previousday. The better the fitness of mutants, the steeper that function. When the initialproportion of mutants is already strong, there is a high probability that the lessfit cells will be wiped out by the next daily sampling. The invasion of the fullpopulation will take from a few days to a month. References [1] D.J. Aldous. Stochastic models add descriptive statistics for phylogenetic trees, fromYule to today.
Statist. Sci. , 16(1):23–34, 2001.[2] L.J.S. Allen.
Stochastic processes with applications to biology . Pearson – PrenticeHall, New-Jersey, 2003.[3] P. Armitage. The statistical theory of bacterial populations subject to mutation.
J.R. Statist. Soc. B , 14:1–40, 1952.[4] J.E. Barrick, D.S. Yu, S.H. Yoon, H. Jeong, T.K. Oh, D. Schneider, R.E. Lenski,and J.F. Kim. Genome evolution and adaptation in a long-term experiment with
Escherichia Coli . Evolution , 461(29):1243–1249, 2009.[5] M.S. Bartlett. Some evolutionary stochastic processes.
J. R. Statist. Soc. B ,11(2):211–229, 1949.[6] M.S. Bartlett. The dual recurrence relation for multiplicative processes.
Math. Proc.Camb. Phil. Soc. , 47(4):821–825, 1951.[7] M.S. Bartlett. Processus stochastiques ponctuels.
Ann. IHP , 14(1):35–60, 1954.[8] M.S. Bartlett.
Stochastic population models in ecology and epidemiology . Methuen,London, 1960.[9] M.S. Bartlett.
An introduction to stochastic processes, with special reference to meth-ods and applications . Cambridge University Press, 1966.[10] R. Bellman and T. Harris. On age-dependent binary branching processes.
Ann.Math. , 55(2):280–295, 1952.
11] A.T. Bharucha-Reid.
Elements of the theory of Markov processes and their Applica-tions . McGraw-Hill, London, 1960.[12] P.R.A. Campos and L.M. Wahl. The effects of population bottlenecks on clonalinterference, and the adaptation effective population size.
Evolution , 63(4):950–958,2009.[13] P.R.A. Campos and L.M. Wahl. The adaptation rate of asexuals: deleterious mu-tations, clonal interference and population bottlenecks.
Evolution , 64(7):1773–1783,2010.[14] Consortium Scilab.
Scilab: Le logiciel libre de calcul numérique . Consortium Scilab,Digiteo, Paris, France, 2011.[15] K.S. Crump and D.G. Hoel. Mathematical models for estimating mutation rates incell populations.
Biometrika , 61(2):237–252, 1974.[16] A. Dewanji, E.G. Luebeck, and S.H. Moolgavkar. A generalized Luria-Delbrückmodel.
Math. Biosci. , 197:140–152, 2005.[17] R. Durrett.
Probability models for DNA sequence evolution . Springer, New-York,2008.[18] E. Çinlar.
Introduction to stochastic processes . Prentice Hall, New York, 1975.[19] A.M. Etheridge. Survival and extinction in a locally regulated population.
Ann.Appl. Probab. , 14(1):188–214, 2004.[20] S.N. Ethier and T.G. Kurtz.
Markov processes: characterization and convergence .Wiley series in Probability and Statistics. Wiley, New-York, 2005.[21] W.J. Ewens.
Mathematical population genetics: theoretical introduction . Springer,New York, 2004.[22] W. Feller.
An introduction to probability theory and its applications , volume II.Wiley, London, 1971.[23] P.L. Foster. Methods for determining spontaneous mutation rates.
Methods Enzy-mol. , 409:195–213, 2006.[24] W.H. Furry. On fluctuation phenomena in the passage of high energy electronsthrough lead.
Phys. Rev. , 52:569–581, 1937.[25] C.W. Gardiner.
Handbook of stochastic methods for physics, chemistry and the nat-ural sciences . Springer, Berlin, 2004.[26] G.F. Gause.
The struggle for existence . Williams & Wilkins, Baltimore, 1934.[27] P.J. Gerrish. A simple formula for obtaining markedly improved mutation ratesestimates.
Genetics , 180:1773–1778, 2008.[28] A. Grant, C. Arnold, N. Thorne, S. Gharbia, and A. Underwood. Mathematicalmodelling of
Mycobacterium tuberculosis
VNTR loci estimates a very slow mutationrate for the repeats.
J. Mol. Evol. , 66:565–574, 2008.[29] R.N. Gutenkunst, J.J. Waterfall, F.P. Casey, K.S. Brown, C.R. Myers, and J.P.Sethna. Universally sloppy parameter sensitivities in systems biology models.
PLoSComput. Biol. , 3:1871–1878, 2007.[30] B.M. Hall, C. Ma, P. Liang, and K.K. Singh. Fluctuation Analysis CalculatOR(FALCOR): a web tool for the determination of mutation rate using Luria-Delbrückfluctuation analysis.
Bioinformatics , 25(12):1564–1565, 2009.[31] T.E. Harris. Some mathematical models for branching processes.
2d Berk. Symp. ,pages 305–328, 1951.[32] J. Hermisson and P. Pfaffelhuber. The pattern of genetic hitchhiking under recurrentmutation.
Elec. J. Probab. , 13(68):2069–2106, 2008.
33] S.B. Hsu, S. Hubbell, and P. Waltman. A mathematical theory for single-nutrientcompetition in continuous cultures of micro-organisms.
SIAM J. Applied. Math. ,32(2):366–383, 1977.[34] G. Jaeger and S. Sarkar. On the distribution of bacterial mutants: the effects ofdifferential fitness of mutants and non-mutants.
Genetica , 96:217–223, 1995.[35] K. Jaqaman and G. Danuser. Linking data to models: data regression.
Nat. Rev.Mol. Cell Biol. , 7:1–10, Nov 2006.[36] K.A. Johnson and R.S. Goody. The original Michaelis constant: translation of the1913 Michaelis-Menten paper.
Biochemistry , 50(4):8264–8269, 2011.[37] M.E. Jones, S.M. Thomas, and A. Rogers. Luria-Delbrück fluctuation experiments:design and analysis.
Genetics , 136:1209–1216, 1994.[38] A.W. Kemp. Comments on the Luria-Delbrück distribution.
J. Appl. Probab. ,31(3):822–828, 1994.[39] D.G. Kendall. On the role of variable generation time in the development of astochastic birth process.
Biometrika , 35(3-4):316–330, 1948.[40] D.G. Kendall. Stochastic processes and population growth.
J. R. Statist. Soc. B ,11(2):230–282, 1949.[41] D.G. Kendall. Les processus stochastiques de croissance en biologie.
Ann. IHP ,13(1):43–108, 1952.[42] T.B. Kepler and M. Oprea. Improved inference of mutation rates: I: an integralrepresentation for the Luria-Delbrück distribution.
Theor. Pop. Biol. , 59(1):41–48,2001.[43] A.I. Khan, D.M. Dinh, D. Schneider, R.E. Lenski, and T.F. Cooper. Negative epis-tasis between beneficial mutations in an evolving bacterial population.
Science ,332(6034):1193–1196, 2011.[44] A.L. Koch. Mutation and growth rates from Luria-Delbrück fluctuation tests.
Mutat.Res. , 95(2-3):129–143, 1982.[45] M. Kot.
Elements of mathematical ecology . Cambridge University Press, 2001.[46] J.A. Koziol. A note on efficient estimation of mutation rates using Luria-Delbrückfluctuation analysis.
Mutat. Res. , 249:275–280, 1991.[47] A. Lambert. The branching process with logistic growth.
Ann. Appl. Probab. ,15(2):1506–1535, 2005.[48] G.I. Lang, D. Botstein, and M.M. Desai. Genetic variation and the fate of beneficialmutations in asexual populations.
Genetics , 188(3):647–661, 2011.[49] D.E. Lea and C.A. Coulson. The distribution of the number of mutants in bacterialpopulations.
J. Genetics , 49:264–285, 1949.[50] A.J. Lotka.
Elements of physical biology . Williams & Wilkins, Baltimore, 1925.[51] D.E. Luria and M. Delbrück. Mutations of bacteria from virus sensitivity to virusresistance.
Genetics , 28:491–511, 1943.[52] W.T. Ma, G.v.H. Sandri, and S. Sarkar. Analysis of the Luria-Delbrück distributionusing discrete convolution powers.
J. Appl. Probab. , 29(2):255–267, 1992.[53] T.R. Malthus.
An essay on the principle of population . Johnson, London, 1798.[54] B. Mandelbrot. A population birth-and-mutation process, I: explicit distributions forthe number of mutants in an old culture of bacteria.
J. Appl. Probab. , 11(3):437–444,1974.[55] Y.E. Maruvka, N.M. Shnerb, Y. Bar-Yam, and J. Wakeley. Recovering populationparameters from a single gene genealogy: an unbiased estimator of the growth rate.
Mol. Biol. Evol. , 28(5):1617–1631, 2011.
56] H. Miao, X. Xia, A.S. Perelson, and H. Wu. On identifiability of nonlinear ODEmodels and applications in viral dynamics.
SIAM Rev. , 53(1):3–39, 2011.[57] P.A.P. Moran.
The statistical processes of evolutionary theory . Clarendon Press,Oxford, 1962.[58] J.D. Murray.
Mathematical biology . Springer, New-York, 1989.[59] A.S. Novozhilov, G.P. Karev, and E.V. Koonin. Biological applications of the theoryof birth-and-death processes.
Briefings Bioinfo. , 7(1):70–85, 2005.[60] M. Oprea and T.B. Kepler. Improved inference of mutation rates II: generalizationof the Luria-Delbrück distribution for realistic cell-cycle time distributions.
Theor.Pop. Biol. , 59(1):49–59, 2001.[61] A.G. Pakes. Remarks on the Luria-Delbrück distribution.
J. Appl. Probab. ,30(4):991–994, 1993.[62] A.G. Pakes. Biological applications of branching processes. In D.N. Shanbhag andC.R. Rao, editors,
Stochastic Processes: Modelling and Simulation , volume 21 of
Handbook of Statistics , pages 693 – 773. Elsevier, 2003.[63] B. Papp, A.A. Notebaart, and C. Pál. Systems biology approaches for predictinggenomic evolution.
Nature Rev. Genetics , 12:591–602, 2011.[64] N. Philippe, E. Crozat, R.E. Lenski, and D. Schneider. Evolution of global regulatorynetwork during a long-term experiment with
Escherichia Coli . BioEssays , 29(9):846–860, 2007.[65] E.O. Powell. Criteria for the growth of contaminants and mutants in continuousculture.
J. Gen. Microbiol. , 18:259–268, 1958.[66] A. Quetelet.
Essai de physique sociale, tome premier . Hauman, Bruxelles, 1836.[67] R Development Core Team.
R: A Language and Environment for Statistical Com-puting . R Foundation for Statistical Computing, Vienna, Austria, 2008. ISBN 3-900051-07-0.[68] S. Ross.
Introduction to probability models . Academic Press, 10 th edition, 2010.[69] S. Sarkar. Haldane’s solution of the Luria-Delbrück distribution. Genetics , 127:257–261, 1991.[70] S. Sarkar, W.T. Ma, and G.v.H. Sandri. On fluctuation analysis: a new, simpleand efficient method for computing the expected number of mutants.
Genetica ,85:173–179, 1992.[71] D. Schneider, E. Duperchy, E. Coursange, R.E. Lenski, and M. Blot. Long-termexperimental evolution in
Escherichia Coli
IX. characterization of insertion sequence-mediated mutations and rearrangements.
Genetics , 156(2):477–488, 2000.[72] D. Schneider, E. Duperchy, J. Dupeyrot, E. Coursange, R.E. Lenski, and M. Blot.Genomic comparisons among
Escherichia Coli strains b, k12, and O157:H7 using ISelements as molecular markers.
BMC Microbiol. , 2:18, 2002.[73] M.T. Stanek, T.F. Cooper, and R.E. Lenski. Identification and dynamics of a ben-eficial mutation in a long term evolution experiment with
Escherichia Coli . BMCEvol. Biol. , 9:302, 2009.[74] F.M. Stewart. Fluctuation tests: how reliable are the estimates of mutation rates?
Genetics , 137:1139–1146, 1994.[75] F.M. Stewart, D.M. Gordon, and B.R. Levin. Fluctuation analysis: the probabilitydistribution of the number of mutants under different conditions.
Genetics , 124:175–185, 1990.[76] W.Y. Tan and S. Piantadosi. On stochastic growth processes with application tostochastic logistic growth.
Statist. Sinica , 1:527–540, 1991.
77] H.C. Tuckwell.
Elementary applications of probability theory . Chapman & Hall/CRC,Boca Raton, FL, 1995.[78] H.C. Tuckwell and J.A. Koziol. Logistic population growth under random dispersal.
Bull. Math. Biology , 49(4):495–506, 1987.[79] P.F. Verhulst. Notice sur la loi que la population suit dans son accroissement. InJ.G. Garnier and A. Quetelet, editors,
Correspondance mathématique et physique ,volume 10, pages 113–121. Société Belge de Librairie, Bruxelles, 1838.[80] V. Volterra. Fluctuations in the abundance of a species considered mathematically.
Nature , 118:558–560, 1926.[81] V. Volterra.
Leçons sur la théorie mathématique de la lutte pour la vie . Gauthier-Villars, Paris, 1931.[82] L.M. Wahl and P.J. Gerrish. The probability that beneficial mutations are lost inpopulations with periodic bottlenecks.
Evolution , 55:2606–2610, 2001.[83] D.J. Wilkinson.
Stochastic modelling for systems biology . Chapman & Hall/CRC,Boca Raton, FL, 2006.[84] G.U. Yule. A mathematical theory of evolution, based on the conclusions of Dr. J.C.Willis, F.R.S.
Phil. Trans. Roy. Soc. London Ser. B , 213:21–87, 1925.[85] Q. Zheng. Progress of a half century in the study of the Luria-Delbrück distribution.
Math. Biosc. , 162:1–32, 1999.[86] Q. Zheng. Statistical and algorithmic methods for fluctuation analysis with SAL-VADOR as an implementation.
Math. Biosci. , 176:237–252, 2002.[87] Q. Zheng. On Bartlett’s formulation of the Luria-Delbrück mutation model.
Math.Biosci. , 215:48–54, 2008.[88] Q. Zheng. The Luria-Delbrück distribution: early statistical thinking about evolu-tion.
Chance , 23(2):15–18, 2010., 23(2):15–18, 2010.