[PDF] How often does the ratchet click? Facts, heuristics, asymptotics

Abstract

The evolutionary force of recombination is lacking in asexually reproducing populations. As a consequence, the population can suffer an irreversible accumulation of deleterious mutations, a phenomenon known as Muller's ratchet. We formulate discrete and continuous time versions of Muller's ratchet. Inspired by Haigh's (1978) analysis of a dynamical system which arises in the limit of large populations, we identify the parameter gamma = N*lambda/(Ns*log(N*lambda)) as most important for the speed of accumulation of deleterious mutations. Here N is population size, s is the selection coefficient and lambda is the deleterious mutation rate. For large parts of the parameter range, measuring time in units of size N, deleterious mutations accumulate according to a power law in N*lambda with exponent gamma if gamma>0.5. For gamma<0.5 mutations cannot accumulate. We obtain diffusion approximations for three different parameter regimes, depending on the speed of the ratchet. Our approximations shed new light on analyses of Stephan et al. (1993) and Gordo & Charlesworth (2000). The heuristics leading to the approximations are supported by simulations.

Full PDF

aa r X i v : . [ m a t h . P R ] S e p How often does the ratchet click?Facts, heuristics, asymptotics. by A. Etheridge ∗ , P. Pfaffelhuber † and A. Wakolbinger University of Oxford, Ludwig-Maximilians University Munichand Goethe-University Frankfurt

17. July 2007

Abstract

The evolutionary force of recombination is lacking in asexually repro-ducing populations. As a consequence, the population can suﬀer an ir-reversible accumulation of deleterious mutations, a phenomenon knownas

Muller’s ratchet . We formulate discrete and continuous time versionsof Muller’s ratchet. Inspired by Haigh’s (1978) analysis of a dynami-cal system which arises in the limit of large populations, we identify theparameter γ = Nλ/ ( Ns · log( Nλ )) as most important for the speed ofaccumulation of deleterious mutations. Here N is population size, s is theselection coeﬃcient and λ is the deleterious mutation rate. For large partsof the parameter range, measuring time in units of size N , deleterious mu-tations accumulate according to a power law in Nλ with exponent γ if γ ≥ .

5. For γ < . Muller’s ratchet is a mechanism that has been suggested as an explanation forthe evolution of sex (Maynard Smith 1978). The idea is simple; in an asexuallyreproducing population chromosomes are passed down as indivisible blocks andso the number of deleterious mutations accumulated along any ancestral line inthe population can only increase. When everyone in the current ‘best’ class hasaccumulated at least one additional bad mutation then the minimum mutationalload in the population increases: the ratchet clicks. In a sexually reproducing ∗ Research supported in part by EPSRC GR/T19537/01. † Travel support from DFG, Bilateral Research Group FOR 498.

AMS 2000 subject classiﬁcation. (Primary) (Sec-ondary).

Keywords and phrases.

Muller’s ratchet, selection, mutation, Fleming-Viot process,Wright-Fisher model, diﬀusion approximation INTRODUCTION fewer deleterious mutations. The high cost of sexual reproduction is thus oﬀset bythe beneﬁts of inhibiting the ratchet. Equally the ratchet provides a possibleexplanation for the degeneration of Y chromosomes in sexual organisms (e.g.Charlesworth 1978, 1996; Rice 1994; Charlesworth & Charlesworth 1997, 1998).However, in order to assess its real biological importance one should establishunder what circumstances Muller’s ratchet will have an evolutionary eﬀect. Inparticular, how many generations will it take for an asexually reproducing pop-ulation to lose its current best class? In other words, what is the rate of theratchet?In spite of the substantial literature devoted to the ratchet (see Loewe 2006for an extensive bibliography), even in the simplest mathematical models aclosed form expression for the rate remains elusive. Instead various approxima-tions have been proposed which ﬁt well with simulations for particular parameterregimes. The analysis presented here uniﬁes these approximations into a singleframework and provides a more detailed mathematical understanding of theirregions of validity.The simplest mathematical model of the ratchet was formulated in the pi-oneering work of Haigh (1978). Consider an asexual population of constantsize N . The population evolves according to classical Wright-Fisher dynamics.Thus each of the N individuals in the ( t + 1)st generation independently choosesa parent from the individuals in the t th generation. The probability that anindividual which has accumulated k mutations is selected is proportional to its relative ﬁtness, (1 − s ) k . The number of mutations carried by the oﬀspring isthen k + J where J is an (independent) Poisson random variable with mean λ .Haigh identiﬁes n = N e − λ/s as an approximation (at large times) for thetotal number of individuals carrying the least number of mutations and ﬁndsnumerical evidence of a linear relationship between n and the average timebetween clicks of the ratchet, at least for ‘intermediate’ values of n (which hequantiﬁes as n > n Stephan et al. (1993) note the increasing importance of s for the rateof the ratchet. The simulations of Gordo & Charlesworth (2000) also suggestthat for n ﬁxed at a large value the ratchet can run at very diﬀerent rates.They focus on parameter ranges that may be the most relevant to the problemof the degeneration of large non-recombining portions of chromosomes.In our approach we use diﬀusion approximations to identify another param-eter as being an important factor in determining the rate of the ratchet. Wedeﬁne γ := N λN s log(

N λ ) . (1.1)Notice that n = N ( N λ ) − γ . In these parameters one can reinterpret Haigh’sempirical results as saying that if we measure time in units of size N , then therate of the ratchet follows a power law in N λ . In fact our main observation inthis note is that for a substantial portion of parameter space (which we shall

THE DISCRETE RATCHET – HAIGH’S APPROACH

Rule of Thumb.

The rate of the ratchet is of the order N γ − λ γ for γ ∈ (1 / , ,whereas it is exponentially slow in ( N λ ) − γ for γ < / . There are two novelties here. First, the abrupt change in behaviour at γ = 1 / γ ∈ (1 / , § γ > N λ we have n < N λ is too large then we see the transition from exponentiallyrare clicks to frequent clicks takes place at larger values of γ .The rest of this note is laid out as follows. In § § s and λ are small. We then pass in § § N generations (but always rarelyper generation). Performing a scaling of this diﬀusion allows us to predict therelationship between the parameters N λ and

N s of the biological model andthe value of γ at which we can expect to see the phase transition from rareclicking to power law behaviour of the rate of the ratchet. In § § The population dynamics described in the introduction can be reformulatedmathematically as follows. Let N be a ﬁxed natural number (the populationsize), λ > s ∈ (0 ,

1) (the selection coeﬃcient).The population is described by a stochastic process taking values in P ( N ), thesimplex of probability weights on N . Suppose that x ( t ) = ( x k ( t )) k =0 , ,... ∈P ( N ) is the vector of type frequencies (or frequency proﬁle ) in the t th gen-eration (so for example N x k ( t ) individuals in the population carry exactly k mutations). Let H be an N -valued random variable with P [ H = k ] propor-tional to (1 − s ) k x k ( t ), let J be a Poisson( λ )-random variable independent of THE DISCRETE RATCHET – HAIGH’S APPROACH g a v e r a g e nu m b e r o f c li c k s i n N g e n e r a ti on s N=10 l =10 - (B) N a v e r a g e nu m b e r o f c li c k s i n N g e n e r a ti on s g = 0.55 g = 0.7 g = 0.9 g = 1.5 l = 10 - Figure 1: (A) We plot the rate of the ratchet against γ , where time is measuredin units of N generations. As predicted by our rule of thumb we see a sharpchange of behaviour around γ = 0 .

5. (B) We see the power-law behaviour ofthe rate for various values of γ . The solid lines are given by simulation of aWright-Fisher model. The dashed lines ﬁt the prediction that the time betweenclicks is a constant times N ( N λ ) − γ for γ > .

5. Note that this breaks down for γ > THE DISCRETE RATCHET – HAIGH’S APPROACH H and let K , ..., K N be independent copies of H + J . Then the random typefrequencies in the next generation are X k ( t + 1) = 1 N { i : K i = k } . (2.1)We shall refer to this as the ratchet dynamics in discrete time .First consider what happens as N → ∞ . By the law of large numbers, (2.1)results in the deterministic dynamics x ( t + 1) := E x ( t ) [ X ( t + 1)] . (2.2)An important property of the dynamics (2.2) is that vectors of Poisson weightsare mapped to vectors of Poisson weights. To see this, note that when N → ∞ the right hand side of (2.2) is just the law of the random variable H + J . If x ( t ) =Poisson( α ), the Poisson distribution with mean α , then H is Poisson ( α (1 − s ))-distributed and consequently x ( t + 1) is the law of a Poisson( α (1 − s ) + λ )random variable. We shall see in § x > π := Poisson( θ )as t → ∞ , where θ := λs . Haigh’s analysis of the ﬁnite population model focusses on the number ofindividuals in the best class. Let us write k ∗ for the number of mutations carriedby individuals in this class. In a ﬁnite population, k ∗ will increase with time,but the proﬁle of frequencies relative to k ∗ , { X k + k ∗ } , forms a recurrent Markovchain. We set Y := ( Y k ) k =0 , ,.. := ( X k ∗ + k ) k =0 , ,.. and observe that since ﬁtness is always measured relative to the mean ﬁtnessin the population, between clicks of the ratchet the equation for the dynamicsof Y is precisely the same as that for X when X >

0. Suppose that after t generations there are n ( t ) = N y ( t ) individuals in the best class. Then theprobability of sampling a parent from this class and not acquiring any additionalmutations is p ( t ) := ( y ( t ) /W ( t )) e − λ , where W ( t ) = ∞ X i =0 y i ( t )(1 − s ) i . (2.3)Thus, given y ( t ), the size of the best class in the next generation has a binomialdistribution with N trials and success probability p ( t ), and so the evolution of THE FLEMING-VIOT DIFFUSION W ( t ), the mean ﬁtness of the population. Weshall see this property of Y reﬂected in the diﬀusion approximation of § π := 11 − π ( π , π , . . . ) = 11 − e − θ (cid:18) θe − θ , θ e − θ , ... (cid:19) . (2.4)The time until the next click is then subdivided into two phases. During the ﬁrstphase the deterministic dynamical system decays exponentially fast towards itsPoisson equilibrium, swamping the randomness arising from the ﬁnite popula-tion size. At the time that he proposes for the end of the ﬁrst phase the size ofthe best class is approximately 1 . π . The mean ﬁtness of the population hasdecreased by an amount which can also be readily estimated and, combiningthis with a Poisson approximation to the binomial distribution, Haigh proposesthat (at least initially) during the second (longer) phase the size of the best classshould be approximated by a Galton-Watson branching process with a Poissonoﬀspring distribution.Haigh’s original proposal was that since the mean ﬁtness of the population(and consequently the mean number of oﬀspring in the Galton Watson process)changes only slowly during the second phase it should be taken to be constantthroughout that phase. Later reﬁnements have modiﬁed Haigh’s approach intwo key ways. First they have worked with a diﬀusion approximation so thatthe Galton-Watson process is replaced by a Feller diﬀusion and second, insteadof taking a constant drift, they look for a good approximation of the meanﬁtness given the size of the best class , resulting in a Feller diﬀusion with logistic growth. Our aim in the rest of this paper is to unify these approximations in asingle mathematical framework and discuss them in the light of simulations. Acrucial building block will be the following extension of (2.4), which we call the Poisson proﬁle approximation (or PPA) of Y based on Y :Π( Y ) := (cid:16) Y , − Y − π ( π , π , . . . ) (cid:17) . (2.5)As a ﬁrst step, we now turn to a diﬀusion approximation for the full ratchetdynamics (2.1). For large N and small λ and s , the following stochastic dynamics on P ( N )in continuous time captures the conditional expectation and variance of the THE FLEMING-VIOT DIFFUSION dX k = X j s ( j − k ) X j X k + λ ( X k − − X k )  dt + X j = k r N X j X k dW jk , (3.1) k = 0 , , , . . . , where X − := 0, and ( W jk ) j>k is an array of independent standard Wienerprocesses with W kj := − W jk . This is, of course, just the inﬁnite-dimensionalversion of the standard multi-dimensional Wright-Fisher diﬀusion. Existence ofa process solving (3.1) can be established using a diﬀusion limit of the discretedynamics of §

2. The coeﬃcient s ( j − k ) is the ﬁtness diﬀerence between type k and type j , λ ( X k − − X k ) is the ﬂow into and out of class k due to mutationand the diﬀusion coeﬃcients N X j X k reﬂect the covariances due to multinomialsampling. Remark 3.1.

Often when one passes to a Fleming-Viot diﬀusion approxima-tion, one measures time in units of size N and correspondingly the parameters s and λ appear as N s and

N λ . Here we have not rescaled time, hence the factorof N in the noise and the unscaled parameters s and λ in the equations. ✷ Writing M ( X ) := X j jX j for the ﬁrst moment of X , (3.1) translates into dX k = (cid:16) s ( M ( X ) − k ) − λ ) X k + λX k − (cid:17) dt + X j = k r N X j X k dW jk (3.2)In exactly the same way as in our discrete stochastic system, writing k ∗ for thenumber of mutations carried by individuals in the ﬁttest class, one would liketo think of the population as a travelling wave with proﬁle Y k = X k + k ∗ . Noticein particular that dY = ( sM ( Y ) − λ ) Y dt + r N Y (1 − Y ) dW , where W is a standard Wiener process. Thus, just as in Haigh’s setting, thefrequency of the best class is determined by the mean ﬁtness of the population.Substituting into (3.1) one can obtain a stochastic equation for M , dM = ( λ − sM ) dt + dG where M = P j ( j − M ) X j and the martingale G has quadratic variation d h G i = 1 N M dt. THE INFINITE POPULATION LIMIT dM = ( − N M + ( λ − sM )) dt + dH where M = P j ( j − M ) X j and the martingale H has quadratic variation d h H i = N ( M − M ) dt with M denoting the fourth centred moment and soon. These equations for the centred moments are entirely analogous to thoseobtained by Higgs & Woodcock (1995) except that in the Fleming-Viot settingthey are exact. As pointed out there, B¨urger (1991) obtained similar equationsto study the evolution of polygenic traits. The diﬃculty in using these equationsto study the rate of the ratchet is of course that they are not closed: the equationfor M k involves M k +1 and so on. Moreover, there is no obvious approximatingclosed system. By contrast, the inﬁnite population limit, in which the noise isabsent, turns out to have a closed solution. The continuous time analogue of (2.2) is the deterministic dynamical system dx k = (cid:16) ( s ( M ( x ) − k ) − λ ) x k + λx k − (cid:17) dt, k = 0 , , , . . . (4.1)where x − = 0, obtained by letting N → ∞ in our Fleming-Viot diﬀusion (3.2).Our goal in this section is to solve this system of equations. Note that Maiaet al. (2003) have obtained a complete solution of the corresponding discretesystem following (2.2).As we shall see in Proposition 4.1, the stationary points of the system areexactly the same as for (2.2), that is x = π and all its right shifts ( π k − k ∗ ) k =0 , ,... , k ∗ = 0 , , , . . . . Since the Poisson distribution can be characterised as the onlydistribution on N with all cumulants equal, it is natural to transform (4.1) intoa system of equations for the cumulants, κ k , k = 1 , , . . . , of the vector x . Thecumulants are deﬁned by the relationlog ∞ X k =0 x k e − ξk = ∞ X k =1 κ k ( − ξ ) k k ! . (4.2)We assume x > κ := − log x . (4.3) Proposition 4.1.

For κ k , k = 0 , , , . . . as in (4.2) and (4.3) the system (4.1) is equivalent to ˙ κ k = − sκ k +1 + λ, k = 0 , , , . . . THE INFINITE POPULATION LIMIT Setting κ := ( κ , κ , . . . ) this system is solved by κ = Bκ (0) ⊤ + λs (1 − e − st )1 , B = ( b ij ) i,j =0 , ,... , b ij = ( ( − st ) j − i ( j − i )! j ≥ i otherwise. (4.4) In particular, x ( t ) = e − κ ( t ) = x (0) exp (cid:0) − λs (1 − e − st ) (cid:1)(cid:16) P ∞ k =0 x k (0) e − stk (cid:17) (4.5) and κ ( t ) = ∞ X k =0 kx k ( t ) = − ∂∂ξ log ∞ X k =0 x k (0) e − ξk (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ξ = st + λs (1 − e − st ) . (4.6) Remark 4.2. If x (0) is a Poisson( µ ) distribution then substituting into (4.4)we see that x ( t ) is a Poisson distribution with parameter λ/s + e − st ( µ − λ/s ).In other words, just as for the discrete dynamical system considered by Haigh,vectors of Poisson weights are mapped to vectors of Poisson weights. In particu-lar π := Poisson( λ/s ) is once again a stationary point of the system. Moreover,this proposition shows that for any vector x (0) with x (0) >

0, the solutionconverges to this stationary point. The corresponding convergence result inthe discrete case is established in Maia et al. (2003). More generally, if k ∗ isthe smallest value of k for which x k (0) > π k − k ∗ ) k =0 , , ,... . Proof of Proposition 4.1.

Using (4.1) we have ddt log (cid:16) x ∞ X k =0 x k e − ξk (cid:17) = x P ∞ k =0 x k e − ξk (cid:16) − P ∞ k =0 x k e − ξk x ˙ x + 1 x ∞ X k =0 ˙ x k e − ξk (cid:17) = − s ∞ X j =0 jx j + λ − s P ∞ k =0 kx k e − ξk P ∞ k =0 x k e − ξk + s ∞ X j =0 jx j + λ P ∞ k =0 x k e − ξk (cid:16) e − ξ ∞ X k =1 x k − e − ξ ( k − − ∞ X k =0 x k e − ξk (cid:17) = s ddξ log (cid:16) ∞ X k =0 x k e − ξk (cid:17) + λe − ξ . Thus, (4.2) gives ddt ∞ X k =0 κ k ( − ξ ) k k ! = − s ∞ X k =0 κ k +1 ( − ξ ) k k ! + λe − ξ . Comparing coeﬃcients in the last equation we obtain˙ κ k = − sκ k +1 + λ, k = 0 , , . . . . THE INFINITE POPULATION LIMIT D := ( δ i +1 ,j ) i,j =0 , , ,... , , , . . . ) , so that ˙ κ ⊤ = − sDκ ⊤ + λ ⊤ . (4.7)Since ( e − Dst ) ij = ( ( − st ) j − i ( j − i )! j ≥ i κ ( t ) ⊤ = e − Dst κ (0) ⊤ + λ Z t e − Dsu du = e − Dst κ (0) ⊤ + λs (1 − e − st )1 ⊤ . Remark 4.3.

With the initial condition x (0) := ˜ π given by (2.4), equations(4.5) and (4.6) become x ( t ) = e − θ θe − st − e − θe − st (4.8)and κ ( t ) = θ − θe − st e θe − st − . (4.9)At time τ := log θs , (4.10)we have x ( τ ) = e − θ − e − ≈ . π . Comparing with § τ is precisely the counterpart of the time proposed byHaigh as the end of ‘phase one’.In § M ( Y ) given Y will require the value of M ( y ( τ ))for y solving (4.1) when started from a Poisson proﬁle approximation. This isthe purpose of the next proposition. Proposition 4.4.

For y ∈ (0 , , let y ( t ) be the solution of (4.1) with theinitial state y (0) := Π( y ) deﬁned in (2.5) , and let τ be Haigh’s relaxation timedeﬁned in (4.10) . Then for A ≥ with η := θ − A M ( y ( Aτ )) = θ + ηe η − (cid:16) − y ( Aτ ) π (cid:17) . THE INFINITE POPULATION LIMIT Proof.

Since ∞ X k =0 π k e − ξk = exp (cid:0) − θ (1 − e − ξ ) (cid:1) = π − e − ξ , we have ∞ X k =0 y k e − ξk (cid:12)(cid:12)(cid:12) ξ = sAτ = y + 1 − y − π π (cid:0) e θe − ξ − (cid:1)(cid:12)(cid:12)(cid:12) ξ = sAτ = y + 1 − y − π π ( e η − − ∂∂ξ ∞ X k =0 y k e − ξk (cid:12)(cid:12)(cid:12) ξ = sAτ = 1 − y − π π − e − ξ θe − ξ (cid:12)(cid:12)(cid:12) ξ = sAτ = 1 − y − π π e η η. Using the solution (4.5) and (4.6) and y (0) = y y ( Aτ ) = y π e η y + − y − π π ( e η − y π e η (1 − π ) y (1 − π e η ) + π ( e η − , (4.11) M ( y ( Aτ )) = − y − π π e η ηy + − y − π π ( e η −

1) + θ − η = θ + η π − y y (1 − π e η ) + π ( e η − . (4.12)From (4.11), y = y ( Aτ ) π ( e η − π e η (1 − π ) − y ( Aτ )(1 − π e η )and thus π − y = π e η ( π − y ( Aτ ))(1 − π ) π e η (1 − π ) − y ( Aτ )(1 − π e η ) ,y (1 − π e η ) + π ( e η −

1) = π ( e η − π e η (1 − π ) π e η (1 − π ) − y ( Aτ )(1 − π e η ) . Plugging the last two equations into (4.12) we ﬁnd M ( y ( Aτ )) = θ + ηe η − (cid:16) − y ( Aτ ) π (cid:17) . ONE DIMENSIONAL DIFFUSION APPROXIMATIONS . . . . Y M N=10 ll =0.1s=0.024Y = pp M = qq Figure 2: Using simulations (see also §

6) we plot ( Y , M ). There is a good ﬁtto a linear relationship between Y and M . Note that γ = 0 . Recall from § Y of the bestclass follows dY = (cid:16) sM ( Y ) − λ (cid:17) Y dt + r N Y (1 − Y ) dW , (5.1)where W is a standard Wiener process. The system of equations (3.2) is toocomplex for us to be able to ﬁnd an explicit expression for M ( Y ), which dependson the whole vector Y of class sizes. Instead we seek a good approximation of M given Y . Substituting this into equation (5.1) will then yield a one-dimensionaldiﬀusion which we use as an approximation for the size of the best class. Ofcourse this assumption of a functional dependence between Y and M is aweakness of the one-dimensional diﬀusion approximation, but simulations showthat there is a substantial correlation between Y and M , see, for example,Figure 2.To understand our approach to ﬁnding a map Y M , recall as a ﬁrst ONE DIMENSIONAL DIFFUSION APPROXIMATIONS Y from aPoisson proﬁle can only be due to the randomness arising from resampling ina ﬁnite population. Since resampling has no tendency to increase or decreasethe frequency of a given class, the average proﬁle immediately after a click ofthe ratchet is approximated by the state where π is distributed evenly over allother classes according to their equilibrium frequencies. During his short ‘phaseone’, Haigh then allows this proﬁle to ‘relax’ through the action of the discretedynamical system (2.2) and it is the mean ﬁtness in the population after thisshort relaxation time which determines the behaviour of the best class during‘phase two’.A natural next step in extending this argument is to suppose that also inbetween click times the resampling distributes the mass π − Y evenly on allother classes. In other words, given Y , approximate the state of the system by Y = Π( Y ) given by (2.5).Of course in reality the dynamical system interacts with the resamplingas it tries to restore the system to its Poisson equilibrium. If this restoringforce is strong, just as in Haigh’s approach one estimated mean ﬁtness duringphase two from the ‘relaxed’ proﬁle, so here one should approximate the meanﬁtness M not from the PPA, but from states which arise by evolving the PPAusing the dynamical system for a certain amount of time. We call the resultingstates relaxed Poisson Proﬁle approximations or RPPA. There are three diﬀerentparameter regimes with which we shall be concerned. Each corresponds to adiﬀerent value of η in the functional relationship M = θ + ηe η − (cid:16) − Y π (cid:17) . (5.2)of Proposition 4.4. These can be distinguished as follows: A small , η ≈ θ, M ≈ θ − π (1 − Y ) , (5.3a) A = 1 , η = 1 , M ≈ θ + 0 . (cid:16) − Y π (cid:17) , (5.3b) A large , η ≈ , M ≈ θ + (cid:16) − Y π (cid:17) (5.3c)The resulting maps Y M are plotted in Figure 3. Observe that for consis-tency, M has to increase, on average, by 1 during one click of the ratchet.Finally, before we can apply our one-dimensional diﬀusion approximation wemust choose a starting value for Y following equation (5.1). For A large, thesystem is already close to its new equilibrium at the time of a click and so wetake Y = π .For A = 1, at the time of the click we observe a state which has relaxed fortime τ from a state of the form ˜ π from (2.4). We computed in Remark 4.3 thatsuch a state comes with Y = 1 . π . ONE DIMENSIONAL DIFFUSION APPROXIMATIONS Y p p p M qq /(1− p ) q +0.58 q +1 1A small 1A=1 1A big Figure 3: Since simulations show a strong correlation between the ﬁrst moment M and Y , we use (5.3a)-(5.3c) to predict M from Y depending on the modelparameters.For small values of A , observe that the proﬁle of the population immediatelyafter a click is approximately ˜ π from (2.4). Since ˜ π is not a state of the formΠ( y ) the arguments that led to Proposition 4.4 do not apply. Instead we followHaigh in dividing the time between clicks into two phases. Consider ﬁrst ‘phaseone’. Recall from the dynamical system that dY = ( sM − λ ) Y dt. We write ( sM − λ ) Y = c ( π − Y ) , where c (like Y and M ) depends on r = θe − st . Starting in Y (0) = ˜ π we havefrom (4.8) and (4.9) Y ( t ) = e − θ θe − st − exp( − θe − st ) = e − θ r − e − r ,M ( t ) = θ − θe − st exp( θe − st ) − θ − re r − . We compute cs = ( M − θ ) Y π − Y = 1 − re r − − − e − r r = r (1 − e − r ) − r e − r r (1 − e − r ) − (1 − e − r ) . ONE DIMENSIONAL DIFFUSION APPROXIMATIONS .

25 for all r > dY = s ( π − Y ) dt + r N Y dW (5.4)started from π (1 − π ) . We allow Y to evolve according to equation (5.4) untilit reaches 1 . π , say, and then use our estimate of M from equation (5.3) toestimate the evolution of Y during the (longer) ‘phase two’.We assume that states of the ratchet are RPPAs, i.e., Poisson proﬁle ap-proximations (2.5) which are relaxed for time Aτ , where τ = s log θ , whichleads to the functional relationship (5.2). Consequently, we suggest that (5.1)is approximated by the ‘mean reversion’ dynamics dY = s ηe η − (cid:0) − Y π (cid:1) Y dt + r N Y dW , (5.5)with η = θ − A , where we have used a Feller noise instead of the Wright-Fisherterm in (5.1). In other words, Y is a Feller branching diﬀusion with logisticgrowth .Using the three regimes from (5.3), we have the approximations A small , dY = λ ( π − Y ) Y dt + r N Y dW, (5.6a) A = 1 , dY = 0 . s (cid:16) − Y π (cid:17) Y dt + r N Y dW , (5.6b) A large , dY = s (cid:16) − Y π (cid:17) Y dt + r N Y dW , (5.6c)(where in the ﬁrst equation we have used that − π ≈ π and that Y π isnegligible).An equation similar to (5.6b) was found (by diﬀerent means) by Stephan etal (1993) and further discussed in Gordo & Charlesworth (2000). Stephan andKim (2002) analyse whether a prefactor of 0.5 or 0.6 in (5.6b) ﬁts better withsimulated data. We discuss the relationship with these papers in detail in § Z ( t ) = 1 π Y ( N π t ) . SIMULATIONS A small equation (5.6a) becomes dZ = N λπ (1 − Z ) Zdt + √ ZdW = (

N λ ) − γ (1 − Z ) Zdt + √ ZdW. (5.7)For A = 1 on the other hand we obtain from (5.6b) dZ = 0 . N sπ (1 − Z ) Zdt + √ ZdW = 0 .

58 1 γ log( N λ ) (

N λ ) − γ (1 − Z ) Zdt + √ ZdW. (5.8)For A large we obtain from (5.6c) the same equation without the factor of 0 . A , i.e.(5.7), is strongly mean reverting for γ < /

2. Recall that the choice of small A is appropriate when the ratchet is clicking frequently and so this indicates thatfrequent clicking simply will not happen for γ < /

2. To indicate the boundarybetween rare and moderate clicking, equation (5.8) is much more relevant thanequation (5.7). At ﬁrst sight, equation (5.8) looks strongly mean reverting for all γ <

1, which would seem to suggest that the ratchet will click only exponentiallyslowly in (

N λ ) − γ . However, the closer γ is to one, the larger the value of N λ we must take for this asymptotic regime to provide a good approximation. Forexample, in the table below we describe parameter combinations for which thecoeﬃcient in front of the mean reversion term in equation (5.8) is at least ﬁve.We see that for γ < / N λ , whereas for γ > / N λ . γ . . . .

55 0 . . . . N λ ≥

20 10 · · · · · · Thus, for example, if γ = 0 . N λ to be of the order of 10 in orderfor the strong mean reversion of equation (5.8) to be evident. This is not avalue of N λ which will be observed in practice. Indeed, as a ‘rule of thumb’, forbiologically realistic parameter values, we should expect the transition from noclicks to a moderate rate of clicks to take place at around γ = 0 . We have argued that the one-dimensional diﬀusions (5.6) approximate the fre-quency in the best class and from this deduced the rule of thumb from §

1. Inthis section we use simulations to test the validity of our arguments.For a population following the dynamics (2.1), the ( t + 1)st generation isformed by multinomial sampling of N individuals with weights p k ( t ) = k X j =0 x k − j ( t )(1 − s ) k − j W ( t ) e − λ λ j j ! , (6.1) SIMULATIONS W ( t ) is the average ﬁtness in the t th generation from (2.3) and it is thisWright-Fisher model which was implemented in the simulations.To supplement the numerical results of Figure 1 we provide simulation resultsfor the average time between clicks (where time is measured in units of N generations) for ﬁxed N and γ and varying λ ; see Figure 4. Note that, for ﬁxed γ in equation (1.1), s is increasing with λ . We carry out simulations usinga population size of N = 10 and λ varying from 10 − to 1. For γ = 0 . N λ = 10 andthe diﬀusion (5.6b) predicts the clicking of the ratchet suﬃciently well. Forincreasing γ , the power law breaks down only for larger values of N λ . For γ =0 .

7, in our simulations we only observe the power law behaviour but conjecturethat for larger values of

N λ the power law would break down; compare with thetable above.For a ﬁner analysis of which of the equations (5.6) works best, we study theresulting Green functions numerically; see Figure 5. In particular, we recordthe relative time spent in some dY in simulations and compare this quantityto the numerically integrated, normalised Green functions given through thediﬀusions (5.6a) and (5.6b). (We do not consider (5.6c) because it only givesan approximation if the ratchet clicks rarely.) We see in (A) that for γ = 0 . y . However,for γ = 0 .

9, clicks are more frequent and we expect (5.6a) to provide a betterapproximation. Indeed, although both (5.6a) and (5.6b) predict the power lawbehaviour, as (B) shows, the ﬁrst equation produces better estimates for therelative amount of time spent in some dY .To support our claim that the states observed are relaxed Poisson ProﬁleApproximations we use a phase-plane analysis; see Figure 6. At any point intime of a simulation, values for Y and M can be observed. The resultingplots indicate that we can distinguish the three parameter regimes introducedin §

5. In the case of rapid clicking of the ratchet (so that the states we observehave not relaxed a lot and thus are approximately of the form Π( y )) we seein (A), (B) that the system is driven by the restoring force to M < θ . Thereason is that M is small at click times and these are frequent. However, theslope of the line relating Y and M is low, as predicted by (5.3a). (We used(1 − Y ) / (1 − π ) ≈ − Y + π in the plot here.) For the case A = 1 the systemspends some time near Y = 0 and thus the ratchet clicks, but not frequently.So, the dynamical system restores states partly to equilibrium and we see thatthe slope given in (5.3b) gives the most reasonable prediction in (C), (D), (E).For rare clicking, i.e. A large, the dynamical system has even more time and(F) shows that the slope is as predicted by (5.3c).Our prediction that we observe proﬁles which are well approximated by arelaxed PPA applies especially well at click times. We check this numerically byobserving the frequency of the (new) best class at click times; see Figure 7. Forsmall γ the ratchet clicks rarely and the system has some time to relax to its SIMULATIONS N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] . SimulationsA smallA=1 N = 10 g = 0.5 N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] . . . . . SimulationsA smallA=1 N = 10 g = 0.55 (C) (D) N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] . . . . . SimulationsA smallA=1 N = 10 g = 0.6 N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] . . . . SimulationsA smallA=1 N = 10 g = 0.7 (E) (F) N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] e − e − e − e − SimulationsA smallA=1 N = 10 g = 0.8 N l ti m e t o n e x t c li c k [ N g e n e r a ti on s ] e − e − e − e − SimulationsA smallA=1 N = 10 g = 0.9 Figure 4: The power law behaviour of the rate of the ratchet with respect to γ isvalid for a large portion of the parameter space. (A) For γ = 0 . N λ > . (B), (C) For γ = 0 .

55 and γ = 0 .

6, we have to explore a larger portion of the parameter space in order tosee that the power law does not apply any more. (D), (E), (F) For γ ≥ . N = 10 and simulations ran for 5 · ( γ = 0 .

5: 2 · ) generations for each value of N λ . DISCUSSION . . . . . . . Y / p d e n s it y SimulationsA smallA=1N = 10 l = 0.1 g = 0.5 0 2 4 6 8 10 . . . . . Y / p d e n s it y SimulationsA smallA=1N = 10 l = 0.1 g = 0.9 Figure 5: We compare the plots for the occupation density of Y from thesimulations with theoretical curves corresponding to the Green functions forthe cases of small A and A = 1 in (5.6). (A) If clicks are rare, A = 1 producesbetter results than small A . (B) If clicks are frequent, the simulated densities of Y are better approximated by small A . Every plot is based on the simulationof 5 · generations.new equilibrium even before the click of the ratchet. As a consequence, we seethat the frequency of the (new) best class at the time of the click is already closeto π . However, if γ is large and the ratchet clicks frequently, the dynamicalsystem has no time before the click to relax the system to the new equilibrium.Therefore, we observe that the frequency of the new best class is close to π . Haigh (1978) was the ﬁrst to attempt a rigorous mathematical analysis of theideas of Muller (1964). However, in spite of the apparent simplicity of Haigh’smathematical formulation of the model, the exact rate of Muller’s ratchet re-mains elusive. In this note, we have developed arguments in the spirit ofHaigh (1978), Stephan et al. (1993) and Gordo & Charlesworth (2000) to giveapproximations for this rate.Haigh gave the empirical formula4

N π + 7 log θ + s − DISCUSSION . . . . . Y M gg =0.9M = qq (1+ pp −y )M = qq + 0.58(1−Y / pp ) 0.000 0.005 0.010 0.015 0.020 . . . . Y M gg =0.8M = qq + 0.58(1−Y / pp )M = qq (1+ pp −y ) (C) (D) . . . . Y M gg =0.7M = qq + 0.58(1−Y / pp )M = qq (1+ pp −y ) 0.00 0.01 0.02 0.03 0.04 . . . . Y M gg =0.6M = qq + 0.58(1−Y / pp ) (E) (F) . . . . . . Y M gg =0.5M = qq + 0.58(1−Y / pp ) 0.47 0.48 0.49 0.50 0.51 0.52 0.53 . . . . . Y M gg =0.1M = qq +1−Y / pp Figure 6: There are three regimes for the relationship between Y and M , asgiven in (5.3). If clicks are frequent, at least the slope of the relationship between Y and M ﬁts roughly to equation (5.3a). If clicks occur reasonably often,(5.3b) gives a good approximation. If clicks are rare, (5.3c) gives a reasonableprediction. The plots show simulations for diﬀerent values of γ . The dashedhorizontal and vertical lines are M = θ and Y = π , respectively. For everyplot we used N = 10 , λ = 0 . generations. DISCUSSION . . . . g F r e qu e n c y o f t h e n e w b e s t c l a ss p p SimulationsN=10 l =0.01 0.5 0.6 0.7 0.8 0.9 . . . . . g F r e qu e n c y o f t h e n e w b e s t c l a ss p p SimulationsN=10 l =0.1 Figure 7: Our heuristic that observed states come from a RPPA applies inparticular at click times. (A) For small γ , i.e., rare clicking, the frequency ofthe best class is already close to π while (B) it is close to π for larger γ , i.e.,frequent clicking. Every plot is based on the simulation of 5 · generations.The reasoning leading to (5.6b) in these papers is twofold. Stephan etal. (1993) and Stephan & Kim (2002) argue that although ﬁtness decreasesby se − λ during one ‘cycle’ of the discrete ratchet model from § kse − λ for k = 0 . k = 0 . M ( Y ) discussed in § M ( π ) = θ and M (0) = θ + ke − λ ≈ θ + k ; compare with Figure 3. On theother hand, Gordo & Charlesworth (2000) use a calculation of Haigh which tellsus that if the dynamical system (2.1) is started in ˜ π from (2.4), then at theend of phase one (corresponding in the continuous setting, as we observed inRemark 4.3, to time s log θ ) we have sM ≈ − e − λ (1 + 0 . s ). This leads tothe approximation M (1 . π ) = θ − .

42 and again interpolating linearly using M ( π ) = θ gives (5.6b).Simulations show that (5.6b) provides a good approximation to the rate ofthe ratchet for a wide range of parameters; see e.g. Stephan and Kim (2002).The novelty in our work is that we derive (5.6b) explicitly from the dynamicalsystem. In particular, we do not use a linear approximation, but instead derivea functional linear relationship in Proposition 4.4. In addition, we clarify therˆole of the two diﬀerent phases suggested by Haigh. As simulations show, sincephase one is fast, it is already complete at the time when phase two starts.Therefore, in practice, we observe states that are relaxed PPAs.The drawback of our analysis is that we cannot give good arguments for thechoice of A = 1 in (5.3b) and (5.6b). However, note that the choice of A = 1 is EFERENCES θ = 10, the choice of A = 0 . A = 2 leads to the prefactor of 0.95,neither of which ﬁts with simulated data; see Figure 6.We obtain two more diﬀusion approximations, which are valid in the cases offrequent and rare clicking, respectively. In practice, both play little rˆole in theprediction of the rate of the ratchet. For fast clicking, (5.6b) shows the samepower law behaviour as (5.6a) and rare clicks are never observed in simulations.Of course from a biological perspective our mathematical model is very naive.In particular, it is unnatural to suppose that each new mutation confers thesame selective disadvantage and, indeed, not all mutations will be deleterious.Moreover, if one is to argue that Muller’s ratchet explains the evolution of sex,then one has to quantify the eﬀect of recombination. Such questions provide arich, but challenging, mathematical playground. Acknowledgements

We have discussed Muller’s ratchet with many diﬀerentpeople. We are especially indebted to Ellen Baake, Nick Barton, MatthiasBirkner, Charles Cuthbertson, Don Dawson, Wolfgang Stephan, Jay Taylor andFeng Yu.

References

B¨urger, R. (1991). Moments, cumulants and polygenic dynamics.

J. Math.Biol. , 30:199–213.Charlesworth, B. (1978). Model for evolution of Y chromosomes and dosagecompensation.

Proc. Natl. Acad. Sci. USA , 75(11):5618–5622.Charlesworth, B. (1996). The evolution of chromosomal sex determination anddosage compensation.

Curr. Biol. , 6:149–162.Charlesworth, B. and Charlesworth, D. (1997). Rapid ﬁxation of deleteriousalleles can be caused by Muller’s ratchet.

Genet. Res. , 70:63–73.Charlesworth, B. and Charlesworth, D. (1998). Some evolutionary consequencesof deleterious mutations.

Genetica , 102/103:2–19.Gessler, D. D. (1995). The constraints of ﬁnite size in asexual populations andthe rate of the ratchet.

Genet. Res. , 66(3):241–253.Gordo, I. and Charlesworth, B. (2000). The degeneration of asexual haploidpopulations and the speed of Muller’s ratchet.

Genetics , 154(3):1379–1387.Haigh, J. (1978). The accumulation of deleterious genes in a population–Muller’sRatchet.

Theor. Popul. Biol. , 14(2):251–267.Higgs, P. G. and Woodcock, G. (1995). The accumulation of mutations inasexual populations and the structure of genealogical trees in the presence ofselection.

J. Math. Biol. , 33:677–702.

EFERENCES

Ann. Appl.Probab , 15(2):1506–1535.Loewe, L. (2006). Quantifying the genomic decay paradox due to Muller’sratchet in human mitochondrial DNA.

Genet. Res. , 87:133–159.Maia, L. P., Botelho, D. F., and Fontatari, J. F. (2003). Analytical solution ofthe evolution dynamics on a multiplicative ﬁtness landscape.

J. Math. Biol. ,47:453–456.Maynard Smith, J. (1978).

The Evolution of Sex . Cambridge University Press.Rice, W. (1994). Degeneration of a non-recombining chromosome.

Science ,263:230–232.Stephan, W., Chao, L., and Smale, J. (1993). The advance of Muller’s ratchetin a haploid asexual population: approximate solutions based on diﬀusiontheory.

Genet. Res. , 61(3):225–231.Stephan, W. and Kim, Y. (2002). Recent applications of diﬀusion theory topopulation genetics. In