[PDF] Time to fixation in the presence of recombination

Abstract

We study the evolutionary dynamics of a haploid population of infinite size recombining with a probability r in a two locus model. Starting from a low fitness locus, the population is evolved under mutation, selection and recombination until a finite fraction of the population reaches the fittest locus. An analytical method is developed to calculate the fixation time T to the fittest locus for various choices of epistasis. We find that (1) for negative epistasis, T decreases slowly for small r but decays fast at larger r (2) for positive epistasis, T increases linearly for small r and mildly for large r (3) for compensatory mutation, T diverges as a power law with logarithmic corrections as the recombination fraction approaches a critical value. Our calculations are seen to be in good agreement with the exact numerical results.

Full PDF

aa r X i v : . [ q - b i o . P E ] O c t Time to ﬁxation in the presence of recombination

Kavita Jain

Theoretical Sciences Unit and Evolutionary and Organismal Biology Unit,Jawaharlal Nehru Centre for Advanced Scientiﬁc Research,Jakkur P.O., Bangalore 560064, India

Abstract

We study the evolutionary dynamics of a haploid population of inﬁnite sizerecombining with a probability r in a two locus model. Starting from a lowﬁtness locus, the population is evolved under mutation, selection and recom-bination until a ﬁnite fraction of the population reaches the ﬁttest locus. Ananalytical method is developed to calculate the ﬁxation time T to the ﬁttestlocus for various choices of epistasis. We ﬁnd that (1) for negative epistasis, T decreases slowly for small r but decays fast at larger r (2) for positiveepistasis, T increases linearly for small r and mildly for large r (3) for com-pensatory mutation, T diverges as a power law with logarithmic correctionsas the recombination fraction approaches a critical value. Our calculationsare seen to be in good agreement with the exact numerical results. Key words: ﬁxation time, recombination, epistasis

1. Introduction Sexual reproduction is ubiquitous in nature - most eukaryotes reproduce sexually and genetic mixing is common in some bacterias [21, 25]. However, asexual reproduction is not entirely absent. Microbes such as virus and bacteria reproduce asexually most of the time, ancient asexuals [12] which have remained exclusively asexual for millions of years persist and human mitochondrial DNA has not recombined for a few million years [17]. It is then natural to ask: under what conditions is one or the other mode of reproduction preferred? A detailed study of theoretical models has been helpful in identifying some relevant parameters and conditions. A parameter which plays a cru- cial role in the evolution of sex and recombination is epistatic interaction Preprint submitted to Elsevier October 29, 2018 mongst gene loci [16]. Experiments have shown that the individual locus do not always contribute independently to the ﬁtness of the whole sequence [28, 24] and the deviation of the ﬁtness from the independent loci model is a measure of the epistatic interactions. The nature of epistasis is important in determining whether a mode of reproduction may be viable. For instance, in the absence of back mutations and recombination, a ﬁnite asexual pop- ulation evolving on a nonepistatic ﬁtness landscape accumulates deleterious mutations irreversibly (Muller’s ratchet) [19, 7]. But the degeneration can be eﬀectively halted if synergistic epistasis is present [13, 10]. On complex multipeaked ﬁtness landscapes that incorporate sign epistasis [27], the eﬀect of sex has been seen to depend on the detailed topography of the ﬁtness landscape [14, 4]. The epistatic interactions play an important role in inﬁnitely large popu- lations as well. In a two locus model with the four possible sequences denoted by ab , Ab , aB and AB and with respective ﬁtnesses w , w , w , w ( > w , w ), recombination reduces the frequency of the favorable mutant AB when epis- tasis parameter e = w w − w is positive but increases the AB frequency for negative e [5]. In this article, we ask: in an inﬁnitely large recombining population if all the population is initially located at the sequence ab , how much time T does it take to get ﬁxed to the double mutant AB with ﬁtness w > w ? The ﬁxation time T is expected to decrease with recombination for negative epistasis and increase for positive epistasis [20, 6]. These qualitative trends are understandable from the results of [5]: for e <

0, as recombination acts in favor of the double mutant, it will get ﬁxed faster than in the asexual case while the reverse holds for e > The main purpose of this article is to ﬁnd analytical expressions for the ﬁxation time T . To this end, we develop a new method to handle the inher- ently nonlinear equations obeyed by the genotype frequencies in the presence of recombination (see Section 2). The basic idea of our approach is that at any instant, only one of the genotypes dominate so that the equations can be expanded perturbatively in powers of the ratio of the non-dominant genotype frequency to the dominant one. The rest of the article is organised as follows. We ﬁrst deﬁne the model under consideration in Section 2. The dynamics of the population frequencies for various choices of epistasis are discussed in Section 3. The ﬁxation time deﬁned as the time at which the double mutant frequency reaches a given ﬁnite fraction is calculated in Section 4. The eﬀect of initial conditions on ﬁxation time is considered in Section 5. The last section discusses our results

2. Model We consider a two locus model with sequences denoted by ab , Ab , aB and AB and respective ﬁtnesses w , w , w , w ( > w , w , w ). The population atthese sequences evolves according to mutation, selection and recombinationdynamics. In such models, several schemes have been used to implementthese basic processes such as recombination followed by mutation and thenselection [6], selection, mutation and then recombination [15] and selection,mutation and recombination appearing as additive terms in continuous timemodels [2]. Here we work with a discrete time, two locus model in whichthe mutation occurs after recombination and selection [26]. The mutationprobability from a to A and b to B is given by µ but the back mutations areneglected. The recombination between ab and AB or aB and Ab occurs witha probability r . Denoting the population fraction at sequences ab , Ab , aB and AB at time t by x , ..., x respectively and time t + 1 by x ′ , ..., x ′ , thetime evolution occurs according to the following nonlinear coupled equations: x ′ = (1 − µ ) ( w x − rD )¯ w ( t ) (1) x ′ = µ (1 − µ )( w x − rD ) + (1 − µ )( w x + rD )¯ w ( t ) (2) x ′ = µ (1 − µ )( w x − rD ) + (1 − µ )( w x + rD )¯ w ( t ) (3) x ′ = µ ( w x − rD ) + µ ( w x + rD ) + µ ( w x + rD ) + ( w x − rD )¯ w ( t ) (4)Here D ( t ) = ( w w x ( t ) x ( t ) − w w x ( t ) x ( t )) / ¯ w ( t ) is the linkage dise- quilibrium at time t and ¯ w ( t ) = P k =1 w k x k ( t ) is the average ﬁtness of the population. In the following, we will work with w = 1 , w = w and w > w , w and initial condition x k (0) = δ k, . As a consequence, x ( t ) = x ( t ) for all t >

0. We deﬁne the epistasis parameter e = w w − w and will discuss fourseparate cases: (i) zero epistasis which requires w > w > w > w < x k ( t ) = z k ( t ) / P j =1 z j ( t )3here the z k ’s satisfy the following condition: X i =1 z ′ i = X i =1 w i z i (5)The unnormalised populations z k ’s obey the following set of equations, z ′ = (1 − µ ) (cid:16) z − r ˜ D (cid:17) (6) z ′ = µ (1 − µ ) (cid:16) z − r ˜ D (cid:17) + (1 − µ ) (cid:16) w z + r ˜ D (cid:17) (7) z ′ = µ (cid:16) z − r ˜ D (cid:17) + 2 µ (cid:16) w z + r ˜ D (cid:17) + (cid:16) w z − r ˜ D (cid:17) (8)with the initial condition z k (0) = δ k, . In the above equations,˜ D = D X i =1 z i = w z z − w z ( P i =1 w i z i ) X i =1 z i (9)For r = 0, the above model reduces to the standard quasispecies modelfor asexuals and can be solved exactly as the population z i ’s obey linearequations [11]. For µ →

0, we ﬁnd z ( t ) ≈ z ( t ) ≈ µ (cid:20) w t − w − (cid:21) (11) z ( t ) ≈ µ (cid:20) w w − w t − w t w − w − w + 1 w − w t − w − (cid:21) (12)Figure 1 shows the time evolution of the populations z k ’s when r = 0 for negative and positive epistasis. With nonzero recombination, it does not seem possible to solve the above equations for z i ( t ) exactly due to the bilinear terms in linkage disequilbrium D . However, in the next section, we will obtain approximate expressions for the unnormalised population z i .

3. Time evolution of populations As we shall see, the dynamics of population z i ’s can be divided in following three dynamical phases: (i) z ≫ z , z (phase I) (ii) z ≫ z , z (phase t Figure 1: Recombination probability r = 0: Time evolution of z (solid), z (broken) and z (dotted) for w = 2 , w = 3 , µ = 10 − , and z (+), z ( × ) and z ( ⊡ ) for w = 2 , w =5 , µ = 10 − using exact equations (6)-(8). z ≫ z , z (phase III). Thus we can expand equations (6)-(8) in powers of z /z , z /z in phase I, z /z , z /z in phase II and similarly, z /z , z /z in phase III. The time scale at which a phase ends is obtained by matching the solutions of the relevant populations in the two phases. In the crossover region however the above assumptions are not expected to hold strictly. But as we shall see, the ﬁxation time is nevertheless well approximated. Note that the perturbation expansions here are diﬀerent from a small r expansion [15]. When w = w , the epistasis parameter e = 0. The dynamics of popula-tions x i ’s and z i ’s evolving for such a ﬁtness choice is shown in Fig. 2. For e = 0, the linkage disequilibrium ˜ D ( t ) obeys the following evolution equation: D ′ = w (1 − µ ) [ w ( z z − z ) − rDS S ][(1 − µ + µw ) z + 2 w (1 − µ + µw ) z + w z − r (1 − µ ) (1 − w ) DS ] ∝ w ( z z − z ) − rDS S (13)where S = P i =1 z i and S = P i =1 w i z i . As D ∝ z z − z for e = 0, it follows that D ′ ∼ D . Therefore, if the population has zero linkagedisequilibrium to start with, it remains zero for all times in the absenceof epistasis [5]. Thus for e = 0, the population z k ’s obey following linearequations, z ′ = (1 − µ ) z z ′ = µ (1 − µ ) z + (1 − µ ) w z z ′ = µ z + 2 µw z + w z which can be easily solved. For small µ , we obtain z ( t ) ≈ z ( t ) ≈ µ (cid:18) w t − w − (cid:19) (15) z ( t ) ≈ (cid:20) µ (cid:18) w t − w − (cid:19)(cid:21) . (16) From the above solution, we see that at time τ at which phase I ends, z z − z = z − z = 0 so that z = z and therefore the phase II is absent in this case. t Figure 2: Non-epistatic interaction: Time evolution of z (solid), z (broken) and z (dotted) for w = 1 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). Thenormalised fractions are shown in the inset. t Figure 3: Negative epistasis: Time evolution of z (solid), z (broken) and z (dotted) for w = 2 , w = 3 , r = 0 . , µ = 10 − using exact equations (6)-(8). We now consider the case when epistasis is negative, w < w . The time evolution of populations for this ﬁtness scheme is shown in Fig. 1 for r = 0 and Fig. 3 for r >

0. In both cases, the population z dominates at short times followed by z and ﬁnally z takes over. This behavior is also reﬂected in the normalised populations x k ’s shown in the inset of Fig. 3. Phase I.

Since z k (0) = δ k, , initially z , z ≪ z so that the sum P i =1 z i ≈ z .Using this in the expression for ˜ D , we ﬁnd that ˜ D ≈ w z − ( w z /z ). Taking µ → z ′ ≈ z + r (cid:18) w z − w z z z (cid:19) z ′ ≈ µz + w z − r (cid:18) w z − w z z z (cid:19) z ′ ≈ µ z + 2 µw z + w z + r (cid:18) w z − w z z z (cid:19) Since z /z , z /z ≪

1, we can write z ′ ≈ z which immediately gives z ( t ) =8 for t < τ where τ is the time at which phase I ends. In the equation for z ( z ), the ﬁrst term on the RHS is the mutation term which signiﬁes that evenif w ( w ) and r are equal to zero, the population z ( z ) will remain nonzerodue to a constant mutational supply from ab . The second term on the RHSof the z equation is the selection term and the last two terms are due torecombination. It turns out that the recombination term can be ignored toyield z ′ ≈ µz + w z . This approximation is later justiﬁed by showing thatindeed the last two terms are negligible compared to w z . In summary, wehave z ′ ≈ z (17) z ′ ≈ µz + w z (18) z ′ ≈ µ z + 2 µw z + (1 − r ) w z + rw z z (19)On solving the above equations, we obtain z ( t ) ≈ z ( t ) ≈ µ (cid:18) w t − w − (cid:19) (21) z ( t ) ≈ µ (cid:20) ( w (1 − r )) t − w (1 − r ) − (cid:21) + 2 µ w w − (cid:20) ( w (1 − r )) t − w t w (1 − r ) − w − ( w (1 − r )) t − w (1 − r ) − (cid:21) + rµ w ( w − (cid:20) ( w (1 − r )) t − w t w (1 − r ) − w − w (1 − r )) t − w t w (1 − r ) − w + ( w (1 − r )) t − w (1 − r ) − (cid:21) (22)We check that the r = 0 limit is recovered from the above solution. Fromthe last equation, we ﬁnd that for r >

0, the growth rate of population z isgiven by max { w (1 − r ) , w } . Since r must be positive, for e <

0, the growthrate of z is w and we have z ( t ) ≈ µ rw ( rw + | e | )( w − × w t , r > w z ≫ r ( w z − w z ) thus justifying (18). It is evident from (21) and (23) that when z becomes one, z = rw / ( rw + | e | ) < z intersects z before z . The time τ at which z ( τ ) = 1 isgiven by τ = 1ln w ln (cid:18) w − µ (cid:19) (24)9hich is independent of r . Phase II.

After time τ , the population z ≫ z , z and therefore the sum P i =1 z i ≈ z . For weak selection, this gives ˜ D ≈ − z /

2. Thus in thisphase, z k ’s obey the following equations: z ′ ≈ (cid:16) w − r (cid:17) z (25) z ′ ≈ w z + r z (26) z ′ ≈ z + r z (27)As the equation for z is decoupled from z and z , we can ﬁrst solve for z and then use the solution to ﬁnd z and z . This ﬁnally gives z ( t ) ≈ (cid:16) w − r (cid:17) t − τ z ( τ ) (28) z ( t ) ≈ w t − τ z ( τ ) + r w − ( r/ t − τ − w t − τ w − ( r/ − w z ( τ ) (29) z ( t ) ≈ z ( τ ) + r w − ( r/ t − τ − w − ( r/ − z ( τ ) (30)where z ( τ ) = z ( τ ) = 1 , z ( τ ) = rw / ( rw + | e | ). The time τ at which z overtakes z is given by τ = τ + 1ln [2 w / (2 w − r )] ln (cid:20) rw + | e | )( w − w − r )[2( w − w ) − r ] rw − r w − r | e | (cid:21) , r > e = 0, the time τ − τ during which phase II is presentvanishes. To understand how τ varies with r , consider the r -dependent term g ( r ) in τ which can be rewritten as g ( r ) = 1ln [2 w / (2 w − r )] ln (cid:20) | e | r (1 − r )( r + 2( w − w )) | e | + rw − w [2( w − w ) − r ] (cid:21) ≈ w /w ) ln (cid:20) | e | r (1 − r )( r + 2( w − w )) | e | + rw − w [2( w − w ) − r ] (cid:21) = 1ln( w /w ) ln [1 + R ]The ratio R = 1 when r satisﬁes the quadratic equation w r + ( w − w )(2 w − w ) r − ( w − w ) | e | = ( r − r + )( r + | r − | ) = 0 where r + ( r − ) is10he positive (negative) root of the quadratic equation. For r ≪ r + , the ratio R ≫ R ) ≈ ln R . For r ≫ r + , R ≪ R ) ≈ R .Using these approximations in the expression for g ( r ) above, we ﬁnd that g ( r ) ∼ ( ln [2( w − w ) /r ] , r ≪ r + | e | /r , r ≫ r + (32)Thus the time τ − τ decreases slowly as ln(1 /r ) for small r and as 1 /r for large r . Due to these properties of τ , the single mutant population can dominate for appreciable time interval for small r . Phase III.

For t > τ , the population z ≫ z , z so that P i z i ≈ z . Due tothis, ˜ D ≈ ( z /w ) − (( w z ) / ( w z )). For small µ , the equation for z k ’s in(6)-(8) can thus be simpliﬁed to give z ′ ≈ w z (33) z ′ ≈ w z (34) z ′ ≈ (cid:18) − rw (cid:19) z + rw w z z (35)where we have neglected the recombination term contribution to the equationfor z by assuming w z ≫ rz /w (see below). From the ﬁrst two equations,we see that z ∼ w t and z ∼ w t . Thus the last term in the equation for z can contribute when e <

0. Explicitly, we obtain z ( t ) = w t − τ z ( τ ) (36) z ( t ) = w t − τ z ( τ ) (37) z ( t ) = (1 − rw ) t − τ z ( τ ) + r w w z ( τ ) z ( τ ) (1 − ( r/w )) t − τ − ( w /w ) t − τ (1 − ( r/w )) − ( w /w ) (38) ≈ rw z ( τ ) w ( r + | e | ) z ( τ ) (cid:18) w w (cid:19) t − τ , e < z k ( τ ) are given by (28)-(30) at time t = τ . From the above solution, it is easily veriﬁed that w z ≫ rz /w is a good approximation for t > τ . We now turn to the case when epistasis is positive. The condition w > w can be satisﬁed for w < w >

1. For w >

1, the time evolution of populations for this ﬁtness scheme is shown in Fig. 4 for e < rw and e > rw . populations z , z and z for w < is absent in all these cases. Phase I.

As discussed for e <

0, in this phase, z ≫ z , z and can be wellapproximated by one, z ( t ) ≈ t < τ . Then the populations z and z obey the following equations: z ′ ≈ µ + w z + r ( w z − w z ) (40) z ′ ≈ µ + 2 µw z + w (1 − r ) z + rw z (41)Figures 4 and 5 show that initially z > z but after some time ( < τ ), z canovertake z while both z , z <

1. This behavior is characteristic of positiveepistasis as can be seen from Fig. 1 for r = 0 also. For this reason, thepopulation z can get a contribution from z in phase I when e > r -dependent term in the equation for z . The equationfor z remains the same as for negative epistasis. Since z appears linearlyin the above equations for z and z , it is possible to eliminate z from theequation for z and express it in terms of z alone. This gives a three termrecursion relation for z : z ( t + 1) = (1 + rw − w ) µ + rw µ + (2 µrw − w + rw ) w z ( t − w + w − rw ) z ( t ) + rw w z ( t − − rw z ( t ) , t ≥ z (0) = 0 and z (1) = µ . We will ﬁnd the solution to the nonlinear equation for z ( t ) iteratively [1]. We ﬁrst ﬁnd the solution f ( t ) of the above diﬀerence equation for z when the nonlinear terms are set to zero. The corrections to z ( t ) arising due to nonlinearity will then be determined by writing z ( t ) = f ( t )(1 + f ( t )). The solution f ( t ) satisﬁes the following linear, inhomogeneous diﬀerenceequation: f ( t + 1) = B f ( t ) + C f ( t −

1) + A , t ≥ A = (1+ rw − w ) µ + rw µ , B = ( w + w − rw ) , C = (2 µrw − w + rw ) w . The solution of this linear equation subject to f (0) = 0 , f (1) = µ can be found by using the method of variation of parameters [1] and is givenby f ( t ) = µ (1 − α + ) − A ( α + − α − )(1 − α + ) α t + + µ (1 − α − ) − A ( α − − α + )(1 − α − ) α t − + A (1 − α − )(1 − α + )12 t t Figure 4: Positive epistasis: Time evolution of z (solid), z (broken) and z (dotted) for (a) w = 1 . , w = 1 . , r = 0 . , µ = 10 − (b) w = 1 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). The normalised fractions are shown in the inset. where α ± = ( w − rw + w ) ± p ( w − rw − w ) + 8 µrw w O ( µ ) in the last expressionwhich gives f ( t ) ≈ (cid:18) µ + µ rw [2 w (1 − w ) + (1 + w )( w − w − rw )]( w − w − w − rw ) (cid:19) (cid:20) w t − w − (cid:21) + µ rw ( w + w − rw )( w − rw − w ) (cid:20) ( w − rw ) t − w − rw − (cid:21) (44)Note that the above solution consists of two growth rates for z namely w and w (1 − r ). Using z ( t ) ≈ f ( t ) in (41) and keeping terms to O ( µ ), we get (22) for z ( t ). It follows that the population z does not grow if r > r c = ( w − /w for w <

1. But for w >

1, the population z always grows and the growth rate is given by max { w (1 − r ) , w } as in the case of negative e . In the following, we will discuss the two cases w > w < w >

1: When w >

1, due to (22), the following subcases arise for z ( t ): z ( t ) ≈ ( µ e (1 − r )( w − rw + w )( e − rw )( w − rw − w − rw − w ) × ( w − rw ) t , r < e/w µ rw ( rw − e )( w − × w t , r > e/w (45)13or r > e/w , it follows from (45) that when z becomes one, z = rw / ( rw − e ) > z hits unity before z and thus phase II is absent. The time τ at which z ( τ ) = 1 is given by τ ( r ) = 1ln w ln (cid:18) w − µw (cid:19) + 12 ln w ln (cid:16) w − er (cid:17) , r > e/w (46)For r ≫ e , the last term in the above expression (and hence τ ) increases as ∼ − e/ ( rw ) with increasing r . For r < e/w , using (45), we ﬁnd that the time τ at which z ( τ ) = 1 isgiven by τ ( r ) = 1ln( w − rw ) ln (cid:20) ( e − rw )( w − − rw )( w − w − rw ) µ e (1 − r )( w + w − rw ) (cid:21) , r < e/w (47)For r = 0, we have τ (0) ≈ w ln (cid:20) ( w − w − w ) µ ( w + w ) (cid:21) (48)which matches the one obtained using (12) or (22) for w ≪ w . To ﬁnd thebehavior of τ for r ≪ e/w , we rewrite the expression for τ ( r ) asln( w − rw ) τ ( r ) − ln w τ (0)= ln "(cid:18) − rw e (1 − r ) (cid:19) (cid:18) − rw w − (cid:19) (cid:18) − rw w − w (cid:19) (cid:18) − rw w + w (cid:19) − Using the inequality r < rw < e < w − w < w − < w + w in the lastequation, we ﬁndln( w − rw ) τ ( r ) ≈ ln w τ (0) + ln (cid:18) − rw e (cid:19) (49)The above expression can be further simpliﬁed to give τ ( r ) ≈ τ (0) (cid:18) r ln w (cid:19) , r ≪ e/w (50)which shows that τ increases linearly with r for r ≪ e/w . w <

1: It is known that for an inﬁnite population, there exits a critical recombination fraction r c beyond which a population initially located at ab t Figure 5: Compensatory mutation: Time evolution of z (solid), z (broken) and z (dotted) for w = 0 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). Thenormalised fractions are shown in the inset. ﬁtness peak [3, 5, 8, 22]. For our model, as discussed above, z can grow (and hence x can be ﬁxed) provided w (1 − r ) > r < r c ≈ ( w − /w . Due to (44) obtained by dropping nonlinear terms in the equation for z ,both z and z increase as ( w − rw ) t . But Fig. 5 shows that z and z donot continue to grow in this manner but rise sharply as the end of phaseI approaches. To understand this, it is essential to include the nonlinearterms in the equations (40) and (41) for z and z . To this end, we write z ( t ) = f ( t ) [1 + f ( t )] where f ( t ) given by (44) reduces to f ( t ) ≈ µ − w + µ rw ( w + w − rw )( w − rw − w ) (cid:20) ( w − rw ) t w − rw − (cid:21) (51)for w < , w (1 − r ) >

1. As we shall later, f ( t ) remains close to zero forshort times but contributes substantially at large times. Using this form of z in (42) and neglecting the quadratic terms in f , we ﬁnd that f ( t ) obeys thefollowing approximate linear inhomogeneous equation with time-dependentcoeﬃcients, f ( t + 1) = B ( t ) f ( t ) + C ( t ) f ( t −

1) + A ( t ) (52)where the coeﬃcient B ( t ) = w + w − rw − rw f ( t ) (53) C ( t ) = (2 µrw − w + rw ) w + 2 rw w f ( t ) (54) A ( t ) = r ( w − w f ( t ) (55)For w − rw >

1, we deﬁne ǫ = r c − r . For ǫ → f ( t ) ≈ a (1 + ǫw ) t where a = µ r c (1 + w )(1 − w ) ǫ (56)and the coeﬃcients are given by B ( t ) ≈ w + 1 − r c w f ( t ) (57) C ( t ) ≈ − w + 2 r c w w f ( t ) (58) A ( t ) ≈ r c w w f ( t ) (59)16riting the diﬀerence equation (52) for f ( t ) as a diﬀerential equation,we get df ( t ) dt + 1 − B ( t ) − C ( t )1 + C ( t ) f ( t ) = A ( t )1 + C ( t ) (60)with the initial condition f (0) = 0. It is straightforward to solve the abovediﬀerential equation and we obtain f ( t ) ≈ (cid:2)(cid:0) b + c (1 + ǫw ) t (cid:1) α − (cid:3) (61)where b = 1 − w − w + 2 r c w w a (62) c = 1 − b (63) α = r c w ǫ (64)The second term in the parentheses in the above equation can be neglectedfor t ≪ t = ln [(1 − w ) / (2 r c w w a )] /ǫw and since a ∼ µ , we obtain f ( t ) ≈ t ≫ t , the ﬁrst term inthe parentheses can be ignored and for ǫ →

0, we obtain f ( t ) ∼ e r c t . Thus z ( t ) = f ( t )(1 + f ( t )) increases as z ( t ) ∼ ( (1 + ǫw ) t , t ≪ t (1 + ǫw ) [1+ r c / ( w ǫ )] t , t ≫ t (65)Thus at times close to the end of phase I, z increases at a faster rate. To ﬁnd the time τ at which phase I ends, we ﬁrst calculate z ( t ) using z ( t ) = f ( t )(1 + f ( t )) in (40). This yields z ( t ) ≈ f ( t + 1) [1 + f ( t + 1)] − µ − w f ( t ) [1 + f ( t )] + rw f ( t ) [1 + 2 f ( t )] rw (66)On expanding f ( t + 1) and f ( t + 1) for small ǫ , we obtain f ( t + 1) ≈ (1 + ǫw ) f ( t ) (67) f ( t + 1) ≈ f ( t ) + 12 cǫαw (1 + ǫw ) t (cid:2) b + c (1 + ǫw ) t (cid:3) α − (68)17ubstituting this in the above expression for z ( t ) and using z ( τ ) = 1, weﬁnd that τ is determined from the following equation:(1 − w ) a y (1 + y ) + ar c ( c + 2 aw )2 y y = rw + µ (69)where y = (1 + ǫw ) τ and y = ( b + c (1 + ǫw ) τ ) α . As this equation isdiﬃcult to analyse, we consider only the fastest growing term to obtain thefollowing approximate equation for τ : y y ar c (cid:2) c + 2 aw (cid:3) ≈ rw (70) Phase III.

For t > τ , the populations z k ’s obey the equations (33)-(35) as for e < τ replaced by τ . For w > z and z grow exponentially fast with their respective ﬁtnesses but z decays with time. The rate of decline is determined by the ratio w / ( w − r ). If this ratio is larger than unity, z ∼ ( w /w ) t and as (( w − r ) /w ) t otherwise. For w < z decays with time while z continues to grow. The remarks above for z behavior when w > w < case also.

4. Fixation time

As seen in the last section, the unnormalised populations z k ’s vary ex-ponentially (or faster) with time so that the normalised population x willreach unity asymptotically. Therefore, we deﬁne the ﬁxation time T as thetime when the population fraction x ( T ) = 1 − δ where δ →

0. In terms of z k ’s, this condition gives z ( T ) + 2 z ( T ) − δ − δ z ( T ) = 0 (71)where z k ( T ) in the above equation is the population fraction in the Phase III at t = T . Another reason why δ > δ = 0, the above equation cannot be satisﬁed as both z and z are always positive. For r = 0 and w >

1, since z ≈

1, we can write δz ( T ) ≈ z ( T ) whichgives T ≈ w /w ) ln (cid:20) − δ )( w − w )( w − µδ ( w + w )( w − (cid:21) (72)which decreases monotonically as w increases. .1. No epistasis Since z ( t ) = z ( t ) due to (15) and (16), the condition (71) simpliﬁes togive z ( T ) = (1 + z ( T )) √ − δ which leads to T ≈ w ln (cid:20) w − √ − δµ (1 − √ − δ ) (cid:21) (73)For w = 2 , w = 4 , µ = 10 − and δ = 0 .

01, the above expression yields T = 27 .

56 in excellent agreement with the result of our exact numerical iteration which gives the ﬁxation time equal to 28 for various values of r . Using (36)-(38) at t = T in (71), we have rw w ( r − e ) (cid:18) w w (cid:19) T − τ + 2 − δ − δ (cid:18) w w (cid:19) T − τ = 0 (74)Since the ﬁrst term on the LHS is exponentially decaying, we neglect it toobtain T = τ + 1ln( w /w ) ln (cid:20) − δ ) δ (cid:21) (75)where τ is given by (31). As discussed in Sec. 3.2, since τ decreases with r , the ﬁxation time T is a decreasing function of recombination probability r when epistasis is negative. A comparison of the analytical estimate with the exact numerical result for two sets of parameters shows a good agreement (see Fig. 6). Dividing both sides of (71) by w and using (36)-(38), we ﬁnd that thecontribution due to z term can be neglected as it is exponentially decaying.This gives 2 z ( τ ) ≈ δ − δ (cid:18) w w (cid:19) T − τ (76)and hence T = τ + ln [2(1 − δ ) /δ ]ln( w /w ) + ln z ( τ )ln( w /w ) (77)We ﬁrst consider the w > w <

44 46 48 50 52 54 56 58 60 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =2.0, w =3.0w =1.5, w =2.0 Figure 6: Negative epistasis: Fixation time as a function of r obtained using exact iteration( • ) and analytical results ( ◦ ) given by (75) for two sets of ﬁtnesses and µ = 10 − , δ = 0 . w = 2 . , w = 3 .

30 35 40 45 50 55 60 65 70 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 T r w =1.25, w =1.8125w =1.25, w =2.5 Figure 7: Positive epistasis: Fixation time as a function of r obtained using exact iteration( • ) and analytical result ( ◦ and × ) given by (77) and τ (upto a constant) respectively fortwo sets of ﬁtnesses and µ = 10 − , δ = 0 . w >

1: In the above expression, z ( τ ) is given by (44) and τ by (47) for r < e/w and (46) for r > e/w . Computing T using these formulae in (77), we obtain the ﬁxation time as a function of r shown in Fig. 7 (open circles) for ﬁtness choices w = 1 . , w = 1 . , e/w ≈ .

14 and w = 1 . , w = . , e/w = 0 . the exact numerical results except in the vicinity of r = e/w . Figure 7 also shows τ which displays a similar behavior as T . We have already seen that the time τ increases linearly with r for r ≪ e/w but weakly for r ≫ e/w . Thus the ﬁxation time T for e > , w > r but is weakly dependent on r for r > e/w . w <

1: The inset of Fig. 8 shows that the ﬁxation time diverges as r ap-proaches critical recombination probability r c = ( w − /w . From (77),the ﬁxation time T for w < τ from (69) and z ( τ ) = f ( τ )(1 + f ( τ )). The time T thus obtained (open circles) whenplotted against r c − r is compared with the results from exact iteration inFig. 8 and shows a good agreement. The approximate τ obtained using (70)is also plotted which well approximates the ﬁxation time T . Therefore, it is21

100 1000 10000 100000 0.001 0.01 0.1 T r c -r

100 1000 10000 0.1 0.15 0.2 T r Figure 8: Compensatory mutation: Fixation time as a function of r obtained using exactiteration ( • ) and analytical results ( ◦ and × ) given by (77) and (70) respectively forparameters w = 0 . , w = 1 . , µ = 10 − , δ = 0 .

01 with r c = ( w − /w . The solidline has a slope equal to − r → r c . For ǫ →

0, (70) can be written as y y ar c (cid:2) c + 2 aw (cid:3) ≈ r c w (78)On taking logarithms both sides, the above equation reduces to2 r c w a (1 − w ) ǫ e τ ǫw ≈ ln w (1 − w )(1 − w + r c w ) w a (79)where we have neglected the linear term in τ as compared to the exponentialterm in τ . Writing a = ˜ a/ǫ , we ﬁnally obtain τ ≈ ǫw (cid:20) ln ln (cid:18) w (1 − w ) ǫ (1 − w + r c w ) w ˜ a (cid:19) − ln (cid:18) r c w ˜ a (1 − w ) ǫ (cid:19)(cid:21) (80)which decays slower than 1 /ǫ (see Fig. 8) due to the logarithmic corrections.

5. Initial Condition with nonzero linkage disequilibrium

So far, we have discussed the population dynamics starting with an initial condition in which only one genotype has a nonzero population. In this sec- tion, we consider the situation when a small ﬁnite frequency at the interme- diate loci is also present at t = 0 i.e. x (0) = 0 , x (0) = x (0) = (1 − x (0)) / As the analytical method presented in the last sections assumes that all but one frequencies is rare at a given time, it seems diﬃcult to obtain analytical results. Therefore we present numerical results to show how the change in initial condition aﬀects the ﬁxation time.

As shown in Fig. 9, due to a nonzero population at intermediate loci, the ﬁxation time at a given r is reduced as compared to the situation when only the genotype ab is present initially. For negative epistasis (Fig. 9a), the trend in the generalised situation appears similar to that discussed in Fig. 6 in that the T decreases slowly for small r but fast for large r . However for positive epistasis with w > and unlike Fig. 7 does not increase. For compensatory mutations, the ﬁxation time shown in Fig. 10 increases as r approaches a critical recombination rate and diverges slower than ( r c − r ) − .

20 22 24 26 28 30 32 34 36 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =2.0, w =3.0w =1.5, w =2.0

16 18 20 22 24 26 28 30 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =1.25, w =1.8125w =1.25, w =2.5 Figure 9: Fixation time as a function of r obtained using exact iteration ( • ) when x (0) =0 . , x (0) = x (0) = 0 .

025 for the same parameters as in Figs. 6 and 7.

6. Conclusions

In this article, we have studied the dynamics of a 2 locus model in which the population evolves deterministically under mutation, selection and re- combination. As the recombination process makes the equations nonlinear, in general it is diﬃcult to study such problems analytically. Here we have developed an analytical method to ﬁnd the ﬁxation time to the best locus for various ﬁtness schemes.

The ﬁxation time T is one of the measures for judging whether recombi- nation is beneﬁcial for a population [6]. If the ﬁxation time decreases with increasing recombination rate r , one may deduce that recombining popu- lation has an advantage over an asexual one. Our calculations show that when the epistasis parameter e is negative, the ﬁxation time decreases fast for large r which suggests that high recombination rate may be beneﬁcial for populations with negatively epistatic ﬁtness. In a ﬁtness landscape with positive epistasis and ﬁtness increasing mono- tonically with mutational distance from the initial sequence, the ﬁxation time is shown to increase with recombination. As already discussed in the

Introduction, the result that T increases with r for positive epistasis is in qualitative agreement with the expectation from the results of Eshel and Feldman [5]. Our analytical calculations show that the functional behavior of time T depends on the ratio rw /e . The ﬁxation time is shown to increase linearly when r ≪ e/w but remains roughly constant for r ≫ e/w . A compensatory mutation is said to occur when the ﬁtness loss caused by

100 1000 10000 100000 0.0001 0.001 0.01 0.1 T r c -r

100 1000 10000 0.15 0.2 T r Figure 10: Fixation time as a function of r obtained using exact iteration ( • ) when x (0) =0 . , x (0) = x (0) = 0 .

025 for the same parameters as in Figs. 8. The numericallydetermined critical recombination rate r c ≈ . − Te = 0 r > r (73) e < , w > r > e Decreases with r (75) e > , w > r < e/w Increases linearly with r (50) r > e/w Increases weakly with r (46) e > , w < r < r c Diverges with r c − r (80) r > r c Inﬁnite

Table 1: Table summarising the dependence of ﬁxation time T on recombination proba-bility r for various choices of epistasis e . one mutation is remedied by its epistatic interaction with a second mutation at a diﬀerent site in the genome. For such a ﬁtness scheme in which the initial and ﬁnal ﬁtness hills are separated by a ﬁtness valley, it is known that an inﬁnitely large population cannot cross the intermediate valley beyond a critical recombination rate r c [3, 5]. This implies that the ﬁxation time diverges as r approaches r c . Our exact numerical results for ﬁxation time when plotted against r c − r on a double logarithmic scale indicate a power law decay. Assuming that the divergence is purely algebraic, a ﬁt to the numerical data then gives T ∼ ( r c − r ) − . . However our calculation that takes the nonlinearities into account shows that the divergence is actually a power law with logarithmic corrections. Here we have focused on the evolution of deterministic population but it is important to include drift eﬀects as the real populations have a ﬁnite size N . The ﬁnite population problem with compensatory mutation has been studied in certain parameter regimes using simulations and within a diﬀu- sion approximation. The analytical calculations of [26, 9, 8] and numerical simulations of [18, 23] for a two-locus model with compensatory mutation indicates that for ﬁxed s and N , a ﬁnite population manages to reach the best locus for any r and the ﬁxation time increases with r as seen for in- ﬁnite population. However it is not clear how the population approaches the inﬁnite N limit. For e > , w >

1, it is found numerically there exits a critical epistasis value below which the ﬁxation time decreases [20]. An analytical understanding of such interesting aspects of the interplay between recombination and drift remains an open problem.

Acknowledgement: The author is grateful to J. Krug for introducing her to this problem and useful discussions. She also thanks S.-C. Park for com- part of this work was done.

References [1] C.M. Bender and S.A. Orszag.

Advanced Mathematical Methods for

Scientists and Engineers . Springer, 1999. [2] R. B¨urger. Linkage and the maintenance of heritable variation by mutation-selection balance.

Genetics , 121:175–184, 1989. [3] J.F. Crow and M. Kimura. Evolution in sexual and asexual populations.

Am. Nat. , 99:439–450, 1965. [4] J.A.G.M. de Visser, S.-C. Park, and J. Krug. Exploring the eﬀect of sex on an empirical ﬁtness landscape.

Am. Nat. , 174:S15-S30, 2009. [5] I. Eshel and M.W. Feldman. On the evolutionary eﬀect of recombination.

Theo. Pop. Biol. , 1:88–100, 1970. [6] M.W. Feldman, S.P. Otto, and F.B. Christiansen. Population genetic perspectives on the evolution of recombination.

Annu. Rev. Genet. , [7] J. Felsenstein. The evolutionary advantage of recombination. Genetics , [8] P.G. Higgs. Compensatory neutral mutations and the evolution of RNA. Genetica , 102/103:91–101, 1998. [9] M. Iizuka and M. Takefu. Average time until ﬁxation of mutants with compensatory ﬁtness interaction.

Genes Genet. Syst. , 71:167–173, 1996. [10] K. Jain. Loss of least-loaded class in asexual populations due to drift and epistasis.

Genetics , 179:2125, 2008. [11] K. Jain and J. Krug. Adaptation in simple and complex ﬁtness land- scapes. In U. Bastolla, M. Porto, H.E. Roman, and M. Vendruscolo, editors,

Structural Approaches to Sequence Evolution: Molecules, Net- works and Populations , pages 299–340. Springer, Berlin, 2007.

Trends Ecol.

Evol. , 11:41–46, 1996. [13] A.S. Kondrashov. Muller’s ratchet under epistatic selection.

Genetics , [14] F.A. Kondrashov and A.S. Kondrashov. Multidimensional epistasis and the disadvantage of sex. Proc. Natl. Acad. Sci. U.S.A. , 98:12089–12092, [15] R.D. Kouyos, S.P. Otto, and S. Bonhoeﬀer. Eﬀect of varying epistasis on the evolution of recombination.

Genetics , 173:589–597, 2006. [16] R.D. Kouyos, O.K. Silander, and S. Bonhoeﬀer. Epistasis between dele- terious mutations and the evolution of recombination.

Trends Ecol.

Evol. , 22:308–315, 2007. [17] L. Loewe. Quantifying the genomic decay paradox due to Muller’s ratchet in human mitochondrial DNA.

Genet. Res. Camb. , 87:133–159, [18] Y. Michalakis and M. Slatkin. Interaction of selection and recombination in the ﬁxation of negative-epistatic genes.

Genet. Res. , 67:257–269, 1996. [19] H. J. Muller. The relation of recombination to mutational advance.

Mutation Res. , 1:2–9, 1964. [20] S.P. Otto, M.W. Feldman, and F.B. Christiansen. Some advantages and disadvantages of recombination. In S.A. Levin, editor,

Frontiers in

Mathematical Biology , pages 198–211. Berlin: Springer Verlag, 1994. [21] S.P. Otto and T. Lenormand. Evolution of sex: Resolving the paradox of sex and recombination.

Nat. Rev. Genet. , 3:252, 2002. [22] S.-C. Park and J. Krug. Bistability in two-locus models with selection, mutation, and recombination, submitted to Theo. Pop. Biol. [23] P.C. Phillips. Waiting for a compensatory mutation: Phase zero of the shifting-balance process.

Genet. Res. , 67:271–283, 1996. structure and evolution of genetic systems.

Nat. Rev. Genet. , 9:855– [25] W. R. Rice. Evolution of sex: Experimental tests of the adaptive signif- icance of sexual recombination.

Nat. Rev. Genet. , 3:241–251, 2002. [26] W. Stephan. The rate of compensatory evolution.

Genetics , 144:419– [27] D. M. Weinreich, R. A. Watson, and L. Chao. Sign epistasis and genetic constraint on evolutionary trajectories.

Evolution , 59:1165–1174, 2005. [28] J.B. Wolf, E.D.I. Brodie, and M.J. Wade, editors.

Epistasis and the evolutionary process . Oxford Univ. Press, New York, 2000.290