Time to fixation in the presence of recombination
aa r X i v : . [ q - b i o . P E ] O c t Time to fixation in the presence of recombination
Kavita Jain
Theoretical Sciences Unit and Evolutionary and Organismal Biology Unit,Jawaharlal Nehru Centre for Advanced Scientific Research,Jakkur P.O., Bangalore 560064, India
Abstract
We study the evolutionary dynamics of a haploid population of infinite sizerecombining with a probability r in a two locus model. Starting from a lowfitness locus, the population is evolved under mutation, selection and recom-bination until a finite fraction of the population reaches the fittest locus. Ananalytical method is developed to calculate the fixation time T to the fittestlocus for various choices of epistasis. We find that (1) for negative epistasis, T decreases slowly for small r but decays fast at larger r (2) for positiveepistasis, T increases linearly for small r and mildly for large r (3) for com-pensatory mutation, T diverges as a power law with logarithmic correctionsas the recombination fraction approaches a critical value. Our calculationsare seen to be in good agreement with the exact numerical results. Key words: fixation time, recombination, epistasis
1. Introduction Sexual reproduction is ubiquitous in nature - most eukaryotes reproduce sexually and genetic mixing is common in some bacterias [21, 25]. However, asexual reproduction is not entirely absent. Microbes such as virus and bacteria reproduce asexually most of the time, ancient asexuals [12] which have remained exclusively asexual for millions of years persist and human mitochondrial DNA has not recombined for a few million years [17]. It is then natural to ask: under what conditions is one or the other mode of reproduction preferred? A detailed study of theoretical models has been helpful in identifying some relevant parameters and conditions. A parameter which plays a cru- cial role in the evolution of sex and recombination is epistatic interaction Preprint submitted to Elsevier October 29, 2018 mongst gene loci [16]. Experiments have shown that the individual locus do not always contribute independently to the fitness of the whole sequence [28, 24] and the deviation of the fitness from the independent loci model is a measure of the epistatic interactions. The nature of epistasis is important in determining whether a mode of reproduction may be viable. For instance, in the absence of back mutations and recombination, a finite asexual pop- ulation evolving on a nonepistatic fitness landscape accumulates deleterious mutations irreversibly (Muller’s ratchet) [19, 7]. But the degeneration can be effectively halted if synergistic epistasis is present [13, 10]. On complex multipeaked fitness landscapes that incorporate sign epistasis [27], the effect of sex has been seen to depend on the detailed topography of the fitness landscape [14, 4]. The epistatic interactions play an important role in infinitely large popu- lations as well. In a two locus model with the four possible sequences denoted by ab , Ab , aB and AB and with respective fitnesses w , w , w , w ( > w , w ), recombination reduces the frequency of the favorable mutant AB when epis- tasis parameter e = w w − w is positive but increases the AB frequency for negative e [5]. In this article, we ask: in an infinitely large recombining population if all the population is initially located at the sequence ab , how much time T does it take to get fixed to the double mutant AB with fitness w > w ? The fixation time T is expected to decrease with recombination for negative epistasis and increase for positive epistasis [20, 6]. These qualitative trends are understandable from the results of [5]: for e <
0, as recombination acts in favor of the double mutant, it will get fixed faster than in the asexual case while the reverse holds for e > The main purpose of this article is to find analytical expressions for the fixation time T . To this end, we develop a new method to handle the inher- ently nonlinear equations obeyed by the genotype frequencies in the presence of recombination (see Section 2). The basic idea of our approach is that at any instant, only one of the genotypes dominate so that the equations can be expanded perturbatively in powers of the ratio of the non-dominant genotype frequency to the dominant one. The rest of the article is organised as follows. We first define the model under consideration in Section 2. The dynamics of the population frequencies for various choices of epistasis are discussed in Section 3. The fixation time defined as the time at which the double mutant frequency reaches a given finite fraction is calculated in Section 4. The effect of initial conditions on fixation time is considered in Section 5. The last section discusses our results
2. Model We consider a two locus model with sequences denoted by ab , Ab , aB and AB and respective fitnesses w , w , w , w ( > w , w , w ). The population atthese sequences evolves according to mutation, selection and recombinationdynamics. In such models, several schemes have been used to implementthese basic processes such as recombination followed by mutation and thenselection [6], selection, mutation and then recombination [15] and selection,mutation and recombination appearing as additive terms in continuous timemodels [2]. Here we work with a discrete time, two locus model in whichthe mutation occurs after recombination and selection [26]. The mutationprobability from a to A and b to B is given by µ but the back mutations areneglected. The recombination between ab and AB or aB and Ab occurs witha probability r . Denoting the population fraction at sequences ab , Ab , aB and AB at time t by x , ..., x respectively and time t + 1 by x ′ , ..., x ′ , thetime evolution occurs according to the following nonlinear coupled equations: x ′ = (1 − µ ) ( w x − rD )¯ w ( t ) (1) x ′ = µ (1 − µ )( w x − rD ) + (1 − µ )( w x + rD )¯ w ( t ) (2) x ′ = µ (1 − µ )( w x − rD ) + (1 − µ )( w x + rD )¯ w ( t ) (3) x ′ = µ ( w x − rD ) + µ ( w x + rD ) + µ ( w x + rD ) + ( w x − rD )¯ w ( t ) (4)Here D ( t ) = ( w w x ( t ) x ( t ) − w w x ( t ) x ( t )) / ¯ w ( t ) is the linkage dise- quilibrium at time t and ¯ w ( t ) = P k =1 w k x k ( t ) is the average fitness of the population. In the following, we will work with w = 1 , w = w and w > w , w and initial condition x k (0) = δ k, . As a consequence, x ( t ) = x ( t ) for all t >
0. We define the epistasis parameter e = w w − w and will discuss fourseparate cases: (i) zero epistasis which requires w > w > w > w < x k ( t ) = z k ( t ) / P j =1 z j ( t )3here the z k ’s satisfy the following condition: X i =1 z ′ i = X i =1 w i z i (5)The unnormalised populations z k ’s obey the following set of equations, z ′ = (1 − µ ) (cid:16) z − r ˜ D (cid:17) (6) z ′ = µ (1 − µ ) (cid:16) z − r ˜ D (cid:17) + (1 − µ ) (cid:16) w z + r ˜ D (cid:17) (7) z ′ = µ (cid:16) z − r ˜ D (cid:17) + 2 µ (cid:16) w z + r ˜ D (cid:17) + (cid:16) w z − r ˜ D (cid:17) (8)with the initial condition z k (0) = δ k, . In the above equations,˜ D = D X i =1 z i = w z z − w z ( P i =1 w i z i ) X i =1 z i (9)For r = 0, the above model reduces to the standard quasispecies modelfor asexuals and can be solved exactly as the population z i ’s obey linearequations [11]. For µ →
0, we find z ( t ) ≈ z ( t ) ≈ µ (cid:20) w t − w − (cid:21) (11) z ( t ) ≈ µ (cid:20) w w − w t − w t w − w − w + 1 w − w t − w − (cid:21) (12)Figure 1 shows the time evolution of the populations z k ’s when r = 0 for negative and positive epistasis. With nonzero recombination, it does not seem possible to solve the above equations for z i ( t ) exactly due to the bilinear terms in linkage disequilbrium D . However, in the next section, we will obtain approximate expressions for the unnormalised population z i .
3. Time evolution of populations As we shall see, the dynamics of population z i ’s can be divided in following three dynamical phases: (i) z ≫ z , z (phase I) (ii) z ≫ z , z (phase t Figure 1: Recombination probability r = 0: Time evolution of z (solid), z (broken) and z (dotted) for w = 2 , w = 3 , µ = 10 − , and z (+), z ( × ) and z ( ⊡ ) for w = 2 , w =5 , µ = 10 − using exact equations (6)-(8). z ≫ z , z (phase III). Thus we can expand equations (6)-(8) in powers of z /z , z /z in phase I, z /z , z /z in phase II and similarly, z /z , z /z in phase III. The time scale at which a phase ends is obtained by matching the solutions of the relevant populations in the two phases. In the crossover region however the above assumptions are not expected to hold strictly. But as we shall see, the fixation time is nevertheless well approximated. Note that the perturbation expansions here are different from a small r expansion [15]. When w = w , the epistasis parameter e = 0. The dynamics of popula-tions x i ’s and z i ’s evolving for such a fitness choice is shown in Fig. 2. For e = 0, the linkage disequilibrium ˜ D ( t ) obeys the following evolution equation: D ′ = w (1 − µ ) [ w ( z z − z ) − rDS S ][(1 − µ + µw ) z + 2 w (1 − µ + µw ) z + w z − r (1 − µ ) (1 − w ) DS ] ∝ w ( z z − z ) − rDS S (13)where S = P i =1 z i and S = P i =1 w i z i . As D ∝ z z − z for e = 0, it follows that D ′ ∼ D . Therefore, if the population has zero linkagedisequilibrium to start with, it remains zero for all times in the absenceof epistasis [5]. Thus for e = 0, the population z k ’s obey following linearequations, z ′ = (1 − µ ) z z ′ = µ (1 − µ ) z + (1 − µ ) w z z ′ = µ z + 2 µw z + w z which can be easily solved. For small µ , we obtain z ( t ) ≈ z ( t ) ≈ µ (cid:18) w t − w − (cid:19) (15) z ( t ) ≈ (cid:20) µ (cid:18) w t − w − (cid:19)(cid:21) . (16) From the above solution, we see that at time τ at which phase I ends, z z − z = z − z = 0 so that z = z and therefore the phase II is absent in this case. t Figure 2: Non-epistatic interaction: Time evolution of z (solid), z (broken) and z (dotted) for w = 1 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). Thenormalised fractions are shown in the inset. t Figure 3: Negative epistasis: Time evolution of z (solid), z (broken) and z (dotted) for w = 2 , w = 3 , r = 0 . , µ = 10 − using exact equations (6)-(8). We now consider the case when epistasis is negative, w < w . The time evolution of populations for this fitness scheme is shown in Fig. 1 for r = 0 and Fig. 3 for r >
0. In both cases, the population z dominates at short times followed by z and finally z takes over. This behavior is also reflected in the normalised populations x k ’s shown in the inset of Fig. 3. Phase I.
Since z k (0) = δ k, , initially z , z ≪ z so that the sum P i =1 z i ≈ z .Using this in the expression for ˜ D , we find that ˜ D ≈ w z − ( w z /z ). Taking µ → z ′ ≈ z + r (cid:18) w z − w z z z (cid:19) z ′ ≈ µz + w z − r (cid:18) w z − w z z z (cid:19) z ′ ≈ µ z + 2 µw z + w z + r (cid:18) w z − w z z z (cid:19) Since z /z , z /z ≪
1, we can write z ′ ≈ z which immediately gives z ( t ) =8 for t < τ where τ is the time at which phase I ends. In the equation for z ( z ), the first term on the RHS is the mutation term which signifies that evenif w ( w ) and r are equal to zero, the population z ( z ) will remain nonzerodue to a constant mutational supply from ab . The second term on the RHSof the z equation is the selection term and the last two terms are due torecombination. It turns out that the recombination term can be ignored toyield z ′ ≈ µz + w z . This approximation is later justified by showing thatindeed the last two terms are negligible compared to w z . In summary, wehave z ′ ≈ z (17) z ′ ≈ µz + w z (18) z ′ ≈ µ z + 2 µw z + (1 − r ) w z + rw z z (19)On solving the above equations, we obtain z ( t ) ≈ z ( t ) ≈ µ (cid:18) w t − w − (cid:19) (21) z ( t ) ≈ µ (cid:20) ( w (1 − r )) t − w (1 − r ) − (cid:21) + 2 µ w w − (cid:20) ( w (1 − r )) t − w t w (1 − r ) − w − ( w (1 − r )) t − w (1 − r ) − (cid:21) + rµ w ( w − (cid:20) ( w (1 − r )) t − w t w (1 − r ) − w − w (1 − r )) t − w t w (1 − r ) − w + ( w (1 − r )) t − w (1 − r ) − (cid:21) (22)We check that the r = 0 limit is recovered from the above solution. Fromthe last equation, we find that for r >
0, the growth rate of population z isgiven by max { w (1 − r ) , w } . Since r must be positive, for e <
0, the growthrate of z is w and we have z ( t ) ≈ µ rw ( rw + | e | )( w − × w t , r > w z ≫ r ( w z − w z ) thus justifying (18). It is evident from (21) and (23) that when z becomes one, z = rw / ( rw + | e | ) < z intersects z before z . The time τ at which z ( τ ) = 1 isgiven by τ = 1ln w ln (cid:18) w − µ (cid:19) (24)9hich is independent of r . Phase II.
After time τ , the population z ≫ z , z and therefore the sum P i =1 z i ≈ z . For weak selection, this gives ˜ D ≈ − z /
2. Thus in thisphase, z k ’s obey the following equations: z ′ ≈ (cid:16) w − r (cid:17) z (25) z ′ ≈ w z + r z (26) z ′ ≈ z + r z (27)As the equation for z is decoupled from z and z , we can first solve for z and then use the solution to find z and z . This finally gives z ( t ) ≈ (cid:16) w − r (cid:17) t − τ z ( τ ) (28) z ( t ) ≈ w t − τ z ( τ ) + r w − ( r/ t − τ − w t − τ w − ( r/ − w z ( τ ) (29) z ( t ) ≈ z ( τ ) + r w − ( r/ t − τ − w − ( r/ − z ( τ ) (30)where z ( τ ) = z ( τ ) = 1 , z ( τ ) = rw / ( rw + | e | ). The time τ at which z overtakes z is given by τ = τ + 1ln [2 w / (2 w − r )] ln (cid:20) rw + | e | )( w − w − r )[2( w − w ) − r ] rw − r w − r | e | (cid:21) , r > e = 0, the time τ − τ during which phase II is presentvanishes. To understand how τ varies with r , consider the r -dependent term g ( r ) in τ which can be rewritten as g ( r ) = 1ln [2 w / (2 w − r )] ln (cid:20) | e | r (1 − r )( r + 2( w − w )) | e | + rw − w [2( w − w ) − r ] (cid:21) ≈ w /w ) ln (cid:20) | e | r (1 − r )( r + 2( w − w )) | e | + rw − w [2( w − w ) − r ] (cid:21) = 1ln( w /w ) ln [1 + R ]The ratio R = 1 when r satisfies the quadratic equation w r + ( w − w )(2 w − w ) r − ( w − w ) | e | = ( r − r + )( r + | r − | ) = 0 where r + ( r − ) is10he positive (negative) root of the quadratic equation. For r ≪ r + , the ratio R ≫ R ) ≈ ln R . For r ≫ r + , R ≪ R ) ≈ R .Using these approximations in the expression for g ( r ) above, we find that g ( r ) ∼ ( ln [2( w − w ) /r ] , r ≪ r + | e | /r , r ≫ r + (32)Thus the time τ − τ decreases slowly as ln(1 /r ) for small r and as 1 /r for large r . Due to these properties of τ , the single mutant population can dominate for appreciable time interval for small r . Phase III.
For t > τ , the population z ≫ z , z so that P i z i ≈ z . Due tothis, ˜ D ≈ ( z /w ) − (( w z ) / ( w z )). For small µ , the equation for z k ’s in(6)-(8) can thus be simplified to give z ′ ≈ w z (33) z ′ ≈ w z (34) z ′ ≈ (cid:18) − rw (cid:19) z + rw w z z (35)where we have neglected the recombination term contribution to the equationfor z by assuming w z ≫ rz /w (see below). From the first two equations,we see that z ∼ w t and z ∼ w t . Thus the last term in the equation for z can contribute when e <
0. Explicitly, we obtain z ( t ) = w t − τ z ( τ ) (36) z ( t ) = w t − τ z ( τ ) (37) z ( t ) = (1 − rw ) t − τ z ( τ ) + r w w z ( τ ) z ( τ ) (1 − ( r/w )) t − τ − ( w /w ) t − τ (1 − ( r/w )) − ( w /w ) (38) ≈ rw z ( τ ) w ( r + | e | ) z ( τ ) (cid:18) w w (cid:19) t − τ , e < z k ( τ ) are given by (28)-(30) at time t = τ . From the above solution, it is easily verified that w z ≫ rz /w is a good approximation for t > τ . We now turn to the case when epistasis is positive. The condition w > w can be satisfied for w < w >
1. For w >
1, the time evolution of populations for this fitness scheme is shown in Fig. 4 for e < rw and e > rw . populations z , z and z for w < is absent in all these cases. Phase I.
As discussed for e <
0, in this phase, z ≫ z , z and can be wellapproximated by one, z ( t ) ≈ t < τ . Then the populations z and z obey the following equations: z ′ ≈ µ + w z + r ( w z − w z ) (40) z ′ ≈ µ + 2 µw z + w (1 − r ) z + rw z (41)Figures 4 and 5 show that initially z > z but after some time ( < τ ), z canovertake z while both z , z <
1. This behavior is characteristic of positiveepistasis as can be seen from Fig. 1 for r = 0 also. For this reason, thepopulation z can get a contribution from z in phase I when e > r -dependent term in the equation for z . The equationfor z remains the same as for negative epistasis. Since z appears linearlyin the above equations for z and z , it is possible to eliminate z from theequation for z and express it in terms of z alone. This gives a three termrecursion relation for z : z ( t + 1) = (1 + rw − w ) µ + rw µ + (2 µrw − w + rw ) w z ( t − w + w − rw ) z ( t ) + rw w z ( t − − rw z ( t ) , t ≥ z (0) = 0 and z (1) = µ . We will find the solution to the nonlinear equation for z ( t ) iteratively [1]. We first find the solution f ( t ) of the above difference equation for z when the nonlinear terms are set to zero. The corrections to z ( t ) arising due to nonlinearity will then be determined by writing z ( t ) = f ( t )(1 + f ( t )). The solution f ( t ) satisfies the following linear, inhomogeneous differenceequation: f ( t + 1) = B f ( t ) + C f ( t −
1) + A , t ≥ A = (1+ rw − w ) µ + rw µ , B = ( w + w − rw ) , C = (2 µrw − w + rw ) w . The solution of this linear equation subject to f (0) = 0 , f (1) = µ can be found by using the method of variation of parameters [1] and is givenby f ( t ) = µ (1 − α + ) − A ( α + − α − )(1 − α + ) α t + + µ (1 − α − ) − A ( α − − α + )(1 − α − ) α t − + A (1 − α − )(1 − α + )12 t t Figure 4: Positive epistasis: Time evolution of z (solid), z (broken) and z (dotted) for (a) w = 1 . , w = 1 . , r = 0 . , µ = 10 − (b) w = 1 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). The normalised fractions are shown in the inset. where α ± = ( w − rw + w ) ± p ( w − rw − w ) + 8 µrw w O ( µ ) in the last expressionwhich gives f ( t ) ≈ (cid:18) µ + µ rw [2 w (1 − w ) + (1 + w )( w − w − rw )]( w − w − w − rw ) (cid:19) (cid:20) w t − w − (cid:21) + µ rw ( w + w − rw )( w − rw − w ) (cid:20) ( w − rw ) t − w − rw − (cid:21) (44)Note that the above solution consists of two growth rates for z namely w and w (1 − r ). Using z ( t ) ≈ f ( t ) in (41) and keeping terms to O ( µ ), we get (22) for z ( t ). It follows that the population z does not grow if r > r c = ( w − /w for w <
1. But for w >
1, the population z always grows and the growth rate is given by max { w (1 − r ) , w } as in the case of negative e . In the following, we will discuss the two cases w > w < w >
1: When w >
1, due to (22), the following subcases arise for z ( t ): z ( t ) ≈ ( µ e (1 − r )( w − rw + w )( e − rw )( w − rw − w − rw − w ) × ( w − rw ) t , r < e/w µ rw ( rw − e )( w − × w t , r > e/w (45)13or r > e/w , it follows from (45) that when z becomes one, z = rw / ( rw − e ) > z hits unity before z and thus phase II is absent. The time τ at which z ( τ ) = 1 is given by τ ( r ) = 1ln w ln (cid:18) w − µw (cid:19) + 12 ln w ln (cid:16) w − er (cid:17) , r > e/w (46)For r ≫ e , the last term in the above expression (and hence τ ) increases as ∼ − e/ ( rw ) with increasing r . For r < e/w , using (45), we find that the time τ at which z ( τ ) = 1 isgiven by τ ( r ) = 1ln( w − rw ) ln (cid:20) ( e − rw )( w − − rw )( w − w − rw ) µ e (1 − r )( w + w − rw ) (cid:21) , r < e/w (47)For r = 0, we have τ (0) ≈ w ln (cid:20) ( w − w − w ) µ ( w + w ) (cid:21) (48)which matches the one obtained using (12) or (22) for w ≪ w . To find thebehavior of τ for r ≪ e/w , we rewrite the expression for τ ( r ) asln( w − rw ) τ ( r ) − ln w τ (0)= ln "(cid:18) − rw e (1 − r ) (cid:19) (cid:18) − rw w − (cid:19) (cid:18) − rw w − w (cid:19) (cid:18) − rw w + w (cid:19) − Using the inequality r < rw < e < w − w < w − < w + w in the lastequation, we findln( w − rw ) τ ( r ) ≈ ln w τ (0) + ln (cid:18) − rw e (cid:19) (49)The above expression can be further simplified to give τ ( r ) ≈ τ (0) (cid:18) r ln w (cid:19) , r ≪ e/w (50)which shows that τ increases linearly with r for r ≪ e/w . w <
1: It is known that for an infinite population, there exits a critical recombination fraction r c beyond which a population initially located at ab t Figure 5: Compensatory mutation: Time evolution of z (solid), z (broken) and z (dotted) for w = 0 . , w = 1 . , r = 0 . , µ = 10 − using exact equations (6)-(8). Thenormalised fractions are shown in the inset. fitness peak [3, 5, 8, 22]. For our model, as discussed above, z can grow (and hence x can be fixed) provided w (1 − r ) > r < r c ≈ ( w − /w . Due to (44) obtained by dropping nonlinear terms in the equation for z ,both z and z increase as ( w − rw ) t . But Fig. 5 shows that z and z donot continue to grow in this manner but rise sharply as the end of phaseI approaches. To understand this, it is essential to include the nonlinearterms in the equations (40) and (41) for z and z . To this end, we write z ( t ) = f ( t ) [1 + f ( t )] where f ( t ) given by (44) reduces to f ( t ) ≈ µ − w + µ rw ( w + w − rw )( w − rw − w ) (cid:20) ( w − rw ) t w − rw − (cid:21) (51)for w < , w (1 − r ) >
1. As we shall later, f ( t ) remains close to zero forshort times but contributes substantially at large times. Using this form of z in (42) and neglecting the quadratic terms in f , we find that f ( t ) obeys thefollowing approximate linear inhomogeneous equation with time-dependentcoefficients, f ( t + 1) = B ( t ) f ( t ) + C ( t ) f ( t −
1) + A ( t ) (52)where the coefficient B ( t ) = w + w − rw − rw f ( t ) (53) C ( t ) = (2 µrw − w + rw ) w + 2 rw w f ( t ) (54) A ( t ) = r ( w − w f ( t ) (55)For w − rw >
1, we define ǫ = r c − r . For ǫ → f ( t ) ≈ a (1 + ǫw ) t where a = µ r c (1 + w )(1 − w ) ǫ (56)and the coefficients are given by B ( t ) ≈ w + 1 − r c w f ( t ) (57) C ( t ) ≈ − w + 2 r c w w f ( t ) (58) A ( t ) ≈ r c w w f ( t ) (59)16riting the difference equation (52) for f ( t ) as a differential equation,we get df ( t ) dt + 1 − B ( t ) − C ( t )1 + C ( t ) f ( t ) = A ( t )1 + C ( t ) (60)with the initial condition f (0) = 0. It is straightforward to solve the abovedifferential equation and we obtain f ( t ) ≈ (cid:2)(cid:0) b + c (1 + ǫw ) t (cid:1) α − (cid:3) (61)where b = 1 − w − w + 2 r c w w a (62) c = 1 − b (63) α = r c w ǫ (64)The second term in the parentheses in the above equation can be neglectedfor t ≪ t = ln [(1 − w ) / (2 r c w w a )] /ǫw and since a ∼ µ , we obtain f ( t ) ≈ t ≫ t , the first term inthe parentheses can be ignored and for ǫ →
0, we obtain f ( t ) ∼ e r c t . Thus z ( t ) = f ( t )(1 + f ( t )) increases as z ( t ) ∼ ( (1 + ǫw ) t , t ≪ t (1 + ǫw ) [1+ r c / ( w ǫ )] t , t ≫ t (65)Thus at times close to the end of phase I, z increases at a faster rate. To find the time τ at which phase I ends, we first calculate z ( t ) using z ( t ) = f ( t )(1 + f ( t )) in (40). This yields z ( t ) ≈ f ( t + 1) [1 + f ( t + 1)] − µ − w f ( t ) [1 + f ( t )] + rw f ( t ) [1 + 2 f ( t )] rw (66)On expanding f ( t + 1) and f ( t + 1) for small ǫ , we obtain f ( t + 1) ≈ (1 + ǫw ) f ( t ) (67) f ( t + 1) ≈ f ( t ) + 12 cǫαw (1 + ǫw ) t (cid:2) b + c (1 + ǫw ) t (cid:3) α − (68)17ubstituting this in the above expression for z ( t ) and using z ( τ ) = 1, wefind that τ is determined from the following equation:(1 − w ) a y (1 + y ) + ar c ( c + 2 aw )2 y y = rw + µ (69)where y = (1 + ǫw ) τ and y = ( b + c (1 + ǫw ) τ ) α . As this equation isdifficult to analyse, we consider only the fastest growing term to obtain thefollowing approximate equation for τ : y y ar c (cid:2) c + 2 aw (cid:3) ≈ rw (70) Phase III.
For t > τ , the populations z k ’s obey the equations (33)-(35) as for e < τ replaced by τ . For w > z and z grow exponentially fast with their respective fitnesses but z decays with time. The rate of decline is determined by the ratio w / ( w − r ). If this ratio is larger than unity, z ∼ ( w /w ) t and as (( w − r ) /w ) t otherwise. For w < z decays with time while z continues to grow. The remarks above for z behavior when w > w < case also.
4. Fixation time
As seen in the last section, the unnormalised populations z k ’s vary ex-ponentially (or faster) with time so that the normalised population x willreach unity asymptotically. Therefore, we define the fixation time T as thetime when the population fraction x ( T ) = 1 − δ where δ →
0. In terms of z k ’s, this condition gives z ( T ) + 2 z ( T ) − δ − δ z ( T ) = 0 (71)where z k ( T ) in the above equation is the population fraction in the Phase III at t = T . Another reason why δ > δ = 0, the above equation cannot be satisfied as both z and z are always positive. For r = 0 and w >
1, since z ≈
1, we can write δz ( T ) ≈ z ( T ) whichgives T ≈ w /w ) ln (cid:20) − δ )( w − w )( w − µδ ( w + w )( w − (cid:21) (72)which decreases monotonically as w increases. .1. No epistasis Since z ( t ) = z ( t ) due to (15) and (16), the condition (71) simplifies togive z ( T ) = (1 + z ( T )) √ − δ which leads to T ≈ w ln (cid:20) w − √ − δµ (1 − √ − δ ) (cid:21) (73)For w = 2 , w = 4 , µ = 10 − and δ = 0 .
01, the above expression yields T = 27 .
56 in excellent agreement with the result of our exact numerical iteration which gives the fixation time equal to 28 for various values of r . Using (36)-(38) at t = T in (71), we have rw w ( r − e ) (cid:18) w w (cid:19) T − τ + 2 − δ − δ (cid:18) w w (cid:19) T − τ = 0 (74)Since the first term on the LHS is exponentially decaying, we neglect it toobtain T = τ + 1ln( w /w ) ln (cid:20) − δ ) δ (cid:21) (75)where τ is given by (31). As discussed in Sec. 3.2, since τ decreases with r , the fixation time T is a decreasing function of recombination probability r when epistasis is negative. A comparison of the analytical estimate with the exact numerical result for two sets of parameters shows a good agreement (see Fig. 6). Dividing both sides of (71) by w and using (36)-(38), we find that thecontribution due to z term can be neglected as it is exponentially decaying.This gives 2 z ( τ ) ≈ δ − δ (cid:18) w w (cid:19) T − τ (76)and hence T = τ + ln [2(1 − δ ) /δ ]ln( w /w ) + ln z ( τ )ln( w /w ) (77)We first consider the w > w <
44 46 48 50 52 54 56 58 60 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =2.0, w =3.0w =1.5, w =2.0 Figure 6: Negative epistasis: Fixation time as a function of r obtained using exact iteration( • ) and analytical results ( ◦ ) given by (75) for two sets of fitnesses and µ = 10 − , δ = 0 . w = 2 . , w = 3 .
30 35 40 45 50 55 60 65 70 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 T r w =1.25, w =1.8125w =1.25, w =2.5 Figure 7: Positive epistasis: Fixation time as a function of r obtained using exact iteration( • ) and analytical result ( ◦ and × ) given by (77) and τ (upto a constant) respectively fortwo sets of fitnesses and µ = 10 − , δ = 0 . w >
1: In the above expression, z ( τ ) is given by (44) and τ by (47) for r < e/w and (46) for r > e/w . Computing T using these formulae in (77), we obtain the fixation time as a function of r shown in Fig. 7 (open circles) for fitness choices w = 1 . , w = 1 . , e/w ≈ .
14 and w = 1 . , w = . , e/w = 0 . the exact numerical results except in the vicinity of r = e/w . Figure 7 also shows τ which displays a similar behavior as T . We have already seen that the time τ increases linearly with r for r ≪ e/w but weakly for r ≫ e/w . Thus the fixation time T for e > , w > r but is weakly dependent on r for r > e/w . w <
1: The inset of Fig. 8 shows that the fixation time diverges as r ap-proaches critical recombination probability r c = ( w − /w . From (77),the fixation time T for w < τ from (69) and z ( τ ) = f ( τ )(1 + f ( τ )). The time T thus obtained (open circles) whenplotted against r c − r is compared with the results from exact iteration inFig. 8 and shows a good agreement. The approximate τ obtained using (70)is also plotted which well approximates the fixation time T . Therefore, it is21
100 1000 10000 100000 0.001 0.01 0.1 T r c -r
100 1000 10000 0.1 0.15 0.2 T r Figure 8: Compensatory mutation: Fixation time as a function of r obtained using exactiteration ( • ) and analytical results ( ◦ and × ) given by (77) and (70) respectively forparameters w = 0 . , w = 1 . , µ = 10 − , δ = 0 .
01 with r c = ( w − /w . The solidline has a slope equal to − r → r c . For ǫ →
0, (70) can be written as y y ar c (cid:2) c + 2 aw (cid:3) ≈ r c w (78)On taking logarithms both sides, the above equation reduces to2 r c w a (1 − w ) ǫ e τ ǫw ≈ ln w (1 − w )(1 − w + r c w ) w a (79)where we have neglected the linear term in τ as compared to the exponentialterm in τ . Writing a = ˜ a/ǫ , we finally obtain τ ≈ ǫw (cid:20) ln ln (cid:18) w (1 − w ) ǫ (1 − w + r c w ) w ˜ a (cid:19) − ln (cid:18) r c w ˜ a (1 − w ) ǫ (cid:19)(cid:21) (80)which decays slower than 1 /ǫ (see Fig. 8) due to the logarithmic corrections.
5. Initial Condition with nonzero linkage disequilibrium
So far, we have discussed the population dynamics starting with an initial condition in which only one genotype has a nonzero population. In this sec- tion, we consider the situation when a small finite frequency at the interme- diate loci is also present at t = 0 i.e. x (0) = 0 , x (0) = x (0) = (1 − x (0)) / As the analytical method presented in the last sections assumes that all but one frequencies is rare at a given time, it seems difficult to obtain analytical results. Therefore we present numerical results to show how the change in initial condition affects the fixation time.
As shown in Fig. 9, due to a nonzero population at intermediate loci, the fixation time at a given r is reduced as compared to the situation when only the genotype ab is present initially. For negative epistasis (Fig. 9a), the trend in the generalised situation appears similar to that discussed in Fig. 6 in that the T decreases slowly for small r but fast for large r . However for positive epistasis with w > and unlike Fig. 7 does not increase. For compensatory mutations, the fixation time shown in Fig. 10 increases as r approaches a critical recombination rate and diverges slower than ( r c − r ) − .
20 22 24 26 28 30 32 34 36 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =2.0, w =3.0w =1.5, w =2.0
16 18 20 22 24 26 28 30 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 T r w =1.25, w =1.8125w =1.25, w =2.5 Figure 9: Fixation time as a function of r obtained using exact iteration ( • ) when x (0) =0 . , x (0) = x (0) = 0 .
025 for the same parameters as in Figs. 6 and 7.
6. Conclusions
In this article, we have studied the dynamics of a 2 locus model in which the population evolves deterministically under mutation, selection and re- combination. As the recombination process makes the equations nonlinear, in general it is difficult to study such problems analytically. Here we have developed an analytical method to find the fixation time to the best locus for various fitness schemes.
The fixation time T is one of the measures for judging whether recombi- nation is beneficial for a population [6]. If the fixation time decreases with increasing recombination rate r , one may deduce that recombining popu- lation has an advantage over an asexual one. Our calculations show that when the epistasis parameter e is negative, the fixation time decreases fast for large r which suggests that high recombination rate may be beneficial for populations with negatively epistatic fitness. In a fitness landscape with positive epistasis and fitness increasing mono- tonically with mutational distance from the initial sequence, the fixation time is shown to increase with recombination. As already discussed in the
Introduction, the result that T increases with r for positive epistasis is in qualitative agreement with the expectation from the results of Eshel and Feldman [5]. Our analytical calculations show that the functional behavior of time T depends on the ratio rw /e . The fixation time is shown to increase linearly when r ≪ e/w but remains roughly constant for r ≫ e/w . A compensatory mutation is said to occur when the fitness loss caused by
100 1000 10000 100000 0.0001 0.001 0.01 0.1 T r c -r
100 1000 10000 0.15 0.2 T r Figure 10: Fixation time as a function of r obtained using exact iteration ( • ) when x (0) =0 . , x (0) = x (0) = 0 .
025 for the same parameters as in Figs. 8. The numericallydetermined critical recombination rate r c ≈ . − Te = 0 r > r (73) e < , w > r > e Decreases with r (75) e > , w > r < e/w Increases linearly with r (50) r > e/w Increases weakly with r (46) e > , w < r < r c Diverges with r c − r (80) r > r c Infinite
Table 1: Table summarising the dependence of fixation time T on recombination proba-bility r for various choices of epistasis e . one mutation is remedied by its epistatic interaction with a second mutation at a different site in the genome. For such a fitness scheme in which the initial and final fitness hills are separated by a fitness valley, it is known that an infinitely large population cannot cross the intermediate valley beyond a critical recombination rate r c [3, 5]. This implies that the fixation time diverges as r approaches r c . Our exact numerical results for fixation time when plotted against r c − r on a double logarithmic scale indicate a power law decay. Assuming that the divergence is purely algebraic, a fit to the numerical data then gives T ∼ ( r c − r ) − . . However our calculation that takes the nonlinearities into account shows that the divergence is actually a power law with logarithmic corrections. Here we have focused on the evolution of deterministic population but it is important to include drift effects as the real populations have a finite size N . The finite population problem with compensatory mutation has been studied in certain parameter regimes using simulations and within a diffu- sion approximation. The analytical calculations of [26, 9, 8] and numerical simulations of [18, 23] for a two-locus model with compensatory mutation indicates that for fixed s and N , a finite population manages to reach the best locus for any r and the fixation time increases with r as seen for in- finite population. However it is not clear how the population approaches the infinite N limit. For e > , w >
1, it is found numerically there exits a critical epistasis value below which the fixation time decreases [20]. An analytical understanding of such interesting aspects of the interplay between recombination and drift remains an open problem.
Acknowledgement: The author is grateful to J. Krug for introducing her to this problem and useful discussions. She also thanks S.-C. Park for com- part of this work was done.
References [1] C.M. Bender and S.A. Orszag.
Advanced Mathematical Methods for
Scientists and Engineers . Springer, 1999. [2] R. B¨urger. Linkage and the maintenance of heritable variation by mutation-selection balance.
Genetics , 121:175–184, 1989. [3] J.F. Crow and M. Kimura. Evolution in sexual and asexual populations.
Am. Nat. , 99:439–450, 1965. [4] J.A.G.M. de Visser, S.-C. Park, and J. Krug. Exploring the effect of sex on an empirical fitness landscape.
Am. Nat. , 174:S15-S30, 2009. [5] I. Eshel and M.W. Feldman. On the evolutionary effect of recombination.
Theo. Pop. Biol. , 1:88–100, 1970. [6] M.W. Feldman, S.P. Otto, and F.B. Christiansen. Population genetic perspectives on the evolution of recombination.
Annu. Rev. Genet. , [7] J. Felsenstein. The evolutionary advantage of recombination. Genetics , [8] P.G. Higgs. Compensatory neutral mutations and the evolution of RNA. Genetica , 102/103:91–101, 1998. [9] M. Iizuka and M. Takefu. Average time until fixation of mutants with compensatory fitness interaction.
Genes Genet. Syst. , 71:167–173, 1996. [10] K. Jain. Loss of least-loaded class in asexual populations due to drift and epistasis.
Genetics , 179:2125, 2008. [11] K. Jain and J. Krug. Adaptation in simple and complex fitness land- scapes. In U. Bastolla, M. Porto, H.E. Roman, and M. Vendruscolo, editors,
Structural Approaches to Sequence Evolution: Molecules, Net- works and Populations , pages 299–340. Springer, Berlin, 2007.
Trends Ecol.
Evol. , 11:41–46, 1996. [13] A.S. Kondrashov. Muller’s ratchet under epistatic selection.
Genetics , [14] F.A. Kondrashov and A.S. Kondrashov. Multidimensional epistasis and the disadvantage of sex. Proc. Natl. Acad. Sci. U.S.A. , 98:12089–12092, [15] R.D. Kouyos, S.P. Otto, and S. Bonhoeffer. Effect of varying epistasis on the evolution of recombination.
Genetics , 173:589–597, 2006. [16] R.D. Kouyos, O.K. Silander, and S. Bonhoeffer. Epistasis between dele- terious mutations and the evolution of recombination.
Trends Ecol.
Evol. , 22:308–315, 2007. [17] L. Loewe. Quantifying the genomic decay paradox due to Muller’s ratchet in human mitochondrial DNA.
Genet. Res. Camb. , 87:133–159, [18] Y. Michalakis and M. Slatkin. Interaction of selection and recombination in the fixation of negative-epistatic genes.
Genet. Res. , 67:257–269, 1996. [19] H. J. Muller. The relation of recombination to mutational advance.
Mutation Res. , 1:2–9, 1964. [20] S.P. Otto, M.W. Feldman, and F.B. Christiansen. Some advantages and disadvantages of recombination. In S.A. Levin, editor,
Frontiers in
Mathematical Biology , pages 198–211. Berlin: Springer Verlag, 1994. [21] S.P. Otto and T. Lenormand. Evolution of sex: Resolving the paradox of sex and recombination.
Nat. Rev. Genet. , 3:252, 2002. [22] S.-C. Park and J. Krug. Bistability in two-locus models with selection, mutation, and recombination, submitted to Theo. Pop. Biol. [23] P.C. Phillips. Waiting for a compensatory mutation: Phase zero of the shifting-balance process.
Genet. Res. , 67:271–283, 1996. structure and evolution of genetic systems.
Nat. Rev. Genet. , 9:855– [25] W. R. Rice. Evolution of sex: Experimental tests of the adaptive signif- icance of sexual recombination.
Nat. Rev. Genet. , 3:241–251, 2002. [26] W. Stephan. The rate of compensatory evolution.
Genetics , 144:419– [27] D. M. Weinreich, R. A. Watson, and L. Chao. Sign epistasis and genetic constraint on evolutionary trajectories.
Evolution , 59:1165–1174, 2005. [28] J.B. Wolf, E.D.I. Brodie, and M.J. Wade, editors.
Epistasis and the evolutionary process . Oxford Univ. Press, New York, 2000.290