A dynamical phase transition in a model for evolution with migration
aa r X i v : . [ q - b i o . P E ] A ug A dynamical phase transition in a model for evolution with migration
Bartłomiej Waclaw, Rosalind J. Allen, and Martin R. Evans
SUPA, School of Physics and Astronomy, University of Edinburgh,Mayfield Road, Edinburgh EH9 3JZ, United Kingdom
Migration between different habitats is ubiquitous among biological populations. In this Letter,we study a simple quasispecies model for evolution in two different habitats, with different fitnesslandscapes, coupled through one-way migration. Our model applies to asexual, rapidly evolvingorganisms such as microbes. Our key finding is a dynamical phase transition at a critical value ofthe migration rate. The time to reach steady state diverges at this critical migration rate. Abovethe transition, the population is dominated by immigrants from the primary habitat. Below thetransition, the genetic composition of the population is highly non-trivial, with multiple coexistingquasispecies which are not native to either habitat. Using results from localization theory, we showthat the critical migration rate may be very small – demonstrating that evolutionary outcomes canbe very sensitive to even a small amount of migration.
PACS numbers: 87.23.Kg, 87.10.-e, 02.50.-r, 05.70.Fh
Biological dispersal—the movement of organisms be-tween habitats—is a ubiquitous phenomenon with im-portant and wide-ranging consequences. In the natu-ral environment, organisms expand their ranges, colonisenew habitats, and can undergo speciation if they becomespatially isolated. Dispersal plays a key role in deter-mining spatial and temporal patterns of genetic diver-sity in all organisms [1]. For sexual organisms, withlow mutation rates, population subdivision into demes,connected by migration, can have important effects ongenetic diversity [2, 3], while in continuous space, trans-mission of unfit alleles can prevent the expansion of aspecies’ range [4]. For asexual, rapidly evolving organ-isms such as bacteria and viruses, dispersal also facilitatesthe emergence of new diseases and resistance to knowntreatments. The “source-sink” paradigm [5, 6], in whichmigration from a favourable habitat maintains organismsin an unfavourable one, has recently been used to ex-plain the microbial genetics of urinary tract infections [7].However, despite its importance, a general understand-ing of how migration affects mutation-selection balancein microbial systems is lacking. In particular, one wouldlike to know how migration changes the proportions ofdifferent genotypes in the evolving population.In order to study the role of migration we introduce inthis letter a simple statistical physics model for the evo-lutionary dynamics of migrating asexual organisms. Ourmodel comprises two environmental habitats (with dif-ferent fitness landscapes) coupled by one-way migrationof organisms from the primary to the secondary habitat.Using a quasispecies approach, we find that the model un-dergoes a dynamical phase transition: at a critical valueof the migration rate, the time to reach the steady statediverges. For sub-critical migration rates, the steady-state population in the secondary habitat is made up ofthe organisms “native” (best adapted) to this habitat, as well as other, non-trivial, quasispecies, which are not na-tive to either habitat. Above the critical migration rate,the native quasispecies in the secondary habitat is wipedout by immigrants from the primary habitat. We useresults from localization theory to gain insight into thetransition and to show that the critical migration rate istypically small, demonstrating that even a small amountof migration can have an important effect on evolutionarydynamics.In our model, organisms have M possible genotypes. N i and n i denote the abundance (number density) of or-ganisms with genotype i in the primary and secondaryhabitat, respectively. The populations in the two habi-tats are thus described by the vectors ~N = ( N , . . . , N M ) and ~n = ( n , . . . , n M ) . Organisms migrate from the pri-mary to the secondary habitat with rate k . Within eachhabitat, mutations transform organism i to j with rate γA ij , where A ij is a symmetric adjacency matrix, to bediscussed later. Organisms of type i reproduce at a rate Φ i − P j N j in the primary habitat and φ i − P j n j in thesecondary habitat. The vectors ~ Φ = (Φ , . . . , Φ M ) and ~φ = ( φ , . . . , φ M ) thus describe the fitness landscapes (orthe maximal growth rate for organisms with genotype i )in each habitat. The terms − P j N j and − P j n j in thegrowth rates account for population saturation due to fi-nite resources, as in the logistic equation. This model isbased on the para-mu-se (parallel mutation and selection)[8] version of quasispecies theory [9], widely discussed inthe biological, chemical and physical literature [10–15].The time evolution of the system is governed by thefollowing set of equations for i = 1 , . . . , M : ˙ N i = N i (Φ i − X j N j ) + γ X j A ij ( N j − N i ) , (1) ˙ n i = n i ( φ i − X j n j ) + γ X j A ij ( n j − n i ) + kN i , (2)where we have assumed that the primary habitat is large,so that the loss of individuals due to migration has a neg-ligible effect on its population [16]. For the calculationspresented here, we suppose that the fitness values Φ i , φ i are independent random numbers drawn from a distribu-tion P ( ϕ ) , common to both environments. Thus genomeswhich are well-adapted in the primary habitat are likelyto be maladapted in the secondary habitat.We first present the analytical solution for the steadystate [17]. For the primary habitat it is known from qua-sispecies theory [8, 9] that the steady-state abundances ~N ∗ are ~N ∗ = Λ ( ~ Ψ T · ~e ) ~ Ψ , (3)with ~e = (1 , . . . , . Here Λ is the largest eigenvalue ofthe matrix W ij = δ ij Φ i + γ ∆ ij , (∆ ij = A ij − δ ij P k A ik being the graph Laplacian), and ~ Ψ is the correspondingeigenvector. We now determine the steady-state geno-type abundances in the secondary habitat by expandingin the eigenbasis of V ij = δ ij φ i + γ ∆ ij ~n ∗ = k M X α =1 ~ψ αT · ~N ∗ n tot − λ α ~ψ α , (4)where ~ψ α and λ α are the eigenvectors and eigenvaluesof V ij (ordered as λ > λ > . . . ) and n tot is the totalsteady-state population in the secondary habitat, whichis determined self-consistently as the largest root of n tot = k M X α =1 ( ~ψ Tα · ~N ∗ )( ~ψ Tα · ~e ) n tot − λ α . (5)To proceed further, we now make some specific as-sumptions about the structure of the genome space (themutation matrix A ij ) and the fitness landscape P ( ϕ ) . Tothis end, we suppose that the mutation graph is a one di-mensional closed chain, in which mutations are possibleonly between neighbouring genotypes ( i.e. A ij = 1 if i = ( M + j ± M , and zero otherwise). We furthersuppose that the fitness can take only two values: and with probability p and − p , respectively. Since it hasbeen suggested that viable genotypes form an intercon-nected network in genome space [18], we shall considerthe case p ≈ , so that the fitness landscape is charac-terised by “islands” of fit genotypes separated by unfitones.Figure 1 shows how the population composition inthe secondary habitat depends on the migration rate k .When k is very small, the steady-state distribution n ∗ i ispeaked around the longest sequence of maximal fitnessvalues: this peak corresponds to the “native” (or best-adapted) quasispecies for the secondary habitat. When k is very large (much larger than the mutation rate γ ),the secondary habitat becomes dominated by immigrants PSfrag replacements k = 0 k = 0 . k = 0 . k = 0 . k = 0 . k = 1 ii n ∗ i n ∗ i n ∗ i N ∗ i Figure 1: Examples of steady-state genotype abundances inthe secondary habitat n ∗ i , for different values of the migra-tion rate k where p = 0 . , γ = 0 . and M = 128 . Theseresults were obtained by numerical self-consistent solution ofEqs. (3), (4) and (5). The top left panel shows zero abun-dances ( n ∗ i = 0 ) in the absence of migration. The inset showsthe abundances N ∗ i in the primary habitat (red line). Thetop right panel shows the “native” steady-state n ∗ i (blue line),for very small migration rate, as well as the fitness landscape φ i / (black line). The other panels show n ∗ i in the sec-ondary habitat, for various values of the migration rate k . × × × × PSfrag replacements k/k T T k Figure 2: Numerical results (from Eq. (2), using Eq. (3) for N ∗ i )) for the time T to reach the steady-state, starting from N i = N ∗ i and n i = 0 . Left panel: T as a function of migrationrate k , for the same system as in Fig. 1. The steady state wasassumed to have been reached when P i | n i ( t + 1) /n i ( t ) − | /M < − . Right panel: T ( k/k ) , where k is determinedfrom Eq. (6), for M = 64 , γ = 0 . , p = 0 . , normalized sothat T (10) = 1 . Results for 20 representative sets of ~ Φ , ~φ arepresented on a log-log plot. from the primary habitat and the distribution of n ∗ i tendsto the primary-habitat steady-state distribution N ∗ i .In contrast, for intermediate migration rates, the ge-netic composition in the secondary habitat is highly non-trivial. As k increases from zero, the quasispecies nativeto the secondary habitat is joined by additional, non-native, quasispecies peaks. These do not correspond tothe native quasispecies from the primary habitat, but × -4 × -4 PSfrag replacements t = 1 t = 2 t = 2 t = 2 . t = 2 t = 2 t = 2 n i n i n i n i n i n i iik = 0 . k = 0 . Figure 3: Genetic composition in the secondary habitat n i ( t ) ,during the approach to steady state, for the same system asin Figure 1, for two migration rates k = 0 . ( k ≪ k , blackline, left) and k = 0 . ( k ≫ k , red, line right, the samevertical scale), for t = 1 , . . . , . are instead determined by the overlap of eigenvectors inthe primary and secondary habitats (as in Eq. (4)). Asthe migration rate is increased slightly further from 0.002to 0.003, these new peaks dominate completely and thenative quasispecies of the secondary habitat disappears.This effect can be triggered by a very moderate change inthe migration rate. The appearance of these new quasis-pecies peaks suggests that migration coupled to mutationcan provide a mechanism for generation and maintenanceof genetic diversity (as will be shown later in Figure 4).Finally, as the migration rate increases further, the newpeaks merge into the native quasispecies peak from theprimary habitat.Figure 2 (left panel) shows that this non-trivial depen-dence of the steady state population on the migrationrate is accompanied by striking changes in the systemdynamics. The time to reach the steady state plotted asa function of k shows a striking maximum at k ≈ . ,suggesting a critical slowing down and a likely dynami-cal phase transition. The approach to the steady statefor k ≪ k is much slower than for k ≫ k . Figure 3illustrates the underlying reason for this. Here we plotsnapshots of ~n at various moments in time during theapproach to the steady state, for the same parameterset and fitness landscape, for migration rates below andabove k . For both migration rates, the immigrating pop-ulation initially has the same composition as the primary habitat. For k ≪ k , the primary habitat quasispeciespeak is lost entirely and the system undergoes a slowprocess of jumps between various local fitness maximabefore finally settling in the global optimum. In con-trast, for k ≫ k , the system rapidly relaxes to a steadystate which overlaps strongly with that of the primaryhabitat.Returning to our analytical expressions for n ∗ i , Eqs. (4)and (5), we can estimate the critical migration rate k atwhich the dynamical phase transition takes place. Equa-tion (4) expresses ~n as a sum of eigenvectors ~ψ for thesecondary habitat, weighted by their overlap with ~N ∗ .When k → , n tot → λ ≃ and ~n → ~ψ (see below).This is the native quasispecies solution for the secondaryhabitat [19]. The phase transition occurs when this so-lution becomes dominated by the contributions from theother terms ( α = 2 , . . . , M in Eqs. (4) and (5)), whicharise from overlap with the primary habitat solutions.This happens at a migration rate approximately givenby k = λ M X α =2 ( ~ψ Tα · ~N ∗ )( ~ψ Tα · ~e ) λ − λ α ! − . (6)To show that this result indeed corresponds to the criticalmigration rate at which the transition happens, we plotin Figure 2, right panel, the time T to reach steady stateas a function of k/k , where k is determined from (6),for simulated dynamics on ≈ representative randomfitness sequences ~ Φ , ~φ . Each fitness landscape generates aslightly different curve T ( k/k ) , but all the curves appearto diverge at k = k , indicating a phase transition.We now briefly discuss how the steady-state propertiesof the system are affected by this phase transition. For k < k , the α = 1 terms in Eqs. (4) and (5) dominate,while for k > k , the terms α = 2 , . . . , M dominate.This has important consequences for the total population n tot in the secondary habitat: for k < k , n tot ≃ [19],while for k > k , n tot grows with k . This prediction isconfirmed numerically in Fig. 4, left panel, for a numberof randomly generated fitness landscapes.Another important steady-state property is genetic di-versity. We noted from Figure 1 that multiple quasis-pecies peaks can coexist in steady state for intermediatemigration rates. Figure 4, right panel, plots the partici-pation ratio (PR) r = ( P i n ∗ i ) / P i ( n ∗ i ) , as a functionof k/k , for several realisations of the fitness landscape.The PR is a convenient measure of diversity, which showshow many of the n ∗ i ’s are much larger than zero. Fig-ure 4 shows that r reaches a maximal value at about k ≈ . − . k and then decreases as k approaches k ;above k , the diversity remains approximately constant.Thus for weak migration the genetic diversity is increasedwhereas for strong migration it is washed out.For the one-dimensional model considered here, re-sults from localisation theory [20] allow us to estimate PSfrag replacements k/k k/k r / r n t o t Figure 4: Left: Total steady-state population n tot in the sec-ondary habitat, as a function of rescaled migration rate k/k ,for M = 128 , p = 0 . , γ = 0 . , for a number of typical randomfitness landscapes. Right: The normalized participation ratio r/r for r = r ( k → , for the same set of fitness landscapes. the value of the critical migration rate k . The eigen-vector equation for ~ Ψ α in the primary habitat mapsonto a Schrödinger equation with random potential U j = − Φ j /γ : − (∆ ~ Ψ α ) j + U j ( ~ Ψ α ) j = E α ( ~ Ψ α ) j , (7)where E α = − Λ α /γ , and likewise for an eigenvector ψ α in the secondary habitat. Equation (7) is essen-tially a 1D tight-binding electron model [21], in which U j = − /γ with probability p and U j = 0 with prob-ability − p . Localization theory tells us that for thisproblem the ground state eigenvector is localized, tak-ing the form Ψ ,j ∼ sin( jπ/w ) on the longest run w ofconsecutive sites with U j = − /γ , and has eigenvalue Λ ≃ − γπ /w . Eigenvectors corresponding to ex-cited states are similarly localized on other, shorter po-tential wells. To estimate k , we observe that the largestcontribution to the sum in (6) comes from the eigenvec-tor with the greatest overlap with ~N ∗ , which we denote ~ψ β . Assuming that ~N ∗ and ~ψ β are localized on poten-tial wells of length w and v , respectively, we can esti-mate that ( ~ψ Tβ · ~N ∗ )( ~ψ Tβ · ~e ) ∼ v/w . The lengths w, v are the longest runs of U j = − /γ in sequences of in-dependent binary random numbers of length M and w ,respectively, therefore w ≃ ln( M (1 − p )) / ln(1 /p ) and v ≃ ln( w (1 − p )) / ln(1 /p ) . For large M , v is much smallerthan w , so λ − λ β ≃ γπ /v . Inserting this into Eq. (6),and setting ǫ = 1 − p , we finally obtain h k i ∼ γw/v ≃ γǫ ln( M ǫ )(ln ln
M ǫ ) . (8)Remarkably, this rough estimate agrees up to a factor ≈ with our simulation results. Here we have consideredsmall ǫ , where multiple fit genomes lie close together ingenotype space, and we see from (8) that h k i is muchsmaller than γ for moderately large M ǫ . This means thateven a very small migration rate (smaller than the muta-tion rate) can dramatically change the course of evolutionin the secondary environment [22]. In summary, we have introduced a simple model forthe evolution of asexual organisms in two coupled habi-tats with different fitness landscapes. We have shownthat a dynamical phase transition occurs as the migra-tion rate changes. Bifurcations caused by migration havebeen observed in several models of sexual populations[3, 4] but, to our knowledge, the present work is the firstto consider the effects of migration on the evolutionarydynamics of asexual organisms from a quasispecies per-spective. In our model, at the critical migration rate, thepopulation in the secondary habitat becomes dominatedby immigrants from the primary habitat. For subcriticalmigration rates, our quasispecies model also reveals thatmigration can provide a novel mechanism for creationand maintenance of genetic diversity.To obtain analytical results and clear insights into thephysics of the model, we have mainly considered a simpleone-dimensional closed-chain representation of the geno-type space and binary random fitness landscapes. As astep towards more complex and realistic representationsof the genome space and fitness landscape, we have alsocarried out numerical simulations for a continuous, uni-form distribution of the fitness, as well as a hypercubicmutation graph. Our key results (in particular the dy-namical phase transition as a function of migration rate)remain valid in these cases, suggesting that our findingsare likely to be of general significance. It will be inter-esting to extend our work to empirical fitness landscapesgenerated from experimental data [23], and, inspired byexisting models for sexual organisms [3, 4], and recentmodels in microbial ecology [24], to multiple connectedhabitats and spatially varying environments.The authors are grateful to N. H. Barton, R. A. Blythe,A. Free, and J.-M. Luck for valuable discussions. R.J.A.was supported by the Royal Society. This work wasfunded by the EPSRC under grant EP/E030173. [1] A. K. Sakai, et al., Annu. Rev. Ecol. Syst. , 305 (2001).[2] R. Bürger, J. Math. Biol. , 939-997 (2009); T. Nagylakiand Y. Lou, in Tutorials in Mathematical Biosciences IV ,edited by A. Friedman, Vol. 1922, p. 119 (Springer, Berlin2008).[3] R. Lande, Evolution , 234 (1979).[4] M. Kirkpatrick and N. H. Barton, Am. Nat. , 1(1997).[5] R. D. Holt, Theor. Popul. Biol. , 181 (1985).[6] H. R. Pulliam, Am. Nat. , 652 (1988).[7] S. Chattopadhyay et al., Proc. Natl Acad. Sci. USA ,12412 (2009).[8] E. Baake and H. Wagner, Genet. Res. Camb. , 93(2001).[9] M. Eigen, P. Schuster, Naturwiss. , 541 (1977).[10] C. O. Wilke, BMC Evolutionary Biology , 44 (2005).[11] M. A. Nowak, Trends in Ecology and Evolution, (4),118 (1992). [12] K. Jain and J. Krug, in Adaptation in Simple and Com-plex Fitness Landscapes , p. 299 (Springer Berlin 2007).[13] L. Demetrius, J. Chem. Phys. (12), 6939 (1987).[14] I. Leuthäusser, J. Stat. Phys. , 343 (1987).[15] E. Baake, M. Baake and H. Wagner, Phys. Rev. Lett. ,559 (1997).[16] The inclusion of an extra term − kN i in Eq. (1) to accountfor the loss of organisms from the primary habitat dueto migration would be equivalent to adding a constantto the fitness landscape ~ Φ and would have no qualitativeeffect on the results reported here.[17] B. Waclaw, R. J. Allen and M. R. Evans, to be published[18] S. Gavrilets, Am. Nat. , 1 (1999).[19] n tot ≃ is a consequence of our choice of constants anddoes not imply a single organism. For example, choosing Φ i = c for fit phenotypes yields n tot ≃ c [20] I. M. Lifshitz, Adv. Phys. , 483 (1964); Sov. Phys. –Uspekhi , 549 (1965).[21] K. Ishii, Supp. of Progress of Theor. Phys. , 77 (1973).[22] One can perform a similar analysis for p ≪ , for whichfit genomes are widely separated in genome space, andcheck that h k i ∼ λ = p γ − γ . The migrationthreshold is of order in this limit, but the conclusionthat migrants inundate better adapted genotypes at somecritical migration rate k remains valid.[23] D. M. Weinreich, et al., Science , 111 (2006); F. J.Poelwijk et al., PLoS Comput. Biol.2