Simple model for the Darwinian transition in early evolution
aa r X i v : . [ q - b i o . P E ] J a n Simple model for the Darwinian transition in early evolution
Hinrich Arnoldt, a Steven H. Strogatz b and Marc Timme ∗ ,a,c a Network Dynamics, Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany b Department of Mathematics, Cornell University, Ithaca, New York 14853, USA c Institute for Nonlinear Dynamics, Faculty of Physics,Georg August University Göttingen, 37077 Göttingen, Germany and ∗ Corresponding author: [email protected]
It has been hypothesized that in the era just before the last universal common ancestor emerged,life on earth was fundamentally collective. Ancient life forms shared their genetic material freelythrough massive horizontal gene transfer (HGT). At a certain point, however, life made a transitionto the modern era of individuality and vertical descent. Here we present a minimal model for thishypothesized “Darwinian transition.” The model suggests that HGT-dominated dynamics may havebeen intermittently interrupted by selection-driven processes during which genotypes became fitterand decreased their inclination toward HGT. Stochastic switching in the population dynamics withthree-point (hypernetwork) interactions may have destabilized the HGT-dominated collective stateand led to the emergence of vertical descent and the first well-defined species in early evolution. Anonlinear analysis of a stochastic model dynamics covering key features of evolutionary processes(such as selection, mutation, drift and HGT) supports this view. Our findings thus suggest a viableroute from early collective evolution to the start of individuality and vertical Darwinian evolution.
Keywords:
Evolutionary dynamics, population genetics, horizontal gene transfer, earlyevolution, emergence of the first species, hypernetwork dynamics, Darwinian threshold
I. INTRODUCTIONA. The last universal common ancestor
In the final chapter of “On the Origin of Species”,Charles Darwin speculated that all life on earth may havedescended from a common ancestor. As he observed,“all living things have much in common, in their chem-ical composition, their germinal vesicles, their cellularstructure, and their laws of growth and reproduction. . . .
Therefore I should infer from analogy that proba-bly all the organic beings which have ever lived on thisearth have descended from some one primordial form,into which life was first breathed” [1].A century after Darwin, molecular biology providednew lines of circumstantial evidence for a universal com-mon ancestor. All organisms were found to use the samemolecule (DNA) for their genetic material, as well as acanonical look-up table (the genetic code) for translat-ing nucleotide sequences into amino-acid sequences [2–4].Further clues came from cross-species comparisons of themolecules involved in the most fundamental processes oflife, such as protein synthesis, core metabolism, and thestorage and handling of the genetic material. The firstsuch analysis [5], based on snippets of ribosomal RNA,provoked a revolution in our understanding of life’s fam-ily tree [6–8]. It indicated that life is divided into threedifferent domains: the Archaea, the Bacteria and theEucarya [5, 6, 8]. Later studies using other molecularsequences placed the root of the tree, corresponding tothe last universal common ancestor, somewhere betweenthe Bacteria and Archaea [9–13], roughly . − . bil-lion years ago. The nature of the last universal common ancestor, however, remains unresolved: Was it prokary-otic or eukaryotic? Did it thrive in extreme or moderatetemperatures? Was its genome based on RNA or DNA?For a review, see Ref. [14] and references therein. B. The era of collective evolution
Our work in this paper was inspired by a conjectureproposed by Woese and his colleagues [15–18]. Accord-ing to this conjecture, the last universal common ances-tor was a community, not a single creature. It marked aturning point in the history of life: before it, evolutionwas collective and dominated by horizontal gene trans-fer; after it, evolution was Darwinian and dominated byvertical gene transfer.In Woese’s scenario, life in the epoch leading up tothe universal ancestor was intensely communal. It wasorganized into loose-knit consortia of protocells far sim-pler than the bacteria or archaea we know today. Woeseand Fox [15] called those hypothetical ancient life forms“progenotes.” The term signifies that the coupling be-tween genotype and phenotype had not yet fully evolved,mainly because the process for translating genes into pro-teins had not yet fully evolved either. A rudimentaryform of translation existed, but it was ambiguous andhence had a statistical character. Instead of producinga single protein, early translation produced a cloud ofsimilar proteins. This ambiguity in protein synthesis inturn limited the specificity of all the progenote’s interac-tions. For example, lacking the large, complex proteinsnecessary for accurate copying and repair of the geneticmaterial, the progenote’s genome was tiny and subjectto high mutation rates.Progenotes were not well-defined organisms as such,because they had no individuality and no long-term ge-netic pedigree. Their genes and component parts couldcome and go, being swapped in or out with other mem-bers of the community via horizontal transfer. But be-cause biochemical innovations produced by any memberof the community were available to all, evolution at thistime was rapid—probably more rapid than at any timesince. Selection acted on whole communities, not on in-dividual progenotes. Those communities that were bet-ter at sharing their biochemical breakthroughs flourished.Out of this cauldron of evolutionary innovation, the uni-versal genetic code and its translational machinery co-evolved, in response to the selective pressures favoringefficient sharing and interoperability.Vetsigian, Woese, and Goldenfeld [19] confronted andconstrained these speculations with mathematical anal-ysis and computer simulations. Going beyond Woese’sconjectures, they probed early evolution scientifically byinterpreting the available data on the genetic code. Theirdynamical model for the evolution of the genetic code [19]showed that a collective state of life is required to obtainthe observed [20, 21] statistical properties of the code, inparticular, its simultaneous universality and optimality.A later study by Goldenfeld and colleagues [22] providedfurther evidence that only a collective state of life couldhave created the highly optimized code used by all lifetoday [21].
C. The Darwinian transition
How did the era of collective evolution come to an end?Woese speculated that as the translation process beganto improve, and as progenotic subsystems became in-creasingly complex and specialized, it would have becomeharder to find foreign parts compatible with them. Thus,horizontal gene transfer would have become increasinglydifficult. The only possible modifications at this pointwould have come from within the progenote’s lineage it-self, through mutation and gene duplication. It was inthis way that the progenotes would have made the Dar-winian transition [17, 18] to become “genotes,” i.e., lifeforms with a tight coupling between their nucleic acidgenotypes and their protein phenotypes, and that couldtherefore evolve through the familiar Darwinian processof vertical descent.The model considered below is an attempt to explore,in mathematical terms, how the Darwinian transitionfrom the collective state to the modern era of individ-uality might have taken place. Our approach shares withRef. [19] the outlook that a dynamical systems calcula-tion should be devised to support or refute the hypothe-ses considered. Our results lend support to the proposedcollective state of life [15–19] by providing a potential mechanism for the exit from that state.
D. Horizontal gene transfer
Over the last decades more and more evidence hasaccumulated that, besides selection, mutation, anddrift [23], another process drives evolution: horizontalgene transfer. Here we briefly review the main pointsabout horizontal gene transfer relevant to the mathemat-ical model developed below.While reproduction implies a vertical transfer of genesfrom one entity to the next in the phylogenetic tree, thereare also processes in which possibly unrelated individualsexchange genetic material during their lifetimes, i.e., hor-izontally in the sense of the tree. This transfer of geneswithin one generation is consequently termed horizontalgene transfer (HGT) [24–27] or lateral gene transfer. Itis now widely accepted that HGT is a fundamental driv-ing force of evolution [25, 28–33], and that its existenceraises profound theoretical problems for evolutionary bi-ology. For example, the longstanding problem of defin-ing bacterial species [30, 34–38] is due, in part, to thepromiscuous use of HGT by bacteria. A recent primer onhorizontal gene transfer and its potential for evolutionaryprocesses in general is given in [39].As discussed above, if HGT was rampant in the earlystages of evolution, the last universal common ancestorwas a community, not a single organism [16, 17, 19, 40].In this collective state, individuals could not yet be dis-tinguished, as each progenote’s genes were frequently ex-changed through HGT. In terms of the model to be devel-oped below, the total pool of genotypes in the collectivestate would be spread out and thus broadly distributedin the state space of all theoretically possible genotypes.Conversely, a genotype distribution that is highly local-ized in state space, being concentrated on just one or afew genotypes, would be the model’s version of a well-defined species.Woese postulated that as the collective state of theprogenote population slowly evolved toward higher com-plexity, its rate of HGT slowly decreased [17]. At somepoint the system crossed the “Darwinian threshold” [17].Then natural selection instead of HGT started to dom-inate the dynamics. The fitter individuals were selectedfor and the first species emerged from the distributedstate. In the colorful language of Dyson [41]:But then, one evil day, a cell resembling aprimitive bacterium happened to find itselfone jump ahead of its neighbors in efficiency.That cell, anticipating Bill Gates by three bil-lion years, separated itself from the commu-nity and refused to share. Its offspring be-came the first species of bacteria—and thefirst species of any kind—reserving their in-tellectual property for their own private use.With their superior efficiency, the bacteriacontinued to prosper and to evolve separately,while the rest of the community continued itscommunal life. Some millions of years later,another cell separated itself from the commu-nity and became the ancestor of the archaea.Some time after that, a third cell separateditself and became the ancestor of the eukary-otes.After making a Darwinian transition, evolution pro-ceeds in the familiar vertical manner, being driven by se-lection, mutation, and drift [23], with HGT playing onlya minor role. Such Darwinian dynamics have, of course,been studied extensively in both experimental and modelsettings [23, 42–46]. Compared to the dynamics of HGTtheir properties are relatively well understood. Recently,potential influences of HGT on such evolutionary dynam-ics have been investigated [16, 19, 47–51]. Some mathe-matical models of HGT have focused on how it can in-crease a population’s fitness in Darwinian evolution [52].Keep in mind, however, that the hypothesized HGT as-sociated with progenotes and the Darwinian transition,being associated with ribosomal genes and the rest ofthe core machinery of the cell, would have been of fargreater evolutionary significance than the HGT of, say,antibiotic resistance genes seen in bacterial communitiestoday. In Woese’s scenario, the ancient form of HGT wasrampant, pervasive, and tremendously disruptive and in-novative. It was the prime mover in shaping the fabricof the cell [18].We would like to understand what such a Darwiniantransition would look like, mathematically. The modeldescribed in the next section is deliberately minimal. Itleaves out all the biology of ribosomes, proteins, geneticcodes, and the like. What remains is an attempt to cap-ture the essence of Woese’s speculations. In place ofa community of progenotes, we consider a communityof abstract genomes, represented by binary sequences.They interact via HGT, and are subject to mutation,selection, and drift on a fitness landscape. Our worksuggests that HGT-dominated dynamics may have beenintermittently interrupted by selection-driven processesduring which genotypes became fitter and decreased theirinclination toward HGT. Such stochastic switching in thenonlinear population dynamics may have destabilized theHGT-dominated state and thus led to a Darwinian tran-sition and the emergence of the first species in early evo-lution.On a side note, an interesting mathematical aspect ofthe model is that it necessarily involves three-point in-teractions, since HGT transforms one genotype into asecond by importing pieces of a third. Thus the modelprovides a natural biological example of a complex hy-pernetwork . Until now, most models in evolutionary dy- namics and population biology did not need to go beyondordinary network structure, with two-point interactionsbetween nodes connected by links.
II. EVOLUTIONARY MODEL
To explore the consequences of HGT on evolution, weconsider a model community of N progenotes evolvingon a fitness landscape [23, 53] in the presence of selec-tion, mutation, drift, and HGT. Each progenote carriesa genome of length l composed of a sequence of the bases and . The genome of progenote i determines its fit-ness f i . The progenotes reproduce by the Moran pro-cess [23, 54], i.e., each progenote reproduces randomly intime, with its reproduction rate given by its fitness f i .Whenever a progenote of genotype i reproduces, an off-spring is added to the population which is either identicalto genotype i or a mutant of genotype j with probabil-ity µ ij . Instantaneously after such a reproduction event,one progenote in the population is chosen randomly to dieand is hence removed from the population. We assumethat one mutation event will only affect one of the basesof the genome, so that the Hamming distance betweengenotypes i and j is 1.Hence, our fitness landscape may be represented by anetwork where the different genotypes are the nodes ofthe network and the possible mutations form the links.Assigning two different bases, 0 and 1, and given thestructure of the mutations, the resulting network is an l -dimensional hypercube (Fig. 1).The fitness landscape underlying our model is assumedto be a Mount Fuji landscape [23]: The highest fitness isassigned to one single genotype, the peak. Other geno-types are assigned lower fitness: the farther away fromthe peak in genotype space, the lower the fitness. Thus,a single-peaked mountain landscape is created on geno-type space, and a population evolving purely through theprocesses of selection and mutation should converge tothis peak. Note that the Moran process described aboveis a random process. It thereby constitutes a minimalmodel intrinsically including the effects of selection, mu-tation and genetic drift [23]. The latter is induced by thestochastic selection in the combined process of reproduc-tion, mutation and death and has the effect of randomlywalking the population around in genotype space even ifno fitness differences were present.To reveal the potential impact of HGT we incorpo-rate its basic features into the stochastic evolution model.Two progenotes A and B may meet and a subsequence s of progenote B ’s genome may be inserted into A ’sgenome. As a result of this horizontal gene transfer event,the genotype of progenote A will transform into anothergenotype C , determined by its original genotype and thesubsequence s . This process is illustrated in Figure 1.To model this process we add HGT-hyperlinks to the Figure 1.
HGT-hyperlinks introduce three-genotypeinteractions to the evolutionary dynamics, creatinghypernetwork dynamics.
This schematic example il-lustrates the insertion of an HGT-hyperlink (red solid anddashed) to the sequence space of genomes of length l = 4 .Individuals of genotype A = 0100 take up the first two bases from genotype B = 1101 which are inserted at positionthree into A = 011100 . After the last two bits are cut fromthe sequence, an progenote of genotype A becomes of newgenotype C = 0111 . hypercube network representing the fitness landscape.One such hyperlink symbolizes a three-genotype inter-action and is defined through the following process. Wechoose two genotypes A and B randomly as well as arandom subsequence of genome B with length between x = 2 and x = l − bases. This subsequence is insertedat a random position of genome A . The remaining x bases at the end of A ’s sequence are cut off, keeping thesequence length of A constant. The new sequence deter-mines a genotype C , which genotype A becomes on inter-acting with B via this HGT-link, denoted ( −−−−−→ A, B, C ) . Ifthe resulting genotype C is identical to A , this HGT-linkwould not alter the population dynamics and would thusbe irrelevant. We therefore neglect such self-projectingHGT-links. We repeat the above procedure until a pre-defined number m of new HGT-links has been added tothe system.An HGT-link defines one type of HGT-event, in whichpart of genotype A is replaced by part of genotype B and is thereby transformed to genotype C . We considerthese events to occur independently of each other. Let k X denote the number of progenotes of genotype X inthe population. Then the HGT-events above occur at a rate r BA → C = c · k A k B N . (1)Here the effective competence for HGT is modeled asa constant c ≥ that captures both the rate at whichthe progenotes meet and their actual preference for theinitiation of an HGT event, given that they meet.Note that interactions of the form (1), independent ofany model details, imply collective dynamics on a com-plex hypernetwork, due to their intrinsic three-genotypecoupling involving A , B , and C . The dynamics of hor-izontal gene transfer in biological systems depends ona multitude of factors, including the mode (e.g., natu-ral transformation or conjugative transfer) of HGT [27],and may vary with the fitness of the donor and recip-ient [47, 55] and other factors such as environmentalconditions [56]. To focus on qualitative mechanisms, wehere consider the simplest setting where c is just a non-negative constant. We note that, via the factors k A , k B and the presence or absence of HGT-links ( −−−−−→ A, B, C ) , theactual rate of all HGT events in the population still de-pends on how the population is distributed in genotypespace. III. QUANTIFYING STOCHASTICSWITCHING
To see how HGT influences the evolutionary dynam-ics we study how the collective model dynamics dependson the competence c . The population sizes k i ( t ) ofprogenotes of different genotypes i present in the pop-ulation fully describe the state of the system at time t .We introduce the population entropy S ( t ) = − l − X i =0 k i ( t ) N log (cid:20) k i ( t ) N (cid:21) (2)to quantify how broadly the population is distributed ingenotype space. Populations consisting of only one geno-type have population entropy zero. If the population isuniformly spread out in genotype space, the populationentropy takes its maximal value S max = l log(2) .Direct simulations of the stochastic dynamics revealthat for large competences c , the collective dynamics con-verge to a state of high population entropy where thepopulation is highly spread out in genotype space (Fig-ure 2a). It may only transiently switch to a state localizedin state space, i.e., with relatively little spread in geneticmaterial. In this high entropy state the total HGT ratein the population is orders of magnitude higher than ina speciated state (see below). The population does notadapt to the underlying fitness landscape; in that sense,HGT is the main driving process in this large- c scenario.We identify this state of high population entropy with a Figure 2.
The population dynamics is dominated by a speciated state for low competence and a distributedstate for high competence.
Shown are example dynamics of the population entropy (2) for competence c = 5 (a), c = 3 (b) and c = 1 (c) in an example population of N = 1000 progenotes with genome length l = 7 . (a): For high competence c the population entropy almost always fluctuates around a high value for all initial conditions. (b): The dynamics switchstochastically to a low entropy state and stay there longer for lower values of c . (c): The low entropy state is rendered globallystable for low c so that the population entropy fluctuates slightly above zero for all initial conditions. In the low entropy statethe population dynamics are driven by selection, in the high entropy state by HGT. Panels (i) show the entropy dynamics, (ii)the average fitness h f i of the population corresponding to the entropy dynamics and (iii) the corresponding HGT rates r HGT that the population exhibits at time t . For low population entropies the fitness is high and HGT rate small and vice versa forhigh population entropies. The mutation probability was set to µ ij = 0 . . Into the resulting Fujiyama fitness landscape [23]with fitness values between f min = 0 . and f max = 1 . we inserted m = 2000 HGT-links. pre-Darwinian collective state, as in this state no distinctspecies can be distinguished and HGT is the dominantforce driving the evolutionary dynamics.In contrast, if the progenotes’ competence for HGT islow, we observe a population dynamics which convergestoward a state of low population entropy (Figure 2c).This confirms the observation that selection, mutationand drift will drive a population to adapt to a fitnesslandscape if the mutation rate is not too high [23, 44].The population is thus concentrated around the fittestgenotype with only rare mutations and genetic drift caus-ing some spread of the population. As a consequence,large parts of the population exhibit the same or similargenotypes such that it is in a speciated state.While the system spends almost all time close to itsspeciated state for low competence, the dynamics switchstochastically between the speciated and the distributedstate if the competence is not small enough. Figures 2a-c show that the higher the progenotes’ competence forHGT is, the longer the system stays in the distributedstate. We conclude that both the speciated state and thedistributed state are dynamically accessible metastablestates (for high enough competence). The dynamics onlyswitch from one of these states to the other due to rareevents in the stochastic dynamics. This is supported bythe fact that the dynamics switch between these stateson much shorter time scales than the time they remainin them (see also Figure 3 below).How does the distributed state disappear for low com-petences? To answer this question we developed amethod based on the population entropy defined in (2) tostudy the forces induced on the dynamics by reproduc-tion and HGT. The evolution dynamics are event-driven,and the population entropy S may only change at theseevent times. At each event there is a population entropy S − directly before the event and a population entropy S + directly after the event. The change of populationentropy ∆ S = S + − S − (3)induced by a single event will in general depend on thetype of event (reproduction or HGT) and the actual dis- æææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ æ c % T h i g h S Figure 3.
The high entropy state is dynamically sta-ble also for vanishing mutation probabilities . Shownis the measured percentage of time a population stayed inthe distributed state for a system with mutation probabili-ties µ ij = 10 − (blue), µ ij = 10 − , µ ij = 10 − (orange), µ ij = 10 − (green) and µ ij = 0 (gray). Qualitatively, theresults are similar, only for higher mutation probabilities thecritical transition occurs at a lower value c cr . System param-eters were l = 7 , N = 1000 and m = 3000 HGT-links wereintroduced into a Fujiyama fitness landscape with fitness val-ues between f min = 0 . and f max = 1 . . Each datapoint wasobtained in a simulation of length T = 10 with the initialcondition S (0) = S max . tribution of the population over genotype space. If thepopulation is in a state with population entropy S , oneevent will thus induce a mean change ∆ S ( S ) averagedover all events occurring at population entropy S . Therate r ( S ) at which these events occur depends on thestate of the system as well. Multiplying the mean changeinduced by the single events with the rate at which theevents occur, we obtain the average rate of change dSdt = r ( S ) · ∆ S ( S ) (4)induced on the dynamics. The reproduction and HGTevents in our model occur independently of each other,so that their contributions separate additively accordingto dSdt = dS Repr dt + dS HGT dt (5) = r Repr ( S ) · ∆ S Repr ( S ) + r HGT ( S ) · ∆ S HGT ( S ) . (6)We measured these functional dependencies in simula-tions of the dynamics (for more details see Supplemen-tary Information), thereby obtaining the forces inducedby reproduction and HGT which drive the population en-tropy dynamics (Figure 4). The bistability of the dynam-ics emerges because the impact of HGT increases with thediversity of the population. Thus, if the population en-tropy is high, HGT will drive it toward even higher pop-ulation entropy, and hence toward the distributed state.However, if the competence drops below a critical value,the impact of HGT on the population’s dynamics is al-ways smaller than that of selection, independent of thediversity of the population. Thus, the distributed state disappears in a saddle-node bifurcation and the popu-lation converges to a speciated state. Furthermore, ouranalysis reveals that HGT alone can drive a populationinto a distributed state, even in a total absence of muta-tions (Figure 3). Figure 4.
The distributed state disappears for lowHGT competence.
At a critical competence the dynam-ical fixed point at high population entropy is destroyed in asaddle-node bifurcation. Here we show the analysis of thesystem yielding the dynamics illustrated in Figure 2. Panels(a) and (b) show the rate of change of the population entropydue to reproduction and HGT obtained with the methods de-scribed in the Methods section. The colors indicate the rateof change for competence values c = 1 (blue), c = 3 (red) and c = 5 (orange). Adding the results from (a) and (b) accord-ing to equation (6) yields the overall rate of change ˙ S for thedynamics shown in (c). The arrow indicates the fixed pointat high population entropies emerging through a saddle-nodebifurcation for increasing the parameter c . With equation (7)we define a potential V ( S ) for the dynamics which is shownin (d) for the competences c = 1 (blue), c = 3 (red) and c = 5 (orange) and additionally for c = 0 . (gray), c = 2 (green)and c = 4 (black). The potential valley at high populationentropies emerges between c = 1 and c = 2 so that the criticalcompetence must lie between these two values. Each datasetwas obtained in simulations measuring the dynamics for atime T = 10 . Why do the dynamics almost always remain in the highentropy state for high competence? Using the averagerate of change ˙ S ( S ) we define a potential V ( S ) = − ˆ S ˙ S ( S ′ ) dS ′ (7)in which the dynamics move under additional stochasticforcing. This potential is shown in Figure 4d. Accord-ing to reaction rate theory [57], the depths of the twostable states’ potential wells determine the average timethe dynamics stay close to each of the stable states. Asthe potential well at the distributed state becomes everdeeper for higher competence the dynamics hence stayever longer in this state.Thus, our results suggest that when progenotes hadhigh competence for HGT in early evolution, a dis-tributed state was dynamically stable. If competencethen decreased below a critical value, the distributedstate may have disappeared and triggered the emergenceof the first species. For this Darwinian transition to oc-cur, the population’s competence must have decreaseddynamically in the distributed state. How could this havehappened? C o m p e t e n ce F i t n ess P op . E n t r op y TimeTimeTime H G T Figure 5.
A possible scenario for the evolution of dis-tinct species from a pre-Darwinian distributed state.
The three time series sketched here are not simulation data,but encapsulate the speculations in the text, showing how theaverage competence, the average fitness, and the populationentropy may evolve in the transition from a distributed stateto the first distinct species. In the initial state (marked inblue) the competence is high, so that HGT drives the dynam-ics; the population exhibits a high population entropy andlow average fitness. Through a stochastic switching the dy-namics reaches a state of low population entropy where thefitness is higher as the population adapts to the fitness land-scape. Here the population could evolve slowly toward lowercompetence. Thus, the dynamics switch back and forth be-tween the low and the high entropy state remaining longerand longer in the low entropy state as the competence de-creases (marked in red). When the competence goes below acritical value (marked by the dashed line in the top panel) thehigh entropy state disappears (marked in green), the dynam-ics remains in the low entropy state, the population’s averagefitness increases and the first species may robustly evolve.
A decrease in competence may rely on a mechanismthat combines the stochastic switching uncovered abovewith the suggestion that fitter populations may tend tobe less prone to HGT events, as schematically illustratedin Figure 5. As the speciated state is always stable, evenif the population’s competence is high, the populationdynamics will stochastically switch to this state repeat-edly for relatively short times. In the selection-dominatedstate (i.e., at low S ) the population’s fitness increases.A fitter population that might be less prone to HGTevents, as suggested recently [50], has a decreased over-all competence (lower c in our simplified model setting).Smaller c in turn increases the stochastic residence timesthe population spends in the selection-dominated state.This combination of two mutually amplifying contribu- tions (decreasing HGT rate and increasing fitness in thepopulation) may yield decreasing competence in the longterm such that after sufficiently many switches to thelow- S state, the competence may drop below a criticalvalue where the distributed state disappears. The popu-lation then stays localized around the fittest genotypes,thus marking the time of transition to Darwinian evolu-tion. At this time, the first species can robustly emerge.The scenario shown schematically in Figure 5 illustratesone potential course of such repeated switching dynam-ics, with temporarily increased phases of higher fitnessand decreasing HGT competence on long time scales. IV. CONCLUSION
Our results provide a first glimpse of the possible dy-namics that may have led to the emergence of the firstspecies from a distributed state dominated by HGT. Wedemonstrated that a high competence for HGT in a pop-ulation may suffice to drive the population into a dis-tributed state (Figure 2). In this state HGT dominatesthe dynamics, in the sense that it inhibits the popula-tion’s ability to adapt to the underlying fitness land-scape and thus prevents it from crystallizing into dis-tinct species. Our analysis revealed that, independentlyof the mutation rate exhibited by the population, HGTcan drive the population dynamics into a state wherethe population is widely spread out in genotype space(Figure 3). We identify this state with a pre-Darwiniancollective state envisioned by Woese [15–18].Similarly, a state where no distinct species exist canemerge if the mutation rate in the population is toolarge [44, 58, 59]. Above a critical mutation rate (theerror threshold) the population cannot adapt to the un-derlying fitness landscape and will always evolve towarda quasispecies state [44, 58, 59] similar to the distributedstate induced by HGT shown above. However, there is afundamental difference in the dynamics induced by HGTand that induced by mutations: While a mutation rateabove an error threshold will always lead to a quasis-pecies state [44], high rates of HGT as studied above in-duce a bistability of the dynamics where the distributedstate coexists with a localized “speciated” state of low S .This coexistence may be essential for the evolution to-ward lower competence in a population and thus for theemergence of the first species; the coexistence is whatenables a population originally in a distributed state torepeatedly switch to a low- S state. As selection plays amajor role in such a low- S state, progenotes with lowercompetence would be selected for. Thus, with time, theentire population would evolve toward lower competenceuntil the distributed state disappears as selection effectsdominate the dynamics and the first species emerge.For the breakdown of the distributed state it is essen-tial that the population evolves toward a lower compe-tence. That the latter may in principle be possible wasalready suggested by Vogan and Higgs [50]. Our resultson an idealized model now demonstrate how stochasticswitching and fitness-dependent competence may com-bine to create a transition from a bistable state to aspeciated-only state. They in particular also suggest thatHGT may be present at similar competence levels be-fore and after the emergence of the first species. From acomplementary perspective, whereas one or a few speciesmay already have existed, other population parts maystill have been mixed without any clear species. So thevery first species may only have marked the beginningof the decline of genuinely non-specific life, with otherDarwinian transitions to follow. Acknowlegdements:
We thank Nigel Goldenfeld, Ste-fan Grosskinsky, Oskar Hallatschek and Arne Traulsenfor valuable comments and discussions. Supported by agrant of the Max Planck Society to MT. [1] C. R. Darwin,
On the Origin of Species (John Murray,London, UK, 1859).[2] MW Nirenberg, OW Jones, P Leder, BFC Clark, WS Sly,and S Pestka, “On the coding of genetic information,” in
Cold Spring Harbor Symposia on Quantitative Biology ,Vol. 28 (Cold Spring Harbor Laboratory Press, 1963) pp.549–557.[3] C. R. Woese,