[PDF] Simple model for the Darwinian transition in early evolution

Abstract

It has been hypothesized that in the era just before the last universal common ancestor emerged, life on earth was fundamentally collective. Ancient life forms shared their genetic material freely through massive horizontal gene transfer (HGT). At a certain point, however, life made a transition to the modern era of individuality and vertical descent. Here we present a minimal model for this hypothesized "Darwinian transition." The model suggests that HGT-dominated dynamics may have been intermittently interrupted by selection-driven processes during which genotypes became fitter and decreased their inclination toward HGT. Stochastic switching in the population dynamics with three-point (hypernetwork) interactions may have destabilized the HGT-dominated collective state and led to the emergence of vertical descent and the first well-defined species in early evolution. A nonlinear analysis of a stochastic model dynamics covering key features of evolutionary processes (such as selection, mutation, drift and HGT) supports this view. Our findings thus suggest a viable route from early collective evolution to the start of individuality and vertical Darwinian evolution, enabling the emergence of the first species.

Full PDF

aa r X i v : . [ q - b i o . P E ] J a n Simple model for the Darwinian transition in early evolution

Hinrich Arnoldt, a Steven H. Strogatz b and Marc Timme ∗ ,a,c a Network Dynamics, Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany b Department of Mathematics, Cornell University, Ithaca, New York 14853, USA c Institute for Nonlinear Dynamics, Faculty of Physics,Georg August University Göttingen, 37077 Göttingen, Germany and ∗ Corresponding author: [email protected]

It has been hypothesized that in the era just before the last universal common ancestor emerged,life on earth was fundamentally collective. Ancient life forms shared their genetic material freelythrough massive horizontal gene transfer (HGT). At a certain point, however, life made a transitionto the modern era of individuality and vertical descent. Here we present a minimal model for thishypothesized “Darwinian transition.” The model suggests that HGT-dominated dynamics may havebeen intermittently interrupted by selection-driven processes during which genotypes became ﬁtterand decreased their inclination toward HGT. Stochastic switching in the population dynamics withthree-point (hypernetwork) interactions may have destabilized the HGT-dominated collective stateand led to the emergence of vertical descent and the ﬁrst well-deﬁned species in early evolution. Anonlinear analysis of a stochastic model dynamics covering key features of evolutionary processes(such as selection, mutation, drift and HGT) supports this view. Our ﬁndings thus suggest a viableroute from early collective evolution to the start of individuality and vertical Darwinian evolution.

Keywords:

Evolutionary dynamics, population genetics, horizontal gene transfer, earlyevolution, emergence of the ﬁrst species, hypernetwork dynamics, Darwinian threshold

I. INTRODUCTIONA. The last universal common ancestor

In the ﬁnal chapter of “On the Origin of Species”,Charles Darwin speculated that all life on earth may havedescended from a common ancestor. As he observed,“all living things have much in common, in their chem-ical composition, their germinal vesicles, their cellularstructure, and their laws of growth and reproduction. . . .

Therefore I should infer from analogy that proba-bly all the organic beings which have ever lived on thisearth have descended from some one primordial form,into which life was ﬁrst breathed” [1].A century after Darwin, molecular biology providednew lines of circumstantial evidence for a universal com-mon ancestor. All organisms were found to use the samemolecule (DNA) for their genetic material, as well as acanonical look-up table (the genetic code) for translat-ing nucleotide sequences into amino-acid sequences [2–4].Further clues came from cross-species comparisons of themolecules involved in the most fundamental processes oflife, such as protein synthesis, core metabolism, and thestorage and handling of the genetic material. The ﬁrstsuch analysis [5], based on snippets of ribosomal RNA,provoked a revolution in our understanding of life’s fam-ily tree [6–8]. It indicated that life is divided into threediﬀerent domains: the Archaea, the Bacteria and theEucarya [5, 6, 8]. Later studies using other molecularsequences placed the root of the tree, corresponding tothe last universal common ancestor, somewhere betweenthe Bacteria and Archaea [9–13], roughly . − . bil-lion years ago. The nature of the last universal common ancestor, however, remains unresolved: Was it prokary-otic or eukaryotic? Did it thrive in extreme or moderatetemperatures? Was its genome based on RNA or DNA?For a review, see Ref. [14] and references therein. B. The era of collective evolution

Our work in this paper was inspired by a conjectureproposed by Woese and his colleagues [15–18]. Accord-ing to this conjecture, the last universal common ances-tor was a community, not a single creature. It marked aturning point in the history of life: before it, evolutionwas collective and dominated by horizontal gene trans-fer; after it, evolution was Darwinian and dominated byvertical gene transfer.In Woese’s scenario, life in the epoch leading up tothe universal ancestor was intensely communal. It wasorganized into loose-knit consortia of protocells far sim-pler than the bacteria or archaea we know today. Woeseand Fox [15] called those hypothetical ancient life forms“progenotes.” The term signiﬁes that the coupling be-tween genotype and phenotype had not yet fully evolved,mainly because the process for translating genes into pro-teins had not yet fully evolved either. A rudimentaryform of translation existed, but it was ambiguous andhence had a statistical character. Instead of producinga single protein, early translation produced a cloud ofsimilar proteins. This ambiguity in protein synthesis inturn limited the speciﬁcity of all the progenote’s interac-tions. For example, lacking the large, complex proteinsnecessary for accurate copying and repair of the geneticmaterial, the progenote’s genome was tiny and subjectto high mutation rates.Progenotes were not well-deﬁned organisms as such,because they had no individuality and no long-term ge-netic pedigree. Their genes and component parts couldcome and go, being swapped in or out with other mem-bers of the community via horizontal transfer. But be-cause biochemical innovations produced by any memberof the community were available to all, evolution at thistime was rapid—probably more rapid than at any timesince. Selection acted on whole communities, not on in-dividual progenotes. Those communities that were bet-ter at sharing their biochemical breakthroughs ﬂourished.Out of this cauldron of evolutionary innovation, the uni-versal genetic code and its translational machinery co-evolved, in response to the selective pressures favoringeﬃcient sharing and interoperability.Vetsigian, Woese, and Goldenfeld [19] confronted andconstrained these speculations with mathematical anal-ysis and computer simulations. Going beyond Woese’sconjectures, they probed early evolution scientiﬁcally byinterpreting the available data on the genetic code. Theirdynamical model for the evolution of the genetic code [19]showed that a collective state of life is required to obtainthe observed [20, 21] statistical properties of the code, inparticular, its simultaneous universality and optimality.A later study by Goldenfeld and colleagues [22] providedfurther evidence that only a collective state of life couldhave created the highly optimized code used by all lifetoday [21].

C. The Darwinian transition

How did the era of collective evolution come to an end?Woese speculated that as the translation process beganto improve, and as progenotic subsystems became in-creasingly complex and specialized, it would have becomeharder to ﬁnd foreign parts compatible with them. Thus,horizontal gene transfer would have become increasinglydiﬃcult. The only possible modiﬁcations at this pointwould have come from within the progenote’s lineage it-self, through mutation and gene duplication. It was inthis way that the progenotes would have made the Dar-winian transition [17, 18] to become “genotes,” i.e., lifeforms with a tight coupling between their nucleic acidgenotypes and their protein phenotypes, and that couldtherefore evolve through the familiar Darwinian processof vertical descent.The model considered below is an attempt to explore,in mathematical terms, how the Darwinian transitionfrom the collective state to the modern era of individ-uality might have taken place. Our approach shares withRef. [19] the outlook that a dynamical systems calcula-tion should be devised to support or refute the hypothe-ses considered. Our results lend support to the proposedcollective state of life [15–19] by providing a potential mechanism for the exit from that state.

D. Horizontal gene transfer

Over the last decades more and more evidence hasaccumulated that, besides selection, mutation, anddrift [23], another process drives evolution: horizontalgene transfer. Here we brieﬂy review the main pointsabout horizontal gene transfer relevant to the mathemat-ical model developed below.While reproduction implies a vertical transfer of genesfrom one entity to the next in the phylogenetic tree, thereare also processes in which possibly unrelated individualsexchange genetic material during their lifetimes, i.e., hor-izontally in the sense of the tree. This transfer of geneswithin one generation is consequently termed horizontalgene transfer (HGT) [24–27] or lateral gene transfer. Itis now widely accepted that HGT is a fundamental driv-ing force of evolution [25, 28–33], and that its existenceraises profound theoretical problems for evolutionary bi-ology. For example, the longstanding problem of deﬁn-ing bacterial species [30, 34–38] is due, in part, to thepromiscuous use of HGT by bacteria. A recent primer onhorizontal gene transfer and its potential for evolutionaryprocesses in general is given in [39].As discussed above, if HGT was rampant in the earlystages of evolution, the last universal common ancestorwas a community, not a single organism [16, 17, 19, 40].In this collective state, individuals could not yet be dis-tinguished, as each progenote’s genes were frequently ex-changed through HGT. In terms of the model to be devel-oped below, the total pool of genotypes in the collectivestate would be spread out and thus broadly distributedin the state space of all theoretically possible genotypes.Conversely, a genotype distribution that is highly local-ized in state space, being concentrated on just one or afew genotypes, would be the model’s version of a well-deﬁned species.Woese postulated that as the collective state of theprogenote population slowly evolved toward higher com-plexity, its rate of HGT slowly decreased [17]. At somepoint the system crossed the “Darwinian threshold” [17].Then natural selection instead of HGT started to dom-inate the dynamics. The ﬁtter individuals were selectedfor and the ﬁrst species emerged from the distributedstate. In the colorful language of Dyson [41]:But then, one evil day, a cell resembling aprimitive bacterium happened to ﬁnd itselfone jump ahead of its neighbors in eﬃciency.That cell, anticipating Bill Gates by three bil-lion years, separated itself from the commu-nity and refused to share. Its oﬀspring be-came the ﬁrst species of bacteria—and theﬁrst species of any kind—reserving their in-tellectual property for their own private use.With their superior eﬃciency, the bacteriacontinued to prosper and to evolve separately,while the rest of the community continued itscommunal life. Some millions of years later,another cell separated itself from the commu-nity and became the ancestor of the archaea.Some time after that, a third cell separateditself and became the ancestor of the eukary-otes.After making a Darwinian transition, evolution pro-ceeds in the familiar vertical manner, being driven by se-lection, mutation, and drift [23], with HGT playing onlya minor role. Such Darwinian dynamics have, of course,been studied extensively in both experimental and modelsettings [23, 42–46]. Compared to the dynamics of HGTtheir properties are relatively well understood. Recently,potential inﬂuences of HGT on such evolutionary dynam-ics have been investigated [16, 19, 47–51]. Some mathe-matical models of HGT have focused on how it can in-crease a population’s ﬁtness in Darwinian evolution [52].Keep in mind, however, that the hypothesized HGT as-sociated with progenotes and the Darwinian transition,being associated with ribosomal genes and the rest ofthe core machinery of the cell, would have been of fargreater evolutionary signiﬁcance than the HGT of, say,antibiotic resistance genes seen in bacterial communitiestoday. In Woese’s scenario, the ancient form of HGT wasrampant, pervasive, and tremendously disruptive and in-novative. It was the prime mover in shaping the fabricof the cell [18].We would like to understand what such a Darwiniantransition would look like, mathematically. The modeldescribed in the next section is deliberately minimal. Itleaves out all the biology of ribosomes, proteins, geneticcodes, and the like. What remains is an attempt to cap-ture the essence of Woese’s speculations. In place ofa community of progenotes, we consider a communityof abstract genomes, represented by binary sequences.They interact via HGT, and are subject to mutation,selection, and drift on a ﬁtness landscape. Our worksuggests that HGT-dominated dynamics may have beenintermittently interrupted by selection-driven processesduring which genotypes became ﬁtter and decreased theirinclination toward HGT. Such stochastic switching in thenonlinear population dynamics may have destabilized theHGT-dominated state and thus led to a Darwinian tran-sition and the emergence of the ﬁrst species in early evo-lution.On a side note, an interesting mathematical aspect ofthe model is that it necessarily involves three-point in-teractions, since HGT transforms one genotype into asecond by importing pieces of a third. Thus the modelprovides a natural biological example of a complex hy-pernetwork . Until now, most models in evolutionary dy- namics and population biology did not need to go beyondordinary network structure, with two-point interactionsbetween nodes connected by links.

II. EVOLUTIONARY MODEL

To explore the consequences of HGT on evolution, weconsider a model community of N progenotes evolvingon a ﬁtness landscape [23, 53] in the presence of selec-tion, mutation, drift, and HGT. Each progenote carriesa genome of length l composed of a sequence of the bases and . The genome of progenote i determines its ﬁt-ness f i . The progenotes reproduce by the Moran pro-cess [23, 54], i.e., each progenote reproduces randomly intime, with its reproduction rate given by its ﬁtness f i .Whenever a progenote of genotype i reproduces, an oﬀ-spring is added to the population which is either identicalto genotype i or a mutant of genotype j with probabil-ity µ ij . Instantaneously after such a reproduction event,one progenote in the population is chosen randomly to dieand is hence removed from the population. We assumethat one mutation event will only aﬀect one of the basesof the genome, so that the Hamming distance betweengenotypes i and j is 1.Hence, our ﬁtness landscape may be represented by anetwork where the diﬀerent genotypes are the nodes ofthe network and the possible mutations form the links.Assigning two diﬀerent bases, 0 and 1, and given thestructure of the mutations, the resulting network is an l -dimensional hypercube (Fig. 1).The ﬁtness landscape underlying our model is assumedto be a Mount Fuji landscape [23]: The highest ﬁtness isassigned to one single genotype, the peak. Other geno-types are assigned lower ﬁtness: the farther away fromthe peak in genotype space, the lower the ﬁtness. Thus,a single-peaked mountain landscape is created on geno-type space, and a population evolving purely through theprocesses of selection and mutation should converge tothis peak. Note that the Moran process described aboveis a random process. It thereby constitutes a minimalmodel intrinsically including the eﬀects of selection, mu-tation and genetic drift [23]. The latter is induced by thestochastic selection in the combined process of reproduc-tion, mutation and death and has the eﬀect of randomlywalking the population around in genotype space even ifno ﬁtness diﬀerences were present.To reveal the potential impact of HGT we incorpo-rate its basic features into the stochastic evolution model.Two progenotes A and B may meet and a subsequence s of progenote B ’s genome may be inserted into A ’sgenome. As a result of this horizontal gene transfer event,the genotype of progenote A will transform into anothergenotype C , determined by its original genotype and thesubsequence s . This process is illustrated in Figure 1.To model this process we add HGT-hyperlinks to the Figure 1.

HGT-hyperlinks introduce three-genotypeinteractions to the evolutionary dynamics, creatinghypernetwork dynamics.

This schematic example il-lustrates the insertion of an HGT-hyperlink (red solid anddashed) to the sequence space of genomes of length l = 4 .Individuals of genotype A = 0100 take up the ﬁrst two bases from genotype B = 1101 which are inserted at positionthree into A = 011100 . After the last two bits are cut fromthe sequence, an progenote of genotype A becomes of newgenotype C = 0111 . hypercube network representing the ﬁtness landscape.One such hyperlink symbolizes a three-genotype inter-action and is deﬁned through the following process. Wechoose two genotypes A and B randomly as well as arandom subsequence of genome B with length between x = 2 and x = l − bases. This subsequence is insertedat a random position of genome A . The remaining x bases at the end of A ’s sequence are cut oﬀ, keeping thesequence length of A constant. The new sequence deter-mines a genotype C , which genotype A becomes on inter-acting with B via this HGT-link, denoted ( −−−−−→ A, B, C ) . Ifthe resulting genotype C is identical to A , this HGT-linkwould not alter the population dynamics and would thusbe irrelevant. We therefore neglect such self-projectingHGT-links. We repeat the above procedure until a pre-deﬁned number m of new HGT-links has been added tothe system.An HGT-link deﬁnes one type of HGT-event, in whichpart of genotype A is replaced by part of genotype B and is thereby transformed to genotype C . We considerthese events to occur independently of each other. Let k X denote the number of progenotes of genotype X inthe population. Then the HGT-events above occur at a rate r BA → C = c · k A k B N . (1)Here the eﬀective competence for HGT is modeled asa constant c ≥ that captures both the rate at whichthe progenotes meet and their actual preference for theinitiation of an HGT event, given that they meet.Note that interactions of the form (1), independent ofany model details, imply collective dynamics on a com-plex hypernetwork, due to their intrinsic three-genotypecoupling involving A , B , and C . The dynamics of hor-izontal gene transfer in biological systems depends ona multitude of factors, including the mode (e.g., natu-ral transformation or conjugative transfer) of HGT [27],and may vary with the ﬁtness of the donor and recip-ient [47, 55] and other factors such as environmentalconditions [56]. To focus on qualitative mechanisms, wehere consider the simplest setting where c is just a non-negative constant. We note that, via the factors k A , k B and the presence or absence of HGT-links ( −−−−−→ A, B, C ) , theactual rate of all HGT events in the population still de-pends on how the population is distributed in genotypespace. III. QUANTIFYING STOCHASTICSWITCHING

To see how HGT inﬂuences the evolutionary dynam-ics we study how the collective model dynamics dependson the competence c . The population sizes k i ( t ) ofprogenotes of diﬀerent genotypes i present in the pop-ulation fully describe the state of the system at time t .We introduce the population entropy S ( t ) = − l − X i =0 k i ( t ) N log (cid:20) k i ( t ) N (cid:21) (2)to quantify how broadly the population is distributed ingenotype space. Populations consisting of only one geno-type have population entropy zero. If the population isuniformly spread out in genotype space, the populationentropy takes its maximal value S max = l log(2) .Direct simulations of the stochastic dynamics revealthat for large competences c , the collective dynamics con-verge to a state of high population entropy where thepopulation is highly spread out in genotype space (Fig-ure 2a). It may only transiently switch to a state localizedin state space, i.e., with relatively little spread in geneticmaterial. In this high entropy state the total HGT ratein the population is orders of magnitude higher than ina speciated state (see below). The population does notadapt to the underlying ﬁtness landscape; in that sense,HGT is the main driving process in this large- c scenario.We identify this state of high population entropy with a Figure 2.

The population dynamics is dominated by a speciated state for low competence and a distributedstate for high competence.

Shown are example dynamics of the population entropy (2) for competence c = 5 (a), c = 3 (b) and c = 1 (c) in an example population of N = 1000 progenotes with genome length l = 7 . (a): For high competence c the population entropy almost always ﬂuctuates around a high value for all initial conditions. (b): The dynamics switchstochastically to a low entropy state and stay there longer for lower values of c . (c): The low entropy state is rendered globallystable for low c so that the population entropy ﬂuctuates slightly above zero for all initial conditions. In the low entropy statethe population dynamics are driven by selection, in the high entropy state by HGT. Panels (i) show the entropy dynamics, (ii)the average ﬁtness h f i of the population corresponding to the entropy dynamics and (iii) the corresponding HGT rates r HGT that the population exhibits at time t . For low population entropies the ﬁtness is high and HGT rate small and vice versa forhigh population entropies. The mutation probability was set to µ ij = 0 . . Into the resulting Fujiyama ﬁtness landscape [23]with ﬁtness values between f min = 0 . and f max = 1 . we inserted m = 2000 HGT-links. pre-Darwinian collective state, as in this state no distinctspecies can be distinguished and HGT is the dominantforce driving the evolutionary dynamics.In contrast, if the progenotes’ competence for HGT islow, we observe a population dynamics which convergestoward a state of low population entropy (Figure 2c).This conﬁrms the observation that selection, mutationand drift will drive a population to adapt to a ﬁtnesslandscape if the mutation rate is not too high [23, 44].The population is thus concentrated around the ﬁttestgenotype with only rare mutations and genetic drift caus-ing some spread of the population. As a consequence,large parts of the population exhibit the same or similargenotypes such that it is in a speciated state.While the system spends almost all time close to itsspeciated state for low competence, the dynamics switchstochastically between the speciated and the distributedstate if the competence is not small enough. Figures 2a-c show that the higher the progenotes’ competence forHGT is, the longer the system stays in the distributedstate. We conclude that both the speciated state and thedistributed state are dynamically accessible metastablestates (for high enough competence). The dynamics onlyswitch from one of these states to the other due to rareevents in the stochastic dynamics. This is supported bythe fact that the dynamics switch between these stateson much shorter time scales than the time they remainin them (see also Figure 3 below).How does the distributed state disappear for low com-petences? To answer this question we developed amethod based on the population entropy deﬁned in (2) tostudy the forces induced on the dynamics by reproduc-tion and HGT. The evolution dynamics are event-driven,and the population entropy S may only change at theseevent times. At each event there is a population entropy S − directly before the event and a population entropy S + directly after the event. The change of populationentropy ∆ S = S + − S − (3)induced by a single event will in general depend on thetype of event (reproduction or HGT) and the actual dis- æææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ ææææææææ æ æ æ æ æ æ æ æ c % T h i g h S Figure 3.

The high entropy state is dynamically sta-ble also for vanishing mutation probabilities . Shownis the measured percentage of time a population stayed inthe distributed state for a system with mutation probabili-ties µ ij = 10 − (blue), µ ij = 10 − , µ ij = 10 − (orange), µ ij = 10 − (green) and µ ij = 0 (gray). Qualitatively, theresults are similar, only for higher mutation probabilities thecritical transition occurs at a lower value c cr . System param-eters were l = 7 , N = 1000 and m = 3000 HGT-links wereintroduced into a Fujiyama ﬁtness landscape with ﬁtness val-ues between f min = 0 . and f max = 1 . . Each datapoint wasobtained in a simulation of length T = 10 with the initialcondition S (0) = S max . tribution of the population over genotype space. If thepopulation is in a state with population entropy S , oneevent will thus induce a mean change ∆ S ( S ) averagedover all events occurring at population entropy S . Therate r ( S ) at which these events occur depends on thestate of the system as well. Multiplying the mean changeinduced by the single events with the rate at which theevents occur, we obtain the average rate of change dSdt = r ( S ) · ∆ S ( S ) (4)induced on the dynamics. The reproduction and HGTevents in our model occur independently of each other,so that their contributions separate additively accordingto dSdt = dS Repr dt + dS HGT dt (5) = r Repr ( S ) · ∆ S Repr ( S ) + r HGT ( S ) · ∆ S HGT ( S ) . (6)We measured these functional dependencies in simula-tions of the dynamics (for more details see Supplemen-tary Information), thereby obtaining the forces inducedby reproduction and HGT which drive the population en-tropy dynamics (Figure 4). The bistability of the dynam-ics emerges because the impact of HGT increases with thediversity of the population. Thus, if the population en-tropy is high, HGT will drive it toward even higher pop-ulation entropy, and hence toward the distributed state.However, if the competence drops below a critical value,the impact of HGT on the population’s dynamics is al-ways smaller than that of selection, independent of thediversity of the population. Thus, the distributed state disappears in a saddle-node bifurcation and the popu-lation converges to a speciated state. Furthermore, ouranalysis reveals that HGT alone can drive a populationinto a distributed state, even in a total absence of muta-tions (Figure 3). Figure 4.

The distributed state disappears for lowHGT competence.

At a critical competence the dynam-ical ﬁxed point at high population entropy is destroyed in asaddle-node bifurcation. Here we show the analysis of thesystem yielding the dynamics illustrated in Figure 2. Panels(a) and (b) show the rate of change of the population entropydue to reproduction and HGT obtained with the methods de-scribed in the Methods section. The colors indicate the rateof change for competence values c = 1 (blue), c = 3 (red) and c = 5 (orange). Adding the results from (a) and (b) accord-ing to equation (6) yields the overall rate of change ˙ S for thedynamics shown in (c). The arrow indicates the ﬁxed pointat high population entropies emerging through a saddle-nodebifurcation for increasing the parameter c . With equation (7)we deﬁne a potential V ( S ) for the dynamics which is shownin (d) for the competences c = 1 (blue), c = 3 (red) and c = 5 (orange) and additionally for c = 0 . (gray), c = 2 (green)and c = 4 (black). The potential valley at high populationentropies emerges between c = 1 and c = 2 so that the criticalcompetence must lie between these two values. Each datasetwas obtained in simulations measuring the dynamics for atime T = 10 . Why do the dynamics almost always remain in the highentropy state for high competence? Using the averagerate of change ˙ S ( S ) we deﬁne a potential V ( S ) = − ˆ S ˙ S ( S ′ ) dS ′ (7)in which the dynamics move under additional stochasticforcing. This potential is shown in Figure 4d. Accord-ing to reaction rate theory [57], the depths of the twostable states’ potential wells determine the average timethe dynamics stay close to each of the stable states. Asthe potential well at the distributed state becomes everdeeper for higher competence the dynamics hence stayever longer in this state.Thus, our results suggest that when progenotes hadhigh competence for HGT in early evolution, a dis-tributed state was dynamically stable. If competencethen decreased below a critical value, the distributedstate may have disappeared and triggered the emergenceof the ﬁrst species. For this Darwinian transition to oc-cur, the population’s competence must have decreaseddynamically in the distributed state. How could this havehappened? C o m p e t e n ce F i t n ess P op . E n t r op y TimeTimeTime H G T Figure 5.

A possible scenario for the evolution of dis-tinct species from a pre-Darwinian distributed state.

The three time series sketched here are not simulation data,but encapsulate the speculations in the text, showing how theaverage competence, the average ﬁtness, and the populationentropy may evolve in the transition from a distributed stateto the ﬁrst distinct species. In the initial state (marked inblue) the competence is high, so that HGT drives the dynam-ics; the population exhibits a high population entropy andlow average ﬁtness. Through a stochastic switching the dy-namics reaches a state of low population entropy where theﬁtness is higher as the population adapts to the ﬁtness land-scape. Here the population could evolve slowly toward lowercompetence. Thus, the dynamics switch back and forth be-tween the low and the high entropy state remaining longerand longer in the low entropy state as the competence de-creases (marked in red). When the competence goes below acritical value (marked by the dashed line in the top panel) thehigh entropy state disappears (marked in green), the dynam-ics remains in the low entropy state, the population’s averageﬁtness increases and the ﬁrst species may robustly evolve.

A decrease in competence may rely on a mechanismthat combines the stochastic switching uncovered abovewith the suggestion that ﬁtter populations may tend tobe less prone to HGT events, as schematically illustratedin Figure 5. As the speciated state is always stable, evenif the population’s competence is high, the populationdynamics will stochastically switch to this state repeat-edly for relatively short times. In the selection-dominatedstate (i.e., at low S ) the population’s ﬁtness increases.A ﬁtter population that might be less prone to HGTevents, as suggested recently [50], has a decreased over-all competence (lower c in our simpliﬁed model setting).Smaller c in turn increases the stochastic residence timesthe population spends in the selection-dominated state.This combination of two mutually amplifying contribu- tions (decreasing HGT rate and increasing ﬁtness in thepopulation) may yield decreasing competence in the longterm such that after suﬃciently many switches to thelow- S state, the competence may drop below a criticalvalue where the distributed state disappears. The popu-lation then stays localized around the ﬁttest genotypes,thus marking the time of transition to Darwinian evolu-tion. At this time, the ﬁrst species can robustly emerge.The scenario shown schematically in Figure 5 illustratesone potential course of such repeated switching dynam-ics, with temporarily increased phases of higher ﬁtnessand decreasing HGT competence on long time scales. IV. CONCLUSION

Our results provide a ﬁrst glimpse of the possible dy-namics that may have led to the emergence of the ﬁrstspecies from a distributed state dominated by HGT. Wedemonstrated that a high competence for HGT in a pop-ulation may suﬃce to drive the population into a dis-tributed state (Figure 2). In this state HGT dominatesthe dynamics, in the sense that it inhibits the popula-tion’s ability to adapt to the underlying ﬁtness land-scape and thus prevents it from crystallizing into dis-tinct species. Our analysis revealed that, independentlyof the mutation rate exhibited by the population, HGTcan drive the population dynamics into a state wherethe population is widely spread out in genotype space(Figure 3). We identify this state with a pre-Darwiniancollective state envisioned by Woese [15–18].Similarly, a state where no distinct species exist canemerge if the mutation rate in the population is toolarge [44, 58, 59]. Above a critical mutation rate (theerror threshold) the population cannot adapt to the un-derlying ﬁtness landscape and will always evolve towarda quasispecies state [44, 58, 59] similar to the distributedstate induced by HGT shown above. However, there is afundamental diﬀerence in the dynamics induced by HGTand that induced by mutations: While a mutation rateabove an error threshold will always lead to a quasis-pecies state [44], high rates of HGT as studied above in-duce a bistability of the dynamics where the distributedstate coexists with a localized “speciated” state of low S .This coexistence may be essential for the evolution to-ward lower competence in a population and thus for theemergence of the ﬁrst species; the coexistence is whatenables a population originally in a distributed state torepeatedly switch to a low- S state. As selection plays amajor role in such a low- S state, progenotes with lowercompetence would be selected for. Thus, with time, theentire population would evolve toward lower competenceuntil the distributed state disappears as selection eﬀectsdominate the dynamics and the ﬁrst species emerge.For the breakdown of the distributed state it is essen-tial that the population evolves toward a lower compe-tence. That the latter may in principle be possible wasalready suggested by Vogan and Higgs [50]. Our resultson an idealized model now demonstrate how stochasticswitching and ﬁtness-dependent competence may com-bine to create a transition from a bistable state to aspeciated-only state. They in particular also suggest thatHGT may be present at similar competence levels be-fore and after the emergence of the ﬁrst species. From acomplementary perspective, whereas one or a few speciesmay already have existed, other population parts maystill have been mixed without any clear species. So thevery ﬁrst species may only have marked the beginningof the decline of genuinely non-speciﬁc life, with otherDarwinian transitions to follow. Acknowlegdements:

We thank Nigel Goldenfeld, Ste-fan Grosskinsky, Oskar Hallatschek and Arne Traulsenfor valuable comments and discussions. Supported by agrant of the Max Planck Society to MT. [1] C. R. Darwin,

On the Origin of Species (John Murray,London, UK, 1859).[2] MW Nirenberg, OW Jones, P Leder, BFC Clark, WS Sly,and S Pestka, “On the coding of genetic information,” in

Cold Spring Harbor Symposia on Quantitative Biology ,Vol. 28 (Cold Spring Harbor Laboratory Press, 1963) pp.549–557.[3] C. R. Woese,