[PDF] Phylodynamics of rapidly adapting pathogens: extinction and speciation of a Red Queen

Abstract

Rapidly evolving pathogens like influenza viruses can persist by accumulating antigenic novelty fast enough to evade the adaptive immunity of the host population, yet without continuous accumulation of genetic diversity. This dynamical state is often compared to the Red Queen evolving as fast as it can just to maintain its foothold in the host population: Accumulation of antigenic novelty is balanced by the build-up of host immunity. Such Red Queen States (RQS) of continuous adaptation in large rapidly mutating populations are well understood in terms of Traveling Wave (TW) theories of population genetics. Here we shall make explicit the mapping of the established Multi-strain Susceptible-Infected-Recovered (SIR) model onto the TW theory and demonstrate that a pathogen can persist in RQS if cross-immunity is long-ranged and its population size is large populations allowing for rapid adaptation. We then investigate the stability of this state focusing on the rate of extinction and the rate of "speciation" defined as antigenic divergence of viral strains beyond the range of cross-inhibition. RQS states are transient, but in a certain range of evolutionary parameters can exist for the time long compared to the typical time to the most recent common ancestor ( T MRCA ). In this range the steady TW is unstable and the antigenic advance of the lead strains relative to the typical co-circulating viruses tends to oscillate. This results in large fluctuations in prevalence that facilitate extinction. We shall demonstrate that the rate of TW fission into antigenically uncoupled viral populations is related to fluctuations of T MRCA and construct a "phase diagram" identifying different regimes of viral phylodynamics as a function of evolutionary parameters.

Full PDF

PPhylodynamics of rapidly adapting pathogens: extinction and speciation of aRed Queen.

Le Yan, Richard A. Neher, and Boris I. Shraiman Kavli Institute for Theoretical Physics,UC Santa Barbara Biozentrum, University of Basel,Switzerland (Dated: October 30, 2018)

Rapidly evolving pathogens like inﬂuenza viruses can persist by accumulating antigenicnovelty fast enough to evade the adaptive immunity of the host population, yet withoutcontinuous accumulation of genetic diversity. This dynamical state is often comparedto the Red Queen evolving as fast as it can just to maintain its foothold in the hostpopulation: Accumulation of antigenic novelty is balanced by the build-up of host immu-nity. Such Red Queen States (RQS) of continuous adaptation in large rapidly mutatingpopulations are well understood in terms of Traveling Wave (TW) theories of popula-tion genetics. Here we shall make explicit the mapping of the established Multi-strainSusceptible-Infected-Recovered (SIR) model onto the TW theory and demonstrate thata pathogen can persist in RQS if cross-immunity is long-ranged and its population sizeis large populations allowing for rapid adaptation. We then investigate the stabilityof this state focusing on the rate of extinction and the rate of “speciation” deﬁned asantigenic divergence of viral strains beyond the range of cross-inhibition. RQS states aretransient, but in a certain range of evolutionary parameters can exist for the time longcompared to the typical time to the most recent common ancestor ( T MRCA ). In thisrange the steady TW is unstable and the antigenic advance of the lead strains relative tothe typical co-circulating viruses tends to oscillate. This results in large ﬂuctuations inprevalence that facilitate extinction. We shall demonstrate that the rate of TW ﬁssioninto antigenically uncoupled viral populations is related to ﬂuctuations of T MRCA andconstruct a “phase diagram” identifying diﬀerent regimes of viral phylodynamics as afunction of evolutionary parameters.

In a host population that develops long lasting immu-nity against a pathogen, the pathogen can persist eitherby infecting immunological naive individuals such as chil-dren or through rapid antigenic evolution that enablesthe pathogen to evade immunity and re-infect individu-als. Childhood diseases like measles or chicken pox fallinto the former category, while inﬂuenza virus popula-tions adapt rapidly and reinfect most humans multipletimes during their lifespan. Continuous adaptation of in-ﬂuenza viruses is facilitated by high mutation rates and itis common that many diﬀerent variants of the same sub-type co-circulate. Nevertheless, almost always a singlevariant eventually outcompetes the others such that di-versity within one subtype remains limited (Petrova andRussell, 2018).The contrast of rapid evolution while maintaining lim-ited genetic diversity is most pronounced for the inﬂuenzavirus subtype A/H3N2. Fig. 1 shows a phylogenetic treeof HA sequences of type A/H3N2 with the characteristic“spindly” shape. The most recent common ancestor ofthe population is rarely more than 3-5 years in the past.Other pathogenic RNA viruses that typically don’t rein-fect the same individual, (measles, mumps, HCV or HIV)diversify for decades or centuries (Grenfell et al. , 2004).Interestingly, inﬂuenza B has split into two co-circulatinglineages in the 1970ies which by now are antigenicallydistinct (Rota et al. , 1990) and maintain intermediate levels of diversity. Thus, one may wish to understandunder what conditions a virus can continuously evolvein competition with host immunity, neither going extinctnor spawning diverging lineages.A variety of mechanisms have been proposed that canlead to spindly phylogenies in rapidly evolving viral pop-ulations. A low dimensional antigenic space might limitthe number of distinct directions in which the populationcan evolve thereby preventing diversiﬁcation (Andreasen et al. , 1997; Andreasen and Sasaki, 2006; Gog and Gren-fell, 2002). This scenario seems inconsistent with esti-mates of the dimensionality of “antigenic shape space”( d = 5 (Perelson and Oster, 1979)) and the number ofdistinct positions in surface proteins that evolve underimmune selection (Bhatt et al. , 2011; Koel et al. , 2013;Neher et al. , 2016). Others have shown that competitionbetween lineages and long range cross-immunity betweenstrains can prevent diversiﬁcation, eﬀectively canalizingthe population into a single lineage (Bedford et al. , 2012;Ferguson et al. , 2003; Tria et al. , 2005). Deleterious mu-tational load also facilitates the maintenance of a singlelineage (Koelle and Rasmussen, 2015).While these previous studies have identiﬁed competi-tion mediated by cross-immunity as important determi-nants of lineage structure, these prior works mostly reliedon simulations with the aim to recapitulate patterns ob-served for human seasonal inﬂuenza viruses. Here, we a r X i v : . [ q - b i o . P E ] O c t In ﬂ uenza A/H3N2 In ﬂ uenza B VictorialineageYamagata lineage T M R C A A/H3N2 T MRCA = 2.0 T MRCA = 1.1 T M R C A A/H1N1pdm(2009) T MRCA = 2.4 T MRCA = 1.0 T M R C A A/H1N1(1918) T MRCA = 3.7 T MRCA = 2.4 T M R C A B/Vic T MRCA = 3.3 T MRCA = 2.1 year T M R C A B/Yam T MRCA = 4.4 T MRCA = 2.6

FIG. 1

Spindly phylogenies and speciation in diﬀerent human seasonal inﬂuenza virus lineages.

The top leftpanel shows a phylogeny of the HA segment of inﬂuenza A virus of subtype H3N2 from its emergence in 1968 to 2018. Thevirus population never accumulates much diversity but is rapidly evolving. The lower left panel shows a phylogeny of the HAsegment of inﬂuenza B viruses from 1940 to 2018. In the 70ies, the population split into two lineages known as Victoria (B/Vic)and Yamagata (B/Yam). The graphs on the right quantify diversity via the time to the most recent common ancestor T MRCA for diﬀerent inﬂuenza virus lineages. Inﬂuenza B viruses harbor more genetic diversity than inﬂuenza A viruses. The subtypeA/H3N2 in particular coalesces typically in 3y while deeps splits in excess of 5y are rare. show that generic stochastic models of antigenic evolu-tion with ﬁnite cross-immunity are compatible with thespindly phylogenies of inﬂuenza viruses if the range ofcross-immunity is large compared to population diversity.If cross-immunity decays more rapidly with mutationaldistance, viral population becomes prone to speciation.Finally, if antigenic evolution is too slow, the virus willgo extinct after a brief pandemic. This rich behavior iscontrolled by three dimensionless parameters that have adirect relationship to the parameters of the multi-strainSIR model. We show how multi-strain SIR models relateto traveling wave models of adaptive evolution (Desai andFisher, 2007; Neher, 2013; Rouzine et al. , 2003; Tsimring et al. , 1996) and how extinction and speciation relate tooscillations of prevalence and genetic diversity.

Model

A model of an antigenically evolving pathogen popula-tion needs to account for cross-immunity between strainsand the evolution of antigenically novel strains. We usean extension of the standard multi-strain SIR model (Gogand Grenfell, 2002). The fraction of individuals I a in-fected with viral strain a changes according to ddt I a = βS a I a − ( ν + γ ) I a (1)where β is the transmissibilty, S a is the fraction of thepopulation that is susceptible to strain a , ν is the recov-ery rate, and γ is the population turnover rate. Suscep-tibility of strain S a depends on the fraction R b of thepopulation recovered from infections with strain bS a = e − (cid:80) b K ab R b (2) ddt R a = νI a − γR a (3) successfullineage FIG. 2

Viral populations escape adaptive immunity byaccumulating antigenic mutations.

Via cross-reactivity,the immunity foot-print of ancestral variants (center of thegraph) mediates competition between related emerging viralstrains and can drive all but one of the competing lineagesextinct. At high mutation rates and relatively short range ofantigenic cross-reactivity, diﬀerent viral lineages can escapeinhibition and continue to evolve independently.

The matrix K ab quantiﬁes the cross-immunity to strain a elicited by infection with strain b , while Eq. 3 describesrecovery. This model approximates the population sus-ceptibility by the average number of infections with eachstrain b and ignores the explicit infection history of indi-viduals.New strains are constantly produced by mutation withrate m . The novel strain will diﬀer from its parent at oneposition in its genome. We shall consider only mutationsthat contribute to the loss of immune recognition andassume that cross-immunity decays exponentially withthe number of mutations that separate two strains: K ab = e − | a − b | d (4)where | a − b | denotes the mutational distance between thetwo strains and d denotes the radius of cross-immunitymeasured in units of mutations. Antigenic space isthereby assumed to be high dimensional and antigenicdistance is proportional to genetic distance in the phylo-genetic tree (Neher et al. , 2016). The distance betweentwo contemporaneous strains is on average twice the dis-tance to their common ancestor.Cross-immunity and the mutation/diversiﬁcation pro-cess are illustrated in Fig. 2. An infection with a virus(center of the graph) generates a cross-immunity foot-print (shaded circles). Mutation away from the focalstrain reduces the eﬀect of existing immunity in the hostpopulation, but complete escape requires many muta-tions. Hence closely related viruses compete against each other for susceptible individuals.The above model was formulated in terms of the de-terministic Eqs. 1 and 2. The actual dynamics, however,is stochastic in two respects: i) antigenic mutations aregenerated at random with rate m and ii) stochasticityof infection and transmission. This stochasticity can becaptured by interpreting the terms in Eq. 1 as rates ofdiscrete transitions in a total population of N h hosts.The latter manifestation of stochasticity is particularlyimportant for novel mutant strains that are rare. Thegreat majority of novel strains are quickly lost to stochas-tic extinction even if they have a growth advantage dueto antigenic novelty. To account for stochasticity in acomputationally eﬃcient way, we employ a clone-basedhybrid scheme where mutation and the dynamics of raremutants is modeled stochastically, while common strainsfollow the deterministic dynamics, see Clone-based sim-ulation in Methods. To simplify the analysis, we willassume that the population turn-over rate γ is small com-pared to other time scales of the system and we will set γ = 0. We will use the recovery rate ν to set the unit oftime, ﬁxing ν = 1 in rescaled units. The remaining pa-rameters of the model are 1) the transmission rate β - inour units the number of transmission events per infectionand hence equal to the basic reproduction number R ,2) the mutation rate m , 3) the range of cross-immunity d measured as a typical number of antigenic mutationsneeded for an e -fold drop of cross-inhibition, and 4) thehost population size N h , which controls the number of op-portunities for adaptive evolution and the time it takesfor a newly mutated strain to reach population frequency(i.e. prevalence) of order one.Before proceeding with a quantitative analysis we dis-cuss diﬀerent behaviors qualitatively. Fig. 3A shows sev-eral trajectories of prevalence I tot = (cid:80) a I a (i.e. total ac-tively infected fraction) for several diﬀerent parameters.Depending on the range of cross-immunity, the pathogeneither goes extinct after a single pandemic (red lines) orsettles into a persistently evolving state, the Red QueenState (RQS) traveling wave (Van Valen, 1973). In largepopulations the RQS exhibits oscillation in prevalence.The RQS is only transient, but its lifetime increases withthe host population size N h and the mutation rate m .To quantitatively understand the dependence on param-eters (shown schematically in Fig. 3BC), we will furthersimplify the model and establish a connection to modelsof rapid adaptation in population genetics. Large eﬀect antigenic mutations allow transition frompandemic to seasonal dynamics

A novel virus in a completely susceptible populationwill initially spread with rate β − β − . The trajec-tory of such an pandemic strain in the time-susceptibility log( N h s ) d E x t i n c t i on log( s=m ) l og ( N h s )

10 20 30ExtinctionTW RQSOscillatory RQS t= = sw N h I t o t ( t ) Extinction q log q SpeciationOscillatoryRQSTWRQSTransientTransientSpeciation OscillatoryRQS (A)(B) (C)

TWRQS log( s=m ) log( s=m ) d d log( s=m ) susceptibility t i m e antigenic mutationspandemic strain endemic strainsnose (D) FIG. 3

Phases in the multi-strain SIR simulation and the transition from the pandemic to endemic. (A) Typicaltrajectories of infection prevalence in the regime of extinction (red), traveling wave RQS (yellow) and oscillatory RQS (green)which are realized for diﬀerent values of model parameters ( N h and m ). (B,C) The extended multi-strain SIR model supportsa long-lived but transient RQS regime in an intermediate range of parameters, ﬂanked be the regime of deterministic extinction(red) and the regime of continuous branching and diversiﬁcation – the “speciation” regime (blue). Panel (B) shows dependenceon population size and mutation rate: small populations at low mutation rates go extinct, while large and rapidly mutatingpopulations continuously produce divergent lineages (speciation). Simulation results supporting this phase diagram are shownin S2. Panel (C) presents dependence on the range of antigenic cross-inhibition ( d ) and the population size. The range of crossinhibition controls the crossover between RQS and speciation regimes. The RQS regime itself undergoes a transition from asteady travelling wave (yellow) to a limit cycle oscillation (green) with increasing population size. Panel (D) is a schematicillustrating the dynamics of the RQS state. A novel pandemic strain (red) initially expands in fully susceptible population. Asthe cumulative number of infected individuals increases, the susceptible fraction decreases, and survival of the strain dependson the emergence of antigenic escape mutations (grey). The pathogen may then settle into an endemic, Red Queen-type statewhere new antigenic variants are continuously produced. The top part of the panel illustrates the population composition at aparticular time point. Rare pioneering variants are q mutations ahead of the dominant variant and grow with rate x n . Diﬀerentlineages are related via their phylogenetic tree embedded in the ﬁtness distribution in the population. plane is indicated in red in Fig. 3D. Further infections inthe contracting epidemic will then push susceptibility be-low β − – the propagation threshold for the virus – andwithout rapid antigenic evolution the pathogen will goextinct after a time t ∼ β − log N h . Such boom-bust epi-demics are reminiscent of the recent Zika virus outbreakin French Polynesia and the Americas where in a shorttime a large fraction of the population was infected anddeveloped protective immunity (O?Reilly et al. , 2018).Persistence and transition to an endemic state is onlypossible if the pathogen can evade the rapid build-up ofimmunity via a small number of large eﬀect antigenicmutations. This process is indicated in Fig. 3D by hor-izontal arrows leading to antigenically evolved strainsof higher susceptibility and bears similarity to the con-cept of “evolutionary rescue” in population genetics (Go-mulkiewicz and Holt, 1995). The parameter range of theidealized SIR model that avoid extinction after a pan- demic resulting in persistent endemic disease is relativelysmall. Yet, various factors like geographic structure, het-erogeneity of host adaptation and population turn-overslow down the pandemic and extinction, thereby increas-ing the chances of suﬃcient antigenic evolution to enterthe endemic, RQS-type, regime. The 2009 pandemic in-ﬂuenza A/H1N1 has undergone such a transition froma pandemic to a seasonal/endemic state. We shall notinvestigate the transition process in detail here, but willassume that endemic regime has been reached. Long range cross-immunity results in evolving but lowdiversity pathogen populations

Once the pathogen population has established an en-demic circulation through continuous antigenic evolution(green and yellow regimes in Fig. 3), the average rate ofnew infections β (cid:80) a I a S a /I tot ﬂuctuates around the rateof recovery ν = 1 (in our time units). This balance ismaintained by the steady decrease in susceptibility dueto rising immunity against resident strains and the emer-gence of antigenically novel strains, see Fig. 3D. If thetypical mutational distance between strains is small com-pared to the cross-immunity range d , the rate at whichsusceptibility decreases is similar for all strains. To seethis we diﬀerentiate Eq. 2 with respect to t and divideby S a : S − a ddt S a ( t ) = − (cid:88) b e − | a − b | d I b ≈ − I tot + (cid:88) b | a − b | d I b (5)where we have used that | a − b | (cid:28) d for all pairs of strainswith substantial prevalence. In fact it will suﬃce to keeponly the ﬁrst, leading, term on the right hand side. Closeto a steady state, prevalent strains obey βS a ≈

1. Wecan hence deﬁne the instantaneous growth rate of strain x a = ( βS a − (cid:28) ddt I a = x a I a ddt x a ≈ − I tot (6)The second equation means that eﬀective ﬁtness of allstrains a decreases approximately at the same rate sincethe pathogen population is dominated by antigenicallysimilar strains.If a new strain c emerged from strain a by a single anti-genic mutation, its mutational distance from a strain b is | c − b | = | a − b | + 1 and K cb = K ab e − d − ≈ K ab (1 − d − ).The population susceptibility of strain c is therefore in-creased to S c ≈ e − (1 − d − ) (cid:80) b K ab R b ≈ S a (cid:18) − log S a d (cid:19) (7)Since the typical susceptibility is of order β − , the growthrate of strain of the mutant c is s = d − log β higher thanthat of its parent. The growth rate increment plays therole of a selection coeﬃcient in typical population geneticmodels and corresponds to the step size of the ﬁtness dis-tribution in Fig. 3D. In this model, individuals within aﬁtness class (bin of the histogram) are equivalent anddiﬀerent classes can be modeled as homogeneous popula-tions which greatly accelerates numerical analysis of themodel, see Methods.The simpliﬁed model in Eq. 6 is analogous to the trav-eling wave (TW) models of rapidly adapting asexual pop-ulations that have been studied extensively over the pasttwo decades (Desai and Fisher, 2007; Hallatschek, 2011;Rouzine et al. , 2003; Tsimring et al. , 1996), see (Neher,2013) for a review. These models describe large popula-tions that generate beneﬁcial mutations rapidly enoughthat many strains co-circulate and compete against each other. The ﬁttest (most antigenically advanced) strainsare often multiple mutational steps ahead of the mostcommon strains. This “nose” of the ﬁtness distributionscontains the strains that dominate in the future and theonly adaptive mutations that ﬁxate in the populationarise in pioneer strains in the nose. Consequently, therate with which antigenic mutations establish in the pop-ulation is controlled by the rate at which they arise in thenose (Desai and Fisher, 2007). If the growth rate at thenose of the distribution, x n , is much faster than antigenicmutation rate, x n (cid:29) m it takes typically τ a = log( x n /m ) x n (8)generations before a novel antigenic mutation arises ina newly arisen pioneer strain that grows exponentiallywith rate x n . The advancement of the nose is bal-anced rapidly by the increasing population mean ﬁt-ness. Previous analyses have shown that in the limitof very large population N (cid:29)

1, the ﬁtness distributionhas an approximately Gaussian shape with a variance σ ≈ s log( N s ) / log ( x n /m ). The wave is σ/s muta-tions wide, while the most advanced strains are approxi-mately q = 2 log( N s ) / log( x n /m ) ahead of the mean (De-sai and Fisher, 2007). Two contemporaneous lineages co-alesce on a time scale τ sw = sq/σ = s − log( x n /m ) andthe branching patterns of the tree resemble a Bolthausen-Sznitman coalescent rather than a Kingman coalescent(Desai et al. , 2013; Neher and Hallatschek, 2013). Wenote that, while parameter N in the TW analysis summa-rized above is the ﬁxed population size, the correspondingentity in our SIR model is the pathogen population size N p , which is related to the (ﬁxed) host population size N h by N p = N h ¯ I where ¯ I is the average viral prevalence,which itself depends on other parameters of the model,scaling in particular with s . Hence, it will be convenientfor us to use N h s as one of the relevant “control param-eters”, replacing N of the standard TW model. A recentrelated work, that also explicitly maps a multi-strain SIRmodels to the TW models, but does not consider the roleof population size ﬂuctuations (Rouzine and Rozhnova,2018). Stability and ﬂuctuations of the RQS

In contrast to most population genetic models of rapidadaptation, our epidemiological model does not controlthe total population size directly. Instead, the pathogenpopulation size (or prevalence) depends on the host sus-ceptibility, which in itself is determined by recent anti-genic evolution of the pathogen. The coupling of thesetwo diﬀerent eﬀects results in a rich and complicated dy-namics: The ﬁrst eﬀect is ecological: a bloom of pathogendepletes susceptible hosts leading to a crash in pathogenpopulation and a tendency of the population size to os- time t= = sw = I tot ( t ) = sw x n ( t ) h x n ( t ) < ( t + = ) i t = = = sw -2 -1 0 1 2-101 mean - tness = sw x -10 -5 0 5 10 ? = l og ( I t o t < ! ) -4-202 0 10 20 30 40Constant < Deterministic DDEStochastic DDEFC simulation nose - tness = sw x n ? = l og ( I t o t < ! ) -4-202 (C)(B)(A) FIG. 4

Oscillatory RQS. (A) An example of the stochasticlimit cycle trajectory from the ﬁtness-class simulation. Notethe rapid rise and fall of infection prevalence (blue), whichcauses a drop in nose ﬁtness (yellow) which subsequently re-covers (approximately linearly) during the remainder of thecycle. Fluctuations in I tot ( t ) and x n ( t ) from cycle to cycle arecaused by the stochasticity of x n , i.e. antigenic evolution in pi-oneer strains. A particularly large ﬂuctuation about τ sw priorto the end, caused a large spike in prevalence, followed by thecollapse of x n below zero and complete extinction. Inset (red)shows the cross-correlation between x n and σ which peakswith the delay τ = τ sw (additional peaks reﬂect the oscillatorynature of the state and are displaced by integer multiples ofmean period); (B) A family of limit cycles in infection preva-lence/mean ﬁtness plane as described by Eq. (9) with ﬁxedvariance. The variation of σ governed by the Eqs. 9-10 (in thedeterministic limit) reduces the family to a single limit cycle(red); (C) Trajectories in the infection prevalence/nose ﬁtnessgenerated by the stochastic DD system in the regime above(right panel) and below (left panel) the oscillatory instabilityof the deterministic DD system. cillate (London and Yorke, 1973). The second eﬀect isevolutionary: higher nose ﬁtness x n begets faster anti-genic evolution and vice versa, resulting in an apparentinstability in the advancement of the antigenic pioneerstrains. This instability has been identiﬁed in the studyof adaptive traveling waves (Fisher, 2013). In our epi-demiological model, ﬂuctuations in the rate of antigenic advance of the pioneer strains couple, with a delay of τ sw ,to the ecological oscillation.To recognize the origin of the oscillatory tendency, con-sider the total prevalence I tot and the mean ﬁtness of thepathogen X = (cid:80) a x a I a /I tot ddt I tot = XI tot ; ddt X = σ − I tot (9)At ﬁxed variance σ = ¯ σ this system is equivalent to a non-linear oscillator, describing a family of limit cycles oscil-lating about I tot = ¯ σ and X = 0 as shown in Fig. 4B.While Eq. (9) describes the behavior of commonstrains, the dynamics of the antigenic pioneer strains isgoverned by the equation for x n that in a continuum limit(suitable for the limit of high mutation rate) reads: ddt x n = τ − x n − I tot + sξ ( t ) (10)The ﬁrst term on the right hand side represents the ratewith which antigenic pioneer strains enter the popula-tion, τ − a , advancing the nose ﬁtness by an increment s ( τ − a s = τ − x n ). The second term on the right hand sideof Eq. (10) represents gradual reduction of susceptibilityof the host population, and ξ ( t ) is a random noise variablerepresenting the stochasticity of the establishment of newstrains. (The Gaussian white noise ξ ( t ) is deﬁned statis-tically by its correlation function (cid:104) ξ ( t ) ξ (0) (cid:105) = τ − a δ ( t ),see Methods. )The ﬁrst term of Eq. (10) captures the apparent in-stability of the nose: an advance of the nose to higher x n accelerates its rate of advancement. The stabilizingfactor is the subsequent increase in I tot , but to see howthat comes about we must connect Eq. (10) to Eq. (9).The connection is provided by σ since it is controlled bythe emergence of novel strains, i.e. the dynamics of the“nose” x n , which impacts the bulk of the distributionafter a delay τ sw . Based on the analysis detailed in theAppendix A we approximate σ ( t ) ≈ τ − x n ( t − τ sw ) (11)relating population dynamics, Eq. (9), to antigenic evo-lution of pioneer strains described by Eq. (10). Takentogether Eqs. (9-11) deﬁne a Diﬀerential Delay (DD)system of equations derived in Appendix A. Samplesimulations of this stochastic DD system are shown inFig. 4(BC). The delay approximation Eq. (11) is sup-ported by the cross-correlation of x n ( t ) and σ ( t (cid:48) ) mea-sured using ﬁtness-class simulations (see Fig. 4A Inset)The deterministic limit of the DD system (obtained byomitting the noise term in Eq. (10)) has two qualitativelydiﬀerent regimes that correspond to the TW and oscil-latory regimes. Small deviations from the steady statewith τ − ¯ x n = ¯ σ = 2 τ − log( N h ¯ I ), are underdampedand oscillate with frequency ω = ¯ σ = τ sw − (cid:112) N h ¯ I )(as determined by linearizing Eq. (9)). When ωτ sw < π , q = e x t = = s w log( N h s ) = 12 : N h s ) = 10 : N h s ) = 8 : N h s ) = 6 : N h s ) = 4 : N h s ) = 3 : q = = = s w log( N h I ) = 9 : N h I ) = 6 : N h I ) = 4 : = ext = sp d = 50 d = 125 d = 200 (A) (B) q : FIG. 5 (A) Simulation results for the average extinction time and the average speciation time obtained in the clone-basedsimulation. Extinction time τ ext , scaled with the sweep time τ sw , increases with the depth of the genealogy measured by thenumber of mutations q separating the most antigenically advanced strain from the most common strain. Also shown is thecharacteristic time to speciation τ sp , which increases with the range of cross-inhibition and decreases with q . The crossoverof the two time scales deﬁnes the transition from transient RQS to speciation. (B) Extinction time over a broad range ofparameters, obtained via ﬁtness class-based simulation of population dynamics, conﬁrms its primary dependence on q for largepopulation sizes. Note the agreement between the results of the ﬁtness class-based simulation (black line in (A)) and theclone-based simulation (colored squares in (A)). the delayed feedback via I tot stabilizes the steady state,while in the opposite regime, the system fails to recoverfrom a deviation of the nose in a single period and thesteady state becomes unstable to a limit cycle oscillation.As the nonlinearity of Eq. (9) implies a longer period withincreasing amplitude, the system is stabilized at a limitcycle with the period long enough compared to the feed-back delay τ sw . In Appendix A we derive the threshold ofoscillatory instability to lie at log( N h ¯ I osc s ) ≈ . T ≈ . τ sw , see Fig. S1 in SI). Wealso ﬁnd that the amplitude of the oscillation log( I max / ¯ I )scales as log( N h ¯ I ) for large values of the later. This tran-sition deﬁnes quantitatively the boundary between theTW RQS and the Oscillatory RQS regimes that appearon the phase diagrams in Fig. 3(BC).The distinction between the TW and Oscillatory RQSis obscured by the stochasticity of antigenic advance,Eq. (10), which continuously feeds the underdamped re-laxation mode, generating a noisy oscillation with thefrequency ω deﬁned above. The diﬀerence between thetwo regimes is illustrated by Fig. 4C: in the TW RQSnoisy oscillation is about the ﬁxed point, whereas in theOscillatory RQS it is about deterministic limit cycle.Interestingly, the dynamics of the Oscillatory RQS,as shown in Fig. 4A, can be understood in terms of anon-linear relaxation oscillator. At relatively low infec-tion prevalence nose ﬁtness x n increases until rising I tot catches up with it (when I tot = τ − x n ) driving it downrapidly. Once this “mini-pandemic” burns out, the pop-ulation returns to the low prevalence part of the cycle I tot < τ − x n , when x n begins to increase again. Thisrelaxation oscillator approximation is discussed in more detail in the Supplementary Information. The rate of extinction

While in the deterministic limit the diﬀerential-delaysystem predicts a stable steady TW (for q > q ex , ¯ I < ¯ I osc ) and a limit cycle (above ¯ I osc ), (see SI) ﬂuctua-tions in the establishment of the antigenic pioneer strains(Eq. (10)) can lead to stochastic extinction. In fact,both the TW and Oscillatory RQS (see Fig. 3BC) aretransient, subject to extinction due to a suﬃciently largestochastic ﬂuctuation. (Note however the contrast withthe “Extinction” state in Fig. 3BC, where extinction isdeterministic and rapid.) The rate of extinction dependson q and log( N h ¯ I ) as shown in Figure 5A, which com-pares results of the clone-based simulations of the multi-strain SIR model with the ﬁtness class-based simulation.Although extinction is ﬂuctuation driven, the mechanismof extinction in the oscillatory state is related closely tothe deterministic dynamics, according to which large am-plitude excursion in infection prevalence can lead to ex-tinction. A large x n advance leads, after a time τ sw toa rise in prevalence I tot , followed by the rapid fall inthe number of susceptible hosts and hence loss of viralﬁtness. This turns out to be the main mode of ﬂuctu-ation driven extinction as illustrated by Fig. 4C. Oneexpects extinction to take place when the ﬂuctuation in-duced deviation of x n (from its mean) δx ≈ s (cid:112) τ ext /τ a becomes of the order of the mean δx ≈ ¯ x n θ (log( N h ¯ I ))with the extinction threshold θ (log( N h ¯ I )) dependent onthe shape of the oscillatory limit cycle (as it depends onthe minimum of infection prevalence during the cycle).This argument suggests τ ext /τ sw ∼ f ( √ qθ (log( N h ¯ I ))) –a functional relation borne out by the results of numeri-cal simulations in Fig. 5. We note that the rate increasein τ ext with increasing q slows down in the oscillatoryregime and appears to approach a power law dependence τ ext /τ sw ∼ q . (albeit over a limited accessible range):presently we do not have analytic understanding of thisspeciﬁc functional form. The rate of speciation

The correspondence of the multi-strain SIR andthe TW models discussed above assumes that cross-immunity decays slowly compared to the coalescent timeof the populations, i.e., d/q (cid:29)

1. In this case, allmembers of the population compete against each otherfor the same susceptible hosts. Conversely, if the viralpopulation were to split into two sub-populations sep-arated by antigenic distance greater than the range ofcross-inhibition d , these sub-population would no-longercompete for the hosts, becoming eﬀectively distinct vi-ral “species” that propagate (or fail) independently ofeach other. Such a split has for example occurred amonginﬂuenza B viruses, see Fig. 1.A “speciation” event corresponds to a deep split inthe viral phylogeny, with the T MRCA growing withoutbounds, see Fig. 1 and Fig. 6A. This situation contraststhe phylogeny of the single competing population, where T MRCA ﬂuctuates with a characteristic ramp-like struc-ture generated by stochastic extinction of one of the twooldest clades. In each such extinction event the MRCAjumps forward by δT . Hence the probability of specia-tion depends on the probability of the two oldest clades topersist without extinction for a time long enough to accu-mulate antigenic divergence in excess of d . The combinedcarrying capacity of the resulting independent lineages isthen twice their original carrying capacity as observed insimulations, see Fig. 6B.To gain better intuition into this process let’s followtwo most antigenically advanced “pioneer strains”. Inthe TW approximation one of these will with high prob-ability belong to the backbone giving the rise to the per-sisting clade, while the other clade will become extinct,unless it persist long enough to diverge antigenically be-yond d , becoming a speciation event. As their antigenicdistance gradually increases, the two clades are evolvingto evade immunity built up against the common ances-tor. The less advanced of the two clades is growing lessrapidly and takes longer to generate antigenic advancemutations, resulting in still slower growth and slowerantigenic advance. Deep splits are hence unstable andit is rare for a split to persist long enough for speciation.In Appendix B we reformulate this intuition mathemati-cally as a “ﬁrst passage”-type problem which shows that T MRCA distribution has an exponential tail which gov-erns the probability of speciation events. The propensityto speciate depends on the radius of cross-immunity d and the typically genetic diversity q . Fig. 6 shows thatthe time to speciation increases approximately exponen-tially with the ratio d/q . More precisely we found thataverage simulated speciation time behaves as τ ∗ sw e f ( d/q ∗ ) with “eﬀective” τ ∗ sw = τ sw / (1 + log q/ log( s/m )) and q ∗ = q (1 + log q/ log( s/m )) picking up an additionallogarithmic dependence on parameters, the exact originof which is beyond our current approximations. Thiscorrection plausibly suggests rapid speciation, τ ∗ sw → m/s → DISCUSSION

The epidemiological and evolutionary dynamics of hu-man RNA viruses show a number of qualitatively distinctpatterns. While agents of classical childhood diseases likemeasles or mumps virus show little antigenic evolution,others viruses like dengue- or norovirus exist in distinctserotypes, while seasonal inﬂuenza viruses undergo con-tinuous antigenic evolution enabling the viruses of thesame lineage to reinfect the same individual.Here, we have integrated classical multi-strain SIRmodels with stochastic models of adaptation to under-stand the interplay between the epidemiological dynam-ics and the accumulation of antigenic novelty. The for-mer is dominated by the most prevalent strains, whilethe latter depends critically on rare pioneer strains thatbecome dominant at later times. The integration of thesetwo diﬀerent crucial aspects of the epi-evolutionary dy-namics allowed to deﬁne a “phase” diagram that summa-rizes qualitatively diﬀerent behavior as a function of therelevant parameter combinations, see Fig. 3B&C.The phase diagram shows diﬀerent combinations ofkey parameters that lead to three distinct outcomes:(1) extinction (red), (2) an evolving but low diversitypathogen population (yellow and green), (3) a deeplybranching and continuously diversifying pathogen pop-ulation (blue). The key parameters are the size of thepopulation log( N h s ), the ratio of mutational eﬀectsand mutation rate log( s/m ), and the cross-immunityrange d . Large d prevents speciation, while rapid mu-tation and large population sizes facilitate speciation.Of the diﬀerent regimes, only extinction (1) and specia- t= = sw

0 20 40 60 80 100 120 T T = = a

0 2q 20 40 60 80 d I t o t = I d=q $ = s p = = $ s w d = 50 d = 67 d = 83 d = 100 d = 125 d = 167 d = 200 y = 1 : e : x ! /- = - = 0 : =d ! /- = - = 0 : =d ! /- = - = 0 : =d ! /- = - = 0 : =d T / T / T (C)(B)(A) FIG. 6

Speciation into antigenically distinct lineages . (A) To speciate, two lineage have to diverge enough to substantiallyreduce cross-reactivity, i.e., T needs to be comparable to d . Inset: Illustration of the deﬁnition of time to most recent commonancestor T and the time interval δT by which T advances. (B) If such speciation happens, the host capacity - the averagenumber of infected individuals increases two-fold. (C) The probability of such deep divergences decreases exponentially withthe ratio d/q ∗ , where eﬀective antigenic diversity is q ∗ = 2 log( N h s ) / log( s/m ). In the presence of deleterious mutations,the relevant q is not necessarily the total advance of the pioneer strains, but only the antigenic contribution. This antigenicadvance q ∗ can be computed as q ∗ = (cid:112) N h s ) σ ag with antigenic variance σ ag = σ − σ β , where σ β is ﬁtness variance dueto deleterious mutations. With this correction, speciation times agree with the predicted dependence (colored lines). tion (3) are truly asymptotic. The intermediate regimesof continuously evolving low diversity pathogen popula-tion - the Red Queen State (RQS) - are strictly speakingmetastable states which eventually either go extinct orundergo branching, but in a certain regime of parame-ters can be very long lived.Outbreaks of emerging viruses that quickly infect alarge fraction of the population, as for example the re-cent Zika virus outbreak in the Americas, fall into regime(1): In 2-3 years, large fractions of the population wereinfected and have developed long-lasting immunity. Asfar as we know, the viral population didn’t evolve anti-genically to escape this build up of herd immunity andthe virus population is not expected to continue to cir-culate in the Americas (O?Reilly et al. , 2018).Diﬀerent inﬂuenza virus lineages, in contrast, persistin the human population, suggesting that they corre-spond to parameters that fall into the RQS region ofthe phase diagram. Furthermore, the diﬀerent subtypesdisplay quantitatively diﬀerent circulation and diversitypatterns that allow for a direct, albeit limited, compar-ison to theoretical models. We know of four seasonalinﬂuenza A lineages: subtype A/H1N1 circulated withinterruption from 1918 to 2009, A/H2N2 circulated forabout 10 years until 1968, A/H3N2 emerged in 1968 andis still circulating today, and the triple reassortant 2009H1N1 lineage, called A/H1N1pdm, settled into a seasonalpattern following the pandemic in 2009. Inﬂuenza Bviruses have split into two separate lineages (B/Victoriaand B/Yamagata) around 1983 (Rota et al. , 1990). Phy-logenetic trees of A/H3N2 and the inﬂuenza B lineagesare shown in Fig. 1. In addition, the ﬁgure shows di-versity of ﬁve lineages as measured by the instantaneous T MRCA through time.The inﬂuenza B lineages tend to be more genetically di-verse than the inﬂuenza A lineages with a typical time tothe most recent common ancestor of around 6 comparedto 3 years, see Fig. 1. A/H3N2 tends to have the lowestdiversity and most rapid population turnover. This dif-ference in diversity is consistent with inﬂuenza B lineagesbeing more prone to speciation.The typical diversity of these viruses needs to be com-pared to their rate of antigenic evolution. Hemagglu-tination inhibition titers drop by about 0.7-1 log2 peryear in A/H3N2 compared to 0.1-0.4 log2 per year forinﬂuenza B lineages (Bedford et al. , 2014; Neher et al. ,2016; Smith et al. , 2004). Hence the ratio of the time re-quired to loose immunity and T is similar for the diﬀerentlineages, suggesting that the distinct rates of genetic andantigenic evolution can not be used as a straight forwardrationalization of the speciation event of Inﬂuenza B andthe lack of speciation of inﬂuenza A lineages. Nor shouldsuch an explanation be expected as there is only a singleobservation of speciation.From the phase diagram, we found the most relevantparameters that determine the fate of a pathogen arethe antigenic diversity q and the range of cross-immunity d . A previous study by Koelle and Rasmussen (2015)has implicated deleterious mutation load as a cause ofspindly phylogenies. Our model can readily incorporatethe eﬀect of deleterious mutations aﬀecting transmissioncoeﬃcient β by δβ and subsequent compensatory muta-tions that increase β . Such a modiﬁcation is expectedto reduce the average β and reduce the selection coeﬃ-cient of antigenic mutations, which in turn reduces theﬁtness variance σ , as derived in Supplementary Infor-0mation. After subtracting the contribution of deleteriousmutations from the the ﬁtness variance, the times to spe-ciation follow the predicted dependence on q and d , seeFig. 6C.While we have shown that the natural tendency of SIRmodels to oscillate couples to the instability of the nose ofthe pathogen ﬁtness distribution, making a quantitativelink to the observed epidemiological dynamics of the ﬂu isdiﬃcult on account of seasonal oscillation in transmissiv-ity. The latter confounding factor is widely believed to bethe cause behind observed seasonality of the ﬂu. Includ-ing explicit temporal variation (in β ) in our model wouldlock the frequency of the prevalence oscillation to theseasonal cycle, possibly resulting in subharmonic mod-ulation, yet distinguishing such a modulation on top ofan already stochastic process is hard. Much remains tobe done: ﬁnite birth rates, distinct age distributions (asfor example is the case for the two inﬂuenza B lineages),realistic distribution of antigenic eﬀect sizes, or very longrange T-cell mediated immunity would all be interestingavenues for future work. METHODSA. Clone-based simulations

We simulate the original model on a genealogical treein two phases: one on the deterministic SIR-type epi-demics and one on the stochastic mutation introducingnew strains. In each time step ∆ t <

1, we apply themid-point method to advance the epidemic equationsEqs. (1,2,3). We then generate a random number uni-formly sampled between zero and one for each survivingstrain with N h I a ≥

1. If the random number is smallerthan mN h I a ∆ t for strain a , we append a new strain b as a descendent to a . The susceptibility to strain b isrelated to susceptibility to strain a via S b = ( S a ) e − /d .In most of the simulations, the transmissibility of diﬀer-ent strains is held constant β . Otherwise we allow fora strain speciﬁed transmissibility that is drawn from itsparent β b = β a − δβ with δβ > β b = β max if the mutation iscompensatory. The new strain grows deterministicallyonly if β b S b > β , recovery rate ν , mutation rateof the virus m , birth/death rate of the hosts γ , the ef-fective cross-immunity range d , and the eﬀective size ofthe hosts N h , whose empirical ranges are summarized inthe Table. I. For ﬂu and other asexual systems in RQS, β (cid:38) ν (cid:29) m, γ , d (cid:29)

1, and N h (cid:29) B. Fitness class-based simulations

The stability of the RQS and the extinction dynamicsis fully captured by the traveling wave equations (6). Wesimulate the traveling wave by gridding the ﬁtness space x into bins of step size s around zero. The infections ofdiﬀerent strains correspond to a natural number in eachbin x i . At each time step, the population in each bin I i updates to a number sampled from the Poisson distribu-tion with parameter λ i = I i (1 + ( x i − ¯ x )∆ t ) determinedby mean ﬁtness x i and a dynamic mean ﬁtness ¯ x , whichincreases by ∆ tI tot /N h , where I tot is the total populationsummed over all bins and N h is the parameter giving thehost population. When ¯ x becomes larger than one binsize s , we shift the all populations to left by one bin andreset ¯ x to 0, a trick to keep only a ﬁnite number of binsin the simulation. At the same time, antigenic mutationis represented by moving a the mutated fraction in eachbin to the adjacent bin on the right. The faction is de-termined by a random number drawn from the Poissondistribution with the mean mI i ∆ t . The typical rangesof the three parameters s , m , and N h are according tothe parameters in the genealogical simulation, as docu-mented also in Table. I. C. Stochastic diﬀerential-delay simulation

To simulate the diﬀerential delay equationsEqs. (9,10,11), we discretize time in increments of∆ t = τ sw /k and update the dynamical variables χ i = x n ( t i ) and η i = I tot ( t i ) via the simple Eulerscheme: χ i +1 = χ i + ∆ t ( χ i − η i ) + χ i qs √ ∆ tξ i ; (12) η i +1 = ¯ I exp  τ sw χ i − k − τ k k (cid:88) j =0 jη i − j  , (13)where ξ i is a Gaussian random variable with zero meanand unit variance. Mean prevalence, ¯ I , enters as thecontrol parameter (which deﬁnes the time average of η i ). D. Inﬂuenza phylogenies

Inﬂuenza virus HA sequences for the subtypesA/H3N2, A/H1N1, A/H1N1pdm, as well as inﬂuenza Blineages Victoria and Yamagata were downloaded from fludb.org .We aligned HA sequences using maﬀt (Katoh et al. ,2002) and reconstructed phylogenies with IQ-Tree(Nguyen et al. , 2015). Phylogenies were further processedand time-scaled with the augur (Hadﬁeld et al. , 2017) andTreeTime (Sagulenko et al. , 2018).1

Symbol meaning comment/range numerical range I a number of individuals infected with strain aS a weighted fraction of individuals susceptible to strain a ∼ . d cross-immunity range ∼

10y [50 , β = νR transmission rate ∼ ν recovery rate ∼

1w 1 γ birth/death rate of people ∼ . y − K ab = e −| b − a | /d cross-immunity of strains a , b ∼

20 mutations τ sw coalescent time scale/sweep time 2 − yT T MRCA ∼ − δT T MRCA ∼ − s Selection coeﬃcient ∼ . w − [0 . , . ∼ . . dm mutation rate beneﬁcial 10 − per week and genome [10 − , − ] I tot = (cid:80) a I i total prevalence 0 . I average prevalence N h host population human population 10 [10 , ]TABLE I Relevant quantities of inﬂuenza virus and parameters in multi-strain SIR model.

APPENDIXA. Diﬀerential-delay approximation of RQS dynamics

Here we derive the diﬀerential delay system of equa-tions that relate the behavior of the pioneer strains withthe bulk of the population. Let us consider the generat-ing function associated with the virus ﬁtness distributionat time t : G ( λ, t ) = (cid:88) i I i ( t ) e λx i ( t ) (14)where x i ( t ) = x n ( t i ) − (cid:82) tt i dt (cid:48) I t ( t (cid:48) ) is the current ﬁtnessof the pioneer strain that ﬁrst appeared at time t i and I i ( t ) is the fraction of the hosts infected by it: I i ( t ) = N − e (cid:82) tti dt (cid:48) x t ( t (cid:48) ) = N − e x n ( t i )( t − t i ) − (cid:82) t − ti dt (cid:48) t (cid:48) I tot ( t − t (cid:48) ) (15)We next take a coarse grained view of pioneer strain es-tablishment replacing the sum in Eq. (13) by an integralover initial times t i → t − τG ( λ, t ) = (cid:90) ∞ dτ N − τ a ( t − τ ) e ( τ + λ ) x n ( t − τ ) − (cid:82) τ dt (cid:48) ( t (cid:48) + λ ) I tot ( t − t (cid:48) ) (16)Let us evaluate the integral in the saddle approximationwhich is dominated by τ = τ ∗ corresponding to the max-imum in the exponential τ ∗ + λ = x n ( t − τ ∗ ) x (cid:48) n ( t − τ ∗ ) + I tot ( t − τ ∗ ) ≈ τ sw (17)where we have used the deterministic limit of Eq (9).To simplify presentation we shall ignore the time depen-dence of τ sw = s − log( x n /m ) replacing x n ( t − τ ∗ ) in thelogarithm by the time average ¯ x n . In the saddle approximation we then havelog N G ( λ, t ) ≈ x n ( t − τ sw + λ ) τ sw − (cid:90) τ sw − λ dt (cid:48) ( t (cid:48) + λ ) I tot ( t − t (cid:48) )(18)(where for simplicity we have omitted the logarithmiccorrections). Note that by deﬁnition G (0 , t ) = I tot ( t ).We can now estimate ﬁtness mean¯ x ( t ) = ddλ log G ( λ, t ) | λ =0 = τ sw [ x (cid:48) n ( t − τ sw ) + I tot ( t − τ sw )] − (cid:90) τ sw dt (cid:48) I tot ( t − t (cid:48) )= x n ( t − τ sw ) − (cid:90) τ sw dt (cid:48) I t ( t − t (cid:48) ) (19)and variance σ ( t ) = d dλ log G ( λ, t ) | λ =0 = τ sw [ x (cid:48)(cid:48) n ( t − τ sw ) + I (cid:48) tot ( t − τ sw )] + I tot ( t − τ sw ) (20)Eq. (20) involves the second derivative x (cid:48)(cid:48) n and expectﬂuctuations in the establishment of new lineages (whichcontribute to x (cid:48) n ) to be quite important. Yet we canget useful insight by continuing to use the deterministicapproximation to x n dynamics, in which case we arriveat simple delay relation between the variance and x n σ ( t ) = τ − x n ( t − τ sw ) (21)which is consistent with the variance calculated for thecase of the steady TW and also satisﬁes the generalizedFisher theorem ddt ¯ x = x (cid:48) n ( t − τ sw ) + I t ( t − τ sw ) − I t ( t )= σ ( t ) − I t ( t ) (22)2Combining Eqs. (9,10,21) we arrive at the determin-istic dynamical system approximating coupled “ecolog-ical” SIR dynamics with the evolutionary dynamics ofantigenic innovation due to the pioneer strains. d dt log I ( t ) = τ − x n ( t − τ sw ) − I tot ( t ) (23) ddt x n ( t ) = τ − x n ( t ) − I tot ( t ) (24)This system admits a family of ﬁxed points of the form τ sw I tot = x n = ¯ x n , but as we show in the SI, the corre-sponding steady TW states are not always stable giv-ing rise to limit cycle oscillations or leading to rapid ex-tinction. Self-consistency condition relating x n and I tot for the steady traveling wave is readily generalized tolimit cycle states. Integrating the diﬀerential-delay sys-tem over one cycle yields (cid:104) x n (cid:105) = τ sw (cid:104) I (cid:105) . An additionalrelation is provided by integrating log N G (0 , t ) over thecycle: (cid:104) log N h I tot (cid:105) = τ (cid:104) I tot (cid:105) (25)A great deal of insight into the behavior of the (deter-ministic) diﬀerential delay system deﬁned above is pro-vided by its deterministic limit (see SI) which deﬁnes thestability “phase diagram” shown in Fig. 3(BC) that cor-rectly captures key aspects of the behavior observed infully stochastic simulations. B. Speciation rate as a stochastic “First Passage” problem.

Speciation occurs when two most distant clades persistto the antigenic independence. This persistence problemcan be formulated as a ﬁrst passage problem by includingthe second ”nose” in the TW approximation.We consider the births of two pioneer strains at time t = 0, as illustrated in Fig. 7. The descendants of thetwo strains forming two branches 1 and 2 diverge in theantigenic space as they persist in time. Suppose that attime t , the nose of branch 1 is at ﬁtness x , and the noseof branch 2 is at x . Before the sweep time t < τ sw , thecross-immunity grows mainly from the prevalent strainsin the common ancestors of the two branches, ddt x i = τ − x i − I tot + sξ i . i = 1 , t > τ sw , the infection bulk splits and moveson to the two branches. As the antigenic distances fromthe noses to the infection bulks on diﬀerent branches arediﬀerent, cross-immunity eﬀects to diﬀerent noses growin diﬀerent rates, ddt x = τ − x − I e − d /d − I e − d /d + sξ ; ddt x = τ − x − I e − d /d − I e − d /d + sξ , (27) t=0 x x d d tsx -x x n 𝜏 𝜏 FIG. 7 Left: Sketch of a branching event at t = 0 with twobranches 1 and 2. The ﬁtnesses of the most ﬁttest strains(noses) in branch 1 and 2 are x and x . Branch 1 is theﬁtter one x > x . The antigenic distances from the cross-immune bulk to the noses of the two branches are d and d .The Gaussian proﬁle in ﬁtness is illustrated in blue. Right:The ﬁtness diﬀerence between the two branches x − x isdoing a biased random walk in time t of step size s with areﬂecting boundary at x = 0 and an absorbing boundary at x = x n . where d and d scale roughly as q , the typical antigenicdistance to the nose. In the limit d ≈ d (cid:38) d , Eqs.(27)reduce to two independent ones of Eq. (10) and the twobranches are thus antigenically independent. What isthe probability of reaching this limit? The approach tothis question rather relies on the persistence probabilityof two branches in the other limit when d ≈ d (cid:46) d , where I + I ≈ I tot cross-immunity growth rate isapproximately the same at both noses.In this limit, the survival probability of the less ﬁtnose maps to a ﬁrst passage problem in the random walkof relative ﬁtness ζ ≡ ( x − x ) /x n . As illustrated inFig. 7, an establishment of nose 1 is a positive step of δζ = s/x n , while an establishment of nose 2 ends up intoa backward step of the same size. As the mutations arrivein characteristic times τ and τ depending on the noseﬁtnesses, in the continuum limit, we have ddt ζ = τ − ζ + sx n ξ, (28)where ξ is a random noise. There are two relevant bound-aries: a reﬂecting boundary at ζ = 0 where two branchesswitch roles in leading the ﬁtness, and an absorbingboundary at ζ = 1 where the ﬁtness of less ﬁt nose dropsbelow the mean ﬁtness and becomes destined.The system can be solved for the probability densitydistribution ρ ( ζ, t ), which obeys ∂ t ρ ( ζ, t ) = − ∂ ζ [ v ( ζ ) ρ ( ζ, t )] + ∂ ζ [ D ( ζ ) ρ ( ζ, t )] , (29)where the drift v and diﬀusivity D depend on ζ , v ( ζ ) = 1 τ sw ζ ; D ( ζ ) ≈ qτ sw . (30)Solving with boundary and initial conditions, ∂ ζ ρ ( ζ, t ) | ζ =0 = 0; ρ ( ζ = 1 , t ) = 0; ρ ( ζ, t = 0) = δ ( ζ ) , (31)3we have ρ ( ζ, t ) = ∞ (cid:88) n =1 e − λ n t/τ sw c n F ( 1 − λ n , , q ζ ) , (32)where F is the generalized hypergeometric function, λ n is the n th smallest values solving F ( − λ , , q ) = 0, andcoeﬃcient c n is determined by the initial condition. Inlong time t , the slowest mode dominates the dynamics.In the large q limit, we have λ = 1. Since F ≈ constfor ζ ∈ (0 , P ( T > t ) ≈ ce − t/τ sw . (33)The typical time interval between successive strainsscales as τ a = τ sw /q . So the probability of the branchdepth being deeper than a is then P ( D > a ) ≈ e − aτ a /τ sw = e − a/q . (34)Recalling that the speciation, or escape of the cross-immunity occurs when the branch depth is larger than d , we ﬁnd the probability of a successful branching p isproportional to e − d/q .In the phylogenetic tree, t/τ a trial branchings from thebackbone arrive in time t . The probability that none ofthem successfully speciate is thus P nsp ( t ) = (1 − p ) t/τ a = e − t/τ sp , (35)where the waiting time for speciation event is τ sp ∝ τ sw q e d/q , (36)as numerically veriﬁed in Fig. 6. SUPPLEMENTARY INFORMATIONStability analysis of the diﬀerential-delay approximation

It is natural to measure time in the units of τ sw : t = τ sw ζ , x n = τ − χ and deﬁne u = τ I and d dζ u ( ζ ) = χ ( ζ − − e u ( ζ ) (S1) ddζ χ ( ζ ) = χ ( ζ ) − e u ( ζ ) (S2)This system has a one parameter family of ﬁxed points χ = ¯ χ, u = log ¯ χ . To analyze ﬁxed point stability welinearize and Laplace transform, yielding z δ ˆ u ( z ) = e − z δ ˆ χ ( z ) − ¯ χδ ˆ u ( z ) + zδu (0) + δu (cid:48) (0) (S3) zδ ˆ χ ( z ) = δ ˆ χ ( za ) − ¯ χδ ˆ u ( z ) + zδχ (0) (S4) v a r ( l og I t o t ) log( N h Is ) T = = s w log( N h Is ) osc = 8 : FIG. S1 Top: Amplitude of epidemic circulations in loga-rithm of total prevalence. The amplitude predicted in theﬁtness class-based (FC) simulation is consistent with theclone-based simulation, bounded from below by the am-plitude in the deterministic diﬀerential-delay approximation(DDE), which sets on at the spontaneous oscillation thresholdlog( N h ¯ Is ) osc = 8 .

3, indicated by the dashed line. When thenoise is properly considered as in Eq. (13), the amplitude canalso be predicted from the stochastic diﬀerential-delay equa-tions (DDE). Bottom: Period of epidemic oscillation, alsobounded below by the limit cycle period of the deterministicDDE. log( s=m ) l og ( N h s ) ExtinctionOscilitoryRQSTWRQSSpeciation

FIG. S2 ”Phase diagram” in the clone-based simu-lation.

The data points in corresponding phases are de-termined by following criteria. The extinction phase (red): τ ext < τ sw ; The speciation phase (blue): τ sp < τ ext ; Thetransient RQS regime (yellow and green): otherwise. Colorin the transient phase labels the amplitude of prevalencevar(log I tot ), increasing from yellow to green. δu (0) , δu (cid:48) (0) , δχ (0) and these poles are at the complex z that solve: z = 1 + ¯ χ (1 − z − e − z ) /z (S5)Fixed point - and hence steady RQS - stability requires (cid:60) ( z ) < < ¯ χ < ¯ χ c . For ¯ χ > .

845 one ﬁnds (cid:61) ( z ) (cid:54) = 0 corresponding to the onset ofoscillatory relaxation which turns into a limit cycle for¯ χ > ¯ χ c ≈ .

6. The period of the limit cycle is wellapproximated by (cid:61) ( z ), as the dashed line shown in thebottom panel of Fig. S1. Stochastic form of the diﬀerential-delay approximation

A sensible stochastic generalization is obtained bythe stochastic approximation for the “nose” dynamics(Eq. (10)) ddt x n = τ − x n − I t ( t ) + sξ ( t ) , (S6)combined with Eq. (18) at λ = 0log I ( t ) = τ sw x n ( t − τ sw ) − (cid:90) τ sw dt (cid:48) t (cid:48) I t ( t − t (cid:48) ) . (S7)Note that in this derivation we have avoided the need forexplicitly approximating σ ! (We have also neglected theeﬀect of ﬂuctuations arising from the logarithmic correc-tion term eﬀectively replacing it by its average value.) Eﬀect of mutations in infectivity

Suppose an antigenic advance mutation has deleteri-ous eﬀect on infectivity reducing the latter by δβ d onaverage. This would eﬀectively reduce the ﬁtness gain ofantigenic innovation from s to s d ( β ) = s ( β ) − ∆ d , with∆ d = β − δβ d . In addition let us assume that there alsoare compensatory mutations which restore maximal in-fectivity β max . These compensatory mutation thus havea beneﬁcial eﬀect on ﬁtness ∆ b ( β ) = β − β max −

1. Weassume that these mutations occur with rate m βb . Ina dynamic balance state the rate of ﬁxation of com-pensatory mutations would exactly balance the delete-rious mutation eﬀect on β so that τ − b ∆ d = τ − a ∆ d withthe ﬁxation rate controlled by the ﬁtness of the lead-ing strain via τ − b = x n / log( x n m βb ). This dynamic bal-ance is achieved at a certain value of β ∗ < β max , specif-ically β max − β ∗ = δβ d τ b τ − a or β ∗ = β max − δβ d r where r = log( x n m ) / log( x n m βb ).The ﬁtness of the nose of the distribution obeys dx n dt = s d ( β ) τ − a + ∆ b τ − b − I (S8) where the 1st term on the RHS is rate of nose advance-ment due to antigenetic mutations τ − a = x n / log( x n m )as before, but with reduced ﬁtness gain s d ( β ). The 2ndterm describes the contribution of compensatory muta-tions. However in the dynamic equilibrium (at β ∗ ) com-pensatory mutations exactly cancel the contribution thedeleterious mutation contribution to s so that for thesteady state we recover I tot = s ( β ∗ ) τ − ag = s ( β ∗ ) x n log x n m (S9)as we had for the TW driven by antigenic advancementonly. The only eﬀect is the reduction of s from s ( β max )to s ( β ∗ ) = d − log β ∗ .The sweep time, τ sw , upon which the ﬁtness of theformer pioneer strain comes down to the mean ﬁtness)and the nose ﬁtness, x n , retain the TW form τ sw = x n I tot = s ( β ∗ )log( x n m ) (S10)Following TW approximation to estimate infection preva-lence √ I tot ∼ N − exp( x n τ sw /

2) as before one ﬁnds x n = 2 τ − log CN h = 2 s ( β ∗ ) log[ N h s / log( x n m )]log( x n m ) (S11)The total ﬁtness variance of the population containsa contribution, from antigenic mutations and the muta-tions in infectivity: σ = τ sw ( s d τ − a + ∆ b τ − b ) = x n [ s − d + ∆ d s + ∆ d ∆ b s ](S12)but under conditions of ∆ d , ∆ b (cid:28) s ( β ∗ ) total variancewould also be decreasing.Most relevant for our analysis however is not the typ-ical, but the maximal antigenic distance within the viralpopulation: q ag = τ sw τ − ag = x n s ( β ∗ ) = 2 log N h s ( β ∗ ) c log x n m (S13)which is basically unchanged in the presence of infectiv-ity mutations except for the expected reduction in themagnitude of s factor inside the logarithm. Therefore,speciation rate would be reduced, but rather weakly, viaa contribution subleading in o (log N h ) REFERENCES

Andreasen, V., J. Lin, and S. A. Levin (1997), Journal ofmathematical biology (7), 825.Andreasen, V., and A. Sasaki (2006), Theoretical PopulationBiology (2), 164.Bedford, T., A. Rambaut, and M. Pascual (2012), BMC bi-ology (1), 38. Bedford, T., M. A. Suchard, P. Lemey, G. Dudas, V. Gregory,A. J. Hay, J. W. McCauley, C. A. Russell, D. J. Smith, andA. Rambaut (2014), Elife , e01914.Bhatt, S., E. C. Holmes, and O. G. Pybus (2011), MolecularBiology and Evolution (9), 2443.Desai, M. M., and D. S. Fisher (2007), Genetics (3),1759.Desai, M. M., A. M. Walczak, and D. S. Fisher (2013), Ge-netics (2), 565.Ferguson, N. M., A. P. Galvani, and R. M. Bush (2003),Nature (6930), 428.Fisher, D. S. (2013), Journal of Statistical Mechanics: Theoryand Experiment (01), P01011.Gog, J. R., and B. T. Grenfell (2002), Proceedings of theNational Academy of Sciences (26), 17209.Gomulkiewicz, R., and R. D. Holt (1995), Evolution (1),201.Grenfell, B. T., O. G. Pybus, J. R. Gog, J. L. Wood, J. M.Daly, J. A. Mumford, and E. C. Holmes (2004), science (5656), 327.Hadﬁeld, J., C. Megill, S. M. Bell, J. Huddleston, B. Potter,C. Callender, P. Sagulenko, T. Bedford, and R. A. Neher(2017), bioRxiv , 224048.Hallatschek, O. (2011), Proceedings of the National Academyof Sciences (5), 1783.Katoh, K., K. Misawa, K.-i. Kuma, and T. Miyata (2002),Nucleic Acids Research (14), 3059.Koel, B. F., D. F. Burke, T. M. Bestebroer, S. v. d. Vliet,G. C. M. Zondag, G. Vervaet, E. Skepner, N. S. Lewis,M. I. J. Spronken, C. A. Russell, M. Y. Eropkin, A. C.Hurt, I. G. Barr, J. C. d. Jong, G. F. Rimmelzwaan, A. D.M. E. Osterhaus, R. A. M. Fouchier, and D. J. Smith(2013), Science (6161), 976.Koelle, K., and D. A. Rasmussen (2015), eLife , e07361.London, W. P., and J. A. Yorke (1973), American journal ofepidemiology (6), 453.Neher, R. A. (2013), Annual Review of Ecology, Evolution, and Systematics (1), 195.Neher, R. A., T. Bedford, R. S. Daniels, C. A. Rus-sell, and B. I. Shraiman (2016), Proceedings ofthe National Academy of Sciences (2), 437.Nguyen, L.-T., H. A. Schmidt, A. von Haeseler, and B. Q.Minh (2015), Molecular Biology and Evolution (1), 268.O?Reilly, K. M., R. Lowe, W. J. Edmunds, P. Mayaud,A. Kucharski, R. M. Eggo, S. Funk, D. Bhatia, K. Khan,M. U. G. Kraemer, A. Wilder-Smith, L. C. Rodrigues,P. Brasil, E. Massad, T. Jaenisch, S. Cauchemez, O. J.Brady, and L. Yakob (2018), BMC Medicine (1), 180.Perelson, A. S., and G. F. Oster (1979), Journal of Theoret-ical Biology (4), 645.Petrova, V. N., and C. A. Russell (2018), Nature ReviewsMicrobiology (1), 47.Rota, P. A., T. R. Wallis, M. W. Harmon, J. S. Rota, A. P.Kendal, and K. Nerome (1990), Virology (1), 59.Rouzine, I. M., and G. Rozhnova (2018), PLOS Pathogens (9), e1007291.Rouzine, I. M., J. Wakeley, and J. M. Coﬃn (2003), Proceed-ings of the National Academy of Sciences (2), 587.Sagulenko, P., V. Puller, and R. A. Neher (2018), Virus Evo-lution (1), 10.1093/ve/vex042.Smith, D. J., A. S. Lapedes, J. C. de Jong, T. M. Bestebroer,G. F. Rimmelzwaan, A. D. Osterhaus, and R. A. Fouchier(2004), Science.Tria, F., M. Lssig, L. Peliti, and S. Franz (2005), Journal ofStatistical Mechanics: Theory and Experiment (07),P07008.Tsimring, L. S., H. Levine, and D. A. Kessler (1996), Physicalreview letters (23), 4440.Van Valen, L. (1973), Evol Theory1