[PDF] Antigenic waves of virus-immune co-evolution

Abstract

The evolution of many microbes and pathogens, including circulating viruses such as seasonal influenza, is driven by immune pressure from the host population. In turn, the immune systems of infected populations get updated, chasing viruses even further away. Quantitatively understanding how these dynamics result in observed patterns of rapid pathogen and immune adaptation is instrumental to epidemiological and evolutionary forecasting. Here we present a mathematical theory of co-evolution between immune systems and viruses in a finite-dimensional antigenic space, which describes the cross-reactivity of viral strains and immune systems primed by previous infections. We show the emergence of an antigenic wave that is pushed forward and canalized by cross-reactivity. We obtain analytical results for shape, speed, and angular diffusion of the wave. In particular, we show that viral-immune co-evolution generates a new emergent timescale, the persistence time of the wave's direction in antigenic space, which can be much longer than the coalescence time of the viral population. We compare these dynamics to the observed antigenic turnover of influenza strains, and we discuss how the dimensionality of antigenic space impacts on the predictability of the evolutionary dynamics. Our results provide a concrete and tractable framework to describe pathogen-host co-evolution.

Full PDF

AAntigenic waves of virus-immune co-evolution

Jacopo Marchi, Michael L¨assig, Aleksandra M. Walczak, ∗ and Thierry Mora ∗ Laboratoire de physique de l’ ´Ecole normale sup´erieure, CNRS, PSL University,Sorbonne Universit´e, and Universit´e de Paris, 75005 Paris, France Institute for Biological Physics, University of Cologne, 50937 Cologne, Germany

The evolution of many microbes and pathogens, including circulating viruses such as seasonalinﬂuenza, is driven by immune pressure from the host population. In turn, the immune systems ofinfected populations get updated, chasing viruses even further away. Quantitatively understandinghow these dynamics result in observed patterns of rapid pathogen and immune adaptation is in-strumental to epidemiological and evolutionary forecasting. Here we present a mathematical theoryof co-evolution between immune systems and viruses in a ﬁnite-dimensional antigenic space, whichdescribes the cross-reactivity of viral strains and immune systems primed by previous infections. Weshow the emergence of an antigenic wave that is pushed forward and canalized by cross-reactivity.We obtain analytical results for shape, speed, and angular diﬀusion of the wave. In particular,we show that viral-immune co-evolution generates a new emergent timescale, the persistence timeof the wave’s direction in antigenic space, which can be much longer than the coalescence time ofthe viral population. We compare these dynamics to the observed antigenic turnover of inﬂuenzastrains, and we discuss how the dimensionality of antigenic space impacts on the predictability ofthe evolutionary dynamics. Our results provide a rigorous and tractable framework to describepathogen-host co-evolution.

I. INTRODUCTION

The evolution of viral pathogens under the selectivepressure of its hosts’ immunity is an example of rapid co-evolution. Viruses adapt in the usual Darwinian senseby evading immunity through antigenic mutations, whileimmune repertoires adapt by creating memory againstpreviously encountered strains. Some mechanisms of in-host immune evolution, such as aﬃnity maturation pro-cess, are important for the rational design of vaccines.Examples are the seasonal human inﬂuenza virus, wherevaccine strain selection can be informed by predictingviral evolution in response to collective immunity[1], aswell as chronic infections such as HIV [2–5], where co-evolution occurs within each host. Because of the rela-tively short time scales of selection and strain turnover,these dynamics also provide a laboratory for studyingevolution and its link to ecology [6].It is useful to think of both viral strains and immuneprotections as living in a common antigenic space [6],corresponding to an idealized “shape space” of bindingmotifs between antibodies and their cognate epitopes[7]. While the space of molecular recognition is high-dimensional, projections onto a low-dimensional eﬀectiveshape space have provided useful descriptions of the anti-genic evolution. In the example of inﬂuenza, neutraliza-tion data from hemagglutination-inhibition assays can beprojected onto a two-dimensional antigenic space [8–10].Mapping historical antigenic evolution in this space sug-gests a co-evolutionary dynamics pushing the virus awayfrom its past positions, where collective immunity has de-veloped. Importantly, the evolution of inﬂuenza involves ∗ Corresponding authors. competitive interactions of antigenically distinct cladesin the viral population, generating a “Red Queen” dy-namics of pathogen evolution [11, 12]. Genomic analysisof inﬂuenza data has revealed evolution by clonal inter-ference [13]; this mode of evolution is well-known fromlaboratory microbial populations [14]. In addition, theviral population may split into subtypes. Such splittingor “speciation” events, which are marked by a decouplingof the corresponding immune interactions, happened inthe evolution of inﬂuenza B [15] and of noroviruses [16].The joint dynamics of viral strains and the immunesystems of the host population may be modeled usingagent-based simulations [17, 18] that track individualhosts and strains. Such approaches have been used tostudy the eﬀect of competition on viral genetic diver-sity [19], to study geographical eﬀects [20], and the ef-fect of vaccination [21]. Alternatively, systems of cou-pled diﬀerential equations known as Susceptible-Infected-Recovered (SIR) models may be adapted to incorpo-rate evolutionary mechanisms of antigenic adaptation[6, 22, 23]. Agent-based simulations in 2 dimensions wereused to recapitulate the ballistic evolution characteristicof inﬂuenza A [18], and to predict the occurence of split-ting and extinction events [24]. In parallel, theory wasdeveloped to study the Red Queen eﬀect [12, 25], basedon the well established theory of the traveling ﬁtness wave[26–28]. While eﬀectively set in one dimension, this classof models can nonetheless predict extinction and splittingevents assuming an inﬁnite antigenic genome [12].In this work, we propose a co-evolutionary theory inan antigenic interaction space of arbitrary dimension d ,which is described by joint non-linear stochastic diﬀeren-tial equations coupling the population densities of virusesand of protected hosts. We show that these equationsadmit a d -dimensional antigenic wave solution, and westudy its motion, shape, and stability, using simulations a r X i v : . [ q - b i o . P E ] F e b and analytical approximations. Based on these results,we discuss how canalization and predictability of anti-genic evolution depend on the dimensionality d . II. RESULTSA. Coarse-grained model of viral-immuneco-evolution

Our model describes the joint temporal evolution ofpopulations of viruses and immune protections in someeﬀective antigenic space of dimension d . Both viralstrains and immune protections are labeled by their po-sition x = ( x , . . . , x d ) in that common antigenic space,called “phenotype” (Fig. 1A). In that space, viruses ran-domly move as a result of antigenic mutations, as wellas proliferate through infections of new hosts. Immunememories are added at the past positions of viruses. Im-mune memories distributed across the host populationprovide protection that reduce the eﬀective ﬁtness of thevirus. We coarse-grain that description by summarizingthe viral population by a density n ( x , t ) of hosts infectedby a particular viral strain x , and immunity by a density h ( x , t ) of immune memories speciﬁc to strain x in thehost population.At each infection cycle, each host may infect R unpro-tected hosts, where R is called the basic reproductionnumber. However, a randomly picked host is susceptibleto strain x with probability (1 − c ( x , t )) M , where c ( x , t ) isthe coverage of strain x by immune memories of the pop-ulation, and the number M of immune memories carriedby each host. Because of cross-reactivity, which allowsimmune memories to confer protection against closebystrains, immune coverage is given as a function of thedensity of immune memories: c ( x , t ) = 1 M (cid:90) d x (cid:48) h ( x (cid:48) , t ) H ( x − x (cid:48) ) , (1)where H ( x − x (cid:48) ) = exp( −| x − x (cid:48) | /r ) is a cross-reactivityKernel describing how well memory x (cid:48) protects againststrain x , and r is the range of the coverage providedby cross-reactivity. In summary, the eﬀective growthrate, or “ﬁtness”, of the virus is given by f ( x , t ) ≡ ln[ R (1 − c ( x , t )) M ] in units of infection cycles.The coupled dynamics of viruses and immune memo-ries is then described by the stochastic diﬀerential equa-tions: ∂ t n ( x , t ) = f ( x , t ) n ( x , t ) + D∂ x n + (cid:112) n ( x , t ) η ( x , t ) (2) ∂ t h ( x , t ) = 1 N h (cid:20) n ( x , t ) − N ( t ) h ( x , t ) M (cid:21) , (3)and η is a Gaussian white noise in time and space (cid:104) η ( x , t ) η ( x (cid:48) , t (cid:48) ) (cid:105) = δ ( x − x (cid:48) ) δ ( t − t (cid:48) ) accounting for de-mographic noise [29]. This stochastic term is crucial,as it will drive the evolution of the wave. D describesthe eﬀect of inﬁnitesimal mutations on the phenotype, virus immune memoryviral fitness D FIG. 1:

A simple model of viral-host co-evolution pre-dicts the emergence of an antigenic wave. A.

Schematicof the co-evolution model. Viruses proliferate while eﬀectivelydiﬀusing in antigenic space (here in 2 dimensions) throughmutations, with coeﬃcient D . Past virus positions are re-placed by immune protections (light blue). Immune protec-tions create a ﬁtness gradient for the viruses (green gradi-ent) favoring strains at the front. Both populations of virusesand immune populations are coarse-grained into densities inantigenic space. B. Snapshot of a numerical simulation ofEq. 2-3 showing the existence of a wave solution. The bluecolormap represents the density of immune protections h ( x , t )left behind by past viral strains. The current virus density n ( x ) is shown in red. C. Close-up onto the viral population,showing ﬁtness isolines. The wave moves in the direction ofthe ﬁtness gradient (arrow) through the enhanced growth ofstains at the edge of the wave (black dots). D. Distributionof ﬁtness across the viral population (corresponding to theprojection of B. along the ﬁtness gradient). Parameters forB-D:

D/r = 3 · − , N h = 10 , ln R = 3, M = 1. D = µ (cid:104) δx (cid:105) /

2, where µ is the mean number of muta-tions per cycle, and (cid:104) δx (cid:105) the mean squared eﬀect of eachmutation along each antigenic dimension (assuming thatmutations do not have a systematic bias, (cid:104) δx (cid:105) = 0). Thecontinuous diﬀusion assumption implied by Eq. 2 is onlyvalid when there are many small mutation eﬀects, µ (cid:29) δ x (cid:28) r . The total viral population size, or numberof infected hosts, N ( t ) = (cid:82) d x n ( x , t ) may ﬂuctuate, butnot the host population size N h , which is constant: newadded memories (ﬁrst term of right-hand side of Eq. 3)overwrites existing ones picked uniformly at random (sec-ond term of r.h.s. of Eq. 3). Since each host carries M immune receptors, we have (cid:82) d x h ( x , t ) = M .If we assume that the system reaches an evolutionarysteady state, with stable viral population size N ( t ) = N ,then Eq. 3 can be integrated explicitly: h ( x , t ) = MN (cid:90) t −∞ dt (cid:48) τ e − t − t (cid:48) τ n ( x , t (cid:48) ) , (4)with τ = M N h /N . This equation shows how the densityof protections reﬂects the past of the virus evolution. B. Antigenic waves

We simulated (2)-(3) on a square lattice (Methods) andfound a stable wave solution (Fig. 1B-D). The wave has astable population size N , and moves approximately bal-listically through antigenic space, pushed from behind bythe immune memories left in the trail of past viral strains(Fig. 1B). These memories exert an immune pressure onthe viruses, forming a ﬁtness gradient across the widthof the wave (Fig. 1C), favoring the few strains that arefurthest from immune memories, at the edge of the wave.We assume that the solution of the coupled evolutionequations (2)-(3) takes the form of a moving quasispeciesin a d -dimensional antigenic space, n ( x , t ) = N √ πσ exp (cid:20) − ( x − vt ) σ (cid:21) ρ ( x , . . . , x d ) . (5)Here, we have written the solution in a co-moving frame,in which a motion with constant speed v takes place inthe direction of the coordinate x , and ﬂuctuations in theother dimensions, ρ ( x , . . . , x d , t ), centered around x i = 0for i >

1, are assumed to be independent. In the nextsections, we will analyse solutions of this form. First, wewill project the d -dimensional antigenic wave onto theone-dimensional ﬁtness space; this projection produces atravelling ﬁtness wave [26–28, 30, 31] that determines theantigenic speed v and the mean pair coalescence time (cid:104) T (cid:105) of the viral genealogy. Second, we will study the shapeof the d -dimensional quasispecies and determine the ﬂuc-tuations in the transverse directions. These ﬂuctuationsproduce a key result of this paper: immune interactionscanalize the evolution of the antigenic wave; this con-straint can be quantiﬁed by a persistence time govern-ing the transverse antigenic ﬂuctuations. Canalization ismost pronounced spaces of low dimensionality d and, aswe discuss below, aﬀects the predictability of antigenicevolution. C. Speed of antigenic evolution

Projected onto the ﬁtness axis f = f ( x , t ), the solutionis approximately Gaussian (Fig. 1D). This representationsuggests a strong similarity to the ﬁtness wave solutionfound in models of rapidly adapting populations withan inﬁnite reservoir of beneﬁcial mutations [26–28, 30,31]. To make the analogy rigorous, we must assume thatthe ﬁtness gradient in antigenic space is approximatelyconstant, meaning that ﬁtness isolines are straight andequidistant. Mutations along the gradient direction havea ﬁtness eﬀect that is linear in the displacement, whilemutations along perpendicular directions are neutral andcan be treated independently. Note that while we will usethis projection onto ﬁtness to compute the speed of the antigenic wave, our description remains in d dimension,and we will come back to transverse ﬂuctuations in thenext sections.There exist diﬀerent theories for the ﬁtness wave, de-pending on the statistics of mutational eﬀects. Our as-sumption of diﬀusive motion makes our projected dynam-ics equivalent to that studied in Ref. [31], which itselfbuilds on earlier work [27]. In the limit where the waveis small compared to the adaptation time scale, vτ (cid:29) σ ,the wave may be replaced by a Dirac delta function at x = ( vt, , . . . ,

0) in Eq. 4. One can then calculate ex-plicitly the immune density (upstream of the wave) andcoverage (downstream of the wave, using Eq. 1): h ( x , t ) ≈ Mvτ e − vt − x vτ Θ( vt − x ) δ ( x ) · · · δ ( x d ) , (6) c ( x , t ) ≈ e − ( x − vt ) /r vτ /r , x ≥ vt, x i> (cid:28) r (7)where Θ( x ) = 1 for x ≥ h ( x , t ) cor-responds to the blue trace of Fig. 1B, and the coverageor ﬁtness gradient to the isolines of Fig. 1C.In the moving frame of the wave, ( u, x , . . . , x d ), with u = x − vt , the local immune protection and viral ﬁtnesscan be expanded locally for u, x i (cid:28) vτ (see [25] for asimilar treatment in a one-dimensional antigenic space): f (( u, x i> ); t ) ≈ ln (cid:34) R (cid:18) − e − u/r vτ /r (cid:19) M (cid:35) ≈ f + su, (8)where f = ln R − M ln[1 + r/ ( vτ )] is the average pop-ulation ﬁtness, and s = | ∂ x f | = Mr (cid:16) R /M − (cid:17) (9)is the ﬁtness gradient. Rescaling the antigenic variable x as sx , this process is equivalent to the evolution of apopulation where mutation eﬀects are described by dif-fusion in ﬁtness space with coeﬃcient Ds . This is pre-cisely the model from which the ﬁtness wave solution ofRef. [27, 31] was described (see Appendix). In the fol-lowing we will use results from these works to describethe antigenic wave, with however the following diﬀerenceconcerning the regulation of the population size. In theusual ﬁtness wave theory, population is kept constant byconstruction, meaning that ﬁtness is only relevant whencompared to the mean of the population. By contrast, inour model population size is left free, and ﬁtness is de-ﬁned as an absolute growth rate. However the ﬁtness ofthe whole viral population undergoes continuous negativedrift due to the constant adaptation of immune systems,encoded in the − svt term in Eq. 8. This negative ﬁtnessdrift has an analogous eﬀect to subtracting the mean ﬁt-ness in models with constant population size, making theequivalence possible.The ﬁtness wave theory allows us to make analyticalprediction about the properties of the antigenic wave.Let us start with its population size N , which is regu-lated by how fast the immune system catches up withthe wave. The immune turnover time τ in Eq. 4 is in-versely proportional to N : the larger the population size,the faster immune memories are updated, increasing theimmune pressure on current viral strains (lower f ), andthus decreasing N . As the moving wave reaches a stablemoving state, its size N becomes stable over time, giv-ing the condition (1 /N ) dN/dt = f = 0, which in turnconstraints the ratio between the wave’s size and speed: Nv = M N h r (cid:16) R /M − (cid:17) = N h s. (10)But the ﬁtness wave theory predicts that the speedof the wave itself depends on the population size. Thelarger N , the more outliers at the nose the ﬁtness wave,and the further out they may jump in antigenic space, es-tablishing ﬁtter ancestors of the future population. Thisresults in a ﬁtness wave whose speed depends only weaklyon population size and mutation rate (see [31] and Ap-pendix), v F ≈ D / F (cid:104)

24 ln(

N D / F ) (cid:105) / , (11)where D F = s D and v F = sv are the diﬀusivity andwave speed in ﬁtness space, which are related to theircounterparts in antigenic space through the scaling factor s . Replacing this scaling into Eq. 11 yields a relationbetween antigenic speed and population size, v ≈ D / s / (cid:104)

24 ln( N ( Ds ) / ) (cid:105) / , (12)which closes the system of equations: using the deﬁni-tion of s (Eq. 9), Eqs. 10 and 12 completely determine N and v as a function of the model’s parameters (througha transcendental equation, see Appendix). We validatedthese theoretical predictions for N and v by comparingthem to numerical simulations, which show good agree-ment over a wide range of parameters (Fig. 2A-B). D. Shape of the antigenic wave

The width σ of the wave in the direction of motionis given by Fisher’s theorem, which relates the rate ofchange of the average ﬁtness to its variance in the popu-lation: ∂ t f = Var( f ). In our description ﬁtness and theantigenic dimension x are linearly related with coeﬃ-cient s , implying s σ = sv . The result of that predic-tion for σ is validated against numerical simulations inFig. 2C.The wave is led by an antigenic ‘nose’ formed byfew outlying strains of reduced cross-reactivity with theconcurrent immune population, generating high ﬁtness.These strains have phenotype u c = sσ / D = v / (4 Ds )and ﬁtness su c . They serve as founder strains from whichthe bulk of the future population will derive some time numerical N t h e o r e t i c a l N A D numerical v t h e o r e t i c a l v B numerical 10 t h e o r e t i c a l C D FIG. 2:

Analytical prediction of wave properties.

Shown are the numerical versus analytical predictions for thewave’s population size N ( A ), speed v ( B ), width σ along thewave’s direction of motion ( C ), and width σ ⊥ in the directionperpendicular to motion ( D ), with d = 2 dimensions. Lengthare in units of the cross-reactivity range (so that r = 1, withno loss of generality). Parameters: N h = 10 (squares), 10 (circles), or 10 (triangles); ln R = 1 (ﬁlled symbols) or 3(empty symbols); M = 1 (small symbols) or 5 (large sym-bols). ∼ u c /v = σ / D later (see Appendix). As a result, twostrains taken at random can trace back their most recentcommon ancestor to some averge time (cid:104) T (cid:105) = ασ / D inthe past, where α ≈ .

66 is a numerical factor estimatedfrom simulations [31].To explain the width σ ⊥ of the wave in the otherphenotypic dimensions than that of motion ( x i> ), wenote that in these directions evolution is neutral. Twostrains taken at random in the bulk are expected to havedrifted, or ‘diﬀused’ in physical language, by an averagesquared displacement (cid:104) ∆ x i (cid:105) = 2 DT from their com-mon ancestor, so that their mean squared distance is4 D (cid:104) T (cid:105) = 2 ασ along x i . If one assumes an approx-imately Gaussian wave of width σ ⊥ , the mean squaredistance between two random strains along x i shouldbe equal to 2 σ ⊥ . Equating the two estimates yields σ ⊥ = ασ . Fig. 2D checks the validity of this predic-tion against simulations.Both longitudinal and transversal ﬂuctuations in anti-genic space are instances of quantitative traits underinterference selection generated by multiple small-eﬀectmutations. The width of these traits is governed by thecommon relation (cid:104) ∆ x i (cid:105) = 2 D (cid:104) T (cid:105) ∼ σ , which expressesthe eﬀective neutrality of the underlying genetic muta-tions [32]. This relation says that antigenic variationsin all dimensions scale in the same way with the modelparameters, and the wave should have an approximatelyspherical shape. Consistently, here we ﬁnd a wave witha ﬁxed ratio α ≈ .

66 between transverse and longitudi-nal variations. This implies a slightly asymmetric shape(which may be non-universal and depend on the micro-scopic assumptions of our mutation model).In what parameter regime is our theory valid? Theﬁtness wave theory we built upon is meant to be validin the large population size, N (cid:29)

1. In addition, weassumed that the ﬁtness landscape was locally linearacross the wave. This approximation should be validall the way up to the tip of wave, given by u c , sincethis is where the selection of future founder strains hap-pen. This condition translates into u c (cid:28) r , implying D (cid:29) r / ln( N ) , where D is in antigenic unit squaredper infection cycle. This result means that one infectioncycle will not produce enough mutations for the virusto leave the cross-reactivity range. In that limit, an-other assumption is automatically fulﬁlled, namely thatthe width of the wave be small compared to the span ofimmune memory: σ (cid:28) vτ . Our simulations, which run inthe regime of very slow eﬀective diﬀusion ( D/r (cid:46) − )and have relatively large population sizes ( N (cid:38) ), sat-isfy these conditions. This explains the good agreementbetween analytics and numerics. E. Equations of motion of the wave’s position

The wave solution allows for a simpliﬁed picture. Thewave travels in the direction of the ﬁtness gradient (orequivalent the gradient of immune coverage) with speed v (Fig. 3A). Occasionally the population splits into twoseparate waves that then travel away from each otherand from their common ancestor (Fig. 3B). The tip ofthe wave’s nose, which contains the high-ﬁtness individ-ual that will seed the future population, determines itsfuture position in antigenic space. In the directions per-pendicular to the ﬁtness gradient, this position diﬀusesneutrally with coeﬃcient D . This motivates us to writeeﬀective equations of motion for the mean position of thewave: d x dt = − (cid:16) v + (cid:113) D (cid:107) ξ (cid:107) ( t ) (cid:17) ∂ x c | ∂ x c | + √ D ξ ⊥ ( t ) , (13) c ( x , t ) = (cid:90) t −∞ dt (cid:48) τ e − t − t (cid:48) τ − | x − x ( t (cid:48) ) | r , (14)where ξ (cid:107) and ξ ⊥ are Gaussian white noises in the di-rections along, and perpendicular to, the ﬁtness gradient ∂ x f / | ∂ x f | = − ∂ x c/ | ∂ x c | . D (cid:107) is an eﬀective diﬀusivityin the direction of motion resulting from the ﬂuctuationsat the nose tip. These ﬂuctuations are diﬀerent thansuggested by D , as they involve feedback mechanismsbetween the wave’s speed v , size N , and advancement ofthe ﬁtness nose u c . In the following we do not consider D FIG. 3:

Stochastic behaviour of the wave: diﬀusivemotion, splits, and extinctions. A.

The wave moves for-ward in antigenic space but is driven by its nose tip, whichundergoes antigenic drift (diﬀusion) in directions perpendicu-lar to its direction of motion. These ﬂuctuations deviate thatdirection, resulting in eﬀective angular diﬀusion. B. Whenantigenic drift is large, the wave may randomly split into sub-populations, creating independent waves going in diﬀerent di-rections. Each wave can also go extinct as size ﬂuctuationsbring it to 0. C. Cartoon illustrating the wave’s angular dif-fusion. Selection and drift combine to create a inertial ran-dom walk of persistence time t persist . D. Analytical prediction(Eq. 17) for the persistence time, versus estimates from sim-ulations. Symbols and colors are the same as in Fig. 2. these ﬂuctuations, and focus on perpendicular ﬂuctua-tions instead.

F. Angular diﬀusion and antigenic canalization

In the description of Eqs. 13-14, the viral wave ispushed by immune protections left in its trail. The ﬁt-ness gradient, and thus the direction of motion, points inthe direction that is set by the wave’s own path. Thiscreates an inertial eﬀect that stabilizes forward motion.On the other hand, ﬂuctuations in perpendicular direc-tions are expected to deviate the course of that mo-tion, contributing to eﬀective angular diﬀusion. To studythis behaviour, we assume that motion is approximatelystraight in direction x = vt , and study small ﬂuctua-tions in the perpendicular directions, x ⊥ = ( x , . . . , x d ),with | x ⊥ | (cid:28) r (as illustrated in Fig. 3C). Eqs (13)-(14)simplify to (see Appendix): ∂ t x ⊥ ( t ) = (cid:90) + ∞ dt (cid:48) T x ⊥ ( t ) − x ⊥ ( t − t (cid:48) ) t (cid:48) e − t (cid:48) /T + √ D ξ ⊥ ( t ) , (15)where T = ( v/r + 1 /τ ) − = ( r/v ) R − /M is an eﬀectivememory timescale combining the host’s actual immunememory, and the cross-reactivity with strains encoun-tered in the past.Eq. (15) may be solved in Fourier space. Deﬁning˜ x ⊥ ( ω ) = (cid:82) + ∞−∞ dte iωt x ⊥ ( t ), it becomes: − iω ˜ x ⊥ ( ω ) (cid:18) − iωT ) iωT (cid:19) = √ D ˜ ξ ⊥ ( ω ) . (16)To understand the behaviour at long times (cid:29) T , we ex-pand at small ω : − ω x ⊥ ( ω ) ≈ √ D ˜ ξ ⊥ ( ω ) /T or equiv-alently in the temporal domain ∂ t x ⊥ ≈ √ D ξ ⊥ ( t ) /T .This implies that the direction of motion, ˆ e ∼ ∂ x f / | ∂ x f | ∼ ∂ t x / | ∂ t x | , undergoes eﬀective angular dif-fusion in the long run: ∂ t ˆ e = √ D ξ ⊥ ( t ) / ( vT ) . The per-sistence time of that inertial motion, t persist = v T D = r D R − /M , (17)does not depend explicitly on the speed and size. How-ever, a larger diﬀusivity implies larger N and v while re-ducing the persistence time. Likewise, a larger reproduc-tion number R or smaller memory capacity M speedsup the wave and increases its size, but also reduces itspersistence time. This implies that, for a ﬁxed numberof hosts N h , larger epidemic waves not only move fasteracross antigenic space, but also change course faster.This persistence time scales as the time it would takea single virus drifting neutrally to escape the cross-reactivity range, r /D . For comparison, the muchshorter timescale for a population of viruses to escapefrom the cross-reactivity range r , t escape = rv = T R /M = N h MN ( R /M − , (18)scales with the inverse incidence rate N h /N . This is con-sistent with the whole population having been infectedat least one every ∼ N h /N infection cycles. This separa-tion of time scales is consistent with the observation thatevolution in the transverse directions is driven by neutraldrift, which is much slower than adaptive evolution in thelongitudinal direction. Both t persist and t escape are muchlonger than the coalescence time of the viral population, u c /v ∼ σ / D , since they reﬂect long-term memory fromthe immune system. However, while t escape ∼ N h /N cor-responds to the re-infection period and is thus boundedby the hosts’ immune memory (itself bounded by theirlifetime, which we do not consider), t persist may be muchlonger than that. This is possible thanks to inertial ef-fects, which are allowed by the high-order dynamics ofEq. 15 generated by the immune system. This very muchlike when, in mechanics, a massive object set in motionin a given direction will keep that direction without theneed for an external force to maintain it.The high-frequency behaviour of (16) has a logarithmicdivergence, meaning that the total power of ˆ e is inﬁniteunless we impose a (ultraviolet) cutoﬀ. Such a regulariza-tion emerges from the ﬁne structure of the wave. While the motion of the wave is driven by its nose tip, the im-mune pressure only extends back to the recent past ofthe bulk of the distribution, which stands at a distance u c away from the nose. In other words, there is a lag (andthus an gap u c in antigenic space) between the most in-novative variants that drive viral evolution, and the ma-jority of currently circulating variants which drive hostimmunity. Mathematically, this implies that the domainof integration of the ﬁrst term in the right-hand side of(15) should start at t c = u c /v , which regularizes the di-vergence. A more careful analysis provided in the Ap-pendix shows that this regularization does not aﬀect thelong-term diﬀusive behaviour of the wave. G. Deﬂection, speciations, and predictability ofantigenic evolution

We now examine how deﬂections of the wave in thetransverse direction determines the predictability andstability of the viral quasi-species. Assuming t (cid:29) T ,angular diﬀusion causes motion to be deﬂected as (seeAppendix) (cid:104) x ⊥ (cid:105) = d − D T t . Crucially, this deﬂectiondepends on the dimension of the antigenic space. Higherdimension means more deviation from the predictablecourse of the wave, and thus less predictability. We candeﬁne a predictability time scale t predict ∼ [8( d − / − / T / ( r /D ) − / , (19)which is the time it takes for prediction errors to becomeof the order of the cross-reactivity range. In low dimen-sions, this time scales as a weighted geometric mean be-tween t escape ∼ T and t persist ∼ r /D . However, at highdimensions t predict may be signiﬁcantly reduced, causingloss of predictability even below t escape .To get a sense of numbers, we can compare our re-sults with epidemiological data, taking the evolution ofinﬂuenza as an example, with an infection cycle time of3 days. It is assumed that individuals lose immunity tothe circulating strain of the ﬂu within ∼ ∼ r in t = 500, i.e. v/r ∼ · − . For instance, with N h = 10 -10 , R = 2, D/r = 3 · − , and M = 1,we get v/r ∼ . · − and t persist ∼ · ∼ t predict ismuch shorter and depends on dimension, albeit slowly,ranging from ∼

20 years for d = 2 to about 2 years for d = 1000. We stress that these numbers are obtained byscaling laws, and should not be taken as precise quanti-tative predictions.Large deﬂections may also cause speciations, or splits,which occur when two substrains co-exist long enough tobecome independent from the immune standpoint. Thishappens when two sub-lineages see the diﬀerence of theirtransverse positions ∆ x ⊥ become larger than ∆ x ∼ r ,within some limited period given by the coalescence time.We estimated the rate of such splitting events using a

10 20 3010 s p li t / (( / ) / v / D ) FIG. 4:

Rate of speciation.

Rescaled rate of splittingevents, deﬁned as the emergence of two substrains at distance∆ x = 0 . r from each other in antigenic space, meaning thatthey are becoming antigenically independent. The predictedscaling, k split ∼ ( v /D ) e −L , as well as the deﬁnition of thecollective variable L as a function of the model parameters,are given by Eq. 20. The line shows a linear ﬁt of the loga-rithm of the ordinate. saddle-point approximation (see Appendix): k split ≈ (cid:114) v D e −L , L = α (cid:32) s R − /M D r ( d − v (cid:33) / (20)with α some numerical factor. Simulations conﬁrmed thevalidity of this scaling (Fig. 4).The splitting rate grows with the dimension, consistentwith the intuition that departure from canalized evolu-tion is easier when more directions of escape are available.Splitting events are expected to strongly aﬀect our abil-ity to predict the future course of the wave. However, therarity of such events (exponential scaling of k split ) meansthat they will have a lower impact on predictability thandeﬂections. These results provide a theoretical and quan-titative basis from which to assess the eﬀect of dimensionon predictability, and possibly estimate d from antigenictime course data of real viral populations. III. DISCUSSION

In this work we have developed an analytical theory forstudying antigenic waves of viral evolution in responseto immune pressure. We showed that predictabilty islimited by two features of antigenic evolution, directionaldiﬀusion and lineage speciations of the antigenic wave.Unlike previous eﬀorts that considered one- [25] orinﬁnite-dimensional antigenic spaces [12], we explicitlyembedded the antigenic phenotype in a d -dimensionalspace. This description allows for the possibility of com-pensatory mutations, and makes it easier to compare re-sults with empirical studies of viral evolution projected onto low-dimensional spaces [8, 9]. Unlike these stud-ies however, our work does not address the question howan eﬀective dimension of antigenic space arises from themolecular architecture of immune interactions. Rather,we focused on the implications of the dimensionalityof antigenic space for phenotypic evolution and its pre-dictability.Our results suggest a hierarchy of time scales for vi-ral evolution. The shortest is the coalescence time (cid:104) T (cid:105) ,which determines population turnover. Then comes t escape , which is the time it takes the viral population toescape immunity elicited at a previous time point. Thelongest timescale is the persistence time t persist , whichgoverns the angular diﬀusion of the wave’s direction.That time scale is due to inertial eﬀects rather than relydirectly on the hosts’ immune memories, and may thusexceed their individual lifetimes. Finally, the predictiontimescale t predict , beyond which prediction accuracy fallsbelow the resolution of cross-reactivity, scales between t escape and t persist at low dimensions. However, unlikethe other timescales, it decreases with the dimension ofthe antigenic space, and may become arbitrarily low atvery high dimensions.Our framework should be applicable to general host-pathogens systems. For instance, co-evolution betweenviral phages and bacteria protected by the CRISPR-Cassystem [33] is governed by the same principles of escapeand adaptation as vertebrate immunity. Even more gen-erally, our theory (Eqs. 2,3) may be relevant to the cou-pled dynamics of predators and preys interacting in space(geographical or phenotypic), opening potential avenuesfor experimental tests of these theories in synthetic mi-crobial systems. IV. METHODS

We simulated discrete population dynamics of in-fected hosts n ( x , t )) and immune protections n h ( x , t ) ≡ N h h ( x , t ) (all integers) on a 2 D square lattice with lat-tice size ∆ x ranging from 10 − r to 0 . r . Each time stepcorresponds to a single infection cycle, ∆ t = 1. At eachtime step: (1) viral ﬁtness f is computed at each oc-cupied lattice site from the immune coverage Eq. 1; (2) viruses at each occupied lattice site are grown accordingto their ﬁtness, n ( x , t + 1) ∼ Poisson[(1 + f ∆ t ) n ( x , t )]; (3) viruses are mutated by jumping to nearby sites onthe lattice; (4) the immune system is updated accordinga discrete version of Eq. 3, by implementing n h ( x , t +1) = n h ( x , t ) + n ( x , t ) and then removing N ( t ) protections atrandom (so that N h remains constant).To implement (1) , we used a combination of exactcomputation of Eq. 1 and approximate methods, includ-ing one based based non-homogeneous fast Fourier trans-forms [34, 35]. Details are given in the Appendix.To implement (3) , we drew the number of mu-tants at each occupied site from a binomial distributionBinomial( n ( x , t ) , − e − µ ∆ t ). The number of new muta-tions m aﬀecting each of these mutants is drawn from aPoisson distribution of mean µ ∆ t conditioned on havingat least 1 mutation. The new location of each mutantis drawn as x + δ x , with δ x = round( (cid:80) mi =1 (cid:15) i ) (round-ing is applied to each dimension), where (cid:15) i is a vector ofrandom orientation and modulus drawn from a Gammadistribution of mean δ ∼ x and shape parameter 20.This distribution was chosen so as to maximize the num-ber of non-zero jumps while maintaining isotropy. Wethen deﬁne D = µ (cid:104) δx (cid:105) / ,

0) with size N and width σ in all dimensions, towhich 0 .

1% additional viruses are randomly added withinthe interval (0; u c ) along x ( N , σ , and u c being all givenby the theory prediction). Immune protections are placedaccording to Eq. 6. The ﬁrst 20,000 time steps serveto reach steady state and are discarded from the analy-sis. When a population extinction ( N = 0) or explosion( N = N h /

2) occurs, the simulation is resumed at an ear-lier checkpoint to avoid re-equilibrating. Simulations areended after 5 · steps of after 20 consecutive extinctionsor explosions from the same checkpoint.In order to analyze the organization of viruses in phe-notypic space, we save snapshots of the simulation atregular time intervals. For each saved snapshot we takeall the coordinates with n > (cid:15) parameter deﬁnes themaximum distance between two samples that are con-sidered to be in the neighborhood of each other. Weperform the clustering for diﬀerent values of (cid:15) and selectthe value that minimizes the variance of the 10th nearestneighbor distance. Clustering results are not sensitive tothis choice. This preliminary clustering step is reﬁnedby merging clusters if their centroids are closer than thesum of the maximum distances of all the points in each cluster from the corresponding centroid.From the clustered lineages we can easily obtain a se-ries of related observables, such as its speed v obtainedas the derivative of the center’s position. The width ofthe lineage proﬁle in the direction of motion σ as well asin the perpendicular direction σ ⊥ are obtained by tak-ing the standard deviaton of the desired component ofthe distances of all the lineage viruses from the lineagecentroid. Reported numbers are time averages of theseobservables. We can track their separate trajectories inantigenic space. A split of a lineage into two new lineagesis deﬁned when two clusters are detected where previ-ously there was one, and their distance is larger than∆ x .To estimate the persistence time, we ﬁrst subsamplethe trajectory so that the distance between consecutivepoints is bigger than 6( (cid:104) σ (cid:105) + std( σ )) so that fast ﬂuctu-ations in the population size do not aﬀect the inference.We take the resulting trajectory angles and smooth themwith a sliding window of 5. Then we divide the tra-jectory into subsegments, and compute the angles meansquared displacement (MSD) over all lineages and all sub-segments. We consider time lags only bigger than twicethe typical smoothing time, and if the MSD trace is longenough we also require the time lag to be bigger than2 T . Finally we only keep time lag bins with at least 10datapoints. We ﬁt the resulting time series to a linearfunction ax + b , and get the persistence time as a . Wecompute the reduced χ as a goodness-of-ﬁt score. Re-sults are shown for simulations that had enough statisticsto perform the ﬁt, lasted at least 10 cycles, and had areduced χ below 3. Acknowledgements.

The study was supported bythe European Research Council COG 724208 and ANR-19-CE45-0018 “RESP-REP” from the Agence Nationalede la Recherche and DFG grant CRC 1310 “Predictabil-ity in Evolution”. [1] Morris DH, et al. (2018) Predictive Modeling of In-ﬂuenza Shows the Promise of Applied Evolutionary Bi-ology.

Trends in Microbiology

Cell

Nature Communications

PLoS genetics bioRxiv .[6] Gandon S, Day T, Metcalf CJE, Grenfell BT (2016)Forecasting Epidemiological and Evolutionary Dynamics of Infectious Diseases.

Trends in Ecology and Evolution

Immunology Let-ters

Science eLife

Science (New York,N.Y.)

Evol Theory eLife [13] Strelkowa N, L¨assig M (2012) Clonal interference in theevolution of inﬂuenza.

Genetics

Genetica

Virology

Clinical Micro-biology and Infection

Nature

BMC Biology

PLoS Pathogens

Pro-ceedings of the Royal Society B: Biological Sciences bioRxiv .[22] Gog JR, Grenfell BT (2002) Dynamics and selection ofmany-strain pathogens.

PNAS

Epidemics

Pathogens

PLoS Pathogens pp 1–16.[26] Rouzine IM, Wakeley J, Coﬃn JM (2003) The solitarywave of asexual evolution.

Proceedings of the National Academy of Sciences of the United States of America

Phys. Rev. E - Stat.Nonlinear, Soft Matter Phys.

Genetics

Proceedings of the National Academy of Sciences of theUnited States of America

Physical Review Let-ters

PNAS

Nature Communications

Philosophical Transactions ofthe Royal Society B: Biological Sciences

ACM Transactions on Mathematical Software(TOMS)

NumerischeMathematik

Journal of Machine Learning Research

A density-based algorithm for discovering clusters in large spatialdatabases with noise (AAAI Press), pp 226–231.

Appendix A: Fitness wave theory

We decompose the density of viral strains according to the main direction of the wave x : n ( x , t ) = n ( x , t ) φ ( x , . . . , x d ) , (A1)where φ is normalized to 1. Projecting and linearizing Eq. (2) of the main text yields: ∂n ( x , t ) ∂t = s ( x − vt ) n ( x , t ) + D ∂ n ∂x + (cid:112) n ( x , t ) η ( x , t ) , (A2)with s deﬁned by (9), and η = (cid:82) dx · dx d η ( x , t ), so that (cid:104) η ( x , t ) η ( x (cid:48) , t (cid:48) ) (cid:105) = δ ( x − x (cid:48) ) δ ( t − t (cid:48) ). The change ofvariable ˜ x = sx , ˜ v = sv , ˜ n = s − d n , yields the traveling wave equation of Ref. [31]: ∂ ˜ n (˜ x , t ) ∂t = (˜ x − ˜ vt )˜ n ( x , t ) + ˜ D ∂ ˜ n ∂ ˜ x + (cid:112) ˜ n ( x , t )˜ η (˜ x , t ) , (A3)with ˜ D = Ds and (cid:104) ˜ η (˜ x , t )˜ η (˜ x (cid:48) , t (cid:48) ) (cid:105) = δ (˜ x − ˜ x (cid:48) ) δ ( t − t (cid:48) ).Note that this continuous description diﬀers from that used in Ref. [12], which also describes a ﬁtness wave inantigenic space. Their approach relies on a discrete evolutionary model where each mutation confers a ﬁxed ﬁtnessadvantage, as described by the ﬁtness wave solution of Desai and Fisher [28].Applying the formulas of that theory yield in the limit of large populations: s σ ≈ ˜ D / (24 ln( N ˜ D / )) / , (A4)0or σ ≈ ( D/s ) / (24 ln( N ( Ds ) / )) / , (A5)and v ≈ D / s / (24 ln( N ( Ds ) / )) / . (A6)The ﬁttest in the population is ahead of the bulk by u c = sσ / D in phenotypic space, with u c ≈

14 (

D/s ) / (24 ln( N ( Ds ) / )) / (A7)Plugging in (9) yields: σ = (cid:32) DrM ( R /M − (cid:33) / 

24 ln  N D / (cid:32) M ( R /M − r (cid:33) /  / , (A8) v = D / (cid:32) M ( R /M − r (cid:33) / 

24 ln  N D / (cid:32) M ( R /M − r (cid:33) /  / , (A9) u c ∼ (cid:32) DrM ( R /M − (cid:33) / 

24 ln  N D / (cid:32) M ( R /M − r (cid:33) /  / . (A10)From the stationarity condition (10) we obtain a self-consistent equation for N : NN h = Mτ = sv = D / (cid:32) M ( R /M − r (cid:33) / 

24 ln  N D / (cid:32) M ( R /M − r (cid:33) /  / . (A11)The condition u c (cid:28) r , implies that r scales with N faster than u c , r (cid:29) ln( N ). We also want σ (cid:28) vτ , therefore r (cid:29) M ( R /M − M / ln( N ) / , which is automatically satisﬁed by the previous condition. Appendix B: Fluctuations in the direction perpendicular to motion

The dynamics of the wave in the directions that are orthogonal to x is governed by the projection of (13) onto x ⊥ = ( x , . . . , x d ): ∂ t x ⊥ = − v ∂ x ⊥ c | ∂ x c | + √ D ξ ⊥ . (B1)From (14) we have c ( x , t ) ≈ (cid:90) t −∞ dt (cid:48) τ e − ( t − t ) (cid:48) /τ − √ ( x − vt (cid:48) ) +( x ⊥ ( t ) − x ⊥ ( t (cid:48) )) /r . (B2)Taking the derivative along x ⊥ yields: ∂ x ⊥ c | x = vt ≈ r (cid:90) t −∞ dt (cid:48) τ − ( x ⊥ ( t ) − x ⊥ ( t (cid:48) )) (cid:112) ( vt − vt (cid:48) ) + ( x ⊥ ( t ) − x ⊥ ( t (cid:48) )) e − ( t − t ) (cid:48) /τ − √ ( vt − vt (cid:48) ) +( x ⊥ ( t ) − x ⊥ ( t (cid:48) )) /r ≈ − rvτ (cid:90) t −∞ dt (cid:48) x ⊥ ( t ) − x ⊥ ( t (cid:48) ) t − t (cid:48) e − ( t − t (cid:48) )(1 /τ + v/r ) , (B3)1where we assumed | x ⊥ | (cid:28) r, τ /v . This derivative is small compared to the gradient along the x , so that we mayapproximate | ∂ x c | ≈ | ∂ x c | = ( r + vτ ) − .Replacing into (B1), we obtain: ∂ t x ⊥ ≈ (cid:90) t −∞ dt (cid:48) T x ⊥ ( t ) − x ⊥ ( t (cid:48) ) t − t (cid:48) e − ( t − t (cid:48) ) /T + √ D ξ ⊥ , (B4)with 1 /T = 1 /τ + v/r . Using integration by part, this equation can be rewritten as an auto-regressive process on ∂ t x ⊥ : ∂ t x ⊥ = (cid:90) ∞ dt (cid:48) T E ( t (cid:48) /T ) ∂ t x ⊥ ( t − t (cid:48) ) + √ D ξ ⊥ ( t ) (B5)where E ( x ) ≡ (cid:82) ∞ x dx (cid:48) e − x (cid:48) /x (cid:48) has the property (cid:82) ∞ dx E ( x ) = 1. Computing the Fourier transform of E ( t/T ) /T ,which is given by − iωT ln(1 − iωT ) and using the rule of convolution in Fourier space yields (16). Eq. (B5) can bere-written in terms of the direction of motion ˆ e = ∂ t x / | ∂ t x | ≈ ∂ t x /v :ˆ e ( t ) = (cid:90) ∞ dt (cid:48) T E ( t (cid:48) /T )ˆ e ⊥ ( t − t (cid:48) ) + √ Dv ξ ⊥ ( t ) (B6)Focusing on long-term behaviour yields the angular diﬀusion equation, ∂ t ˆ e = √ D ξ ⊥ ( t ) / ( vT ), for the direction ofmotion ˆ e . The two-point function of ˆ e follows the equation ∂ t arccos [ˆ e ( t )ˆ e ( t + t )] = √ DvT η ( t ) , (B7)where η ( t ) is a unit Gaussian white noise, leading to: (cid:104) ˆ e ( t )ˆ e ( t + ∆ t ) (cid:105) = e − ∆ t/t persist , (B8)with t persist = v T / (4 D ) is deﬁned as the persistence time. Going along the curviline coordinate that follows thetrajectory with speed v , we obtain a persistence length of vt persist .The logarithmic divergence at high frequencies in (16), which is also apparent in the logarithmic divergence in thetemporal domain at small t in the auto-regressive Kernel E ( t/T ) /T . This divergence may be regularized by realizingthat there is a lag u c /v between the nose of the wave, which drives the behaviour of the wave, and its bulk. Thisimplies that the integral over the past trajectory encoding the immune memory extends only up t − u c /v in the past: ∂ t x ⊥ ≈ (cid:90) t − u c /v −∞ dt (cid:48) T x ⊥ ( t ) − x ⊥ ( t (cid:48) ) t − t (cid:48) e − ( t − u c /v − t (cid:48) ) /T + √ D ξ ⊥ , (B9)or after integration by parts: ∂ t x ⊥ = e (cid:15) (cid:20) E ( (cid:15) )( x ⊥ ( t ) − x ⊥ ( x ⊥ − T (cid:15) )) + (cid:90) ∞ (cid:15)T dt (cid:48) T E ( t (cid:48) /T ) ∂ t x ⊥ ( t − t (cid:48) ) (cid:21) + √ D ξ ⊥ , (B10)with (cid:15) = t c /T (cid:28) − iω ˜ x ⊥ = K ( − iωt )˜ x ⊥ + √ D ˜ ξ ⊥ (B11)where now the K is the Laplace transform of the operator: K ( z ) = e (cid:15) [ E ( (cid:15) ) − E ( (cid:15) (1 + z ))] (B12)Since E ( z ) goes to 0 for large z , ( z − K ( z )) − goes as 1 /z for z → ∞ (small time scales), which means that athigh frequency ﬂuctuations of ∂ t x ⊥ (and thus of the direction of motion ˆ e ) track those of ξ ⊥ .Since E ( x ) ∼ − γ − ln( x ) + x at small x , then E ( (cid:15) ) − E ( (cid:15) (1 + z )) ≈ ln(1 + z ) − (cid:15)z for moderate z and small (cid:15) ,and K ( z ) ≈ z − z / O ( z (cid:15) ), so that ( z − K ( z )) − ∼ /z . We thus recover that at long time scales ∂ t x ⊥ diﬀuseswith diﬀusivity 4 D/T .2 Appendix C: Rate of speciation

A speciation, or split, occurs when two strains starting at the tip of the nose of the ﬁtness wave, and continuingthrough their progenies, co-exist long enough for them to become independent from the viewpoint of the immunepressure. This happens when their distance in the x ⊥ direction become larger than some threshold scaling with thecross-reactivity range, ∆ x ∼ r .Assuming t (cid:29) T , angular diﬀusion causes a strain to bend from the main direction of the wave as: ∂ t x ⊥ = √ DT ξ ⊥ ( t ) . (C1)After integration, the expected deviation reads: (cid:104) x ⊥ (cid:105) = 8( d − D T t (C2)assuming x ⊥ ( t = 0) = 0. Now if there are two strains a and b co-existing, their divergence in the perpendiculardirection is Gaussian distributed with: (cid:104) ∆ x (cid:105) = (cid:104) ( x ⊥ a − x ⊥ b ) (cid:105) = 16( d − D T t . (C3)Two strains are expected to co-exist at the leading edge for a time t before one of them gets absorbed into the bulkand goes extinct. The expected time for that scales as ∼ τ sw = u c /v = σ / D = v/ Ds . Assuming splitting eventsare rare, they occur when two co-existing strains both survive for an unusually long time. The distribution of suchrare events is asymptotically given by the probability density function P ( t coexist > t ) = e − t/τ sw . The probability thatthe two strains have drifted by at least r before that happens is then given by: P (∆ x > ∆ x ; coexist) = (cid:90) + ∞ dtτ sw e − t/τ sw (cid:90) + ∞ ∆ x d ∆ x (cid:112) π d − Dt / (3 T ) exp (cid:18) − ∆ x d − Dt / (3 T ) (cid:19) . (C4)Since we assume that this even is rare, P (∆ x > ∆ x ; coexist) (cid:28)

1, we make a saddle-point approximation (Laplacemethod) in the t variable. We look for the maximum of L ( t, ∆ x ) = tτ sw + ∆ x d − Dt / (3 T ) , (C5)with respect to t , ∂ t L = 0, which gives: t ∗ = √ / (cid:18) T τ sw ∆ x ( d − D (cid:19) / . (C6)Applying Laplace’s method with ∆ x = ∆ x along with a linear approximation of L in the vicinity of ∆ x (cid:38) ∆ x ,we obtain: P (∆ x > ∆ x ; coexist) ≈ τ sw (cid:112) π d − Dt ∗ / (3 T ) √ π (cid:112) ∂ t L ( t ∗ , ∆ x ) 1 ∂ x L ( t ∗ , ∆ x ) e −L ( t ∗ , ∆ x ) = (cid:114) e − (cid:18) T x d − Dτ (cid:19) / . (C7)Replacing T = ( r/v ) R − /M and τ sw = v/ (4 Ds ) yields: P (∆ x > ∆ x ; coexist) ≈ (cid:114)

38 exp  − (cid:32) s R − /M D ∆ x r d − v (cid:33) /  . (C8)We check self-consistently that our approximation of angular diﬀusion is correct for ∆ x ∼ r . The condition is metwhen t ∗ (cid:29) T , or t ∗ = (cid:113) r ∆ x R − /M / (( d − D sv ) / (cid:29) T = rv R − /M . (C9)3This condition is equivalent to: v (cid:29) D / r − / , (C10)which is (barely) satisﬁed for large population sizes (Eq. 12).Finally, to get the rate of splitting events, we must multiply the probability of a successful splitting event, P (∆ x > ∆ x ; coexist), by the rate with which branches sprout from the main trunk of the phylogenic tree. Since mutationsare modeled by continuous diﬀusion in antigenic space, such a new branch occurs whenever the individual virus onthe trunk of the tree (deﬁned as the virus that will eventually seed the entire future population) reproduces, as thetwo oﬀspring immediately become antigenically distinct because of diﬀusion, and thus make two distinct branches.This happens with rate f ( u c ) = u c s = v / D , so that the overall rate of speciation should scale as: k split ≈ (cid:114) v D exp  − (cid:32) s R − /M D ∆ x r d − v (cid:33) /  . (C11)Replacing ∆ x = ar , with a a numerical scaling factor, gives the result of the main text. Appendix D: Details of simulation implementation

To update the ﬁtness at each time step, we used either an exact computation of Eq. 1, or a faster approximatemethod based non-homogeneous fast Fourier transforms [34, 35]. For the exact computation, c ( x , t + ∆ t ) − c ( x , t ) wascalculated at each time step by convolving the Kernel H with n h ( x , t + ∆ t ) − n h ( x , t ) (exploiting the fact that thissum is sparse because not all positions get updated). Both algorithms can be preceded by an extra-approximationfor positions x (cid:48) with n h ( x (cid:48) , t + ∆ t ) > H byreducing | x − x (cid:48) | to its projection along the direction given by x (cid:48) − (cid:104) x (cid:105) n , with (cid:104) x (cid:105) n = (cid:82) d x x n ( x ) /N . This allows usto pre-compute the contributions of all these x (cid:48) to Eq. 1 with a mere 1D convolution, ∀ x where n ( x ) >

0, speedingup the computation considerably. We choose the desired combination of approximations based on the convolutioncomputational complexity, driven by the number of positions with n ( x , t + ∆ t ) > n h ( x , t + ∆ t ) − n h ( x , t ) >>