[PDF] A preceding low-virulence strain pandemic inducing immunity against COVID-19

Abstract

Countries highly exposed to incoming traffic from China were expected to be at the highest risk of COVID-19 spread. However, COVID-19 case numbers (infection levels) are negatively correlated with incoming traffic-level. Moreover, infection levels are positively correlated with population-size, while the latter should only affect infection-level once herd immunity is reached. These could be explained if a low-virulence strain (LVS) began spreading a few months earlier from China, providing immunity from the later emerging known SARS-CoV-2 high-virulence strain (HVS). We find that the dynamics of the COVID-19 pandemic depend on the LVS and HVS spread doubling-times and the delay between their initial onsets. We find that LVS doubling-time to be T L ∼1.59±0.17 times slower than the HVS ( T H ), but its earlier onset allowed its global wide-spread to the levels required for herd-immunity. In countries exposed earlier to the LVS and/or having smaller population-size, the LVS achieved herd-immunity earlier, allowing less time for the spread of the HVS, and giving rise to lower HVS-infection levels. Such model accurately predicts a country's infection-level ({\rm R^{2}=0.74}; p-value of {\rm 5.2\times10^{-13}}), given only its population-size and incoming-traffic from China. It explains the negative correlation with incoming-traffic ( c exp ), the positive correlation with the population size (n_{pop}) and their specific relations ( N cases ∝ n T L / T H pop × c T L / T H −1 exp ). We find that most countries should have already achieved herd-immunity. Further COVID-19-spread in these countries is limited and is not expected to rise by more than a factor of 2-3. We suggest tests/predictions to further verify the model and biologically identify the LVS, and discuss the implications.

Full PDF

AA preceding low-virulence strain pandemicinducing immunity against COVID-19

Hagai B. Perets ∗ and Ruth Perets , July 15, 2020 Technion - Israel Institute of Technology, Haifa, Israel 3200004 Rambam Health Care Campus, Haifa, Israel

Abstract

The COVID-19 pandemic is thought to began in Wuhan, China in December 2019. Mobility analysis identiﬁedEast-Asia and Oceania countries to be highly-exposed to COVID-19 spread, consistent with the earliest spreadoccurring in these regions. However, here we show that while a strong positive correlation between case-numbersand exposure level could be seen early-on as expected, at later times the infection-level is found to be negatively correlated with exposure-level. Moreover, the infection level is positively correlated with the population size,which is puzzling since it has not reached the level necessary for population-size to aﬀect infection-level throughherd immunity. These issues are resolved if a low-virulence Corona-strain (LVS) began spreading earlier in Chinaoutside of Wuhan, and later globally, providing immunity from the later appearing high-virulence strain (HVS).Following its spread into Wuhan, cumulative mutations gave rise to the emergence of a HVS, known as SARS-CoV-2, starting the COVID-19 pandemic. We model the co-infection by a LVS and a HVS, and show that it canexplain the evolution of the COVID-19 pandemic and the non-trivial dependence on the exposure level to Chinaand the population-size in each country. We ﬁnd that the LVS began its spread a few months before the onset ofthe HVS, and that its spread doubling-time is ∼ . ± . times slower than the HVS. Although more slowlyspreading, its earlier onset allowed the LVS to spread globally before the emergence of the HVS. In particular, incountries exposed earlier to the LVS and/or having smaller population-size, the LVS could achieve herd-immunityearlier, and quench the later-spread HVS at earlier stages. We ﬁnd our two-parameter (the spread-rate and theinitial onset time of the LVS) can accurately explain the current infection levels ( R = 0 . ; correlation p-value(p) of . × − ). Furthermore, countries exposed early should have already achieved herd-immunity. Wepredict that in those countries cumulative infection levels could rise by no more than 2-3 times the currentlevel through local-outbreaks, even in the absence of any containment measures. We suggest several tests andpredictions to further verify the double-strain co-infection model, and discuss the implications of identifying theLVS. COVID-19 is thought to be a zoonotic disease, which currently spreads by human-to-human transmission[1, 2].Human mobility mediates the geographic spread of the disease between countries through ground and air trans-portation. Given the origin of COVID-19 in China, it is therefore expected that traﬃc from China to other countrieswould drive the initial epidemic spread world-wide (e.g. [3]). In particular, it is expected that outbound mobilitylevels from China into other countries (i.e. the numbers of incoming passengers from China) should predict therisk/exposure level of the disease-spread into these regions. Here we make use of exposure-level, deﬁned as themobility level and risk assessment provided by the GLEAM epirisk module (e.g. Refs. [4, 5, 6]), and compare itwith the infection level (see Fig. 1a), measured either by the number of conﬁrmed COVID-19 cases or conﬁrmedCOVID-19 deaths (and later accounting for normalization by testing levels and/or age-structure; see Methods).We ﬁnd that shortly after the beginning of COVID-19 spread, highly exposed countries show a linear positivecorrelation between the infection level (as measured by the number of COVID-19 conﬁrmed cases), and the mobility-exposure level (given by the epiRisk module, based on ﬂight-traﬃc data) at early times ( p = 5 . × − ; for countrieswith ﬁrst cases by 80 days after December 1st 2019; see additional discussion of the early spread in Methods), butshow no correlation with the population size . However, at a later time (day 216), the same countries showa negative correlation between the infection-level and the mobility-exposure level ( p = 0 . , ρ = − . ; and p = 0 . ; ρ = − . , for age-normalized deaths), as can be seen in Fig. 1a. Moreover, we ﬁnd that at the later ∗ Corresponding author: [email protected] Note that unless otherwise stated, all correlations discussed are of the logarithms of the parameters a r X i v : . [ q - b i o . P E ] J u l igure 1: Left: Comparison of the number of conﬁrmed cases as a function of the exposure level, at early (80 dayssince December 1 st negative correlation between the infection-level and the mobility-exposure level. Right: The testing-normalizednumbers of conﬁrmed cases and testing (and age) normalized numbers of deaths (shifted down by a factor of tenfor clarity), as a function of the risk-parameter (see deﬁnition in the main text) showing a direct linear relationas expected in the double-strain co-infection model. The least-populated and highest-populated countries (markedby open circles) were not used in deriving the models parameters as to assure they do not bias the results (seemethods). Nevertheless, as can be seen, these too are also consistent with the model. See methods for a similarﬁgure with annotated countries names (not shown here for clarity).2ime point, the infection-level for all countries in our sample is positively correlated with the population-size (Pearsontest; p = 1 . × − ρ = 0 . for conﬁrmed cases; and p = 2 . × − , ρ = 0 . for age-normalized deaths). Whilethe early direct correlation of cases with exposure supports exposure-level risk estimates, our attention was drownto the unexpected negative correlation between the infection-level and and exposure-level at later times, and thecorrelation with population size at a relatively early epidemic stage where the numbers of cases are far below thelevels necessary for herd immunity to take aﬀect.Here we show that the existence of an earlier low virulence (ancestor) strain (LVS) providing immunity fromthe HVS, could explain the current global distribution of the COVID-19. We propose that the LVS spread froma location in China, but far outside Wuhan, and later gave rise to the emergence of a the highly virulent strain(HVS), known as SARS-CoV-2. The exact location of the initial LVS’s outbreak is unknown, but is likely to be inone of the least HVS-infected regions in China, and possibly close to Vietnam, given the very low infection levelsobserved there (see also section 1.3).Previous studies explored epidemic models of co-infection and cross-immunity by multiple viral strains[7, 8, 9, 10,11, 12, 13, 14, 15]. The emergence of highly pathogenic respiratory viruses from low-pathogenic ancestors has beenpreviously described[16], and cross-immunity between respiratory low-virulence strains and high-virulence strainshas been described as well[17, 18, 19]. In these cases the LVS eﬀectively gave rise to a natural live vaccine, leading toa mild-to-asymptomatic presentation of the HVS disease later-on [17, 18, 19]. In the case of SARS-CoV-2, mutationrate was found to be slow, at the level of . × − substitutes per site per year[20], slower than the InﬂuenzaA/H1N1 pandemic virus which mutates at . × − substitutes per site per year[21]. Unlike the “antigenic shift”in fast mutating viruses that thwarts cross immunity and requires yearly vaccination[22], a slower mutation ratemay allow for the co-existence of closer strains, retaining cross-immunity, or cross-response[23, 22]. We suggestthat the mutation(s) that occurred in the LVS leading to the emergence of COVID-19 aﬀected the pathogenicityand transmissibility of the virus, but not antigenicity. To date, genetic sequencing has shown that SARS-CoV-2 mutated and diverged into several genetic groups[24]. Mutations aﬀecting pathogenicity have been previouslydescribed for SARS-CoV-2[25], but currently all identiﬁed groups, to the best of our knowledge, are highly virulent,inconsistent with the expected characteristics of the proposed LVS. Mutations aﬀecting transmissibility, but notantigenicity, have been previously suggested for other respiratory pandemics[22]. Speciﬁcally, mutations eﬀectingtransmissibility and pathogenicity were previously described for the Ebola pandemic of 2013-2016[26, 27]. Ourproposed model therefore follows similar pathways found earlier in other viruses.As in those cases of cross-immunizing strains, our model suggests that although both strains share cross im-munity, the earlier strain has mild to no clinical symptoms, leading to it being yet-unrecognized and not raisingworld-wide alarm. However, when this LVS reached Wuhan, an accumulation of mutations occurred that gave riseto the emergence of a faster spreading and more pathogenic HVS, namely the currently identiﬁed SARS-CoV-2,which then spread from Wuhan to China and the rest of the world. Our model suggests, given the delay betweenthe onset of the LVS and the HVS, that countries with high exposure to China could achieve earlier LVS inducedherd-immunity (or near herd immunity), leading to earlier quenching of the HVS spread. In countries that havelower exposure to China the LVS achieves herd immunity only later, or might not achieve herd-immunity at allbefore the HVS becomes widely spread. In these countries a larger COVID-19 pandemic would be expected. Suchdynamics can explain the observed negative correlation between traﬃc from China and the infection-levels in eachcountry. Similarly, since herd immunity is dependent on the percentage of population that is infected by a virus[28],less-populated countries could achieve herd immunity of the LVS earlier on, leading to the positive correlationbetween COVID-19 infection-level and the population-size.As we show below, in our double-strain co-infection model, the overall infection level for any given country isprescribed by the mobility-exposure level of the country and its population-size, where the speciﬁc weight of thesetwo parameters is determined by the spread rate of the LVS. Using a linear-regression analysis we can identify theLVS spread rate, and explain the current infection levels for the countries in our sample, which extends over fourorders of magnitude in the number of cases. In Fig. 1b we show the correlation between the number of (testing-normalized) conﬁrmed cases (see Methods) and the risk-parameter, which we deﬁne below (adjusted R = 0 . F-statistics vs. a constant model score of 112, with p-value of p =5 . × − ; 41 observation points, 39 degreesof freedom; using the ’ﬁtlm’ function in Matlab[29]) allowing us to explain (and predict) the past (and future)dynamics of the pandemic.In order to study the dynamics of two-virus spread with cross antigenicity, we utilized the well-establishedSusceptible-Infected Recovered (SIR) model [30] (see ref. [28] for an overview). We extended the SIR model to thecase of two circulating cross-immunizing viruses (i.e. a person infected by one of the viruses is immune to infectionfrom the other virus, both during and after the infection; see Methods and related previous studies of co-infectionand the spread of multiple strains[7, 8, 9, 10, 11, 12, 13, 14, 6]). In the model we track the number of people3usceptible to both viruses (S), the numbers of people infected with the LVS, or with the later-emerging HVS,( I L and I H , respectively), and the number of people who have had contracted either strain, and then recovered ordied ( R L and R H , for the LVS and HVS, respectively). It is assumed that the total population in a given country N = S ( t ) + I L ( t ) + I H ( t ) + R L ( t ) + R H ( t ) is ﬁxed, and that those who have recovered from either virus are immuneto both. As we discuss later on, we assume the precursor LVS spreads slower than the HVS. The probability ofa person contracting a given virus at time t is proportional to the fraction of susceptible people among the totalpopulation I H / N for the HVS, and I L / N for the LVS. These assumptions lead us to a set of ﬁve ordinary diﬀerentialequations for S ( t ) , I ( t ) , and R ( t ) : dSdt = − β L S ( t ) I L ( t )N − β L S ( t ) I L ( t )N (1) dI L dt = β L S ( t ) I L ( t )N − γ L I L ( t ) (2) dI H dt = β L S ( t ) I L ( t )N − γ H I H ( t ) (3) dR L dt = γ L I L ( t ) (4) dR H dt = γ H I H ( t ) (5)Here, γ L , γ H ≥ are the recovery/death rates and β L , β H ≥ , the transmissibility parameters, measure thelikelihood of transmitting the LVS and HVS, respectively, when an infected and a susceptible person come intocontact. Here we make a simplifying assumption that a person can transmit the virus to other people for a totalperiod of 14 days, before the person recovers (or dies). While the LVS has low/negligible-virulence, as discussedearlier, and is assumed, for simplicity that infected people recover at the same rate from both viruses. In both caseswe assume for simplicity that no measures to slow or stop the spread of either virus have been implemented. Ascan be seen in Fig. 1b, our results provide an excellent ﬁt for the data, even without considering any containmentmeasures, suggesting that the latter might have only a relatively small eﬀect on the ﬁnal infection level outcome,although they could give rise to an overall slower progression of the pandemic (see Methods for further discussionof our assumptions). Examples for the evolution of such double-strain co-infection model dynamics are shown inFig. 2.As can be seen in Fig. 2 (see also methods for further details), the larger the initial spread of the LVS (LVSseed-infection) at the time when the HVS began spreading in a given country, the faster the LVS achieves herd-immunity, thereby also quenching the fast spread of the HVS, and giving rise to a lower HVS-infection level. Sinceherd-immunity requires a population-wise infection-level (e.g. ∼ − /βγ of the population; see [31, and referencestherein]), more populated countries take longer to achieve herd-immunity for the same seed-infection (Fig. 2c &Fig. 2d). This allows for more time for the spread of the HVS, giving rise to an overall higher HVS infection-level.Since the spread of the HVS is faster than that of the LVS, the former (HVS) can even outrun the latter (LVS), inwhich case the HVS will not be limited by the LVS, and will only be limited by the population size (See Fig. 2c).Based on the double-strain co-infection model we can now explain the observed distribution of COVID-19. TheLVS outbreak began in China, far from Wuhan. Shortly after the LVS reached Wuhan, and before the LVS achievedherd immunity in this region (e.g. similar to the case in 2b), cumulative mutations lead to the emergence of theHVS in Wuhan, initiating the initial exponential spread of the HVS strain and its outbreak in China and laterworldwide. At later time the HVS spread slowed down and almost completely stopped once the LVS achievedherd-immunity, quenching the exponential spread of both the LVS and the HVS. Note that in the model the total cumulative number of infections can still slowly rise by additional factor of 2-3 after herd-immunity is achieved(which approximately corresponds to the time when the peak in active-cases is seen; see Fig. 2). Throughout Chinathe LVS had more time to spread before the spread of the HVS, leading to immunization of an even larger fractionof the population (compared to Wuhan and Hubei region) before the spread of the HVS from Wuhan, leading toan even lower COVID-19 infection rate in China outside Hubei province.The global spread of the LVS followed the exposure level, with more exposed countries receiving LVS-infectedpeople earlier (strong negative correlation between the logarithm of the exposure level and the day of ﬁrst infection(see methods and Fig. 4; p = 6 . × − ). If the LVS and HVS spread rates were the same, the spread of bothstrains would have followed the exact same dynamics, and the delay between the arrival of the LVS and the HVSwould be constant. Consequently, a ﬁxed ratio of LVS-infection level to HVS-infection level in diﬀerent countries(with similar population sizes - see below) would be expected, leading to an overall similar HVS-infection levels.4igure 2: The results of a simple SIR co-infection model, showing the evolution of two viruses which share cross-immunity, where the slower-spreading one (the LVS, in our case), begins spreading earlier. We show four diﬀerentcases as to exemplify the dynamics. Top (bottom) left : The LVS begins spreading 50 (65) days earlier than theHVS in a population of × people. Top (bottom) right: The LVS begins spreading 50 (65) days earlier than theHVS in a population of × people. As can be seen, the longer the delay between the onset of the LVS and theHVS (which is a function of the mobility-exposure level, see Eq. 8), and the smaller the population, the spread ofthe HVS is quenched earlier, leading to an overall lower infection-level of the HVS (see test for further discussion).5owever, the infection levels in countries of similar population sizes diﬀers between lower and higher exposures (e.g.compare France and Thailand, Taiwan and the Netherlands in Fig. 1). Therefore, the observed diﬀerential infectionlevel, requires the LVS transmissibility level to be lower (consistent with our ﬁnding below). In this case, as shownIn Fig. 2 and discussed in the Methods, the HVS spread can outrun the LVS spread and decrease the relative LVSto HVS infection levels with time. The LVS spread to less exposed countries started later on, allowing for a shortertime frame for the faster HVS spread to outrun the LVS spread, giving rise to a diﬀerentiated increase in the ratioof HVS-to-LVS infection with decreasing exposure-level. For a given population size this ratio determines the ﬁnalHVS infection level (compare Figs. 2a and 2c).Another important parameter that determines viral spread in the double-strains co-infection model, is populationsize. As discussed above, the time to achieve herd immunity depends logarithmically on the population size.Therefore, countries with smaller population size would approach herd immunity faster, allowing less time for theHVS spread to outrun the LVS-spread before its exponential growth is quenched.Considering the exposure-level - dependent HVS to LVS infection ratio, the population size, and the initial delaybetween the onset of the LVS and the onset of the HVS in China, we expect the HVS infection level to have aspeciﬁc dependence on these parameters (as we derive in details the Methods), given by log (I maxH ) = [( t L − t H ) / T H ] + (cid:18) T L T H (cid:19) log N − (cid:18) T L T H − (cid:19) log c exp , (6)where I maxH is the expected maximal infection level of the HVS, t L and t H are the onset times of the LVS and the HVS,respectively; and T L and T H are the doubling times for the LVS and HVS, respectively. In particular, we expect aspeciﬁc relation between the dependence on the population size, and on the doubling times ratio ( T L / T H comparedwith T L / T H − ; see Methods). Linear regression can therefore allow us to identify the speciﬁc dependence. Usingthe data on conﬁrmed cases we ﬁnd best ﬁt linear regression models (for Eq. 6) of T L / T H =1 . ± . (Adjusted R of 0.42). Although already providing highly signiﬁcant results ( p = 5 . × − ; Pearson correlation test), suchdata are noisy due to diﬀerences in the testing level. When analyzing the testing-normalized conﬁrmed cases (seemethods) we ﬁnd best ﬁt linear regression models (for Eq. 6) of T L / T H =1 . ± . (Adjusted R of 0.74). Usingthe age-normalized deaths data we ﬁnd T L / T H =1 . ± . (Adjusted R of 0.41), or, when also age and testing-normalized data, giving T L / T H =1 . ± . (Adjusted R of 0.62). All of these are consistent with each other andthe model. Hereafter, we adapt the most signiﬁcant result, T L / T H = 1 . , as our ﬁducial ratio. Correlation testsbetween the risk-parameter, χ , which we deﬁne (following Eq. 6) as χ = (cid:18) N2 (t H0 − t L0 ) / T L c exp1 − T H / T L (cid:19) T L / T H , and the number of testing-normalized cases (age and testing- normalized deaths) for T L / T H = 1 . give p = 5 . × − , . × − and p (cid:28) − ( p = 1 . × − , × − and p =1 . × − ), for the Pearson, Kendall, and Spearmantests, respectively.Taking the mobility exposure-level in China to be unity, by deﬁnition, and plugging the population size andthe infection level, we can then ﬁnd the delay time between the onset of the LVS and the HVS. If we assume adoubling time of . days (in the absence of any containment measures), consistent with observed spread rate at theearliest days, we ﬁnd ( t L − t H ) ∼ days, suggesting the initial human infection by the LVS began in China aroundSeptember. This, however does not account for non-diagnosed infections, population structure and inter-Chinamobility that may slow transmission, and it is possible that the delay might somewhat diﬀer. If we assume only ∼ . of of the infections are diagnosed we get ∼ days.In Figs. 1c and 1d we show I maxH for both the testing-normalized conﬁrmed cases and for the age and testing-normalized deaths, as a function of the risk-parameter deﬁned above. In fact, low and high population countries,not used to derive the model parameters as not to introduce potential bias (see Methods) also well ﬁt (includingthem in the statistical score gives p = 7 × − (Pearson) for the same model parameters), and the model shows anexcellent agreement with observations, over ﬁve orders of magnitudes in infection level. We ﬁnd that that in mostcountries in our sample the spread of the LVS should achieve herd-immunity level, and quench further exponentialspread of the HVS, much before the HVS achieves herd-immunity infection-level. We ﬁnd low dispersion aroundthe expected values from the model (up to a factor of a few for the testing-corrected conﬁrmed cases), leavingrelatively little room for other parameters beside exposure level and population size in aﬀecting the ﬁnal outcome,as we further discuss below. A wider dispersion can be seen in the age-corrected deaths. However, the amplitudeof the dispersion grows with the number of deaths/conﬁrmed-cases, suggesting a relation to the load on the healthsystem aﬀecting treatment, and thereby the number of deaths. A further inquiry on this issue, though important,is beyond the scope of this paper and will be explored elsewhere.6ur model and analysis of the current statistical data supports the existence of a preceding LVS. They explainand predict upper limit for infection-levels as a function of the mobility-exposure to China and the population-size.In particular, we ﬁnd that more exposed countries the spread of the LVS is already at or close-to herd-immunitylevel, which explains the puzzling observed low-level infections in these countries, and the direct, but non-lineardependence of infection level on population size, which is otherwise unexpected given the low infection level in respectthe those required for herd-immunity. However, additional independent tests could further support or refute themodel. These can be divided between biological tests and demographic one, as we discuss in the following.Cross antigenicity between the LVS and COVID-19 is an essential part of our model and can be either humoralor cellular. If cross antigenicity relies on cross reactive antibodies, it is possible that antibodies to LVS will bedetected by serological testing for SARS-CoV-2, already employed to some extent in several countries[32]. Inparticular countries which appear to have achieved LVS herd-immunity should have shown a large fraction of thepopulation to be sero-positive, in contrast with the currently directly measured lower per-population HVS infection-levels[32, 33]. However, current serologic testing is optimized for HVS, and might be less eﬃcient for detection ofthe LVS. Recent ﬁndings at the levels of a few percents up to 30 percents sero-positive, i.e. tens to hundred timeslarger than the infection rates inferred from the conﬁrmed cases, but still considerably lower than required for herd-immunity. These likely reﬂect low-testing levels in most countries, suggesting that actual HVS-infection levels aretypically much higher than inferred from the reported cases, consistent with our use of testing-normalization in ouranalysis (see Methods). It also suggests that none of the currently identiﬁed genetic groups of SARS-CoV-2 couldbe the LVS proposed here. Moreover, it is important to note, that it is currently unclear whether the antibodiesdetected in the currently available serologic tests are indeed protective antibodies, and therefore even in the cases ofhumoral cross antigenicity, LVS strains might not be detected at all by SARS-CoV-2 serologic tests. We thereforesuggest to make use of a more direct and accurate method to test the existence of LVS antibodies, by employinga viral micro-neutralization testing of the HVS on serum from a sample of people (preferably from highly exposedcountries, for which the majority of the population should already have been infected by the LVS) who are foundto be negative for the SARS-CoV-2 HVS in serologic tests. These should be able to identify antibodies reaction tothe HVS otherwise undetected by currently employed serologic test, in a similar manner as used to test acquiredimmunity to a high pathogenic ﬂu virus following vaccination for a less pathogenic ﬂu virus[34]. To the best of ourknowledge only one micro-neutralization study (with limited statistics) has been done on SARS-CoV-2 to date[35],not identifying antibodies in the serum of people who were found to be negative for the SARS-CoV-2 HVS inserologic tests.A likely possibility is therefore that cross antigenicity between the two strains might be dependent on cellular immunity, rather than humoral one. For example, previous studies on avian inﬂuenza HVS H5N1 protection bythe LVSs H9N2[17, 18], H1N1, or H1N2[19] showed that the immunity was cellular rather than humoral immunity,and H5N1 antibodies were not found in chickens that were previously infected by low virulence strains[17, 18, 19].In fact, very recently, and after the completion of our analysis Sekine et al. [36] detected SARS-CoV-2-speciﬁc Tcells were detectable in antibody-seronegative family members and individuals with a history of asymptomatic ormild COVID-19. This could be consistent with our suggestion of cellular cross-immunity and the possible wide-spread immunity due to the LVS. Our model suggests that mapping T-cell response test among a current sample ofpeople with no known infection in most countries and especially in highly exposed countries (e.g. China, Vietnam,Thailand) should show the majority to already be immune, inconsistent with the known low-level infection level inthese countries in respect to the population size, verifying our results. Although, at this stage, one can not excludeacquired immunity through past infections from other viruses, however, such immunity level should not follow thespeciﬁc correlations expected from our model and explaining the current global distribution.Assuming a LVS SARS-CoV-2 is identiﬁed and a speciﬁc serologic or mapping T-cell response test is developedfor it, tests of blood samples taken across China (or in other countries achieving LVS herd-immunity) aroundOctober-November, i.e. before the December 2019 HVS outbreak, but at the point where the LVS should havealready been widely spread, could identify positive cases, in particular outside Hubei province. In fact, the studyof samples taken at diﬀerent times would potentially allow for identifying the LVS spread, where positive fractionsshould gradually increase from September until November. Similarly, early evidence for SARS-CoV-2-like virusesfound before the known COVID-19 outbreak in December 2019 could further support our ﬁnding, e.g. throughﬁnding evidence for SARS-CoV-2 in the sewage, much before the COVID-19 outbreak began at the respectivecountry[37], could provide similar clues of the early spread of the LVS. In particular, a much earlier spread of theHVS should have been easily identiﬁed due to the large number of patients expected, inconsistent with the data,while a wide spread LVS, could be consistent with such ﬁndings.Biological tests could provide optimal smoking-gun signatures of the LVS spread, but are depend on the unknowngenetic similarity of the LVS and the HVS, the type of serologic or mapping T-cell response tests and the type of7ross antigenicity. A diﬀerent, sequence-independent statistical test approach can be used to test for LVS spread.People visiting China in the 1-2 months before the HVS outbreak (but not afterwards), might have been infectedby the LVS during their visit, thereby acquiring immunity to the HVS. HVS infection-level among such peopleshould therefore be lower than that of the background population in their home country, in particular in COVID-19 highly infected regions. Therefore testing COVID-19 prevalence among people visiting China in September toNovember 2019 could conﬁrm the presence of a protective LVS strain independent of its genomic sequence, butrequires non-trivial data collection due to the large travel and health information needed for this statistical analysis.The existence of a LVS providing cross-immunity to the COVID-19 HVS strains has far-reaching implications.First, it proves the existence of acquired viral immunity and the potential for developing an eﬃcient vaccine, giventhe immunity provided by the LVS as an eﬀective live-vaccine. Biological identiﬁcation of the LVS could be usedas a highly advantageous starting-point for developing a safe COVID-19 vaccine, although modiﬁcations preventingthe occurrence of new pathogenic properties of the natural virus will be essential. Moreover, although the LVSmight protect from further infections in already “protected” countries, it is yet unknown how long such immunityprotection lasts (see e.g. discussion of limited immunity in Ref. [38]). Lastly, the HVS SARS-CoV-2 might at somepoint change its structure to generate a novel HVS that would not be cross immune with the previous strains.Biological identiﬁcation of the LVS would also enable the development of speciﬁc serologic tests and/or mappingT-cell response tests, that can then be used to measure and directly validate the existence of herd-immunity in agiven country, and thereby direct the application of containment measures and opening strategy plans.Non-homogeneous geographical transmission and/or non-homogeneous transmission among less/disconnectedcommunities can give rise to pockets of less immune populations within largely-immune countries, explaining thepossibility of local outbreaks of the HVS. More generally, our simpliﬁed model considers countries as independentunits, while intra-country dynamics could lead to a down-scaled version of the model. Even in countries in whichthe early LVS infection provided partial protection, the protection might not be homogeneous. Regions and/orcommunities more exposed to international traﬃc would be infected earlier, leading to initially higher infectionlevels, but eventually show lower infection-level at later times, since the same regions/communities also acquiredhigher level of immunity through larger prior LVS infection. More isolated regions and/or remote/disconnectedcommunities might eventually be at higher risk and could have higher infection levels compared with the overallcountry population. Generally, regions/communities having earlier infections are likely to experience overall lessercumulative levels of infections, compared to same-size communities infected later, and larger isolated communitiesare generally likely to experience higher infection level. Similar intra-country analysis (e.g. comparing the exposurelevels and population-sizes in diﬀerent states in the US, would suggest some of the more populated states in theUS are still likely to experience higher infection levels), could serve to predict the intra-country dynamics of theCOVID-19 pandemic. Furthermore, COVID-19 can still aﬀect immune deﬁcient people everywhere, as these arenot protected by the LVS, although herd-immunity signiﬁcantly lowers their chance of infection.It is possible, and even likely that in a fraction of the cases the LVS only provides partial immunity, allowing forHVS infection, but leading to a less virulent form of COVID-19. In such a case we might expect a higher fraction ofnewly identiﬁed COVID-19 patients to be, on average, more symptomatic during the early phases of the pandemicswhen it still spreads exponentially, before the LVS achieves her-immunity. At these early times the LVS has notyet infected the majority of the population and newly HVS-infected people are not likely to have been previouslyinfected by the LVS, and be partially immunized. After the LVS infected a large fraction of the population, newHVS-infections are far more likely to be of previously LVS-infected people, who already acquired partial immunity.We might therefore expect a lower fraction of asymptomatic cases, and a higher morbidity rate during the earlyexponential growth of the HVS, in comparison with later times after sub-exponential-growth and later decay in thenumber of cases is observed. We do note that the improvement in COVID-19 treatment in time due to the learningcurve of the health systems might give rise to lower morbidity at later times too, but it is less likely to to aﬀect theasymptomatic to symptomatic ratio, nor should it be related to the transition from exponential to sub-exponentialgrowth. Note, that in a case of only partial immunity, the infection-level, in terms of cases, could somewhat increasebeyond the level described above, the transition to sub-exponential growth, but the infection-level in terms of deathsis likely to be far less aﬀected.Our results indicate that the infection level depends on the exposure level and the population size. In particularthere is relatively little dispersion in the observed infection levels in respect to the basic model ﬁt. This suggests thatalthough the widely varied containment measures applied by diﬀerent countries[39] might be important in slowingthe virus spread to some extent, and possibly allowing the health system to better accommodate more serious cases,the speciﬁc and diﬀerent measures taken by each country eventually had little eﬀect on the number of infectedpeople in a given country. In particular, it is diﬃcult to understand how would the diﬀerent measures taken atdiﬀerent times could give rise to the type of correlations with exposure-level and population sizes which we identify.8ur results would similarly suggest that diﬀerent opening strategies after lock-downs and social distancing wouldnot considerably aﬀect the overall infection levels, which is eﬀectively dependent on the double-strain dynamics andnot the containment measures responses (or the lack of them). While it is not clear whether any of the currentlyused containment measure had a signiﬁcant eﬀect on the overall number of infected people, it is important to notethat in principle, measures which could be more eﬀective for eradicating the HVS compared with the LVS shouldbe prioritized. Such measures may change the relative LVS to HVS infection levels, and would provide the sameeﬀect as introducing a higher per-population exposure, and in turn constrain the maximal HVS infection levelto lower values. Current contact tracing, for example, is focused on HVS infections, while not aﬀecting the LVSspread; similarly, quarantine/isolation of HVS-infected people decreases the HVS-transmittibility. Social distancing,however, aﬀects both the LVS and the HVS, and might even be harmful under some conditions, as it lowers thetransmissibility parameter β for both the LVS and the HVS. Given that the doubling time and β are related through T = log(2) / log(1 + β ) , one gets T L / T H = log(1 + β H ) / log(1 + β L ) . Decreasing both β H and β L by the same factorincreases T L / T H allowing the HVS to outrun the LVS faster, i.e. leading to an overall higher HVS infection-levels.However, this is only true as long as the reproduction number is bigger than unity and growth is exponential, ifthe containment measures are suﬃcient to drive the reproduction number below one, the exponential spread of thedisease is stopped in any case, and than these measures are indeed helpful.Fast mutating viruses might mutate too fast as to give rise to long-lasting cross-immune strains. The slowmutation rate of SARS-CoV-2 could be the reason allowing for LVS and HVS co-infection. Nevertheless, any newlow-virulence viruses are likely to go through several mutations before becoming virulent. This would suggest thatcontinuous monitoring of viruses mutations in the population could be used both to identify the potential progressinto virulent forms, as well as serve for development of new vaccine, once a virulent virus forms. In particular,sample collections of non-virulent viruses and their sequencing done continuously could serve as key input in theonset of a new epidemic, allowing for immediate back-tracking of the new-virus development, and the possible useof earlier non-virulent strains as an advanced starting point for development of a vaccine.Finally, although developed in the context of the COVID-19 pandemic, similar multi-strain analysis and thestatistical identiﬁcation of co-infection described here can be used for the study of other pandemics. In our analysis we considered data only for countries for which our above mentioned data sources provided allrelevant information, i.e. number of conﬁrmed cases, number of deaths, number of tests and mobility-exposurelevels, and population sizes (the latter was available for all countries). We then made use of several ﬁltering andnormalization methods to minimize data noise level , as described below.

The exact numbers of COVID-19 infections are unknown, since screening of the entire population has not been donein any country. We therefore use the numbers of conﬁrmed cases and the numbers of COVID-19 related deaths asproxies for infection levels. However, each of these numbers is inﬂuenced by various factors, which are diﬃcult toasses, and their exact relation to the actual infection level could diﬀer to some extent in diﬀerent countries, givingrise to eﬀective noisy data. We made of two corrections in order to ﬁlter some of the main components.COVID-19 related deaths are typically of patients which were earlier critical condition. These are more likely tohave been identiﬁed and tested than asymptomatic patients, which might have never been tested. Conﬁrmed deathsare therefore potentially more reliable proxy for the infection level. On the other hand the majority of the deathsare of people older than 70, and therefore diﬀerences in age structure in diﬀerent countries will give rise to diﬀerentratio between the infection level and the number of deaths. We therefore age-calibrated the death rates in eachcountry by dividing the number of deaths by the fraction of the population older than 70 years. Other potentiallyimportant aﬀecting factor is the quality of the health system, where better health systems might be able to providebetter treatment and decrease the morbidity level. Finally, even in high quality health systems, an overload of alarge number of patients in critical conditions might aﬀect the quality of treatment, potentially leading to highermorbidity rates in various countries. The two latter factors are more diﬃcult to asses quantitatively, and are notaccounted for in our analysis. We only considered countries for which our data source provided the age-structure.The number of conﬁrmed cases is likely to be less aﬀected by the overall quality of the health system, but ismore dependent on the number of COVID-19 tests done in respect to the population size, and the testing strategy.We therefore normalized the numbers of conﬁrmed cases in each country by dividing them by the correspondingnumbers of tests per population in these countries. Since our model predicts a relation between infection leveland the population size, one might be concerned that inclusion of such a parameter which is related to populationsize could introduce an artiﬁcial correlation. We ﬁnd that the tests per population show no correlation with thepopulation size ( p = 0 . ; see also Fig. 3) when excluding the most ( > people) and least ( < × people)populated countries in our sample. In Fig. 5 we show the data for all countries in our sample, including thesehigh/low population ones. As can be seen, these data for these countries is highly consistent with the model. Infact, when adding these countries to the analysis (but not the systematic outliers (see below) also shown in theﬁgure) the results even further improve, and we ﬁnd adjusted- R = 0 . and Pearson correlation test p-values of × − , for the same model parameters derived without considering these countries. Throughout our analysis we assume that each country is exposed to the viruses directly through traﬃc from China.While this is likely to true for countries highly exposed to China, less exposed countries might be infected viasecondary spread from other countries, even before being infected through direct incoming traﬃc from China. Adetailed pandemic model which accounts for the traﬃc between each country and the epidemic evolution in timecould potentially better account for such secondary exposed countries. This is beyond the scope of the currentstudy, and we therefore only account for countries which were likely to be directly exposed to China. In Fig. 4we plot the day of ﬁrst conﬁrmed infection for each country as a function of the exposure level to China. FromEq. 8 (see below) we expect a correlation between the logarithm of the exposure level and the initial infectionin a given country. An overall strong correlation can indeed be found for the full data sample ( p < − ) , butabove ∼ days the correlation saturates, possibly suggesting that countries exposed to COVID-19 more than ∼ days after December 1st 2020, are more likely to have been exposed indirectly. Therefore, in our analysiswe only make use of data from countries in which the ﬁrst conﬁrmed case has been identiﬁed at most 100 daysafter December 1 st days suggesting thatthe exact identiﬁcation of countries with experience direct or secondary exposure is diﬃcult to discern without a10igure 3: The relation between population size and the number of COVID-19 tests per population. As can be seenwe ﬁnd no correlation ( P = 0 . ) between these parameters.more detailed model. Nevertheless, we ﬁnd that even making use of the full data, i.e. even beyond 100 days, doesnot aﬀect our conclusions, i.e. we can still identify (with signiﬁcant statistical score, albeit less signiﬁcant thanthose discussed above) model parameters consistent with those we ﬁnd using the ﬁltered data. In principle theinitial infection day might be used (after appropriate analysis) as a phenomenological proxy for the exposure level,irrespective of the traﬃc data. This is beyond the scope of the current analysis and will explored in later studies. Throughout our analysis we have made use of all data available to us, as described in section 1.1, and the statisticalsigniﬁcance levels are based on analysis of these data. Nevertheless, as mentioned in the main text, some clearsystematic eﬀects and outliers can be identiﬁed in respect to speciﬁc countries. It is therefore important to discussthese systematics and the possible origin of outliers.Our analysis makes use of exposure-level assessments based on air-traﬃc mobility from China. This may giverise to two systematic types of incorrect estimates of the exposure level. Underestimates for China neighboringcountries, and overestimates of ﬂight-transportation hubs.China has long ground borders with neighboring countries allowing for considerable ground traﬃc mobility, whichdata are not available to us. Passenger trains operate from China to Vietnam, Mongolia, Kazakhstan, Russia andNorth Korea. Besides trains, border crossing points allow for ground mobility to the other neighboring countries,including Bhutan, Kyrgyzstan, Kazakhstan, Laos, Nepal, Tajikistan, Myanmar, India, Pakistan and Afghanistan.The exposure level of these countries are therefore likely to be underestimated. Increasing their exposure level coulddrive them to the regime of higher per-population exposure level, and constrain and lower their HVS infection leveldue to the LVS herd-immunity limit. We therefore did not include these countries in our main analysis.Air-traﬃc data might not well account for ﬂight-hubs, in which signiﬁcant fraction of incoming passengers arecommuting through on their way to other countries. In such cases the exposure level might be overestimated. Inmore populated countries, the addition of the commuting might not considerably aﬀect the overall incoming traﬃclevel. The only cases in which the annual traﬃc through the airport is at least approximately ten times largerthan the population in the host country itself are Dubai airport in the United Arab Emirates, Singapore airportand Doha airport in Qatar (all three by a large margin from other major hubs; according to the Airport CouncilInternational world airport traﬃc and rankings). The exposure level of these ﬂight-hubs are therefore likely to beoverestimated. We therefore did not include these countries in our analysis.Although not included in the main analysis, given the uncertainties, these countries do provide some indirectinformation. In Fig. 5 we show the same data as in Fig. 1, but now highlighting the expected “outlier” underesti-mated exposure China-neighbors and overestimated exposure ﬂight-hubs countries. As can be seen the locations ofthese countries and the outlying position are consistent with the appropriate expectations, giving further (thoughindirect) support for the double-virus co-infection model. In particular, in a single HVS model, the neighbouring11igure 4: The day of ﬁrst conﬁrmed infection for each country as a function of the exposure level to China.countries should be more exposed and show larger infection-levels and the ﬂight-hubs should be less-exposed show-ing lower infection level, i.e. the opposite expectation in respect to the double-strain model, and inconsistent withthe COVID-19 data.

In order to demonstrate the basic aspects of the model and its observable expectations, we make use of theSusceptible-Infected-Recovered (SIR) model[30], often used to study the spread of infectious disease. We extend itto the case of two circulating cross-immunizing viruses (i.e. a person infected by one of the viruses is immune toinfection from the other virus, both during and after the infection), following past studies of co-infection and thespread of multiple strains[7, 8, 9, 10, 11, 12, 13, 6]), as described by Eqs. 1-5 in the main text.Let us now consider the dynamics of the epidemic in two possible scenarios (1) the HVS spreads in a givencountry earlier than the LVS, or the LVS spreads earlier, but the HVS spreads in a given country shortly after; and(2) The LVS spreads much earlier in a given country well before the spread of the HVS. In the ﬁrst scenario theHVS spreads into the country before or shortly after the ﬁrst arrival of the LVS. In this case (see 2b), althoughsome fraction of the population becomes LVS-infected, the spread of the HVS outruns the LVS spread, even ifthe LVS began before (but not too long before, as that would lead to the second scenario) and HVS eﬀectivelyspreads freely with little eﬀect of the early LVS-spread. In particular, in such cases the HVS-infection level couldinfect the majority of the population, if no measures are taken, and until then its spread is independent of thecountry population size, and just grows exponentially with time, as if only a single strain exists. Note that thesame evolution also describes the early stages in the second scenario, before the LVS reaches herd-immunity andaﬀects the spread of the HVS.In these cases, the infection-level at a given time should just be approximately I H (t) = 2 (t − t exp ) / T H , (7)where t exp is the time of the initial infection (exposure) in the given country, which is a function of the exposurelevel to China, as we discuss below. The number of infected people arriving to another destination country fromChina is proportional to the number of infected people in China, and the fraction of those people who travel fromChina to that destination. The number of infected people in China at a given time (assuming no containmentmeasures are taken) is ( t − t ) /T , where t is the time of the initial infection in China, and T is the doubling time ofa given virus. The fraction of travelers from China to a given country is proportional to the mobility exposure level, c exp = αµ to that country, where µ is taken from the GLEAM project (see section 1.1), and α is some constant unitcalibration to account for the GLEAM data units. The time for the ﬁrst infection (exposure) in a given country( t exp ) therefore follows the following relation ( t exp − t ) /T c exp , t exp − t = − Tln c exp . (8)Plugging this result in Eq. 7 we get I H (t) = 2 (t − t exp ) / T H = 2 ( t − t ) / T c exp , (9)In other words, at a given time we expect to see a direct linear correlation between the exposure level, and thenumber of conﬁrmed cases at given country, at least until the majority of the population became infected and a herdimmunity was achieved (by either virus). This is consistent with the results in Fig. 1. At early time the evolutionfollows the regular dynamics of a pandemic by a single virus, and allows us to ﬁnd the calibration parameter α by simple linear regression on the data. At later times we see a diﬀerent behavior, which can be understood byconsidering the second scenario.In the second scenario, the LVS begins spreading in the community and the number of LVS-infected peoplegrows exponentially following the same evolution as described for the HVS in Eq. 7. If the LVS spreads far earlierthan the HVS it could have infected the majority of the population as to induce herd-immunity, and by the time ofﬁrst HVS infection, the HVS can not spread. If the HVS begins its spread long after the LVS, but before the LVSachieved herd-immunity, both viruses initially grow almost unimpeded, until the LVS achieves herd-immunity, atwhich point both viruses spreads stop the exponential growth, slow down and achieve their maximum (See Fig. 2).Longer delay between the initial spread of the LVS and the initial spread of the HVS therefore translates intolower number of HVS infected people and further constrains its spread up to a lower fraction of the overall population(compare Figs. 2b and d to Figs. 2a and c, respectively). For the same delay, countries with smaller populationsizes will achieve herd-immunity sooner, allowing for a lesser number of HVS infections (compare Figs. 21 and b toFigs. 2c and d, respectively).The introduction of a precursor LVS can therefore explain the otherwise puzzling ﬁndings of the highest exposedcountries showing low infection levels at late times (Fig. 1a), and the transition from initial fast exponential growthto a slow non-expoential level, and even an eﬀective complete stop of the growth in all countries, although none of them have infection levels even close to those required for herd-immunity. In fact, all countries show far lowerinfection levels than would be expected from a simple exponential growth expected from Eq. 7 (Fig. 6).13iven some initial number of LVS-infected people ( I ) in the population at the point when the HVS beginsto spread (i.e. the time of ﬁrst HVS-infection), in a given country, one can ﬁnd speciﬁc relations between themaximal number of infected people, the doubling times T L and T H of the two viruses, and the initial fraction ofLVS and HVS infected people in the population. During the initial free spread of the LVS, it can spread until itachieves herd-immunity once the majority of the population is infected (we will take the actual population size forsimplicity). The time it takes to the maximal infection point is therefore the number of doublings of the infectedpopulation from its initial value up to approximately the size of the population ∼ N, times the doubling time t max = T L · log (cid:18) NI L (cid:19) (10)Until the LVS achieves herd-immunity, the HVS spreads freely. Its spread should follow an exponential growthuntil slowed/stopped once herd immunity is achieved. As discussed above, if the LVS infects the majority of thepopulation and achieves herd-immunity before being outrun by the spread of the HVS, the mutual immunity insuresthat the further spread of the HVS will also be quenched. At this8point the HVS would also stop and achieve itsmaximum infection level. Hence, the number of HVS-infected people is just I H(t max ) = I maxH =2 t max /T H = 2 (cid:16) TLTH (cid:17) log (cid:18) NI0L (cid:19) = (cid:18) NI (cid:19) T L / T H , (11)or writing it in logarithmic terms log (I maxH ) = − (cid:18) T L T H (cid:19) log (cid:18) I L N (cid:19) . (12)The exposure time to a virus for a given country was derived in Eq. 8. It is valid both for the LVS and theHVS, where the parameters correspond to the speciﬁc strain (i.e. t Lexp , t L and T L for the LVS, and t Hexp , t H and T H for the HVS, and c exp is the same for both (neglecting seasonal variations in mobility). Using this relation, we canﬁnd the infection level of the LVS at a given country at the point when the HVS ﬁrst arrives, I . We get t Lexp − t L = − T L ln c, (country) t Hexp − t H = − T H ln c, (country) and therefore t Hexp − t Lexp = ( t H − t L ) − ( T H − T L ) ln ( c exp ) . (13)Plugging this into Eq. 11 we get I L = 2 ( t Hexp − t Lexp ) /T L = 2[ t H − t L − ( T dH − T dL ) ln ( c exp ) ] /T L = 2 ( t H − t L ) /T L c − T H /T L exp , to ﬁnally get from Eq. 12 I max H = (cid:18) NI (cid:19) T L / T H = (cid:18) N2 (t H0 − t L0 ) / T L c exp1 − T H / T L (cid:19) T L / T H , or in logarithmic terms log (I maxH ) = [( t L − t H ) / T H ] + (cid:18) T L T H (cid:19) log N − (cid:18) T L T H − (cid:19) log c exp , (14)i.e. the maximal number of infected people in a given country in which the LVS reached herd-immunity beforethe HVS, should be directly correlated with the total population in the country, and inversely correlated with theexposure level to China. The exact powers are determined by the doubling times (spread rates) of the LVS and theHVS. The larger the exposure to China the longer the delay between the initial LVS infection and the initial HVSinfection (see Eq. 13), allowing for the LVS more time to spread and achieve herd-immunity while the HVS is stillat earlier stages. The smaller the population size in a given country, the faster the LVS can achieve herd-immunity,and quench the spread of the LVS at earlier times, leading to lower infection levels, consistent with the models inFig. 2. A critical transition can occur for less exposed and/or more populated countries, when I maxH is comparablewith the overall population, i.e. providing no constraint on the HVS spread (needless to say I maxH needs alwaysto be smaller or equal to the total population N) . This happens when the HVS had suﬃcient time to spread and14igure 6: The number of conﬁrmed cases vs. the expected number of cases for an exponential growth, at epochs.Early on (day 70) all infected countries followed a simple exponential growth (compare with the dashed line). Atlater times newly infected countries follow an exponential growth, while countries infected earlier stop following anexponential growth and show a sub-exponential growth. This patterns repeats, with more countries stopping theirexponential growth after their initial exponential spread. By day 185 all countries in our sample already stoppedspreading exponentially. This dynamics follows the expectation from the double-strain co-infection model (see text).The dashed line corresponds to a simple exponential growth with a doubling time of 2.7 days.eventually outrun the spread of the LVS, before the latter achieved herd-immunity level. As discussed in the maintext, we can therefore ﬁt our model using the data on the infection-level (testing-normalized cases or age-normalizeddeaths) and the mobility-exposure data, to ﬁnd the model parameters. In particular, we can infer the doubling-timeof the LVS using a linear-regression analysis, and then also infer the delay between the onset of the LVS and theHVS (see main text), which then also provides the mobility-exposure calibration unit α mentioned above.For any given country we can now predict the expected infection levels. Our models in Figs. 2 suggest that theHVS spread should already slow down when the infection level reaches ∼ of the maximal level, and that therise and decay of the HVS infection level should be asymmetric with a fast rise and a slow decay, in particular it ismore asymmetric than would be expected for the case of a single virus evolution (e.g. compare the more symmetricstructure of the LVS evolution in comparison with the HVS), potentially consistent with observed asymmetries inthe epidemic dynamics observed in many countries. If allowed to spread with no constraint the number of infected people should rise exponentially to be log [I exp ( t )] = ( t − t exp ) / T , (15)at time t , where I is the number of infected people, t is the time of the initial exposure (ﬁrst infection), and T is thedoubling time for the the given virus. As discussed above (section 1.4), in the early stages of the HVS infection theevolution should eﬀectively be indistinguishable from that expected for a the unconstrained spread of a single virus.In Fig. 6 we show the number of age-corrected deaths in diﬀerent countries at several points in time, as a function ofthe initial infection time in each country. As expected, in countries infected earlier (larger number of days since ﬁrstinfection), the number of age-normalized deaths is larger, and consistent with an exponential growth as featured inEq. 15. However, at later times one can observe that the number of deaths stops growing exponentially, and theoverall structure of the number of deaths in each country hardly evolves (on logarithmic scales; one can still seea very slow, non-exponential growth). This behavior is explained by the double-strain co-infection model, whichsuggests (see Fig. 2) that the exponential growth should begin to slow down once the LVS-infection levels reach alarge fraction of the population, and later become sub-exponential until it eventually stops growing.15 eferences [1] Lu, R. et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virusorigins and receptor binding.

Lancet , 565–574 (2020).[2] Wu, A. et al.

Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating inChina.

Cell Host Microbe , 325–328 (2020).[3] Haider, N. et al. Passengers’ destinations from China: low risk of Novel Coronavirus (2019-nCoV) transmissioninto Africa and South America.

Epidemiol. Infect. , e41 (2020).[4] Van den Broeck, W. et al.

The GLEaMviz computational tool, a publicly available software to explore realisticepidemic spreading scenarios at the global scale.

BMC Infect. Dis. , 37 (2011).[5] Pastore y Piontti, A., Perra, N., Rossi, L., Samay, N. & Vespignani, A. Charting the Next Pandemic: ModelingInfectious Disease Spreading in the Data Science Age (2019).[6] Chinazzi, M. et al.

The eﬀect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19)outbreak.

Science , 395–400 (2020). URL https://science.sciencemag.org/content/368/6489/395 . https://science.sciencemag.org/content/368/6489/395.full.pdf .[7] Dietz, K. Epidemiologic interference of virus populations. J Math Biol , 291–300 (1979).[8] Castillo-Chavez, C., Hethcote, H. W., Andreasen, V., Levin, S. A. & Liu, W. M. Epidemiological models withage structure, proportionate mixing, and cross-immunity. J Math Biol , 233–258 (1989).[9] Adler, F. R. & Brunet, R. C. The dynamics of simultaneous infections with altered susceptibilities. TheorPopul Biol , 369–410 (1991).[10] Balmer, O. & Tanner, M. Prevalence and implications of multiple-strain infections. Lancet Infect Dis ,868–878 (2011).[11] Susi, H., Barr?s, B., Vale, P. F. & Laine, A. L. Co-infection alters population dynamics of infectious disease. Nat Commun , 5975 (2015).[12] Kucharski, A. J., Andreasen, V. & Gog, J. R. Capturing the dynamics of pathogens with many strains. JMath Biol , 1–24 (2016).[13] Nickbakhsh, S. et al. Modelling the impact of co-circulating low pathogenic avian inﬂuenza viruses on epidemicsof highly pathogenic avian inﬂuenza in poultry.

Epidemics , 27–34 (2016).[14] Thompson, R., Thompson, C., Pelerman, O., Gupta, S. & Obolski, U. Increased frequency of travel inthe presence of cross-immunity may act to decrease the chance of a global pandemic. Phil. Trans. R. Soc.B , 20180274 (2019). URL . .[15] Kamikubo, Y. & Takahashi, A. Epidemiological tools that predict partial herd immunity to sars coronavirus2. medRxiv (2020). URL . .[16] Qi, W. et al. Emergence and Adaptation of a Novel Highly Pathogenic H7N9 Inﬂuenza Virus in Birds andHumans from a 2013 Human-Infecting Low-Pathogenic Ancestor.

J. Virol. (2018).[17] Seo, S. H. & Webster, R. G. Cross-reactive, cell-mediated immunity and protection of chickens from lethalH5N1 inﬂuenza virus infection in Hong Kong poultry markets. J. Virol. , 2516–2525 (2001).[18] Khalenkov, A., Perk, S., Panshin, A., Golender, N. & Webster, R. G. Modulation of the severity of highlypathogenic H5N1 inﬂuenza in chickens previously inoculated with Israeli H9N2 inﬂuenza viruses. Virology ,32–38 (2009).[19] Nfon, C. et al.

Prior infection of chickens with H1N1 or H1N2 avian inﬂuenza elicits partial heterologousprotection against highly pathogenic H5N1.

PLoS ONE , e51933 (2012).[20] Li, X. et al. Transmission dynamics and evolutionary history of 2019-nCoV.

J. Med. Virol. , 501–511 (2020).1621] Su, Y. C. F. et al. Phylodynamics of H1N1/2009 inﬂuenza reveals the transition from host adaptation toimmune-driven selection.

Nat Commun , 7952 (2015).[22] Ferguson, N. M., Galvani, A. P. & Bush, R. M. Ecological and immunological determinants of inﬂuenzaevolution. Nature , 428–433 (2003).[23] Nickbakhsh, S. et al.

Virus-virus interactions impact the population dynamics of inﬂuenza and the commoncold.

Proc. Natl. Acad. Sci. U.S.A. (2019).[24] Chen, Z.-w., Li, Z., Li, H., Ren, H. & Hu, P. Global genetic diversity patterns and transmissions of sars-cov-2. medRxiv (2020). URL . .[25] Yao, H. et al. Patient-derived mutations impact pathogenicity of sars-cov-2. medRxiv (2020). URL . .[26] Urbanowicz, R. A. et al. Human Adaptation of Ebola Virus during the West African Outbreak.

Cell ,1079–1087 (2016).[27] Diehl, W. E. et al.

Ebola Virus Glycoprotein with Increased Infectivity Dominated the 2013-2016 Epidemic.

Cell , 1088–1098 (2016).[28] Hethcote, H. W. The mathematics of infectious diseases.

SIAM Review , 599–653 (2000).[29] MATLAB. version 7.10.0 (R2010a) (The MathWorks Inc., Natick, Massachusetts, 2010).[30] Kermack, W. O., McKendrick, A. G., Kermack, W. O. & McKendrick, A. G. Contributions to the mathematicaltheory of epidemics–I. 1927. Bull. Math. Biol. , 33–55 (1991).[31] Britton, T., Trapman, P. & Ball, F. G. The disease-induced herd immunity level for covid-19 is substantiallylower than the classical herd immunity level. medRxiv (2020). URL . .[32] Ioannidis, J. The infection fatality rate of covid-19 inferred from seroprevalence data. medRxiv (2020). URL . .[33] Havers, F. P. et al. Seroprevalence of antibodies to sars-cov-2 in six sites in the united states, march 23-may 3, 2020. medRxiv (2020). URL . .[34] Stephenson, I. et al. Cross-reactivity to highly pathogenic avian inﬂuenza H5N1 viruses after vaccination withnonadjuvanted and MF59-adjuvanted inﬂuenza A/Duck/Singapore/97 (H5N3) vaccine: a potential primingstrategy.

J. Infect. Dis. , 1210–1215 (2005).[35] Manenti, A. et al.

Evaluation of SARS-CoV-2 neutralizing antibodies using a CPE-based colorimetric live virusmicro-neutralization assay in human serum samples.

J. Med. Virol. (2020).[36] Sekine, T. et al.

Robust t cell immunity in convalescent individuals with asymptomatic or mild covid-19. bioRxiv (2020). URL . .[37] Fongaro, G. et al. Sars-cov-2 in human sewage in santa catalina, brazil, november 2019. medRxiv (2020). URL . .[38] Huang, A. T. et al. A systematic review of antibody mediated immunity to coronaviruses: antibody kinetics,correlates of protection, and association of antibody responses with severity of disease. medRxiv (2020).[39] Hale, N., T., Angrist, B. K., Petherick, A., Phillips, T. & Webster, S. Variations in government responses tocovid-19.

Blavatnik School of Government, University of Oxford