Real-time projections of epidemic transmission and estimation of vaccination impact during an Ebola virus disease outbreak in the Eastern region of the Democratic Republic of Congo
Lee Worden, Rae Wannier, Nicole A. Hoff, Kamy Musene, Bernice Selo, Mathias Mossoko, Emile Okitolonda-Wemakoy, Jean Jacques Muyembe-Tamfum, George W. Rutherford, Thomas M. Lietman, Anne W. Rimoin, Travis C. Porco, J. Daniel Kelly
RReal-time projections of epidemic transmission and estimation ofvaccination impact during an Ebola virus disease outbreak in theEastern region of the Democratic Republic of Congo
Lee Worden , Rae Wannier , Nicole A. Hoff , Kamy Musene , Bernice Selo ,Mathias Mossoko , Emile Okitolonda-Wemakoy , Jean Jacques Muyembe-Tamfum ,George W. Rutherford , Thomas M. Lietman , Anne W. Rimoin , Travis C. Porco , andJ. Daniel Kelly ∗ F. I. Proctor Foundation, University of California, San Francisco (UCSF),San Francisco, CA, USA School of Medicine, UCSF, San Francisco, CA, USA School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA Ministry of Health, Kinshasa, Democratic Republic of Congo School of Public Health, University of Kinshasa, Kinshasa, Democratic Republic of Congo National Institute of Biomedical Research, Kinshasa, Democratic Republic of CongoNovember 6, 2018 ∗ Corresponding author a r X i v : . [ q - b i o . P E ] N ov bstractBackground: As of October 12, 2018, 211 cases of Ebola virus disease (EVD) were reportedin North Kivu Province, Democratic Republic of Congo. Since the beginning of October theoutbreak has largely shifted into regions in which active armed conflict is occurring, and inwhich EVD cases and their contacts are difficult for health workers to reach. We used availabledata on the current outbreak with case-count time series from prior outbreaks to project theshort-term and long-term course of the outbreak.
Methods:
For short and long term projections we modeled Ebola virus transmission usinga stochastic branching process that assumes gradually quenching transmission estimated frompast EVD outbreaks, with outbreak trajectories conditioned on agreement with the course of thecurrent outbreak, and with multiple levels of vaccination coverage. We used a negative binomialautoregression for short-term projections, a Theil-Sen regression model for final sizes, and abaseline minimum-information projection using Gott’s law to construct an ensemble of forecaststo be compared and recorded for future evaluation against final outcomes. From August 20 toOctober 13, short-term model projections were validated against actual case counts.
Results:
During validation of short-term projections, from one week to four weeks, we foundmodels consistently scored higher on shorter-term forecasts. Based on case counts as of October13, the stochastic model projected a median case count of 226 cases by October 27 (95% predic-tion interval: 205–268) and 245 cases by November 10 (95% prediction interval: 208–315), whilethe auto-regression model projects median case counts of 240 (95% prediction interval: 215–307) and 259 (95% prediction interval: 216–395) cases for those dates, respectively. Projectedmedian final counts range from 274 to 421. Except for Gott’s law, the projected probabilityof an outbreak comparable to 2013–2016 is exceedingly small. The stochastic model estimatesthat vaccine coverage in this outbreak is lower than reported in its trial setting in Sierra Leone.
Conclusions:
Based on our projections we believe that the epidemic had not yet peaked at thetime of these estimates, though a trajectory on the scale of the West African outbreak is ex-ceedingly improbable. Validating our models in real time allowed us to generate more accurateshort-term forecasts, and this process may provide a useful roadmap for real-time short-termforecasting. We estimate that transmission rates are higher than would be seen under targetlevels of 62% coverage due to contact tracing and vaccination, and this model estimate mayoffer a surrogate indicator for the outbreak response challenges. ntroduction On August 1, 2018, the World Health Organization (WHO) announced a new outbreak of Ebolavirus disease (EVD) in North Kivu Province in Eastern Democratic Republic of Congo (DRC).Epidemiological investigations traced EVD cases back to the week of April 30 and identified theinitial epicenter to be Mabalako. North Kivu has over eight million inhabitants, some of whomsuffer from armed conflict, humanitarian crisis, and displacement from their bordering countriesof Uganda and Rwanda. Since the outbreak began, EVD cases have spread across ten healthzones in two provinces at a rate outpacing the case counts of the other 2018 DRC Ebola outbreakin Equateur Province. As of October 13, 211 EVD cases were reported (31 probable and 180confirmed); Ministry of Health of DRC, World Health Organization, and other organizations wereresponding to the Ebola outbreak.As new interventions such as vaccines or rapid diagnostics are being implemented during out-breaks, their impact on epidemic transmission is poorly understood, requiring assumptions to bemade that may lead to inaccurate forecasting results. Unknown social or environmental differencesaffecting transmission can also affect forecasts in unknown ways. For example, the overlap of theoutbreak with regions where armed conflict is occurring in North Kivu, DRC, might result in higherunder-reporting rates and lower vaccine coverage than in other outbreaks, causing increased trans-mission and decreased accuracy of reporting, or might result in reduced transmission due to reducedmobility or other considerations. Since the beginning of October, an increased rate of detection ofnew cases has been observed in the conflict zone, perhaps due to reduced disease control.During an Ebola outbreak, real-time forecasting has the potential to support decision-makingand allocation of resources, but highly accurate forecasts have proven difficult for Ebola as wellas other diseases.
Moreover, there are mathematical reasons to believe that highly accurateforecasts of small, noisy outbreaks may never be possible. Nevertheless, while predicting the exactnumber of cases is unlikely to ever be possible, forecasts which are accurate enough to be useful maybe possible. Previous work has found that probabilistic forecasts can have relatively high accuracywithin a few weeks, though they become less useful as time horizons grow longer , and short-termforecasts may provide useful information for response organizations.3n this paper we apply a suite of independent methods of real-time forecasting to the EasternDRC outbreak, to generate both short-term and long-term projections of future case counts as ofthe time of writing. We validate short-term projections by scoring projections derived from casecount reports obtained earlier in the outbreak against subsequent known counts. We include pastand present projections (in supplemental material) for future evaluation. We summarize modelresults in terms of projections of the future course of the outbreak, and interpret their implicationsrelevant to current rates of transmission and vaccine coverage in conflict zones and overall. Methods
We used four techniques to derive real-time projections of future case counts: a stochastic simulationmodel calibrated to time-dependent transmission rates measured from past outbreaks of EVD andconstrained to the observed partial trajectory of the current outbeak, extending the model used inour previous work on the Spring outbreak; a negative binomial auto-regression model predictingthe course of the outbreak from its course to date together with the course of previous outbreaks;a regression model for final size based on past outbreaks; and a simple final size projection usingGott’s law, which assumes only that the proportion of the outbreak observed so far is entirelyunknown.
Data sources
Data on the current outbreak was collected from the WHO website in real time as updated infor-mation was published. A cumulative case count of probable and confirmed cases was generatedto be consistent with the best knowledge at the time. Copies of the list of case counts were keptas of multiple dates (Figure 8), to be used in retrospective scoring of model projections againstsubsequently known counts. Though the epidemic was officially reported in late July as a clusterof cases occurring in June and July, seven sporadic early cases from April and May were laterlinked to the current outbreak and added to later case totals. This additional knowledge was addedretrospectively to the time series of cumulative case counts only for predictions made for days onor after September 15th, when these cases were officially linked to the current outbreak.4 tochastic model
We modeled Ebola virus transmission using a stochastic branching process model, parameterizedby transmission rates estimated from the dynamics of prior EVD outbreaks, and conditioned onagreement with reported case counts from the 2018 EVD outbreak to date. We incorporated highand low estimates of vaccination coverage into this model. We used this model to generate a setof probabilistic projections of the size and duration of simulated outbreaks in the current setting.This model is similar to one described in previous work , with the addition of a smoothing stepallowing transmission rates intermediate between those estimated from previous outbreaks.To estimate the reproduction number R as a function of the number of days from the beginningof the outbreak, we included reported cases by date from fourteen prior outbreaks (Table 1). The first historical outbreak reported in each country was excluded (e.g., 1976 outbreak in Yambuko,DRC). As there is a difference in the Ebola response system as well as community sensitization toEVD following a countrys first outbreak, we employed this inclusion criterion to reflect the Ebolaresponse system in DRC during what is now its tenth outbreak. We used the Wallinga-Teunistechnique to estimate R for each case and therefore for each reporting date in these outbreaks. The serial interval distribution used for this estimation was a gamma distribution with a mean of14.5 days and a standard deviation of 5 days, with intervals rounded to the nearest whole numberof days, consistent with the understanding that the serial interval of EVD cases ranges from 3 to 36days with mean 14 to 15 days. We estimated an initial reproduction number R initial and quenchingrate τ for each outbreak by fitting an exponentially quenched curve to the outbreak’s estimates of R by day d (Figure 9).We modeled transmission using a stochastic branching process model in which the number ofsecondary cases caused by any given primary case is drawn from a negative binomial distributionwhose mean is the reproduction number R as a function of day of the outbreak, and variance iscontrolled by a dispersion parameter k . All transmission events were assumed to be independent.The interval between date of detection of each primary case and that of each of its secondary casesis assumed gamma distributed with mean 14.5 days and standard deviation 5 days, rounded to thenearest whole number of days, as above. 5e used the ( R initial , τ ) pairs estimated from past outbreaks to provide R values for simulation. R initial values were sampled uniformly from the range of values estimated from past outbreaks. Wefit a linear regression line through the values of R initial and log( τ ) estimated for past outbreaks,above, and used the resulting regression line to assign a mean τ to each R , used with the residualvariance of log( τ ) as a distribution from which to sample τ values for simulation given R initial . Thepair of parameters R initial and τ sampled in this way, together with each of three values of thedispersion parameter k , 0 .
3, 0 .
5, and 0 .
7, consistent with transmission heterogeneity observed inpast Ebola outbreaks, were used to generate simulated outbreaks.This model generated randomly varying simulated outbreaks with a range of case counts perday. The outbreak was assumed to begin with a single case. The simulation was run multiple times,each instance producing a proposed epidemic trajectory, generated by the above branching processwith the given parameters R initial , τ , and k , and these were then filtered by discarding all proposedoutcomes but those whose cumulative case counts matched known counts of the current 2018 EVDoutbreak on known dates. In earlier, smaller, data sets we filtered against all reported case counts,while in later, more complete data sets we thinned the case counts, for computational tractability,by selecting five case counts evenly spaced in the data set plus the final case count (Figure 8). Thefiltration required an exact match of the first target value, and at subsequent target dates acceptedepidemics within a number of cases more or less than each recorded value. On the earlier data setsin which the beginning dates of the epidemic were unknown, the first target value was allowed tomatch on any day, and subsequent target dates were assigned relative to that day.Thus this model embodies a set of assumptions that transmission rates are overall graduallydeclining from the start of the outbreak to its end, though possibly in noisy ways. When thetolerance of the filter on case counts is small, quenching of transmission through time must closelytrack case counts, while when tolerance is high, fluctuations in the rate of generation of new casescan reflect a pattern of ongoing quenching of transmission more loosely and on the long term, whilebeing more insensitive to short-term up and down fluctuations in transmission rates reflected bythe true case counts.We varied the tolerance as the data set became more complete to maintain a roughly fixed rateof generation of filtered trajectories: on the August 20 data set we allowed a tolerance of 4 cases6ore or less than each target count, on August 27 and September 5, 6 cases, on September 15, 10cases, on October 7, 12 cases, and on October 13, 17 cases. This one-step particle filtering techniqueproduced an ensemble of model outbreaks, filtered on agreement with the recorded trajectory ofthe outbreak to date. This filtered ensemble was then used to generate projections of the eventualoutcome of the outbreak. To model vaccination coverage with respect to total transmission (unreported and reported),we multiplied the estimate of vaccine effectiveness by low and high estimates of reported cases. In aring vaccination study at the end of the West Africa outbreak, the overall estimated rVSV-vectoredvaccine efficacy was 100% and vaccine effectiveness was 64.6% in protecting all contacts and contactsof contacts from EVD in the randomized clusters, including unvaccinated cluster members. Weused estimates of vaccine effectiveness in our stochastic model. The ring vaccination study foundthe vaccine to be effective against cases with onset dates 10 days or more from the date of vaccineadministration, so we modeled the vaccination program as a proportionate reduction in the numberof new cases with onsets 10 days or more after the program start date.We used past estimates of the proportion of unreported cases to estimate the proportion ofexposed individuals not covered by the vaccination process. Based on a Sierra Leonean study fromthe 2013–2016 outbreak, we estimated that the proportion of reported cases in DRC would riseover time from a low of 68% to a high of 96%. Given these low and high estimates of reported casesand the estimate of vaccine effectiveness, a low estimate of vaccination program coverage was 44%(68% × . × . For simulation based on cases as of October 13, 320 outbreaks were retained from 34,663,104simulated outbreaks after filtering on approximate agreement with DRC case counts. (Numbersof simulations from earlier data sets are reported in Supplemental Materials.) The simulatedoutbreaks that were retained after filtering were continued until they generated no further cases.Rare simulated outbreaks that exceeded 300,000 cases were capped at the first value reached abovethat number, to avoid wasted computation. We used this ensemble to derive a distribution of finaloutbreak sizes, and of cumulative counts at specific forecasting dates. Projection distributions were7erived using kernel density estimation with leave-one-out cross-validation to determine bandwidth,using a log-normal kernel for final sizes, due to the extended tail of the values, and a normal kernelfor all other estimates. We calculated median values and 95% prediction intervals using the 2.5and 97.5 percentiles of simulated outbreak size and duration. We conducted the analyses using R3.4.2 (R Foundation for Statistical Computing, Vienna, Austria).
Auto-Regression model
A negative binomial autoregressive model was chosen through a validation process to forecastadditional new case counts at time points one week, two week, four weeks, and two months fromthe current date. To adjust for disparities in the frequency of case reporting in historic outbreaks,the data were weighted by the inverse square root of the number of observations contributed to themodel. Models considered included parameters for historic raw case counts at different time points,logs of raw case counts, ratio of historic case counts to try and capture the trend of the epidemiccurve, log(time), and an offset for current case total. When historic case counts for specific dateswere missing, each missing case count was linearly interpolated from the two nearest case counts,allowing the model to remain agnostic about the current trend of the epidemic. After model fittingand validation, the final model chosen was a log-link regression for additional cases on the numberof new cases identified in the previous two weeks, the previous four weeks and the ratio of thesetwo case counts.
Regression model
We conducted a simple regression forecast based solely on outbreaks of size 10 or greater, basedon prior outbreaks.
Nonparametric Theil-Sen regression (R package mblm ) was used to projectthe final outbreak size based on values of the outbreak size at a specific earlier time. All timeseries were aligned on the day they reached 10 cases. Finally, we reported the median and 95%central coverage intervals for the prediction distribution, conditional on the predicted value beingno smaller than the observed value for each day. Full details are given in . All analyses wereconducted using R 3.4.2 (R Foundation for Statistical Computing, Vienna, Austria).8 ott’s law model With Gott’s Law, we assume we have no special knowledge of our position on the epidemic curve. If we assume a non-informative uniform prior for the portion α of the epidemic included in the lastavailable report, the corresponding probability density function for the final size Y = Y /α is Y /y , Y ≤ y . We constructed a probability mass function by assigning all probability density to the wholenumber of days given by the integer part of each value. We used this probability mass function asa projection of the final outbreak size. Scoring
Each of the above models was used to generate an assignment of probability to possible values ofmultiple quantities: • Case count 1 week after the last available case count • Case count 2 week after the last available case count • Case count 4 weeks after the last available case count • Case count 2 months after the last available case count • Final outbreak sizeEach model’s performance on each of these projections was scored by recording the naturallogarithm of the probability it assigned to the subsequently known true value of the quantity inquestion.The short-term projections based on real-time reporting were used to evaluate and calibratethe models during the epidemic, based on the data available at multiple time points during theoutbreak. Final outbreak size projections were recorded for future evaluation of their performance.
Results
When we started performing our short-term forecasts on August 20, 2018, there were 102 reportedEVD cases in North Kivu and Ituri provinces, DRC. We used our stochastic and auto-regression9ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Figure 1:
Comparison of retrospective model projections to known case counts whenprojecting from past snapshots of available data.Figure 2:
Log-likelihood scores of retrospective model projections on known casecounts.
Short-term projections of case counts based on reported counts as of Oct. 13.Figure 4:
Medians and prediction intervals from short-term projections of case counts based on reported counts as of Oct. 13.Figure 5:
Projections of final case counts based on reported counts as of Oct. 13.11igure 6:
Medians and prediction intervals from projections of final case counts basedon reported counts as of Oct. 13.models to project one-week, two-week, four-week, and two-month forecasts of outbreak size. Astime lapsed, we compared predicted and actual outbreak sizes and found a higher probability ofaccurate forecasts at one week than two months (Figures 1, 2). Log-likelihood scores typicallydeclined as the model extended its projection time into the future. However, the largest decline inlog-likelihood score occurred between the four-week and two-month forecasts. Concurrently, therewere larger prediction intervals associated with these longer-term forecasts. As the epidemic curveaccelerated in early October, we observed that model projections were less likely to predict actualcase counts. These findings were consistent for both models.After our model validation process was completed, we used the stochastic and auto-regressionmodels to project one-week, two-week, four-week, and two-month outcomes (Figures 3, 4). We usedthe Gott’s law and Theil-Sen regression models together with the stochastic model to project finaloutbreak sizes (Figures 5, 6). As of October 13, there were 216 reported EVD cases. With thestochastic model, the four-week projection of median outbreak size was 245 cases (95% predictioninterval: 208–315). Median final outbreak size was 274 cases (95% prediction interval: 210–632).With the auto-regression model, the four-week projection of median outbreak size was 259 cases1295% prediction interval: 216–395). With Gott’s law, median final outbreak size was 421 cases(95% prediction interval: 222–4219). Median final outbreak size projected by the regression modelwas 277 cases (95% prediction interval: 216–485).Because the question has been raised of whether the current outbreak might exceed the catas-trophic West Africa outbreak in size, we evaluated the model’s projected probability of a final sizeof at least the 28,616 cases reported in that outbreak. A final outbreak size of 28,616 or morecases was projected to have an exceedingly low probability of less than 1 in 10,000 in all casesexcept the Gott’s law model, whose projected probability distribution is very long-tailed, whichprojects a probability of about 0.005 (roughly 1 in 190) for that event. However, as with all of theabove projections, it should be understood that they are conditional on model assumptions beingmet. If unpredictable events should change patterns of transmission, for example escape of theoutbreak into a region where sustained high transmission rates violate the assumption of gradualquenching of transmission, model projections will no longer be applicable.
Stochastic model
In order to produce model outbreak trajectories consistent with the reported overall case countssince the beginning of October, it was necessary to make the filtering step of the model moretolerant to variation in counts in order to accommodate the rapidly rising count. This is becausehigher transmission rates in late September and early October were necessary to generate casecounts of that size than are consistent with the earlier counts.If this model’s assumptions of continually quenching overall rate of transmission is accepted,this result could be taken as evidence in favor of increased transmission in the conflict zone, sincethe increase in cases reported in October reflects cases recorded there.The likelihoods of the three scenarios of zero, low, and high vaccine coverage estimated by thestochastic model, on the basis of which scenarios are selected by the filtering step, indicate that thelower vaccine coverage scenario was consistently found more likely than the higher vaccine coveragescenario (Figure 7). However, no vaccine coverage was the most likely scenario in all forecasts.This could be read as evidence for decreased vaccine coverage in the conflict zone. It shouldbe said clearly, however, that the model’s quenching assumption could be violated by the presence13igure 7:
Likelihoods of vaccine coverage scenarios estimated by number of simulatedoutbreaks accepted by the stochastic model’s filtering step in which simulated outbreaks mustmatch reported case counts.of other causes of increased transmission relative to past outbreaks that it can not distinguish andwould misidentify as decreased vaccine coverage.Stochastic model parameters conditioned on filtering by true case counts, stochastic model out-comes for multiple snapshots of reported data, and short-term and long-term projections generatedby all models from past snapshots are reported in Supplemental Material.
Discussion
Political instability, mobility and community impenetrability to health workers in Eastern DRCpresent new challenges to efforts to respond to the ongoing EVD outbreak. Public health respondershave not been able to trace up to 80% of contacts of EVD cases, and new chains of transmission arebeing identified on a routine basis. At present, the most reliable source of data is the weekly casecounts that can be found in the WHO situation reports; those have indicated that the number ofadditional cases is increasing rather than decreasing. In such situations of data scarcity, modelingin real time can be useful, particularly in the short term.Our projection with multiple models, as of October 13, 2018, predicts growth in the short termconsistent with rates recorded in the recent past. In the long term our models do not project14arge-scale growth of the outbreak into a public health emergency of international concern, evenconsidering worst case scenarios. However, these outcomes are likely dependent on highly contingentevents such as escape of the outbreak into additional regions or nations which can not be predictedby mechanistic models.This is the first EVD outbreak in a conflict zone, and the first with deployment of the vaccinesince the beginning of the outbreak, so the suitability of prior mathematical models to predict thecourse of this particular outbreak is unknown. We validated the short-term forecasting performanceof two mathematical models during the early part of the ongoing outbreak.Our short-term projections of the future course of the outbreak at the time of writing are inrough agreement with expert consensus that the outbreak has gained speed as cases appear inthe conflict zone, and will likely continue at roughly the same rate of growth in the short term.Our models do not indicate that the epidemic has yet peaked. Longer term forecasts are lesscertain and depend on whether intervention, conflict and mass behavior are able to stem continuedtransmission.There are limitations to our projections. Projection distributions are right-skewed, with longtails (and we therefore report the median instead of the mean). We were unable to include allthe 23 observed EVD outbreaks with a case count greater than ten cases in our estimates due todata availability.The simple regression projection is based entirely on past outbreaks of Ebola virusdisease (measured and reported in different ways), and cannot account for the improved controlmeasures and vaccination in the way that a mechanistic model does. We included as much real-timeinformation into our estimates as possible, but situations such as the introduction of EVD into azone of armed conflict and the recent introduction of vaccination are not reflected by the suiteof past outbreaks. The stochastic model did not include vaccination of healthcare workers. Weestimated vaccination effectiveness, reported cases, and time from symptom onset to reporting usingstudies from West Africa, not DRC. A strength of our approach was the use of multiple methods toestimate the outbreak size, even though Gott’s Law has not been validated for outbreak projections.Our models confirm that the speedup in the conflict area appears to reflect increased transmis-sion, possibly due to decreased vaccination coverage. Before October, there was no data to suggestwhether the conflict zone would manifest more transmission and less detection due to inaccessibility15f health services, or less transmission because of reduced mobility, or other outcomes. The Octo-ber data suggests that transmission is increased there. It is of course not clear whether reportedcounts underrepresent true numbers of cases in these areas. It may be that a model that explicitlydistinguishes transmission rates in these zones from those in other areas would model the dynamicsunderlying these cases more faithfully and produce more accurate projections. Most of the EVDcases reported in late September and early October occurred in active conflict zones, and chal-lenges impeding an effective outbreak response have increased. Strong international partners suchas the U.S. Centers for Disease Control and Prevention withdrew their support due to the securityconcerns. Although there was rapid deployment of vaccines during this outbreak, we found thatthe impact of vaccines on transmission reduction has been limited at best. Our stochastic model,which included high, low, and no vaccine coverage scenarios, was much less likely to fit high cover-age scenarios than those with low or no coverage, especially with more recent data included. Thus,as contact tracing efforts faltered and estimates of vaccine coverage became increasingly unreliable,our stochastic model produced estimates of transmission rates more consistent with levels of vaccinecoverage lower than target levels of 62% coverage associated with past programs of contact tracingand vaccination.As of October 2018, the current outbreak is ongoing and does not yet appear to be concluding.We believe current rates of accumulation of cases will continue at least in the short term. Wedo not see evidence to indicate it will expand to the scale of the 2013–16 outbreak, although thepossibility can not be dismissed. We believe the increased rate of case detection corresponding tothe shift of transmission into conflict zones is due to increased transmission, probably driven byreduced ability to detect and vaccinate contacts in those locations. Even as control efforts falter,case fatality rates have been decreasing during the outbreak. With aggressive supportive care,experimental therapeutics and high-quality facilities ( e.g. air-conditioned, individualized), health-seeking behaviors may reduce transmission potential in communities that are resistant to controlefforts. eferences [1] Butler D. Models overestimate Ebola cases. Nature. 2014;515(7525):18.[2] Shaman J, Yang W, Kandula S. Inference and forecast of the current West African Ebolaoutbreak in Guinea, Sierra Leone and Liberia. PLoS Curr. 2014;6.[3] Reis J, Yamana T, Kandula S, Shaman J. Superensemble forecast of respiratory syncytial virusoutbreaks at national, regional, and state levels in the United States. Epidemics. 2018;S1755-4365(17):30174–3.[4] Yamana T, Kandula S, Shaman J. Superensemble forecasts of dengue outbreaks. J R SocInterface. 2016;13(123).[5] Reich N, Lauer S, Sakrejda K, Iamsirithaworn S, Hinjoy S, Suangtho P, et al. Challenges inReal-Time Prediction of Infectious Disease: A Case Study of Dengue in Thailand. PLoS NeglTrop Dis. 2016;10(6):e0004761.[6] Graham M, Suk J, Takahashi S, Metcalf C, Jimenez A, Prikazsky V, et al. Challenges andOpportunities in Disease Forecasting in Outbreak Settings: A Case Study of Measles in LolaPrefecture, Guinea. Am J Trop Med Hyg. 2018;98(5):1489–1497.[7] Li M, Dushoff J, Bolker B. Fitting mechanistic epidemic models to data: A comparison ofsimple Markov chain Monte Carlo approaches. Stat Methods Med Res. 2018;27(7):1956–1967.[8] Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, Edmunds WJ. Assessing the perfor-mance of real-time epidemic forecasts: A case study of the 2013-16 Ebola epidemic. bioRxiv.2018;Available from: .[9] World Health Organization. Ebola situation reports: Democratic Republic of the Congo;2018. Online; accessed 15-October-2018. .[10] Kelly JD, Worden L, Wannier R, Hoff NA, Mukadi P, Sinai C, et al. Real-time projectionsof Ebola outbreak size and duration with and without vaccine use in ´Equateur, Democratic17epublic of Congo, as of May 27, 2018. bioRxiv. 2018;Available from: .[11] Ebola haemorrhagic fever in Sudan, 1976. Report of a WHO/International Study Team. BullWorld Health Organ. 1978;56(2):247–70. Available from: .[12] Breman JG, Heymann DL, Lloyd G, McCormick JB, Miatudila M, Murphy FA, et al. Discoveryand Description of Ebola Zaire Virus in 1976 and Relevance to the West African Epidemic Dur-ing 2013-2016. J Infect Dis. 2016;Available from: .[13] Baron RC, McCormick JB, Zubeir OA. Ebola virus disease in southern Sudan: hospital dis-semination and intrafamilial spread. Bull World Health Organ. 1983;61(6):997–1003. Availablefrom: .[14] Georges AJ, Leroy EM, Renaut AA, Benissan CT, Nabias RJ, Ngoc MT, et al. Ebola hemor-rhagic fever outbreaks in Gabon, 1994-1997: epidemiologic and health control issues. J InfectDis. 1999;179 Suppl 1:S65–75. Available from: .[15] Khan AS, Tshioko FK, Heymann DL, Le Guenno B, Nabeth P, Kerstins B, et al. The reemer-gence of Ebola hemorrhagic fever, Democratic Republic of the Congo, 1995. Commission deLutte contre les Epidmies Kikwit. J Infect Dis. 1999;179 Suppl 1:S76–86. Available from: .[16] CDC and Ministry of Health: T Oyok, Odonga C, Mulwani E, Abur J, Kaducu F, AkechM, et al. Outbreak of Ebola Hemorrhagic Fever — Uganda, August 2000–January 2001.MMWR: Morbidity and Mortality Weekly Report. 2001;50(5):73–77. Available from: .[17] Outbreak(s) of Ebola haemorrhagic fever, Congo and Gabon, October 2001-July 2002.18kly Epidemiol Rec. 2003;78(26):223–8. Available from: .[18] Boumandouki P, Formenty P, Epelboin A, Campbell P, Atsangandoko C, Allarangar Y, et al.Prise en charge des malades et des d´efunts lors de l’´epid´emie de fi`evre h´emorragique due auvirus Ebola d’octobre `a d´ecembre 2003. Bull Soc Pathol Exot. 2005;98(3):218–223. Availablefrom: [19] World Health Organization. Weekly epidemiological record. Abonnement annuel.2005;43(80):369–376. Available from: .[20] Nkoghe D, Kone ML, Yada A, Leroy E. A limited outbreak of Ebola haemorrhagic fever inEtoumbi, Republic of Congo, 2005. Trans R Soc Trop Med Hyg. 2011;105(8):466–72. Availablefrom: .[21] Rosello A, Mossoko M, Flasche S, Van Hoek AJ, Mbala P, Camacho A, et al. Ebola virusdisease in the Democratic Republic of the Congo, 1976-2014. Elife. 2015;4. Available from: .[22] MacNeil A, Farnon EC, Morgan OW, Gould P, Boehmer TK, Blaney DD, et al. Filovirusoutbreak detection and surveillance: lessons from Bundibugyo. J Infect Dis. 2011;204 Suppl3:S761–7. Available from: .[23] Uganda: Ebola Situation Report. Bull World Health Organ. 2013;Available from: https://reliefweb.int/sites/reliefweb.int/files/resources/Uganda-Ebola-27august2012.pdf .[24] Centers for Disease Control and Prevention. Number of Cases and Deaths in Guinea, Liberia,and Sierra Leone during the 2014-2016 West Africa Ebola Outbreak. 2017;Online; accessed10-May-2018. 1925] Wallinga J, Teunis P. Different epidemic curves for severe acute respiratory syndrome revealsimilar impacts of control measures. Am J Epidemiol. 2004;160(6):509–16. Available from: .[26] Blumberg S, Lloyd-Smith JO. Comparing methods for estimating R from the size distributionof subcritical transmission chains. Epidemics. 2013;5(3):131–45. Available from: .[27] Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individualvariation on disease emergence. Nature. 2005;438(7066):355–9. Available from: .[28] Dalziel B, Lau M, Tiffany A, et al. Unreported cases in the 2014–2016 Ebola epidemic:Spatiotemporal variation, and implications for estimating transmission. PLoS Negl Trop Dis.2018;12(e0006161).[29] Henao-Restrepo AM, Camacho A, Longini IM, Watson CH, Edmunds WJ, Egger M, et al.Efficacy and effectiveness of an rVSV-vectored vaccine in preventing Ebola virus disease: fi-nal results from the Guinea ring vaccination, open-label, cluster-randomised trial (Ebola C¸ aSuffit!). Lancet. 2017;389(10068):505–518. Available from: .[30] Dalziel BD, Lau MSY, Tiffany A, McClelland A, Zelner J, Bliss JR, et al. Unreported casesin the 2014-2016 Ebola epidemic: Spatiotemporal variation, and implications for estimatingtransmission. PLoS Negl Trop Dis. 2018;12(1):e0006161. Available from: .[31] Gott III JR. Implications of the Copernican principle for our future prospects. Nature.1993;363:315–319. 20 upplemental Information Results
Data Sources
Time Period Country ReportedCount TimeSeriesCount Regression? Stochastic? Auto-Regression?
Aug–Sep 1976 DRC* 318 262 Yes No YesJun–Nov 1976 Sudan 284 284 Yes No YesAug–Sep 1979 Sudan 34 34 Yes Yes YesDec 1994–Feb 1995 Gabon 52 49 Yes No YesMay–Jul 1995 DRC 315 317 Yes Yes YesJan–Apr 1996 Gabon 37 29 Yes Yes YesJul 1996–Mar 1997 Gabon 60 – No No NoOct 2000–Jan 2001 Uganda 425 436 Yes No YesOct 2001–Jul 2002 Gabon, Repub-lic of the Congo 124 124 Yes Yes YesDec 2002–Mar 2003 Republic of theCongo 143 – No No NoNov–Dec 2003 Republic of theCongo 35 35 Yes Yes YesApr–Jun 2004 Sudan 17 17 Yes Yes YesApr–May 2005 DRC 12 12 Yes Yes YesAug–Nov 2007 DRC 264 264 Yes Yes YesDec 2007–Jan 2008 Uganda 131 127 Yes Yes YesDec 2008–Feb 2009 DRC 32 32 Yes Yes YesJun–Aug 2012 Uganda 24 24 Yes Yes YesJun–Nov 2012 DRC 52 52 Yes Yes YesAug–Nov 2014 DRC 66 62 Yes Yes YesJul–Oct 2014 Nigeria (offshootof West Africanoutbreak) 20 – No No NoJan 2014–Jun 2016 Guinea, Liberia,Sierra Leone 28,616 21,422 Yes No YesApr–Jun 2018 DRC 53 53 Yes Yes Yes
Table 1:
Table of past outbreaks by year and country.
Official reported case counts for eachepidemic are given, including suspected cases (“Reported Count”). Case counts for the time seriesdata included in the models include only probable and confirmed cases (“Time Series Count”).Case counts for historic outbreaks were pulled from publicly available literature.
Lastly, eachhistoric outbreak’s inclusion in the regression, stochastic, and auto-regression models is enumerated.*Democratic Republic of Congo (formerly Zaire)Table 1 summarizes the past outbreaks used as data to inform our models.We retained snapshots of the set of available case counts at multiple time points, for use in21coring of retrospective model projections against known subsequent counts (Figure 8). In laterdata sets, due to the larger number of data points, a subset of the case counts was selected for usein the stochastic model’s particle filtering step, as noted in the figure.
Stochastic Model
Epidemic curves reported for past Ebola outbreaks were used to estimate time series of effectivereproduction number ( R ) by day, which were then fit to an exponential quenching curve (Figure 9).The parameters R initial and τ estimated by that curve fitting on past epidemics were then usedto create a distribution from which values were sampled to parametrize the stochastic simulation(Figure 10).The R initial and τ parameters driving simulated outbreaks that were successful in passing theparticle filtering step tended to cluster in particular locations within the assumed distribution(Figure 11). In some cases, distinct ranges of R initial and/or τ were selected in conjunction withthe different vaccine coverage scenarios.For simulation based on cases as of August 20, 320 outbreaks were retained from 10,196,928simulated outbreaks after filtering based on approximate agreement with reported case counts fromthe current outbreak. For the August 27 data set, 320 were retained from 11,622,528; for September5, 321 were retained from 6,492,672; for September 15, 320 from 39,537,792; and for October 7, 320from 48,845,376.The simulations passing the particle filtering step, representing a distribution of parametervalues and vaccine scenarios, were continued beyond the particle filtering points to generate aspreading set of projections of case counts at later dates (Figure 12), which was smoothed to createprobabilistic projections of future case counts at the desired dates.22ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018 Data as of 10-13-2018Figure 8: Reported case counts in current outbreak by date, in multiple snapshots ofavailable data. Where not otherwise noted, all case counts shown were used in the stochasticmodel’s particle filtering step. 23igure 9:
Estimates of reproduction number R by day in past Ebola outbreaks. Thin curvesare exponentially quenched curves R = R initial e − τd fit to each series of R estimates.Figure 10: Distribution of transmission rates sampled for simulation . Black dots are pairs R initial and quenching rate τ estimated from past Ebola outbreaks, and blue cloud is the continuousdistribution from which pairs are sampled for simulation.24ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018 Data as of 10-13-2018Figure 11: Transmission rates selected by the particle filtering process , by vaccine cover-age scenario, for successive snapshots of available case count data. As in previous figure, black dotsfor R initial , τ pairs estimated for past outbreaks (for comparison), and colors illustrate the densityof R initial , τ pairs selected by filtering simulated outbreaks, classified by level of vaccine coverage.25ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018 Data as of 10-13-2018Figure 12: Cumulative case counts by date projected by individual realizations of the stochas-tic model, by vaccine coverage scenario, using successive snapshots of available case count data.The vertical axis is cut off at the upper limit of the 95% prediction interval for outbreak sizes, forreadability. 26 rojections
We have recorded the projections generated by our models from older data sets to assess thedevelopment of the projections as the outbreak has progressed, in Figures 13, 14, 15, and 16.27ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018Figure 13:
Short-term projections based on past data sets.28ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018Figure 14:
Medians and prediction intervals from short-term projections based on pastdata sets. 29ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018Figure 15:
Final outbreak size projections based on past data sets.30ata as of 8-20-2018 Data as of 8-27-2018Data as of 9-5-2018 Data as of 9-15-2018Data as of 10-7-2018Figure 16:
Medians and prediction intervals from final outbreak size projections basedon past data sets. 31able 2 summarizes the medians and 95% prediction intervals produced by each model onthe most recent data set included, and their probabilities of outcomes exceeding the 2013–2016outbreak.
Forecast as of Forecast Model Lower Median Upper Over 28,616
Table 2: