[PDF] A Bayesian Updating Scheme for Pandemics: Estimating the Infection Dynamics of COVID-19

Abstract

Epidemic models play a key role in understanding and responding to the emerging COVID-19 pandemic. Widely used compartmental models are static and are of limited use to evaluate intervention strategies with the emerging pandemic. Applying the technology of data assimilation, we propose a Bayesian updating approach for estimating epidemiological parameters using observable information for the purpose of assessing the impacts of different intervention strategies. We adopt a concise renewal model and propose new parameters by disentangling the reduction of instantaneous reproduction number Rt into mitigation and suppression factors for quantifying intervention impacts at a finer granularity. Then we developed a data assimilation framework for estimating these parameters including constructing an observation function and developing a Bayesian updating scheme. A statistical analysis framework is then built to quantify the impact of intervention strategies by monitoring the evolution of these estimated parameters. By Investigating the impacts of intervention measures of European countries, the United States and Wuhan with the framework, we reveal the effects of interventions in these countries and the resurgence risk in the USA.

Full PDF

1 Abstract —Epidemic models play a key role in understanding and responding to the emerging COVID-19 pandemic. Widely used compartmental models are static and are of limited use to evaluate intervention strategies with the emerging pandemic. Applying the technology of data assimilation, we propose a Bayesian updating approach for estimating epidemiological parameters using observable information for the purpose of assessing the impacts of different intervention strategies. We adopt a concise renewal model and propose new parameters by disentangling the reduction of instantaneous reproduction number 𝑹 𝒕 into mitigation and suppression factors for quantifying intervention impacts at a finer granularity. Then we developed a data assimilation framework for estimating these parameters including constructing an observation function and developing a Bayesian updating scheme. A statistical analysis framework is then built to quantify the impact of intervention strategies by monitoring the evolution of these estimated parameters. By Investigating the impacts of intervention measures of European countries, the United States and Wuhan with the framework, we reveal the effects of interventions in these countries and the resurgence risk in the USA. Index Terms —COVID-19, Data assimilation, Bayesian updating, Renewal process, Epidemiology, Non-pharmaceutical intervention. I. I NTRODUCTION n response to the COVID-19 pandemic, governments have taken non-pharmaceutical intervention measures. Common measures include travel restriction, school and non-essential business closure and social distancing, as well as early isolation of confirmed patients. Recently, as the first-wave epidemic peak has faded away in many countries, the accumulated observations of epidemic growth [1] and corresponding intervention policies [2] shed more insights on how the interventions worked. Meanwhile, many governments have switched into the phase to reopen economic and social activities, with attention on tamping down possible resurgences. However, the recent second-wave outbreak in some countries and regions (e.g. the United States, Hong Kong) alerts us to monitor the epidemic evolution carefully while intervention measures are being relaxed. * Corresponding author: Yike Guo (email: [email protected]).

Mathematical models play a key role in understanding and responding to the emerging COVID-19 pandemic [3]–[5]. Compartmental models (e.g. SIR, SIER) and time-since-infection models (i.e. renewal process-based models) are the two well-known approaches describing the underlying transmission dynamics [6], [7]. The compartmental models describe the transmission among sub-populations while the renewal process-based approach starts from the inter-individual transmission. Despite different nomenclatures and applications, each model contains parameters characterizing the epidemic dynamics. One of the most well-known parameters is the reproduction number 𝑅 , which represents the average number of secondary cases that would be induced by an infected primary case [8]. This key parameter is related to the final epidemic size of infectious disease [9]. Intervention measures aim to maintain the reproduction number under one so that the epidemic can be contained along with time. Thus, estimation of time-varying 𝑅 will reflect the impacts of intervention. The basic reproduction number 𝑅 ! is the reproduction number at the beginning of the epidemic outbreak, when the susceptible population is approximately infinite and without intervention measures. When various intervention measures are being introduced, the instantaneous reproduction number 𝑅 " (also called effective reproduction number) is of greater interest. To gain insights into epidemic evolution, most existing studies such as [3], [10] focus on estimating time-varying instantaneous reproduction number 𝑅 " . 𝑅 " is defined as the average number of secondary cases that would be generated by an infected primary case at a time 𝑡 when conditions remained the same thereafter [8], reflecting the real-time transmission dynamics. This could help governments to monitor the evolution of COVID-19 and update intervention policies accordingly [11]. However, the nowcasting of 𝑅 " from reported data is not an easy task. Several approaches have been proposed to estimate 𝑅 " with different advantages [12]–[14], but the timeliness and accuracy are still of concern. Nowcasting results are affected by different factors, such as assumptions of the epidemic models, statistical inference methods and uncertainty of data resources. Inappropriate interpretation or imprecise estimation of 𝑅 " are Ling Li is with the School of Computing, University of Kent, Kent, UK. Yuan Huang and Zhongzhao Teng are with the Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK. Yike Guo is with Hong Kong Baptist University, Hong Kong, China.

A Bayesian Updating Scheme for Pandemics:

Estimating the Infection Dynamics of COVID-19

Shuo Wang I 𝑅 " , we introduce two complementary parameters, the mitigation factor ( 𝑝 " ) captures the effect of shielding susceptible population (e.g. through social distancing), and the suppression factor ( 𝐷 " ) captures the effect of isolating the infected population (e.g. through quarantine) to stop virus transmission. We propose a novel method to estimate these parameters by taking the data assimilation approach of using Bayesian updating methods. We use daily reports of confirmed cases as the observation. A deconvolution method is used to build an observation function to estimate the infection cases by adjusting the incubation time and report delay. The evolution of the time-varying infectiousness profile (i.e. 𝑝 " and 𝐷 " ) is estimated from the adjusted epidemic curve through a Bayesian approach of assimilation. Such a fine-grained infectiousness profile enables us to quantify the impacts of various intervention measures in a comprehensive way. The paper is structured as follows: We introduce the related work in Section II. In section III, we present the overview of a time-varying renewal process model where the two parameters 𝑝 " and 𝐷 " are proposed. In section IV, we present in detail the Bayesian updating scheme for estimating the dynamic parameters. In section V, we develop a statistical analysis method of assessing the intervention impacts based on the estimated results and the report of intervention policies. In section VI, as applications of our approach, we investigate the impacts of intervention measures of European countries, the United States and Wuhan to illustrate the importance of this development. II. R ELATED W ORK

At the beginning of COVID-19 outbreak in Wuhan, China, compartmental models (e.g. SIR, SEIR model) have been used to investigate the epidemic dynamics [16]–[18], where the basic reproductive number was estimated from the models with static parameters. With the spread of COVID-19 worldwide, renewal process-based models (i.e. time-since-infection model) are also being widely used in the study of COVID-19. The R package ‘EpiEstim’ [12], [13] is the most widely used in estimating the time-varying 𝑅 " with a sliding window. In [10], ‘EpiEstim’ was applied to infer 𝑅 " via the discrete renewal process for policy impact assessment. Similar work has been done in [3] to infer 𝑅 " using ‘EpiEstim’ from laboratory-confirmed cases in Wuhan and hence evaluated the impact of non-pharmaceutical public health interventions. The work in [11] has pointed out that the infection data is usually not available and death data was used as observation for 𝑅 " updating. Instead of simply applying ‘EpiEstim’, they estimated 𝑅 " by employing the renewal equation as a latent process to model infections and connecting the infections to death data via a generative mechanism. However, the estimated 𝑅 " is in a piecewise form and the number of changing points was assumed to be determined by the imposed interventions. [19] estimates 𝑅 " from the death data as well while linking the disease transmissibility to mobility using the renewal equation. In general, [11] and [19] explicitly formulated the 𝑅 " ’s updating function by introducing external factors (e.g. interventions and mobility). Thus, the estimated 𝑅 " curve is largely constrained by the factors that are considered in the model. Data Assimilation [20] lends itself naturally to this problem since it provides a framework to enable dynamically updating the model states and parameters when new observations become available while also taking into account model and observation uncertainty. Data assimilation technologies, such as Kalman filter and variational method [21], have been widely used in signal tracking, oceanology, environment monitoring and weather forecasting where physical models and observation data are assimilated to produce accurate prediction. Data assimilation for epidemiological modelling was first proposed in [22] where compartment models were used as the underlying model for assimilation. In [25] and [26], estimating time-varying parameters in the compartment models was further investigated. To the authors’ best knowledge, our work is the first study of applying data assimilation to the renewal process-based model. III. E PIDEMIC MODELLING OF

COVID-19

TRANSMISSION

In this section, we propose a time-varying renewal process with two complementary parameters 𝑝 " and 𝐷 " to model the evolving infectiousness profile. We adopted a time-varying renewal process for epidemic modeling. The renewal process [8] of infectious disease transmission is: 𝐼(𝑡) = + 𝐼(𝑡 − 𝜏) 𝛽(𝜏)𝑑𝜏 (1) where

𝐼(𝑡) is the incident infection on time 𝑡 and 𝛽(𝜏) is the infectiousness profile. The infectiousness profile means a primary case who was infected 𝜏 time ago (i.e. with the infection-age 𝜏 ) can now generate new secondary cases at a rate of 𝛽(𝜏) , describing a homogenous mixing process. The infectiousness profile 𝛽(𝜏) is related to biological, behavioral and environmental factors. We can calculate the reproduction number 𝑅 as the area under curve of 𝛽(𝜏) , which is the overall number of secondary cases infected by a primary case. Further, 𝛽(𝜏) can be rewritten as: 𝛽(𝜏) = 𝑅 ∙ 𝑤(𝜏) (2) where the unit-normalized transmission rate 𝑤(𝜏) is the probability density function of generation time, i.e. the interval between the primary infection and the secondary infection. In 3 the early stage without intervention, the infectiousness profile remains time-independent as the baseline 𝛽 ! (𝜏) which describes the transmission dynamics when the susceptible population is infinite. The corresponding 𝑅 is the well-known basic reproduction number 𝑅 ! . In reality, the infectiousness profile 𝛽(𝜏) will evolve with time 𝑡 , therefore we introduce 𝛽 " (𝜏) to address the change in its distribution caused by intervention measures. To quantify the impacts of intervention measures to the evolution of 𝑅 " , we propose two factors: suppression and mitigation to disentangle the intervention effects. Here we use two complementary metrics 𝑝 " and 𝐷 " modelling the suppression and mitigation factors respectively, as illustrated in Figure 1. The suppression effects mainly shorten the infectious period of the infected population, corresponding to the truncation of 𝛽(𝜏) along the horizontal axis. We use a time-varying parameter 𝐷 " to denote the effective infectious window induced by suppression. The mitigation effects attenuate the overall infectiousness by shielding the susceptible population, corresponding to the scaling on the vertical direction. We introduce another time-varying parameter 𝑝 " to describe this attenuation effect induced by mitigation. Formally, we parameterize the evolution of the infectiousness profile as: 𝛽 " (𝜏) = 4𝛽 ! (𝜏) ∙ 𝑝 " 𝜏 < 𝐷 " " (3) Accordingly, the instantaneous reproductive number 𝑅 " can be derived: 𝑅 " = 𝑝 " ∙ + 𝛽 ! (𝜏)𝑑𝜏 $ ! ! (4) Therefore, the impact of intervention measures on 𝑅 " reduction is disentangled: mitigation factor 𝑝 " attenuates the overall infectiousness through shielding the susceptible population and suppression factor 𝐷 " shortens the infectious period through isolating the infected population. It is noted that the 𝑅 " can be derived from 𝑝 " and 𝐷 " which provide more mechanistic details about the evolution of the infectiousness profile. IV. A DAPTIVE P ARAMETER E STIMATION

We aim to develop a comprehensive framework to estimate parameters of renewal process models using Bayesian updating approach of data assimilation, especially the three key parameters: < 𝑅 " , 𝑝 " , 𝐷 " >. The estimation is essential for quantify the impacts of different interventions through monitoring the evolution of < 𝑅 " , 𝑝 " , 𝐷 " >. This framework contains building an observation function to map observations to model state, modelling and Bayesian updating as shown in Figure 2 and 3. By applying the observation function, we reconstruct the number of daily infections from reports of confirmed cases, taking into account the incubation time and report delay with a deconvolution algorithm. Then < 𝑅 " , 𝑝 " , 𝐷 " > is estimated through a Bayesian approach of data assimilation. A. Reconstruction of daily infection from reported cases

In data assimilation, model states and parameters can be updated using new observation data. It is important for parameter estimation that proper observation is chosen, and an observation function can be built which maps observations to a state variable (usually regarded as the output of the model). In this study, the observations we have chosen are from the reported number of confirmed cases. The model output is daily infection incidence through the renewal process. However, such observations experience an inevitable time delay between the actual infection time and the reporting date (Figure 2). This includes an incubation time (i.e. the period between infection and onset of symptoms) and confirmation period (i.e. the period between onset and officially reported after being tested). The confirmed cases reported on time 𝑡 were actually infected within a past period and the reported number is the convolution result of the historical daily infection. Here, we define an observation function to reconstruct the Fig. 2. Reconstruction of daily infection from the confirmed cases using deconvolution algorithms. The time delay between the infection and onset and report is demonstrated (top). The estimated distribution between infection and report is presented which is used for deconvolution (bottom). Fig. 1. Disentangling the reduction of reproduction number into mitigation and suppression factors. 𝑠(𝜏) for 𝜏 ∈ {0, 𝑑} from infection to report (Figure 2). Denoting the epidemic curve of reported infection cases 𝐼? %:’ = {𝐼? % , 𝐼? ( , … , 𝐼? " } and the epidemic curve of confirmed cases 𝐶 %:’ = {𝐶 % , 𝐶 ( , … , 𝐶 " } , the reported infection with an observation process of past infections can be modelled as a Poisson process: 𝐶 " ~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝑚𝑒𝑎𝑛 = J 𝑠(𝑘)𝐼? ")**+" ) (5) Estimate the daily reported infection curve 𝐼? %:’ given the daily confirmed cases curve 𝐶 %:’ and infection-to-confirmed time distribution 𝑠 %:, is an ill-posed deconvolution problem and can be solved using Richardson-Lucy (RL) iteration method [25]. The initial guess 𝐼? %:"! is the confirmed cases curve 𝐶 %:’ shifted back by the mode of the infection-to-confirmed time distribution. Let 𝐶? -. = ∑ 𝑠(𝑘)𝐼? ")*.*+" be the expected number of confirmed cases on day 𝑖 of iteration 𝑛 , and 𝑞 " be the probability that a reported case resulting from infection on day 𝑡 will be observed as defined in [25]. Then the iteration of 𝐼? " is computed by an expectation-maximization (EM) algorithm as: 𝐼? "./% = 𝐼? ". 𝑞 " J 𝑠(𝑖 − 𝑡)𝐶 " 𝐶? ".-0" (6) A normalized 𝜒 ( statistics is used as the stop criterion of the iteration: 𝜒 ( = 1𝑁 J (𝐶? -. − 𝐶 " )𝐶? -." < 1 (7) where 𝑁 is the total number of data points. It is of note that the reported number of confirmed cases constitute the lower bound of the real infection due to the lack of mass test and the existence of asymptomatic cases. However, as long as the detection rate remains consistent, the scaling of reconstructed data does not affect the following inference of transmission dynamics. B. Bayesian Updating for Parameter Estimation

Following the Bayesian updating approach of data assimilation, we propose an instantaneous estimation method. For the defined epidemiology renewal process, the daily incident infection 𝐼 " is the state variable and can be assimilated from the reconstructed infection data from observation. The evolution of the state 𝐼 " is governed by the renewal process with the time-varying infectiousness profile 𝛽 " (𝜏) , parameterized with 𝑝 " and 𝐷 " . Here we present a Bayesian framework to monitor the evolution of 𝑝 " and 𝐷 " using the daily reports of confirmed cases (Figure 3). Our updating scheme employs a two-level hierarchical model for the inference of time-varying parameters [26]. Let us denote the observed daily incidence of infection till time step 𝑡 as 𝐼? %:’ = {𝐼? % , 𝐼? ( , … , 𝐼? " } . Suppose pT𝛉 ’)% |𝐼? %:’)% W is the estimated distribution of 𝛉 = [𝑝, 𝐷] at time step 𝑡 − 1 . Under the assumption of consistent detection rates, the observed daily incidence 𝐼? " also satisfies the renewal process. The low-level model predicts the observation (i.e. reconstructed daily infection) given a parameter set through the renewal process: pT𝐼? ’ Z𝛉 ’ , 𝐼? %:’)% W ~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝑚𝑒𝑎𝑛 = J 𝛽 " (𝑘; 𝛉 ’ )𝐼? ")*")%*2% ) (8) where a Poisson process of observing the infected cases is assumed. This describes the likelihood of observing the new incidence data given history observations and parameter value 𝛉 ’ . The high-level model describes the evolution of the model parameters 𝑝 " and 𝐷 " through transforming the joint distribution: pT𝛉 ’ Z𝐼? %:’)%

W = T ∘ pT𝛉 ’)%

Z𝐼? %:’)%

W (9) where

T(. ) is a transformation function defining the temporal variations of the 𝛉 . The prior knowledge of parameter distribution is transferred to the next time step 𝑡 by the high-level model T . Under the scenario without interventions, the parameters 𝑝 " and 𝐷 " fluctuate around the baseline values. Therefore, we can assume a random walk of 𝛉 in the parameter space as the high-level model. The update of joint parameter distribution is by convoluting with a Gaussian kernel with variance 𝜎 % . When the intervention is introduced on time 𝑑 , the random walk of 𝛉 is altered where the variance of the Gaussian kernel will become 𝜎 ( . The transformation T(. ) is defined as:

Fig. 3. Illustration of the Bayesian updating framework for estimating suppression and mitigation factors. We employ a two-level hierarchical model: For each time step, the low-level model (i.e. renewal process) provides the likelihood of 𝑝 ! , 𝐷 ! (green). The posterior (orange) is calculated through the element product of the likelihood and the prior (blue) from the previous time step. To generate the prior for next time step, we use the high-level model (i.e. the transformation T ) to induce the evolution of parameters. The high-level model is a piecewise gaussian random walk process where the fluctuations of 𝑝 ! and 𝐷 ! differ before and after an intervention time. The instantaneous reproduction number 𝑅 ! can be derived from the posterior distribution of 𝑝 ! and 𝐷 ! . T ∘ p(𝛉) = ap(𝛉) ∗ K " (𝛉) 𝑡 < 𝑑p(𝛉) ∗ K (𝛉) 𝑡 ≥ 𝑑 (10) where K " (𝛉) and K (𝛉) are the Gaussian kernels before and after the deployment of intervention at time 𝑑 . This high-level model includes three hyperparameters: variances before and after intervention: 𝜎 % and 𝜎 ( , and the change-point time 𝑑 . Let us denote the hyperparameters 𝜼 = [𝜎 % , 𝜎 ( , 𝑑] . After seen the latest observation 𝐼? ’ , the posterior estimation of 𝛉 is update by the Bayes rule: pT𝛉 ’ Z𝐼? %:’

W = T ∘ pT𝛉 ’)%

Z𝐼? %:’)%

W ∙ pT𝐼? ’ Z𝛉 ’ , 𝐼? %:’)% WpT𝐼? ’ Z𝐼? %:")%

W (11)

This step reflects the Bayesian principle in the key updating step in Kalman filtering [21]. Unlike the Kalman filtering method where uncertainty is explicitly modelled through a covariance matrix under the Gaussian assumption, we directly use posterior probability to capture the uncertainty of estimation. The posterior is usually intractable but can be approximated through grid-based methods. Given a set of hyperparameters 𝜼 - , the hybrid model evidence can be calculated as [26]: pT𝐼? %:’ Z𝜼 - W = + pT𝐼? %:’ , 𝛉 ’ Z𝜼 - W𝑑𝛉 ’ (12) Finally, the posterior estimation pT𝛉 ’ Z𝐼? %:’ W can be averaged across the hyperparameter grids weighted by the hybrid model evidence. The posterior mean and confidence intervals of 𝑝 " and 𝐷 " as well as the corresponding 𝑅 " are obtained in a dynamic manner. The prior of 𝑅 ! at the first timestep is set uninformative as a uniform distribution with the pre-set lower and upper limits (e.g., the upper limit for the European countries is set to 8 in the experiment). The shape of 𝛽 ! (𝜏) is adapted from the distribution of generation time interval 𝑤(𝜏) reported by Ferretti et al.[5] We applied the above framework to infer the epidemic evolution in 14 European countries, states in the US and Wuhan city, China in Section VI. The codes of the our framework is released as an open-source package (https://github.com/whfairy2007/COVID19_Bayesian). V. E VALUATION OF I NTERVENTION M EASURES

With the estimated results from the above Bayesian updating scheme, now we can perform statistical analysis between the evolution of the transmission dynamics and the implementation of intervention measures. The whole framework containing data reconstruction, dynamic modelling, Bayesian updating, statistical analysis is presented in Figure 4. In this section, we introduce the quantification of intervention measures and the statistical method. A. Data Source

For the observations, we use the aggregated data of publicly available daily confirmed cases of 14 Europe countries (Austria, Belgium, Denmark, France, Germany, Ireland, Italy, Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom) and 52 states of the United States from John Hopkins University database [1]. The data include the time series of confirmed cases from January 22nd to June 8 th th 𝑆 " of intervention measures during the analysis period (accessed on June 9 th 𝑆 " ≤ 𝑆 " ≤ 𝑆 " ≤ 𝑆 " ≤

80% and Level 4: 80%< 𝑆 " ≤ 𝑆 " . B. Calculation of intervention policy indices

We categorize the dates within our analysis period in European countries into five different response levels, based on the overall stringency index 𝑆 " . To identify the representative measures of each response level, we calculate the quantification indices of the eight intervention measures. Descriptions of the eight intervention measures and the quantification methods are provided in [2]. For each intervention measure, the Oxford report provides an ordinal scale quantification 𝑣 of the strength of j-th policy implementation and a binary flag 𝑓 Fig. 4. Components of the quantification framework. The evolution of mitigation and suppression factors are estimated using the infection data reconstructed from the daily reported confirmed cases. Given the history of government responses, the impacts of intervention measures are quantified by correlating the inferred epidemic parameters to response levels. 𝑡 . Following similar practice use in the Oxford report, we normalize the implementation of each intervention measure as 𝑃 = max (0, 𝑣 + 0.5𝑓 − 0.5)𝑁 × 100% (13) where 𝑁 is the maximum value of the indicator 𝑃 . To assign a label of response level to each measure, we calculate the change of mean policy indices across different response levels. The response level with the largest increase is considered as the level that the measure belongs to (i.e. the measure is a representative measure of this response level). For example, the mean index of school closure showed the largest increase from Level 0 to Level 1, so we consider this is a representative measure of Level 1. The representative measures of each response level are listed in Table 1. C. Regression analysis of the intervention impacts

We performed a retrospective analysis of the time-varying transmission dynamics during different response levels in Europe countries. First, the evolution history of 𝑅 " and the overall stringency index 𝑆 " are obtained using the above framework. The stringency index 𝑆 " is categorized into five response levels. We fit a log-linear mixed-effect model, where the logarithm of 𝑅 " is the outcome variable and categorical stringency index is the predictor. The logarithm is used to obtain the intervention impacts on the relative change of 𝑅 " [27]. We performed a partial-pool analysis by assuming the impacts of intervention measure (slopes) shared across all selected European countries while the basic reproduction number 𝑅 ! (intercept) varies due to environmental and social factors. The regression formula is written as: ln 𝑅 = 𝑏 ! + J 𝑏 * ∗ 𝐷 + 𝛾 + 𝜖 𝑗 = 1,2, … ,14 (14) where 𝑅 is the estimated reproduction number of j-th country, 𝑏 ! is the fixed effect term of ln 𝑅 ! and 𝑏 * is the fixed effects of interventions in response level 𝑘 . 𝐷 is the dummy variable that takes the value 1 if and only if the response status is at Level k. 𝛾 is the random effect term following zero-mean Gaussian which explains the difference of ln 𝑅 ! across countries and 𝜖 is the Gaussian error term. Equation 14 associates the relative changes in 𝑅 to the fixed effects of response levels, and can be rewritten into its marginal form as: ln(1 + 𝑅 − 𝑅 ! 𝑅 ! ) = J 𝑏 * ∗ 𝐷 *6*2% (15) Therefore, the relative change of 𝑅 due to the intervention measures in k -th response level can be derived from 𝑏 * (i.e. ∆𝑅/𝑅 ! = 𝑒𝑥𝑝(𝑏 * ) − 1 ). Country-specific ln 𝑅 ! can be estimated as 𝑏 ! + 𝛾 at the Level 0. The statistical analysis is performed using the R package ‘lme4’. The fixed effect is considered significant with P value<0.05. The 95% confidence intervals (CI) are estimated using bootstrap method. The assumption of normality is checked by inspecting the quantile-quantile plot of the residuals. The same procedure is also applied to the analysis of 𝐷 " and 𝑝 " to quantify the suppression and mitigation factors, respectively. The results are demonstrated in Table 1. VI. R ESULTS A. Validation on simulated data

We simulated an artificial epidemic outbreak with a time-varying infectiousness profile using renewal process. The generation time intervals were adapted from Ferretti et al.[5]. The simulation period includes 50 days and an intensive intervention measure is induced on day 35 altering the transmission dynamics. Before the intervention, the ground-truth 𝑅 " followed Gaussian random walk with a mean of 2.5. After the intervention (50% 𝑝 " reduction and 67% 𝐷 " reduction), the mean of 𝑅 " was reduced to 0.5 (black line). We validate the effectiveness of our approach in capturing the sudden change of 𝑅 " evolution induced by interventions, which is hard to be detected by traditional sliding window-based methods (Figure 5). We compared the results using our approach (red line with 95% confidence intervals) to the results computed by the R package ‘EpiEstim v2.2’ [12] (blue) which is a sliding window-based method widely used for 𝑅 " estimation. We observed that the ground-truth 𝑅 " is well estimated within our confidence interval. In particular, the sharp change of 𝑅 " caused by the intervention is captured immediately by our approach while there is a lag using the sliding window-based method. B. Evaluation of Intervention measures in Europe Countries

In this part, we applied the proposed framework to analyze the epidemic evolution in the 14 European Countries and also Wuhan. With the inferred < 𝑅 " , 𝑝 " , 𝐷 " >, we can then assess the impacts of intervention measures. Figure 6 demonstrates the reconstruction of daily infections in the UK from the reported confirmed cases. The infected-to-report delay between report and infected time is composed of the incubation period (a lognormal distribution with a mean of 5.5 days and a standard deviation of 2.1 days [5]) and the onset-to-report period (a gamma distribution with a mean of 4.9 days and a standard deviation of 3.3 days [10]). The blue bars in Figure 6 indicate the number of confirmed cases. After deconvolving the confirmed numbers using infected-to-report delay, we got the infected curve, which is colored in red in Fig. 5. Validation of the proposed Bayesian updating scheme. 𝑅 " of the UK from the infected curve. The missing values in the infected curve are replaced by the average mean of the neighbouring numbers. green bar is the posterior mean of estimated 𝑅 " . To quantitatively show the impacts of different strength levels of interventions, Table 1 summarizes the statistical analysis results of 14 European countries. It shows different reduction rates of < 𝑅 " , 𝑝 " , 𝐷 " > for different response levels. The relative reduction of < 𝑅 " , 𝑝 " , 𝐷 " > compared to the minimal response (Level 0 where 𝑅 " is set to 𝑅 ! ) was estimated for each response level. With soft response (Level 1), the corresponding intervention measures (e.g. school closure, quarantine of international arrivals from high-risk regions) are correlated with a relative reduction of 𝑅 " by 35% showing both strong suppression effect ( 𝐷 " shortening 22%) and mitigation effect ( 𝑝 " reduction 29%). With strong response (Level 2), the relative reduction of 𝑅 " increases to 60% with a strong mitigation effect ( 𝑝 " reduction 56%). But the suppression effect ( 𝐷 " shortening 26%) is similar to that of Level 1, indicating marginal incremental suppression effect. This observation shows a consistency with the aim of representative intervention measures on this level (e.g. cancelling public events, restrictions on gathering and internal movements) to reduce the contact rates among the population. The emergent response (Level 3) shows substantial relative reduction of reproductive number ( 𝑅 " reduction 71%) with suppression ( 𝐷 " shortening 37%) and mitigation ( 𝑝 " reduction 67%) effects, correlated to the intensive measures (e.g. workplace closure and stay-at-home requirements). A similar degree of reductions is found for Level 4 ( 𝑅 " reduction 74%; 𝐷 " shortening 40%; 𝑝 " reduction 70%) while the stringency of intervention measures is higher. We find that our estimated evolving patterns of 𝑝 " and 𝐷 " correspond well to the serial strategies taken by some European countries, such as the ‘contain-delay-lockdown’ route taken in the UK. Apart from the results of 14 European Countries, Figure 8 also shows the results of applying our method to the data from Wuhan, where the greens bars indicate the posterior mean of 𝑅 " during the outbreak of COVID-19. We can see that at the early stage of the pandemic, the 𝑅 " levels are above 1. After the lockdown intervention has taken effect, 𝑅 " has experienced a sharp decrease from 23 rd Jan. When the centralized quarantine policy has been enforced from the beginning of February, the 𝑅 " values then largely remain below zero (the spike around 14 th Feb is due to misreporting). Figure 9 compares the reductions in < 𝑅 " , 𝑝 " , 𝐷 " > for different response levels between European Countries and Wuhan. From the analysis of Wuhan data, the strong impact of lockdown is clearly demonstrated with the immediate relative reduction of 𝑅 " by 58%. We also observed that the combination of lockdown, centralized quarantine and immediate admission of confirmed patients starting from Feb 2nd in Wuhan was associated with a more substantial relative reduction of 𝑅 " with strong suppression and mitigation effects. Fig. 6. Reconstruction of daily infections from the report of confirmed cases in UK. The forward convolution on reconstructed data (black line) matches well with actual reported data (blue bars), validating the correctness of the deconvolution method. Fig. 7. Estimated evolution of transmission dynamics in UK. The black line represents the reconstructed daily infection number and the green bar is the posterior mean of estimated 𝑅 ! . Fig. 8. Estimated evolution of transmission dynamics in Wuhan. The black line represents the reconstructed daily infection number and the green bar is the posterior mean of estimated 𝑅 ! . Two major events (city lockdown measure from Jan 23 rd and centralized quarantine from Feb 2 nd ) are annotated with red arrows. C. Resurgence risks in United States

We also used the proposed framework to estimate the epidemic evolution in different states of the United States. We observed that, as of the week ending May 31 st , the averaged reproduction number 𝑅 " in 30 states exceeds 1 (Figure 10). These could be related to the recent lift of government restrictions and alert us to take a close monitoring on the epidemic evolution. At the time of preparing this paper (June 18 th th June 2020 have experienced an increased number of daily confirmed cases compared to that of May 31 st , and 14 states have recorded all-time high after May 31 st . When we prepare the final version in early August, this alarming prediction of a second wave outbreak is unfortunately proven true for all the states listed. So far, the application of the framework to many countries and the retrospective impact analysis of intervention measures in European countries indicate the effectiveness of our approach in monitoring 𝑅 " . This can be further validated by predicting the evolution of 𝑝 " , 𝐷 " and 𝑅 " and projected infections in future study. Our current study has several limitations. Firstly, the reporting protocols and standards of confirmed cases, as well as the detection rates, vary among countries. However, as long as the reporting bias is consistent over time, the inference results of 𝑝 " , 𝐷 " and 𝑅 " should not be affected. We also note that the implementation of multiple intervention measures within a short interval makes it challenging to quantify the impact of a single measure which needs further statistical analysis. VII. C ONCLUSIONS

In conclusion, we propose a comprehensive Bayesian updating approach to timely estimate parameters of COVID-19 epidemic models. The disease transmission dynamics is modelled by renewal equations with time-varying parameters. Instead of purely focusing on estimating instantaneous reproduction number 𝑅 " , we introduce two complementary parameters, the mitigation factor ( 𝑝 " ) and the suppression factor ( 𝐷 " ), to quantify intervention impacts at a finer granularity. A Bayesian updating scheme is adopted to dynamically infer model parameters. By monitoring and analyzing the evolution of the estimated parameters, impacts of intervention measures in different response levels can be quantitatively assessed. We have applied our method to European countries, the United States and Wuhan, and reveal the effects of interventions in these countries and the resurgence risk in the USA. Our work opens a promising venue to inform policy for better decision-making in response to a possible second-wave outbreak. A CKNOWLEDGMENT

We express our sincere thanks to all members of the joint analysis team between Imperial College London, University of Cambridge and University of Kent and Hong Kong Baptist University. We thank Yuting Xing for helping collect epidemic data in Wuhan and the United States. We thank Siyao Wang and Liqun Wu for their efforts on developing a digital tracing app for validation and visualization.

Fig. 9. The relative reduction of mitigation factor 𝑝 ! and suppression factor 𝐷 ! under different response levels compared to minimal response level. TABLE I. THE RELATIVE REDUCTION OF MITIGATION FACTOR AND SUPPRESSION FACTOR

UNDER DIFFERENT RESPONSE LEVELS OF E UROPEAN COUNTRIES

Response Representative Measures Impact of measures 𝑅 ! relative reduction Suppression effect 𝐷 ! relative reduction Mitigation effect 𝑝 ! relative reduction Level 0 Minimal response

No mandatory restrictions 0 0 0

Level 1 Soft response

Closing schools, International travel controls. 35% CI: [25%, 45%] 22% CI: [17%, 27%] 29% CI: [18%, 38%]

Level 2 Strong response

Cancel public events, Restrictions on gathering, Restrictions on internal movement. 60% CI: [54%, 65%] 26% CI: [21%, 30%] 56% CI: [50%, 61%]

Level 3

Close workplace, Close public transport, Stay-at-home requirements. 71% CI: [68%, 74%] 37% CI: [35%, 40%] 67% CI: [64%, 70%]

Level 4 Emergent response

74% CI: [71%, 77%] 40% CI: [37%, 42%] 70% CI: [66%, 73%]

9 R

EFERENCES [1] E. Dong, H. Du, and L. Gardner, “An interactive web-based dashboard to track COVID-19 in real time,”

Lancet Infect. Dis. , vol. 20, no. 5, pp. 533–534, May 2020. [2] T. Hale, A. Petherick, T. Phillips, and S. Webster, “Variation in government responses to COVID-19,” 2020. [3] A. Pan et al. , “Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China,”

JAMA , vol. 323, no. 19, p. 1915, May 2020. [4] R. Li et al. , “Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2),”

Science (80-. ). , vol. 3221, no. March, p. eabb3221, 2020. [5] L. Ferretti et al. , “Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing.,”

Science , vol. 6936, no. March, pp. 1–13, 2020. [6] E. Vynnycky and R. White,

An introduction to infectious disease modelling . OUP oxford, 2010. [7] N. C. Grassly and C. Fraser, “Mathematical models of infectious disease transmission,”

Nat. Rev. Microbiol. , vol. 6, no. 6, pp. 477–487, 2008. [8] C. Fraser, “Estimating individual and household reproduction numbers in an emerging epidemic,”

PLoS One , vol. 2, no. 8, 2007. [9] J. Ma and D. J. D. Earn, “Generality of the final size formula for an epidemic of a newly invading infectious disease,”

Bull. Math. Biol. , vol. 68, no. 3, pp. 679–702, 2006. [10] K. Leung, J. T. Wu, D. Liu, and G. M. Leung, “First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment,”

Lancet , vol. 395, no. 10233, pp. 1382–1393, Apr. 2020. [11] S. Flaxman et al. , “Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe,”

Nature , pp. 1–5, 2020. [12] R. N. Thompson et al. , “Improved inference of time-varying reproduction numbers during infectious disease outbreaks,”

Epidemics , vol. 29, no. August, 2019. [13] A. Cori, N. M. Ferguson, C. Fraser, and S. Cauchemez, “A new framework and software to estimate time-varying reproduction numbers during epidemics,”

Am. J. Epidemiol. , vol. 178, no. 9, pp. 1505–1512, 2013. [14] J. Wallinga and P. Teunis, “Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures,”

Am. J. Epidemiol. , vol. 160, no. 6, pp. 509–516, 2004. [15] D. Adam, “A guide to R-the pandemic’s misunderstood metric.,”

Nature , vol. 583, no. 7816, pp. 346–348, 2020. [16] N. Imai, I. Dorigatti, A. Cori, C. Donnelly, S. Riley, and N. Ferguson, “Report 2: Estimating the potential total number of novel Coronavirus cases in Wuhan City, China,” 2020. [17] Q. Li et al. , “Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia,”

N. Engl. J. Med. , vol. 382, no. 13, pp. 1199–1207, 2020. [18] J. T. Wu, K. Leung, and G. M. Leung, “Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study,”

Lancet , vol. 395, no. 10225, pp. 689–697, 2020. [19] P. Nouvellet et al. , “Report 26: Reduction in mobility and COVID-19 transmission.” [20] M. Asch, M. Bocquet, and M. Nodet,

Data assimilation: methods, algorithms, and applications . 2016. [21] Z. Chen, “Bayesian filtering: From Kalman filters to particle filters, and beyond,”

Statistics (Ber). , vol. 182, no. 1, pp. 1–69, 2003. [22] C. J. Rhodes and T. D. Hollingsworth, “Variational data assimilation with epidemic models,”

J. Theor. Biol. , vol. 258, no. 4, pp. 591–602, 2009. [23] L. M. A. Bettencourt and R. M. Ribeiro, “Real time bayesian estimation of the epidemic potential of emerging infectious diseases,”

PLoS One , vol. 3, no. 5, p. e2185, 2008. [24] L. Cobb, A. Krishnamurthy, J. Mandel, and J. D. Beezley, “Bayesian tracking of emerging epidemics using ensemble optimal statistical interpolation,”

Spat. Spatiotemporal. Epidemiol. , vol. 10, pp. 39–48, 2014. [25] E. Goldstein, J. Dushoff, M. Junling, J. B. Plotkin, D. J. D. Earn, and M. Lipsitch, “Reconstructing influenza incidence by deconvolution of daily mortality time series,”

Proc. Natl. Acad. Sci. U. S. A. , vol. 106, no. 51, pp. 21825–21829, 2009. [26] C. Mark, C. Metzner, L. Lautscham, P. L. Strissel, R. Strick, and B. Fabry, “Bayesian model selection for complex dynamic systems,”

Nat. Commun. , vol. 9, no. 1, 2018. [27] A. Agresti,

An introduction to categorical data analysis . John Wiley & Sons, 2018. Fig. 10. The averaged 𝑅 ! values in different states of the United States. We report the result of averaged 𝑅 ! in the US during the week ending May 31st 2020, which is ranked by the averaged 𝑅 !!