[PDF] Monitoring the pandemic: A fractional filter for the COVID-19 contact rate

Abstract

This paper aims to provide reliable estimates for the COVID-19 contact rate of a Susceptible-Infected-Recovered (SIR) model. From observable data on confirmed, recovered, and deceased cases, a noisy measurement for the contact rate can be constructed. To filter out measurement errors and seasonality, a novel unobserved components (UC) model is set up. It specifies the log contact rate as a latent, fractionally integrated process of unknown integration order. The fractional specification reflects key characteristics of aggregate social behavior such as strong persistence and gradual adjustments to new information. A computationally simple modification of the Kalman filter is introduced and is termed the fractional filter. It allows to estimate UC models with richer long-run dynamics, and provides a closed-form expression for the prediction error of UC models. Based on the latter, a conditional-sum-of-squares (CSS) estimator for the model parameters is set up that is shown to be consistent and asymptotically normally distributed. The resulting contact rate estimates for several countries are well in line with the chronology of the pandemic, and allow to identify different contact regimes generated by policy interventions. As the fractional filter is shown to provide precise contact rate estimates at the end of the sample, it bears great potential for monitoring the pandemic in real time.

Full PDF

MMonitoring the pandemic: A fractional ﬁlter for the COVID-19contact rate

Tobias Hartl ∗ a,ba University of Regensburg, 93053 Regensburg, Germany b Institute for Employment Research, 90478 Nuremberg, GermanyFebruary 2021

Abstract.

This paper aims to provide reliable estimates for the COVID-19 contact rate ofa Susceptible-Infected-Recovered (SIR) model. From observable data on conﬁrmed, recovered,and deceased cases, a noisy measurement for the contact rate can be constructed. To ﬁlter outmeasurement errors and seasonality, a novel unobserved components (UC) model is set up. Itspeciﬁes the log contact rate as a latent, fractionally integrated process of unknown integrationorder. The fractional speciﬁcation reﬂects key characteristics of aggregate social behavior suchas strong persistence and gradual adjustments to new information. A computationally simplemodiﬁcation of the Kalman ﬁlter is introduced and is termed the fractional ﬁlter. It allows toestimate UC models with richer long-run dynamics, and provides a closed-form expression for theprediction error of UC models. Based on the latter, a conditional-sum-of-squares (CSS) estimatorfor the model parameters is set up that is shown to be consistent and asymptotically normallydistributed. The resulting contact rate estimates for several countries are well in line with thechronology of the pandemic, and allow to identify diﬀerent contact regimes generated by policyinterventions. As the fractional ﬁlter is shown to provide precise contact rate estimates at theend of the sample, it bears great potential for monitoring the pandemic in real time.

Keywords.

COVID-19, ﬁltering, long memory, SIR model, unobserved components.

JEL-Classiﬁcation.

C22, C51, C52. ∗ Corresponding author: Department of Economics and Econometrics, University of Regensburg, Universit¨atsstr.31, 93053 Regensburg, Germany, email: [email protected] a r X i v : . [ ec on . E M ] F e b Introduction

Since the outbreak of COVID-19 reducing social contacts is widely viewed as the key way tocontain the spread of the virus. In terms of the Susceptible-Infected-Recovered (SIR) model ,this relates to the contact rate, deﬁned as the average number of contacts per person per timeunit multiplied by the probability of disease transmission between a susceptible and an infectedindividual (Hethcote; 2000). The probability of disease transmission should only depend oncharacteristics that are speciﬁc to the virus. Therefore, the contact rate can be interpretedas a proxy for aggregate social behavior and is the key variable addressed by social distancingmeasures. Knowing the trajectory of the contact rate would allow to draw inference on the impactof policy measures on contact reduction, to real-time monitor the dynamics of virus dispersion,and to design policy rules based on the current pandemic situation. Since the contact rateitself is unobservable, appropriate methods to estimate the contact rate are required, and will beconsidered in this paper.At the early stage of the pandemic, ﬁrst estimates for the natural logarithm of the contactrate were obtained by ﬁtting a deterministic, linear time trend with structural breaks to transfor-mations of data on conﬁrmed, recovered, and deceased cases (Hartl, W¨alde and Weber; 2020; Leeet al.; 2021; Liu et al.; 2021). Modeling the log contact rate by a piece-wise linear time trend wasa reasonable and pragmatic approximation given the short time series on case numbers availableat that time. However, it implies that contact rate growth evolves deterministically as a straightline with jumps at the break dates. This assumption is likely to be violated by the behavior ofindividuals. While structural breaks may be suitable to identify turning points of the contactrate, they are inappropriate for monitoring the current pandemic situation, as breaks require atleast some post-break observations to be well identiﬁed.This paper aims to improve estimates for the contact rate of COVID-19 by taking into accountkey features of aggregate social behavior. In detail, the log contact rate, as denoted by log β t , ismodeled as an unobserved, fractionally integrated process of (unknown) order d ∈ R + , generatedby stochastic shocks { η i } ti =1 . The stochastic speciﬁcation of the contact rate is motivated bythe consideration that social decisions, e.g. on whether to meet, are made conditional on theinformation available at that time, e.g. on current social distancing measures or the state of thepandemic. As information does not evolve deterministically but appears as stochastic shocks,this suggests to treat log β t as a stochastic process generated by the information shocks { η i } ti =1 .Specifying log β t as a fractionally integrated process accounts for strong persistence and nonsta-tionarity (in short: long memory) of social behavior. In contrast to structural breaks but alsoto random walks, the fractional speciﬁcation allows social behavior to gradually adjust to new The SIR model – in its various variants – has recently become a popular tool to study the economic impactof the pandemic and for policy simulations, see Acemoglu et al. (2020); Avery et al. (2020); Korolev (2021); Liuet al. (2021) among others. Fractional integration techniques have been found useful for describing the aggregate behavior of individualsin a variety of applications, e.g. for explaining the Deaton paradox (Diebold and Rudebusch; 1991) and for theestimation of the business cycle (Hartl, Tschernig and Weber; 2020). d is treated as an unknown parameter to be estimated.Methodologically, this paper contributes to the literature on time series ﬁltering by settingup a novel unobserved components (UC) model that does not require prior knowledge about theintegration order of the variable under study. Current UC models and related ﬁltering techniquesrely heavily on prior assumptions about the integration order d and typically assume d = 1 (e.g.Harvey; 1985; Morley et al.; 2003; Chang et al.; 2009) or d = 2 (e.g. Clark; 1987; Hodrick andPrescott; 1997; Oh et al.; 2008) to be known. In contrast, the novel UC model reﬂects thatthe degree of persistence of the log contact rate is unknown. It allows to decompose a noisymeasurement for the log contact rate that is based on a transformation of data on conﬁrmed,recovered, and deceased cases, into measurement errors, seasonal components, and the unobservedlog contact rate itself. As the latter is modeled by a fractionally integrated process, the model iscalled the fractional UC model.The second methodological contribution of this paper is to derive a computationally muchsimpler estimator for the model parameters and the unobserved components compared to currentstate space methods. Current methods typically rely on the Kalman ﬁlter to set up a conditional(quasi-)likelihood function for the estimation of the model parameters. Given the parameter es-timates, a time-varying signal for the unobserved components is then obtained from the Kalmansmoother. Both the Kalman ﬁlter and smoother become computationally infeasible when thedimension of the state vector of UC models is high, as for fractionally integrated processes. Toaddress this problem, this paper proposes a computationally simple modiﬁcation of the Kalmanﬁlter and smoother that is termed the fractional ﬁlter. While ﬁltered and smoothed estimatesfrom the fractional ﬁlter are identical to the Kalman ﬁlter and smoother, the fractional ﬁlteravoids the computationally intensive recursions for the conditional variance. The fractional ﬁl-ter provides a closed-form expression for the prediction error of UC models, based on which aconditional-sum-of-squares (CSS) estimator for the fractional integration order and other modelparameters is set up. While the CSS estimator has been found useful for the estimation ofARFIMA models, see Hualde and Robinson (2011) and Nielsen (2015), it has not been consid-ered in the UC literature so far. The CSS estimator minimizes the sum of squared predictionerrors that is proportional to the exponent in the conditional (quasi-)likelihood function basedon the Kalman ﬁlter. Due to the computational gains from the fractional ﬁlter, the CSS es-timator allows to estimate UC models with richer long-run dynamics. The paper provides theasymptotic theory for the CSS estimator, showing it to be consistent and asymptotically normallydistributed, while the ﬁnite sample properties are assessed by a Monte Carlo study.Using data from the Johns Hopkins University Center for Systems Science and Engineering(Dong et al.; 2020, JHU CSSE), estimates for contact and reproduction rate are presented for2anada, Germany, Italy, and the United States, where beneﬁts from the new methods directlybecome apparent: First, estimation results are not only well in line with the chronology of thepandemic, but also allow to identify diﬀerent contact regimes generated by the strengtheningand easing of contact restrictions. Second, a recursive window evaluation shows contact rateestimates at the end of a truncated sample to largely overlap with those based on the full sampleinformation. This makes the fractional ﬁlter a suitable candidate for monitoring outbreaks atthe current frontier of the data. And third, the proposed estimation and ﬁltering techniques areshown to be fairly robust to under-reporting of recovered cases, which is of particular importancefor the US, as several states do not report data on recovered individuals. While under-reportingheavily downward-biases contact and reproduction rate estimates in Lee et al. (2021), this isshown not to be the case for the fractional ﬁlter.The remaining paper is organized as follows: Section 2 motivates the speciﬁcation of thecontact rate and sets up the fractional UC model. Section 3 introduces the fractional ﬁlter forlog β t , covers parameter estimation via the CSS estimator and presents the asymptotic theory.Section 4 contains empirical results for Canada, Germany, Italy, and the United States, whilesection 5 concludes. The appendices include proofs for consistency and asymptotic normality ofthe CSS estimator as well as a Monte Carlo study on the ﬁnite sample properties. To motivate the estimation of the contact rate, consider the discrete SIR model, augmented toinclude deaths, which also forms the starting point of Pindyck (2020, eqn. 1–4) and Lee et al.(2021, eqn. 2.1) 1 = S t + I t + D t + R t , (1)∆ I t = β t S t − I t − − γI t − , (2)∆ D t = γ d I t − , (3)∆ R t = γ r I t − . (4)In (1), the (initial) population size, normalized to be one, is decomposed into S t , the proportionof the population susceptible in t , I t , the fraction of the population infected in t , D t , the fractionthat has died until t , and R t , the proportion that has recovered until t . In (2), γ = γ d + γ r denotes the rate at which infected either die, see (3), or recover, see (4), and obviously γ d , γ r ≥ γI t − denotes the fraction of outﬂows of infected at t . The fraction of new infectionsat t is captured by β t S t − I t − , where S t − I t − can be interpreted as the average probabilityof a contact being between a susceptible subject and an infected subject. β t > R t = β t /γ can be derived. It is the averagenumber of infections caused by an infected subject during the infectious period 1 /γ at the earlystage of the pandemic (where S t − ≈ R t is an indicator for the current dynamics of thepandemic, as for R t < I t to converge, see (2)where 0 ≤ S t − ≤

1. Thus, if policy seeks to contain the spread of COVID-19, then it mustcontrol the contact rate, which controls the reproduction rate R t .As shown by Lee et al. (2021), from (1) to (4) a measurement for the contact rate β t canbe obtained directly: Denote C t = I t + R t + D t as the fraction of conﬁrmed cases (consisting ofinfected, recovered, and deceased cases) and use ∆ C t = ∆ I t + ∆ R t + ∆ D t together with (2) to(4) to obtain ∆ C t = β t S t − I t − − γI t − + ( γ d + γ r ) I t − = β t S t − I t − . Solving for β t yields β t = ∆ C t I t − S t − =: Y t , (5)see Lee et al. (2021, eqn. 2.2). As argued there, if for each t the data ( C t , R t , D t ) can be observed,then the time-varying contact rate can be calculated straightforwardly via (5) using S t = 1 − C t ,as well as I t = C t − R t − D t .Unfortunately, reported case numbers for C t , R t , and D t , such as the daily data from JHUCSSE used in the applications in section 4, suﬀer from measurement errors, see e.g. Horta¸csuet al. (2021). In addition, they display a strong weekly seasonal pattern that is likely to bedriven by a varying number of tests conducted over the diﬀerent days of the week (Bergmanet al.; 2020). Under the assumption that Y t is measured with a proportionally constant errorvariance resulting from seasonality and measurement errors, one has the following structure forthe natural logarithm of the observable ˜ Y t . Assumption 1 (Multiplicative seasonal and measurement errors) . For each t , the observable ˜ Y t satisﬁes log ˜ Y t = log Y t + (cid:88) i =1 α i s i,t + u t = log β t + (cid:88) i =1 α i s i,t + u t , t = 1 , ..., n, with Y t as given in (5) . s i,t are seasonal dummies for i = 1 , ..., , that capture the weekly patternsof reported case numbers, (cid:80) i =1 α i = 0 , and the measurement error u t ∼ W N (0 , σ u ) is whitenoise. Assumption 1 speciﬁes an unobserved components (UC) model where the observable noisymeasurement log ˜ Y t is decomposed into an unobservable measurement error u t , seasonal compo-nents (cid:80) i =1 α i s i,t , and the log contact rate log β t . The log speciﬁcation accounts for a proportional4mpact of measurement errors and seasonality, and forces the contact rate to be strictly positive.As the diﬀerent components are not separately identiﬁed, an additional assumption on thedynamic structure of the contact rate is required. Empirical models of COVID-19 case numbershave so far assumed log β t to follow a piece-wise linear time trend with structural breaks, seeHartl, W¨alde and Weber (2020); Lee et al. (2021); Liu et al. (2021). As an alternative, the UCliterature suggests to model time-varying coeﬃcients as random walks (see Durbin and Koopman;2012, for an overview). Both speciﬁcations assume contact rate growth ∆ log β t only to becontemporaneously aﬀected either by structural breaks or by stochastic shocks, an assumptionthat is likely to be violated. Reﬂecting that the persistence properties of social behavior, andthus of the contact rate, are unknown, assumption 2 speciﬁes the log contact rate as a fractionallyintegrated process of unknown order d . Assumption 2 (Speciﬁcation of the contact rate) . The log contact rate follows a type II frac-tionally integrated process of order d ∈ R + , denoted as log β t ∼ I ( d ) , where log β t = µ + x t , ∆ d + x t = η t , η t ∼ W N (0 , σ η ) , t = 1 , ..., n,µ is an intercept, and the η t are white noise and are independent of the measurement error u t . Under assumption 2, the log contact rate log β t is a stochastic long memory process generatedby the shocks { η i } ti =1 . The shock η t models the information new in t , such as news reports orpolicy announcements. Social decisions, reﬂected in log β t , however may additionally depend onpast information η t − , ..., η . Together, { η i } ti =1 forms the information available at t , conditionalon which social decisions, e.g. on whether to meet, are made. The speciﬁcation takes into accountthat new information does not evolve deterministically, but appears as stochastic shocks, whichcannot be captured by a deterministic speciﬁcation as e.g. in Lee et al. (2021).The degree of persistence of the log contact rate is determined by the integration order d ,which controls for the persistent impact of past shocks via the fractional diﬀerence operator ∆ d + .The latter exhibits a polynomial expansion in the lag operator L of order inﬁnite∆ d = (1 − L ) d = ∞ (cid:88) i =0 π i ( d ) L i , π i ( d ) =  i − d − i π i − ( d ) i = 1 , , ..., i = 0 . (6)The +-subscript denotes a truncation of an operator at t ≤

0, ∆ d + x t = (cid:80) t − i =0 π i ( d ) x t − i , whichreﬂects the type II deﬁnition of fractionally integrated processes (Marinucci and Robinson; 1999).For d = 1 the log contact rate is a random walk, which follows from plugging d = 1 into (6).Consequently, assumption 2 encompasses the predominant speciﬁcation in the UC literature.However, assumption 2 allows for a far more general dynamic impact of past shocks η , ..., η t on5og β t , as can be seen by plugging x t = ∆ − d + η t into log β t = µ + x t , which giveslog β t = µ + ∆ − d + η t = µ + t − (cid:88) i =0 π i ( − d ) η t − i . (7)While a random walk is an unweighted sum of past shocks η , ..., η t , so that π i ( −

1) = 1 for all i = 1 , ..., t −

1, allowing d (cid:54) = 1 yields non-uniform weights of past shocks in the impulse responsefunction of log β t and thus a gradual adjustment of the log contact rate to new information.This reﬂects that social behavior adjusts step-wise to new information both at the individual andthe aggregate level. As processing new information on the Coronavirus and revising individualdecisions (e.g. meeting friends, traveling, working from home) takes time and evolves gradually,individuals can be expected to step-wise adjust their contacts in response to new information.Overall, individuals will react heterogeneously both in terms of speed and intensity to novelinformation: Some will anticipate new information faster than others, and the extent of reactionwill depend on individual characteristics such as risk awareness and attitudes. Such gradualadjustments are well captured by assumption 2, in particular when 1 < d <

2: In that case,contact rate growth ∆ log β t ∼ I ( d −

1) is strongly persistent and mean-reverting, as will becomeapparent in the applications in section 4. Strong persistence reﬂects the gradual adjustment ofsocial behavior to new information, while mean-reversion ensures an asymptotically decliningimpact of past information to today’s contact rate growth.The remaining assumptions are imposed mainly for technical reasons. The type II deﬁnition offractional integration assumes zero starting values for the fractionally integrated process by trun-cating the polynomial expansion of the fractional diﬀerence operator, ∆ d + x t = (cid:80) t − i =0 π i ( d ) x t − i . Itis required to treat the asymptotically stationary ( d < /

2, from now on ‘stationary’ for brevity)and the asymptotically nonstationary case ( d > /

2, from now on ‘nonstationary’) alongsideeach other. While the type II deﬁnition may be a strong assumption for some time series, it isplausible for the contact rate, as we have data covering roughly the whole pandemic. Thus, thepre-sample shocks η i , i ≤

0, should be zero. Independence of u t and η t follows from the charac-terization of u t as a measurement error that should not inﬂuence the contact rate. In general, theassumption can be relaxed to allow for Corr( η t , u t ) (cid:54) = 0, as for instance in correlated UC models(Morley et al.; 2003), and will not aﬀect the asymptotic results in section 3. The distributionalassumptions on η t and u t are somewhat weaker than the assumption of Gaussian white noise onwhich UC models typically rely (Morley et al.; 2003). They will be shown to be largely satisﬁedin the applications of section 4. Finally, d > β t and u t . In this section, the fractional ﬁlter is derived. It is a computationally simple modiﬁcation of theKalman ﬁlter that avoids the Kalman recursions for the conditional variance. The modiﬁcationis necessary, as the Kalman ﬁlter becomes computationally infeasible for UC models when the6imension of the state vector is high, as for fractionally integrated processes. The fractional ﬁlterprovides a closed-form expression for the prediction error of the UC model. Based on that, aconditional-sum-of-squares (CSS) estimator for the model parameters is set up. It minimizes thesum of squared prediction errors obtained from the fractional ﬁlter. The CSS estimator is shownto be consistent and asymptotically normally distributed. Given the CSS parameter estimates,the log contact rate can be estimated by the fractional ﬁlter given the full sample information.Finally, estimation of the mean and seasonal components is considered.Under assumptions 1 and 2, the fractional UC model is given bylog ˜ Y t = log β t + (cid:88) i =1 α i s i,t + u t , log β t = µ + x t , x t = ∆ − d + η t , t = 1 , ..., n. (8)Denote µ , α , , ..., α , , d , σ η, , σ u, as the true parameters of the data-generating mechanism.Leaving aside the deterministic terms for the moment, by deﬁning y t = log ˜ Y t − µ − (cid:80) i =1 α i s i,t ,the stochastic part of the fractional UC model (8) is y t = x t + u t , x t = ∆ − d + η t , t = 1 , ..., n. (9)In the following, let θ = ( d, σ η , σ u ) (cid:48) ∈ Θ denote the vector holding the parameters of (9),and let θ = ( d , σ η, , σ u, ) (cid:48) ∈ Θ , where Θ = D × Ω η × Ω u denotes the parameter space with D = { d ∈ R | < d ≤ d max } and Ω i = { σ i ∈ R | < σ i < ∞} , i = η, u . Deﬁne F t as the σ -algebra generated by y , ..., y t , and let the expected value operator E θ ( z t ) of an arbitrary randomvariable z t denote that expectation is taken with respect to the distribution of z t given θ , so thatE θ ( z t ) = E( z t ). Furthermore, let Σ ( i,j ) denote the ( i, j )-th entry of an arbitrary matrix Σ .Estimation of the parameters θ is carried out by the CSS estimator that minimizes the sumof squared prediction errors of model (9). The prediction error is deﬁned as the one-step aheadforecast error of y t +1 given F t v t +1 ( θ ) = y t +1 − E θ ( y t +1 |F t ) = y t +1 − E θ ( x t +1 |F t ) . (10)It depends on E θ ( x t +1 |F t ), for which the fractional ﬁlter provides an analytical solution. Theﬁlter is introduced in the following lemma. Lemma 3.1 (Fractional ﬁlter for x t +1 given F t ) . Under assumptions 1 and 2 E θ ( x t +1 |F t ) = t (cid:88) i =1 π i ( − d ) Σ ( i, · ) η t :1 y t :1 Σ − y t :1 y t :1 , where y t :1 = ( y t , ..., y ) (cid:48) , η t :1 = ( η t , ..., η ) (cid:48) , Σ η t :1 y t :1 = Cov θ ( η t :1 , y t :1 ) , and Σ y t :1 = Var θ ( y t :1 ) . The uperscript in Σ ( i, · ) η t :1 y t :1 denotes the i -th row of the matrix, and Σ ( i,j ) η t :1 y t :1 =  π i − j ( − d ) σ η if i ≥ j, else, Σ ( i,j ) y t :1 =  σ u + σ η (cid:80) t − ik =0 π k ( − d ) if i = j,σ η (cid:80) t − max( i,j ) k =0 π k ( − d ) π k + | i − j | ( − d ) else. The proof is contained in appendix C. As can be seen from lemma 3.1, the fractional ﬁlterprovides a solution for E θ ( x t +1 |F t ) that only depends on θ and y , ..., y t . By plugging it into (10),one has the closed-form expression for the prediction error v t +1 ( θ ) = y t +1 − t (cid:88) i =1 π i ( − d ) Σ ( i, · ) η t :1 y t :1 Σ − y t :1 y t :1 . (11)Based on (11) the objective function of the CSS estimator for θ is set upˆ θ = arg min θ ∈ Θ n n (cid:88) t =1 v t ( θ ) . (12)Note that estimating the parameters of the fractional UC model via the CSS estimator (12)in combination with the fractional ﬁlter deviates from the methodological state space literature:There, expectation and variance of x t +1 conditional on F t are typically obtained from the Kalmanﬁlter recursions (see e.g. Durbin and Koopman; 2012, ch. 4.3). The resulting prediction error andits conditional variance then enter the Gaussian (quasi-)likelihood function that is maximized toestimate θ . However, the Kalman ﬁlter becomes computationally infeasible when the dimensionof the state vector is high, as for fractionally integrated processes. Thus, a computationallysimpler ﬁlter is required. The fractional ﬁlter, as deﬁned in lemma 3.1, is a modiﬁcation ofthe Kalman ﬁlter: Its solution for E θ ( x t +1 |F t ) is identical to the Kalman ﬁlter (see Durbin andKoopman; 2012, ch. 4.2), but it avoids the Kalman recursions for the conditional variance of x t +1 .While the conditional variance is necessary for (quasi-)maximum likelihood estimation, the CSSestimator only requires a closed-form expression for the prediction error, for which the fractionalﬁlter is suﬃcient. The objective function of the CSS estimator in (12) is of course proportional tothe exponent in the conditional Gaussian (quasi-)likelihood function. However, CSS estimationis computationally much simpler due to the fractional ﬁlter. Together, the fractional ﬁlter andthe CSS estimator provide a computationally feasible alternative to the Kalman ﬁlter and the(quasi-)maximum likelihood estimator, particularly for UC models with richer long-run dynamics.While the asymptotic theory of the CSS estimator is well established for ARFIMA models,see Hualde and Robinson (2011) and Nielsen (2015), it has not yet been derived for structuralUC models. To ﬁll this gap, theorems 3.2 and 3.3 summarize the asymptotic estimation theoryfor the CSS estimator for fractional UC models. In addition, the ﬁnite sample properties areaddressed by a Monte Carlo study in appendix B. For consistency and asymptotic normality ofthe CSS estimator, the moment assumptions on the shocks η t , u t need to be strengthened.8 ssumption 3 (Higher moments of η t , u t ) . The conditional moments of η t , u t (conditionalon past η t − , η t − , ... , and u t − , u t − , ... ) are ﬁnite up to order four and equal the unconditionalmoments. Theorem 3.2 (Consistency) . Under assumptions 1 to 3 the CSS estimator ˆ θ is consistent, ˆ θ p −→ θ as n → ∞ . The proof of theorem 3.2 is given in appendix C and is carried out as follows: First, themodel in (9) is shown to be identiﬁed. Next, v t ( θ ), as given in (11), is shown to be integratedof order d − d , and thus is stationary for d − d < / d − d > / d − d = 1 / Θ . Adopting the results ofHualde and Robinson (2011) and Nielsen (2015), who show for ARFIMA models encompassingthe reduced form of (9) that the probability of the CSS estimator to stay in the region of theparameter space where v t ( θ ) is nonstationary is asymptotically zero, the relevant region of Θ asymptotically reduces to the region where d − d < / Theorem 3.3 (Asymptotic normality) . Under assumptions 1 to 3 the CSS estimator ˆ θ is asymp-totically normally distributed, √ n (ˆ θ − θ ) d −→ N(0 , Ω − ) as n → ∞ . The proof of theorem 3.3 is again contained in appendix C. Since the CSS estimator isconsistent, the asymptotic distribution theory is inferred from a Taylor expansion of the scorefunction about θ . A central limit theorem is shown to hold for the score function at θ , togetherwith a uniform weak law of large numbers for the Hessian matrix. The latter allows to evaluatethe Hessian matrix in the Taylor expansion of the score function at θ . Thus, the asymptoticdistribution of the CSS estimator, as given in theorem 3.3, can be inferred from solving theTaylor expansion for √ n (ˆ θ − θ ). As usual in the state space literature, no analytical solution tothe asymptotic variance of the CSS estimator can be provided. The parameters of the reducedform depend non-trivially on θ , so that the partial derivatives of the reduced form cannot beanalytically derived. However, from theorem 3.3 it follows that an estimate for the parametercovariance matrix can be obtained from the negative inverse of the Hessian matrix computed inthe numerical optimization.Estimation of the latent component x t in (9) is considered next. In line with the method-ological literature on state space models, x t is estimated by plugging the CSS estimates ˆ θ intothe projection x t | n ( θ ) = Cov θ ( x t , y n :1 ) Var θ ( y n :1 ) − y n :1 = t − (cid:88) i =0 π i ( − d ) Σ ( i, · ) η t :1 y n :1 Σ − y n :1 y n :1 , (13)9here y n :1 = ( y n , ..., y ) (cid:48) , Σ η t :1 y n :1 = Cov θ ( η t :1 , y n :1 ), and Σ y n :1 = Var θ ( y n :1 ). The superscript in Σ ( i, · ) η t :1 y n :1 denotes the i -th row of the matrix, and Σ ( i,j ) η t :1 y n :1 =  π n − t + i − j ( − d ) σ η if n − j ≥ t − i, Σ y n :1 follow from lemma 3.1 by setting t = n . For θ = θ , x t | n ( θ ) is theminimum variance linear unbiased estimator given y , ..., y n , see Durbin and Koopman (2012,lemma 2). Due to theorem 3.2, this property holds asymptotically for x t | n (ˆ θ ). Note that (13) isidentical to the Kalman smoother, see Durbin and Koopman (2012, ch. 4.4). However, (13) iscomputationally much simpler, as it avoids the computationally intensive Kalman recursions forthe conditional variance. In line with lemma 3.1, (13) is the fractional ﬁlter for x t given F n .Finally, estimation of the seasonal components α i, , i = 1 , ...,

7, and µ is considered. The-oretically, all parameters of the model in (8) could be estimated jointly by the CSS estimator.But as Tschernig et al. (2013) explain, including deterministic terms in the optimization canlead to poor results in ﬁnite samples for fractionally integrated processes, particularly when d isclose to unity, as the deterministic terms suﬀer from poor identiﬁcation. They provide simulationevidence and a line of reasoning explaining why the following two-step estimator is more robust:In the ﬁrst step, the integration order d is estimated using the exact local Whittle estimator ofShimotsu (2010), which allows for unknown deterministic terms and yields ˆ d EW . Based on ˆ d EW ,the deterministic terms µ , α , , ..., α , in∆ ˆ d EW + log ˜ Y t = ∆ ˆ d EW + µ + (cid:88) i =1 α i ∆ ˆ d EW + s i,t + error t , (14)are estimated by ordinary least squares. In the second step, the objective function of the CSSestimator in (12) is minimized for the adjusted log ˜ Y t − ˆ µ − (cid:80) i =1 ˆ α i s i,t .As an alternative to (14), one could also eliminate the seasonal components by averaging overseven neighboring observations, as (cid:80) i =1 α i, s i,t = 0 . The intercept in∆ ˆ d EW + (cid:88) i =0 log ˜ Y t − q + i = ∆ ˆ d EW + µ + error t , ≤ q ≤ , (15)could then be estimated by ordinary least squares. q determines whether averages are calculatedsolely based on past data ( q = 6), based on centered data around t ( q = 3), or based on futuredata ( q = 0). While the second approach does not require to estimate α , , ..., α , , averaging overseven days smooths out potential kinks in the contact rate which is problematic. Furthermore,averaging may pollute the estimates of x t and induce spurious long memory. Finally, the choiceof q is not trivial: While for forecasting purposes q = 6 is adequate, choosing q = 0 is likely toaccount best for the delay in reporting of case numbers, and obviously q = 3 may be a good10ompromise between the two options. In the applications µ , α , , ..., α , will be estimated via(14). In this section, estimation results for the time-varying contact rate β t are presented for Canada,Germany, Italy, and the United States. The underlying data on conﬁrmed, recovered, and de-ceased cases stems from the JHU CSSE. As in Lee et al. (2021) and Liu et al. (2021), t = 1 isset once the number of cumulative cases reaches 100. Prior smoothing as suggested by Lee et al.(2021) and Liu et al. (2021), who use one-sided three-day rolling averages to smooth the data,is avoided, as this likely pollutes the kinks in the contact rate that occur due to containmentmeasures. Instead of smoothing out seasonality, the data is adjusted for weekly seasonal patterns asdescribed at the end of section 3 using (14). The bandwidth for the exact local Whittle estimatorin (14) is set to m = (cid:98) n . (cid:99) , which is justiﬁed by the Monte Carlo study in appendix B. Based onthe seasonally adjusted data, the parameters θ are estimated via the CSS estimator (12), where100 combinations of starting values for θ are drawn from uniform distributions with appropriatesupport (in particular d ∈ [0 . , µ in (14), yields the log contact rate estimatelog ˆ β t .The average infected period is required for R t and is estimated by solving (2) for γ and takingthe average ˆ γ = 1 n − n (cid:88) t =2 (cid:20) ˆ β t S t − − ∆ I t I t − (cid:21) . (16)This reﬂects that the deﬁnition of recovered varies over the countries under study, particularly asnon-hospitalized persons are typically assumed to have recovered h days after they tested positive,and h varies over the countries under study. The choice of h proportionally aﬀects the numberof currently infected I t , and thus β t is inversely proportional to h by (5). Consequently, only thedynamics of β t should be compared over the diﬀerent countries, not the absolute numbers. Incontrast, the reproduction rate R t = β t /γ accounts for the diﬀerent h when γ is estimated via(16). If instead γ = 1 /

18 is ﬁxed as in Lee et al. (2021), the dependence on h is not resolvedand countries with a higher h will exhibit a smaller reproduction rate by construction. This isprecisely the reason for the implausible estimates for R t in Lee et al. (2021), and is solved by Liu et al. (2021) argue that one-sided three-day rolling averages smooth out noise generated by the timingof the reporting. However, the opposite should be the case, as a one-sided smoothing shifts case numbers frompast to present, while a delay in reporting shifts case numbers from present to the future. To ﬁx the latter, aforward-looking ﬁlter is required, not a backward-looking one, see the discussion at the end of section 3. h via (16).Results are reported for Canada, Germany, and Italy in subsection 4.1. They are selectedas they are all members of the G7 and have implemented containment measures of diﬀerentstrength, duration, and at diﬀerent points in time, thus making a comparison interesting. Forthe selected countries, there exist reliable data on conﬁrmed, recovered, and deceased casesprovided by the JHU CSSE. As will be shown, the latter is not the case for the US, where dataon recovered subjects suﬀers heavily from under-reporting, yielding a severe downward-bias forthe estimated contact rate and the resulting reproduction rate R t as reported by Lee et al. (2021).The problem is ﬁxed by an assumption on the average duration of an infection, and results for theUS are presented in subsection 4.2. As will become apparent there, the fractional ﬁlter is quiterobust to under-reporting of recovered cases. To monitor the pandemic in real-time, subsection4.3 examines the precision of the fractional ﬁlter at the end of the sample. For Canada, ﬁgure 1 sketches the estimated log contact rate and the resulting reproduction rateˆ R t = ˆ β t / ˆ γ in the ﬁrst row. The average duration of an infection is estimated to be 1 / ˆ γ = 18 . v t (ˆ θ ) and its estimatedautocorrelation function.Based on the top-left panel of ﬁgure 1, several turning points of the contact rate can beidentiﬁed using a simple algorithm that deﬁnes a minimum (maximum) whenever the contactrate β t at t is smaller (greater) than all β t +1 , ..., β t +10 , the contact rates of the next ten days.These periods correspond to several policy regimes characterized by the strengthening and easingof containment measures. While a small selection of policy measures is presented below, a detailedoverview is given by McCoy et al. (2020).1. March 13 – March 21:

The contact rate increases and peaks on March 21. As areaction, several provinces and territories declare the state of emergency between March13 and March 22, impose gathering bans, close schools, universities, and businesses, andcancel mass events, among others.2.

March 22 – July 5:

After the implementation of containment measures the contact ratedecreases continuously. While additional containment measures such as travel restrictionsare implemented in April, several provinces and territories start to step-wise relax theirrestrictions in May. On July 1, Canada’s national summer holidays begin.3.

July 6 – July 24:

During the ﬁrst half of the summer holidays the contact rate increasessharply. The reproduction rate increases above unity on July 20.4.

July 25 – August 3:

The contact rate slightly decreases, while the reproduction rateremains above unity.5.

August 4 – September 30:

After a short phase of reduction, contact and reproductionrate start to increase again, while schools re-open on September 8. Canada’s prime ministerTrudeau says the second wave of COVID-19 is already underway.12

Date l og b t Canada : Contact rate

Date R t Canada : Reproduction rate lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll −2−1012 Apr Jul Oct Jan

Date v t Canada : Prediction error −0.10.00.1 0 10 20 30 40 50 60

Lags A u t o c o rr e l a t i on v t Canada : ACF prediction err.

Figure 1: Estimation results for Canada. The top-left panel displays the estimated contact ratelog ˆ β t in blue together with the observable log ˜ Y t in gray. The top-right panel shows the estimatedreproduction rate ˆ R t = ˆ β t / ˆ γ in blue together with ˜ Y t / ˆ γ in gray. The dashed horizontal line cor-responds to R = 1. The dashed vertical lines correspond to turning points of the contact rate.The bottom-left panel shows the estimated prediction error v t (ˆ θ ) in (11) together with two stan-dard deviations in blue, dashed. The bottom-right panel sketches the estimated autocorrelationfunction of the prediction error v t (ˆ θ ) together with a 95% conﬁdence interval.13. October 1 – October 17:

Contact and reproduction rate slightly decrease. Thanksgivingtakes place on October 12.7.

October 18 – November 7:

After Thanksgiving, contact and reproduction rate increaseslightly. On November 3rd, Ontario introduces an incidence-based system for when totighten containment measures.8.

November 8 – December 23:

Contact and reproduction rate exhibit a slight but steadydecrease. Reproduction rate remains above unity.The estimated contact rate and the resulting reproduction rate are well in line with the chronologyof policy interventions. In particular, the fractional ﬁlter allows to identify turning points of thecontact rate that are not visible from the raw data that is plotted in gray color in ﬁgure 1.The two graphs at the bottom of ﬁgure 1 illustrate how well the Canadian data ﬁts the modelassumptions. Assumptions 1 and 2 assume the measurement error u t and the log contact rateshock η t to be homoscedastic white noise processes. Since ˆ θ is consistent, see theorem 3.2, by(C.1) the prediction error v t (ˆ θ ) becomes a white noise process as n → ∞ if assumptions 1 and2 hold. While some outliers exist, about 95% of the prediction errors lie within two standarddeviations, as the bottom-left panel illustrates. The bottom-right panel shows that there is notmuch autocorrelation left in the prediction error. Given the parsimonious parametrization of thefractional UC model, this is surprising. The two panels at the bottom of ﬁgure 1 thus substantiatethat the dynamics of the log contact rate are well captured by a fractionally integrated process.The estimated integration order is ˆ d = 1 . β t retains 21 .

66% of its impact in t + 1, 13 .

17% in t + 2, and 9 .

73% in t + 3.After one week, the impact is still 5 . . . π i ( d − / ˆ γ = 21 .

27 days, which is slightly greater compared to Canada and likely results from diﬀerentalgorithms to estimate the number of recovered individuals. The integration order estimate is ofsimilar size as for Canada ( ˆ d = 1 . March 2 – March 5:

The contact rate starts at a comparably high level on March 2,likely caused by Carnival celebrations and ski tourism during Germany’s winter holidays(Felbermayr et al.; 2020). It peaks on March 5.2.

March 6 – May 2:

A slight decrease at the beginning of March turns into a sharp decreasearound March 16. Measures to contain the spread of the virus, such as school closings andevent cancellations, are implemented from March 13 on. From March 22 on, gatherings ofmore than two people are prohibited and several businesses are closed (Hartl, W¨alde andWeber; 2020). At the end of April, schools and businesses partly re-opened.14

Date l og b t Germany : Contact rate

Date R t Germany : Reproduction rate lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll −3−2−101 Apr Jul Oct Jan

Date v t Germany : Prediction error −0.10.00.1 0 10 20 30 40 50 60

Lags A u t o c o rr e l a t i on v t Germany : ACF prediction err.

Figure 2: Estimation results for Germany. For a description see ﬁgure 1.15.

May 3 – May 19:

Contact and reproduction rate slightly increase. Gathering restrictionsare relaxed and most businesses are allowed to re-open on May 6.4.

May 20 – June 10:

Contact and reproduction rate slightly decrease.5.

June 11 – June 23:

A short but strong increase is caused, among others, by a massiveoutbreak in a meat factory (BBC News; 2020d).6.

June 24 – July 2:

A slight decline follows. On June 29, summer holidays begin inGermany’s largest state.7.

July 3 – August 11:

During the summer holidays, Germany experiences a further increasein its contact and reproduction rate.8.

August 12 – August 30:

Contact and reproduction rate decrease.9.

August 31 – October 19:

A slight increase of the contact rate is followed by a strongincrease at the end of September. On October 15 stricter rules for hotspots are implemented,including mask obligations, contact restrictions, and curfews (Deutsche Welle; 2020b).10.

October 20 – November 28:

Contact and reproduction rate decrease, but the latterremains above unity. On October 28 a ‘lockdown light’ is announced for Germany, inducinggathering restrictions and business closings among others (BBC News; 2020a).11.

November 29 – December 23:

While contact and reproduction rate slightly increase,the government announces a tightening of its lockdown measures on December 13 (BBCNews; 2020b).Similar to Canada, the two panels at the bottom of ﬁgure 2 indicate that the model assumptionsare largely satisﬁed, despite little remaining autocorrelation in the prediction error.For Italy, a slight adjustment of the data is required, as ∆ C t = 0 on June 19 and thus log ˜ Y t isnot deﬁned, see (5). To adjust the single observation, averages from the neighboring observationsare used (i.e. ∆ C t = 1 / C t − + ∆ C t +1 )), while the adjusted cases in t − t + 1 will equal2 / / ˆ γ = 35 .

92 days, which is signiﬁcantly higher than for Canada and Germany. The discussionbelow (16) gives an explanation for the high variation in γ over the diﬀerent countries. Theestimated integration order is ˆ d = 1 . η t in Italy will yield a more persistenteﬀect on the contact rate. This may be explained by the severity of the pandemic in Italy inspring 2020, which likely had a long-lasting impact on social behavior. Based on the top-leftpanel of ﬁgure 3 the following regimes and turning points are visible:1. February 24 – February 25:

Due to several clusters in northern Italy more than 100cumulative cases are counted on February 23. While the contact rate peaks on February25, epicenters in northern Italy eﬀectively went into lockdown on February 22 (Signorelliet al.; 2020). 16

Date l og b t Italy : Contact rate

Date R t Italy : Reproduction rate llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll −0.50.00.51.0 Apr Jul Oct Jan

Date v t Italy : Prediction error −0.10−0.050.000.050.100.15 0 10 20 30 40 50 60

Lags A u t o c o rr e l a t i on v t Italy : ACF prediction err.

Figure 3: Estimation results for Italy. For a description see ﬁgure 1.17.

February 26 – June 4:

The contact rate exhibits the strongest decline among all countriesunder study. During March, the government announces several containment measures suchas school closings, halting all non-essential businesses, and tight regulations on free move-ment (Signorelli et al.; 2020). During May, the lockdown is lifted gradually. It eﬀectivelyends on June 3 (BBC News; 2020c).3.

June 5 – August 26:

The contact rate exhibits a long and steep increase.4.

August 27 – September 23:

A short decrease of the contact rate follows.5.

September 24 – October 24:

The contact rate again increases and reaches a level ashigh as at the end of March. Gatherings, restaurants, sports and school activities are againrestricted (Deutsche Welle; 2020a).6.

October 25 – December 9:

The contact rate strongly declines, while additional restric-tions on bars and restaurants are implemented (Deutsche Welle; 2020c).7.

December 10 – December 23:

A minor increase in the contact rate is visible the daysbefore Christmas.The two panels at the bottom of ﬁgure 3 are similar to Canada and Germany, and indicate thatthe prediction errors are rather homoscedastic, although outliers exist, and little autocorrelationis left.

The US is treated separately, since data on recovered cases reported by the JHU CSSE seem heav-ily downward-biased. To see this, consider the diﬀerence between lagged cumulative conﬁrmed,cumulative recovered, and cumulative deceased cases for diﬀerent lags hC t − h − R t − D t . (17)For h = 0, (17) measures the number of currently infected subjects. For small h , (17) should bepositive, as it takes some time for the infected subjects to either recover or die. As h increases,(17) should turn negative, as an increasing number of subjects contained in the cumulative cases C t − h and subjects infected between t − h and t (and thus contained in C t − C t − h ) either recoveror die. The turning point, denoted by ¯ h , should be close to the average infected period 1 /γ , aslong as new conﬁrmed cases between t − h and t , i.e. C t − C t − h , do not explode. If they do, then¯ h should be smaller than 1 /γ , as outﬂows from C t − C t − h disproportionally increase R t and D t .Figure 4 plots (17) in case numbers for lags h = 15 , , ..., ,

45. As can be seen, even after 45days the diﬀerence between lagged cumulative conﬁrmed, cumulative recovered, and cumulativedeceased cases is predominantly positive. This is at odds with the average infected periods forCanada, Germany, and Italy as found in subsection 4.1, and indicates that data on R t , D t Date C t - h - R t - D t US : Gap between lagged cumulative confirmed, recovered, and deceased cases Figure 4: Diﬀerence between lagged cumulative conﬁrmed, recovered, and deceased cases C t − h − R t − D t (in case numbers) for h = 15 , , ..., , h = 21 days after theytested positive, C t − = D t + R t . The assumption is justiﬁed as follows: First, it is similar tothe average infected period estimated for Germany and more conservative than the estimate forCanada. And second, it is centered in the range of deﬁnitions for recovered individuals by thefederal states. In addition, estimates for ¯ h = 18 and ¯ h = 24 days are presented, which gives areasonable interval for the contact rate.Under the assumption of ¯ h = 21, the estimated integration order equals ˆ d = 1 . March 5 – March 6:

Contact and reproduction rate peak at the initial stage of theepidemic.2.

March 7 – May 11:

Contact and reproduction rate steadily decrease, despite a smallblip on March 19. The national state of emergency is declared on March 13, and severalcontainment measures such as business closings and stay-at-home orders are implementedmainly during the second half of March. Depending on the state, businesses re-open fromApril 20 on. More than half of the states have opened businesses on May 7 (Chernozhukov19

Date l og b t US : Contact rate Date R t US : Reproduction rate llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll −1.0−0.50.00.51.0 Apr Jul Oct Jan Date v t US : Prediction error −0.10.00.10.2 0 10 20 30 40 50 60

Lags A u t o c o rr e l a t i on v t US : ACF prediction err.

Figure 5: Estimation results for the United States for ¯ h = 21. Shaded areas correspond to ¯ h = 18and ¯ h = 24. For a description see ﬁgure 1. 20t al.; 2021).3. May 12 – June 28:

As containment measures are relaxed, contact and reproduction rateincrease slightly during the second half of May and experience a strong increase duringJune.4.

June 29 – August 17:

Contact and reproduction rate decline, reaching the level of May.5.

August 18 – September 21:

A slight increase is visible.6.

September 22 – September 30:

A short decrease follows.7.

October 1 – November 10:

Contact and reproduction rate exhibit a steady increase,reaching the June peak. Regional containment measures are implemented at the beginningof November.8.

November 11 – November 26:

Contact and reproduction rate decrease.9.

November 27 – December 7:

After Thanksgiving, contact and reproduction rate expe-rience a short increase.10.

December 8 – December 23:

A decrease is visible from December 8 on.As can be seen from ﬁgure 5, estimates for the contact rate are rather robust to the choice of¯ h . They are slightly greater for ¯ h = 18, as the number of currently infected I t is smaller byconstruction and thus additional contacts are required to explain new conﬁrmed cases, whilethey are slightly smaller for ¯ h = 24 exactly for the opposite reason. The estimated reproductionrate is virtually identical among the three scenarios, as it is normalized by the average infectedperiod ˆ R t = ˆ β t / ˆ γ . For ¯ h = 21, the two panels at the bottom of ﬁgure 5 indicate that the modelassumptions are largely satisﬁed, despite some weak correlation in the prediction errors. Theplots are very similar for ¯ h = 18 and ¯ h = 24 and thus not shown. The reproduction rate ˆ R t isgreater than unity during the whole sample, which contradicts the results of Lee et al. (2021)who rely on the downward-biased data on recovered subjects from the JHU CSSE while ﬁxingthe average infected period to be 1 /γ = 18. This subsection investigates the end-of-sample properties of the fractional ﬁlter for real-timeestimation of the contact rate. Reliable contact rate estimates at the current frontier of thedata would allow to real-time monitor the state of the pandemic and can serve as a surveillancemeasure for future outbreaks. Based on reliable real-time estimates for the contact rate, policyrules can be implemented to prevent an exponential growth of case numbers. Acting early reduceseconomic and social costs of containment measures, and consequently a well-designed policy rulewill be beneﬁcial, given that the fractional ﬁlter yields a reliable estimate for the current level ofthe contact rate. Drawing inference on the latter is the focus of this subsection.In detail, real-time monitoring is simulated by truncating the sample at a certain point t , r ≤ t ≤ n , where r is the minimum sample size for the CSS estimator to produce reasonableestimates. The parameters θ , µ , α , , ..., α , of (8) are then estimated as described in section 3using the information available at time t , F t , and the resulting parameter estimates are denoted21s ˆ θ ( t ) , ˆ µ ( t ) , etc. To take into account reporting lags, and to be robust against outliers at theend of the sample, a little backward-smoothing is allowed by reporting the smoothed estimatefor the log contact rate at period t − t . From (13),the smoothed estimates are log ˆ β t − | t = ˆ µ ( t ) + x t − | t (ˆ θ ( t ) ) . (18)As (18) only depends on information available at t , it mimics the situation of a policy maker at t and can be used to draw inference on the monitoring properties of the fractional ﬁlter at time t .Based on ˆ β t − | t , policy rules to prevent an exponential spread of the virus can be designed. Suchrules could, for instance, deﬁne a threshold for ˆ R t − at which additional containment measuresare implemented. As the threshold should naturally depend on the number of currently infected,current hospital capacities, and other parameters, the precise design of such a policy rule is leftto the experts, and only a primitive policy rule will be introduced later for illustrative purposes.The reliability of real-time estimates for the contact rate is assessed by the following experi-ment: First, an estimation sample that consists of information available until May 31 is deﬁned,for which θ , µ , α , , ..., α , are estimated. It consists of at least 80 observations, which is consid-ered as a reasonable sample size for the estimation sample. Based on these estimates, log ˆ β r − | r is obtained as described above. In a second step, information available on June 1 is added to thesample and parameter estimates are updated using the ˆ θ ( r ) from the estimation sample as startingvalues for the CSS estimator, which gives ˆ θ ( r +1) . As before, the estimate for log ˆ β r − | r +1 is stored.The procedure repeats for all t , r < t ≤ n , where in every step t the CSS estimator is initializedby ˆ θ ( t − . The resulting real-time estimates for the contact rate are then compared to those ofsubsections 4.1 and 4.2 to draw inference on their reliability. In addition, a primitive policy ruleis introduced. It assumes governments to take action as soon as ˆ R t − = ˆ β t − | t / ˆ γ > .

2. Thelatter is motivated by the observation that preventing an exponential propagation (i.e. R t − > . β benchmarkt − | t = 1 n (cid:88) i =0 log ˜ Y t − i , (19)which includes three forward-looking observations and should smooth out the seasonality.The real-time experiment considered in this paper deviates from Lee et al. (2021), who suggestto monitor the current state of the pandemic by ﬁtting a linear time trend with structural breaksto log ˜ Y t . Lee et al. (2021) evaluate the monitoring properties of their contact rate estimateex-post, using all information available in their sample. Consequently, their estimates at point t depend on information that was not available to policy makers at period t whenever t < n . Asstructural breaks are not well identiﬁed at the end of the sample, recent changes in the contactrate cannot be expected to be found by the estimator of Lee et al. (2021).22 Date l og b t - | t Canada : Contact rate −0.20.00.2 Jul Oct Jan

Date l og b t - | t - l og b t - | n Canada : Deviations from Contact rate −6−5−4−3−2 Jul Oct Jan

Date l og b t - | t Germany : Contact rate −0.6−0.4−0.20.00.20.4 Jul Oct Jan

Date l og b t - | t - l og b t - | n Germany : Deviations from Contact rate −5−4−3 Jul Oct Jan

Date l og b t - | t Italy : Contact rate −0.10−0.050.000.05 Jul Oct Jan

Date l og b t - | t - l og b t - | n Italy : Deviations from Contact rate −3.6−3.3−3.0−2.7 Jul Oct Jan

Date l og b t - | t US : Contact rate −0.04−0.020.000.02 Jul Oct Jan Date l og b t - | t - l og b t - | n US : Deviations from Contact rate Figure 6: Real-time estimates for the log contact rate. The left panels display real-time contactrate estimates log ˆ β t − | t (blue, solid), full sample estimates log ˆ β t − | n (black, dashed), and theobservable data log ˜ Y t − (gray) for Canada, Germany, Italy, and the United States. The dashedvertical line corresponds to the date where the real-time estimate for the reproduction rate exceeds1 .

2. The right panels show deviations from the full sample contact rate estimates for the real-timeestimates log ˆ β t − | t − log ˆ β t − | n (blue, solid) and the benchmark log ˆ β benchmarkt − | t − log ˆ β t − | n (red,dashed). 23he results of the real-time experiment are visualized in ﬁgure 6. The four panels on theleft side sketch the resulting real-time estimates for the contact rate for the four countries understudy, together with the (full sample) results of subsections 4.1 and 4.2. As can be seen, thereal-time estimates almost perfectly overlap with estimates using the full sample information.This implies that the real-time estimates are well suited for monitoring the current state of thecontact rate. Minor deviations are visible for Canada at the end of July, and for Germany atthe middle of June, while no greater deviations are visible for Italy and the US. The real-timeestimates exceed the threshold ˆ R t − = ˆ β t − | t / ˆ γ > . β t − | t and log ˆ β benchmarkt − | t from the log contact rate estimates based on the full sample information log ˆ β t − | n . Thus, theyshed light on whether the fractional ﬁlter improves estimates for the contact rate compared to arolling seven-day average that uses three forward-looking observations. For Italy and the US, theadvantages of the fractional ﬁlter directly become apparent, as the benchmark exhibits greaterdeviations. For Canada and Germany, the fractional ﬁlter performs comparably well when largeoutliers occur, e.g. around July 20 for Canada and around June 20 for Germany. To extract a time-varying signal for the COVID-19 contact rate from daily data on conﬁrmed,recovered, and deceased cases, this paper introduces a novel unobserved components model. Itmodels the log contact rate as a fractionally integrated process of unknown integration order.A computationally simple modiﬁcation of the Kalman ﬁlter is introduced and is termed thefractional ﬁlter. It provides a closed-form expression for the prediction error that allows toestimate the model parameters by a conditional-sum-of-squares (CSS) estimator. The asymptotictheory for the CSS estimator is provided. For the countries under study, estimation results arewell in line with the chronology of the pandemic. They allow to draw inference on the impactof policy measures such as contact restrictions. The new ﬁltering method bears great potentialas a monitoring device for the current state of the pandemic, as it yields reliable contact rateestimates at the current frontier of the data. 24s vaccines become more and more available, future research can generalize the model toinclude the number of vaccinated. For instance, this can be done by decomposing 1 = S t + I t + R t + D t + V t , where V t is the fraction of vaccinated. The states R t and V t should be non-overlappingas long as vaccines are not rolled out to recovered subjects. While vaccine recommendations varyover the diﬀerent countries, some assign a lower priority to recovered subjects, so that R t and V t are non-overlapping at the early stage of the vaccine roll-out. Furthermore, mutations of theCoronavirus can be taken into account e.g. by allowing for a smooth transition between a contactrate with a low probability of virus transmission and one with a high probability.For applications beyond COVID-19 related data, the fractional ﬁlter oﬀers a robust, ﬂexible,and data-driven way for signal extraction of data of unknown persistence. It requires no priorassumptions on the integration order of a process, and thus provides a solution to model speciﬁ-cation in the unobserved components literature. Due to its computational advantages comparedto the classic Kalman ﬁlter, it allows to estimate unobserved components models with richerdynamics. Acknowledgments

The author thanks Nicolas Apfel, Uwe Hassler, Timon Hellwagner, Roland Jucknewitz, AlinaPrechtl, Veronika P¨uschel, Lars Schlereth, Rolf Tschernig, Enzo Weber, and the participants ofthe Department Seminar at the University of Regensburg for very helpful comments.25

Estimation results

Canada Germany Italy United States d . . . . . . . . σ η . . . . . . . . σ u . . . . . . . . θ from the CSS estimator as described in section 3 for Canada,Germany, Italy, and the United States. Standard errors are denoted in parentheses and werecalculated based on the inverse of the numeric Hessian matrix, see theorem 3.3. B Monte Carlo evidence

The ﬁnite sample performance of the CSS estimator is assessed in a Monte Carlo study, where,to be in line with (9), the data-generating mechanism is given by y t = x t + u t , ∆ d + x t = η t , t = 1 , ..., n. (B.1) u t ∼ N ID (0 , σ u, ), η t ∼ N ID (0 , σ η, ), u t , η t are uncorrelated, and σ η, = ρσ u, so that ρ controlsthe signal-to-noise ratio. The integration orders d ∈ { . , . , . } cover the relevant intervalfor the applications in section 4, while ρ ∈ { . , , } captures high and low signal-to-noiseratios. The variance parameter is set to σ u, = 1. Diﬀerent sample sizes n ∈ { , , } covering the relevant regions for the applications in section 4 are considered. The parameters θ = ( d , σ η, , σ u, ) are estimated via the CSS estimator as described in section 3. For eachspeciﬁcation, 1000 replications are simulated, and starting values are set to θ start = (1 , , d from the exact local Whittleestimator of Shimotsu (2010) are reported as benchmarks for m = (cid:98) n j (cid:99) Fourier frequencies, j ∈ { . , . , . , . , . , . } . Finally, the mean squared error M SE x and the coeﬃcientof determination R x for the estimation of x t , that are calculated via M SE x = 1 n n (cid:88) t =1 ( x t − x t | n (ˆ θ )) , R x = 1 − (cid:80) nt =1 ( x t − x t | n (ˆ θ )) (cid:80) nt =1 ( x t − ¯ x ) , (B.2)are reported, and indicate how well x t is estimated by the fractional ﬁlter (13).The results for the Monte Carlo study are contained in table B.1. Not surprisingly, theparametric CSS estimator outperforms the exact local Whittle estimator. However, gains arequite large in terms of the MSE for the integration order for all n ∈ { , , } and allcombinations of ρ and d . The mean squared error of the integration order estimate becomes26 d ˆ d ˆ d . EW ˆ d . EW ˆ d . EW ˆ d . EW ˆ d . EW ˆ d . EW M SE x R x n = 100.5 0.75 0.0641 0.1021 0.0804 0.0762 0.0736 0.0728 0.0775 0.4786 0.6747.5 1.25 0.0387 0.1011 0.0721 0.0664 0.0694 0.0789 0.1054 0.3719 0.9796.5 1.75 0.0285 0.0876 0.0620 0.0576 0.0637 0.0809 0.1293 0.3418 0.99921 0.75 0.0409 0.0943 0.0673 0.0585 0.0505 0.0446 0.0433 0.6245 0.79141 1.25 0.0299 0.0978 0.0644 0.0535 0.0484 0.0465 0.0570 0.4880 0.98671 1.75 0.0239 0.0851 0.0539 0.0470 0.0453 0.0475 0.0710 0.4258 0.99952 0.75 0.0277 0.0919 0.0615 0.0504 0.0407 0.0318 0.0264 0.7861 0.87112 1.25 0.0231 0.0977 0.0601 0.0489 0.0393 0.0323 0.0319 0.6282 0.99152 1.75 0.0204 0.0830 0.0511 0.0422 0.0372 0.0325 0.0384 0.5306 0.9997 n = 200.5 0.75 0.0232 0.0662 0.0446 0.0396 0.0393 0.0427 0.0488 0.3985 0.8158.5 1.25 0.0154 0.0615 0.0394 0.0320 0.0313 0.0381 0.0541 0.3432 0.9934.5 1.75 0.0128 0.0519 0.0348 0.0281 0.0260 0.0328 0.0561 0.3287 0.99991 0.75 0.0171 0.0641 0.0390 0.0307 0.0256 0.0238 0.0248 0.5479 0.87421 1.25 0.0124 0.0614 0.0378 0.0280 0.0225 0.0214 0.0267 0.4502 0.99561 1.75 0.0106 0.0513 0.0335 0.0260 0.0206 0.0192 0.0276 0.4085 0.99992 0.75 0.0128 0.0622 0.0372 0.0275 0.0202 0.0159 0.0140 0.7169 0.91852 1.25 0.0104 0.0620 0.0370 0.0268 0.0193 0.0149 0.0145 0.5815 0.99722 1.75 0.0091 0.0510 0.0331 0.0252 0.0185 0.0141 0.0148 0.5068 0.9999 n = 300.5 0.75 0.0157 0.0448 0.0339 0.0284 0.0274 0.0311 0.0386 0.3770 0.8594.5 1.25 0.0106 0.0404 0.0288 0.0218 0.0199 0.0242 0.0386 0.3394 0.9964.5 1.75 0.0089 0.0361 0.0265 0.0195 0.0164 0.0189 0.0364 0.3278 0.99991 0.75 0.0120 0.0423 0.0301 0.0224 0.0185 0.0173 0.0190 0.5227 0.90311 1.25 0.0086 0.0402 0.0282 0.0199 0.0157 0.0145 0.0187 0.4442 0.99761 1.75 0.0075 0.0361 0.0258 0.0187 0.0145 0.0125 0.0177 0.4069 1.00002 0.75 0.0093 0.0412 0.0288 0.0203 0.0153 0.0118 0.0105 0.6895 0.93662 1.25 0.0072 0.0401 0.0276 0.0193 0.0144 0.0109 0.0103 0.5712 0.99852 1.75 0.0064 0.0360 0.0257 0.0186 0.0140 0.0102 0.0099 0.5037 1.0000Table B.1: Mean squared error (MSE) and R x for d and x t in (B.1). The columns ˆ d and ˆ d jEW show the MSE for the CSS estimator of d as well as for the exact local Whittle estimator ofShimotsu (2010) for m = (cid:98) n j (cid:99) Fourier frequencies, j ∈ { . , . , . , . , . , . } . M SE x displays the mean squared error for x t , while R x is the coeﬃcient of determination, see (B.2).27maller for higher d , which is plausible as the fraction of total variation of y t generated by x t increases with d . For the same reason, it decreases with increasing ρ . The same conclusions onthe precision with which d is estimated hold for the mean squared error of x t , which decreasesas n , d , and ρ increase. The proportion of explained variation of x t , measured by R x , is high andthus x t is estimated well via (13). Particularly for d = 1 .

25, which is the relevant case for theapplications in section 4, the R x is close to unity for all n . C Proofs

Proof of Lemma 3.1.

First, note that E θ ( x t +1 |F t ) = E θ ( y t +1 |F t ), so that it is suﬃcient to derivethe latter expression. For this, consider the reduced form of (9), which follows from takingfractional diﬀerences and utilizing the aggregation properties of MA processes, see Granger andMorris (1976), so that∆ d + y t = η t + ∆ d + u t = η t + t − (cid:88) i =0 π i ( d ) u t − i = t − (cid:88) i =0 φ i ( θ ) ε t − i = φ ( L, θ ) ε t , (C.1)with φ ( θ ) = 1, ε t ∼ W N (0 , σ ε ), and φ ( L, θ ) is invertible. σ ε and the coeﬃcients in φ ( L, θ ) canbe derived by matching the autocovariance functions of (C.1), see Watson (1986, eqn. 2.6), anddepend non-linearly on θ . However, they are not required for the proof. Solving for ε t yields ε t = φ ( L, θ ) − ∆ d + y t = y t − ∞ (cid:88) i =1 A i ( θ ) y t − i . From the type II deﬁnition of fractional integration, see assumption 2, it follows that Cov θ ( y t , y j ) =0 for all j ≤ t >

0, and thus E θ ( y t +1 |F t ) = (cid:80) ti =1 A i ( θ ) y t +1 − i . The (yet unknown) coeﬃcients A i ( θ ) follow from the Yule-Walker equations  Cov θ ( y t +1 , y t )Cov θ ( y t +1 , y t − )...Cov θ ( y t +1 , y )  =  Var θ ( y t ) Cov θ ( y t − , y t ) · · · Cov θ ( y , y t )Cov θ ( y t , y t − ) Var θ ( y t − ) · · · Cov θ ( y , y t − )... ... . . . ...Cov θ ( y t , y ) Cov θ ( y t − , y ) · · · Var θ ( y )   A ( θ ) A ( θ )... A t ( θ )  , so that by deﬁning the vectors A ( θ ) = ( A ( θ ) , ..., A t ( θ )), y t :1 = ( y t , ..., y ) (cid:48) , and solving theYule-Walker equations for A ( θ ), one has A ( θ ) = Cov θ ( y t +1 , y t :1 ) Var θ ( y t :1 ) − , which impliesE θ ( y t +1 |F t ) = t (cid:88) i =1 A i ( θ ) y t − i = Cov θ ( y t +1 , y t :1 ) Var θ ( y t :1 ) − y t :1 . From Cov θ ( y t +1 , y t :1 ) = Cov θ ( x t +1 , y t :1 ) = (cid:80) ti =1 π i ( − d ) Cov θ ( η t +1 − i , y t :1 ), see assumption 2,lemma 3.1 follows. 28 roof of Theorem 3.2. First, the model in (9) is shown to be identiﬁed. Identiﬁcation follows ifthe parameters σ η , σ u can be recovered from the autocovariance function of the reduced form φ ( L, θ ) ε t in (C.1). To see this, consider the covariances Var θ ( φ ( L, θ ) ε t ) = σ ε (cid:80) t − i =0 φ i ( θ ) = σ η + σ u (cid:80) t − i =0 π i ( d ), and Cov θ ( φ ( L, θ ) ε t , φ ( L, θ ) ε t − ) = σ ε (cid:80) t − i =0 φ i ( θ ) φ i +1 ( θ ) = σ u (cid:80) t − i =0 π i ( d ) π i +1 ( d ).In matrix form this gives σ ε (cid:32) (cid:80) t − i =0 φ i ( θ ) (cid:80) t − i =0 φ i ( θ ) φ i +1 ( θ ) (cid:33) = (cid:34) (cid:80) t − i =0 π i ( d )0 (cid:80) t − i =0 π i ( d ) π i +1 ( d ) (cid:35) (cid:32) σ η σ u (cid:33) , (C.2)so that solving for ( σ η , σ u ) (cid:48) yields (cid:32) σ η σ u (cid:33) = 1 (cid:80) t − i =0 π i ( d ) π i +1 ( d ) (cid:34)(cid:80) t − i =0 π i ( d ) π i +1 ( d ) − (cid:80) t − i =0 π i ( d )0 1 (cid:35) (cid:32) (cid:80) t − i =0 φ i ( θ ) (cid:80) t − i =0 φ i ( θ ) φ i +1 ( θ ) (cid:33) σ ε , and thus ( σ η , σ u ) (cid:48) can be uniquely recovered from the reduced form. The assumption that d > (cid:80) t − i =0 π i ( d ) π i +1 ( d ) (cid:54) = 0, so that the matrix in (C.2) has full rank.Next, the CSS estimator based on the reduced form (C.1) is derived and is shown to beidentical to (12). Multiplying (C.1) by φ ( L, θ ) − yields ε t = φ ( L, θ ) − ∆ d + y t , based on whicha reduced form CSS estimator can be constructed. Deﬁne ψ + ( L, θ ) = [ φ ( L, θ ) − ] + = [1 − (cid:80) ∞ i =1 ψ i ( θ ) L i ] + , as the (truncated) inverse of φ ( L, θ ), and denote ε t ( θ ) = [ φ ( L, θ ) − ] + ∆ d + y t = ψ + ( L, θ )∆ d + y t as the reduced form residual given the observable variables y , ..., y n and θ . From ε t ( θ ), the reduced form CSS estimator isˆ θ = arg min θ ∈ Θ R ( θ ) , R ( θ ) = 1 n n (cid:88) t =1 ε t ( θ ) , (C.3)and equals the CSS estimator in (12). To see this, add and subtract y t from ε t ( θ ) = ψ + ( L, θ )∆ d + y t ,so that y t = (1 − ψ + ( L, θ )∆ d + ) y t + ε t ( θ ), and plug y t into the conditional expectation in (10) v t ( θ ) = y t − E θ ( y t |F t − ) = y t − E θ (cid:104) (1 − ψ + ( L, θ )∆ d + ) y t |F t − (cid:105) = ψ + ( L, θ )∆ d + y t = ε t ( θ ) . The third equality follows from (1 − ψ + ( L, θ )∆ d + ) y t being F t − -measurable, since ψ ( L, θ ) =1 − (cid:80) ∞ i =1 ψ i ( θ ) L i and π ( d ) = 1. Thus, the contemporaneous y t cancel in the expectation operatorand the whole term can be taken out of the expectation operator. From v t ( θ ) = ε t ( θ ) it followsthat the optimization problems in (12) and (C.3) are identical.Next, the integration order of the residuals is assessed. Since y t ∼ I ( d ), the residuals satisfy ε t ( θ ) = ψ + ( L, θ )∆ d + y t = ψ + ( L, θ )∆ d − d + η t + ψ + ( L, θ )∆ d + u t ∼ I ( d − d ) . (C.4)For d − d < / d − d > / d − d = 1 /

2, the objective functiondoes not uniformly converge on the set of admissible values for d . The same problem is addressedby Hualde and Robinson (2011) and by Nielsen (2015) for ARFIMA models encompassing (C.1).Nielsen (2015, eqn. 8) shows that a weak law of large numbers (WLLN) applies to the sum ofsquared residuals whenever d − d < /

2, while the sum of squared residuals diverges in probabilitywhenever d − d ≥ /

2, which translates intoplim n →∞ n n (cid:88) t =1 ε t ( θ ) =  E[˜ ε t ( θ )] if d − d < / , ∞ else. (C.5)˜ ε t ( θ ) = ψ ( L, θ ) φ ( L, θ )∆ d − d ε t is the untruncated residual generated by the untruncated ∆ d and ψ ( L, θ ). In addition, letting D ∗ ( κ ) = D ∩ { d : d − d ≤ / − κ } , 0 < κ < / ε t ( θ ) is stationary, Nielsen (2015, eqn. 13) shows that forany constant K > κ > (cid:32) inf d ∈ D \ D ∗ (¯ κ ) ∩ θ ∈ Θ n n (cid:88) t =1 ε t ( θ ) > K (cid:33) → , as n → ∞ , (C.6)implying that Pr( ˆ d ∈ D ∗ (¯ κ ) ∩ θ ∈ Θ ) → n → ∞ . From (C.6) it follows that the relevantparameter space asymptotically reduces to the stationary region Θ ∗ (¯ κ ) = { θ | θ ∈ Θ, d ∈ D ∗ (¯ κ ) } .Since the model is identiﬁed, for consistency it remains to be shown that a uniform weaklaw of large numbers (UWLLN) holds for the objective function within the stationary regionof the parameter space. A UWLLN holds if both, the objective function and the supremum ofthe gradient, satisfy a WLLN, see Wooldridge (1994, thm. 4.2 and eqn. 4.4) and Newey (1991,cor. 2.2). While a WLLN for the objective function follows directly from (C.5), it remains to beshown that sup θ ∈ Θ ∗ ( κ ) (cid:12)(cid:12)(cid:12)(cid:12) ∂R ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12) = O p (1) , (C.7)for any ﬁxed 0 < κ < / ε t , MA weights (cid:80) ∞ i =0 | m h,i ( θ ) | < ∞ , h = 1 ,

2, and the set ˜ Θ = { θ | θ ∈ Θ, d − d < / } , it holds thatsup θ ∈ ˜ Θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n (cid:88) t =1 (cid:34) ∂ j ∆ d − d + ∂d j ∞ (cid:88) i =0 m ,i ( θ ) ε t − i (cid:35) (cid:34) ∂ k ∆ d − d + ∂d k ∞ (cid:88) i =0 m ,i ( θ ) ε t − i (cid:35)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (1) , (C.8)for j, k ≥ ∂R ( θ ) ∂θ = 2 n n (cid:88) t =1 ε t ( θ ) ∂ε t ( θ ) ∂θ , ∂ε t ( θ ) ∂θ = ∂ψ + ( L, θ ) ∂θ ∆ d + y t + ψ + ( L, θ ) ∂ ∆ d − d + ∂θ ∆ d + y t . (C.9)30ince ψ + ( L, θ ) satisﬁes the absolute summability condition for (C.8), it follows thatsup θ ∈ Θ ∗ ( κ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n (cid:88) t =1 ε t ( θ ) ψ + ( L, θ ) ∂ ∆ d − d + ∂d ∆ d + y t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (1) , (C.10)while the partial derivatives of ∆ d − d + w.r.t. σ η , σ u are zero.For the remaining term in (C.9), note that the sum of absolute coeﬃcients of the truncatedpolynomial ψ + ( L, θ ) is bounded by the sum of absolute coeﬃcients of the untruncated polynomial ψ ( L, θ ) = φ ( L, θ ) − . Thus, it is suﬃcient to prove absolute summability of the coeﬃcients in ∂ψ ( L, θ ) /∂θ = − ψ ( L, θ ) ( ∂φ ( L, θ ) /∂θ ). Absolute summability of the coeﬃcients in ∂φ ( L, θ ) /∂θ is shown in lemma D.1 in appendix D. Since ψ ( L, θ ) is stable, ∂ψ ( L, θ ) /∂θ satisﬁes the absolutesummability condition for (C.8) and thussup θ ∈ Θ ∗ ( κ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n (cid:88) t =1 ε t ( θ ) ∂ψ + ( L, θ ) ∂θ ∆ d + y t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (1) . (C.11)From (C.10) and (C.11) it follows that (C.7) holds. Consequently, the supremum of the gradientsatisﬁes a WLLN for θ ∈ Θ ∗ ( κ ), which generalizes the pointwise convergence of the objectivefunction to weak convergence, implying that a UWLLN holds for the objective function. Sincethe model is identiﬁed, consistency of the CSS estimator follows from the UWLLN together with(C.6), and thus ˆ θ p −→ θ as n → ∞ , see Wooldridge (1994, thm. 4.3). Proof of Theorem 3.3.

Since the CSS estimator is consistent, see theorem 3.2, the asymptoticdistribution theory can be inferred from a Taylor expansion of the score function about θ √ n ∂R ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ =ˆ θ = √ n ∂R ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ + √ n ∂ R ( θ ) ∂θ∂θ (cid:48) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ =¯ θ (ˆ θ − θ ) , (C.12)where the entries in ¯ θ satisfy | ¯ θ i − θ ,i | ≤ | ˆ θ i − θ ,i | for all i = 1 , ,

3, and θ i denotes the i -th entryof θ = ( d, σ η , σ u ) (cid:48) , i = 1 , ,

3. The score function at θ follows from (C.9) √ n ∂R ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ = 2 √ n n (cid:88) t =1 ε t ( θ ) ∂ε t ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ = S n + o p (1) , (C.13)where S n = 2 √ n n (cid:88) t =1 ε t ∂ ˜ ε t ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ . (C.14)˜ ε t ( θ ) = ψ ( L, θ ) φ ( L, θ )∆ d − d ε t is the untruncated residual generated by the untruncated ∆ d and ψ ( L, θ ) = 1 − (cid:80) ∞ i =1 ψ i ( θ ) L i , and the second equality in (C.13) is shown to hold by Robinson (2006,pp. 135-136). In the following, let S ( j ) n denote the j -th entry of S n holding the partial derivative31.r.t. θ j , j = 1 , ,

3, and let C ,j ( L, θ ) = (cid:80) ∞ i =1 C ,j,i ( θ ) L i = φ ( L, θ )( ∂/∂θ j )[ ψ ( L, θ )∆ d − d ] denotethe coeﬃcients of the partial derivative of ˜ ε t ( θ ) w.r.t. θ j .To derive the asymptotic distribution theory for the CSS estimator, a central limit theorem(CLT) is shown to hold for the score function at θ . Next, it is proven that a UWLLN holds forthe Hessian matrix by showing that the Hessian matrix and its ﬁrst partial derivatives satisfya WLLN (Wooldridge; 1994, thm. 4.2). The UWLLN allows to evaluate the Hessian matrix in(C.12) at θ and yields the asymptotic distribution of √ n (ˆ θ − θ ). As the reduced form coeﬃcients φ ( L ) depend non-trivially on θ , no analytical expression for the asymptotic variance of the CSSestimator is provided. Instead, it will be shown that the CSS estimator is asymptotically normallydistributed, and its asymptotic variance is shown to exist. This allows to estimate Var(ˆ θ ) e.g.via the inverse of the numerical Hessian matrix.Starting with the score function, similar to Nielsen (2015, p. 175) a CLT can be inferred fromthe Cram´er-Wold device by showing that for any 3-dimensional vector µ = ( µ , µ , µ ) (cid:48) , it holdsthat µ (cid:48) S n = (cid:80) j =1 µ j S ( j ) n d −→ N(0 , µ (cid:48) Ω µ ). To see this, deﬁne the σ -algebra ˜ F t = σ ( { ε s , s ≤ t } )generated by the white noise ε t and its lags. Next, note that in (C.14) the term ε t [ ∂ ˜ ε t ( θ ) /∂θ (cid:12)(cid:12) θ = θ ]adapted to ˜ F t is a stationary MDS, since ε t is white noise, the partial derivatives are ˜ F t − -measurable, and the coeﬃcients of the partial derivatives are absolutely summable, as shown inthe proof of theorem 3.2. It follows for µ (cid:48) S n = 2 n − / (cid:80) nt =1 ν t with ν t = (cid:88) j =1 ν j,t ν j,t = µ j ε t ∂ ˜ ε t ( θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ , that ν t adapted to ˜ F t is a stationary MDS. Similar to Nielsen (2015, p. 175), by the law of largenumbers for stationary and ergodic processes, the sum of conditional variances for µ (cid:48) S n with S n as given in (C.14) is then1 n n (cid:88) t =1 E( ν t | ˜ F t − ) = 1 n n (cid:88) t =1 3 (cid:88) j,k =1 E (cid:104) ν j,t ν k,t | ˜ F t − (cid:105) = (cid:88) j,k =1 µ j µ k σ ε, n n (cid:88) t =1 ∂ ˜ ε t ( θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ ∂ ˜ ε t ( θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ p −→ (cid:88) j,k =1 µ j µ k σ ε, ∞ (cid:88) i =1 C ,j,i ( θ ) C ,k,i ( θ ) = (cid:88) j,k =1 µ j µ k Ω ( j,k )0 . (C.15)In C ,j ( θ ) = φ ( L, θ )( ∂/∂θ j )[ ψ ( L, θ )∆ d − d ] (cid:12)(cid:12) θ = θ , the partial derivatives of the ﬁrst polyno-mial ∂ψ ( L, θ ) /∂θ j = − ψ ( L, θ )( ∂φ ( L, θ ) /∂θ j ) are absolutely summable for all j = 1 , ,

3, as ψ ( L, θ ) and ∂φ ( L, θ ) /∂θ j are absolutely summable, see lemma D.1 in appendix D. Furthermore,( ∂/∂d )∆ d − d (cid:12)(cid:12) θ = θ = (cid:80) ∞ j =1 j − L j (Nielsen; 2015, p. 175), so that (cid:80) ∞ i =1 C ,j,i ( θ ) C ,k,i ( θ ) = O (1).Consequently, by the CLT for stationary MDS (see e.g. Davidson; 2000, thm. 6.2.3) S n d −→ N(0 , Ω ).To evaluate the Hessian matrix in (C.12) at θ , it remains to be shown that a UWLLN appliesto the Hessian matrix (Wooldridge; 1994, thm. 4.4), for which it is suﬃcient to show that a WLLN32olds for the Hessian matrix and for the supremum of its ﬁrst partial derivativessup θ ∈ Θ ∗ ( κ ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ R t ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12)(cid:12) = O p (1) , j, k, l = 1 , , , (C.16)for any ﬁxed κ ∈ (0 , / H ( θ ) = ∂ R ( θ ) ∂θ∂θ (cid:48) = 2 n n (cid:88) t =1 (cid:20) ∂ε t ( θ ) ∂θ ∂ε t ( θ ) ∂θ (cid:48) + ε t ( θ ) ∂ ε t ( θ ) ∂θ∂θ (cid:48) (cid:21) , (C.17)and a WLLN holds for the Hessian matrix if the absolute summability condition for (C.8) issatisﬁed by the two diﬀerent terms of the Hessian matrix. Since the coeﬃcients of the ﬁrstpartial derivatives of ε t ( θ ) were shown to be absolutely summable in the proof of theorem 3.2 for θ ∈ Θ ∗ ( κ ), the ﬁrst term in (C.17) directly satisﬁes the condition for (C.8) and thus is boundedin probability. It remains to be shown that absolute summability holds for the coeﬃcients of ∂ ε t ( θ ) / ( ∂θ∂θ (cid:48) ). From (C.9) ∂ ε t ( θ ) ∂θ j ∂θ k = ∂ψ + ( L, θ ) ∂θ j ∂ ∆ d − d + ∂θ k ∆ d + y t + ∂ψ + ( L, θ ) ∂θ k ∂ ∆ d − d + ∂θ j ∆ d + y t + ψ + ( L, θ ) ∂ ∆ d − d + ∂θ j ∂θ k ∆ d + y t + ∂ ψ + ( L, θ ) ∂θ j ∂θ k ∆ d + y t , (C.18)for j, k = 1 , ,

3. The coeﬃcients in ∂ψ + ( L, θ ) /∂θ j were already shown to be absolutely summablein the proof of theorem 3.2, and thus the ﬁrst and second term in (C.18) satisfy the absolutesummability condition for (C.8). As the coeﬃcients in ψ ( L, θ ) are absolutely summable, thethird term in (C.18) is also bounded by (C.8), so that only the coeﬃcients of the second partialderivatives of ψ + ( L, θ ) need to be shown to be absolutely summable. As their sum is bounded bythe sum of absolute coeﬃcients of the untruncated polynomial ψ ( L, θ ) = φ ( L, θ ) − , it is suﬃcientto prove absolute summability for the latter. For this, consider ∂ ψ ( L, θ ) ∂θ j ∂θ k = 2 ψ ( L, θ ) ∂φ ( L, θ ) ∂θ j ∂φ ( L, θ ) ∂θ k − ψ ( L, θ ) ∂ φ ( L, θ ) ∂θ j ∂θ k , j, k = 1 , , , (C.19)where the coeﬃcients of ﬁrst and second partial derivatives of φ ( L, θ ) are shown to be absolutelysummable in lemma D.1 in appendix D. Thus, (C.18) satisﬁes the absolute summability conditionfor (C.8), so that the Hessian matrix (C.17) satisﬁes a WLLN.To prove (C.16), consider ∂ R ( θ ) ∂θ j ∂θ k ∂θ l = 2 n n (cid:88) t =1 (cid:20) ∂ ε t ( θ ) ∂θ j ∂θ k ∂ε t ( θ ) ∂θ l + ∂ ε t ( θ ) ∂θ j ∂θ l ∂ε t ( θ ) ∂θ k + ∂ ε t ( θ ) ∂θ k ∂θ l ∂ε t ( θ ) ∂θ j + ε t ( θ ) ∂ ε t ( θ ) ∂θ j ∂θ k ∂θ l (cid:21) ,j, k, l = 1 , ,

3, where absolute summability of the coeﬃcients of the ﬁrst three terms was already33hown. Consequently, for the last term to also satisfy the condition for (C.8), the coeﬃcients ofthe third partial derivatives of ε t ( θ ) need to be shown to be absolutely summable. The derivativesare ∂ ε t ( θ ) ∂θ j ∂θ k ∂θ l = ∂ ψ + ( L, θ ) ∂θ j ∂θ k ∂θ l ∆ d + y t + ψ + ( L, θ ) ∂ ∆ d − d + ∂θ j ∂θ k ∂θ l ∆ d + y t + r t ( θ ) , (C.20)and r t ( θ ) holds the products of ﬁrst and second partial derivatives of ψ ( L, θ ) and ∆ d − d + thathave already been shown to satisfy the absolute summability condition for (C.8). The secondterm in (C.20) directly satisﬁes the condition for (C.8), so that only the ﬁrst term remains tobe checked. As before, the partial derivatives of the untruncated polynomial are considered, asthey are an upper bound for the sum of absolute coeﬃcients of the truncated polynomial. From(C.19) ∂ ψ ( L, θ ) ∂θ j ∂θ k ∂θ l =2 ψ ( L, θ ) (cid:20) ∂ φ ( L, θ ) ∂θ j ∂θ k ∂φ ( L, θ ) ∂θ l + ∂ φ ( L, θ ) ∂θ j ∂θ l ∂φ ( L, θ ) ∂θ k + ∂ φ ( L, θ ) ∂θ k ∂θ l ∂φ ( L, θ ) ∂θ j (cid:21) − ψ ( L, θ ) ∂φ ( L, θ ) ∂θ j ∂φ ( L, θ ) ∂θ k ∂φ ( L, θ ) ∂θ l − ψ ( L, θ ) ∂ φ ( L, θ ) ∂θ j ∂θ k ∂θ l , j, k, l = 1 , , . Absolute summability of the coeﬃcients of the partial derivatives of φ ( L, θ ) up to order threeis shown in lemma D.1 in appendix D. Consequently, (C.20) satisﬁes the absolute summabilitycondition for (C.8), so that (C.16) holds. Thus, a UWLLN holds for the Hessian matrix, so thatpointwise convergence generalizes to weak convergence. This, together with consistency of ˆ θ (seetheorem 3.2) allows to evaluate the Hessian matrix in (C.12) at θ . Analogously to (C.13), itfollows from the argument of Robinson (2006, pp. 135-136) that the partial derivatives of ε t ( θ )in (C.17) can be replaced by those of ˜ ε t ( θ ) as n → ∞ , and ε t ( θ ) can be replaced by ε t , whichyields ∂ R t ( θ ) ∂θ j ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ = 2 n n (cid:88) t =1 (cid:34) ∂ε t ( θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ ∂ε t ( θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ + ε t ( θ ) ∂ ε t ( θ ) ∂θ j ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ (cid:35) p −→ Ω ( j,k )0 , (C.21)as n → ∞ . The second term converges to zero in probability, as the second partial derivativesare ˜ F t − -measurable, and thus the second term adapted to ˜ F t − is a stationary MDS.Solving (C.12) for √ n (ˆ θ − θ ) and plugging in the limits for ﬁrst and second partial derivativesyields √ n (ˆ θ − θ ) = H t (¯ θ ) − √ n ∂R ( θ ) ∂θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θ = θ d −→ N(0 , Ω − ) , (C.22)as n → ∞ , which completes the proof. 34 Partial derivatives of φ ( L, θ ) Lemma D.1 (Absolute summability of partial derivatives) . For φ ( L, θ ) in φ ( L, θ ) σ ε ε ∗ t = σ ε t − (cid:88) i =0 φ i ( θ ) ε ∗ t − i = σ η η ∗ t + ∆ d + σ u u ∗ t = σ η η ∗ t + σ u t − (cid:88) i =0 π i ( d ) u ∗ t − i , (D.1) with ε ∗ t ∼ WN(0 , , u ∗ t ∼ WN(0 , , η ∗ t ∼ WN(0 , , φ ( θ ) = 1 , it holds that lim t →∞ t − (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12) ∂φ i ( θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) < ∞ , (D.2)lim t →∞ t − (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12) ∂ φ i ( θ ) ∂θ j ∂θ k (cid:12)(cid:12)(cid:12)(cid:12) < ∞ , (D.3)lim t →∞ t − (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12) ∂ φ i ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12)(cid:12) < ∞ , (D.4) for all j, k, l = 1 , , , and all θ ∈ Θ , where θ j denotes the j -th entry of θ = ( d, σ η , σ u ) (cid:48) .Proof of lemma D.1. The following results are required to prove (D.2) to (D.4). For σ ε , notethat by solving the variance of (D.1) for σ ε σ ε = σ η + σ u (cid:80) t − i =0 π i ( d ) (cid:80) t − i =0 φ i ( θ ) . (D.5)Since ∂ j π i ( d ) /∂d j = O ( i − d − (1 + log i ) j ) for all i ≥ j ≥

0, see Johansen and Nielsen (2010,lemma B.3), and thus lim t →∞ (cid:80) t − i =1 | ∂ j π i ( d ) /∂d j | < ∞ for all j ≥

0, it follows that ∂∂θ j (cid:34) σ η + σ u t − (cid:88) i =0 π i ( d ) (cid:35) = O (1) , (D.6) ∂ ∂θ j ∂θ k (cid:34) σ η + σ u t − (cid:88) i =0 π i ( d ) (cid:35) = O (1) , (D.7) ∂ ∂θ j ∂θ k ∂θ l (cid:34) σ η + σ u t − (cid:88) i =0 π i ( d ) (cid:35) = O (1) , (D.8)for all j, k, l = 1 , ,

3. For the same reason, it follows from (D.1) that ∂φ ( L, θ ) σ ε ε ∗ t ∂θ j = O (1) ε ∗ t + t − (cid:88) i =1 O ( i − d − (1 + log i )) ε ∗ t − i , (D.9) ∂ φ ( L, θ ) σ ε ε ∗ t ∂θ j ∂θ k = O (1) ε ∗ t + t − (cid:88) i =1 O ( i − d − (1 + log i ) ) ε ∗ t − i , (D.10)35 φ ( L, θ ) σ ε ε ∗ t ∂θ j ∂θ k ∂θ l = O (1) ε ∗ t + t − (cid:88) i =1 O ( i − d − (1 + log i ) ) ε ∗ t − i , (D.11)and the limits stem from the ﬁrst, second and third partial derivatives of σ η η ∗ t + σ u (cid:80) t − i =0 π i ( d ) u ∗ t − i w.r.t. d , while all coeﬃcients of the other partial derivatives are bounded below. Consequently,(D.9) to (D.11) are MA processes with absolutely summable coeﬃcients. Note that this is notsuﬃcient for absolute summability of the partial derivatives of φ ( L, θ ), as σ ε in the numeratorsof (D.9) to (D.11) also depends on θ .For (D.2), consider ∂σ ε /∂θ j = c ( θ, θ j ) − c ( θ, θ j ), where c ( θ, θ j ) = ∂∂θ j (cid:104) σ η + σ u (cid:80) t − i =0 π i ( d ) (cid:105)(cid:80) t − i =0 φ i ( θ ) = O (1) , c ( θ, θ j ) = 2 σ ε (cid:80) t − i =1 φ i ( θ ) ∂φ i ( θ ) ∂θ j (cid:80) t − i =0 φ i ( θ ) , (D.12)and the ﬁrst term is O (1) due to (D.6). For the partial derivative of φ ( L, θ ) σ ε ε ∗ t one then has ∂φ ( L, θ ) σ ε ε ∗ t ∂θ j = σ ε t − (cid:88) i =1 ∂φ i ( θ ) ∂θ j ε ∗ t − i + 12 σ ε ∂σ ε ∂θ j φ ( L, θ ) ε ∗ t = c ( θ, θ j )2 σ ε φ ( L, θ ) ε ∗ t − c ( θ, θ j )2 σ ε φ ( L, θ ) ε ∗ t + σ ε t − (cid:88) i =1 ∂φ i ( θ ) ∂θ j ε ∗ t − i . (D.13)From (D.9) it follows that the term on the left hand side (LHS) is a MA process with absolutelysummable coeﬃcients for any t . Since the same holds for φ ( L, θ ) ε ∗ t , by (D.12) the ﬁrst term on theright hand side (RHS) is also a MA process with absolutely summable coeﬃcients. Consequently,the diﬀerence of the latter two terms on the RHS σ ε t − (cid:88) i =1 ∂φ i ( θ ) ∂θ j ε ∗ t − i − c ( θ, θ j )2 σ ε φ ( L, θ ) ε ∗ t = − c ( θ, θ j )2 σ ε ε ∗ t + t − (cid:88) i =1 (cid:20) σ ε ∂φ i ( θ ) ∂θ j − c ( θ, θ j ) φ i ( θ )2 σ ε (cid:21) ε ∗ t − i , is also a MA process with absolutely summable coeﬃcients. As the contemporaneous impact of ε ∗ t cannot cancel, it follows that c ( θ, θ j ) = O (1) is bounded, and thus the second term on theRHS of (D.13) is a MA process with absolutely summable coeﬃcients. For the equality in (D.13)to hold, it must thus hold that σ ε (cid:80) t − i =1 ( ∂φ i ( θ ) /∂θ j ) ε ∗ t − i is also a MA process with absolutelysummable coeﬃcients for any t , which proves (D.2).For (D.3) one has ∂ σ ε / ( ∂θ j ∂θ k ) = c ( θ, θ j , θ k ) − c ( θ, θ j , θ k ) with c ( θ, θ j , θ k ) = (cid:40) ∂ ∂θ j ∂θ k (cid:34) σ η + σ u t − (cid:88) i =0 π i ( d ) (cid:35) − ∂σ ε ∂θ k t − (cid:88) i =1 φ i ( θ ) ∂φ i ( θ ) ∂θ j (cid:41) (cid:34) t − (cid:88) i =0 φ i ( θ ) (cid:35) − − (cid:40) ∂σ ε ∂θ j t − (cid:88) i =1 φ i ( θ ) ∂φ i ( θ ) ∂θ k + 2 σ ε t − (cid:88) i =1 ∂φ i ( θ ) ∂θ j ∂φ i ( θ ) ∂θ k (cid:41) (cid:34) t − (cid:88) i =0 φ i ( θ ) (cid:35) − , (D.14)36 ( θ, θ j , θ k ) = (cid:34) σ ε t − (cid:88) i =1 φ i ( θ ) ∂ φ i ( θ ) ∂θ j ∂θ k (cid:35) (cid:34) t − (cid:88) i =0 φ i ( θ ) (cid:35) − , (D.15)and c ( θ, θ j , θ k ) = O (1) is bounded due to (D.2) and (D.7). The second partial derivatives of φ ( L, θ ) σ ε ε ∗ t are ∂ φ ( L, θ ) σ ε ε ∗ t ∂θ j ∂θ k = z ( θ, θ j , θ k ) + 12 σ ε ∂ σ ε ∂θ j ∂θ k φ ( L, θ ) ε ∗ t + σ ε t − (cid:88) i =1 ∂ φ i ( θ ) ∂θ j ∂θ k ε ∗ t − i , (D.16) z ( θ, θ j , θ k ) = 12 σ ε ∂σ ε ∂θ k t − (cid:88) i =1 ∂φ i ( θ ) ∂θ j ε ∗ t − i + 12 σ ε ∂σ ε ∂θ j t − (cid:88) i =1 ∂φ i ( θ ) ∂θ k ε ∗ t − i − σ ε ∂σ ε ∂θ j ∂σ ε ∂θ k φ ( L, θ ) ε ∗ t , and z ( θ, θ j , θ k ) is a MA process with absolutely summable coeﬃcients due to (D.2). Pluggingin ∂ σ ε / ( ∂θ j ∂θ k ) = c ( θ, θ j , θ k ) − c ( θ, θ j , θ k ) and rearranging terms yields ∂ φ ( L, θ ) σ ε ε ∗ t ∂θ j ∂θ k − c ( θ, θ j , θ k )2 σ ε φ ( L, θ ) ε ∗ t − z ( θ, θ j , θ k ) = − c ( θ, θ j , θ k )2 σ ε ε ∗ t + t − (cid:88) i =1 (cid:20) σ ε ∂ φ i ( θ ) ∂θ j ∂θ k − c ( θ, θ j , θ k )2 σ ε φ i ( θ ) (cid:21) ε ∗ t − i , where the LHS is a MA process with absolutely summable coeﬃcients for any t due to (D.10) and(D.14). Again, as the contemporaneous ε ∗ t cannot cancel out, c ( θ, θ j , θ k ) = O (1) is bounded.Therefore, c ( θ, θ j , θ k ) / (2 σ ε ) φ ( L, θ ) ε ∗ t is a MA process with absolutely summable weights, sothat for the equality above to hold, (cid:80) t − i =1 ∂ φ i ( θ ) / ( ∂θ j ∂θ k ) ε ∗ t − i must also be a MA process withabsolutely summable weights for any t , which proves (D.3).Turning to (D.4), the third partial derivatives of the variance parameter σ ε can be representedas ∂ σ ε / ( ∂θ j ∂θ k ∂θ l ) = c ( θ, θ j , θ k , θ l ) − c ( θ, θ j , θ k , θ l ) with c ( θ, θ j , θ k , θ l ) = (cid:34) σ ε t − (cid:88) i =1 φ i ( θ ) ∂ φ i ( θ ) ∂θ j ∂θ k ∂θ l (cid:35) (cid:34) t − (cid:88) i =0 φ i ( θ ) (cid:35) − . (D.17) c ( θ, θ j , θ k , θ l ) holds the products of ﬁrst and second partial derivatives of σ ε and φ (1 , θ ) thathave already been shown to be O (1), as well as ∂ / ( ∂θ j ∂θ k ∂θ l ) (cid:104) σ η + σ u (cid:80) t − i =0 π i ( d ) (cid:105) that is O (1)as shown in (D.8). Consequently c ( θ, θ j , θ k , θ l ) = O (1), and the exact expression is omitted forbrevity. The third partial derivatives of φ ( L, θ ) σ ε ε ∗ t follow from (D.16) and equal ∂ φ ( L ) σ ε ε ∗ t ∂θ j ∂θ k ∂θ l = z ( θ, θ j , θ k , θ l ) + 12 σ ε ∂ σ ε ∂θ j ∂θ k ∂θ l φ ( L, θ ) ε ∗ t + σ ε t − (cid:88) i =1 ∂ φ i ( θ ) ∂θ j ∂θ k ∂θ l ε ∗ t − i , (D.18)where z ( θ, θ j , θ k , θ l ) holds the products of the ﬁrst and second partial derivatives of σ ε and φ ( L, θ )for which absolute summability was shown above. Therefore, z ( θ, θ j , θ k , θ l ) is a MA process withabsolutely summable coeﬃcients. Plugging in ∂ σ ε / ( ∂θ j ∂θ k ∂θ l ) = c ( θ, θ j , θ k , θ l ) − c ( θ, θ j , θ k , θ l )37nd rearranging gives ∂ φ ( L, θ ) σ ε ε ∗ t ∂θ j ∂θ k ∂θ l − c ( θ, θ j , θ k , θ l )2 σ ε φ ( L, θ ) ε ∗ t − z ( θ, θ j , θ k , θ l ) = − c ( θ, θ j , θ k , θ l )2 σ ε ε ∗ t + t − (cid:88) i =1 (cid:20) σ ε ∂ φ i ( θ ) ∂θ j ∂θ k ∂θ l − c ( θ, θ j , θ k , θ l )2 σ ε φ i ( θ ) (cid:21) ε ∗ t − i , where the LHS is a MA process with absolutely summable coeﬃcients for any t by (D.11). Asfor the ﬁrst and second partial derivatives, c ( θ, θ j , θ k , θ l ) = O (1) holds, as the contemporaneous ε ∗ t do not cancel on the RHS. Due to boundedness of c ( θ, θ j , θ k , θ l ), the term c ( θ, θ j , θ k , θ l ) = O (1) φ ( L, θ ) ε ∗ t is a MA process with absolutely summable weights. Since all other terms are MAprocesses with absolutely summable weights, (cid:80) t − i =1 ∂ φ i ( θ ) / ( ∂θ j ∂θ k ∂θ l ) ε ∗ t − i must also be a MAprocess with absolutely summable coeﬃcients for the above equality to hold. This proves (D.4).38 eferences Acemoglu, D., Chernozhukov, V., Werning, I. and Whinston, M. D. (2020). Optimal targetedlockdowns in a multi-group SIR model,

NBER Working Paper 27102 , National Bureau ofEconomic Research.

URL: https://ideas.repec.org/p/nbr/nberwo/27102.html

Avery, C., Bossert, W., Clark, A., Ellison, G. and Ellison, S. F. (2020). An economist’s guide toepidemiology models of infectious disease,

Journal of Economic Perspectives (4): 79–104.BBC News (2020a). Coronavirus: Germany restricts social life in ‘lockdown light’. 2 November2020. (Accessed 15 January 2021). URL:

BBC News (2020b). Coronavirus: Germany to go into lockdown over Christmas. 13 December2020. (Accessed 15 January 2021).

URL:

BBC News (2020c). Coronavirus: Italy’s Conte oﬀers hope as travel restrictions end. 3 June2020. (Accessed 15 January 2021).

URL:

BBC News (2020d). Coronavirus: What went wrong at Germany’s G¨utersloh meat factory. 25June 2020. (Accessed 15 January 2021).

URL:

Bergman, A., Sella, Y., Agre, P. and Casadevall, A. (2020). Oscillations in U.S. COVID-19incidence and mortality data reﬂect diagnostic and reporting factors, mSystems (4).Chang, Y., Miller, J. I. and Park, J. Y. (2009). Extracting a common stochastic trend: Theorywith some applications, Journal of Econometrics (2): 231–247.Chernozhukov, V., Kasahara, H. and Schrimpf, P. (2021). Causal impact of masks, policies,behavior on early covid-19 pandemic in the U.S.,

Journal of Econometrics (1): 23–62.Clark, P. K. (1987). The cyclical component of U.S. economic activity,

The Quarterly Journal ofEconomics (4): 797–814.Davidson, J. (2000).

Econometric Theory , Blackwell Publishers.Deutsche Welle (2020a). Coronavirus digest: Europe toughens restrictions as cases rise. 13October 2020. (Accessed 15 January 2021).

URL:

Deutsche Welle (2020b). Coronavirus: Germany toughens restrictions as it enters ‘decisive’phase. 14 October 2020. (Accessed 15 January 2021).

URL:

Deutsche Welle (2020c). Italy toughens coronavirus measures amid second wave surge. 25October 2020. (Accessed 15 January 2021). 39

RL:

Diebold, F. X. and Rudebusch, G. D. (1991). Is consumption too smooth? Long memory andthe Deaton paradox,

The Review of Economics and Statistics (1): 1–9.Dong, E., Du, H. and Gardner, L. (2020). An interactive web-based dashboard to track COVID-19in real time, The Lancet Infectious Diseases (5): 533–534.Durbin, J. and Koopman, S. J. (2012). Time Series Analysis by State Space Methods: SecondEdition , Oxford University Press, Oxford.Felbermayr, G., Hinz, J. and Chowdhry, S. (2020). Apr`es-ski: The spread of Coronavirus fromIschgl through Germany,

Covid Economics: Vetted and Real-Time Papers : 177–204.Granger, C. W. J. and Morris, M. J. (1976). Time series modelling and interpretation, Journalof the Royal Statistical Society. Series A (General) (2): 246–257.Hartl, T., Tschernig, R. and Weber, E. (2020). Fractional trends and cycles in macroeconomictime series, arXiv:2005.05266v2 . URL: https://arxiv.org/abs/2005.05266v2

Hartl, T., W¨alde, K. and Weber, E. (2020). Measuring the impact of the German public shutdownon the spread of COVID-19,

Covid Economics: Vetted and Real-Time Papers : 25–32.Harvey, A. C. (1985). Trends and cycles in macroeconomic time series, Journal of Business &Economic Statistics (3): 216–227.Hethcote, H. W. (2000). The mathematics of infectious diseases, SIAM Review (4): 599–653.Hodrick, R. J. and Prescott, E. C. (1997). Postwar U.S. business cycles: An empirical investiga-tion, Journal of Money, Credit and Banking (1): 1–16.Horta¸csu, A., Liu, J. and Schwieg, T. (2021). Estimating the fraction of unreported infectionsin epidemics with a known epicenter: An application to COVID-19, Journal of Econometrics (1): 106–129.Hualde, J. and Robinson, P. M. (2011). Gaussian pseudo-maximum likelihood estimation offractional time series models,

The Annals of Statistics (6): 3152–3181.Johansen, S. and Nielsen, M. Ø. (2010). Likelihood inference for a nonstationary fractionalautoregressive model, Journal of Econometrics (1): 51–66.Korolev, I. (2021). Identiﬁcation and estimation of the SEIRD epidemic model for COVID-19,

Journal of Econometrics (1): 63–85.Lee, S., Liao, Y., Seo, M. H. and Shin, Y. (2021). Sparse HP ﬁlter: Finding kinks in the COVID-19contact rate,

Journal of Econometrics (1): 158–180.Liu, L., Moon, H. R. and Schorfheide, F. (2021). Panel forecasts of country-level Covid-19infections,

Journal of Econometrics (1): 2–22.Marinucci, D. and Robinson, P. M. (1999). Alternative forms of fractional Brownian motion,

Journal of Statistical Planning and Inference (1–2): 111–122.40cCoy, L. G., Smith, J., Anchuri, K., Berry, I., Pineda, J., Harish, V., Lam, A. T., Yi, S. E.,Hu, S., COVID-19 Canada Open Data Working Group: Non-Pharmaceutical Interventionsand Fine, B. (2020). CAN-NPI: A curated open dataset of Canadian non-pharmaceuticalinterventions in response to the global COVID-19 pandemic, Working paper , medRxiv.

URL: https://doi.org/10.1101/2020.04.17.20068460

Morley, J. C., Nelson, C. R. and Zivot, E. (2003). Why are the Beveridge-Nelson and unobserved-components decompositions of GDP so diﬀerent?,

The Review of Economics and Statistics (2): 235–243.Newey, W. K. (1991). Uniform convergence in probability and stochastic equicontinuity, Econo-metrica (4): 1161–1167.Nielsen, M. Ø. (2015). Asymptotics for the conditional-sum-of-squares estimator in multivariatefractional time-series models, Journal of Time Series Analysis (2): 154–188.Oh, K. H., Zivot, E. and Creal, D. (2008). The relationship between the Beveridge-Nelsondecomposition and other permanent-transitory decompositions that are popular in economics, Journal of Econometrics (2): 207–219.Pindyck, R. S. (2020). COVID-19 and the welfare eﬀects of reducing contagion,

NBER WorkingPaper 27121 , National Bureau of Economic Research.

URL: https://ideas.repec.org/p/nbr/nberwo/27121.html

Robinson, P. M. (2006). Conditional-sum-of-squares estimation of models for stationary timeseries with long memory, in H.-C. Ho, C.-K. Ing and T. L. Lai (eds),

Time Series and RelatedTopics: In Memory of Ching-Zong Wei , Vol. 52 of

IMS Lecture Notes-Monograph Series ,Institute of Mathematical Statistics, Beachwood, Ohio, pp. 130–137.Shimotsu, K. (2010). Exact local Whittle estimation of fractional integration with unknown meanand time trend,

Econometric Theory (2): 501–540.Signorelli, C., Scognamiglio, T. and Odone, A. (2020). Covid-19 in Italy: impact of containmentmeasures and prevalence estimates of infection in the general population, Acta Biomedica (3): 175–179.The COVID Tracking Project (2021). The “good” metric is pretty bad: Why it’s hard to countthe people who have recovered from COVID-19. 13 January 2021. (Accessed 16 January 2021). URL: https://covidtracking.com/analysis-updates/why-its-hard-to-count-recovered

Tschernig, R., Weber, E. and Weigand, R. (2013). Fractionally integrated VAR models witha fractional lag operator and deterministic trends: Finite sample identiﬁcation and two-stepestimation,

Working Paper 471 , University of Regensburg, Regensburg.

URL: https://epub.uni-regensburg.de/27269/

Watson, M. W. (1986). Univariate detrending methods with stochastic trends,

Journal of Mon-etary Economics (1): 49–75.Wooldridge, J. M. (1994). Estimation and inference for dependent processes, in R. F. Engle andD. McFadden (eds),