[PDF] On the Time Trend of COVID-19: A Panel Data Study

Abstract

In this paper, we study the trending behaviour of COVID-19 data at country level, and draw attention to some existing econometric tools which are potentially helpful to understand the trend better in future studies. In our empirical study, we find that European countries overall flatten the curves more effectively compared to the other regions, while Asia & Oceania also achieve some success, but the situations are not as optimistic elsewhere. Africa and America are still facing serious challenges in terms of managing the spread of the virus, and reducing the death rate, although in Africa the virus spreads slower and has a lower death rate than the other regions. By comparing the performances of different countries, our results incidentally agree with Gu et al. (2020), though different approaches and models are considered. For example, both works agree that countries such as USA, UK and Italy perform relatively poorly; on the other hand, Australia, China, Japan, Korea, and Singapore perform relatively better.

Full PDF

OOn the Time Trend of COVID-19: A Panel Data Study Chaohua Dong † and Jiti Gao ‡ and Oliver Linton (cid:63) and Bin Peng ‡† Zhongnan University of Economics and Law ‡ Monash University (cid:63)

University of CambridgeJune 25, 2020

Abstract

In this paper, we study the trending behaviour of COVID-19 data at country level,and draw attention to some existing econometric tools which are potentially helpful tounderstand the trend better in future studies. In our empirical study, we ﬁnd that Europeancountries overall ﬂatten the curves more eﬀectively compared to the other regions, whileAsia & Oceania also achieve some success, but the situations are not as optimistic elsewhere.Africa and America are still facing serious challenges in terms of managing the spread ofthe virus, and reducing the death rate, although in Africa the virus spreads slower andhas a lower death rate than the other regions. By comparing the performances of diﬀerentcountries, our results incidentally agree with Gu et al. (2020), though diﬀerent approachesand models are considered. For example, both works agree that countries such as USA, UKand Italy perform relatively poorly; on the other hand, Australia, China, Japan, Korea,and Singapore perform relatively better.

Keywords : COVID-19, Deterministic time trend, Panel data, Varying-coeﬃcient

JEL classiﬁcation : C23, C54 The ﬁrst author thanks ﬁnancial support from National Nature Science Foundation of China under grantnumber: 71671143; the second and the third authors would like to acknowledge the ﬁnancial support of theAustralian Research Council Discovery Grants program under Grant Number: DP200102769. (cid:63)

Corresponding Author : Oliver Linton, Faculty of Economics, University of Cambridge, Cambridge CB3 9DD,U.K. Email: [email protected] a r X i v : . [ ec on . E M ] J un Introduction

Words like “exponential rate” and “ﬂatten the curve” have been widely cited by all sorts ofsocial media since the outbreak of the pandemic coursed by COVID-19. Since early 2020, gov-ernments of the entire world have been frequently updating their policies in order to manage thespread of the virus, and reduce the death rate while constrained by limited medical resources.Understanding the trending behaviour of the pandemic is therefore crucial from the perspectiveof policy making.The paper investigates the trending behaviour of COVID-19 data at country level, and drawsattention to some existing econometric tools which are potentially helpful in future work. Trendmodelling of COVID-19 data is challenging due to the following reasons at least. First, eachcountry shows a dominating deterministic trend, which wipes out other information. Second,the policy of each country has been updated frequently during the pandemic, so analysis usingconstant parameters may not reﬂect these impacts properly. Third, time series analysis cannotbe conducted for some countries due to small sample, while pooling data together yields a highlyunbalanced dataset. In this study, we aim to model the aforementioned challenges, raise somediﬃculties, and call for studies which can account for these features simultaneously.Based on our investigation, we ﬁnd the following econometric literature is particularly useful.Deterministic time trend modelling (such as Phillips, 2007, Robinson, 2012 and Gao et al., 2020)helps address the ﬁrst challenge. Time-varying coeﬃcient models which date back to Robinson(1989, 1991) or works even earlier are useful to address the second challenge. In some recentstudies, both Gu et al. (2020) and Li and Linton (2020) conduct time series analysis on COVID-19 data of selected countries for diﬀerent purposes, while Liu et al. (2020) forecast infectionof COVID-19 using panel data by a Bayesian methodology. They all agree that time-varyingcoeﬃcients should be adopted to investigate pandemic data. Factor models and relevant dataimputation techniques are closely related to the ﬁrst and third challenges (e.g., Bai and Ng, 2002,2019, Su and Wang, 2017, and Su et al., 2019). It is noteworthy that Bai and Ng (2019) and Suet al. (2019) have worked out that certain types of random missing data can be dealt within theframework of factor analysis eﬀectively.In our empirical study, we ﬁnd that European countries overall ﬂatten the curves more eﬀec-tively compared to the other regions, while Asia & Oceania also achieve some success, but thesituations are not as optimistic elsewhere. Africa and America are still facing serious challengesin terms of managing the spread of the virus, and reducing the death rate, although in Africathe virus spreads slower and has a lower death rate than the other regions. By comparing theperformances of diﬀerent countries, our results incidentally agree with Gu et al. (2020), thoughdiﬀerent approaches and models are considered. For example, both works agree that countries1uch as USA, UK and Italy perform relatively poorly; on the other hand, Australia, China,Japan, Korea, and Singapore perform relatively better.The rest of this paper is as follows. Section 2 presents the model, and the estimation strategywith associated asymptotic properties. In section 3, we provide our empirical ﬁndings. Section4 concludes. Theoretical development, tables and ﬁgures are provided in the appendix.Before proceeding further, it is convenient to introduce some notation that will be usedthroughout this paper. (cid:98) A (cid:99) means the largest integer not exceeding A ; K ( · ) and h representa kernel function and a bandwidth of the nonparametric kernel method, respectively; K h ( u ) = K ( u/h ) /h ; I ( · ) stands for the indicator function; diag { A , . . . , A k } means constructing a diagonalmatrix from A , . . . , A k . In this section, we consider two models which we believe are useful to investigate the time trendof COVID-19.

We now present the ﬁrst model, which captures the trend aspect. The countries, indexed by i = 1 , . . . , N , start experiencing the virus at diﬀerent time points b iT ∈ { , . . . , T } . For manycountries we may have b iT = 1, but not all of them. We now propose the following model: y it =  g i ( τ t ) | t − β t,b iT | a + ε it , for t ≥ b iT , otherwise (2.1)In model (2.1), y it is the logarithm of the observed number of new cases (plus one to includedays that have zero outcomes). ε it is an error term capturing information less dominating thanthe trend. Further assumptions will be imposed on ε it later to account for potential omittingvariable issues, to capture second tier information over time, and to allow for certain types ofheterogeneity. Theoretically, β t,b iT may be unknown. Practically, β t,b iT accounts for the impactsof diﬀerent starting points, and may have diﬀerent forms depending on the research questions.A commonly used form of β t,b iT may be β t,b iT ≡ b iT −

1. This is not the main focus of the paper,as it does not impact on our empirical study very much. The trend of (2.1) can be regarded asa common feature of the virus. Speciﬁcally, the value of a characterizes the rate of infection ordeath. Larger a indicates a faster rate. g i ( · ) is a function to reﬂect the change of policy overtime for the country i , and captures some heterogeneous features across countries.2e can regard (2.1) as a panel data version of Gao et al. (2020) with an extra moving mean β t,b iT . This raises a few challenges that are raised in both the main text and the online supple-mentary appendix. Before proceeding further, we impose a condition to quantify the impacts ofmissing values. Speciﬁcally, suppose that there exist a sequence of ﬁxed points { b ∗ , . . . , b ∗ N } anda known function β ∗ ( · , · ) such that(1) . max i ≥ (cid:12)(cid:12)(cid:12)(cid:12) b iT T − b ∗ i (cid:12)(cid:12)(cid:12)(cid:12) = O ( T − ν ) , (2) . max i ≥ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − β t,b iT T (cid:12)(cid:12)(cid:12)(cid:12) a − β ∗ ( τ t , b ∗ i ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C T − ν , (2.2)where 0 ≤ C < ∞ , ν and ν are ﬁxed constants satisfying that 0 < ν ≤ ν >

0. When β t,b iT ≡ b iT − β ∗ ( τ t , b ∗ i ) = | τ t − b ∗ i | a , part (2) of (2.2) holds trivially. Without missingvalues, b iT ’s and b ∗ i ’s reduce to 1 and 0 respectively. Practically, the values of b iT ’s and b ∗ i ’s canbe controlled by removing a reasonable range of periods from the beginning in order to reducethe impacts of missing values. In practice, one has to ﬁnd a balance between available samplesize and the impact of missing data.We are interested in recovering information under the framework of (2.1)-(2.2). To carry onour analysis, we write (2.1) in vector form. Y t = I t G ( τ t ) + I t E t , (2.3)where I t = diag { I ( t ≥ b T ) , . . . , I ( t ≥ b NT ) } , Y t = ( y t , . . . , y Nt ) (cid:48) , E t = ( ε t , . . . , ε Nt ) (cid:48) , and G ( τ t ) = ( g ( τ t ) | t − β t,b T | a , . . . , g N ( τ t ) | t − β t,b NT | a ) (cid:48) . Since G ( · ) is unknown, we adopt the nonparametric kernel approach, and multiply K / h ( τ t − u )for both sides of (2.3). Given τ t in a small neighbour of u , we obtain G ( τ t ) T a K / h ( τ t − u ) ≈ G ( u ) , (2.4)where G ( u ) = ( g ( u ) β ∗ ( u, b ∗ ) , . . . , g N ( u ) β ∗ ( u, b ∗ N )) (cid:48) . Thus, after proper normalization (i.e., T a ),(2.4) is the leading vector when analysing Y t K / h ( τ t − u ). However, a is unknown, so has to beestimated.In view of (2.3)-(2.4) and motivated by the construction of the Financial Stress Index , weconduct the principle component analysis on the sample quantity The largest eigenvalue and the associated eigenvector is calculated using 18 weekly data series in or-der to measure the degree of ﬁnancial stress in the markets. See St. Louis Fed’s website for details.https://fred.stlouisfed.org/series/STLFSI2 u ) = 1 N T T (cid:88) t =1 Y t Y (cid:48) t K h ( τ t − u ) (2.5)for all u .We brieﬂy explain the intuition below. Note that simple algebra yieldsΣ( u ) = 1 N T T (cid:88) t =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u )+ 1 N T T (cid:88) t =1 I t E t E (cid:48) t I t K h ( τ t − u ) + interaction terms . (2.6)Loosely speaking, NT (cid:80) Tt =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u ) of (2.6) contains a quadratic in the timetrend that will dominates the other terms. As a consequence, the largest eigenvalue and the asso-ciated eigenvector of Σ( u ) reﬂect the information associated with NT (cid:80) Tt =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u ) only, which allows us to focus on the trending properties of the virus, and ignore the secondaryinformation asymptotically. To explain the intuition using an even simpler example, one mayconsider conducting an OLS regression for y t = ρ t + ε t , where as long as ε t is not diverging fasterthan t , the information of ρ can always be retrieved.That said, let λ u and (cid:96) u be the largest eigenvalue and the corresponding eigenvector of Σ( u ),and (cid:107) (cid:96) u (cid:107) = 1. Mathematically, it is written as λ u (cid:96) u = Σ( u ) (cid:96) u . (2.7)Accounting for the unbalancedness of the data, we further deﬁne the following set: C = (cid:110) t | t = 1 , . . . , T, lim N →∞ (cid:93) N τ t N = 1 (cid:111) , where N u = { i | b ∗ i ≤ u − h, ≤ i ≤ N } , and (cid:93) N u represents the cardinality of N u . Let N cu = { , . . . , N } \ N u . By construction, C rules out a set of time periods that we cannot makeinference on due to the availability of data. In practice, we may let (cid:93) N τ t ≥ N − ln N , whichreplaces the limit in the deﬁnition of C as a practical guide to choose C . Alternatively, we canlet C = { max i ≥ b iT − c, . . . , T } with c being a reasonably small positive integer for feasibilityand simplicity.Finally, the estimator of a is presented as follows. (cid:98) a = 12 ln T · ln (cid:110) (cid:93) C (cid:88) t ∈ C λ τ t (cid:111) . (2.8)Intuitively, (cid:93) C (cid:80) t ∈ C λ τ t yields an estimate of O ( T a ) using (2.6), so the logarithm of (cid:93) C (cid:80) t ∈ C λ τ t is divided by 2 ln T to yield an estimate of a . 4elow, we present our assumptions and give some justiﬁcations. Assumption 1

1. Let K ( · ) be a function deﬁned on [ − , K (1) ( w ) be uniformly bounded on [ − , (cid:82) − K ( w ) dw = 1 and (cid:82) − | w | K ( w ) dw < ∞ . Suppose that h → T h → ∞ .2. (a) Suppose that max i ≥ sup τ ∈ D | F i ( τ ) | < ∞ , where F i ( τ ) = g i ( τ ) β ∗ ( τ, b ∗ i ) and D =[inf t ∈ C τ t , w →

0, let max i ≥ sup τ ∈ D | F i ( τ + w ) − F i ( τ ) | ≤ c | w | µ , where µ and c > g ( u ) such that sup u ∈ D | N G ( u ) (cid:48) G ( u ) − ¯ g ( u ) | = O ( φ ,N ), where φ ,N → (cid:82) D ¯ g ( u ) du = 1, and G ( · ) is deﬁned in (2.4).3. Suppose that sup u ∈ D NT (cid:80) Tt =1 E (cid:48) t E t K h ( τ t − u ) = O P ( δ T ), and δ T /T a → F i ( τ ) requires Lipschitz continuity. It can be furtherdecomposed by putting restrictions on a , β ∗ ( · , · ) and g i ( · )’s, but it will lead to quite lengthynotation and development. Assumption 1.2.b imposes an identiﬁcation restriction. The condition (cid:82) D ¯ g ( u ) du = 1 ﬁxes the location of ¯ g ( u ) along Y -axis, and it has no impact on the quantities inrelative terms that we shall explore in the empirical study.As the error terms include information less dominating than the trend, all we require inAssumption 1.3 is that the magnitude of the secondary information does not overwhelm thetrend presented by the virus, which can be regarded as how we model the omitting variableissues in the current setting.We are now ready to present the asymptotic results associated with our empirical investiga-tion. Theorem 2.1.

Consider the model stated in (2.1) and (2.2) . Under Assumption 1, as ( N, T ) → ( ∞ , ∞ ) ,1. sup u ∈ D (cid:107) (cid:96) u (cid:96) (cid:48) u − P G ( u ) (cid:107) = O P ( φ ,NT ) , and sup u ∈ D (cid:12)(cid:12)(cid:12) λ u T a − G ( u ) (cid:48) G ( u ) N (cid:12)(cid:12)(cid:12) = O P ( φ ,NT ) ;2. (cid:98) a − a = O P ( φ ,NT + φ ,N ln T ) ,where φ ,NT = δ / T T a + (cid:8) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:9) / , and P G ( u ) = G ( u ) {G ( u ) (cid:48) G ( u ) } − G ( u ) (cid:48) .In addition, suppose b ∗ i = 0 for i ≥ .3. For ∀ t, s ∈ C , R ts − β ∗ ( τ t , β ∗ ( τ s , · (cid:107) G ( τ t ) (cid:107) (cid:107) G ( τ s ) (cid:107) = O P ( φ ,NT ) ; . For ∀ i, j ∈ N u , sup u ∈ D (cid:12)(cid:12)(cid:12) Q u,ij − g i ( u ) g j ( u ) (cid:12)(cid:12)(cid:12) = O P ( φ ,NT ) ,where R ts = λ τt λ τs , G ( u ) = ( g ( u ) , . . . , g N ( u )) (cid:48) , Q u,ij = (cid:96) u,i (cid:96) u,j , and (cid:96) u,i stands for the i th element of (cid:96) u . With a balanced dataset, the terms involving ν , ν , µ and (cid:93) N cu in the above theorem willvanish, and the asymptotic development will be much simpliﬁed. Utilizing panel data, the rate ofthe second result improves the slow rate of Theorem 4.2 of Gao et al. (2020), wherein a detailedexplanation can be found.The ﬁrst result explains how the unbalancedness of the data aﬀects the asymptotic results.Also, it implies that we can recover the space spanned by G ( u ). Under the conditions b ∗ i = 0for i ≥

1, the result will reduce to sup u ∈ D (cid:107) (cid:96) u (cid:96) (cid:48) u − P G ( u ) (cid:107) = O P ( φ ,NT ). It is noteworthy thatthe condition b ∗ i = 0 for i ≥ C in practice.For the third result, without loss of generality, suppose that t > s . Note that there aretwo ratios involved in R ts , i.e., | β ∗ ( τ t , || β ∗ ( τ s , | and (cid:107) G ( τ t ) (cid:107)(cid:107) G ( τ s ) (cid:107) . It is not hard to see that the ratio | β ∗ ( τ t , || β ∗ ( τ s , | measures the rate associated with the virus, while the ratio (cid:107) G ( τ t ) (cid:107)(cid:107) G ( τ s ) (cid:107) reﬂects the eﬀorts that thecountries make to ﬂatten the curves. For eﬀective policies, the ratio R ts should be lower than 1.The fourth result is also about a ratio that provides a way of comparing the eﬀectiveness oftwo diﬀerent policies at the same time point. Note that g i ( · )’s model the eﬀectiveness of thepolicies. A lower value of g i ( · ) indicates better eﬀorts in terms of ﬂattening the curve. Thus,if 0 < g i ( u ) g j ( u ) <

1, we may conclude the country i has a more eﬀective policy compared to thecountry j . Otherwise, the country j performs relatively better.Finally, we comment on how g i ( · )’s and β ∗ ( · , · ) can be recovered. Since β ∗ ( · , · ) and g i ( · )’sexist in the model through a multiplication form, they cannot be individually estimated withoutfurther identiﬁcation restrictions. If one is willing to impose a restriction (such as G ( u ) (cid:48) G ( u ) N = 1),then β ∗ ( · , · ) can be recovered as suggested by the second argument of Theorem 2.1.1. If the formof β ∗ ( · , · ) was known, the asymptotic distribution associated with the estimate of each g i ( · ) canbe constructed as in Theorem 4.3 of Gao et al. (2020). Alternatively, Theorem 2.1.4 suggeststhat for any given u we may pick an individual i as a benchmark, then recover the rest g j ( u )’sand β ∗ ( · , · ) utilizing the ratio of the fourth result. As these are not the main focus of this paper,we leave the choice of identiﬁcation strategy to future study. In our empirical work we willemphasise the identiﬁed quantities: a, and the ratios R ∗ ts = β ∗ ( τ t , β ∗ ( τ s , · (cid:107) G ( τ t ) (cid:107) (cid:107) G ( τ s ) (cid:107) and Q ( u ) = g i ( u ) g j ( u ) . We consider a second model that is designed to capture a single peaked epidemic trajectory,similar to Li and Linton (2020). We consider the following regression6 it =  γ i − g i ( τ t ) | t − β t,b iT | a + ε it , for t ≥ b iT , otherwise , (2.9)where γ i is the global maximum of each individual. When t = β t,b iT , the global maximum isachieved at γ i .If we have a complete trajectory of the epidemic, or at least data that includes the peak andsometime afterwards, we may estimate γ i directly. Speciﬁcally, we may take any local (in time)smoother and maximize this over time. The smoothing method eliminates the error term andthen the resulting function is uniquely maximized at the true peak time. One then can use themethodology of Section 2.1 to work with the transformed model as follows. y ∗ it =  g i ( τ t ) | t − β t,b iT | a + ε ∗ it , for t ≥ b iT , otherwise , (2.10)where y ∗ it = (cid:98) γ i − y it and ε ∗ it = − ε it + ( (cid:98) γ i − γ i ).Additionally, one may consider an estimation strategy that tries to estimated the parametersof interest simultaneously to avoid the bias caused by the plug-in procedure. We wish to leaveit to the future study, but we examine the model (2.10) using the approach of Section 2.1 in theempirical study as a robustness check. In this section, we investigate the time trend of the COVID-19 data. Before proceeding further,we comment on two practical issues — the choice of kernel function and the bandwidth selectionprocedure.For the kernel function, we follow Hong and Li (2005) and Su and Wang (2017) to adopt aboundary adjusted kernel: K (( τ t − u ) /h ) =  K (( τ t − u ) /h ) , u ∈ [ h, − h ] K (( τ t − u ) /h ) / (cid:82) (1 − u ) /h − K ( w ) dw, u ∈ (1 − h, t = 1 , . . . , T , where K ( w ) is the Epanechnikov kernel. By construction of C , there is no needto adjust the left boundary.Next, we provide a bandwidth selection procedure which minimizes a leave-one-out crossvalidation function as follows. (cid:98) h = argmin h CV( h ) , h ) = (cid:88) t ∈ C (cid:107) Y t / ( √ N T a h ) − (cid:98) (cid:96) − τ t (cid:107) , where a h is obtained from (2.8) given h , and (cid:98) (cid:96) − τ t is obtained from (2.7) by replacing Σ( τ t ) with NT (cid:80) Ts =1 ,s (cid:54) = t Y s Y (cid:48) s K h ( τ s − τ t ). The terms √ N and T a h are normalizers to ensure that (cid:98) (cid:96) − τ t andthe normalized Y t are on the same scale. To examine the sensitivity of the bandwidth selectionprocedure, we further consider h L = 0 . (cid:98) h and h R = 1 . (cid:98) h . We focus on daily new infection and new deaths from four regions , (i.e., Africa (AF), America(AM), Asia & Oceania (AO), and Europe (EU)); we account for population density of eachcountry in the following analysis. Note that there are only 8 countries from Oceania in the datasource, so we merge Asia and Oceania together. Population density is based on the data of 2018from World Bank, and is measured as people per sq. km of land area. We exclude countries thatdo not have the population density ﬁgures. For each region, the sample period starts from thedate when the ﬁrst conﬁrmed case is recorded, but we remove the ﬁrst 30 days of each regionin order to reduce the impacts of missing data. Finally, we summarize the available sample inTable A.1.For infection data, the four regions have roughly the same number of countries. However,death data are very unbalanced. We remove the countries with total deaths less than 20 at31/05/2020, which is why the number of countries drops for death data. It is not surprisingthat Asia & Oceania has the longest period due to early outbreak of China, while Africa has theshortest period. (2.1) We now start conducting numerical analysis using the approach of Section 2.1. Speciﬁcally, weconsider two sets of { y it } for both infection and death.Case 1: ln (daily increase + 1)Case 2: ln (cid:16) daily increase +1population density (cid:17) .2.1 Overall Analysis We let C = {(cid:98) T / (cid:99) + 1 , . . . , T } for simplicity, and summarize the estimates of a in Table A.2,which shows that the estimates are not overly sensitive to diﬀerent choices of the bandwidth.For infection data, Europe has the highest values of (cid:98) a for both Cases 1 and 2, which couldbe due to overall high quality infrastructure leading to high mobility of the entire population.Moreover, America and Asia & Oceania have roughly similar values in both Cases 1 and 2, whileAfrica has the lowest value, which implies that the virus spreads in Africa slower than the otherregions.For death data, America has the highest death rate for both Cases 1 and 2. Although theestimates from the original data (i.e., Case 1) indicate that Africa has a very low death rate,the estimates from the normalized version (i.e., Case 2) indicates that the situation is not toooptimistic but is still the best among four regions.Next, we examine the ratio R t +1 ,t for t = (cid:98) T / (cid:99) + 1 , . . . , T − In this subsection, we compare the performances of countries in each region using the fourthresult of Theorem 2.1. Speciﬁcally, for each region, we let the country that has the largestvalue of daily increase at 31/05/2020 be the benchmark, and label it by the index i = 1. Wesummarize the reference countries in Table A.3. We then plot Q τ t ,i for i ≥ Q τ t ,i from largest to smallest atthe time period T . The lines in each sub-plot reﬂect how the corresponding countries performat diﬀerent time points compared to the reference country. As explained under Theorem 2.1,(1). smaller value indicates better performance, and (2). a value less (greater) than 1 indicates9etter (worse) performance than the reference country.Our results somewhat agree with the ﬁndings of Gu et al. (2020). For example, (1). countriessuch as USA, UK and Italy are at the top of the corresponding sub-plots in our investigation,which indicates ineﬀective performance in terms of managing the spread of the virus and reducingthe death rate; (2). on the other hand, our ﬁnding also suggests that countries such as Australia,China, Japan, Korea, and Singapore perform relatively well as in Gu et al. (2020). Finally, we estimate a and the ratio R using a rolling-window sample in order to capture somedynamics, which in a sense can be regarded as a robustness check on the sensitivity of the data.We prepare the data as in Section 3.1, and remove the ﬁrst 40 days for each region to avoid theimpacts of missing value on the 30 days rolling-window (i.e., T = 30 for each regression). Foreach window, we let C = { , , . . . , } and estimate ¯ R = (cid:80) t =26 R t +1 ,t . We then record theestimated a and ¯ R from the ﬁrst available window till the end.For eﬀective policies, we expect the estimates of a show a turning point at certain stage, andexpect the value of ¯ R below one. We plot the estimates of each region in Figures A.5-A.8, wherethe X -axis is indexed by the last day of the consecutive 30 days period.First, we take a look at the values associated with infection in Figures A.5 and A.7. InFigure A.5, the curves of Africa and America keep increasing with a very steady rate, which is aconcern from the perspective of ﬂattening the curve. The curves of Asia & Oceania become ﬂatgradually, but the turning points have not shown up yet. Europe is the only continent whichhas a turning point in Figure A.5, and the pattern exists in both Cases 1 and 2. It furthersupports that European countries have more eﬀective polices overall. In Figure A.7, the curvesof Asia & Oceania and Europe are approaching to 1, while the curves of Africa and America donot. Especially, the values of ¯ R of Africa start diverging from 1 from late May, which is alsoworrisome.Second, we turn to the results associated with death in Figures A.6 and A.8. Clearly, inFigure A.6, the death rate of Europe has been dropping, while Asia & Oceania have managedto ﬂatten the curve, but the turning point has not shown up yet. Africa and America haveincreasing death rates during the entire period. In Figure A.8, Europe still performs much betterthan the other regions, as it is the only region having ¯ R less than 1. The curves of Asia &Oceania have been approaching to 1, while Africa and America do not show much improvementduring the period. 10 .3 Results Associated with Model (2.9) The data and the corresponding settings of this subsection are identical to those in Section 3.2,but we work with the transferred version using (2.10). Still, we consider Cases 1 and 2 for thetransferred data. It is noteworthy that under the model (2.9), the interpretation on the values of a , R t +1 ,t and Q τ t ,i are respectively diﬀerent from those in Section 3.1. Speciﬁcally, the eﬀectivepolicies would ensure relatively short periods to reach the peak of the pandemic. In this sense,the ﬁrst diﬀerent is that large a may not be a sign of bad situation. The second diﬀerence is thatwe expect the ratio R t +1 ,t greater than 1 to indicate more eﬀective policies, since larger R t +1 ,t implies reaching the peak with a shorter period. Finally, for the ratio Q τ t ,i with i = 2 , . . . , N ,we expect a value greater than 1 to represent a more eﬀective policy compared to the referenceindividual.Note that since Africa and America have not reached the peak with obvious reasons byscreening the data plots, we do not comment on the values associated with Africa and Americamuch below although the values for these two regions are reported. We ﬁrst summarize the estimates of a in Table A.4. For both infection and death data, it seemsto suggest that in Europe the spread of the virus and the death rate reach the peak slower thanthe other regions by nature. For Asian & Oceania, the spread of the virus and the death ratetend to reach the peak slightly faster than Europe for both Cases 1 and 2.We now focus on the values of R t +1 ,t presented in Figures A.9 and A.10. Consistent withwhat we ﬁnd in Section 3.2.1, Europe indeed has more eﬀective polices, as the values of R t +1 ,t aregreater than 1 in the entire period for both Cases 1 and 2. Asia & Oceania have some success,but the situation is not as good as in Europe. For each region, the reference countries are the same as those in Table A.3. The legend of eachsub-plot is ranked by Q τ t ,i from largest to smallest at the time period T , however, larger valueimplies better performance in this case.For the infection data of Europe, Case 1 of Figure A.11 fully agrees with Case 1 of FigureA.3, i.e., all countries perform better than the reference country. For the death data of Europe,a similar argument applies to Case 2 of Figure A.12 and Case 2 of Figure A.4.Interestingly, for Asia & Oceania, the downward trending of Cases 1 and 2 in Figures A.3and A.4 becomes upward trending in Figures A.11 and A.9. Thus, both models conﬁrm that11ompared to the reference country, the rest countries in Asia & Oceania have been improving,or the situation of the reference country has been getting out of control. In this paper, we study the trending behaviour of COVID-19 data at country level, and drawattention to some existing econometric tools which are potentially helpful to understand thetrend better in the future study. In our empirical study, we ﬁnd that European countries overallﬂatten the curves more eﬀectively compared to the other regions, while Asia & Oceania alsoachieve some success, but the situations are not optimistic as in Europe. Africa and Americaare still facing serious challenges in terms of managing the spread of the virus, and reducing thedeath rate, although in Africa the virus spreads slower and has lower death rate than the otherregions by nature. By comparing the performances of diﬀerent countries, our results incidentallyagree with Gu et al. (2020), though diﬀerent approaches and models are considered. For example,both works agree that countries such as USA, UK and Italy perform relatively poorly; on theother hand, Australia, China, Japan, Korea, and Singapore perform relatively better.

References

Bai, J. and Ng, S. (2002), ‘Determining the number of factors in approximate factor models’,

Econo-metrica (1), 191–221.Bai, J. and Ng, S. (2019), Matrix completion, counterfactuals, and factor analysis of missing data.Working paper available at https://arxiv.org/abs/1910.06677.Gao, J., Linton, O. and Peng, B. (2020), ‘Inference on a semiparametric model with global power lawand local nonparametric trends’, Econometric Theory (2), 223–249.Gu, J., Yan, H., Huang, Y., Zhu, Y., Sun, H., Zhang, X., Wang, Y., Qiu, Y. and Chen, S. (2020), Betterstrategies for containing COVID-19 epidemics — A study of 25 countries via an extended varyingcoeﬃcient seir model. Working paper available at https://doi.org/10.1101/2020.04.27.20081232.Hong, Y. and Li, H. (2005), ‘Nonparametric speciﬁcation testing for continuous-time models with ap-plications to term structure of interest rates’, The Review of Financial Studies (1), 37–84.Li, Q. and Racine, J. (2006), Nonparametric Econometrics Theory and Practice , Princeton UniversityPress. EconometricTheory (4), 557–614.Robinson, P. M. (1989), ‘Chapter 15: Nonparametric estimation of time-varying parameters’, StatisticalAnalysis and Forecasting of Economic Structural Change pp. 253–264.Robinson, P. M. (1991), ‘Chapter 13: Time-varying nonlinear regression’,

Economic Structural Change pp. 179–190.Robinson, P. M. (2012), ‘Inference on power law spatial trends’,

Bernoulli (2), 644–677.Su, L., Miao, K. and Jin, S. (2019), On factor models with random missing: Em estimation,inference,and cross validation. Working paper available at https://ink.library.smu.edu.sg/soe research/2231/.Su, L. and Wang, X. (2017), ‘On time-varying factor models: Estimation and testing’, Journal ofEconometrics (1), 84–101.

Appendix A

In what follows, O (1) stands for a constant, and may be diﬀerent at each appearance. Without loss ofgenerality, let b T ≤ b T ≤ · · · ≤ b NT in what follows. Lemma A.1

Consider the model stated in (2.1) and (2.2) . Under Assumption 1, as ( N, T ) → ( ∞ , ∞ ) ,1. sup u ∈ D (cid:13)(cid:13)(cid:13) NT (cid:80) Tt =1 I t G ( τ t ) T a G ( τ t ) (cid:48) T a I t K h ( τ t − u ) − N G ( u ) G ( u ) (cid:48) (cid:13)(cid:13)(cid:13) = O (cid:16) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:17) ;2. sup u ∈ D (cid:12)(cid:12)(cid:12) NT (cid:80) Tt =1 G ( τ t ) (cid:48) I t G ( τ t ) ( β ∗ ( τ t , b ∗ i )) K h ( τ t − u ) − N (cid:80) i ∈ N u β ∗ ( u, b ∗ i ) g i ( u ) (cid:12)(cid:12)(cid:12) = O (cid:16) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:17) . Proof of Lemma A.1: (1). Writesup u ∈ D (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N T T (cid:88) t =1 I t G ( τ t ) T a G ( τ t ) (cid:48) T a I t K h ( τ t − u ) − N G ( u ) G ( u ) (cid:48) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (A.1) sup u ∈ D N (cid:88) i ∈ N u (cid:88) j ∈ N u (cid:110) T T (cid:88) t =1 F i ( u ) F j ( u ) I ( t ≥ b iT ) I ( t ≥ b jT ) K h ( τ t − u ) − F i ( u ) F j ( u ) (cid:111) (A.2)+ O (cid:18) T ν + (cid:93) N cu N (cid:19) (A.3)= sup u ∈ D N (cid:88) i ∈ N u (cid:110) (cid:90) b ∗ i F i ( u ) K h ( w − u ) dw − F i ( u ) (cid:111) (A.4)+ sup u ∈ D N (cid:88) i ∈ N u (cid:88) j ∈ N u ,j

N T N T

AF 48 62 23 50AM 43 98 20 61AO 48 123 25 108EU 49 98 41 69

Table A.2: Estimate of a for Model (2.1) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.239 0.240 0.239 0.393 0.393 0.393AM 0.274 0.277 0.272 0.402 0.402 0.402AO 0.262 0.263 0.262 0.409 0.401 0.405EU 0.328 0.329 0.327 0.449 0.449 0.448Death AF 0.021 0.022 0.020 0.321 0.322 0.321AM 0.285 0.286 0.285 0.445 0.445 0.445AO 0.133 0.133 0.132 0.367 0.366 0.365EU 0.235 0.235 0.234 0.409 0.401 0.405 a for Model (2.9) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.229 0.230 0.229 0.396 0.396 0.396AM 0.187 0.188 0.186 0.358 0.358 0.357AO 0.210 0.212 0.209 0.393 0.394 0.392EU 0.164 0.164 0.164 0.375 0.375 0.375Death AF 0.108 0.109 0.107 0.339 0.339 0.339AM 0.096 0.096 0.095 0.367 0.367 0.367AO 0.144 0.146 0.143 0.373 0.373 0.372EU 0.103 0.104 0.102 0.345 0.345 0.344 R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.4: Model 1 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.

Table A.2: Estimate of a for Model (2.1) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.239 0.240 0.239 0.393 0.393 0.393AM 0.274 0.277 0.272 0.402 0.402 0.402AO 0.262 0.263 0.262 0.409 0.401 0.405EU 0.328 0.329 0.327 0.449 0.449 0.448Death AF 0.021 0.022 0.020 0.321 0.322 0.321AM 0.285 0.286 0.285 0.445 0.445 0.445AO 0.133 0.133 0.132 0.367 0.366 0.365EU 0.235 0.235 0.234 0.409 0.401 0.405 a for Model (2.9) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.229 0.230 0.229 0.396 0.396 0.396AM 0.187 0.188 0.186 0.358 0.358 0.357AO 0.210 0.212 0.209 0.393 0.394 0.392EU 0.164 0.164 0.164 0.375 0.375 0.375Death AF 0.108 0.109 0.107 0.339 0.339 0.339AM 0.096 0.096 0.095 0.367 0.367 0.367AO 0.144 0.146 0.143 0.373 0.373 0.372EU 0.103 0.104 0.102 0.345 0.345 0.344 R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.4: Model 1 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.5: Model 1 — Estimated a of Infection Data using Rolling Window. The left and rightpanels are Case 1 and Case 2 respectively. a of Death Data using Rolling Window. The left and rightpanels are Case 1 and Case 2 respectively. R of Infection Data using Rolling Window. The left and right panelsare Case 1 and Case 2 respectively. R of Death Data using Rolling Window. The left and right panels areCase 1 and Case 2 respectively. R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.12: Model 2 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.

Related Researches

Optimal transportation and the falsifiability of incompletely specified economic models

by Ivar Ekeland

A note on global identification in structural vector autoregressions

by Emanuele Bacchiocchi

Duality in dynamic discrete-choice models

by Khai Xiang Chiong

A test of non-identifying restrictions and confidence regions for partially identified parameters

by Alfred Galichon

Assessing Sensitivity of Machine Learning Predictions.A Novel Toolbox with an Application to Financial Literacy

by Falco J. Bargagli Stoffi

Extreme dependence for multivariate data

by Damien Bosc

Dilation bootstrap

by Alfred Galichon

Inference under Covariate-Adaptive Randomization with Imperfect Compliance

by Federico A. Bugni

Identification of Matching Complementarities: A Geometric Viewpoint

by Alfred Galichon

Hypothetical bias in stated choice experiments: Part I. Integrative synthesis of empirical evidence and conceptualisation of external validity

by Milad Haghani

Hypothetical bias in stated choice experiments: Part II. Macro-scale analysis of literature and effectiveness of bias mitigation methods

by Milad Haghani

The Econometrics and Some Properties of Separable Matching Models

by Alfred Galichon

Discretizing Unobserved Heterogeneity

by Stéphane Bonhomme Thibaut Lamadon Elena Manresa

Permutation Tests at Nonparametric Rates

by Marinho Bertanha

General Bayesian time-varying parameter VARs for predicting government bond yields

by Manfred M. Fischer

Quasi-maximum likelihood estimation of break point in high-dimensional factor models

by Jiangtao Duan

A Control Function Approach to Estimate Panel Data Binary Response Model

by Amaresh K Tiwari

Set Identification in Models with Multiple Equilibria

by Alfred Galichon

Inference in Incomplete Models

by Alfred Galichon

Non-stationary GARCH modelling for fitting higher order moments of financial series within moving time windows

by Luke De Clerk

Bridging factor and sparse models

by Jianqing Fan

Misguided Use of Observed Covariates to Impute Missing Covariates in Conditional Prediction: A Shrinkage Problem

by Charles F Manski

A Novel Multi-Period and Multilateral Price Index

by Consuelo Rubina Nava

Cointegrated Solutions of Unit-Root VARs: An Extended Representation Theorem

by Mario Faliva

Estimation and Inference by Stochastic Optimization: Three Examples

by Jean-Jacques Forneron

«

1

2

3

4

»

Submitted on 19 Jun 2020 (v1), last revised 23 Jun 2020 (this version, v2) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar