On the Time Trend of COVID-19: A Panel Data Study
OOn the Time Trend of COVID-19: A Panel Data Study Chaohua Dong † and Jiti Gao ‡ and Oliver Linton (cid:63) and Bin Peng ‡† Zhongnan University of Economics and Law ‡ Monash University (cid:63)
University of CambridgeJune 25, 2020
Abstract
In this paper, we study the trending behaviour of COVID-19 data at country level,and draw attention to some existing econometric tools which are potentially helpful tounderstand the trend better in future studies. In our empirical study, we find that Europeancountries overall flatten the curves more effectively compared to the other regions, whileAsia & Oceania also achieve some success, but the situations are not as optimistic elsewhere.Africa and America are still facing serious challenges in terms of managing the spread ofthe virus, and reducing the death rate, although in Africa the virus spreads slower andhas a lower death rate than the other regions. By comparing the performances of differentcountries, our results incidentally agree with Gu et al. (2020), though different approachesand models are considered. For example, both works agree that countries such as USA, UKand Italy perform relatively poorly; on the other hand, Australia, China, Japan, Korea,and Singapore perform relatively better.
Keywords : COVID-19, Deterministic time trend, Panel data, Varying-coefficient
JEL classification : C23, C54 The first author thanks financial support from National Nature Science Foundation of China under grantnumber: 71671143; the second and the third authors would like to acknowledge the financial support of theAustralian Research Council Discovery Grants program under Grant Number: DP200102769. (cid:63)
Corresponding Author : Oliver Linton, Faculty of Economics, University of Cambridge, Cambridge CB3 9DD,U.K. Email: [email protected] a r X i v : . [ ec on . E M ] J un Introduction
Words like “exponential rate” and “flatten the curve” have been widely cited by all sorts ofsocial media since the outbreak of the pandemic coursed by COVID-19. Since early 2020, gov-ernments of the entire world have been frequently updating their policies in order to manage thespread of the virus, and reduce the death rate while constrained by limited medical resources.Understanding the trending behaviour of the pandemic is therefore crucial from the perspectiveof policy making.The paper investigates the trending behaviour of COVID-19 data at country level, and drawsattention to some existing econometric tools which are potentially helpful in future work. Trendmodelling of COVID-19 data is challenging due to the following reasons at least. First, eachcountry shows a dominating deterministic trend, which wipes out other information. Second,the policy of each country has been updated frequently during the pandemic, so analysis usingconstant parameters may not reflect these impacts properly. Third, time series analysis cannotbe conducted for some countries due to small sample, while pooling data together yields a highlyunbalanced dataset. In this study, we aim to model the aforementioned challenges, raise somedifficulties, and call for studies which can account for these features simultaneously.Based on our investigation, we find the following econometric literature is particularly useful.Deterministic time trend modelling (such as Phillips, 2007, Robinson, 2012 and Gao et al., 2020)helps address the first challenge. Time-varying coefficient models which date back to Robinson(1989, 1991) or works even earlier are useful to address the second challenge. In some recentstudies, both Gu et al. (2020) and Li and Linton (2020) conduct time series analysis on COVID-19 data of selected countries for different purposes, while Liu et al. (2020) forecast infectionof COVID-19 using panel data by a Bayesian methodology. They all agree that time-varyingcoefficients should be adopted to investigate pandemic data. Factor models and relevant dataimputation techniques are closely related to the first and third challenges (e.g., Bai and Ng, 2002,2019, Su and Wang, 2017, and Su et al., 2019). It is noteworthy that Bai and Ng (2019) and Suet al. (2019) have worked out that certain types of random missing data can be dealt within theframework of factor analysis effectively.In our empirical study, we find that European countries overall flatten the curves more effec-tively compared to the other regions, while Asia & Oceania also achieve some success, but thesituations are not as optimistic elsewhere. Africa and America are still facing serious challengesin terms of managing the spread of the virus, and reducing the death rate, although in Africathe virus spreads slower and has a lower death rate than the other regions. By comparing theperformances of different countries, our results incidentally agree with Gu et al. (2020), thoughdifferent approaches and models are considered. For example, both works agree that countries1uch as USA, UK and Italy perform relatively poorly; on the other hand, Australia, China,Japan, Korea, and Singapore perform relatively better.The rest of this paper is as follows. Section 2 presents the model, and the estimation strategywith associated asymptotic properties. In section 3, we provide our empirical findings. Section4 concludes. Theoretical development, tables and figures are provided in the appendix.Before proceeding further, it is convenient to introduce some notation that will be usedthroughout this paper. (cid:98) A (cid:99) means the largest integer not exceeding A ; K ( · ) and h representa kernel function and a bandwidth of the nonparametric kernel method, respectively; K h ( u ) = K ( u/h ) /h ; I ( · ) stands for the indicator function; diag { A , . . . , A k } means constructing a diagonalmatrix from A , . . . , A k . In this section, we consider two models which we believe are useful to investigate the time trendof COVID-19.
We now present the first model, which captures the trend aspect. The countries, indexed by i = 1 , . . . , N , start experiencing the virus at different time points b iT ∈ { , . . . , T } . For manycountries we may have b iT = 1, but not all of them. We now propose the following model: y it = g i ( τ t ) | t − β t,b iT | a + ε it , for t ≥ b iT , otherwise (2.1)In model (2.1), y it is the logarithm of the observed number of new cases (plus one to includedays that have zero outcomes). ε it is an error term capturing information less dominating thanthe trend. Further assumptions will be imposed on ε it later to account for potential omittingvariable issues, to capture second tier information over time, and to allow for certain types ofheterogeneity. Theoretically, β t,b iT may be unknown. Practically, β t,b iT accounts for the impactsof different starting points, and may have different forms depending on the research questions.A commonly used form of β t,b iT may be β t,b iT ≡ b iT −
1. This is not the main focus of the paper,as it does not impact on our empirical study very much. The trend of (2.1) can be regarded asa common feature of the virus. Specifically, the value of a characterizes the rate of infection ordeath. Larger a indicates a faster rate. g i ( · ) is a function to reflect the change of policy overtime for the country i , and captures some heterogeneous features across countries.2e can regard (2.1) as a panel data version of Gao et al. (2020) with an extra moving mean β t,b iT . This raises a few challenges that are raised in both the main text and the online supple-mentary appendix. Before proceeding further, we impose a condition to quantify the impacts ofmissing values. Specifically, suppose that there exist a sequence of fixed points { b ∗ , . . . , b ∗ N } anda known function β ∗ ( · , · ) such that(1) . max i ≥ (cid:12)(cid:12)(cid:12)(cid:12) b iT T − b ∗ i (cid:12)(cid:12)(cid:12)(cid:12) = O ( T − ν ) , (2) . max i ≥ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − β t,b iT T (cid:12)(cid:12)(cid:12)(cid:12) a − β ∗ ( τ t , b ∗ i ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C T − ν , (2.2)where 0 ≤ C < ∞ , ν and ν are fixed constants satisfying that 0 < ν ≤ ν >
0. When β t,b iT ≡ b iT − β ∗ ( τ t , b ∗ i ) = | τ t − b ∗ i | a , part (2) of (2.2) holds trivially. Without missingvalues, b iT ’s and b ∗ i ’s reduce to 1 and 0 respectively. Practically, the values of b iT ’s and b ∗ i ’s canbe controlled by removing a reasonable range of periods from the beginning in order to reducethe impacts of missing values. In practice, one has to find a balance between available samplesize and the impact of missing data.We are interested in recovering information under the framework of (2.1)-(2.2). To carry onour analysis, we write (2.1) in vector form. Y t = I t G ( τ t ) + I t E t , (2.3)where I t = diag { I ( t ≥ b T ) , . . . , I ( t ≥ b NT ) } , Y t = ( y t , . . . , y Nt ) (cid:48) , E t = ( ε t , . . . , ε Nt ) (cid:48) , and G ( τ t ) = ( g ( τ t ) | t − β t,b T | a , . . . , g N ( τ t ) | t − β t,b NT | a ) (cid:48) . Since G ( · ) is unknown, we adopt the nonparametric kernel approach, and multiply K / h ( τ t − u )for both sides of (2.3). Given τ t in a small neighbour of u , we obtain G ( τ t ) T a K / h ( τ t − u ) ≈ G ( u ) , (2.4)where G ( u ) = ( g ( u ) β ∗ ( u, b ∗ ) , . . . , g N ( u ) β ∗ ( u, b ∗ N )) (cid:48) . Thus, after proper normalization (i.e., T a ),(2.4) is the leading vector when analysing Y t K / h ( τ t − u ). However, a is unknown, so has to beestimated.In view of (2.3)-(2.4) and motivated by the construction of the Financial Stress Index , weconduct the principle component analysis on the sample quantity The largest eigenvalue and the associated eigenvector is calculated using 18 weekly data series in or-der to measure the degree of financial stress in the markets. See St. Louis Fed’s website for details.https://fred.stlouisfed.org/series/STLFSI2 u ) = 1 N T T (cid:88) t =1 Y t Y (cid:48) t K h ( τ t − u ) (2.5)for all u .We briefly explain the intuition below. Note that simple algebra yieldsΣ( u ) = 1 N T T (cid:88) t =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u )+ 1 N T T (cid:88) t =1 I t E t E (cid:48) t I t K h ( τ t − u ) + interaction terms . (2.6)Loosely speaking, NT (cid:80) Tt =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u ) of (2.6) contains a quadratic in the timetrend that will dominates the other terms. As a consequence, the largest eigenvalue and the asso-ciated eigenvector of Σ( u ) reflect the information associated with NT (cid:80) Tt =1 I t G ( τ t ) G ( τ t ) (cid:48) I t K h ( τ t − u ) only, which allows us to focus on the trending properties of the virus, and ignore the secondaryinformation asymptotically. To explain the intuition using an even simpler example, one mayconsider conducting an OLS regression for y t = ρ t + ε t , where as long as ε t is not diverging fasterthan t , the information of ρ can always be retrieved.That said, let λ u and (cid:96) u be the largest eigenvalue and the corresponding eigenvector of Σ( u ),and (cid:107) (cid:96) u (cid:107) = 1. Mathematically, it is written as λ u (cid:96) u = Σ( u ) (cid:96) u . (2.7)Accounting for the unbalancedness of the data, we further define the following set: C = (cid:110) t | t = 1 , . . . , T, lim N →∞ (cid:93) N τ t N = 1 (cid:111) , where N u = { i | b ∗ i ≤ u − h, ≤ i ≤ N } , and (cid:93) N u represents the cardinality of N u . Let N cu = { , . . . , N } \ N u . By construction, C rules out a set of time periods that we cannot makeinference on due to the availability of data. In practice, we may let (cid:93) N τ t ≥ N − ln N , whichreplaces the limit in the definition of C as a practical guide to choose C . Alternatively, we canlet C = { max i ≥ b iT − c, . . . , T } with c being a reasonably small positive integer for feasibilityand simplicity.Finally, the estimator of a is presented as follows. (cid:98) a = 12 ln T · ln (cid:110) (cid:93) C (cid:88) t ∈ C λ τ t (cid:111) . (2.8)Intuitively, (cid:93) C (cid:80) t ∈ C λ τ t yields an estimate of O ( T a ) using (2.6), so the logarithm of (cid:93) C (cid:80) t ∈ C λ τ t is divided by 2 ln T to yield an estimate of a . 4elow, we present our assumptions and give some justifications. Assumption 1
1. Let K ( · ) be a function defined on [ − , K (1) ( w ) be uniformly bounded on [ − , (cid:82) − K ( w ) dw = 1 and (cid:82) − | w | K ( w ) dw < ∞ . Suppose that h → T h → ∞ .2. (a) Suppose that max i ≥ sup τ ∈ D | F i ( τ ) | < ∞ , where F i ( τ ) = g i ( τ ) β ∗ ( τ, b ∗ i ) and D =[inf t ∈ C τ t , w →
0, let max i ≥ sup τ ∈ D | F i ( τ + w ) − F i ( τ ) | ≤ c | w | µ , where µ and c > g ( u ) such that sup u ∈ D | N G ( u ) (cid:48) G ( u ) − ¯ g ( u ) | = O ( φ ,N ), where φ ,N → (cid:82) D ¯ g ( u ) du = 1, and G ( · ) is defined in (2.4).3. Suppose that sup u ∈ D NT (cid:80) Tt =1 E (cid:48) t E t K h ( τ t − u ) = O P ( δ T ), and δ T /T a → F i ( τ ) requires Lipschitz continuity. It can be furtherdecomposed by putting restrictions on a , β ∗ ( · , · ) and g i ( · )’s, but it will lead to quite lengthynotation and development. Assumption 1.2.b imposes an identification restriction. The condition (cid:82) D ¯ g ( u ) du = 1 fixes the location of ¯ g ( u ) along Y -axis, and it has no impact on the quantities inrelative terms that we shall explore in the empirical study.As the error terms include information less dominating than the trend, all we require inAssumption 1.3 is that the magnitude of the secondary information does not overwhelm thetrend presented by the virus, which can be regarded as how we model the omitting variableissues in the current setting.We are now ready to present the asymptotic results associated with our empirical investiga-tion. Theorem 2.1.
Consider the model stated in (2.1) and (2.2) . Under Assumption 1, as ( N, T ) → ( ∞ , ∞ ) ,1. sup u ∈ D (cid:107) (cid:96) u (cid:96) (cid:48) u − P G ( u ) (cid:107) = O P ( φ ,NT ) , and sup u ∈ D (cid:12)(cid:12)(cid:12) λ u T a − G ( u ) (cid:48) G ( u ) N (cid:12)(cid:12)(cid:12) = O P ( φ ,NT ) ;2. (cid:98) a − a = O P ( φ ,NT + φ ,N ln T ) ,where φ ,NT = δ / T T a + (cid:8) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:9) / , and P G ( u ) = G ( u ) {G ( u ) (cid:48) G ( u ) } − G ( u ) (cid:48) .In addition, suppose b ∗ i = 0 for i ≥ .3. For ∀ t, s ∈ C , R ts − β ∗ ( τ t , β ∗ ( τ s , · (cid:107) G ( τ t ) (cid:107) (cid:107) G ( τ s ) (cid:107) = O P ( φ ,NT ) ; . For ∀ i, j ∈ N u , sup u ∈ D (cid:12)(cid:12)(cid:12) Q u,ij − g i ( u ) g j ( u ) (cid:12)(cid:12)(cid:12) = O P ( φ ,NT ) ,where R ts = λ τt λ τs , G ( u ) = ( g ( u ) , . . . , g N ( u )) (cid:48) , Q u,ij = (cid:96) u,i (cid:96) u,j , and (cid:96) u,i stands for the i th element of (cid:96) u . With a balanced dataset, the terms involving ν , ν , µ and (cid:93) N cu in the above theorem willvanish, and the asymptotic development will be much simplified. Utilizing panel data, the rate ofthe second result improves the slow rate of Theorem 4.2 of Gao et al. (2020), wherein a detailedexplanation can be found.The first result explains how the unbalancedness of the data affects the asymptotic results.Also, it implies that we can recover the space spanned by G ( u ). Under the conditions b ∗ i = 0for i ≥
1, the result will reduce to sup u ∈ D (cid:107) (cid:96) u (cid:96) (cid:48) u − P G ( u ) (cid:107) = O P ( φ ,NT ). It is noteworthy thatthe condition b ∗ i = 0 for i ≥ C in practice.For the third result, without loss of generality, suppose that t > s . Note that there aretwo ratios involved in R ts , i.e., | β ∗ ( τ t , || β ∗ ( τ s , | and (cid:107) G ( τ t ) (cid:107)(cid:107) G ( τ s ) (cid:107) . It is not hard to see that the ratio | β ∗ ( τ t , || β ∗ ( τ s , | measures the rate associated with the virus, while the ratio (cid:107) G ( τ t ) (cid:107)(cid:107) G ( τ s ) (cid:107) reflects the efforts that thecountries make to flatten the curves. For effective policies, the ratio R ts should be lower than 1.The fourth result is also about a ratio that provides a way of comparing the effectiveness oftwo different policies at the same time point. Note that g i ( · )’s model the effectiveness of thepolicies. A lower value of g i ( · ) indicates better efforts in terms of flattening the curve. Thus,if 0 < g i ( u ) g j ( u ) <
1, we may conclude the country i has a more effective policy compared to thecountry j . Otherwise, the country j performs relatively better.Finally, we comment on how g i ( · )’s and β ∗ ( · , · ) can be recovered. Since β ∗ ( · , · ) and g i ( · )’sexist in the model through a multiplication form, they cannot be individually estimated withoutfurther identification restrictions. If one is willing to impose a restriction (such as G ( u ) (cid:48) G ( u ) N = 1),then β ∗ ( · , · ) can be recovered as suggested by the second argument of Theorem 2.1.1. If the formof β ∗ ( · , · ) was known, the asymptotic distribution associated with the estimate of each g i ( · ) canbe constructed as in Theorem 4.3 of Gao et al. (2020). Alternatively, Theorem 2.1.4 suggeststhat for any given u we may pick an individual i as a benchmark, then recover the rest g j ( u )’sand β ∗ ( · , · ) utilizing the ratio of the fourth result. As these are not the main focus of this paper,we leave the choice of identification strategy to future study. In our empirical work we willemphasise the identified quantities: a, and the ratios R ∗ ts = β ∗ ( τ t , β ∗ ( τ s , · (cid:107) G ( τ t ) (cid:107) (cid:107) G ( τ s ) (cid:107) and Q ( u ) = g i ( u ) g j ( u ) . We consider a second model that is designed to capture a single peaked epidemic trajectory,similar to Li and Linton (2020). We consider the following regression6 it = γ i − g i ( τ t ) | t − β t,b iT | a + ε it , for t ≥ b iT , otherwise , (2.9)where γ i is the global maximum of each individual. When t = β t,b iT , the global maximum isachieved at γ i .If we have a complete trajectory of the epidemic, or at least data that includes the peak andsometime afterwards, we may estimate γ i directly. Specifically, we may take any local (in time)smoother and maximize this over time. The smoothing method eliminates the error term andthen the resulting function is uniquely maximized at the true peak time. One then can use themethodology of Section 2.1 to work with the transformed model as follows. y ∗ it = g i ( τ t ) | t − β t,b iT | a + ε ∗ it , for t ≥ b iT , otherwise , (2.10)where y ∗ it = (cid:98) γ i − y it and ε ∗ it = − ε it + ( (cid:98) γ i − γ i ).Additionally, one may consider an estimation strategy that tries to estimated the parametersof interest simultaneously to avoid the bias caused by the plug-in procedure. We wish to leaveit to the future study, but we examine the model (2.10) using the approach of Section 2.1 in theempirical study as a robustness check. In this section, we investigate the time trend of the COVID-19 data. Before proceeding further,we comment on two practical issues — the choice of kernel function and the bandwidth selectionprocedure.For the kernel function, we follow Hong and Li (2005) and Su and Wang (2017) to adopt aboundary adjusted kernel: K (( τ t − u ) /h ) = K (( τ t − u ) /h ) , u ∈ [ h, − h ] K (( τ t − u ) /h ) / (cid:82) (1 − u ) /h − K ( w ) dw, u ∈ (1 − h, t = 1 , . . . , T , where K ( w ) is the Epanechnikov kernel. By construction of C , there is no needto adjust the left boundary.Next, we provide a bandwidth selection procedure which minimizes a leave-one-out crossvalidation function as follows. (cid:98) h = argmin h CV( h ) , h ) = (cid:88) t ∈ C (cid:107) Y t / ( √ N T a h ) − (cid:98) (cid:96) − τ t (cid:107) , where a h is obtained from (2.8) given h , and (cid:98) (cid:96) − τ t is obtained from (2.7) by replacing Σ( τ t ) with NT (cid:80) Ts =1 ,s (cid:54) = t Y s Y (cid:48) s K h ( τ s − τ t ). The terms √ N and T a h are normalizers to ensure that (cid:98) (cid:96) − τ t andthe normalized Y t are on the same scale. To examine the sensitivity of the bandwidth selectionprocedure, we further consider h L = 0 . (cid:98) h and h R = 1 . (cid:98) h . We focus on daily new infection and new deaths from four regions , (i.e., Africa (AF), America(AM), Asia & Oceania (AO), and Europe (EU)); we account for population density of eachcountry in the following analysis. Note that there are only 8 countries from Oceania in the datasource, so we merge Asia and Oceania together. Population density is based on the data of 2018from World Bank, and is measured as people per sq. km of land area. We exclude countries thatdo not have the population density figures. For each region, the sample period starts from thedate when the first confirmed case is recorded, but we remove the first 30 days of each regionin order to reduce the impacts of missing data. Finally, we summarize the available sample inTable A.1.For infection data, the four regions have roughly the same number of countries. However,death data are very unbalanced. We remove the countries with total deaths less than 20 at31/05/2020, which is why the number of countries drops for death data. It is not surprisingthat Asia & Oceania has the longest period due to early outbreak of China, while Africa has theshortest period. (2.1) We now start conducting numerical analysis using the approach of Section 2.1. Specifically, weconsider two sets of { y it } for both infection and death.Case 1: ln (daily increase + 1)Case 2: ln (cid:16) daily increase +1population density (cid:17) .2.1 Overall Analysis We let C = {(cid:98) T / (cid:99) + 1 , . . . , T } for simplicity, and summarize the estimates of a in Table A.2,which shows that the estimates are not overly sensitive to different choices of the bandwidth.For infection data, Europe has the highest values of (cid:98) a for both Cases 1 and 2, which couldbe due to overall high quality infrastructure leading to high mobility of the entire population.Moreover, America and Asia & Oceania have roughly similar values in both Cases 1 and 2, whileAfrica has the lowest value, which implies that the virus spreads in Africa slower than the otherregions.For death data, America has the highest death rate for both Cases 1 and 2. Although theestimates from the original data (i.e., Case 1) indicate that Africa has a very low death rate,the estimates from the normalized version (i.e., Case 2) indicates that the situation is not toooptimistic but is still the best among four regions.Next, we examine the ratio R t +1 ,t for t = (cid:98) T / (cid:99) + 1 , . . . , T − In this subsection, we compare the performances of countries in each region using the fourthresult of Theorem 2.1. Specifically, for each region, we let the country that has the largestvalue of daily increase at 31/05/2020 be the benchmark, and label it by the index i = 1. Wesummarize the reference countries in Table A.3. We then plot Q τ t ,i for i ≥ Q τ t ,i from largest to smallest atthe time period T . The lines in each sub-plot reflect how the corresponding countries performat different time points compared to the reference country. As explained under Theorem 2.1,(1). smaller value indicates better performance, and (2). a value less (greater) than 1 indicates9etter (worse) performance than the reference country.Our results somewhat agree with the findings of Gu et al. (2020). For example, (1). countriessuch as USA, UK and Italy are at the top of the corresponding sub-plots in our investigation,which indicates ineffective performance in terms of managing the spread of the virus and reducingthe death rate; (2). on the other hand, our finding also suggests that countries such as Australia,China, Japan, Korea, and Singapore perform relatively well as in Gu et al. (2020). Finally, we estimate a and the ratio R using a rolling-window sample in order to capture somedynamics, which in a sense can be regarded as a robustness check on the sensitivity of the data.We prepare the data as in Section 3.1, and remove the first 40 days for each region to avoid theimpacts of missing value on the 30 days rolling-window (i.e., T = 30 for each regression). Foreach window, we let C = { , , . . . , } and estimate ¯ R = (cid:80) t =26 R t +1 ,t . We then record theestimated a and ¯ R from the first available window till the end.For effective policies, we expect the estimates of a show a turning point at certain stage, andexpect the value of ¯ R below one. We plot the estimates of each region in Figures A.5-A.8, wherethe X -axis is indexed by the last day of the consecutive 30 days period.First, we take a look at the values associated with infection in Figures A.5 and A.7. InFigure A.5, the curves of Africa and America keep increasing with a very steady rate, which is aconcern from the perspective of flattening the curve. The curves of Asia & Oceania become flatgradually, but the turning points have not shown up yet. Europe is the only continent whichhas a turning point in Figure A.5, and the pattern exists in both Cases 1 and 2. It furthersupports that European countries have more effective polices overall. In Figure A.7, the curvesof Asia & Oceania and Europe are approaching to 1, while the curves of Africa and America donot. Especially, the values of ¯ R of Africa start diverging from 1 from late May, which is alsoworrisome.Second, we turn to the results associated with death in Figures A.6 and A.8. Clearly, inFigure A.6, the death rate of Europe has been dropping, while Asia & Oceania have managedto flatten the curve, but the turning point has not shown up yet. Africa and America haveincreasing death rates during the entire period. In Figure A.8, Europe still performs much betterthan the other regions, as it is the only region having ¯ R less than 1. The curves of Asia &Oceania have been approaching to 1, while Africa and America do not show much improvementduring the period. 10 .3 Results Associated with Model (2.9) The data and the corresponding settings of this subsection are identical to those in Section 3.2,but we work with the transferred version using (2.10). Still, we consider Cases 1 and 2 for thetransferred data. It is noteworthy that under the model (2.9), the interpretation on the values of a , R t +1 ,t and Q τ t ,i are respectively different from those in Section 3.1. Specifically, the effectivepolicies would ensure relatively short periods to reach the peak of the pandemic. In this sense,the first different is that large a may not be a sign of bad situation. The second difference is thatwe expect the ratio R t +1 ,t greater than 1 to indicate more effective policies, since larger R t +1 ,t implies reaching the peak with a shorter period. Finally, for the ratio Q τ t ,i with i = 2 , . . . , N ,we expect a value greater than 1 to represent a more effective policy compared to the referenceindividual.Note that since Africa and America have not reached the peak with obvious reasons byscreening the data plots, we do not comment on the values associated with Africa and Americamuch below although the values for these two regions are reported. We first summarize the estimates of a in Table A.4. For both infection and death data, it seemsto suggest that in Europe the spread of the virus and the death rate reach the peak slower thanthe other regions by nature. For Asian & Oceania, the spread of the virus and the death ratetend to reach the peak slightly faster than Europe for both Cases 1 and 2.We now focus on the values of R t +1 ,t presented in Figures A.9 and A.10. Consistent withwhat we find in Section 3.2.1, Europe indeed has more effective polices, as the values of R t +1 ,t aregreater than 1 in the entire period for both Cases 1 and 2. Asia & Oceania have some success,but the situation is not as good as in Europe. For each region, the reference countries are the same as those in Table A.3. The legend of eachsub-plot is ranked by Q τ t ,i from largest to smallest at the time period T , however, larger valueimplies better performance in this case.For the infection data of Europe, Case 1 of Figure A.11 fully agrees with Case 1 of FigureA.3, i.e., all countries perform better than the reference country. For the death data of Europe,a similar argument applies to Case 2 of Figure A.12 and Case 2 of Figure A.4.Interestingly, for Asia & Oceania, the downward trending of Cases 1 and 2 in Figures A.3and A.4 becomes upward trending in Figures A.11 and A.9. Thus, both models confirm that11ompared to the reference country, the rest countries in Asia & Oceania have been improving,or the situation of the reference country has been getting out of control. In this paper, we study the trending behaviour of COVID-19 data at country level, and drawattention to some existing econometric tools which are potentially helpful to understand thetrend better in the future study. In our empirical study, we find that European countries overallflatten the curves more effectively compared to the other regions, while Asia & Oceania alsoachieve some success, but the situations are not optimistic as in Europe. Africa and Americaare still facing serious challenges in terms of managing the spread of the virus, and reducing thedeath rate, although in Africa the virus spreads slower and has lower death rate than the otherregions by nature. By comparing the performances of different countries, our results incidentallyagree with Gu et al. (2020), though different approaches and models are considered. For example,both works agree that countries such as USA, UK and Italy perform relatively poorly; on theother hand, Australia, China, Japan, Korea, and Singapore perform relatively better.
References
Bai, J. and Ng, S. (2002), ‘Determining the number of factors in approximate factor models’,
Econo-metrica (1), 191–221.Bai, J. and Ng, S. (2019), Matrix completion, counterfactuals, and factor analysis of missing data.Working paper available at https://arxiv.org/abs/1910.06677.Gao, J., Linton, O. and Peng, B. (2020), ‘Inference on a semiparametric model with global power lawand local nonparametric trends’, Econometric Theory (2), 223–249.Gu, J., Yan, H., Huang, Y., Zhu, Y., Sun, H., Zhang, X., Wang, Y., Qiu, Y. and Chen, S. (2020), Betterstrategies for containing COVID-19 epidemics — A study of 25 countries via an extended varyingcoefficient seir model. Working paper available at https://doi.org/10.1101/2020.04.27.20081232.Hong, Y. and Li, H. (2005), ‘Nonparametric specification testing for continuous-time models with ap-plications to term structure of interest rates’, The Review of Financial Studies (1), 37–84.Li, Q. and Racine, J. (2006), Nonparametric Econometrics Theory and Practice , Princeton UniversityPress. EconometricTheory (4), 557–614.Robinson, P. M. (1989), ‘Chapter 15: Nonparametric estimation of time-varying parameters’, StatisticalAnalysis and Forecasting of Economic Structural Change pp. 253–264.Robinson, P. M. (1991), ‘Chapter 13: Time-varying nonlinear regression’,
Economic Structural Change pp. 179–190.Robinson, P. M. (2012), ‘Inference on power law spatial trends’,
Bernoulli (2), 644–677.Su, L., Miao, K. and Jin, S. (2019), On factor models with random missing: Em estimation,inference,and cross validation. Working paper available at https://ink.library.smu.edu.sg/soe research/2231/.Su, L. and Wang, X. (2017), ‘On time-varying factor models: Estimation and testing’, Journal ofEconometrics (1), 84–101.
Appendix A
In what follows, O (1) stands for a constant, and may be different at each appearance. Without loss ofgenerality, let b T ≤ b T ≤ · · · ≤ b NT in what follows. Lemma A.1
Consider the model stated in (2.1) and (2.2) . Under Assumption 1, as ( N, T ) → ( ∞ , ∞ ) ,1. sup u ∈ D (cid:13)(cid:13)(cid:13) NT (cid:80) Tt =1 I t G ( τ t ) T a G ( τ t ) (cid:48) T a I t K h ( τ t − u ) − N G ( u ) G ( u ) (cid:48) (cid:13)(cid:13)(cid:13) = O (cid:16) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:17) ;2. sup u ∈ D (cid:12)(cid:12)(cid:12) NT (cid:80) Tt =1 G ( τ t ) (cid:48) I t G ( τ t ) ( β ∗ ( τ t , b ∗ i )) K h ( τ t − u ) − N (cid:80) i ∈ N u β ∗ ( u, b ∗ i ) g i ( u ) (cid:12)(cid:12)(cid:12) = O (cid:16) T min { ν ,ν } + (cid:93) N cu N + h µ (cid:17) . Proof of Lemma A.1: (1). Writesup u ∈ D (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N T T (cid:88) t =1 I t G ( τ t ) T a G ( τ t ) (cid:48) T a I t K h ( τ t − u ) − N G ( u ) G ( u ) (cid:48) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (A.1) sup u ∈ D N (cid:88) i ∈ N u (cid:88) j ∈ N u (cid:110) T T (cid:88) t =1 F i ( u ) F j ( u ) I ( t ≥ b iT ) I ( t ≥ b jT ) K h ( τ t − u ) − F i ( u ) F j ( u ) (cid:111) (A.2)+ O (cid:18) T ν + (cid:93) N cu N (cid:19) (A.3)= sup u ∈ D N (cid:88) i ∈ N u (cid:110) (cid:90) b ∗ i F i ( u ) K h ( w − u ) dw − F i ( u ) (cid:111) (A.4)+ sup u ∈ D N (cid:88) i ∈ N u (cid:88) j ∈ N u ,j
N T N T
AF 48 62 23 50AM 43 98 20 61AO 48 123 25 108EU 49 98 41 69
Table A.2: Estimate of a for Model (2.1) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.239 0.240 0.239 0.393 0.393 0.393AM 0.274 0.277 0.272 0.402 0.402 0.402AO 0.262 0.263 0.262 0.409 0.401 0.405EU 0.328 0.329 0.327 0.449 0.449 0.448Death AF 0.021 0.022 0.020 0.321 0.322 0.321AM 0.285 0.286 0.285 0.445 0.445 0.445AO 0.133 0.133 0.132 0.367 0.366 0.365EU 0.235 0.235 0.234 0.409 0.401 0.405 a for Model (2.9) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.229 0.230 0.229 0.396 0.396 0.396AM 0.187 0.188 0.186 0.358 0.358 0.357AO 0.210 0.212 0.209 0.393 0.394 0.392EU 0.164 0.164 0.164 0.375 0.375 0.375Death AF 0.108 0.109 0.107 0.339 0.339 0.339AM 0.096 0.096 0.095 0.367 0.367 0.367AO 0.144 0.146 0.143 0.373 0.373 0.372EU 0.103 0.104 0.102 0.345 0.345 0.344 R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.4: Model 1 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.
Table A.2: Estimate of a for Model (2.1) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.239 0.240 0.239 0.393 0.393 0.393AM 0.274 0.277 0.272 0.402 0.402 0.402AO 0.262 0.263 0.262 0.409 0.401 0.405EU 0.328 0.329 0.327 0.449 0.449 0.448Death AF 0.021 0.022 0.020 0.321 0.322 0.321AM 0.285 0.286 0.285 0.445 0.445 0.445AO 0.133 0.133 0.132 0.367 0.366 0.365EU 0.235 0.235 0.234 0.409 0.401 0.405 a for Model (2.9) Case 1 Case 2 (cid:98) h h L h R (cid:98) h h L h R Infection AF 0.229 0.230 0.229 0.396 0.396 0.396AM 0.187 0.188 0.186 0.358 0.358 0.357AO 0.210 0.212 0.209 0.393 0.394 0.392EU 0.164 0.164 0.164 0.375 0.375 0.375Death AF 0.108 0.109 0.107 0.339 0.339 0.339AM 0.096 0.096 0.095 0.367 0.367 0.367AO 0.144 0.146 0.143 0.373 0.373 0.372EU 0.103 0.104 0.102 0.345 0.345 0.344 R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.4: Model 1 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.5: Model 1 — Estimated a of Infection Data using Rolling Window. The left and rightpanels are Case 1 and Case 2 respectively. a of Death Data using Rolling Window. The left and rightpanels are Case 1 and Case 2 respectively. R of Infection Data using Rolling Window. The left and right panelsare Case 1 and Case 2 respectively. R of Death Data using Rolling Window. The left and right panels areCase 1 and Case 2 respectively. R t +1 ,t of Infection Data. The left and right panels are Case 1 and Case2 respectively. R t +1 ,t of Death Data. The left and right panels are Case 1 and Case 2respectively. Q τ t ,i of Infection Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3. igure A.12: Model 2 — Q τ t ,i of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.of Death Data. The top and bottom panels are Case 1 and Case 2 respectively. The referencecountries are presented in Table A.3.