[PDF] Modeling Long Cycles

Abstract

Recurrent boom-and-bust cycles are a salient feature of economic and financial history. Cycles found in the data are stochastic, often highly persistent, and span substantial fractions of the sample size. We refer to such cycles as "long". In this paper, we develop a novel approach to modeling cyclical behavior specifically designed to capture long cycles. We show that existing inferential procedures may produce misleading results in the presence of long cycles, and propose a new econometric procedure for the inference on the cycle length. Our procedure is asymptotically valid regardless of the cycle length. We apply our methodology to a set of macroeconomic and financial variables for the U.S. We find evidence of long stochastic cycles in the standard business cycle variables, as well as in credit and house prices. However, we rule out the presence of stochastic cycles in asset market data. Moreover, according to our result, financial cycles as characterized by credit and house prices tend to be twice as long as business cycles.

Full PDF

MMODELING LONG CYCLES

NATASHA KANG AND VADIM MARMER

Vancouver School of Economics, University of British Columbia A BSTRACT . Recurrent boom-and-bust cycles are a salient feature of economic and ﬁnan-cial history. Cycles found in the data are stochastic, often highly persistent, and spansubstantial fractions of the sample size. We refer to such cycles as “long”. In this paper, wedevelop a novel approach to modeling cyclical behavior speciﬁcally designed to capturelong cycles. We show that existing inferential procedures may produce misleading resultsin the presence of long cycles, and propose a new econometric procedure for the inferenceon the cycle length. Our procedure is asymptotically valid regardless of the cycle length.We apply our methodology to a set of macroeconomic and ﬁnancial variables for the U.S.We ﬁnd evidence of long stochastic cycles in the standard business cycle variables, as wellas in credit and house prices. However, we rule out the presence of stochastic cycles inasset market data. Moreover, according to our result, ﬁnancial cycles as characterized bycredit and house prices tend to be twice as long as business cycles.K

EY WORDS . Stochastic cycles, autoregressive processes, local-to-unity asymptotics, conﬁ-dence sets, business cycle, ﬁnancial cycleJEL C

LASSIFICATION : C12, C22, C5, E32, E44

1. Introduction

This paper develops an econometric framework for the inference on the cyclical prop-erties of time series. We are particularly interested in stochastic cycles arising due topersistent low-frequency oscillatory impulse responses. The period of such cycles spansa substantial fraction of a sample, and the econometrician would be able to observe only

Date a r X i v : . [ ec on . E M ] O c t handful of peaks and troughs in data. We refer to such cycles as “long”. It is also im-portant to emphasize the stochastic nature of such cycles. While their peaks may appearto be regularly spaced, the timing of the peaks and even their number in a sample aredetermined by particular realizations of the shocks.Long cycles are prevalent in macroeconomic and ﬁnancial data. In a recent paper,Beaudry, Galizia, and Portier (2020) estimate that many variables have cycles of lengthof approximately 32–40 quarters, which can correspond to 15% and even 20% of theirobserved samples. Using data from 1960 to 2011, Drehmann, Borio, and Tsatsaronis(2012) estimate that the length of the cycle for credit is 18 years or approximately 35%of their sample. The ﬁrst contribution of our paper is to show that existing inferentialprocedures for the cycle length can be distorted in such cases. We ﬁnd that substantialdistortions occur when the cycle length exceeds 25% of the sample size.In our second contribution, we propose a new econometric procedure for the inferenceon the periodicity of cycles. The novel aspect of our methodology is that it is specif-ically designed to take into account the possibility of long persistent stochastic cycles.Our procedure produces a conﬁdence intervals for the cycle length that has the followingproperty: its asymptotic coverage probability is correct regardless of the cycle length.Thus, the conﬁdence interval is asymptotically valid both when the period is small rel-ative to the sample size, and when it spans a substantial fraction of observed data. Noother procedure in the existing literature has this property. When the data generatingprocess (DGP) is acyclical, in large samples our procedure is expected to produce emptyconﬁdence intervals for the cycle length. Hence, the procedure can be used to rule outthe cyclical behavior.Stochastic cycles naturally arise in AR(2) models with complex roots. For example,Sargent (1987) describes the region for the values of the autoregressive coefﬁcients thatproduce a spike in the spectrum in the interior of the [0 , π ] range. The period of such cy-cles is determined by the values of the autoregressive coefﬁcients and has a ﬁxed lengthin time units. According to this model, the cycle length would amount to a negligiblefraction of the sample size in large samples. Thus, the long-cycle characteristics of dataare not preserved asymptotically: while in a ﬁnite sample the cycle length can repre-sent a substantial fraction of the sample size, it would be negligible in the asymptoticapproximation. As a result, conventional asymptotic approximations to the ﬁnite sampledistributions of estimators and statistics would be inaccurate.The problem can be better understood from the perspective of the literature concernedwith inference on the largest autoregressive root (Stock, 1991; Andrews, 1993; Hansen, The region with an interior spike in the spectrum is a subset of the region with complex roots. Long cycles correspond to low-frequency ﬂuctuations, which can be indicative of autoregressive roots near unity. How-ever, it has been shown in the literature that when autoregressive roots are close to unity,the conventional asymptotic theory does not provide an accurate approximation to the ﬁ-nite sample distributions of estimators and statistics. More accurate approximations canbe obtained using the so-called local-to-unity asymptotics developed in Phillips (1987,1988). Moreover, the local-to-unity asymptotics nests the conventional asymptotics as alimiting case. This is achieved by modeling the autoregressive coefﬁcients as drifting withthe sample size toward unity at the appropriate rate. Such speciﬁcations are consistentwith stationary parameter values for ﬁnite-sample DGPs and can produce any desiredlevel of persistence. Note also that investigating the asymptotic properties of statisticsunder speciﬁcations with drifting coefﬁcients is required to verify the uniform size ofinferential procedures (Andrews, Cheng, and Guggenberger, 2020).Borrowing from the insights of this literature, we model long cycles using autoregres-sive speciﬁcations with two complex roots as in Sargent (1987), however, the complexroots are drifting and local to one. The drifting speciﬁcation allows us to preserve thelong-cycle feature asymptotically: regardless of the size of the sample, the length of acycle as a fraction of the sample size remains the same. This point is illustrated in Figure1. It shows the difference between the simulated sample paths of cyclical processes gen-erated using the ﬁxed-coefﬁcient and drifting-coefﬁcient DGPs for small and large samplesizes. The ﬁgure demonstrates that in the model with ﬁxed autoregressive coefﬁcients, byrelying on asymptotic approximations one would distort the cyclical properties of data.However, the long-cycle properties are preserved in the limit by relying on asymptoticapproximations with drifting coefﬁcients.While we adopt the local-to-unity modeling approach of Phillips (1987, 1988), thereare a number of important differences between our paper and the existing literature.First, since the roots are complex conjugates, the long-cycle process is near I(2) insteadof near I(1). Second, due to the presence of complex roots, we obtain different asymp-totic approximations from those in the literature. The previous results for local-to-unityprocesses cannot accommodate long cycles, and we develop a novel theory for such pro-cesses.Thus, the third contribution of this paper is to the literature concerned with inferenceon autoregressive roots. Most of that literature is focused on the largest autoregressiveroots (or the sum of the autoregressive coefﬁcients). In our paper, we also focus on Equivalently, the sum of the autoregressive coefﬁcients. In a recent paper, Dou and M¨uller (2019) propose a generalized local-to-unity ARMA model with multiplelocal-to-unity autoregressive roots balanced local-to-unity roots in the moving average component. They

10 20 30 40 50 60 70 80 90 100 ( A ) ﬁxed coefﬁcients, n = 100 ( B ) drifting coefﬁcients, n = 100 ( C ) ﬁxed coefﬁcients, n = 1000 ( D ) drifting coefﬁcients, n = 1000 F IGURE

1. Time plots of AR(2) processes with ﬁxed and drifting coefﬁ-cients. In the standard AR(2) speciﬁcation with ﬁxed coefﬁcients, the pe-riod stays the same as the sample size increases. In contrast, the period ofan AR(2) process with drifting coefﬁcients grows proportionally with thesample size.another important and empirically relevant aspect of data: low-frequency stochastic os-cillations. To fully analyze such processes, one has to consider complex autoregressiveroots. Our results also lay out the foundation for a new econometric framework thatbesides the inference on cyclicality, can also be used to study cointegrating long cyclesand phase shifts in macro-ﬁnancial aggregates. However, the latter are outside the scopeof this paper and left for future research. do not consider cyclical behavior, and their limiting distributions are different from those arising in ourcase. he fourth contribution of this paper is empirical, where we implement our procedureto study the cyclical properties of key macroeconomic and ﬁnancial indicators using U.S.data. Recurrent boom-and-bust cycles are a salient feature of economic and ﬁnancialhistory. A long-standing interest in understanding these ups and downs in the macro-ﬁnancial aggregates has led to a vast body of literature on business cycles, and a resur-gence of research on ﬁnancial cycles post the ﬁnancial crisis-induced Great Recession of2008. Among these strands of work is the empirical characterization of business andﬁnancial cycles. The traditional approach to such characterization is to identify turningpoints or peaks and troughs in the time series using the dating algorithms of Bry andBoschan (1971) and Harding and Pagan (2002). Based on the turning-point analysis,Drehmann, Borio, and Tsatsaronis (2012) highlight the importance of medium-term cy-cles that last 18 years for credit, 11 years for GDP and 9 years for equity prices. Theseﬁndings are in-line with studies using frequency-based bandpass ﬁlters (see Aikman, Hal-dane, and Nelson, 2015; Comin and Gertler, 2006).The cyclical properties of data have also been formally examined in the literature usinga variety of methods, including direct and indirect spectrum estimation (e.g. A’Hearn andWoitek, 2001; Strohsal, Proa˜no, and Wolters, 2019) and structural time series modeling(e.g. Harvey, 1985; R¨unstler and Vlekke, 2018). However, they rely on the conventionalasymptotic approximations that may produce misleading results with long cycle data, aswe argue in this paper. For example, we show that the periodogram-based estimator isasymptotically biased in the case of long cycles.Using our methodology, we ﬁnd that long cycles cannot be ruled out for macroeco-nomic series such as the real GDP per capita, unemployment rates, and hours per capita.Our results suggest the possibility of much longer cycles than those previously reported inthe literature. In addition, we ﬁnd that ﬁnancial variables such as credit to non-ﬁnancialsector and home prices exhibit long cycles that are even longer than those for the macrovariables. Our results support the position that ﬁnancial cycles operate at a lower fre-quency than business cycles. However, our most striking result is that we decisivelyreject stochastic cycles for asset market variables such as the volatility index, credit riskpremium, and equity prices. This suggests that the mechanism for asset market ﬂuctu-ations is different from that of macroeconomic variables and ﬁnancial variables such ascredit and home prices. Importantly, this ﬁnding rejects a view suggested in the macro-ﬁnance literature that asset prices and economic ﬂuctuations are driven by the sameunderlying forces: time-varying risk premiums and risk-bearing capacity (see Cochrane,2017).The rest of the paper is organized as follows. In Section 2, we present our modelingapproach for long cycles. Section 3 presents our core asymptotic results. The results are xtended in Section 4 to allow for linear time trends and deterministic cycles. In Section5, we discuss the size distortions of the conventional inference approach. Section 6describes our procedure for constructing conﬁdence intervals for the cycle length. Section7 presents our empirical results. In Appendix A, we discuss the asymptotic properties ofthe periodogram for long-cycle processes.

2. A model for long cycles

In this section, we present a model for processes exhibiting long cycles. Our objectiveis to develop a parsimonious modeling approach that allows for cycles with periods span-ning non-negligible fractions of observed data. More formally, the model should allowfor the period as a fraction of the sample size n to converge to a non-zero constant as n → ∞ .As discussed in the introduction and following Sargent (1987), in the class of autore-gressive models, a cyclical behavior requires complex roots that come as conjugate pairs,which makes AR(2) model with serially uncorrelated errors a natural starting point. Laterin the paper, we will extend the approach and allow for serially correlated errors. Hence,consider a process { y t } generated according to (1 − φ L − φ L ) y t = u t , (2.1)where L denotes the lag operator and { u t } is a mean-zero iid sequence with a ﬁnite vari-ance. Let λ and λ denote the roots of the characteristic equation for the lag polynomialin (2.1): z − φ z − φ = 0 . When | λ | < and | λ | < , { y t } has the following MA( ∞ ) representation: y t = 1 λ − λ ∞ (cid:88) j =0 (cid:0) λ j +11 − λ j +12 (cid:1) u t − j . Suppose the roots λ , λ are complex, and consider their polar form representation: λ , λ = re ± iθ , (2.2)where r denotes the modulus, and θ is the argument of the complex roots.Given the polar coordinate representation for the roots and the MA( ∞ ) representationfor the process, we can now write { y t } as y t = ∞ (cid:88) j =0 r j sin( θ ( j + 1))sin( θ ) u t − j . (2.3) ccording to (2.3), the realized value of y t is a weighted inﬁnite sum of past realizationsof the innovation sequence { u t } . When the characteristic roots are complex, the weightsor impulse responses are given by a damped sine wave: the impulse response of y t to u t − j is w j = r j sin( θ ( j + 1))sin( θ ) , where the modulus r indicates the rate of decay or the persistence of the sine wave, andthe argument θ corresponds to the angular frequency and determines the period of thesine wave.The stochastic process { y t } inherits its oscillatory behaviour precisely from this dampedperiodic sine weighting function. The closer r is to one, the more persistent is { y t } , andthe closer θ is to zero, the lower is the oscillating frequency and the longer the length ofcycles in { y t } . Stochastic cycles are therefore conveniently captured in an AR(2) modelwith a pair of complex conjugate roots.A cyclical process generated according to (2.1) with roots given by (2.2) has an ex-pected cycle length of π/θ . With any ﬁxed parameter value θ , the period as a fractionof the sample size becomes negligible for large n . Hence, asymptotic approximations as-suming ﬁxed values for the argument θ can produce distinctly different cyclical behaviorfrom that observed in ﬁnite samples. In other words, conventional asymptotics with aﬁxed complex root argument θ can distort the cyclical properties of the process. As aresult, such asymptotic theory would provide a poor approximation to the actual behav-ior of the process in ﬁnite samples. Since the expected period of a process as a fractionof the sample size is given by π/ ( θn ) , to preserve the cyclical properties in the limit as n → ∞ , one has to consider a drifting sequence of the arguments { θ n } and allow for nθ n → d ∈ [0 , ∞ ] . We re-write the AR(2) model in (2.1) as follows: (1 − φ ,n L − φ ,n L ) y t = u t , (2.4)where φ ,n and φ ,n are now drifting coefﬁcients that can change with n . We denote thecorresponding characteristic roots as λ ,n and λ ,n , and make the following assumption. In the more common exponential decay representation, r j = e ln( r ) j . Restricting to processes with non-explosive roots, i.e. r ≤ , we have ln( r ) ≤ and − ln( r ) is known as the decay constant. This point is illustrated in Figure 1. The approach can still accommodate conventional asymptotics by allowing nθ n → ∞ . As in Phillips (1987, 1988), the solution to the difference equation (2.4) is a triangular array of the form { y n,t : t = 1 , . . . , n ; n ≥ } . However, we suppress the subscript n to simplify the notation. ssumption 2.1. The characteristic roots λ ,n , λ ,n associated with the lag polynomialequation in (2.4) are given by λ ,n = e ( c + id ) /n , λ ,n = e ( c − id ) /n (2.5)where i = √− is the imaginary number, and c ≤ and d > are ﬁxed localizationparameters.In the above assumption, we exclude positive values of c as they correspond to explo-sive roots. The negative values of d can be excluded as d and − d deﬁne the same pairof roots. Note that the autoregressive coefﬁcients are related to the characteristic rootsthrough the following equations: φ ,n = λ ,n + λ ,n = 2 e c/n cos( d/n ) , (2.6) φ ,n = − λ ,n λ ,n = − e c/n . (2.7)Hence, the autoregressive coefﬁcient φ ,n is local to 2 while φ ,n is local to − . The sumof the autoregressive coefﬁcients is local to one, and the process can be mistaken forthose considered in the local-to-unity literature. As we discuss below, processes deﬁnedby (2.6)–(2.7) exhibit persistent stochastic oscillations and, as a result, their asymptoticproperties are different from those considered in the local-to-unity literature.The expressions for the characteristic roots in (2.2) and (2.5) are equivalent with r replaced by r n ≡ exp( c/n ) , and θ replaced by θ n ≡ d/n . The modulus r n in (2.5) hasthe same representation as the autoregressive parameter in the local-to-unity model inPhillips (1987, 1988). Therefore, close to zero values of c correspond to persistent pro-cesses with two roots near unity. Hence, the process deﬁned by Assumption 2.1 can beviewed as near I(2).The parameter d controls the length of the cycle, where long cycles correspond tovalues of d near zero. Under Assumption 2.1, the expected period as a fraction of thesample size is given by τ θ ≡ πnθ n = 2 πd . (2.8)With this parametrization, the length of the cycle as a fraction of the sample size isindependent of the sample size, and resulting asymptotic approximations preserve thecyclical properties of the process in the limit.An alternative but related measure of the periodicity of a process can be constructedfrom its spectral properties. The advantage of this measure is that it takes into accountthe persistence of the process unlike that based solely on the argument θ n of the complexroots. Let ω ∗ n denote the frequency that maximizes the spectral density of the process in ω ∗ n = cos − (cid:18) − φ ,n (1 − φ ,n )4 φ ,n (cid:19) , and the corresponding period of the process as a fraction of the sample size is given by τ ω ∗ n ≡ π/ω ∗ n n . The proposition below provides an asymptotic approximation to the length of a cycle asa fraction of the sample size when measured using the spectrum-based approach.

Proposition 2.1.

Suppose that { y n,t } is generated according to (2.4) with characteristicroots satisfying Assumption 2.1 and serially uncorrelated { u t } with a ﬁnite variance. Sup-pose further that d ≥ | c | . Then its spectrum maximizing frequency ω ∗ n satisﬁes nω ∗ n = √ d − c + O ( n − ) , and its corresponding spectrum-based period as a fraction of the sample size satisﬁes τ ω ∗ n = 2 π √ d − c + O ( n − ) . The proposition shows that, when using spectrum-based measures of the period, thelength of a cycle relatively to the sample size can be approximated by τ ω ≡ π √ d − c . (2.9)Unlike the angular frequency-based measure τ θ , the spectrum-based measure τ ω takesinto account the persistence of the process as captured by the value of the localizationparameter c . Note that larger negative values of c ≤ produce less persistent processes.In such cases, the spectrum’s peak is closer to the origin, and as a result, less persistentprocesses may not exhibit any visible cyclical behavior.In Appendix A, we show that the periodogram-based estimation approach producesbiased estimates of τ ω . We, therefore, develop below an inference procedure for τ θ and τ ω based on the estimates of the autoregressive coefﬁcients φ ,n and φ ,n . For that purposewe proceed in two steps. First, we develop a procedure for constructing asymptoticallyvalid conﬁdence sets for the autoregressive parameters φ ,n and φ ,n . In the second step,we use projection arguments to build conﬁdence intervals for the proposed measuresof the length of a cycle τ θ and τ ω . The theory developed below relaxes the iid/seriallyuncorrelated assumptions on { u t } . . Asymptotics for long-cycle processes We now provide the asymptotic theory for the process deﬁned in equations (2.4)–(2.5). The theory will be subsequently used for establishing the asymptotic distributionsof regression-based statistics involving long-cycle time series. It is also required for devel-oping robust and asymptotically valid inference about the cyclical properties of a process.The speciﬁcation proposed in equations (2.4)–(2.5) is akin to the ﬁrst-order autore-gressive local-to-unity root model in Phillips (1987). Assuming that a process { x t } isgenerated according to x t = a n x t − + u t with a n = exp( c/n ) , Phillips (1987) shows thatthe distribution of { x t } can be approximated by an Ornstein-Uhlenbeck diffusion process: n − / x (cid:98) nr (cid:99) = n − / (cid:98) nr (cid:99) (cid:88) t =1 e c ( (cid:98) nr (cid:99)− t ) /n u t ⇒ σJ c ( r ) , (3.1)where J c ( r ) ≡ (cid:90) r e c ( r − s ) d W ( s ) ,r ∈ [0 , , (cid:98) x (cid:99) denotes the largest integer less or equal to x , W ( · ) is a standard Brown-ian motion, σ denotes the limit of the long-run variance of { u t } , “ ⇒ ” denotes the weakconvergence of probability measures, and it is assumed that { u t } satisﬁes a FunctionalCentral Limit Theorem (FCLT). Note that the distribution of the Ornstein-Uhlenbeck pro-cess J c ( r ) depends on the localization parameter c . In what follows, we build on theseinsights.We make the following assumption on the innovation sequence { u t } . Assumption 3.1 (FCLT) . Let W ( · ) denote the standard Brownian motion, and let σ bethe limit of the long-run variance of { u t } : σ ≡ lim n →∞ V ar ( n − / (cid:80) nt =1 u t ) . Then for r ∈ [0 , , n − / (cid:98) nr (cid:99) (cid:88) t =1 u t ⇒ σW ( r ) . In the case of long-cycles, a different limiting process arises from that of the local-to-unity case, with cyclicality reﬂected by the sine function. However, the process can bealso described as an integral with respect to a Brownian motion, and it also depends onthe localization parameters c and d . We deﬁne: J c,d ( r ) ≡ d (cid:90) r e c ( r − s ) sin( d ( r − s ))d W ( s ) . (3.2) Assumption 3.1 holds, for example, when { u t } is a mixing process such that E ( u t ) = 0 for all t , sup t E | u t | β + (cid:15) ≤ ∞ for some β < and (cid:15) > , and { u t } is α -mixing of size − β/ ( β − (Phillips, 1987).Alternatively, it holds when { u t } is a linear MA( ∞ ) process satisfying the conditions in Phillips and Solo(1992, Theorem 3.15). he next proposition shows that in large samples and after appropriate scaling, the dis-tribution of a long-cycle process can be approximated by that of J c,d ( · ) . Proposition 3.1.

Suppose that { y t } is generated according to equation (2.4) , and Assump-tions 2.1 and 3.1 hold. Then, n − / y (cid:98) nr (cid:99) ⇒ σJ c,d ( r ) . Proof.

The solution to (2.4) can be expressed in terms of the characteristic roots as y n,t = 1 λ n, − λ n, t (cid:88) k =1 (cid:0) λ t − k +1 n, − λ t − k +1 n, (cid:1) u k = 12 i · e c/n sin( d/n ) t (cid:88) k =1 (cid:0) e ( c + id )( t − k +1) /n − e ( c − id )( t − k +1) /n (cid:1) u k , where the second equality follows by Assumption 2.1. By Assumption 3.1 and as in (3.1), n − / (cid:98) nr (cid:99) (cid:88) k =1 ( e ( c + id )( t − k +1) /n − e ( c − id )( t − k +1) /n ) u k ⇒ σ (cid:90) r (cid:0) e ( c + id )( r − s ) − e ( c − id )( r − s ) (cid:1) d W ( s )= 2 iσ (cid:90) r e c ( r − s ) sin( d ( r − s ))d W ( s ) . The result follows since sin( d/n ) = d/n + O ( n − ) . (cid:3) The continuous time Gaussian process J c,d ( · ) plays the central role in our analysis. Itcan be viewed as a continuous time version of the MA( ∞ ) representation in (2.3): pastshocks are weighted by a damped sine wave. Again, the localization parameter c controlsthe persistence, and the localization parameter d controls the cyclicality. Note also thatthe long-cycle process requires stronger scaling than that in the local-to-unity case: n − / instead of n − / . This is a reﬂection of the fact that long-cycle processes are near I(2).We now turn to the properties of the least-squares estimators and the correspondingtest statistics for the second-order autoregressive model with long cycles. Let (cid:98) φ ,n and (cid:98) φ ,n denote the least-squares estimator of (2.4): (cid:32) (cid:98) φ ,n (cid:98) φ ,n (cid:33) = (cid:32) (cid:80) y t − (cid:80) y t − y t − (cid:80) y t − y t − (cid:80) y t − (cid:33) − (cid:32)(cid:80) y t − y t (cid:80) y t − y t (cid:33) . (3.3)As it turns out, the matrix on the right-hand side is asymptotically singular becauseall three elements (cid:80) y t − , (cid:80) y t − , and (cid:80) y t − y t − converge to the same random limitwhen properly scaled. This is because (cid:80) y t − y t − = (cid:80) y t − + smaller order terms, which ollows formally from Lemma 3.1(b) below. The singularity complicates the derivation ofthe limiting distributions of the estimators and the corresponding test statistics.To eliminate the singularity arising in the limit, we consider the following transforma-tion of the equation in (2.4): y t = ( φ ,n + φ ,n ) y t − − φ ,n ∆ y t − + u t , (3.4)where ∆ y t − = y t − − y t − . Since (3.4) is obtained from the original equation through anon-singular linear transformation of the regressors and parameters, the OLS estimatorof φ ,n + φ ,n is given by (cid:98) φ ,n + (cid:98) φ ,n . Moreover, the usual Wald test statistic for testing jointhypotheses about φ and φ is the same for both regressions. Thus, we have: (cid:32) (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n (cid:98) φ ,n − φ ,n (cid:33) = (cid:32) (cid:80) y t − − (cid:80) y t − ∆ y t − − (cid:80) y t − ∆ y t − (cid:80) (∆ y t − ) (cid:33) − (cid:32) (cid:80) y t − u t − (cid:80) ∆ y t − u t (cid:33) . (3.5)As we show below, the matrix on the right-hand side of (3.5) is no longer singular in thelimit.It follows from the representation in (3.5) that the asymptotic theory of the OLS esti-mator involves the sample moments of ( y t − , ∆ y t − ) . Hence, in addition to the asymptoticapproximation of y t − , we also need the asymptotic approximation for ∆ y t − . The latterinvolves two additional continuous time processes. We deﬁne: K c,d ( r ) ≡ d (cid:90) r e c ( r − s ) cos( d ( r − s ))d W ( s ) ,G c,d ( r ) ≡ c · J c,d ( r ) + d · K c,d ( r ) . (3.6)Note that the diffusion process K c,d ( r ) is akin to the process J c,d ( r ) except that it is deﬁnedwith a cosine function instead of a sine function. The next proposition shows that inlarge samples and after scaling, the distribution of ∆ y (cid:98) nr (cid:99) can be approximated by that of G c,d ( r ) . Proposition 3.2.

Suppose that { y t } is generated according to equation (2.4) , and Assump-tions 2.1 and 3.1 hold. Then, n − / ∆ y (cid:98) nr (cid:99) ⇒ σG c,d ( r ) , where the result holds jointly with that in Proposition 3.1. ote that in contrast to the n − / scaling applied to y t − , its ﬁrst difference ∆ y t − requires scaling by n − / . Hence, the ﬁrst differences of long-cycle processes have con-vergence rates of O ( n / ) tantamount to that of local-to-unity processes. However dueto cyclicality, the large-sample distribution of ∆ y t − is different from that arising in thelocal-to-unity model.Based on the results of Proposition 3.1 and 3.2, we can now provide the asymptotictheory for the sample moments of long-cycle processes. Parts of the lemma below requirethe following ergodicity property for { u t } . Assumption 3.2.

Let σ u ≡ lim n →∞ n − (cid:80) nt =1 Eu t be the average variance of { u t } overtime. We assume that n − (cid:80) nt =1 u t → p σ u . Lemma 3.1.

Suppose that { y t } is generated according to equation (2.4) , and Assumptions2.1 and 3.1 hold. The following results hold jointly.(a) n − (cid:80) y t − ⇒ σ (cid:82) J c,d ( r )d r .(b) n − (cid:80) y t − ∆ y t − ⇒ σ (cid:82) J c,d ( r ) G c,d ( r )d r .(c) n − (cid:80) (∆ y t − ) ⇒ σ (cid:82) G c,d ( r )d r .Suppose in addition that Assumption 3.2 holds. The following results hold jointly with (a)–(c).(d) n − (cid:80) y t − u t ⇒ σ (cid:82) J c,d ( r )d W ( r ) .(e) n − (cid:80) ∆ y t − u t ⇒ σ (cid:82) G c,d ( r )d W ( r ) + ( σ − σ u ) . Note that in part (e) of the lemma, the limiting distribution of the sample covariancebetween ∆ y t − and u t depends on the difference between the long-run and the averageover time variances of { u t } . This reﬂects the serial correlation in { u t } and is standard inthe unit root literature. However despite the serial correlation, the difference σ − σ u doesnot appear in the limiting expressions in part (d) for the sample covariance between y t − and u t . This is because of the stronger scaling factor required for the near I(2) long-cycleprocess { y t } .To simplify the notation, in the rest of the paper we use (cid:82) J c,d to denote (cid:82) J c,d ( r )d r and (cid:82) J c,d d W to denote (cid:82) J c,d ( r )d W ( r ) . We use the same convention for the integralexpressions with G c,d ( r ) with J c,d replaced by G c,d . Lastly, we use (cid:82) J c,d G c,d to denote (cid:82) J c,d ( r ) G c,d ( r )d r .Equipped with the results of Lemma 3.1, we can now describe the asymptotic distribu-tion of the least-squares estimators of φ ,n and φ ,n . Proposition 3.3.

Suppose that { y n,t } is generated according to equation (2.4) , and Assump-tions 2.1, 3.1, and 3.2 hold. The following results hold jointly with the results of Lemma3.1. a) n ( (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n ) ⇒ (cid:82) G c,d · (cid:82) J c,d d W − (cid:18) (cid:82) G c,d d W + (1 − σ u /σ ) (cid:19) · (cid:82) J c,d G c,d (cid:82) J c,d · (cid:82) G c,d − ( (cid:82) J c,d G c,d ) . (b) n (cid:32) (cid:98) φ ,n − φ ,n (cid:98) φ ,n − φ ,n (cid:33) ⇒ (cid:32) − (cid:33) × (cid:82) J c,d G c,d · (cid:82) J c,d d W − (cid:18) (cid:82) G c,d d W + (1 − σ u /σ ) (cid:19) · (cid:82) J c,d (cid:82) J c,d · (cid:82) G c,d − (cid:0) (cid:82) J c,d G c,d (cid:1) . According to part (b) of the proposition, the joint asymptotic distribution of the least-squares estimators of φ ,n and φ ,n is singular and determined by the same randomvariable. Moreover, their rate of convergence is O p ( n − ) despite the process { y t } be-ing near I(2). This is a consequence of the asymptotic singularity in (3.3) as previouslydiscussed on page 11. However, in part (a) of the proposition, the least-squares estima-tor of φ ,n + φ ,n has the faster convergence rate O p ( n − ) characteristic to I(2) processes.Note that the limiting distributions depend on the localization parameters c and d .Next, we discuss inference for the autoregressive coefﬁcients. Consider testing a jointhypothesis H : φ = φ , , φ = φ , against H : φ (cid:54) = φ , or φ (cid:54) = φ , . The usual Waldstatistic is given by W n ( φ , , φ , ) ≡ (cid:32) (cid:98) φ ,n + (cid:98) φ ,n − φ , − φ , (cid:98) φ ,n − φ , (cid:33) (cid:62) (cid:98) V − n (cid:32) (cid:98) φ ,n + (cid:98) φ ,n − φ , − φ , (cid:98) φ ,n − φ , (cid:33) , (3.7)where (cid:98) V n ≡ (cid:98) σ n (cid:32) (cid:80) y t − − (cid:80) y t − ∆ y t − − (cid:80) y t − ∆ y t − (cid:80) (∆ y t − ) (cid:33) − , and (cid:98) σ n is a consistent estimator of the long-run variance σ constructed using ˆ u t = y t − (cid:98) φ ,n y t − − (cid:98) φ ,n y t − , see Newey and West (1987) and Andrews (1991). The infeasibleestimator of σ that uses u t is constructed as ˜ σ n = n − n (cid:88) t =1 u t + 2 m n (cid:88) h =1 w n ( h ) n − n (cid:88) t = h +1 u t u t − h , where m n = o ( n ) is the lag truncation parameter, and w n ( · ) is a bounded weight functionsuch that lim n →∞ w n ( h ) = 1 for all h . The feasible estimator (cid:98) σ n is constructed similarlyusing the estimated residuals ˆ u t in place of u t . We make the following assumption. ssumption 3.3. The infeasible estimator ˜ σ n of the long-run variance σ is consistent: ˜ σ n → p σ .The conditions for consistency of the infeasible estimator can be found in Newey andWest (1987) and Andrews (1991). Our next result describes the asymptotic null distri-bution of the Wald statistic for long-cycle processes. Proposition 3.4.

Suppose that { y n,t } is generated according to equation (2.4) , Assumptions2.1 and 3.1-3.3 hold, and m n = o ( n ) . Then, W n ( φ ,n , φ ,n ) ⇒ (cid:82) (cid:26) J c,d · (cid:18) (cid:82) G c,d d W + (1 − σ u /σ ) (cid:19) − G c,d · (cid:82) J c,d d W (cid:27) (cid:82) J c,d · (cid:82) G c,d − (cid:0) (cid:82) J c,d G c,d (cid:1) . The asymptotic null distribution of the Wald statistic is non-standard and non-pivotal:it depends on the ratio of the average over time and long-run variances σ u /σ , and on theunknown localization parameters c and d . While the ratio σ u /σ plays no role when { u t } are serially uncorrelated and can be estimated consistently otherwise, the dependenceon c and d remains. Hence, the quantiles of the limiting distribution can only be simulatedgiven the values of c and d .In Section 5, we discuss the differences between the usual χ critical values and thequantiles of the asymptotic distribution in Proposition 3.4. Depending on the values of c and d , the differences can be substantial especially when the model includes deterministiccomponents discussed in the next section.

4. Extensions to models with deterministic components

For practical applications, it is important to allow the DGP to include non-zero means,trends, and deterministic cycles. We discuss such adjustments in this section. As the re-sults below show, the limiting distributions of the regression estimators and test statisticstake a similar form to those in Section 3, but with J c,d and G c,d replaced with their resid-uals from appropriate continuous time projections. This property is standard in the unitroot literature and continues to hold in our case.Formally, we assume that the data { y t : t = 1 , . . . , n } are generated according to (1 − φ ,n L − φ ,n L )( y t − D t ) = u t , (4.1)where D t is non-random, can vary with t , and depends on unknown parameters. Tocontrol for the deterministic part, estimation of the autoregressive coefﬁcients requires See the proof of Proposition 3.4. rojecting against the components of D t . The asymptotic distributions of the estimatorsand test statistics change accordingly. We consider the following three formulations of D t : (i) Constant mean: D t = µ for some unknown parameter µ .(ii) Deterministic cycles: D t = µ + (cid:80) k { θ k cos(2 πkt/n ) + θ k sin(2 πkt/n ) } , where k ’sare known positive integers, and θ k , θ k are unknown coefﬁcients.(iii) Linear time trend: D t = µ + ξt/n , where ξ is the unknown coefﬁcient.The speciﬁcation in (i) allows { y t } to have a constant over time non-zero mean. TheDGP in (ii) can be used, for example, to distinguish between very low frequency ﬂuc-tuations and long cycles, as many time series in economics exhibit such patterns, seeBeaudry, Galizia, and Portier (2020). The D t component in (ii) generates cosine andsine oscillations at frequencies πk/n . The period of such oscillations relatively to thesample size is /k , and they can capture very low-frequency cycles in data that are outsidethe range of the econometrician’s interest. For practical purposes, we consider k = 1 , , .Inclusion of such components can be viewed as de-trending of data by removing ﬂuctu-ations at the frequencies corresponding to the values of k . The asymptotic results devel-oped in this section can be used to account for de-trending in inferential procedures.The DGP in (iii) allows for linear time trends, and such adjustments have a long historyin the unit root literature. The division by n is required for deriving the asymptoticproperties and can be absorbed into the unknown coefﬁcient ξ . Hence, observationallythe model in (iii) is identical to the model with no adjustment by n .The empirical application in Section 7 also considers the case where D t consists ofseasonal dummies and a constant. However, as shown in Phillips and Jin (2002) forunit root testing, the arising asymptotic distributions have the same form as those in theconstant mean case.As in the previous section and to avoid singularities in the limit, we use the transformedversion of the model with y t − and ∆ y t − : y t = ( φ ,n + φ ,n ) y t − − φ ,n ∆ y t − + (1 − φ ,n L − φ ,n L ) D t + u t . (4.2) In this section, we consider case (i) of a constant unknown mean. When D t = µ ,equation (4.2) becomes y t = α n + ( φ ,n + φ ,n ) y t − − φ ,n ∆ y t − + u t , (4.3) We thank Paul Beaudry for pointing our attention to this fact. here α n ≡ (1 − φ ,n − φ ,n ) µ = O ( n − ) . Let (cid:98) φ ,n and (cid:98) φ ,n be the least-squares estimator ofthe corresponding coefﬁcients in (4.3), and deﬁne (cid:101) y t − = y t − − ¯ y and (cid:102) ∆ y t − = ∆ y t − − ∆ y ,where ¯ y n and ∆ y n denote the sample averages of y t − and ∆ y t − respectively. Then, (cid:32) (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n (cid:98) φ ,n − φ ,n (cid:33) = (cid:32) (cid:80) (cid:101) y t − − (cid:80) (cid:101) y t − (cid:102) ∆ y t − − (cid:80) (cid:101) y t − (cid:102) ∆ y t − (cid:80) (cid:102) ∆ y t − (cid:33) − (cid:32) (cid:80) (cid:101) y t − u t − (cid:80) (cid:102) ∆ y t − u t (cid:33) . and we have the following analogue of Lemma 3.1. Lemma 4.1.

Suppose that { y t } is generated according to equation (4.3) , and Assumptions2.1 and 3.1 hold. Deﬁne (cid:101) J c,d ( r ) ≡ J c,d ( r ) − (cid:82) J c,d ( s )d s and (cid:101) G c,d ( r ) ≡ G c,d ( r ) − (cid:82) G c,d ( s )d s .The following results hold jointly.(a) n − (cid:80) (cid:101) y t − ⇒ σ (cid:82) (cid:101) J c,d .(b) n − (cid:80) (cid:102) ∆ y t − ⇒ σ (cid:82) (cid:101) G c,d .(c) n − (cid:80) (cid:101) y t − (cid:102) ∆ y t − ⇒ σ (cid:82) (cid:101) J c,d (cid:101) G c,d .Suppose in addition that Assumption 3.2 holds. The following results hold jointly with (a)–(c).(d) n − (cid:80) (cid:101) y t − u t ⇒ σ (cid:82) (cid:101) J c,d d W .(e) n − (cid:80) (cid:102) ∆ y t − u t ⇒ σ (cid:82) (cid:101) G c,d d W + ( σ − σ u ) . The results in Lemma 4.1 are parallel to those in Lemma 3.1. However, instead of J c,d and G c,d , the distributions arising in the limit depend on (cid:101) J c,d and (cid:101) G c,d . Note that the latterprocesses are obtained from J c,d and G c,d by subtracting their respective continuous timeaverages, which matches the construction of (cid:101) y t and (cid:102) ∆ y t in ﬁnite samples. In this section, we consider case (ii) of deterministic cycles. When D t includes deter-ministic cycles, equation (4.2) takes the form y t = α n + (cid:88) k (cid:26) γ k,n cos (cid:18) πktn (cid:19) + γ k,n sin (cid:18) πktn (cid:19)(cid:27) + ( φ ,n + φ ,n ) y t − − φ ,n ∆ y t − + u t , (4.4)where the intercept α n is as deﬁned in the case of a constant mean. The lags of thecosine and sine components can be written as linear combinations of cos(2 πkt/n ) and sin(2 πkt/n ) with coefﬁcients depending on n and, therefore, can be omitted. The least-squares estimators (cid:98) φ ,n and (cid:98) φ ,n can be obtained by estimating y t = ( φ ,n + φ ,n ) (cid:101) y t − − φ ,n ∆ (cid:101) y t − + u t , See Lemma B.1 in the Appendix. here (cid:101) y t − and (cid:102) ∆ y t − are the residuals from the regressions of y t − and ∆ y t − respectivelyon cos(2 πkt/n ) , sin(2 πkt/n ) , and a constant.The following result describes the asymptotic distributions of the sample moments of (cid:101) y t − , (cid:102) ∆ y t − , and u t . Lemma 4.2.

Suppose that { y t } is generated according to equation (4.4) , and Assumptions2.1 and 3.1 hold. Deﬁne (cid:101) J c,d ( r ) ≡ J c,d ( r ) − (cid:90) J c,d ( s )d s − (cid:88) k { ψ k cos(2 πkr ) − ψ k sin(2 πkr ) } , (cid:101) G c,d ( r ) ≡ G c,d ( r ) − (cid:90) G c,d ( s )d s − (cid:88) k { ϕ k cos(2 πkr ) − ϕ k sin(2 πkr ) } , where ψ k ≡ (cid:90) cos(2 πks ) J c,d ( s )d s, ψ k ≡ (cid:90) sin(2 πks ) J c,d ( s )d s,ϕ k ≡ (cid:90) cos(2 πks ) G c,d ( s )d s, ϕ k ≡ (cid:90) sin(2 πks ) G c,d ( s )d s. The following results hold jointly.(a) n − (cid:80) (cid:101) y t − ⇒ σ (cid:82) (cid:101) J c,d .(b) n − (cid:80) ∆ (cid:101) y t − ⇒ σ (cid:82) (cid:101) G c,d .(c) n − (cid:80) (cid:101) y t − ∆ (cid:101) y t − ⇒ σ (cid:82) (cid:101) J c,d (cid:101) G c,d .Suppose in addition that Assumption 3.2 holds. The following results hold jointly with (a)-(c).(d) n − (cid:80) (cid:101) y t − u t ⇒ σ (cid:82) (cid:101) J c,d d W .(e) n − (cid:80) ∆ (cid:101) y t − u t ⇒ σ (cid:82) (cid:101) G c,d d W + ( σ − σ u ) . Lemma 4.2 is the analogue of Lemma 3.1 for the model with deterministic cycles. Thecoefﬁcients ψ ,k and ψ ,k can be viewed as the least-squares coefﬁcients in the continuoustime regression of J c,d ( s ) against cos(2 πks ) , sin(2 πks ) , and a constant with s varyingover the [0 , interval. The coefﬁcients ϕ k and ϕ k have a similar interpretation with J c,d replaced by G c,d . The processes (cid:101) J c,d and (cid:101) G c,d are therefore the residuals from thecorresponding continuous time regressions. They are the continuous time versions of (cid:101) y t − and (cid:102) ∆ y t − respectively. Hence, the results of Lemma 3.1 continue to hold with theprocesses J c,d and G c,d replaced by their respective residuals from the continuous timeregressions. .3. Linear time trend In this section, we consider case (iii) of a linear time trend. The model in equation(4.2) now takes the form y t = δ n + β n ( t/n ) + ( φ ,n + φ ,n ) y t − − φ ,n ∆ y t − + u t (4.5)where δ n ≡ α n + ( φ ,n + 2 φ ,n ) ξ/n = O ( n − ) , and β n ≡ ξ (1 − φ ,n − φ ,n ) = O ( n − ) . Similarly to the previous cases, the least-squares estimators (cid:98) φ ,n and (cid:98) φ ,n can be obtainedby estimating (cid:101) y t = ( φ ,n + φ ,n ) (cid:101) y t − − φ ,n (cid:102) ∆ y t − + (cid:101) u t where (cid:101) y t − and (cid:102) ∆ y t − are now the residuals from the regressions of y t − and ∆ y t − re-spectively against t/n and a constant. Lemma 4.3.

Suppose that { y t } is generated according to equation (4.5) , and Assumptions2.1 and 3.1 hold. Deﬁne (cid:101) J c,d ( r ) ≡ J c,d ( r ) − (4 − r ) (cid:90) J c,d ( s )d s − (12 r − (cid:90) sJ c,d ( s )d s, (cid:101) G c,d ( r ) ≡ G c,d ( r ) − (4 − r ) (cid:90) G c,d ( s )d s − (12 r − (cid:90) sG c,d ( s )d s. The following results hold jointly.(a) n − (cid:80) (cid:101) y t − ⇒ σ (cid:82) (cid:101) J c,d .(b) n − (cid:80) (cid:102) ∆ y t − ⇒ σ (cid:82) (cid:101) G c,d .(c) n − (cid:80) (cid:101) y t − (cid:102) ∆ y t − ⇒ σ (cid:82) (cid:101) J c,d (cid:101) G c,d .Suppose in addition that Assumption 3.2 holds. The following results hold jointly with (a)-(c).(d) n − (cid:80) (cid:101) y t − u t ⇒ σ (cid:82) (cid:101) J c,d d W .(e) n − (cid:80) (cid:102) ∆ y t − u t ⇒ σ (cid:82) (cid:101) G c,d d W + ( σ − σ u ) . Lemma 4.3 is the analogue of Lemmas 4.1 and 4.2 for the case of the linear time trend.The processes (cid:101) J c,d ( r ) and (cid:101) G c,d ( r ) can be similarly interpreted as the residuals from thecontinuous time regressions of J c,d ( r ) and G c,d ( r ) respectively against a constant and r varying over the interval [0 , . The results of Lemmas 4.1–4.3 can now be used to describe the asymptotic distributionsof the least-squares estimators of the autoregressive coefﬁcients and the corresponding See Lemma B.1 in the Appendix. ald statistics for the models with a constant mean, deterministic cycles, and a lineartime trend respectively.Under the same assumptions as those in Proposition 3.3, however with the model inequation (2.4) replaced by that in either (4.3), (4.4), or (4.5), the asymptotic distributionof the least-squares estimators of the autoregressive coefﬁcients now satisﬁes n ( (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n ) ⇒ (cid:82) (cid:101) G c,d · (cid:82) (cid:101) J c,d d W − (cid:18) (cid:82) (cid:101) G c,d d W + (1 − σ u /σ ) (cid:19) · (cid:82) (cid:101) J c,d (cid:101) G c,d (cid:82) (cid:101) J c,d · (cid:82) (cid:101) G c,d − ( (cid:82) (cid:101) J c,d (cid:101) G c,d ) ,n (cid:32) (cid:98) φ ,n − φ ,n (cid:98) φ ,n − φ ,n (cid:33) ⇒ (cid:32) − (cid:33) × (cid:82) (cid:101) J c,d (cid:101) G c,d · (cid:82) (cid:101) J c,d d W − (cid:18) (cid:82) (cid:101) G c,d d W + (1 − σ u /σ ) (cid:19) · (cid:82) (cid:101) J c,d (cid:82) (cid:101) J c,d · (cid:82) (cid:101) G c,d − (cid:0) (cid:82) (cid:101) J c,d (cid:101) G c,d (cid:1) , (4.6)where the convergence holds jointly with the results of either Lemma 4.1, 4.2, or 4.3respectively with the correspondingly deﬁned residual processes (cid:101) J c,d and (cid:101) G c,d .For all three speciﬁcations in Sections 4.1–4.3, the Wald statistic for testing H : φ = φ , , φ = φ , against H : φ (cid:54) = φ , or φ (cid:54) = φ , takes the same form as in equation (3.7).However, (cid:98) V n is now given by (cid:98) V n = (cid:98) σ n (cid:32) (cid:80) (cid:101) y t − − (cid:80) (cid:101) y t − (cid:102) ∆ y t − − (cid:80) (cid:101) y t − (cid:102) ∆ y t − (cid:80) ( (cid:102) ∆ y t − ) (cid:33) − , with (cid:101) y t − and (cid:102) ∆ y t − deﬁned respectively for each speciﬁcation. Provided that the as-sumptions of Proposition 3.4 hold with the model in (2.4) replaced by that in either(4.3), (4.4), or (4.5), the asymptotic null distribution of the Wald statistic is given by W n ( φ ,n , φ ,n ) ⇒ (cid:82) (cid:26) (cid:101) J c,d · (cid:18) (cid:82) (cid:101) G c,d d W + (1 − σ u /σ ) (cid:19) − (cid:101) G c,d · (cid:82) (cid:101) J c,d d W (cid:27) (cid:82) (cid:101) J c,d · (cid:82) (cid:101) G c,d − (cid:0) (cid:82) (cid:101) J c,d (cid:101) G c,d (cid:1) (4.7)with the correspondingly deﬁned residual processes (cid:101) J c,d and (cid:101) G c,d .As in the base case with no deterministic components, the asymptotic null distributionsof the Wald statistics are non-standard and depend on the unknown parameters c and d . The differences between the quantiles of these asymptotic distributions and the χ critical values are discussed in Section 5. In comparison with the base case, inclusion f the deterministic components may result in more substantial deviations from the χ critical values.

5. Size distortions of conventional tests

In this section, we discuss the size distortions one would see if the econometrician wereto use conventional χ critical values in place of the quantiles of the distributions derivedin equation (4.7) in the previous section. For the purpose of this exercise, we assume thatthere is no serial correlation in { u t } and, as a result, non-centrality term . − σ u /σ ) isequal to zero. Note that if { u t } is serially correlated, one can expect more substantial sizedistortions due to the presence of the non-centrality term in the asymptotic distribution.Let F c,d ( · ) denote the CDF of the asymptotic null distribution in (4.7). Note that theCDF depends on the unknown localization parameters c and d . Consider a test that rejectsthe null hypothesis when the Wald statistic exceeds the conventional χ , − α critical value,where χ , − α is the − α quantile of the χ distribution. The asymptotic size of this testis − F c,d ( χ , − α ) , and size distortion are given by the difference between the asymptoticsize and the nominal size α . We next examine the extent of size distortions for differentvalues of c and d in the case of the three speciﬁcations for the deterministic component D t in Section 4.Table 1 reports the asymptotic size for α = 0 . and different values of c and d foreach of the three speciﬁcations of D t . The CDF F c,d ( · ) is computed by Monte Carlo sim-ulation with 100,000 replications and J c,d and G c,d processed generated using the Euler-Maruyama method with a time step ∆ t = 0 . . The table also reports the length of thecycle as a fraction of the sample size measured by τ θ = 2 π/d . The smaller is the value of d , the lower is the oscillation frequency, and the longer is the cycle length relative to thesample size.In the case of all three speciﬁcations for D t , the table shows similar patterns: theasymptotic size deviates from the nominal . values for values of c and d closer to zero.However, as c becomes more negative and d becomes more positive, the asymptotic sizestarts to approach the nominal value.For example, in model with a constant mean, the asymptotic size at c = − and d = 5 is 0.116, which means that the Wald test based on the conventional χ , − α critical valueover rejects the null by 0.066. As we move down the rows and across the columns ofTable 1, the process becomes less persistent and with a shorter cycle period, and as aresult size distortions become negligible. Note, however, that the relationship can benon-monotone. ABLE

1. Asymptotic size of the conventional Wald test with χ , − α criticalvalues for α = 0 . for different values of the localization parameters c, d and different speciﬁcations of the deterministic component dc k = 1 -1 .746 .103 .071 .054 .051 .050-5 .641 .161 .070 .063 .054 .050-10 .441 .192 .090 .059 .056 .051-15 .317 .186 .104 .069 .054 .059-20 .242 .170 .109 .075 .058 .050-130 .052 .052 .051 .051 .050 .048linear time trend-1 .317 .071 .072 .053 .051 .050-5 .257 .089 .058 .062 .054 .051-10 .188 .101 .066 .053 .056 .051-15 .146 .101 .072 .057 .051 .060-20 .121 .096 .074 .060 .052 .049-100 .053 .052 .052 .052 .050 .049 τ θ τ θ = 2 π/d : cycle length as a fraction of the sample size. While in the case of the constant mean model, the size distortions are relatively minor,they are much more prominent in the case of the models with deterministic cycles andlinear trends. In particular, the usage of conventional χ critical values may result insevere size distortions in the case of deterministic cycles. For example when c = − and d = 5 , the null rejection probability is approximately 75% instead of 5%. It is approxi-mately 32% in the case of the linear time trend speciﬁcation. While d = 5 correspondsto very long cycles as measured by τ θ , size distortions remain substantial even for shortercycles. For example, in the model with deterministic cycles, the size of the conventional est is approximately 19% for c = − , d = 15 . These values correspond to τ θ = 0 . and τ ω = 0 . .Note again that the size distortions can be non-monotone across the rows/columns.However, for large negative values of c or large values of d , size distortions disappear.This is consistent with the results in Phillips (1987), who shows that in the local-to-unitymodel, the null distribution of the t -statistic for the autoregressive coefﬁcient convergesto the standard normal as c → −∞ .To conclude, depending on the values of c and d , the expression on the right-hand sideof (4.7) can generate a wide range of different asymptotic distributions. The distribu-tions can deviate substantially from the χ for values of c, d sufﬁciently close to zeros.Such speciﬁcations correspond to longer cycles. For values of c, d sufﬁciently far fromzero, which correspond to shorter cycles, the distributions converge to χ . In particular,across all speciﬁcations of the deterministic component, the size distortions from using χ critical values become negligible for τ θ ≤ . . However, when the length of the cycleas measured by τ θ exceeds 14% of the sample size, conventional inference procedurescan results in size distortions. The distortions are typically more pronounced for longercycles.

6. Inference for cyclicality

In this section, we propose a procedure for inference on the cycle length in terms ofthe angular frequency-based measure τ θ and the spectrum-based measure τ ω that wereintroduced in Section 2. Recall that the two measures can be deduced from the autore-gressive coefﬁcients φ n and φ ,n through the relationships in (2.6)–(2.9). Therefore,we ﬁrst construct conﬁdence sets for the autoregressive parameters by collecting values ( φ , φ ) consistent with cyclical behavior and not rejected by data. In the second step, weuse projection arguments to construct conﬁdence intervals for τ θ and τ ω . By multiplyingthe values of τ θ and τ ω in the conﬁdence intervals by n , the length of the cycle can be alsoexpressed in time units instead of fractions of the sample size.The proposed conﬁdence sets have the following property: If the true DGP is indeedcyclical, the coverage probability is at least − α asymptotically whether the roots ofthe autoregressive equation are close to one or far from it. However, if the true DGP isinconsistent with cyclical behavior, we expect the conﬁdence sets to be empty in largesamples. Hence, the proposed procedure can be used to detect cyclical speciﬁcationsconsistent with data. It can also be used to rule out cyclical behavior. However, ourprocedure is not designed for ruling out non-cyclical speciﬁcations. hen the roots of the autoregressive polynomial are local to unity as in Assumption2.1, the least-squares estimators of the autoregressive coefﬁcients are consistent regard-less of whether { u t } is serially correlated or not. This is established in Proposition 3.3for the base case and in (4.6) for the cases with deterministic components. Serial cor-relation in { u t } and the resulting correlation between ( y t − , ∆ y t − ) and u t is reﬂected bythe non-centrality term . − σ u /σ ) in the asymptotic distributions. The non-centralityproliferates from the estimators into the asymptotic null distribution of the Wald statistic.This is standard for the unit root literature and continues to hold in our framework.However, when the roots of the autoregressive polynomial are sufﬁciently far fromunity, the least-squares estimators of the autoregressive coefﬁcients are no longer consis-tent and suffer from ﬁrst order bias. Therefore, to design an inferential procedure thatremains valid regardless of the magnitude of the roots, we explicitly control for potentialserial correlation in { u t } .We proceed as follows. First, in Section 6.1 we discuss how to construct conﬁdenceintervals for τ θ and τ ω when { u t } are serially uncorrelated. Then, in Section 6.2 weextend the procedure to a serially correlated innovation process { u t } by assuming thatit satisﬁes an AR( p ) formulation with real roots bounded away from one. We employthe BIC selection procedure to choose the appropriate number of lags p as well as thespeciﬁcation for the deterministic part D t . { u t } Suppose that { u t } is serially uncorrelated and, therefore, the least-squares estimatorsof the autoregressive coefﬁcients φ ,n and φ ,n are consistent whether the roots are closeto unity or far from it. Recall that the expression on the right-hand side of (4.7) with − σ u /σ = 0 approximates well the asymptotic distribution of the Wald statistic forany conﬁguration of the localization parameters c and d . Moreover, recall that given thesample size n , there is a one-to-one relationship between ( φ ,n , φ ,n ) and the localizationparameters ( c, d ) , and let φ ,n = Φ ,n ( c, d ) and φ ,n = Φ ,n ( c, d ) , where the functions Φ ,n ( c, d ) and Φ ,n ( c, d ) are deﬁned according to (2.6) and (2.7)respectively. Because the relationship is one-to-one for any given n , conﬁdence sets for ( φ , φ ) can be equivalently represented as conﬁdence sets in terms of ( c, d ) .By running the regressions of y t against y t − and y t − with different speciﬁcations ofthe deterministic part D t , one can use the Bayesian Information Criterion (BIC) based election procedure to consistently choose the appropriate speciﬁcation between the con-stant mean, deterministic cycle, and linear time trend. Once the speciﬁcation for D t has been selected, consider the corresponding Wald statistic W n (Φ ,n ( c, d ) , Φ ,n ( c, d )) .Let W − α ( c, d ) denote the − α quantiles of the asymptotic distribution in (4.7) with − σ u /σ = 0 , i.e. W − α ( c, d ) is the − α quantile of the distribution of W ( c, d ) ∼ (cid:82) (cid:26) (cid:101) J c,d · (cid:82) (cid:101) G c,d d W − (cid:101) G c,d · (cid:82) (cid:101) J c,d d W (cid:27) (cid:82) (cid:101) J c,d · (cid:82) (cid:101) G c,d − (cid:0) (cid:82) (cid:101) J c,d (cid:101) G c,d (cid:1) , (6.1)where the deﬁnitions of (cid:101) J c,d and (cid:101) G c,d correspond to the speciﬁcation for D t . The conﬁ-dence set for ( c, d ) can now be constructed by test inversion as CS n, − α ≡ (cid:26) ( c, d ) : W n (cid:0) Φ ,n ( c, d ) , Φ ,n ( c, d ) (cid:1) ≤ W − α ( c, d ) (cid:27) . The conﬁdence set CS n, − α is bounded as W − α ( c, d ) → χ , − α when c → −∞ or d → ∞ . In practice, the conﬁdence set can be approximated by choosing a dense two-dimensional grid of values c and d . We use a grid with c ≤ and π < d < nπ , wherethe lower bound of π is imposed to rule out cycles longer than the sample size whenmeasured by τ θ .The construction of CS n, − α is akin to the grid bootstrap procedure of Hansen (1999),however, we use the asymptotic critical values instead of their bootstrap approximation.Note that the critical values must be adjusted for every considered point ( c, d ) . Thevalidity of CS n, − α is due to the following facts. First, ( c, d ) is included in the conﬁdenceset only if the null hypothesis H : φ ,n = Φ ,n ( c, d ) , φ ,n = Φ ,n ( c, d ) cannot be rejected bythe Wald test with the critical value W − α ( c, d ) . Second, the critical values are computedusing the same values ( c, d ) as those speciﬁed under H . Third, the distribution in (6.1)nests the χ distribution, which arises under the ﬁxed ( φ , φ ) asymptotics, as a limitingcase. Note that having correct size under both drifting and ﬁxed parameters speciﬁcationsis required for the uniform validity (Andrews, Cheng, and Guggenberger, 2020).We construct conﬁdence intervals for τ θ and τ ω from CS n, − α by projection: CI τ θ n, − α ≡ (cid:20) inf d :( c,d ) ∈ CS n, − α πd , sup d :( c,d ) ∈ CS n, − α πd (cid:21) , (6.2) CI τ ω n, − α ≡ (cid:20) inf ( c,d ) ∈ CS n, − α π √ d − c , sup ( c,d ) ∈ CS n, − α π √ d − c (cid:21) . (6.3)The conﬁdence interval for τ θ is bounded as long as the grid of d values used to con-struct CS n, − α excludes zero. On the other hand, the conﬁdence interval for τ ω can beunbounded if pairs ( c, d ) with c = d are included in CS n, − α . .2. Serially correlated { u t } In this section we assume that the innovations process { u t } is generated as AR( p ): (1 − ρ L − . . . − ρ p L p ) u t = ε t , (6.4)where { ε t } are iid (0 , σ ε ) , and the roots of the polynomial − ρ L − . . . − ρ p L p are real andbounded away from unity. By running the regressions of y t against different speciﬁcationsof the deterministic part D t and y t − , y t − , . . . , y t − − m for some m > p , one can againuse the BIC selection procedure to consistently estimate the speciﬁcation for D t and thenumber of lags p . We now proceed assuming that the model for D t and the number oflags p are known.Let (cid:101) y t denote the residuals from projection of y t against the components of D t . Under H : φ ,n = φ , , φ ,n = φ , , the values φ ,n and φ ,n are known and can be computed fromthe values of c and d . Let (cid:101) u t, ≡ (cid:0) − φ , L − φ , L (cid:1)(cid:101) y t . Using the null-restricted residuals (cid:101) u t, , one can estimate the autoregressive coefﬁcients ρ , . . . , ρ p . Let (cid:98) ρ , , . . . , (cid:98) ρ p, denote their least-squares estimators. Note that under H ,these estimators are consistent. We can now remove the autoregressive part in u t : (cid:98) x t, ≡ (1 − (cid:98) ρ , L − . . . − (cid:98) ρ p, L p ) (cid:101) y t . Thus, to construct the process (cid:98) x t, , we have ﬁltered out the deterministic part D t andserial correlation in { u t } . Note that under the null, the population counterpart of (cid:98) x t, satisﬁes: (cid:101) x t ≡ (1 − ρ L − . . . − ρ p L p ) (cid:101) y t = (cid:101) ε t − φ ,n L − φ ,n L , where (cid:101) ε t is the residual from the projection of ε t against the components of D t .One can now use { (cid:98) x t, } for inference on the cyclical properties of { y t } , however, addi-tional adjustments are required to account for estimation of ρ , . . . , ρ p . The main purposeof the adjustments discussed below is to ensure that the modiﬁed Wald statistic has thecorrect asymptotic null distributions both under the long-cycle asymptotics proposed inthe paper as well as under the standard asymptotics with φ and φ ﬁxed in the stationaryrange. Let (cid:98) φ ,n and (cid:98) φ ,n now denote the least-squares estimators of φ and φ respectively fromthe regression of (cid:98) x t, against (cid:98) x t − , and (cid:98) x t − , : (cid:98) x t, = (cid:98) φ ,n (cid:98) x t − , + (cid:98) φ ,n (cid:98) x t − , + (cid:98) ε t, , Recall that having correct size under both drifting and ﬁxed parameters speciﬁcations is required for theuniform validity (Andrews, Cheng, and Guggenberger, 2020). here (cid:98) ε t, denotes the least-squares residuals. The modiﬁed Wald statistic takes the form W n,p ( φ , , φ , ) ≡ (cid:98) σ ε,n (cid:32) (cid:98) φ ,n − φ , (cid:98) φ ,n − φ , (cid:33) (cid:62) M n Σ − n M n (cid:32) (cid:98) φ ,n − φ , (cid:98) φ ,n − φ , (cid:33) , where (cid:98) σ ε,n ≡ n − (cid:80) (cid:98) ε t, , and the matrix M n is given by M n ≡ (cid:32) (cid:80) (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:33) . To construct Σ n , we ﬁrst deﬁne ˙ x t, and ¨ x t, as the residual from the least-squares regres-sion of (cid:98) x t, and (cid:98) x t − , respectively against (cid:101) u t, , . . . , (cid:101) u t − p +1 , : (cid:98) x t, = ˙ ζ ,n (cid:101) u t, + . . . + ˙ ζ p,n (cid:101) u t − p +1 , + ˙ x t, , (6.5) (cid:98) x t − , = ¨ ζ ,n (cid:101) u t, + . . . + ¨ ζ p,n (cid:101) u t − p +1 , + ¨ x t − , , where ˙ ζ ,n , . . . , ˙ ζ p,n and ¨ ζ ,n , . . . , ¨ ζ p,n are the OLS estimators. The matrix Σ n is given by Σ n ≡ (cid:32) (cid:80) ˙ x t − , (cid:80) ˙ x t − , ¨ x t − , (cid:80) ˙ x t − , ¨ x t − , (cid:80) ¨ x t − , (cid:33) . The next proposition shows that under the conventional stationary asymptotics, the as-ymptotic null distribution of the Wald statistic is the usual χ distribution. Proposition 6.1.

Suppose that { y t } is generated according to (1 − φ L − φ L ) y t = u t withthe coefﬁcients φ and φ ﬁxed in the stationary range, and { u t } satisfying (6.4) with thecoefﬁcients ρ , . . . , ρ p in the stationary range and ε t ∼ iid (0 , σ ε ) . Then, W n,p ( φ , φ ) ⇒ χ . In the case of a long-cycle speciﬁcation, the null asymptotic distribution of the modiﬁedWald statistic is the same as in (6.1).

Proposition 6.2.

Suppose that { y t } is generated according to (4.1) , where { u t } satisﬁes (6.4) with the coefﬁcients ρ , . . . , ρ p in the stationary range and ε t ∼ iid (0 , σ ε ) . Supposefurther that Assumption 2.1 holds. Then, W n,p ( φ ,n , φ ,n ) ⇒ W ( c, d ) , where W ( c, d ) is deﬁned in (6.1) with (cid:101) J c,d and (cid:101) G c,d deﬁned according to the speciﬁcation of D t . sing the results of Propositions 6.1 and 6.2, one can now construct conﬁdence setsfor ( c, d ) using the modiﬁed Wald statistic as CS n,p, − α ≡ (cid:26) ( c, d ) : W n,p (cid:0) Φ ,n ( c, d ) , Φ ,n ( c, d ) (cid:1) ≤ W − α ( c, d ) (cid:27) . Similarly to the construction in (6.2) and (6.3), the conﬁdence set CS n,p, − α can be pro-jected to construct conﬁdence intervals for τ θ and τ ω .

7. Cyclical properties of macroeconomic and ﬁnancial vari-ables

In this section, we apply our inference procedure to the quarterly series of a set ofmacroeconomic and ﬁnancial variables for the U.S. All data are publicly available fromFRED, Federal Reserve Bank of St. Louis. A detailed description of the data is summarizedin Table 3 in Appendix C. All the series are measured in natural logs except for the credit-to-GDP ratio (for the private non-ﬁnancial sector), which is in percentage points, andthe interest rate spread between Moody’s seasoned BAA corporate bond yield and the10-year treasury constant maturity, which is expressed in levels. For each series, we takethe longest and the most updated sample ending in 2020. Depending on the series, oursamples span periods ranging from 34 to 73 years.We use the empirical models in (4.1). Let y t denote the observed data series such that y t = y ct + D t ,y ct = φ ,n y ct − + φ ,n y ct − + u t , where y ct is the latent cyclical part, the innovations { u t } are potentially serially correlatedaccording to an AR( p ) speciﬁcation with unknown p , and D t may contain linear deter-ministic trends and deterministic cycles as discussed in Section 4. In all speciﬁcations,the intercept (constant) is included by default. The raw data for the credit-to-GDP ratiois not seasonally adjusted and, therefore, its speciﬁcation for D t also allows for seasonaldummies.We use the BIC to select the appropriate speciﬁcation for D t (i.e. whether to includea linear time trend, deterministic cycles or seasonal dummies). We also rely on the BICto determine the presence of autocorrelation in { u t } . Note that a BIC estimate of the lagorder p + 2 ≥ for { y t } implies an autocorrelation of order p for { u t } . In our empiricalapplication, most of the time series do not exhibit autocorrelation in { u t } except forcredit to GDP ratio, as indicated by the BIC. Moreover, credit-to-GDP ratio is the onlyseries where we have included the seasonal dummies. ABLE

2. Length of cycle in quarters nτ θ nτ ω n Linear time trend Deterministic cycles Autocorr. u t Macroeconomic variables

Real GDP per capita —(23, 264) —(25, 512) 294 Yes No NoUnemployment rate 52(22, 260) —(27, 185) 290 No No NoHours per capita 22(18, 64) 25(18, 198) 294 Yes k = 1 No Financial variables

VXOS&P 100 volatility index — ∅ — ∅

139 No k = 3 NoCredit risk premiumBAA to 10Y — ∅ — ∅

269 Yes No NoEquity price index — ∅ — ∅

197 Yes No NoPrivate non-ﬁnancial sectorcredit % GDP —(50, 245) —(54, 358) 273 Yes No AR(1)Home price index 63(42, 120) 77(43, 234) 134 Yes No No All data series are sampled at quarterly frequencies with the sample size of each series given by n . Columns 1 and 2 indicate respectively the length of cycles measured based on angular frequency nτ θ and spectrum-maximizingfrequency nτ ω . In columns 1 and 2, the numbers on top indicate the point estimates of the cycle length. A dash line “—” is used when the pointestimate corresponds to an acyclical process and when the point estimate is unavailable in case of autocorrelation. Enclosed in theparentheses are the minimum and maximum cycle lengths implied by the conﬁdence intervals of τ θ and τ ω . When the intervalis empty, it is indicated by the symbol ” ∅ ”. All numbers are in quarters. The intercept is included in all speciﬁcations. Credit to private non-ﬁnancial sector (% GDP) are seasonally adjusted by including seasonal dummies in the regression.

Table 2 presents our results. Note that the last three columns of the table describe thespeciﬁcation selected by the BIC for D t and the order of autocorrelation for { u t } . Forexample, hours per capita contains a linear time trend and deterministic cycles of cosineand sine waves with k = 1 , which corresponds to a periodicity of n/k = n . Accordingto the BIC, the errors { u t } are serially uncorrelated. Columns 1 and 2 of the table re-port respectively the angular frequency-based measure nτ θ and the spectrum-maximizingfrequency-based measure nτ ω for the cycle length. The point estimates for nτ θ and nτ ω areindicated as “—” when the autoregressive coefﬁcient estimates of φ ,n and φ ,n correspondto acyclical processes, or when they are not available as in the case of autocorrelation.The minimum and maximum cycle length implied by the 95% conﬁdence intervals of nτ θ and nτ ω are given in the parentheses. he two alternative measures of cycle length generally produce similar lower boundestimates. Based on the 95% conﬁdence intervals, we are unable to reject the null thatmacroeconomic variables, such as real GDP per capita, unemployment rate and hours percapita, contain stochastic cycles with periodicity of at least 5-6 years. Partly due to theprojection-based construction of the nτ θ and nτ ω conﬁdence intervals, the implied rangeof the cycle length is typically wide. The upper bound conﬁdence estimates usually arelarge and differ considerably between the two measures. Nevertheless, our results pointto the presence of cyclicality among macroeconomic variables, conforming to the view ofendogenous business cycles (Beaudry, Galizia, and Portier, 2020). On the ﬁnancial side,we ﬁnd that credit to private non-ﬁnancial sector as a percent of GDP and home pricesexhibit cycles of at least 10 years in duration, twice as long as the minimum detectedcycle length in the macroeconomic variables.The most striking ﬁnding of this section is that for the asset market variables (thevolatility index, credit risk premium, and equity prices), our procedure returns emptyconﬁdence sets. This suggests that the underlying mechanism for asset market ﬂuctua-tions is different from that of the macro variables and the ﬁnancial variables such as thecredit and home prices. Our results are in favour of the dichotomy between the assetmarket and the real economy. Moreover, the results do not support the view that the re-cessions are driven by risk perception, risk premiums and risk-bearing capacity suggestedin the macro-ﬁnance literature (see Cochrane, 2017). Note that the S&P 100 VolatilityIndex, a measure of market uncertainty, has a deterministic cycle of approximately 46quarters in length according to the BIC model selection. However, it is different in naturefrom stochastic cycles detected in the other variables.To better visualize the cyclical dynamics consistent with the data, for each series y t , weplot in Figure 2 the impulse responses to a one-standard-deviation shock to the innova-tion u of all cyclical speciﬁcations in the 95% conﬁdence sets CS n, − α . The dynamicsshown in the ﬁgure resonate with the results in Table 2. Financial variables such as creditto private non-ﬁnancial sector and home prices exhibit much longer cycles than the busi-ness cycle variables. The duration from peaks to troughs is at least 25-30 quarters inﬁnancial cycles and at least 15 quarters in business cycles. In addition, the ﬁnancial cy-cles are also more pronounced. For a one-standard deviation shock, the initial amplitudeof the cyclical response is approximately 3 to 7 times the standard deviation for creditand home prices, and about 1.5 to 3 times for unemployment rate and hours per capita.For real GDP per capita, the impulse responses are split into two parts. On the left, theaxis corresponds the set of impulse responses similar to those observed in unemployment For credit to private non-ﬁnancial sector, the standard deviation of the innovation is computed assumingno serial correlation. ate and hours per capita. On the right, the axis maps to the set of cyclical impulseresponses with large amplitudes and high persistences. Note that the scale of the axison the right has increased by 10-fold. While the possibility of having much longer andhighly persistent stochastic cycles cannot be rejected, real GDP per capita do also sharesimilar dynamics to unemployment rate and hours per capita. In sum, our results suggest that business cycles as marked by the expansions and con-tractions of the aggregate economic activity are not just recurrent but periodic with anaverage duration of at least 5-6 years. Furthermore, ﬁnancial cycles as characterized bythe booms and busts in credit and home prices are much longer than the business cycles:at least 10 years in duration. In addition, these ﬁnancial cycles have more prominentoscillations with much larger amplitudes than business cycles. Moreover, we ﬁnd thatequity prices, though commonly included in the characterization of ﬁnancial cycles, donot exhibit stochastic cycles, and therefore merits separate consideration from credit andhome prices. Lastly, our results suggest that asset market ﬂuctuations are a differentphenomenon from the changes in real economic activities. Note also that hours per capita is much more persistent than unemployment rate and real GDP per capita. A ) Real GDP per capita ( B ) Unemployment rate ( C ) Hours per capita( D ) Credit to private non-ﬁnancial sector% GDP ( E ) Home price index F IGURE

2. Impulse responses to a one-standard-deviation shock to innovations ppendix A. The periodogram of long-cycle processes A.1. Asymptotic properties

Periodogram-based nonparametric estimators are commonly used to infer on cyclicalbehaviour of time series. In this section, we derive the asymptotic properties of theperiodogram in the case of long-cycle processes. For − π ≤ ω ≤ π , the periodogram of { y t } is deﬁned as I n ( ω ) ≡ πn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) t =1 y t e − iωt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (A.1)see, for example, equation (6.1.24) in Priestley (1981). In the case of covariance station-ary processes with continuous spectral densities, it is well known that the periodogram isan asymptotically unbiased estimator of the spectral density at ω (see equation (6.2.12)in Priestley, 1981).Given the results of Proposition 2.1, in the case of long-cycle processes, we are inter-ested in the spectrum near the origin at frequencies of the form ω n = h/n for a constant h ∈ R . Suppose that { y n,t } is generated according to the DGP in equation (2.4) withthe roots as in Assumption 2.1. Assume also that { u t } are serially uncorrelated with azero mean and the variance σ u . In this case, the spectral density of { y t } , denoted f n ( ω ) ,satisﬁes n − f n ( h/n ) = σ u πn | − λ ,n e ih/n | | − λ ,n e ih/n | → σ u π c + ( d + h ) )( c + ( d − h ) ) . We show below that, in the case of long-cycle processes and near the origin frequencies,the periodogram is a biased estimator.

Proposition A.1.

Suppose that { y n,t } is generated according to equation (2.4) and Assump-tion 2.1, and { u t } are serially uncorrelated with a zero mean and the variance σ u > . Thenfor a constant h , the periodogram of { y n,t } satisﬁes lim n →∞ n − E (cid:20) I n (cid:18) hn (cid:19)(cid:21) = σ u π (cid:90) ∞−∞ − cos( h − x )( h − x ) c + ( d + x ) )( c + ( d − x ) ) dx. (A.2)The result can be extended to allow for strictly stationary and serially correlated { u t } ,when the spectral density ϕ ( ω ) of { u t } is bounded, bounded away from zero and contin-uously differentiable with the derivative satisfying sup x ∈ [ − πn,πn ] | ϕ (cid:48) ( x/n ) | = O ( n − ) . For See the proof of Proposition A.1. IGURE

3. The limits of the expected value of the periodogram (solid line)and the true spectrum (dashed line) for c = 4 and d = 10 . The correspond-ing vertical lines indicate the maximizing frequencies in terms of h , where h is determined by ω n = h/n , and ω n denotes frequenciesexample, the condition holds when { u t } is an MA( p ) process. In that case, σ u in equation(A.2) should be replaced with ϕ (0) .The result in Proposition A.1 can be used to assess the magnitude of the bias impliedby the periodogram as we illustrate below. When the cyclical properties of a process areassessed using its spectrum, the appropriate measure of the cycle length is τ ω . The solidline in Figure 3 plots the limiting expression for the expected values of the periodogram ofa long-cycle process at near the origin frequencies. Its maximizing frequency is shown bythe solid vertical line. The dashed line displays the limit of the true spectral density. Thevertical dashed line indicates the true spectrum maximizing frequency √ d − c derivedin Proposition 2.1. To construct the plot, we use the following values of the localizationparameters: c = 4 and d = 10 .The numerical results displayed in the ﬁgure demonstrate that the periodogram mayunder estimate the spectrum maximizing frequency and, as a result, over estimate thelength of the cycle. According to the true spectrum, the cycle length relatively to thesample size is τ ω = 0 . , while according to the periodogram it is τ ω = 0 . . For quarterlydata and a sample size n = 200 , this corresponds to the upward bias of 8 quarters for thecycle length. roposition 3.1 can be used to describe the asymptotic distribution of the periodogramof a long-cycle process at near-the-origin frequencies. The next result shows that asymp-totic distribution of the periodogram depends on a continuous time Fourier transform ofthe asymptotic approximation of the long-cycle process. Corollary A.1.

Suppose that { y n,t } is generated according to equation (2.4) , and Assump-tions 2.1 and 3.1 hold. Then, n − I n (cid:18) hn (cid:19) ⇒ π (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) J c,d ( r ) e − ihr dr (cid:12)(cid:12)(cid:12)(cid:12) , where the process J c,d ( r ) is deﬁned in equation (3.2) . A.2. Proofs of the asymptotic properties of the periodogram

Proof of Proposition A.1.

The spectral density of { y n,t } is given by f n ( ω ) = σ u π | − λ n, e iω | | − λ n, e iω | . (A.3)By the results in Priestley (1981), equations (6.2.10)–(6.2.11), EI n (cid:18) hn (cid:19) = (cid:90) π − π f n ( x ) F n (cid:18) x − hn (cid:19) dx = 1 n (cid:90) πn − πn f n (cid:16) xn (cid:17) F n (cid:18) x − hn (cid:19) dx, (A.4)where the second result holds by the change of variable, and F n ( x ) = sin ( nx/ n sin ( x/

2) = 1 − cos( nx ) n (1 − cos( x )) . Applying a series expansion cos(( h − x ) /n ) = 1 − . h − x ) /n ) + O (( h − x ) /n ) , weobtain F n (cid:18) h − xn (cid:19) = 2 n (1 − cos( h − x ))( h − x ) (cid:16) O (cid:0) h − xn (cid:1) (cid:17) . (A.5)Next, we consider an expansion of the elements of f n ( x/n ) . − λ n, e ix/n = 1 − e c/n (cid:18) cos (cid:18) d + xn (cid:19) + i sin (cid:18) d + xn (cid:19)(cid:19) = 1 − (cid:18) cn + O (cid:18) n (cid:19)(cid:19) × (cid:32) − (cid:18) d + xn (cid:19) + O (cid:18) d + xn (cid:19) + i (cid:32) d + xn + O (cid:18) d + xn (cid:19) (cid:33)(cid:33) − cn + O (cid:32)(cid:18) d + xn (cid:19) + ( d + x ) n + (cid:18) d + xn (cid:19) + 1 n (cid:33) + i (cid:18) d + xn + O (cid:18) d + xn + ( d + x ) n (cid:19)(cid:19) . Hence, (cid:12)(cid:12) − λ n, e ix/n (cid:12)(cid:12) = c + ( d + x ) n + O (cid:18) d + xn (cid:19) . (A.6)Similarly, (cid:12)(cid:12) − λ n, e ix/n (cid:12)(cid:12) = c + ( d − x ) n + O (cid:18) d − xn (cid:19) . (A.7)By (A.3) and (A.6)–(A.7), n f n (cid:16) xn (cid:17) = n σ π (cid:16) c + ( d + x ) (cid:16) O (cid:0) d + xn (cid:1) (cid:17)(cid:17) (cid:16) c + ( d − x ) (cid:16) O (cid:0) d − xn (cid:1) (cid:17)(cid:17) . (A.8)The result of the proposition follows from (A.4), (A.5), and (A.8). (cid:3) Proof of Corollary A.1.

Since (cid:90) ( t +1) /nt/n e − ihs ds = e − iht/n n (cid:0) O ( n − ) (cid:1) , we have n − / n (cid:88) t =1 y t e − iht/n = 11 + O ( n − ) n (cid:88) t =1 n − / y t (cid:90) ( t +1) /nt/n e − ihs d s = 11 + O ( n − ) n (cid:88) t =1 (cid:90) ( t +1) /nt/n n − / y (cid:98) ns (cid:99) e − ihs d s = 11 + O ( n − ) (cid:90) n − / y (cid:98) ns (cid:99) e − ihs d s + O p ( n − ) ⇒ (cid:90) J c,d e − ihs d s, (A.9)where the equality in the third line holds because y (cid:98) ns (cid:99) = y t for t/n ≤ s < ( t + 1) /n ,and the equality in the last line holds by the Continuous Mapping Theorem (CMT) andProposition 3.1. The result of the corollary follows by (A.9) and the CMT. (cid:3) ppendix B. Proofs of the main results Proof of Proposition 2.1.

By Assumption 2.1 and (2.6)–(2.7), − φ ,n (1 − φ ,n )4 φ ,n = 0 . (cid:0) dn − (cid:1) (exp (cid:0) cn − (cid:1) + exp (cid:0) − cn − (cid:1) )= (1 − . d n − + O ( n − ))(1 + 0 . c n − + O ( n − ))= 1 − . d − c ) n − + O ( n − ) . Since the argument of cos − converges to one, it follows that ω ∗ n = cos − (cid:0) − . d − c ) n − + O ( n − ) (cid:1) = o (1) . (B.1)Consider cos − (1 − s ) = t or − s = cos( t ) , where s and t are small. Expanding cos( t ) around t = 0 , we obtain s = t / O ( t ) . Hence, s = t (1 + O ( t )) , and it follows that t = √ s (1 + O ( t ))= √ s (1 + O (2 s (1 + O ( t )))= √ s + O ( s / ) . Therefore, nω ∗ n = √ d − c + O ( n − ) . (cid:3) Lemmas B.1, B.2, and B.3 below present auxiliary results that are needed for the proofof Proposition 3.2 and Lemma 3.1. In particular, Lemma B.3 establishes the properties ofthe diffusion processes that appear in the limiting expressions for the estimators and teststatistics.

Lemma B.1.

Suppose that Assumption 2.1 holds. The following approximation holds forthe long-cycle autoregressive coefﬁcients in (2.6) and (2.7) :(a) φ ,n = 2 + cn + c − d n + O ( n − ) .(b) − φ ,n = 1 + cn + c n + O ( n − ) .(c) φ ,n − φ ,n = 1 − d + c n + O ( n − ) .(d) − φ ,n = 1 − cn − c n + O ( n − ) .(e) φ ,n + φ ,n = 1 − c + d n + O ( n − ) .(f) − ( φ ,n + φ ,n ) φ ,n = 1 + cn + c − d n + O ( n − ) .(g) − ( φ ,n + φ ,n ) = c + d ) n + O ( n − ) .(h) φ ,n = 1 + cn + c n + O ( n − ) . Lemma B.2.

Suppose { y n,t } is generated according to (2.4) . We have: a) (cid:80) nt =2 y t − = (cid:80) nt =1 y t − − y n − .(b) (cid:80) nt =3 y t − y t − = φ ,n − φ ,n (cid:80) nt =2 y t − + − φ ,n ( (cid:80) nt =2 y t − u t − y n y n − ) + φ ,n − φ ,n y y .(c) (cid:80) nt =2 y t − u t = (cid:80) nt =2 y t − u t − (cid:80) nt =2 ( y t − − y t − ) u t .(d) y n = y n − + ( y n − y n − ) .(e) y n y n − = y n − y n ( y n − y n − ) .(f) y n − = y n − y n ( y n − y n − ) + ( y n − y n − ) .(g) (cid:80) y t − (∆ y t − ∆ y t − ) = y n − ∆ y n − (cid:80) (∆ y t − ) . Lemma B.3.

The diffusion processes J c,d ( · ) , K c,d ( · ) , and G c,d ( · ) have the following proper-ties:(a) d J c,d ( r ) = c · J c,d ( r )d r + d · K c,d ( r )d r = G c,d ( r )d r .(b) d K c,d ( r ) = c · K c,d ( r )d r − d · J c,d ( r )d r + d d W ( r ) .(c) (cid:82) r e c ( r − s ) J c,d ( r )d s = c + d (cid:110) (cid:82) r e c ( r − s ) d W ( s ) − (cid:16) c · J c,d ( r ) + d · K c,d ( r ) (cid:17)(cid:111) .(d) d (cid:0) J c,d ( r ) · K c,d ( r ) (cid:1) = 2 c · J c,d ( r ) · K c,d ( r )d r + d · ( K c,d ( r ) − J c,d ( r ))d r + d J c,d ( r )d W ( r ) .(e) (cid:82) G c,d ( r )d r = ( c + d ) (cid:82) J c,d ( r )d r + J c,d (1) G c,d (1) − (cid:82) J c,d ( r )d W ( r ) − c · J c,d (1) (f) J (1) = 2 (cid:82) J c,d ( r ) G c,d ( r )d r (g) ( G c,d (1) − / c (cid:82) G c,d ( r )d r + cd (cid:82) K c,d ( r ) G c,d ( r )d r − d (cid:82) J c,d ( r ) G c,d ( r )d r + (cid:82) G c,d ( r )d W ( r ) . Proof of Lemma B.3.

To prove part (a) and (b), note that by applying trigonometricidentities, we have J c,d ( r ) = 1 d (cid:90) r e c ( r − s ) { sin( dr ) cos( ds ) − cos( dr ) sin( ds ) } d W ( s ) ,K c,d ( r ) = 1 d (cid:90) r e c ( r − s ) { cos( dr ) cos( ds ) + sin( dr ) sin( ds ) } d W ( s ) . By applying stochastic differentiation of J c,d ( r ) and K c,d ( r ) , d · d J c,d ( r ) = ( c · e cr sin( dr ) + d · e cr cos( dr )) (cid:90) r e − cs cos( ds )d W ( s ) · d r + e cr sin( dr ) e − cr cos( dr )d W ( r ) − ( c · e cr cos( dr ) − d · e cr sin( dr )) (cid:90) r e − cs sin( ds )d W ( s ) · d r − e cr sin( dr ) e − cr cos( dr )d W ( r )= c (cid:90) r e c ( r − s ) { sin( dr ) cos( ds ) − cos( dr ) sin( ds ) } d W ( s ) · d r + d (cid:90) r e c ( r − s ) { cos( dr ) cos( ds ) + cos( dr ) sin( ds ) } d W ( s ) · d r,d · d K c,d ( r ) = ( c · e cr cos( dr ) − d · e cr sin( dr )) (cid:90) r e − cs cos( ds )d W ( s ) · d r e cr cos( dr ) e − cr cos( dr )d W ( r )+( c · e cr sin( dr ) + d · e cr cos( dr )) (cid:90) r e − cs sin( ds )d W ( s ) · d r + e cr sin( dr ) e − cr sin( dr )d W ( r )= c (cid:90) r e c ( r − s ) { cos( dr ) cos( ds ) + sin( dr ) sin( ds ) } d W ( s ) · d r + d (cid:90) r e c ( r − s ) { sin( dr ) cos( ds ) − cos( dr ) sin( ds ) } d W ( s ) · d r +d W ( r ) . Parts (a) and (b) now follow from the trigonometric identities. To prove part (c), use theresults from (a) and (b) and evaluate the following integrals using integration by parts: (cid:90) r e c ( r − s ) J c,d ( s )d s = dc (cid:90) r e c ( r − s ) K c,d ( s )d s − c J c,d ( r ) , (cid:90) r e c ( r − s ) K c,d ( s )d s = 1 cd (cid:90) r e c ( r − s ) d W ( s ) − dc (cid:90) rc e c ( r − s ) J c,d ( s )d s − c K c,d ( r ) . With some algebraic manipulations, we obtain part (c).By Ito’s lemma, d (cid:0) J c,d ( r ) · K c,d ( r ) (cid:1) = d J c,d ( r ) · K c,d ( r ) + J c,d ( r ) · d K c,d ( r ) . Note that the quadratic covariation is negligible in this case. Using (a) and (b), part (d)follows immediately.Next we proceed to prove (e). From (d), it follows that d · J c,d (1) K c,d (1) = 2 cd (cid:90) J c,d ( r ) K c,d ( r )d r + d (cid:90) ( K c,d ( r ) − J c,d ( r ))d r + (cid:90) J c,d ( r )d W ( r ) , By the deﬁnition of G c,d ( · ) , J c,d (1) G c,d (1) = c · J c,d (1) + d · J c,d (1) K c,d (1) . By applying the two results from above, we obtain the result in (e): (cid:90) G c,d ( r )d r = c (cid:90) J c,d ( r )d r + 2 cd (cid:90) J c,d ( r ) K c,d ( r )d r + d (cid:90) K c,d ( r )d r = ( c + d ) (cid:90) J c,d ( r )d r + d · J c,d (1) K c,d (1) − (cid:90) J c,d ( r )d W ( r )= ( c + d ) (cid:90) J c,d ( r )d r + · J c,d (1) G c,d (1) − c · J c,d (1) − (cid:90) J c,d ( r )d W ( r ) . o prove (f) and (g), we use stochastic differentiation of J c,d ( r ) and G c,d ( r ) , respec-tively: d J c,d ( r ) = 2 J c,d ( r )d J c,d ( r ) = 2 J c,d ( r ) G c,d ( r )d r, d G c,d ( r ) = 2 G c,d ( r )d G c,d ( r ) + (d G c,d ( r )) = 2 G c,d ( r )( c · d J c,d ( r ) + d · d K c,d ( r )) + d r = 2 c · G c,d ( r ) G c,d ( r )d r + 2 cd · G c,d ( r ) K c,d ( r )d r − d · G c,d ( r ) J c,d ( r )+2 G c,d ( r )d W ( r ) + d r. The results in (f) and (g) follows by integrating both sides of the stochastic differentialequations above with respect to r over [0 , . (cid:3) Proof of Proposition 3.2.

By Lemma B.1(a) and (b), y t = (cid:18) cn + c − d n + O ( n − ) (cid:19) y t − − (cid:18) cn + 2 c n + O ( n − ) (cid:19) y t − + u t , and ∆ y t = (cid:18) cn (cid:19) ∆ y t − + (cid:18) c − d n + O ( n − ) (cid:19) − y t − (cid:18) c n + O ( n − ) (cid:19) y t − + u t = t (cid:88) j =0 (cid:18) cn (cid:19) t − j u j + (cid:18) c − d n + O ( n − ) (cid:19) t (cid:88) j =0 (cid:18) cn (cid:19) t − j y j − − (cid:18) c n + O ( n − ) (cid:19)(cid:19) t (cid:88) j =0 (cid:18) cn (cid:19) t − j y j − . Deﬁne S n ( r ) ≡ (cid:80) (cid:98) nr (cid:99) t =1 u t . We have: ∆ y (cid:98) nr (cid:99) = (cid:98) nr (cid:99) (cid:88) j =0 (cid:18) cn (cid:19) (cid:98) nr (cid:99)− j (cid:90) jnj − n d S n ( s )+ (cid:18) c − d n + O ( n − ) (cid:19) (cid:98) nr (cid:99) (cid:88) j =0 (cid:90) jnj − n (cid:18) cn (cid:19) (cid:98) nr (cid:99)− j y (cid:98) n j − n (cid:99) d s − (cid:18) c n + O ( n − ) (cid:19) (cid:98) nr (cid:99) (cid:88) j =0 (cid:90) jnj − n (cid:18) cn (cid:19) (cid:98) nr (cid:99)− j y (cid:98) n j − n (cid:99) d s. By the CMT and Proposition 3.1, n − / ∆ y (cid:98) nr (cid:99) ⇒ σ (cid:90) r e c ( r − s ) d W ( s ) − σ ( c + d ) (cid:90) r e c ( r − s ) J c,d ( s )d s, = σ ( c · J c,d ( r ) + d · K c,d ( r )) , here the result in the last line follows by Lemma B.3(c). The result of the propositionnow follows by the deﬁnition of G c,d ( r ) in (3.6). (cid:3) Proof of Lemma 3.1.

Parts (a)–(c) follow immediately from Propositions 3.1 and 3.2 bythe CMT. To prove the result in part (d), by squaring both sides of equation (2.4) andsumming over t , we obtain: (cid:88) y t = ( φ ,n + φ ,n ) (cid:88) y t − + φ ,n (cid:88) (∆ y t − ) + (cid:88) u t − φ ,n + φ ,n ) φ ,n (cid:88) y t − ∆ y t − + 2( φ ,n + φ ,n ) (cid:88) y t − u t − φ ,n (cid:88) ∆ y t − u t . After rearranging and applying the results of Lemmas B.1–B.2, we have: (cid:88) y t − u t = c + d n (cid:88) y t − + y n ∆ y n − (cid:88) (∆ y t − ) − cn (cid:88) y t − ∆ y t − + O p ( n ) . By the results in parts (a)–(c) of the lemma, and using the shortened notation as ex-plained on page 13, n − (cid:88) y t − u t ⇒ σ (cid:18) ( c + d ) (cid:90) J c,d + J c,d (1) G c,d (1) − (cid:90) G c,d − c (cid:90) J c,d G c,d (cid:19) = σ (cid:90) J c,d d W, where the result in the last line is by Lemma B.3(e) and (f).To prove part (e), we follow the same steps as in part (d) using (3.4) to obtain n (cid:88) ∆ y t − u t = c + d n (cid:88) y t − ∆ y t − − cn (cid:88) (∆ y t − ) − n (cid:88) u t + 12 n (∆ y n ) + O ( n − ) ⇒ σ ( c + d ) (cid:90) J c,d G c,d − cσ (cid:90) G c,d − σ u + 12 σ G c,d (1)= σ (cid:90) G c,d d W + 12 ( σ − σ u ) , where the equality in the last line is by part (g) of Lemma B.3 and the deﬁnition of G c,d . (cid:3) Proof of Proposition 3.3.

By (3.5), (cid:32) (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n (cid:98) φ ,n − φ ,n (cid:33) = 1 (cid:80) y t − (cid:80) (∆ y t − ) − ( (cid:80) y t − ∆ y t − ) × (cid:32) (cid:80) (∆ y t − ) (cid:80) y t − ∆ y t − (cid:80) y t − ∆ y t − (cid:80) y t − (cid:33) (cid:32) (cid:80) y t − u t − (cid:80) ∆ y t − u t (cid:33) . he result in part (a) and the result in part (b) for (cid:98) φ ,n follow immediately by Lemma 3.1and the CMT. The result in part (b) for (cid:98) φ ,n follows since n ( (cid:98) φ ,n − φ ,n ) = n ( (cid:98) φ ,n + (cid:98) φ ,n − φ ,n − φ ,n ) − n ( (cid:98) φ ,n − φ ,n )= O p ( n − ) − n ( (cid:98) φ ,n − φ ,n ) , where the second equality holds by the result in part (a). (cid:3) Proof of Proposition 3.4.

The result follows from Lemma 3.1(a)-(c) and Proposition3.3, provided that ˆ σ n → p σ . The long-run variance estimator (cid:98) σ n is given by (cid:98) σ n = ˆ σ u,n + 2 m n (cid:88) h =1 w n ( h ) n − n (cid:88) t = h +1 ˆ u t ˆ u t − h , where (cid:98) σ u,n = n − n (cid:88) t =1 ˆ u t . Denote φ ,n ≡ φ ,n + φ ,n and (cid:98) φ ,n ≡ (cid:98) φ ,n + (cid:98) φ ,n . We have: (cid:98) σ u,n = 1 n (cid:88) u t − ( (cid:98) φ ,n − φ ,n ) 2 n (cid:88) y t − u t + ( (cid:98) φ ,n − φ ,n ) 2 n (cid:88) ∆ y t − u t + 1 n (cid:88) (cid:18) ( (cid:98) φ ,n − φ ,n ) y t − + ( (cid:98) φ ,n − φ ,n )∆ y t − (cid:19) = 1 n (cid:88) u t + O p ( n − ) → p σ u , where the equality in the line before the last holds by Lemma Lemma 3.1(d),(e) andProposition 3.3, and the result in the last line holds by Assumption 3.2. By the samearguments and since the weight function w n ( · ) is bounded, n − n (cid:88) t = h +1 ˆ u t ˆ u t − h = n − n (cid:88) t = h +1 u t u t − h + O p ( n − ) . Hence, (cid:98) σ n = ˜ σ n + O p ( m n /n ) , and the result follows by Assumption 3.3. (cid:3) Proof of Lemma 4.1.

By the results of Propositions 3.1, 3.2, and the CMT, n − / ¯ y n /σ ⇒ (cid:90) J c,d ( s )d s, n − / ∆ y n /σ ⇒ (cid:90) G c,d ( s )d s. Hence, n − / ( y (cid:98) nr (cid:99) − ¯ y n ) /σ ⇒ J c,d ( s ) − (cid:90) J c,d ( r )d s = (cid:101) J c,d ( r ) , − / (∆ y (cid:98) nr (cid:99) − ∆ y n ) /σ ⇒ G c,d ( r ) − (cid:90) G c,d ( s ) = (cid:101) G c,d ( r )d s. The results of the lemma now follow by the CMT using the same arguments as those inthe proof of Lemma 3.1 (cid:3)

Proof of Lemma 4.2.

The results of the lemma follow by the same arguments as those inthe proofs of Lemma 3.1 and 4.1 after observing that (cid:82) cos (2 πks )d s = (cid:82) sin (2 πks )d s =1 / . (cid:3) Proof of Lemma 4.3.

The results of the lemma follow by the same arguments as thosein the proofs of Lemma 3.1 and 4.1 after observing that (cid:32) (cid:82) s d s (cid:82) s d s (cid:82) s d s (cid:33) − = (cid:32) − − (cid:33) . (cid:3) Proof of Proposition 6.1.

To simplify the presentation, we prove the result for p = 1 . Forthe general case, the proof is similar but requires more a complicated notation. Under H , (cid:101) u t, = (cid:101) u t , where { (cid:101) u t } are the residuals from the projection of { u t } against the componentsof D t . Since (1 − φ L − φ L ) (cid:98) x t, = (cid:101) ε t − ( (cid:98) ρ , − ρ ) (cid:101) u t − , (cid:98) ρ , − ρ = (cid:80) (cid:101) u t − ε t (cid:80) (cid:101) u t − , the estimators of φ and φ satisfy: (cid:32) (cid:98) φ ,n − φ (cid:98) φ ,n − φ (cid:33) = (cid:32) (cid:80) (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:98) x t − , (cid:80) (cid:98) x t − , (cid:33) − (cid:32)(cid:80) (cid:98) x t − , ( (cid:101) ε t − ( (cid:98) ρ , − ρ ) (cid:101) u t − ) (cid:80) (cid:98) x t − , ( (cid:101) ε t − ( (cid:98) ρ , − ρ ) (cid:101) u t − ) (cid:33) , with (cid:32)(cid:80) (cid:98) x t − , ( (cid:101) ε t − ( (cid:98) ρ , − ρ ) (cid:101) u t − ) (cid:80) (cid:98) x t − , ( (cid:101) ε t − ( (cid:98) ρ , − ρ ) (cid:101) u t − ) (cid:33) = (cid:80) (cid:18)(cid:98) x t − , − (cid:80) (cid:98) x s − , (cid:101) u s − (cid:80) (cid:101) u s − (cid:101) u t − (cid:19)(cid:101) ε t (cid:80) (cid:18)(cid:98) x t − , − (cid:80) (cid:98) x s − , (cid:101) u s − (cid:80) (cid:101) u s − (cid:101) u t − (cid:19)(cid:101) ε t  . (B.2)The result follows since under the null, (cid:98) ρ , − ρ = O p ( n − / ) and (cid:98) x t, = (cid:101) x t − ( (cid:98) ρ , − ρ ) (cid:101) y t − . (cid:3) Proof of Proposition 6.2.

Similarly to the proof of Proposition 6.1, we prove the resultfor p = 1 . For the general case, the proof is analoguous, but requires more a complicated otation. Consider ˙ ζ ,n in (6.5): ˙ ζ ,n = (cid:80) (cid:101) u t ( (cid:101) x t − ( (cid:98) ρ , − ρ ) (cid:101) y t − ) (cid:80) (cid:101) u t = O p ( n ) , where the second equality holds by the lemmas in Section 4. Next, consider the elementsof the matrix Σ n : n − (cid:88) ˙ x t − , = n − (cid:88) ( (cid:98) x t, − ˙ ζ ,n (cid:101) u t, ) = n − (cid:88) (cid:101) x t + o p (1) ⇒ σ ε (cid:90) (cid:101) J c,d , where the results in the second and third lines hold again by the lemmas in Section 4.After applying the same arguments to the other elements in Σ n , the elements of M n , andthe expressions on the right-hand side of (B.2), the result of the proposition follows bythe CMT. (cid:3) ppendix C. Description of the data in Section 7 Table 3 in this appendix provides a description of the variables used in the empiricalapplication in Section 7. The description includes the source with exact identiﬁers foreach variable, any transformations applied to the raw data, and the sample periods.T

ABLE

3. Data description

Source Identiﬁer Construction SampleReal GDP per capita FRED A939RX0Q048SBEA Natural logarithm 1947Q1 to 2020Q2Unemployment rate FRED UNRATE Natural logarithm 1948Q1 to 2020Q2Hours per capita FRED HOANBSB230RC0Q173SBEA Ratio of non-farm business hoursto population, Natural logarithm 1947Q1 to 2020Q2VXOS&P 100 volatility index FRED VXOCLS Natural logarithm 1986Q1 to 2020Q3Credit risk premiumBAA to 10Y FRED BAA10YM — 1953Q2 to 2020Q2Credit to non-ﬁnancial sector% GDP FRED QUSPAM770A — 1952Q1 to 2020Q1Home price index FRED CSUSHPISA S&P/Case-Shiller U.S. NationalHome Price Index, Natural logarithm 1987Q1 to 2020Q2Equity price index FRED WILL5000INDCPALTT01USQ661S Wilshire 5000 Total Market Indexdivided by CPI, Natural logarithm 1971Q2 to 2020Q2In the case where aggregation is needed, the end of period values are used. eferences A’H

EARN , B.,

AND

U. W

OITEK (2001): “More international evidence on the historicalproperties of business cycles,”

Journal of Monetary Economics , 47(2), 321–346.A

IKMAN , D., A. G. H

ALDANE , AND

B. D. N

ELSON (2015): “Curbing the credit cycle,”

Economic Journal , 125(585), 1072–1109.A

NDREWS , D. W. K. (1991): “Heteroskedasticity and autocorrelation consistent covari-ance matrix estimation,”

Econometrica , 59, 817–858.(1993): “Exactly median-unbiased estimation of ﬁrst order autoregressive/unitroot models,”

Econometrica , pp. 139–165.A

NDREWS , D. W. K., X. C

HENG , AND

P. G

UGGENBERGER (2020): “Generic results forestablishing the asymptotic size of conﬁdence sets and tests,”

Journal of Econometrics ,218, 496–531.B

EAUDRY , P., D. G

ALIZIA , AND

F. P

ORTIER (2020): “Putting the cycle back into businesscycle analysis,”

American Economic Review , 110(1), 1–47.B RY , G., AND

C. B

OSCHAN (1971): “Cyclical analysis of time series: Selected proceduresand computer programs,” NBER technical paper.C

OCHRANE , J. H. (2017): “Macro-ﬁnance,”

Review of Finance , 21(3), 945–985.C

OMIN , D.,

AND

M. G

ERTLER (2006): “Medium-term business cycles,”

American EconomicReview , 96(3), 523–551.D OU , L., AND

U. K. M¨

ULLER (2019): “Generalized local-to-unity models,” Working paper.D

REHMANN , M., C. E. B

ORIO , AND

K. T

SATSARONIS (2012): “Characterising the ﬁnancialcycle: don’t lose sight of the medium term!,” BIS working paper.E

LLIOTT , G.,

AND

J. H. S

TOCK (2001): “Conﬁdence intervals for autoregressive coefﬁ-cients near one,”

Journal of Econometrics , 103(1-2), 155–181.H

ANSEN , B. E. (1999): “The grid bootstrap and the autoregressive model,”

Review ofEconomics and Statistics , 81(4), 594–607.H

ARDING , D.,

AND

A. P

AGAN (2002): “Dissecting the cycle: a methodological investiga-tion,”

Journal of Monetary Economics , 49(2), 365–381.H

ARVEY , A. C. (1985): “Trends and cycles in macroeconomic time series,”

Journal ofBusiness & Economic Statistics , 3(3), 216–227.M

IKUSHEVA , A. (2007): “Uniform inference in autoregressive models,”

Econometrica ,75(5), 1411–1452.(2012): “One-dimensional inference in autoregressive models with the potentialpresence of a unit root,”

Econometrica , 80(1), 173–212.N

EWEY , W. K.,

AND

K. D. W

EST (1987): “A simple, positive semi-deﬁnite, heteroskedas-ticity and autocorrelation consistent covariane matrix,”

Econometrica , 55, 703–708. HILLIPS , P. C. B. (1987): “Towards a uniﬁed asymptotic theory for autoregression,”

Biometrika , 74(3), 535–547.(1988): “Regression theory for near-integrated time series,”

Econometrica , 56(5),1021–1043.P

HILLIPS , P. C. B.,

AND

S. J IN (2002): “The KPSS test with seasonal dummies,” EconomicsLetters , 77(2), 239–243.P

HILLIPS , P. C. B.,

AND

V. S

OLO (1992): “Asymptotics for Linear Processes,”

Annals ofStatistics , 20(2), 971–1001.P

RIESTLEY , M. B. (1981):

Spectral Analysis and Time Series . Academic press.R¨

UNSTLER , G.,

AND

M. V

LEKKE (2018): “Business, housing, and credit cycles,”

Journal ofApplied Econometrics , 33(2), 212–226.S

ARGENT , T. J. (1987):

Macroeconomic Theory , Economic Theory, Econometrics, andMathematical Economics Series. Emerald Group Publishing Limited, Bingley, UK, 2ndedn.S

TOCK , J. H. (1991): “Conﬁdence intervals for the largest autoregressive root in USmacroeconomic time series,”

Journal of Monetary Economics , 28(3), 435–459.S

TROHSAL , T., C. R. P

ROA ˜ NO , AND

J. W

OLTERS (2019): “Characterizing the ﬁnancialcycle: Evidence from a frequency domain analysis,”

Journal of Banking & Finance , 106,568–591., 106,568–591.