[PDF] Testing for Nonlinear Cointegration under Heteroskedasticity

Abstract

This article discusses cointegration tests for nonlinear cointegration in the presence of variance breaks in the errors. We build on approaches of Cavaliere and Taylor (2006, Journal of Time Series Analysis) for heteroskedastic cointegration tests and of Choi and Saikkonen (2010, Econometric Theory) for nonlinear cointegration tests. We propose a bootstrap test and prove its consistency. A Monte Carlo study shows the approach to have appealing finite sample properties and to work better than an approach using subresiduals. We provide an empirical application to the environmental Kuznets curves (EKC), finding that the cointegration tests do not reject the EKC hypothesis in most cases.

Full PDF

aa r X i v : . [ ec on . E M ] F e b Testing for Nonlinear Cointegration under Heteroskedasticity

Christoph Hanck ∗ Department of Economics, University of Duisburg-Essen.

Till Massing † Department of Economics, University of Duisburg-Essen.

February 18, 2021

Abstract

This article discusses cointegration tests for nonlinear cointegration in the presence of vari-ance breaks in the errors. We build on approaches of Cavaliere and Taylor (2006, Journal ofTime Series Analysis) for heteroskedastic cointegration tests and of Choi and Saikkonen (2010,Econometric Theory) for nonlinear cointegration tests. We propose a bootstrap test and proveits consistency.A Monte Carlo study shows the approach to have appealing ﬁnite sample properties andto work better than an approach using subresiduals. We provide an empirical application tothe environmental Kuznets curves (EKC), ﬁnding that the cointegration tests do not reject theEKC hypothesis in most cases.

Keywords:

Nonlinear cointegration tests; variance breaks; ﬁxed regressor bootstrap.

JEL Classiﬁcation Numbers:

C12, C32, Q2. ∗ Address: Universit¨atsstr. 12, 45130 Essen, Germany, e-mail: [email protected] . † Address: Universit¨atsstr. 12, 45130 Essen, Germany, e-mail: [email protected] . Introduction

In the past decades, a broad literature on cointegration tests has developed, addressing a varietyof diﬀerent possible features of the data like endogeneity, heteroskedasticity, and nonlinearity. Forexample, the discussion of the environmental Kuznets curve in our application reveals that the dataexhibits both a nonlinear cointegrating relation as well as variance breaks.This paper presents a framework capable to test for cointegration both when the cointegratingrelation is nonlinear and in the presence of heteroskedasticity. In order to achieve this, we mainlybuild on Choi and Saikkonen (2010) and on Cavaliere and Taylor (2006). The nonlinear cointe-grating relation can be very general and variance breaks can occur both in the integrated regressorand in the (stationary or integrated) error term.There are two possibilities for specifying a null hypothesis. Namely, one can formulate the nullhypothesis of no cointegration . In this ﬁeld, e.g., Dickey and Fuller (1979), Phillips and Perron(1988) and their numerous extensions test the null of the presence of a unit root for univariatetime series. Engle and Granger (1987) extended this to the context of testing for no cointegration.Alternatively, Kwiatkowski et al. (1992) test the null of stationarity against the alternative of aunit root (commonly known as KPSS test). Shin (1994) extended this approach to test the null of cointegration , as we do here. The basic idea is to use the ordinary least squares (OLS) residuals ofa linear cointegrating regression to build the test statistic.This theory has been enhanced in several directions. For example, Leybourne and McCabe(1994) and McCabe et al. (1997) proposed extensions of the original framework. Cavaliere (2005)and Cavaliere and Taylor (2006) incorporated variance breaks into the linear cointegration model.Saikkonen and Choi (2004) dropped the linearity assumption of the cointegrating regression andproposed a test for cointegrating smooth transition functions. Choi and Saikkonen (2010) furtherextended this to general kinds of nonlinear cointegrating regressions. Both employed nonlinearleast squares estimation (NLS) and leads-and-lags regression instead of OLS for estimating thecointegrating parameter vector.The paper is organized as follows. Section 2 describes the nonlinear cointegrating regressionmodel and the maintained assumptions. Section 3 presents the cointegration test and developsits large sample properties. Furthermore, Section 3 discusses a bootstrap approach for practical2mplementation of the test. Section 4 analyzes the ﬁnite sample quality of the test in a MonteCarlo study. Section 5 illustrates the approach with an application to the environmental Kuznetscurve. Unless stated otherwise, all proofs are relegated to Appendix A.Some notational remarks: We denote by ⌊ x ⌋ the largest integer number smaller or equal than x ∈ R and ⌈ x ⌉ the smallest integer number larger or equal than x . ( · ) denotes the indicator functionand D R m × m [0 ,

1] denotes the space of m × m matrices of c`adl`ag functions on [0 , w → , convergence in probability by p → , weakconvergence in probability (see Gin´e and Zinn, 1990) by w → p , and almost sure convergence by a . s . → .All limits are taken as T → ∞ , unless stated otherwise. In this section, we introduce the model and the underlying assumptions. We consider (as inChoi and Saikkonen, 2010) the nonlinear cointegrating regression y t = g ( x t , θ ) + u t , t = 1 , . . . , T, (1)where y t is 1-dimensional and x t is the k -dimensional regressor vector. Both y t and x t are I (1).We assume that g ( x t , θ ) is a known smooth function of x t up to the unknown k -dimensional pa-rameter vector θ . We furthermore assume that the vector elements of x t are not cointegrated (seeAssumption 3 for a precise statement below). This also means g ( x t , θ ) is not I (0). The error termis taken to be u t = ζ u,t + µ t , where µ t = µ t − + ρ µ ζ µ,t , µ = 0 . The random walk behavior of x t is speciﬁed by x t = x t − + ζ x,t . k + 2)-dimensional vector process ζ t := ( ζ u,t , ζ ′ x,t , ζ µ,t ) ′ . Assumption 1 (i) { ζ u,t } and { ζ µ,t } are independent.(ii) ζ t := ( ζ u,t , ζ ′ x,t , ζ µ,t ) ′ = Σ / t ζ ∗ t , where { ζ ∗ t } is a stationary, zero-mean, unit variance, strong-mixing sequence with mixing coeﬃcient of size − r/ ( r − , for some r > and E || ζ ∗ t || r < ∞ and Σ t :=  σ u,t σ ′ ux,t σ ux,t Σ x,t

00 0 ′ σ µ,t  . The scalars σ u,t and σ µ,t are strictly positive, σ ux,t is k -dimensional, Σ x,t ( k × k ) is positivedeﬁnite. All entries may depend on t . We assume that Σ t is positive deﬁnite for any t . This means that u t has a random walk component unless ρ µ = 0. Hence the null hypothesis ofcointegration is given by H : ρ µ = 0 against the alternative H : ρ µ > ζ u,t and ζ x,t to allow for endogeneity. The Monte Carlo experiments in Section4 will reveal the proposed bootstrap approach eﬀectively handles endogeneity. We conjecturethat a correlation between, e.g., ζ u,t and ζ µ,t will not reveal diﬀerent insights. We, therefore,abstain from considering further non-zero covariance terms in (ii). Moreover, we also generalizeCavaliere and Taylor (2006, Assumption 1) in terms of permitting autocorrelation of the ζ t ’s. Thisis adopted from Assumption 2 of Choi and Saikkonen (2010).Following Cavaliere (2005) and Cavaliere and Taylor (2006), we allow for general forms of het-eroskedastic errors. Assumption 2

The sequence { Σ t } Tt =1 satisﬁes Σ T ( s ) := Σ ⌊ T s ⌋ = Σ( s ) , where Σ( · ) is a non-stochastic function which lies in D R ( k +2) × ( k +2) [0 , , with i, j -th element Σ ij ( · ) . Assumption 2 allows for many possible models for the covariance matrix of ζ t . For simpleor multiple variance shifts Σ ij ( · ) is a piecewise constant function. For example, Σ ij ( s ) := Σ ij +(Σ ij − Σ ij ) ( s ≥ ⌊ τ ij ⌋ ) represents a shift from Σ ij to Σ ij at time ⌊ τ ij T ⌋ (0 ≤ τ ij ≤ t,ij exhibits a linear trend), piecewise aﬃne functions, or4mooth transition functions. The assumption also allows for very general combinations of variance-covariance shifts. For example, the variance of ζ u,t can have a shift while ζ x,t is homoskedastic orheteroskedastic with a diﬀerent shift function Σ ij ( s ). Notice that variance shifts in ζ µ,t are onlyrelevant if the alternative H is true. Although we rule out stochastic volatility here, a generalizationto stochastic a stochastic { Σ t } , s.t. { Σ t } is strictly exogenous w.r.t. { ζ ∗ t } , is possible. We refer toCavaliere and Taylor (2006) for details.Furthermore, we deﬁne Ω t := t − V ar (cid:0)P ti =1 ζ i (cid:1) , which can be decomposed asΩ t =  ω u,t ω ′ ux,t ω ux,t Ω x,t

00 0 ′ ω µ,t  . Analogously, Ω( s ) := Ω ⌊ T s ⌋ . Then, the average long-run covariance matrix lim T →∞ Ω T is given by¯Ω = Z Ω( s )d s, which can be partitioned into ¯Ω =  ¯ ω u ¯ ω ′ ux ω ux ¯Ω x

00 0 ′ ¯ ω µ  . Assumption 1 & 2 imply a generalized invariance principle as stated in Lemma 1. The standardinvariance principle as in Shin (1994) would require a time-constant covariance matrix Σ.

Lemma 1

Let Assumptions 1 and 2 hold on { ζ t } . Then, as T → ∞ , T − / ⌊ T s ⌋ X t =1 ζ t w → B Ω ( s ) , s ∈ [0 , , where B Ω ( s ) := ( B , Ω ( s ) , B ′ , Ω ( s ) , B , Ω ( s )) ′ := Z s Ω / ( r )d B ( r ) , with B = ( B , B ′ , B ) ′ is a ( k + 2) -dimensional Brownian motion with unit covariance matrix. Proof.

The proof is analogous to the proof of Lemma 1 in Cavaliere and Taylor (2006) and thus5s omitted.The next assumption ensures that the components of x t are not cointegrated. This is given bythe special case λ = 0. Assumption 3

The spectral density matrix f ζζ ( λ ) is bounded away from zero: f ζζ ( λ ) ≥ εI k +2 , ε > . Assumption 4 is the usual assumption required for deriving consistency and asymptotic distri-bution of the NLS estimator.

Assumption 4 (i) The parameter space Θ of θ is a compact subset of R k and the true parameter θ ∈ Θ , where Θ denotes the interior of Θ .(ii) g ( x, θ ) is three times continuously diﬀerentiable on R × Θ ∗ , where Θ ∗ ⊃ Θ is open. The assumptions on x t theoretically rule out the possibility of deterministic regressors like anintercept or a time trend because they are not I (1). However, we discuss these interesting scenariosin Appendix B and illustrate that the bootstrap generally works well. Following Saikkonen and Choi (2004) and Choi and Saikkonen (2010) we use triangular arrayasymptotics in order to study the large sample behavior of the proposed test statistic (2), presentedbelow. We ﬁx the actual sample size at T and embed the model in a sequence of models dependenton the sample size T , which tends to inﬁnity. We replace the regressor x t by x tT := ( T /T ) / x t .This makes the regressor and regressand dependent on T and we obtain the actual model for T = T .If T is large, the triangular asymptotics can be expected to give reasonable approximations to theﬁnite sample distributions of the estimator and test statistics, see Saikkonen and Choi (2004).Choi and Saikkonen (2010) note that conventional asymptotic results on the NLS estimator arenot available when the error term u t is allowed to be serially correlated or x t is not exogenous. See6aikkonen and Choi (2004) and Choi and Saikkonen (2010) for a more detailed discussion abouttriangular asymptotics.In particular, we embed the model (1) in a sequence of models y tT = g ( x tT , θ ) + u t , t = 1 , . . . , T. In practice, we always choose T = T , so that the transformation x tT is not needed. The transfor-mation is made only to apply triangular asymptotics. We deﬁne B , Ω := T / B , Ω .We use NLS regression to estimate θ . Let Q ( θ ) = T X t =1 ( y tT − g ( x tT , θ )) be the objective function to be minimized with respect to θ ∈ Θ. Since Q is continuous on Θ foreach ( y T , . . . , y T T , x T , . . . , x T T ) and Θ is compact by Assumption 4, the NLS estimator ˆ θ T existsand is Borel measurable (P¨otscher and Prucha, 2013).We need to make additional assumptions about the functions g and K , where K ( x, θ ) := ∂g ( x,θ ) ∂θ (cid:12)(cid:12)(cid:12) θ = θ , to show that, under the null, the NLS estimator is consistent and to derive its asymp-totic distribution in Proposition 1 below. Assumption 5 guarantees that the limit of the objectivefunction is minimized (a.s.) at the true parameter vector θ . Assumption 5

For some s ∈ [0 , and all θ = θ , g (cid:0) B , Ω ( s ) , θ (cid:1) = g (cid:0) B , Ω ( s ) , θ (cid:1) (a . s . ) . Assumption 6 shall allow to establish the limiting distribution of the NLS estimator.

Assumption 6 Z K (cid:0) B , Ω ( s ) , θ (cid:1) K (cid:0) B , Ω ( s ) , θ (cid:1) ′ d s > . s . ) . roposition 1 Suppose that Assumptions 1–6 hold. Then, under H , T / (cid:16) ˆ θ T − θ (cid:17) w → (cid:18)Z K (cid:0) B , Ω ( s ) , θ (cid:1) K (cid:0) B , Ω ( s ) , θ (cid:1) ′ d s (cid:19) − · (cid:18)Z K (cid:0) B , Ω ( s ) , θ (cid:1) d B , Ω ( s ) + Z K (cid:0) B , Ω ( s ) , θ (cid:1) d sκ (cid:19) =: ψ (cid:0) B , Ω , θ , κ (cid:1) , where K ( x, θ ) = ∂K ( x,θ ) ∂x ′ (cid:12)(cid:12)(cid:12) θ = θ and κ = P ∞ j =0 E ( θ , θ ,j ) . Proof.

The proof can be directly adapted from the proof of Theorem 2 in Saikkonen and Choi(2004) and Theorem A.1 in Choi and Saikkonen (2010).

This subsection introduces the test statistic we work with and establishes its large samplebehavior. In order to test for cointegration we test for the stationarity of the error process u t . Thetest is residual-based and builds on to the cointegration test of Shin (1994), which, in turn, is basedon the KPSS test (Kwiatkowski et al., 1992). We use the test statisticˆ η := ( T ˆ ω u ) − T X t =1  t X j =1 ˆ u j  , (2)where ˆ u t := y t − g ( x tT , ˆ θ T ) andˆ ω u := ˆ ω u ( l ) := T − T X t =1 ˆ u t + 2 T − l X s =1 w ( s, l ) T X t = s +1 ˆ u t ˆ u t − s , where w is a kernel which fulﬁlls, e.g., the conditions of Andrews (1991) and the lag truncationparameter l := l T depends on the sample size. Here, ˆ ω u is a consistent estimator of the long-runvariance, as long as T /l → ∞ for T → ∞ .The linear case without autocorrelation gives us the model of Cavaliere and Taylor (2006). Wemay then use the parametric estimator ˆ σ u := T − T X t =1 ˆ u t (3)8or the variance. In this case one can show that ˆ σ u is consistent similarly as in Cavaliere and Taylor(2006).Under the null hypothesis, we obtain the following asymptotic behavior of the test statistic. Theorem 1

Under the Assumptions 1–6 and the H ˆ η w → ¯ ω − u Z (cid:0) B , Ω ( s ) − F ( s, B , Ω , θ ) ′ ψ ( B , Ω , θ , κ ) (cid:1) d s, (4) where F ( s, B , Ω , θ ) := R s K ( B , Ω ( r ) , θ )d r and ψ ( B , Ω , θ , κ ) is deﬁned in Proposition 1. As the variance proﬁle Σ( s ) and thus Ω( s ) is generally unknown, we see that the limiting distributiondepends on nuisance parameters, which makes tabulated critical values impractical. The bootstrap,discussed in Section 3.3, is a natural solution.Under the alternative asymptotic theory becomes even more tedious. Since the NLS estimatorˆ θ T is not consistent anymore a limiting distribution is hard to derive. We may, however, establishthe order of magnitude of ˆ η under H , which is enough to justify consistency of the cointegrationtest. Theorem 2

Let H be true. Under Assumptions 1–6, ˆ η = O p ( T /l ) , where l is the lag truncationused in the estimation of ˆ ω u . We adopt a bootstrap solution to provide feasible inference building on Cavaliere and Taylor’s(2006) bootstrap test for linear cointegration in the presence of variance breaks. They used the heteroskedastic ﬁxed regressor bootstrap by Hansen (2000). It treats the regressors as ﬁxed, withoutimposing strong assumptions on the data generating process (DGP). In Theorem 3 we show thatthe ﬁxed regressor bootstrap replicates the correct asymptotic distribution of the test statistic. Asusual, it does not replicate the ﬁnite sample distribution of the test statistic, see Hansen (2000).However, Section 4 will demonstrate that the bootstrap works well in ﬁnite samples, as also observedby Cavaliere and Taylor (2006) for testing linear cointegration. Popular other bootstraps, e.g., blockresampling (Lahiri, 1999), are not applicable because the regressor is integrated and heteroskedasticand the error term is potentially heteroskedastic under the null hypothesis.9ore speciﬁcally, the heteroskedastic ﬁxed regressor bootstrap works as follows:1. Run the original NLS regression, save residuals ˆ u t and compute the test statistic ˆ η as givenin (2).2. Construct the bootstrap sample y btT := u bt := ˆ u t z t , t = 1 , . . . , T , where { z t } is a sequence ofi.i.d. standard normal variates.3. Estimate ˆ θ bT via NLS of y btT on g ( x tT , θ ), save the residuals ˆ u bt := y btT − g ( x tT , ˆ θ bT ) and computethe bootstrap test statistic asˆ η b := ( T (ˆ ω bu ) ) − T X t =1  t X j =1 ˆ u bj  , where (ˆ ω bu ) is the long-run variance estimate using the bootstrap sample.4. Repeat steps 2 and 3 independently B times and, given that we reject for large values, computethe simulated bootstrap p -value ˜ p bT := 1 − ˜ G bT (ˆ η ), where ˜ G bT is the empirical cumulativedistribution function of the bootstrap test statistics { ˆ η b } Bb =1 .The replications, for B suﬃciently large, approximate the true bootstrap distribution G bT whichis the theoretical cumulative distribution function of ˆ η b and the associated bootstrap p -value isdeﬁned as p bT := 1 − G bT (ˆ η ). Then, as B → ∞ , ˜ p bT a . s . → p bT .The next theorem shows that (i) the bootstrap replicates the correct asymptotic null distribu-tion, and, (ii) that the test based on the bootstrap p -values is consistent. Theorem 3 (i) Under Assumptions 1–6 and the H , p bT w → U [0 , .(ii) Under Assumptions 1–6 and the H , p bT p → . Choi and Saikkonen (2010) proposed a KPSS type test for cointegration using subresidualswhich we describe below. Its advantage is that the limiting distribution of the test statistic,under homoskedasticity, is nuisance parameter-free and explicitly given although, for nonlinear10ointegration, the limiting distribution of the original test statistic was of the form like that inTheorem 1.However, in the presence of variance breaks the limiting distribution of the subresidual-basedstatistic depends on nuisance parameters, as we will show in Corollary 1. This makes its direct useimpractical. We hence favor the bootstrap approach.The subresidual-based test statistic is of the same form as ˆ η in (2) but use only a subset of theresiduals { ˆ u t } i + ℓ − t = i . We deﬁne ˆ η i,ℓ = ( ℓ (ˆ ω ℓu ) ) − i + ℓ − X t = i  t X j = i ˆ u j  . The index i is the starting point of the subresiduals and ℓ denotes the size of the set of subresiduals,also called block size. (ˆ ω ℓu ) is the long-run variance estimate using the subset of residuals. Thenwe have the following Corollary 1

Suppose that Assumptions 1–6 and H hold. If ℓ → ∞ and ℓ/T → as T → ∞ , wehave for any i with ≤ i ≤ T − ℓ that ˆ η i,ℓ w → ¯ ω − u Z B , Ω ( s )d s. (5)Choi and Saikkonen (2010) found that, under homoskedasticity, ˆ η i,ℓ weakly converges to Z W ( s )d s, (6)where W ( s ) is a standard Brownian motion. Moreover, they derived the distribution function of(6) and provided an easy series representation. This makes the residual approach easy to use.However, for heteroskedastic errors the variance terms in (5) do not cancel out in general. Thus,the limiting distribution depends on nuisance parameters.For comparative purposes, we still use the distribution of (6) for testing the null of nonlinearcointegration in the Monte Carlo experiments in Section 4, ignoring potential heteroskedasticity.This is because we want to investigate the impact of variance breaks for the approach. Moreover,we will compare it with the bootstrap test. 11he c.d.f. of (6) is given by F ( z ) = √ ∞ X n =0 Γ( n + 1 / n !Γ(1 /

2) ( − n − Erf √ / n √ √ z !! , z ≥ , where Erf ( x ) = √ π R x exp( − y )d y is the error function. Choi and Saikkonen (2010) demonstratedthat truncating the series at n = 10 is suﬃciently accurate, and we follow their choice.In order to aggregate subsample tests by using diﬀerent starting points i Choi and Saikkonen(2010) proposed a Bonferroni procedure. For this, we compute M test statisticsˆ η i ,ℓ , . . . , ˆ η i M ,ℓ and deﬁne ˆ η max ,ℓ := max (cid:8) ˆ η i ,ℓ , . . . , ˆ η i M ,ℓ (cid:9) . Due to the Bonferroni-inequalitylim T →∞ P (cid:0) ˆ η max ,ℓ ≤ c α/M (cid:1) ≥ − α, where c α/M is the α/M -critical value from the distributionof R W ( s )d s . We choose M = ⌈ T /ℓ ⌉ and ℓ like in Choi and Saikkonen (2010) with the minimumvolatility rule proposed by Romano and Wolf (2001). This section provides evidence that the proposed nonlinear cointegration test works well inﬁnite samples. We conduct several simulation studies for diﬀerent settings. Especially, we studythe proposed bootstrap test for linear, polynomial, and smooth transition regression cointegration.We compare the empirical rejection rates with those of the standard Shin (1994) test. Moreover,we compare the bootstrap cointegration test with the subresidual-based approach. For the DGP weextend the example of Cavaliere and Taylor (2006), who generated data with a linear cointegrationrelation under variance breaks, by also considering nonlinear cointegration. We start with the linearcase. 12 .1 Linear regression model

We consider the DGP y t = x t + u t , t = 1 , . . . , T, (7) u t = ρu t − + ζ u,t + µ t , u = 0 (8) µ t = µ t − + ρ µ ζ µ,t , µ = 0 , (9) x t = x t − + ζ x,t , x = 0 , (10)where ζ t := ( ζ u,t , ζ x,t , ζ µ,t ) ′ = Σ / t ζ ∗ t , ζ ∗ t ∼ N (0 , I ), i.i.d., | ρ | < t :=  σ u,t σ ux,t σ ux,t σ x,t

00 0 ′ σ µ,t .  In particular, here we initially consider the case of a simple linear cointegrating regression with asingle non-deterministic integrated regressor.We consider abrupt variance breaks of the form σ u,t = σ u, + ( σ u, − σ u, ) ( t ≥ ⌊ τ u T ⌋ ) σ x,t = σ x, + ( σ x, − σ x, ) ( t ≥ ⌊ τ x T ⌋ ) σ µ,t = σ µ, + ( σ µ, − σ µ, ) ( t ≥ ⌊ τ µ T ⌋ ) . In all simulations we set σ u, = σ x, = σ µ, = 1.As Cavaliere and Taylor (2006) noted under the null hypothesis ρ µ = 0 four cases can occur:(i) if τ u = τ x = 0, then y t and x t are both standard I (1) processes with homoskedastic incrementsand cointegrated; (ii) if τ u = 0 , τ x = 0 the permanent shocks to the system are homoskedastic (i.e., x t is integrated with homoskedastic innovations) but there is a variance shift in both the transitorycomponent of y t and in the cointegrating relation; (iii) if τ u = 0 , τ x = 0, the permanent shocks tothe system are heteroskedastic with changes to both x t and y t being heteroskedastic, but there areno variance shifts in the cointegrating relation; (iv) if τ u = 0 , τ x = 0, the permanent shocks to the13ystem are heteroskedastic, changes to both x t and y t are heteroskedastic and there is a varianceshift both in the transitory component of y t and in the cointegrating relation. If H is true varianceshifts in ζ µ have no inﬂuence. Under the alternative we also allow for variance breaks in ζ µ whichlead to variance breaks in u t which are similar to cases (ii) and (iv).Moreover, we consider covariance breaks of the form σ ux,t = σ ux, + ( σ ux, − σ ux, ) ( t ≥ ⌊ τ ux T ⌋ ) . In our simulations we only consider the case where all variance shifts occur at the same time,i.e., τ := τ u = τ x = τ µ = τ ux . For the results on other possible scenarios see the simulation studyof Cavaliere and Taylor (2006).We investigate the following parameter constellations. Let the sample size be T ∈ { , } .We take ρ µ ∈ { , . , . , . } . ρ µ = 0 is to estimate size, the other constellations are for apower analysis. We consider variance breaks at τ ∈ { , . , . , . } . While the ﬁrst of the τ -valuescorresponds to the case of no variance breaks the latter stand for early, middle, and late variancebreaks. We also ﬁx the magnitude of the variance breaks by setting σ = σ u, = σ x, = σ µ, ∈{ / , } , like in Cavaliere and Taylor (2006). The parameter for the covariance σ ux,t are chosen insuch a way that the correlation between ζ u,t and ζ x,t is ﬁxed over time at λ ∈ { , . } , i.e., withoutor with endogeneity. The AR (1) parameter of u t is set ρ ∈ { , . } . Empirical rejection rates arebased on 10,000 replications (unless stated otherwise) and the number of bootstrap replications is B = 500. Finally, the nominal level of signiﬁcance is α = 0 .

05 for the remainder of this paper.We perform the test by estimating θ in the linear regression y t onto g ( x t , θ ) ≡ θx t and usingthe residuals to compute ˆ η . We use the estimator ˆ σ u given in (3) for ρ = 0 and, for ρ = 0 .

5, anon-parametric autocorrelation-robust estimator for the long-run variance with a Bartlett kerneland a spectral window of (cid:4) T / . (cid:5) as suggested in Kwiatkowski et al. (1992). Table 1 reportsempirical rejection rates (as percentages) for the diﬀerent parameter constellations. Panel (a) showsthe rates for the bootstrap approach, panel (b) for the subsample approach and panel (c) for thestandard Shin (1994) test. First, the bootstrap generally yields very good empirical sizes andpowers. Both time (early or late) and direction (increase or decrease) of a variance break do not While we formulate the theory for nonlinear cointegrating regressions we for simplicity use the OLS estimatorwhenever possible to speed up the computations. ρ µ .The subsample-based test is undersized in the constellation without heteroskedasticity underabsence of endogeneity and autocorrelation. Interestingly, it is oversized under endogeneity andautocorrelation, especially in the presence of early downward variance breaks. This eﬀect reducesif the shifts occur later. Moreover, the bootstrap test has higher power for ρ = 0, especially if thealternative is close to the null, otherwise the subresidual test has higher power.Panel (c) shows the result for the test based on critical values tabulated by Shin (1994). Weobserve that variance breaks are an issue and that the test oversizes or undersizes depending ondownward or upward breaks. The empirical power is generally smaller than for the bootstrap test. In this subsection, we consider the case of polynomial cointegrating regression, in particular aquadratic and a cubic relation. We replace the linear model (7) and simulate according to y t = x t + x t + u t , for the quadratic relation, while (8), (9) & (10) and all further parameter constellations of Subsec-tion 4.1 still hold. We now estimate θ = ( θ , θ ) ′ by regressing y t on g ( x t , θ ) = θ x t + θ x t . In thismodel, we already cannot use the critical values of Shin (1994) because to consider both x t and x t as integrated regressors violates the model assumptions. This is also discussed in Wagner and Hong(2016).Table 2 shows the tests’ rejection frequencies. Similar interpretations like in Subsection 4.1 forthe linear case apply here, too. In addition, we observe a decrease of empirical power relative toTable 1, plausibly due to the more complex model to be ﬁtted. The loss is more moderate for thebootstrap test.Inspired by the application in Section 5, we also consider a cubic cointegrating regression. Wesimulate from the model y t = x t + 2 x t + x t + u t , linear regression model for various parameter constellations. All rejection rates are givenas percentages. The nominal size is 5%. Panel (a) is for the bootstrap test, panel (b) for thesubresidual-based test and panel (c) for the Shin (1994) test. ρ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ quadratic regression model for various parameter constellations. All rejection rates are givenas percentages. The nominal size is 5%. Panel (a) is for the bootstrap test and panel (b) for thesubresidual-based test. σ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ cubic regression model for various parameter constellations. All rejection rates are givenas percentages. The nominal size is 5%. Panel (a) is for the bootstrap test and panel (b) for thesubresidual-based test. ρ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ We now discuss an example of a cointegrating regression which is indeed nonlinear in theparameters. Thus, NLS is needed for estimation. We adopt the example of cointegrating smoothtransition functions which is also considered in Saikkonen and Choi (2004) and Choi and Saikkonen(2010). We generate data according to y t = θ + θ x t + θ

11 + exp( − ( x t − θ )) + u t , with the parameter constellation θ = 0 , θ = 1 , θ = 1 , θ = 5. In rare cases, for some generatedsamples the NLS algorithm does not converge. We thus exclude these cases from the analysis. Tosave computational time we run 1,000 repetitions for each constellation. Note that while the trueparameter θ = 0 we include ˆ θ in the estimation. This means we are in the setting beyond ourmodel assumptions with an additional deterministic regressor. For a more detailed discussion seeAppendix B.Table 4 panel (a) reports the rejection rates for the bootstrap test and panel (b) for thesubresidual-based test. We observe that the bootstrap test works well, again, with some mod-erate size problems in the presence of either endogeneity or autocorrelation (which can be solvedusing leads-and-lags as in Appendix B) and somewhat larger size distortions for both endogeneityand autocorrelation. The subresidual based test delivers mixed results, being is undersized andoversized for diﬀerent scenarios of variance breaks. We now discuss an application of cointegrating polynomial regressions for the environmentalKuznets curve (EKC). It relates per capita GDP and per capita pollution of, e.g., CO emissions.19able 4: The table reports the empirical rejection frequencies for testing the null of cointegrationin the smooth transition regression model for various parameter constellations. All rejection ratesare given as percentages. The nominal size is 5%. Panel (a) is for the bootstrap test and panel (b)for the subresidual-based test. σ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ k -th power x kt of an integrated regressor into the regres-sion this power itself is not I (1) anymore and thus violates the assumptions of the Shin (1994)test. Based on Wagner and Hong (2016) the aforementioned authors applied a fully modiﬁed OLSapproach for CPRs. However, they did not allow for variance breaks in their approach, which couldlead to erroneous inference regarding the EKC hypothesis. We apply the bootstrap discussed aboveto address this possible issue in the following.We study data of 19 industrialized countries (see Table 5) over the periodfrom 1870 to 2014. We use per capita GDP data of the Maddison database( ). CO data istaken from the homepage of the Carbon Dioxide Information Analysis Center( https://cdiac.ess-dive.lbl.gov/ ) and is expressed as 1,000 tons per capita. We con-vert all time series to natural logarithms. Among others, Wagner (2015) also examined sulfurdioxide data, but discussion and results are similar. For brevity, we only focus on (the morerelevant) CO emissions. Let e t denote log per capita GDP and y t denote log CO emissions percapita. We then study the model e t = c + δt + θ y t + θ y t + θ y t + u t . To assess whether variance breaks are present in the error term we follow Cavaliere and Taylor(2008) and deﬁne the empirical variance proﬁle asˆ ρ ( s ) := P ⌊ T s ⌋ t =1 ˆ u t + ( sT − ⌊ T s ⌋ )ˆ u ⌊ T s ⌋ +1 P Tt =1 ˆ u t (11)for s ∈ (0 , ρ (0) := 0 and ˆ ρ (1) := 1. In case of homoskedasticity, we should have ˆ ρ ( s ) ≈ s . New Zealand is an exception were data is available for 1878-2014. s . Figures 2–5 for the remaining countries are given in Appendix C. We observe the presence ofvariance breaks for all countries (except maybe Denmark). For example, there is an early upwardvariance break for Canada. Thus, the usage of heteroskedasticity-robust tests is advisable.Next, we run a few univariate tests to characterize the series. In particular, we test for station-arity using a KPSS test (with the null of no unit root) and the test by Phillips and Perron (1988)(with the null of a unit root). Note that heteroskedasticity is an issue for the KPSS test makingcritical values derived by Kwiatkowski et al. (1992) invalid. A possible remedy is to proceed as inCavaliere (2005). We use the proposed bootstrap for the series y t and e t instead for residuals totest if they have no unit root.We perform three tests for cointegration, the bootstrap test using NLS residuals, the bootstraptest using leads-and-lags (LL) residuals (see Appendix B) and the subresidual based test. We usea non-parametric autocorrelation-robust estimator for the variance with a Bartlett kernel and aspectral window of (cid:4) T / . (cid:5) as suggested in Kwiatkowski et al. (1992).Table 5 reports the test results for the diﬀerent countries given in the ﬁrst column. The secondto fourth column are for the cointegration tests with NLS, LL and the subresidual-based test.Columns 5 and 6 give results for the KPSS test for e t and y t , and column 7 and 8 for the Phillips-Perron (PP) test, resp. All test results are given by the corresponding p -values where very small p -values are abbreviated with < . e t and y t while thePP test does not reject the null of a unit root. This provides evidence that the regressor and theregressand are both I (1).The three cointegration tests reveal mixed results. The ﬁrst observation is that all three lead toacceptance of the null in the majority of the cases. We recall that the subresidual-based test is bothin general undersized and second not robust to variance breaks, making it unreliable. Of course,bootstrap tests are dependent on simulation. Moreover, the p -values are all close to the nominalsize, so that decisions may hinge on simulation variability. To reduce the eﬀects of randomness weincreased the number of bootstrap runs to 2,000. The bootstrap tests come to diﬀerent test resultsin the case of Canada, Germany, Japan and Switzerland. Both tests reject only for Australia, New22igure 1: Empirical variance proﬁle (11) for diﬀerent countries. The dashed line is the referenceline for homoskedasticity. . . . . . . Australia s V a r i a n ce p r o ﬁ l e . . . . . . Austria s V a r i a n ce p r o ﬁ l e . . . . . . Belgium s V a r i a n ce p r o ﬁ l ee

Australia .035 .032 .024 < . < .

01 .686 .399Austria .390 .366 .469 < . < .

01 .044 .774Belgium .560 .474 .797 .010 < .

01 .040 .952Canada .053 .043 .316 < .

01 .020 .738 .023Denmark .097 .142 .659 < . < . > .

99 .747Finland .251 .187 .757 .029 < .

01 .034 .739France .186 .163 .674 < . < .

01 .607 .708Germany .027 .068 .124 < . < .

01 .065 .582Italy .134 .143 .584 .045 < .

01 .152 .900Japan .061 .019 .611 < . < .

01 .130 .794Netherlands .329 .276 .719 .085 < .

01 .017 .814New Zealand .024 .027 .032 .029 < .

01 .094 .233Norway .090 .103 .695 .018 < .

01 .073 .959Portugal .016 .026 .051 < . < . < .

01 .951Spain .191 .113 .264 < . < .

01 .430 .970Sweden .538 .432 .900 < . < .

01 .478 .735Switzerland .049 .053 .063 .016 < .

01 .239 .935United Kingdom .174 .111 .427 < . < .

01 .015 .731United States .042 .037 .114 < . < .

01 .830 .071

Acknowledgements

Financial support of the German Research Foundation (Deutsche Forschungsgemeinschaft,DFG) via the Collaborative Research Center “Statistical modelling of nonlinear dynamic processes”(SFB 823, Teilprojekt A4) is gratefully acknowledged.The authors thank Janine Langerbein for excellent research assistance. The results are, in any case, not directly comparable since the Maddison database had a major update sincethen and also, since polynomials are sensitive to even small changes in the scala. eferences Andrews, D. W. (1991), ‘Heteroskedasticity and autocorrelation consistent covariance matrix esti-mation’,

Econometrica (3), 817–858.Cavaliere, G. (2005), ‘Unit root tests under time-varying variances’, Econometric Reviews (3), 259–292.Cavaliere, G., Rahbek, A. and Taylor, A. R. (2010), ‘Testing for co-integration in vector autore-gressions with non-stationary volatility’, Journal of Econometrics (1), 7–24.Cavaliere, G. and Taylor, A. (2006), ‘Testing the null of co-integration in the presence of variancebreaks’,

Journal of Time Series Analysis (4), 613–636.Cavaliere, G. and Taylor, A. (2008), ‘Time-transformed unit root tests for models with non-stationary volatility’, Journal of Time Series Analysis (2), 300–330.Choi, I. and Saikkonen, P. (2010), ‘Tests for nonlinear cointegration’, Econometric Theory (3), 682–709.Demetrescu, M., Hanck, C. and Kruse, R. (2019), Robust ﬁxed-b inference in the presence oftime-varying volatility, Technical report.Dickey, D. A. and Fuller, W. A. (1979), ‘Distribution of the estimators for autoregressive time serieswith a unit root’, Journal of the American statistical association (366a), 427–431.Engle, R. F. and Granger, C. W. (1987), ‘Co-integration and error correction: representation,estimation, and testing’, Econometrica , 251–276.Gin´e, E. and Zinn, J. (1990), ‘Bootstrapping general empirical measures’, The Annals of Probability (2), 851–869.Grossman, G. M. and Krueger, A. B. (1995), ‘Economic growth and the environment’, The Quar-terly Journal of Economics (2), 353–377.Hansen, B. E. (1996), ‘Inference when a nuisance parameter is not identiﬁed under the null hy-pothesis’,

Econometrica (2), 413–430. 25ansen, B. E. (2000), ‘Testing for structural change in conditional models’, Journal of Econometrics (1), 93–115.Kiefer, N. M. and Vogelsang, T. J. (2005), ‘A new asymptotic theory for heteroskedasticity-autocorrelation robust tests’, Econometric Theory (6), 1130–1164.Kuznets, S. (1955), ‘Economic growth and income inequality’, The American Economic Review (1), 1–28.Kwiatkowski, D., Phillips, P. C., Schmidt, P. and Shin, Y. (1992), ‘Testing the null hypothesis ofstationarity against the alternative of a unit root: How sure are we that economic time serieshave a unit root?’, Journal of Econometrics (1-3), 159–178.Lahiri, S. N. (1999), ‘Theoretical comparisons of block bootstrap methods’, Annals of Statistics (1), 386–404.Leybourne, S. J. and McCabe, B. (1994), ‘A simple test for cointegration’, Oxford Bulletin ofEconomics and Statistics (1), 97–103.McCabe, B., Leybourne, S. and Shin, Y. (1997), ‘A parametric approach to testing the null ofcointegration’, Journal of Time Series Analysis (4), 395–413.Phillips, P. C. B. and Hansen, B. E. (1990), ‘Statistical inference in instrumental variables regressionwith I(1) processes’, The Review of Economic Studies (1), 99–125.Phillips, P. C. B. and Perron, P. (1988), ‘Testing for a unit root in time series regression’, Biometrika (2), 335–346.P¨otscher, B. M. and Prucha, I. R. (2013), Dynamic nonlinear econometric models: Asymptotictheory , Springer Berlin Heidelberg.Romano, J. P. and Wolf, M. (2001), ‘Subsampling intervals in autoregressive models with lineartime trend’,

Econometrica (5), 1283–1314.Saikkonen, P. (1991), ‘Asymptotically eﬃcient estimation of cointegration regressions’, EconometricTheory (1), 1–21. 26aikkonen, P. and Choi, I. (2004), ‘Cointegrating smooth transition regressions’, Econometric The-ory (2), 301–340.Shin, Y. (1994), ‘A residual-based test of the null of cointegration against the alternative of nocointegration’, Econometric Theory (1), 91–115.Stern, D. I. (2004), ‘The rise and fall of the environmental Kuznets curve’, World Development (8), 1419–1439.Stern, D. I. (2018), The environmental kuznets curve, in ‘Companion to Environmental Studies’,Routledge in association with GSE Research, pp. 49–54.Stypka, O., Wagner, M., Grabarczyk, P. and Kawka, R. (2017), ‘The asymptotic validity of”standard” fully modiﬁed OLS estimation and inference in cointegrating polynomial regressions’.Working Paper.Wagner, M. (2015), ‘The environmental Kuznets curve, cointegration and nonlinearity’, Journal ofApplied Econometrics (6), 948–967.Wagner, M. and Hong, S. H. (2016), ‘Cointegrating polynomial regressions: fully modiﬁed OLSestimation and inference’, Econometric Theory (5), 1289–1315. A Appendix: Proofs

Proof of Theorem 1.

Consider T − / P ⌊ T s ⌋ t =1 ˆ u t . Since ˆ u t = u t − ( g ( x tT , ˆ θ T ) − g ( x tT , θ )), asecond-order Taylor expansion of g ( x tT , ˆ θ T ) around θ gives T − / ⌊ T s ⌋ X t =1 ˆ u t = T − / ⌊ T s ⌋ X t =1 u t − T − / ⌊ T s ⌋ X t =1 K ( x tT , θ ) ′ (ˆ θ T − θ ) (12)+ T / (ˆ θ T − θ ) ′  T − ⌊ T s ⌋ X t =1 ∂ g ( x tT , ˜ θ ) ∂θ∂θ ′  (ˆ θ T − θ ) , where || ˜ θ − θ || ≤ || ˆ θ T − θ || . 27or the ﬁrst term in (12) Lemma 1 gives that, under H , T − / ⌊ T s ⌋ X t =1 u t = T − / ⌊ T s ⌋ X t =1 ζ u,t w → B , Ω ( s ) . For the second term in (12), recall that T / (ˆ θ T − θ ) w → ψ (cid:16) B , Ω , θ , κ (cid:17) (Proposition 1). ByLemma 1, x tT = ( T /T ) / x t = ( T /T ) / ⌊ T s ⌋ X j =1 ζ ,j w → T B , Ω ( s ) =: B , Ω ( s ) . This implies that T − ⌊ T s ⌋ X t =1 x tT w → Z s B , Ω ( r )d r, and by the continuous mapping theorem, T − ⌊ T s ⌋ X t =1 K ( x tT , θ ) w → Z s K ( B , Ω ( r ) , θ )d r =: F ( s, B , Ω , θ ) . We conclude that T − / ⌊ T s ⌋ X t =1 ˆ u t w → B , Ω ( s ) − F ( s, B , Ω , θ ) ′ ψ (cid:0) B , Ω , θ , κ (cid:1) , since all weak convergences hold jointly. Another application of the continuous mapping theoremyields T − T X t =1  t X j =1 ˆ u j  w → Z (cid:0) B , Ω ( s ) − F ( s, B , Ω , θ ) ′ ψ ( B , Ω , θ , κ ) (cid:1) d s. Finally, (4) follows by the continuous mapping theorem.

Proof of Theorem 2.

Under the alternative H : ρ µ > T − / u ⌊ T s ⌋ = T − / ζ , ⌊ T s ⌋ + T − / ρ µ ⌊ T s ⌋ X t =1 ζ µ,t w → ρ µ B , Ω ( s ) . This implies that T − / P ⌊ T s ⌋ t =1 u t w → ρ µ R s B , Ω ( r )d r and hence T − / P ⌊ T s ⌋ t =1 u t = O p (1).28ike in the proof of Theorem 1 we use a Taylor expansion to obtain T − / ⌊ T s ⌋ X t =1 ˆ u t = T − / ⌊ T s ⌋ X t =1 u t − T − / ⌊ T s ⌋ X t =1 K ( x tT , θ ) ′ (ˆ θ T − θ ) + o p (1) . Next, observe | ˆ θ T − θ | = O p ( T / ). To see this we use a linear approximation g ( x tT , ˆ θ T ) ≈ g ( x tT , θ ) + K t (ˆ θ T − θ ) , where K is the Jacobian matrix with entries K ti = ∂g ( x tT ,θ ) ∂θ i , for t = 1 , . . . , T , i = 1 , . . . , k , and K t is its t -th row. We can use this approximation and the following normal equations of a linear model( K ′ K ) − (ˆ θ T − θ ) = K ′ ˜ y, with ˜ y t = y tT − g ( x tT , θ ). We now obtain the asymptotics as for ordinary least squares as in Shin(1994) and McCabe et al. (1997) using that P ⌊ T s ⌋ t =1 g ( x tT , θ ) = O p ( T ), P ⌊ T s ⌋ t =1 K ( x tT , θ ) = O p ( T ),and P ⌊ T s ⌋ t =1 u t = O p ( T / ).Thus, P ⌊ T s ⌋ t =1 ˆ u t = O p ( T / ), which leads to T − T X t =1  t X j =1 ˆ u j  = O p ( T ) . Moreover, Kwiatkowski et al. (1992) showed that the long-run variance estimator ˆ ω u = O ( lT )which implies ˆ η = O p ( T /l ). As long as

T /l → ∞ for T → ∞ the test is consistent. Proof of Theorem 3. (i) Similarly to the proof of Theorem 3 in Cavaliere and Taylor (2006) consider the process M bt s.t. M bT ( s ) := T − / ⌊ T s ⌋ X t =1 u bt = T − / ⌊ T s ⌋ X t =1 ˆ u t z t . Conditionally on { ˆ u t , x tT } Tt =1 , this is an exact Gaussian process with kernelΛ MT ( s, s ′ ) = T − ⌊ T ( s ∧ s ′ ) ⌋ X t =1 ˆ u t , s ∧ s ′ denotes the minimum of s and s ′ .Under the null, V ar ( u t ) = σ u,t and σ ( s ) = σ u, ⌊ T s ⌋ which is the variance proﬁle of the u t . Asin the proof of Lemma A.5 in Cavaliere et al. (2010) we see that T − ⌊ T ( s ∧ s ′ ) ⌋ X t =1 ˆ u t = T − ⌊ T ( s ∧ s ′ ) ⌋ X t =1 u t + o p (1) p → Z s ∧ s ′ σ ( r )d r, pointwise, where the ﬁrst equality follows by McCabe et al. (1997). Since T − P ⌊ T s ⌋ t =1 ˆ u t ismonotonically increasing in s and the limit function is continuous in s the convergence inprobability is also uniform. The RHS is the kernel of the Gaussian process W σ s.t. W σ ( s ) := R s σ ( r )d W ( r ), where W is a standard Brownian motion. This implies that M bT ( s ) w → p W σ ( s ),as in Hansen (1996).Analogously, applying the same mappings as in the proof of Theorem 1, T − T X t =1  t X j =1 ˆ u bj  w → p Z (cid:0) W σ ( s ) − F ( s, B , Ω , θ ) ′ ψ ( B , Ω , θ , κ ) (cid:1) d s. Now, we derive the large sample behavior of (ˆ ω bu ) .(ˆ ω bu ) = T − T X t =1 (ˆ u bt ) + 2 T − l X s =1 w ( s, l ) T X t = s +1 ˆ u bt ˆ u bt − s = T − T X t =1 ( u bt ) + 2 T − l X s =1 w ( s, l ) T X t = s +1 u bt u bt − s + o p (1) p → Z σ ( r )d r, because E ( z t z t − s |{ ˆ u t , x tT } Tt =1 ) = 0 for all s > s = 0, and the same argumentas above by McCabe et al. (1997).This implies that the bootstrap test statistic ˆ η b samples from a distribution that has thesame variance proﬁle as the distribution of ˆ η but with white noise serial correlation. Usingthe arguments in Demetrescu et al. (2019) which are based upon Kiefer and Vogelsang (2005)the bootstrap (asymptotically) controls size.30ii) We again consider M bT ( s ) and Λ MT ( s, s ′ ) but now it suﬃces to look at the order of convergence.Recall that under the alternative P ⌊ T s ⌋ t =1 ˆ u t = O p ( T / ) and P ⌊ T s ⌋ t =1 ˆ u t = O p ( T ). This impliesthat Λ MT ( s, s ′ ) = O p ( T ) and, like in part (i), T − / M bT ( s ) converges weakly in probability toa Gaussian process where the kernel is given by the weak limit of T − Λ MT ( s, s ′ ).By the continuous mapping theorem it follows that P Tt =1 ˆ u bt = O p ( T ) and, hence, that T X t =1  t X j =1 ˆ u bj  = O p ( T ) . Consider next the long-run variance estimator (ˆ ω bu ) . Again, as in the proof of Theorem 2,(ˆ ω bu ) is consistent under the alternative of order O ( lT ). All in all, we get ˆ η b = O p (1 /l ). Sinceˆ η = O p ( T /l ) (Theorem 2) it follows that p bT w →

0, as long as l → ∞ for T → ∞ . Proof of Corollary 1.

As in Choi and Saikkonen (2010) we modify equation (12) to ℓ − / ⌊ ℓs + i − ⌋ X t = i ˆ u t = ℓ − / ⌊ ℓs + i − ⌋ X t = i u t − ℓ − ⌊ ℓs + i − ⌋ X t = i K ( x tT , θ ) ′ √ T (ˆ θ T − θ ) r ℓT + T (ˆ θ T − θ ) ′  ℓ − / ⌊ ℓs + i − ⌋ X t = i ∂ g ( x tT , ˜ θ ) ∂θ∂θ ′  (ˆ θ T − θ ) ℓT . We use the arguments from the proof of Theorem 1 and ℓT → ℓ − / ⌊ ℓs + i − ⌋ X t = i ˆ u t w → B , Ω ( s ) . The remainder follows by the continuous mapping theorem and (ˆ ω ℓu ) → ¯ ω u . B Appendix: Additional Simulations

This section discusses the case of estimating polynomial regressions with additional deterministicregressors. This is beyond our model assumptions, following the assumption of Choi and Saikkonen(2010) that all regressors are integrated. Deterministic regressors are not integrated. However,deterministic regressors are useful in many applications. Therefore, we extend the simulations31f Section 4.2 to study the impact of an intercept or a time trend to the rejection rates for thebootstrap test. More speciﬁcally, we discuss the cubic regression model with deterministic becauseit is the model in Section 5. Unreported results show that the results are qualitatively similar fora linear cointegrating regression model with a deterministic regressor.First we consider the cubic model including an intercept y t = 1 + x t + 2 x t + x t + u t . Panel (a) of Table 6 shows, analogously to the previous results, the rejection frequencies with thebootstrap test using NLS. We observe that in the presence of endogeneity the test is somewhatoversized with a rejection rate of about 10%.We also discuss a version of the cubic polynomial regression with a time trend of the form y t = 1 + t + x t + 2 x t + x t + u t . We do so mainly because there are some notable diﬀerences to the case without deterministiccomponents, and because we use this model for the application in Section 5. Panel (a) of Table 7shows, analogously to the previous results, the rejection frequencies with the bootstrap test usingNLS. We observe that in the presence of endogeneity the test is oversized with a rejection rateof about 10%. This is no surprise as the literature already documented this issue and proposedseveral solutions. For example, one could use fully modiﬁed OLS developed in Phillips and Hansen(1990) as suggested in Wagner and Hong (2016). We here follow Choi and Saikkonen (2010) whouse the leads-and-lags (LL) estimator proposed by Saikkonen (1991) (which is also known underthe name dynamic (non)-linear least squares). We brieﬂy describe the procedure. We estimate thecoeﬃcients in the model y t = c + δt + θ x t + θ x t + θ x t + K X j = − K π j ∆ x t − j + e t , which means that we include 2 K leads and lags into the regression. As in Choi and Saikkonen(2010) we take K = 1 , ,

3. However, panel (b) in Table 7 only reports the case of K = 1 as theothers have shown similar results. We compute test statistics and bootstrap p -values analogously,32able 6: The table reports the empirical rejection frequencies for testing the null of cointegration inthe cubic regression model with intercept for various parameter constellations. All rejection ratesare given as percentages. The nominal size is 5%. Panel (a) is for the bootstrap test using leastsquares and panel (b) is for the bootstrap test using leads and lags. σ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ e t . To save computational time we run 1,000 replications in this example forall settings.Comparing both panels of Table 7 shows that the size problem is corrected. Moreover, the em-pirical power is of comparable magnitude. We also employ this test based on LL for the applicationin Section 5. C Appendix: Plots cubic regression model with time trend for various parameter constellations. All rejectionrates are given as percentages. The nominal size is 5%. Panel (a) is for the bootstrap test usingleast squares and panel (b) is for the bootstrap test using leads and lags. σ µ : 0 0.001 0.01 0.1 ρ : 0 0.5 0 0.5 0 0.5 0 0.5 T τ : σ λ . . . . . . Denmark s V a r i a n ce p r o ﬁ l e . . . . . . Finland s V a r i a n ce p r o ﬁ l e . . . . . . France s V a r i a n ce p r o ﬁ l ee . . . . . . Germany s V a r i a n ce p r o ﬁ l e . . . . . . Italy s V a r i a n ce p r o ﬁ l e . . . . . . Japan s V a r i a n ce p r o ﬁ l e . . . . . . Netherlands s V a r i a n ce p r o ﬁ l ee