Backward CUSUM for Testing and Monitoring Structural Change
BBackward CUSUM for Testingand Monitoring Structural Change
Sven Otto ∗ University of Bonn J¨org BreitungUniversity of CologneMarch 6, 2020
Abstract
It is well known that the conventional CUSUM test suffers from low power andlarge detection delay. We therefore propose two alternative detector statistics. Thebackward CUSUM detector sequentially cumulates the recursive residuals in reversechronological order, whereas the stacked backward CUSUM detector considers a tri-angular array of backward cumulated residuals. While both the backward CUSUMdetector and the stacked backward CUSUM detector are suitable for retrospectivetesting, only the stacked backward CUSUM detector can be monitored on-line. Thelimiting distributions of the maximum statistics under suitable sequences of alter-natives are derived for retrospective testing and fixed endpoint monitoring. In theretrospective testing context, the local power of the tests is shown to be substantiallyhigher than that for the conventional CUSUM test if a single break occurs after onethird of the sample size. When applied to monitoring schemes, the detection delayof the stacked backward CUSUM is shown to be much shorter than that of the con-ventional monitoring CUSUM procedure. Moreover, an infinite horizon monitoringprocedure and critical values are presented.
Keywords: structural breaks, recursive residuals, sequential tests, change-point detection,local power, local delay ∗ Corresponding author: Sven Otto, University of Bonn, Institute for Finance and Statistics, Adenauer-allee 24-26, 53113 Bonn, Germany. Tel.: +49-228-73-9271. Mail: [email protected]. a r X i v : . [ ec on . E M ] M a r Introduction
Cumulative sums have become a standard statistical tool for testing and monitoring struc-tural changes in time series models. The CUSUM test was introduced by Brown et al.(1975) as a structural break test for the coefficient vector in the linear regression model y t = x (cid:48) t β t + u t with time index t , where β t denotes the coefficient vector and x t is thevector of regressor variables. Under the null hypothesis, there is no structural change, suchthat β t = β for all t = 1 , . . . , T , while, under the alternative hypothesis, the coefficientvector changes at unknown time T ∗ , where 1 < T ∗ ≤ T .Sequential tests, such as the CUSUM test, consist of a detector statistic and a criticalboundary function. The CUSUM detector sequentially cumulates standardized one-stepahead forecast errors, which are also referred to as recursive residuals. The detector isevaluated for each time point within the testing period, and, if its path crosses the boundaryfunction at least once, the null hypothesis is rejected.A variety of retrospective structural break tests have been proposed in the literature.Kr¨amer et al. (1988) investigated the CUSUM test of Brown et al. (1975) under a moregeneral setting. The MOSUM tests by Bauer and Hackl (1978) and Chu et al. (1995) arebased on a moving time window of fixed length. A CUSUM test statistic that cumulatesOLS residuals was proposed by Ploberger and Kr¨amer (1992), and Ploberger et al. (1989)presented a fluctuation test based on a sequence of OLS estimates. Kuan and Hornik (1995)studied generalized fluctuation tests. Andrews (1993) proposed a sup-Wald test, and thetests by Nyblom (1989) and Hansen (1992) consider likelihood scores instead of residuals.Since the seminal work of Chu et al. (1996), increasing interest has been focused onmonitoring structural stability in real time. Sequential monitoring procedures consist ofa detector statistic and a boundary function that are evaluated for periods beyond somehistorical time span { , , . . . , T } . It is assumed that there is no structural change withinthe historical time period. The monitoring time span with t > T can either have a fixedendpoint M < ∞ or an infinite horizon (see Figure 1). In the fixed endpoint setting, themonitoring period starts at T + 1 and ends at M , while the boundary function depends onthe ratio m = M/T . This setting is suitable if the length of the monitoring period is known2n advance. In case of an infinite horizon, the monitoring time span does not need to bespecified before the monitoring procedure starts. These two monitoring schemes are alsoreferred to as closed-end and open-end procedures (see Kirch and Kamgaing 2015). Thenull hypothesis of no structural change is rejected whenever the path of the detector crossessome critical boundary function for the first time. CUSUM-based monitoring proceduresfor a fixed endpoint are proposed in Leisch et al. (2000), Zeileis et al. (2005), Wied andGaleano (2013), and Dette and G¨osmann (2019), whereas Chu et al. (1996), Horv´ath et al.(2004), Aue et al. (2006), Fremdt (2015), and G¨osmann et al. (2019) considered an infinitemonitoring horizon. Figure 1: Retrospective testing and monitoring0
T M (You are here) • retrospective fixed endpoint monitoringinfinite horizon monitoring A drawback of the conventional retrospective CUSUM test is its low power, whereas theconventional monitoring CUSUM procedure exhibits large detection delays. This is due tothe fact that the pre-break recursive residuals are uninformative, as their expectation isequal to zero up to the break date, while the recursive residuals have a non-zero expectationafter the break. Hence, the cumulative sums of the recursive residuals typically contain alarge number of uninformative residuals that only add noise to the statistic. In contrast,if one cumulates the recursive residuals backwards from the end of the sample to thebeginning, the cumulative sum collects the informative residuals first, and the likelihoodof exceeding the critical boundary will typically be larger than when cumulating residualsfrom the beginning onwards. In this paper, we show that such backward CUSUM testsmay indeed have a much higher power and lower detection delay than the conventionalforward CUSUM tests. 3nother way of motivating the backward CUSUM testing approach is to consider thesimplest possible situation, where, under the null hypothesis, it is assumed that the processis generated as y t = β + u t , with β and σ = V ar ( u t ) assumed to be known. We areinterested in testing the hypothesis, that at some time period T ∗ , the mean changes tosome unknown value β ∗ >
0. To test this hypothesis, we introduce the dummy variable D ∗ t , which is unity for t ≥ T ∗ and zero elsewhere. For this one-sided testing problem, thereexists a uniform most powerful test statistic, which is the t -statistic of the hypothesis δ = 0in the regression ( y t − β ) = δD ∗ t + u t : T T ∗ = 1 σ √ T − T ∗ + 1 T (cid:88) t = T ∗ ( y t − β ) . If β is unknown, we may replace it by the full sample mean y , resulting in the backwardcumulative sum of the OLS residuals from period T through T ∗ . Note that if T ∗ is unknown,the test statistic is computed for all possible values of T ∗ , whereas the starting point T ofthe backward cumulative sum remains constant. Since the sum of the OLS residuals is zero,it follows that the test is equivalent to a test based on the forward cumulative sum of theOLS residuals. In contrast, if we replace β with the recursive mean µ t − = ( t − − (cid:80) t − i =1 y t ,we obtain a test statistic based on the backward cumulative sum of the recursive residuals(henceforth, backward CUSUM). In this case, however, the test is different from a test basedon the forward cumulative sum of the recursive residuals (henceforth, forward CUSUM).This is due to the fact that the sum of the recursive residuals is an unrestricted randomvariable. Accordingly, the two versions of the test may have quite different properties. Inparticular, it turns out that the backward CUSUM is much more powerful than the standardforward CUSUM at the end of the sample. Accordingly, this version of the CUSUM testprocedure is better suited for the purpose of real-time monitoring, where it is crucial to bepowerful at the end of the sample.Furthermore, the conventional CUSUM test has no power against alternatives that donot affect the unconditional mean of y t . In order to obtain tests that have power againstbreaks of this kind, we extend the existing invariance principle for recursive residuals to amultivariate version and consider a vector-valued CUSUM process instead of the univariate4USUM detector. For both retrospective testing and monitoring, we propose a vector-valued sequential statistic in the fashion of the score-based cumulative sum statistic ofHansen (1992). The maximum vector entry of the multivariate statistic then yields adetector and a sequential test that has power against a much larger class of structuralbreaks.In Section 2, the limiting distribution of the multivariate CUSUM process is derivedunder both the null hypothesis and local alternatives. Section 3 introduces the forwardCUSUM, the backward CUSUM, and the stacked backward CUSUM tests for both retro-spective testing and monitoring. While the backward CUSUM is only defined for t ≤ T and can thus be implemented only for retrospective testing, the stacked backward CUSUMcumulates recursive residuals backwardly in a triangular scheme and is therefore suitablefor real-time monitoring. Furthermore, the local powers of the tests are compared. Inthe retrospective setting, the powers of the backward CUSUM and the stacked backwardCUSUM tests are substantially higher than that of the the conventional forward CUSUMtest if a single break occurs after one third of the sample size. In the case of monitoring,the detection delay of the stacked backward CUSUM under local alternatives is shown tobe much lower than that of the monitoring CUSUM detector by Chu et al. (1996). Section4 considers the estimation of the break date based on backward cumulated recursive resid-uals. We present an estimator, which is more accurate than the conventional maximumlikelihood estimator if the break is located at the end of the sample. In Section 5 we discusstesting against partial structural breaks. Section 6 presents simulated critical values andMonte Carlo simulation results, and Section 7 concludes. We consider the multiple linear regression model y t = x (cid:48) t β t + u t , t ∈ N , where y t is the dependent variable, and x t = (1 , x t , . . . , x tk ) (cid:48) is the vector of regressorvariables including a constant. The k × β t depends on5he time index t , and u t is an error term. Let { ( y t , x (cid:48) t ) (cid:48) , ≤ t ≤ T } be the set of historicalobservations, such that the time point T divides the time horizon into the retrospective timeperiod 1 ≤ t ≤ T and the monitoring period t > T . We impose the following assumptionson the regressors and the error term. Assumption 1. (a) { x t } t ∈ N is stationary and ergodic with E [ x t x (cid:48) t ] = C , where C ispositive definite, and E | x tj | κ < ∞ for some κ > , for all j = 2 , . . . , k .(b) { u t } t ∈ N is a stationary martingale difference sequence with respect to F t , the σ -algebragenerated by { ( x (cid:48) i +1 , u i ) (cid:48) , i ≤ t } , such that E [ u t |F t − ] = 0 , E [ u t |F t − ] = σ > , and E | u t | κ < ∞ for some κ > . Recursive residuals for linear regression models were introduced by Brown et al. (1975) asstandardized one-step ahead forecast errors. Let (cid:98) β t − = (cid:0) (cid:80) t − i =1 x i x (cid:48) i (cid:1) − (cid:0) (cid:80) t − i =1 x i y i (cid:1) be theOLS estimator at time t −
1. The recursive residuals are given by w t = y t − x (cid:48) t (cid:98) β t − (cid:113) x (cid:48) t (cid:0)(cid:80) t − i =1 x i x (cid:48) i (cid:1) − x t , t ≥ k + 1 , and w t = 0 for t = 1 , . . . , k .For testing against structural changes in the regression coefficient vector, Brown et al.(1975) introduced the sequential statistic Q t,T = ( (cid:98) σ T ) − / (cid:80) tj =1 w j for t = 1 , . . . , T , where (cid:98) σ is a consistent estimator for σ . In the monitoring context, Chu et al. (1996) consideredthe detector statistic Q t,T − Q T,T for t > T . The limiting behavior of the underlyingempirical process has been thoroughly analyzed in the literature. Under H : β t = β for all t ∈ N , Sen (1982) showed that Q (cid:98) rT (cid:99) ,T = ( (cid:98) σ T ) − / (cid:80) (cid:98) rT (cid:99) j =1 w j converges weakly anduniformly to a standard Brownian motion W ( r ) for r ∈ [0 , H : β t = β + T − / g ( t/T ), where g ( r ) is piecewiseconstant and bounded. Let µ = plim T →∞ ( x , . . . , x k ) (cid:48) be the mean regressor, where x j isthe sample mean of the j -th component of the regressors, and let h ( r ) = 1 σ (cid:90) r g ( z ) d z − σ (cid:90) r (cid:90) z z g ( v ) d v d z. (1)6he authors showed that Q (cid:98) rT (cid:99) ,T converges weakly and uniformly to W ( r ) + µ (cid:48) h ( r ) for r ∈ [0 , g ( r ) is orthogonal to µ ,the limiting distributions under H and H coincide. Hence, if the break in the coefficientvector does not affect the unconditional mean of y t , then the CUSUM tests of Brown et al.(1975) and Chu et al. (1996) have no power against such an alternative.To sidestep this difficulty, we consider a multivariate cumulative sum process of recursiveresiduals, which is defined as Q T ( r ) = 1 (cid:98) σ √ T C − / T (cid:98) rT (cid:99) (cid:88) t =1 x t w t , r ≥ , (2)where (cid:98) σ = ( T − k − − (cid:80) Tj =1 ( w j − w ) is a consistent estimator for σ (see Kr¨ameret al. 1988), and C T = T − (cid:80) Tt =1 x t x (cid:48) t denotes the sample covariance matrix. Note that Q T ( r ) is a vector of piecewise constant processes, where its domain can be divided intothe retrospective time period r ∈ [0 ,
1] and the monitoring period r >
1. On the domain r ∈ [0 , m ], m < ∞ , the multivariate CUSUM process is bounded in probability. Hence,each component of Q T ( r ) is in the space D ([0 , m ]) of c`adl`ag functions on [0 , m ], and Q T ( r )is an element of the k-fold product space D ([0 , m ]) k = D ([0 , m ]) × . . . × D ([0 , m ]). Thespace is equipped with the Skorokhod metric (see Billingsley 1999, p.166 and p.244), andthe symbol “ ⇒ ” denotes weak convergence with respect to this metric. The result presentedbelow summarizes the limiting behavior of Q T ( r ) for both the retrospective and the fixedendpoint monitoring time period under both H and H : Theorem 1.
Let { ( x t , u t ) } t ∈ N satisfy Assumption 1, let g ( r ) be piecewise constant andbounded, and let β t = β + T − / g ( t/T ) for all t ∈ N . Then, for any fixed and positive m < ∞ , Q T ( r ) ⇒ W ( r ) + C / h ( r ) , r ∈ [0 , m ] , (3) as T → ∞ , where W ( r ) is a k -dimensional standard Brownian motion and h ( r ) is definedas in (1) . Note that the function g ( r ) is constant if and only if β t is constant for all t ∈ N .Under H , we then obtain C / h ( r ) = , and thus Q T ( r ) ⇒ W ( r ). By contrast, under7 local alternative with a non-constant break function g ( r ), it follows that h ( r ) is non-zero, and, consequently, C / h ( r ) is non-zero, since C / is positive definite. The limitingdistributions of Q T ( r ) under both H and H coincide only for the trivial case where g ( r )is constant. Therefore, tests that are based on Q T ( r ) have power against a larger class ofalternatives than the tests of Brown et al. (1975) and Chu et al. (1996).The functional central limit theorem given by equation (3) is not suitable for analyzingthe asymptotic behavior of an infinite horizon monitoring statistic, since the variance of Q T ( r ) is unbounded as r → ∞ , and sup r ≥ (cid:107) Q T ( r ) − W ( r ) (cid:107) might not converge in general.For an i.i.d. random process { v t } t ∈ N with E [ v ] = 0, E [ v ] = σ and E [ v κ ] < ∞ , κ > W ( r ), suchthat σ − (cid:80) Tt =1 v t = W ( T ) + o ( T /κ ), a.s., as T → ∞ , where the approximation rate isoptimal. This almost sure invariance principle is known as the KMT approximation, whichwas employed by Horv´ath (1995) to derive the limiting distribution of the infinite horizonstatistic sup t>T | Q t,T − Q T,T | /d ( t/T ) for an appropriate boundary function d ( r ). Wu et al.(2007) and Berkes et al. (2014) extended the almost sure invariance principle to moregeneral classes of dependent random processes, which can be used to formulate the followingstochastic approximation result: Theorem 2.
Let { ( x t , u t ) } t ∈ N satisfy Assumption 1 and let β t = β for all t ∈ N . Then,there exists a k -dimensional standard Brownian motion W ( r ) , such that, as T → ∞ , sup r ≥ (cid:107) Q T ( r ) − W ( r ) (cid:107)√ r = o P (1) , where (cid:107) · (cid:107) denotes the maximum norm, which is the largest vector entry in absolute value. This result is the key tool to derive the limiting distribution of infinite horizon moni-toring statistics that are based on the multivariate CUSUM process, which is done in thenext section. It also indicates that Q T ( r ) should be scaled by a factor of at least order √ r to approximate the process by a Brownian motion.8 CUSUM detectors
In this section, we consider sequential tests for both retrospective testing and monitoringthat are based on the multivariate CUSUM processes Q T ( r ). The null hypothesis of nostructural change in the regression coefficient vector is formulated as H : β t = β for all t ∈ I , where the testing period is given by I = { t ∈ N : 1 ≤ t ≤ T } in the retrospective context, { t ∈ N : T + 1 ≤ t ≤ mT } in the fixed endpoint monitoring context, { t ∈ N : T + 1 ≤ t < ∞} in the infinite horizon monitoring context.In the monitoring context, the non-contamination assumption β t = β for the historicaltime period t = 1 , . . . , T is imposed. The monitoring time span could have either a fixedendpoint M = (cid:98) mT (cid:99) with m > m = ∞ .The sequential tests consist of a detector statistic and a critical boundary function, inwhich the detector is evaluated for each time point within the testing period, and, if itspath crosses the boundary function at least once, the null hypothesis is rejected. We makethe following assumption on the boundary function: Assumption 2.
The boundary function is of the form b ( r ) = λ α · d ( r ) , where λ α denotesthe critical value, which depends on the significance level α , and d ( r ) is a continuous andstrictly increasing function with d (0) > and sup r ≥ √ r + 1 /d ( r ) < ∞ . While the forward CUSUM detectors for retrospective testing and monitoring are dis-cussed in Section 3.1, we introduce the backward CUSUM detector in Section 3.2 andthe stacked backward CUSUM detectors in Section 3.3. In Section 5 we present modifieddetectors for testing and monitoring partial structural change.9 .1 Forward CUSUM
As an extension of the univariate CUSUM detector by Brown et al. (1975) we consider themultivariate retrospective CUSUM detector Q t,T = Q T (cid:0) tT (cid:1) = 1 (cid:98) σ √ T C − / T t (cid:88) j =1 x j w j , ≤ t ≤ T. The vector-valued detector is inspired by the score-based cumulative sum statistic of Hansen(1992). While Hansen (1992) considered OLS residuals and proposed averaging all entriesof the vector-valued cumulative sum, we consider recursive residuals and formulate themultivariate detectors with respect to the maximum norm (cid:107) · (cid:107) . The null hypothesis isrejected if the path of (cid:107) Q t,T (cid:107) exceeds the critical boundary function b t = λ α · d (cid:0) t/T (cid:1) atleast once within the retrospective testing period. The critical value λ α determines thesignificance level α such thatlim T →∞ P (cid:16) (cid:107) Q t,T (cid:107) ≥ λ α · d (cid:0) tT (cid:1) for at least one t = 1 , . . . , T (cid:12)(cid:12) H (cid:17) = α. Let M ret Q = max ≤ t ≤ T (cid:107) Q t,T (cid:107) /d (cid:0) t/T (cid:1) be the maximum statistic representation of theCUSUM detector. The above condition can be equivalently expressed aslim T →∞ P ( M ret Q ≥ λ α | H ) = α. Hence, λ α is the (1 − α ) quantile of the limiting null distribution of M ret Q . Note that M ret Q together with the critical value λ α defines a one-shot test that is equivalent to the sequentialCUSUM test.For real-time monitoring, we follow Chu et al. (1996) and define the multivariate retro-spective CUSUM detector as Q mon t,T = Q T (cid:0) tT (cid:1) − Q T (1) = 1 (cid:98) σ √ T C − / T t (cid:88) j = T +1 x j w j , t > T, and H is rejected if its maximum norm (cid:107) Q mon t,T (cid:107) exceeds the boundary b t = λ α · d (cid:0) ( t − T ) /T (cid:1) at least once for some t > T . For a fixed endpoint M = (cid:98) mT (cid:99) , where 1 < m < ∞ , let M mon Q,m = max
T
Let β t = β for all t ∈ N and let Assumptions 1 and 2 hold true. Then, M ret BQ = max ≤ t ≤ T (cid:107) BQ t,T (cid:107) d (cid:0) T − t +1 T (cid:1) d −→ sup r ∈ (0 , (cid:107) W ( r ) (cid:107) d ( r ) as T → ∞ , where W ( r ) is a k -dimensional standard Brownian motion. Using the same boundary as for the retrospective forward CUSUM, the limiting nulldistributions of their maximum statistics coincide. Simulated critical values when usingthe linear boundary are presented in Table 1. A simple illustrative example of the detectorpaths together with the linear boundary of Brown et al. (1975) are depicted in Figure 2,in which a process with k = 1 and a single break in the mean at 3 / t and is therefore not suitablefor a monitoring procedure. The path of (cid:107) BQ t,T (cid:107) is only defined for t ≤ T , as its endpoint T is fixed. 13igure 2: Illustrative example for the backward CUSUM with a break in the mean − − Forward CUSUM time 0 20 40 60 80 100 − − Backward CUSUM timedetector statistic linear boundary (5%) recursive residuals
Note: The process y t = µ t + u t , t = 1 , . . . , T , is simulated for T = 100 with µ t = 0 for t < µ t = 1 for t ≥
75, and i.i.d. standard normal innovations u t . The bold solid line paths are the trajectories of (cid:107) Q t,T (cid:107) and (cid:107) BQ t,T (cid:107) , where the detectors are univariate such that the norm is just the absolute value. In thebackground, the recursive residuals are plotted. The dashed lines correspond to the linear boundary (4)with significance level α = 5% and critical value λ α = 0 . To combine the advantages of the backward CUSUM with the measurability properties ofthe forward CUSUM for monitoring, we resort to an inspection scheme, which goes backto Page (1954) and involves a triangular array of residuals together with an additionalmaximum. Let M ret BQ ( t ) = max ≤ s ≤ t (cid:107) Q T (cid:0) tT (cid:1) − Q T (cid:0) s − T (cid:1) (cid:107) d (cid:0) t − s +1 T (cid:1) be the backward CUSUM statistic with endpoint t . The idea is to compute this statisticsequentially for each time point t = 1 , . . . , T , yielding M ret BQ (1) , M ret BQ (2) , . . . , M ret BQ ( T ).The stacked backward CUSUM statistic is the maximum among this sequence of backwardCUSUM statistics. An important feature of this sequence is that it is measurable withrespect to the filtration of information at time t and M ret BQ ( t ) can thus be adapted forreal-time monitoring. The stacked backward CUSUM detector is defined as SBQ s,t,T = Q T (cid:0) tT (cid:1) − Q T (cid:0) s − T (cid:1) = 1 (cid:98) σ √ T C − / T t (cid:88) j = s x j w j , ≤ s ≤ t < ∞ . Since the upper and the lower summation index of
SBQ s,t,T are both flexible with s ≤ t ,this induces a triangular scheme. H is rejected if (cid:107) SBQ s,t,T (cid:107) exceeds the two-dimensional14oundary b s,t = λ α · d (cid:0) ( t − s + 1) /T (cid:1) for some s and t with 1 ≤ s ≤ t ≤ T , or, equivalently,if the double maximum statistic M ret SBQ = max ≤ t ≤ T M ret BQ ( t ) = max ≤ t ≤ T max ≤ s ≤ t (cid:107) SBQ s,t,T (cid:107) d (cid:0) t − s +1 T (cid:1) exceeds λ α .The backward CUSUM maximum statistic M ret BQ ( t ) is itself a sequential statistic. Stack-ing all those maximum statistics on one another leads to an additional maximum and adouble supremum in the limiting distribution. The stacked backward CUSUM uses therecursive residuals in a multiple way such that the set over which the maximum is takenhas many more elements than the forward CUSUM and the backward CUSUM. For t = 1only w is cumulated, for t = 2 the residuals w and w are cumulated, for t = 3 we consider w , w , and w , and so forth.The triangular detector can also be monitored on-line across all the time points t > T .The null hypothesis is rejected if (cid:107) SBQ s,t,T (cid:107) exceeds b s,t = λ α · d (cid:0) ( t − s + 1) /T (cid:1) at leastonce for some s and t with T < s ≤ t . Analogously to the retrospective case, let M mon BQ ( t ) = max T T , and let M mon SBQ,m = max
T Let β t = β for all t ∈ N and let Assumptions 1 and 2 hold true. Then,(a) M ret SBQ d −→ sup r ∈ (0 , sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) ,(b) M mon SBQ,m d −→ sup r ∈ (0 ,m − sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) d = sup r ∈ (0 , m − m ) sup s ∈ (0 ,r ) (cid:107) (1 − s ) B ( r ) − (1 − r ) B ( s ) (cid:107) (1 − r )(1 − s ) d (cid:0) r − s (1 − r )(1 − s ) (cid:1) , < m < ∞ , c) M mon SBQ, ∞ d −→ sup r ∈ (0 , ∞ ) sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) d = sup r ∈ (0 , sup s ∈ (0 ,r ) (cid:107) (1 − s ) B ( r ) − (1 − r ) B ( s ) (cid:107) (1 − r )(1 − s ) d (cid:0) r − s (1 − r )(1 − s ) (cid:1) ,as T → ∞ , where W ( r ) is a k -dimensional standard Brownian motion and B ( r ) is a k -dimensional standard Brownian bridge. Analogously to the forward CUSUM, for the linear boundary of Brown et al. (1975), itfollows that,max T 3, while the stacked backward CUSUM outperforms the other two tests if thebreak is located at around 1 / m = 2, the local power curves of theforward CUSUM test and the stacked backward CUSUM test have exactly the same shapeas their counterparts in the retrospective case. The monitoring local power curve for abreak at τ ∗ ∈ (1 , 2) then coincides with the corresponding retrospective curve in Figure 3with a single break at τ ∗ − 1. Hence, the power of the stacked backward CUSUM is alwayshigher than that of the forward CUSUM if τ ∗ ≥ . 15 in the monitoring case.The much more important performance measure for monitoring detectors is the delaybetween the actual break and the detection time point, since every fixed nontrivial alter-native will be detected if the monitoring horizon is long enough. Let T d be the stoppingtime of the time point of the first boundary crossing, and let the mean local relative delaybe given by E (cid:2) T d /T | τ ∗ ≤ T d /T ≤ m (cid:3) − τ ∗ . Figure 4 presents the simulated mean localrelative delay curves for the fixed endpoint m = 4 for M mon SBQ, with the linear boundary,for M mon Q, with the linear boundary, and for M mon Q, with the radical boundary by Chu et al.(1996). The mean local relative delay of the stacked backward CUSUM is much lower thanthat of the forward CUSUM. Furthermore, the mean local relative delay is constant acrossdifferent break locations, with the exception of breaks that are located at τ ∗ < . H . The upper three pictures in Figure 519igure 5: Size distributions of the retrospective and monitoring detectors Forward CUSUM time point of rejection den s i t y . . . . . Backward CUSUM time point of rejection den s i t y . . . . . Stacked backward CUSUM time point of rejection den s i t y . . . . . . . . Stacked backward CUSUM time point of rejection den s i t y . . . . . . . Forward CUSUM time point of rejection den s i t y . . . . . . . Forward CUSUM (radical boundary) time point of rejection den s i t y . . . . . Note: The frequencies of the location of the first boundary exceedance under the null hypothesis are shown fora significance level of 5% for the model with k = 1. The frequencies are based on random draws under thelimiting null distribution of the maximum statistics. The retrospective cases is considered for the upper threehistograms and the fixed endpoint monitoring case with m = 10 for the lower three. The linear boundary (4) isconsidered in the first five plots and the radical boundary by Chu et al. (1996) is used in the last plot. present histograms of the asymptotic size distributions for retrospective tests using the lin-ear boundary. For the forward CUSUM, the highest rejection rates under H are obtainedat relative locations between 0 . 15 and 0 . . . 85. The distribution for the forward CUSUM is right-skewed, whereas,for the backward CUSUM, it is left-skewed. For the stacked backward CUSUM, the dis-tribution is much closer to a uniform distribution, although it is slightly left-skewed. Notethat the size distributions provide information about the location of false rejections, but,when comparing Figure 3 with Figure 5, it is reasonable to assume that this is also relatedto the distribution of the power across different time points. There is no consensus on whichdistribution should be preferred, as whether one wishes to put more weight on particularregions of time points of rejection depends on the particular application. However, Zeileis20t al. (2005) and Anatolyev and Kosenok (2018) argue that if no further information isavailable, one might prefer a uniform distribution to a skewed one. The lower three pic-tures in Figure 5 present the distributions of the size for the fixed monitoring horizon with m = 10. The distribution for the stacked backward CUSUM is much closer to a uniformdistribution compared to those of the forward CUSUM variants. As soon as the testing procedure has indicated a structural instability in the coefficientvector, the next step is to locate the break point. In the single break model with coefficientvector β t = β + δ { t ≥ T ∗ } , δ (cid:54) = , (9)Horv´ath (1995) suggested to estimate the relative break date τ ∗ = T ∗ /T by the relativetime index for which the likelihood ratio statistic is maximized. As an asymptoticallyequivalent estimator, Bai (1997) proposed the maximum likelihood estimator (cid:98) τ ret ML = 1 T · argmin ≤ t ≤ T (cid:0) S ( t ) + S ( t ) (cid:1) , (10)where S ( t ) is the OLS residual sum of squares using observations until time point t and S ( t ) is the OLS residual sum of squares using observations from time t + 1 onwards. Incase of monitoring, Chu et al. (1996) considered (cid:98) τ mon ML = 1 T · argmin T 5, the solid line showsthe trajectory of the asymptotic mean of the scaled detector h ∗ ( r ) / √ − r andthe dashed line shows the trajectory of h ∗ ( r ) given by equations (13) and (12). To bypass this problem, we use backwardly cumulated recursive residuals to estimatethe relative break location. As illustrated in Figure 2, the backward CUSUM detectoris approximately constant in the pre-break period and decreases to zero in the post-breakperiod, and the maximum is attained near the break location t = T ∗ when dividing (cid:107) BQ t,T (cid:107) by its standard deviation (cid:112) ( T − t + 1) /T . Accordingly, we consider the estimators (cid:98) τ ret = 1 T · argmax ≤ t ≤ T (cid:13)(cid:13) BS t,T (cid:13)(cid:13) , (cid:98) τ mon = 1 T · argmax T Let { ( x t , u t ) } t ∈ N satisfy Assumption 1 and let β t be given by equation (9) .Then, as T → ∞ ,(a) (cid:98) τ ret p −→ τ ∗ , if τ ∗ ∈ (0 , ,(b) (cid:98) τ mon p −→ τ ∗ , if τ ∗ ∈ (1 , T d /T ] . It is not always a good idea to use all entries of the multivariate CUSUM process, especiallyif k is large and if the focus is to test for breaks in only some regression coefficients.Following the discussion of Section 2, the univariate CUSUM tests of Brown et al. (1975)and Chu et al. (1996) are partial structural break tests in the sense that they have onlypower against a break in the intercept. However, since the critical values for the multivariateCUSUM test increase with the number of regressors k , the univariate CUSUM test has ahigher power against a break in the intercept than the multivariate counterpart if k ≥ l < k linear combinations of the regressioncoefficients, which can be expressed by some orthonormal k × l matrix H , such that thepartial stability hypothesis (cid:101) H : H (cid:48) β t = H (cid:48) β is tested against (cid:101) H : H (cid:48) β t (cid:54) = H (cid:48) β for some t . The corresponding partial multivariate CUSUM statistic is given by (cid:101) Q t,T = H (cid:48) Q t,T . Incase of a test for a break in only the intercept, (cid:101) Q t,T coincides with the univariate CUSUMdetector Q t,T , where H = (1 , , . . . , (cid:48) . Analogously, we define (cid:103) BQ t,T = (cid:101) Q T,T − (cid:101) Q t − ,T , (cid:93) SBQ s,t,T = (cid:101) Q t,T − (cid:101) Q s − ,T . Under (cid:101) H , Theorem 1 yields (cid:101) Q (cid:98) rT (cid:99) ,T ⇒ H (cid:48) W ( r ), where H (cid:48) W ( r ) is an l -dimensional stan-dard Brownian motion, since the columns of H are orthonormal. Hence, the limiting dis-tributions of the maximum statistics that are based on the modified detectors coincide with23able 1: Asymptotic critical values for the retrospective tests M ret Q and M ret BQ M ret SBQ ν 20% 10% 5% 2 . 5% 1% 20% 10% 5% 2 . 5% 1%1 0.734 0.847 0.945 1.034 1.143 1.018 1.113 1.198 1.278 1.3742 0.839 0.941 1.032 1.115 1.219 1.107 1.196 1.277 1.352 1.4423 0.895 0.993 1.081 1.163 1.260 1.156 1.244 1.321 1.392 1.4814 0.933 1.029 1.114 1.192 1.287 1.190 1.275 1.350 1.419 1.5065 0.962 1.056 1.139 1.216 1.307 1.216 1.299 1.372 1.441 1.5266 0.985 1.077 1.160 1.235 1.323 1.237 1.317 1.388 1.457 1.5417 1.005 1.095 1.176 1.249 1.338 1.253 1.333 1.404 1.471 1.5568 1.021 1.110 1.189 1.261 1.349 1.268 1.347 1.418 1.483 1.566 Note: Critical values λ α are reported for the linear boundary in (4). The ν -dimensional Gaussianprocesses in the limiting distributions are simulated on a grid of 10,000 equidistant points with100,000 Monte Carlo repetitions. In case of a global structural break test we have ν = k , and in caseof a partial structural break test we have ν = l . those presented in Theorems 3–5, except that the Brownian motions are l -dimensional in-stead of k -dimensional. Critical values are presented in Tables 1 and 2 in the subsequent sec-tion. Under the conditions of Theorem 1, it follows that (cid:101) Q (cid:98) rT (cid:99) ,T ⇒ H (cid:48) W ( r ) + H (cid:48) C / h ( r ),where H (cid:48) C / h ( r ) (cid:54) = if H (cid:48) g ( r ) is not constant, Hence, the modified tests have poweragainst all nontrivial alternatives of the form H (cid:48) β t = H (cid:48) β + T − / H (cid:48) g ( t/T ). Tables 1 and 2 present critical values for the retrospective and monitoring detectors usingthe linear boundary (4). Empirical sizes for the retrospective case are shown in Table 3.The tests have only minor size distortions in finite samples. The empirical powers of theretrospective tests are compared with that of the sup-Wald test of Andrews (1993). Thesup-Wald statistic is given by max r ∈ [ r , − r ] T · S − S ( r ) − S ( r ) r (1 − r ) , where S is the OLS residual sum of squares using observations { , . . . , T } , S ( r ) is theOLS residual sum of squares using observations { , . . . , (cid:98) rT (cid:99)} , and S ( r ) is the OLS residualsum of squares using observations {(cid:98) rT (cid:99) + 1 , . . . , T } . The parameter r defines the lower24able 2: Asymptotic critical values for M mon SBQ,m ν = 1 ν = 2 ν = 3 ν = 4m 10% 5% 1% 10% 5% 1% 10% 5% 1% 10% 5% 1%1.2 0.782 0.859 1.024 0.859 0.935 1.092 0.902 0.975 1.129 0.932 1.003 1.1521.4 0.941 1.030 1.208 1.028 1.111 1.277 1.076 1.156 1.320 1.108 1.185 1.3451.6 1.026 1.113 1.292 1.111 1.192 1.365 1.158 1.238 1.406 1.189 1.269 1.4321.8 1.077 1.162 1.344 1.161 1.244 1.411 1.208 1.286 1.452 1.240 1.317 1.4762 1.113 1.198 1.374 1.196 1.277 1.442 1.244 1.321 1.481 1.275 1.350 1.5063 1.211 1.293 1.462 1.291 1.366 1.524 1.334 1.407 1.558 1.363 1.436 1.5824 1.262 1.339 1.500 1.336 1.410 1.564 1.378 1.450 1.599 1.407 1.478 1.6216 1.316 1.390 1.544 1.387 1.460 1.606 1.428 1.496 1.638 1.456 1.522 1.6608 1.346 1.419 1.569 1.417 1.486 1.629 1.456 1.522 1.661 1.483 1.548 1.68610 1.367 1.440 1.588 1.437 1.503 1.644 1.475 1.540 1.677 1.500 1.565 1.703 ∞ ν = 5 ν = 6 ν = 7 ν = 8m 10% 5% 1% 10% 5% 1% 10% 5% 1% 10% 5% 1%1.2 0.954 1.023 1.170 0.972 1.041 1.186 0.987 1.054 1.198 1.000 1.065 1.2061.4 1.133 1.208 1.366 1.152 1.225 1.381 1.167 1.241 1.396 1.181 1.253 1.4091.6 1.214 1.293 1.452 1.235 1.311 1.466 1.251 1.325 1.477 1.265 1.339 1.4881.8 1.265 1.340 1.496 1.283 1.357 1.511 1.300 1.372 1.525 1.315 1.385 1.5372 1.299 1.372 1.526 1.317 1.388 1.541 1.333 1.404 1.556 1.347 1.418 1.5663 1.386 1.457 1.601 1.404 1.472 1.615 1.420 1.487 1.629 1.433 1.500 1.6404 1.429 1.497 1.638 1.446 1.513 1.651 1.461 1.527 1.665 1.473 1.539 1.6796 1.476 1.541 1.680 1.492 1.557 1.696 1.507 1.571 1.709 1.519 1.583 1.7188 1.503 1.567 1.706 1.519 1.582 1.718 1.533 1.596 1.728 1.545 1.607 1.73910 1.520 1.584 1.718 1.536 1.599 1.732 1.551 1.612 1.744 1.562 1.623 1.752 ∞ Note: Critical values λ α are reported using the linear boundary (4). The ν -dimensional Gaussian processes in the limitingdistributions are simulated on a grid of 10,000 equidistant points with 100,000 Monte Carlo repetitions. In case of a globalstructural break test we have ν = k , and in case of a partial structural break test we have ν = l . The critical value for m = ∞ corresponds to the right-hand side process of equation (6). k = 1 k = 2 k = 3 k = 4 T 100 200 500 100 200 500 100 200 500 100 200 500 M ret Q M ret BQ M ret SBQ Note: Simulated rejection rates under H are presented in percentage points. The values are obtainedfrom 100,000 Monte Carlo repetitions using the critical values from Table 1 for the linear boundarywith α = 5%. The cases k = 1 , . . . , y t = β + u t , y t = β + β x t + u t , y t = β + β x t + β x t + u t , and y t = β + β x t + β x t + β x t + u t , respectively, where x t , x t , x t , and u t are simulated independently as standard normal random variables for all t = 1 , . . . , T . Table 4: Size-adjusted powers of the retrospective tests Model (14) ( k = 1) Model (15) ( k = 2) M ret Q M ret BQ M ret SBQ supW M ret Q M ret BQ M ret SBQ supW τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . τ ∗ = 0 . Note: Simulated size-adjusted rejection rates under models (14) and (15) are presented in percentagepoints for a significance level of 5% and a sample size of T = 100, where supW denotes the sup-Waldtest with r = 0 . 15. The values are obtained from 100,000 Monte Carlo repetitions for a sample sizeof T = 100, while the linear boundary (4) is implemented. and upper trimming parameters. In the subsequent simulations, we consider r = 0 . r ∈ [ r , − r ] B ( r ) (cid:48) B ( r ) / ( r (1 − r )), and critical values for different values of r and k are tabulated in Andrews (1993), where it is also shown that the sup-Wald test hasweak optimality properties. In the case of a single structural break, its local power curveapproaches the power curve from the infeasible point optimal maximum likelihood testasymptotically, as the significance level tends to zero. Note that the sup-Wald statistic isnot suitable for monitoring, since its numerator statistic T ( S − S ( t/T ) − S ( t/T )) is not26easurable with respect to the filtration of information at time t .We illustrate the finite sample performance for a simple model with k = 1 and a breakin the mean, which is given by y t = µ t + u t , µ t = 2 + 0 . · { t ≥ τ ∗ T } , u t iid ∼ N (0 , , (14)and for a univariate linear regression model with a break in the slope coefficient, which isgiven by y t = µ t + β t x t + u t , µ t = 2 , β t = 1 + 0 . · { t ≥ τ ∗ T } , x t , u t iid ∼ N (0 , , (15)where t = 1 , . . . , T . Table 4 presents the size-adjusted power results.First, we observe that the backward CUSUM and the stacked backward CUSUM out-perform the forward CUSUM, except for the case τ ∗ = 0 . 1. Second, while the forwardCUSUM test has much lower power than the sup-Wald test, the reversed order cumulationstructure in the backward CUSUM seems to compensate for this weakness of the forwardCUSUM test. The backward CUSUM performs equally well than the sup-Wald test, whichis remarkable since, as discussed previously, the latter test has weak optimality properties.Finally, while the sup-Wald statistic and the backward CUSUM detector are not suitablefor monitoring, the stacked backward CUSUM test is much more powerful than the forwardCUSUM test, and its detector statistic is therefore well suited for real-time monitoring.In order to evaluate the finite sample performances of the monitoring detectors, weconsider models (14) and (15) for the time points t = T + 1 , . . . , (cid:98) mT (cid:99) . We simulate theseries up to the fixed endpoints m ∈ { . , , , } , while the critical values for the case m = ∞ are implemented (see Table 1). For M mon Q, ∞ with the linear boundary, the 5%critical values are given by 0 . 957 for k = 1 and 1 . 044 for k = 2. Table 5 presents theempirical sizes. Note, that the tests are undersized by construction, as not all of the sizeis used up to the time point mT . For k ≥ 2, we observe some size distortions for smallsample sizes. The results in Table 6 show that the mean delay for the stacked backwardCUSUM is much lower than that of the forward CUSUM and is almost constant across thebreakpoint locations. 27able 5: Empirical sizes of the infinite horizon monitoring detectors k = 1 k = 2 T = 100 T = 500 T = 100 T = 200 T = 500horizon SBQ Q CSW SBQ Q CSW SBQ Q SBQ Q SBQ Q m = 1 . m = 2 0.2 4.2 0.1 0.2 4.4 0.1 1.4 6.6 0.7 5.5 0.4 4.8 m = 4 1.0 4.7 0.9 0.9 4.8 0.8 4.8 7.3 2.5 6.0 1.4 5.2 m = 6 1.7 4.7 1.6 1.4 4.8 1.4 7.7 7.4 4.1 6.0 2.3 5.2 m = 8 2.4 4.7 2.0 2.0 4.8 1.8 10.3 7.4 5.7 6.0 3.3 5.2 m = 10 3.1 4.7 2.3 2.7 4.8 2.0 12.7 7.4 7.2 6.0 4.3 5.2 Note: Simulated rejection rates under H are presented in percentage points. The linear boundary (4) isimplemented, while critical values for α = 5% and m = ∞ are considered. The values are obtained from 100,000random draws of the models y t = β + u t and y t = β + β x t + u t for t = 1 , . . . , (cid:98) mT (cid:99) , where x t and u t arei.i.d. and standard normal. While SBQ and Q correspond to the stacked backward CUSUM and the forwardCUSUM with critical values for the case m = ∞ , the univariate test by Chu et al. (1996) using the radicalboundary (5) is denoted by CSW. Table 6: Empirical mean detection delays of the monitoring detectors Model (14) Model (15) Model (14) Model (15)SBQ Q CSW SBQ Q SBQ Q CSW SBQ Q τ ∗ = 1 . τ ∗ = 3 36.0 99.1 71.1 52.4 129.6 τ ∗ = 2 38.4 59.4 60.1 57.7 77.0 τ ∗ = 5 34.5 178.0 89.4 48.1 233.6 τ ∗ = 2 . τ ∗ = 10 33.5 374.6 124.2 45.7 487.8 Note: The empirical mean detection delays are obtained from 100,000 Monte Carlo repetitions using size-adjustedcritical values for a significance level of 5%, where models (14) and (15) are simulated for t = 1 , . . . , (cid:98) mT (cid:99) with T = 100and m = 20. While SBQ and Q correspond to the stacked backward CUSUM and the forward CUSUM with the linearboundary (4) and with critical values for the case m = ∞ , the univariate test by Chu et al. (1996) with the radicalboundary (5) is denoted by CSW. To compare the breakpoint estimator (11) with its maximum likelihood benchmark(10), we present Monte Carlo simulation results for model (14) for the bias and the meansquared error (MSE) in Table 7. If the break τ ∗ is located after 85% of the sample, theestimator based on backwardly cumulated recursive residuals has a much lower bias andMSE than the maximum likelihood estimator, which is due to the fact that the post-breaksample consists of too few observations for an accurate maximum likelihood estimation.28able 7: Bias and MSE of breakpoint estimators T=100 T=200Bias MSE Bias MSE τ ∗ ML BQ ML BQ ML BQ ML BQ0.5 0.000 − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − Note: The Bias and MSE results for the breakdate estimators (10) and (11) are obtainedfrom 100,000 Monte Carlo repetitions, where model (14) is simulated for t = 1 , . . . , T . MLdenotes the maximum likelihood estimator (cid:98) τ ∗ ret ML and BQ denotes the estimator (cid:98) τ ret , whichis based on backwardly cumulated recursive residuals. In this paper we propose two alternatives to the conventional CUSUM detectors by Brownet al. (1975) and Chu et al. (1996). It has been demonstrated that cumulating the recursiveresiduals backwardly result in much higher power than using forwardly cumulated recursiveresiduals, in particular if the break is located at the end of the sample. Accordingly, thebackward scheme is especially attractive for on-line monitoring. To this end the stackedtriangular array of backwardly cumulated recursive residuals is employed and we find thatthis approach yields a much lower detection delay than the monitoring procedure by Chuet al. (1996). Due to the multivariate nature of our tests, they also have power againststructural breaks that do not affect the unconditional mean of the dependent variable.We also suggest a new estimator for break date based on backwardly cumulated recursiveresiduals. This estimator outperforms the conventional estimator constructed by the sumof squared residuals whenever the break occurs close to the end of the sample, which is therelevant scenario for on-line monitoring. 29 cknwoledgements We are thankful to Holger Dette, Josua G¨osmann and Dominik Wied for very helpfulcomments and suggestions. Further, we would like to thank the participants of the RMSEmeeting 2018 in Vallendar, the econometrics research seminar at the UC3M in Madrid, andthe DAGStat Conference 2019 in Munich. 30 ppendix: Proofs We first present some auxiliary lemmas which we require for the proofs of Theorems 1 and2. Lemma 1. Under Assumption 1, there exists a k -dimensional standard Brownian motion W ( r ) , such that the following statements hold true:(a) For any fixed m < ∞ , as T → ∞ , √ T (cid:98) rT (cid:99) (cid:88) t =1 x t u t ⇒ σ C / W ( r ) , r ∈ [0 , m ] . (b) lim t →∞ (cid:107) (cid:80) tj =1 x j u j − σ C / W ( t ) (cid:107)√ t = 0 (a.s.) . Proof. For (a), note that a direct consequence of the functional central limit theorem formultiple time series on the space D ([0 , k given by Theorem 2.1 in Phillips and Durlauf(1986) is that M − / (cid:80) (cid:98) sM (cid:99) t =1 x t u t ⇒ σ C / W ( s ), s ∈ [0 , M → ∞ (see also Lemma 3in Kr¨amer et al. 1988). Then, on the space D ([0 , m ]) k ,1 √ T (cid:98) rT (cid:99) (cid:88) t =1 x t u t = √ m √ M (cid:98) ( r/m ) M (cid:99) (cid:88) t =1 x t u t ⇒ √ mσ C / W ( r/m ) d = σ C / W ( r ) , r ∈ [0 , m ] . To show (b), note that { x t u t } t ∈ N is a stationary and ergodic martingale difference sequencewith E [ x t u t ] = 0 and E [( x t u t )( x t u t ) (cid:48) ] = σ C . We apply the strong invariance principlegiven by Theorem 3 in Wu et al. (2007). Then,lim t →∞ (cid:107) σ − C − / (cid:80) tj =1 x j u j − W ( t ) (cid:107) t /q (cid:112) ln( t )(ln(ln( t ))) / < ∞ , (a.s.) , where q = min { κ, } (see also Strassen 1967), and the assertion follows from the fact thatlim t →∞ t /q (cid:112) ln( t )(ln(ln( t ))) / / √ t = 0. Lemma 2. Let { ( x t , u t ) } t ∈ N satisfy Assumption 1, let β t = β for all t ∈ N , and let m ∈ (0 , ∞ ) . Let X t = (cid:80) tj =1 x j w j , Y t = (cid:80) tj =1 x j u j , and Z t = (cid:80) t − j =1 (cid:80) ji =1 j − x i u i . Then,as T → ∞ , sup ≤ t ≤ mT (cid:107) X t − ( Y t − Z t ) (cid:107)√ T = o P (1) , and sup T 1) = O P (1), as T → ∞ , and let a t = t − / (cid:80) tj =1 a j x j u j , where (cid:107) a T (cid:107) = O P (1). Furthermore, note that j − / − ( j + 1) − / < j − / . Then, (cid:101) Y t − Y t = t (cid:88) j =1 ( a j x j u j ) j − / = a t + t − (cid:88) j =1 (cid:16) a j j / (cid:2) j − / − ( j + 1) − / (cid:3)(cid:17) < a t + t − (cid:88) j =1 j a j , which implies thatsup ≤ t ≤ mT (cid:107) (cid:101) Y t − Y t (cid:107)√ T < sup ≤ t ≤ mT (cid:16) (cid:107) a t (cid:107)√ T + mT / t − (cid:88) j =1 (cid:107) a j (cid:107) j / (cid:17) = o P (1) , and sup T Let W ( r ) be a k -dimensional standard Brownian motion and let B ( r ) be a k -dimensional standard Brownian bridge. Then,(a) W ( r ) − (cid:82) r z − W ( z ) d z d = W ( r ) , for r ≥ ,(b) W ( r/ (1 − r )) d = B ( r ) / (1 − r ) , for r ∈ (0 , .Proof. Let W j ( r ) and B j ( r ) be the j -th component of W ( r ) and B ( r ), respectively. Weshow the identities for each j = 1 , . . . , k , separately. Using Cauchy-Schwarz and Jensen’s33nequalities, we obtain (cid:82) r z − E [ | W j ( z ) | ] d z < ∞ as well as (cid:82) r z − E [ | W j ( r ) W j ( z ) | ] d z < ∞ ,which justifies the application of Fubini’s theorem in the subsequent steps. Since both W j ( r ) and F ( W j ( r )) = W j ( r ) − (cid:82) r z − W j ( z ) d z are Gaussian with zero mean, it remainsto show that their covariance functions coincide. Let w.l.o.g. r ≤ s . Then, E [ F ( W j ( r )) F ( W j ( s ))] − E [ W j ( r ) W j ( s )]= (cid:90) r (cid:90) s E [ W j ( z ) W j ( z )] z z d z d z − (cid:90) s E [ W j ( r ) W j ( z )] z d z − (cid:90) r E [ W j ( s ) W j ( z )] z d z = (2 r + r ln( s ) − r ln( r )) − ( r + r ln( s ) − r ln( r )) − r = 0 , and (a) has been shown. The second result follows from the fact that both processes areGaussian with zero mean and E (cid:20) B j ( r )1 − r B j ( s )1 − s (cid:21) = min { r (1 − s ) , s (1 − r ) } (1 − r )(1 − s ) = min (cid:110) r − r , s − s (cid:111) = E (cid:2) W j ( r − r ) W j ( s − s ) (cid:3) . Lemma 4. Let { ( x t , u t ) } t ∈ N satisfy Assumption 1, let β t = β for all t ∈ N , and let m ∈ (0 , ∞ ) . Then, as T → ∞ , √ T (cid:98) rT (cid:99) (cid:88) t =1 x t w t ⇒ σ C / W ( r ) , r ∈ [0 , m ] , where W ( r ) is a k -dimensional standard Brownian motion.Proof. From Lemma 2, we have sup r ∈ [0 ,m ] T − / (cid:107) X (cid:98) rT (cid:99) − ( Y (cid:98) rT (cid:99) − Z (cid:98) rT (cid:99) ) (cid:107) = o P (1). Let F ( Y (cid:98) rT (cid:99) ) = Y (cid:98) rT (cid:99) − (cid:82) r z − Y (cid:98) zT (cid:99) d z . Then, lim T →∞ (cid:107) ( Y (cid:98) rT (cid:99) − Z (cid:98) rT (cid:99) ) − F ( Y (cid:98) rT (cid:99) )) (cid:107) = 0, andsup r ∈ [0 ,m ] (cid:107) T − / X (cid:98) rT (cid:99) − F ( T − / Y (cid:98) rT (cid:99) ) (cid:107) = o P (1). Lemma 1(a) and the continuous mappingtheorem imply F ( T − / Y (cid:98) rT (cid:99) ) ⇒ F ( σ C − / W ( r )) = σ C − / F ( W ( r )). Furthermore, fromLemma 3, it follows that F ( W ( r )) d = W ( r ). Consequently, T − / X (cid:98) rT (cid:99) ⇒ σ C / W ( r ). Lemma 5. Let (cid:107) · (cid:107) M be the induced matrix norm of (cid:107) · (cid:107) . Let h be a R k -valued func-tion of bounded variation, and let { A t } t ∈ N be a sequence of random ( k × k ) matrices with sup r ∈ [0 ,m ] (cid:107) T − (cid:80) (cid:98) rT (cid:99) t =1 ( A t − A ) (cid:107) M = o P (1) , where m ∈ (0 , ∞ ) . Then, as T → ∞ , sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( A t − A ) h ( tT ) (cid:13)(cid:13)(cid:13) = o P (1) . roof. By the application of Abel’s formula of summation by parts, which is given in (18),it follows that (cid:98) rT (cid:99) (cid:88) t =1 ( A t − A ) h ( tT ) = (cid:98) rT (cid:99) (cid:88) t =1 ( A t − A ) h ( (cid:98) rT (cid:99) T ) + (cid:98) rT (cid:99)− (cid:88) t =1 t (cid:88) j =1 ( A j − A )( h ( tT ) − h ( t +1 T )) . The fact that h ( r ) is of bounded variation yieldssup r ∈ [0 ,m ] (cid:107) h ( r ) (cid:107) = O (1) , sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) (cid:98) rT (cid:99)− (cid:88) t =1 tT ( h ( tT ) − h ( t +1 T )) (cid:13)(cid:13)(cid:13) = O (1) . Consequently,sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( A t − A ) h ( (cid:98) rT (cid:99) T ) (cid:13)(cid:13)(cid:13) ≤ sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( A t − A ) (cid:13)(cid:13)(cid:13) M (cid:13)(cid:13)(cid:13) h ( (cid:98) rT (cid:99) T ) (cid:13)(cid:13)(cid:13) = o P (1)and sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99)− (cid:88) t =1 t (cid:88) j =1 ( A j − A )( h ( tT ) − h ( t +1 T )) (cid:13)(cid:13)(cid:13) ≤ sup r ∈ [0 ,m ] (cid:98) rT (cid:99)− (cid:88) t =1 tT (cid:13)(cid:13)(cid:13) t t (cid:88) j =1 ( A j − A ) (cid:13)(cid:13)(cid:13) M (cid:13)(cid:13)(cid:13) h ( tT ) − h ( t +1 T ) (cid:13)(cid:13)(cid:13) = o P (1) . Then, by the triangle inequality, the assertion follows. Proof of Theorem 1 Let w ∗ t = f − t ( y ∗ t − x (cid:48) t (cid:98) β ∗ t − ), which are recursive residuals from a regression without anystructural break, where f t = (1 + ( t − − x (cid:48) t C − t − x t ) / , y ∗ t = x (cid:48) t β + u t , and (cid:98) β ∗ t − = (cid:16) t − (cid:88) j =1 x j x (cid:48) j (cid:17) − (cid:16) t − (cid:88) j =1 x j y ∗ j (cid:17) . Then, y t = x (cid:48) t β t + u t = y ∗ t + T − / x (cid:48) t g ( t/T ), and (cid:98) β t − = (cid:98) β ∗ t − + 1 √ T ( t − C − t − t − (cid:88) j =1 x j x (cid:48) j g ( j/T ) . w t = w ∗ t + f − t T − / x (cid:48) t g ( t/T ) − f − t T − / ( t − − C − t − (cid:80) t − j =1 x j x (cid:48) j g ( j/T ). Wecan decompose the partial sum process as T − / (cid:80) (cid:98) rT (cid:99) t =1 x t w t = S ,T ( r ) + S ,T ( r ) + S ,T ( r ),where S ,T ( r ) = 1 √ T (cid:98) rT (cid:99) (cid:88) t =1 x t w ∗ t , S ,T ( r ) = 1 T (cid:98) rT (cid:99) (cid:88) t =1 f − t x t x (cid:48) t g ( tT ) , (19) S ,T ( r ) = − T (cid:98) rT (cid:99) (cid:88) t =1 f t ( t − x t x (cid:48) t C − t − t − (cid:88) j =1 x j x (cid:48) j g ( jT ) . (20)Let (cid:107) · (cid:107) M be the induced matrix norm of (cid:107) · (cid:107) . Lemma 4 yields S ,T ( r ) ⇒ σ C / W ( r ). Forthe second term, note that, from Assumption 1(a) and the fact that √ T ( f − T − 1) = O P (1),it follows that sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( f − t x t x (cid:48) t − C ) (cid:13)(cid:13)(cid:13) M = o P (1) . (21)Since g ( r ) is piecewise constant and therefore of bounded variation, Lemma 5 yieldssup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) S ( r ) − (cid:90) r C g ( s ) d s (cid:13)(cid:13)(cid:13) = sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( f − t x t x (cid:48) t − C ) g ( tT ) (cid:13)(cid:13)(cid:13) = o P (1) . (22)For the third term, let p ( r ) = 1 (cid:98) rT (cid:99) C − (cid:98) rT (cid:99) (cid:98) rT (cid:99) (cid:88) j =1 x j x (cid:48) j g ( jT ) , p ( r ) = 1 (cid:98) rT (cid:99) C − (cid:98) rT (cid:99) (cid:98) rT (cid:99) (cid:88) j =1 C g ( jT ) , p ( r ) = 1 (cid:98) rT (cid:99) (cid:98) rT (cid:99) (cid:88) j =1 g ( jT ) . From Assumption 1(a), it follows that sup r ∈ [0 ,m ] (cid:107) p ( r ) − p ( r ) (cid:107) M = o P (1). Furthermore,from Lemma 5 and from the fact that sup r ∈ [0 ,m ] (cid:107) (cid:98) rT (cid:99) (cid:80) (cid:98) rT (cid:99) t =1 ( x t x (cid:48) t − C ) (cid:107) M = o P (1), itfollows that sup r ∈ [0 ,m ] (cid:107) p ( r ) − p ( r ) (cid:107) = o P (1). Thus, sup r ∈ [0 ,m ] (cid:107) p ( r ) − p ( r ) (cid:107) = o P (1).Consequently, sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) S ,T ( r ) + 1 T (cid:98) rT (cid:99) (cid:88) t =1 f − t x t x (cid:48) t h ( t − T ) (cid:13)(cid:13)(cid:13) ≤ sup r ∈ [0 ,m ] T (cid:98) rT (cid:99) (cid:88) t =1 (cid:107) f − t x t x (cid:48) t (cid:107) M (cid:107) p ( t − T ) − p ( t − T ) (cid:107) , (23)36hich is o P (1). Since p is a partial sum of a piecewise constant function, it is of boundedvariation, and, together with (21), we can apply Lemma 5. Then,sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) T (cid:98) rT (cid:99) (cid:88) t =1 ( f − t x t x (cid:48) t − C ) p ( t − T ) (cid:13)(cid:13)(cid:13) = o P (1) , which yields sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) S ,T ( r ) + (cid:90) r (cid:90) s s C g ( v ) d v d s (cid:13)(cid:13)(cid:13) = sup r ∈ [0 ,m ] (cid:13)(cid:13)(cid:13) S ,T ( r ) + 1 T C (cid:98) rT (cid:99) (cid:88) t =1 p ( t − T ) (cid:13)(cid:13)(cid:13) + o P (1) = o P (1) . Finally, Slutsky’s theorem implies that S ,T ( r ) + S ,T ( r ) + S ,T ( r ) ⇒ σ C / W ( r ) + σ C h ( r ),which yields Q T ( r ) = (cid:98) σ − C − / T ( S ,T ( r ) + S ,T ( r ) + S ,T ( r )) ⇒ W ( r ) + C / h ( r ) , since (cid:98) σ is consistent for σ (see Kr¨amer et al. 1988). Proof of Theorem 2 Lemma 2 yields sup t ≥ T (cid:107) (cid:80) tj =1 x j w j − (cid:80) tj =1 ( x j u j − j − (cid:80) ji =1 x i u i ) (cid:107)√ t = o P (1) . Let W ( r ) be the k -dimensional standard Brownian motion given by Lemma 1(b). Then, A T = sup t ≥ T (cid:107) (cid:80) tj =1 x j u j − σ C / W ( t ) (cid:107)√ t = o P (1) , Furthermore, (cid:107) (cid:80) tj =1 x t u t − W ( t ) (cid:107) ≤ ξt / − (cid:15) , for some (cid:15) > ξ ,for all t ∈ N . It follows thatsup t ≥ T (cid:107) ( (cid:80) tj =1 x j u j − j − (cid:80) ji =1 x i u i ) − σ C / ( W ( t ) − (cid:80) tj =1 j − W ( j )) (cid:107)√ t ≤ A T + sup t ≥ T t (cid:88) j =1 (cid:107) (cid:80) ji =1 x i u i − W ( j ) (cid:107) j √ t ≤ A T + ξ · (cid:16) sup t ≥ T t (cid:88) j =1 j / − (cid:15) j √ t (cid:17) = o P (1) , t ≥ T t (cid:88) j =1 j / − (cid:15) j √ t ≤ sup t ≥ T t (cid:88) j =1 j (cid:15) T (cid:15) ≤ T (cid:15) ∞ (cid:88) j =1 j (cid:15) = o P (1) . Consequently, sup t ≥ T (cid:107) (cid:80) tj =1 x j w j − σ C − / ( W ( t ) − (cid:80) tj =1 j − W ( j )) (cid:107)√ t = o P (1) . From the fact that T − / W ( t ) d = W ( t/T ) it follows that there exists some k -dimensionalstandard Brownian motion W ∗ ( t ), such thatsup r ≥ (cid:107) T − / (cid:80) (cid:98) rT (cid:99) j =1 x j w j − σ C − / ( W ∗ ( r ) − (cid:80) (cid:98) rT (cid:99) j =1 j − W ∗ ( j/T )) √ t = o P (1) . Moreover, from Lemma 3 and the fact that lim T →∞ (cid:80) (cid:98) rT (cid:99) j =1 j − W ∗ ( j/T ) = (cid:82) r z − W ∗ ( z ) d z ,there exists some k -dimensional standard Brownian motion W ∗∗ ( t ), such thatsup r ≥ (cid:107) T − / (cid:80) (cid:98) rT (cid:99) j =1 x j w j − σ C / W ∗∗ ( r ) (cid:107)√ r = o P (1) , and, therefore, sup r ≥ (cid:107) σ − C − / T − / (cid:80) (cid:98) rT (cid:99) j =1 x j w j − W ∗∗ ( r ) (cid:107)√ r = o P (1) . Since ˆ σ is consistent for σ (see Kr¨amer et al. 1988) and { x t } t ∈ N is ergodic, we have (cid:107) ˆ σ − C − / T − σ − C − / (cid:107) M = o P (1) , where (cid:107) · (cid:107) M denotes the matrix norm induced by (cid:107) · (cid:107) . Consequently,sup r ≥ (cid:107) Q T ( r ) − W ∗∗ ( r ) (cid:107)√ r = o P (1) . Proof of Theorem 3 For any fixed m ∈ (1 , ∞ ), Theorem 1 yields Q T ( r ) ⇒ W ( r ), r ∈ [0 , m ], under H . Then,(a) follows with the continuous mapping theorem. For (b), the continuous mapping theoremimplies that M mon Q,m = sup r ∈ (1 ,m ) (cid:107) Q T ( r ) − Q T (1) (cid:107) d ( r − d −→ sup r ∈ (1 ,m ) (cid:107) W ( r ) − W (1) (cid:107) d ( r − d = sup r ∈ (0 ,m − (cid:107) W ( r ) (cid:107) d ( r ) . 38e transform the supremum to a supremum over a subset of the unit interval. Considerthe bijective function g : (0 , ( m − /m ) → (0 , m − 1) that is given by g ( η ) = η/ (1 − η ).Furthermore, note that W ( g ( η )) d = B ( η ) / (1 − η ), which follows from Lemma 3. Conse-quently, sup r ∈ (0 ,m − (cid:107) W ( r ) (cid:107) d ( r ) = sup η ∈ (0 , m − m ) (cid:107) W ( g ( η )) (cid:107) d ( g ( η )) d = sup η ∈ (0 , m − m ) (cid:107) B ( η ) (cid:107) (1 − η ) d (cid:0) η − η (cid:1) . For the last result, Theorem 2 and Assumption 2 implysup r> (cid:107) Q T ( r ) − Q T (1) (cid:107) d ( r − − sup r> (cid:107) W ( r ) − W (1) (cid:107) d ( r − ≤ sup r> (cid:107) Q T ( r ) − Q T (1) − ( W ( r ) − W (1)) (cid:107) d ( r − ≤ sup r> (cid:107) Q T ( r ) − W ( r ) (cid:107) d ( r − 1) + sup r> (cid:107) Q T (1) − W (1) (cid:107) d ( r − ≤ sup r> (cid:18) (cid:107) Q T ( r ) − W ( r ) (cid:107)√ r · √ rd ( r − (cid:19) + (cid:107) Q T (1) − W (1) (cid:107) · sup r> d ( r − ≤ (cid:18) sup r> √ rd ( r − (cid:19) · (cid:18) sup r> (cid:107) Q T ( r ) − W ( r ) (cid:107)√ r (cid:19) = o P (1)for some k -dimensional standard Brownian motion W ( r ). Then, M mon Q, ∞ = sup r ∈ (1 , ∞ ) (cid:107) Q T ( r ) − Q T (1) (cid:107) d ( r − d −→ sup r ∈ (1 , ∞ ) (cid:107) W ( r ) − W (1) (cid:107) d ( r − g : (0 , → (0 , ∞ ) that is given by g ( η ) = η/ (1 − η ),which yieldssup r ∈ (1 , ∞ ) (cid:107) W ( r ) − W (1) (cid:107) d ( r − d = sup r ∈ (0 , ∞ ) (cid:107) W ( r ) (cid:107) d ( r ) = sup η ∈ (0 , (cid:107) W ( g ( η )) (cid:107) d ( g ( η )) d = sup η ∈ (0 , (cid:107) B ( η ) (cid:107) (1 − η ) d (cid:0) η − η (cid:1) . Proof of Theorem 4 Theorem 1 and the continuous mapping theorem imply that M ret BQ = sup r ∈ (0 , (cid:107) Q T (1) − Q T ( r ) (cid:107) d (1 − r ) d −→ sup r ∈ (0 , (cid:107) W (1) − W ( r ) (cid:107) d (1 − r ) d = sup r ∈ (0 , (cid:107) W ( r ) (cid:107) d ( r ) . roof of Theorem 5 Analogously to the proof of Theorem 3, M ret SBQ d −→ sup r ∈ (0 , sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) , M mon SBQ,m d −→ sup r ∈ (1 ,m ) sup s ∈ (1 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s )follow with Theorem 1 and the continuous mapping theorem. Furthermore, let the function g : (0 , ( m − /m ) → (0 , m − 1) be given by g ( η ) = η/ (1 − η ). With Lemma 3(b), we havesup r ∈ (1 ,m ) sup s ∈ (1 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) d = sup r ∈ (0 ,m − sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s )= sup η ∈ (0 , m − m ) sup s ∈ (0 ,g ( η )) (cid:107) W ( g ( η )) − W ( s ) (cid:107) d ( g ( η ) − s ) = sup η ∈ (0 , m − m ) sup ζ ∈ (0 ,η ) (cid:107) W ( g ( η )) − W ( g ( ζ )) (cid:107) d ( g ( η ) − g ( ζ )) d = sup η ∈ (0 , m − m ) sup ζ ∈ (0 ,η ) (cid:107) B ( η ) / (1 − η ) − W ( ζ ) / (1 − ζ ) (cid:107) d (cid:0) η − η − ζ − ζ (cid:1) = sup η ∈ (0 , m − m ) sup ζ ∈ (0 ,r ) (cid:107) (1 − ζ ) B ( η ) − (1 − η ) B ( ζ ) (cid:107) (1 − η )(1 − ζ ) d (cid:0) η − ζ (1 − η )(1 − ζ ) (cid:1) . Finally, for (c), Theorem 2 and Assumption 2 implysup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( r ) − Q T ( s ) (cid:107) d ( r − s ) − sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) ≤ sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( r ) − Q T ( s ) − ( W ( r ) − W ( s )) (cid:107) d ( r − s ) ≤ sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( r ) − W ( r ) (cid:107) d ( r − s ) + sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( s ) − W ( s ) (cid:107) d ( r − s ) ≤ sup r ∈ (1 , ∞ ) (cid:107) Q T ( r ) − W ( r ) (cid:107) d ( r − 1) + sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( s ) − W ( s ) (cid:107) d ( r − ≤ (cid:18) sup r ∈ (1 , ∞ ) √ rd ( r − (cid:19) · (cid:18) sup r ∈ (1 , ∞ ) (cid:107) Q T ( r ) − W ( r ) (cid:107)√ r (cid:19) = o P (1)for some k -dimensional standard Brownian motion W ( r ). Then, M mon SBQ,m = sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) Q T ( r ) − Q T ( s ) (cid:107) d ( r − s ) d −→ sup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) . Consider now the bijective function g : (0 , → (0 , ∞ ) that is given by g ( η ) = η/ (1 − η ).Analogously to the derivations above, we obtainsup r ∈ (1 , ∞ ) sup s ∈ (1 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s ) d = sup r ∈ (0 , ∞ ) sup s ∈ (0 ,r ) (cid:107) W ( r ) − W ( s ) (cid:107) d ( r − s )= sup η ∈ (0 , sup ζ ∈ (0 ,η ) (cid:107) W ( g ( η )) − W ( g ( ζ )) (cid:107) d ( g ( η ) − g ( ζ )) d = sup η ∈ (0 , sup ζ ∈ (0 ,r ) (cid:107) (1 − ζ ) B ( η ) − (1 − η ) B ( ζ ) (cid:107) (1 − η )(1 − ζ ) d (cid:0) η − ζ (1 − η )(1 − ζ ) (cid:1) . roof of Theorem 6 Adopting the notation of the local break in Theorem 1, we have β t = β + T − / g ( t/T )with g ( t/T ) = T / δ { t ≥ T ∗ } . Unlike in Theorem 1, the alternative does not converge to thenull as the sample size grows. Following equations (19)–(23), we have1 T (cid:98) rT (cid:99) (cid:88) t =1 x t w t = 1 T / (cid:0) S ,T ( r ) + S ,T ( r ) + S ,T ( r ) (cid:1) , where sup r ∈ [0 , (cid:107) T − / S ,T ( r ) (cid:107) = o P (1), andsup r ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13) S ,T ( r ) + S ,T ( r ) − C (cid:16) (cid:90) r g ∗ ( z ) d z − (cid:90) r (cid:90) z z g ∗ ( v ) d v d z (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) = o P (1) , where g ∗ ( r ) = δ { r ≥ τ ∗ } . Note that (cid:90) r g ∗ ( z ) d z − (cid:90) r (cid:90) z z g ∗ ( v ) d v d z = δ (cid:90) r (cid:16) { s ≥ τ ∗ } − (cid:90) s s { v ≥ τ ∗ } (cid:17) d s = δ (cid:90) rτ ∗ (cid:16) − s − τ ∗ s (cid:17) d s = δ (cid:90) rτ ∗ s d s = τ ∗ δ (cid:0) ln( r ) − ln( τ ∗ ) (cid:1) { r ≥ τ ∗ } , which implies that σT − / Q T ( r ) ⇒ τ ∗ C / δ (cid:0) ln( r ) − ln( τ ∗ ) (cid:1) { r ≥ τ ∗ } . Then, (cid:98) τ ret = 1 T · argmax ≤ t ≤ T (cid:13)(cid:13)(cid:13) (cid:98) σ √ T √ T − t + 1 (cid:0) Q T (1) − Q T ( t +1 T ) (cid:1)(cid:13)(cid:13)(cid:13) , (cid:98) τ mon = 1 T · argmax T Journal of Time Series Econometrics , 10:1941–1928.Andrews, D. W. (1993). Tests for parameter instability and structural change with unknownchange point. Econometrica , 61:821–856.Aue, A., Horv´ath, L., Huˇskov´a, M., and Kokoszka, P. (2006). Change-point monitoring inlinear models. Econometrics Journal , 9:373–403.Bai, J. (1997). Estimation of a change point in multiple regression models. Review ofEconomics and Statistics , 79:551–563.Bauer, P. and Hackl, P. (1978). The use of MOSUMS for quality control. Technometrics ,20:431–436.Berkes, I., Liu, W., and Wu, W. B. (2014). Koml´os–major–tusn´ady approximation underdependence. The Annals of Probability , 42:794–817.Billingsley, P. (1999). Convergence of probability measures, second edition . New York:Wiley.Brown, R. L., Durbin, J., and Evans, J. M. (1975). Techniques for testing the constancyof regression relationships over time. Journal of the Royal Statistical Society. Series B ,37:149–192.Chu, C.-S. J., Hornik, K., and Kaun, C.-M. (1995). Mosum tests for parameter constancy. Biometrika , 82:603–617.Chu, C.-S. J., Stinchcombe, M., and White, H. (1996). Monitoring structural change. Econometrica , 64:1045–65.Dette, H. and G¨osmann, J. (2019). A likelihood ratio approach to sequential change pointdetection for a general class of parameters. Journal of the American Statistical Associa-tion , 0:1–17. 42remdt, S. (2015). Page’s sequential procedure for change-point detection in time seriesregression. Statistics , 49:128–155.G¨osmann, J., Kley, T., and Dette, H. (2019). A new approach for open-end sequentialchange point monitoring. https://arxiv.org/abs/1906.03225 .Hansen, B. E. (1992). Testing for parameter instability in linear models. Journal of PolicyModeling , 14:517–533.Horv´ath, L. (1995). Detecting changes in linear regressions. Statistics: A Journal ofTheoretical and Applied Statistics , 26:189–208.Horv´ath, L., Huˇskov´a, M., Kokoszka, P., and Steinebach, J. (2004). Monitoring changes inlinear models. Journal of Statistical Planning and Inference , 126:225–251.Kirch, C. and Kamgaing, J. T. (2015). On the use of estimating functions in monitoringtime series for change points. Journal of Statistical Planning and Inference , 161:25–49.Koml´os, J., Major, P., and Tusn´ady, G. (1975). An approximation of partial sums ofindependent rv’-s, and the sample df. i. Zeitschrift f¨ur Wahrscheinlichkeitstheorie undverwandte Gebiete , 32:111–131.Kr¨amer, W., Ploberger, W., and Alt, R. (1988). Testing for structural change in dynamicmodels. Econometrica , 56:1355–1369.Kuan, C.-M. and Hornik, K. (1995). The generalized fluctuation test: A unifying view. Econometric Reviews , 14:135–161.Leisch, F., Hornik, K., and Kuan, C.-M. (2000). Monitoring structural changes with thegeneralized fluctuation test. Econometric Theory , 16:835–854.Nyblom, J. (1989). Testing for the constancy of parameters over time. Journal of theAmerican Statistical Association , 84:223–230.Page, E. S. (1954). Continuous inspection schemes. Biometrika , 41:100–115.43erron, P. (2006). Dealing with structural breaks. Palgrave handbook of econometrics ,1:278–352.Phillips, P. C. and Durlauf, S. N. (1986). Multiple time series regression with integratedprocesses. The Review of Economic Studies , 53:473–495.Ploberger, W. and Kr¨amer, W. (1990). The local power of the cusum and cusum of squarestests. Econometric Theory , 6:335–347.Ploberger, W. and Kr¨amer, W. (1992). The cusum test with ols residuals. Econometrica ,60:271–285.Ploberger, W., Kr¨amer, W., and Kontrus, K. (1989). A new test for structural stability inthe linear regression model. Journal of Econometrics , 40:307–318.Robbins, H. and Siegmund, D. (1970). Boundary crossing probabilities for the wienerprocess and sample sums. The Annals of Mathematical Statistics , 41:1410–1429.Sen, P. K. (1982). Invariance principles for recursive residuals. The Annals of Statistics ,10:307–312.Strassen, V. (1967). Almost sure behavior of sums of independent random variables andmartingales. Proceedings of the Fifth Berkeley Symposium on Mathematical Statisticsand Probability , 2:315–343.Wied, D. and Galeano, P. (2013). Monitoring correlation change in a sequence of randomvariables. Journal of Statistical Planning and Inference , 143:186–196.Wu, W. B. et al. (2007). Strong invariance principles for dependent random variables. TheAnnals of Probability , 35:2294–2320.Zeileis, A. (2004). Alternative boundaries for cusum tests. Statistical Papers , 45:123–131.Zeileis, A., Leisch, F., Kleiber, C., and Hornik, K. (2005). Monitoring structural change indynamic econometric models.