The S-Estimator in Change-Point Random Model with Long Memory
aa r X i v : . [ m a t h . S T ] J un THE S-ESTIMATOR IN CHANGE-POINTRANDOM MODEL WITH LONG MEMORY
GABRIELA CIUPERCA
Universit´e de Lyon, Universit´e Lyon 1,CNRS, UMR 5208, Institut Camille Jordan,Bat. Braconnier, 43, blvd du 11 novembre 1918,F - 69622 Villeurbanne Cedex, France,email: [email protected]: 33(0)4.72.43.16.90, fax: 33(0)4.72.43.16.87
Abstract
The paper considers two-phase random design linear regression models. The errorsand the regressors are stationary long-range dependent Gaussian. The regressionparameters, the scale parameters and the change-point are estimated using a methodintroduced by Rousseeuw and Yohai [33]. This is called S-estimator and it has theproperty that is more robust than the classical estimators; the outliers don’t spoilthe estimation results. Some asymptotic results, including the strong consistencyand the convergence rate of the S-estimators, are proved.
Key words:
Change-points, S-estimator, Long-memory, Asymptotic properties
AMS 2000 subject classifications: primary 62F12; secondary 62H12, 60G15.
Consider the two-phase linear regression model: Y t = X t β ≤ t ≤ [ nπ ] + X t β [ nπ ]+1 ≤ t ≤ n + ε t , t = 1 , ..., n (1)where 11 ( . ) is the indicator function and π ∈ (0 , ξ = ( β , β , π ), β , β ∈ Υ. Theset Υ is a compact of R d , d ≥
1. For this model, Y t denotes the response variable, X t is a p-vector of regressors and ε t is the error. Preprint submitted to Elsevier 13 November 2018 he model parameters are: regression parameters β and β , change-point π anderror variance σ , with σ ∈ (0 , ∞ ). Let us denote ξ = ( β , β , π ) and σ the truevalues of these parameters. In this paper we consider the problem of estimating of ξ and σ , based on the observation of ( Y t , X t ) ≤ t ≤ n .Classical estimation methods studied in the statistic literature are the least squares(LS), maximum likelihood (ML) or a wider class M-estimation methods. For eachof these methods one has to distinguish the cases when the errors are independentor not, and in the dependent case it is necessary to take into account the covariancestructure. The same conditions can be considered for regressors X t . In traditionalmethodology, these variables are usually assumed to be independent or with short-memory. So, if the errors are i.i.d. or with short-memory, the statistic literaturerelated to the parametric change-point estimation is very vast. Recent develop-ments for the LS estimation include Feder ([13], [14]), Bai and Perron [3], Kimand Kim [22]. Bai [1] considers also the least squares estimation of a shift in lin-ear process. The process ε t is given by: ε t = P ∞ j =0 c j u k − j , where u j is white noisewith mean zero and variance σ and the coefficients c j satisfy P ∞ j =0 j | c j | < ∞ .This condition excludes long-memory. For the ML estimation we refer to Bhat-tacharya [7], Koul and Qian [23], Ciuperca and Dapzol [9]. In the general case ofthe M-estimator, we can cite the papers of Rukhin and Vajda [34], Koul et al. [25].Obviously, the list is not exhaustive, the subject is so large and productive thatwe cannot give all the papers. The convergence rate and limiting distributions ofthe change-point and of the regression parameters M-estimators are derived forthe model (1) by Fiteni [15], under restrictive and numerous assumptions. Amongthese conditions she considers that ( Y t , X t ) is a random vector, L -NED, on astrong mixing base { w t ; t = ..., , , ... } , ρ ′ ( ε t + θX t ) X t is a random sequence ofmean zero, L -NED of size 1 / { w t ; t = ..., , , ... } andsup t ≤ n IE [ k ρ ′ ( ε t + θX t ) X t k r for some r >
2. Under the same dependence assump-tions, Fiteni [16] considers the τ -estimators.On the contrary, in the case of long-memory errors or regressors, the statisticalliterature related to the parametric change-point estimation is less vast. For thesimpler model: Y t = µ ≤ t ≤ k ∗ + ( µ + δ )11 k ∗ Long-memory (long-range dependent) processes arise in numerous physical andsocial sciences. For several examples, see e.g. Baillie [4], Cheung [8], Lo [28] amongothers. We also mention Guo and Kuol [17], where some currency exchange datasets with long-memory are considered. Another long-memory example in economywe find also in Ding et al. [12] on S&P daily 500 stock market returns. In thatpaper, they found that although the returns themselves contain little serial corre-lation, the absolute value of returns has significantly positive serial correlation upto 2700 lags.For the construction of the S-estimators, a function ρ : R −→ [0 , 1] is needed.Throughout our article, we assume that the following classic conditions are satisfiedby ρ : • ρ is symmetric, continuously differentiable on R and ρ (0) = 0. • ρ is increasing in [0 , c ), for some c > 0, and constant in [ c, ∞ ).Let us denote: ψ ( z ) = ρ ′ ( z ).An example of ρ satisfying these conditions was proposed by Beaton and Tukey[5], for some c > ρ ( x ) = x/c ) − x/c ) + ( x/c ) , if | x | ≤ c , if | x | > c (3)For model (1), the following assumptions are considered: (A1) X t is a sequence of d -dimensional stationary long-range dependent Gaussianvectors, with IE [ X t ] = 0, covariance matrix Γ( t ) = IE [ X X t +1 ] = L ( t ) T N ( t ) L ( t ),where N ( t ) = diag ( t − θ , ..., t − θ d ), θ , ..., θ d ∈ (0 , 1) for t ≥ V ar ( X ). L ( x ) a d × d orthogonal matrix of slowly varying functions; (A2) ε t a sequence of stationary long-range dependent Gaussian variables, with IE [ ε t ] = 0, γ (0) = V ar [ ε t ] = σ and the covariance γ ( t ) = IE [ ε ε t +1 ] = t − α L ( t ), α ∈ (0 , 1) for t ≥ L ( x ) a positive slowly varying function; (A3) the errors ε t are independent of X t .The values of θ , ..., θ d , α and the functions expressions of L ( x ), L ( x ) are known.Recall that a positive measurable function h is slowly varying in Karamata’s senseif and only if, for any λ > h ( λx ) /h ( x ) converges to 1 as x tends to infinity.Examples of slowly varying functions: log x , log log x , log log log x .....4nterested readers are referred to Beran [6] or Robinson [31] for a complete referenceon long-memory processes.An example of process X t = ( X t , ..., X td ) is obtained when, for some 0 < d < / X tj = d X l =1 X v ∈ Z B jl ( t − v ) ς v,j , B jl ( v ) = v − (1 − d ) L jl ( v ) , v ≥ , j, l = 1 , ..., d where L jl are slowly varying functions and where ς v = ( ς v, , ..., ς v,d ) T , v ∈ Z arei.i.d. with ς v,j , j = 1 , ..., d standard Gaussian variables (see Koul and Baillie [24]).For the residual function, let us consider classical notation r t ( β ) = Y t − X t β andlet K the constant given by K = IE Φ [ ρ ( ε /σ )], where Φ is standard Gaussiandistribution.In order to construct the S-estimator in a change-point model (1), we proceed asfollows:- first, for ( β , β , π ) ∈ Υ × Υ × (0 , 1) fixed, scale parameter σ is estimated by thepositive solution s n ( ξ ) = s n ( β , β , π ) of the equation: n − nπ ] X t =1 ρ r t ( β ) s n ( ξ ) ! + n − n X t =[ nπ ]+1 ρ r t ( β ) s n ( ξ ) ! = K (4)- at the second stage, the regression parameters are estimated by the argument ofthe minimum of solution s n ( ξ ) obtained of the previous phase: (cid:16) ˜ β n ( π ) , ˜ β n ( π ) (cid:17) = arg min ( β ,β ) ∈ Υ × Υ s n ( β , β , π ) (5)- in the end, the change-point is estimated by:ˆ π n = arg min π ∈ [0 , s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) (6)We shall make the usual identifiability assumption that the two segments aredifferent: β = β , ∀ ξ ∈ Υ × Υ × (0 , 1) (7)i.e. at least one of the coefficients of X t has a shift. Thus the jump at π is non-zero.This condition implies that the solution of (6) is unique and it will be essential inthe proof of the strong consistency.If solution s n ( ξ ) to (4) exists then it is well-defined, bounded, strictly positive, witha probability arbitrarily large (see Lemma 5.1). These results are valid regardlessof the covariance structure of X t , of ε t and their distribution. What matters istheir average is worth 0 and their variance is bounded.If (4) has more than one solution, s n ( ξ ) is defined as the supremum of all solutions.5bviously, if function ρ is given by (3), thus equation (4) has at least a solution.In this context, we define ˆ σ n = s n (cid:16) ˜ β n (ˆ π n ) , ˜ β n (ˆ π n ) , ˆ π n (cid:17) as the S-estimator of σ and ( ˆ β n , ˆ β n ) = ( ˜ β n (ˆ π n ) , ˜ β n (ˆ π n )) that of ( β , β ). We shall study the asymptoticbehaviour of ˆ σ n , ( ˆ β n , ˆ β n ) and of ˆ π n , in the case that equation (4) has at least asolution.For any ϕ twice differentiable function, for x, h ∈ R , throughout this paper we aregoing to use the mean value theorem under the form: ϕ ( x + h ) = ϕ ( x ) + h (cid:20) ϕ ′ ( x ) + h Z (1 − s ) ϕ ′′ ( x + sh ) ds (cid:21) (8)For a vector V = ( v , · · · , v m ), let us denote by k V k its Euclidean norm and wemake the convention that | V | = ( | v | , · · · , | v m | ).In the following, we denote by C a generic positive finite constant that may bedifferent in different context, but will never depend on n . This section establishes asymptotic properties of the S-estimator in model (1). Forthis purpose, first let us calculate, for solution s n ( ξ ) of equation (4), the partialderivatives with respect to β and β . Differentiating (4) with respect to β , weobtain: [ nπ ] X t =1 r t ( β ) s n ( ξ ) ∂s n ( ξ ) ∂β ψ r t ( β ) s n ( ξ ) ! + n X t =[ nπ ]+1 r t ( β ) s n ( ξ ) ∂s n ( ξ ) ∂β ψ r t ( β ) s n ( ξ ) ! + [ nπ ] X t =1 X t s n ( ξ ) ψ r t ( β ) s n ( ξ ) ! = 0Considering the following notation: D n ( ξ ) = n − nπ ] X t =1 r t ( β ) s n ( ξ ) ψ r t ( β ) s n ( ξ ) ! + n − n X t =[ nπ ]+1 r t ( β ) s n ( ξ ) ψ r t ( β ) s n ( ξ ) ! (9)and by making similar calculation for ∂s n ( ξ ) /∂β , we obtain: ∂s n ( ξ ) ∂β = − n − D n ( ξ ) − P [ nπ ] t =1 X t s n ( ξ ) ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) ∂s n ( ξ ) ∂β = − n − D n ( ξ ) − P nt =[ nπ ]+1 X t s n ( ξ ) ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) (10)6ince ρ is symmetric and increasing in [0 , c )( and choosing suitably c ) we have: xψ ( x ) > , if x ∈ ( − c, c ) \ { } = 0 , if x = 0 or | x | ≥ c (11)By means of Lemma 5.2, we prove that the random process D n ( ξ ) − is boundedwith a probability close to 1. In fact, the covariance structure of X t and of ε t ,respectively, plays no role in this result. Moreover, if both random variables areno more Gaussian, Lemma 5.2 holds if X t and ε t are bounded with a probabilityclose to 1.In order to prove the consistency we require that function ψ also is differentiableand strictly increasing on (0 , c ). This condition will be used for the Taylor’s ex-pansion of ρ , around ( β , β ), up to second order. (H1) ψ ( . ) is differentiable and ψ ′ ( u ) > ∀ u ∈ (0 , c ). Theorem 3.1 Under assumptions (A1)-(A3), (H1), (7), we have that estimator ˆ ξ n = ( ˆ β n , ˆ β n , ˆ π n ) is strongly consistent: ˆ ξ n a.s. −→ n →∞ ξ . Remark 3.1 Statement of Theorem 3.1 remains valid, if X t is not Gaussian, butit is i.i.d. and IE [ X t X Tt ] < ∞ . If ε t is not Gaussian, it has to be bounded with aprobability close to 1. As a consequence of relation (10), the first two stages (4) and (5) in the constructionof the parameters estimators, are the solutions to the equations system:( a ) n − P [ nπ ] t =1 ρ (cid:16) r t ( β ) σ (cid:17) + n − P nt =[ nπ ]+1 ρ (cid:16) r t ( β ) σ (cid:17) − K = 0( b ) n − P [ nπ ] t =1 ψ (cid:16) r t ( β ) σ (cid:17) X t = 0( c ) n − P nt =[ nπ ]+1 ψ (cid:16) r t ( β ) σ (cid:17) X t = 0 (12)Since the change-point intervention is essential, the convergence study of the scaleparameter estimator is realized separately. According to Theorem 3.1, we fix π in a neighbourhood V ( π ) of π . In order to show the convergence of the scaleparameter estimator, supplementary assumptions are needed. (H2) ψ is twice differentiable with bounded second derivative. (H3) ψ ( x ) /x is nonincreasing for x > s n ( ξ ),while (H3) is used in order to apply results of Zhengyan et al. [36] on the consis-tency of the scale S-estimator in a model without change-point. Moreover, in thepaper of Zhengyan et al. [36], the assumption (H3) is needed to show the conver-gence of the regression parameter estimator, which is not the case here. Theorem 3.2 Under (A1)-(A3), (H1)-(H3), (7), for all π in a neighbourhood V ( π ) of π , the estimator of σ is strongly consistent: s n ( ˜ β n ( π ) , ˜ β n ( π ) , π ) a.s. −→ n →∞ σ . Corollary 3.1 Under (A1)-(A3), (H1)-(H3), (7), scale parameter S-estimator ˆ σ n = s n ( ˆ β n , ˆ β n , ˆ π n ) is strongly consistent for σ . Remark 3.2 In a model without change-point, the assumption (H2) is needed forfound the convergence rate and the asymptotic distribution of the estimators butnot in the convergence proof. Remark 3.3 The convergence result of Theorem 3.2 holds if random vector X t isnot more Gaussian but i.i.d. with IE [ X t ] = 0 and IE [ X t X Tt ] < ∞ . In order to find the convergence rate, we will use the Hermite expansion for afunction of standard Gaussian variable (for details about the Hermite expansionsee for example Palma [30]). Let us consider function χ ( . ) := ρ ( . ) − K , where K = IE Φ [ ρ ( ε /σ )]. Suppose that the Hermite rank of χ (cid:16) ε σ (cid:17) is q . Because function ρ is symmetric and ρ (0) = 0, we have q ≥ 2. If we denote ν t = ε t /σ , then: χ ( ν t ) = X q ≥ q J q ( χ ) q ! H q ( ν t )with H q the Hermite polynomial, J q ( χ ) = IE [ χ ( ν ) H q ( ν )] and for all t, t ′ = 1 , · · · n : IE [ H p ( ν t ) H q ( ν t ′ )] = q ! γ q ( t − t ′ )11 p = q (13)Let also k = min { ( αq ) / , ( θ i + α ) / , ≤ i ≤ d } .In order to have the rate of convergence of the estimators in a model withoutchange-point, following assumptions are imposed by Zhengyan et al [36]: αq < { α + θ j ; 1 ≤ j ≤ d } .The following theorem gives the convergence rate of the regression parameters andof the scale parameter estimators. These rates are the same that in a model with-8ut change-point. Theorem 3.3 For all π ∈ (0 , , if (A1)-(A3), (H1)-(H3), (7) hold, we have k ˜ β n ( π ) − β k = O IP (cid:16) ( nπ ) − k L ( nπ ) (cid:17) = O IP (cid:16) n − k ˜ L ( n ) (cid:17) k ˜ β n ( π ) − β k = O IP (cid:16) ( n (1 − π )) − k L ( n (1 − π )) (cid:17) = O IP (cid:16) n − k ˜ L ( n ) (cid:17) where L and ˜ L are slowly varying functions. For the scale parameter, putting ˜ s n ( π ) := s n ( ˜ β n ( π ) , ˜ β n ( π ) , π ) , we have | ˜ s n ( π ) − σ | = O IP (cid:16) n − k ˜ L ( n ) (cid:17) . Now let us study the convergence rate of the change-point estimator:ˆ π n = arg min π s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) = arg min π h s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) − s n ( β , β , π ) i For that we consider one of the last two equations of (12), for instance (c): n − n X t =[ nπ ]+1 ψ r t ( ˜ β n ( π )) s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) X t = 0 Theorem 3.4 Under assumptions (A1)-(A3), (H1)-(H3), (7), we have ˆ π n − π = O IP ( n − − k ˜ L ( n )) with ˜ L ( n ) a slowly varying function. Example. If α ≥ max i =1 ,...,d θ i , then k = ( α + min i =1 ,...,d θ i ) / ≤ α .What is remarkable comparatively to the independence or the short-memory caseis that ˆ π n converges faster towards π when X t or ε t are long-range dependent.Consider the particular case α = θ = ... = θ d , then k = α . Further if α ∈ (1 / , β and β , we have a faster convergence rate than in theindependence or short-memory case. Finally, the long-memory brings about thatthe true values of the parameters are faster approached.Remark also that the obtained convergence rate completely differs from that ofchange-point τ -estimators when X t are NED-dependent (Fiteni [15], [16]). If X t areindependent, the convergence rate is n − for the change-point estimator and n − / for the parameters regression estimator, indifferently of used method: M-method(Koul et al. [25]), ML-method (Ciuperca and Dapzol [9]), LS-estimation (Bai andPerron [3]). Same convergence rate, n − , is obtained for change-point LS-estimatorin a model with correlated errors, but not with long-memory (Bai [1]).9t is interesting to note that the rate convergence of the change-point estimatorin the mean of Gaussian variable (2), having long-range dependence, consideredby Horvath and Kokoszka [20], is n − g − (1 /δ ) with g a regular varying function.Thus, the estimator of Horvath and Kokoszka [20] is slower than our estimator.On the other hand, let us remark that convergence rate of the S-estimators dependsof the Hermite rank of ρ ( ε /σ ) − K and of the covariance structure of X t and ε t . Let us consider the function e ( ξ ) = IE [ s n ( η, π ) − s n ( η , π )],with supposition, without loss the generality, that π ≤ π . Using the same argu-ments as for (37), we obtain that: IE [ | s n ( η, π ) − s n ( η, π ) | ] ≤ C k β − β k · | π − π | < ∞ and similarly to (34): IE [ | s n ( η, π ) − s n ( η , π ) | ] ≤ C k η − η k . Thus, function e ( ξ ) is well-defined. By Lemma 5.3, function e ( ξ ) is continuous and furthermore e ( ξ ) = 0. For using an argument like the one in Huber [21], we will to prove that: IE [ s n ( η, π ) − s n ( η , π )] > 0, for every ξ = ξ . Since s n ( ξ ) and s n ( ξ ) are bothsolutions of equation (4), we have 0 = ( S (0)1 ,n + S (1)1 ,n ) + ( S (0)2 ,n + S (1)2 ,n ) + ( S (0)3 ,n + S (1)3 ,n ),with: S (0)1 ,n ≡ n − nπ ] X t =1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! , S (1)1 ,n ≡ n − nπ ] X t =1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! S (0)2 ,n ≡ n − nπ ] X t =[ nπ ]+1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! , S (1)2 ,n ≡ n − nπ ] X t =[ nπ ]+1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! S (0)3 ,n ≡ n − n X t =[ nπ ]+1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! , S (1)3 ,n ≡ n − n X t =[ nπ ]+1 " ρ r t ( β ) s n ( ξ ) ! − ρ r t ( β ) s n ( ξ ) ! Then, by the mean value theorem (TVM), S (0)1 ,n + S (0)2 ,n + S (0)3 ,n can be written as: n − s n ( ξ ) − s n ( ξ ) ! [ nπ ] X t =1 r t ( β ) ψ r t ( β ) u (1) n ( η , π, π ) ! + [ nπ ] X t =[ nπ ]+1 r t ( β ) ψ r t ( β ) u (2) n ( η , π, π ) ! + n X t =[ nπ ]+1 r t ( β ) ψ r t ( β ) u (3) n ( η , π, π ) ! with u (1) n , u (2) n , u (3) n defined in the same way as in the proof of the Lemma 5.3.Moreover, using property (11), we have the following: S (0)1 ,n + S (0)2 ,n + S (0)3 ,n = [ s n ( ξ ) − s n ( ξ )] V n , where V n is a positive random variable with probability close to 1.10oreover, using Taylor’s expansion, the expressions of S (1)1 ,n , S (1)2 ,n and S (1)3 ,n can bewritten as: S (1)1 ,n = n − nπ ] X t =1 X t ( β − β ) " ψ r t ( β ) s n ( ξ ) ! + 12 ψ ′ ε t + δ X t ( β − β ) s n ( ξ ) ! ( β − β ) T X Tt S (1)2 ,n = n − nπ ] X t =[ nπ ]+1 X t ( β − β ) " ψ r t ( β ) s n ( ξ ) ! + 12 ψ ′ ε t + δ X t ( β − β ) s n ( ξ ) ! ( β − β ) T X Tt S (1)3 ,n = n − n X t =[ nπ ]+1 X t ( β − β ) " ψ r t ( β ) s n ( ξ ) ! + 12 ψ ′ ε t + δ X t ( β − β ) s n ( ξ ) ! ( β − β ) T X Tt with δ , δ , δ ∈ (0 , n − P [ nπ ] t =1 X t ( β − β ) ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) = o IP (1) , n − P [ nπ ] t =[ nπ ]+1 X t ( β − β ) ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) = o IP (1) ,n − P nt =[ nπ ]+1 X t ( β − β ) ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) = o IP (1) (14)Relation (14) and assumption (H1) imply: for any ξ = ξ , for all ǫ > 0, there exits a > 0, such that IP [ S (1)1 ,n + S (1)2 ,n + S (1)3 ,n > a ] > − ǫ (15)Assumption (7), the above relation and S (1)1 ,n + S (1)2 ,n + S (1)3 ,n = − ( S (0)1 ,n + S (0)2 ,n + S (0)3 ,n ) =[ s n ( ξ ) − s n ( ξ )] V n , with V n > 0, imply the conclusion IE [ s n ( η, π ) − s n ( η , π )] > 0, for all ξ = ξ . Using this, the compactness of the parameter space, ˆ ξ n =arg min ξ ∈ Υ × Υ × [0 , s n ( ξ ) and an argument like one in Huber [21], the strongly con-vergence of ˆ ξ n results. (cid:4) Proof of Theorem 3.2. We first prove that, if we consider in (1) the true valuefor η and π , then the scale parameter estimator is strongly consistent: s n ( η , π ) a.s. −→ n →∞ σ (16)Let us observe that in fact s n ( η , π ) is the solution of a problem without breaking: K = n − nπ ] X t =1 ρ ε t s n ( ξ ) ! + n − n X t =[ nπ ]+1 ρ ε t s n ( ξ ) ! = n − n X t =1 ρ ε t s n ( ξ ) ! and then, relation (16) is obtained by Theorem 3.1 of Zhengyan et al. [36]. Now,as a consequence of Theorem 3.1, we may consider only the case ( η, π ) in a neigh-11ourhood V ( η , π ) of ( η , π ). Consider the decomposition: s n ( η, π ) − s n ( η , π ) = [ s n ( η, π ) − s n ( η , π )]+[ s n ( η , π ) − s n ( η , π )] : ≡ S ( n )+ S ( n )(17)Since S ( n ), depends only on the regression parameters, by Theorem 3.1, takinginto account relations (10) and (32), we readily obtain:sup η ∈V ( η ) | s n ( η, π ) − s n ( η , π ) | a.s. −→ n →∞ S ( n ), an argument like the one used for (35) yield that s n ( η , π ) − s n ( η , π )behaves as: n − nπ ] X t =[ nπ ]+1 X t ψ ˜ r t ( β , β ) s n ( η , π ) ! where ˜ r t ( β , β ) = r t ( β )+ m t [ r t ( β ) − r t ( β )], with 0 < m t < 1. Let us remark that˜ r t ( β , β ) = ε t . We write Taylor’s expansion of ψ ( ε t /s n ( η , π )) around ψ ( ε t /σ )up to second order: n − nπ ] X t =1 X t ψ ε t s n ( η , π ) ! = n − nπ ] X t =1 X t ψ (cid:18) ε t σ (cid:19) − n − s n ( η , π ) − σ σ s n ( η , π ) [ nπ ] X t =1 X t ε t ψ ′ (cid:18) ε t σ (cid:19) + n − ( s n ( η , π ) − σ ) σ s n ( η , π ) [ nπ ] X t =1 ψ ′′ ( ς t ) ε t X t with ς t = ε t [ s n ( η , π ) + υ t ( σ − s n ( η , π ))] / ( σ s n ( η , π )), υ t ∈ (0 , ψ ′′ isbounded, we have n − P [ nπ ] t =1 ψ ′′ ( ς t ) ε t X t < ∞ with probability 1. Moreover: π nπ ] [ nπ ] X t =1 X t ε t ψ ′ (cid:18) ε t σ (cid:19) a.s. −→ n →∞ π IE (cid:20) X t ε t ψ ′ (cid:18) ε t σ (cid:19)(cid:21) = 0Hence: n − P [ nπ ] t =1 X t [ ψ ( ε t /s n ( η , π )) − ψ ( ε t /σ )] = o IP ( s n ( η , π ) − σ ). This rela-tion and n − P [ nπ ] t =1 X t ψ ( ε t /σ ) a.s. −→ n →∞ S ( n ) = s n ( η , π ) − s n ( η , π ) = o IP (1) + o IP ( s n ( η , π ) − σ ) = o IP (1) + o IP ( S ( n )), for the last relation we have used(16). Then sup π ∈V ( π ) | S ( n ) | a.s. −→ n →∞ 0. This fact, with relation (18), together withdecomposition (17) and relation (16), yield the Theorem. (cid:4) Proof of Theorem 3.3. For π ∈ (0 , 1) fixed, the convergence rate of the regressionparameters estimator ˜ β n ( π ) and ˜ β n ( π ) is obtained by the application of Zhengyanet al. [36] results on every segment. On the other hand, the study of the convergencerate of ˆ s n is more difficult because it interferes in both segments. For notational12implicity, in the rest of this proof, we denote ˜ β n = ˜ β n ( π ), ˜ β n = ˜ β n ( π ) and˜ s n = ˜ s n ( π ). The study will be made in two stages. First, we are going to writeequation (12)(a) in another form, putting in evidence σ by a limited development.Afterwards, in the second stage, the obtained form is studied by taking into accountthe convergence rate of the regression parameters estimators and what X t , ε t arelong-memory Gaussian. Stage 1 . Equation (12)(a) can be expressed as: n − nπ ] X t =1 χ r t ( ˜ β n )˜ s n ! + n − n X t =[ nπ ]+1 χ r t ( ˜ β n )˜ s n ! = 0 (19)We apply (8) to function χ and for: t = 1 , · · · , [ nπ ], x t = r t ( ˜ β n ) / ˜ s n , h t = (cid:16) σ − − ˜ s − n (cid:17) r t ( ˜ β n ) t = [ nπ ] + 1 , · · · , n , x t = r t ( ˜ β n ) / ˜ s n , h t = (cid:16) σ − − ˜ s − n (cid:17) r t ( ˜ β n )Hence, for the part t = 1 , · · · , [ nπ ], we have: n − nπ ] P t =1 χ ( x t + h t ) = n − nπ ] P t =1 χ ( x t ) + ˜ s n − σ σ ˜ s n " n − nπ ] P t =1 r t ( ˜ β n ) ψ ( x t )+ n − s n − σ σ ˜ s n [ nπ ] P t =1 r t ( ˜ β n ) R (1 − s ) ψ ′ (cid:18) r t ( ˜ β n ) σ + s · r t ( ˜ β n ) (cid:16) σ − s n (cid:17)(cid:19) ds (20)Thus, in order to study the first sum of (19), we shall analyse the terms of theright-hand side of (20).We first consider the last term of the right-hand side of (20). Elementary algebrayields that: n − ˜ s n − σ σ ˜ s n [ nπ ] X t =1 r t ( ˜ β n ) = ˜ s n − σ σ ˜ s n n − nπ ] X t =1 ε t + n − nπ ] X t =1 ( ˜ β n − β ) T X T X t ( ˜ β n − β )+ 2 n − nπ ] X t =1 ε t X t ( ˜ β n − β ) By the ergodic theorem n − P [ nπ ] t =1 ε t = O IP (1), n − P [ nπ ] t =1 X T X t = O IP (1) and since ε t and X t are independent, we have n − P [ nπ ] t =1 ε t X t = o IP (1). Thus, since ψ ′ isbounded, ˜ s n − σ = o IP (1), ˜ s n > o IP (1).We now consider the second term of the right-hand side of (20). For the sum, wehave: n − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) [ nπ ] X t =1 r t ( ˜ β n ) ψ r t ( ˜ β n )˜ s n ! − [ nπ ] X t =1 r t ( β ) ψ r t ( β ) σ !(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) [ nπ ] X t =1 h r t ( ˜ β n ) − r t ( β ) i ψ r t ( ˜ β n )˜ s n !(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + n − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) [ nπ ] X t =1 r t ( β ) " ψ r t ( ˜ β n )˜ s n ! − ψ r t ( β ) σ ! and since ψ is bounded: ≤ C k ˜ β n − β k n − nπ ] X t =1 k X t k + Cn − nπ ] X t =1 k ε t k The above inequality, with the ergodic theorem, IE [ k X t k ] < ∞ , IE [ k ε t k ] < ∞ and˜ β n − β = o IP (1), imply that n − nπ ] X t =1 r t ( ˜ β n ) ψ r t ( ˜ β n )˜ s n ! − n − nπ ] X t =1 r t ( β ) ψ r t ( β ) σ ! = o IP (1)Thus, the second term of the right-hand side of (20) can be expressed: n − nπ ] X t =1 h t ψ ( x t ) = ˜ s n − σ σ ˜ s n n − nπ ] X t =1 ε t ψ (cid:18) ε t σ (cid:19) + o IP (1) Then, relation (20) becomes: n − nπ ] X t =1 χ ( x t + h t ) = n − nπ ] X t =1 χ ( x t ) + ˜ s n − σ σ ˜ s n n − nπ ] X t =1 ε t ψ (cid:18) ε t σ (cid:19) + o IP (1) (21)A similar relation holds for the part t = [ nπ ] + 1 , · · · , n : n − n X t =[ nπ ]+1 χ ( x t + h t ) = n − n X t =[ nπ ]+1 χ ( x t )+ ˜ s n − σ σ ˜ s n n − n X t =[ nπ ]+1 ε t ψ (cid:18) ε t σ (cid:19) + o IP (1) (22)Adding (21) and (22), taking into account the relation (19), we obtain:0 = n − nπ ] X t =1 χ r t ( ˜ β n ) σ ! + n − n X t =[ nπ ]+1 χ r t ( ˜ β n ) σ ! + ˜ s n − σ σ ˜ s n n − n X t =1 ε t ψ (cid:18) ε t σ (cid:19) + o IP (1) ! By ergodic theorem: n − P nt =1 ε t ψ ( ε t /σ ) IP −→ n →∞ IE [ ε ψ ( ε /σ )]. Stage 2 . Then, the convergence rate of ˜ s n will be obtained by studying: n − nπ ] X t =1 χ r t ( ˜ β n ) σ ! + n − n X t =[ nπ ]+1 χ r t ( ˜ β n ) σ ! = σ − ˜ s n σ ˜ s n (cid:20) IE (cid:20) ε ψ (cid:18) ε σ (cid:19)(cid:21) + o IP (1) (cid:21) (23)14or t = 1 , · · · , [ nπ ], making the Taylor’s expansion of χ up to second order, weobtain that n − P [ nπ ] t =1 χ (cid:16) σ − r t ( ˜ β n ) (cid:17) can be written as: n − [ nπ ] X t =1 χ (cid:18) ε t σ (cid:19) − σ nπ ] X t =1 χ ′ (cid:18) ε t σ (cid:19) X t ( ˜ β n − β ) − σ 20 [ nπ ] X t =1 χ ′′ ε t − δ t X t ( ˜ β n − β ) σ ! [ X t ( ˜ β n − β )] (24)Let us analyse the three terms of the previous equation separately. • For the first term, let us ν t = ε t /σ ∼ N (0 , 1) denote. We use the Hermiteexpansion for P [ nπ ] t =1 χ ( ν t ). Because the Hermite rank of χ ( ν t ) is q , q ≥ 2, by (13)below: [ nπ ] X t =1 χ ( ν t ) = J q ( χ ) q ! [ nπ ] X t =1 H q ( ν t ) + [ nπ ] X t =1 X q ≥ q +1 J q ( χ ) q ! H q ( ν t ) : ≡ T ,n + T ,n (25)For T ,n we have: IE [ T ,n ] = J q ( χ )( q !) nπ ] X t =1 [ nπ ] X j =1 ( q !) γ q ( | t − j | ) = ( q !) J ( ρ )( q !) [ nπ ] γ q (0) + 2 [ nπ ] − X t =1 ([ nπ ] − t ) γ q ( t ) = ( q !) J ( ρ )( q !) O ( n ) + 2 [ nπ ] − X t =1 ([ nπ ] − t ) t − αq L q ( t ) = J ( ρ ) q ! h O ( n ) + O ( n − αq ) L q ([ nπ ]) i = O ( n − αq ) L q ([ nπ ])For T ,n we have: IE [ T ,n ] = X q ≥ q +1 J q ( ρ ) q ! [ nπ ] X t =1 [ nπ ] X j =1 γ q ( | t − j | ) ≤ X q ≥ q +1 J q ( ρ ) q ! [ nπ ] X t =1 [ nπ ] X j =1 γ q +1 ( | t − j | )= O ( n )+2 X q ≥ q +1 J q ( ρ ) q ! [ nπ ] − X t =1 ([ nπ ] − t ) γ q +1 ( t ) ≤ O ( n )+2 X q ≥ q +1 J q ( ρ ) q ! [ nπ ] − X t =1 ([ nπ ] − t ) t − ( q +1) α L q +1 ( t )= O ( n − ( q +1) α L q +1 ([ nπ ]))Hence IE [ T ,n ] = o ( IE [ T ,n ]). Then, for equation (25), we straightforwardly have: [ nπ ] X t =1 χ ( ν t ) = O IP (cid:16) IE [ T ,n ] (cid:17) / = O IP ( n − αq / ) L q / ([ nπ ]) (26) • For the second term of (24), since ν t and X t are independent, by ergodic theorem,15e have: n − nπ ] X t =1 χ ′ ( ν t ) X t ( ˜ β n − β ) = o IP ( k ˜ β n − β k ) (27) • For the third term of (24), since ψ ′ is bounded and n − P [ nπ ] t =1 X t X Tt = O IP (1), wehave: n − nπ ] X t =1 χ ′′ ( ν t ) [ X t ( ˜ β n − β )] = O IP ( k ˜ β n − β k ) = o IP ( k ˜ β n − β k ) (28)Then, by taking (26), (27), (28) into account, the behaviour of (24) is given by (26)and it is O IP ( n − αq / ) L q / ([ nπ ]) + o IP ( k ˜ β n − β k ). Similar one reasoning is madefor the part t = [ nπ ] + 1 , · · · , n and we obtain that: n − P nt =[ nπ ]+1 χ (cid:16) σ − r t ( ˜ β n ) (cid:17) = O IP ( n − αq / ) L q / ( n (1 − [ π ])) + o IP ( k ˜ β n − β k ). Then, for relation (23), we have: σ − ˜ s n σ ˜ s n (cid:20) IE (cid:20) ε ψ (cid:18) ε σ (cid:19)(cid:21) + o IP (1) (cid:21) = O IP ( n − αq / ) L q / ( n )+ o IP ( k ˜ β n − β k + k ˜ β n − β k )and the convergence rate of ˜ s n follows. (cid:4) Proof of Theorem 3.4. As a consequence of Theorem 3.1, we consider π in aneighbourhood of π . We suppose, without loss of generality, that π < π . Consid-ering relation (12)(c), we have: n − nπ ] X t =[ nπ ]+1 ψ r t ( ˜ β n ( π )) s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) X t = − n − n X t =[ nπ ]+1 ψ r t ( ˜ β n ( π )) s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) X t (29)Since k ˜ β n ( π ) − β k = O IP ( n − k ˜ L ( n )), an argument like the one used for relation(27) yield that the right-hand side of (29) is O IP ( n − k ˜ L ( n )).We apply (8) to function ψ , for: x t = ε t s n ( ˜ β n ( π ) , ˜ β n ( π ) ,π ) , h t = − X t ( ˜ β n ( π ) − β ) s n ( ˜ β n ( π ) , ˜ β n ( π ) ,π ) . Forthe left-hand side of (29), since s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) → σ a.s. for n → ∞ , and β = β , we obtain : n − nπ ] X t =[ nπ ]+1 ψ r t ( ˜ β n ( π )) s n (cid:16) ˜ β n ( π ) , ˜ β n ( π ) , π (cid:17) X t = n − nπ ] X t =[ nπ ]+1 ψ (cid:18) ε t σ (cid:19) X t + O IP ( n ( π − π ))(30)But, making Hermite expansion of ψ ( ν t ), we get: [ nπ ] X t =[ nπ ]+1 ψ (cid:18) ε t σ (cid:19) X t = J ( ψ ) σ nπ ] X t =[ nπ ]+1 ε t X t + [ nπ ] X t =[ nπ ]+1 X q> J q ( ψ ) q ! H q ( ν t ) X t : ≡ I ,n + I ,n J q ( ψ ) = IE [ ψ ( ν ) H q ( ν )]. On the other hand, as in the proof of Theorem3.3, we have I ,n = o IP ( I ,n ). The variance of I ,n is: IE [ I ,n I T ,n ] = J ( ψ ) σ 20 [ n ( π − π )] X i =1 [ n ( π − π )] X j =1 γ ( | i − j | )Γ( | i − j | )= [ n ( π − π )] γ (0)Γ(0) + 2 [ n ( π − π )] X i =1 [ n ( π − π ) − i ] γ ( i )Γ( i )= O (cid:16) L ( n ( π − π )) L T ( n ( π − π )) M ( n ( π − π )) L ( n ( π − π )) (cid:17) What implies: n − nπ ] X t =[ nπ ]+1 ψ (cid:18) ε t σ (cid:19) X t = O IP (cid:16) ( n ( π − π )) − min( θ i + α ) / L / ( n ( π − π )) L T ( n ( π − π )) L ( n ( π − π )) (cid:17) This last relation together with (29), (30) and since the right-hand side of (29) is O IP ( n − k ˜ L ( n )) imply: O IP ( n − k ˜ L ( n ))= O IP ( n ( π − π ))+ O IP (cid:16) ( n ( π − π )) − min( θ i + α ) / L / ( n ( π − π )) L T ( n ( π − π )) L ( n ( π − π )) (cid:17) We obtain that: ˆ π n − π = O IP ( n − − k ˜ L ( n )). (cid:4) If solution s n ( ξ ) of equation (4) exists, then it is well-defined, bounded,strictly positive, with a probability arbitrarily large. Proof of Lemma 5.1. Since IE [ r t ( β ) = 0] and V ar [ r t ( β )] = V ar [ ε t ]+ βV ar [ X ] β t < ∞ , by Bienaym´e-Tchebichev inequality, we obtain that r t ( β ) is bounded with aprobability arbitrarily large.We prove that s n ( ξ ) is bounded by reduction to absurdity. If s n ( ξ ) is not boundedthen: there exists ξ ∈ Υ × Υ × (0 , 1) and n ξ ∈ N such that for all n > n ξ , M > ǫ > IP [ s n ( ξ ) > M ] ≥ − ǫ . Since ρ is continuous and ρ (0) = 0,then: ρ r t ( β ) s n ( ξ ) ! IP −→ n →∞ , t = 1 , ..., n (31)17nd 1 n [ nπ ] X t =1 ρ r t ( β ) s n ( ξ ) ! + 1 n n X t =[ nπ ]+1 ρ r t ( β ) s n ( ξ ) ! ≤ [ nπ ] n max ≤ t ≤ [ nπ ] ρ r t ( β ) s n ( ξ ) ! + n − [ nπ ] n max [ nπ ]+1 ≤ t ≤ n ρ r t ( β ) s n ( ξ ) ! which, by (31), converges to 0 in probability, for n → ∞ . What is contradictorywith (4). To prove that s n ( ξ ) > 0, let us consider function g ( β, s ) = ( ε − Xβ ) /s ,with β in a compact of R d containing 0 and s ∈ (0 , ∞ ). Since ε − Xβ is boundedwith a probability close to 1, if s n ( ξ ) = 0, thus lim s → | g ( β, s ) | = ∞ , whatis contradictory with (4). Hence, for all ǫ > 0, there exists δ > IP [inf ξ ∈ Υ × Υ × [0 , s n ( ξ ) > δ ] > − ǫ . (cid:4) Lemma 5.2 Under assumptions (A1)-(A3), for any ǫ ∈ (0 , , ξ ∈ Υ × Υ × [0 , ,there exists a positive constant δ such that: IP [ inf ξ ∈ Υ × Υ × [0 , D n ( ξ ) > δ ] > − ǫ . Proof of Lemma 5.2. Because ξ belongs to a compact and taking into accountrelation (11), we have to prove that for all ǫ > ξ ∈ Υ × Υ × [0 , δ > IP n − [ nπ ] X t =1 r t ( β ) ψ r t ( β ) s n ( ξ ) ! + n X t =[ nπ ]+1 r t ( β ) ψ r t ( β ) s n ( ξ ) ! > δ > − ǫ Since r t ( β ), ψ (cid:16) r t ( β ) s n ( ξ ) (cid:17) have the same sign and since ψ is continuous, we are goingto show only that, for all ǫ > 0, for all β in compact set Υ, there exists a δ > IP [ | ε − Xβ | > δ ] > − ǫ .Random variables ε and X are Gaussian and independent. Then: IP [ | ε − Xβ | >δ ] = 2 IP [ ε − Xβ < − δ ] = 2Φ (cid:16) − δ [ γ (0)+ β Γ(0) β T ] / (cid:17) . We recall that Φ denotesthe standard Gaussian distribution. Then, the Lemma results by setting: δ =inf β ∈ Υ [ γ (0) + β Γ(0) β T ] / (cid:12)(cid:12)(cid:12) Φ − (cid:16) − ǫ (cid:17)(cid:12)(cid:12)(cid:12) . (cid:4) The key for strong convergence proof is the following uniform convergence result. Lemma 5.3 For all ̺ > , under assumptions (A1)-(A3), for Ω ̺ ( ξ ) = { ξ ∗ ∈ Υ × Υ × [0 , k η − η ∗ k < ̺, | π − π ∗ | < ̺ } , we have: IE " sup ξ ∗ ∈ Ω ̺ ( ξ ) | s n ( η, π ) − s n ( η ∗ , π ∗ ) | −→ ̺ → roof of Lemma 5.3. We have the triangular inequality: | s n ( η, π ) − s n ( η ∗ , π ∗ ) | ≤ | s n ( η, π ) − s n ( η, π ∗ ) | + | s n ( η, π ∗ ) − s n ( η ∗ , π ∗ ) | . First, wewill study s n ( η, π ∗ ) − s n ( η ∗ , π ∗ ). By the mean value theorem (TVM), we have: s n ( η, π ∗ ) − s n ( η ∗ , π ∗ ) = ( β − β ∗ ) ∂s n ∂β ( ˜ β , β ∗ , π ∗ ) + ( β − β ∗ ) ∂s n ∂β ( β ∗ , ˜ β , π ∗ ) (32)where ˜ β = β + υ ( β − β ∗ ), ˜ β = β + υ ( β − β ∗ ), υ , υ ∈ (0 , ψ isbounded, we obtain: IE "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂s n ( ˜ β , β ∗ , π ∗ ) ∂β (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Cn − nπ ] X t =1 IE [ X t ] IE " ψ r t ( ˜ β ) s n ( ˜ β , β ∗ , π ∗ ) ! / < C (33)Then, writing a similar relation for ( ∂s n /∂β )( β ∗ , ˜ β , π ∗ ), we have for (32): IE [ | s n ( η, π ∗ ) − s n ( η ∗ , π ∗ ) | ] −→ , for ̺ → π ∗ = 0 or π ∗ = 1, then in relation (32), the term in β ,respectively β , does not appear.Now, we study | s n ( η, π ) − s n ( η, π ∗ ) | , supposing that π < π ∗ . Since s n ( η, π ) and s n ( η, π ∗ ) are both solutions of (4), we have: n − nπ ∗ ] X t =1 " ρ r t ( β ) s n ( η, π ) ! − ρ r t ( β ) s n ( η, π ∗ ) ! + n − n X t =[ nπ ∗ ]+1 " ρ r t ( β ) s n ( η, π ) ! − ρ r t ( β ) s n ( η, π ∗ ) ! = n − nπ ∗ ] X [ nπ ]+1 " ρ r t ( β ) s n ( η, π ) ! − ρ r t ( β ) s n ( η, π ) ! Thus, applying the MVT: n − [ s n ( η, π ∗ ) − s n ( η, π )] (cid:20)P [ nπ ∗ ] t =1 r t ( β ) ψ (cid:18) r t ( β ) u (1) n ( η,π,π ∗ ) (cid:19) + P nt =[ nπ ∗ ]+1 r t ( β ) ψ (cid:18) r t ( β ) u (2) n ( η,π,π ∗ ) (cid:19)(cid:21) = n − P [ nπ ∗ ][ nπ ]+1 [ X t ( β − β )] ψ (cid:16) ˜ r t ( β ,β ) s n ( η,π ) (cid:17) (35)where u (1) , u (2) are two positive bounded functions, not necessarily solutions of (4)and ˜ r t ( β , β ) = r t ( β ) + m t [ r t ( β ) − r t ( β )], with 0 < m t < 1. By relation (11): [ nπ ∗ ] X t =1 r t ( β ) ψ r t ( β ) u (1) n ( η, π, π ∗ ) ! + n X t =[ nπ ∗ ]+1 r t ( β ) ψ r t ( β ) u (2) n ( η, π, π ∗ ) ! > r t ( β , β ) = Y t − X t [ β + m t ( β − β )] = r t ( β + m t ( β − β )). Using the same arguments as for (33), we obtain that: n − nπ ∗ ] X t =[ nπ ]+1 IE [ X t ] IE " ψ ˜ r t ( β , β ) s n ( η, π ) ! / ≤ C ( π ∗ − π )where C is a vector with all bounded components. Taking into account also (36),we obtain for (35): IE [ | s n ( η, π ) − s n ( η, π ∗ ) | ] ≤ C k β − β k · | π − π ∗ | < C̺ −→ ̺ → (cid:4) References [1] J. Bai, Least squares estimation of a shift in linear processes, Journal of TimeSeries Analysis 15 (1994) 453-472.[2] J. Bai, Estimation of multiple-regime regressions with least absolute deviation,Journal of Statistical Planning Inference, 74 (1998) 103-134.[3] J. Bai, P. Perron, Estimating and testing linear models with multiple structuralchanges, Econometrica 66 (1998) 47-78.[4] R. Baillie, Long memory processes and fractional integration in econometrics,Journal of Econometrics, 73 (1996) 5-59.[5] A.E. Beaton, J.W. Tukey, The fitting of power series, meaning polynomials,illustrated on band-spectroscopic data, Technometrics 16 (1974) 147-185.[6] J. Beran, Statistics for long-memory process, Chapman & Hall, New York, 1994.[7] P.K.Bhattacharya, Some aspects of change-point analysis. IMS Lecture Notes-Monograph Series, Vol. 23, Hayward, CA 1994, pp.28-56.[8] Y.X. Cheung, Long memory in foreign exchange rates, Journal of Bussiness andEconomic Statistics 11(1993) 93-101.[9] G. Ciuperca, N. Dapzol, Maximum likelihood estimator in a multi-phase randomregression model, Statistics 42 (2008) 363-381.[10] G. Ciuperca Estimating nonlinear regression with and without change-points by theLAD-method, Annals of the Institute of Statistical Mathematics in revision. 11] L. Davies, The Asymptotics of S-Estimators in the Linear Regression Model, Annalsof Statistics 18 (1990) 1651-1675.[12] Z. Ding, C.W.J. Granger, R.F. Engle, A long memory property of stock marketreturns and a new model, Journal of Empirical Finance 1 (1993) 83-106.[13] P.I. Feder, On asymptotic distribution theory in segmented regression problems-identified case, Ann. Statist. 3 (1975) 49-83.[14] P.I. Feder, The log likelihood ratio in segmented regression, Ann. Statist. 3 (1975)84-97.[15] I. Fiteni, Robust estimation of structural break points, Econometric Theory 18(2002) 349-386.[16] I. Fiteni, τ -estimators of regression models with structural change of unknownlocation, Journal of Econometrics 119 (2004) 19-44.[17] H. Guo, H.L. Koul, Nonparametric regression with heteroscedastic long memoryerrors, Journal of Statistical Planning Inference 137 (2007) 379-404.[18] F.R.. Hampel, A general quantitative definition of robustness, Annals ofMathematical Statistics 42 (1971) 1887-1896.[19] J. Hidalgo, P.M. Robinson, Testing for structural change in a long-memoryenvironment, Journal of Econometrics 1 (1996) 159-174.[20] L. Horvath, P. Kokoszka, The effect of long-range dependence on change-pointestimators, Journal of Statistical Planning Inference 64 (1997) 57-81.[21] P.J. Huber, The behaviour of maximum likelihood estimates under nonstandardconditions. Proceedings of the Fifth Berkeley Symposium on Mathematics Statisticand Probability, Vol 1, University California Press, Berkeley, 1967, pp 221-234.[22] J. Kim , H.J Kim, Asymptotic results in segmented multiple regression, Journal ofMultivariate Analysis 99 (2008) 2016-2038.[23] H.L. Koul, L. Qian, Asymptotics of maximum likelihood estimator in a two-phaselinear regression model, Journal of Statistical Planning and Inference 108 (2002)99-119.[24] H.L. Koul, R.T. Baillie, Asymptotics of M-estimators in non-linear regression withlong memory design, Statistics and Probability Letters 61 (2003) 237-252.[25] H.L. Koul, L. Qian, D. Surgailis, Asymptotics of M-estimators in two-phase linearregression models, Stochastic Processes and their Applications 103 (2003) 123-154.[26] C.M. Kuan, C.C. Hsu, Change-point estimation of fractionally integrated processes,Journal of Time Series Analysis 19 (1998) 693-708. 27] S. Lazarov´a, Testing for structural change in regression with long memory processes,Journal of Econometrics 129 (2005) 329-372.[28] A.W. Lo, Long term memory in stock market prices, Econometrica 59 (1991)1279-1313.[29] L.C. Nunes, C.M. Kuan, P. Newbold, Spurious break Econometric Theory 11(1995)736-749.[30] W. Palma, Long-Memory Time Series, Theory and Methods, Wiley, New Jersey,2007.[31] P.M. Robinson, Time series with strong dependence, In: C.A. Sims, Advances inEconometrics, Sixth World Congress, Cambridge Univ. Press, 1994.[32] E. Roelant, S. Van Aelst, C. Croux, Multivariate generalized S-estimators Journalof Multivariate Analysis (2008) in press.[33] P.J. Rousseeuw, V.J. Yohai, Robust regression by means of S-estimators, In: Robustand Nonlinear Time Series Analysis, Franke, J., Hrdle, W., Martin, R.D. (Eds.),Lecture Notes in Statistics, Vol. 26, Springer, New York, 1984, pp. 256-272.[34] A.L. Rukhin, I. Vajda, Change-point estimation as a nonlinear regression problem,Statistics 30 (1997) 181-200.[35] Ph. Sibbertsen, Long memory versus structural breaks: an overview, StatisticalPaper 4 (2004) 465-515.[36] L. Zhengyan,L. Degui, C. Jia, Asymptotic behavior for S -estimators in randomdesign linear model with long-range-dependent errors, Metrika 66 (2007) 289-303.-estimators in randomdesign linear model with long-range-dependent errors, Metrika 66 (2007) 289-303.