On detecting weak changes in the mean of CHARN models
OOn detecting weak changes in the mean of CHARN models
Joseph Ngatchou-Wandji a * and Marwa Ltaifa ba EHESP Sorbonne Paris Cit´e andInstitut ´Elie Cartan de Lorraine, 54506 Vandoeuvre-l`es-Nancy cedex, France.E-mail: [email protected] b Institut ´Elie Cartan de Lorraine, 54506 Vandoeuvre-l`es-Nancy cedex, France andLAMMDA-ESST Hammam-Sousse, University of Sousse, 4011 Hammam-Sousse, Tunisia.E-mail: [email protected]
Abstract
We study a likelihood ratio test for detecting multiple weak changes in the mean of aclass of CHARN models. The locally asymptotically normal (LAN) structure of the family oflikelihoods under study is established. It results that the test is asymptotically optimal and anexplicit form of its asymptotic local power is given as a function of candidates change locations.Weak changes locations estimates are obtained as the time indexes maximizing an estimate of thelocal power. A simulation study shows the good performance of our methods compared to someCUSUM approaches. Our results are also applied to three sets of real data.
Running headline.
Change detection in CHARN models
AMS 2000 subject classification.
MSC 62M10; 62M02; 62M05; 62F03; 62F05.
Key words.
Change-points; LAN; Likelihood-ratio tests; CHARN models.
This paper deals with weak changes detection in the mean of Conditional Heteroscedastic AutoRe-gressive Nonlinear (CHARN) models (see [34]). We mean by weak change that whose magnitude * Corresponding author a r X i v : . [ m a t h . S T ] J a n s too small. In a series issued from a phenomenon under monitoring, a weak change may be a pre-cursor sign of upcoming more significant changes indicating a critical behavior. Such phenomenoncan be found in economics and finance, public health, biosciences, engineering, climatology, hy-drology, linguistic, genomics and in many other domains. Because weak changes may be missingby classical change detection methods, it can be of primary importance to find screening methodsmore adapted to their detection.Since [57], change-point study has attracted a lot of attention, and has been the subject of alarge amount of work. The monographs of [7] and [18] present the basic notions and theory onthe subject. For long, the studies has been done in the context of independent observations as in[16], [44], [29], [50], [51], [32], [67], [59], [58] and [64]. These last decades there is a growingattention on change-points study in dependent data. [38] propose several methods of on-line changedetection in the variance of an heteroskedastic time series, [61] constructs Wald-type tests for breaksdetection in the trend of a dynamic time series, [62] and [38] study tests for detecting change in themean, [38], [10], [30] and [53] study tests for change detection in the parameters of various timeseries models. [3] and [60] give surveys of recent methods and techniques for change-point studyin time series.Most of the change detection strategies are based mainly either on estimation or on testing.Non-exhaustive references on estimation approaches are [6], [23], [24], [17], [4], [2], [37] and [66].Some references on testing approaches are, among others, [45], [15], [20], [21], [63], [28], [5], [68],[27], [43], [26] and [31]. Some optimization based methods are described in [47], [35] and [52].However, except perhaps [27], [46] and [11] where weak changes are considered in the contextof independent data, most of the existing studies generally focus on the detection of abrupt changesthat may be seen with naked eyes. Moreover, for testing hypotheses approaches, the theoreticallocal power is rarely studied. For instance, a such study seems to have been done only in [21] whichstudies the local power of the CUSUM and Wilcoxon tests in the context of long range dependenceshifted stationary time series. So, to the best of our knowledge, the study of weak changes in timeseries remains very little explored.The off-line methods proposed here are for multiple weak changes detection in off-line data.They are in spirit, models choice approaches, based on a likelihood-ratio test for discriminatingbetween contiguous time-varying mean CHARN models built on the basis of a segmentation ofthe observations. The theoretical power of the test, expressed a function of the candidate breaks,plays an important role in the detection of the changes and the estimation of their locations. Thetechniques for its computation are more close to those of [41] and [42] or [8] and [9], than to thosein [7] and [22]. We use them in Section 3, where we present our test as well as the associatedtheoretical results based on the models, the notation and assumptions listed in Section 2. In Section4, a simulation experiment conducted shows a good performance of our methods compared to2he CUSUM and weighting CUSUM tests studied in [39]. Our methods are also applied to realdata from the floods of the Upper Hanjiang River in China. Section 5 contains the proofs of thetheoretical results stated in Section 3.
Our methods rest on the local power of a likelihood-ratio test for contiguous time-dependent coef-ficients CHARN models. Before studying the test, we precise the notation, we present the modelsand list the main assumptions.
Let d , d , d be positive integers with d = d + d and G be a real-valued function defined anddifferentiable on R d . We denote for z = ( z , . . . , z d ) = ( z ( ) , z ( ) ) ∈ R d and for any k = , ∂ z G ( z ) : = (cid:18) ∂ G ( z ) ∂ z , . . . , ∂ G ( z ) ∂ z d (cid:19) , ∂ z ( k ) G ( z ) : = ∂ G ( z ) ∂ z ( k ) , . . . , ∂ G ( z ) ∂ z ( k ) d k ∂ z G ( z ) : = (cid:18) ∂ G ( z ) ∂ z i ∂ z j : 1 ≤ i , j ≤ d (cid:19) , ∂ z ( k ) G ( z ) : = (cid:32) ∂ G ( z ) ∂ z ( k ) i ∂ z ( k ) j : 1 ≤ i , j ≤ d k (cid:33) ∂ z ( ) z ( ) G ( z ) : = (cid:32) ∂ G ( z ) ∂ z ( ) i ∂ z ( ) j : 1 ≤ i ≤ d , ≤ j ≤ d (cid:33) . Let U = ( U , . . . , U d ) (cid:62) and D = ( D (cid:62) , . . . , D (cid:62) d ) (cid:62) , where for all i = , . . . , d , D i is a matrix 1 × d and H (cid:62) stands for the transpose of a vector or a matrix H . We define DU (cid:62) = U (cid:62) D = ( D (cid:62) U , . . . D (cid:62) d U d ) (cid:62) ,we denote by || · || the Euclidean norm on R d and by || A || M = max ≤ i ≤ n ∑ nj = | a i j | , the norm of anysquare matrix A = ( a i j ) ≤ i , j ≤ n . For a differentiable function h with derivative h (cid:48) , we denote φ h = h (cid:48) h and I ( h ) = (cid:90) φ h ( x ) h ( x ) dx . Let k , n ∈ N and k << n . For the full description of our methodologies, we assume the observations X , . . . , X n are issued from the following time-varying coefficients p th-order CHARN( p ) model. X t = T ( Z t − ) + γ (cid:62) ω ( t ) + V ( Z t − ) ε t , t ∈ Z , (1)where T and V are real-valued functions such that inf x ∈ R p V ( x ) > γ = ( γ , . . . , γ k , γ k + ) (cid:62) ∈ R k + , ω ( t ) = ( t ∈ [ t , t ) , t ∈ [ t , t ) , . . . , t ∈ [ t k − , t k ) , t ∈ [ t k , t k + ) ) (cid:62) with 1 = t < t < . . . < t k < t k + = n standing3or the candidates locations of the changes, ( X t ) t ∈ Z is piece-wise stationary and ergodic on the [ t j , t j + ) ’s, ( ε t ) t ∈ Z is a white noise with density function f , and for all t ∈ Z , Z t = ( X t , . . . , X t − p + ) (cid:62) and ε t is independent of the σ -algebra G t − = σ ( Z , . . . , Z t − ) .In the next section, we construct a likelihood-ratio test for testing H : γ = γ against H ( n ) β : γ = γ + β √ n = γ n , n > , for some γ ∈ R k + and β ∈ R k + depending on the t j ’s. We study its properties under H and H ( n ) β .For any k ≥
0, denote by P k , t k the theoretical power of this test at t k = ( t , . . . , t k ) , with theconvention that P , t = α , α ∈ ( , ) , the level of the test. An expression of P k , t k is given inthe next section as a function of the parameters. This is computed under the following generalassumptions: ( A ) : f is differentiable and (cid:90) x f ( x ) dx = (cid:90) x f ( x ) dx = . ( A ) : φ f is differentiable with a c φ -Lipschitzian derivative φ (cid:48) f , where 0 < c φ < ∞ . ( A ) : lim x → + ∞ f ( x ) = lim x →− ∞ f ( x ) = lim x → + ∞ f (cid:48) ( x ) = lim x →− ∞ f (cid:48) ( x ) = . ( A ) : (cid:90) | φ f ( x ) | f ( x ) dx < ∞ . ( A ) : For all j = , . . . , k + , n j ( n ) = t j − t j − −→ ∞ , n j ( n ) n −→ α j as n → ∞ . ( A ) : The sequence of the Z t ’s whose components are associated with indexes within [ t j − , t j ) , j = , . . . , k + , is stationary and ergodic with stationary cumulative distribution function F j . ( A ) : µ j (cid:96) ( γ ) = I ( f ) (cid:90) V (cid:96) ( x ) dF j ( x ) < ∞ j = , . . . , k + , (cid:96) ≤ . Remark 1 (i) In the case γ (cid:54) = , assumption ( A ) may not hold. But one can check that it doesat least for the subclass of (1) with constant functions T . These models have the form of thosestudied in [20] and [21].(ii) On each [ t j − , t j ) , j = , . . . , k + , there are at most p random vectors Z t whose componentsare associated with indexes within both [ t j − , t j ) and [ t j − , t j − ) . They may not have the samestationary distribution. But since their number is negligible behind n j ( n ) , their distributionsdo not affect the asymptotic results. Remark 2 (i) The µ j (cid:96) ( γ ) ’s are functions of γ through the cumulative distribution F j .(ii) Assumption ( A ) implies (cid:90) φ f ( x ) f ( x ) dx = and (cid:90) φ f ( x ) f ( x ) dx = (cid:90) φ (cid:48) f ( x ) f ( x ) dx . iii) We assume that as n tends to infinity, the density function of ( ε , Z ) (cid:62) under H ( n ) β tends to itsdensity under H . With this, asymptotically, ( ε , Z ) (cid:62) has no influence on the test statisticderived here. As it allows for a simplification of the form of the likelihoods, we consider inthis paper the likelihoods and log-likelihoods associated with ( ε , Z (cid:62) , X , X , . . . , X n ) (cid:62) . Remark 3
A decision about the above hypotheses can be taken on the basis of P k , t k . Since it isunknown, an estimator (cid:99) P k , t k can be obtained by substituting the parameters for their estimatorsin its expression. In particular, for j = , . . . k + , the jth component of β can be estimated by √ n ( X j , n − (cid:98)(cid:101) γ j , n ) , where (cid:98)(cid:101) γ j , n is an estimator of the jth component of γ , and X j , n is the samplemean of the observations with time indexes within [ τ j , τ j + ) . A possible (cid:98)(cid:101) γ j , n is the sample mean ofthe observations with indexes in [ t j − , t j ) . Many strategies for changes detection and for estimating their locations in off-line data have beenproposed in the literature. One can report to [56], or to [55] for a review. The strategies that wepropose here rest on the power of the test studied in this paper. Before presenting them, we firstdiscuss the forms of γ and β in (1).One would invoke a change in the mean of the data when at a certain time index the meanchanges. Thus, the reference mean would be that of the earliest observations assumed to be sta-tionary up to the time index t where the first change occurs or is assumed to occur. Whence, thefirst component of γ will always be taken to be the theoretical mean µ of the data on [ t , t ) , whilethe first component of β would be nil. [4] explain how to check the stationarity of these earliestobservations called there historical data . [1] proposes change point tests assuming the location ofthe potential change is known a priori and the number of observations before or after the changepoint presumed fixed. [14] gives a survey of change detection methods with known or unknownpotential change points. All these suggest that in change point study one may need some a priori information about the data.It is well known (see, eg, [13], page 14) that one of the starting points of the modelization ofa time series X , X , . . . , X n is the analysis of its graphic (or chronogram). This can provide someinformation as for example, a guessed historical data X , X , . . . , X m , m << n , a maximum number K of breaks and areas where they may be located, hence, a minimum distance h << n betweenadjacent breaks.Let ζ ∈ ( , . ) be a given positive small number. Our procedures for detecting changes and esti-mating their locations in a time series X , X , . . . , X n starts with finding the above a priori informationand continues as follows.S1 - Change detection:
Take k = t ’s satisfying m ≤ t ≤ n − h .5- If | (cid:99) P , t − P , t | ≤ ζ for all such t , then, no change is detected in the series.ii- If | (cid:99) P , t − P , t | > ζ for a t , then, there is at least one change in the series.S2 - Estimating changes locations:
For k = , . . . , K , let m < τ < . . . < τ k ≤ n − h , τ j − τ j − ≥ h , j = , . . . , k , be the potential locations of the changes obtained from the chronogram. Let C j be an arbitrary set of time indexes containing τ j , and Ω j ∩ Ω (cid:96) = /0, j (cid:54) = (cid:96) = , . . . , k .Consider S k = C × C × . . . × C k . For any k -tuple τ k = ( τ , . . . , τ k ) ∈ S k , apply the abovetesting problem with t j = τ j , j = , . . . , k + (cid:99) P k , t k . • At step k + | (cid:99) P k + , t k + − (cid:99) P k , t k | ≤ ζ and | (cid:99) P k , t k − (cid:99) P k − , t k − | > ζ , then an estimate ( (cid:98) k , (cid:98) t k ) of thecouple of the number of changes and the vector of the locations can be obtained as ( (cid:98) k , (cid:98) t k ) = arg max t k ∈ S k (cid:99) P k , t k . ii- If | (cid:99) P k + , t k + − (cid:99) P k , t k | > ζ , repeat item i with k substituted for k + • Clearly, if k , the number of changes is known, t k can be estimated by (cid:98) t k = arg max t k ∈ S k (cid:99) P k , t k . (2)S3 - Alternative to
S2 : As a substitute to S2, we can use the following sequential strategy.Consider the segments (cid:48) = { , . . . , m , . . . , τ + h } , (cid:48) k = { τ k + h , . . . , τ k + + h } , k = , . . . , K .i- Step 1 : Apply the above test for each t ∈ (cid:48) . If | (cid:99) P , t − P , t | ≤ ζ for all these t ’s,then, there is no change location on (cid:48) . Otherwise, there is one estimated by (cid:98) t = arg max t ∈ (cid:48) (cid:99) P , t . ii- Step k + (cid:101) (cid:48) k + = (cid:48) k + if no change was detected on (cid:48) k , and (cid:101) (cid:48) k + = { (cid:98) t k + h , . . . , τ k + + h } if a change was detected on (cid:48) k at (cid:98) t k . Then repeat item i with (cid:101) (cid:48) k + substituted for (cid:48) .It is clear that the total number of changes in series is the number of the (cid:48) k ’s where a changehas been located, and the locations are estimated by the (cid:98) t k ’s. Note - The part i of S1 comes from the idea that in the case of no change, all the estimates of thepossible β would be close to 0, and the estimated power would be close to the level of the test. Partii is motivated by the fact that if there is at least one change, then one can find one t for which themean of the observations before and after that time index are different. The test would then rejectthe null hypothesis of no break. S2 and S3 are also built on such ideas.6 emark 4 i- In practice, the Ω j ’s in strategy S2 must not be too large.ii- In the case of single change, one can use strategy S1 for change detection and for estimatingits location, apply (2) with S = { m + , . . . , n − m } or S3 with (cid:48) = { m + , . . . , n − m } . We first establish the contiguity of the sequences { H ( n ) = H } and { H ( n ) = H ( n ) β } , β ∈ R k + . Wedenote by Λ n ( γ , β ) log-likelihood ratio of H against H ( n ) β and we define the central statistic ∆ n ( γ , β ) = √ n n ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] , (3)where for all γ = ( γ , . . . , γ k + ) ∈ R k + and all t ∈ Z , ε t ( γ ) = X t − T ( Z t − ) − γ (cid:62) ω ( t ) V ( Z t − ) . (4) Theorem 1
Assume that ( A ) - ( A ) hold. Then, for any β ∈ R k + , under H , as n → ∞ , Λ n ( γ , β ) = ∆ n ( γ , β ) − µ ( γ , β ) + o P ( ) , ∆ n ( γ , β ) D −→ N ( , µ ( γ , β )) , with µ ( γ , β ) = k + ∑ j = α j β j µ j ( γ ) = ϖ ( γ , β ) . Proof.
See Appendix.
Corollary 1
Assume that ( A ) - ( A ) hold. Then, for any β ∈ R k + the sequences { H ( n ) β : n ≥ } and { H ( n ) = H : n ≥ } are contiguous. Moreover, under H ( n ) β , as n → ∞ , ∆ n ( γ , β ) D −→ N ( µ ( γ , β ) , µ ( γ , β )) . Proof.
For any β ∈ R k + , from Theorem 1, under H , as n → ∞ , ∆ n ( γ , β ) D −→ N ( , µ ( γ , β )) . H , as n → ∞ , Λ n ( γ , β ) D −→ N (cid:18) − µ ( γ , β ) , µ ( γ , β ) (cid:19) . Then, it is easy to see that under H , as n → ∞ , (cid:32) ∆ n ( γ , β ) Λ n ( γ , β ) (cid:33) D −→ N − µ ( γ , β ) , (cid:32) µ ( γ , β ) µ ( γ , β ) µ ( γ , β ) µ ( γ , β ) (cid:33) . It results from [48] or [25] that { H ( n ) β : n ≥ } and { H ( n ) = H : n ≥ } are contiguous and under H ( n ) β , as n → ∞ , ∆ n ( γ , β ) D −→ N ( µ ( γ , β ) , µ ( γ , β )) . For known γ and for any β ∈ R k + , for testing H against H ( n ) β , we base our test on the statistic T n ( γ , β ) = ∆ n ( γ , β ) (cid:98) ϖ n ( γ , β ) , where (cid:98) ϖ n ( γ , β ) is any consistent estimator of ϖ ( γ , β ) = µ ( γ , β ) . In the sequel, (cid:98) ϖ n ( γ , β ) willbe taken to be the natural estimator (cid:98) µ n ( γ , β ) with (cid:98) µ n ( γ , β ) = ∑ k + j = (cid:98) α j β j (cid:98) µ j ( γ ) , and for j = , . . . , k + (cid:98) α j is an estimator of α j = lim n → ∞ n j ( n ) / n and (cid:98) µ j ( γ ) = I ( f ) n j ( n ) t j ∑ t = t j − V ( Z t − ) . Theorem 2
Assume that ( A ) - ( A ) hold. Then, for any β ∈ R k + ,(i) Under H , T n ( γ , β ) D −→ N ( , ) , as n → ∞ . (ii) Under H ( n ) β , at level of significance α ∈ ( , ) , the asymptotic power of the test based on T n ( γ , β ) is P k , t k = − Φ ( u α − ϖ ( γ , β )) , where u α is the ( − α ) -quantile of a standardGaussian distribution with cumulative distribution function Φ .(iii) The test based on T n ( γ , β ) is locally asymptotically optimal. Proof.
See Appendix. 8 .2 The parametric models
Now, we place ourselves in the framework of the model (1) with the functions T and V of knownforms but depending on the unknown parameters. More precisely, we assume that T ( x ) = T ρ ( x ) , V ( x ) = V θ ( x ) , ρ ∈ Θ ⊂ R l and θ ∈ (cid:101) Θ ⊂ R q . Let ψ = ( ρ (cid:62) , θ (cid:62) ) (cid:62) ∈ Θ × (cid:101) Θ ⊂ R l × R q the truenuisance parameter of the model (1). For γ ∈ R k + and ψ = ( ρ (cid:62) , θ (cid:62) ) (cid:62) ∈ Θ × (cid:101) Θ , define ε t ( ψ , γ ) = X t − T ρ ( Z t − ) − γ (cid:62) ω ( t ) V θ ( Z t − ) , t ∈ Z . (5)We make the following additional assumptions: ( B ) : For any θ ∈ (cid:101) Θ and z ∈ R p , V θ ( z ) > τ , where τ is a positive real number. ( B ) : There exists δ ≥ (cid:90) || x || + δ dF j ( x ) < ∞ , j = , . . . , k + . ( B ) : (cid:90) | x φ (cid:48) f ( x ) | f ( x ) dx < ∞ . ( B ) : Denote Int ( Θ ) and Int ( (cid:101) Θ ) the interior of Θ and (cid:101) Θ respectively. The functions T ρ ( z ) and V θ ( z ) are continuous and differentiable with respect to ρ ∈ Int ( Θ ) and θ ∈ Int ( (cid:101) Θ ) respec-tively, and there exist finite numbers r and r such that B ( ρ , r ) ⊂ Int ( Θ ) , B ( θ , r ) ⊂ Int ( (cid:101) Θ ) , such that the largest number among sup ρ ∈ B ( ρ , r ) || ∂ ρ T ρ ( z ) || , sup θ ∈ B ( θ , r ) | V θ ( z ) | ,sup θ ∈ B ( θ , r ) || ∂ θ V θ ( z ) || and sup θ ∈ B ( θ , r ) || ∂ θ V θ ( z ) || M is bounded by some positive function ϑ ( z ) such that for any j = , . . . , k + (cid:82) ϑ ( x ) dF j ( x ) < ∞ . ( B ) : The true parameter ψ = ( ρ (cid:62) , θ (cid:62) ) (cid:62) has a consistent estimator ψ n = ( ρ (cid:62) n , θ (cid:62) n ) (cid:62) satisfying n ( ψ n − ψ ) = n − n ∑ t = Ψ ( Z t − , ψ ) Ω (cid:62) [ ε t ( ψ , γ )] + o P ( ) , Ψ ( z , ψ ) = ( Ψ (cid:62) ( z , ψ ) , Ψ (cid:62) ( z , ψ )) (cid:62) , Ω ( z ) = ( Ω ( z ) , Ω ( z )) (cid:62) z ∈ R p , Ψ m ( z , ψ ) = ( Ψ m ( z , ψ ) , . . . , Ψ ml ( z , ψ )) (cid:62) ∈ R l , m = , , Ω ( z ) ∈ R , z ∈ R p , (cid:82) || Ψ ( x , ψ ) || + δ dF j ( x ) < ∞ , j = , . . . , k + , (cid:82) || Ω ( x ) || + δ f ( x ) dx < ∞ , (cid:82) Ω ( x ) f ( x ) dx = . Proposition 1
Assume that ( A ) - ( A ) , ( B ) hold. Then as n → ∞ , (i) Under H , √ n ( ψ n − ψ ) D −→ N ( , Σ ) , (ii) For any β ∈ R k + , under H ( n ) β , √ n ( ψ n − ψ ) D −→ N ( ν , Σ ) : = (cid:90) Ω (cid:62) ( x ) φ f ( x ) f ( x ) dx k + ∑ j = α j β j (cid:90) Ψ ( x , ψ ) V θ ( x ) dF j ( x ) Σ = (cid:32) Σ Σ Σ Σ (cid:33) Σ ml : = (cid:90) Ω m ( x ) Ω l ( x ) f ( x ) dx k + ∑ j = α j (cid:90) Ψ m ( x , ψ ) Ψ (cid:62) l ( x , ψ ) dF j ( x ) , m , l = , . Proof.
See Appendix. γ is known In practice, the case where the parameter γ is known may be encountered when there is no apparentchange, and one wishes to test for possible weak changes.For any β ∈ R k + , denote by Λ n ( ψ , γ , β ) the log-likelihood ratio of H against H ( n ) β and by ∆ n ( ψ , γ , β ) the counterpart of ∆ ( γ , β ) given by (3) with ε t ( γ ) defined by (4), substituted for ε t ( ψ , γ ) defined by (5). For all l ≤ j = , . . . , k +
1, define the following real numbers µ jl ( ψ , γ ) = I ( f ) (cid:90) V l θ ( x ) dF j ( x ) µ ( ψ , γ , β ) = k + ∑ j = α j β j µ j ( ψ , γ ) = ϖ ( ψ , γ , β ) . Note that, since V θ ( x ) > τ >
0, for any x and θ , the µ j (cid:96) ( ψ , γ ) ’s are finite. Next, since for any j = , . . . , k + , µ j ( ψ , γ ) depends on F j which itself depends on ψ (which is unknown) and on γ , we estimate it by (cid:98) µ j ( ψ n , γ ) given by (cid:98) µ j ( ψ n , γ ) = I ( f ) n j ( n ) t j ∑ t = t j − V (cid:98) θ n ( Z t − ) . So, although we can consider any consistent estimators of µ ( ψ , γ , β ) and ϖ ( ψ , γ , β ) = µ ( ψ , γ , β ) ,we will take them here to be respectively, (cid:98) µ n ( ψ n , γ , β ) = k + ∑ j = (cid:98) α j β j (cid:98) µ j ( ψ n , γ ) and (cid:98) ϖ n ( ψ n , γ , β ) = (cid:98) µ n ( ψ n , γ , β ) where for all j = , . . . , k + , (cid:98) α j is an estimator of α j which can be taken to be (cid:98) α j = n j ( n ) / n . Proposition 2
Assume that ( A ) - ( A ) , ( B ) - ( B ) hold. Then, for any sequence of consistent andasymptotic normal estimators { ψ n } n ≥ of ψ , under H , as n → ∞ , for any β ∈ R k + , we have i) ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) + o P ( ) (ii) (cid:98) ϖ n ( ψ n , γ , β ) −→ ϖ ( ψ , γ , β ) , where { N ( n ) } n ≥ stands for a subset { , . . . , n } such that n / N ( n ) −→ as n → ∞ . Proof.
See Appendix.For any β ∈ R k + , for testing H against H ( n ) β , we consider the statistic T n ( ψ N ( n ) , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) (cid:98) ϖ n ( ψ N ( n ) , γ , β ) . Theorem 3
Assume that ( A ) - ( A ) , ( B ) - ( B ) hold. Then, for any β ∈ R k + ,(i) Under H , T n ( ψ N ( n ) , γ , β ) D −→ N ( , ) , as n → ∞ . (ii) Under H ( n ) β , at level of significance α ∈ ( , ) , the asymptotic power of the test based on T n ( ψ N ( n ) , γ , β ) is P k , t k = − Φ ( u α − ϖ ( ψ , γ , β )) , where u α is the ( − α ) -quantile of astandard Gaussian distribution with cumulative distribution Φ . (iii) The test based on T n ( ψ N ( n ) , γ , β ) is locally asymptotically optimal. Proof.
See Appendix. γ is unknown The case γ is unknown is that having a real statistical meaning. Both γ and γ n = γ + β / √ n needto be estimated for the computation of the likelihood-ratio statistic. For any β ∈ R k + , let (cid:98) γ , n and (cid:101) γ n be respectively, the maximum likelihood estimator of γ and γ n . It is easy to check that or largervalues of n , in probability, (cid:101) γ n = (cid:98) γ , n + β √ n . Let N ( n ) be any sub-sequence of { , . . . , n } satisfying n / N ( n ) −→ n → ∞ . For any β ∈ R k + ,our test statistic for this testing problem is T n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) = ∆ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) (cid:98) ϖ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) . roposition 3 Assume that ( A ) - ( A ) , ( B ) - ( B ) hold. Then, for any sequence of consistent andasymptotically normal estimators { ( ψ n , (cid:98) γ , n ) } n ≥ of ( ψ , γ ) , as n → ∞ , under H , for any β ∈ R k + , ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) + o P ( ) . Proof.
See Appendix.
Theorem 4
Assume that ( A ) - ( A ) , ( B ) - ( B ) hold. Let N ( n ) be any sub-sequence of { , . . . , n } satisfying n / N ( n ) −→ as n → ∞ . Then for any β ∈ R k + ,(i) Under H , T n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) D −→ N ( , ) , as n → ∞ . (ii) Under H ( n ) β , at level of significance α ∈ ( , ) , the asymptotic power of the test based on T n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) is P k , t k = − Φ ( u α − ϖ ( ψ , γ , β )) , where u α is the ( − α ) -quantileof a standard Gaussian distribution with cumulative distribution Φ .(iii) The test based on T n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) is locally asymptotically optimal. Proof.
This theorem is a straightforward consequence of Proposition 3 and Theorem 3.
In this section, we apply our theoretical results on simulated data, using the software R. We focuson the study of the power as a function of β when the breaks locations are fixed, and as a function ofthe breaks locations when the associated β is estimated. The breaks are estimated by the strategiesdescribed in subsection 2.3. The results we present in the sequel are obtained for the nominal level α = α = ,
10% as they are very similar. All theestimators in this section are computed from 5000 replications.
We start with the study of the theoretical local power of our test for known breaks. That is, weconsider the testing problem stated in Section 2 when the t j ’s are assumed to be given. Thus, thepower of the test is a function of β only. In this subsection, we study the behavior of this functionfor several models from the more general following one: X t = (cid:16) ρ + ρ X t − e − ρ X t − (cid:17) X t − + (cid:18) γ + β √ n (cid:19) (cid:62) ω ( t ) + (cid:16) θ + θ X t − e − θ X t − (cid:17) ε t , t ∈ Z (6)where the ρ j ’s, θ j ’s and γ are parameters to be specified in each particular model considered, n is the sample size, ( ε t ) t ∈ Z is a standard white noise with a differentiable density f , and β ∈ [ − , ] k + , with k standing for the number of breaks.12 a) ρ = . , ρ = , θ = , θ = (b) ρ = . ρ = , θ = , θ = (c) ρ = . , ρ = . , ρ = , θ = . , θ = . , θ = (d) ρ = . , ρ = . , ρ = , θ = . , θ = . , θ = Figure 1: Power of the test for standard Gaussian and standardized exponential noises. γ = γ = • Case k = ρ = . ρ = θ = , θ = n = t = t =
60. Figure 1 (a)-(b) present the power of the test as a function of β ∈ [ − , ] , for f respectively the standard normal density, and the standardized exponential density with parameter1.25. Note that in the latter situation, the ε t ’s are obtained by taking ε t = λ ( (cid:101) ε t − / λ ) with (cid:101) ε t beingan exponential random variable with parameter λ . It can be seen that on these two graphics, thepower tends to 1 and the power corresponding to the exponential noise tends faster to 1 than that ofthe Gaussian noise.We next consider (6) for ρ = . , ρ = . , ρ = θ = . , θ = . , θ = n = . . . . . . . . . . . . . . . . . . . . . . . N ( , ) E ( , ) Table 1: Power against an AR(1) model with 2 breaks t =
30 and t =
60. Figure 1 (c)-(d) show the power, respectively for standard normal density f and standardized exponential density f with parameter 1.25. It can be seen that in either case, itincreases with β to 1. • Case k = n = , t = t =
60 and t = ρ = . , ρ = θ = , θ =
0. The results for the power corresponding to the standard Gaussian f ( N ( , ) ) andthe standardized exponential f ( E ( , ) ) are displayed in Table 1 below. It is seen that the powermoves toward 1 when β moves away from ( , , ) (cid:62) . • We also studied the case of 3 breaks. The results are not reported as they were very similar to thecase of 2 breaks. γ and f are unknown In practice γ and f may be unknown and may need to be estimated. As the theory is more complexin this case we have not tried to tackle it. But we have done some trials in order to have an idea onhow the test may behave in this situation. While we estimated γ by a maximum likelihood method, f was estimated by the well-known kernel (or the Parzen-Rosenblatt) estimator (cid:98) f n defined by (cid:98) f n ( x ) = nh n n ∑ t = K (cid:18) x − (cid:98) ε t h n (cid:19) , x ∈ R , where (cid:98) ε t = X t − T ρ n ( Z t − ) − (cid:98) γ (cid:62) n ω ( t ) V θ n ( Z t − ) , with ψ n = ( ρ n , θ n ) and (cid:98) γ n being the maximum likelihood estimates of the nuisance parameter ψ =( ρ , θ ) and γ respectively, h n = n − / the bandwidth and K the Gaussian kernel function.The power of the test was computed based on these estimates which are themselves based onobservations from (6) for ρ = . , ρ = θ = , θ = f . Westudied the case of single break and we took n = t =
30 and t =
60. The correspondingtheoretical powers are not presented as they were very similar to those on Figure 1.14 .2 Changes detection and their locations estimation
In this subsection, we investigate changes detection and their locations estimation in simulated dataand in three series of real data. We compare our methods to the CUSUM tests studied in [39].
Here, the data X , . . . , X n , n =
200 are simulated from model (1) for V ( x ) = T ( x ) = T ( x ) = . x , and for several γ , β and t j , j = , . . . , k . In either case, the noise is Gaussian.We first study the situation where the data present no break, and we test for no break againstthe existence of a break. The n =
200 observations are simulated for T ( x ) = , V ( x ) = γ = β =
0. Thus, they are stationary and contain no break. We simulated four sets of n = t = , . . . , ζ = .
01. As can be seen from Figure 2 (a), the curve of the power function, as a function of t ,is flat . Its values are comprised between 5% and 5,3% which shows that our test has not detectedany significant change in the data got from models with no break.We next considered the case the data contain a single weak break point, and we tested for nobreak against a single weak break. The data were computed from (1) for V ( x ) = T ( x ) = . x , γ = ( , ) and β = ( , β ) , for β = − . , − . , − . , . , . , .
8, respectively combined withthe respective breaks locations t = , ,
120 and 160. In any case, we applied the strategy S2.Figure 2 (b) shows that the power is significantly larger than the nominal level 5 % and, generally,reaches its maximum at the right breaks locations. Thus, our method enables the detection of changeand the correct estimation of its location in the data obtained from models with a single break.Another case studied is that where the data are suspected to contain two weak break points. Thetest is for no break against two weak breaks. The data are computed from (1) for V ( x ) = T ( x ) = . x , T ( x ) = V ( x ) = γ = ( , , ) and β = ( , β , β ) for ( β , β ) = ( . , − . ) , ( . , . ) , ( − . , . ) , ( . , − . ) , ( − . , . ) and ( . , − . ) , respectively combined with therespective couples of breaks locations ( t , t ) = ( , ) and ( , ) . We applied S2 with S = { , . . . , } × { , . . . , } for ( t , t ) = ( , ) and S = { , . . . , } × { , . . . , } for ( t , t ) = ( , ) . From Figure 2 (c)-(d), it can be seen that the power deviates from the nominallevel and generally reaches its maximum at the right couples of the breaks locations listed above.Table 2 gives more extensive results. As can be seen there, our method generally detects changesand the correct locations in the simulated data.Finally, we compare our methods to those studied in [39] in the case of a single change de-tection. We first tried to detect changes in a series simulated with no change. While our strategyS1 never found any change in those series, the tests studied in [39], termed here SCUSUM (stan-dard CUSUM test) and RCUSUM (R´eny type CUSUM test), found changes in most of the series15 . − . − . . − . . . − . . − . − . − . ( t , t ) ( 60,100) (59,100) (70,100) (60,100) (65,100) (60,100) (60,100) (60,100)(40,120) (38,120) (40,120) (40,120) (40,120) (40,120) (40,120) (40,120)(50,140) (40,140) (50,140) (50,140) (50,138) (50,140) (50,140) (50,140)(80,160) (80,160) (80,160) (80,159) (80,159) (80,160) (80,160) (80,160)(100,170) (99,170) (100,170) (100,170) (100,170) (100,169) (100,170) (100,170) Table 2: Changes locations for 2 breakssimulated. For those simulated with one change, our test was generally able to find the correctlocation even for changes with too small magnitude (see Table 3). SCUSUM and RCUSUM wereable to provide reasonable estimates only for changes with large magnitude. The SCUSUM, per-formed in [39] to estimate changes locations in the earliest and the latest observations, seems to bebeaten by our methods in estimating such locations. Note that for performing the SCUSUM andthe RCUSUM, we used the R routine CUSUM.test in the library CPAT.
Now, our methods are applied to detecting changes in three time series Q , V and V relativeto the floods of the Upper Hanjiang River in China, collected from 1950 to 2011. Their lengthis n =
62. The first series represents the annual maxima daily discharge measured in m / s . Thesecond represents the annual maxima 3 day flood volume and the third the annual maxima 15 dayflood volume, both measured in m . The chronogram of Q in Figure 3 (a) and those of V (blackline) and V (blue line) in Figure 3 (b) seem to present a little trend and no apparent seasonality.For each of them, using the R routine ma, we estimated the trend (red line) by a five-order movingaverage and subtracted them from their corresponding series. As the trend is assumed to be smooth,the eventual changes in the series are expected to be found in the residual series. More explicitly,letting ( T t ) be the estimate of the trend, and letting ( Y t ) be either of the series, we applied ourmethods to the residual series X t = Y t − T t and considered the model X t = µ t + σ t ε t , ≤ t ≤ n , where µ t and σ t are piecewise constants and ( ε t ) is a standard white noise with a Gaussian distri-bution.We consider the above model because the Ljung-Box and Box-Pierce tests (see [13]) applied toeach of the three ( X t ) rejected their iid hypothesis. Whence, we considered them as heteroscedastic.Also, the Shapiro-Wilk test applied to several subsets of the residual series suggested they are16 rue break 30 60 90 120 150 170 β = ( , β ) ( , − . ) ( , . ) ( , − . ) ( , . ) ( , . ) ( , − . ) CURRENT 30 60 91 120 149 170SCUSUM 99 99 99 100 101 100RCUSUM 91 60 69 143 79 126 β = ( , β ) ( , − ) ( , ) ( , − ) ( , ) ( , ) ( , − ) CURRENT 30 60 90 120 150 171SCUSUM 76 81 93 113 136 105RCUSUM 29 63 82 80 159 44 β = ( , β ) ( , − ) ( , ) ( , − ) ( , ) (cid:62) ( , ) ( , − ) CURRENT 30 60 90 120 150 170SCUSUM 43 64 91 119 147 153RCUSUM 35 77 104 121 150 150 ( , β ) ( , − ) ( , ) ( , − ) ( , ) ( , ) ( , − ) CURRENT 30 60 90 120 150 170SCUSUM 36 62 91 120 148 163RCUSUM 38 61 90 121 150 166 β = ( , β ) ( , − ) ( , ) ( , − ) ( , ) ( , ) ( , − ) CURRENT 30 60 90 120 150 170SCUSUM 34 61 90 120 149 166RCUSUM 30 60 90 121 150 170
Table 3: Change locations estimates obtained by CURRENT, SCUSUM and RCUSUM methods.piecewise Gaussian. The graphic of the residual associated with Q is presented in Figure 3 (c),while those associated with the two others are on Figure 3 (d) (black line for V and blue line for V ). On each of them, the red line represents the trend. In any case, our historical data was the 20first observations X , . . . , X , we suspected changes around 1986 and 1998. As minimum distance h we took h =
5, and ζ = . S j ’s containing 5 points gave, for Q , locations 1985 and 1996and for V and V , locations 1984 and 1997. The strategy S3 gave locations 1986 and 1997 for thethree series.We also applied SCUSUM and RCUSUM sequentially with the same subsets of data used inour sequential method S3. Both gave the same results. That is, for Q and V , changes at 1971and 2000 and for V , changes at 1968 and 2000. These locations estimates are a bit far from ourestimates which are close to those obtained by [65] who found only one change located at 1987 for Q , and 1985 for both V and V .In order to evaluate the accuracy of our estimates, we would like to recall that the Ankan Reser-voir has been proved to be among other reservoirs the one mostly influencing the flow regime of theUpper Hanjiang River. Built in 1982, this reservoir started storing water in 1989. One year later,its first generator was operational and in 1992 all its four generators were operational. The entire17roject of the Ankan Reservoir was complete by 1995. From this, it is obvious that our estimatedchange locations fall within the period of construction and functioning of the Ankan Reservoir. (a) AR(1) model (b)
AR(1) model with one break (c)
White noise model with t = , t = , β = ( , . , − . ) (d) AR(1) model with t = , t = , β = ( , . , − . ) Figure 2: Changes locations for no change (a), one change (b) and two changes (c)-(d).
We have constructed and studied a likelihood-ratio test for weak changes detection in the meanof CHARN( p ) time series models. We have derived its power function and given its expression.We have used this power in some new strategies for detecting changes in the mean and estimatingtheir locations. Simulation experiment has shown that our method performs well on the examples18tudied, and that the results do not depend on the level of significance of the test. Compared tosome competitive methods, ours have shown a better performance in estimating the locations ofweak changes. They have also been applied to three sets of real data studied in [65]. Two changeshave been detected in each of these data and their locations have been estimated. One of our locationestimates is very close to the single ones obtained in [65].We did numerous trials from which we observed that, in almost all the cases, with strategyS1, no change was detected when the data were simulated with no change, a change was detectedfor data simulated with at least one change, and the location was accurately estimated for datacontaining one single change.In using our methods as screening methods, changes that may have been previously detected bysome given method can be considered as known. So that, under the null hypothesis, the data maynot be stationary. This constitutes a great advantage of our methods over existing ones. Indeed, inliterature, the stationarity assumption under the null hypothesis plays a key role in the derivation ofthe theoretical results. This section provides the proofs of results.
For any β ∈ R k + , the log-likelihood ratio of H against H ( n ) β is given by Λ n ( γ , β ) = n ∑ t = { log [ f ( ε t ( γ n ))] − log [ f ( ε t ( γ ))] } + o P ( ) . We first show that as n → ∞ , Λ n ( γ , β ) decomposes into Λ n ( γ , β ) = Λ n − Λ n + o P ( ) , where Λ n : = √ n n ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] Λ n : = n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) φ (cid:48) f [ ε t ( γ )] . By a first-order Taylor expansion of log f [ ε t ( γ )] around γ , one has for some (cid:101) γ between γ n and γ , Λ n ( γ , β ) = ( γ n − γ ) (cid:62) n ∑ t = ∂ [ log f ( ε t ( γ ))] ∂ γ (cid:12)(cid:12)(cid:12) γ = γ + ( γ n − γ ) (cid:62) ∂ [ log f ( ε t ( γ ))] ∂ γ (cid:12)(cid:12)(cid:12) γ = (cid:101) γ ( γ n − γ ) . a) Q : Raw data (b) V and V : Raw data (c) Residual series (d)
Residual series (e)
Estimated change point (f)
Estimated change point
Figure 3: Changes locations for real data.From the first- and second-order derivatives of log f [ ε t ( γ )] will respect to γ , one has Λ n ( γ , β ) = √ n n ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] − n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) φ (cid:48) f [ ε t ( (cid:101) γ )] + o P ( )= Λ n − Λ ( ) n + o P ( ) . Then we can write, by a first-order Taylor expansion of ε t ( γ ) around γ , Λ ( ) n : = n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) φ (cid:48) f (cid:20) ε t ( γ ) + ( γ − (cid:101) γ ) (cid:62) ω ( t ) V ( Z t − ) (cid:21) . ( A ) , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n ∑ t = (cid:0) β (cid:62) ω ( t ) (cid:1) V ( Z t − ) φ (cid:48) f (cid:20) ε t ( γ ) + ( γ − (cid:101) γ ) (cid:62) ω ( t ) V ( Z t − ) (cid:21) − n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) φ (cid:48) f [ ε t ( γ )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ nC φ n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) (cid:12)(cid:12)(cid:12) ( γ − (cid:101) γ ) (cid:62) ω ( t ) V ( Z t − ) (cid:12)(cid:12)(cid:12) . ≤ ( k + ) C φ || γ − (cid:101) γ || (cid:32) n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) (cid:33) . Expanding the term in brackets in terms of a sum of sums over the [ t j − , t j ) ’s, it is easy to see that,by the ergodic theorem it tends almost surely to ∑ k + j = α j β j µ j ( γ ) . Since (cid:101) γ is between γ and γ n and γ n − γ = β / √ n , it follows that || γ − (cid:101) γ || ≤ || β || / √ n which tends to zero as n goes to infinity.Whence almost surely, lim n → ∞ n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) || γ − (cid:101) γ || = . Thus, for larger value of n , Λ ( ) n = Λ n + o P ( ) . Let’s show now that under H , as n → ∞ , Λ n D −→ N ( , µ ( γ , β )) . For this, we consider Λ n , j = √ n j ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] , j = , . . . n , and we define for any t = , . . . , n , ξ n , t = √ n β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] . Then we observe that Λ n , j is a zero-mean process that can be rewritten as Λ n , j = j ∑ t = ξ n , t , j = , . . . , n First, we show that { ( Λ n , j , G j ) , j = , . . . , n } is a sequence of martingales where we recall that G t − = σ ( Z , . . . , Z t ) is a σ -algebra spanned by Z , . . . , Z t , t ∈ Z . j , j ∈ Z , j < j . We have: E (cid:16) Λ n , j (cid:12)(cid:12)(cid:12) G j (cid:17) = E (cid:34) √ n j ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] (cid:12)(cid:12)(cid:12) G j (cid:35) + E (cid:34) √ n j ∑ t = j + β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] (cid:12)(cid:12)(cid:12) G j (cid:35) = √ n j ∑ t = β (cid:62) ω ( t ) V ( Z t − ) φ f [ ε t ( γ )] + √ n j ∑ t = j + E (cid:20) β (cid:62) ω ( t ) V ( Z t − ) (cid:12)(cid:12)(cid:12) G j (cid:21) E (cid:8) φ f [ ε t ( γ )] (cid:9) . From Remark 2, we have E (cid:8) φ f [ ε t ( γ )] (cid:9) =
0. Then E (cid:16) Λ n , j (cid:12)(cid:12)(cid:12) G j (cid:17) = Λ n , j . Also, we have n ∑ t = E ( ξ n , t | G t − ) = n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) E (cid:110) φ f [ ε t ( γ )] (cid:12)(cid:12)(cid:12) G t − (cid:111) . Since for any t = , . . . , n , ε t is independent of G t − , we have n ∑ t = E ( ξ n , t | G t − ) = n k + ∑ j = t j ∑ t = t j − β j V ( Z t − ) E (cid:8) φ f [ ε t ( γ )] (cid:9) = k + ∑ j = n j ( n ) n β j n j ( n ) t j ∑ t = t j − V ( Z t − ) E (cid:8) φ f [ ε t ( γ )] (cid:9) . From our iid assumption on the ε (cid:48) t s we have E (cid:8) φ f [ ε t ( γ )] (cid:9) = (cid:90) φ f ( x ) f ( x ) dx = I ( f ) . From the ergodic theorem, for any j = , . . . , k + , almost surely,lim n → ∞ n j ( n ) t j ∑ t = t j − V ( Z t − ) E (cid:8) φ f [ ε t ( γ )] (cid:9) = µ j < ∞ . Then, lim n → ∞ n ∑ t = E ( ξ n , t | G t − ) = µ ( γ , β ) = k + ∑ j = α j β j µ j ( γ ) < ∞ . Now, we check the conditional Lindeberg conditions.Let ε >
0, using the H¨older conditional inequality, we have n ∑ t = E (cid:16) ξ n , t | ξ n , t | > ε | G t − (cid:17) ≤ n ∑ t = E (cid:0) ξ n , t | G t − (cid:1) E (cid:16) | ξ n , t | > ε | G t − (cid:17) = n ∑ t = E (cid:34) n − (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) φ f [ ε t ( γ )] (cid:12)(cid:12)(cid:12) G t − (cid:35) P (cid:16) | ξ n , t | > ε (cid:12)(cid:12)(cid:12) G t − (cid:17) . n ∑ t = E (cid:16) ξ n , t | ξ n , t | > ε | G t − (cid:17) ≤ Cst n n ∑ t = (cid:2) β (cid:62) ω ( t ) (cid:3) V ( Z t − ) E (cid:110) φ f [ ε t ( γ )] (cid:12)(cid:12)(cid:12) G t − (cid:111) × E (cid:20) n − | β (cid:62) ω ( t ) | V ( Z t − ) | φ f [ ε t ( γ )] | (cid:12)(cid:12)(cid:12) G t − (cid:21) ≤ Cst n ∑ t = n − | β (cid:62) ω ( t ) | V ( Z t − ) E (cid:8) | φ f [ ε t ( γ )] | (cid:9) = Cst k + ∑ j = (cid:18) n j ( n ) n (cid:19) | β j | ( n j ( n )) t j ∑ t = t j − V ( Z t − ) E (cid:8) | φ f [ ε t ( γ )] | (cid:9) . For any j = , . . . , k +
1, observing that n − j ( n ) t j ∑ t = t j − V ( Z t − ) E (cid:8) | φ f [ ε t ( γ )] | (cid:9) = n − j ( n ) × n − j ( n ) t j ∑ t = t j − V ( Z t − ) E (cid:8) | φ f [ ε t ( γ )] | (cid:9) and noting that the second term in the right-hand side of the above equation tends almost surely toa finite limit, we conclude easily that for all positive ε ,lim n → ∞ n ∑ t = E (cid:16) ξ n , t | ξ n , t | > ε | G t − (cid:17) = a . s . Thus, the conditions of Corollary 3.1 of [33] are verified. Accordingly, under H , Λ n D −→ N ( , µ ( γ , β )) , n → ∞ . Now, turning to the study of Λ n , it is easy to see that Λ n = k + ∑ j = n j ( n ) n β j n j ( n ) t j ∑ t = t j − V ( Z t − ) φ (cid:48) f [ ε t ( γ )] . Since, under our assumptions, E { φ (cid:48) f [ ε ( γ )] } = I ( f ) , by Remark ( ii ) and the ergodic theorem, forall j = , . . . , k + , we have, almost surelylim n → + ∞ n j ( n ) t j ∑ t = t j − V ( Z t − ) φ (cid:48) f [ ε t ( γ )] = I ( f ) (cid:90) V ( x ) dF j ( x ) = µ j ( γ ) . Thus, almost surely, lim n → + ∞ Λ n = k + ∑ j = α j β j µ j ( γ ) = µ ( γ , β ) , and we can conclude that, as n → ∞ , Λ n ( γ , β ) = Λ n − µ ( γ , β ) + o P ( ) .
23t results from the fact that under H , Λ n D −→ N ( , µ ( γ , β )) , as n → ∞ , that under H , as n → ∞ , Λ n ( γ , β ) D −→ N (cid:18) − µ ( γ , β ) , µ ( γ , β ) (cid:19) . Finally, the LAN property is established for the central sequence ∆ n ( γ , β ) ≡ Λ n . It follows immediately from Theorem 1 that, under H , as n → ∞ , (cid:32) ∆ n ( γ , β ) Λ n ( γ , β ) (cid:33) D −→ N − µ ( γ , β ) , (cid:32) µ ( γ , β ) µ ( γ , β ) µ ( γ , β ) µ ( γ , β ) (cid:33) . The part (i) is an easy consequence of Theorem 1.For the part (ii), from [25] (see Proposition 4.2), under H ( n ) β , as n → ∞ , ∆ n ( γ , β ) D −→ N ( µ ( γ , β ) , µ ( γ , β )) . Observing that for all n ≥ , ∆ n ( γ , β ) (cid:98) ϖ n ( γ , β ) = ∆ n ( γ , β ) ϖ ( γ , β ) × ϖ ( γ , β ) (cid:98) ϖ n ( γ , β ) . Since, under H , as n → ∞ , (cid:98) ϖ n ( γ , β ) −→ ϖ ( γ , β ) , by contiguity, the result still holds under H ( n ) β . It follows from Theorem 1 that, under H ( n ) β , as n → ∞ , ∆ n ( γ , β ) ϖ ( γ , β ) D −→ N ( ϖ ( γ , β ) , ) . Accordingly, we obtain the asymptotic cumulative distribution of T n ( γ , β ) under H ( n ) β given bylim n → ∞ P (cid:18) ∆ n ( γ , β ) (cid:98) ϖ n ( γ , β ) > u α | H ( n ) β (cid:19) = lim n → ∞ P (cid:18) ∆ n ( γ , β ) ϖ ( γ , β ) > u α | H ( n ) β (cid:19) = − Φ ( u α − ϖ ( γ , β )) , where for α ∈ ( , ) , u α is the (1 − α )-quantile of a standard normal distribution with cumulativedistribution Φ . The part (iii) results from the LAN property and the fact that the limiting model is a Gaussianlocation model (see, eg, [48] or [25]). 24 .3 Proof of Proposition 1 (i) Let for any 1 ≤ j ≤ n , L n , j = n − j ∑ t = ( Ψ ( Z t − , ψ ) Ω [ ε t ( ψ , γ )] , Ψ ( Z t − , ψ ) Ω [ ε t ( ψ , γ )]) + o P ( ) , and η = ( η (cid:62) , η (cid:62) ) (cid:62) ∈ R l + q with η = ( η , . . . , η l ) (cid:62) and η = ( η , . . . , η q ) (cid:62) .We observe that for any n ≥ ≤ j ≤ n , η (cid:62) L n , j = n − j ∑ t = a t ( η ) , a t ( η ) = ∑ (cid:96) = η (cid:62) (cid:96) Ψ (cid:96) ( Z t − , ψ ) Ω (cid:96) [ ε t ( ψ , γ )] , t ∈ Z . It’s easy to see that for any j = , . . . , n , η (cid:62) L n , j is centred at 0.We first show that { ( η (cid:62) L n , j , G j ) , j = , . . . , n } is a sequence of martingales. Since the proof is verysimilar to that of Λ n , j in the proof of Theorem 1, we don’t detail it much.Let j , j ∈ Z such that j < j . We have: E ( η (cid:62) L n , j | G j ) = E ( η (cid:62) L n , j | G j ) + n − j ∑ t = j + E [ a t ( η ) | G j ]= η (cid:62) L n , j + n − j ∑ t = j + ∑ (cid:96) = E [ η (cid:62) Ψ (cid:96) ( Z t − , ψ )] E { Ω (cid:96) [ ε t ( ψ , γ )] } . As for any (cid:96) = , t ∈ Z , E { Ω (cid:96) [ ε t ( ψ , γ )] } =
0, we have E ( η (cid:48) L n , j | G j ) = η (cid:48) L n , j . Also, n ∑ t = E { [ n − a t ( η )] | G t − } = n n ∑ t = E [ a t ( η ) | G t − ] . (7)Substituting a t ( η ) for its expression, we get n ∑ t = E { [ n − a t ( η )] | G t − } = n n ∑ t = E (cid:18)(cid:110) η (cid:62) Ψ ( Z t − , ψ ) Ω [ ε t ( ψ , γ )] + η (cid:62) Ψ ( Z t − , ψ ) Ω [ ε t ( ψ , γ )] (cid:111) (cid:12)(cid:12)(cid:12) G t − (cid:19) t ∈ Z , ε t independent of G t − , it is easy to see that n ∑ t = E { [ n − a t ( η )] | G t − } = k + ∑ j = n j ( n ) n n j ( n ) t j ∑ t = t j − [ η (cid:62) Ψ ( Z t − , ψ )] E (cid:110) Ω [ ε t ( ψ , γ )] (cid:111) + k + ∑ j = n j ( n ) n n j ( n ) t j ∑ t = t j − [ η (cid:62) Ψ ( Z t − , ψ )] E (cid:110) Ω [ ε t ( ψ , γ ] (cid:111) + k + ∑ j = n j ( n ) n n j ( n ) t j ∑ t = t j − η (cid:62) Ψ ( Z t − , ψ ) . η (cid:62) Ψ ( Z t − , ψ ) E { Ω [ ε t ( ψ , γ )] Ω [ ε t ( ψ , γ )] } = S ( ) n + S ( ) n + S ( ) n . From the ergodic theorem, we have, as n → ∞ , S ( ) n a . s −→ k + ∑ j = α j (cid:90) [ η (cid:62) Ψ ( x , ψ )] dF j ( x ) (cid:90) Ω ( x ) f ( x ) dx < ∞ , S ( ) n a . s −→ k + ∑ j = α j (cid:90) [ η (cid:62) Ψ ( x , ψ )] dF j ( x ) (cid:90) Ω ( x ) f ( x ) dx < ∞ , S ( ) n a . s −→ k + ∑ j = α j (cid:90) η (cid:62) Ψ ( x , ψ ) . η (cid:62) Ψ ( x , ψ ) dF j ( x ) (cid:90) Ω ( x ) Ω ( x ) f ( x ) dx < ∞ . Whence, as n → ∞ , almost surely, n ∑ t = E [( n − a t ( η )) | G t − ] −→ ξξ = k + ∑ j = α j (cid:104) (cid:90) Ω ( x ) f ( x ) dx (cid:90) [ η (cid:62) Ψ ( x , ψ )] dF j ( x ) + (cid:90) Ω ( x ) f ( x ) dx (cid:90) [ η (cid:62) Ψ ( x , ψ )] dF j ( x )+ (cid:90) Ω ( x ) Ω ( x ) f ( x ) dx (cid:90) η (cid:62) Ψ ( x , ψ ) η (cid:62) Ψ ( x , ψ ) dF j ( x ) (cid:105) . It remains to check the Lindeberg condition. This can be done in the same line as in the proof ofTheorem 1. Hence, we deduce from Corollary 3.1 of [33] that, under H , η (cid:62) √ n ( ψ n − ψ ) D −→ N ( , ξ ) , n → ∞ , from which, it results that, under H , as n → ∞ , √ n ( ψ n − ψ ) D −→ N ( , Σ ) Σ = (cid:32) Σ Σ Σ Σ (cid:33) Σ , Σ , Σ and Σ defined in the proposition.(ii) Under H , we have by the part (i) and Theorem 1, that as n → ∞ , √ n ( ψ n − ψ ) D −→ N ( , Σ ) , Λ n ( ψ , γ , β ) D −→ N (cid:18) − µ ( ψ , γ , β ) , µ ( ψ , γ , β ) (cid:19) , where we recall that µ ( ψ , γ , β ) = k + ∑ j = α j β j µ j ( ψ , γ ) , with µ j ( ψ , γ ) = I ( f ) (cid:82) V θ ( x ) dF j ( x ) , j = , . . . , k + S n = √ n ( ψ n − ψ ) , it results that under H , ( S (cid:62) n , Λ n ( ψ , γ , β )) (cid:62) follows asymptotically anormal distribution with expectation (cid:18) (cid:62) , − µ ( ψ , γ , β ) (cid:19) (cid:62) and covariance vectorlim n → ∞ C ov ( S n , Λ n ( ψ , γ , β )) = lim n → ∞ C ov (cid:18) S n , ∆ n ( ψ , γ , β ) − µ ( ψ , γ , β ) (cid:19) = lim n → ∞ C ov ( S n , ∆ n ( ψ , γ , β )) . Substituting ∆ n ( ψ , γ ) for its expression, we get C ov ( S n , Λ n ( ψ , γ , β ) = √ n n ∑ t = β (cid:62) ω ( t ) E (cid:26) S n φ f [ ε t ( ψ , γ )] V θ ( Z t − ) (cid:27) − √ n n ∑ t = β (cid:62) ω ( t ) E ( S n ) E (cid:26) φ f [ ε t ( ψ , γ )] V θ ( Z t − ) (cid:27) = C n + C n . Since E { φ f [ ε t ( ψ , γ )] } = , t ∈ Z , it follows that C n = , ∀ n . Now, C n = n n ∑ t = β (cid:62) ω ( t ) E (cid:26) Ψ ( Z t − , ψ ) Ω (cid:62) [ ε t ( ψ , γ )] φ f [ ε t ( ψ , γ )] V θ ( Z t − ) (cid:27) + n n ∑ s , t = , s (cid:54) = t β (cid:62) ω ( t ) E (cid:26) Ψ ( Z s − , ψ ) Ω (cid:62) [ ε s ( ψ , γ )] φ f [ ε t ( ψ , γ )] V θ ( Z t − ) (cid:27) = C n + C n . Since E { Ω [ ε t ( ψ , γ )] } = E { Ω [ ε t ( ψ , γ )] } = E { φ f [ ε t ( ψ , γ )] } ) = t ∈ Z , and since ( ε t ) t ∈ Z is iid and for any t ∈ Z , ε t is independent of G t − = σ ( Z , . . . , Z t ) , it follows that C n = . C n , we have C n = (cid:90) Ω (cid:62) ( x ) φ f ( x ) f ( x ) dx n n ∑ t = β (cid:62) ω ( t ) E (cid:20) Ψ ( Z t − , ψ ) V θ ( Z t − ) (cid:21) = (cid:90) Ω (cid:62) ( x ) φ f ( x ) f ( x ) dx k + ∑ j = n j ( n ) n β j n j ( n ) t j ∑ t = t j − E (cid:20) Ψ ( Z t − , ψ ) V θ ( Z t − ) (cid:21) . Letting n → ∞ , it is easy to see that the right member of the above equality tends almost surely to (cid:90) Ω (cid:62) ( x ) φ f ( x ) f ( x ) dx k + ∑ j = α j β j (cid:90) Ψ ( x , ψ ) V θ ( x ) dF j ( x ) . Then, under H , as n → ∞ , C ov ( S n , Λ n ( ψ , γ , β )) −→ ν = (cid:90) Ω (cid:62) ( x ) φ f ( x ) f ( x ) dx k + ∑ j = α j β j (cid:90) Ψ ( x , ψ ) V θ ( x ) dF j ( x ) . Whence, under H , as n → ∞ , (cid:32) S n Λ n ( ψ , γ , β ) (cid:33) D −→ N − µ ( ψ , γ , β ) , (cid:32) Σ ν (cid:62) ν µ ( ψ , γ , β ) (cid:33) . By Le Cam’s third lemma, under H ( n ) , as n → ∞ , S n = √ n ( ψ n − ψ ) D −→ N ( ν , Σ ) . For any ψ = ( ρ (cid:62) , θ (cid:62) ) (cid:62) ∈ Θ × (cid:101) Θ and ( γ , β ) ∈ R k + × R k + , define Λ n ( ψ , γ , β ) = n ∑ t = log (cid:26) f (cid:20) ε t (cid:18) ψ , γ + β √ n (cid:19)(cid:21)(cid:27) − log { f ε t [( ψ , γ )] } + o P ( ) . Then, the log-likelihood ratio of H against H ( n ) β is Λ n ( ψ , γ , β ) . Writing a first-order Taylor ex-pansion of ∆ n ( ., γ , β ) around ψ n , we have, for some (cid:101) ψ n lying between ψ n and ψ . ∆ n ( ψ , γ , β ) = ∆ n ( ψ n , γ , β ) + ( ψ − ψ n ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β )+ ( ψ − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ n , γ , β )( ψ − ψ n ) . First, we show that, under H , as n → ∞ , ( ψ − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ n , γ , β )( ψ − ψ n ) = o P ( ) . (8)28or this, we observe that (cid:12)(cid:12)(cid:12) ( ψ − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ n , γ , β )( ψ − ψ n ) (cid:12)(cid:12)(cid:12) ≤ ||√ n ( ψ − ψ n ) || × √ n || ∂ ψ ∆ n ( (cid:101) ψ n , γ , β ) || M × || ψ − ψ n || . From Proposition 1, we have that, under H , as n → ∞ , √ n ( ψ − ψ n ) converges in distribution to anormal distribution and ψ − ψ n tends to in probability as n → ∞ . So (8) would be handled if weshow that || ∂ ψ ∆ n ( (cid:101) ψ n , γ , β ) || M / √ n tends in probability to some positive random variable.To get this, we define for any ( ψ , γ , β ) ∈ R l + q × R k + × R k + , D n ( ψ , γ , β ) = √ n ∂ ψ ∆ n ( ψ , γ , β ) = (cid:32) D n , , ( ψ , γ , β ) D n , , ( ψ , γ , β ) D (cid:62) n , , ( ψ , γ , β ) D n , , ( ψ , γ , β ) (cid:33) , D n , , ( ψ , γ , β ) : = √ n ∂ ρ ∆ n ( ψ , γ , β ) D n , , ( ψ , γ , β ) : = √ n ∂ ρθ ∆ n ( ψ , γ , β ) D n , , ( ψ , γ , β ) : = √ n ∂ θ ∆ n ( ψ , γ , β ) . From a lengthy but simple algebra, by ergodicity, it is easy to show that || D n , , ( (cid:101) ψ n , γ , β ) || M , || D n , , ( (cid:101) ψ n , γ , β ) || M and || D n , , ( (cid:101) ψ n , γ , β ) || M converge in probability to some positive numbers.It results from all these convergences that || ∂ ψ ∆ n ( (cid:101) ψ n , γ , β ) || M / √ n tends in probability to somepositive number. Thus (8) is obtained.Now, as n → ∞ , adding and subtracting appropriate terms, we can write ∆ n ( ψ , γ , β ) = ∆ n ( ψ n , γ , β ) + ( ψ − ψ N ( n ) ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β )+ ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β ) + o P ( ) . Observing that, as n → ∞ , √ n ( ψ − ψ N ( n ) ) = (cid:112) N ( n )( ψ − ψ N ( n ) ) √ n (cid:112) N ( n ) = o P ( ) , it is easy to see that ∂ ψ ∆ n ( ψ n , γ , β ) / √ n converges in probability to some random vector. So, wehave proved that, as n → ∞ , ( ψ − ψ N ( n ) ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β ) = √ n ( ψ − ψ N ( n ) ) (cid:62) √ n ∆ n ( ψ n , γ , β ) = o P ( ) . Consequently, as n → ∞ , ∆ n ( ψ , γ , β ) = ∆ n ( ψ n , γ , β ) + ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β ) + o P ( ) . For ending the proof of Proposition 2, we need the following lemma.29 emma 1
Let ψ n be a consistent and asymptotically normal estimator of ψ . For γ , β ∈ R k + , ψ N ( n ) is asymptotically in the tangent space Γ n to the curve of ∆ n ( ψ , γ , β ) at ψ n , defined as follows: Γ n : = (cid:110) x ∈ R l + q , ∆ n ( x , γ , β ) = ∆ n ( ψ n , γ , β ) + ( x − ψ n ) ∂ ψ ∆ n ( ψ n , γ , β ) (cid:111) , where { N ( n ) } n ≥ stands for a subset { , . . . , n } such that n / N ( n ) , tends to 0 as n tends to infinity. Proof. writing a second-order Taylor expansion of ∆ ( · , γ , β ) around ψ n , we have, for some (cid:101) ψ n lying between ψ N ( n ) and ψ n , ∆ n ( ψ N ( n ) , γ , β ) = ∆ n ( ψ n , γ , β ) + ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β )+ ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ N ( n ) , γ , β )( ψ N ( n ) − ψ n ) . To show that the sequence ψ N ( n ) is asymptotically in Γ n , we just have to show that ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ N ( n ) , γ , β )( ψ N ( n ) − ψ n ) = o P ( ) . But since n / N ( n ) −→ n → ∞ , we have √ n ( ψ N ( n ) − ψ n ) = (cid:112) N ( n )( ψ N ( n ) − ψ ) √ n (cid:112) N ( n ) + √ n ( ψ − ψ n )= o P ( ) + √ n ( ψ − ψ n ) . This implies that √ n ( ψ N ( n ) − ψ n ) converges in distribution to a Gaussian random vector. So, toshow that ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( (cid:101) ψ N ( n ) , γ , β )( ψ N ( n ) − ψ n ) = o P ( ) , we just show that the sequence || ∂ ψ ∆ n ( (cid:101) ψ N ( n ) , γ , β ) || / √ n converges in probability to some positive random variable. In this pur-pose, observing that (cid:101) ψ N ( n ) − ψ = o P ( ) , as we have shown previously that || ∂ ψ ∆ n ( (cid:101) ψ n , γ , β ) || M / √ n converges in probability to some random variable, it follows that || ∂ ψ ∆ n ( (cid:101) ψ N ( n ) , γ , β ) || M / √ n con-verges in probability to some positive random variable. Consequently, ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ Λ n ( (cid:101) ψ N ( n ) , γ , β )( ψ N ( n ) − ψ n ) = o P ( ) . This ends the proof of the lemma.Now, by Lemma 1 as n → ∞ , we have ∆ n ( ψ N ( n ) , γ n ) = ∆ n ( ψ n , γ , β ) + ( ψ N ( n ) − ψ n ) (cid:62) ∂ ψ ∆ n ( ψ n , γ , β ) + o P ( ) . Therefore, from the equality before the statement of Lemma 1, we have that under H , as n → ∞ , ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) + o P ( ) . This ends the proof the first part of Proposition 2.For the second part, recall that for all j = , . . . , k + µ j ( ψ , γ ) = I ( f ) (cid:90) V θ ( x ) dF j ( x ) . (cid:98) µ ( ψ n , γ ) − µ ( ψ , γ ) = k + ∑ j = β j (cid:8) ( (cid:98) α j − α j )[ (cid:98) µ j ( ψ n , γ ) − µ j ( ψ , γ )] (cid:9) + k + ∑ j = β j (cid:8) α j [ (cid:98) µ j ( ψ n , γ ) − µ j ( ψ , γ )] (cid:9) . Remembering that for all j = , . . . k + (cid:98) α j − α j tends to 0 as n tends to infinity, by the continuityof the function ( x , θ ) (cid:55)→ V θ ( x ) and the fact that it is bounded from the bottom by some positivenumber τ , it follows from the Lebesgue’s convergence theorem that each term in the above sumtends to 0. This handles the second part of Proposition 2. We recall that T n ( ψ N ( n ) , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) (cid:98) ϖ n ( ψ N ( n ) , γ , β ) . From Proposition 2, under H , as n → ∞ , ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) + o P ( ) and in probability, (cid:98) ϖ n ( ψ N ( n ) , γ , β ) −→ ϖ ( ψ , γ , β ) . Whence the part ( i ) follows from Theorem 1 and Proposition 2.For the part ( ii ) , we first observe that by contiguity, the above asymptotics still hold under H ( n ) .Next, from an adaptation of the proof of Corollary 1, we can see that under H ( n ) , as n → ∞ , ∆ n ( ψ , γ , β ) ϖ ( ψ , γ , β ) D −→ N ( ϖ ( ψ , γ , β ) , ) . Now, writing T n ( ψ N ( n ) , γ , β ) = ∆ n ( ψ , γ , β ) ϖ ( ψ , γ , β ) × ϖ ( ψ , γ , β ) (cid:98) ϖ n ( ψ N ( n ) , γ , β ) + (cid:98) ϖ n ( ψ N ( n ) , γ , β ) × o P ( ) , it is easy to see that, under H ( n ) β , as n → ∞ , T n ( ψ N ( n ) , γ , β ) D −→ N ( ϖ ( ψ , γ , β ) , ) . Thus, we conclude that the local asymptotic power of the constructed test is preserved. Replacing ∆ n ( ψ , γ , β ) with its estimated version has no effect. Then, the statistics T n ( ψ N ( n ) , γ , β ) and T n ( ψ , γ , β ) are locally asymptotically equivalent. Consequently, the last part of the theorem canbe handled as that of Theorem 2. 31 .6 Proof of Proposition 3 For the proof of the proposition we need the following lemmas.
Lemma 2
Assume ( A ) - ( A ) , ( B ) - ( B ) hold. Then, for any sequence of consistent and asymptoti-cally normal estimators { (cid:98) γ , n } n ≥ of γ , under H , as n → ∞ , we have, for any β ∈ R k + , ∆ n ( ψ , γ , β ) = ∆ n ( ψ , (cid:98) γ , N ( n ) , β ) + o P ( ) , where { N ( n ) } n ≥ stands for a subset { , . . . , n } such that n / N ( n ) tends to 0 as n tends to infinity. Proof.
The proof is similar to that of Proposition 2.
Lemma 3
Let ψ n be a consistent estimator of ψ . Assume that ( A ) - ( A ) , ( B ) - ( B ) hold. Then,for any sequence of consistent and asymptotically normal estimators { (cid:98) γ , n } n ≥ of γ , under H , asn → ∞ , we have, for any β ∈ R k + , ∆ n ( ψ n , γ , β ) = ∆ n ( ψ n , (cid:98) γ , N ( n ) , β ) + o P ( ) , with { N ( n ) } n ≥ standing for a subset { , . . . , n } such that n / N ( n ) tends to 0 as n tends to infinity. Proof.
By a first-order Taylor expansion of ∆ n ( ψ n , · , β ) around (cid:98) γ , n , for some ˙ γ , n lying between (cid:98) γ , n and γ , we have, ∆ n ( ψ n , γ , β ) = ∆ n ( ψ n , (cid:98) γ , n , β ) − ( (cid:98) γ , n − γ ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β )+ ( (cid:98) γ , n − γ ) (cid:62) ∂ γ ∆ n ( ψ n , ˙ γ , n , β )( (cid:98) γ , n − γ ) . Proceeding as in the proof of Proposition 2, we have that as n → ∞ , ( γ − (cid:98) γ , n ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:101) γ , n , β )( γ − (cid:98) γ , n ) = o P ( ) . Thus, as n → ∞ , writing ∆ n ( ψ n , γ , β ) = ∆ n ( ψ n , (cid:98) γ , n , β ) + ( γ − (cid:98) γ , N ( n ) ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β )+ ( (cid:98) γ , N ( n ) − (cid:98) γ , n ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) + o P ( ) and observing that √ n ( γ − (cid:98) γ , N ( n ) ) = o P ( ) , it remains to show that ( / √ n ) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) con-verges in probability to some random vector. In this purpose, for some ¨ γ , n lying between γ and (cid:98) γ , n , we can write1 √ n ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) = n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48) f [ ε t ( ψ n , γ )] − ( (cid:98) γ , n − γ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) ω (cid:62) ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( ψ n , ¨ γ , n )]= V n + V n (9)32nder the assumptions ( B ) and ( B ) and the fact that (cid:98) γ , n − γ converges in probability to 0, theergodic theorem allows us to conclude that, as n → ∞ , V n tends in probability to 0. Now, addingand subtracting appropriate terms, and using a Taylor expansion, it is easy so see that V n = ( θ n − θ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( (cid:101) ψ n , γ )] ∂ (cid:62) θ ε t ( (cid:101) ψ n , γ )+( ρ n − ρ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( (cid:101) ψ n , γ )] ∂ (cid:62) ρ ε t ( (cid:101) ψ n , γ ) − n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48) f [ ε t ( ψ , γ )]= V n + V n + V n . In view of ( B ) and the ergodic theorem, we can easily show that as n tends to ∞ , V n converges inprobability to some random vector.For the convergence of the term V n , we can write V n = ( ρ n − ρ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( (cid:101) ψ n , γ )] ∂ (cid:62) ρ T (cid:101) ρ n ( Z t − ) V (cid:101) θ n ( Z t − ) , from which we have || V n || ≤ Cst || ρ n − ρ || n n ∑ t = | β (cid:62) ω ( t ) | ϑ ( Z t − ) . By the ergodic theorem and the fact that ρ n − ρ = o P ( ) , it is easy to see that as n → ∞ , the right-hand side of the above inequality tends in probability to 0. Therefore, as n → ∞ , V n tends inprobability to 0.For the convergence of V n , we can write V n = − ( θ n − θ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( (cid:101) ψ n , γ )] ∂ (cid:62) θ V (cid:101) θ n ( Z t − ) × ( ε t ( (cid:101) ψ n , γ ) − ε t ( ψ , γ )) − ( θ n − θ ) n n ∑ t = ω ( t ) β (cid:62) ω ( t ) V θ n ( Z t − ) φ (cid:48)(cid:48) f [ ε t ( (cid:101) ψ n , γ )] ∂ (cid:62) θ V (cid:101) θ n ( Z t − ) ε t ( ψ , γ ) . (10)It is easy to see that the first term in the right-hand side of (10) is bounded by Cst || θ − θ n || n n ∑ t = | β (cid:62) ω ( t ) | ϑ ( Z t − ) | ε t ( ψ , γ ) | + Cst || θ n − θ || × || ρ − ρ n || n n ∑ t = | β (cid:62) ω ( t ) | ϑ ( Z t − ) . Using again the ergodic theorem and the fact that || ψ n − ψ || converges in probability to 0, theright-hand of (10) converges in probability to 0.In the same way, by the ergodic theorem, as n → ∞ , the second term in the right-hand of (10)converges almost surely to 0. Then, as n → ∞ , V n tends in probability to 0. It results that V n tendsin probability to 0, as n → ∞ . Consequently, as n → ∞ , ( γ − (cid:98) γ , N ( n ) ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) = o P ( ) , n ( ψ n , γ , β ) = ∆ n ( ψ n , (cid:98) γ , n , β ) + ( (cid:98) γ , N ( n ) − (cid:98) γ , n ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) + o P ( ) . Now, for showing that (cid:98) γ , N ( n ) is asymptotically in the tangent space to the curve of ∆ n ( ψ n , γ , β ) at (cid:98) γ , n , we proceed as in the proof of Lemma 1 and obtain ∆ n ( ψ n , (cid:98) γ , N ( n ) , β ) = Λ n ( ψ n , (cid:98) γ , n , β ) + ( (cid:98) γ , N ( n ) − (cid:98) γ , n ) (cid:62) ∂ γ ∆ n ( ψ n , (cid:98) γ , n , β ) + o P ( ) . Whence, under H , as n → ∞ , ∆ n ( ψ n , γ , β ) = ∆ n ( ψ n , (cid:98) γ , N ( n ) , β ) + o P ( ) , where { N ( n ) } n ≥ stands for a subset { , . . . , n } such that n / N ( n ) tends to 0 as n tends to infinity.Back to the proof of the proposition, from Lemma 3, for any estimator ψ n of ψ satisfying ( B ) , forany sequence of consistent and asymptotically normal estimators { (cid:98) γ , N ( n ) } n ≥ of γ , as n → ∞ , ∆ n ( ψ N ( n ) , γ , β ) = ∆ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) + o P ( ) , from which, subtracting and adding ∆ n ( ψ N ( n ) , γ , β ) we obtain ∆ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) − ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , γ , β ) − ∆ n ( ψ , γ , β ) + o P ( ) . By Proposition 2 and Lemma 3, we have ∆ n ( ψ , γ , β ) = ∆ n ( ψ N ( n ) , (cid:98) γ , N ( n ) , β ) + o P ( ) . Acknowledgements - We thank Professor Lihua XIONG and Professor Cong JIANG for providingus with the Upper Hanjiang River floods data.
References [1] Andrews, D. W. (2003).
End-of-sample instability tests . Econometrica, 71(6), 1661-1694.[2] Amano, T. (2012).
Asymptotic Optimality of Estimating Function Estimator for CHARNModel . Advances in Decision Sciences, 2012.[3] Aue, A., & Horv´ath, L. (2013).
Structural breaks in time series . Journal of Time SeriesAnalysis, 34(1), 1-16.[4] Bardet, J. M., & Kengne, W. (2014).
Monitoring procedure for parameter change in causaltime series . Journal of Multivariate Analysis, 125, 204-221.345] Bardet, J. M., Kengne, W., & Wintenberger, O. (2012).
Multiple breaks detection in generalcausal time series using penalized quasi-likelihood . Electronic Journal of Statistics, 6, 435-477.[6] Bardet, J. M., & Wintenberger, O. (2009).
Asymptotic normality of the quasi-maximum like-lihood estimator for multidimensional causal processes . The Annals of Statistics, 37(5B),2730-2759.[7] Basseville, M., & Nikiforov, I. V. (1993).
Detection of abrupt changes: theory and applica-tion (Vol. 104). Englewood Cliffs: prentice Hall.[8] Benghabrit, Y., & Hallin, M. (1996).
Locally asymptotically optimal tests for autoregressiveagainst bilinear serial dependence.
Statistica sinica, 147-169.[9] Benghabrit, Y., & Hallin, M. (1998).
Locally asymptotically optimal tests for AR (p) againstdiagonal bilinear dependence . Journal of statistical planning and inference, 68(1), 47-63.[10] Berkes, I., Horv´ath, L., & Kokoszka, P. (2004).
Testing for parameter constancy in GARCH(p, q) models . Statistics & probability letters, 70(4), 263-273.[11] Bhattacharya, P. K., & Zhou, H. (2017).
Nonparametric Stopping Rules for Detecting SmallChanges in Location and Scale Families . In From Statistics to Mathematical Finance (pp.251-271). Springer, Cham.[12] Billingsley, P. (1968).
Convergence of probability measures . Wiley, New York.[13] Brockwell, P.J. & Davis, R.A. (2002).
Introduction to time series and forcasting . Springer.[14] Busetti, F. (2015).
On detecting end-of-sample instabilities . Unobserved Components andTime Series Econometrics,eds., S.J Koopman and N. Shepard. Oxford University Press.[15] Chen, K. M., Cohen, A., & Sackrowitz, H. (2011).
Consistent multiple testing for changepoints . Journal of multivariate analysis, 102(10), 1339-1343.[16] Chernoff, H., & Zacks, S. (1964).
Estimating the current mean of a normal distribution whichis subjected to changes in time . The Annals of Mathematical Statistics, 35(3), 999-1018.[17] Ciuperca, G. (2011).
A general criterion to determine the number of change-points . Statistics& probability letters, 81(8), 1267-1275.[18] Cs¨org¨o, M., & Horv´ath, L. (1997).
Limit theorems in change-point analysis (Vol. 18). JohnWiley & Sons Inc. 3519] Dahlhaus, R. (1997).
Fitting time series models to nonstationary processes . The annals ofStatistics, 25(1), 1-37.[20] Dehling, H., Rooch, A., & Taqqu, M. S. (2013).
Non-parametric change-point tests for long-range dependent data . Scandinavian Journal of Statistics, 40(1), 153-173.[21] Dehling, H., Rooch, A., & Taqqu, M. S. (2017).
Power of change-point tests for long-rangedependent data . Electronic Journal of Statistics, 11(1), 2168-2198.[22] Deshayes, J., & Picard, D. (1985).
Off-line statistical analysis of change-point models us-ing non parametric and likelihood methods . In Detection of Abrupt Changes in Signals andDynamical Systems (pp. 103-168). Springer, Berlin, Heidelberg.[23] D¨oring, M. (2010).
Multiple change-point estimation with U-statistics . Journal of StatisticalPlanning and Inference, 140(7), 2003-2017.[24] D¨oring, M. (2011).
Convergence in distribution of multiple change point estimators . Journalof statistical planning and inference, 141(7), 2238-2248.[25] Dreosebeke, J-J. & Fine, J. Association pour la statistique et ses utilisations (France). (1996).
Inf´erence non param´etrique. Les statistiques de rangs . ´Editions de l’Universit´e de Bruxelles.[26] Enikeeva, F., Munk, A., & Werner, F. (2018).
Bump detection in heterogeneous Gaussianregression . Bernoulli, 24(2), 1266-1306.[27] Fotopoulos, S. B., Jandhyala, V. K., & Tan, L. (2009).
Asymptotic study of the change-pointmle in multivariate Gaussian families under contiguous alternatives . Journal of statisticalplanning and inference, 139(3), 1190-1202.[28] Francq, C., & Zako¨ıan, J. M. (2012).
Strict stationarity testing and estimation of explosiveand stationary generalized autoregressive conditional heteroscedasticity models . Economet-rica, 80(2), 821-861.[29] Gardner, L. A. (1969).
On detecting changes in the mean of normal variates . The Annals ofMathematical Statistics, 40(1), 116-126.[30] Gombay, E. (2008).
Change detection in autoregressive time series . Journal of MultivariateAnalysis, 99(3), 451-464.[31] Gombay, E., & Serban, D. (2009).
Monitoring parameter change in AR (p) time series mod-els . Journal of Multivariate Analysis, 100(4), 715-725.3632] Haccou, P., Meelis, E. & van de Geer, S. (1988).
The likelihood ratio test for a change pointin a sequence of independent exponentially distributed random variables . Vol. 30. Stochastic.Process. Appl. 121-139.[33] Heyde, C. C. (1980).
Martingale limit theory and its application (No. 515.62 H3).[34] H¨ardle, W., Tsybakov, A., & Yang, L. (1998).
Nonparametric vector autoregression . Journalof Statistical Planning and Inference, 68(2), 221-245.[35] Hawkins, D. M. (2001).
Fitting multiple change-point models to data . Computational Statis-tics & Data Analysis, 37(3), 323-341.[36] Horv´ath, L. (2001).
Change-point detection in long-memory processes . Journal of Multivari-ate Analysis, 78(2), 218-234.[37] Horv´ath, L., & Hu˘skov´a, M. (2005).
Testing for changes using permutations of U-statistics .Journal of Statistical Planning and Inference, 128(2), 351-371.[38] Horv´ath, L., Kokoszka, P., & Zhang, A. (2006).
Monitoring constancy of variance in condi-tionally heteroskedastic time series . Econometric Theory, 373-402.[39] Horv´ath, L., Miller, C., & Rice, G. (2020).
A new class of change point test statistics of R´enyitype . Journal of Business & Economic Statistics, 38(3), 570-579.[40] Horv´ath, L., & Steinebach, J. (2000).
Testing for changes in the mean or variance of astochastic process under weak invariance . Journal of statistical planning and inference,91(2), 365-376.[41] Hwang, S. Y., & Basawa, I. V. (2001).
Nonlinear time series contiguous to AR (1) processesand a related efficient test for linearity . Statistics & probability letters, 52(4), 381-390.[42] Basawa, I. V., & Hwang, S. Y. (2003).
Estimation for nonlinear autoregressive models gen-erated by beta-ARCH processes . Sankhy¯a: The Indian Journal of Statistics, 744-762.[43] Huh, J. (2010).
Detection of a change point based on local-likelihood . Journal of multivariateanalysis, 101(7), 1681-1700.[44] Kander, Z., & Zacks, S. (1966).
Test procedures for possible changes in parameters of statis-tical distributions occurring at unknown time points . The Annals of Mathematical Statistics,37(5), 1196-1210.[45] Kengne, W. C. (2012).
Testing for parameter constancy in general causal time-series models .Journal of Time Series Analysis, 33(3), 503-518.3746] Khakhubia, T. G. (1987).
A limit theorem for a maximum-likelihood estimate of the disordertime . Theory of Probability & Its Applications, 31(1), 141-144.[47] Lavielle, M. (1998).
Optimal segmentation of random processes . IEEE Transactions on Sig-nal Processing, 46(5), 1365-1373.[48] Le Cam, L. (1986).
Asymptotic methods in statistical decision theory . Springer-Verlag.[49] Liebscher, E. (2003).
Strong convergence of estimators in nonlinear autoregressive models .Journal of Multivariate Analysis, 84(2), 247-261.[50] MacNeill, I. B. (1974).
Tests for change of parameter at unknown times and distributions ofsome related functionals on Brownian motion . The Annals of Statistics, 950-962.[51] Matthews, D. E., Farewell, V. T., & Pyke, R. (1985).
Asymptotic score-statistic processesand tests for constant hazard against a change-point alternative . The Annals of Statistics,13(2), 583-591.[52] Menne, M. J., & Williams Jr, C. N. (2005).
Detection of undocumented changepoints usingmultiple test statistics and composite reference series . Journal of Climate, 18(20), 4271-4286.[53] Mohr, M., & Selk, L. (2020).
Estimating change points in nonparametric time series regres-sion models . Statistical Papers, 1-27.[54] Ngatchou-Wandji, J. (2008).
Estimation in a class of nonlinear heteroscedastic time seriesmodels . Electronic Journal of Statistics, 2, 40-62.[55] Niu, Y. S., Hao, N., & Zhang, H. (2016).
Multiple change-point detection: A selectiveoverview . Statistical Science, 611-623.[56] Niu, Y. S., & Zhang, H. (2012).
The screening and ranking algorithm to detect DNA copynumber variations . The annals of applied statistics, 6(3), 1306.[57] Page, E. S. (1954).
Continuous inspection schemes . Biometrika, 41(1/2), 100-115.[58] Pettitt, A. N. (1979).
A non-parametric approach to the change-point problem . Journal of theRoyal Statistical Society: Series C (Applied Statistics), 28(2), 126-135.[59] Sen, A., & Srivastava, M. S. (1975).
On tests for detecting change in mean . The Annals ofstatistics, 98-108. 3860] Truong, C., Oudre, L., & Vayatis, N. (2020).
Selective review of offline change point detectionmethods . Signal Processing, 167, 107299.[61] Vogelsang, T. J. (1997).
Wald-type tests for detecting breaks in the trend function of a dy-namic time series . Econometric Theory, 818-849.[62] Vogelsang, T. J. (1999).
Sources of nonmonotonic power when testing for a shift in mean ofa dynamic time series . Journal of Econometrics, 88(2), 283-299.[63] Wang, Q., & Phillips, P. C. (2012).
A specification test for nonlinear nonstationary models .The Annals of Statistics, 40(2), 727-758.[64] Wolfe, D. A., & Schechtman, E. (1984).
Nonparametric statistical procedures for thechangepoint problem . Journal of Statistical Planning and Inference, 9(3), 389-396.[65] Xiong, L., Jiang, C., Xu, C. Y., Yu, K. X., & Guo, S. (2015).
A framework of change-point de-tection for multivariate hydrological series . Water Resources Research, 51(10), 8198-8217.[66] Yang, Y., & Song, Q. (2014).
Jump detection in time series nonparametric regression models:a polynomial spline approach . Annals of the Institute of Statistical Mathematics, 66(2), 325-344.[67] Yao, Y. C., & Davis, R. A. (1986).
The asymptotic behavior of the likelihood ratio statistic fortesting a shift in mean in a sequence of independent normal variates . Sankhy¯a: The IndianJournal of Statistics, Series A, 339-353.[68] Zhou, Z. (2014).