[PDF] Multi-frequency-band tests for white noise under heteroskedasticity

Abstract

This paper proposes a new family of multi-frequency-band (MFB) tests for the white noise hypothesis by using the maximum overlap discrete wavelet packet transform (MODWPT). The MODWPT allows the variance of a process to be decomposed into the variance of its components on different equal-length frequency sub-bands, and the MFB tests then measure the distance between the MODWPT-based variance ratio and its theoretical null value jointly over several frequency sub-bands. The resulting MFB tests have the chi-squared asymptotic null distributions under mild conditions, which allow the data to be heteroskedastic. The MFB tests are shown to have the desirable size and power performance by simulation studies, and their usefulness is further illustrated by two applications.

Full PDF

aa r X i v : . [ ec on . E M ] A p r MULTI-FREQUENCY-BAND TESTS FOR WHITE NOISE UNDERHETEROSKEDASTICITY

By Mengya Liu ∗ , Fukang Zhu ∗ and Ke Zhu † Jilin University ∗ and University of Hong Kong † This paper proposes a new family of multi-frequency-band (MFB)tests for the white noise hypothesis by using the maximum overlapdiscrete wavelet packet transform (MODWPT). The MODWPT al-lows the variance of a process to be decomposed into the varianceof its components on diﬀerent equal-length frequency sub-bands, andthe MFB tests then measure the distance between the MODWPT-based variance ratio and its theoretical null value jointly over severalfrequency sub-bands. The resulting MFB tests have the chi-squaredasymptotic null distributions under mild conditions, which allow thedata to be heteroskedastic. The MFB tests are shown to have the de-sirable size and power performance by simulation studies, and theirusefulness is further illustrated by two applications.

Keywords and phrases:

Heteroskedasticity; Maximum overlap discrete wavelet packet transform; Testingfor white noise; Variance ratio test; Wavelets

1. Introduction.

Consider a stochastic sequence { y t } with E ( y t ) = 0 for all t ∈ Z .A long standing problem in time series analysis is to detect the null hypothesis that { y t } is white noise, i.e.,(1.1) H : { y t } is an uncorrelated process . In the time domain, Box and Pierce (1970) and later Ljung and Box (1978) proposedportmanteau tests to detect H by checking whether E ( y t y t − k ) = 0 at some ﬁnite lags k = 1 , ..., K . Their portmanteau tests require { y t } to be independent and identically dis-tributed (i.i.d.), while the i.i.d. condition is restrictive in many economic and ﬁnancialapplications. To relax this condition, Lobato, Nankervis and Savin (2001) constructed amodiﬁed portmanteau test, which is valid when { y t } is a martingale diﬀerence sequence(MDS). This method was further studied by Escanciano and Lobato (2009) with a data-driven method to select an optimal lag. For the non-MDS { y t } , some robust versions ofportmanteau test were proposed in Romano and Thombs (1996) and Horowitz, Lobato,Nankervis and Savin (2006) by implementing the block bootstrap methods, Lobato (2001)by using the self-normalization technique, and Lobato, Nankervis and Savin (2002) andZhu (2016) by estimating the asymptotic variance matrix of the ﬁrst K sample autocorre-lations of { y t } . However, all of the aforementioned tests require { y t } to be stationary, andthey are thus not applicable for heteroskedastic { y t } (i.e., Ey t a constant for all t ).In the frequency domain, Gen¸cay and Signori (2015) recently introduced a family ofmulti-scale tests for H , and their tests work for the heteroskedastic { y t } . To illustrate theidea of multi-scale tests, we simply assume that { y t } is a covariance stationary process. Themulti-scale tests ﬁrst apply the maximum overlap discrete wavelet transform (MODWT)to { y t } , and then obtain its high frequency component W m ≡ { W m,t } and low frequencycomponent V m ≡ { V m,t } at each scale m , where W m and V m are related to the frequencysub-bands [ m +1 , m ] and [0 , m +1 ], respectively, and they are decomposed recursively from V m − ; see the left panel in Figure 1 for the decomposition way of MODWT. Next, Gen¸cayand Signori (2015) showed that if { y t } is white noise,(1.2) var( W m,t )var( y t ) = 12 m for m = 1 , , ..., where var( W m,t ) is the MODWT-based wavelet variance, and so var( W m,t ) / var( y t ) is theMODWT-based wavelet variance ratio (WVR). Motivated by (1.2), the multi-scale testsdetect H by measuring the distance (under certain norm) between the sample version of MODWT-based WVR and m at each scale m (or jointly over the ﬁrst m scales). With theaid of wavelet method, the multi-scale tests are particularly suitable in situations wherethe data { y t } have jumps, kinks, seasonality and non-stationary features. This advantagedoes not hold for the Fourier-based frequency-domain tests in Hong (1996), Paparoditis(2000), Fan and Zhang (2004), Escanciano and Velasco (2006), and Shao (2011a). Besidesthe multi-scale tests, some other wavelet-based frequency-domain tests were constructedbased on the wavelet spectral density estimator. In this context, Lee and Hong (2001)applied the idea of Hong (1996) to construct an asymptotically pivotal test, but their testrequires { y t } to be stationary and homoskedastic, and its result is usually sensitive tothe choice of the ﬁnest scale especially when the sample size is small; Duchesne, Li andVandermeerschen (2010) and Li, Yao and Duchesne (2014) further developed some wavelet-based tests by using the idea of Fan (1996), however, their methods are only applicablefor the stationary i.i.d. data, with some bootstrap methods to obtain the critical values. Fig 1 . The decomposition ways of MODWT (left) and MODWPT (right). For the MODWT, only V m atscale m is decomposed into V m +1 and W m +1 at scale m + 1 . For the MODWPT, all { W m,n } m − n =0 at scale m are decomposed into { W m +1 ,n } m +1 − n =0 at scale m + 1 . Although the multi-scale tests have the aforementioned advantage over the existingones, they have a drawback due to the decomposition way of MODWT. To see it clearly,we note that for any covariance stationary process { y t } and m = 1 , , ... ,(1.3) var( W m,t )var( y t ) ≈ R / m / m +1 S y ( f ) df R / S y ( f ) df (see Gen¸cay and Signori (2015)), where S y ( f ) is the spectral density function of { y t } , and it is ﬂat under H . The result (1.3) implies that the MODWT-based WVR at scale m essentially measures the ratio of the total variance contributed by the frequency sub-band[ m +1 , m ]. So, the multi-scale tests lack the power if S y ( f ) is not ﬂat but satisﬁes therelationship: R / m / m +1 S y ( f ) df R / S y ( f ) df ≈ m for m = 1 , , .... As a simple illustrating example, Figure 2 plots S y ( f ) for a white noise process and acorrelated process. By construction, the contribution of frequency sub-band [ m +1 , m ] tothe total variance of each process is the same, and the multi-scale tests are thus unableto distinguish these two processes. To detect this correlated process, an intuitive way isto further decompose the high-frequency component W m , so that more signals to reject H can be found within the frequency sub-band [ m +1 , m ]. However, the MODWT failsto do this, since it does not re-decompose W m any more. Fig 2 . The plot of S y ( f ) for a white noise process (left) and a correlated process (right). The contributionof frequency band [1/4,1/2] to the total variance of each process is in gray. This paper is motivated to propose a new family of frequency-domain-based tests for H by using the maximum overlap discrete wavelet packet transform (MODWPT). The MOD-WPT decomposes the process { y t } into 2 m diﬀerent components { W m,n ; n = 0 , ..., m − } at each scale m , where W m,n ≡ { W m,n,t } is related to the frequency sub-band [ n m +1 , n +12 m +1 ],and it is decomposed recursively from 2 m − components { W m − ,n } at the previous scale;see the right panel in Figure 1 for the decomposition way of MODWPT. Unlike theMODWT, the MODWPT re-composes each W m,n so that the entire frequency band [0 , ]is reﬁned, and it thus provides us with an eﬀective way to largely overcome the inconsis- tency problem in multi-scale tests. With { W m,n ; n = 0 , ..., m − } , our testing principleuses the fact that if { y t } is stationary white noise,(1.4) var( W m,n,t )var( y t ) = 12 m for n = 0 , ..., m − , where var( W m,n,t ) is the MODWPT-based wavelet variance, and var( W m,n,t ) / var( y t ) is theMODWPT-based WVR. Hence, at each scale m , we can look for the rejection evidenceby measuring the distance between the sample version of MODWPT-based WVR and m jointly over n = 1 , ..., m −

1. Note that we do not consider the testing signal in W m, (whichis identical to V m ) as done in Gen¸cay and Signori (2015). Our resulting tests are calledthe multi-frequency-band (MFB) tests, since they are constructed by collecting signalsfrom all frequency sub-bands (except the ﬁrst one) at each scale m . The MFB tests areshown to have simple chi-squared limiting null distributions, under conditions that allowfor higher order dependence, heteroskedasticity, and trending moments. Hence, they areeasy-to-implement with great generality. Simulation studies show that the MFB tests canhave desirable empirical size and power even when the sample size is small, and they canperform better than the multi-scale tests and other competitors especially when the serialdependence of the examined data exists at large lags. Also, the simulation studies indicatethat the multi-scale tests could serve as diagnostic tools for many non-stationary models,including, for example, the time-varying GARCH model in Subba Rao (2006), the non-stationary GARCH model in Francq and Zako¨ıan (2012), and the ZD-GARCH model inLi, Zhang, Zhu and Ling (2018), whose model diagnostic checking methods are absent inthe literature.Finally, two applications are given to demonstrate the usefulness of the MFB tests. Inthe ﬁrst application, our MFB tests show that although the entire S&P500 return series in2006–2015 is not white noise, its sub-series in 2009–2015 is white noise. These results areinformative for empirical researchers, since they indicate that the S&P500 stock marketpossibly is not predictable in 2009–2015 but predictable in 2006–2008. Since the S&P500stock market is relatively more volatile in 2006–2008 than 2009–2015, our ﬁndings maysuggest that the S&P500 stock market is more likely to be ineﬃcient when it is morevolatile. In the second application, we apply our MFB tests to four non-stationary stockreturn series in Francq and Zako¨ıan (2012), and ﬁnd that three of them are not whitenoises. Hence, it implies that these three non-white-noise series have some dynamical structures in their conditional mean, and they should not be directly ﬁtted by the ﬁrst-order non-stationary GARCH model as done in Francq and Zako¨ıan (2012).The remainder of this paper is organized as follows. Section 2 introduces the MODWPT-based WVR and gives the asymptotics of its estimator. Section 3 proposes our MFB testsand studies their asymptotics. Simulations are provided in Section 4 and applications areoﬀered in Section 5. Technical proofs are deferred to the Appendix.

2. Wavelet variance ratio and its estimator.

The wavelet variance ratio (WVR)plays an important role in our testing principle. Below, we introduce the WVR based onthe maximum overlap discrete wavelet packet transform (MODWPT) and its estimator.For more discussions on MODWPT, we refer to Percival and Walden (2000).2.1.

MODWPT-based WVR.

To elaborate the deﬁnition of the MODWPT-based WVR,we simply assume that { y t } Tt =1 is a stationary process with mean zero. The MODWPT-based WVR is deﬁned in terms of the MODWPT component of { y t } Tt =1 . To compute theMODWPT component, we need a wavelet ﬁlter { h l } L − l =0 and its associated scaling ﬁlter { g l } L − l =0 , where { h l } L − l =0 satisﬁes that h l = 0 for l < l ≥ L , and L − X l =0 h l = 0 , L − X l =0 h l = 1 , ∞ X l = −∞ h l h l +2 n = 0 , and { g l } L − l =0 satisﬁes that g l = ( − l +1 h L − − l and L − X l =0 g l = 1 , L − X l =0 g l = 1 , ∞ X l = −∞ g l g l +2 n = 0 , ∞ X l = −∞ g l h l +2 n = 0 , for all nonzero integers n . Some well-known choices of h l and g l are given as follows: • Haar wavelet: { h l } l =0 = (1 / , − /

2) and { g l } l =0 = (1 / , / • Daubechies wavelets ( D ( L )): D (2) is just the Haar wavelet. The wavelet and scalingﬁlters for D (4) are deﬁned as { h l } l =0 = − √ , − √ , √ , − − √ ! and { g l } l =0 = √ , √ , − √ , − √ ! , respectively. The wavelet and scaling ﬁlters for D ( L ) with L >

Let L m = (2 m − L −

1) + 1 for some integer m ≥

1. Based on { h l } L − l =0 and { g l } L − l =0 ,we then compute { e v m,n,l } L m − l =0 by e v m,n,l = 12 m/ v m,n,l for n = 0 , , ..., m −

1. Here, v m,n,l is deﬁned recursively by v m,n,l = L − X k =0 u n,k v m − , [ n ] ,l − m − k with v , ,l = g l and v , ,l = h l , where [ · ] is the integer part operator, and u n,l =  g l , if n mod 4 = 0 or 3 ,h l , if n mod 4 = 1 or 2 . Using { e v m,n,l } L m − l =0 , the MODWPT components W m,n ≡ { W m,n,t } Tt =1 at scale m are com-puted with the MODWPT coeﬃcients W m,n,t = L m − X l =0 e v m,n,l y t − l mod T . Note that W m,n,t can be fast calculated by using the R package “wmtsa”. Generally speak-ing, the MODWPT at each scale m decomposes the entire frequency band [0 , ] into 2 m equal sub-bands (see the right panel in Figure 1), and the resulting W m,n contains thecharacteristics of the original time series { y t } Tt =1 in each sub-band [ n m +1 , n +12 m +1 ].Similar to Gen¸cay and Signori (2015), we next deﬁne the wavelet variance of { y t } in thefrequency sub-band [ n m +1 , n +12 m +1 ] by(2.1) wvar m,n ( y ) ≡ var( W m,n,t ) . With { wvar m,n ( y ) } , we can approximately decompose the variance of { y t } at scale m by(2.2) var( y ) ≈ m − X n =0 wvar m,n ( y ) , where the result (2.2) holds, because wvar m,n ( y ) ≈ var m,n ( y ) ≡ R n +12 m +1 n m +1 S y ( f )d f by ne-glecting the leakage of the wavelet ﬁlter (see Gen¸cay and Signori (2015)), and var( y ) =2 R / S y ( f )d f = P m − n =0 var m,n ( y ) . Here, S y ( f ) is the spectral density function of { y t } ,and var m,n ( y ) can be viewed as the general variance of { y t } in the sub-band [ n m +1 , n +12 m +1 ].Now, we deﬁne the MODWPT-based WVR in the frequency sub-band [ n m +1 , n +12 m +1 ] by(2.3) ξ m,n ( y ) ≡ wvar m,n ( y )var( y ) . Clearly, the result (2.2) implies that for the general stationary process { y t } , P m − n =0 ξ m,n ( y ) ≈ . Particularly, if { y t } is covariance stationary white noise, Theorem 2.1 below shows thatthe approximation symbol “ ≈ ” can be replaced by the equality symbol “=”. Theorem . Suppose { y t } is covariance stationary white noise. Then, ξ m,n ( y ) = 12 m at each scale m , where n = 0 , ..., m − . The preceding theorem demonstrates that if { y t } is covariance stationary white noise,the MODWPT-based wavelet variance at each sub-band [ n m +1 , n +12 m +1 ] contributes a ratioof m to the total variance. In the next section, we will apply this result to form a classof tests for H . Speciﬁcally, we will measure the distance between ξ m,n ( y ) and m undercertain norm, and a large value of this distance conveys the evidence of rejection for H .2.2. The estimator of ξ m,n ( y ) . To facilitate our testing idea, an estimator of ξ m,n ( y )is needed. In this paper, we estimate ξ m,n ( y ) by b ξ m,n,T , where(2.4) b ξ m,n,T = \ wvar m , n ( y ) \ var( y ) ≡ P Tt =1 W m,n,t P Tt =1 y t . Let z m,n,t = P L m − i =0 P L m j>i e v m,n,i e v m,n,j y t − i y t − j and(2.5) s m,n,T ( z ) = 1 T T X t =1 var( z m,n,t ) + 2 T T X t =1 T − X k =1 cov( z m,n,t , z m,n,t − k ) , where s m,n,T ( z ) is the long run variance of √ T P Tt =1 z m,n,t . Theorem 2.2 below shows thatthe consistency and asymptotic normality of b ξ m,n,T hold even for the heteroskedastic whitenoise { y t } . Theorem . Suppose { y t } is heteroskedastic white noise. For any given m ≥ and n = 1 , ..., m − , (i) if Assumption 1 in the Appendix holds, b ξ m,n,T p −→ m as T → ∞ ; (ii) if lim T →∞ T P Tt =1 Ey t = σ < ∞ and Assumption 2 in the Appendix holds, (2.6) W V m,n ≡ s T σ z m,n ) (cid:18)b ξ m,n,T − m (cid:19) d −→ N (0 , as T → ∞ , where avar( z m,n ) is the probability limit of s m,n,T ( z ) in (2.5). To implement Theorem 2.2(ii), we need either estimate σ and avar( z m,n ) consistentlyor calculate them explicitly. For the general cases, σ can be consistently estimated by b σ ≡ T P Tt =1 y t under some mixingale conditions in Andrews (1988), and avar( z m,n ) canbe consistently estimated by the conventional Newey–West (NW) estimator d avar( z m,n ).For a special case that(2.7) all cross-joint cumulants of order four for { y t } are zeros , we can show that 4 σ − avar( z m,n ) in (2.6) has an explicit formula, which can be directlycalculated from the wavelet ﬁlter { h l } . Here, the cross-joint cumulants of order four for { y t } is deﬁned as the coeﬃcients κ a,b,c,d in the Taylor’s expansion:log M ( ξ ) = X a ξ a κ a + 12! X a,b ξ a ξ b κ a,b + 13! X a,b,c ξ a ξ b ξ c κ a,b,c + 14! X a,b,c,d ξ a ξ b ξ c ξ d κ a,b,c,d + · · · , where M ( ξ ) = E exp( ξ ′ y ijklt ) with ξ ∈ R × and y ijklt = ( y t − i , y t − j , y t − k , y t − l ) ′ ∈ R × forany i, j, k, l , and each index in the summation is running from 1 to 4. Proposition . Suppose { y t } is heteroskedastic white noise and the condition (2.7)holds. Then, W V m,n deﬁned in (2.6) can be simpliﬁed as (2.8)

W V m,n = s Ta ( e v m,n,n ) (cid:18)b ξ m,n,T − m (cid:19) , where a ( e v m,n ,n ) = X s ∈ Z i max X i = i min j max X j ≥ i e v m,n ,i e v m,n ,j e v m,n ,i − s e v m,n ,j − s with i min = max { , s } , i max = min { L m , L m + s } − and j max = min { L m , L m + s } − . Note that

W V m,n aims to convey the testing signal expressed by the WODWPT-basedWVR within the frequency sub-band [ n m +1 , n +12 m +1 ], and the results of W V m,n in Theorem2.2(ii) and Proposition 2.1 are key to form our test statistics below.

3. Multi-frequency-band tests.

In this section, we propose some new test statis-tics based on the WODWPT-based WVR to detect the null hypothesis H in (1.1). Let W m ≡ ( W V m, , · · · , W V m, m − ) ′ ∈ R (2 m − × , and Σ m ∈ R (2 m − × (2 m − be the asymp-totic covariance matrix of W m under H with its ( i, j )th entryΣ m,i,j = acov( z m,i z m,j ) p avar( z m,i ) p avar( z m,j ) , where acov( z m,i z m,j ) is the probability limit of the long run covariance of √ T P Tt =1 z m,i,t and √ T P Tt =1 z m,j,t . Since our testing principle is to measure the distance between b ξ m,n,T and m for n = 1 , ..., m −

1, a straightforward way is to consider a joint multi-frequency-band test statistic:(3.1)

M F B m ≡ W ′ m Σ − m W m at each scale m . By construction, we know that under H , M F B m d −→ χ m − as T → ∞ . Our test

M F B m is similar to the multi-scale test GSM m based on the maximum overlapdiscrete wavelet transform (MODWT) in Gen¸cay and Signori (2015), where GSM m ≡ ( GS , ..., GS m ) ˙Σ − m ( GS , ..., GS m ) ′ , and under H , GSM m d −→ χ m as T → ∞ . Here, ˙Σ m ∈ R m × m is the asymptotic covariancematrix of ( GS , ..., GS m ) with GS m ≡ s T σ z m ) (cid:18)b ξ m,T − m (cid:19) , where b ξ m,T is deﬁned as b ξ m,n,T in (2.4) with W m,n,t replaced by W m,t , avar( z m ) is deﬁnedas avar( z m,n ) in Theorem 2.2 with z m,n,t replaced by z ∗ m,t , and z ∗ m,t = L m − X i =0 L m X j>i h m,i h m,j y t − i y t − j . Like

GSM m , M F B m can also consistently detect any ﬁnite ARMA alternatives and havenon-trivial power to detect the local alternative of the form: H T : S T ( f ) = 1 √ T (cid:16) S ( f ) − (cid:17) + 12 , by using the similar arguments as in Gen¸cay and Signori (2015), where S ( f ) is the non-constant spectrum. However, the two tests have distinctions due to the diﬀerent decompo-sition ways of MODWT and MODWPT as shown in Figure 1. Speciﬁcally, GSM m looksfor the rejection evidence from the components { W , ..., W m } at the ﬁrst m scales, while M F B m does it from the components { W m, , ..., W m, m − } at a given scale m . When m = 1, GSM m and M F B m are identical. However, when m > M F B m tends to ﬁnd more ad-equate testing signals than GSM m , since the MODWPT zooms in the high frequencysub-bands by further decomposing W m,n , while the MODWT does not. To use

M F B m in practice, we need calculate W V m,n in (2.6) and replace Σ m in (3.1) bya known matrix. In general cases, W V m,n can be calculated by replacing σ and avar( z m,n )with b σ and the NW estimator d avar( z m,n ), and Σ m can be replaced by its NW estimator b Σ m , where the ( i, j )th entry of b Σ m is b Σ m,i,j = d acov( z m,i z m,j ) p d avar( z m,i ) p d avar( z m,j ) , and d acov( z m,i z m,j ) is the NW estimator of acov( z m,i z m,j ). In a particular case, if { y t } satisﬁes the condition (2.7), W V m,n can be calculated explicitly as in (2.8), and Σ m can besimpliﬁed as A m by the similar arguments as for Proposition 2.1, where the ( i, j )th entryof A m is(3.2) A m,i,j = a ( e v m,i,j ) p a ( e v m,i,i ) p a ( e v m,j,j ) . Now, we consider three computational versions of

M F B m : • M F B gm calculates W V m,n as in (2.8), and replaces Σ m by A m in (3.2); • M F B △ m calculates W V m,n with σ and avar( z m,n ) replaced by b σ and d avar( z m,n ),and replaces Σ m by A m ; • M F B em calculates W V m,n with σ and avar( z m,n ) replaced by b σ and d avar( z m,n ),and replaces Σ m by b Σ m .Note that M F B gm , M F B △ m and M F B em are constructed in a similar way as the multi-scaletests GSM gm , GSM △ m and GSM em in Gen¸cay and Signori (2015), where we use the notation GSM em to denote their test GSM m for the notational consistency. By construction, M F B gm and M F B △ m are feasible for the special case that condition (2.7) holds, while M F B em isvalid for general cases. The same conclusion holds for their multi-scale counterparts.

4. Simulation.

In this section, we examine the ﬁnite-sample performance of our tests

M F B gm , M F B △ m and M F B em in comparison with the portmanteau tests Q K in Ljung andBox (1978), the automatic portmanteau test AQ in Escanciano and Lobato (2009), andthe multi-scale tests GSM gm , GSM △ m and GSM em in Gen¸cay and Signori (2015). Unlessstated otherwise, all MFB and GSM tests are computed with Haar wavelet in the sequel.4.1. Size study.

Let ǫ t i.i.d. ∼ N (0 ,

1) unless speciﬁed. To examine the empirical size ofall tests, we consider the following null models:N1 [ N ( , )] a standard normal process: y t = ǫ t ; N2 [ N ( , )- GARCH ] a GARCH process with N (0 ,

1) innovations: y t = σ t ǫ t and σ t =0 .

001 + 0 . y t − + 0 . σ t − ;N3 [ t - GARCH ] a GARCH process as in model N2 except ǫ t i.i.d. ∼ t ;N4 [ EGARCH ] an EGARCH process with N (0 ,

1) innovations: y t = σ t ǫ t and log σ t =0 .

001 + 0 . | ǫ t | − . ǫ t + 0 .

95 log σ t − ;N5 [ Mixture of normals ] a mixture of two normals N (0 , /

2) and N (0 ,

1) with mixingprobability 1 / N ( , t )]: a heteroskedastic normal with trending variance: y t = √ tǫ t ;N7 [ Time-varying GARCH ] a time-varying GARCH(1 ,

1) process with N (0 ,

1) inno-vations: y t = τ ( t/T ) u t , τ ( x ) = I (0 < x < .

5) + 2 I (0 . ≤ x < u t = σ t ǫ t and σ t = 0 .

05 + 0 . u t − + 0 . σ t − ;N8 [ Non-stationary GARCH ] a non-stationary GARCH(1 ,

1) process with N (0 , y t = σ t ǫ t and σ t = 0 .

001 + 0 . y t − + 0 . σ t − ;N9 [ ZD-GARCH ] a ZD-GARCH(1 ,

1) process with N (0 ,

1) innovations: y t = σ t ǫ t and σ t = 0 . y t − + 0 . σ t − ;N10 [ All-pass ARMA ] an All-pass ARMA(1 ,

1) process with N (0 ,

1) innovations: y t =0 . y t − + ǫ t − (1 / . ǫ t − ;N11 [ Bilinear ] a bilinear process with N (0 ,

1) innovations: y t = ǫ t + 0 . ǫ t − y t − ;N12 [ Nonlinear MA ] a nonlinear MA model with N (0 ,

1) innovations: y t = ǫ t +0 . ǫ t − ǫ t − .Models N1–N6 were considered by Gen¸cay and Signori (2015), and except model N6, theother ﬁve models are stationary MDS with constant variances. Models N7–N9 were studiedby Subba Rao (2006), Francq and Zako¨ıan (2012), and Li, Zhang, Zhu and Ling (2018),respectively. These three models are non-stationary MDS with time-varying variances.Unlike models N1–N9, models N10–N12 are uncorrelated but non-MDS as shown in Shao(2011b).As the settings in Gen¸cay and Signori (2015), Table 1 reports the proportion (in per-centage) of rejections at 5% nominal level for all MFB and GSM tests with m = 2, theportmanteau tests Q K with K = 5 , ,

20, and the automatic portmanteau test AQ , where10000 replications are generated from each null model with the sample size T = 100, 300or 1000. From this table, our ﬁndings are as follows:(i) Our three MFB tests have a similar size performance as their GSM counterpartsin all examined cases. When the sample size is small (e.g., T = 100), M F B g has an Table 1

Rejection rates (in percentage) under the null models N1–N12.

N1 N2 N3 N4 T

100 300 1000 100 300 1000 100 300 1000 100 300 1000

MF B g MF B △ MF B e GSM g GSM △ GSM e Q Q Q T

100 300 1000 100 300 1000 100 300 1000 100 300 1000

MF B g MF B △ MF B e GSM g GSM △ GSM e Q Q Q T

100 300 1000 100 300 1000 100 300 1000 100 300 1000

MF B g MF B △ MF B e GSM g GSM △ GSM e Q Q Q accurate size performance, except for models N4, N6–N9 and N11–N12. As the samplesize becomes larger (e.g., T = 1000), the over-sized problem for M F B g is even worse. Incontrast, M F B △ and M F B e can always have accurate sizes when the sample size is large,although they (particularly M F B e ) tend to be slightly over-sized when the sample size issmall.(ii) All three portmanteau tests Q K show good size performances in models N1, N5 andN10, but they have the severe over-sized problem in models N3–N4, N6–N9 and N11, andthis problem tends to exist in models N2 and N12 even when the sample size is large (e.g., T = 1000).(iii) The automatic portmanteau test AQ exhibits a good size performance in all exam-ined cases, except that it tends to have a slightly over-sized problem when the sample sizeis small, and this problem remains in models N10–N12 even when the sample size is large.Overall, our ﬁndings are similar to those in Gen¸cay and Signori (2015). On one hand,when the sample size is small, M F B g (or GSM g ) has a relatively better size performancethan others for most of stationary MDS data, and M F B △ (or GSM △ and AQ ) does thisfor most of non-stationary or non-MDS data. On the other hand, when the sample size islarge, M F B △ (or GSM △ ) seems to have the best size performance in general.4.2. Power study.

To examine the empirical power of all tests, we consider the followingfour alternative models:A1 [ N ( , )- AR ( )] an AR(2) process with N (0 ,

1) innovations: y t = β y t − + β y t − + ǫ t ;A2 [ N ( , )- AR ( )] an AR(3) process with N (0 ,

1) innovations: y t = β y t − + β y t − + ǫ t ;A3 [ N ( , t )- AR ( )] an AR(2) process with N (0 , t ) innovations: y t = β y t − + β y t − + √ tǫ t ;A4 [ N ( , t )- AR ( )] an AR(3) process with N (0 , t ) innovations: y t = β y t − + β y t − + √ tǫ t ,where β (or β ) is set to be − . , − . , ..., . , and 0 . M F B g , GSM g , Q , and AQ when the sample size is small. Tables 2 and 3 report the power (in percentage) at 5% nominal levelfor M F B g , where 10000 replications are generated from each alternative model with thesample size T = 100. To make a comparison, Tables 2 and 3 also report the relative powergains of M F B g with respect to the other three tests. From these two tables, we can havethe following ﬁndings:(i) For model A1, M F B g is generally more powerful than GSM g when β <

0, while

GSM g outperforms M F B g when β >

0. For model A2, the advantage of

GSM g over M F B g largely disappears, but M F B g has a huge power improvement over GSM g up to786%. This implies that the power advantage of M F B g over GSM g tends to be moresubstantial, when the serial dependence of data happens at larger lags. For models A3–A4with heteroskedastic data, a similar conclusion can be drawn.(ii) For all considered four models, M F B g is always more powerful than Q . The powerperformance between M F B g and AQ is mixed. For models A1 and A3, M F B g (or AQ )shows its relative better performance when β > β < M F B g has a clear power improvement over AQ up to 88%, while AQ is only slightly better than M F B g when β < β is close to 0. For model A4, a similar phenomenon as formodel A2 can be observed. All these ﬁndings once again imply that M F B g has a moresubstantial power advantage over AQ , when the serial dependence of data happens atlarger lags.4.3. Robust analysis.

In the previous two subsections, we focus on m = 2 for our MFBtests. This subsection aims to do some robust analysis for our MFB tests, based on thesettings as in Gen¸cay and Signori (2015). First, we explore the ﬁnite sample performance ofour MFB tests in terms of the choice of m . To illustrate it, we generate 10000 replicationswith sample size T = 100 ,

300 or 1000 from the following AR( k ) model: y t = βy t − k + ǫ t , where | β | <

1. Figures 3 and 4 plot the (size-adjusted) power of

M F B gm (for m = 1 , ..., GSM gm is also plotted in these two ﬁgures. From Figure 3, we canﬁnd that when m = 1 , m = 4 and 5, the GSM tests perform better than the MFB tests especially for β >

0. In contrast, Figure 4 shows that when m = 3 , Table 2

Size-adjusted power and relative power against model A1 (left side) and model A2 (right side). A y t = β y t − + β y t − + ǫ t A y t = β y t − + β y t − + ǫ t MF B g MF B g β β β β MF B g /GSM g ) − MF B g /GSM g ) − β β β β . - . - . - . - . - . - - - - - - - - - - - - - - - - MF B g /Q ) − MF B g /Q ) − β β β β MF B g /AQ ) − MF B g /AQ ) − β β β β - - - - - - - - -0.10 0.30 0.43 0.09 - - - - - - - - - - Table 3

Size-adjusted power and relative power against model A3 (left side) and model A4 (right side). A y t = β y t − + β y t − + √ tǫ t A y t = β y t − + β y t − + √ tǫ t MF B g MF B g β β β β MF B g /GSM g ) − MF B g /GSM g ) − β β β β - - - - - - - - - - - - - - - - - - - - - - MF B g /Q ) − MF B g /Q ) − β β β β MF B g /AQ ) − MF B g /AQ ) − β β β β - - - - - - - - -0.10 0.28 0.40 0.07 - - - - - - - - - more powerful than the GSM tests, while all tests exhibit low power when m = 1 and2. These ﬁndings suggest that when the serial dependence happens at the small lag, ourMFB tests can perform stably over m , and when the serial dependence happens at thelarge lag, our MFB tests with a large m can perform well, and they are generally morepowerful than the GSM tests in this case. m=1 m=2 m=3 m=4 m=5 β β -0.4 -0.2 0 0.2 0.4 β -0.4 -0.2 0 0.2 0.4 β -0.4 -0.2 0 0.2 0.4 β MFB gm GSM gm Fig 3 . The power of

MF B gm and GSM gm (for m = 1 , ..., ) against AR (1) alternative: y t = βy t − + ǫ t . Thetop, middle and bottom panels are corresponding to the sample size T = 100 , , and 1000, respectively. m=1 m=2 m=3 m=4 m=5 β β -0.4 -0.2 0 0.2 0.4 β -0.4 -0.2 0 0.2 0.4 β -0.4 -0.2 0 0.2 0.4 β MFB gm GSM gm Fig 4 . The power of

MF B gm and GSM gm (for m = 1 , ..., ) against AR (5) alternative: y t = βy t − + ǫ t . Thetop, middle and bottom panels are corresponding to the sample size T = 100 , , and 1000, respectively. Second, we check the ﬁnite sample performance of our MFB tests in terms of the choiceof wavelets. As the settings in Gen¸cay and Signori (2015), we report the size and (size-adjusted) power of

M F B g for Haar wavelet and Daubechies wavelets D(4), D(6), D(8)and D(10) in Table 4. From this table, we can see that there is no signiﬁcant diﬀerence interms of size, but the Haar wavelet has some marginal advantages in terms of power. Table 4

Size and power of

MF B g for various wavelets. Models T Haar D(4) D(6) D(8) D(10)Panel A: size studyModel N1 100 4.65 4.55 4.55 4.54 4.60300 4.68 4.69 4.75 4.70 4.651000 4.42 4.58 4.59 4.52 4.48Model N2 100 5.14 5.40 5.26 5.19 5.26300 6.69 6.62 6.65 6.58 6.591000 7.22 7.10 7.05 7.06 7.08Panel B: power studyModel A1with β = β = 0 . β = β = 0 .

5. Applications.

Application 1.

Checking whether the market index returns are predictable hasbeen a long standing problem in the literature. The empirical studies in Lo and MacKinlay(1988) and Hong and Lee (2005) found that the S&P500 index returns are predictable.However, their empirical studies overlooked a fact that the predictability conclusion madebased on the entire period may not be true for some speciﬁc sub-periods. To relieve thisconcern, we examine whether the recent S&P500 return series as well as their sub-seriesare white noises, and if the white noise assumption is rejected, the examined series ispredictable, therefore giving the empirical evidence against the eﬃcient market hypothesis.We consider the daily S&P500 index from January 2, 2006 to December 31, 2015, with2515 observations in total. Denote the S&P500 return y t = 100 log( P t /P t − ), where P t isthe closing S&P500 index at day t . We ﬁrst apply the MFB tests, the GSM tests and theAQ test to the entire 10-year return series, and the results in Panel A of Table 5 show avery strong evidence to reject the white noise assumption for this entire series. Although the entire series is not white noise, there has a chance that its sub-series may be whitenoise. To examine this, we then apply all tests to ﬁve 2-year sub-series, and the resultsreported in Panel B of Table 5 indicate that both 2012-2013 and 2014-2015 sub-series arewhite noises at the level 5%, while the other three two-year sub-series are not. For thesethree non-white-noise sub-series, we further check whether their one-year sub-series arewhite noises. The results given in Panel C of Table 5 show that among six 1-year sub-series,the 2009, 2010 and 2011 sub-series are indeed white noises at the level 5%. In all sub-seriesstudy, our MFB tests exhibit much more rejection evidence than the GSM tests, and theAQ test fails to do this for the 2010-2011 sub-series and the 2008 and 2011 sub-series.Overall, our testing results imply that the S&P500 return series is not white noise during2006–2008, while it is white noise during 2009–2015. Since the S&P500 stock market isrelatively more volatile in 2006–2008 than 2009–2015, our ﬁndings may indicate that theS&P500 market is more likely to be ineﬃcient when it is more volatile.5.2. Application 2.

This subsection re-visits daily stock returns of BTC, CCME, KV-A, and MCBF in Francq and Zako¨ıan (2012). These four data sets range from June 29,2007, March 31, 2009, March 31, 2006, and August 28, 2007, respectively, to February7, 2011, with 907, 468, 1220, and 867, respectively, observations in total. In Francq andZako¨ıan (2012), all four stock return series are ﬁtted by the non-stationary GARCH(1 , ,

1) model, otherwise, they possiblyhave some conditional mean dynamics, which need be ﬁltered out ﬁrst.We use our three MFB tests as well as three GSM tests and the automatic portmanteautest AQ to examine whether these four stock return series are white noises. The testingresults are summarized in Table 6, from which we ﬁnd that only CCME return series iswhite noise, while the other three return series are not at the level 5%. Speciﬁcally, ourMFB tests get more rejection evidence than the GSM tests for the KV-A return series,and the GSM tests do it better especially at the scales m = 3 and 4 for the BTC returnseries. For the MCBF return series, the white noise hypothesis is strongly rejected by alltests. Compared with the MBF and GSM tests, the test AQ can not ﬁnd the signiﬁcantevidence of rejection for BTC, CCME and KV-A return series.In summary, our testing results imply that only CCME return series has no serial depen- Table 5

Testing results for S&P500 returns.

Time period m m AQ Panel A: entire 10-year series2006–2015

MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em Panel B: 2-year sub-series2006–2007

MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM △ m MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em dence on its conditional mean, and it is thus suitable to ﬁt this series by the non-stationaryGARCH(1 ,

1) model. However, the other three return series (particularly, MCBF) mostlikely have serial dependence on their conditional mean, and without ﬁltering out the con-ditional mean eﬀect ahead, the ﬁttings in Francq and Zako¨ıan (2012) may be inappropriatefor these three series.

Table 6

Testing results for four stock returns.

Series m m AQ BTC

MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em CCME

MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em MF B gm GSM gm MF B △ m GSM △ m MF B em GSM em Note: The p-value of each test statistic less than 5% is in boldface.

APPENDIX: TECHNICAL CONDITIONS AND PROOFSTo introduce our technical conditions, the deﬁnition of near-epoch dependence is needed.

Definition

A.1 . For a stochastic sequence { ǫ t } , let F t + mt − m ( ǫ ) = σ ( ǫ t − m , ..., ǫ t + m ) . Astochastic sequence { y t } is near-epoch dependent (NED) on { ǫ t } in L p -norm for p > if (cid:13)(cid:13) y t − E [ y t |F t + mt − m ( ǫ )] (cid:13)(cid:13) p ≤ d t ν m , where ν m → as m → ∞ , and d t is a sequence of positive real numbers such that d t = O ( k y t k p ) . The concept of near-epoch dependence can be traced back to the work of Ibragimov(1962). The NED processes allow for considerable heterogeneity and also for dependenceand include the mixing processes as a special case. As shown in Davidson (2002, 2004)and references therein, many nonlinear models are shown to be NED.Next, we are ready to give our technical conditions. Assumption . { y t } is a stochastic process which is L r -bounded for r > and L p -NED on an α -mixing process for p ≥ . Assumption . (i) For r > and for all i, j, k, l such that ≤ i < j ≤ L m and ≤ k < l ≤ L m , { y t − i y t − j y t − k y t − l /M y ,t } is uniformly L r -bounded for r > , where M y ,t = L m X i =0 L m X j> L m X k =0 L m X l> e v m,n,i e v m,n,j e v m,n,k e v m,n,l E ( y t − i y t − j y t − k y t − l ) . (ii) For all positive i ≤ L m , { y t y t − j } is a L r -bounded stochastic sequence for r > and L p -NED of size − / on a φ -mixing process { ǫ t } for p ≥ .(iii) var ( z m,n,t ) ∼ t β and s m,n,T ( z ) ∼ T γ for β ≤ γ . Assumptions 1–2 are in line with Assumptions A–B in Gen¸cay and Signori (2015), andthey allow for the heteroskedastic data. For the GARCH(1, 1) model, the NED conditionsin Assumptions 1–2 were veriﬁed by Gen¸cay and Signori (2015). For the general model,it seems challenging to verify Assumptions 1–2 in theory at this stage. Nevertheless, thegood ﬁnite-sample performance of our MFB tests in Section 4 implies that these twoassumptions could hold for a variety of time series models.

Proof of Theorem 2.1.

According to the construction of MODWPT, W m,n,t can beobtained by applying the ﬁlter { e v m,n,l } to the process { y t } , where { e v m,n,l } only dependson { h l } and { g l } . Let V m,n ( · ) be the discrete Fourier transfer function for { e v m,n,l } , whichdepends only on the transfer functions G m ( · ) and H m ( · ) for { h l } and { g l } , respectively(see, e.g., the speciﬁc expressions in Percival and Walden (2000, p.215)). Then, when { y t } is stationary, the spectrum of W m,n,t is S W m,n ( · ) = | V m,n ( · ) | S y ( · ), and since S y ( f ) = σ y for a covariance stationary white noise { y t } , it follows thatvar( W m,n,t ) = Z − S W m,n ( f )d f = Z − | V m,n ( f ) | S y ( f )d f = σ y Z − | V m,n ( f ) | d f = σ y k e v m,n k = σ y k g k i m,n k h k m − i m,n = σ y / m , (A.1)where (A.1) holds by Parseval’s identity and the basic properties of the wavelet ﬁlter andits associated scaling ﬁlters, and i m,n is an integer satisfying 0 ≤ i m,n ≤ m , which is onlydetermined by m and n (see Percival and Walden (2000, p.215)). ✷ Proof of Theorem 2.2. (i) Since the NED property is preserved under linear combi-nations (see Davidson (1995, p.267)) and the MODWPT is a linear operator, { W m,n,t } is L -NED under Assumption 1, and consequently, { W m,n,t } is L -NED (see Davidson (1995,p.268)), where W m,n,t = L m − X l =0 e v m,n,l y t − l + 2 L m − X i =0 L m X j>i e v m,n,i e v m,n,j y t − i y t − j = L m X l =1 e v m,n,l y t − l + 2 z m,n,t , and { z m,n,t } is L -NED because it is a linear combination of { W m,n,t } and { y t } , both ofwhich are L -NED. Then, it follows that b ξ m,n,T = P Tt =1 W m,n,t P Tt =1 y t = P Tt =1 (cid:16)P L m − l =0 e v m,n,l y t − l + 2 z m,n,t (cid:17)P Tt =1 y t = P L m − l =0 e v m,n,l P Tt =1 y t − l P Tt =1 y t + 2 P Tt =1 z m,n,t P Tt =1 y t = L m − X l =0 e v m,n,l + 2 P Tt =1 z m,n,t P Tt =1 y t (A.2) = 12 m + 2 P Tt =1 z m,n,t P Tt =1 y t , (A.3)where (A.2) holds since the ﬁltering is cyclic so that P Tt =1 y t − l is not related to l and isequal to P Tt =1 y t , and (A.3) holds since for m th level of MODWPT, each of { e v m,n,t } m − n =0 is the cascade ﬁlters obtained by convolution of m ﬁlters with norm 1 /

2, and the norm ofa convolution is the product of the norms. Finally, the conclusion holds since2 P Tt =1 z m,n,t P Tt =1 y t p −→ T → ∞ by Theorem 1 of Andrews (1988) and Slutsky’s Theorem.(ii) Since the NED property is preserved under linear combinations and { z m,n,t } is alinear combination of processes of the form { y t y t − i } , we can get that { z m,n,t } is L r -NEDon { ǫ t } under Assumption 2. Next, we will verify that { z m,n,t } satisﬁes the conditions ofthe Central Limit Theorem for NED processes in De Jong (1997, p.358). Note thatvar( z m,n,t ) = var  L m − X i =0 L m X j>i e v m,n,i e v m,n,j y t − i y t − j  = cov  L m − X i =0 L m X j>i e v m,n,i e v m,n,j y t − i y t − j , L m − X k =0 L m X l>k e v m,n,k e v m,n,l y t − k y t − l  = L m − X i =0 L m X j>i L m − X k =0 L m X l>k e v m,n,i e v m,n,j e v m,n,k e v m,n,l cov ( y t − i y t − j , y t − k y t − l )= L m − X i =0 L m X j>i L m − X k =0 L m X l>k e v m,n,i e v m,n,j e v m,n,k e v m,n,l E ( y t − i y t − j y t − k y t − l ) , where the last equation holds because the mean of { y t } is zero. Then, we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) y t − i y t − j y t − k y t − l P L m − i =0 P L m j>i P L m − k =0 P L m l>k e v m,n,i e v m,n,j e v m,n,k e v m,n,l E ( y t − i y t − j y t − k y t − l ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) p ∼ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P L m − i =0 P L m j>i P L m − k =0 P L m l>k e v m,n,i e v m,n,j e v m,n,k e v m,n,l y t − i y t − j y t − k y t − l P L m − i =0 P L m j>i P L m − k =0 P L m l>k e v m,n,i e v m,n,j e v m,n,k e v m,n,l E ( y t − i y t − j y t − k y t − l ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) p = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) z m,n,t var( z m,n,t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) p = (cid:13)(cid:13)(cid:13)(cid:13) z m,n,t σ m,n,t (cid:13)(cid:13)(cid:13)(cid:13) p , which implies that z m,n,t /σ m,n,t is L q -bounded for q = 2 p >

2. Hence, we have veriﬁedthat { z m,n,t } satisﬁes the conditions of the Central Limit Theorem for NED processes, andso we have T X t =1 z m,n,t s m,n,T ( z ) d −→ N (0 ,

1) as T → ∞ . By (A.3), it follows that P Tt =1 y t s m,n,T ( z ) (cid:18)b ξ m,n,T − m (cid:19) d −→ N (0 ,

1) as T → ∞ . Since T P Tt =1 Ey t → σ as T → ∞ , the conclusion follows by Slutsky’s Theorem. ✷ Proof of Proposition 2.1.

The conclusion holds by the similar arguments as forCorollary 13 in Gen¸cay and Signori (2015), and hence the details are omitted. ✷ REFERENCES [1]

Andrews, D. W. K. (1988) Laws of large numbers for dependent nonidentically distributed randomvariables.

Econom. Theory , 458–467.[2] Box, G. E. and

Pierce, D. A. (1970) Distribution of residual autocorrelations in autoregressive-integrated moving average time series models.

J. Am. Stat. Assoc. , 1509–1526.[3] Daubechies, I. (1992)

Ten Lectures on Wavelets . Philadelphia: SIAM.[4]

Davidson, J. (1995)

Stochastic Limit Theory . Oxford University Press, Oxford.[5]

Davidson, J. (2002) Establishing conditions for the functional central limit theorem in nonlinear andsemiparametric time series processes.

J. Econometrics , 243–269.[6]

Davidson, J. (2004) Moment and memory properties of linear conditional heteroscedasticity models,and a new model.

J. Bus. Econom. Statist. , 16–29. [7] De Jong, R. (1997) Central limit theorems for dependent heterogeneous random variables.

Econom.Theory , 353–367.[8] Duchesne, P. , Li, L. and

Vandermeerschen, J. (2010) On testing for serial correlation of unknownform using wavelet thresholding.

Comput. Statist. Data Anal. , 2512–2531.[9] Escanciano, J. C. and

Lobato, I. N. (2009) An automatic portmanteau test for serial correlation.

J. Econometrics , 140-149.[10]

Escanciano, J. C. and

Velasco, C. (2006) Generalized spectral tests for the martingale diﬀerencehypothesis.

J. Econometrics , 151–185.[11]

Fan, J. (1996) Test of signiﬁcance based on wavelet thresholding and Neyman’s truncation.

J. Am.Stat. Assoc. , 674–688.[12] Fan, J. and

Zhang, W. (2004) Generalised likelihood ratio tests for spectral density.

Biometrika ,195–209.[13] Francq, C. and

Zako¨ıan, J. M. (2012) Strict stationarity testing and estimation of explosive andstationary generalized autoregressive conditional heteroscedasticity models.

Econometrica , 821–861.[14] Genc¸ay, R. and

Signori, D. (2015) Multi-scale tests for serial correlation.

J. Econometrics ,62–80.[15]

Hong, Y. (1996) Consistent testing for serial correlation of unknown form.

Econometrica , 837–864.[16] Hong, Y. and

Lee, Y. J. (2005) Generalized spectral tests for conditional mean models in time serieswith conditional heteroscedasticity of unknown form.

Rev. Econ. Stud. , 499–541.[17] Horowitz, J. L. , Lobato, I. N. , Nankervis, J. C. and

Savin, N. E. (2006) Bootstrapping theBox-Pierce Q-test: a robust test of uncorrelatedness.

J. Econometrics , 841–862.[18]

Ibragimov, I. (1962) Some limit theorems for stationary processes.

Theory of Probab. Appl. , 349–382.[19] Lee, J. and

Hong, Y. (2001) Testing for serial correlation of unknown form using wavelet methods.

Econom. Theory , 386–423.[20] Li, D. , Zhang, X. , Zhu, K. and

Ling, S. (2018) The ZD-GARCH model: A new way to studyheteroscedasticity.

J. Econometrics , 1–17.[21]

Li, L. , Yao, S. and

Duchesne, P. (2014) On wavelet-based testing for serial correlation of unknownform using Fan’s adaptive Neyman method.

Comput. Statist. Data Anal. , 308–327.[22] Ljung, G. M. and

Box, G. E. (1978) On a measure of lack of ﬁt in time series models.

Biometrika , 297–303.[23] Lo, A. W. and

MacKinlay, A. C. (1988) Stock market prices do not follow random walks: Evidencefrom a simple speciﬁcation test.

Rev. Financial Stud. , 41–66.[24] Lobato, I. N. (2001) Testing that a dependent process is uncorrelated.

J. Am. Stat. Assoc. ,1066–1076.[25] Lobato, I. N. , Nankervis, J. C. and

Savin, N. E. (2002) Testing for zero autocorrelation in thepresence of statistical dependence.

Econom. Theory , 730–743. [26] Pararoditis, E. (2000) Spectral density based goodness-of-ﬁt tests for time series analysis.

Scand.J. Stat. , 143–176.[27] Percival, D. B. and

Walden, A. T. (2000)

Wavelet Methods for Time Series Analysis . CambridgeUniversity Press.[28]

Romano, J. L. and

Thombs, L. A. (1996) Inference for autocorrelations under weak assumptions.

J. Am. Stat. Assoc. , 590–600.[29] Shao, X. (2011a) A bootstrap-assisted spectral test of white noise under unknown dependence.

J.Econometrics , 213–224.[30]

Shao, X. (2011b) Testing for white noise under unknown dependence and its applications to diagnosticchecking for time series models.

Econom. Theory , 312–343.[31] Subba Rao, S. (2006) On some nonstationary, nonlinear random processes and their stationaryapproximations.

Adv. Appl. Probab. , 1155–1172.[32] Zhu, K. (2016) Bootstrapping the portmanteau tests in weak auto-regressive moving average models.

J. Royal Stat. Soc. B78