Permutation-based tests for discontinuities in event studies
PPermutation-based tests for discontinuities in event studies
Federico A. BugniDepartment of EconomicsDuke University [email protected]
Jia LiDepartment of EconomicsDuke University [email protected]
July 21, 2020
Abstract
We propose using a permutation test to detect discontinuities in an underlying economicmodel at a cutoff point. Relative to the existing literature, we show that this test is well suitedfor event studies based on time-series data. The test statistic measures the distance between theempirical distribution functions of observed data in two local subsamples on the two sides of thecutoff. Critical values are computed via a standard permutation algorithm. Under a high-levelcondition that the observed data can be coupled by a collection of conditionally independentvariables, we establish the asymptotic validity of the permutation test, allowing the sizes of thelocal subsamples to be either be fixed or grow to infinity. In the latter case, we also establishthat the permutation test is consistent. We demonstrate that our high-level condition can beverified in a broad range of problems in the infill asymptotic time-series setting, which justifiesusing the permutation test to detect jumps in economic variables such as volatility, tradingactivity, and liquidity. An empirical illustration on a recent sample of daily S&P 500 returns isprovided.
KEYWORDS: event study, infill asymptotics, jump, permutation tests, randomization tests, semi-martingale.JEL classification codes: C12, C14, C22, C32. a r X i v : . [ ec on . E M ] J u l Introduction
Many econometric problems can be expressed in terms of the continuity or the discontinuity ofcertain component in the underlying economic model. In an influential paper, Chow (1960) testedthe temporal stability in the demand for automobiles, and subsequently stimulated a large literatureon structural breaks in time series analysis; see, for example, Andrews (1993), Stock (1994), Baiand Perron (1998), and many references therein. In microeconometrics, the regression discontinuitydesign (RDD) has been extensively used for causal inference. This literature identifies and estimatesan average treatment effect by evaluating discontinuities of conditional expectation functions ofoutcome and treatment variables at a cutoff point of the running variable; see Imbens and Lemieux(2008) and Lee and Lemieux (2010) for comprehensive reviews. Meanwhile, a more recent high-frequency financial econometrics literature has been devoted to studying discontinuities, or jumps,in various financial time series (e.g., price, volatility, trading activity, etc.). The high-frequencyjump literature is pioneered by Barndorff-Nielsen and Shephard (2006), who propose the firstnonparametric test for asset price jumps using high-frequency data in an infill asymptotic setting.More recently, Bollerslev et al. (2018) study the jumps of volatility and trading intensity in high-frequency jump regressions (Li et al. (2017)) that closely resemble the classical RDD.Although these strands of literature involve apparently different terminology and technical tools,they share a common theme: The econometric goal is to learn about differences in the data gener-ating processes between two subsamples separated by the cutoff. Imbens and Kalyanaraman (2011)emphasize that these subsamples should be “local” to the cutoff point, which is quite natural giventhe nonparametric nature of discontinuity inference (Hahn et al. (2001)). The issue under study isthus a local version of the classical two-sample problem. Correspondingly, the related inference isoften carried out using nonparametric two-sample t-tests, which are based on kernel regressions inthe RDD (Hahn et al. (2001), Imbens and Kalyanaraman (2011), Calonico et al. (2014)) or, in thesame spirit, spot high-frequency estimators (Foster and Nelson (1996), Comte and Renault (1998),Jacod and Protter (2012), Li et al. (2017), Bollerslev et al. (2018)) in the infill time-series setting.In an ideal scenario in which the subsamples separated by the cutoff are i.i.d., the permuta-tion test is an excellent tool to detect differences in their distributions. In particular, standardresults for randomization inference (Lehmann and Romano (2005, Chapter 15.2)) indicate that apermutation test implemented with any arbitrary test statistic is finite-sample valid under theseconditions. The recent literature has investigated the properties of permutation tests under lessideal conditions. One example is Canay and Kamat (2017), who consider an RDD and show thatpermutation-based inference is asymptotically valid to detect discontinuities in the distribution ofthe baseline covariates at the cutoff. These authors implement their test with a finite number of Coincidentally, the RDD was first proposed by Thistlethwaite and Campbell (1960) around the same time asthe Chow test. L distance between the empiricalcumulative distribution functions for the two local subsamples near the cutoff, and compute thecritical value via a standard permutation algorithm. As explained earlier, if the data were i.i.d.,the behavior of this permutation test would follow directly from standard results for randomiza-tion inference. This “off-the-shelf” theory, however, is not applicable here because time-series dataobserved in a short event window can be serially highly dependent.The main theoretical contribution of the present paper is to establish the asymptotic validity ofthe permutation test in this non-standard setting. The theory has two components. The first is anew generic result for permutation test. Specifically, we link the (feasible) permutation test formedusing the original data with an infeasible test constructed in a “coupling” problem that involvesconditionally i.i.d. coupling variables. Since the latter resembles the classical two-sample problem,the infeasible test controls size exactly under the coupling null hypothesis (i.e., coupling variablesin the two subsamples are homogeneous), and is consistent under the complementary alternativehypothesis. Under a proper notion of coupling, which is customized for the permutation test, weshow that the feasible test inherits the same asymptotic rejection properties from the infeasible one.Since this result is of independent theoretical interest that is well beyond our subsequent analysisin the infill time-series setting, we frame the theory under general high-level conditions so as tofacilitate other types of applications.The second component of our analysis pertains to specializing the generic result to the infilltime-series setting designed for event-study applications. The event-study framework is particularlyrelevant for studying macroeconomic and financial shocks, including monetary shocks triggeredby FOMC announcements (Cochrane and Piazzesi (2002), Nakamura and Steinsson (2018a)), or“natural disasters” such as the ongoing COVID-19 pandemic. Following Li and Xiu (2016) and2ollerslev et al. (2018), we model observed data using a general state-space framework, in whichthe observations are discretely sampled from a latent state process “contaminated” by randomdisturbances. This model has been used to model variables such as asset returns, trading volume,duration, and bid-ask spread, and readily accommodates both continuously and discretely valuedvariables. Under this state-space model, the temporal discontinuity in the data’s distribution ismainly driven by the jump of the latent state process (e.g., asset volatility, trading intensity, andpropensity of informed trading), which can be detected by the permutation test. Under easy-to-verify primitive conditions, we construct coupling variables and apply the aforementioned generaltheory to establish the permutation test’s asymptotic validity.We recognize two advantages of the proposed permutation test in comparison with the standardapproach based on the nonparametric “spot” estimation of the underlying state process. Firstly,the permutation test attains asymptotic size control even if the number of observations in eachsubsample is fixed . This remarkable property is reminiscent of the finite-sample exactness of thepermutation test in the classical two-sample problem for i.i.d. data. In contrast, the nonparametricestimation approach works in a fundamentally different way, as it relies on the asymptotic (mixed)normality of the estimator, which in turn requires the sizes of the local subsamples to grow toinfinity. In empirical applications, however, it is often desirable to use a short time window,either to reduce the effect of confounding factors in the background, or simply because of thelack of observations soon after the occurrence of the economic event (say, in a real-time researchsituation). Not surprisingly, the conventional inference based on asymptotic Gaussianity oftenresults in large size distortions in this “small-sample” scenario, as we demonstrate concretely in arealistically calibrated Monte Carlo experiment (see Section 3). Meanwhile, the permutation testexhibits much more robust size control in finite samples.The second advantage of the permutation test is its versatility: The same test can be applied inmany different empirical contexts without any modification. On the other hand, the nonparametricestimation approach often relies on specific features of the problem, and needs to be designed on acase-by-case basis. Therefore, the proposed permutation test may be particularly attractive in newempirical environments for which tests based on the conventional approach are not yet developedor not yet well-understood. In Section 2.2, we illustrate this point more concretely in the context oftesting for volatility jumps. In that case, the standard approach relies crucially on the assumptionthat the price shocks are Brownian in its design of the spot volatility estimator and the associatedt-statistic, and it cannot be adapted easily to accommodate a more general setting with L´evy-drivenshocks. The permutation test, on the other hand, is valid even in the latter, more general, setting. For similar type of results in the context of RDD; see Cattaneo et al. (2015), Cattaneo et al. (2017), and Canayand Kamat (2017). To the best of our knowledge, the estimation and inference of the spot volatility (i.e., the scaling process) in thenon-Brownian case remains to be an open question in the literature. There is some limited work on the inferenceof integrated volatility functionals for the non-Brownian case (see Todorov and Tauchen (2012)) which demonstrates
Notation . We use (cid:107) x (cid:107) to denote the Euclidean norm of a vector x . For any real number a ,we use (cid:100) a (cid:101) to denote the smallest integer that is larger than a . For any constant p ≥ (cid:107) · (cid:107) p denotes the L p norm for random variables. For two real sequences a n and b n , we write a n (cid:16) b n if a n /C ≤ b n ≤ Ca n for some finite constant C ≥ We first prove a new result that is broadly useful for establishing the asymptotic validity of permu-tation tests. Because of its independent theoretical interest, we develop the theory under high-levelconditions. In Section 2.2, below, we shall specialize this general result in event-study applications various distinct complications in the non-Brownian setting. Y n,i ) i ∈I n of R -valued observed variables defined on a probability space(Ω , F , P ), which may be either “raw” data or preliminary estimators. Our econometric goal isto decide whether two subsamples ( Y n,i ) i ∈I ,n and ( Y n,i ) i ∈I ,n have “significantly” different distri-butions, where ( I ,n , I ,n ) is a partition of I n . For ease of exposition, we assume that I ,n and I ,n contain the same number of observations, denoted by k n . We stress from the outset that k n may either be fixed or grow to infinity in the subsequent analysis. As such, our analysis speaks tonot only the classical finite-sample analysis of permutation tests, but also the large-sample analysisroutinely used in econometrics.To implement the test, we first estimate the empirical cumulative distribution functions (CDF)for the two subsamples using (cid:98) F j,n ( x ) ≡ k n (cid:88) i ∈I j,n { Y n,i ≤ x } , j ∈ { , } . We then measure their difference via the Cram´er–von Mises statistic given by (cid:98) T n ≡ k n (cid:88) i ∈I n (cid:16) (cid:98) F ,n ( Y n,i ) − (cid:98) F ,n ( Y n,i ) (cid:17) . For a significance level α ∈ (0 , π to denote a permutation of the elements of I n , that is, a bijective mapping from I n to itself.Let G n denote the collection of all possible permutations of I n , with M n being its cardinality. Algorithm 1 . Step 1. For each permutation π ∈ G n , compute the permuted test statistic (cid:98) T n ( π )as (cid:98) T n , but with ( Y n,i ) i ∈I n replaced by ( Y n,π ( i ) ) i ∈I n .Step 2. Order { (cid:98) T n ( π ) : π ∈ G n } as (cid:98) T (1) n ≤ · · · ≤ (cid:98) T ( M n ) n . Set (cid:98) T ∗ n = (cid:98) T ( k ) n for k = (cid:100) M n (1 − α ) (cid:101) .Step 3. If (cid:98) T n > (cid:98) T ∗ n , reject the null hypothesis. If (cid:98) T n < (cid:98) T ∗ n , do not reject the null hypothesis. If (cid:98) T n = (cid:98) T ∗ n , reject the null hypothesis with probability ˆ p n ≡ ( M n α − (cid:99) M + n ) / (cid:99) M n , where (cid:99) M + n and (cid:99) M n are the cardinalities of { j : (cid:98) T ( j ) n > (cid:98) T ∗ n } and { j : (cid:98) T ( j ) n = (cid:98) T ∗ n } , respectively. The resulting test thenrejects according to ˆ φ n ≡ { (cid:98) T n > (cid:98) T ∗ n } + ˆ p n { (cid:98) T n = (cid:98) T ∗ n } . (cid:3) Remark 2.1.
The test ˆ φ n specified in Algorithm 1 is a randomized test and has a random outcomewhen (cid:98) T n = (cid:98) T ∗ n . One can construct a non-randomized (and more conservative) version by replacingˆ p n with zero. Also, in practice, M n may be too large to consider G n in its entirety. In such cases,we could replace G n with a random subset of it, denoted by (cid:98) G n , and composed of the identity All of our results can be easily extended to the case when I ,n and I ,n have different sizes, but with the sameorder of magnitude. G n . All of the formal results in this paperwould apply if we use (cid:98) G n instead of G n in Algorithm 1.If the data ( Y n,i ) i ∈I n are i.i.d., then the null hypothesis of the classical two-sample problem holds,and Lehmann and Romano (2005, Theorem 15.2.1) implies that the aforementioned permutationtest has exact size control in finite samples. This is a remarkable property of the permutation test,as it holds without requiring any specific distributional assumptions on the data. In contrast to theclassical two-sample problem, however, we shall not assume that the data are independent, or even“weakly” dependent (e.g., mixing). As mentioned in the Introduction, the main goal of this paperis to study the permutation test for time-series data observed within a short event window (say, afew days or hours), which can be serially highly dependent in practice. Our key theoretical insightis that the permutation test is still asymptotically valid if the data ( Y n,i ) i ∈I n can be approximated,or “coupled,” by another collection of variables that are conditionally independent, as formalizedby the following assumption. Assumption 2.1.
There exists a collection of variables ( U n,i ) i ∈I n such that the following con-ditions hold for a sequence ( G n ) n ≥ of σ -fields: (i) for each n ≥ , the variables ( U n,i ) i ∈I n are G n -conditionally independent, and U n,i has the same G n -conditional distribution as U n,j if i, j be-long to the same subsample (i.e., I ,n or I ,n ); (ii) for any real sequence η n = o (1) , we have sup x ∈ R P ( | U n,i − x | ≤ η n |G n ) = O p ( η n ) ; (iii) max i ∈I n | (cid:101) Y n,i − U n,i | = o p ( k − n ) , where ( (cid:101) Y n,i ) i ∈I n is anidentical copy of ( Y n,i ) i ∈I n in G n -conditional distribution. Assumption 2.1 lays out the high-level structure for bridging our analysis with the classicaltheory on permutation tests, which we carry out in Theorem 2.1 below. Condition (i) sets upthe “coupling” problem, which corresponds to a conditional version of the classical two-sampleproblem, treating the ( U n,i ) i ∈I ,n and ( U n,i ) i ∈I ,n variables as “data.” In part (a) of Theorem 2.1,we consider the situation in which both subsamples have the same conditional distribution. Inthis case, our coupling variables ( U n,i ) i ∈I n give rise to an infeasible permutation test that can beanalyzed as a classical two-sample problem. In particular, this infeasible permutation test attainsthe exact finite-sample size under our conditions.This infeasible test, however, only plays an auxiliary role in our analysis, because our interestis on the feasible test ˆ φ n formed using the original ( Y n,i ) i ∈I n data. Therefore, a key componentof our theoretical argument in Theorem 2.1 is to show that the feasible test for the original datainherits asymptotically the same rejection properties from the infeasible test. Conditions (ii) and(iii) in Assumption 2.1 are introduced for this purpose. Specifically, condition (ii) requires thevariable U n,i to be non-degenerate, in the sense that its conditional probability mass within anysmall [ x − η, x + η ] interval is of order O ( η ) in probability. Condition (iii) specifies the requisite Condition (ii) is satisfied if the conditional probability densities of U n,i , n ≥
1, exist and are uniformly boundedin probability. k n , as detailed in Section 2.2. Theorem 2.1.
Under Assumption 2.1, the following statements hold for the permutation test ˆ φ n described in Algorithm 1:(a) If the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d., we have E [ ˆ φ n ] → α .(b) Let Q j,n ( · ) denote the G n -conditional distribution function of U n,i for i ∈ I j,n and j ∈ { , } ,and Q n = ( Q ,n + Q ,n ) / . If k n → ∞ and P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → for anyreal sequence δ n = o (1) , we have E [ ˆ φ n ] → . Theorem 2.1 characterizes the asymptotic rejection probabilities of the feasible test ˆ φ n underthe null and alternative hypotheses of the two-sample problem for the coupling variables. Part(a) pertains to the situation in which the two subsamples of coupling variables, ( U n,i ) i ∈I ,n and( U n,i ) i ∈I n , have the same conditional distribution, which corresponds to the null hypothesis. Inthis case, the theorem shows that the asymptotic rejection probability of the feasible test is equalto the nominal level α . It is relevant to note that this result holds whether k n is fixed or divergent.This property is clearly reminiscent of the permutation test’s finite-sample exactness in the classicalsetting.Part (b) of Theorem 2.1 concerns the power of the feasible test ˆ φ n . It shows that the fea-sible test rejects with probability approaching one when the conditional distributions of the twocoupling subsamples, Q ,n and Q ,n , are different, in the sense that their “distance” measured by (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) is asymptotically non-degenerate, where the mixture distribution Q n captures approximately the distribution of the permuted data. This consistency-type result re-quires that the information available from each subsample grows with the sample size, i.e., k n → ∞ .This result appears to be new in the context of permutation-based tests under a fixed alternativefor the coupling variables. In particular, we note that an analogous result is unavailable in Canayand Kamat (2017), as they restrict attention to an asymptotic framework with a fixed k n . Ourproof relies on applying Lehmann and Romano (2005, Theorem 15.2.3) to the infeasible test, forwhich we use the coupling construction developed by Chung and Romano (2013) to show that theso-called Hoeffding (1952) condition is satisfied.Theorem 2.1 establishes the relation between the rejection probability of the feasible test ˆ φ n andthe homogeneity (or the lack of it) across the two coupling subsamples ( U n,i ) i ∈I ,n and ( U n,i ) i ∈I ,n .This result does not speak directly to hypotheses formulated in terms of the original ( Y n,i ) i ∈I n In our applications, we can often verify condition (iii) with (cid:101) Y n,i = Y n,i . Nonetheless, allowing (cid:101) Y n,i (cid:54) = Y n,i isuseful when Y n,i is itself an estimator. For example, if ( Y n,i ) i ∈I n is a finite collection of estimators that convergejointly in distribution, then the coupling can be obtained via Skorokhod representation; see Canay and Kamat (2017)for an application of this type. Y n,i = ∆ − / n ( P ( i +1)∆ n − P i ∆ n ) be the scaled increment of the asset priceprocess P t over the i th sampling interval ( i ∆ n , ( i + 1) ∆ n ]. Let τ = i ∗ ∆ n be a “cutoff” timepoint of interest (e.g., the announcement time of a news release), and consider two index sets I ,n = { i ∗ − k n , . . . , i ∗ − } and I ,n = { i ∗ + 1 , . . . , i ∗ + k n } , which collect observations before andafter the cutoff, respectively. We consider an asymptotic setting in which these subsamples are“local” in calendar time, that is, k n ∆ n →
0. Note that this implies that ∆ n →
0, which means thatwe are considering an infill asymptotic setting. If P t is an Itˆo process with respect to an informationfiltration ( F t ) t ≥ , we may represent Y n,i as Y n,i = ∆ − / n (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − / n (cid:90) ( i +1)∆ n i ∆ n σ s dW s , for i ∈ I n , (1)where b t is the drift process, σ t is the stochastic volatility process, and W t is a standard Brownianmotion. If the σ t process is smooth (e.g., H¨older continuous) in a local neighborhood before τ ,then the volatility throughout the pre-event subsample I ,n is approximately σ ( i ∗ − k n )∆ n . Furtherrecognizing that the drift term is negligible relative to the Brownian component, we can approximate Y n,i for each i ∈ I ,n using the coupling variables U n,i = σ ( i ∗ − k n )∆ n ∆ − / n ( W ( i +1)∆ n − W i ∆ n ) ∼ MN (cid:16) , σ i ∗ − k n )∆ n (cid:17) , (2)where MN denotes the mixed normal distribution. Since the Brownian motion has independentand stationary increments, it is easy to see that the coupling variables ( U n,i ) i ∈I ,n are F ( i ∗ − k n )∆ n -conditionally i.i.d. Moreover, if the volatility process σ t does not jump at the cutoff time τ , wemay follow the same logic to extend the approximation in (2) further to i ∈ I ,n . In other words, ifthe volatility process process does not jump then the coupling variables ( U n,i ) i ∈I n are conditionallyi.i.d., which corresponds to the situation in part (a) of Theorem 2.1. On the other hand, if thevolatility process jumps at time τ , say by a constant c (cid:54) = 0, then the coupling variables for the I ,n subsample will instead take the form U n,i = (cid:0) σ ( i ∗ − k n )∆ n + c (cid:1) ( W ( i +1)∆ n − W i ∆ n ). In this case, thetwo subsamples of U n,i ’s have distinct conditional distributions (i.e., mixed normal with differentconditional variances), corresponding to the scenario in part (b) of Theorem 2.1.Within the context of this illustrative example, we can further clarify a key feature of theproposed test that holds more generally. It is not aimed at detecting “small” time-variations Note that I n does not include the i ∗ th return observation. Therefore, although the returns in (1) do not containprice jumps, an event-induced price jump is allowed to occur at time τ .
8n the distribution of the observed data. In fact, by allowing the drift b t and the volatility σ t to be time-varying, a smooth form of heterogeneity is always built in. The test instead detectsabrupt changes, or discontinuities, in the evolution of the distribution, which can be more plausiblyassociated with the “lumpy” information carried by the underlying economic announcement, asemphasized by Nakamura and Steinsson (2018b). Specifically in this example, the asset returnsare locally centered Gaussian (due to the assumption that the price is an Itˆo process), and hence,the temporal discontinuity in the return distribution manifests itself as a volatility jump. Theempirical scope of our permutation test, however, is far beyond volatility-jump testing depicted inthis illustration, as we shall demonstrate in the remainder of the paper. We now specialize the generic Theorem 2.1 into an infill asymptotic time-series setting that isparticularly suitable for event studies. By introducing a mild additional econometric structure, weshall establish the asymptotic validity of the permutation test under more primitive conditions thatare easy to verify in a variety of concrete empirical settings. As in the running example above,we consider an event occurring at time τ = i ∗ ∆ n , which separates two subsamples indexed by I ,n = { i ∗ − k n , . . . , i ∗ − } and I ,n = { i ∗ + 1 , . . . , i ∗ + k n } , respectively. All limits in the sequelare obtained under the infill asymptotic setting with ∆ n → Y n,i = g ( ζ i ∆ n , (cid:15) n,i ) + R n,i , i ∈ I n , (3)where the state process ζ t is c`adl`ag, adapted to a filtration F t , and takes values in an open set Z ⊆ R dim( ζ ) ; ( (cid:15) n,i ) i ∈I n are i.i.d. random disturbances taking values in some (possibly abstract)space E ; g ( · , · ) is a “smooth” transform; and R n,i is a residual term that is negligible relative to theleading term g ( ζ i ∆ n , (cid:15) n,i ) in a proper sense detailed below. A simpler version of this state-spacemodel without the R n,i residual term has been used by Li and Xiu (2016) and Bollerslev et al.(2018), among others, for modeling market variables such as trading volume and bid-ask spread.By introducing the R n,i term, we can use a unified framework to accommodate a broader class ofmodels, which in particular include increments of an Itˆo semimartingale. We now revisit the modelin (1) as the first illustration. Example 1 (Brownian Asset Returns) . We represent the Itˆo-process model (1) for assetreturns in the form of (3) by setting ζ t = σ t , (cid:15) n,i = ∆ − / n ( W ( i +1)∆ n − W i ∆ n ), and g ( z, (cid:15) ) = z(cid:15) . Theresulting residual term R n,i has the form R n,i = ∆ − / n (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − / n (cid:90) ( i +1)∆ n i ∆ n ( σ s − σ i ∆ n ) dW s . (4)9nder mild and fairly standard regularity conditions, it is easy to show that the R n,i terms areuniformly o p (1). On the other hand, the leading term g ( ζ i ∆ n , (cid:15) n,i ) has a non-degenerate centeredmixed Gaussian distribution with conditional variance σ i ∆ n . (cid:3) This running example further illustrates the distinct roles played by ζ t , (cid:15) n,i , and R n,i in ourstate-space model (3). The leading term g ( ζ i ∆ n , (cid:15) n,i ) captures the “main feature” of the observeddata; in addition, since the (cid:15) n,i disturbance terms are i.i.d., any “large” change in the empiricaldistribution across the two subsamples must be attributed to the time- τ discontinuity in the stateprocess ζ t . From this description, it follows that the hypothesis test for the continuity of thedistribution of the main feature of the observed data can be formulated as H : ∆ ζ τ = 0 versus H a : ∆ ζ τ (cid:54) = 0 , (5)where ∆ ζ τ ≡ ζ τ − ζ τ − ≡ ζ τ − lim s ↑ τ ζ s denotes the jump of the state process at time τ .With the state-space model (3) in place, we can design more primitive sufficient conditionsfor establishing the asymptotic validity of the permutation test under the hypotheses in (5). Weneed some additional notation to describe these conditions. For each fixed z ∈ Z , let f z ( · ) and F z ( · ) denote the probability density function (PDF) and the CDF of the random variable g ( z, ε n,i ),respectively. It is also convenient to introduce a “shifted” version of ζ t defined as ˜ ζ t ≡ ζ t − ∆ ζ τ { t ≥ τ } , which has the same increments as ζ t over time intervals not containing τ . Assumption 2.2. (i) The collection of variables ( (cid:15) n,i ) i ∈I n are i.i.d. and, for each k ∈ I n , thevariables ( (cid:15) n,i ) i ≥ k are independent of F k ∆ n . Moreover, for any compact subset K ⊆ Z , we have (ii) sup x ∈ R ,z ∈K f z ( x ) < ∞ ; and (iii) inf z ∈K (cid:82) R ( F z ( x ) − F z + c ( x )) dF z ( x ) > whenever c (cid:54) = 0 . Assumption 2.3.
There exist a sequence ( T m ) m ≥ of stopping times increasing to infinity, asequence of compact subsets ( K m ) m ≥ of Z , and a sequence ( K m ) m ≥ of constants such that forsome real sequence a n ≥ and each m ≥ : (i) (cid:107) g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) (cid:107) ≤ K m a n (cid:107) z − z (cid:48) (cid:107) for all z, z (cid:48) ∈ K m ; (ii) ζ t takes values in K m for all t ≤ T m , and (cid:107) ˜ ζ t ∧ T m − ˜ ζ s ∧ T m (cid:107) ≤ K m | t − s | / for all t, s in some fixed neighborhood of τ ; (iii) max i ∈I n | R n,i | = o p ( k − n ) . Assumption 2.2 entails regularity conditions pertaining to the random disturbance terms, whichare often easy to verify in concrete examples as demonstrated later in this subsection. Assumption2.3 imposes a set of smoothness conditions that permits the approximation of the observed datausing properly constructed coupling variables. Specifically, condition (i) requires that the randomfunction z (cid:55)→ g ( z, (cid:15) n,i ) is Lipschitz in z over compact sets under the L distance. The a n sequencecaptures the scale of the Lipschitz coefficient. In many applications, we can verify this condition Note that the assumption is framed in a localized fashion using the stopping times ( T m ) m ≥ , which is a standardtechnique for weakening the regularity condition in the infill asymptotic setting. See Jacod and Protter (2012, Section4.4.1) for a comprehensive discussion on the localization technique. a n ≡
1, but allowing a n to diverge to infinity is sometimes necessary (see Example 2below). Condition (ii) states that the ζ t process is locally compact (up to each stopping time T m )and, upon removing the fixed-time discontinuity at τ , it is (1 / L norm. This H¨older-continuity requirement can be easily verified using well-known results providedthat the ˜ ζ process is an Itˆo semimartingle or a long-memory process (see Jacod and Protter (2012,Chapter 2) and Li and Liu (2020)). Condition (iii) imposes the requisite assumptions on the residualterms. In some applications, this condition holds trivially with R n,i ≡
0, but, more generally, itneeds to be verified on a case-by-case basis using (relatively standard) infill asymptotic techniques.Theorem 2.2, below, establishes the size and power properties of the permutation test under thehypotheses described in (5).
Theorem 2.2.
In the state-space model (3), suppose that Assumptions 2.2 and 2.3 hold, and that a n k n ∆ / n = o (1) . Then, the following statements hold for the permutation test ˆ φ n described inAlgorithm 1:(a) Under the null hypothesis in (5), i.e., ∆ ζ τ = 0 , we have E [ ˆ φ n ] → α ;(b) Under a fixed alternative hypothesis in (5), i.e., ∆ ζ τ = c for some (unknown) constant c (cid:54) = 0 , we have E [ ˆ φ n ] → when k n → ∞ . This theorem is proved by verifying the high-level conditions in Theorem 2.1 with properlyconstructed coupling variables analogous to those in equation (2). The condition a n k n ∆ / n = o (1)mainly requires that the window size k n does not grow too fast, which ensures the closeness betweenthe coupling variables and the original data. In the typical case with a n = 1, it reduces to k n = o (∆ − / n ). Part (a) shows that the permutation test attains the desired asymptotic level under thenull hypothesis in (5). Again, we stress that the test has valid asymptotic size control even in the“small-sample” case with fixed k n . As in Theorem 2.1, the “large-sample” condition k n → ∞ isneeded for establishing the consistency of the test under the alternative, as shown in part (b).In the remainder of this subsection, we use a few prototype examples to demonstrate howthe proposed test may be used in various empirical settings. In particular, we show how to castthe specific problems into the approximate state-space model (3), and discuss how to verify oursufficient regularity conditions. We start by revisiting the running example. Example 1 (Brownian Asset Returns, Continued) . Recall that (cid:15) n,i ≡ ∆ − / n ( W ( i +1)∆ n − W i ∆ n ), ζ t = σ t , and g ( z, (cid:15) ) = z(cid:15) . In this context, the hypothesis testing problem in (5) representsa test of the continuity of the volatility process σ t at time t = τ , i.e., H : ∆ σ τ = 0 versus H a : ∆ σ τ (cid:54) = 0 . We suppose that the volatility process σ t is non-degenerate by setting its domain to Z = (0 , ∞ ).Since the Brownian motion has independent increments with respect to the underlying filtration,11he disturbance term (cid:15) n,i satisfies Assumption 2.2(i). In addition, for each point z ∈ Z , the randomvariable f ( z, (cid:15) n,i ) has an N (cid:0) , z (cid:1) distribution. It is then easy to see that conditions (ii) and (iii)in Assumption 2.2 hold for any compact subset K ⊆ Z (note that K is necessarily bounded awayfrom zero). To verify Assumption 2.3, first note that g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) = ( z − z (cid:48) ) (cid:15) n,i , and hence, (cid:107) g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) (cid:107) = | z − z (cid:48) | . Assumption 2.3(i) thus holds for a n = 1. It is well known that σ t is locally (1 / L norm if it is an Itˆo semimartingale or a long-memory process; if so, Assumption 2.3(ii) is satisfied if the σ t and σ − t processes are both locallybounded. Finally, to verify Assumption 2.3(iii), we assume that the drift process b t is locallybounded. It is then easy to show via routine calculations that max i ∈I n | R n,i | = O p ( k / n ∆ / n ).Since the condition a n k n ∆ / n = o (1) in Theorem 2.2 implies that O p ( k / n ∆ / n ) = o p ( k − n ), we havemax i ∈I n | R n,i | = o p ( k − n ) as needed in Assumption 2.3(iii). All conditions in Theorem 2.2 are nowverified, and this shows that the permutation test ˆ φ n is asymptotically valid for testing the nullhypothesis ∆ σ τ = 0. (cid:3) Example 1 shows that the permutation test ˆ φ n is asymptotically valid for testing the presenceof a volatility jump. This is a relatively familiar problem in the literature. It is therefore useful tocontrast the proposed permutation test with the standard approach, which is based on nonpara-metric “spot” estimators of the asset price’s instantaneous variances before and after the event timegiven by, respectively, ˆ σ τ − = 1 k n (cid:88) i ∈I ,n Y n,i , ˆ σ τ = 1 k n (cid:88) i ∈I ,n Y n,i . (6)Assuming k n → ∞ and k n ∆ n →
0, it can be shown that (see Jacod and Protter (2012, Chapter13)) k / n (cid:0) ˆ σ τ − ˆ σ τ − − ( σ τ − σ τ − ) (cid:1)(cid:113) σ τ + 2ˆ σ τ − d −→ N (0 , . (7)Thus, we can test H : ∆ σ τ = 0 by comparing the t-statistic k / n (cid:0) ˆ σ τ − ˆ σ τ − (cid:1) / (cid:113) σ τ + 2ˆ σ τ − withcritical values based on the standard normal distribution.Two remarks are in order. First, note that the asymptotic size control of the standard approachrelies on the asymptotic normal approximation (7), which depends crucially on k n → ∞ (in additionto having ∆ n →
0) because the underlying central limit theorem is obtained by aggregating a“large” number of martingale differences. Hence, the t-test may suffer from severe size distortionwhen k n is relatively small. This issue is empirically relevant because an applied researcher may usea short time window to capture short-lived “impulse-like” dynamics and/or to minimize the impactof other confounding economic factors in the background. Moreover, for “real-time” applications,the researcher may have no choice but to use a small k n simply because of the limited amountof available data soon after the event time τ . In sharp contrast, the permutation test controlsasymptotic size even when k n is fixed. This remarkable property is inherited from the coupling12wo-sample problem, in which the permutation test controls size exactly regardless of whether k n is fixed or grows to infinity.The second and perhaps practically more important difference between the two tests is that thepermutation test is more versatile. Under the spot-estimation-based approach, both the design ofthe spot estimators in (6) and the convergence in (7) depend heavily on the fact that the incrementsof the Brownian motion are not only i.i.d., but also Gaussian. Gaussianity is obviously essential forthe conventional approach because, among other things, it ensures that the instantaneous varianceof the normalized returns are well-defined. The permutation test, on the other hand, only exploitsthe i.i.d. property of the Brownian shocks, without relying on the Gaussianity. Therefore, thepermutation test readily accommodates a more general model for asset returns with L´evy shocks,as we demonstrate in the following example.
Example 2 (L´evy-driven Asset Returns) . We generalize the model in Example 1 by replacingthe Brownian motion W with a L´evy martingale L , so that the asset return has the form P ( i +1)∆ n − P i ∆ n = (cid:90) ( i +1)∆ n i ∆ n b s ds + (cid:90) ( i +1)∆ n i ∆ n σ s dL s , for i ∈ I n . In this case, we define the random disturbance as (cid:15) n,i ≡ ∆ − /βn ( L ( i +1)∆ n − L i ∆ n ) for some constant β ∈ (1 , − /βn is used to ensure that (cid:15) n,i has a non-degenerate distribution. For instance, if L is a stable process, we take β to be its jump-activityindex, so that (cid:15) n,i has a centered stable distribution (recall that the Brownian motion is a stableprocess with index β = 2). We treat the value of β as unknown. Since the permutation test isscale-invariant with respect to the data, we can nonetheless regard the normalized return Y n,i =∆ − /βn ( P ( i +1)∆ n − P i ∆ n ) as directly observable (because tests implemented for P ( i +1)∆ n − P i ∆ n and Y n,i are identical). To apply our theory, we represent Y n,i using the state-space model (3) with ζ t = σ t , g ( z, (cid:15) ) = z(cid:15) , and the residual term given by R n,i = ∆ − /βn (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − /βn (cid:90) ( i +1)∆ n i ∆ n ( σ s − σ i ∆ n ) dL s . Recognizing that the scaled L´evy increments ( (cid:15) n,i ) i ∈I n are i.i.d., we can verify Assumptions 2.2and 2.3 using similar arguments as in Example 1 but with a n = ∆ / − /βn , which depicts therate at which (cid:107) (cid:15) n,i (cid:107) diverges. In particular, the condition a n k n ∆ / n = o (1) requires k n to obey k n = o (∆ (1 /β − / n ). Then, we can apply Theorem 2.2 to show that the permutation test ˆ φ n isasymptotically valid for testing the discontinuity in the volatility process σ t at time τ , regardlessof whether the driving L´evy process is a Brownian motion or not. (cid:3) Recall that many distributions used in continuous-time models do not have finite second moments. For example,within the class of stable distributions, the Gaussian distribution is the only one with a finite second moment.Moreover, Gaussianity also implies that the variance of ∆ − n ( W i ∆ n − W ( i − n ) is 2, which explains the “2” factorin the denominator of the t-statistic.
13o far, we have illustrated the use of the permutation test for high-frequency asset returnsdata. Under the settings of Examples 1 and 2, the distributional change of asset returns is mainlydriven by the time- τ discontinuity in volatility, and hence, the permutation test is effectively atest for volatility jumps. Example 2, in particular, highlights the versatility and robustness ofthe permutation test compared with the conventional approach based on spot estimation. Goingone step further, we now illustrate how to apply the permutation test to other types of economicvariables. Example 3 (Location-Scale Model for Volume) . Consider a simple model for tradingvolume, under which the volume within the i th sampling interval is given by Y n,i = µ i ∆ n + v i ∆ n (cid:15) n,i .The µ t location process captures the local mean, or trading intensity, and the v t scale processcaptures the time-varying heterogeneity in the order size. This location-scale model fits directlyinto the state-space model (3) with ζ t = ( µ t , v t ), g (( µ, v ) , (cid:15) ) = µ + v(cid:15) , and R n,i ≡
0. Let F t be thefiltration generated by the ζ t process. If (cid:15) n,i is independent of the ζ t process and has finite secondmoment and bounded PDF, then it is easy to verify Assumptions 2.2 and 2.3 with a n = 1. Theorem2.2 thus implies that the permutation test is valid for testing the discontinuity in ζ t = ( µ t , v t ) attime τ . (cid:3) The location-scale structure in Example 3 is by no means essential in applications, because thepermutation test is valid provided that the more general conditions in Assumptions 2.2 and 2.3hold. This illustration is pedagogically convenient, in that it permits a straightforward verificationof our high-level conditions. That being said, this example does reveal a limitation of our theorydeveloped so far. That is, the data variable needs to be continuously distributed, as requiredin Assumption 2.2(ii) (which in turn is related to Assumption 2.1(ii)). Observed data in actualapplications are invariably discrete, but this continuous-distribution assumption is often deemedas a reasonable approximation to reality. In some situations, however, the discreteness in the datais more salient. For example, the trading volume of a relatively illiquid asset may take values assmall integer multiples of the lot size (e.g., 100 shares). This motivates us to directly confrontthe discreteness in the data, as detailed in the next subsection.
The extension will be carried out in similar steps as the theory developed above. We start withmodifying the generic result in Theorem 2.1 to accommodate discretely-valued observations. Recallthat Q j,n ( · ) denote the G n -conditional distribution function of the coupling variable U n,i for i ∈ I j,n and j ∈ { , } , and Q n = ( Q ,n + Q ,n ) / This issue has become less important in the equity market as retail investors can now trade a single share, oreven a fractional share, of a stock. However, the lot size is still relevant for less liquid assets such as option contractsor for equity data from earlier sample periods. heorem 2.3. Suppose that there exists a collection of variables ( U n,i ) i ∈I n that satisfies Assump-tion 2.1(i) for some sequence ( G n ) n ≥ of σ -fields, and P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n where ( (cid:101) Y n,i ) i ∈I n is an identical copy of ( Y n,i ) i ∈I n in G n -conditional distribution. Then, the followingstatements hold for the test ˆ φ n described in Algorithm 1:(a) If the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d., we have E [ ˆ φ n ] → α .(b) If k n → ∞ and P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → for any real sequence δ n = o (1) ,we have E [ ˆ φ n ] → . Theorem 2.3 establishes exactly the same asymptotic properties for the permutation test asTheorem 2.1, but under different conditions: It does not impose the anti-concentration requirementfor the coupling variable (i.e., Assumption 2.1(ii)), and the “distance” between the observed dataand the coupling variable is measured by the probability mass of { (cid:101) Y n,i (cid:54) = U n,i } . These modificationsseem natural for the discrete-data setting.Next, we specialize the generic result in Theorem 2.3 to the state-space model (3), starting withsome motivating examples. The first is an alternative model for the trading volume that explicitlyfeatures discretely-valued data, which shows an interesting contrast to Example 3. Example 4 (Poisson Model for Volume) . Let Y n,i be the trading volume of an asset withinthe i th sampling interval. Following Andersen (1996), we model the discretely valued volume usinga Poisson distribution with time-varying mean. To form a state-space representation, let ( (cid:15) n,i ( t )) t ≥ be a copy of the standard Poisson process on R + , independent across i , and let ζ t be the time-varying mean process independent of the (cid:15) n,i ’s. We then set Y n,i = (cid:15) n,i ( ζ i ∆ n ), which, conditionalon the ζ process, is Poisson distributed with mean ζ i ∆ n . This representation is a special case of(3), with g ( ζ, (cid:15) ) = (cid:15) ( ζ ) being a time-change and R n,i = 0. We also note that although the (cid:15) n,i ’s areassumed to be i.i.d., the ( Y n,i ) i ∈I n series can be highly persistent through its dependence on thestochastic mean process ζ t . (cid:3) To further broaden the empirical scope, we consider another example concerning the bid-askspread of asset quotes. This example is econometrically interesting because of its resemblance to thediscrete-choice models (e.g., probit and logit) commonly used for modeling binary and multinomialdata.
Example 5 (Bid-Ask Spread) . Let Y n,i be the bid-ask spread of an asset at time i ∆ n . For aliquid asset, the spread is often maintained at 1 tick (e.g., 1 cent), but it may widen to several ticksdue to a higher level of asymmetric information or dealer’s inventory cost. For ease of exposition,we suppose that Y n,i is a binary variable taking values in { , } , while noting that a multinomialextension is straightforward. Motivated by the classical discrete-choice models, we model the spread15s Y n,i = 1 + 1 { ζ i ∆ n ≥ (cid:15) n,i } , and suppose that the variables ( (cid:15) n,i ) i ∈I n are i.i.d. and independentof the ζ t process. With the CDF of (cid:15) n,i denoted by F (cid:15) ( · ), we have P ( Y n,i = 2 | ζ i ∆ n ) = F (cid:15) ( ζ i ∆ n ).Evidently, upon redefining ζ t as F (cid:15) ( ζ t ), we can assume that (cid:15) n,i is uniformly distributed on the[0 ,
1] interval without loss of generality. This normalization in turn allows us to interpret ζ t as thestochastic propensity of a “wide” spread, which may serve as a measure of market illiquidity. (cid:3) We now proceed to establish the asymptotic validity of the permutation test for the hypothesesdescribed in (5) for discretely valued observations; see Theorem 2.4 below. Since the state-spacerepresentation (3) holds with the residual term R n,i = 0 in the examples above, it seems reasonableto avoid unnecessary redundancy by restricting our analysis to a simpler version given by Y n,i = g ( ζ i ∆ n , (cid:15) n,i ) , i ∈ I n . (8)We replace Assumption 2.3 with the following assumption, where we recall that for each z ∈ Z , F z ( · ) denotes the CDF of the random variable g ( z, ε n,i ) and ˜ ζ t = ζ t − ∆ ζ τ { t ≥ τ } . Assumption 2.4.
There exist a sequence ( T m ) m ≥ of stopping times increasing to infinity, asequence of compact subsets ( K m ) m ≥ of Z , and a sequence ( K m ) m ≥ of constants such that foreach m ≥ : (i) P ( g ( z, (cid:15) n,i ) (cid:54) = g ( z (cid:48) , (cid:15) n,i )) ≤ K m (cid:107) z − z (cid:48) (cid:107) for all z, z (cid:48) ∈ K m ; (ii) ζ t takes values in K m for all t ≤ T m , and (cid:107) ˜ ζ t ∧ T m − ˜ ζ s ∧ T m (cid:107) ≤ K m | t − s | / for all t, s in some fixed neighborhood of τ . Theorem 2.4.
In the state-space model (8), suppose that Assumptions 2.2(i), 2.2(iii) and 2.4 hold,and that k n ∆ n = o (1) . Then, the following statements hold for the permutation test ˆ φ n describedin Algorithm 1:(a) Under the null hypothesis in (5), i.e., ∆ ζ τ = 0 , we have E [ ˆ φ n ] → α ;(b) Under a fixed alternative hypothesis in (5), i.e., ∆ ζ τ = c for some (unknown) constant c (cid:54) = 0 , we have E [ ˆ φ n ] → when k n → ∞ . Theorem 2.4 depicts the same asymptotic behavior of the permutation test as in Theorem2.2. The sufficient conditions of these results differ mainly in how to gauge the closeness betweenthe data and the coupling variable, as manifest in the difference between Assumption 2.3(i) andAssumption 2.4(i). The latter is easy to verify under more primitive conditions in concrete settings.Specifically, in Example 4, we note that | g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) | is a Poisson random variable withmean | z − z (cid:48) | , and hence, P ( g ( z, (cid:15) n,i ) (cid:54) = g ( z (cid:48) , (cid:15) n,i )) = 1 − exp( − | z − z (cid:48) | ) ≤ | z − z (cid:48) | as desired. InExample 5, we can use (cid:15) n,i ∼ Uniform[0 ,
1] to deduce that P (cid:0) g ( z, (cid:15) n,i ) (cid:54) = g (cid:0) z (cid:48) , (cid:15) n,i (cid:1)(cid:1) = P (cid:0) { z ≥ (cid:15) n,i } (cid:54) = 1 { z (cid:48) ≥ (cid:15) n,i } (cid:1) = (cid:12)(cid:12) z − z (cid:48) (cid:12)(cid:12) , which, again, verifies Assumption 2.4(i). Therefore, in the context of Examples 4 and 5 above,the permutation test is asymptotically valid for detecting discontinuities in trading activity andilliquidity, respectively. 16 Monte Carlo simulations
Our Monte Carlo experiment is based on the setting of Example 2. We simulate the (log) priceprocess according to dP t = σ t dL t under an Euler scheme on a 1-second mesh, and then resamplethe data at the ∆ n = 1 minute frequency. We simulate L either as a standard Brownian motionor as a (centered symmetric) stable process with index β = 1 .
5. To avoid unrealistic price path,we truncate the stable distribution so that its normalized increment ∆ − /βn (cid:0) L i ∆ n − L ( i − n (cid:1) issupported on [ − C, C ], and we consider C ∈ { , , } to examine the effect of the support. Theunit of time is one day.To simulate the volatility process, we first simulate two volatility factors according to thefollowing dynamics (see Bollerslev and Todorov (2011)): dV ,t = 0 . . − V ,t ) dt + 0 . (cid:112) V ,t (cid:16) ρdL t + (cid:112) − ρ dB ,t (cid:17) + c · { t = τ } ,dV ,t = 0 . . − V ,t ) dt + 0 . (cid:112) V ,t (cid:16) ρdL t + (cid:112) − ρ dB ,t (cid:17) + c · { t = τ } , where B ,t and B ,t are independent standard Brownian motions that are also independent of L t , ρ = − . c determines the size of the volatility jump at the event time τ .In particular, c = 0 corresponds to the null hypothesis, and we consider a range of c values in (0 , c parameter is calibrated according to Bollerslev et al.’s (2018) empirical estimates for FOMCannouncements. We note that the two volatility factors, V and V , capture the slow- and fast-mean-revertingvolatility dynamics, respectively, with the former having “smoother” sample paths than the latter.With this in mind, we simulate σ t using two models: (cid:40) Model A: σ t = 2 V ,t , Model B: σ t = V ,t + V ,t . (9)In finite samples, Model A features relatively smooth volatility paths, which is close to the “ideal”scenario underlying the infill asymptotic theory. Meanwhile, Model B generates more realistic, androugher, sample path for σ , providing a nontrivial challenge for the proposed inference theory.We implement the permutation test at the 5% significance level, with the window size k n ∈{ , , , } . The six-fold increase from the smallest window size to the largest one represents Specifically, Bollerslev et al. (2018) estimate the average jump size of log( σ t ) for the S&P 500 ETF aroundFOMC announcements to be 1.037 (see Table 3 of that paper). This suggests that σ τ /σ τ − = (exp(1 . ≈ c ≈ . Model A: One-factor Volatility k n = 15 0.041 0.049 0.053 0.049 0.019 0.012 0.022 0.021 k n = 30 0.056 0.056 0.049 0.044 0.032 0.035 0.044 0.052 k n = 60 0.041 0.044 0.048 0.054 0.084 0.070 0.075 0.065 k n = 90 0.050 0.048 0.042 0.046 0.115 0.080 0.098 0.098 Model B: Two-factor Volatility k n = 15 0.041 0.049 0.060 0.048 0.025 0.016 0.022 0.024 k n = 30 0.059 0.054 0.049 0.045 0.087 0.064 0.065 0.081 k n = 60 0.068 0.053 0.056 0.069 0.208 0.139 0.164 0.153 k n = 90 0.082 0.064 0.056 0.070 0.289 0.193 0.231 0.196 Note:
This table presents rejection frequencies of the permutation test and the t-testunder the null hypothesis σ τ − = σ τ . The significance level is fixed at 5%. Column (1)corresponds to the case with L being a standard Brownian motion, and columns (2)–(4) correspond to cases in which L is truncated stable with index 1 . C ∈ { , , } . The rejection frequencies are computed based on 1,000Monte Carlo trials.a considerable range that allows us to explore the robustness of the proposed test with respect tothe k n tuning parameter. The critical value is computed as in Remark 2.1 based on 1,000 i.i.d.permutations. For comparison, we also implement the standard (two-sided) t-test based on (7).Rejection frequencies are computed based on 1,000 Monte Carlo trials. We first examine the size properties of the permutation test ˆ φ n and the t-test based on (7). Table1 reports the rejection frequencies of these tests under the null hypothesis (i.e., c = 0) for variousdata generating processes. Column (1) corresponds to the case with L being a standard Brownianmotion, and columns (2), (3), and (4) report results when L is a truncated stable process with thetruncation parameter C = 10, 20, and 30, respectively.The top panel of the table shows results from Model A, where the volatility is solely drivenby the “slow” factor. Quite remarkably, the rejection frequencies of the permutation test are veryclose to the 5% nominal level for all specifications of L and, importantly, for a wide range of the18indow size k n . In contrast, the rejection rates of the t-test appear to be far more sensitive to thechoice of k n . As we increase k n from 15 to 90, the rejection rate increases from 1.9% to 11.5% when L is a Brownian motion, and we see a similar pattern of size distortion when L is truncated stable.It should be noted that, although the performance of the t-test shown in columns (2)–(4) appearssimilar to what we see in column (1), that test is not formally justified when L is not a Brownianmotion.The more challenging case is Model B with the two-factor volatility dynamics. Looking at thebottom panel of Table 1, we find that the permutation test still has rejection rates that are quiteclose to the nominal level, although we see some over-rejection when k n = 90. This is likely dueto the fact that the approximation error in the coupling has nontrivial impact when the windowsize is large. That being said, this bias issue also affects the benchmark t-test and, apparently, ina more severe fashion.We next turn to the power comparison. Since the results for the four specifications of L aresimilar, we focus on the Brownian motion case for brevity. Figure 1 plots the power curves ofthe permutation test and the t-test for various k n ’s in Model A and Model B. We see that therejection frequencies increase with the window size k n and the jump size c , which is expected fromour consistency result obtained under k n → ∞ . The permutation test is generally less powerfulthan the t-test under the alternative hypothesis. However, the latter is more powerful at the costof size distortion, which can be very large as shown in Table 1.Overall, we find that the permutation test controls size remarkably well under the null hypoth-esis. Although it appears to be less powerful than the t-test, it does not suffer from the latter’s sizedistortion which can be severe in the two-factor volatility model. Our results suggest that, givenits robustness, the permutation test is a useful complement to the conventional test based on spotestimation and asymptotic Gaussian approximation. As an empirical illustration, we apply the proposed permutation test in a case study for testingdistributional discontinuities in asset returns. We focus on the impact of news related to thenovel coronavirus (COVID-19) on the US stock market during the early phase of the ongoingpandemic. Our dataset consists of the daily (adjusted) close prices of the S&P 500 index fromDecember 20, 2019 to March 18, 2020, which is publicly available at Yahoo Finance. Accordingto a New York Times article, the first reporting of COVID-19 was on December 31, 2019, statingthat “Chinese authorities treated dozens of cases of pneumonia of unknown cause.” On March 11, Data source: https://finance.yahoo.com/quote/%5EGSPC/history?p=%5EGSPC. ermutation Test in Model A Jump Size R e j ec ti on P r ob a b ilit y kn = 15kn = 30kn = 60kn = 90 T-test in Model A
Jump Size R e j ec ti on P r ob a b ilit y Permutation Test in Model B
Jump Size R e j ec ti on P r ob a b ilit y T-test in Model B
Jump Size R e j ec ti on P r ob a b ilit y Figure 1: The figure plots the rejection frequencies of the permutation test and the t-test. Thesignificance level is fixed at 5% (highlighted by shade). Results for Model A and Model B arepresented in the top and bottom rows, respectively. The L´evy process L is simulated as a standardBrownian motion. The power curves are computed for the jump size parameter c ∈ { , . , , . . . , } .The rejection frequencies are computed based on 1,000 Monte Carlo trials.2020, the World Health Organization (WHO) declared COVID-19 to be a global pandemic. Severalsignificant news events in between are listed in Table 2, including the first reported COVID-19 casein the US, and the outbreaks in South Korea and Italy. We implement the permutation test witha window size k n = 5, corresponding to 5 trading days. It is natural to consider this short windowas a fixed number, which is permitted under the proposed theory. In contrast, the conventionalspot-estimation-based approach would require k n → ∞ , which is clearly implausible in the presentcontext. By using a publicly available dataset and a short event window, this illustrative exampleis intentionally designed to mimic a “real-time” and high-stake research scenario in the publicdomain, for which the underlying price and risk dynamics is not yet well understood (due to therare-disaster nature of COVID-19). This example is thus ideal to highlight the two comparativeadvantages of the proposed permutation test, namely, its small-sample reliability and practicalversatility (recall the discussion in Section 2.2). Our sample period is chosen so that there are five observations of daily returns before the initial reporting fromChina and five observations after the WHO’s pandemic declaration.
Date ( τ ) Headline Event12/31/2019 Chinese authorities treated dozens of cases of pneumonia of unknown cause.01/20/2020 Other countries, including the United States, confirmed COVID-19 cases.01/30/2020 The WHO declared COVID-19 a global health emergency.02/21/2020 A secretive church is linked to outbreak in South Korea. Italy sees major surge inCOVID-19 cases and officials lock down towns (reported on Sunday, February 23, 2020).03/11/2020 The WHO declared COVID-19 a global pandemic. Source:
For each event time τ in Table 2, we implement the permutation test at the 5% significance level.To simplify interpretation, we use the non-randomized version of the permutation test described inRemark 2.1, that is, we report a rejection if and only if (cid:98) T n > (cid:98) T ∗ n . We reject the null hypothesisfor two instances: January 20, 2020 and February 21, 2020. The former corresponds to the firstreported COVID-19 case in the US, and the latter is associated with the outbreak in South Korea(Friday) and the subsequent reporting of the surge in Italy (Sunday). On the other hand, we donot reject the null hypothesis for either the initial reporting in China or the two WHO declarations.To gain further insight, we plot in Figure 2 the daily return series of the S&P 500 index inour sample, marked with the aforementioned events. The time series plot provides corroborativeevidence for the testing results. The US market indeed did not respond to China’s initial reportingon December 31, 2019, but became increasingly alerted in the week after COVID-19 cases werealso reported in Japan, South Korea, Thailand, and the US. When the WHO declared globalemergency on January 30, 2020, the market was already moderately volatile, so the declarationitself did not trigger any significant (distributional) discontinuity. The outbreaks in South Koreaand Italy evidently drove the market into a panic, which we highlight using shaded color in thefigure. The WHO’s pandemic announcement amid the turmoil is not associated with a rejectionfrom our test, suggesting that the declaration was mostly a response to publicly known information,without providing new “lumpy” information that could cause jumps in the return distribution.
In this paper, we propose using a permutation test to detect discontinuities in an economic modelat a cutoff point. Relative to the existing literature, we show that the permutation test is wellsuited for event studies based on time-series data. While nonparametric t-tests have been widelyused for this purpose in various empirical contexts, the permutation test proposed in this paperprovides a distinct alternative. Instead of relying on asymptotic (mixed) Gaussianity from central We implement each test as in Remark 2.1 using 100,000 i.i.d. permutations. OVID-19 News and S&P 500 Daily Returns
Dec. 31, 2019First reporting from China Cases in Japan, South Korea, Thailand & USJan. 20-21, 2020 Jan. 30, 2020WHO global emergency Feb 21, 2020South KoreaOutbreak inSurge in ItalyFeb. 23-24, 2020WHO global pandemicMar. 11, 20200 10 20 30 40 50 60
Observation Index -350-300-250-200-150-100-50050100150 R e t u r n s Figure 2: The figure plots the daily returns of the S&P 500 index from December 20, 2019 to March18, 2020. The (adjusted) close price data is obtained from Yahoo Finance. Note that January 20,2020 (Martin Luther King Jr. Day) and February 23, 2020 (Sunday) are indexed together withtheir subsequent trading days.limit theorems, we exploit finite-sample properties of the permutation test in the approximating,or “coupling,” two-sample problem.We demonstrate that our new theory is broadly useful in a wide range of problems in the infillasymptotic time-series setting, which justifies using the permutation test to detect jumps in eco-nomic variables such as volatility, trading activity, and liquidity. Compared with the conventionalnonparametric t-test, the proposed permutation test has several distinct features. First, the permu-tation test provides asymptotic size control regardless of whether the sizes of the local subsamplesare fixed or growing to infinity. In the latter case, we also establish that the permutation test isconsistent. Second, the permutation test is versatile, as it can be applied without modification tomany different contexts and under relatively weak conditions.
Appendix: Proofs
Throughout the proofs, we use K to denote a positive constant that may change from line toline, and write K p to emphasize its dependence on some parameter p . For any event E ∈ F , we22dentify it with the associated indicator random variable. Proof of Theorem 2.1.
Step 1. Define φ n in the same way as ˆ φ n but with ( Y n,i ) i ∈I n replaced by( U n,i ) i ∈I n . In this step, we show that E [ ˆ φ n ] = E [ φ n ] + o (1) . (10)Let ˜ φ n be defined in the same way as ˆ φ n , but with ( Y n,i ) i ∈I n replaced by ( (cid:101) Y n,i ) i ∈I n , as definedin Assumption 2.1(iii). Since ( (cid:101) Y n,i ) i ∈I n and ( Y n,i ) i ∈I n share the same (conditional) distribution, E [ ˜ φ n ] = E [ ˆ φ n ] . (11)Let E n ∈ F be the event where the ordered values of ( U n,i ) i ∈I n and ( (cid:101) Y n,i ) i ∈I n correspond to thesame permutation of I n . Since the test statistic is only a function of the rank of the observations,we have ˜ φ n = φ n in restriction to E n . Hence, | E [ ˜ φ n ] − E [ φ n ] | = | E [ ˜ φ n E cn ] − E [ φ n E cn ] | ≤ P ( E cn ) . (12)By (11) and (12), (10) follows from P ( E cn ) = o (1), which will be proved below.Let A n,i,j ≡ { U n,j − U n,i ≥ , (cid:101) Y n,j − (cid:101) Y n,i < } for every ( i, j ) ∈ I n × I n , and note that E cn ⊆ ∪ i,j A n,i,j . Recall the elementary fact that if a sequence of random variables X n = o p (1), thenthere exists a real sequence δ n = o (1) such that P ( | X n | ≤ δ n ) →
1. Under Assumption 2.1(iii), byapplying this result to X n = 2 max i ∈I n | (cid:101) Y n,i − U n,i | k n , we can find a sequence δ n = o (1) such that P (cid:18) max i ∈I n | (cid:101) Y n,i − U n,i | ≤ δ n k − n / (cid:19) → . (13)We then observe that A n,i,j ⊆ { U n,j − U n,i ≥ δ n k − n , (cid:101) Y n,j − (cid:101) Y n,i < } ∪ { ≤ U n,j − U n,i < δ n k − n }⊆ {| (cid:101) Y n,j − (cid:101) Y n,i − ( U n,j − U n,i ) | > δ n k − n } ∪ { ≤ U n,j − U n,i < δ n k − n }⊆ { max i ∈I n | (cid:101) Y n,i − U n,i | > δ n k − n / } ∪ { ≤ U n,j − U n,i < δ n k − n } . Therefore, E cn ⊆ ∪ i,j A n,i,j ⊆ { max i ∈I n | (cid:101) Y n,i − U n,i | > δ n k − n / } ∪ ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) , which, together with (13), implies that P ( E cn ) ≤ P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) + o (1) . (14)Next, consider the following argument: P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n }|G n ) ≤ (cid:88) i,j ∈I n P (0 ≤ U n,j − U n,i < δ n k − n |G n ) ≤ k n (cid:88) i ∈I n sup x ∈ R P ( | U n,i − x | ≤ δ n k − n |G n )= O p ( δ n ) = o p (1) , (15)23here the last line holds by Assumption 2.1(ii). By (15) and the bounded convergence theorem, P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) = o (1) . (16)By combining (14) and (16), we conclude that P ( E cn ) = o (1), as desired.Step 2. We now prove the assertions in parts (a) and (b) of the theorem. In view of (10), we onlyneed to prove E [ φ n ] → α and E [ φ n ] → U n,i ) i ∈I n are conditionally i.i.d. and so permutations constitute a group of transformations thatsatisfy the randomization hypothesis in Lehmann and Romano (2005, Definition 15.2.1). Then,Lehmann and Romano (2005, Theorem 15.2.1) implies that E [ φ n | G n ] = α , and E [ φ n ] = α thenfollows from the law of iterated expectations.To prove part (b), we need some additional notation. To emphasize the dependence of (cid:98) T n , (cid:98) T ∗ n ,and ˆ φ n on the original data ( Y n,i ) i ∈I n , we explicitly write them as (cid:98) T n ( Y ), (cid:98) T ∗ n ( Y ), and ˆ φ n ( Y ). Withthis notation, we can rewrite φ n = ˆ φ n ( U ), since it is computed in the same way as ˆ φ n but with( Y n,i ) i ∈I n replaced by ( U n,i ) i ∈I n .We first analyze the asymptotic behavior of (cid:98) T n ( U ). Define the empirical analogue of Q j,n ( · ) as (cid:98) Q j,n ( x ) ≡ k n (cid:88) i ∈I j,n { U n,i ≤ x } . Since the variables ( U n,i ) i ∈I j,n are G n -conditionally i.i.d., E [( (cid:98) Q j,n ( x ) − Q j,n ( x )) |G n ] ≤ O ( k − n ) = o (1) . By Markov’s inequality and the law of iterated expectations, this implies that (cid:98) Q j,n ( x ) − Q j,n ( x ) = o p (1) for each x ∈ R . This and a classical Glivenko–Cantelli theorem (e.g., Davidson (1994, Theorem21.5)) imply that sup x ∈ R | (cid:98) Q j,n ( x ) − Q j,n ( x ) | = o p (1) , for j ∈ { , } . (17)By definition, (cid:98) T n ( U ) ≡ k n (cid:88) i ∈I n ( (cid:98) Q ,n ( U n,i ) − (cid:98) Q ,n ( U n,i )) . In addition, we define S n ≡ k n (cid:88) i ∈I n ( Q ,n ( U n,i ) − Q ,n ( U n,i )) . Note that the functions (cid:98) Q j,n ( · ) and Q j,n ( · ) are uniformly bounded. Hence, by the triangle inequalityand (17), | (cid:98) T n ( U ) − S n | ≤ k n (cid:88) i ∈I n | ( (cid:98) Q ,n ( U n,i ) − (cid:98) Q ,n ( U n,i )) − ( Q ,n ( U n,i ) − Q ,n ( U n,i )) |≤ Kk n (cid:88) i ∈I n (cid:88) j ∈{ , } | (cid:98) Q j,n ( U n,i ) − Q j,n ( U n,i ) | = o p (1) . (18)24onditional on G n , the bounded random functions Q ,n ( · ) and Q ,n ( · ) can be treated as deterministicfunctions. Next, note that S n = 12 (cid:88) j ∈{ , } (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ j,n ( x )+ o p (1) = (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x )+ o p (1) , (19)where the first equality holds by a law of large numbers for the conditionally i.i.d. variables( U n,i ) i ∈I j,n for j = 1 ,
2, and the second equality holds by the definition of Q n . By combining(18) and (19), we deduce that (cid:98) T n ( U ) = (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) + o p (1) . (20)Next, we analyze the asymptotic behavior of (cid:98) T ∗ n ( U ). It is useful to consider the followingrepresentation of this variable. We denote U ˜ π = ( U n, ˜ π ( i ) ) i ∈I n , where ˜ π is a random permutationof I n , independent from the data, and is drawn uniformly from the set of all permutations of I n . By definition, (cid:98) T ∗ n ( U ) is the 1 − α quantile of (cid:98) T n ( U ˜ π ), conditional on the sample, where therandomness comes from the random realization of ˜ π . To analyze the permutation distribution, weconstruct an additional coupling sequence of ( U n,i ) i ∈I n following the method of Chung and Romano(2013, Section 5.3). The result of their coupling construction is another random sequence ( U (cid:48) n,i ) i ∈I n such that (i) U n,i = U (cid:48) n,i for all i in some random subset I (cid:48) n ⊆ I n ; (ii) the cardinality of I n \ I (cid:48) n ,denoted D n , satisfies E [ D n ] = O ( k / n ); and (iii) ( U (cid:48) n,i ) i ∈I n are G n -conditionally i.i.d. with marginaldistribution Q n .For j ∈ { , } , define (cid:98) Q j,n ( x ; π ) ≡ k n (cid:88) i ∈I j,n { U n,π ( i ) ≤ x } and (cid:98) Q (cid:48) j,n ( x ; π ) ≡ k n (cid:88) i ∈I j,n { U (cid:48) n,π ( i ) ≤ x } . By repeatedly using the triangle inequality, | (cid:98) T n ( U π ) − (cid:98) T n ( U (cid:48) π ) | = 12 k n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i ∈I n (cid:18)(cid:0) (cid:98) Q ,n ( U n,π ( i ) ; π ) − (cid:98) Q ,n ( U n,π ( i ) ; π ) (cid:1) − (cid:0) (cid:98) Q (cid:48) ,n ( U (cid:48) n,π ( i ) ; π ) − (cid:98) Q (cid:48) ,n ( U (cid:48) n,π ( i ) ; π ) (cid:1) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Kk n (cid:88) j ∈{ , } (cid:88) i ∈I n (cid:12)(cid:12) (cid:98) Q j,n ( U n,π ( i ) ; π ) − (cid:98) Q (cid:48) j,n ( U (cid:48) n,π ( i ) ; π ) (cid:12)(cid:12) ≤ Kk n (cid:88) i,k ∈I n (cid:12)(cid:12) { U n,π ( k ) ≤ U n,π ( i ) } − { U (cid:48) n,π ( k ) ≤ U (cid:48) n,π ( i ) } (cid:12)(cid:12) ≤ KD n /k n = o p (1) , (21)where the last inequality uses the fact that ( U n,i , U n,k ) = ( U (cid:48) n,i , U (cid:48) n,k ) if ( i, k ) ∈ I (cid:48) n × I (cid:48) n , and so thesummation on the previous line only has (2 k n ) − (2 k n − D n ) ≤ k n D n bounded terms that can be25ifferent from zero; and the o p (1) statement follows from E [ D n ] = O ( k / n ), k n → ∞ , and Markov’sinequality.For any fixed arbitrary permutation π , (cid:98) T n ( U (cid:48) π ) is the Cram´er-von Mises statistic for the G n -conditionally i.i.d. variables ( U (cid:48) n,π ( i ) ) i ∈I n . Hence, by a similar argument leading to (20), we have (cid:98) T n ( U (cid:48) π ) = o p (1). By combining this with (21), it follows that (cid:98) T n ( U π ) = o p (1) . (22)Since this result holds for any arbitrary fixed permutation π , it also holds for any pair of permu-tations considered at random from the set of all possible permutations of I n , independently fromthe data. By elementary properties of stochastic convergence, this implies the so-called Hoeffd-ing’s condition (e.g., Lehmann and Romano (2005, Equation (15.10))). By this and Lehmann andRomano (2005, Theorem 15.2.3), the permutation distribution associated with the test statistic (cid:98) T n ( U ), conditional on the data, converges to zero in probability. As a corollary of this, (cid:98) T ∗ n ( U ) = o p (1) . (23)From (20), (23), and the condition in part (b), it is easy to see that (cid:98) T n ( U ) > (cid:98) T ∗ n ( U ) withprobability approaching 1. This further implies that E [ φ n ] →
1, which, together with (10) provesthe assertion of part (b).
Proof of Theorem 2.2. (a) We prove the assertion of part (a) by applying Theorem 2.1(a). Weconstruct the coupling variable U n,i as follows: U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) , for all i ∈ I n . (24)We set G n = F ( i ∗ − k n )∆ n . By Assumption 2.2, ( (cid:15) n,i ) i ∈I n are i.i.d. and independent of G n . Since ζ ( i ∗ − k n )∆ n is G n -measurable, the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d. This verifies thecondition in part (a) of Theorem 2.1, which also implies Assumption 2.1(i). It remains to verifyconditions (ii) and (iii) in Assumption 2.1.By a standard localization argument (see Jacod and Protter (2012, Section 4.4.1)), we canstrengthen Assumption 2.3 by assuming T = ∞ , K m = K , and K m = K for some fixed compact set K and constant K >
0. In particular, ζ ( i ∗ − k n )∆ n takes values in the compact set K . By Assumption2.2, it is then easy to see that the G n -conditional probability density of U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) isuniformly bounded (and it does not depend on i ). This implies condition (ii) of Assumption 2.1.Finally, we verify condition (iii) of Assumption 2.1. By Assumption 2.2(i), for each i ∈ I n , ε n,i isindependent of F i ∆ n . Since ζ i ∆ n and ζ ( i ∗ − k n )∆ n are F i ∆ n -measurable, we deduce from Assumption2.3(i) that E [ | g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) | |F i ∆ n ] ≤ Ka n (cid:107) ζ i ∆ n − ζ ( i ∗ − k n )∆ n (cid:107) . (25)26ote that under the null hypothesis with ∆ ζ τ = 0, the processes ζ t and ˜ ζ t are identical. Hence, byAssumption 2.3(ii) and (25), (cid:13)(cid:13) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:13)(cid:13) ≤ Ka n k / n ∆ / n . By the maximal inequality under the L norm (see, e.g., van der Vaart and Wellner (1996, Lemma2.2.2)), we further deduce that (cid:13)(cid:13)(cid:13)(cid:13) max i ∈I n (cid:12)(cid:12) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) ≤ Ka n k n ∆ / n . (26)Recall that a n k n ∆ / n = o (1) by assumption. Hence,max i ∈I n (cid:12)(cid:12) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:12)(cid:12) = o p ( k − n ) . (27)Note that, by the definitions in (3) and (24), Y n,i − U n,i = g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) + R n,i . (28)Combining (27), (28), and Assumption 2.3(iii), we deduce that max i ∈I n | Y n,i − U n,i | = o p (cid:0) k − n (cid:1) ,which verifies Assumption 2.1(iii). We have now verified all the conditions needed in Theorem2.1(a), which proves the assertion of part (a) of Theorem 2.2.(b) We prove the assertion of part (b) by applying Theorem 2.1(b). Under the maintainedalternative hypothesis, we have ∆ ζ τ = c for some constant c (cid:54) = 0. The coupling variable now takesthe following form U n,i = (cid:40) g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) i ∈ I ,n ,g ( ζ ( i ∗ − k n )∆ n + c, (cid:15) n,i ) i ∈ I ,n . (29)Under Assumption 2.2, it is easy to see that, for each j ∈ { , } , the variables ( U n,i ) i ∈I j,n are G n -conditionally i.i.d., which verifies Assumption 2.1(i).We now turn to the remaining conditions in Assumption 2.1. As in part (a), we can invoke thestandard localization procedure and assume that the ζ t process takes value in a compact set K .Note that ζ τ − ( ζ ( i ∗ − k n )∆ n + ∆ ζ τ ) = ζ τ − − ζ ( i ∗ − k n )∆ n = o p (1) , where the o p (1) statement follows from the fact that the ζ t process is c`adl`ag and k n ∆ n → K slightly if necessary, we also have ζ ( i ∗ − k n )∆ n + c ∈ K withprobability approaching 1. Then, we can verify Assumption 2.1(ii) following the same argument asin part (a). The verification of Assumption 2.1(iii) is also similar.Finally, we verify the condition in Theorem 2.1(b) pertaining to the conditional CDFs. Notethat Q ,n ( x ) = F ζ ( i ∗− kn )∆ n ( x ) and Q ,n ( x ) = F ζ ( i ∗− kn )∆ n + c ( x ) .
27t is then easy to see that2 (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) ≥ (cid:90) (cid:16) F ζ ( i ∗− kn )∆ n ( x ) − F ζ ( i ∗− kn )∆ n + c ( x ) (cid:17) dF ζ ( i ∗− kn )∆ n ( x ) . Since ζ ( i ∗ − k n )∆ n takes values in the compact set K , Assumption 2.2(iii) implies that the lower boundin the above display is bounded away from zero. Hence, (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n forany real sequence δ n = o (1). We have now verified all conditions for Theorem 2.1(b), which provesthe assertion of part (b) of Theorem 2.2. Proof of Theorem 2.3.
The assertions of the theorem follow from similar arguments to those usedto prove Theorem 2.1. For the sake of brevity, we focus on the only substantial difference, which ishow we establish that P ( E cn ) = o (1). Recall that E n denotes the event where the ordered values of( U n,i ) i ∈I n and ( (cid:101) Y n,i ) i ∈I n correspond to the same permutation of I n . In the case of this proof, thisresult follows from P ( E cn ) ≤ P ( ∪ i ∈I n { (cid:101) Y n,i (cid:54) = U n,i } ) ≤ (cid:88) i ∈I n P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (1) , where the first inequality follows from E cn ⊆ ∪ i ∈I n { (cid:101) Y n,i (cid:54) = U n,i } and the convergence follows fromthe assumption that P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . Proof of Theorem 2.4. (a) We prove this assertion by applying Theorem 2.3(a). We shall verifythe conditions in Theorem 2.3 for (cid:101) Y n,i = Y n,i , U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ), and G n = F ( i ∗ − k n )∆ n . Byassumption, the variables ( (cid:15) n,i ) i ∈I n are i.i.d. and independent of G n . Hence, the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d.It remains to verify that P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . By repeating thelocalization argument used in the proof of Theorem 2.2, we can strengthen Assumption 2.4 with T = ∞ without loss of generality. In particular, ζ t takes values in some compact subset K ⊆ Z .Note that for each i ∈ I n , (cid:15) n,i is independent of (cid:0) ζ i ∆ n , ζ ( i ∗ − k n )∆ n (cid:1) . By Assumption 2.4(i), wethus have P ( Y n,i (cid:54) = U n,i |G n ) ≤ K (cid:13)(cid:13) ζ i ∆ n − ζ ( i ∗ − k n )∆ n (cid:13)(cid:13) . Then, by Assumption 2.4(ii), we furtherhave P ( Y n,i (cid:54) = U n,i ) ≤ K ( k n ∆ n ) / . The condition P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) then follows from k n ∆ n = o (1). By Theorem 2.3(a), we have E [ ˆ φ n ] → α as asserted.(b) We prove this assertion by applying Theorem 2.3(b). We verify the conditions in Theorem2.3 for (cid:101) Y n,i = Y n,i , G n = F ( i ∗ − k n )∆ n , and U n,i = (cid:40) g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) if i ∈ I ,n ,g ( ζ ( i ∗ − k n )∆ n + c, (cid:15) n,i ) if i ∈ I ,n . U n,i ) i ∈I j,n are G n -conditionally i.i.d. foreach j ∈ { , } , and P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . Assumption 2.2(iii) also ensuresthat P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → δ n = o (1). By Theorem2.3(b), we have that E [ ˆ φ n ] →
1, as asserted.
References
Andersen, T. G. (1996): “Return Volatility and Trading Volume: An Information Flow Inter-pretation of Stochastic Volatility,”
Journal of Finance , 51, 169–204.
Andrews, D. W. K. (1993): “Tests for Parameter Instability and Structural Change With Un-known Change Point,”
Econometrica , 61, 821–856.
Bai, J. and P. Perron (1998): “Estimating and Testing Linear Models with Multiple StructuralChanges,”
Econometrica , 66, 47–78.
Barndorff-Nielsen, O. E. and N. Shephard (2006): “Econometrics of Testing for Jumps inFinancial Economics Using Bipower Variation,”
Journal of Financial Econometrics , 4, 1–30.
Bollerslev, T., J. Li, and Y. Xue (2018): “Volume, Volatility, and Public News Announce-ments,”
Review of Economic Studies , 85, 2005–2041.
Bollerslev, T. and V. Todorov (2011): “Estimation of Jump Tails,”
Econometrica , 79, 1727–1783.
Calonico, S., M. D. Cattaneo, and R. Titiunik (2014): “Robust Nonparametric ConfidenceIntervals for Regression-Discontinuity Designs,”
Econometrica , 82, 2295–2326.
Canay, I. A. and V. Kamat (2017): “Approximate Permutation Tests and Induced Order Statis-tics in the Regression Discontinuity Design,”
The Review of Economic Studies , 85, 1577–1608.
Cattaneo, M. D., B. R. Frandsen, and R. Titiunik (2015): “Randomization inference in theregression discontinuity design: An application to party advantages in the US Senate,”
Journalof Causal Inference , 3, 1 – 24.
Cattaneo, M. D., R. Titiunik, and G. Vazquez-Bare (2017): “Comparing inference ap-proaches for RD designs: A reexamination of the effect of Head Start on child mortality,”
Journalof Policy Analysis and Management , 36, 643–681.
Chow, G. C. (1960): “Tests of Equality Between Sets of Coefficients in Two Linear Regressions,”
Econometrica , 28, 591–605.
Chung, E. and J. P. Romano (2013): “Exact and Asymptotically Robust Permutation Tests,”
Annals of Statistics , 41, 484–507. 29 ochrane, J. H. and M. Piazzesi (2002): “The Fed and Interest Rates - A High-FrequencyIdentification,”
American Economic Review , 92, 90–95.
Comte, F. and E. Renault (1998): “Long Memory in Continuous Time Stochastic VolatilityModels,”
Mathematical Finance , 8, 291–323.
Davidson, J. (1994):
Stochastic Limit Theory , Oxford University Press.
DiCiccio, C. J. and J. P. Romano (2017): “Robust Permutation Tests for Correlation andRegression Coefficients,”
Journal of the American Statistical Association , 112, 1211–1220.
Foster, D. P. and D. B. Nelson (1996): “Continuous Record Asymptotics for Rolling SampleVariance Estimators,”
Econometrica , 64, 139–174.
Hahn, J., P. Todd, and W. V. der Klaauw (2001): “Identification and Estimation of Treat-ment Effects with a Regression-Discontinuity Design,”
Econometrica , 69, 201–209.
Hoeffding, W. (1952): “The Large-Sample Power of Tests Based on Permutations of Observa-tions,”
Annals of Mathematical Statistics , 3, 169–192.
Imbens, G. and K. Kalyanaraman (2011): “Optimal Bandwidth Choice for the RegressionDiscontinuity Estimator,”
The Review of Economic Studies , 79, 933–959.
Imbens, G. W. and T. Lemieux (2008): “Regression Discontinuity Designs: A Guide to Prac-tice,”
Journal of Econometrics , 142, 615 – 635.
Jacod, J. and P. Protter (2012):
Discretization of Processes , Springer.
Lee, D. S. and T. Lemieux (2010): “Regression Discontinuity Designs in Economics,”
Journalof Economic Literature , 48, 281–355.
Lehmann, E. L. and J. P. Romano (2005):
Testing Statistical Hypothesis , Springer.
Li, J. and Y. Liu (2020): “Efficient Estimation of Integrated Volatility Functionals under GeneralVolatility Dynamics,”
Econometric Theory, Forthcoming . Li, J., V. Todorov, and G. Tauchen (2017): “Jump Regressions,”
Econometrica , 85, 173–195.
Li, J. and D. Xiu (2016): “Generalized Method of Integrated Moments for High-frequency Data,”
Econometrica , 84, 1613–1633.
Nakamura, E. and J. Steinsson (2018a): “High-Frequency Identification of Monetary Non-Neutrality: The Information Effect,”
Quarterly Journal of Economics , 133, 1283–1330.——— (2018b): “Identification in Macroeconomics,”
Journal of Economic Perspectives , 32, 59–86.30 tock, J. (1994): “Chapter 46 Unit Roots, Structural Breaks and Trends,” Elsevier, vol. 4 of
Handbook of Econometrics , 2739 – 2841.
Thistlethwaite, D. L. and D. T. Campbell (1960): “Regression-discontinuity Analysis: Analternative to the Ex Post Facto Experiment.”
Journal of Educational Psychology , 51, 309317.
Todorov, V. and G. Tauchen (2012): “Realized Laplace Transforms for Pure-jump Semimartin-gales,”
The Annals of Statistics , 40, 1233–1262. van der Vaart, A. and J. Wellner (1996):