[PDF] Permutation-based tests for discontinuities in event studies

Abstract

We propose using a permutation test to detect discontinuities in an underlying economic model at a cutoff point. Relative to the existing literature, we show that this test is well suited for event studies based on time-series data. The test statistic measures the distance between the empirical distribution functions of observed data in two local subsamples on the two sides of the cutoff. Critical values are computed via a standard permutation algorithm. Under a high-level condition that the observed data can be coupled by a collection of conditionally independent variables, we establish the asymptotic validity of the permutation test, allowing the sizes of the local subsamples to be either be fixed or grow to infinity. In the latter case, we also establish that the permutation test is consistent. We demonstrate that our high-level condition can be verified in a broad range of problems in the infill asymptotic time-series setting, which justifies using the permutation test to detect jumps in economic variables such as volatility, trading activity, and liquidity. An empirical illustration on a recent sample of daily S&P 500 returns is provided.

Full PDF

PPermutation-based tests for discontinuities in event studies

Federico A. BugniDepartment of EconomicsDuke University [email protected]

Jia LiDepartment of EconomicsDuke University [email protected]

July 21, 2020

Abstract

We propose using a permutation test to detect discontinuities in an underlying economicmodel at a cutoﬀ point. Relative to the existing literature, we show that this test is well suitedfor event studies based on time-series data. The test statistic measures the distance between theempirical distribution functions of observed data in two local subsamples on the two sides of thecutoﬀ. Critical values are computed via a standard permutation algorithm. Under a high-levelcondition that the observed data can be coupled by a collection of conditionally independentvariables, we establish the asymptotic validity of the permutation test, allowing the sizes of thelocal subsamples to be either be ﬁxed or grow to inﬁnity. In the latter case, we also establishthat the permutation test is consistent. We demonstrate that our high-level condition can beveriﬁed in a broad range of problems in the inﬁll asymptotic time-series setting, which justiﬁesusing the permutation test to detect jumps in economic variables such as volatility, tradingactivity, and liquidity. An empirical illustration on a recent sample of daily S&P 500 returns isprovided.

KEYWORDS: event study, inﬁll asymptotics, jump, permutation tests, randomization tests, semi-martingale.JEL classiﬁcation codes: C12, C14, C22, C32. a r X i v : . [ ec on . E M ] J u l Introduction

Many econometric problems can be expressed in terms of the continuity or the discontinuity ofcertain component in the underlying economic model. In an inﬂuential paper, Chow (1960) testedthe temporal stability in the demand for automobiles, and subsequently stimulated a large literatureon structural breaks in time series analysis; see, for example, Andrews (1993), Stock (1994), Baiand Perron (1998), and many references therein. In microeconometrics, the regression discontinuitydesign (RDD) has been extensively used for causal inference. This literature identiﬁes and estimatesan average treatment eﬀect by evaluating discontinuities of conditional expectation functions ofoutcome and treatment variables at a cutoﬀ point of the running variable; see Imbens and Lemieux(2008) and Lee and Lemieux (2010) for comprehensive reviews. Meanwhile, a more recent high-frequency ﬁnancial econometrics literature has been devoted to studying discontinuities, or jumps,in various ﬁnancial time series (e.g., price, volatility, trading activity, etc.). The high-frequencyjump literature is pioneered by Barndorﬀ-Nielsen and Shephard (2006), who propose the ﬁrstnonparametric test for asset price jumps using high-frequency data in an inﬁll asymptotic setting.More recently, Bollerslev et al. (2018) study the jumps of volatility and trading intensity in high-frequency jump regressions (Li et al. (2017)) that closely resemble the classical RDD.Although these strands of literature involve apparently diﬀerent terminology and technical tools,they share a common theme: The econometric goal is to learn about diﬀerences in the data gener-ating processes between two subsamples separated by the cutoﬀ. Imbens and Kalyanaraman (2011)emphasize that these subsamples should be “local” to the cutoﬀ point, which is quite natural giventhe nonparametric nature of discontinuity inference (Hahn et al. (2001)). The issue under study isthus a local version of the classical two-sample problem. Correspondingly, the related inference isoften carried out using nonparametric two-sample t-tests, which are based on kernel regressions inthe RDD (Hahn et al. (2001), Imbens and Kalyanaraman (2011), Calonico et al. (2014)) or, in thesame spirit, spot high-frequency estimators (Foster and Nelson (1996), Comte and Renault (1998),Jacod and Protter (2012), Li et al. (2017), Bollerslev et al. (2018)) in the inﬁll time-series setting.In an ideal scenario in which the subsamples separated by the cutoﬀ are i.i.d., the permuta-tion test is an excellent tool to detect diﬀerences in their distributions. In particular, standardresults for randomization inference (Lehmann and Romano (2005, Chapter 15.2)) indicate that apermutation test implemented with any arbitrary test statistic is ﬁnite-sample valid under theseconditions. The recent literature has investigated the properties of permutation tests under lessideal conditions. One example is Canay and Kamat (2017), who consider an RDD and show thatpermutation-based inference is asymptotically valid to detect discontinuities in the distribution ofthe baseline covariates at the cutoﬀ. These authors implement their test with a ﬁnite number of Coincidentally, the RDD was ﬁrst proposed by Thistlethwaite and Campbell (1960) around the same time asthe Chow test. L distance between the empiricalcumulative distribution functions for the two local subsamples near the cutoﬀ, and compute thecritical value via a standard permutation algorithm. As explained earlier, if the data were i.i.d.,the behavior of this permutation test would follow directly from standard results for randomiza-tion inference. This “oﬀ-the-shelf” theory, however, is not applicable here because time-series dataobserved in a short event window can be serially highly dependent.The main theoretical contribution of the present paper is to establish the asymptotic validity ofthe permutation test in this non-standard setting. The theory has two components. The ﬁrst is anew generic result for permutation test. Speciﬁcally, we link the (feasible) permutation test formedusing the original data with an infeasible test constructed in a “coupling” problem that involvesconditionally i.i.d. coupling variables. Since the latter resembles the classical two-sample problem,the infeasible test controls size exactly under the coupling null hypothesis (i.e., coupling variablesin the two subsamples are homogeneous), and is consistent under the complementary alternativehypothesis. Under a proper notion of coupling, which is customized for the permutation test, weshow that the feasible test inherits the same asymptotic rejection properties from the infeasible one.Since this result is of independent theoretical interest that is well beyond our subsequent analysisin the inﬁll time-series setting, we frame the theory under general high-level conditions so as tofacilitate other types of applications.The second component of our analysis pertains to specializing the generic result to the inﬁlltime-series setting designed for event-study applications. The event-study framework is particularlyrelevant for studying macroeconomic and ﬁnancial shocks, including monetary shocks triggeredby FOMC announcements (Cochrane and Piazzesi (2002), Nakamura and Steinsson (2018a)), or“natural disasters” such as the ongoing COVID-19 pandemic. Following Li and Xiu (2016) and2ollerslev et al. (2018), we model observed data using a general state-space framework, in whichthe observations are discretely sampled from a latent state process “contaminated” by randomdisturbances. This model has been used to model variables such as asset returns, trading volume,duration, and bid-ask spread, and readily accommodates both continuously and discretely valuedvariables. Under this state-space model, the temporal discontinuity in the data’s distribution ismainly driven by the jump of the latent state process (e.g., asset volatility, trading intensity, andpropensity of informed trading), which can be detected by the permutation test. Under easy-to-verify primitive conditions, we construct coupling variables and apply the aforementioned generaltheory to establish the permutation test’s asymptotic validity.We recognize two advantages of the proposed permutation test in comparison with the standardapproach based on the nonparametric “spot” estimation of the underlying state process. Firstly,the permutation test attains asymptotic size control even if the number of observations in eachsubsample is ﬁxed . This remarkable property is reminiscent of the ﬁnite-sample exactness of thepermutation test in the classical two-sample problem for i.i.d. data. In contrast, the nonparametricestimation approach works in a fundamentally diﬀerent way, as it relies on the asymptotic (mixed)normality of the estimator, which in turn requires the sizes of the local subsamples to grow toinﬁnity. In empirical applications, however, it is often desirable to use a short time window,either to reduce the eﬀect of confounding factors in the background, or simply because of thelack of observations soon after the occurrence of the economic event (say, in a real-time researchsituation). Not surprisingly, the conventional inference based on asymptotic Gaussianity oftenresults in large size distortions in this “small-sample” scenario, as we demonstrate concretely in arealistically calibrated Monte Carlo experiment (see Section 3). Meanwhile, the permutation testexhibits much more robust size control in ﬁnite samples.The second advantage of the permutation test is its versatility: The same test can be applied inmany diﬀerent empirical contexts without any modiﬁcation. On the other hand, the nonparametricestimation approach often relies on speciﬁc features of the problem, and needs to be designed on acase-by-case basis. Therefore, the proposed permutation test may be particularly attractive in newempirical environments for which tests based on the conventional approach are not yet developedor not yet well-understood. In Section 2.2, we illustrate this point more concretely in the context oftesting for volatility jumps. In that case, the standard approach relies crucially on the assumptionthat the price shocks are Brownian in its design of the spot volatility estimator and the associatedt-statistic, and it cannot be adapted easily to accommodate a more general setting with L´evy-drivenshocks. The permutation test, on the other hand, is valid even in the latter, more general, setting. For similar type of results in the context of RDD; see Cattaneo et al. (2015), Cattaneo et al. (2017), and Canayand Kamat (2017). To the best of our knowledge, the estimation and inference of the spot volatility (i.e., the scaling process) in thenon-Brownian case remains to be an open question in the literature. There is some limited work on the inferenceof integrated volatility functionals for the non-Brownian case (see Todorov and Tauchen (2012)) which demonstrates

Notation . We use (cid:107) x (cid:107) to denote the Euclidean norm of a vector x . For any real number a ,we use (cid:100) a (cid:101) to denote the smallest integer that is larger than a . For any constant p ≥ (cid:107) · (cid:107) p denotes the L p norm for random variables. For two real sequences a n and b n , we write a n (cid:16) b n if a n /C ≤ b n ≤ Ca n for some ﬁnite constant C ≥ We ﬁrst prove a new result that is broadly useful for establishing the asymptotic validity of permu-tation tests. Because of its independent theoretical interest, we develop the theory under high-levelconditions. In Section 2.2, below, we shall specialize this general result in event-study applications various distinct complications in the non-Brownian setting. Y n,i ) i ∈I n of R -valued observed variables deﬁned on a probability space(Ω , F , P ), which may be either “raw” data or preliminary estimators. Our econometric goal isto decide whether two subsamples ( Y n,i ) i ∈I ,n and ( Y n,i ) i ∈I ,n have “signiﬁcantly” diﬀerent distri-butions, where ( I ,n , I ,n ) is a partition of I n . For ease of exposition, we assume that I ,n and I ,n contain the same number of observations, denoted by k n . We stress from the outset that k n may either be ﬁxed or grow to inﬁnity in the subsequent analysis. As such, our analysis speaks tonot only the classical ﬁnite-sample analysis of permutation tests, but also the large-sample analysisroutinely used in econometrics.To implement the test, we ﬁrst estimate the empirical cumulative distribution functions (CDF)for the two subsamples using (cid:98) F j,n ( x ) ≡ k n (cid:88) i ∈I j,n { Y n,i ≤ x } , j ∈ { , } . We then measure their diﬀerence via the Cram´er–von Mises statistic given by (cid:98) T n ≡ k n (cid:88) i ∈I n (cid:16) (cid:98) F ,n ( Y n,i ) − (cid:98) F ,n ( Y n,i ) (cid:17) . For a signiﬁcance level α ∈ (0 , π to denote a permutation of the elements of I n , that is, a bijective mapping from I n to itself.Let G n denote the collection of all possible permutations of I n , with M n being its cardinality. Algorithm 1 . Step 1. For each permutation π ∈ G n , compute the permuted test statistic (cid:98) T n ( π )as (cid:98) T n , but with ( Y n,i ) i ∈I n replaced by ( Y n,π ( i ) ) i ∈I n .Step 2. Order { (cid:98) T n ( π ) : π ∈ G n } as (cid:98) T (1) n ≤ · · · ≤ (cid:98) T ( M n ) n . Set (cid:98) T ∗ n = (cid:98) T ( k ) n for k = (cid:100) M n (1 − α ) (cid:101) .Step 3. If (cid:98) T n > (cid:98) T ∗ n , reject the null hypothesis. If (cid:98) T n < (cid:98) T ∗ n , do not reject the null hypothesis. If (cid:98) T n = (cid:98) T ∗ n , reject the null hypothesis with probability ˆ p n ≡ ( M n α − (cid:99) M + n ) / (cid:99) M n , where (cid:99) M + n and (cid:99) M n are the cardinalities of { j : (cid:98) T ( j ) n > (cid:98) T ∗ n } and { j : (cid:98) T ( j ) n = (cid:98) T ∗ n } , respectively. The resulting test thenrejects according to ˆ φ n ≡ { (cid:98) T n > (cid:98) T ∗ n } + ˆ p n { (cid:98) T n = (cid:98) T ∗ n } . (cid:3) Remark 2.1.

The test ˆ φ n speciﬁed in Algorithm 1 is a randomized test and has a random outcomewhen (cid:98) T n = (cid:98) T ∗ n . One can construct a non-randomized (and more conservative) version by replacingˆ p n with zero. Also, in practice, M n may be too large to consider G n in its entirety. In such cases,we could replace G n with a random subset of it, denoted by (cid:98) G n , and composed of the identity All of our results can be easily extended to the case when I ,n and I ,n have diﬀerent sizes, but with the sameorder of magnitude. G n . All of the formal results in this paperwould apply if we use (cid:98) G n instead of G n in Algorithm 1.If the data ( Y n,i ) i ∈I n are i.i.d., then the null hypothesis of the classical two-sample problem holds,and Lehmann and Romano (2005, Theorem 15.2.1) implies that the aforementioned permutationtest has exact size control in ﬁnite samples. This is a remarkable property of the permutation test,as it holds without requiring any speciﬁc distributional assumptions on the data. In contrast to theclassical two-sample problem, however, we shall not assume that the data are independent, or even“weakly” dependent (e.g., mixing). As mentioned in the Introduction, the main goal of this paperis to study the permutation test for time-series data observed within a short event window (say, afew days or hours), which can be serially highly dependent in practice. Our key theoretical insightis that the permutation test is still asymptotically valid if the data ( Y n,i ) i ∈I n can be approximated,or “coupled,” by another collection of variables that are conditionally independent, as formalizedby the following assumption. Assumption 2.1.

There exists a collection of variables ( U n,i ) i ∈I n such that the following con-ditions hold for a sequence ( G n ) n ≥ of σ -ﬁelds: (i) for each n ≥ , the variables ( U n,i ) i ∈I n are G n -conditionally independent, and U n,i has the same G n -conditional distribution as U n,j if i, j be-long to the same subsample (i.e., I ,n or I ,n ); (ii) for any real sequence η n = o (1) , we have sup x ∈ R P ( | U n,i − x | ≤ η n |G n ) = O p ( η n ) ; (iii) max i ∈I n | (cid:101) Y n,i − U n,i | = o p ( k − n ) , where ( (cid:101) Y n,i ) i ∈I n is anidentical copy of ( Y n,i ) i ∈I n in G n -conditional distribution. Assumption 2.1 lays out the high-level structure for bridging our analysis with the classicaltheory on permutation tests, which we carry out in Theorem 2.1 below. Condition (i) sets upthe “coupling” problem, which corresponds to a conditional version of the classical two-sampleproblem, treating the ( U n,i ) i ∈I ,n and ( U n,i ) i ∈I ,n variables as “data.” In part (a) of Theorem 2.1,we consider the situation in which both subsamples have the same conditional distribution. Inthis case, our coupling variables ( U n,i ) i ∈I n give rise to an infeasible permutation test that can beanalyzed as a classical two-sample problem. In particular, this infeasible permutation test attainsthe exact ﬁnite-sample size under our conditions.This infeasible test, however, only plays an auxiliary role in our analysis, because our interestis on the feasible test ˆ φ n formed using the original ( Y n,i ) i ∈I n data. Therefore, a key componentof our theoretical argument in Theorem 2.1 is to show that the feasible test for the original datainherits asymptotically the same rejection properties from the infeasible test. Conditions (ii) and(iii) in Assumption 2.1 are introduced for this purpose. Speciﬁcally, condition (ii) requires thevariable U n,i to be non-degenerate, in the sense that its conditional probability mass within anysmall [ x − η, x + η ] interval is of order O ( η ) in probability. Condition (iii) speciﬁes the requisite Condition (ii) is satisﬁed if the conditional probability densities of U n,i , n ≥

1, exist and are uniformly boundedin probability. k n , as detailed in Section 2.2. Theorem 2.1.

Under Assumption 2.1, the following statements hold for the permutation test ˆ φ n described in Algorithm 1:(a) If the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d., we have E [ ˆ φ n ] → α .(b) Let Q j,n ( · ) denote the G n -conditional distribution function of U n,i for i ∈ I j,n and j ∈ { , } ,and Q n = ( Q ,n + Q ,n ) / . If k n → ∞ and P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → for anyreal sequence δ n = o (1) , we have E [ ˆ φ n ] → . Theorem 2.1 characterizes the asymptotic rejection probabilities of the feasible test ˆ φ n underthe null and alternative hypotheses of the two-sample problem for the coupling variables. Part(a) pertains to the situation in which the two subsamples of coupling variables, ( U n,i ) i ∈I ,n and( U n,i ) i ∈I n , have the same conditional distribution, which corresponds to the null hypothesis. Inthis case, the theorem shows that the asymptotic rejection probability of the feasible test is equalto the nominal level α . It is relevant to note that this result holds whether k n is ﬁxed or divergent.This property is clearly reminiscent of the permutation test’s ﬁnite-sample exactness in the classicalsetting.Part (b) of Theorem 2.1 concerns the power of the feasible test ˆ φ n . It shows that the fea-sible test rejects with probability approaching one when the conditional distributions of the twocoupling subsamples, Q ,n and Q ,n , are diﬀerent, in the sense that their “distance” measured by (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) is asymptotically non-degenerate, where the mixture distribution Q n captures approximately the distribution of the permuted data. This consistency-type result re-quires that the information available from each subsample grows with the sample size, i.e., k n → ∞ .This result appears to be new in the context of permutation-based tests under a ﬁxed alternativefor the coupling variables. In particular, we note that an analogous result is unavailable in Canayand Kamat (2017), as they restrict attention to an asymptotic framework with a ﬁxed k n . Ourproof relies on applying Lehmann and Romano (2005, Theorem 15.2.3) to the infeasible test, forwhich we use the coupling construction developed by Chung and Romano (2013) to show that theso-called Hoeﬀding (1952) condition is satisﬁed.Theorem 2.1 establishes the relation between the rejection probability of the feasible test ˆ φ n andthe homogeneity (or the lack of it) across the two coupling subsamples ( U n,i ) i ∈I ,n and ( U n,i ) i ∈I ,n .This result does not speak directly to hypotheses formulated in terms of the original ( Y n,i ) i ∈I n In our applications, we can often verify condition (iii) with (cid:101) Y n,i = Y n,i . Nonetheless, allowing (cid:101) Y n,i (cid:54) = Y n,i isuseful when Y n,i is itself an estimator. For example, if ( Y n,i ) i ∈I n is a ﬁnite collection of estimators that convergejointly in distribution, then the coupling can be obtained via Skorokhod representation; see Canay and Kamat (2017)for an application of this type. Y n,i = ∆ − / n ( P ( i +1)∆ n − P i ∆ n ) be the scaled increment of the asset priceprocess P t over the i th sampling interval ( i ∆ n , ( i + 1) ∆ n ]. Let τ = i ∗ ∆ n be a “cutoﬀ” timepoint of interest (e.g., the announcement time of a news release), and consider two index sets I ,n = { i ∗ − k n , . . . , i ∗ − } and I ,n = { i ∗ + 1 , . . . , i ∗ + k n } , which collect observations before andafter the cutoﬀ, respectively. We consider an asymptotic setting in which these subsamples are“local” in calendar time, that is, k n ∆ n →

0. Note that this implies that ∆ n →

0, which means thatwe are considering an inﬁll asymptotic setting. If P t is an Itˆo process with respect to an informationﬁltration ( F t ) t ≥ , we may represent Y n,i as Y n,i = ∆ − / n (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − / n (cid:90) ( i +1)∆ n i ∆ n σ s dW s , for i ∈ I n , (1)where b t is the drift process, σ t is the stochastic volatility process, and W t is a standard Brownianmotion. If the σ t process is smooth (e.g., H¨older continuous) in a local neighborhood before τ ,then the volatility throughout the pre-event subsample I ,n is approximately σ ( i ∗ − k n )∆ n . Furtherrecognizing that the drift term is negligible relative to the Brownian component, we can approximate Y n,i for each i ∈ I ,n using the coupling variables U n,i = σ ( i ∗ − k n )∆ n ∆ − / n ( W ( i +1)∆ n − W i ∆ n ) ∼ MN (cid:16) , σ i ∗ − k n )∆ n (cid:17) , (2)where MN denotes the mixed normal distribution. Since the Brownian motion has independentand stationary increments, it is easy to see that the coupling variables ( U n,i ) i ∈I ,n are F ( i ∗ − k n )∆ n -conditionally i.i.d. Moreover, if the volatility process σ t does not jump at the cutoﬀ time τ , wemay follow the same logic to extend the approximation in (2) further to i ∈ I ,n . In other words, ifthe volatility process process does not jump then the coupling variables ( U n,i ) i ∈I n are conditionallyi.i.d., which corresponds to the situation in part (a) of Theorem 2.1. On the other hand, if thevolatility process jumps at time τ , say by a constant c (cid:54) = 0, then the coupling variables for the I ,n subsample will instead take the form U n,i = (cid:0) σ ( i ∗ − k n )∆ n + c (cid:1) ( W ( i +1)∆ n − W i ∆ n ). In this case, thetwo subsamples of U n,i ’s have distinct conditional distributions (i.e., mixed normal with diﬀerentconditional variances), corresponding to the scenario in part (b) of Theorem 2.1.Within the context of this illustrative example, we can further clarify a key feature of theproposed test that holds more generally. It is not aimed at detecting “small” time-variations Note that I n does not include the i ∗ th return observation. Therefore, although the returns in (1) do not containprice jumps, an event-induced price jump is allowed to occur at time τ .

8n the distribution of the observed data. In fact, by allowing the drift b t and the volatility σ t to be time-varying, a smooth form of heterogeneity is always built in. The test instead detectsabrupt changes, or discontinuities, in the evolution of the distribution, which can be more plausiblyassociated with the “lumpy” information carried by the underlying economic announcement, asemphasized by Nakamura and Steinsson (2018b). Speciﬁcally in this example, the asset returnsare locally centered Gaussian (due to the assumption that the price is an Itˆo process), and hence,the temporal discontinuity in the return distribution manifests itself as a volatility jump. Theempirical scope of our permutation test, however, is far beyond volatility-jump testing depicted inthis illustration, as we shall demonstrate in the remainder of the paper. We now specialize the generic Theorem 2.1 into an inﬁll asymptotic time-series setting that isparticularly suitable for event studies. By introducing a mild additional econometric structure, weshall establish the asymptotic validity of the permutation test under more primitive conditions thatare easy to verify in a variety of concrete empirical settings. As in the running example above,we consider an event occurring at time τ = i ∗ ∆ n , which separates two subsamples indexed by I ,n = { i ∗ − k n , . . . , i ∗ − } and I ,n = { i ∗ + 1 , . . . , i ∗ + k n } , respectively. All limits in the sequelare obtained under the inﬁll asymptotic setting with ∆ n → Y n,i = g ( ζ i ∆ n , (cid:15) n,i ) + R n,i , i ∈ I n , (3)where the state process ζ t is c`adl`ag, adapted to a ﬁltration F t , and takes values in an open set Z ⊆ R dim( ζ ) ; ( (cid:15) n,i ) i ∈I n are i.i.d. random disturbances taking values in some (possibly abstract)space E ; g ( · , · ) is a “smooth” transform; and R n,i is a residual term that is negligible relative to theleading term g ( ζ i ∆ n , (cid:15) n,i ) in a proper sense detailed below. A simpler version of this state-spacemodel without the R n,i residual term has been used by Li and Xiu (2016) and Bollerslev et al.(2018), among others, for modeling market variables such as trading volume and bid-ask spread.By introducing the R n,i term, we can use a uniﬁed framework to accommodate a broader class ofmodels, which in particular include increments of an Itˆo semimartingale. We now revisit the modelin (1) as the ﬁrst illustration. Example 1 (Brownian Asset Returns) . We represent the Itˆo-process model (1) for assetreturns in the form of (3) by setting ζ t = σ t , (cid:15) n,i = ∆ − / n ( W ( i +1)∆ n − W i ∆ n ), and g ( z, (cid:15) ) = z(cid:15) . Theresulting residual term R n,i has the form R n,i = ∆ − / n (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − / n (cid:90) ( i +1)∆ n i ∆ n ( σ s − σ i ∆ n ) dW s . (4)9nder mild and fairly standard regularity conditions, it is easy to show that the R n,i terms areuniformly o p (1). On the other hand, the leading term g ( ζ i ∆ n , (cid:15) n,i ) has a non-degenerate centeredmixed Gaussian distribution with conditional variance σ i ∆ n . (cid:3) This running example further illustrates the distinct roles played by ζ t , (cid:15) n,i , and R n,i in ourstate-space model (3). The leading term g ( ζ i ∆ n , (cid:15) n,i ) captures the “main feature” of the observeddata; in addition, since the (cid:15) n,i disturbance terms are i.i.d., any “large” change in the empiricaldistribution across the two subsamples must be attributed to the time- τ discontinuity in the stateprocess ζ t . From this description, it follows that the hypothesis test for the continuity of thedistribution of the main feature of the observed data can be formulated as H : ∆ ζ τ = 0 versus H a : ∆ ζ τ (cid:54) = 0 , (5)where ∆ ζ τ ≡ ζ τ − ζ τ − ≡ ζ τ − lim s ↑ τ ζ s denotes the jump of the state process at time τ .With the state-space model (3) in place, we can design more primitive suﬃcient conditionsfor establishing the asymptotic validity of the permutation test under the hypotheses in (5). Weneed some additional notation to describe these conditions. For each ﬁxed z ∈ Z , let f z ( · ) and F z ( · ) denote the probability density function (PDF) and the CDF of the random variable g ( z, ε n,i ),respectively. It is also convenient to introduce a “shifted” version of ζ t deﬁned as ˜ ζ t ≡ ζ t − ∆ ζ τ { t ≥ τ } , which has the same increments as ζ t over time intervals not containing τ . Assumption 2.2. (i) The collection of variables ( (cid:15) n,i ) i ∈I n are i.i.d. and, for each k ∈ I n , thevariables ( (cid:15) n,i ) i ≥ k are independent of F k ∆ n . Moreover, for any compact subset K ⊆ Z , we have (ii) sup x ∈ R ,z ∈K f z ( x ) < ∞ ; and (iii) inf z ∈K (cid:82) R ( F z ( x ) − F z + c ( x )) dF z ( x ) > whenever c (cid:54) = 0 . Assumption 2.3.

There exist a sequence ( T m ) m ≥ of stopping times increasing to inﬁnity, asequence of compact subsets ( K m ) m ≥ of Z , and a sequence ( K m ) m ≥ of constants such that forsome real sequence a n ≥ and each m ≥ : (i) (cid:107) g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) (cid:107) ≤ K m a n (cid:107) z − z (cid:48) (cid:107) for all z, z (cid:48) ∈ K m ; (ii) ζ t takes values in K m for all t ≤ T m , and (cid:107) ˜ ζ t ∧ T m − ˜ ζ s ∧ T m (cid:107) ≤ K m | t − s | / for all t, s in some ﬁxed neighborhood of τ ; (iii) max i ∈I n | R n,i | = o p ( k − n ) . Assumption 2.2 entails regularity conditions pertaining to the random disturbance terms, whichare often easy to verify in concrete examples as demonstrated later in this subsection. Assumption2.3 imposes a set of smoothness conditions that permits the approximation of the observed datausing properly constructed coupling variables. Speciﬁcally, condition (i) requires that the randomfunction z (cid:55)→ g ( z, (cid:15) n,i ) is Lipschitz in z over compact sets under the L distance. The a n sequencecaptures the scale of the Lipschitz coeﬃcient. In many applications, we can verify this condition Note that the assumption is framed in a localized fashion using the stopping times ( T m ) m ≥ , which is a standardtechnique for weakening the regularity condition in the inﬁll asymptotic setting. See Jacod and Protter (2012, Section4.4.1) for a comprehensive discussion on the localization technique. a n ≡

1, but allowing a n to diverge to inﬁnity is sometimes necessary (see Example 2below). Condition (ii) states that the ζ t process is locally compact (up to each stopping time T m )and, upon removing the ﬁxed-time discontinuity at τ , it is (1 / L norm. This H¨older-continuity requirement can be easily veriﬁed using well-known results providedthat the ˜ ζ process is an Itˆo semimartingle or a long-memory process (see Jacod and Protter (2012,Chapter 2) and Li and Liu (2020)). Condition (iii) imposes the requisite assumptions on the residualterms. In some applications, this condition holds trivially with R n,i ≡

0, but, more generally, itneeds to be veriﬁed on a case-by-case basis using (relatively standard) inﬁll asymptotic techniques.Theorem 2.2, below, establishes the size and power properties of the permutation test under thehypotheses described in (5).

Theorem 2.2.

In the state-space model (3), suppose that Assumptions 2.2 and 2.3 hold, and that a n k n ∆ / n = o (1) . Then, the following statements hold for the permutation test ˆ φ n described inAlgorithm 1:(a) Under the null hypothesis in (5), i.e., ∆ ζ τ = 0 , we have E [ ˆ φ n ] → α ;(b) Under a ﬁxed alternative hypothesis in (5), i.e., ∆ ζ τ = c for some (unknown) constant c (cid:54) = 0 , we have E [ ˆ φ n ] → when k n → ∞ . This theorem is proved by verifying the high-level conditions in Theorem 2.1 with properlyconstructed coupling variables analogous to those in equation (2). The condition a n k n ∆ / n = o (1)mainly requires that the window size k n does not grow too fast, which ensures the closeness betweenthe coupling variables and the original data. In the typical case with a n = 1, it reduces to k n = o (∆ − / n ). Part (a) shows that the permutation test attains the desired asymptotic level under thenull hypothesis in (5). Again, we stress that the test has valid asymptotic size control even in the“small-sample” case with ﬁxed k n . As in Theorem 2.1, the “large-sample” condition k n → ∞ isneeded for establishing the consistency of the test under the alternative, as shown in part (b).In the remainder of this subsection, we use a few prototype examples to demonstrate howthe proposed test may be used in various empirical settings. In particular, we show how to castthe speciﬁc problems into the approximate state-space model (3), and discuss how to verify oursuﬃcient regularity conditions. We start by revisiting the running example. Example 1 (Brownian Asset Returns, Continued) . Recall that (cid:15) n,i ≡ ∆ − / n ( W ( i +1)∆ n − W i ∆ n ), ζ t = σ t , and g ( z, (cid:15) ) = z(cid:15) . In this context, the hypothesis testing problem in (5) representsa test of the continuity of the volatility process σ t at time t = τ , i.e., H : ∆ σ τ = 0 versus H a : ∆ σ τ (cid:54) = 0 . We suppose that the volatility process σ t is non-degenerate by setting its domain to Z = (0 , ∞ ).Since the Brownian motion has independent increments with respect to the underlying ﬁltration,11he disturbance term (cid:15) n,i satisﬁes Assumption 2.2(i). In addition, for each point z ∈ Z , the randomvariable f ( z, (cid:15) n,i ) has an N (cid:0) , z (cid:1) distribution. It is then easy to see that conditions (ii) and (iii)in Assumption 2.2 hold for any compact subset K ⊆ Z (note that K is necessarily bounded awayfrom zero). To verify Assumption 2.3, ﬁrst note that g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) = ( z − z (cid:48) ) (cid:15) n,i , and hence, (cid:107) g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) (cid:107) = | z − z (cid:48) | . Assumption 2.3(i) thus holds for a n = 1. It is well known that σ t is locally (1 / L norm if it is an Itˆo semimartingale or a long-memory process; if so, Assumption 2.3(ii) is satisﬁed if the σ t and σ − t processes are both locallybounded. Finally, to verify Assumption 2.3(iii), we assume that the drift process b t is locallybounded. It is then easy to show via routine calculations that max i ∈I n | R n,i | = O p ( k / n ∆ / n ).Since the condition a n k n ∆ / n = o (1) in Theorem 2.2 implies that O p ( k / n ∆ / n ) = o p ( k − n ), we havemax i ∈I n | R n,i | = o p ( k − n ) as needed in Assumption 2.3(iii). All conditions in Theorem 2.2 are nowveriﬁed, and this shows that the permutation test ˆ φ n is asymptotically valid for testing the nullhypothesis ∆ σ τ = 0. (cid:3) Example 1 shows that the permutation test ˆ φ n is asymptotically valid for testing the presenceof a volatility jump. This is a relatively familiar problem in the literature. It is therefore useful tocontrast the proposed permutation test with the standard approach, which is based on nonpara-metric “spot” estimators of the asset price’s instantaneous variances before and after the event timegiven by, respectively, ˆ σ τ − = 1 k n (cid:88) i ∈I ,n Y n,i , ˆ σ τ = 1 k n (cid:88) i ∈I ,n Y n,i . (6)Assuming k n → ∞ and k n ∆ n →

0, it can be shown that (see Jacod and Protter (2012, Chapter13)) k / n (cid:0) ˆ σ τ − ˆ σ τ − − ( σ τ − σ τ − ) (cid:1)(cid:113) σ τ + 2ˆ σ τ − d −→ N (0 , . (7)Thus, we can test H : ∆ σ τ = 0 by comparing the t-statistic k / n (cid:0) ˆ σ τ − ˆ σ τ − (cid:1) / (cid:113) σ τ + 2ˆ σ τ − withcritical values based on the standard normal distribution.Two remarks are in order. First, note that the asymptotic size control of the standard approachrelies on the asymptotic normal approximation (7), which depends crucially on k n → ∞ (in additionto having ∆ n →

0) because the underlying central limit theorem is obtained by aggregating a“large” number of martingale diﬀerences. Hence, the t-test may suﬀer from severe size distortionwhen k n is relatively small. This issue is empirically relevant because an applied researcher may usea short time window to capture short-lived “impulse-like” dynamics and/or to minimize the impactof other confounding economic factors in the background. Moreover, for “real-time” applications,the researcher may have no choice but to use a small k n simply because of the limited amountof available data soon after the event time τ . In sharp contrast, the permutation test controlsasymptotic size even when k n is ﬁxed. This remarkable property is inherited from the coupling12wo-sample problem, in which the permutation test controls size exactly regardless of whether k n is ﬁxed or grows to inﬁnity.The second and perhaps practically more important diﬀerence between the two tests is that thepermutation test is more versatile. Under the spot-estimation-based approach, both the design ofthe spot estimators in (6) and the convergence in (7) depend heavily on the fact that the incrementsof the Brownian motion are not only i.i.d., but also Gaussian. Gaussianity is obviously essential forthe conventional approach because, among other things, it ensures that the instantaneous varianceof the normalized returns are well-deﬁned. The permutation test, on the other hand, only exploitsthe i.i.d. property of the Brownian shocks, without relying on the Gaussianity. Therefore, thepermutation test readily accommodates a more general model for asset returns with L´evy shocks,as we demonstrate in the following example.

Example 2 (L´evy-driven Asset Returns) . We generalize the model in Example 1 by replacingthe Brownian motion W with a L´evy martingale L , so that the asset return has the form P ( i +1)∆ n − P i ∆ n = (cid:90) ( i +1)∆ n i ∆ n b s ds + (cid:90) ( i +1)∆ n i ∆ n σ s dL s , for i ∈ I n . In this case, we deﬁne the random disturbance as (cid:15) n,i ≡ ∆ − /βn ( L ( i +1)∆ n − L i ∆ n ) for some constant β ∈ (1 , − /βn is used to ensure that (cid:15) n,i has a non-degenerate distribution. For instance, if L is a stable process, we take β to be its jump-activityindex, so that (cid:15) n,i has a centered stable distribution (recall that the Brownian motion is a stableprocess with index β = 2). We treat the value of β as unknown. Since the permutation test isscale-invariant with respect to the data, we can nonetheless regard the normalized return Y n,i =∆ − /βn ( P ( i +1)∆ n − P i ∆ n ) as directly observable (because tests implemented for P ( i +1)∆ n − P i ∆ n and Y n,i are identical). To apply our theory, we represent Y n,i using the state-space model (3) with ζ t = σ t , g ( z, (cid:15) ) = z(cid:15) , and the residual term given by R n,i = ∆ − /βn (cid:90) ( i +1)∆ n i ∆ n b s ds + ∆ − /βn (cid:90) ( i +1)∆ n i ∆ n ( σ s − σ i ∆ n ) dL s . Recognizing that the scaled L´evy increments ( (cid:15) n,i ) i ∈I n are i.i.d., we can verify Assumptions 2.2and 2.3 using similar arguments as in Example 1 but with a n = ∆ / − /βn , which depicts therate at which (cid:107) (cid:15) n,i (cid:107) diverges. In particular, the condition a n k n ∆ / n = o (1) requires k n to obey k n = o (∆ (1 /β − / n ). Then, we can apply Theorem 2.2 to show that the permutation test ˆ φ n isasymptotically valid for testing the discontinuity in the volatility process σ t at time τ , regardlessof whether the driving L´evy process is a Brownian motion or not. (cid:3) Recall that many distributions used in continuous-time models do not have ﬁnite second moments. For example,within the class of stable distributions, the Gaussian distribution is the only one with a ﬁnite second moment.Moreover, Gaussianity also implies that the variance of ∆ − n ( W i ∆ n − W ( i − n ) is 2, which explains the “2” factorin the denominator of the t-statistic.

13o far, we have illustrated the use of the permutation test for high-frequency asset returnsdata. Under the settings of Examples 1 and 2, the distributional change of asset returns is mainlydriven by the time- τ discontinuity in volatility, and hence, the permutation test is eﬀectively atest for volatility jumps. Example 2, in particular, highlights the versatility and robustness ofthe permutation test compared with the conventional approach based on spot estimation. Goingone step further, we now illustrate how to apply the permutation test to other types of economicvariables. Example 3 (Location-Scale Model for Volume) . Consider a simple model for tradingvolume, under which the volume within the i th sampling interval is given by Y n,i = µ i ∆ n + v i ∆ n (cid:15) n,i .The µ t location process captures the local mean, or trading intensity, and the v t scale processcaptures the time-varying heterogeneity in the order size. This location-scale model ﬁts directlyinto the state-space model (3) with ζ t = ( µ t , v t ), g (( µ, v ) , (cid:15) ) = µ + v(cid:15) , and R n,i ≡

0. Let F t be theﬁltration generated by the ζ t process. If (cid:15) n,i is independent of the ζ t process and has ﬁnite secondmoment and bounded PDF, then it is easy to verify Assumptions 2.2 and 2.3 with a n = 1. Theorem2.2 thus implies that the permutation test is valid for testing the discontinuity in ζ t = ( µ t , v t ) attime τ . (cid:3) The location-scale structure in Example 3 is by no means essential in applications, because thepermutation test is valid provided that the more general conditions in Assumptions 2.2 and 2.3hold. This illustration is pedagogically convenient, in that it permits a straightforward veriﬁcationof our high-level conditions. That being said, this example does reveal a limitation of our theorydeveloped so far. That is, the data variable needs to be continuously distributed, as requiredin Assumption 2.2(ii) (which in turn is related to Assumption 2.1(ii)). Observed data in actualapplications are invariably discrete, but this continuous-distribution assumption is often deemedas a reasonable approximation to reality. In some situations, however, the discreteness in the datais more salient. For example, the trading volume of a relatively illiquid asset may take values assmall integer multiples of the lot size (e.g., 100 shares). This motivates us to directly confrontthe discreteness in the data, as detailed in the next subsection.

The extension will be carried out in similar steps as the theory developed above. We start withmodifying the generic result in Theorem 2.1 to accommodate discretely-valued observations. Recallthat Q j,n ( · ) denote the G n -conditional distribution function of the coupling variable U n,i for i ∈ I j,n and j ∈ { , } , and Q n = ( Q ,n + Q ,n ) / This issue has become less important in the equity market as retail investors can now trade a single share, oreven a fractional share, of a stock. However, the lot size is still relevant for less liquid assets such as option contractsor for equity data from earlier sample periods. heorem 2.3. Suppose that there exists a collection of variables ( U n,i ) i ∈I n that satisﬁes Assump-tion 2.1(i) for some sequence ( G n ) n ≥ of σ -ﬁelds, and P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n where ( (cid:101) Y n,i ) i ∈I n is an identical copy of ( Y n,i ) i ∈I n in G n -conditional distribution. Then, the followingstatements hold for the test ˆ φ n described in Algorithm 1:(a) If the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d., we have E [ ˆ φ n ] → α .(b) If k n → ∞ and P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → for any real sequence δ n = o (1) ,we have E [ ˆ φ n ] → . Theorem 2.3 establishes exactly the same asymptotic properties for the permutation test asTheorem 2.1, but under diﬀerent conditions: It does not impose the anti-concentration requirementfor the coupling variable (i.e., Assumption 2.1(ii)), and the “distance” between the observed dataand the coupling variable is measured by the probability mass of { (cid:101) Y n,i (cid:54) = U n,i } . These modiﬁcationsseem natural for the discrete-data setting.Next, we specialize the generic result in Theorem 2.3 to the state-space model (3), starting withsome motivating examples. The ﬁrst is an alternative model for the trading volume that explicitlyfeatures discretely-valued data, which shows an interesting contrast to Example 3. Example 4 (Poisson Model for Volume) . Let Y n,i be the trading volume of an asset withinthe i th sampling interval. Following Andersen (1996), we model the discretely valued volume usinga Poisson distribution with time-varying mean. To form a state-space representation, let ( (cid:15) n,i ( t )) t ≥ be a copy of the standard Poisson process on R + , independent across i , and let ζ t be the time-varying mean process independent of the (cid:15) n,i ’s. We then set Y n,i = (cid:15) n,i ( ζ i ∆ n ), which, conditionalon the ζ process, is Poisson distributed with mean ζ i ∆ n . This representation is a special case of(3), with g ( ζ, (cid:15) ) = (cid:15) ( ζ ) being a time-change and R n,i = 0. We also note that although the (cid:15) n,i ’s areassumed to be i.i.d., the ( Y n,i ) i ∈I n series can be highly persistent through its dependence on thestochastic mean process ζ t . (cid:3) To further broaden the empirical scope, we consider another example concerning the bid-askspread of asset quotes. This example is econometrically interesting because of its resemblance to thediscrete-choice models (e.g., probit and logit) commonly used for modeling binary and multinomialdata.

Example 5 (Bid-Ask Spread) . Let Y n,i be the bid-ask spread of an asset at time i ∆ n . For aliquid asset, the spread is often maintained at 1 tick (e.g., 1 cent), but it may widen to several ticksdue to a higher level of asymmetric information or dealer’s inventory cost. For ease of exposition,we suppose that Y n,i is a binary variable taking values in { , } , while noting that a multinomialextension is straightforward. Motivated by the classical discrete-choice models, we model the spread15s Y n,i = 1 + 1 { ζ i ∆ n ≥ (cid:15) n,i } , and suppose that the variables ( (cid:15) n,i ) i ∈I n are i.i.d. and independentof the ζ t process. With the CDF of (cid:15) n,i denoted by F (cid:15) ( · ), we have P ( Y n,i = 2 | ζ i ∆ n ) = F (cid:15) ( ζ i ∆ n ).Evidently, upon redeﬁning ζ t as F (cid:15) ( ζ t ), we can assume that (cid:15) n,i is uniformly distributed on the[0 ,

1] interval without loss of generality. This normalization in turn allows us to interpret ζ t as thestochastic propensity of a “wide” spread, which may serve as a measure of market illiquidity. (cid:3) We now proceed to establish the asymptotic validity of the permutation test for the hypothesesdescribed in (5) for discretely valued observations; see Theorem 2.4 below. Since the state-spacerepresentation (3) holds with the residual term R n,i = 0 in the examples above, it seems reasonableto avoid unnecessary redundancy by restricting our analysis to a simpler version given by Y n,i = g ( ζ i ∆ n , (cid:15) n,i ) , i ∈ I n . (8)We replace Assumption 2.3 with the following assumption, where we recall that for each z ∈ Z , F z ( · ) denotes the CDF of the random variable g ( z, ε n,i ) and ˜ ζ t = ζ t − ∆ ζ τ { t ≥ τ } . Assumption 2.4.

There exist a sequence ( T m ) m ≥ of stopping times increasing to inﬁnity, asequence of compact subsets ( K m ) m ≥ of Z , and a sequence ( K m ) m ≥ of constants such that foreach m ≥ : (i) P ( g ( z, (cid:15) n,i ) (cid:54) = g ( z (cid:48) , (cid:15) n,i )) ≤ K m (cid:107) z − z (cid:48) (cid:107) for all z, z (cid:48) ∈ K m ; (ii) ζ t takes values in K m for all t ≤ T m , and (cid:107) ˜ ζ t ∧ T m − ˜ ζ s ∧ T m (cid:107) ≤ K m | t − s | / for all t, s in some ﬁxed neighborhood of τ . Theorem 2.4.

In the state-space model (8), suppose that Assumptions 2.2(i), 2.2(iii) and 2.4 hold,and that k n ∆ n = o (1) . Then, the following statements hold for the permutation test ˆ φ n describedin Algorithm 1:(a) Under the null hypothesis in (5), i.e., ∆ ζ τ = 0 , we have E [ ˆ φ n ] → α ;(b) Under a ﬁxed alternative hypothesis in (5), i.e., ∆ ζ τ = c for some (unknown) constant c (cid:54) = 0 , we have E [ ˆ φ n ] → when k n → ∞ . Theorem 2.4 depicts the same asymptotic behavior of the permutation test as in Theorem2.2. The suﬃcient conditions of these results diﬀer mainly in how to gauge the closeness betweenthe data and the coupling variable, as manifest in the diﬀerence between Assumption 2.3(i) andAssumption 2.4(i). The latter is easy to verify under more primitive conditions in concrete settings.Speciﬁcally, in Example 4, we note that | g ( z, (cid:15) n,i ) − g ( z (cid:48) , (cid:15) n,i ) | is a Poisson random variable withmean | z − z (cid:48) | , and hence, P ( g ( z, (cid:15) n,i ) (cid:54) = g ( z (cid:48) , (cid:15) n,i )) = 1 − exp( − | z − z (cid:48) | ) ≤ | z − z (cid:48) | as desired. InExample 5, we can use (cid:15) n,i ∼ Uniform[0 ,

1] to deduce that P (cid:0) g ( z, (cid:15) n,i ) (cid:54) = g (cid:0) z (cid:48) , (cid:15) n,i (cid:1)(cid:1) = P (cid:0) { z ≥ (cid:15) n,i } (cid:54) = 1 { z (cid:48) ≥ (cid:15) n,i } (cid:1) = (cid:12)(cid:12) z − z (cid:48) (cid:12)(cid:12) , which, again, veriﬁes Assumption 2.4(i). Therefore, in the context of Examples 4 and 5 above,the permutation test is asymptotically valid for detecting discontinuities in trading activity andilliquidity, respectively. 16 Monte Carlo simulations

Our Monte Carlo experiment is based on the setting of Example 2. We simulate the (log) priceprocess according to dP t = σ t dL t under an Euler scheme on a 1-second mesh, and then resamplethe data at the ∆ n = 1 minute frequency. We simulate L either as a standard Brownian motionor as a (centered symmetric) stable process with index β = 1 .

5. To avoid unrealistic price path,we truncate the stable distribution so that its normalized increment ∆ − /βn (cid:0) L i ∆ n − L ( i − n (cid:1) issupported on [ − C, C ], and we consider C ∈ { , , } to examine the eﬀect of the support. Theunit of time is one day.To simulate the volatility process, we ﬁrst simulate two volatility factors according to thefollowing dynamics (see Bollerslev and Todorov (2011)): dV ,t = 0 . . − V ,t ) dt + 0 . (cid:112) V ,t (cid:16) ρdL t + (cid:112) − ρ dB ,t (cid:17) + c · { t = τ } ,dV ,t = 0 . . − V ,t ) dt + 0 . (cid:112) V ,t (cid:16) ρdL t + (cid:112) − ρ dB ,t (cid:17) + c · { t = τ } , where B ,t and B ,t are independent standard Brownian motions that are also independent of L t , ρ = − . c determines the size of the volatility jump at the event time τ .In particular, c = 0 corresponds to the null hypothesis, and we consider a range of c values in (0 , c parameter is calibrated according to Bollerslev et al.’s (2018) empirical estimates for FOMCannouncements. We note that the two volatility factors, V and V , capture the slow- and fast-mean-revertingvolatility dynamics, respectively, with the former having “smoother” sample paths than the latter.With this in mind, we simulate σ t using two models: (cid:40) Model A: σ t = 2 V ,t , Model B: σ t = V ,t + V ,t . (9)In ﬁnite samples, Model A features relatively smooth volatility paths, which is close to the “ideal”scenario underlying the inﬁll asymptotic theory. Meanwhile, Model B generates more realistic, androugher, sample path for σ , providing a nontrivial challenge for the proposed inference theory.We implement the permutation test at the 5% signiﬁcance level, with the window size k n ∈{ , , , } . The six-fold increase from the smallest window size to the largest one represents Speciﬁcally, Bollerslev et al. (2018) estimate the average jump size of log( σ t ) for the S&P 500 ETF aroundFOMC announcements to be 1.037 (see Table 3 of that paper). This suggests that σ τ /σ τ − = (exp(1 . ≈ c ≈ . Model A: One-factor Volatility k n = 15 0.041 0.049 0.053 0.049 0.019 0.012 0.022 0.021 k n = 30 0.056 0.056 0.049 0.044 0.032 0.035 0.044 0.052 k n = 60 0.041 0.044 0.048 0.054 0.084 0.070 0.075 0.065 k n = 90 0.050 0.048 0.042 0.046 0.115 0.080 0.098 0.098 Model B: Two-factor Volatility k n = 15 0.041 0.049 0.060 0.048 0.025 0.016 0.022 0.024 k n = 30 0.059 0.054 0.049 0.045 0.087 0.064 0.065 0.081 k n = 60 0.068 0.053 0.056 0.069 0.208 0.139 0.164 0.153 k n = 90 0.082 0.064 0.056 0.070 0.289 0.193 0.231 0.196 Note:

This table presents rejection frequencies of the permutation test and the t-testunder the null hypothesis σ τ − = σ τ . The signiﬁcance level is ﬁxed at 5%. Column (1)corresponds to the case with L being a standard Brownian motion, and columns (2)–(4) correspond to cases in which L is truncated stable with index 1 . C ∈ { , , } . The rejection frequencies are computed based on 1,000Monte Carlo trials.a considerable range that allows us to explore the robustness of the proposed test with respect tothe k n tuning parameter. The critical value is computed as in Remark 2.1 based on 1,000 i.i.d.permutations. For comparison, we also implement the standard (two-sided) t-test based on (7).Rejection frequencies are computed based on 1,000 Monte Carlo trials. We ﬁrst examine the size properties of the permutation test ˆ φ n and the t-test based on (7). Table1 reports the rejection frequencies of these tests under the null hypothesis (i.e., c = 0) for variousdata generating processes. Column (1) corresponds to the case with L being a standard Brownianmotion, and columns (2), (3), and (4) report results when L is a truncated stable process with thetruncation parameter C = 10, 20, and 30, respectively.The top panel of the table shows results from Model A, where the volatility is solely drivenby the “slow” factor. Quite remarkably, the rejection frequencies of the permutation test are veryclose to the 5% nominal level for all speciﬁcations of L and, importantly, for a wide range of the18indow size k n . In contrast, the rejection rates of the t-test appear to be far more sensitive to thechoice of k n . As we increase k n from 15 to 90, the rejection rate increases from 1.9% to 11.5% when L is a Brownian motion, and we see a similar pattern of size distortion when L is truncated stable.It should be noted that, although the performance of the t-test shown in columns (2)–(4) appearssimilar to what we see in column (1), that test is not formally justiﬁed when L is not a Brownianmotion.The more challenging case is Model B with the two-factor volatility dynamics. Looking at thebottom panel of Table 1, we ﬁnd that the permutation test still has rejection rates that are quiteclose to the nominal level, although we see some over-rejection when k n = 90. This is likely dueto the fact that the approximation error in the coupling has nontrivial impact when the windowsize is large. That being said, this bias issue also aﬀects the benchmark t-test and, apparently, ina more severe fashion.We next turn to the power comparison. Since the results for the four speciﬁcations of L aresimilar, we focus on the Brownian motion case for brevity. Figure 1 plots the power curves ofthe permutation test and the t-test for various k n ’s in Model A and Model B. We see that therejection frequencies increase with the window size k n and the jump size c , which is expected fromour consistency result obtained under k n → ∞ . The permutation test is generally less powerfulthan the t-test under the alternative hypothesis. However, the latter is more powerful at the costof size distortion, which can be very large as shown in Table 1.Overall, we ﬁnd that the permutation test controls size remarkably well under the null hypoth-esis. Although it appears to be less powerful than the t-test, it does not suﬀer from the latter’s sizedistortion which can be severe in the two-factor volatility model. Our results suggest that, givenits robustness, the permutation test is a useful complement to the conventional test based on spotestimation and asymptotic Gaussian approximation. As an empirical illustration, we apply the proposed permutation test in a case study for testingdistributional discontinuities in asset returns. We focus on the impact of news related to thenovel coronavirus (COVID-19) on the US stock market during the early phase of the ongoingpandemic. Our dataset consists of the daily (adjusted) close prices of the S&P 500 index fromDecember 20, 2019 to March 18, 2020, which is publicly available at Yahoo Finance. Accordingto a New York Times article, the ﬁrst reporting of COVID-19 was on December 31, 2019, statingthat “Chinese authorities treated dozens of cases of pneumonia of unknown cause.” On March 11, Data source: https://ﬁnance.yahoo.com/quote/%5EGSPC/history?p=%5EGSPC. ermutation Test in Model A Jump Size R e j ec ti on P r ob a b ilit y kn = 15kn = 30kn = 60kn = 90 T-test in Model A

Jump Size R e j ec ti on P r ob a b ilit y Permutation Test in Model B

Jump Size R e j ec ti on P r ob a b ilit y T-test in Model B

Jump Size R e j ec ti on P r ob a b ilit y Figure 1: The ﬁgure plots the rejection frequencies of the permutation test and the t-test. Thesigniﬁcance level is ﬁxed at 5% (highlighted by shade). Results for Model A and Model B arepresented in the top and bottom rows, respectively. The L´evy process L is simulated as a standardBrownian motion. The power curves are computed for the jump size parameter c ∈ { , . , , . . . , } .The rejection frequencies are computed based on 1,000 Monte Carlo trials.2020, the World Health Organization (WHO) declared COVID-19 to be a global pandemic. Severalsigniﬁcant news events in between are listed in Table 2, including the ﬁrst reported COVID-19 casein the US, and the outbreaks in South Korea and Italy. We implement the permutation test witha window size k n = 5, corresponding to 5 trading days. It is natural to consider this short windowas a ﬁxed number, which is permitted under the proposed theory. In contrast, the conventionalspot-estimation-based approach would require k n → ∞ , which is clearly implausible in the presentcontext. By using a publicly available dataset and a short event window, this illustrative exampleis intentionally designed to mimic a “real-time” and high-stake research scenario in the publicdomain, for which the underlying price and risk dynamics is not yet well understood (due to therare-disaster nature of COVID-19). This example is thus ideal to highlight the two comparativeadvantages of the proposed permutation test, namely, its small-sample reliability and practicalversatility (recall the discussion in Section 2.2). Our sample period is chosen so that there are ﬁve observations of daily returns before the initial reporting fromChina and ﬁve observations after the WHO’s pandemic declaration.

Date ( τ ) Headline Event12/31/2019 Chinese authorities treated dozens of cases of pneumonia of unknown cause.01/20/2020 Other countries, including the United States, conﬁrmed COVID-19 cases.01/30/2020 The WHO declared COVID-19 a global health emergency.02/21/2020 A secretive church is linked to outbreak in South Korea. Italy sees major surge inCOVID-19 cases and oﬃcials lock down towns (reported on Sunday, February 23, 2020).03/11/2020 The WHO declared COVID-19 a global pandemic. Source:

For each event time τ in Table 2, we implement the permutation test at the 5% signiﬁcance level.To simplify interpretation, we use the non-randomized version of the permutation test described inRemark 2.1, that is, we report a rejection if and only if (cid:98) T n > (cid:98) T ∗ n . We reject the null hypothesisfor two instances: January 20, 2020 and February 21, 2020. The former corresponds to the ﬁrstreported COVID-19 case in the US, and the latter is associated with the outbreak in South Korea(Friday) and the subsequent reporting of the surge in Italy (Sunday). On the other hand, we donot reject the null hypothesis for either the initial reporting in China or the two WHO declarations.To gain further insight, we plot in Figure 2 the daily return series of the S&P 500 index inour sample, marked with the aforementioned events. The time series plot provides corroborativeevidence for the testing results. The US market indeed did not respond to China’s initial reportingon December 31, 2019, but became increasingly alerted in the week after COVID-19 cases werealso reported in Japan, South Korea, Thailand, and the US. When the WHO declared globalemergency on January 30, 2020, the market was already moderately volatile, so the declarationitself did not trigger any signiﬁcant (distributional) discontinuity. The outbreaks in South Koreaand Italy evidently drove the market into a panic, which we highlight using shaded color in theﬁgure. The WHO’s pandemic announcement amid the turmoil is not associated with a rejectionfrom our test, suggesting that the declaration was mostly a response to publicly known information,without providing new “lumpy” information that could cause jumps in the return distribution.

In this paper, we propose using a permutation test to detect discontinuities in an economic modelat a cutoﬀ point. Relative to the existing literature, we show that the permutation test is wellsuited for event studies based on time-series data. While nonparametric t-tests have been widelyused for this purpose in various empirical contexts, the permutation test proposed in this paperprovides a distinct alternative. Instead of relying on asymptotic (mixed) Gaussianity from central We implement each test as in Remark 2.1 using 100,000 i.i.d. permutations. OVID-19 News and S&P 500 Daily Returns

Dec. 31, 2019First reporting from China Cases in Japan, South Korea, Thailand & USJan. 20-21, 2020 Jan. 30, 2020WHO global emergency Feb 21, 2020South KoreaOutbreak inSurge in ItalyFeb. 23-24, 2020WHO global pandemicMar. 11, 20200 10 20 30 40 50 60

Observation Index -350-300-250-200-150-100-50050100150 R e t u r n s Figure 2: The ﬁgure plots the daily returns of the S&P 500 index from December 20, 2019 to March18, 2020. The (adjusted) close price data is obtained from Yahoo Finance. Note that January 20,2020 (Martin Luther King Jr. Day) and February 23, 2020 (Sunday) are indexed together withtheir subsequent trading days.limit theorems, we exploit ﬁnite-sample properties of the permutation test in the approximating,or “coupling,” two-sample problem.We demonstrate that our new theory is broadly useful in a wide range of problems in the inﬁllasymptotic time-series setting, which justiﬁes using the permutation test to detect jumps in eco-nomic variables such as volatility, trading activity, and liquidity. Compared with the conventionalnonparametric t-test, the proposed permutation test has several distinct features. First, the permu-tation test provides asymptotic size control regardless of whether the sizes of the local subsamplesare ﬁxed or growing to inﬁnity. In the latter case, we also establish that the permutation test isconsistent. Second, the permutation test is versatile, as it can be applied without modiﬁcation tomany diﬀerent contexts and under relatively weak conditions.

Appendix: Proofs

Throughout the proofs, we use K to denote a positive constant that may change from line toline, and write K p to emphasize its dependence on some parameter p . For any event E ∈ F , we22dentify it with the associated indicator random variable. Proof of Theorem 2.1.

Step 1. Deﬁne φ n in the same way as ˆ φ n but with ( Y n,i ) i ∈I n replaced by( U n,i ) i ∈I n . In this step, we show that E [ ˆ φ n ] = E [ φ n ] + o (1) . (10)Let ˜ φ n be deﬁned in the same way as ˆ φ n , but with ( Y n,i ) i ∈I n replaced by ( (cid:101) Y n,i ) i ∈I n , as deﬁnedin Assumption 2.1(iii). Since ( (cid:101) Y n,i ) i ∈I n and ( Y n,i ) i ∈I n share the same (conditional) distribution, E [ ˜ φ n ] = E [ ˆ φ n ] . (11)Let E n ∈ F be the event where the ordered values of ( U n,i ) i ∈I n and ( (cid:101) Y n,i ) i ∈I n correspond to thesame permutation of I n . Since the test statistic is only a function of the rank of the observations,we have ˜ φ n = φ n in restriction to E n . Hence, | E [ ˜ φ n ] − E [ φ n ] | = | E [ ˜ φ n E cn ] − E [ φ n E cn ] | ≤ P ( E cn ) . (12)By (11) and (12), (10) follows from P ( E cn ) = o (1), which will be proved below.Let A n,i,j ≡ { U n,j − U n,i ≥ , (cid:101) Y n,j − (cid:101) Y n,i < } for every ( i, j ) ∈ I n × I n , and note that E cn ⊆ ∪ i,j A n,i,j . Recall the elementary fact that if a sequence of random variables X n = o p (1), thenthere exists a real sequence δ n = o (1) such that P ( | X n | ≤ δ n ) →

1. Under Assumption 2.1(iii), byapplying this result to X n = 2 max i ∈I n | (cid:101) Y n,i − U n,i | k n , we can ﬁnd a sequence δ n = o (1) such that P (cid:18) max i ∈I n | (cid:101) Y n,i − U n,i | ≤ δ n k − n / (cid:19) → . (13)We then observe that A n,i,j ⊆ { U n,j − U n,i ≥ δ n k − n , (cid:101) Y n,j − (cid:101) Y n,i < } ∪ { ≤ U n,j − U n,i < δ n k − n }⊆ {| (cid:101) Y n,j − (cid:101) Y n,i − ( U n,j − U n,i ) | > δ n k − n } ∪ { ≤ U n,j − U n,i < δ n k − n }⊆ { max i ∈I n | (cid:101) Y n,i − U n,i | > δ n k − n / } ∪ { ≤ U n,j − U n,i < δ n k − n } . Therefore, E cn ⊆ ∪ i,j A n,i,j ⊆ { max i ∈I n | (cid:101) Y n,i − U n,i | > δ n k − n / } ∪ ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) , which, together with (13), implies that P ( E cn ) ≤ P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) + o (1) . (14)Next, consider the following argument: P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n }|G n ) ≤ (cid:88) i,j ∈I n P (0 ≤ U n,j − U n,i < δ n k − n |G n ) ≤ k n (cid:88) i ∈I n sup x ∈ R P ( | U n,i − x | ≤ δ n k − n |G n )= O p ( δ n ) = o p (1) , (15)23here the last line holds by Assumption 2.1(ii). By (15) and the bounded convergence theorem, P ( ∪ i,j ∈I n { ≤ U n,j − U n,i < δ n k − n } ) = o (1) . (16)By combining (14) and (16), we conclude that P ( E cn ) = o (1), as desired.Step 2. We now prove the assertions in parts (a) and (b) of the theorem. In view of (10), we onlyneed to prove E [ φ n ] → α and E [ φ n ] → U n,i ) i ∈I n are conditionally i.i.d. and so permutations constitute a group of transformations thatsatisfy the randomization hypothesis in Lehmann and Romano (2005, Deﬁnition 15.2.1). Then,Lehmann and Romano (2005, Theorem 15.2.1) implies that E [ φ n | G n ] = α , and E [ φ n ] = α thenfollows from the law of iterated expectations.To prove part (b), we need some additional notation. To emphasize the dependence of (cid:98) T n , (cid:98) T ∗ n ,and ˆ φ n on the original data ( Y n,i ) i ∈I n , we explicitly write them as (cid:98) T n ( Y ), (cid:98) T ∗ n ( Y ), and ˆ φ n ( Y ). Withthis notation, we can rewrite φ n = ˆ φ n ( U ), since it is computed in the same way as ˆ φ n but with( Y n,i ) i ∈I n replaced by ( U n,i ) i ∈I n .We ﬁrst analyze the asymptotic behavior of (cid:98) T n ( U ). Deﬁne the empirical analogue of Q j,n ( · ) as (cid:98) Q j,n ( x ) ≡ k n (cid:88) i ∈I j,n { U n,i ≤ x } . Since the variables ( U n,i ) i ∈I j,n are G n -conditionally i.i.d., E [( (cid:98) Q j,n ( x ) − Q j,n ( x )) |G n ] ≤ O ( k − n ) = o (1) . By Markov’s inequality and the law of iterated expectations, this implies that (cid:98) Q j,n ( x ) − Q j,n ( x ) = o p (1) for each x ∈ R . This and a classical Glivenko–Cantelli theorem (e.g., Davidson (1994, Theorem21.5)) imply that sup x ∈ R | (cid:98) Q j,n ( x ) − Q j,n ( x ) | = o p (1) , for j ∈ { , } . (17)By deﬁnition, (cid:98) T n ( U ) ≡ k n (cid:88) i ∈I n ( (cid:98) Q ,n ( U n,i ) − (cid:98) Q ,n ( U n,i )) . In addition, we deﬁne S n ≡ k n (cid:88) i ∈I n ( Q ,n ( U n,i ) − Q ,n ( U n,i )) . Note that the functions (cid:98) Q j,n ( · ) and Q j,n ( · ) are uniformly bounded. Hence, by the triangle inequalityand (17), | (cid:98) T n ( U ) − S n | ≤ k n (cid:88) i ∈I n | ( (cid:98) Q ,n ( U n,i ) − (cid:98) Q ,n ( U n,i )) − ( Q ,n ( U n,i ) − Q ,n ( U n,i )) |≤ Kk n (cid:88) i ∈I n (cid:88) j ∈{ , } | (cid:98) Q j,n ( U n,i ) − Q j,n ( U n,i ) | = o p (1) . (18)24onditional on G n , the bounded random functions Q ,n ( · ) and Q ,n ( · ) can be treated as deterministicfunctions. Next, note that S n = 12 (cid:88) j ∈{ , } (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ j,n ( x )+ o p (1) = (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x )+ o p (1) , (19)where the ﬁrst equality holds by a law of large numbers for the conditionally i.i.d. variables( U n,i ) i ∈I j,n for j = 1 ,

2, and the second equality holds by the deﬁnition of Q n . By combining(18) and (19), we deduce that (cid:98) T n ( U ) = (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) + o p (1) . (20)Next, we analyze the asymptotic behavior of (cid:98) T ∗ n ( U ). It is useful to consider the followingrepresentation of this variable. We denote U ˜ π = ( U n, ˜ π ( i ) ) i ∈I n , where ˜ π is a random permutationof I n , independent from the data, and is drawn uniformly from the set of all permutations of I n . By deﬁnition, (cid:98) T ∗ n ( U ) is the 1 − α quantile of (cid:98) T n ( U ˜ π ), conditional on the sample, where therandomness comes from the random realization of ˜ π . To analyze the permutation distribution, weconstruct an additional coupling sequence of ( U n,i ) i ∈I n following the method of Chung and Romano(2013, Section 5.3). The result of their coupling construction is another random sequence ( U (cid:48) n,i ) i ∈I n such that (i) U n,i = U (cid:48) n,i for all i in some random subset I (cid:48) n ⊆ I n ; (ii) the cardinality of I n \ I (cid:48) n ,denoted D n , satisﬁes E [ D n ] = O ( k / n ); and (iii) ( U (cid:48) n,i ) i ∈I n are G n -conditionally i.i.d. with marginaldistribution Q n .For j ∈ { , } , deﬁne (cid:98) Q j,n ( x ; π ) ≡ k n (cid:88) i ∈I j,n { U n,π ( i ) ≤ x } and (cid:98) Q (cid:48) j,n ( x ; π ) ≡ k n (cid:88) i ∈I j,n { U (cid:48) n,π ( i ) ≤ x } . By repeatedly using the triangle inequality, | (cid:98) T n ( U π ) − (cid:98) T n ( U (cid:48) π ) | = 12 k n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i ∈I n (cid:18)(cid:0) (cid:98) Q ,n ( U n,π ( i ) ; π ) − (cid:98) Q ,n ( U n,π ( i ) ; π ) (cid:1) − (cid:0) (cid:98) Q (cid:48) ,n ( U (cid:48) n,π ( i ) ; π ) − (cid:98) Q (cid:48) ,n ( U (cid:48) n,π ( i ) ; π ) (cid:1) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Kk n (cid:88) j ∈{ , } (cid:88) i ∈I n (cid:12)(cid:12) (cid:98) Q j,n ( U n,π ( i ) ; π ) − (cid:98) Q (cid:48) j,n ( U (cid:48) n,π ( i ) ; π ) (cid:12)(cid:12) ≤ Kk n (cid:88) i,k ∈I n (cid:12)(cid:12) { U n,π ( k ) ≤ U n,π ( i ) } − { U (cid:48) n,π ( k ) ≤ U (cid:48) n,π ( i ) } (cid:12)(cid:12) ≤ KD n /k n = o p (1) , (21)where the last inequality uses the fact that ( U n,i , U n,k ) = ( U (cid:48) n,i , U (cid:48) n,k ) if ( i, k ) ∈ I (cid:48) n × I (cid:48) n , and so thesummation on the previous line only has (2 k n ) − (2 k n − D n ) ≤ k n D n bounded terms that can be25iﬀerent from zero; and the o p (1) statement follows from E [ D n ] = O ( k / n ), k n → ∞ , and Markov’sinequality.For any ﬁxed arbitrary permutation π , (cid:98) T n ( U (cid:48) π ) is the Cram´er-von Mises statistic for the G n -conditionally i.i.d. variables ( U (cid:48) n,π ( i ) ) i ∈I n . Hence, by a similar argument leading to (20), we have (cid:98) T n ( U (cid:48) π ) = o p (1). By combining this with (21), it follows that (cid:98) T n ( U π ) = o p (1) . (22)Since this result holds for any arbitrary ﬁxed permutation π , it also holds for any pair of permu-tations considered at random from the set of all possible permutations of I n , independently fromthe data. By elementary properties of stochastic convergence, this implies the so-called Hoeﬀd-ing’s condition (e.g., Lehmann and Romano (2005, Equation (15.10))). By this and Lehmann andRomano (2005, Theorem 15.2.3), the permutation distribution associated with the test statistic (cid:98) T n ( U ), conditional on the data, converges to zero in probability. As a corollary of this, (cid:98) T ∗ n ( U ) = o p (1) . (23)From (20), (23), and the condition in part (b), it is easy to see that (cid:98) T n ( U ) > (cid:98) T ∗ n ( U ) withprobability approaching 1. This further implies that E [ φ n ] →

1, which, together with (10) provesthe assertion of part (b).

Proof of Theorem 2.2. (a) We prove the assertion of part (a) by applying Theorem 2.1(a). Weconstruct the coupling variable U n,i as follows: U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) , for all i ∈ I n . (24)We set G n = F ( i ∗ − k n )∆ n . By Assumption 2.2, ( (cid:15) n,i ) i ∈I n are i.i.d. and independent of G n . Since ζ ( i ∗ − k n )∆ n is G n -measurable, the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d. This veriﬁes thecondition in part (a) of Theorem 2.1, which also implies Assumption 2.1(i). It remains to verifyconditions (ii) and (iii) in Assumption 2.1.By a standard localization argument (see Jacod and Protter (2012, Section 4.4.1)), we canstrengthen Assumption 2.3 by assuming T = ∞ , K m = K , and K m = K for some ﬁxed compact set K and constant K >

0. In particular, ζ ( i ∗ − k n )∆ n takes values in the compact set K . By Assumption2.2, it is then easy to see that the G n -conditional probability density of U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) isuniformly bounded (and it does not depend on i ). This implies condition (ii) of Assumption 2.1.Finally, we verify condition (iii) of Assumption 2.1. By Assumption 2.2(i), for each i ∈ I n , ε n,i isindependent of F i ∆ n . Since ζ i ∆ n and ζ ( i ∗ − k n )∆ n are F i ∆ n -measurable, we deduce from Assumption2.3(i) that E [ | g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) | |F i ∆ n ] ≤ Ka n (cid:107) ζ i ∆ n − ζ ( i ∗ − k n )∆ n (cid:107) . (25)26ote that under the null hypothesis with ∆ ζ τ = 0, the processes ζ t and ˜ ζ t are identical. Hence, byAssumption 2.3(ii) and (25), (cid:13)(cid:13) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:13)(cid:13) ≤ Ka n k / n ∆ / n . By the maximal inequality under the L norm (see, e.g., van der Vaart and Wellner (1996, Lemma2.2.2)), we further deduce that (cid:13)(cid:13)(cid:13)(cid:13) max i ∈I n (cid:12)(cid:12) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13) ≤ Ka n k n ∆ / n . (26)Recall that a n k n ∆ / n = o (1) by assumption. Hence,max i ∈I n (cid:12)(cid:12) g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) (cid:12)(cid:12) = o p ( k − n ) . (27)Note that, by the deﬁnitions in (3) and (24), Y n,i − U n,i = g ( ζ i ∆ n , (cid:15) n,i ) − g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) + R n,i . (28)Combining (27), (28), and Assumption 2.3(iii), we deduce that max i ∈I n | Y n,i − U n,i | = o p (cid:0) k − n (cid:1) ,which veriﬁes Assumption 2.1(iii). We have now veriﬁed all the conditions needed in Theorem2.1(a), which proves the assertion of part (a) of Theorem 2.2.(b) We prove the assertion of part (b) by applying Theorem 2.1(b). Under the maintainedalternative hypothesis, we have ∆ ζ τ = c for some constant c (cid:54) = 0. The coupling variable now takesthe following form U n,i = (cid:40) g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) i ∈ I ,n ,g ( ζ ( i ∗ − k n )∆ n + c, (cid:15) n,i ) i ∈ I ,n . (29)Under Assumption 2.2, it is easy to see that, for each j ∈ { , } , the variables ( U n,i ) i ∈I j,n are G n -conditionally i.i.d., which veriﬁes Assumption 2.1(i).We now turn to the remaining conditions in Assumption 2.1. As in part (a), we can invoke thestandard localization procedure and assume that the ζ t process takes value in a compact set K .Note that ζ τ − ( ζ ( i ∗ − k n )∆ n + ∆ ζ τ ) = ζ τ − − ζ ( i ∗ − k n )∆ n = o p (1) , where the o p (1) statement follows from the fact that the ζ t process is c`adl`ag and k n ∆ n → K slightly if necessary, we also have ζ ( i ∗ − k n )∆ n + c ∈ K withprobability approaching 1. Then, we can verify Assumption 2.1(ii) following the same argument asin part (a). The veriﬁcation of Assumption 2.1(iii) is also similar.Finally, we verify the condition in Theorem 2.1(b) pertaining to the conditional CDFs. Notethat Q ,n ( x ) = F ζ ( i ∗− kn )∆ n ( x ) and Q ,n ( x ) = F ζ ( i ∗− kn )∆ n + c ( x ) .

27t is then easy to see that2 (cid:90) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) ≥ (cid:90) (cid:16) F ζ ( i ∗− kn )∆ n ( x ) − F ζ ( i ∗− kn )∆ n + c ( x ) (cid:17) dF ζ ( i ∗− kn )∆ n ( x ) . Since ζ ( i ∗ − k n )∆ n takes values in the compact set K , Assumption 2.2(iii) implies that the lower boundin the above display is bounded away from zero. Hence, (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n forany real sequence δ n = o (1). We have now veriﬁed all conditions for Theorem 2.1(b), which provesthe assertion of part (b) of Theorem 2.2. Proof of Theorem 2.3.

The assertions of the theorem follow from similar arguments to those usedto prove Theorem 2.1. For the sake of brevity, we focus on the only substantial diﬀerence, which ishow we establish that P ( E cn ) = o (1). Recall that E n denotes the event where the ordered values of( U n,i ) i ∈I n and ( (cid:101) Y n,i ) i ∈I n correspond to the same permutation of I n . In the case of this proof, thisresult follows from P ( E cn ) ≤ P ( ∪ i ∈I n { (cid:101) Y n,i (cid:54) = U n,i } ) ≤ (cid:88) i ∈I n P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (1) , where the ﬁrst inequality follows from E cn ⊆ ∪ i ∈I n { (cid:101) Y n,i (cid:54) = U n,i } and the convergence follows fromthe assumption that P ( (cid:101) Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . Proof of Theorem 2.4. (a) We prove this assertion by applying Theorem 2.3(a). We shall verifythe conditions in Theorem 2.3 for (cid:101) Y n,i = Y n,i , U n,i = g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ), and G n = F ( i ∗ − k n )∆ n . Byassumption, the variables ( (cid:15) n,i ) i ∈I n are i.i.d. and independent of G n . Hence, the variables ( U n,i ) i ∈I n are G n -conditionally i.i.d.It remains to verify that P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . By repeating thelocalization argument used in the proof of Theorem 2.2, we can strengthen Assumption 2.4 with T = ∞ without loss of generality. In particular, ζ t takes values in some compact subset K ⊆ Z .Note that for each i ∈ I n , (cid:15) n,i is independent of (cid:0) ζ i ∆ n , ζ ( i ∗ − k n )∆ n (cid:1) . By Assumption 2.4(i), wethus have P ( Y n,i (cid:54) = U n,i |G n ) ≤ K (cid:13)(cid:13) ζ i ∆ n − ζ ( i ∗ − k n )∆ n (cid:13)(cid:13) . Then, by Assumption 2.4(ii), we furtherhave P ( Y n,i (cid:54) = U n,i ) ≤ K ( k n ∆ n ) / . The condition P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) then follows from k n ∆ n = o (1). By Theorem 2.3(a), we have E [ ˆ φ n ] → α as asserted.(b) We prove this assertion by applying Theorem 2.3(b). We verify the conditions in Theorem2.3 for (cid:101) Y n,i = Y n,i , G n = F ( i ∗ − k n )∆ n , and U n,i = (cid:40) g ( ζ ( i ∗ − k n )∆ n , (cid:15) n,i ) if i ∈ I ,n ,g ( ζ ( i ∗ − k n )∆ n + c, (cid:15) n,i ) if i ∈ I ,n . U n,i ) i ∈I j,n are G n -conditionally i.i.d. foreach j ∈ { , } , and P ( Y n,i (cid:54) = U n,i ) = o (cid:0) k − n (cid:1) uniformly in i ∈ I n . Assumption 2.2(iii) also ensuresthat P ( (cid:82) ( Q ,n ( x ) − Q ,n ( x )) dQ n ( x ) > δ n ) → δ n = o (1). By Theorem2.3(b), we have that E [ ˆ φ n ] →

1, as asserted.

References

Andersen, T. G. (1996): “Return Volatility and Trading Volume: An Information Flow Inter-pretation of Stochastic Volatility,”

Journal of Finance , 51, 169–204.

Andrews, D. W. K. (1993): “Tests for Parameter Instability and Structural Change With Un-known Change Point,”

Econometrica , 61, 821–856.

Bai, J. and P. Perron (1998): “Estimating and Testing Linear Models with Multiple StructuralChanges,”

Econometrica , 66, 47–78.

Barndorff-Nielsen, O. E. and N. Shephard (2006): “Econometrics of Testing for Jumps inFinancial Economics Using Bipower Variation,”

Journal of Financial Econometrics , 4, 1–30.

Bollerslev, T., J. Li, and Y. Xue (2018): “Volume, Volatility, and Public News Announce-ments,”

Review of Economic Studies , 85, 2005–2041.

Bollerslev, T. and V. Todorov (2011): “Estimation of Jump Tails,”

Econometrica , 79, 1727–1783.

Calonico, S., M. D. Cattaneo, and R. Titiunik (2014): “Robust Nonparametric ConﬁdenceIntervals for Regression-Discontinuity Designs,”

Econometrica , 82, 2295–2326.

Canay, I. A. and V. Kamat (2017): “Approximate Permutation Tests and Induced Order Statis-tics in the Regression Discontinuity Design,”

The Review of Economic Studies , 85, 1577–1608.

Cattaneo, M. D., B. R. Frandsen, and R. Titiunik (2015): “Randomization inference in theregression discontinuity design: An application to party advantages in the US Senate,”

Journalof Causal Inference , 3, 1 – 24.

Cattaneo, M. D., R. Titiunik, and G. Vazquez-Bare (2017): “Comparing inference ap-proaches for RD designs: A reexamination of the eﬀect of Head Start on child mortality,”

Journalof Policy Analysis and Management , 36, 643–681.

Chow, G. C. (1960): “Tests of Equality Between Sets of Coeﬃcients in Two Linear Regressions,”

Econometrica , 28, 591–605.

Chung, E. and J. P. Romano (2013): “Exact and Asymptotically Robust Permutation Tests,”

Annals of Statistics , 41, 484–507. 29 ochrane, J. H. and M. Piazzesi (2002): “The Fed and Interest Rates - A High-FrequencyIdentiﬁcation,”

American Economic Review , 92, 90–95.

Comte, F. and E. Renault (1998): “Long Memory in Continuous Time Stochastic VolatilityModels,”

Mathematical Finance , 8, 291–323.

Davidson, J. (1994):

Stochastic Limit Theory , Oxford University Press.

DiCiccio, C. J. and J. P. Romano (2017): “Robust Permutation Tests for Correlation andRegression Coeﬃcients,”

Journal of the American Statistical Association , 112, 1211–1220.

Foster, D. P. and D. B. Nelson (1996): “Continuous Record Asymptotics for Rolling SampleVariance Estimators,”

Econometrica , 64, 139–174.

Hahn, J., P. Todd, and W. V. der Klaauw (2001): “Identiﬁcation and Estimation of Treat-ment Eﬀects with a Regression-Discontinuity Design,”

Econometrica , 69, 201–209.

Hoeffding, W. (1952): “The Large-Sample Power of Tests Based on Permutations of Observa-tions,”

Annals of Mathematical Statistics , 3, 169–192.

Imbens, G. and K. Kalyanaraman (2011): “Optimal Bandwidth Choice for the RegressionDiscontinuity Estimator,”

The Review of Economic Studies , 79, 933–959.

Imbens, G. W. and T. Lemieux (2008): “Regression Discontinuity Designs: A Guide to Prac-tice,”

Journal of Econometrics , 142, 615 – 635.

Jacod, J. and P. Protter (2012):

Discretization of Processes , Springer.

Lee, D. S. and T. Lemieux (2010): “Regression Discontinuity Designs in Economics,”

Journalof Economic Literature , 48, 281–355.

Lehmann, E. L. and J. P. Romano (2005):

Testing Statistical Hypothesis , Springer.

Li, J. and Y. Liu (2020): “Eﬃcient Estimation of Integrated Volatility Functionals under GeneralVolatility Dynamics,”

Econometric Theory, Forthcoming . Li, J., V. Todorov, and G. Tauchen (2017): “Jump Regressions,”

Econometrica , 85, 173–195.

Li, J. and D. Xiu (2016): “Generalized Method of Integrated Moments for High-frequency Data,”

Econometrica , 84, 1613–1633.

Nakamura, E. and J. Steinsson (2018a): “High-Frequency Identiﬁcation of Monetary Non-Neutrality: The Information Eﬀect,”

Quarterly Journal of Economics , 133, 1283–1330.——— (2018b): “Identiﬁcation in Macroeconomics,”

Journal of Economic Perspectives , 32, 59–86.30 tock, J. (1994): “Chapter 46 Unit Roots, Structural Breaks and Trends,” Elsevier, vol. 4 of

Handbook of Econometrics , 2739 – 2841.

Thistlethwaite, D. L. and D. T. Campbell (1960): “Regression-discontinuity Analysis: Analternative to the Ex Post Facto Experiment.”

Journal of Educational Psychology , 51, 309317.

Todorov, V. and G. Tauchen (2012): “Realized Laplace Transforms for Pure-jump Semimartin-gales,”

The Annals of Statistics , 40, 1233–1262. van der Vaart, A. and J. Wellner (1996):