State Heterogeneity Analysis of Financial Volatility Using High-Frequency Financial Data
SState Heterogeneity Analysis of Financial VolatilityUsing High-Frequency Financial Data
Dohyun Chun and Donggyu Kim ∗ College of Business, Korea Advanced Institute of Science and Technology (KAIST), Seoul, Korea
March 1, 2021
Abstract
Recently, to account for low-frequency market dynamics, several volatility models, employinghigh-frequency financial data, have been developed. However, in financial markets, we oftenobserve that financial volatility processes depend on economic states, so they have a state het-erogeneous structure. In this paper, to study state heterogeneous market dynamics based onhigh-frequency data, we introduce a novel volatility model based on a continuous Itˆo diffusionprocess whose intraday instantaneous volatility process evolves depending on the exogenousstate variable, as well as its integrated volatility. We call it the state heterogeneous GARCH-Itˆo(SG-Itˆo) model. We suggest a quasi-likelihood estimation procedure with the realized volatilityproxy and establish its asymptotic behaviors. Moreover, to test the low-frequency state het-erogeneity, we develop a Wald test-type hypothesis testing procedure. The results of empiricalstudies suggest the existence of leverage, investor attention, market illiquidity, stock marketcomovement, and post-holiday effect in S&P 500 index volatility.
JEL classification:
C22, C53, C58
Key words and phrases:
GARCH, diffusion process, regime switching, quasi-maximum likeli-hood estimator, Wald test. ∗ corresponding author.E-mail addresses: [email protected] (D. Chun), [email protected] (D. Kim). a r X i v : . [ s t a t . A P ] F e b Introduction
Volatility plays an important role in financial asset pricing, risk management, portfolio allocation,and managerial decision-making. These interests have led many researchers to analyze financialvolatility features such as time-varying heteroscedasticity, heavy tailness, and volatility clusteringeffect. To account for stylized market features, GARCH models (Bollerslev, 1986; Engle, 1982)have been introduced. In financial markets, we often observe that volatility varies with economicor financial states, but the plain GARCH model cannot deal with this. To consider this stateheterogeneity in the volatility process, researchers have developed state-heterogeneity GARCH-type models—for example, Markov-switching GARCH (Bauwens et al., 2010, 2014; Gray, 1996;Haas et al., 2004; Hamilton and Susmel, 1994; Klaassen, 2002), GJR-GARCH (Glosten et al.,1993), and QR-GARCH (Nyberg, 2012) models. Their empirical studies support the existence ofstate heterogeneity in financial volatility.GARCH family models generally use daily return information to determine daily volatility lev-els, but daily return squares provide limited information about current volatility levels (Andersenand Bollerslev, 1998). Therefore, the data period should be long enough to enjoy the large-sampleasymptotic properties of estimator. However, structural breaks in long-time-period data may dete-riorate the estimation quality and the requirement for long-time-period data hinders investigationof short-term market dynamics. State heterogeneity models are severely limited in their expo-sure to the aforementioned issues because the number of parameters increases in proportion to thenumber of states, and the data are split among states. Recently, widely available financial bigdata have shed light on this issue. For example, thanks to advances of technology, high-frequencyfinancial data are available, and we can accurately estimate volatility with relatively short-time-period data. In particular, researchers have modeled high-frequency data based on continuous-timeItˆo processes and proposed procedures for estimating realized volatility. Examples include multi-scale realized volatility (Zhang, 2006, 2011), pre-averaging realized volatility (Jacod et al., 2009),quasi-maximum likelihood estimator (QMLE; A¨ıt-Sahalia et al., 2010; Xiu, 2010), kernel realizedvolatility (Barndorff-Nielsen et al., 2008), and robust pre-averaging realized volatility (Fan and2im, 2018). Renault and Werker (2011) suggested the endogenous trading time robust realizedvolatility, and Liu et al. (2018) demonstrated that the pre-averaging estimator is robust for thezero-duration high-frequency data. The availability of these efficient realized volatility estima-tors had made a great impact on the volatility modeling and analysis. For example, in regard tothe modeling aspect, researchers have tried to bridge the gap between the discrete-time volatilitymodel and continuous-time process (Kallsen and Taqqu, 1998; Nelson, 1990; Wang, 2002). Forthe volatility dynamics analysis, realized volatility is employed as an innovation, which helps toimprove estimation and prediction performance (Cerovecki et al., 2019; Engle and Gallo, 2006;Kim and Wang, 2020; Shephard and Sheppard, 2010; Song et al., 2021; Tao et al., 2011; Visser,2011). Recently, Kim and Wang (2016) introduced the unified continuous volatility process (uni-fied GARCH-Itˆo model) to provide a mathematical background for using high-frequency data inthe GARCH model estimation. They showed that incorporating high-frequency financial data im-proves parameter estimation performance and helps analyze low-frequency market dynamics. Seealso Kim (2016); Kim and Fan (2019). In this manner, some state heterogeneous volatility mod-els also incorporate high-frequency data. For instance, researchers employed realized volatility inregime-switching ARMA-GARCH (Zhang and Frey, 2015), two-stage three-state FIGARCH (Shiand Ho, 2015), realized GARCH (Hansen et al., 2012), and multivariate Markov regime-switchingGARCH (Lai et al., 2017) models. These studies reported the usefulness of high-frequency data inanalyzing state heterogeneity in low-frequency financial volatility. The success of previous studieshave increased interest in developing volatility models that provide a mathematical background forusing high-frequency data to analyze low-frequency volatility dynamics.To examine and account for state heterogeneity in low-frequency volatility dynamics based onhigh-frequency financial data, we propose a novel volatility model based on a continuous-time Itˆodiffusion process whose instantaneous volatility process evolves depending on the state variables.In particular, its instantaneous volatility process is continuous with respect to time and has ahomogeneous process during each low-frequency period. In contrast, the process varies with thestate at each low-frequency period to capture the low-frequency market dynamics. Consequently, its3ntegrated volatility process has a form of the famous regime switching GARCH model. The modelis called the state heterogeneous GARCH-Itˆo (SG-Itˆo) model. To estimate model parameters,we suggest a quasi-maximum likelihood estimation procedure based on the high-frequency dataand establish its asymptotic theories. Furthermore, to test state heterogeneity in low-frequencyvolatility, we introduce a Wald test-type hypothesis testing procedure. The results of empiricalstudies suggest the existence of leverage, trading volume or investor attention, market illiquidity,stock market comovement, and post-holiday effect on S&P 500 index volatility. However, thesestate heterogeneities are not revealed with the same period of low-frequency data, because of itsinefficiency. More details are provided in Section 5.2.The rest of the paper is organized as follows. Section 2 introduces the SG-Itˆo model andillustrates properties of instantaneous and integrated volatility processes. Section 3 presents thequasi-maximum likelihood method and establishes its asymptotic theories. Section 4 suggestsa hypothesis testing procedure. Section 5 provides the results of numerical studies, includingsimulation and empirical studies. Section 6 concludes the paper. The proofs are provided in theAppendix.
State heterogeneity in financial volatility has long been discussed as the key feature of marketdynamics (Lamoureux and Lastrapes, 1990b). To account for state heterogeneity in the volatilityprocess, researchers have proposed various form of regime-switching GARCH (RS-GARCH) models.For example, Hamilton and Susmel (1994) applied the Markov-switching approach to build the stateheterogeneous GARCH process. See also Bauwens et al. (2010, 2014), Gray (1996), Haas et al.(2004), and Klaassen (2002). Glosten et al. (1993) introduced the GJR-GARCH model, whichreflects the well-known leverage effect (Black, 1976; Christie, 1982; Figlewski and Wang, 2000;Tauchen et al., 1996). Taking the state of the business cycle into account exogenously, Nyberg42012) introduced the regime-switching GARCH-M model (QR-GARCH) model to examine thestate-dependent risk-return relationship.A regime-switching model is characterized by the joint process of historical log price { X n } andstate variable { s n } . Specifically, a process of the state variable s t and an evolution process of X t for a given state identify the model structure (Lange and Rahbek, 2009). In the case of the RS-GARCH model, the conditional volatility process depends on the sigma field generated by { X n } ( F x,Ln = σ ( X n , X n − , X n − , . . . )) and { s n } ( F sn = σ ( s n , s n − , s n − , . . . )), where n ∈ N and N is theset of all non-negative integers. A general discrete-time RS-GARCH model is described as follows: X n − X n − = µ + (cid:113) h n ( θ s,L ) (cid:15) Ln ,h n ( θ s,L ) = ω Li + γ Li h n − ( θ s,L ) + β Li ζ n − , (2.1)where θ s,L = ( ω Li , γ Li , β Li ) is a model parameter for the state indicator i = 1 , , . . . , D , D is thenumber of states, µ is a drift, ζ n = X n − X n − − µ , and random error (cid:15) Ln satisfies E (cid:2) (cid:15) Ln |F Ln − (cid:3) = 0and E (cid:104)(cid:0) (cid:15) Ln (cid:1) |F Ln − (cid:105) = 1 a.s. for F Ln − = F x,Ln − ∪ F sn . For the RS-GARCH model, the modelparameters vary with the state, so the state variable s t plays a key role. The state variable s t may have variety of forms, and the assumption for the state process distinguishes the model. Forexample, the Markov-switching GARCH model has a latent Markov state process, whereas theGJR- and QR-GARCH models employ exogenous state variables. In the study of discrete-time market dynamics, a long time period of data is needed to obtainconsistent estimation results. However, the long period of data is prone to exposure to the structuralbreak issue, especially for RS-GARCH-type models, because of their complexity. Recently, realizedvolatility estimators based on high-frequency financial data have been well developed (A¨ıt-Sahaliaet al., 2010; Barndorff-Nielsen et al., 2008; Jacod et al., 2009; Xiu, 2010; Zhang, 2006). Kim andWang (2016) showed improvement of parameter estimation efficiency by using realized volatilityestimators as the estimation proxy. Therefore, we hypothesize that (1) the state heterogeneity5xists in financial volatility process and (2) using high-frequency data facilitates its analysis. Forhypothesis testing, a model that enables to utilize realized volatility estimators in the analysisof low-frequency state heterogeneity is required. This section introduces a novel continuous-timevolatility model whose instantaneous volatility process varies with a discrete state process.For the state variable process s n , we consider discrete-time exogenous variables that determinethe state of volatility process. The term exogenous comes from the independence assumptionbetween the state variable s n and price process, which describe the unilateral influence of the stateon the volatility process. For simplicity, this study deals with the binary state by assuming { s n } as a binary process. Let R + = [0 , ∞ ] and t ∈ R + . We define a state heterogeneity volatility modelwith a continuous-time Itˆo process as follows. Definition 1.
For t ∈ ( n − , n ] , we call a log stock price X t follows an SG-Itˆo model if it satisfies dX t = µdt + σ t dB t , σ t = (1 − s n ) σ ,t + s n σ ,t ,σ i,t = σ n − + ( t − n + 1) { ω i + ( γ i − σ n − } + β i (cid:18)(cid:90) tn − σ i,s dB s (cid:19) for i ∈ { , } , (2.2) where X t is a log stock price, σ ,t and σ ,t are volatility processes adapted to F xt = σ ( X s : s ≤ t ) , B t is the standard Brownian motion with respect to a filtration F xt , and θ = ( ω , ω , γ , γ , β , β ) are model parameters. For the SG-Itˆo model, instantaneous volatility has a continuous-time state heterogeneous processdefined at all times t . In particular, during the current low-frequency period t ∈ ( n − , n ], theinstantaneous volatility σ s n +1 ,t has a homogeneous continuous-time Itˆo process depending on thecurrent state variable s n . During the next low-frequency period t ∈ ( n, n + 1], the instantaneousvolatility σ s n +1 +1 ,t evolves from the end of the previous instantaneous process σ s n +1 ,n while theirevolving process is determined by the next period state variable s n +1 . This model deals with thelow-frequency state heterogeneity in volatility by state-varying coefficients. For example, for s n = 0,we have ( ω i , γ i , β i ) = ( ω , γ , β ) and σ t = σ ,t , whereas for s n = 1, we have ( ω i , γ i , β i ) = ( ω , γ , β )and σ t = σ ,t . Moreover, the current level of volatility depends on past volatility due to its recursive6tructure. Thus, the model is uniquely identified by the path of the state variables because thesequential order of the states differentiates the volatility process. Accordingly, it can handle theregime shift in volatility with the corresponding state variable. Since the model can incorporateany state process, it allows us to test a given exogenous state. This is a distinguishing featureof the SG-Itˆo model compared to the single-regime model. When the states are homogeneous( s n = s n − = · · · = s ), the model returns to a single-regime model, the unified GARCH-Itˆo model(Kim and Wang, 2016). That is, the unified GARCH-Itˆo model is a special example of the SG-Itˆomodel. More details are provided in Appendix A.2.In this paper, we consider the case where n ∈ N denotes a day. In this case, s n is a dailystate variable, and σ t for t ∈ [ n − , n ) denotes the intraday volatility on day n . Definition 1 nowimplies that intraday volatility on day n evolves from close volatility on day n − s n . This study aims to investigate the low-frequency market dynamics, so the integrated volatilitystructure over the low-frequency period is important. Moreover, the integrated volatility processwill be used in the parameter estimation procedure. In this section, we study properties of theintegrated volatilities.
Theorem 1. (a) Under the SG-Itˆo framework, integrated volatility on state i ∈ { , } can bedecomposed into {F x,Ln − , F sn − } -adapted process and martingale difference as follows: (cid:90) nn − σ i,t dt = h i,n ( θ ) + ξ i,n a.s. , where h i,n ( θ ) = H c,i ( θ ) + H β,i ( θ ) σ n − , ξ i,n = 2 (cid:90) nn − ( e ( n − t ) β i − (cid:90) tn − σ i,s dB s σ i,t dB t ,H c,i ( θ ) = β − i ( e β i − − β i ) ω i , H β,i ( θ ) = ( γ i − β − i ( e β i − − β i ) + β − i ( e β i − . b) Let F n − = F xn − ∪F sn . Then, for given s n and s n − , the conditional expected integrated volatility E (cid:104)(cid:82) nn − σ t dt |F n − (cid:105) = h n ( θ ) a.s. is represented by h n ( θ ) = s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) + s ,n ( ω h + γ h h n − ( θ ) + β h Z n − )+ s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) + s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) , (2.3) where s ,n = (1 − s n − )(1 − s n ) , s ,n = (1 − s n − ) s n , s ,n = s n − (1 − s n ) , s ,n = s n − s n ,θ h = { ω h , ω h , ω h , ω h , γ h , γ h , γ h , γ h , β h , β h , β h , β h } ,ω hii = (1 − γ i ) H c,i ( θ ) + ω i H β,i ( θ ) , γ hii = γ i , β hii = β i H β,i ( θ ) ,ω hij = H c,j ( θ ) − γ i H c,i ( θ )( H β,j ( θ ) /H β,i ( θ )) + ω i H β,j ( θ ) , γ hij = γ i ( H β,j ( θ ) /H β,i ( θ )) , β hij = β i H β,j ( θ ) ,Z n = (1 − s n ) Z ,n + s n Z ,n , Z i,t = (cid:90) tt − σ i,s dB s for i ∈ { , } . Theorem 1(a) shows that integrated volatility on state i is decomposed into GARCH volatility h i,n ( θ ) and martingale difference ξ i,n . This decomposition plays a prominent role in the subsequenttheorems, and we show that the theorems can be established for any process that satisfies thedecomposition in Theorem 1(a). Theorem 1(b) indicates that the expected integrated volatility h n ( θ ) follows a four-state RS-GARCH(1,1) structure. In particular, the integrated form of themodel parameter θ h is determined by the product of s n and s n − , so the integrated volatilitydepends on both current and previous states. In the sense that its integrated volatility has a RS-GARCH-like structure, the SG-Itˆo model has an instantaneous volatility process that characterizesthe RS-GARCH models. We note that this general model allows to incorporate and extend existingregime-switching volatility frameworks by employing a suitable state process. For example, thismodel illustrates the Markov-switching GARCH model with latent Markov state process and theGJR- and QR-GARCH models with observed exogenous state processes.In this paper, we mainly deal with observed state processes. In practice, however, future stateis often unobservable. For instance, day-of-week or previous day return state is available at the8eginning of the day, whereas daily trading volume or market illiquidity is not available untilthe end of the day. We note that the SG-Itˆo model does not require the observability of the statevariable s n . For unrevealed s n , Proposition 1 suggests that we can estimate the expected integratedvolatility with state transition probability. Proposition 1.
For unrevealed s n and given s n − , we have E (cid:20)(cid:90) nn − σ t dt (cid:12)(cid:12)(cid:12) F xn − , F sn − (cid:21) = p ,n (1 − s n − )( ω h + γ h h n − ( θ ) + β h Z n − )+ p ,n (1 − s n − )( ω h + γ h h n − ( θ ) + β h Z n − )+ p ,n s n − ( ω h + γ h h n − ( θ ) + β h Z n − )+ p ,n s n − ( ω h + γ h h n − ( θ ) + β h Z n − ) a.s. , where p ij,n = p ( s n = j − | s n − = i − for i, j ∈ { , } . In practice, Z n ’s are not observable due to the drift term µ , thus to predict the future volatil-ity, we first need to estimate µ using the sample mean of the daily log-returns. The martingaleconvergence theorem shows that the sample mean of daily log-return converges to µ . In this paper, we assume that the true log price process follows the SG-Itˆo model as described inDefinition 1. We also distinguish the low- and high-frequency data as follows. The low-frequencydata signify the log price observed at integer time points t = 0 , , , . . . and we assume that thetrue low-frequency log prices, X , X , . . . , are observed. At the same time, high-frequency dataindicate the log price observed at time points between integer time points, which are denoted by t n,m for n = 0 , , . . . , N and m = 1 , . . . , M n − n and satisfy n − t n, < t n, < · · · N, M → ∞ , where M = (cid:80) Nn =1 M n /N . Let p −−→ and d −−→ be convergence in probability anddistribution, respectively. The L p norm of a random variable Z is denoted by (cid:107) Z (cid:107) L p = ( E [ | Z | p ]) p .Finally, (cid:107) X (cid:107) max = max j,k | X j,k | for a matrix X = ( X j,k ) j,k =1 ,...,q and (cid:107) x (cid:107) max = max j | x j | for avector x = ( x , . . . , x q ). C ’s present a positive generic constant whose values can be changed fromappearance to appearance, free from θ , N , and M n .For statistical inferences, we apply a quasi-maximum likelihood estimation procedure to theintegrated volatility process. Theorem 1(a) suggests that integrated volatility over the n th periodis decomposed into the GARCH volatility h n ( θ ) and martingale difference. The well-developedmartingale convergence theorem indicates that the integrated volatility converges to h n ( θ ) as N →∞ , so the integrated volatility can be a good proxy of h n ( θ ). Unfortunately, integrated volatilityis not observed, so we need to estimate it using the observed noisy high-frequency data. There arewell-performing realized volatility estimators such as multi-scale realized volatility (Zhang, 2006,2011), pre-averaging realized volatility (Jacod et al., 2009), and kernel realized volatility (Barndorff-Nielsen et al., 2008), which have the optimal convergence rate M − / n . We adopt the pre-averagingrealized volatility in the numerical studies. 10et θ = ( ω , ω , β , β , γ , γ ) ∈ Θ with the true value θ = ( ω , , ω , , β , , β , , γ , , γ , ) ∈ Θfor the compact parameter space Θ. Then, for the given state variable s n , the quasi-maximumlikelihood function is defined as follows: (cid:98) Q N,M ( θ ) = − N N (cid:88) n =1 (cid:20) log( h n ( θ )) + RV n h n ( θ ) (cid:21) , (3.1)where RV n is the realized volatility estimator constructed based on high-frequency data duringthe n th period and h n ( θ ) = (1 − s n ) h ,n ( θ ) + s n h ,n ( θ ) is state-heterogeneous GARCH volatilitypresented in Theorem 1(b). The estimation procedure can be easily generalized to the other formsof state processes with a suitable quasi-likelihood function. The QMLE (cid:98) θ is (cid:98) θ = argmax θ ∈ Θ (cid:98) Q N,M ( θ ) . To investigate the asymptotic behaviors of QMLE (cid:98) θ , we need the following technical conditions. Assumption 1. (a) Let the parameter space Θ = { θ = ( ω , ω , γ , γ , β , β ) : ω l < ω i < ω u , γ l < γ i < γ u , β l <β i < β u , ω hl < ω hij < ω hu , γ hl < γ hij < γ hu , β hl < β hij < β hu , γ hij + β hij < } for i, j ∈ { , } , where ω l , ω u , γ l , γ u , β l , β u , ω hl , ω hu , γ hl , γ hu , β hl , β hu are known positive constants.(b) sup n ∈ N E ( | ξ i,n | δ ) < ∞ for i ∈ { , } and some δ > .(c) E [ Z n |F n − ] h n ( θ ) ≤ C a.s. for any n ∈ N .(d) { ξ i,n , Z i,n , s n } is a stationary and ergodic process. Remark 1. Assumption 1 is required to handle the low-frequency part. Assumption 1(b) is thesufficient condition for the uniform integrability of martingale difference process. The uniformintegrability is a necessary condition to show the boundedness of derivatives of the quasi-likelihoodfunctions, which is required to obtain the consistency of (cid:98) θ . Assumption 1(c) is the finite fourthmoment condition. Because the target parameter is the second moment, the finite fourth momentcondition is not strong at all to obtain the convergence rate N − / (see also Lee and Hansen (1994)). ssumption 1(d) is only required to derive asymptotic normality of (cid:98) θ . However, this condition isnot an obvious result under the SG-Itˆo model. It is an interesting theoretical problem to investigateconditions which imply Assumption 1(d) under the SG-Itˆo model. We leave this for the futurestudy. Remark 2. The state process should be stationary and ergodic. It includes state processes that aregenerally applied in existing switching models. For example, any multinomial variables of ergodicprobabilities are included. Specifically, the state variable s t satisfies s t | Ω t − ∼ Bernoulli ( p t ) , p t = E t − ( s t = 1) = Φ( π t ) , where Φ( · ) is a link function and π t is explanatory variables. Then, under some stationary conditionfor π t , the state variable s t is a stationary and ergodic process. Assumption 2. (a) Assume C M ≤ M n ≤ C M , sup n sup ≤ m ≤ M n | t n,m − t n,m − | = O ( M − ) , and N M − → as N, M → ∞ .(b) sup n ∈ N (cid:13)(cid:13)(cid:13) RV n − (cid:82) nn − σ t dt (cid:13)(cid:13)(cid:13) L ≤ CM − .(c) For any n ∈ N , E [ RV n |F n − ] ≤ CE (cid:104)(cid:82) nn − σ t dt |F n − (cid:105) + C a.s. Remark 3. Assumption 2 stands for the high-frequency part. Assumption 2(a) is a typical con-dition for realized volatility estimators. Assumption 2(b)–(c) can be obtained easily under somefourth moment conditions as discussed in Tao et al. (2013), Kim et al. (2016), and Kim et al.(2018). Theorems 2 and 3 establish the consistency of (cid:98) θ and its convergence rate, respectively. Theorem 2. Under Assumption 1(a)–(b) and Assumption 2(a)–(b), we have (cid:98) θ → θ in probability. Theorem 3. Under Assumption 1(a)–(c) and Assumption 2, we have (cid:13)(cid:13)(cid:13)(cid:98) θ − θ (cid:13)(cid:13)(cid:13) max = O p ( N − + M − ) . emark 4. Theorem 3 shows that the convergence rate of (cid:98) θ has both high- and low- frequency-oriented components. The rate N − is due to the low-frequency part, which is the usual parametricconvergence rate. The rate M − is from the high-frequency volatility estimation related to As-sumption 2(b), known as the optimal convergence rate of the realized volatility estimator with thepresence of the micro-structure noise. Theorem 4 derives asymptotic normality of (cid:98) θ using stationary and ergodic assumptions. Theorem 4. Suppose that Assumptions 1 and 2 are met and N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = θ h n ( θ ) − ξ n (cid:35) p −−→ V, N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = θ h n ( θ ) − (cid:35) p −−→ W. Then we have √ N ( (cid:98) θ − θ ) d −−→ N (cid:0) , W − V W − (cid:1) . Theorem 4 demonstrates that the limiting distribution of QMLE is Gaussian with the variance W − V W − , where the matrices V and W are information and Hessian matrices, respectively.Theorem 4 implies that the quality of the integrated volatility estimator affects the variance of theparameter estimates. To check the effect of employing integrated volatility estimators as the proxy,we consider the parameter estimation procedure using low-frequency data only. For example, theQMLE (cid:98) θ L is obtained as follows: (cid:98) Q LN ( θ ) = − N N (cid:88) n =1 (cid:20) log( h n ( θ )) + ζ n h n ( θ ) (cid:21) and (cid:98) θ L = argmax θ ∈ Θ (cid:98) Q LN ( θ ) , (3.2)where ζ n = X n − X n − − µ is defined in Equation (2.1). Then, similar to the proof of Theorem 4,13he asymptotic distribution of (cid:98) θ L can be derived as follows: √ N ( (cid:98) θ L − θ ) d −−→ N (cid:0) , W − V L W − (cid:1) , where 14 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = θ h n ( θ ) − (cid:0) Z n − h n ( θ ) (cid:1) (cid:35) p −−→ V L . From the above results, we can find that the estimation errors of the GARCH volatility in realizedvolatility (i.e., (cid:82) nn − σ t dt − h n ( θ )) or daily return square (i.e., Z − h n ( θ )) play a key role in thevariance of parameter estimates. We can easily find that the estimation errors in realized volatilityis smaller than that in the daily return square. That is, compared to the daily return square, thesufficient information in realized volatility estimators reduces the parameter estimation error andproduces accurate parameter estimates with a relatively short time period of data (see also Kimand Wang (2016)).To make inferences based on the asymptotic distribution derived in Theorem 4, we constructconsistent estimators for V and W as follows: (cid:98) V = 14 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = (cid:98) θ h n ( (cid:98) θ ) − ( RV n − h n ( (cid:98) θ )) (cid:35) , (cid:99) W = 12 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = (cid:98) θ h n ( (cid:98) θ ) − (cid:35) . To show the consistency of estimators, we make following additional assumptions. Assumption 3. (a) E [ Z n |F n − ] h n ( θ ) ≤ C a.s. for any n ∈ N .(b) sup n ∈ N (cid:13)(cid:13)(cid:13) RV n − (cid:82) nn − σ t dt (cid:13)(cid:13)(cid:13) L ≤ CM − . Remark 5. Assumption 3 is the finite eighth moment condition. The parameters, V and W , ofinterest are functions of fourth moments. Thus, to establish their asymptotic theorems, we need the nite eighth moment condition. Proposition 2. Under Assumptions 1–3, we have (cid:98) V → V and (cid:99) W → W in probability. Remark 6. Proposition 2 shows the consistency of (cid:98) V and (cid:99) W . We can obtain their convergencerates by imposing some additional condition. For example, by assuming that { h n ( θ ) , ξ n } is astrong mixing sequence, Theorem 1.2 (Merlevede and Peligrad, 2000) shows that (cid:98) V and (cid:99) W havethe convergence rate N − / + M − / . The main purpose of this paper is to investigate state heterogeneity in the volatility process. Inthe previous section, we propose a state-heterogeneous diffusion process that can incorporate thelow-frequency state process. Under the SG-Itˆo model, state heterogeneity in the volatility process isillustrated by the state-varying parameters ω i , γ i , β i . Therefore, we can test the state heterogeneityby conducting a hypothesis test under the null hypothesis statement H : ω = ω , γ = γ , β = β . In this section, we construct a Wald test-type hypothesis testing procedure with the null hypothesis H for the QMLE. The rejection of the null hypothesis signifies that the external state distinguishesthe model specification, which implies the existence of state heterogeneity in the volatility process.Let R be the v × u restriction matrix with full row rank. Theorem 5 defines the Wald-type statisticand establishes its limiting distribution. Theorem 5. Under Assumptions 1–3 and the null hypothesis of Rθ = r , we have T N,M = N ( R (cid:98) θ − r ) T ( R (cid:99) W (cid:98) V − (cid:99) W R T ) − ( R (cid:98) θ − r ) d −−→ χ ( v ) , where χ ( v ) indicates a chi-squared random variable with v degrees of freedom. (cid:98) θ induces a Wald-type statistic that followsasymptotically χ distribution under the null hypothesis. We can test the null hypothesis H bysetting R = − − − and r = (0 , , T . Then the Wald-type statistic T N,M follows χ (3). In the empirical study, we reveal the stateheterogeneity in S&P 500 index volatility by conducting the proposed Wald-type test. Details canbe found in Section 5. Remark 7. Theorems 2–5 are established based on the decomposition of expected integrated volatil-ity in Theorem 1(a), (cid:90) nn − σ i,t dt = h i,n ( θ ) + ξ i,n a.s. , and their results can be established for any instantaneous volatility process that satisfies Theorem1(a). The SG-Itˆo model is one of the examples. To evaluate the relevance of asymptotic theories, we conducted simulation studies. We first simu-lated the log price process and assessed the finite sample performance of the suggested estimator (cid:98) θ . The log stock price X t n,m for t n,m = n − m/M was generated from the SG-Itˆo model inDefinition 1 with the following form: dX t = µdt + σ t dB t , σ t = (1 − s n ) σ ,t + s n σ ,t ,σ i,t = σ t ] + ( t − [ t ]) { ω ,i + ( γ ,i − σ t ] } + β ,i ( X t − X [ t ] − ( t − [ t ]) µ ) ,s n = { ( X n − − X n − ) < } , (5.1)16here {·} is an indicator function and θ = ( ω , , ω , , β , , β , , γ , , γ , ) is the true modelparameter. To capture the leverage effect, we set s n = 1 if the previous day return is neg-ative and s n = 0 otherwise. The price process was generated under the null and alternativehypothesis, respectively. Under the null hypothesis, the true parameter set is given by θ =(0 . , . , . , . , . , . θ = (0 . , . , . , . , . , . X = 10, initial instantaneousvolatility σ i, = ω hi (1 − β hi − γ i )+ β i ω hi (1 − β hi − γ i )(1 − γ i ) , and µ = 0. We chose N = 1000 and M = 23400, correspondingto the stock price observed every second during four years. The Euler scheme was applied to dis-critize continuous-time processes. The observed price Y t n,m was calculated as the sum of the truelog price X t n,m and the micro-structure noise (cid:15) t n,m , where X t n,m was generated from Equation (5.1)and (cid:15) t n,m was generated from i.i.d. normal distribution with mean zero and standard deviation σ (cid:15) = 0 . 01. For realized volatility estimator, we employed the pre-averaging method (Christensenet al., 2010; Jacod et al., 2009), presented as follows: RV n = 1 φ K ( f ) MM − K M − K +1 (cid:88) k =1 ( Y ( t k ) − (cid:98) Y k ) ,Y ( t k ) = K − (cid:88) i =1 f (cid:18) iK (cid:19) [ Y t,k + i − Y t,k + i − ] , φ K ( f ) = K (cid:88) i =1 f (cid:18) iK (cid:19) , (cid:98) Y k = K (cid:88) i =1 (cid:18) f (cid:18) iK (cid:19) − f (cid:18) i − K (cid:19)(cid:19) ( Y t,k + i − Y t,k + i − ) , where K = (cid:104) √ M (cid:105) is the tuning parameter that determines the number of observations used for thepre-averaging step and f ( x ) = min( x, − x ). Using the generated stock prices, we estimated therealized volatility and calculated (cid:98) θ using the QMLE method in Section 3. The simulation procedurewas repeated 1000 times.We first examined the effect of period and frequency of the data on parameter estimation. Theaccuracy of model parameter estimation is expected to be improved by longer period and higherfrequency data. To verify this, we generated additional data sets by resampling the entire data.Specifically, we collected first 250, 500, 750, and 1000-day data for N = 250 , , M = 390), 10-second ( M = 2340), and 5-second data ( M = 4680), respectively. Figure1 provides mean squared errors (MSEs) of the estimator (cid:98) θ for varying N and M . From Figure 1,we find that the use of longer period and higher frequency data significantly improves estimationperformance, which supports the theoretical findings in Section 3. [Figure 1 inserted about here] To investigate the advantage of considering state heterogeneity in the model, we comparedthe prediction performance of the SG-Itˆo model with that of the existing volatility models includ-ing GARCH(1,1), RS-GARCH(1,1), and unified GARCH-Itˆo models. Conditional daily volatilityprocesses of the GARCH and unified GARCH-Itˆo models are presented as follows: h n ( θ L ) = ω L + γ L h n − ( θ L ) + β L Z n − , (5.2) h n ( θ g ) = ω g ∗ + γ g h n − ( θ g ) + β g ∗ Z n − , (5.3)where θ L = ( ω L , γ L , β L ) is the model parameter of the GARCH model, and θ g is the modelparameter of the unified GARCH-Itˆo model described in Appendix A.2. The RS-GARCH(1,1)model is illustrated in Equation (2.1). The QMLE of θ L and θ s,L were obtained as in Equation(3.2), whereas that of the unified GARCH-Itˆo parameter was obtained by maximizing Equation(3.1). Note that in parameter estimation, the discrete-time models employ daily return square,whereas the continuous-time models employ daily realized volatility estimates. To evaluate one-day ahead out-of-sample prediction performances, we calculated the mean squared prediction error(MSPE) of each model as follows: M SP E = 1 d d (cid:88) n =1 ( RV n − V n ) , where V n is a fitted variance generated from each volatility model, d is the length of predictionwindow, and the length of both estimation and prediction windows are set to 500. Figure 2 draws18he log MSPEs of the GARCH, RS-GARCH, unified GARCH-Itˆo, and SG-Itˆo models under thenull and alternative hypotheses. In the comparison of the discrete-time models (i.e., GARCHand RS-GARCH) and continuous-time models (i.e., unified GARCH-Itˆo and SG-Itˆo), we find thatthe continuous-time models perform better. This may be because the discrete-time models needrelatively long time periods to obtain consistent estimators, whereas the continuous-time modelscan estimate the model parameters well in short time period. This improvement stems from theefficiency of the realized volatility estimator. In the comparison of the unified GARCH-Itˆo andSG-Itˆo models, under the null hypothesis, the unified GARCH-Itˆo model performs better. Thisis because under the null hypothesis, the unified GARCH-Itˆo model is true, so the complexity ofthe SG-Itˆo model brings the inefficiency of parameter estimation. On the other hand, the SG-Itˆomodel outperforms under the alternative hypothesis. This may be because the SG-Itˆo model candeal with the state heterogeneity in volatility dynamics, whereas the unified GARCH-Itˆo modelcannot. [Figure 2 inserted about here] To check the performance of the Wald-type test statistic T N,M developed in Section 4, weinvestigated the asymptotic convergence of the statistic and conducted size α tests. Figure 3reports χ quantile-quantile plots of the Wald-type statistic T N,M by varying N, M under thenull hypothesis. The real line in the figures denotes the best linear fitted line that illustratesperfect χ distribution. Figure 3 shows that the Wald-type statistic T N,M gradually closes to thelimiting distribution χ as N and M increase. Table 1 reports the rejection rate of hypothesistest for significance levels of 0.1, 0.05, 0.025, 0.01 by varying N, M under the null and alternativehypotheses. In Table 1, we find that under the null hypothesis, the type I error becomes closerto the suggested significance level α as N and M increase. That is, the proposed test proceduresatisfies size α tests asymptotically. Under the alternative hypothesis, the power becomes closer toone as N and M increase. These results support the theoretical findings in Section 4. [Table 1 inserted about here] Figure 3 inserted about here] In the empirical study, we examined the volatility process of S&P 500 index return under the SG-Itˆo framework. We used intraday S&P 500 index data from 9:30 a.m. to 4:00 p.m., spanning fromJanuary 2, 2015, to December 31, 2018 ( N = 998), provided by Chicago Board of Exchange. BeforeJuly 23, 2015, data sampling frequency varied from one to three seconds, so the number of intradaydata M n varied from 10,000 to 23,400. After July 24, 2015, M n was fixed to 23,400 except for earlyclosing days. We constructed daily pre-averaging realized volatility estimates using intraday indexdata.For the SG-Itˆo model, the state process plays a prominent role in model specification. In thisempirical study, we considered seven state processes that are known to affect financial volatility anddefined models (i)–(vii) corresponding to each state variable. The first two models are related tomarket returns. These models deal with the negative correlation between financial return and futurevolatility, which is called the leverage effect (Black, 1976; Christie, 1982; Figlewski and Wang, 2000;Tauchen et al., 1996). For the market return states, we calculated (i) open-to-close returns for thethe previous day market return and (ii) close-to-open returns for the overnight return. We assigned s n = 1 if (i) the open-to-close return was included in the lowest three deciles and (ii) the overnightreturn was negative, respectively, and s n = 0 otherwise. Note that models (i) and (ii) incorporatethe GJR-GARCH model. Second, we considered the Chinese stock market information in model(iii). As the second-largest economy in the world, Chinese economy and their stock market maycomove with that of the U.S. Moreover, the Chinese stock market indices contain information forthe non-trading hours in the U.S. stock market. Thus, we suppose that the Chinese stock marketmovement affects the U.S. stock market volatility. We assigned s n = 1 if the Hang Seng index returnwas included in the lowest three deciles and s n = 0 otherwise. Third, we considered the day-of-weekseasonality in the financial market, especially on pre- and post-holiday (Abraham and Ikenberry,1994; French, 1980; Lakonishok and Maberly, 1990; Miller, 1988). Specifically, for models (iv) and20v), we constructed pre- and post-holiday indicators using NYSE holiday data and assigned s n = 1on day (iv) before and (v) after NYSE holidays, including weekends, respectively, and s n = 0otherwise. Fourth, we considered trading volume and investor attention. Previous studies showedthat they are positively correlated with financial volatility (Andrei and Hasler, 2014; Copeland,1976; Jennings et al., 1981; Lamoureux and Lastrapes, 1990a, 1994). We measured trading volumeand investor attention together with abnormal trading volume abtv , calculated by the aggregatemarket daily dollar volume divided by the sum of recent 20 days dollar volume (Barber and Odean,2007). The large abtv presents the day of high trading volume and high investor attention. For themodel (vi), we assigned s n = 1 if the day with abtv was greater than average, and s n = 0 otherwise.Finally, we adopted an illiquidity measure to proxy the bid-ask based aggregate market illiquidity(Chen et al., 2018; Wang and Yau, 2000). The Corwin and Schultz (2012, CS) measure gauges theilliquidity of individual stocks based on daily high-low spread as follows: cs = 2( e δ − e δ ,δ = √ τ − √ τ − √ − (cid:114) ρ − √ , τ = (cid:20) log (cid:18) H t − L t − (cid:19)(cid:21) + (cid:20) log (cid:18) H t L t (cid:19)(cid:21) , ρ = (cid:20) log (cid:18) H t − ,t L t − ,t (cid:19)(cid:21) , where H t − ,t and L t − ,t are high and low price over days t − t , respectively. For model(vii), we calculated firm-specific CS measures and value-weighted them to construct the aggregatemarket illiquidity measure vwcs . We assigned s n = 1 if vwcs was in the highest three deciles, whichdenoted an illiquid day, and s n = 0 otherwise.Table 2 reports the SG-Itˆo model parameter estimation and hypothesis test results. The param-eter estimates provide some interesting features of volatility processes. For example, ω and γ ofthe models (i), (ii), and (iii) are significantly higher than ω and γ , respectively, which means thatthe volatility is generally greater after negative return shock and their clustering become strength-ened. In particular, the greater β of the model (ii) (0.212) than the model (i) (0.136) may suggestthat the market volatility is more sensitive to overnight shocks than the previous day market re-turns. Parameter estimates of model (vi) suggest that the impacts of the previous volatility and21eturn shocks on the present volatility increase with heavy tradings. [Table 2 inserted about here] The results of hypothesis testing suggest that the null hypothesis H : { ω = ω , γ = γ , β = β } is rejected at the 1% level for models (i)–(iv), (vi), and (vii). This implies that the volatilityprocess is distinguished from the homogeneous volatility process when (i) previous day open-to-close return is significantly low, (ii) overnight return is negative, (iii) Hang Seng index return issignificantly low, (iv) investors prepare for upcoming holidays, (vi) aggregate trading volume isabnormally high, and (vii) the market is illiquid. These results are in line with existing studies.For example, Braun et al. (1995), Carr and Wu (2017), and Kim and Kon (1994) demonstrated thatmarket volatility is significantly increased by negative return shocks. Ahoniemi and Lanne (2013),Ahoniemi et al. (2016), and Tsiakas (2008) reveal that the overnight information significantlyaffect the stock market dynamics and help forecast asset volatility. In particular, the Asian stockmarkets possibly reflect overnight information of the U.S. stock market because of their time lag(Taylor, 2007). We provide the evidence of close relationship between the U.S. and Chinese stockmarkets. Gallant et al. (1992), Kambouroudis and McMillan (2016), and Karpoff (1987) showedthat aggregate trading volume is positively related to future market volatility. The existence ofday-of-week and holiday effect are remain up for debate. Berument and Kiymaz (2001) and Kiymazand Berument (2003) showed the existence of day-of-week effect on market volatility, whereas Birru(2018) claimed that the effect has disappeared on an aggregate level. The hypothesis testing resultsfor models (iv) and (v) suggest that the pre-holiday effect on the market volatility process mayexist, whereas the post-holiday effect has disappeared.Table 3 shows the integrated form of the SG-Itˆo model parameter estimates described in The-orem 1(b). Table 4 presents parameter estimates of GARCH(1,1), unified GARCH-Itˆo (i.e., ω g ∗ , γ g ∗ , β g ∗ ), and RS-GARCH(1,1) models. The integrated form of the SG-Itˆo model parameters canbe interpreted similarly to the RS-GARCH(1,1) model parameters. For example, the large β h and β h of the model (ii) in Table 3 may suggest that daily integrated volatility is significantly affectedby market return after negative overnight return shocks. This is in line with the large β L of the22S-GARCH model (ii) in Table 4. [Table 3 inserted about here][Table 4 inserted about here] To investigate the efficiency of adopting the high-frequency data, we estimated the SG-Itˆomodel parameters and Wald-type statistics using low-frequency data only. We estimated parameterestimates (cid:98) θ L with low-frequency data using the Equation (3.2). Then we can calculate the Wald-type statistic for (cid:98) θ L , as follows: T N = N ( R (cid:98) θ L − r ) T ( R (cid:99) W ( (cid:98) V L ) − (cid:99) W R T ) − ( R (cid:98) θ L − r ) d −−→ χ ( v ) , where (cid:98) V L = 14 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = (cid:98) θ h n ( (cid:98) θ ) − (cid:16) ζ n − h n ( (cid:98) θ ) (cid:17) (cid:35) . Table 5 reports SG-Itˆo model parameter estimation and hypothesis test results based on low-frequency data only. We find that standard errors of parameters are significantly increased com-pared with the results in Table 2, and, accordingly, most of the ω s and β s are not significant at the1% level anymore. Moreover, the Wald-type test fails to detect the state heterogeneity in models(ii)-(iv), and the significance of rejection has reduced for models (i) and (vi) as well. These resultsmay imply that the relatively short period of low-frequency data may not contain sufficient infor-mation and fail to capture low-frequency volatility dynamics. From the results, we can concludethat the use of high-frequency data helps to analyze low-frequency dynamics for relatively short-time-period data, so it would be more robust to the structural break issue. These findings supportour hypotheses of existence of state heterogeneity and efficiency of using high-frequency data toexamine low-frequency market dynamics. [Table 5 inserted about here] M AP E = 100 d d (cid:88) n =1 (cid:12)(cid:12)(cid:12)(cid:12) RV n − V n RV n (cid:12)(cid:12)(cid:12)(cid:12) , where d is the length of prediction window and V n is a fitted variance generated from each volatilitymodel. The estimation window is 750 days with the prediction period spanning from December 22,2017, to December 31, 2018 (248 days). The benchmarks are unified GARCH-Itˆo, RS-GARCH(1,1),and GARCH(1,1) models and the model specifications are presented in Section 5.1. We also con-sider heterogeneous auto-regressive (HAR) model of Corsi (2009) as an additional benchmark. Onthe one hand, state variable s n is available at the beginning of day n for models (i) previous dayopen-to-close return, (ii) overnight return, (iii) Hang Seng index return, (iv) pre-holiday, and (v)post-holiday. On the other hand, s n is not observed until the end of day n for the models (vi)abnormal trading volume and (vii) market illiquidity, so we have to utilize a state transition proba-bility as in Proposition 1. To obtain state transition probability, we simply assumed time-persistentstate transition probability and calculated the portion of transition from state j to i for p ij . Theestimation of state transition probability significantly affects prediction performance, but we leavea more elaborate probability inference for further research. Table 6 reports out-of-sample predic-tion results measured by MAPEs. The results suggest that continuous-time models (SG-Itˆo andunified GARCH-Itˆo) performs better than discrete-time models (GARCH and RS-GARCH) andthe HAR model and that state heterogeneity models (SG-Itˆo and RS-GARCH) are superior tostate-homogeneous models (GARCH and unified GARCH-Itˆo) in general. Thus, the continuousand state heterogeneous SG-Itˆo model shows outstanding performance compared to the others. Inparticular, the MAPE improvement of the SG-Itˆo model is the greatest in model (ii), in which thestate heterogeneity was the greatest, whereas the improvement seems insignificant in model (v),24n which state heterogeneity was not detected. For models (vi) and (vii), although we employedthe simple procedure to estimate transition probability, the prediction performance of the SG-Itˆomodel is similar to or even better than that of the benchmark models. [Table 6 inserted about here] State heterogeneity in financial volatility has widely been discussed as a representative marketcharacteristic. This study hypothesizes that there exists state heterogeneity in financial volatilityand use of high-frequency data facilitates analyzing it. To test the hypothesis, we proposed a novelvolatility model whose instantaneous volatility has a continuous-time process and that evolvesdepending on the discrete state process. Through the model, this study provides a mathematicalbackground to apply high-quality realized volatility estimators to the study of discrete-time stateheterogeneity volatility frameworks. Along with the model, we construct a Wald-type hypothesistesting procedure to test our hypothesis. Through hypothesis testing, we verify the existence ofleverage, investor attention, market illiquidity, stock market comovement, and post-holiday effectin S&P 500 index volatility. The statistical test based on low-frequency data only, however, doesnot catch these effects well.In this paper, our focus is to test the given exogenous state. However, in practice, how to definethe state process is an important but difficult question. Fortunately, the proposed SG-Itˆo diffusionprocess is not affected by the state process, so it is easy to incorporate any state process in theSG-Itˆo process structure. Thus, studying state processes based on the high-frequency financial datais a promising direction for future research. 25 Appendix A.1 Instantaneous volatility Under the SG-Itˆo framework, the instantaneous volatility at integer time point n can be presentedas the linear function of σ n − and daily return square as follows: σ n = (1 − s n ) σ ,n + s n σ ,n = (1 − s n )( ω + β Z ,n ) + s n ( ω + β Z ,n ) + ((1 − s n ) γ + s n γ ) σ n − . Then we can express instantaneous volatility at integer time point as the infinite sum of Z i,n ’s usinga recursive relationship. Let D n ( k ) = [(1 − s n +1 − k ) γ + s n +1 − k γ ] D n ( k − 1) and D n (0) = 1. Then D n ( k ) = (cid:81) k − i =0 [(1 − s n − i ) γ + s n − i γ ] and we have σ n = D n (0) { (1 − s n )( ω + β Z ,n ) + s n ( ω + β Z ,n ) } + D n (1) σ n − = D n (0) { (1 − s n )( ω + β Z ,n ) + s n ( ω + β Z ,n ) } + D n (1) { (1 − s n − )( ω + β Z ,n − ) + s n − ( ω + β Z ,n − ) } + D n (2) σ n − = k − (cid:88) i =0 D n ( i ) { (1 − s n − i )( ω + β Z ,n − i ) + s n − i ( ω + β Z ,n − i ) } + D n ( k ) σ n − k = ∞ (cid:88) i =0 D n ( i ) { (1 − s n − i )( ω + β Z ,n − i ) + s n − i ( ω + β Z ,n − i ) } a.s. (A.1)Note that D n ( k ) ≤ ( γ hu ) k , σ n satisfies following inequality: σ n ≤ ∞ (cid:88) i =0 ( γ hu ) i ( ω hu + β hu Z n − i ) = ω hu − γ hu + β hu ∞ (cid:88) i =0 ( γ hu ) i Z n − i . Then, by Assumption 1(d), we can easily show the existence of the infinite sum in (A.1).26 .2 Connection with the GARCH-Itˆo model We can show that when the states are homogeneous ( s n = s n − = · · · = s ), the SG-Itˆo modelreturns to the unified GARCH-Itˆo model (Kim and Wang, 2016). That is, the unified GARCH-Itˆomodel is a special example of the SG-Itˆo model. The unified GARCH-Itˆo model can be presentedas follows: dX t = µdt + σ t dB t , (A.2) σ t = σ n − + ( t − n + 1) { ω g + ( γ g − σ n − } + β g (cid:18)(cid:90) tn − σ s dB s (cid:19) , (A.3)where θ g = ( ω g , β g , γ g ) are model parameters. Let us assume s n = 1 for all n ∈ N . Then, byEquation (A.1), the instantaneous volatility under the SG-Itˆo model can be presented as follows: σ n = ∞ (cid:88) i =0 γ i ( ω + β Z n − i ) = ω − γ + β ∞ (cid:88) i =0 γ i Z n − i . For h n ( θ ), we have h n ( θ ) = ω h + γ h h n − ( θ ) + β h Z n − = ω g ∗ + γ g h n − ( θ ) + β g ∗ Z n − , where ω g ∗ = ω g ( β g ) − ( e β g − 1) and β g ∗ = ( β g ) − ( γ g − e β g − − β g ) + e β g − A.3 Proof of Theorem 1 Proof. Proof of Theorem 1 Consider ( a ). By Itˆo’s lemma, we have R ( k ) = (cid:90) nn − ( n − t ) k k ! σ ,t dt = (cid:90) nn − ( n − t ) k k ! [ σ n − + ( t − n + 1) { ω + ( γ − σ n − } ] dt + β (cid:90) nn − ( n − t ) k k ! (cid:18)(cid:90) tn − σ ,s dB s (cid:19) dt 27 1( k + 2)! [ ω + ( γ + k + 1) σ n − ]+ 2 β (cid:90) nn − ( n − t ) k +1 ( k + 1)! (cid:18)(cid:90) tn − σ ,s dB s (cid:19) σ ,t dB t + β R ( k + 1) . Then we have (cid:90) nn − σ ,t dt = R (0) = β − ( e β − − β ) ω + [( γ − β − ( e β − − β ) + β − ( e β − σ n − + ξ ,n , where ξ ,n = 2 (cid:82) nn − ( e ( n − t ) β − (cid:82) tn − σ ,s dB s σ ,t dB t . We can calculate (cid:82) nn − σ ,t dt in the sameway. Then integrated volatility under the SG-Itˆo framework can be expressed as a function of {F x,Ln − , F sn − } -adapted process and the martingale difference as follows: (cid:90) nn − σ i,t dt = h i,n ( θ ) + ξ i,n, E (cid:20)(cid:90) nn − σ i,t dt (cid:12)(cid:12)(cid:12) F n − (cid:21) = h i,n ( θ ) = H c,i ( θ ) + H β,i ( θ ) σ n − a.s.Consider ( b ). By the result of ( a ), we have E (cid:20)(cid:90) nn − σ t dt (cid:12)(cid:12)(cid:12) F n − (cid:21) = h n ( θ ) = (1 − s n ) h ,n ( θ ) + s n h ,n ( θ )= (1 − s n ) (cid:0) H c, ( θ ) + H β, ( θ ) σ n − (cid:1) + s n (cid:0) H c, ( θ ) + H β, ( θ ) σ n − (cid:1) = H wc,n ( θ ) + H wβ,n ( θ ) σ n − , where H wc,n ( θ ) = (1 − s n ) H c, ( θ )+ s n H c, ( θ ) and H wβ,n ( θ ) = (1 − s n ) H β, ( θ )+ s n H β, ( θ ). By Equation(A.1), we have h n ( θ ) = H wc,n ( θ ) + H wβ,n ( θ ) σ n − = H wc,n ( θ ) + H wβ,n ( θ ) ∞ (cid:88) i =0 D n − ( i ) { (1 − s n − − i )( ω + β Z ,n − − i ) + s n − i ( ω + β Z ,n − − i ) } = H wc,n ( θ ) + H wβ,n ( θ ) ∞ (cid:88) i =1 D n − ( i ) { (1 − s n − − i )( ω + β Z ,n − − i ) + s n − i ( ω + β Z ,n − − i ) } H wβ,n ( θ ) { (1 − s n − )( ω + β Z ,n − ) + s n − ( ω + β Z ,n − ) } = H wc,n ( θ ) + H wβ,n ( θ ) D n − (1) σ n − + H wβ,n ( θ ) { (1 − s n − )( ω + β Z ,n − ) + s n − ( ω + β Z ,n − ) } . Thus, we have h n ( θ ) = ω wn + γ wn h n − ( θ ) + β wn Z n − , (A.4)where ω wn = H wc,n ( θ ) + H wβ,n ( θ ) { (1 − s n − ) ω + s n − ω } − γ wn H ωc,n − ( θ ), γ wn = D n − (1) H ωβ,n ( θ ) H ωβ,n − ( θ ) , β wn = H wβ,n ( θ ) { (1 − s n − ) β + s n − β } . Then we can easily show h n ( θ ) = s ,n ( ω h + β h h n − ( θ ) + γ h Z n − ) + s ,n ( ω h + β h h n − ( θ ) + γ h Z n − )+ s ,n ( ω h + β h h n − ( θ ) + γ h Z n − ) + s ,n ( ω h + β h h n − ( θ ) + γ h Z n − ) . A.4 Proof of asymptotic theories This section provides proofs of asymptotic theories presented in Section 3. First Lemma 1 showsthat the impact of the initial value is asymptotically negligible by showing that the impact of initialvalue on h n ( θ ) is exponentially decaying. Accordingly, the difference between the quasi-likelihoodfunctions with true and arbitrary value decays faster than O p ( N − ). A.4.1 Initial valueLemma 1. Under Assumption 1(a), we have for any ϑ = O p (1) and n ∈ N , | h n ( θ , σ ) − h n ( θ , ϑ ) | = O p (( γ hu ) n − ) .Proof. Proof. Simple algebraic manipulations provide h n ( θ , σ ) − h n ( θ , ϑ ) = n − (cid:89) k =1 γ wn − k +1 ( h ( θ , σ ) − h ( θ , ϑ ))29 ( γ hu ) n − H wβ, ( θ )( σ − ϑ ) = O p (( γ hu ) n − ) . Thus, as N → ∞ , the difference between σ and ϑ become negligible. A.4.2 Proof of Theorem 2 For given { s n } , we define the log likelihood functions and their derivatives as follows: (cid:98) Q N,M ( θ ) = − N N (cid:88) n =1 (cid:20) log( h n ( θ )) + RV n h n ( θ ) (cid:21) = − N N (cid:88) n =1 (cid:98) q N,M ( θ ) , (cid:98) S N,M ( θ ) = ∂ (cid:98) Q N,M ( θ ) ∂θ , (cid:98) θ = argmax θ ∈ Θ (cid:98) Q N,M ( θ ) , (cid:101) Q N ( θ ) = − N N (cid:88) n =1 (cid:34) log( h n ( θ )) + (cid:82) nn − σ t dth n ( θ ) (cid:35) , (cid:101) S N ( θ ) = ∂ (cid:101) Q N ( θ ) ∂θ ,Q N ( θ ) = − N N (cid:88) n =1 (cid:20) log( h n ( θ )) + h n ( θ ) h n ( θ ) (cid:21) , S N ( θ ) = ∂Q N ( θ ) ∂θ . We denote derivatives of function g at x ∗ by ∂g ( x ∗ ) ∂x = ∂g ( x ) ∂x (cid:12)(cid:12)(cid:12) x = x ∗ . Note that in Assumption 1(a),we defined upper and lower bounds of θ and θ h . Lemma 2. Under Assumption 1(a), we have(a) sup n ∈ N E [ Z n ] ≤ ω hu − β hu − γ hu + E [ h ( θ )] < ∞ , and sup n ∈ N E [sup θ ∈ Θ h n ( θ )] < ∞ .(b) ξ i,n = β ,i (cid:82) nn − e ( n − t ) β ,i ( Z i,t − Z i,n − ) − (cid:82) nn − e ( n − t ) β ,i σ i,t dt a.s. for any n ∈ N and i ∈ { , } .(c) There exists a neighborhood B ( θ ) of θ such that for any p ≥ , sup n ∈ N (cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) h n ( θ ) h n ( θ ) (cid:13)(cid:13)(cid:13) L p < ∞ and B ( θ ) ⊂ Θ .(d) For any n ∈ N , sup n ∈ N (cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) 1 h n ( θ ) ∂h n ( θ ) ∂θ j (cid:13)(cid:13)(cid:13) L p ≤ C , sup n ∈ N (cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) 1 h n ( θ ) ∂ h n ( θ ) ∂θ j ∂θ k (cid:13)(cid:13)(cid:13) L p ≤ C , and sup n ∈ N (cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) 1 h n ( θ ) ∂ h n ( θ ) ∂θ j ∂θ k ∂θ l (cid:13)(cid:13)(cid:13) L p ≤ C for any j, k, l ∈ { , , , , , } , where θ =( θ , θ , θ , θ , θ , θ ) = ( ω , ω , γ , γ , β , β ) . roof. Proof. For ( a ), by Equation (A.4), we can express daily integrated volatility as infinite sumof Z i,n ’s as follows: h n ( θ ) = ω wn + γ wn h n − ( θ ) + β wn Z n − = n − (cid:88) k =1 ( ω wn − k +1 + β wn − k +1 Z n − k ) k − (cid:89) l =1 γ wn − l +1 + n − (cid:89) k =1 γ wn − k +1 h ( θ ) . Using iterative relationship and γ wn + β wn < 1, we can show that E [ Z n ] = E [ h n ( θ )] = ω wn + γ wn E [ h n − ( θ )] + β wn E [ Z n − ]= ω wn + ( γ wn + β wn ) E [ h n − ( θ )] ≤ ω hu − ( β hu + γ hu ) n − − ( β hu + γ hu ) + ( γ hu ) n − E [ h ( θ )] ≤ ω hu − β hu − γ hu + E [ h ( θ )] < ∞ . Then we can easily show sup n ∈ N E [sup θ ∈ Θ h n ( θ )] < ∞ .For ( b ), let f ( t, Z i,t ) = ( e ( n − t ) β ,i − Z i,t − Z i,n − ) . Then, by Itˆo’s lemma, we have df ( t, Z i,t ) = (cid:104) − β ,i e ( n − t ) β ,i ( Z i,t − Z i,n − ) + ( e ( n − t ) β ,i − σ i,t (cid:105) dt + 2( e ( n − t ) β ,i − Z i,t − Z i,n − ) dZ i,t ,f ( n, Z i,n ) = 0 = (cid:90) nn − (cid:104) − β ,i e ( n − t ) β ,i ( Z i,t − Z i,n − ) + ( e ( n − t ) β ,i − σ i,t (cid:105) dt + ξ i,n . Consider ( c ). For any δ > 0, there exists a neighborhood B ( θ ) ⊂ Θ such that γ w ,n ≤ (1 + δ ) γ wn .Using the fact that x/ (1 + x ) ≤ x q for all x ≥ q ∈ [0 , h n ( θ ) h n ( θ ) = (cid:80) n − k =1 ( ω w ,n − k +1 + β w ,n − k +1 Z n − k ) (cid:81) k − l =1 γ w ,n − l +1 + (cid:81) n − k =1 γ w ,n − k +1 h ( θ ) (cid:80) n − k =1 ( ω wn − k +1 + β wn − k +1 Z n − k ) (cid:81) k − l =1 γ wn − l +1 + (cid:81) n − k =1 γ wn − k +1 h ( θ ) ≤ (cid:80) n − k =1 ( ω w ,n − k +1 + β w ,n − k +1 Z n − k ) (cid:81) k − l =1 γ w ,n − l +1 + Cω wn + (cid:80) n − k =2 ( ω wn − k +1 + β wn − k +1 Z n − k (cid:81) k − l =1 γ wn − l +1 ) + C n − (cid:88) k =1 (cid:34) ω hu ω hl ( γ hu ) k − + β w ,n − k +1 Z n − k (cid:81) k − l =1 γ w ,n − l +1 ω hl + β wn − k +1 Z n − k (cid:81) k − l =1 γ wn − l +1 (cid:35) + C = C + β hl β hu n − (cid:88) k =1 β wn − k +1 Z n − k (cid:81) k − l =1 γ wn − l +1 ω hl + β wn − k +1 Z n − k (cid:81) k − l =1 γ wn − l +1 (cid:32) (cid:81) k − l =1 γ w ,n − l +1 (cid:81) k − l =1 γ wn − l +1 (cid:33) = C + C n − (cid:88) k =1 x k x k k − (cid:89) l =1 γ w ,n − l +1 γ wn − l +1 , ≤ C + C n − (cid:88) k =1 x qk k − (cid:89) l =1 γ w ,n − l +1 γ wn − l +1 ≤ C + C n − (cid:88) k =1 ( β hu ) q Z qn − k ( ω hl ) q ( γ hu ) q ( p − (1 + δ ) p − = C + C n − (cid:88) k =1 ( γ hu ) q ( p − (1 + δ ) p − Z qn − k = C + C n − (cid:88) k =1 ρ p − Z qn − k , where x k = β wn − k +1 Z n − k (cid:81) k − l =1 γ wn − l +1 ω hl . Let 0 < δ < − ( γ hu ) q ( γ hu ) q . Then, (1 + δ ) < γ hu ) q and ρ = (1 + δ )( γ hu ) q < 1. Taking q ∈ [0 , 1] such that E ( Z pqn − k ) < ∞ , we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) h n ( θ ) h n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ≤ C + C n − (cid:88) k =1 ρ p − (cid:13)(cid:13)(cid:13) Z qn − k (cid:13)(cid:13)(cid:13) L p < ∞ . From | ρ | < 1, we conclude that sup n ∈ N (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) h n ( θ ) h n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p < ∞ . For (d), we first examine the first derivatives. For ω , ω , β , and β , we can show that1 h n ( θ ) ∂h n ( θ ) ∂θ j ≤ C a.s. for j = 1 , , , , because σ n is their linear function.Consider the case that ( ω wn , β wn , γ wn ) = ( ω h , β h , γ h ). Under Assumption 1(a), we can easily32how that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂θ hj ∂γ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C a.s. for j = 1 , , . . . , 12 and k = 1 , . The property that x/ (1 + x ) ≤ x q for any q ∈ [0 , 1] and all x ≥ (cid:12)(cid:12)(cid:12)(cid:12) h n ( θ ) ∂h n ( θ ) ∂γ (cid:12)(cid:12)(cid:12)(cid:12) = h n ( θ ) − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k =1 ∂ω h ∂γ ( γ h ) k − + ( k − ω h + β h Z n − k )( γ h ) k − ∂γ h ∂γ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + ( n − γ h ) n − h ( θ ) ∂γ h ∂γ ≤ C (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k =1 ( γ h ) k − ( γ h ) k − ( ω h + β h Z n − k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + C (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k =1 ( k − γ h ) k − ( ω h + β h Z n − k ) ω h + ( γ h ) k − ( ω h + β h Z n − k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + C ≤ C (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k =1 kρ kq ( ω hu + β hu Z n − k − ) q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + C. We can choose q ∈ [0 , 1] such that E ( ω hu + β hu Z n − k − ) qp < ∞ . Then, since | ρ | < n ∈ N (cid:13)(cid:13)(cid:13)(cid:13) h n ( θ ) ∂h n ( θ ) ∂γ (cid:13)(cid:13)(cid:13)(cid:13) L p < C. Similarly, we can show the bound for the first derivatives of the h n ( θ ) and for the second and thirdderivatives. Lemma 3. Under Assumption 1(a)–(b) and Assumption 2(a)–(b), we have sup θ ∈ Θ (cid:12)(cid:12)(cid:12) (cid:98) Q N,M ( θ ) − Q N ( θ ) (cid:12)(cid:12)(cid:12) = O p ( M − ) + o p (1) . Proof. Proof. By the triangular inequality, we have (cid:12)(cid:12)(cid:12) (cid:98) Q N,M ( θ ) − Q N ( θ ) (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) (cid:98) Q N,M ( θ ) − (cid:101) Q N ( θ ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) (cid:101) Q N ( θ ) − Q N ( θ ) (cid:12)(cid:12)(cid:12) . h n ( θ ) − < ∞ a.s. By Assumption 2(b), we have E (cid:20) sup θ ∈ Θ (cid:12)(cid:12)(cid:12) (cid:98) Q N,M ( θ ) − (cid:101) Q N ( θ ) (cid:12)(cid:12)(cid:12)(cid:21) ≤ C N N (cid:88) n =1 E (cid:34)(cid:13)(cid:13)(cid:13)(cid:13) RV n − (cid:90) nn − σ t dt (cid:13)(cid:13)(cid:13)(cid:13) L (cid:35) ≤ CM − . Accordingly, we have sup θ ∈ Θ (cid:12)(cid:12)(cid:12) (cid:98) Q N,M ( θ ) − (cid:101) Q N ( θ ) (cid:12)(cid:12)(cid:12) = O p ( M − ) . We can easily show that (cid:101) Q N ( θ ) − Q N ( θ ) = N (cid:80) Nn =1 ξ n h n ( θ ) . Because h n ( θ ) is adapted to F n − , ξ n h n ( θ ) is also martingale difference. Furthermore, (cid:12)(cid:12)(cid:12) ξ n h n ( θ ) (cid:12)(cid:12)(cid:12) is uniformly integrable because (cid:12)(cid:12)(cid:12) ξ n h n ( θ ) (cid:12)(cid:12)(cid:12) ≤ ω hl | ξ n | .Thus, (cid:12)(cid:12)(cid:12) (cid:101) Q N ( θ ) − Q N ( θ ) (cid:12)(cid:12)(cid:12) −→ K N ( θ ) = (cid:101) Q N ( θ ) − Q N ( θ ). By mean-value theorem, there exists θ ∗ between θ and θ (cid:48) satisfying (cid:12)(cid:12)(cid:12) K N ( θ ) − K N ( θ (cid:48) ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N N (cid:88) n =1 ξ n h n ( θ ∗ ) ∂h n ( θ ∗ ) ∂θ ( θ − θ (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∗ ∈ Θ ξ n h n ( θ ∗ ) ∂h n ( θ ∗ ) ∂θ (cid:13)(cid:13)(cid:13)(cid:13) max (cid:13)(cid:13)(cid:13) ( θ − θ (cid:48) ) (cid:13)(cid:13)(cid:13) max . By Lemma 2(d), (cid:13)(cid:13)(cid:13) ∂h n ( θ ∗ ) ∂θ k h n ( θ ∗ ) (cid:13)(cid:13)(cid:13) L ≤ C for every k ∈ { , , , , , } . Therefore, we have (cid:13)(cid:13)(cid:13)(cid:13) sup θ ∗ ∈ Θ ξ n h n ( θ ∗ ) ∂h n ( θ ∗ ) ∂θ (cid:13)(cid:13)(cid:13)(cid:13) L ≤ C (cid:107) ξ n (cid:107) L ≤ C < ∞ . As a result, K N ( θ ) satisfies the weak Lipschitz condtion and uniformly converges to zero by Theorem3 in Andrews (1992). Proof. Proof of Theorem 2. First, let us show the existence of the unique maximizer of Q N ( θ ).From the definition of Q N ( θ ), it is obvious thatmax θ ∈ Θ Q N ( θ ) ≤ − N N (cid:88) n =1 min θ n ∈ Θ (cid:20) log( h n ( θ n )) + h n ( θ ) h n ( θ n ) (cid:21) . θ ,n is the minimizer of the n th summand on right hand side, θ ,n must satisfy h n ( θ ,n ) = h n ( θ )for every n ∈ N . Therefore, if there exists θ ∗ ∈ Θ such that h n ( θ ∗ ) = h n ( θ ) for every n ∈ N , θ ∗ would be the maximizer. In this manner, θ is one of the candidates of θ ∗ . We then show that θ ∗ = θ a.s.Under the SG-Itˆo framework, we have h n ( θ ) = s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) + s ,n ( ω h + γ h h n − ( θ ) + β h Z n − )+ s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) + s ,n ( ω h + γ h h n − ( θ ) + β h Z n − ) . Then θ ∗ and θ satisfy AP = 0 a.s., where θ ∗ = ( ω ∗ , ω ∗ , γ ∗ , γ ∗ , β ∗ , β ∗ ) ,A = s , · · · s , s , h ( θ ) · · · s , h ( θ ) s , Z · · · s , Z s , · · · s , s , h ( θ ) · · · s , h ( θ ) s , Z · · · s , Z ... ... ... ... ... ... s ,n · · · s ,n s ,n h n ( θ ) · · · s ,n h n ( θ ) s ,n Z n · · · s ,n Z n ,P T = ( ω h ∗ − ω h , ω h ∗ − ω h , ω h ∗ − ω h , ω h ∗ − ω h , γ h ∗ − γ h , γ h ∗ − γ h , γ h ∗ − γ h , γ h ∗ − γ h , β h ∗ − β h , β h ∗ − β h , β h ∗ − β h , β h ∗ − β h , ) . Note that A is of full rank because Z n is nondegenerated and s n = 0 or 1. Then A T A is invertableand P = 0 a.s. That is, we have ω h ∗ = ω h , , ω h ∗ = ω h , , γ h ∗ = γ h , , γ h ∗ = γ h , , β h ∗ = β h , , β h ∗ = β h , a.s. , which implies θ ∗ = θ a.s. This also implies that there is a unique maximizer of Q N ( θ ) because ω h ( ω h ) and β h ( β h ) are strictly increasing function of β ( β ). Then, for any (cid:15) > 0, there is a35onstant ν such that Q N ( θ ) − max θ ∈ Θ: (cid:107) θ − θ (cid:107) max >(cid:15) Q N ( θ ) > ν a.s.Now, the theorem is the result of Theorem 1 in Xiu (2010). A.4.3 Proof of Theorem 3Lemma 4. We have following properties under Assumption 1(a), Assumption 2(c), and Lemma2(d):(a) There exists a neighborhood B ( θ ) of θ such that sup n ∈ N (cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) ∂ (cid:98) q N,M ( θ ) ∂θ j ∂θ k ∂θ l (cid:13)(cid:13)(cid:13) L < ∞ forany j, k, l ∈ { , , , , , } , where θ = ( θ , θ , θ , θ , θ , θ ) = ( ω , ω , γ , γ , β , β ) .(b) − (cid:79) S N ( θ ) is a positive definite matrix for N ≥ Proof. Proof. Consider ( a ). By Assumption 2(c), we have E [ RV n |F n − ] ≤ CE (cid:20)(cid:90) nn − σ t dt (cid:12)(cid:12)(cid:12) F n − (cid:21) + C = Ch n ( θ ) + C a.s. Then, by Lemma 2(c) and (d), we have E (cid:34) sup θ ∈ B ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) RV n h n ( θ ) ∂ h n ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12)(cid:12)(cid:35) ≤ CE (cid:34) sup θ ∈ B ( θ ) h n ( θ ) h n ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) h n ( θ ) ∂ h n ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12)(cid:12)(cid:35) + C ≤ C (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) h n ( θ ) h n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup θ ∈ B ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) h n ( θ ) ∂ h n ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L + C ≤ C < ∞ . We can similarly bound remaining terms.Consider ( b ). Let h θ,n = ∂h n ( θ ) ∂θ h n ( θ ) − = h n ( θ ) − (cid:18) ∂h n ( θ ) ∂ω ∂h n ( θ ) ∂ω ∂h n ( θ ) ∂γ ∂h n ( θ ) ∂γ ∂h n ( θ ) ∂β ∂h n ( θ ) ∂β (cid:19) T . − (cid:79) S N ( θ ) = N (cid:80) Nn =1 h θ,n h Tθ,n . Suppose that − (cid:79) S N ( θ ) is not a positivedefinite matrix. This implies the existence of λ (cid:54) = 0 which satisfies N (cid:80) Nn =1 λ T h θ,n h Tθ,n λ = 0,implying that h Tθ,n λ = 0 for all n = 1 , ..., N . Define J = ( h θ, h θ, · · · h θ,n )= ∂h ( θ ) ∂ω · · · · · · ∂ω wn ∂ω + ∂γ wn ∂ω h n − ( θ ) + γ wn ∂h n − ( θ ) ∂ω + ∂β wn ∂ω Z n − ∂h ( θ ) ∂ω · · · · · · ∂ω wn ∂ω + ∂γ wn ∂ω h n − ( θ ) + γ wn ∂h n − ( θ ) ∂ω + ∂β wn ∂ω Z n − ∂h ( θ ) ∂γ · · · · · · ∂ω wn ∂γ + ∂γ wn ∂γ h n − ( θ ) + γ wn ∂h n − ( θ ) ∂γ + ∂β wn ∂γ Z n − ∂h ( θ ) ∂γ · · · · · · ∂ω wn ∂γ + ∂γ wn ∂γ h n − ( θ ) + γ wn ∂h n − ( θ ) ∂γ + ∂β wn ∂γ Z n − ∂h ( θ ) ∂β · · · · · · ∂ω wn ∂β + ∂γ wn ∂β h n − ( θ ) + γ wn ∂h n − ( θ ) ∂β + ∂β wn ∂β Z n − ∂h ( θ ) ∂β · · · · · · ∂ω wn ∂β + ∂γ wn ∂β h n − ( θ ) + γ wn ∂h n − ( θ ) ∂β + ∂β wn ∂β Z n − , where ∂ω wn ∂θ k = s ,n ∂ω h ∂θ k + s ,n ∂ω h ∂θ k + s ,n ∂ω h ∂θ k + s ,n ∂ω h ∂θ k ,∂γ wn ∂θ k = s ,n ∂γ h ∂θ k + s ,n ∂γ h ∂θ k + s ,n ∂γ h ∂θ k + s ,n ∂γ h ∂θ k ,∂β wn ∂θ k = s ,n ∂β h ∂θ k + s ,n ∂β h ∂θ k + s ,n ∂β h ∂θ k + s ,n ∂β h ∂θ k , for any k ∈ { , , , , , } and θ = ( θ , θ , θ , θ , θ , θ ) = ( ω , ω , γ , γ , β , β ). Since Z i ’s arenondegenerated, J is of full rank almost surely. Therefore, J T λ = 0 a.s. implies λ = 0 a.s., whichis contradiction. Proof. Proof of Theorem 3. By mean-value theorem, there exists θ ∗ between (cid:98) θ and θ such that (cid:98) S N,M ( (cid:98) θ ) − (cid:98) S N,M ( θ ) = − (cid:98) S N,M ( θ ) = (cid:79) (cid:98) S N,M ( θ ∗ )( (cid:98) θ − θ ). If − (cid:79) (cid:98) S N,M ( θ ∗ ) p −−→ − (cid:79) S N ( θ ), which is apositive definite matrix, convergence rates of | (cid:98) S N,M ( θ ) | and | (cid:98) θ − θ | are equivalent. Therefore, proofof Theorem 3 is equivalent to show (a) (cid:98) S N,M ( θ ) = O p ( M − + N − ) and (b) (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ∗ ) − (cid:79) S N ( θ ) (cid:13)(cid:13)(cid:13) max =37 p (1). Consider (a). By Assumption Lemma 2(d) and Assumption 2(b), we have (cid:13)(cid:13)(cid:13) (cid:98) S N,M ( θ ) − (cid:101) S N ( θ ) (cid:13)(cid:13)(cid:13) L = 12 N (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N (cid:88) n =1 ∂h n ( θ ) ∂θ h n ( θ ) ( RV n − (cid:90) nn − σ t dt ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ≤ N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13) ∂h n ( θ ) ∂θ h n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) RV n − (cid:90) nn − σ t dt (cid:13)(cid:13)(cid:13)(cid:13) L ≤ CM − . Then we have (cid:98) S N,M ( θ ) − S N ( θ ) = (cid:98) S N,M ( θ ) = − N N (cid:88) n =1 ∂h n ( θ ) ∂θ h n ( θ ) ( RV n − (cid:82) nn − σ t dt ) + ξ n h n ( θ )= − N N (cid:88) n =1 ∂h n ( θ ) ∂θ h n ( θ ) ξ n h n ( θ ) + O p ( M − ) . By Itˆo’s lemma and Itˆo’s isometry, for j ∈ { , , , , , } , we have E (cid:32) N N (cid:88) n =1 ∂h n ( θ ) ∂θ j h n ( θ ) ξ n h n ( θ ) (cid:33) = 14 N E (cid:34) N (cid:88) n =1 (cid:18) ∂h n ( θ ) ∂θ j (cid:19) (cid:18) h n ( θ ) (cid:19) (cid:18) ξ n h n ( θ ) (cid:19) (cid:35) = 14 N E (cid:34) N (cid:88) n =1 (cid:18) ∂h n ( θ ) ∂θ j (cid:19) (cid:18) h n ( θ ) (cid:19) E [ ξ n |F n − ] h n ( θ ) (cid:35) ≤ C N E (cid:34) N (cid:88) n =1 (cid:18) ∂h n ( θ ) ∂θ j (cid:19) (cid:18) h n ( θ ) (cid:19) E [ Z n |F n − ] h n ( θ ) (cid:35) ≤ C N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) ∂h n ( θ ) ∂θ j (cid:19) (cid:18) h n ( θ ) (cid:19) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13) E [ Z n |F n − ] h n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13) L = O p ( N − ) , where the last equality is hold by Lemma 2(d) and Assumption 1(c). Therefore, the statement of(a) is proved. 38onsider (b). By the triangular inequality, we have (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ∗ ) − (cid:79) S N ( θ ) (cid:13)(cid:13)(cid:13) max ≤ (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ∗ ) − (cid:79) (cid:98) S N,M ( θ ) (cid:13)(cid:13)(cid:13) max (b-1)+ (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ) − (cid:79) S N ( θ ) (cid:13)(cid:13)(cid:13) max . (b-2)For (b-1), let U n = max j,k,l ∈{ , ,..., } sup θ ∈ Θ (cid:12)(cid:12)(cid:12) ∂ (cid:98) q N,M ( θ ) ∂θ j ∂θ k ∂θ l (cid:12)(cid:12)(cid:12) . By Taylor expansion and mean valuetheorem, following inequality is satisfied for θ ∗∗ between θ and θ ∗ : (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ∗ ) − (cid:79) (cid:98) S N,M ( θ ) (cid:13)(cid:13)(cid:13) max ≤ N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13) ∂ (cid:98) q N,M ( θ ∗∗ ) ∂θ j ∂θ k ∂θ l (cid:13)(cid:13)(cid:13)(cid:13) max (cid:107) θ ∗ − θ (cid:107) max ≤ C N N (cid:88) n =1 U n (cid:107) θ ∗ − θ (cid:107) max = o p (1) , where the last line is due to Theorem 2 and Lemma 4. For (b-2), by Lemma 2(d) and Assumption2(b), we have (cid:13)(cid:13)(cid:13) (cid:79) (cid:98) S N,M ( θ ) − (cid:79) (cid:101) S N ( θ ) (cid:13)(cid:13)(cid:13) max = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N N (cid:88) n =1 (cid:20) h n ( θ ) ∂h n ( θ ) ∂θ ∂h n ( θ ) ∂θ T − h n ( θ ) ∂ h n ( θ ) ∂θ∂θ T (cid:21) (cid:18) RV n − (cid:90) nn − σ t dt (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max = O p ( M − ) . Note that we have η n = (cid:79) (cid:98) S N ( θ ) − (cid:79) S N ( θ )= 12 N N (cid:88) n =1 ∂ h n ( θ ) ∂θ∂θ T h n ( θ ) − (cid:32) (cid:82) nn − σ t dt − h n ( θ ) h n ( θ ) (cid:33) − ∂h n ( θ ) ∂θ ∂h n ( θ ) ∂θ T h n ( θ ) − (cid:82) nn − σ t dt − h n ( θ ) h n ( θ )= 12 N N (cid:88) n =1 ξ n h n ( θ ) (cid:20) ∂ h n ( θ ) ∂θ∂θ T h n ( θ ) − − ∂h n ( θ ) ∂θ ∂h n ( θ ) ∂θ T h n ( θ ) − (cid:21) . 39e can easily show that (cid:13)(cid:13) η n (cid:13)(cid:13) max ≤ C N (cid:80) Nn =1 (cid:13)(cid:13)(cid:13) E [ Z n |F n − ] h n ( θ ) (cid:13)(cid:13)(cid:13) max = O p ( N − ). As a result, we have (cid:79) (cid:98) S N,M ( θ ) = (cid:79) (cid:101) S N ( θ ) + O p ( M − )= (cid:79) S N ( θ ) + η n + O p ( M − )= (cid:79) S N ( θ ) + O p ( N − ) + O p ( M − ) . A.4.4 Proof of Theorem 4 Proof. Proof of Theorem 4. For any λ ∈ R , let v n = ∂h n ( θ ) ∂θ h n ( θ ) − ξ n and κ n = λ T v n . Since κ n is martingale difference, E ( κ n ) < ∞ . Also, κ n is stationary and ergodic by Assumption 1(d). Let √ N (cid:80) Nn =1 κ n p −−→ κ . By martingale CLT, √ N κ − (cid:80) Nn =1 κ n d −−→ N (0 , N (cid:80) Nn =1 v n v Tn p −−→ V .Then, by Cramer-Wold device, we have −√ N V − (cid:101) S N ( θ ) = √ N N V − N (cid:88) n =1 v n d −−→ N (0 , I ) , where I k denotes k by k identity matrix. Furthermore, let − (cid:79) S N ( θ ) = 12 N N (cid:88) n =1 (cid:20) ∂h n ( θ ) ∂θ ∂h n ( θ ) ∂θ T h n ( θ ) − (cid:21) p −−→ W for positive definite matrix W . By mean-value theorem, there exists θ ∗ between (cid:98) θ and θ whichsatisfies (cid:98) S N,M ( (cid:98) θ ) − (cid:98) S N,M ( θ ) = − (cid:98) S N,M ( θ ) = (cid:79) (cid:98) S N,M ( θ ∗ )( (cid:98) θ − θ ) . √ N ( (cid:98) θ − θ ) = −√ N (cid:79) (cid:98) S N,M ( θ ∗ ) − (cid:98) S N,M ( θ )= √ N ( W − + o p (1))( (cid:101) S N ( θ ) + O p ( M − ))= √ N (cid:101) S N ( θ ) W − + O p ( N M − ) + o p (1) . (A.5)Thus, we can conclude that ( W − V W − ) − √ N ( (cid:98) θ − θ ) d −−→ N (0 , I ) . A.4.5 Proof of Proposition 2 Proof. Proof of Proposition 2. We first consider (cid:98) V . Let I ( θ ) = 14 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = θ h n ( θ ) − ( RV n − h n ( θ )) (cid:35) = 14 N N (cid:88) n =1 ι n ( θ ) , ˜ I ( θ ) = 14 N N (cid:88) n =1 (cid:34) ∂h n ( θ ) ∂θ (cid:20) ∂h n ( θ ) ∂θ (cid:21) T (cid:12)(cid:12)(cid:12) θ = θ h n ( θ ) − ξ n (cid:35) = 14 N N (cid:88) n =1 ˜ ι n ( θ ) . Then, (cid:98) V = I ( (cid:98) θ ) and we have (cid:13)(cid:13)(cid:13) I ( (cid:98) θ ) − V (cid:13)(cid:13)(cid:13) max ≤ (cid:13)(cid:13)(cid:13) I ( (cid:98) θ ) − I ( θ ) (cid:13)(cid:13)(cid:13) max + (cid:13)(cid:13)(cid:13) I ( θ ) − V (cid:13)(cid:13)(cid:13) max . (A.6)First, we show that the convergence of I ( (cid:98) θ ) to I ( θ ) is equivalent to that of (cid:98) θ to θ . Sim-ilar to the proof of Lemma 4(a), under Lemma 2(d) and Assumption 2(c), we can show U (cid:48) n =max j sup θ ∈ B ( θ ) (cid:12)(cid:12)(cid:12) ∂ι n ( θ ) ∂θ j (cid:12)(cid:12)(cid:12) = O p (1) for j ∈ { , , , , , } . For large N and M , by mean value theo-rem and Taylor expansion, we have (cid:13)(cid:13)(cid:13) I ( (cid:98) θ ) − I ( θ ) (cid:13)(cid:13)(cid:13) max ≤ C N N (cid:88) n =1 U (cid:48) n (cid:13)(cid:13)(cid:13)(cid:98) θ − θ (cid:13)(cid:13)(cid:13) max . (cid:98) θ converges to θ with the convergence rate N − / + M − / , we have (cid:13)(cid:13)(cid:13) I ( (cid:98) θ ) − I ( θ ) (cid:13)(cid:13)(cid:13) max = O p ( N − / + M − / ) . For the second term of right hand side of Equation (A.6), we have (cid:13)(cid:13)(cid:13) I ( θ ) − V (cid:13)(cid:13)(cid:13) max ≤ (cid:13)(cid:13)(cid:13) I ( θ ) − ˜ I ( θ ) (cid:13)(cid:13)(cid:13) max + (cid:13)(cid:13)(cid:13) ˜ I ( θ ) − V (cid:13)(cid:13)(cid:13) max . (A.7)We consider the first term of right hand side of Equation (A.7). For j ∈ { , , , , , } , we have14 N N (cid:88) n =1 E (cid:34) N N (cid:88) n =1 (cid:18) ∂h n ( θ ) ∂θ j (cid:19) h n ( θ ) − { ( RV − h n ( θ )) − ξ } (cid:35) ≤ C N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) h n ( θ ) − (cid:40) ( RV − h n ( θ )) − (cid:18)(cid:90) nn − σ t dt − h n ( θ ) (cid:19) (cid:41)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ≤ C N N (cid:88) n =1 (cid:13)(cid:13)(cid:13)(cid:13) RV − (cid:90) nn − σ t dt (cid:13)(cid:13)(cid:13)(cid:13) L (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) RV − (cid:82) nn − σ t dt + 2 ξh n ( θ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ≤ CM − , where the first inequality is hold by Lemma 2(d) and the last inequality is hold by Assumption 3.Consequently, we have (cid:13)(cid:13)(cid:13) I ( θ ) − ˜ I ( θ ) (cid:13)(cid:13)(cid:13) max = O p ( M − ) and (cid:13)(cid:13)(cid:13) ˜ I ( θ ) − V (cid:13)(cid:13)(cid:13) max = o p (1). In conclusion,we have (cid:13)(cid:13)(cid:13) I ( (cid:98) θ ) − V (cid:13)(cid:13)(cid:13) max = o p (1) + O p ( N − / + M − / ) . Similarly, we can show that (cid:13)(cid:13)(cid:13)(cid:99) W − W (cid:13)(cid:13)(cid:13) max = o p (1) + O p ( N − / + M − / ) as well. A.4.6 Proof of Theorem 5 Proof. Proof of Theorem 5. By multiplying both sides of Equation (A.5) by R , we obtain √ N R ( (cid:98) θ − θ ) = − R √ N (cid:101) S N ( θ ) W − + O p ( N M − ) + o p (1) . 42y Theorem 4, we have [ RW − V W − R T ] √ N ( R (cid:98) θ − r ) d −−→ N (0 , I v ) . Then, by continuous mapping theorem, we have N ( R (cid:98) θ − r ) T ( RW − V W − R T ) − ( R (cid:98) θ − r ) d −−→ χ ( v ) . In Proposition 2, we already showed that (cid:98) V p −−→ V and (cid:99) W p −−→ W . Consequently, we have T N,M = N ( R (cid:98) θ − r ) T ( R (cid:99) W − (cid:98) V (cid:99) W − R T ) − ( R (cid:98) θ − r ) d −−→ χ ( v ) . able 1. Size α test results for the Wald-type statisticUnder H Under H a N M Notes. This table presents Wald-type test rejection rates under the null and alternative hypothesis for α = 0 . , . , . , . N = 250 , , , M = 390 , , , able 2. SG-Itˆo model parameter estimation and hypothesis test results based on the realizedvolatility estimates ModelsParameters (i) (ii) (iii) (iv) (v) (vi) (vii) ω ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.004) (0.005) (0.004) (0.007) (0.007) (0.006) (0.004) γ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.037) (0.037) (0.034) (0.045) (0.051) (0.050) (0.038) β ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.024) (0.019) (0.021) (0.020) (0.026) (0.031) (0.028) ω ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ γ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.072) (0.048) (0.059) (0.078) (0.084) (0.071) (0.050) β ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.030) (0.032) (0.033) (0.042) (0.030) (0.036) (0.021)Wald 30.252 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ Statistic (0.000) (0.000) (0.000) (0.000) (0.462) (0.000) (0.018) Notes. This table represents SG-Itˆo model parameter estimation and hypothesis test results based on therealized volatility estimates. Models (i)–(vii) are constructed to examine the following effects on thevolatility process: (i) leverage (previous-day market return), (ii) leverage (overnight return), (iii) Chinesestock market movement, (iv) pre-holiday, (v) post-holiday, (vi) abnormal trading volume, and (vii)aggregate liquidity. The Wald-type statistics are from the Wald-type test under the null hypothesis H : { ω = ω , γ = γ , β = β } . For the parameter estimation, intraday S&P 500 index data spanningfrom January 1, 2015, to December 31, 2018, are used. The numbers in parentheses under parameterestimates and Wald-type statistics indicate standard errors and p -values, respectively. ∗∗∗ and ∗∗ oncoefficients and Wald-type statistics denote statistical significance at the 1% and 5% level, respectively. able 3. Integrated form of SG-Itˆo model parameter estimatesModelsParameters (i) (ii) (iii) (iv) (v) (vi) (vii) ω h γ h β h ω h γ h β h ω h γ h β h ω h γ h β h Notes. This table presents the integrated form of SG-Itˆo model parameter estimates (i.e., (cid:99) θ h ) suggested inTheorem 1(b). Models (i)–(vii) are constructed to examine the following effects on the volatility process:(i) leverage (previous-day market return), (ii) leverage (overnight return), (iii) Chinese stock marketmovement, (iv) pre-holiday, (v) post-holiday, (vi) abnormal trading volume, and (vii) aggregate liquidity. able 4. Estimation of the GARCH, GARCH-Itˆo, and RS-GARCH model parametersRS-GARCHParameters GARCH GARCH-Itˆo (i) (ii) (iii) (iv) (v) (vi) (vii) ω L γ L β L ω L γ L β L Notes. This table presents paremeter estimates of the GARCH, GARCH-Itˆo, and RS-GARCH models. TheRS-GARCH models models (i)–(vii) are constructed to examine the following effects on the volatilityprocess: (i) leverage (previous-day market return), (ii) leverage (overnight return), (iii) Chinese stockmarket movement, (iv) pre-holiday, (v) post-holiday, (vi) abnormal trading volume, and (vii) aggregateliquidity. able 5. SG-Itˆo model parameter estimation and hypothesis test results based on the low-frequency data only ModelsParameters (i) (ii) (iii) (iv) (v) (vi) (vii) ω ∗ ∗ ∗∗ (0.010) (0.017) (0.012) (0.024) (0.023) (0.012) (0.010) γ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.098) (0.085) (0.062) (0.067) (0.086) (0.110) (0.052) β ∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.109) (0.073) (0.064) (0.045) (0.069) (0.099) (0.055) ω ∗ γ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.273) (0.125) (0.148) (0.138) (0.139) (0.202) (0.093) β ∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗ ∗ ∗∗ Statistic (0.068) (0.436) (0.152) (0.805) (0.950) (0.055) (0.024) Notes. This table presents SG-Itˆo model parameter estimation and hypothesis test results based on thelow-frequency data for models (i)–(vii). Models (i)–(vii) are constructed to examine the following effects onthe volatility process: (i) leverage (previous-day market return), (ii) leverage (overnight return), (iii)Chinese stock market movement, (iv) pre-holiday, (v) post-holiday, (vi) abnormal trading volume, and (vii)aggregate liquidity. The Wald-type statistics are from the Wald-type test under the null hypothesis H : { ω = ω , γ = γ , β = β } . For the parameter estimation, daily S&P 500 index data spanning fromJanuary 1, 2015, to December 31, 2018 are used. The numbers in parentheses under parameter estimatesand Wald-type statistics indicate standard error and p -value, respectively. ∗∗∗ , ∗∗ , and ∗ on coefficients andWald-type statistics denotes statistical significance at the 1%, 5%, and 10% level, respectively. able 6. Out-of-sample prediction performance of the volatility models measured by MAPEsOut-of-sample MAPEsVolatility models (i) (ii) (iii) (iv) (v) (vi) (vii)SG-Ito 0.560 0.503 0.552 0.558 0.567 0.564 0.572GARCH-Ito 0.571 0.571 0.571 0.571 0.571 0.571 0.571RS-GARCH 0.812 0.766 0.805 0.930 0.909 0.917 0.952GARCH 0.925 0.925 0.925 0.925 0.925 0.925 0.925HAR 0.646 0.646 0.646 0.646 0.646 0.646 0.646 Notes. This table presents out-of-sample prediction performance of the volatility models measured byMAPEs. Models (i)–(vii) are constructed to examine the following effects on the volatility process: (i)leverage (previous-day market return), (ii) leverage (overnight return), (iii) Chinese stock market movement(iv) pre-holiday, (v) post-holiday, (vi) abnormal trading volume, and (vii) aggregate liquidity. Estimationwindow is 750 days and prediction period is from December 22, 2017, to December 31, 2018 (248 days). M S E N=250N=500N=750N=1000 0 5000 10000 15000 20000M0.51.01.52.02.53.03.5 M S E M S E M S E M S E M S E Figure 1. MSEs of parameter estimates of the SG-Itˆo model Notes. This figure illustrates MSEs of parameter estimates of the SG-Itˆo model based on data simulatedfrom the SG-Itˆo model with N = 250 , , , M = 390 , , , l o g M S P E (a) Null hypothesis SG-ItoGARCH-ItoRS-GARCHGARCH (b) Alternative hypothesis Figure 2. One-day ahead out-of-sample volatility prediction error Notes. This figure illustrates one-day ahead out-of-sample volatility prediction error (MSPE) of volatilitymodels against M under the null and alternative hypothesis, with 500-day estimation window andprediction periods. Note that we took the log transformation of MSPEs. O r d e r e d V a l u e s N=250, M=390 O r d e r e d V a l u e s N=250, M=2340 O r d e r e d V a l u e s N=250, M=4680 O r d e r e d V a l u e s N=250, M=23400 O r d e r e d V a l u e s N=500, M=390 O r d e r e d V a l u e s N=500, M=2340 O r d e r e d V a l u e s N=500, M=4680 O r d e r e d V a l u e s N=500, M=23400 O r d e r e d V a l u e s N=750, M=390 O r d e r e d V a l u e s N=750, M=2340 O r d e r e d V a l u e s N=750, M=4680 O r d e r e d V a l u e s N=750, M=23400 O r d e r e d V a l u e s N=1000, M=390 O r d e r e d V a l u e s N=1000, M=2340 O r d e r e d V a l u e s N=1000, M=4680 O r d e r e d V a l u e s N=1000, M=23400 Figure 3. χ quantile-quantile plots of the Wald-type statistic Notes. This figure illustrates χ quantile-quantile plots of the Wald-type statistic under the null hypothesisfor N = 250 , , , M = 390 , , , χ distribution. ata availability statement The S&P500 intraday index data is provided by the Chicago Board of Exchange (CBOE).(web link: https://datashop.cboe.com/). Please note that the data sharing policy of CBOE restrictsthe redistribution of data. References Abraham, A. and Ikenberry, D. L. (1994). The individual investor and the weekend effect. Journalof Financial and Quantitative Analysis , 29(2):263–277.Ahoniemi, K., Fuertes, A.-M., and Olmo, J. (2016). Overnight news and daily equity trading risklimits. Journal of Financial Econometrics , 14(3):525–551.Ahoniemi, K. and Lanne, M. (2013). Overnight stock returns and realized volatility. InternationalJournal of Forecasting , 29(4):592–604.A¨ıt-Sahalia, Y., Fan, J., and Xiu, D. (2010). High-frequency covariance estimates with noisy andasynchronous financial data. Journal of the American Statistical Association , 105(492):1504–1517.Andersen, T. G. and Bollerslev, T. (1998). Answering the skeptics: Yes, standard volatility modelsdo provide accurate forecasts. International Economic Review , pages 885–905.Andrei, D. and Hasler, M. (2014). Investor attention and stock market volatility. Review ofFinancial Studies , 28(1):33–72.Andrews, D. W. (1992). Generic uniform convergence. Econometric Theory , 8(2):241–257.Barber, B. M. and Odean, T. (2007). All that glitters: The effect of attention and news on thebuying behavior of individual and institutional investors. Review of Financial Studies , 21(2):785–818. 53arndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2008). Designing realizedkernels to measure the ex post variation of equity prices in the presence of noise. Econometrica ,76(6):1481–1536.Bauwens, L., Dufays, A., and Rombouts, J. V. (2014). Marginal likelihood for markov-switchingand change-point garch models. Journal of Econometrics , 178:508–522.Bauwens, L., Preminger, A., and Rombouts, J. V. (2010). Theory and inference for a markovswitching garch model. Econometrics Journal , 13(2):218–244.Berument, H. and Kiymaz, H. (2001). The day of the week effect on stock market volatility. Journalof Economics and Finance , 25(2):181–193.Birru, J. (2018). Day of the week and the cross-section of returns. Journal of Financial Economics ,130(1):182–214.Black, F. (1976). Studies of stock market volatility changes. .Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econo-metrics , 31(3):307–327.Braun, P. A., Nelson, D. B., and Sunier, A. M. (1995). Good news, bad news, volatility, and betas. Journal of Finance , 50(5):1575–1603.Carr, P. and Wu, L. (2017). Leverage effect, volatility feedback, and self-exciting market disruptions. Journal of Financial and Quantitative Analysis , 52(5):2119–2156.Cerovecki, C., Francq, C., H¨ormann, S., and Zakoian, J.-M. (2019). Functional garch models: Thequasi-likelihood approach and its applications. Journal of Econometrics , 209(2):353–375.Chen, Y., Eaton, G. W., and Paye, B. S. (2018). Micro (structure) before macro? the predictivepower of aggregate illiquidity for stock returns and economic activity. Journal of FinancialEconomics , 130(1):48–73. 54hristensen, K., Kinnebrock, S., and Podolskij, M. (2010). Pre-averaging estimators of the ex-postcovariance matrix in noisy diffusion models with non-synchronous data. Journal of Econometrics ,159(1):116–133.Christie, A. A. (1982). The stochastic behavior of common stock variances: Value, leverage andinterest rate effects. Journal of Financial Economics , 10(4):407–432.Copeland, T. E. (1976). A model of asset trading under the assumption of sequential informationarrival. Journal of Finance , 31(4):1149–1168.Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal ofFinancial Econometrics , 7(2):174–196.Corwin, S. A. and Schultz, P. (2012). A simple way to estimate bid-ask spreads from daily highand low prices. Journal of Finance , 67(2):719–760.Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the varianceof united kingdom inflation. Econometrica , pages 987–1007.Engle, R. F. and Gallo, G. M. (2006). A multiple indicators model for volatility using intra-dailydata. Journal of Econometrics , 131(1-2):3–27.Fan, J. and Kim, D. (2018). Robust high-dimensional volatility matrix estimation for high-frequencyfactor model. Journal of the American Statistical Association , 113(523):1268–1283.Figlewski, S. and Wang, X. (2000). Is the ‘leverage effect’ a leverage effect? Available at SSRN256109 .French, K. R. (1980). Stock returns and the weekend effect. Journal of Financial Economics ,8(1):55–69.Gallant, A. R., Rossi, P. E., and Tauchen, G. (1992). Stock prices and volume. Review of FinancialStudies , 5(2):199–242. 55losten, L. R., Jagannathan, R., and Runkle, D. E. (1993). On the relation between the expectedvalue and the volatility of the nominal excess return on stocks. Journal of Finance , 48(5):1779–1801.Gray, S. F. (1996). Modeling the conditional distribution of interest rates as a regime-switchingprocess. Journal of Financial Economics , 42(1):27–62.Haas, M., Mittnik, S., and Paolella, M. S. (2004). A new approach to markov-switching garchmodels. Journal of Financial Econometrics , 2(4):493–530.Hall, P. and Heyde, C. C. (2014). Martingale limit theory and its application . Academic press.Hamilton, J. D. and Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changesin regime. Journal of Econometrics , 64(1-2):307–333.Hansen, P. R., Huang, Z., and Shek, H. H. (2012). Realized garch: a joint model for returns andrealized measures of volatility. Journal of Applied Econometrics , 27(6):877–906.Jacod, J., Li, Y., Mykland, P. A., Podolskij, M., and Vetter, M. (2009). Microstructure noise inthe continuous case: The pre-averaging approach. Stochastic Processes and Their Applications ,119(7):2249–2276.Jennings, R. H., Starks, L. T., and Fellingham, J. C. (1981). An equilibrium model of asset tradingwith sequential information arrival. Journal of Finance , 36(1):143–161.Kallsen, J. and Taqqu, M. S. (1998). Option pricing in arch-type models. Mathematical Finance ,8(1):13–26.Kambouroudis, D. S. and McMillan, D. G. (2016). Does vix or volume improve garch volatilityforecasts? Applied Economics , 48(13):1210–1228.Karpoff, J. M. (1987). The relation between price changes and trading volume: A survey. Journalof Financial and Quantitative Analysis , 22(1):109–126.56im, D. (2016). Statistical inference for unified garch–itˆo models with high-frequency financialdata. Journal of Time Series Analysis , 37(4):513–532.Kim, D. and Fan, J. (2019). Factor garch-itˆo models for high-frequency data with application tolarge volatility matrix prediction. Journal of econometrics , 208(2):395–417.Kim, D. and Kon, S. J. (1994). Alternative models for the conditional heteroscedasticity of stockreturns. Journal of Business , pages 563–598.Kim, D., Liu, Y., and Wang, Y. (2018). Large volatility matrix estimation with factor-baseddiffusion model for high-frequency financial data. Bernoulli , 24(4B):3657–3682.Kim, D. and Wang, Y. (2016). Unified discrete-time and continuous-time models and statisticalinferences for merged low-frequency and high-frequency financial data. Journal of Econometrics ,194(2):220–230.Kim, D. and Wang, Y. (2020). Overnight volatility processes. Manuscript .Kim, D., Wang, Y., and Zou, J. (2016). Asymptotic theory for large volatility matrix esti-mation based on high-frequency financial data. Stochastic Processes and their Applications ,126(11):3527–3577.Kiymaz, H. and Berument, H. (2003). The day of the week effect on stock market volatility andvolume: International evidence. Review of Financial Economics , 12(4):363–380.Klaassen, F. (2002). Improving garch volatility forecasts with regime-switching garch. EmpiricalEconomics , 27:363–394.Lai, Y.-S., Sheu, H.-J., and Lee, H.-T. (2017). A multivariate markov regime-switching high-frequency-based volatility model for optimal futures hedging. Journal of Futures Markets ,37(11):1124–1140.Lakonishok, J. and Maberly, E. (1990). The weekend effect: Trading patterns of individual andinstitutional investors. The Journal of Finance , 45(1):231–243.57amoureux, C. G. and Lastrapes, W. D. (1990a). Heteroskedasticity in stock return data: Volumeversus garch effects. Journal of Finance , 45(1):221–229.Lamoureux, C. G. and Lastrapes, W. D. (1990b). Persistence in variance, structural change, andthe garch model. Journal of Business & Economic Statistics , 8(2):225–234.Lamoureux, C. G. and Lastrapes, W. D. (1994). Endogenous trading volume and momentum instock-return volatility. Journal of Business & Economic Statistics , 12(2):253–260.Lange, T. and Rahbek, A. (2009). An introduction to regime switching time series models. In Handbook of Financial Time Series , pages 871–887. Springer.Lee, S.-W. and Hansen, B. E. (1994). Asymptotic theory for the garch (1, 1) quasi-maximumlikelihood estimator. Econometric Theory , 10(1):29–52.Liu, Z., Kong, X.-B., and Jing, B.-Y. (2018). Estimating the integrated volatility using high-frequency data with zero durations. Journal of Econometrics , 204(1):18–32.Merlevede, F. and Peligrad, M. (2000). The functional central limit theorem under the strongmixing condition. Annals of Probability , pages 1336–1352.Miller, E. M. (1988). Why a weekend effect? Journal of Portfolio Management , 14(4):43.Nelson, D. B. (1990). Arch models as diffusion approximations. Journal of Econometrics , 45(1-2):7–38.Nyberg, H. (2012). Risk-return tradeoff in us stock returns over the business cycle. Journal ofFinancial and Quantitative Analysis , 47(1):137–158.Renault, E. and Werker, B. J. (2011). Causality effects in return volatility measures with randomtimes. Journal of Econometrics , 160(1):272–279.Shephard, N. and Sheppard, K. (2010). Realising the future: Forecasting with high-frequency-basedvolatility (heavy) models. Journal of Applied Econometrics , 25(2):197–231.58hi, Y. and Ho, K.-Y. (2015). Modeling high-frequency volatility with three-state figarch models. Economic Modelling , 51:473–483.Song, X., Kim, D., Yuan, H., Cui, X., Lu, Z., Zhou, Y., and Wang, Y. (2021). Volatility analysiswith realized garch-itˆo models. To be appeared in Journal of Econometrics .Tao, M., Wang, Y., and Chen, X. (2013). Fast convergence rates in estimating large volatilitymatrices using high-frequency financial data. Econometric Theory , 29(4):838–856.Tao, M., Wang, Y., Yao, Q., and Zou, J. (2011). Large volatility matrix inference via combininglow-frequency and high-frequency approaches. Journal of the American Statistical Association ,106(495):1025–1040.Tauchen, G., Zhang, H., and Liu, M. (1996). Volume, volatility, and leverage: A dynamic analysis. Journal of Econometrics , 74(1):177–208.Taylor, N. (2007). A note on the importance of overnight information in risk management models. Journal of Banking & Finance , 31(1):161–180.Tsiakas, I. (2008). Overnight information and stochastic volatility: A study of european and usstock exchanges. Journal of Banking & Finance , 32(2):251–268.Visser, M. P. (2011). Garch parameter estimation using high-frequency data. Journal of FinancialEconometrics , 9(1):162–197.Wang, G. H. and Yau, J. (2000). Trading volume, bid–ask spread, and price volatility in futuresmarkets. Journal of Futures Markets , 20(10):943–970.Wang, Y. (2002). Asymptotic nonequivalence of garch models and diffusions. Annals of Statistics ,30(3):754–783.Xiu, D. (2010). Quasi-maximum likelihood estimation of volatility with high frequency data. Jour-nal of Econometrics , 159(1):235–250. 59hang, L. (2006). Efficient estimation of stochastic volatility using noisy observations: A multi-scaleapproach. Bernoulli , 12(6):1019–1043.Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. Journal of Economet-rics , 160(1):33–47.Zhang, X. and Frey, R. (2015). Improving armagarch forecasts for high frequency data with regime-switching arma-garch.