[PDF] Overnight GARCH-Itô Volatility Models

Abstract

Various parametric volatility models for financial data have been developed to incorporate high-frequency realized volatilities and better capture market dynamics. However, because high-frequency trading data are not available during the close-to-open period, the volatility models often ignore volatility information over the close-to-open period and thus may suffer from loss of important information relevant to market dynamics. In this paper, to account for whole-day market dynamics, we propose an overnight volatility model based on It\^o diffusions to accommodate two different instantaneous volatility processes for the open-to-close and close-to-open periods. We develop a weighted least squares method to estimate model parameters for two different periods and investigate its asymptotic properties. We conduct a simulation study to check the finite sample performance of the proposed model and method. Finally, we apply the proposed approaches to real trading data.

Full PDF

OOvernight GARCH-Itˆo Volatility Models

Donggyu Kim a and Yazhen Wang ba College of Business,Korea Advanced Institute of Science and Technology (KAIST), b Department of Statistics, University of Wisconsin-MadisonMarch 1, 2021

Abstract

Various parametric volatility models for ﬁnancial data have been developed to incor-porate high-frequency realized volatilities and better capture market dynamics. How-ever, because high-frequency trading data are not available during the close-to-openperiod, the volatility models often ignore volatility information over the close-to-openperiod and thus may suﬀer from loss of important information relevant to marketdynamics. In this paper, to account for whole-day market dynamics, we propose anovernight volatility model based on Itˆo diﬀusions to accommodate two diﬀerent in-stantaneous volatility processes for the open-to-close and close-to-open periods. Wedevelop a weighted least squares method to estimate model parameters for two diﬀer-ent periods and investigate its asymptotic properties. We conduct a simulation studyto check the ﬁnite sample performance of the proposed model and method. Finally, weapply the proposed approaches to real trading data.

Keywords: high-frequency ﬁnancial data, low-frequency ﬁnancial data, quasi-maximumlikelihood estimation, stochastic diﬀerential equation, volatility estimation and prediction

Since Markowitz (1952) introduced the modern portfolio theory, measuring risk has becomeimportant in ﬁnancial applications. Volatility itself is often employed as a proxy for risk.Furthermore, there are several risk measurements, such as Value at Risk (VaR), expectedshortfall, and market beta (Duﬃe and Pan, 1997; Rockafellar et al., 2000; Sharpe, 1964).These risk measurements take volatilities as an important ingredient in their formulations,and their performances heavily depend on the accuracy of volatility estimation.Generalized autoregressive conditional heteroskedasticity (GARCH) models are one ofthe most successful volatility models for low-frequency data (Bollerslev, 1986; Engle, 1982).They employ squared daily log-returns as innovations in conditional expected volatilities,1 a r X i v : . [ q -f i n . S T ] F e b nd are able to capture low-frequency market dynamics, such as volatility clustering andheavy tail. At the high-frequency level, nonparametric approaches, such as Itˆo processes andrealized volatility estimators, are often utilized to model and estimate volatilities. Examplesinclude two-time scale realized volatility (TSRV) (Zhang et al., 2005), multi-scale realizedvolatility (MSRV) (Zhang, 2006), kernel realized volatility (KRV) (Barndorﬀ-Nielsen et al.,2008), quasi-maximum likelihood estimator (QMLE) (A¨ıt-Sahalia et al., 2010; Xiu, 2010),pre-averaging realized volatility (PRV) (Jacod et al., 2009), and robust pre-averaging real-ized volatility (Fan and Kim, 2018). In practice, we often observe jumps in ﬁnancial data,and the decomposition of daily variation into continuous and jump components can improvevolatility estimation and aid with better explanation of volatility dynamics (A¨ıt-Sahaliaet al., 2012; Andersen et al., 2007; Barndorﬀ-Nielsen and Shephard, 2006; Corsi et al., 2010).For example, Fan and Wang (2007) and Zhang et al. (2016) employed the wavelet methodto identify the jumps in given noisy high-frequency data. Mancini (2004) studied a thresh-old method for jump-detection and presented the order of an optimal threshold, and Jacodet al. (2009) introduced the jump robust pre-averaging realized (PRV) estimator. We callthis realized volatility. There have been several recent attempts to combine low-frequencyGARCH and SV models and high-frequency realized volatilities. Examples include the real-ized volatility based modeling approaches (Andersen and Bollerslev, 1997a,b, 1998a,b; An-dersen et al., 2003), the heterogeneous auto-regressive (HAR) models (Corsi, 2009), thehigh-frequency based volatility (HEAVY) models (Shephard and Sheppard, 2010), the real-ized GARCH models (Hansen et al., 2012), and the uniﬁed GARCH/SV-Itˆo models (Kim andFan, 2019; Kim and Wang, 2016; Kim et al., 2020; Song et al., 2020). The realized volatilitybased models, such as HAR, HEAVY, and realized GARCH models, take reduced ARFIMAforms to model and forecast realized volatilities estimated from high-frequency data, andthe uniﬁed GARCH/SV-Itˆo models provide theoretical platform to reconcile low-frequencyGARCH/SV volatility representations and high-frequency volatility processes and harnessesrealized volatilities and GARCH/SV models to yield better, albeit more complicated, mod-eling and inference for combining low- and high-frequency data. Empirical studies haveshown that, with realized volatility as a part of the innovation, volatility models can bettercapture market dynamics. However, because high-frequency data are usually available onlyduring trading hours, such as the open-to-close period, the high-frequency volatility modelsoften include open-to-close integrated volatility in the innovation and ignore the overnightrisk (Corsi, 2009; Kim and Wang, 2016; Kim et al., 2020; Song et al., 2020). Taylor (2007)showed that the overnight information is important for evaluating risk management models,so the volatility measured by the open-to-close high-frequency observations may signiﬁcantlyundervalue their risk. Furthermore, the overnight risk is often severe—for example, duringthe European debt crisis, Asian ﬁnancial crisis, and so on—and so it is an important fac-tor that accounts for market dynamics. From this point of view, there are several studieson the impact of overnight returns on volatility and modeling the volatility process usingovernight returns and realized volatility. Hansen and Lunde (2005) studied optimal incorpo-ration of the overnight information and proposed inverse weighting of the realized volatilityand squared overnight returns by using the corresponding variance estimates. Andersen2t al. (2011) modeled the overnight returns using an augmented GARCH type structure.See also Martens (2002); Todorova and Souˇcek (2014); Tseng et al. (2012) for more infor-mation on the impact of overnight volatility. These studies document an increasing interestin developing Itˆo process-based models that provide a rigorous mathematical formulationfor using both open-to-close high-frequency data and close-to-open low-frequency data toanalyze whole-day market dynamics.In this paper, we develop an instantaneous volatility model for a whole-day period. Thewhole-day is broken down into two time periods, the open-to-close and close-to-open periods.During the open-to-close period, we observe high-frequency trading data, whereas duringthe close-to-open period, we observe low-frequency close and open prices. To reﬂect thisstructural diﬀerence, we develop two diﬀerent instantaneous volatility processes for the open-to-close and close-to-open periods. For example, for the open-to-close period, we use thecurrent integrated volatility as an innovation to reﬂect the market dynamics immediately,which helps to adapt to the rapid change in the volatility process, as occurs in the high-frequency volatility models (Corsi, 2009; Hansen et al., 2012; Shephard and Sheppard, 2010;Song et al., 2020). For the close-to-open period, we employ the current squared log-returnas an innovation, which brings us back to the discrete-time GARCH model for the close-to-open period. The proposed structure implies that the conditional expected volatility forthe whole-day period is a function of past open-to-close integrated volatilities and squaredclose-to-open log-returns. We call this volatility model the overnight GARCH-Itˆo (OGI)model. Moreover, to estimate its model parameters, we develop a quasi-likelihood estimationprocedure. Speciﬁcally, for the open-to-close period, we employ realized volatilities as a proxyfor the corresponding conditional expected volatilities, whereas for the close-to-open period,we adopt squared close-to-open log-returns as a proxy for the corresponding conditionalexpected volatilities. These proxies have heterogeneous variances that are related to theaccuracy of the proxies. To reﬂect this, we calculate their variances and assign diﬀerentweights to each proxy. As a result, the proposed estimation method takes the form ofweighted least squares. We apply the overnight GARCH-Itˆo model for a VaR study.The rest of this paper is organized as follows: Section 2 introduces the overnight GARCH-Itˆo model and discusses its properties. Section 3 proposes weighted least squares estimationmethods and investigates its asymptotic properties. Section 4 conducts a simulation study tocheck the ﬁnite sample performance of the proposed estimation methods. Section 5 appliesthe proposed overnight GARCH-Itˆo model and method to real trading data. We collect theproofs in Appendix A. In this section, we develop an Itˆo diﬀusion process to capture the whole-day market dynamics.To separate the parameters for the high-frequency period (open-to-close) and low-frequencyperiod (close-to-open), we use the subscript or superscript H and L , respectively. For thelow-frequency GARCH volatility related parameter, we use superscript g .3 eﬁnition 1. We call the log-price X t an overnight GARCH-Itˆo (OGI) process if it satisﬁes dX t = µ t dt + σ t ( θ ) dB t + J t d Λ t ,µ t = (cid:40) µ H , if t ∈ ([ t ] , [ t ] + λ ] ,µ L , if t ∈ [[ t ] + λ, [ t ] + 1] ,σ t ( θ ) =  σ t ] ( θ ) + ( t − [ t ]) λ ( ω H + γ H σ t ] ( θ )) − t − [ t ] λ ( ω H + σ t ] ( θ ))+ β H ( t − [ t ])([ t ]+ λ − t ) λ (1 − λ ) (cid:80) ∞ j =1 γ j − (cid:16)(cid:82) [ t ]+1 − j [ t ]+ λ − j σ s ( θ ) dB s (cid:17) + α H λ (cid:82) t [ t ] σ s ( θ ) ds + ν H λ ([ t ] + λ − t ) ( Z Ht ) , if t ∈ ([ t ] , [ t ] + λ ] ,σ t ]+ λ ( θ ) + t − [ t ] − λ − λ ( ω L + ( γ L − σ t ]+ λ ( θ ))+ α L ( t − [ t ] − λ )([ t ]+1 − t )(1 − λ ) λ (cid:80) ∞ j =1 γ j − (cid:82) [ t ]+1 − j + λ [ t ]+1 − j σ s ( θ ) ds + β L − λ (cid:16)(cid:82) t [ t ]+ λ σ s ( θ ) dB s (cid:17) + ν L (1 − λ ) ([ t ] + 1 − t ) ( Z Lt ) , if t ∈ [[ t ] + λ, [ t ] + 1] , where [ t ] denotes the integer part of t except that [ t ] = t − t is an integer, λ isthe time length of the open-to-close period, Z Ht = (cid:82) t [ t ] dW s , Z Lt = (cid:82) tλ +[ t ] dW s , dW t dB t = 0a.s., γ = γ H γ L , and θ = ( ω H , ω H , ω L , γ H , γ L , α H , α L , β H , β L , ν H , ν L , µ H , µ L ) is the modelparameters. For the jump part, Λ t is the standard Poisson process with constant intensity µ J ,and the jump sizes J t ’s are independent of the continuous diﬀusion processes. Furthermore,the jump size J t is equal to zero for the close-to-open period.The instantaneous volatility process of the OGI model is continuous with respect to time.For the open-to-close period—for example, [ t ] ≤ t ≤ [ t ]+ λ —the instantaneous volatility pro-cess reﬂects the market risk via the current integrated volatility and past squared overnightreturns, whereas for the close-to-open period, [ t ] + λ ≤ t ≤ [ t ] + 1, the instantaneous volatil-ity process utilizes the current log-return and past open-to-close integrated volatilities toexpress the market risk. Speciﬁcally, the past risk factors are calculated through exponen-tially weighted averages with γ order. Furthermore, to account for the U-shape pattern of theintra-day volatility process (Admati and Pﬂeiderer, 1988; Andersen and Bollerslev, 1997b;Andersen et al., 2019; Hong and Wang, 2000), the instantaneous volatility process has thequadratic terms with respect to time t . Thus, with appropriate choices of ω H and ω H , theOGI model can explain the U-shape pattern. At the market open time, the instantaneousvolatility process has the following GARCH structure: σ n ( θ ) = ω L + γ L ( ω H − ω H ) + γσ n − ( θ ) + γ L α H λ (cid:90) n − λn − σ t ( θ ) dt + β L − λ ( X n − X n − λ − (1 − λ ) µ L ) , where n is an integer, and at the market close time, σ n + λ ( θ ) = ω H − ω H + γ H ω L + γσ n − λ ( θ ) + α H λ (cid:90) n + λn σ t ( θ ) dt γ H β L − λ ( X n − X n − λ − (1 − λ ) µ L ) . (2.1)Thus, the instantaneous volatility process is some quadratic interpolation of the GARCHvolatility with the open-to-close integrated volatility and squared close-to-open log-returnas the innovation. To account for the random ﬂuctuations of the instantaneous volatilities,we introduce Z Ht and Z Lt with the scale parameters ν H and ν L . When considering onlyone of the open-to-close and close-to-open periods and ignoring the other period, the OGImodel recovers the realized GARCH-Itˆo process (Hansen et al., 2012; Song et al., 2020) oruniﬁed GARCH-Itˆo process (Kim and Wang, 2016). Thus, unlike the proposed OGI model,these models only incorporate one innovation term of the integrated volatility and squaredlog-return in their conditional volatility.Because our main interest lies in measuring the whole-day risk, to estimate the modelparameters, we use nonparametric, integrated volatility estimators (Barndorﬀ-Nielsen et al.,2008; Jacod et al., 2009; Xiu, 2010; Zhang, 2006) and squared log-returns as proxies for theparametric conditional expected integrated volatility. Thus, it is important to investigatethe properties of the integrated volatility of the proposed OGI model. The following theoremshows the properties of the integrated volatilities. Theorem 1.

For the OGI model, we have the following properties.(a) The integrated volatilities have the following structure. For < α H < , < β L < ,and n ∈ N , we have (cid:90) nn − σ t ( θ ) dt = h n ( θ ) + D n a.s. , (2.2) (cid:90) n − λn − σ t ( θ ) dt = λh Hn ( θ ) + D Hn a.s., (2.3) (cid:90) nn − λ σ t ( θ ) dt = (1 − λ ) h Ln ( θ ) + D Ln a.s., (2.4) where h n ( θ ) = ω g + γh n − ( θ ) + α g λ (cid:90) n − λn − σ t ( θ ) dt + β g − λ ( X n − − X n − λ − (1 − λ ) µ L ) ,h Hn ( θ ) = ω gH + γh Hn − ( θ ) + α gH λ (cid:90) n − λn − σ t ( θ ) dt + β gH − λ ( X n − − X n − λ − (1 − λ ) µ L ) ,h Ln ( θ ) = ω gL + γh Ln − ( θ ) + α gL λ (cid:90) n − λn − σ t ( θ ) dt + β gL − λ ( X n − − X n − λ − (1 − λ ) µ L ) ,D n , D Hn , D Ln are martingale diﬀerences and ω g , γ, α g , β g , ω gH , α gH , β gH , ω gL , α gL , β gL arefunctions of θ . Their detailed forms are deﬁned in Theorem 3. b) We have E (cid:104)(cid:0) D Hn (cid:1) (cid:12)(cid:12)(cid:12) F n − (cid:105) = ϕ Hn ( θ ) = λ ν gH a.s. ,E (cid:104)(cid:0) D LLn (cid:1) (cid:12)(cid:12)(cid:12) F n − (cid:105) = ϕ Ln ( θ )= F β L , s n − ( θ ) + F β L , ω L s n − ( θ ) + F β L , ω L + (1 − λ ) ν gL a.s. , where D LLn = D Ln + 2 (cid:82) nλ + n − ( X t − X λ + n − ) σ t ( θ ) dB t , ν gH and ν gL are deﬁned in (A.2) and (A.7) , respectively, s n − ( θ ) is deﬁned in (A.6) , and F β L ,i ’s are functions of β L deﬁned in (A.3) . Theorem 1 (a) shows that the integrated volatility can be decomposed into the GARCHvolatility and martingale diﬀerence. This structure implies that the daily conditional ex-pected volatility is a function of the past open-to-close integrated volatilities and squaredclose-to-open log-returns. That is, under the OGI process, the market dynamics can beexplained by the open-to-close integrated volatility and squared close-to-open log-returns,which represent volatilities for the open-to-close and close-to-open periods, respectively.Thanks to these two diﬀerent volatility sources, we expect the proposed OGI model to cap-ture the market dynamics well. In the empirical study, we ﬁnd that the integrated volatilitiesand squared log-returns help to account for the market dynamics (see Section 5).As we discussed above, we estimate the model parameters via the relationship betweenthe conditional GARCH volatilities, h Hn ( θ ), h Ln ( θ ), and h n ( θ ), and the corresponding inte-grated volatility or squared log-return. Thus, to study the low-frequency volatility dynamics,we only need Theorem 1 (a). That is, under the model assumptions (2.2)–(2.4), we developthe rest of the paper. In comparison with direct volatility modeling based on realized volatil-ity such as HAR, HEAVY, and realized GARCH models (Andersen et al., 2003; Corsi, 2009;Hansen et al., 2012; Shephard and Sheppard, 2010), the uniﬁed GARCH-Itˆo model and OGImodel may be more diﬃcult or even less practical for drawing statistical inferences from com-bined low- and high-frequency data. However, like the uniﬁed GARCH-Itˆo model case, theOGI approach indicates the existence of the diﬀusion process, which satisﬁes the conditions(2.2)–(2.4) and ﬁlls the gap between the low-frequency discrete time series volatility modelingand the high-frequency continuous time diﬀusion process. Because the purpose of this paperis to develop diﬀusion processes that can account for the low-frequency market dynamics, theparameter of interest is the GARCH parameter θ g = ( ω gH , ω gL , γ, α gH , α gL , β gH , β gL ). We noticethat, under the model assumption, we need the common γ condition for the open-to-closeand close-to-open conditional volatilities to have the GARCH conditional volatility form for h Hn ( θ ), h Ln ( θ ), and h n ( θ ). When it comes to estimating GARCH parameters, we assume thatthe open-to-close and close-to-open volatilities have diﬀerent dynamic structures, so we makeinferences for h Hn ( θ ) and h Ln ( θ ) separately under the common γ condition. Details can befound in Section 3. 6 Estimation procedure

We assume that the underlying diﬀusion process follows the OGI process deﬁned in Deﬁnition1. The high-frequency observations during the d th open-to-close period are observed at t d,i , i = 1 , . . . , m d , where d − t d, < t d, < · · · < t d,m d = λ + d −

1. Let m be theaverage number of the high-frequency observations, that is, m = n (cid:80) nd =1 m d . Due to marketineﬃciencies, such as the bid-ask spread, asymmetric information, and so on, the high-frequency data are masked by the microstructure noise. To account for this, we assume thatthe observed log-prices during the open-to-close period have the following additive noisestructure: Y t d,i = X t d,i + (cid:15) t d,i , for d = 1 , . . . , n, i = 1 , . . . , m d − , where X t is the true log-price, (cid:15) t d,i is microstructure noise with mean zero and variance η d , and the log-price and microstructure noise are independent. The low-frequency drifts µ H and µ L can be easily estimated by the sample means of open-to-close and close-to-openlog-returns. Furthermore, the eﬀect of µ t is negligible regarding high-frequency realizedvolatility estimators. Thus, for simplicity, we assume µ t = 0 in Deﬁnition 1. In contrast,during the close-to-open period, we only observe the low-frequency observations, open andclose prices. In the low-frequency time series modeling, we often assume that the true low-frequency observations are observed. In practice, the microstructure noise may exist in thelow-frequency observations, but its impact on the low-frequency modeling is relatively small.Thus, we also assume that the true low-frequency observations, the open and close prices X d and X λ + d , are observed at the open and close times, t d +1 , and t d +1 ,m d +1 . Remark 1.

For the microstructure noise, we may need a stationary condition to estimatethe integrated volatility with the optimal convergence rate m − / (Andersen et al., 2012,2014; Barndorﬀ-Nielsen et al., 2008; Fan and Kim, 2018; Jacod et al., 2009; Kim et al., 2016;Zhang, 2006). For example, we may impose a ARMA-type structure on the microstructurenoise and assume some dependence between the price processes and the microstructure noise.However, in this paper, we directly adopt a well-performing nonparametric realized volatilityestimator, which can be obtained under certain structures of the microstructure withoutaﬀecting the volatility modeling. Thus, we can put such structures on the microstructurenoise, as long as we can secure the well-performing realized volatility estimator. We ﬁrst ﬁx some notations. For any given vector b = ( b i ) i =1 ,...,k , we deﬁne (cid:107) b (cid:107) max = max i | b i | .Let C ’s be positive generic constants whose values are independent of θ , n , and m andmay change from occurrence to occurrence. In this section, we develop an estimationprocedure for the GARCH parameters, θ g = ( ω gH , ω gL , γ, α gH , α gL , β gH , β gL ), which are mini-mum required parameters to evaluate the GARCH volatilities deﬁned in Theorem 1, where7lements of θ g are deﬁned in Theorem 3. We denote the true GARCH parameter by θ g = ( ω gH, , ω gL, , γ , α gH, , α gL, , β gH, , β gL, ).Theorem 1 indicates that integrated volatilities can be decomposed into the GARCHvolatility terms h Hn ( θ g ) and h Ln ( θ g ), and the martingale diﬀerence terms D Hn and D Ln . This factinspires us to use the integrated volatilities as proxies of the GARCH volatilities. Then, as thesample period goes to inﬁnity, the martingale convergence theorem may provide consistencyof the estimators. However, the integrated volatilities are not observable, so we ﬁrst needto estimate them. For the open-to-close period, we use the high-frequency observations toestimate the open-to-close integrated volatility nonparametrically (A¨ıt-Sahalia et al., 2012;Andersen et al., 2007; Barndorﬀ-Nielsen et al., 2008; Corsi et al., 2010; Fan and Wang,2007; Jacod et al., 2009; Xiu, 2010; Zhang, 2006; Zhang et al., 2016), and we call thesenonparametric estimators “realized volatility.” Under mild conditions, we can show thatrealized volatility converges to integrated volatility with the optimal convergence rate m − / (Barndorﬀ-Nielsen et al., 2008; Jacod et al., 2009; Kim et al., 2016; Tao et al., 2013; Xiu, 2010;Zhang, 2006). In the numerical study, we employ the jump robust pre-averaging realized(PRV) estimator (A¨ıt-Sahalia and Xiu, 2016; Jacod et al., 2009). However, for the close-to-open period, high-frequency data are not available, so we use the squared close-to-openreturn as the proxy. Note that Itˆo’s lemma indicates( X n − X λ + n − ) = (cid:90) nλ + n − σ t ( θ ) dt + 2 (cid:90) nλ + n − ( X t − X λ + n − ) σ t ( θ ) dB t a.s.This implies that the squared close-to-open return can also be decomposed into the GARCHvolatility and martingale diﬀerence. That is, we have the following relationships: (cid:90) λ + n − n − σ t ( θ ) dt = λh Hn ( θ g ) + D Hn a.s. , ( X n − X λ + n − ) = (1 − λ ) h Ln ( θ g ) + D LLn a.s. , where D LLn = D Ln + 2 (cid:82) nλ + n − ( X t − X λ + n − ) σ t ( θ ) dB t . We use the above relationships toestimate the GARCH parameter θ g .The variances of the martingale diﬀerences D Hn and D LLn indicate the accuracy of theGARCH volatility information coming from the proxies (cid:82) λ + n − n − σ t ( θ ) dt and ( X n − X λ + n − ) ,so each proxy with the smaller variance is closer to the corresponding GARCH volatility.Thus, as we incorporate the variance information into an estimation procedure, we expectto improve its performance. For example, we can standardize the proxies as follows: (cid:16)(cid:82) λ + n − n − σ t ( θ ) dt − λh Hn ( θ g ) (cid:17) E (cid:2) ( D Hn ) (cid:3) , (cid:0) ( X n − X λ + n − ) − (1 − λ ) h Ln ( θ g ) (cid:1) E (cid:2) ( D LLn ) (cid:3) . The unit expectations help to assign a larger weight to a more accurate proxy. In theempirical study, we ﬁnd that the variance of the integrated volatilities is smaller than that8f the squared close-to-open returns. That is, the open-to-close proxy is more accurate, sowe make more use of the information from the open-to-close period by assigning to it alarger weight. To compare the proxies and GARCH volatilities, we employ the weightedleast squares estimation as follows: L n ( θ g ) = − n n (cid:88) i =1 (cid:34) ( IV i − λh Hi ( θ g )) (cid:98) φ H + (cid:0) ( X i − X λ + i − ) − (1 − λ ) h Li ( θ g ) (cid:1) (cid:98) φ L (cid:35) , where the GARCH volatility terms h Hi ( θ g ) and h Li ( θ g ) are deﬁned in Theorem 1, IV i = (cid:82) λ + i − i − σ t ( θ ) dt , and (cid:98) φ H and (cid:98) φ L are consistent estimators of variances of martingale diﬀer-ences D Hn and D LLn , respectively. To evaluate the above quasi-likelihood function, we ﬁrstneed to estimate the integrated volatility IV i . It can be estimated by the realized volatilityestimator, which is denoted by RV i . Then we estimate the GARCH volatilities as follows: (cid:98) h Hn ( θ g ) = ω gH + γ (cid:98) h Hn − ( θ g ) + α gH λ RV n − + β gH − λ ( X n − − X λ + n − ) , (3.1) (cid:98) h Ln ( θ g ) = ω gL + γ (cid:98) h Ln − ( θ g ) + α gL λ RV n + β gL − λ ( X n − − X λ + n − ) , . (3.2)To evaluate the GARCH volatilities, we use RV and the sample variance of the close-to-open log-returns as the initial values h H ( θ g ) and h L ( θ g ), respectively. The eﬀect of the initialvalue has the negligible order n − (see Lemma 1 in Kim and Wang (2016)), so its choicedoes not signiﬁcantly aﬀect the parameter estimation. With these estimators, we deﬁne thequasi-likelihood function as follows: (cid:98) L n,m ( θ g ) = − n n (cid:88) i =1 (cid:34) ( RV i − λ (cid:98) h Hi ( θ g )) (cid:98) φ H + (cid:16) ( X i − X λ + i − ) − (1 − λ ) (cid:98) h Li ( θ g ) (cid:17) (cid:98) φ L (cid:35) , (3.3)and we obtain the estimator of the GARCH parameters θ g by maximizing the quasi-likelihoodfunction. That is, (cid:98) θ g = arg max θ g ∈ Θ g (cid:98) L n,m ( θ g ) , where Θ g is the parameter space of θ g . We call the estimator the weighted least squaresestimator (WLSE). To obtain the variances of martingale diﬀerences, (cid:98) φ H and (cid:98) φ L , we employthe QMLE method as follows. We deﬁne the quasi-likelihood functions for the open-to-closeand close-to-open, respectively, in the following manner: (cid:98) L Hn,m ( θ gH ) = − n n (cid:88) i =1 (cid:34) log( λ (cid:98) h Hi ( θ gH )) + RV i λ (cid:98) h Hi ( θ gH ) (cid:35) , (3.4) (cid:98) L Ln,m ( θ gL ) = − n n (cid:88) i =1 (cid:34) log((1 − λ ) (cid:98) h Li ( θ gL )) + ( X i − X λ + i − ) (1 − λ ) (cid:98) h Li ( θ gL ) (cid:35) , (3.5)9here θ gH = ( ω gH , γ, α gH , β gH ) and θ gL = ( ω gL , γ, α gL , β gL ). Then we ﬁnd their maximizers, whichare denoted by (cid:98) θ gH and (cid:98) θ gL . Using the residuals, we estimate the variances of martingalediﬀerence in the following way: (cid:98) φ H = 1 n n (cid:88) i =1 ( RV i − λ (cid:98) h Hi ( (cid:98) θ gH )) , (cid:98) φ L = 1 n n (cid:88) i =1 (( X i − X λ + i − ) − (1 − λ ) (cid:98) h Li ( (cid:98) θ gL )) . Similar to the proofs of Theorems 3 and 5 in Kim and Wang (2016), we can establish theirconsistency.

Remark 2.

There are other possible choices of the variance of the martingale diﬀerences.For example, we can use the conditional variances in Theorem 1 (b) to evaluate the quasi-likelihood function (3.3). However, the conditional variance heavily depends on the underlineOGI process, which may cause some bias when the underline model is misspeciﬁed. Thus, tomake robust inferences, we use the unconditional variance instead of the conditional variance.Furthermore, the proposed procedure has a more simple structure, which may help to reduceestimation errors. We note that the proposed two-step weighted least square estimationprocedure works well as long as the ﬁrst-step variance estimators are consistent. Thus, wecan easily incorporate the other variance estimator. According to our empirical analysis, theunconditional variance estimator provides more stable results than the conditional varianceestimator. Thus, we use the unconditional variance and only report its related results. If wecan estimate conditional variance in a robust way, it may show better performance. However,obtaining the robustness is not straightforward because we need to impose structure on theprocess to evaluate the conditional variance. We leave this for future study.To establish asymptotic properties for the proposed WLSE, we need the following tech-nical assumptions.

Assumption 1. (1) θ g ∈ Θ g = { ( ω gH , ω gL , γ, α gH , α gL , β gH , β gL ); ω l < ω gH , ω gL < ω u , γ l < γ < γ u < , α l <α gH , α gL < α u , β l < β gH , β gL < β u < } , where ω l , ω u , γ l , γ u , α l , α u , β l , β u are someknown positive constants.(2) We have for some positive constant C , sup d E (cid:2) ( X λ + d − − X d − ) (cid:3) ≤ C, sup d E (cid:2) ( X d − X λ + d − ) (cid:3) ≤ C, sup d E (cid:104)(cid:0) D LLd (cid:1) (cid:105) ≤ C. (3) We have C m ≤ m d ≤ C m for all d , and max d sup ≤ j ≤ m d | t d,j − t d,j − | = O ( m − ) .

4) For any d ∈ N , E ( | RV d − IV d | ) ≤ Cm − / .(5) ( IV d , ( X d − X λ + d − ) , D Hd , D LLd ) is a stationary ergodic process.(6) | (cid:98) φ H − φ H | = o p (1) and | (cid:98) φ L − φ L | = o p (1) , where φ H = E (cid:104)(cid:0) D Hn (cid:1) (cid:105) and φ L = E (cid:104)(cid:0) D LLn (cid:1) (cid:105) . Remark 3.

Assumption 1(2) is about the ﬁnite 4th moment condition, which is the minimumrequirement when handling the second moment target parameter. Under some ﬁnite 4thmoment conditions, Assumption 1(4) is satisﬁed (Kim et al., 2018; Tao et al., 2013). However,when there is a jump part in the diﬀusion process, this condition may be violated. In thiscase, we need to employ some jump robust realized volatility (A¨ıt-Sahalia and Xiu, 2016;Zhang et al., 2016) and derive some uniform convergence with respect to time d . Finally,Assumption 1(5) is required to derive an asymptotic normal distribution of the proposedWLSE.The following theorem investigates the asymptotic behaviors of the proposed WLSE (cid:98) θ . Theorem 2.

Under Assumption 1, we have (cid:107) (cid:98) θ g − θ g (cid:107) max = O p ( m − / + n − / ) . (3.6) Furthermore, we suppose that nm − / → as m, n → ∞ . Then we have √ n ( (cid:98) θ g − θ g ) d → N (0 , A − BA − ) , (3.7) where A = E (cid:34) λ φ H ∂h H ( θ g ) ∂θ g ∂h H ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g + (1 − λ ) φ L ∂h L ( θ g ) ∂θ g ∂h L ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g (cid:35) ,B = E (cid:34) λ ϕ H ( θ ) φ H ∂h H ( θ g ) ∂θ g ∂h H ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g + (1 − λ ) ϕ L ( θ ) φ L ∂h L ( θ g ) ∂θ g ∂h L ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g (cid:35) ,ϕ Hi ( θ ) and ϕ Li ( θ ) are deﬁned in Theorem 1(b). Remark 4.

Theorem 2 shows that the WLSE (cid:98) θ g has the convergence rate m − / + n − / .The ﬁrst term, m − / , comes from estimating the integrated volatility, which is known asthe optimal convergence rate in the case of high-frequency data with the presence of themicrostructure noise. The second term, n − / , is the usual convergence rate in the low-frequency data case. Under the ergodic assumption, we also derive the asymptotic normality. Remark 5.

To derive the asymptotic normality, we need the condition nm − / →

0, whichis too restrictive for the long sample period. If this condition is violated, the asymptoticnormality may depend on m / ( RV d − IV d ), which is the quantity related to high-frequencyestimation. If this term is some martingale diﬀerence, we may be able to relax the conditionsuch as nm − →

0. In this case, usually, m is huge, so it is not restrictive.11ne of our objectives in this paper is to predict future volatility. The best predictorgiven the current available information F n is the conditional expected volatility—that is,the GARCH volatility h n +1 ( θ g ). With the model parameter estimator, we estimate theGARCH volatility as follows: (cid:98) h n +1 ( (cid:98) θ g ) = (cid:98) ω g + (cid:98) γ (cid:98) h n ( (cid:98) θ g ) + (cid:98) α g λ − RV n + (cid:98) β g (1 − λ ) − ( X n − X λ + n − ) , where the GARCH parameters (cid:98) ω g , (cid:98) α g , and (cid:98) β g are estimated using the plug-in method withthe WLSE (cid:98) θ g . The following corollary provides the consistency of the GARCH volatilityestimator. Corollary 1.

Under the assumptions of Theorem 2 (except for nm − / → ), we have | (cid:98) h n +1 ( (cid:98) θ g ) − h n +1 ( θ g ) | = O p ( n − / + m − / ) . In ﬁnancial practices, we are interested in the GARCH parameters ( ω g , γ, α g , β g ) and oftenmake statistical inferences about them, such as hypothesis tests. In this section, we discusshow to conduct hypothesis tests for the GARCH parameters.We ﬁrst derive the asymptotic distribution of the GARCH parameter estimators. Theo-rem 2 implies that √ n ( (cid:98) θ g − θ g ) d → N (0 , A − BA − ) , where A = E (cid:34) λ φ H ∂h H ( θ g ) ∂θ g ∂h H ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g + (1 − λ ) φ L ∂h L ( θ g ) ∂θ g ∂h L ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g (cid:35) ,B = E (cid:34) λ ϕ H ( θ ) φ H ∂h H ( θ g ) ∂θ g ∂h H ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g + (1 − λ ) ϕ L ( θ ) φ L ∂h L ( θ g ) ∂θ g ∂h L ( θ g ) ∂θ g (cid:62) (cid:12)(cid:12)(cid:12) θ g = θ g (cid:35) . The GARCH parameters are functions of θ . For example, ω g = λω gH + (1 − λ ) ω gL , α g = λα gH + (1 − λ ) α gL , β g = λβ gH + (1 − λ ) β gL , where α gH , α gL , β gH , β gL are deﬁned in Theorem 3.Thus, using the delta method and Slutsky’s theorem, we can show that when ∂f ( θ g ) ∂θ g | θ g = θ g (cid:54) = 0, T f,n = √ n ( f ( (cid:98) θ g ) − f ( θ g )) (cid:113) (cid:79) f ( (cid:98) θ g ) (cid:62) ( (cid:98) A − (cid:98) B (cid:98) A − ) − (cid:79) f ( (cid:98) θ g ) d → N (0 , , (3.8)where (cid:79) f ( (cid:98) θ g ) = ∂f ( θ g ) ∂θ g | θ g = (cid:98) θ g and (cid:98) A and (cid:98) B are consistent estimators of A and B , respectively.To evaluate the asymptotic variances of the GARCH parameter estimators, we ﬁrst need toestimate A and B . We use the following estimators, (cid:98) A ( θ g ) = − ∂ (cid:98) L n,m ( θ g ) ∂θ g ∂θ g (cid:62) and (cid:98) B ( θ g ) = 1 n n (cid:88) i =1 ∂ (cid:98) l i ( θ g ) ∂θ g ∂ (cid:98) l i ( θ g ) ∂θ g (cid:62) , (cid:98) l i ( θ g ) = ( RV i − λ (cid:98) h Hi ( θ g )) (cid:98) φ H + (cid:16) ( X i − X λ + i − ) − (1 − λ ) (cid:98) h Li ( θ g ) (cid:17) (cid:98) φ L , and (cid:98) h Hi ( θ g ), (cid:98) h Li ( θ g ) are deﬁned in (3.1) and (3.2), respectively. Under some stationarycondition, we can establish its consistency. Then, using the proposed Z-statistics T f,n in(3.8), we can conduct the hypothesis tests based on the standard normal distribution. We conducted simulations to check the ﬁnite sample performance of the proposed estimationmethods. We generated the log-prices for n days with frequency 1 /m all for each day and let t d,j = d − j/m all , d = 1 , . . . , n, j = 0 , . . . , m all . We chose the closed time λ as 6 . / dX t = µ t dt + σ t ( θ ) dB t + J t d Λ t , where the instantaneous volatility σ t ( θ ) is given by the OGI deﬁned in Deﬁnition 1, the drift µ t = 0, and ( ω H , , ω H , , ω L, , γ H, , γ L, , α H, , α L, , β H, , β L, , ν H, , ν L, ) = (0 . , . , . , . , . , . , . , . , . , . , . ω gH, , ω gL, , γ , α gH, ,α gL, , β gH, ) = (0 . , . , . , . , . , . , . | J t | = 0 .

05 and the signs of the jumps were randomly generated. Λ t wasgenerated using a Poisson distribution with mean 10 during the open-to-close period.For the open-to-close period, we generated the noisy observations as follows: Y t d,j = X t d,j + (cid:15) t d,j , for d = 1 , . . . , n, j = 1 , . . . , m − , where m is the number of high-frequency observations for the open-to-close period, and (cid:15) t d,j ’s are generated from i.i.d. normal distributions with mean zero and standard deviation0 . (cid:113)(cid:82) dd − σ t ( θ ) dt . To generate the true process, we chose m all = 43 , n from 100 to 500 and m from390 to 11,700, which correspond to the numbers of 1 minute and every 2 seconds in theopen-to-close period, respectively. We treated Y t d,j , j = 1 , . . . , m − X t d, and X t d,m as the observed log-prices. Toestimate the integrated volatility for the open-to-close period, we employed the jump adjustedpre-averaging realized volatility estimator (A¨ıt-Sahalia and Xiu, 2016; Jacod et al., 2009) asfollows: RV d = 1 ψK m − K +1 (cid:88) k =1 (cid:26) ¯ Y ( t d,k ) − (cid:98) Y ( t d,k ) (cid:27) {| ¯ Y ( t d,k ) |≤ τ m } , (4.1)where ¯ Y ( t d,k ) = K − (cid:88) l =1 g (cid:18) lK (cid:19) (cid:0) Y t d,k + l − Y t d,k + l − (cid:1) , ψ = (cid:90) g ( t ) dt, Y ( t d,k ) = K (cid:88) l =1 (cid:26) g (cid:18) lK (cid:19) − g (cid:18) l − K (cid:19)(cid:27) (cid:0) Y t d,k + l − − Y t d,k + l − (cid:1) , we take the weight function g ( x ) = x ∧ (1 − x ) and the bandwidth size K = (cid:98) m / (cid:99) , {·} isan indicator function, and τ m = c τ m − . is a truncation level for the constant c τ . We chose c τ as three times the sample standard deviation of the pre-averaged prices m / ¯ Y ( t d,k ). Werepeated the whole procedure 500 times.Table 1: Mean absolute errors (MAE) for the WLSE estimates with n = 100 , ,

500 and m = 390 , , MAE × n m ω gH ω gL γ α gH α gL β gH β gL

100 390 0.0383 0.0497 0.2329 0.1627 0.2145 0.0455 0.10141170 0.0383 0.0498 0.2325 0.1626 0.2141 0.0453 0.10132340 0.0291 0.0484 0.1696 0.1064 0.2307 0.0299 0.1024True vol 0.0231 0.0447 0.1483 0.0817 0.2143 0.0233 0.1023200 390 0.0247 0.0338 0.1619 0.1444 0.1593 0.0380 0.06851170 0.0247 0.0338 0.1615 0.1445 0.1593 0.0378 0.06852340 0.0191 0.0343 0.1091 0.0781 0.1645 0.0229 0.0730True vol 0.0136 0.0307 0.0880 0.0513 0.1583 0.0164 0.0749500 390 0.0159 0.0230 0.1191 0.1329 0.1171 0.0329 0.04461170 0.0159 0.0230 0.1191 0.1330 0.1173 0.0328 0.04462340 0.0134 0.0235 0.0726 0.0586 0.1079 0.0190 0.0490True vol 0.0082 0.0212 0.0545 0.0330 0.1053 0.0095 0.0519

Table 1 reports the mean absolute errors (MAE), | (cid:98) θ − θ | of the WLSE estimates with n = 100 , ,

500 and m = 390 , , ω gL , α gL , β gL , aremainly dependent on the number of low-frequency observations, whereas the open-to-closeperiod parameters, ω gH , γ , α gH , β gH , are dependent on the number of both the high-frequencyand low-frequency observations. This is because the close-to-open period parameters are es-timated based on only the low-frequency data, whereas the open-to-close period parameters, ω gH , γ , α gH , β gH , are estimated based on both the high-frequency and low-frequency data.This result supports the theoretical ﬁndings in Section 3.To check the asymptotic normality of the GARCH parameters ( ω g , γ, α g , β g ), we calcu-lated the Z-statistics deﬁned in Section 3.3. Figure 1 draws the standard normal quantile-quantile plots of the Z-statistics estimates of ω g , γ , α g , and β g for n = 500 and m =390 , , − − w g (m=390) Standard Normal O r de r ed e s t i m a t ed v a l ue Standard Normal QQ−line −3 −2 −1 0 1 2 − − w g (m=1170) Standard Normal O r de r ed e s t i m a t ed v a l ue Standard Normal QQ−line −2 −1 0 1 2 3 − − w g (m=11700) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − w g (true vol) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − g (m=390) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − g (m=1170) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − g (m=11700) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 − − g (true vol) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 − − a g (m=390) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − a g (m=1170) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 − − a g (m=11700) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − a g (true vol) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − b g (m=390) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 4 − − b g (m=1170) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 − − b g (m=11700) Standard Normal O r de r ed e s t i m a t ed v a l ue −3 −2 −1 0 1 2 3 − − b g (true vol) Standard Normal O r de r ed e s t i m a t ed v a l ue Figure 1: Standard normal quantile-quantile plots of the Z-statistics estimates of ω g , γ , α g ,and β g for n = 500 and m = 370 , , h n +1 ( θ ).To estimate future GARCH volatility, we employed the proposed conditional GARCH volatil-15

100 200 300 400 500 − . − . − . . . . . OGI n D i ff e r en c e f r o m t r ue v o l − . − . − . . . . . S−OGI n D i ff e r en c e f r o m t r ue v o l − . − . − . . . . . A−OGI n D i ff e r en c e f r o m t r ue v o l − . − . − . . . . . Realized GARCH n D i ff e r en c e f r o m t r ue v o l − . − . − . . . . . Adj−Realized GARCH n D i ff e r en c e f r o m t r ue v o l − . − . − . . . . . GARCH n D i ff e r en c e f r o m t r ue v o l Figure 2: One sample path for estimated conditional volatilities of the OGI, S-OGI, A-OGI,realized GARCH, adjusted realized GARCH, and GARCH with n = 500 and m = 11700.ity estimator (cid:98) h n +1 ( (cid:98) θ ), realized GARCH volatility estimator (Hansen et al., 2012; Song et al.,2020) with only the open-to-close high-frequency observations, discrete GARCH(1,1) volatil-ity estimator with the open-to-open log-returns, and sample variance of the open-to-openlog-returns using the in-sample data. For example, the realized GARCH volatility has thefollowing GARCH form: h n ( θ ) = ω + γh n − ( θ ) + αRV n − , and the discrete GARCH(1,1) has the following GARCH form: h n ( θ ) = ω + γh n − ( θ ) + β ( X n − − X n − ) . We then adopted the QMLE method with the Gaussian quasi-likelihood function to estimatetheir GARCH parameters. Because the realized GARCH volatility estimator only covers theopen-to-close period, we magniﬁed the estimator by multiplying it with (1 + mean [ OV /RV ])to match the magnitude, where OV is the overnight return squares and RV is the open-to-close realized volatility. We call this the adjusted realized GARCH volatility. Finally, we alsoconsider other estimation procedures based on Theorem 1 (a). For example, we estimate theopen-to-close and close-to-open separately, as described in the ﬁrst step in Section 3. Thatis, without the common γ assumption, we make inferences for h Hn ( θ ) and h Ln ( θ ) separately,16

00 200 300 400 500

True vol n R e l a t i v e M AE OGIS−OGIA−OGIRealized GARCHAdj−Realized GARCHGARCHSample variance

100 200 300 400 500 m=11700 n R e l a t i v e M AE

100 200 300 400 500 m=1170 n R e l a t i v e M AE

100 200 300 400 500 m=390 n R e l a t i v e M AE Figure 3: Relative Mean absolute errors for the OGI, S-OGI, A-OGI, realized GARCH,adjusted realized GARCH, GARCH, and sample variance with respect to the OGI with n = 100 , , m = 390 , , n = 100 , , m = 390 , , MAE × n m OGI S-OGI A-OGI Adj-Realized Realized GARCH Sample100 390 0.3828 0.4877 0.4346 0.8357 1.4920 0.7644 1.50401170 0.3612 0.4500 0.4224 0.4574 1.4287 0.7644 1.44152340 0.3328 0.4196 0.4038 0.3531 1.3775 0.7644 1.3968True vol 0.2991 0.4165 0.3712 0.2966 1.4079 0.7644 1.4320200 390 0.3090 0.3561 0.3356 0.8186 1.4986 0.5170 1.50271170 0.2847 0.3385 0.3228 0.4567 1.4325 0.5170 1.43862340 0.2570 0.3033 0.2909 0.3365 1.3821 0.5170 1.3873True vol 0.2353 0.2881 0.2674 0.2855 1.4086 0.5170 1.4163500 390 0.2460 0.2527 0.2542 0.7983 1.5168 0.4298 1.52951170 0.2083 0.2271 0.2280 0.4270 1.4526 0.4298 1.45662340 0.1762 0.1984 0.2004 0.3270 1.4013 0.4298 1.4029True vol 0.1462 0.1735 0.1744 0.2840 1.4271 0.4298 1.4316 using the QMLE method with the normal likelihood function. We call this the separateOGI (S-OGI) model. In contrast, we estimate the open-to-open conditional volatility h n ( θ )17irectly. Speciﬁcally, Theorem 1 (a) shows that the conditional volatility is h n ( θ g ) = ω g + γh n − ( θ ) + α g − λ RV n − + β g λ ( X n − − X λ + n − ) . Then we estimate the GARCH parameter ( ω g , γ, α g , β g ) using the QMLE method with RV + OV as the proxy. We call this the aggregated OGI (A-OGI) model. We note that this modelcan be considered as the realized GARCH model with the additional overnight innovationterm. We measure the mean absolute errors with the one-day-ahead sample period over 500samples as follows: 1500 (cid:88) i =1 | (cid:99) var n +1 ,i − h n +1 ,i ( θ ) | , where (cid:99) var n +1 ,i is one of the above future volatility estimators at the i th sample path given theavailable information at time n . Figure 2 draws one sample path for the diﬀerences betweenthe estimated and true conditional volatilities, where the estimated conditional volatility isone of the OGI, S-OGI, A-OGI, realized GARCH, adjusted realized GARCH, and GARCHwith n = 500 and m = 11700. We choose the sample path, which has the smallest meanabsolute error. Figure 3 depicts the relative mean absolute errors for the OGI, S-OGI,A-OGI, realized GARCH, adjusted realized GARCH, GARCH, and sample variance withrespect to the OGI against varying the number, n , of the low-frequency observations andthe number, m , of the high-frequency observations. We report their numerical results inTable 2. In Figures 2–3 and Table 2, we ﬁnd that the OGI models can estimate the one-day-ahead GARCH volatility h n +1 ( θ ) well, but the other estimators cannot account for itwell. This may be because, under the OGI model, the market dynamics are explained bythe open-to-close high-frequency volatility and squared close-to-open log-returns; however,the other models ignore one of the factors. Compared to estimation methods for the OGImodels, the WLSE yields better performance than the others. One possible explanation forthis is that the WLSE procedure gives more weight to the high-frequency observations, andthis helps reduce the estimation errors. From these results, we can conjecture that modelingappropriate overnight processes helps to not only account for market dynamics but alsoimprove the estimation accuracy. We applied the proposed OGI model to real trading high-frequency data. We obtained thetop 5 trading volume assets (BAC, FCX, INTC, MSFT, MU) intra-day data from January2010 to December 2016 from the TAQ database in the Wharton Research Data Services(WRDS) system, 1762 trading days in total. We deﬁned the trading hours from 9:30 to16:00 as the open-to-close period and the overnight period from 16:00 to the following-day9:30 as the close-to-open period, that is, λ = 6 . /

24. We used the log-prices and adoptedthe jump robust PRV estimation procedure in (4.1) to estimate open-to-close integratedvolatility. In the empirical study, we chose the tuning parameter c τ as 10 times the sample18tandard deviation of pre-averaged prices m / ¯ Y ( t d,k ). We ﬁnd that the jump variation isabout from 13% to 18% proportion of the total variation on average. Details can be foundin Table 4.We ﬁrst estimated the GARCH parameters using the recent 1000 days data. From theestimated GARCH parameters, we obtained the following GARCH volatility for each asset. (cid:98) h n +1 ( (cid:98) θ ) = (cid:98) ω g + (cid:98) γ (cid:98) h n ( (cid:98) θ ) + (cid:98) α g λ − RV cn + (cid:98) β g (1 − λ ) − ( X n − X λ + n − ) . Table 3 repots the estimation results. Furthermore, to check the relative importance ofeach component, we report the average proportion of jumps, and the mean and standarddeviations of the estimated GARCH volatility, PRV, and squared overnight returns in Table4. From Table 3, we show that dynamic structures can be explained by the past PRVor squared overnight return, and the coeﬃcients of realized and overnight volatilities arestatistically signiﬁcant. From Table 4, we ﬁnd that the magnitude of squared overnightreturns is comparable to that of PRV, and the squared overnight returns have a greaterstandard deviation. These results lead us to conjecture that the overnight risk usuallysigniﬁcantly aﬀects the volatility dynamics structure.Table 3: OGI model estimation results.Stock ω g (p-value) γ (p-value) α g (p-value) β g (p-value)BAC 0.00006 (0.00) 0.16048 (0.00) 0.22391 (0.00) 0.26862 (0.01)FCX 0.00011 (0.38) 0.28644 (0.01) 0.21289 (0.00) 0.20468 (0.03)INTC 0.00005 (0.00) 0.18198 (0.00) 0.23510 (0.00) 0.06777 (0.06)MSFT 0.00001 (0.00) 0.14403 (0.00) 0.17846 (0.00) 0.05760 (0.00)MU 0.00018 (0.00) 0.39881 (0.00) 0.14263 (0.00) 0.01847 (0.00)Table 4: Average of the jump proportion, and mean and standard deviations for conditionalGARCH, pre-averaging realized volatility (PRV), and overnight volatility (OV).Mean × SD × Stock Jump GARCH PRV OV GARCH PRV OVBAC 0.169 2.556 2.420 1.712 1.995 3.572 8.367FCX 0.133 10.145 5.424 3.341 11.956 7.651 11.086INTC 0.181 1.936 1.403 0.799 1.285 1.389 3.453MSFT 0.166 2.169 1.221 0.866 0.911 1.133 4.485MU 0.164 6.794 5.195 2.585 2.462 4.264 10.236For a comparison, we calculated the OGI, S-OGI, A-OGI, adjusted realized GARCH,and discrete GARCH(1,1) volatilities deﬁned in Section 4 and the GJR GARCH (1,1)(Glosten et al., 1993). To check the performance of the ARFI-type model, we adoptedthe HAR-RV model (Corsi, 2009), and we magniﬁed the estimator by multiplying it with(1 + mean [ OV /RV ]) to match the scale of the whole day variation, which is called “adjusted19AR.” To check the leverage eﬀect, we also considered some variations of the OGI model asfollows: h n ( θ ) = (cid:18) ω g + γh n − ( θ ) + α g + I Hn − aλ (cid:90) n − λn − σ t ( θ ) dt + β g + I Ln − b − λ ( X n − − X n − λ ) (cid:19) , where I Hn = { X n − − X n − λ

Table 6: MSPEs, RMSPEs and QLIKEs for the OGI, S-OGI, A-OGI, GJR-OGI, adjustedrealized GARCH, adjusted HAR, discrete GARCH, and GJR GARCH.

Stock OGI-W S-OGI A-OGI GJR-OGI Adj-Real Adj-HAR GARCH GJRBAC MSPE × × × × × where V ol i is one of OGI, S-OGI, A-OGI, GJR-OGI, adjusted realized GARCH, adjustedHAR, discrete GARCH, and GJR GARCH predicted volatilities. Then we calculated theregression residuals, (cid:98) (cid:15) i , for each model and checked their auto-correlations. Figure 4 drawsthe auto-correlation function (ACF) of the regression residuals for the ﬁve assets. FromFigure 4, we ﬁnd that the OGI, S-OGI, A-OGI, and adjusted HAR models produce relativelysmall auto-correlations for most of assets, but the other models still yield signiﬁcantly non-zero auto-correlations for some assets. That is, the OGI model can explain the marketdynamics in the volatility time series. An intriguing discovery is that the HAR model showssimilar performance to the models incorporating the overnight risk factor. These numericalresults provide evidence for us to conclude that the ARFI structure helps reduce the volatilitypersistence.We examined the performance of the proposed method in measuring one-day-ahead VaR.To evaluate VaR, we ﬁrst predicted the one-day-ahead conditional expected volatility by the21 . . . . . . Lag A C F BAC (OGI) . . . . . . Lag A C F BAC (S−OGI) . . . . . . Lag A C F BAC (A−OGI) . . . . . . Lag A C F BAC (GJR−OGI) . . . . . . Lag A C F BAC (A−Realized) . . . . . . Lag A C F BAC (A−HAR) . . . . . . Lag A C F BAC (GARCH) . . . . . . Lag A C F BAC (GJR−GARCH) . . . . . . Lag A C F FCX (OGI) . . . . . . Lag A C F FCX (S−OGI) . . . . . . Lag A C F FCX (A−OGI) . . . . . . Lag A C F FCX (GJR−OGI) . . . . . . Lag A C F FCX (A−Realized) . . . . . . Lag A C F FCX (A−HAR) . . . . . . Lag A C F FCX (GARCH) . . . . . . Lag A C F FCX (GJR−GARCH) . . . . . . Lag A C F INTC (OGI) . . . . . . Lag A C F INTC (S−OGI) . . . . . . Lag A C F INTC (A−OGI) . . . . . . Lag A C F INTC (GJR−OGI) . . . . . . Lag A C F INTC (A−Realized) . . . . . . Lag A C F INTC (A−HAR) . . . . . . Lag A C F INTC (GARCH) . . . . . . Lag A C F INTC (GJR−GARCH) . . . . . . Lag A C F MSFT (OGI) . . . . . . Lag A C F MSFT (S−OGI) . . . . . . Lag A C F MSFT (A−OGI) . . . . . . Lag A C F MSFT (GJR−OGI) . . . . . . Lag A C F MSFT (A−Realized) . . . . . . Lag A C F MSFT (A−HAR) . . . . . . Lag A C F MSFT (GARCH) . . . . . . Lag A C F MSFT (GJR−GARCH) . . . . . . Lag A C F MU (OGI) . . . . . . Lag A C F MU (S−OGI) . . . . . . Lag A C F MU (A−OGI) . . . . . . Lag A C F MU (GJR−OGI) . . . . . . Lag A C F MU (A−Realized) . . . . . . Lag A C F MU (A−HAR) . . . . . . Lag A C F MU (GARCH) . . . . . . Lag A C F MU (GJR−GARCH)

Figure 4: ACF plots for the regression residuals between the nonparametric volatility andestimated volatility, such as the OGI, S-OGI, A-OGI, GJR-OGI, adjusted realized GARCH,adjusted HAR, discrete GARCH, and GJR GARCH predicted volatilities.OGI, S-OGI, A-OGI, GJR-OGI, adjusted realized GARCH, adjusted HAR, discrete GARCH,and GJR GARCH using the in-sample period data. We then calculated the quantiles by his-torical standardized daily returns. Speciﬁcally, we standardized the in-sample daily returnsby the ﬁtted conditional volatilities. Then we calculated the sample quantiles for 0.01, 0.02,0.05, 0.1, 0.2 and with the sample quantile estimates and predicted volatility, we obtained22he one-day ahead VaR values. We ﬁxed the in-sample period as 500 days and used therolling window scheme.To backtest the estimated VaR, we conducted some hypothesis tests. Speciﬁcally, we ﬁrstcalculated the sequence I i = I ( S i − S i − < − (cid:91) V aR i,q ), i = 501 , . . . , S i is the stockprice at time i and (cid:91) V aR i,q is the VaR value, with the VaR level 1 − q given the availableinformation up to time i − I i is the target probability q . So the null statement is q = q ,and the alternative statement is q (cid:54) = q . The following three test statistics are employed tocarry out the hypothesis tests. The ﬁrst one is the likelihood ratio unconditional coverage(LRuc) test, which is based on the standard likelihood ratio test, which is, in turn, basedon binomial distribution (Kupiec, 1995). The second one is the likelihood ratio conditionalcoverage (LRcc) test proposed by Christoﬀersen (1998): LR cc = − (cid:104) L ( q : I , . . . , I n ) /L ( (cid:98) Π : I , . . . , I n ) (cid:105) , where L ( q : I , . . . , I n ) = q x (1 − q ) n − x and x = (cid:80) ni =1 I i ; L (Π : I , . . . , I n ) = π n (1 − π ) n π n (1 − π ) n , π ij = P ( I d +1 = j | I d = i ), and n ij is the number of j outcomes after i outcome; (cid:98) Π is the maximum likelihood estimator. So LRcc can test correct conditionalcoverage. Details can be found in Christoﬀersen (1998). Finally, we considered the dynamicquantile (DQ) test (Engle and Manganelli, 2004). It generalizes the conditional coveragetest by considering the relation of the current hit and the multiple lagged information. Forexample, we denote the current hit by H t and obtain the following regression with lag L , H t = β + L (cid:88) i =1 β i H t − i + ε t . Under the null hypothesis, β should be equal to α , and β should be zero. Moreover, Engleand Manganelli (2004) showed DQ = (cid:98) β (cid:62) X (cid:62) X (cid:98) β α (1 − α ) d −→ χ L +2 , where χ L +2 is the chi-square with L + 2 degrees of freedom, X = ( X , . . . , X t ), and X t =( H t − , . . . , H t − L ) (cid:62) . We chose L = 4.Figure 5 draws the scatterplots for the p-values of LRuc, LRcc, and DQ tests for theOGI, S-OGI, A-OGI, GJR-OGI, adjusted realized GARCH, adjusted HAR, discrete GARCH,and GJR GARCH models with q = 0 . , . , . , . , and 0 .

2. Figure 5 indicates thatthe VaR estimates from the OGI, S-OGI, and A-OGI exhibit good performance overall.These results show that the overnight risk is important to account for whole-day marketdynamics, and the OGI process can account for market dynamics by utilizing the overnightrisk information. When comparing the OGI, S-OGI, and A-OGI models, the proposed OGIappears to be slightly better in some cases, and it is robust. These ﬁndings prompt us23 . . . . . . LRuc (0.01)

Model p − v a l ue . . . . . . LRuc (0.02)

Model p − v a l ue . . . . . . LRuc (0.05)

Model p − v a l ue . . . . . . LRuc (0.1)

Model p − v a l ue . . . . . . LRuc (0.2)

Model p − v a l ue . . . . . . LRcc (0.01)

Model p − v a l ue . . . . . . LRcc (0.02)

Model p − v a l ue . . . . . . LRcc (0.05)

Model p − v a l ue . . . . . . LRcc (0.1)

Model p − v a l ue . . . . . . LRcc (0.2)

Model p − v a l ue . . . . . . DQ (0.01)

Model p − v a l ue . . . . . . DQ (0.02)

Model p − v a l ue . . . . . . DQ (0.05)

Model p − v a l ue . . . . . . DQ (0.1)

Model p − v a l ue . . . . . . DQ (0.2)

Model p − v a l ue Figure 5: Scatterplots for the p-values of LRuc, LRcc, and DQ tests with q = 0.01, 0.02,0.05, 0.1 and 0.2. Note that the OGI (1), S-OGI (2), A-OGI (3), GJR-OGI (4), adjustedrealized GARCH (5), adjusted HAR (6), discrete GARCH (7), and GJR GARCH (8).to speculate that it may help improve estimation accuracy by estimating the open-to-closeand close-to-open separately with the weighted least squared estimation method under thecommon γ condition. In contrast, the GJR-OGI shows relatively worse performance. Thismay be because the relatively complicated model causes some estimation errors, which resultin worse performance. The realized GARCH model produces good performance in the LRucand LRcc tests, but it has poor performance in the DQ test. This may be because therealized GARCH model cannot explain some dynamics that may come from the overnightrisk. 24 Proofs

A.1 Proof of Theorem 1

Let P t = (cid:82) t σ t ( θ ) dB t . Theorem 1 is an immediate consequence of Theorem 3 (a) below. Theorem 3.

For the OGI model, the integrated volatilities have the following structure.(a) For < α H < and n ∈ N , we have (cid:90) n − λn − σ t ( θ ) dt = λh Hn ( θ ) + D Hn a.s., where h Hn ( θ ) = ω gH + γh Hn − ( θ ) + α gH λ − (cid:90) n − λn − σ t ( θ ) dt + β gH (1 − λ ) − ( P n − − P n − λ ) ,(cid:37) H = α − H ( e α H − , (cid:37) H = α − H ( e α H − − α H ) ,(cid:37) H = α − H ( e α H − − α H − α H / , (cid:37) H = 2 γ H (cid:37) H + (cid:37) H − (cid:37) H ,ω gH = (1 − γ ) [2 ω H (cid:37) H − ω H (cid:37) H + ν H { (cid:37) H − (cid:37) H } ] + γ L ( ω H − ω H ) (cid:37) H + ω L (cid:37) H ,γ = γ H γ L , α gH = (cid:37) H γ L α H , β gH = (cid:37) H β L + β H ( (cid:37) H − (cid:37) H ) , and D Hn = 2 ν H α − H (cid:90) n − λn − { α H λ − ( λ + n − − t − λα − H ) e λ − α H ( λ + n − − t ) + 1 } Z Ht dW t is a martingale diﬀerence.(b) For < β L < , and n ∈ N , we have (cid:90) nn − λ σ t ( θ ) dt = (1 − λ ) h Ln ( θ ) + D Ln a.s., where h Ln ( θ ) = ω gL + γh Ln − ( θ ) + α gL λ − (cid:90) n − λn − σ t ( θ ) dt + β gL (1 − λ ) − ( P n − − P n − λ ) ,(cid:37) L = β − L ( e β L − , (cid:37) L = β − L ( e β L − − β L ) , (cid:37) L = ( γ L − (cid:37) L + (cid:37) L ,ω gL = (1 − γ ) [ ω L (cid:37) L + ν L ( (cid:37) L − (cid:37) L )] + ( ω H − ω H + γ H ω L ) (cid:37) L + (cid:37) L α H ω gH + α L ( (cid:37) L − (cid:37) L ) ω gH , α gL = (cid:37) L α H ( γ + α gH ) + α L ( (cid:37) L − (cid:37) L )( γ + α gH ) β gL = (cid:37) L ( γ H β L + α H β gH ) + α L ( (cid:37) L − (cid:37) L ) β gH , nd D Ln = 2 (cid:90) nn − λ ( e β L (1 − λ ) − ( n − t ) − P t − P λ + n − ) σ t ( θ ) dB t +2 ν L β − L (cid:90) nn − λ (cid:20) β L − λ { n − t − (1 − λ ) β − L } e β L (1 − λ ) − ( n − t ) + 1 (cid:21) Z Lt dW t +( (cid:37) L β H + α L ( (cid:37) L − (cid:37) L )(1 − λ ) λ − D Hn is a martingale diﬀerence.(c) For < β H < , < β L < , and n ∈ N , we have (cid:90) nn − σ t ( θ ) dt = h n ( θ ) + D n a.s. , where D n = D Hn + D Ln , h n ( θ ) = ω g + γh n − ( θ ) + α g λ − (cid:90) n − λn − σ t ( θ ) dt + β g (1 − λ ) − ( P n − − P n − λ ) , (A.1) ω g = λω gH + (1 − λ ) ω gL , α g = λα gH + (1 − λ ) γα gL , β g = λβ gH + (1 − λ ) β gL . (d) For < α H < , < β L < , and n ∈ N , we have E (cid:34)(cid:90) n − λn − σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F n − (cid:35) = λh Hn ( θ ) a.s.,E (cid:34)(cid:90) nn − λ σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F n − (cid:35) = (1 − λ ) h Ln ( θ ) a.s.,E (cid:34)(cid:90) nn − σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F n − (cid:35) = h n ( θ ) a.s. Proof of Theorem 3.

Consider ( a ) and ( b ). By Itˆo’s lemma, we obtain R H ( k ) = (cid:90) λ + n − n − ( λ + n − − t ) k k ! σ t ( θ ) dt = ω H λ − λ k +3 ( k + 3)! − ω H λ − λ k +2 ( k + 2)!+ σ n − ( θ ) λ − (cid:26) γ H λ k +3 ( k + 3)! − λ λ k +2 ( k + 2)! + λ λ k +1 ( k + 1)! (cid:27) + β H λ (1 − λ ) (cid:26) λ λ k +2 ( k + 2)! − λ k +3 ( k + 3)! (cid:27) ∞ (cid:88) j =1 γ j − (cid:18)(cid:90) n − jn − λ − j σ s ( θ ) dB s (cid:19) ν H λ − (cid:26) λ λ k +2 ( k + 2)! − λ k +3 ( k + 3)! (cid:27) +2 λ − ν H (cid:90) λ + n − n − ( λ + n − − s ) k +2 ( k + 1)( k + 2)! ( W s − W n − ) dW s + α H λ − R H ( k + 1) a.s.Thus, we have R H (0) = (cid:90) λ + n − n − σ t ( θ ) dt = λ ∞ (cid:88) k =0 (cid:18) ω H α − H α k +3 H ( k + 3)! − ω H α − H α k +2 H ( k + 2)! (cid:19) + ∞ (cid:88) k =0 σ n − ( θ ) λ (cid:26) γ H α − H α k +3 H ( k + 3)! − α − H α k +2 H ( k + 2)! + α − H α k +1 H ( k + 1)! (cid:27) + ∞ (cid:88) k =0 β H (1 − λ ) λ (cid:26) α − H α k +2 H ( k + 2)! − α − H α k +3 H ( k + 3)! (cid:27) ∞ (cid:88) j =1 γ j − (cid:18)(cid:90) n − jn − λ − j σ s ( θ ) dB s (cid:19) + ∞ (cid:88) k =0 ν H λ (cid:26) α − H α k +2 H ( k + 2)! − α − H α k +3 H ( k + 3)! (cid:27) +2 ν H λ − ∞ (cid:88) k =0 (cid:90) λ + n − n − ( λ − α H ) k ( λ + n − − t ) k +2 ( k + 1)( k + 2)! Z Ht dW t = λω H (cid:37) H − λω H (cid:37) H + ν H λ { (cid:37) H − (cid:37) H } + σ n − ( θ ) λ { γ H (cid:37) H + (cid:37) H − (cid:37) H } + λ β H (1 − λ ) { (cid:37) H − (cid:37) H } ∞ (cid:88) j =1 γ j − (cid:18)(cid:90) n − jn − λ − j σ s ( θ ) dB s (cid:19) +2 ν H α − H (cid:90) λ + n − n − { α H λ − ( λ + n − − t − λα − H ) e λ − α H ( λ + n − − t ) + 1 } Z Ht dW t = λ (cid:18) ω gH + γh Hn − ( θ ) + α gH λ − (cid:90) λ + n − n − σ t dt + β gH (1 − λ ) − ( P n − − P λ + n − ) (cid:19) + D Hn a.s.Similarly, we have R L ( k ) = (cid:90) nλ + n − ( n − t ) k k ! σ t ( θ ) dt = ω L (1 − λ ) − (1 − λ ) k +2 ( k + 2)! + σ λ + n − ( θ ) (cid:26) γ L − − λ (1 − λ ) k +2 ( k + 2)! + (1 − λ ) k +1 ( k + 1)! (cid:27) + ν L (1 − λ ) − (cid:18) (1 − λ ) k +3 ( k + 2)! − − λ ) k +3 ( k + 3)! (cid:19) + α L (1 − λ ) λ (cid:26) (1 − λ ) (1 − λ ) k +2 ( k + 2)! − − λ ) k +3 ( k + 3)! (cid:27) ∞ (cid:88) j =1 γ j − (cid:90) n − j + λn − j σ s ( θ ) ds ν L (1 − λ ) − (cid:90) nλ + n − ( n − t ) k +2 ( k + 1)( k + 2)! Z Lt dW t +2 β L (1 − λ ) − (cid:90) nλ + n − ( n − t ) k +1 ( k + 1)! ( P s − P λ + n − ) σ s ( θ ) dB s + β L (1 − λ ) − R L ( k + 1) a.s. , and R L (0) = (cid:90) nλ + n − σ t ( θ ) dt = (1 − λ ) ω L (cid:37) L + ν L (1 − λ )( (cid:37) L − (cid:37) L ) + σ λ + n − ( θ )(1 − λ ) { ( γ L − (cid:37) L + (cid:37) L } +(1 − λ ) α L λ { (cid:37) L − (cid:37) L } ∞ (cid:88) j =1 γ j − (cid:90) n − j + λn − j σ s ( θ ) ds +2 ν L β − L (cid:90) nλ + n − { β L − λ ( n − t − (1 − λ ) β − L ) e β L (1 − λ ) − ( n − t ) + 1 } Z Lt dW t +2 (cid:90) nλ + n − ( e β L (1 − λ ) − ( n − t ) − P t − P λ + n − ) σ t ( θ ) dB t = (1 − λ ) (cid:18) ω gL + γh Ln − ( θ ) + α gL λ − (cid:90) λ + n − n − σ t ( θ ) dt + β gL (1 − λ ) − ( P n − − P λ + n − ) (cid:19) + D Ln a.s.The results of ( c ) and ( d ) are immediate consequences of ( a ) and ( b ). (cid:4) Proof of Theorem Theorem 3 (b).

To simplify the notations, we set n = 1. Simplealgebraic manipulations show E (cid:104)(cid:0) D H (cid:1) (cid:12)(cid:12)(cid:12) F (cid:105) = 4 ν H α − H (cid:90) λ t { λ − α H ( λ − t − α − H λ ) e α H λ − ( λ − t ) + 1 } dt = λ ν H (2 α H − α H + 9) e α H + (16 α H − e α H + 4 α H + 22 α H + 392 α H = λ ν gH , where ν gH = 2 α H ν H (2 α H − α H + 9) e α H + (16 α H − e α H + 4 α H + 22 α H + 39 . (A.2)Consider D LLn . We ﬁrst deﬁne f β L , ( t ) = 32 e βL − λ (1 − t ) − e βL − λ (1 − t ) , f β L , ( t ) = 1 − λβ L ( e βL − λ ( t − λ ) − ,f β L , ( t ) = β − L [ { ( λ − β L − λ + 1 } e βL − λ ( t − λ ) − { ( t − β L − λ + 1 } ] . By Itˆo’s Lemma, we obtain E (cid:34)(cid:18)(cid:90) λ { β L − λ (1 − t − (1 − λ ) β − L ) e β L (1 − λ ) − (1 − t ) + 1 } Z Lt dW t (cid:19) (cid:35) (cid:20)(cid:90) λ { β L − λ (1 − t − (1 − λ ) β − L ) e β L (1 − λ ) − (1 − t ) + 1 } ( t − λ ) dt (cid:21) = (1 − λ ) (cid:0) (2 β L − β L + 9) e β L + (16 β L − e β L + (4 β L + 22 β L + 39) (cid:1) β L and 4 E (cid:20)(cid:90) λ e βL − λ (1 − t ) ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = 4 E (cid:20)(cid:90) λ e βL − λ (1 − t ) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) +4 E (cid:20)(cid:90) λ e βL − λ (1 − t ) β L − λ ( P t − P λ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = 4 E (cid:20)(cid:90) λ e βL − λ (1 − t ) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) +12 E (cid:20)(cid:90) λ ( e βL − λ (1 − t ) − P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) a.s.Thus, we have E (cid:20)(cid:90) λ e βL − λ (1 − t ) ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = 32 E (cid:20)(cid:90) λ ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) − E (cid:20)(cid:90) λ e βL − λ (1 − t ) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) λ (cid:18) e βL − λ (1 − t ) − e βL − λ (1 − t ) (cid:19) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:34) (cid:90) λ f β L , ( t ) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) }× (cid:8) f β L , ( t ) σ λ ( θ ) + f β L , ( t )( ω L + ( γ L − σ λ ( θ )) (cid:9) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:35) = (cid:90) λ (cid:18) t − λ − λ ( γ L − (cid:19) f β L , ( t )( f β L , ( t ) + ( γ L − f β L , ( t )) dtσ λ ( θ )+ (cid:90) λ (cid:34) (cid:18) t − λ )( γ L − − λ (cid:19) f β L , ( t ) f β L , ( t )+ ( t − λ ) f β L , ( t )1 − λ ( f β L , ( t ) + ( γ L − f β L , ( t )) (cid:35) dtω L σ λ ( θ )+ (cid:90) λ t − λ − λ f β L , ( t ) f β L , ( t ) dtω L

29 14 (cid:0) F β L , σ λ ( θ ) + F β L , ω L σ λ ( θ ) + F β L , ω L (cid:1) a.s. , where the second and third equalities are due to (A.4) and (A.5) below, respectively, and F β L , = 4 (cid:90) λ (cid:18) t − λ − λ ( γ L − (cid:19) f β L , ( t )( f β L , ( t ) + ( γ L − f β L , ( t )) dt,F β L , = 4 (cid:90) λ (cid:34) (cid:18) t − λ )( γ L − − λ (cid:19) f β L , ( t ) f β L , ( t )+ ( t − λ ) f β L , ( t )1 − λ ( f β L , ( t ) + ( γ L − f β L , ( t )) (cid:35) dt,F β L , = 4 (cid:90) λ t − λ − λ f β L , ( t ) f β L , ( t ) dt. (A.3)Hence, we arrive at G ( k ) = E (cid:20)(cid:90) λ (1 − t ) k k ! ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) λ (1 − t ) k k ! { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) + E (cid:20)(cid:90) λ (1 − t ) k k ! β L − λ ( P t − P λ ) (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) λ (1 − t ) k k ! { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) + 6 β L − λ E (cid:20)(cid:90) λ (1 − t ) k +1 ( k + 1)! ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) λ (1 − t ) k k ! { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) + 6 β L − λ G ( k + 1) a.s. , where the second equality is due to Itˆo’s Isometric, and the third equality can be derivedusing arguments similar to the proofs of Theorem 1. Therefore, we obtain G (0) = E (cid:20)(cid:90) λ ( P t − P λ ) σ t ( θ ) dt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = ∞ (cid:88) k =0 (6 β L ) k E (cid:20)(cid:90) λ (1 − t ) k k ! { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) λ e βL − λ (1 − t ) { σ λ ( θ ) + t − λ − λ ( ω L + ( γ L − σ λ ( θ )) } (cid:90) tλ σ s ( θ ) dsdt (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) a.s. (A.4)Note that E (cid:20)(cid:90) tλ ( t − s ) k k ! σ s ( θ ) ds (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) E (cid:20)(cid:90) tλ ( t − s ) k k ! (cid:26) σ λ ( θ ) + s − λ − λ ( ω L + ( γ L − σ λ ( θ )) (cid:27) ds (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) + β L − λ E (cid:20)(cid:90) tλ ( t − s ) k +1 ( k + 1)! σ s ( θ ) ds (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) a.s. , where the last equality can be derived similar to the proofs Theorem 1. We have E (cid:20)(cid:90) tλ σ s ( θ ) ds (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = E (cid:20)(cid:90) tλ e βL − λ ( t − s ) (cid:26) σ λ ( θ ) + s − λ − λ ( ω L + ( γ L − σ λ ( θ )) (cid:27) ds (cid:12)(cid:12)(cid:12)(cid:12) F λ (cid:21) = f β L , ( t ) σ λ ( θ ) + f β L , ( t )( ω L + ( γ L − σ λ ( θ )) a.s. , (A.5)and E (cid:104)(cid:0) D LL (cid:1) (cid:12)(cid:12)(cid:12) F λ (cid:105) = F β L , σ λ ( θ ) + F β L , ω L σ λ ( θ ) + F β L , ω L + (1 − λ ) ν gL + ( α ∗ L (1 − λ ) λ − D H ) + ν L (cid:0) (2 β L − β L + 9) e β L + (16 β L − e β L + (4 β L + 22 β L + 39) (cid:1) β L a.s.Finally, an application of the tower property leads to E (cid:104)(cid:0) D LL (cid:1) (cid:12)(cid:12)(cid:12) F (cid:105) = F β L , s ( θ ) + F β L , ω L s ( θ ) + F β L , ω L + (1 − λ ) ν gL a.s. , where s ( θ ) = ω H − ω H + γ H ω L + γσ − λ ( θ ) + β H h H ( θ ) + γ H β L − λ ( X n − X n − λ − (1 − λ ) µ L ) , (A.6) ν gL = ν L (cid:0) (2 β L − β L + 9) e β L + (16 β L − e β L + (4 β L + 22 β L + 39) (cid:1) β L + { ( (cid:37) L β H (1 − λ ) λ − ) + ( β H λ − ) } ν gH . (A.7) (cid:4) A.2 Proof of Theorem 2

To easy the notations, we use θ instead of θ g in this subsection. Deﬁne (cid:98) L n,m ( θ ) = − n n (cid:88) i =1  ( RV i − λ (cid:98) h Hi ( θ )) (cid:98) φ H + (cid:16) ( X i − X λ + i − ) − (1 − λ ) (cid:98) h Li ( θ ) (cid:17) (cid:98) φ L  , L n ( θ, φ H , φ L ) = − n n (cid:88) i =1 (cid:34) ( IV i − λh Hi ( θ )) φ H + (cid:0) ( X i − X λ + i − ) − (1 − λ ) h Li ( θ ) (cid:1) φ L (cid:35) ,L n ( θ ) = − n n (cid:88) i =1 (cid:34) λ ( h Hi ( θ ) − h Hi ( θ )) + ϕ H ( θ ) φ H + (1 − λ ) (cid:0) h Li ( θ ) − h Li ( θ ) (cid:1) + ϕ Li ( θ ) φ H (cid:35) , (cid:98) s n,m ( θ ) = ∂ (cid:98) L n,m ( θ ) ∂θ , (cid:98) s n ( θ, φ H , φ L ) = ∂ (cid:98) L n ( θ, φ H , φ L ) ∂θ , s n ( θ ) = ∂L n ( θ ) ∂θ , where IV i = (cid:82) λ + i − i − σ t ( θ ) dt . Since the eﬀect of the initial value h ( θ ) is of order n − andthus negligible, without loss of the generality, we assume h ( θ ) is given. Proposition 1.

Under the assumption of Theorem 2, (cid:98) θ converges to θ in probability. Proof of Proposition 1.

Note that | (cid:98) L n,m ( θ ) − L n ( θ ) | ≤ | (cid:98) L n,m ( θ ) − (cid:98) L n ( θ, θ ) | + | (cid:98) L n ( θ, θ ) − L n ( θ ) | . First consider | (cid:98) L n,m ( θ ) − (cid:98) L n ( θ, θ ) | . By Assumption 1(4), we have E (cid:20) sup θ | (cid:98) h Hi ( θ ) − h Hi ( θ ) | (cid:21) ≤ C i − (cid:88) k =0 γ ku E ( | RV i − − k − IV i − − k | ) ≤ Cm − / , (A.8)and similarly, we can show E (cid:20) sup θ | (cid:98) h Li ( θ ) − h Li ( θ ) | (cid:21) ≤ Cm − / . (A.9)Together with Assumption 1(6), we obtainsup θ ∈ Θ | (cid:98) L n,m ( θ ) − (cid:98) L n ( θ, θ ) | = o p (1) . Consider the second term | (cid:98) L n ( θ, θ ) − L n ( θ ) | . We have (cid:98) L n ( θ, θ ) − L n ( θ )= − n n (cid:88) i =1 (cid:34) λD Hi ( h Hi ( θ ) − h Hi ( θ )) + ( D Hi ) − ϕ H ( θ ) φ H + D LLi (1 − λ ) { h Li ( θ ) − h Li ( θ ) } + ( D LLi ) − ϕ Li ( θ ) φ L (cid:35) . Note that the martingale diﬀerence terms, D Hi and D LLi , are integrable. The uniform con-vergence of the second term | (cid:98) L n ( θ, θ ) − L n ( θ ) | can be established using arguments similarto the proofs of Theorem 1 in Kim and Wang (2016). Then, to prove the statement, we32eed to show the uniqueness of the maximizer of L n ( θ ). L n ( θ ) is concave, and the solu-tion of the equation with its ﬁrst derivative equal to zero must satisfy h Hi ( θ ) = h Hi ( θ ) and h Li ( θ ) = h Li ( θ ) for all i = 1 , . . . , n . Thus, the maximizer θ ∗ must satisfy h Hi ( θ ∗ ) = h Hi ( θ )and h Li ( θ ∗ ) = h Li ( θ ) for all i = 1 , . . . , n . Suppose that the maximizer θ ∗ may be diﬀerentfrom θ . Since h Hi ( θ ) = ω gH + γh Hi − ( θ ) + α H λ g IV i − + β gH − λ ( X i − − X λ + i − ) ,θ ∗ and θ satisfy almost surely T  ω gH, − ω ∗ H γ − γ ∗ β gH, − β ∗ H α gH, − α ∗ H  = 0 , where T =  h H ( θ ) ( X − X λ ) IV h H ( θ ) ( X − X λ +1 ) IV ... ... ... ...1 h Hn ( θ ) ( X n − − X λ + n − ) IV n −  . Then, since IV i ’s and X i ’s are nondegenerate random variables, we have  ω gH, − ω ∗ H γ − γ ∗ β gH, − β ∗ H α gH, − α ∗ H  = 0 a.s. , and similarly, we obtain  ω gL, − ω ∗ L γ − γ ∗ β gL, − β ∗ L α gL, − α ∗ L  = 0 a.s. , which implies θ ∗ = θ a.s. This shows that the maximizer is unique. Finally, the result is aconsequence of Theorem 1 in Xiu (2010). (cid:4) Proof of Theorem 2.

The mean value theorem and Taylor expansion indicate that forsome θ ∗ between θ and (cid:98) θ , we have (cid:98) s n,m ( (cid:98) θ ) − (cid:98) s n,m ( θ ) = − (cid:98) s n,m ( θ ) = − (cid:79) (cid:98) s n,m ( θ ∗ )( (cid:98) θ − θ ) . Similar to the proofs of Proposition 1, we can show − (cid:79) (cid:98) s n,m ( θ ∗ ) p → − (cid:79) s n ( θ ) . Since IV i ’s and X i ’s are nondegenerate, − (cid:79) s n ( θ ) is almost surely positive deﬁnite. By theergodic theorem, we have − (cid:79) s n ( θ ) p → A. | (cid:98) s n,m ( θ ) − (cid:98) s n ( θ , φ H , φ L ) | ≤ | (cid:98) s n,m ( θ ) − (cid:98) s n ( θ , (cid:98) φ H , (cid:98) φ L ) | + | (cid:98) s n ( θ , (cid:98) φ H , (cid:98) φ L ) − (cid:98) s n ( θ , φ H , φ L ) | . By Assumption 1 and (A.8)–(A.9), we can establish | (cid:98) s n,m ( θ ) − (cid:98) s n ( θ , , (cid:98) φ H , (cid:98) φ L ) | = O p ( m − / ) . (A.10)Consider | (cid:98) s n ( θ , (cid:98) φ H , (cid:98) φ L ) − (cid:98) s n ( θ , φ H , φ L ) | . Simple algebraic manipulations show (cid:98) s n ( θ , (cid:98) φ H , (cid:98) φ L ) − (cid:98) s n ( θ , φ H , φ L ) = 1 n n (cid:88) i =1 λD Hi ∂h Hi ( θ ) ∂θ (cid:18) (cid:98) φ H − φ H (cid:19) +2(1 − λ ) D LLi ∂h Li ( θ ) ∂θ (cid:18) (cid:98) φ L − φ L (cid:19) . Since D Hi ’s are martingale diﬀerences and h Hi ( θ ) is F i − -adaptive, we have λn (cid:80) ni =1 D Hi ∂h Hi ( θ ) ∂θ = O p ( n / ). Thus, with (cid:107) (cid:98) φ H − φ H (cid:107) max = o p (1), we obtain1 n n (cid:88) i =1 λD Hi ∂h Hi ( θ ) ∂θ (cid:32) ϕ H ( (cid:98) θ ) − ϕ H ( θ ) (cid:33) = o p ( n − / ) . Similarly, we can show 1 n n (cid:88) i =1 D LLi ∂h Li ( θ ) ∂θ (cid:18) (cid:98) φ L − φ L (cid:19) = o p ( n − / ) . Hence, we have (cid:98) s n ( θ , (cid:98) φ H , (cid:98) φ L ) − (cid:98) s n ( θ , φ H , φ L ) = o p ( n − / ) . (A.11)By (A.10) and (A.11), we obtain | (cid:98) s n,m ( θ ) − (cid:98) s n ( θ , φ H , φ L ) | = O p ( m − / ) + o p ( n − / ) . Note that (cid:98) s n ( θ , φ H , φ L ) = 1 n n (cid:88) i =1 λ D Hi φ H ∂h Hi ( θ ) ∂θ + 2(1 − λ ) D LLi φ L ∂h Li ( θ ) ∂θ . Since the martingale diﬀerence terms have the ﬁnite 4th moments, the above term conver-gence rate is n − / . Thus, (3.6) is proved. An application of the ergodic theorem leadsto √ n (cid:98) s n ( θ ) d → N (0 , B ) . Finally, we conclude √ n ( (cid:98) θ − θ ) = − A − √ n (cid:98) s n ( θ ) + o p (1) d → N (0 , A − BA − ) . The statement (3.7) is proved. (cid:4) cknowledgements The research of Donggyu Kim was supported in part by KAIST Settlement/Research Subsi-dies for Newly-hired Faculty grant G04170049 and KAIST Basic Research Funds by Faculty(A0601003029). The research of Yazhen Wang was supported in part by NSF grants DMS-1707605 and DMS-1913149.This research was performed using the compute resources and assistance of the UW-Madison Center For High Throughput Computing (CHTC) in the Department of ComputerSciences. The CHTC is supported by UW-Madison, the Advanced Computing Initiative,the Wisconsin Alumni Research Foundation, the Wisconsin Institutes for Discovery, and theNational Science Foundation, and is an active member of the Open Science Grid, which issupported by the National Science Foundation and the U.S. Department of Energy’s Oﬃceof Science.The authors would like to thank the editor, associate editor and two referees for commentsand suggestions that led to signiﬁcant improvement of the paper.

References

Admati, A. R. and Pﬂeiderer, P. (1988). A theory of intraday patterns: Volume and pricevariability.

The Review of Financial Studies , 1(1):3–40.A¨ıt-Sahalia, Y., Fan, J., and Xiu, D. (2010). High-frequency covariance estimates withnoisy and asynchronous ﬁnancial data.

Journal of the American Statistical Association ,105(492):1504–1517.A¨ıt-Sahalia, Y., Jacod, J., and Li, J. (2012). Testing for jumps in noisy high frequency data.

Journal of Econometrics , 168(2):207–222.A¨ıt-Sahalia, Y. and Xiu, D. (2016). Increased correlation among asset classes: Are volatilityor jumps to blame, or both?

Journal of Econometrics , 194(2):205–219.Andersen, T. G. and Bollerslev, T. (1997a). Heterogeneous information arrivals and returnvolatility dynamics: Uncovering the long-run in high frequency returns.

The journal ofFinance , 52(3):975–1005.Andersen, T. G. and Bollerslev, T. (1997b). Intraday periodicity and volatility persistencein ﬁnancial markets.

Journal of empirical ﬁnance , 4(2-3):115–158.Andersen, T. G. and Bollerslev, T. (1998a). Answering the skeptics: Yes, standard volatilitymodels do provide accurate forecasts.

International Economic Review , 39(4):885–905.Andersen, T. G. and Bollerslev, T. (1998b). Deutsche mark-dollar volatility: Intraday activ-ity patterns, macroeconomic announcements, and longer run dependencies.

The journalof Finance , 53(1):219–265. 35ndersen, T. G., Bollerslev, T., and Diebold, F. X. (2007). Roughing it up: Including jumpcomponents in the measurement, modeling, and forecasting of return volatility.

The reviewof economics and statistics , 89(4):701–720.Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2003). Modeling and fore-casting realized volatility.

Econometrica , 71(2):579–625.Andersen, T. G., Bollerslev, T., and Huang, X. (2011). A reduced form framework formodeling volatility of speculative prices based on realized variation measures.

Journal ofEconometrics , 160(1):176–189.Andersen, T. G., Dobrev, D., and Schaumburg, E. (2012). Jump-robust volatility estimationusing nearest neighbor truncation.

Journal of Econometrics , 169(1):75–93.Andersen, T. G., Dobrev, D., and Schaumburg, E. (2014). A robust neighborhood truncationapproach to estimation of integrated quarticity.

Econometric Theory , pages 3–59.Andersen, T. G., Thyrsgaard, M., and Todorov, V. (2019). Time-varying periodicity inintraday volatility.

Journal of the American Statistical Association , 114(528):1695–1707.Barndorﬀ-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2008). Designingrealized kernels to measure the ex post variation of equity prices in the presence of noise.

Econometrica , 76(6):1481–1536.Barndorﬀ-Nielsen, O. E. and Shephard, N. (2006). Econometrics of testing for jumps inﬁnancial economics using bipower variation.

Journal of ﬁnancial Econometrics , 4(1):1–30.Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity.

Journal ofeconometrics , 31(3):307–327.Christoﬀersen, P. F. (1998). Evaluating interval forecasts.

International economic review ,pages 841–862.Corsi, F. (2009). A simple approximate long-memory model of realized volatility.

Journalof Financial Econometrics , 7(2):174–196.Corsi, F., Pirino, D., and Reno, R. (2010). Threshold bipower variation and the impact ofjumps on volatility forecasting.

Journal of Econometrics , 159(2):276–288.Duﬃe, D. and Pan, J. (1997). An overview of value at risk.

Journal of derivatives , 4(3):7–49.Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of thevariance of united kingdom inﬂation.

Econometrica: Journal of the Econometric Society ,pages 987–1007.Engle, R. F. and Manganelli, S. (2004). Caviar: Conditional autoregressive value at risk byregression quantiles.

Journal of Business & Economic Statistics , 22(4):367–381.36an, J. and Kim, D. (2018). Robust high-dimensional volatility matrix estimation for high-frequency factor model.

Journal of the American Statistical Association , 113(523):1268–1283.Fan, J. and Wang, Y. (2007). Multi-scale jump and volatility analysis for high-frequencyﬁnancial data.

Journal of the American Statistical Association , 102(480):1349–1362.Glosten, L. R., Jagannathan, R., and Runkle, D. E. (1993). On the relation between theexpected value and the volatility of the nominal excess return on stocks.

Journal ofFinance , 48(5):1779–1801.Hansen, P. R., Huang, Z., and Shek, H. H. (2012). Realized garch: a joint model for returnsand realized measures of volatility.

Journal of Applied Econometrics , 27(6):877–906.Hansen, P. R. and Lunde, A. (2005). A realized variance for the whole day based on inter-mittent high-frequency data.

Journal of Financial Econometrics , 3(4):525–554.Hong, H. and Wang, J. (2000). Trading and returns under periodic market closures.

TheJournal of Finance , 55(1):297–354.Jacod, J., Li, Y., Mykland, P. A., Podolskij, M., and Vetter, M. (2009). Microstructurenoise in the continuous case: the pre-averaging approach.

Stochastic processes and theirapplications , 119(7):2249–2276.Kim, D. and Fan, J. (2019). Factor garch-itˆo models for high-frequency data with applicationto large volatility matrix prediction.

Journal of Econometrics , 208(2):395–417.Kim, D., Liu, Y., and Wang, Y. (2018). Large volatility matrix estimation with factor-baseddiﬀusion model for high-frequency ﬁnancial data.

Bernoulli , 24(4B):3657–3682.Kim, D., Song, X., and Yazhen, W. (2020). Uniﬁed discrete-time factor stochastic volatilityand continuous-time ito models for combining inference based on low-frequency and high-frequency.

Preprint: arXiv:2006.12039 .Kim, D. and Wang, Y. (2016). Uniﬁed discrete-time and continuous-time models and sta-tistical inferences for merged low-frequency and high-frequency ﬁnancial data.

Journal ofEconometrics , 194:220–230.Kim, D., Wang, Y., and Zou, J. (2016). Asymptotic theory for large volatility matrix estima-tion based on high-frequency ﬁnancial data.

Stochastic Processes and their Applications ,126:3527—-3577.Kupiec, P. H. (1995). Techniques for verifying the accuracy of risk measurement models.

The Journal of Derivatives , 3(2):73–84.Mancini, C. (2004). Estimation of the characteristics of the jumps of a general poisson-diﬀusion model.

Scandinavian Actuarial Journal , 2004(1):42–52.37arkowitz, H. (1952). Portfolio selection.

The journal of ﬁnance , 7(1):77–91.Martens, M. (2002). Measuring and forecasting s&p 500 index-futures volatility using high-frequency data.

Journal of Futures Markets: Futures, Options, and Other DerivativeProducts , 22(6):497–518.Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies.

Jour-nal of Econometrics , 160(1):246–256.Rockafellar, R. T., Uryasev, S. (2000). Optimization of conditional value-at-risk.

Journal ofrisk , 2:21–42.Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditionsof risk.

The journal of ﬁnance , 19(3):425–442.Shephard, N. and Sheppard, K. (2010). Realising the future: forecasting with high-frequency-based volatility (heavy) models.

Journal of Applied Econometrics , 25(2):197–231.Song, X., Kim, D., Yuan, H., Cui, X., Lu, Z., Zhou, Y., and Yazhen, W.(2020). Volatility analysis with realized garch-ito models.

Journal of Econometrics ,DOI:10.1016/j.jeconom.2020.07.007.Tao, M., Wang, Y., and Chen, X. (2013). Fast convergence rates in estimating large volatilitymatrices using high-frequency ﬁnancial data.

Econometric Theory , 29(04):838–856.Taylor, N. (2007). A note on the importance of overnight information in risk managementmodels.

Journal of Banking & Finance , 31(1):161–180.Todorova, N. and Souˇcek, M. (2014). Overnight information ﬂow and realized volatilityforecasting.

Finance Research Letters , 11(4):420–428.Tseng, T.-C., Lai, H.-C., and Lin, C.-F. (2012). The impact of overnight returns on realizedvolatility.

Applied Financial Economics , 22(5):357–364.Xiu, D. (2010). Quasi-maximum likelihood estimation of volatility with high frequency data.

Journal of Econometrics , 159(1):235–250.Zhang, L. (2006). Eﬃcient estimation of stochastic volatility using noisy observations: Amulti-scale approach.

Bernoulli , 12(6):1019–1043.Zhang, L., Mykland, P. A., and A¨ıt-Sahalia, Y. (2005). A tale of two time scales: Determiningintegrated volatility with noisy high-frequency data.

Journal of the American StatisticalAssociation , 100(472):1394–1411.Zhang, X., Kim, D., and Wang, Y. (2016). Jump variation estimation with noisy highfrequency ﬁnancial data via wavelets.