[PDF] Doubly Multiplicative Error Models with Long- and Short-run Components

Abstract

We suggest the Doubly Multiplicative Error class of models (DMEM) for modeling and forecasting realized volatility, which combines two components accommodating low-, respectively, high-frequency features in the data. We derive the theoretical properties of the Maximum Likelihood and Generalized Method of Moments estimators. Two such models are then proposed, the Component-MEM, which uses daily data for both components, and the MEM-MIDAS, which exploits the logic of MIxed-DAta Sampling (MIDAS). The empirical application involves the S&P 500, NASDAQ, FTSE 100 and Hang Seng indices: irrespective of the market, both DMEM's outperform the HAR and other relevant GARCH-type models.

Full PDF

DDoubly Multiplicative Error Models withLong– and Short–run Components

A. Amendola ∗ V. Candila † F. Cipollini ‡ G.M. Gallo § ∗ Dept. of Economics and Statistics, University of Salerno, Italy,[email protected] † MEMOTEF Department, Sapienza University of Rome, Italy,[email protected] ‡ Dipartimento di Statistica “G. Parenti”, University of Florence, Italy,fabrizio.cipollini@uniﬁ.it § Italian Court of Audits (Corte dei conti), and NYU in Florence, Italy,[email protected] 8, 2020

Abstract

We suggest the Doubly Multiplicative Error class of models (

DMEM ) for modelingand forecasting realized volatility, which combines two components accommodating low–,respectively, high–frequency features in the data. We derive the theoretical properties of theMaximum Likelihood and Generalized Method of Moments estimators. Two such modelsare then proposed, the

Component − MEM , which uses daily data for both components,and the

MEM − MIDAS , which exploits the logic of MIxed–DAta Sampling (MIDAS).The empirical application involves the S&P 500, NASDAQ, FTSE 100 and Hang Sengindices: irrespective of the market, both

DMEM ’s outperform the

HAR and other relevant

GARCH –type models.

Keywords:

Financial markets; Realized volatility; Multiplicative Error Model; MIDAS;GARCH; HAR. a r X i v : . [ q -f i n . S T ] J un Introduction

More than forty years have passed since Engle’s pioneering work (Engle, 1982) on modeling theconditional variance as an autoregressive process of observable variables. GARCH-type models(Bollerslev, 1986) are still playing a signiﬁcant role in the ﬁnancial econometrics literature.This is mainly due to the fact that this class of models allows to reproduce several stylizedfacts, such as the persistence in the conditional second moments (volatility clustering) and, inits extensions, the possibility of taking into account the slow moving or state dependent averagevolatility level. This empirical regularity can be suitably accommodated assuming that thedynamic evolution of volatility is driven by two components, a high- and a low-frequency one,which combine additively or multiplicatively (Amado et al. , 2019, offer a comprehensive surveyof the contributions in this ﬁeld). As a matter of fact, several suggestions exist in the GARCHliterature to model the low frequency component. For instance, Hamilton and Susmel (1994)and Dueker (1997) consider a Markov Switching framework, Amado and Ter¨asvirta (2008) aSmooth Transition context, Mazur and Pipie´n (2012) and Engle and Rangel (2008) introducedeterministic functions in order to make the unconditional variance time-varying with highpersistence. This latter contribution points to a relationship between a time–varying averagelevel of volatility and macroeconomic events related to the business cycle: since the macro–variables are observed at a lower frequency than that of the asset returns, the MIxed–DAtaSampling (MIDAS) approach suggested by Ghysels et al. (2007) was extended to allow the realeconomy to inﬂuence ﬁnancial volatility (GARCH–MIDAS model Engle et al. , 2013; Conradand Loch, 2015). Some extensions are available, such as the Double Asymmetric GARCH–MIDAS (DAGM) introduced by Amendola et al. (2019), where a variable available at a lowfrequency drives the slow moving level of volatility and is allowed to have differentiated effectsaccording to its sign, determining a local time–varying trend around which a GJR–GARCH(Glosten et al. , 1993,

GJR ) describes the short –run dynamics. Volatility modeling has encountered a tremendous boost by the availability of ultra-high fre-quency data, and the ensuing stream of literature related to estimating volatility using tick–by–tick data, conveniently aggregated: following the pathbreaking paper by Andersen and Boller-slev (1998), realized volatility measures have become an ideal target for evaluating volatilityforecasting performances. Such forecasts may be generated by GARCH models (for the con-ditional variances of asset returns) or by models of realized variances themselves (conditionalexpectations of variances or volatility, or, yet, log–variances), the latter being able to exploitintra-daily information about market movements. For the latter class of models a wide choiceexists: the variants of the Multiplicative Error Model (

MEM , Engle, 2002; Engle and Gallo,2006), the Heterogeneous Autoregressive Model (

HAR ) by Corsi (2009), the Realized GARCH(

RGARCH , Hansen et al. , 2012), among others, have proven to be effective in translating thereﬁnement of volatility measurement achieved in the realized variance estimators (for a survey A similar approach was independently developed by Pan and Liu (2018).

1n this estimators in reference to forecasting, cf. Andersen et al. , 2006) into good out–of-samplemodel performances relative to the GARCH results (notoriously based just on squared close–to–close returns).This paper discusses the presence of a long –run and a short –run components of volatil-ity, combining multiplicatively with one another within a uniﬁed general framework within the

MEM class, which we label

DMEM (Doubly Multiplicative Error Model): in it, the short –runcomponent is seen as ﬂuctuating around one and be a function of past volatility or some prede-termined variables, all observed at the same frequency. As per the long –run component (whichprovides the time–varying average level of volatility), it can be assumed as: a constant (givingback the base

MEM ); a smooth function of time (giving rise to a

Spline − MEM in the caseof a spline); a speciﬁcation based on daily data which mirrors the structure of the short –runcomponent with a higher persistence (a novel model, which we label

Component − MEM ); andthe extension of the MIDAS approach to the

MEM world, providing a tool in which weekly ormonthly data for the long –run can be combined with daily data for the short –run (another novelmodel, the

MEM − MIDAS ). From an empirical point of view we are motivated to compare per-formance of these models against a few representative models in the

GARCH class, in particularthose based on a MIDAS approach on the one side and models for realized volatility keepinga base asymmetric MEM (

AMEM ) as a reference, together with (an asymmetric versio of) the

HAR and the

RGARCH , all characterized by the absence of such a low–frequency component.The theoretical discussion shows that both new models have desirable statistical propertiesfor their estimators (both within a Maximum Likelihood and a Generalized Method of Momentsframework). From an empirical point of view, we estimate all the competing models for therealized volatility series of four major indices (the S&P 500, NASDAQ, FTSE 100 and HangSeng). To summarize the results, to a question like

Is a long–run component advisable? , theanswer is yes: the models that do not use it are dominated by the ones that do within the classesof models for realized volatility on the one hand and models for conditional variances of returnson the other. To a question like

Does modeling realized volatility perform better than a

GARCH ,even when the latter contain a long–term component? , our answer is still yes, pointing to therichness of intra–daily information over the consideration of just returns. Moreover, our resultsfavor the

DMEM approach over the

HAR in spite of its capability of mimicking long memoryfeatures in the data.Our contribution parallels a number of papers where the issue of a low-frequency componentwas taken into account. Within the

MEM context, has been estimated in several ways: throughregime switching and smooth transition functions Gallo and Otranto (2015), by deterministicsplines Brownlees and Gallo (2010) or by a semi-non-parametric vector

MEM , where the low-frequency term affecting several assets is obtained non-parametrically Barigozzi et al. (2014).A comparison with those models goes beyond the scope of this paper.The rest of the paper is organized as follows. In Section 2 we suggest the rationale and thenotation for the

DMEM , introducing the two new models (

Component − MEM and

MEM − MIDAS ).2ection 3 presents the theoretical results on the estimators’ properties and statistical inference.Section 4 introduces the market indices used in the empirical estimation, presents the results interms of in–sample estimation and performs the main forecasting comparison across the com-peting models. Section 5 contains some concluding remarks.

Let { x i , t } be a time series coming from a non-negative discrete time process for the i -th day( i = , . . . , N t ) of the period t (for example, a week, a month or a quarter; t = , . . . , T ): thiscomprises most ﬁnancial activity–related variables, such as realized volatility, high-low range,number of trades, volumes, durations, and so on.Let F i , t be the information set available at day i of period t . In its standard version (Engle,2002), the MEM assumes that x i , t = µ i , t ε i , t = τξ i , t ε i , t , (1)where: τ is a constant; ξ i , t is a quantity that, conditionally on F i − , t and by means of a param-eter vector θθθ , evolves deterministically; ε i , t is an error term such that ε i , t | F i − , t iid ∼ D + ( , σ ) , (2)meaning that it has a unit mean, unknown variance σ and a probability density function deﬁnedover a non-negative support. Therefore, independently of the chosen distribution D + and the function used to build theevolution of µ i , t , we have that E ( x i , t | F i − , t ) = τξ i , t . (3)Evaluating expression (3) unconditionally, we can interpret τ to be the unconditional expecta-tion of x i , t if we assume that E ( ξ i , t ) =

1, so that x i , t moves around the constant term τ . Corre-spondingly, the conditional variance can be expressed as Var ( x i , t | F i − , t ) = σ τ ξ i , t . (4)In this paper, we extend the speciﬁcation for the conditional mean to have a multiplicativecomponent structure, in which both factors of the conditional expectation are time–varying. Wehave x i , t = µ i , t ε i , t = τ i , t ξ i , t ε i , t . (5) τ i , t can be seen as a slow –moving component determining the average level of the conditionalmean at any given time, or, which is the same, a long –run component. By the same token, since For ease of notation, we use the set F i − , t even when the ﬁrst day of a new period, say x , t depends on theinformation observed the last day of the period immediately preceding t , that is F N t − , t − i , t is a factor centered around one, it plays the role of dumping or amplifying τ i , t depending onwhether it is < or >

1; for this reason, we label it as a short –run or fast –moving component.Equation (5) with innovation (2) deﬁne a Doubly Multiplicative Error Model, or

DMEM . Let us start by expressing the short –run component in general terms as the GARCH–typeexpression typical of a MEM, augmented by the contribution of a predetermined de–meaned(vector) variable zzz ( DMEM − X to parallel the GARCH − X , cf. Han and Kristensen, 2014): ξ i , t = ( − α − γ / − β ) + α x ( ξ ) i − , t + γ x ( ξ − ) i − , t + β ξ i − , t + δδδ (cid:48) zzz i − , t (6)where x ( ξ ) i , t ≡ x i , t τ i , t x ( ξ − ) i , t ≡ x ( ξ ) i , t ( r i , t < ) . (7) x ( ξ − ) i , t is a variable derived from x ( ξ ) i , t which takes a non-zero value only if it corresponds to anegative return (for asymmetric effects).Starting from E ( ξ i , t ) =

1, we have E ( ξ i , t ) = − β ∗ − (cid:2) ( σ + ) (cid:0) ( β ∗ − β ) + γ / (cid:1) + β ( β ∗ − β ) (cid:3) where β ∗ = α + γ / + β denotes the persistence. To simplify matters, here we removed thecontribution of predetermined variables: explicit inclusion would require assumptions on thecorrelation between variables x and zzz .As far as the long –run is concerned, we consider here different alternatives, apart from itbeing constant (the resulting model would be the standard MEM ). • [ Spline − MEM ] We can specify τ i , t by means of a spline function (for example a linear or acubic spline) τ i , t = exp ( f s ( i , t )) as a smoothing spline or a regression spline with a relatively low number of knots so as toguarantee the slow –moving feature. The resulting model is the so called Spline − MEM (theP-Spline MEM of Brownlees and Gallo, 2010, corresponds to a speciﬁc choice of spline func-tions).

Spline − MEM is trend-stationary (stationary around the trend component representedby τ i , t ). • [ Component − MEM ] Another possibility is to structure τ i , t in a way similar to ξ i , t , namely τ i , t = ω ( τ ) + α ( τ ) x ( τ ) i − , t + γ ( τ ) x ( τ − ) i − , t + β ( τ ) τ i − , t The consideration of two multiplicative components in the univariate GARCH case is discussed by Conradand Kleen (2020). x ( τ ) i , t ≡ x i , t ξ i , t x ( τ − ) i , t ≡ x ( τ ) i , t ( r i , t < ) . (8)The essential difference in comparison with ξ i , t is that τ i , t is not constrained to move arounda unit mean, although the persistence features of the components relative to one another char-acterize the fact that τ moves differently than ξ .The model resulting from this speciﬁcation of τ i , t , which we name Component − MEM , issimilar to the model introduced by Brownlees et al. (2012) who use, however, an additive(namely µ = τ + ξ ) speciﬁcation not examined here. Another speciﬁcation which makes useof different multiplicative components is the Composite- MEM proposed by Brownlees et al. (2011) to model intradaily volumes.The

Component − MEM is mean stationary ⇔ E ( τ i , t ) = µ  − σ ( α + γ / ) (cid:16) α ( τ ) + γ ( τ ) / (cid:17) + (cid:0) σ + (cid:1) γ γ ( τ ) / − β ∗ β ( τ ) ∗  where µ = E ( x i , t ) and β ( τ ) ∗ = α ( τ ) + γ ( τ ) / + β ( τ ) . If all parameters are non-negative, thisimplies that E ( τ i , t ) ≤ µ . Such characteristic comes from the fact that the drivers of ξ and τ equations, namely x ( ξ ) i , t and x ( τ ) i , t , are positively correlated since they both depend on ε i , t . Incase of mean-stationarity we have then ω ( τ ) = µ ( − pers ( τ ))  − σ ( α + γ / ) (cid:16) α ( τ ) + γ ( τ ) / (cid:17) + (cid:0) σ + (cid:1) γ γ ( τ ) / − β ∗ β ( τ ) ∗  Easier to understand in case γ = γ ( τ ) = E ( τ i , t ) = µ (cid:32) − σ α α ( τ ) − pers ( ξ ) pers ( τ ) (cid:33) ω ( τ ) = µ ( − β ∗ ) (cid:32) − σ α α ( τ ) − β ∗ β ( τ ) ∗ (cid:33) • [ MEM − MIDAS ] Yet another option is to allow τ i , t to have a MIDAS-like structure, adaptingthe use of mixed frequency data models (Engle et al. , 2013; Conrad and Kleen, 2020) to themultiplicative error model context. In its simplest form, for all days i of the same period t , τ i , t can be expressed over a window of K periods as τ i , t = τ t ≡ exp (cid:40) m + ζ K ∑ j = δ j ( ω ) X t − j (cid:41) X t indicates a variable available only at t times and δ k ( ω ) = ( k / K ) ω − ( − k / K ) ω − K ∑ j = ( j / K ) ω − ( − j / K ) ω − . (9)Assuming ω = ω ≥ DAGM (Pan and Liu, 2018; Amendola et al. , 2019).Regarding the choice of the MIDAS driver X , one could favor a variable X ⊥ ε as in Conradand Kleen (2020), as this simpliﬁes the analysis, although it may be difﬁcult to meet thiscondition in practice (as acknowledged by Conrad and Kleen, 2020, p.4)). Inference on the model deﬁned in Section 2 can be obtained extending the framework suggestedby Brownlees et al. (2012, Section 9.2.2). Assuming that the conditional mean is correctlyspeciﬁed and indicating with θθθ the vector of parameters entering it, two estimation strategies areillustrated in what follows: Maximum Likelihood (ML) and Generalized Method of Moments(GMM).

The

DMEM

Maximum Likelihood estimator (cid:98) θθθ ML is deﬁned as the value of θθθ maximizing theaverage log-likelihood function l N = N − T ∑ t = N t ∑ i = l i , t = N − T ∑ t = N t ∑ i = [ ln f ε ( ε i , t | F i − , t ) + ln ε i , t − ln x i , t ] where N = T ∑ t = N t is the number of observations. The portion relative to θθθ of the average score function can be expressed as sss N = N − T ∑ t = N t ∑ i = ∇ θθθ l i , t = − N − T ∑ t = N t ∑ i = ( ε i , t b i , t + ) aaa i , t , (10)6here ε i , t = x i , t τ i , t ξ i , t (11) aaa i , t = µ i , t ∇ θθθ µ i , t = τ i , t ∇ θθθ τ i , t + ξ i , t ∇ θθθ ξ i , t b i , t = ∇ ε i , t ln f ε ( ε i , t | F i − , t ) . A choice of f ε ( ε i , t | F i − , t ) giving E ( ε i , t b i , t + | F i − , t ) = (cid:98) θθθ ML . This condition is obtained in case ofcorrect speciﬁcation of the error distribution but, as discussed in what follows, there are choicesof f ε ( ε i , t | F i − , t ) able to guarantee (12) despite they are wrongly speciﬁed: in this case, (cid:98) θθθ ML is said a QML estimator. In what follows we assume that (12) is satisﬁed by the distributionchosen for ε i , t .The squared portions relative to θθθ of the asymptotic OPG ( III ∞ ) and Hessian ( HHH ∞ ) matricesare given by lim N → ∞ of, respectively, III N = N − T ∑ t = N t ∑ i = E ( ∇ θθθ l i , t ∇ θθθ (cid:48) l i , t ) = N − T ∑ t = N t ∑ i = E (cid:104) ( ε i , t b i , t + ) | F i − , t (cid:105) E (cid:0) aaa i , t aaa (cid:48) i , t (cid:1) (13) HHH N = N − T ∑ t = N t ∑ i = E ( ∇ θθθ θθθ (cid:48) l i , t )= N − T ∑ t = N t ∑ i = (cid:2) E (cid:2) ε i , t (cid:0) b i , t + ε i , t ∇ ε i , t b i , t (cid:1) | F i − , t (cid:3) E (cid:0) aaa i , t aaa (cid:48) i , t (cid:1) − E ( ε i , t b i , t + | F i − , t ) E (cid:0) ∇ θθθ aaa (cid:48) i , t (cid:1) (cid:3) = N − T ∑ t = N t ∑ i = E (cid:2) ε i , t (cid:0) b i , t + ε i , t ∇ ε i , t b i , t (cid:1) | F i − , t (cid:3) E (cid:0) aaa i , t aaa (cid:48) i , t (cid:1) (14)where the last equality is implied by (12).Expressions (13) and (14) are sufﬁcient to derive Avar ( (cid:98) θθθ ML ) (the asymptotic variance matrixof (cid:98) θθθ ML ), but only when the possible free shape parameter in f ε ( ε i , t | F i − , t ) , say λ , is “orthogo-nal” to θθθ in the sense that it satisﬁeslim N → ∞ (cid:34) N − T ∑ t = N t ∑ i = E ( ∇ λ ∇ θθθ (cid:48) l i , t ) (cid:35) = − lim N → ∞ (cid:34) N − T ∑ t = N t ∑ i = E ( ε i , t ∇ λ b i , t | F i − , t ) E ( aaa i , t ) (cid:35) = (cid:98) θθθ ML depends also on the asymptotic variance of (cid:98) λ . Expressing the full parameter vector as ( θθθ ; λ ) , the corresponding OPG and

Hessian matrices are structuredin ( i , j ) -blocks ( i , j = , ) corresponding to the two parameters in that order. Since Avar ( (cid:98) θθθ ML ) is related to the ( , ) -block of some inverse matrix (being it the asymptotic OPG , Hessian or Sandwich matrix), in general it maydepend on the asymptotic variance of (cid:98) λ , right as a consequence of the block matrix algebra. E ( ε i , t ∇ λ b i , t | F i − , t ) = . (15)In the following section we discuss two among the possible speciﬁcations of the error dis-tribution. A sensible speciﬁcation for the conditional distribution of ε i , t is the Gamma ( φ , φ ) , which guar-antees the constraint E ( ε i , t | F i − , t ) = V ( ε i , t | F i − , t ) = / φ . This can be seen asa generalization introduced by Engle and Gallo (2006) to the choice of exponential distribution(where φ =

1) within the Autoregressive Conditional Durations (ACD) model by Engle andRussell (1998) and of the χ ( ) distribution (where φ =

2) suggested by Engle (2002). In sucha case, b i , t = φ − ε i , t − φ ⇒ ε i , t b i , t + = φ ( − ε i , t ) . (16)It is important to remark that this choice guarantees condition (12) is satisﬁed should the Gammanot be the true distribution of the error term (QML property), and irrespective of the value of φ :this makes the results based on assuming the exponential or the χ ( ) distributions much moregeneral, upon an appropriate choice of the standard errors.Plugging Equation (16) into (10) provides the θθθ –portion of the average score sss N = φ N − T ∑ t = N t ∑ i = ( ε i , t − ) aaa i , t , (17)which, in turn, implies the ﬁrst order condition T ∑ t = N t ∑ i = ( ε i , t − ) aaa i , t = . (18)Equation (16) guarantees also the important implication that the shape parameter φ is “or- For example, in case of correct model speciﬁcation,Avar ( (cid:98) θθθ ML ) = − ( HHH − HHH HHH − HHH ) − simpliﬁes to Avar ( (cid:98) θθθ ML ) = HHH − only in case HHH =

000 (for sake of simplicity, we use symbols

III and

HHH , reserved inthis section to the parameter θθθ , also for the general ( θθθ ; λ ) case; we also omit the ∞ symbol).If one refers instead the Sandwich matrix, we have in generalAvar ( (cid:98) θθθ ML ) = AAA − (cid:0) III − BBBIII − III BBB (cid:48) + BBBIII BBB (cid:48) (cid:1)

AAA − , ( AAA = HHH − HHH HHH − HHH and BBB = HHH HHH − ) that simpliﬁes to Avar ( (cid:98) θθθ ML ) = HHH − III HHH − again in case HHH = HHH =

000 is what is labeled “orthogonality” condition in the text. See Newey and McFadden (1994, Section 6)for a related discussion. θθθ in the sense of Equation (15): E (cid:0) ε i , t ∇ φ b i , t | F i − , t (cid:1) = E (cid:16) ε i , t (cid:16) ε − i , t − (cid:17) | F i , t (cid:17) = , as a consequence of the unit mean assumption for the error term. This, in turn, implies that theasymptotic variance of (cid:98) θθθ ML is uniquely determined by the OPG and the

Hessian matrices

III ∞ = φ σ AAA HHH ∞ = − φ AAA , where AAA = lim N → ∞ (cid:34) N − T ∑ t = N t ∑ i = E (cid:0) aaa i , t aaa (cid:48) i , t (cid:1)(cid:35) . Correspondingly, the OPG, Hessian and Sandwich versions of the asymptotic variance matrixare, respectively, Avar I ( (cid:98) θθθ ML ) = φ − σ − AAA − Avar H ( (cid:98) θθθ ML ) = φ − AAA − Avar S ( (cid:98) θθθ ML ) = σ AAA − . (19)Equivalence among the three expressions is ensured by taking φ = σ − (instead of ﬁxing it, likefor instance in the exponential and χ ( ) cases); hence, a consistent estimator is (cid:100) Avar ( (cid:98) θθθ ML ) = (cid:98) σ (cid:98) AAA − where (cid:98) σ is a consistent estimator of σ , (cid:98) AAA = N − T ∑ t = N t ∑ i = (cid:98) aaa i , t (cid:98) aaa (cid:48) i , t , and (cid:98) aaa i , t means aaa i , t evaluated at (cid:98) θθθ ML .The ML estimator of φ solvesln φ + − ψ ( φ ) + N − T ∑ t = N t ∑ i = [ ln (cid:98) ε i , t − (cid:98) ε i , t ] = , (20)where, ψ ( · ) denotes the digamma function and (cid:98) ε i , t indicates the RHS of (11) where the denomi-nator is evaluated at (cid:98) θθθ ML . Of course, this estimator is efﬁcient if the true distribution is Gamma, Considering the unit expectation constraint on ε i , t , we likely have N − ∑ Tt = ∑ N t i = (cid:98) ε i , t ≈

1, so that (20) couldbe simpliﬁed as ln φ − ψ ( φ ) + N − T ∑ t = N t ∑ i = ln (cid:98) ε i , t = . ε i , t = ln x i , t − ln τ i , t − ln ξ i , t . Analternative, which is not suffering from this drawback, is provided by using a GMM estimatorof σ (discussed below). Another possible speciﬁcation for the conditional distribution of ε i , t is the Lognormal ( − V / , V ) ,which guarantees the constraint E ( ε i , t | F i − , t ) = Var ( ε i , t | F i − , t ) = exp ( V ) − b i , t = − ε i , t (cid:18) . + ln ε i , t V (cid:19) ⇒ ε i , t b i , t + = − V − (cid:18) V + ln ε i , t (cid:19) . (21)As noted before, if the Log-normal is the true distribution of ε i , t then condition (12) is satisﬁed;otherwise, this condition requires E ( ln ε i , t | F i − , t ) = − V / θθθ –portion of the average score is then given by sss N = V − N − T ∑ t = N t ∑ i = (cid:18) ln ε i , t + V (cid:19) aaa i , t , (22)for the ﬁrst order condition T ∑ t = N t ∑ i = (cid:18) ln ε i , t + V (cid:19) aaa i , t = . (23)Notice that, differently from the Gamma case (cf. Equation (18)), Equation (23) depends onthe shape parameter V . This implies that, during estimation, one should alternate betweenestimation of θθθ and V .Another important difference with the Gamma case is that the shape parameter V is not“orthogonal” to θθθ , given that the LHS of Equation (15) is now E ( ε i , t ∇ V b i , t | F i − , t ) = E (cid:16) ε i , t ε − i , t ln ε i , t V − | F i − , t (cid:17) = − V V − = − V − ( (cid:98) θθθ ML ) depends both on V and on the asymptotic variance of an estimator (cid:98) V (more on this below).Focusing now on the shape parameter, the ML estimator of V solves V + V − N − T ∑ t = N t ∑ i = ln (cid:98) ε i , t = , (25)10hich implies (cid:98) V ML = (cid:118)(cid:117)(cid:117)(cid:116) N − T ∑ t = N t ∑ i = ln (cid:98) ε i , t + −  . (27)Because of (24), the asymptotic variance matrix of (cid:98) θθθ ML and (cid:98) V ML depends on their jointbehavior. Assuming the correct speciﬁcation of f ε ( ε i , t | F i − , t ) , the joint Hessian matrix is givenby − V −  AAA − aaa (cid:48) − aaa V + V  (28)where aaa = lim N → ∞ (cid:34) N − T ∑ t = N t ∑ i = E ( aaa i , t ) (cid:35) . This implies Avar ( (cid:98) θθθ ML ) = V (cid:18) AAA − VV + aaaaaa (cid:48) (cid:19) − , which can be estimated by (cid:100) Avar ( (cid:98) θθθ ML ) = (cid:98) V (cid:32)(cid:98) AAA − (cid:98) V (cid:98) V + (cid:98) aaa (cid:98) aaa (cid:48) (cid:33) − , where (cid:98) aaa = N − T ∑ t = N t ∑ i = (cid:98) aaa i , t . The availability of several different closed form estimators of V (depending on the (cid:98) ε i , t ’s)allows for the possibility to build a concentrated log-likelihood by replacing V with the desired (cid:98) V formula: since the concentrated log-likelihood depends only on θθθ , this bypasses the need toalternate between θθθ and V estimation (e.g. expression (27) as in Cattivelli and Gallo, 2020). Asimpler alternative is maybe to resort to the Method of Moments (MM) estimator (26), whichis also in line with the zero expected score requirement in (12). Alternative estimators are possible. For example, the zero expected score condition E ( ln ε i , t | F i − , t ) = − V / (cid:98) V = − N − T ∑ t = N t ∑ i = ln (cid:98) ε i , t , (26)which is non-negative because of the Jensen’s inequality ( E ( ε i , t | F i − , t ) = ⇒ E ( ln ε i , t | F i − , t ) ≤ ln E ( ε i , t | F i − , t ) = ln 1 = V / (cid:98) ε i , t ’s (justiﬁed by the zero expected score condition, again). This leads to estimate V by the samplevariance of the ln (cid:98) ε i , t ’s. .2 Generalized Method of Moments Inference A different way to estimate the model, which does not need an explicit choice of the error termdistribution, is to resort to Generalized Method of Moments (GMM). Let ε i , t − = x i , t τ i , t ξ i , t − . (29)Under model assumptions, ε i , t − σ . Following Brownlees et al. (2012,Section 9.2.2.2), we get that the efﬁcient GMM estimators of θθθ , say (cid:98) θθθ

GMM , solves the criterionequation (18) and has the asymptotic variance matrix given in (19), i.e., the same properties of (cid:98) θθθ ML assuming Gamma distributed errors.In the spirit of a semiparametric approach, a straightforward estimator for σ is (cid:98) σ = N − T ∑ t = N t ∑ i = ( (cid:98) ε i , t − ) where (cid:98) ε i , t represents here ε i , t evaluated at (cid:98) θθθ GMM . Note that this estimator does not suffer fromthe presence of zeros in the data.

Volatility, our main object of interest, is expressed as the square root of the realized kernelvariance (Barndorff-Nielsen et al. , 2008, 2009) converted in percentage annualized terms: forthe sake of comparison, given that the realized volatility refers to the open–to–close period, wewill estimate the GARCH models also in reference to such period. Data on the S&P 500, FTSE100, NASDAQ and Hang Seng indices have been collected from the realized library of theOxford-Man Institute (Heber et al. , 2009), which allows us to derive open–to–close returns andtheir sign. The MIDAS–related macroeconomic variable is the US Industrial Production ( IP t ),observed monthly and taken from the Federal Reserve Economic Data database. The variable IPc t is used in month-to-month percentage change (as in Conrad and Loch, 2015). The periodunder consideration for all the variables is from 2 January 2001 to 15 May 2020. For referencepurposes, some summary statistics (minimum, maximum, mean, standard deviation, skewnessand kurtosis) for all variables considered are in Table 1.[Table 1 about here.]Figure 1 depicts the open-to-close log-returns (top panels, black lines) and realized kernelvolatilities (bottom panels, blue lines) for the four indices considered over the full sample. Wesuperimposed the US recession periods dated by the NBER in 2001 and then 2008-09, as areference to periods of slowdown in economic activity (and hence a downturn in industrial12roduction). Although the scales are different, there are features in the dynamics of the serieswhich are common to all four indices, notably the explosion of volatility around the LehmanBrothers demise in September 2008, and other episodes which are more idiosyncratic, althoughthe surge in volatility at the end of 2002 is common to the US and UK indices, and the one in2015 seems to have affected more the US markets and Hong Kong.[Figure 1 about here.]We include in the set of competing models those having the realized volatility as the de-pendent variable, namely the multiplicative class (the AMEM plus the two proposed speciﬁca-tions

MEM − MIDAS and

Component − MEM ) and the asymmetric version of the HAR model(

AHAR ), on the one side; and the GARCH class for the conditional variance of open–to–closereturns, namely,

GJR , GM , and the DAGM , on the other. To the latter, we add the

RGARCH ,which is still speciﬁed as a GARCH, but makes use of realized variance in its speciﬁcation. Allthe functional forms are described in Table 2.[Table 2 about here.]The testing ground for the models includes two different robust loss functions (LFs, Patton(2011)): QLIKE and MSE. All LFs have the realized kernel volatility as their target, and theGARCH models variance forecasts are modiﬁed to match that target. The evaluation makes useof the Model Conﬁdence Set (MCS, Hansen et al. , 2011), and the test statistic used in the MCSprocedure is the semi-quadratic T SQ , as recently done by Cipollini et al. (2020), for instance. The ﬁrst in–sample period spans from January 2001 to December 2012. Tables from 3 to 6report the estimated coefﬁcients for each model, some residual diagnostics and the MCS inclu-sion according to the two LFs. In terms of diagnostics, we consider the Ljung-Box (Ljung andBox, 1978), applied on standardized residuals (squared standardized residuals for the GARCH-based models) at different lags. Overall, considering higher lags, the tests for the two proposedspeciﬁcations signal an absence of clustering in the residuals (except for the NASDAQ index),contrary to what happens for many of the other competing speciﬁcations. As regards to theinclusion in the MCS, we can notice that MEM–based speciﬁcations have a better performancethan all the other models. Interestingly, the proposed

Component − MEM model is always in-cluded in the set of the superior models, independently of the LF adopted. In the case of theFTSE 100, the

Component − MEM model is the only speciﬁcation belonging to the MCS.[Table 3 about here.][Table 4 about here.][Table 5 about here.][Table 6 about here.]13 .2 A Graphical Appraisal of the long –run

The two

DMEM models produce an estimate of the long –run which is at a daily frequencyfor the

Component − MEM , and at a monthly frequency for the

MEM − MIDAS : in order forthem to be compared, we choose to aggregate the former at the monthly level by averaging tothe same scale, with an obvious change of notation for the objects involved, by dropping thesubscript i . In Figure 2 we report the four τ t components (for each index) estimated with the Component − MEM (top plot), and with the

MEM − MIDAS (bottom plot). It seems that the τ t components have a similar pattern across all the indices, within the same speciﬁcation (more onthis later). To investigate this aspect, in Table 7, we report the correlations (numbers in regulartext) among the τ t terms of the Component − MEM and among those of the

MEM − MIDAS (number in italics ); on the main diagonal, we reproduce the correlation coefﬁcient between τ t ’sestimated by the two different methods: they are all above 0 . τ t ’s fordifferent indices is conﬁrmed and, as expected, the values are higher for the two US and the UKmarkets. [Figure 2 about here.][Table 7 about here.] In the out-of-sample exercise, each model is estimated using a rolling window of twelve years(approximately, 3000 daily observations). Subsequently, the one-step-ahead forecasts are gen-erated for the following two months, conditionally on the parameters’ estimates previously ob-tained. Then, the estimation window shifts forward by two months, new out-of-sample forecastsare produced as in the previous step for the following two months, and so forth until the endof the series. The ﬁrst estimation period coincides with the in-sample period 2001–2012. Theout-of-sample performances of the models, for each index under consideration, are depicted inTables 8 to 11. It can be easily noted that the largest gray area (indicating inclusion in the MCS)for all the tables, LFs and out-of-sample periods is for the

MEM –based models, followed bysome more scattered presence of the

AHAR . The consistent presence of these models is reas-suring in terms of modeling realized volatility directly, on the one hand, and within that class interms of the convenience to treat innovation terms as entering multiplicatively. Modeling con-ditional volatility through the conditional second moments of returns seems to be dominated It is not relevant, for the sake of our argument, to address the issue of the different opening schedules acrosstime zones here. The results presented here are robust to larger reﬁtting periods. Additional material is available upon request.

RGARCH seldomenters the MCS.To gain some further insights as of the behavior of each model in relationship with the ob-served volatility pattern, we suggest a graphical comparison (Figure 3) between the two

DMEM models introduced in this paper. To that end, we reproduce, for the last period of our sample(from 2 January 2020 to 15 May 2020), the out–of–sample forecasts next to the realized kernelvolatility. [Table 8 about here.][Table 9 about here.][Table 10 about here.][Table 11 about here.][Figure 3 about here.]

Two different general approaches can be followed when forecasting asset return volatility: oneis the GARCH approach where the conditional variance is estimated from return data, the otheris modeling the conditional expectation of volatility using ultra–high frequency measures of re-alized volatility data. In the ﬁrst approach, therefore, measurement and modeling are comprisedwithin the same framework, while, in the second, the two aspects are decoupled. The meritsof the GARCH model are testiﬁed by the hundreds of thousands of theoretical and empiricalcontributions since the seminal paper by Engle (1982). This type of approach has been enrichedover the years by successive reﬁnements, with the goal to capture some empirical regularities inthe pattern of the observed time series. This is the case for the consideration of a time–varyinglocal average in the conditional variance, a feature addressed by Engle and Rangel (2008), alsoin reference to its economic interpretation to macro economic ﬂuctuations. As a parallel ap-proach, direct modeling of realized measures of volatility has the advantage to exploit the bettertheoretical properties of these ultra–high frequency measures (less noisy than squared returns).For either approach, the consideration of how complicated it is to collect the data and to ﬁne-tune a model to derive the forecast has to be weighed against the actual reward in an improvedforecasting performance. The availability of freely downloadable price data still maintains pop-ularity with the

GARCH approach (especially among practitioners), but it is also true that thenumber of high–frequency data vendors is expanding and that DIY processing and storing tick-by-tick data is not a prohibitive task. 15 comparison across models can be interpreted as an exercise that aims at assessing thecapability of each model to reproduce empirical regularities in the data, but also at establishinghow important those stylized facts are when taken on an out–of–sample terrain.In this context, our paper has two clear outcomes: one is to suggest that modeling realizedvolatility delivers better results than going through a

GARCH –type approach; the second is toshow that incorporating the feature that average volatility by subperiod is time–varying providesan advantage in forecasting. For the ﬁrst outcome, there are clear merits in using a model inwhich the errors enter multiplicatively, as in the

MEM : this mitigates the attenuation bias inrealized volatility models as documented by Cipollini et al. (2020), because it takes into ex-plicit consideration the heteroskedastic nature of volatility measurement errors. For the secondoutcome, we suggest that doubling the multiplicative components incorporating a slow movingand a short –run components of volatility dynamics delivers better results, at least for our fourstock market indices. We contributed two such models, differentiated by the type of informationentering the low–frequency component: in the

Component − MEM , we use the same daily data,but we allow for a more persistent dynamics; in the

MEM − MIDAS , we use a monthly macro-variable (the US industrial production) the variations of which combine in a smooth componentwhich exploits the mixed sampling results by Ghysels et al. (2006) and by Engle et al. (2013).While our

MEM − MIDAS performs better than the corresponding GM or DAGM in a

GARCH context, its delivering a τ t which lags behind relative to the bursts of volatility makes it, attimes, preferred by another member of the DMEM family, namely the

Component − MEM . Wecan see a convenience in using the

MEM − MIDAS within a scenario–type approach designingprolonged periods of downturns in economic activity (not necessarily limited to our choice ofUS industrial production): the impact and aftermath of the COVID–19 health emergency onthe ﬁnancial volatility may thus be studied in projecting to the medium term this channel oftransmission originating in the real economy.While reﬁnements are still possible (e.g. the use of a second lag in making use of observedvolatility values, or a

DAGM extension within the

MEM − MIDAS ), one indication that emergesfrom the empirical results is that the components estimated by our models have some common-ality that should be exploited – in a common factor sense – by a joint modeling of the series.16 eferences

Amado, C., Silvennoinen, A. and Ter¨asvirta, T. (2019) Models with multiplicative decomposi-tion of conditional variances and correlations, in

Financial Mathematics, Volatility and Co-variance Modelling (Eds.) J. Chevallier, S. Goutte, D. Guerreiro, S. Saglio and B. Sanhaji,Routledge, vol. 2.Amado, C. and Ter¨asvirta, T. (2008) Modelling conditional and unconditional heteroskedastic-ity with smoothly time-varying structure, Tech. Rep. 8, CREATES Research Paper.Amendola, A., Candila, V. and Gallo, G. M. (2019) On the asymmetric impact of macro–variables on volatility,

Economic Modelling , , 135–152.Andersen, T. G. and Bollerslev, T. (1998) Answering the skeptics: Yes, standard volatility mod-els do provide accurate forecasts, International Economic Review , , 885–905.Andersen, T. G., Bollerslev, T., Christoffersen, P. F. and Diebold, F. X. (2006) Volatility andcorrelation forecasting, in Handbook of Economic Forecasting (Eds.) G. Elliott, C. W. J.Granger and A. Timmermann, North Holland.Barigozzi, M., Brownlees, C., Gallo, G. M. and Veredas, D. (2014) Disentangling systematicand idiosyncratic dynamics in panels of volatility measures,

Journal of Econometrics , ,364–384.Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2008) Designing realisedkernels to measure the ex-post variation of equity prices in the presence of noise, Economet-rica , , 1481–1536.Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2009) Realised kernelsin practice: trades and quotes, Econometrics Journal , , 1–32.Bollerslev, T. (1986) Generalized autoregressive conditional heteroskedasticity, Journal ofEconometrics , , 307–327.Brownlees, C. T., Cipollini, F. and Gallo, G. M. (2011) Intra-daily volume modeling and pre-diction for algorithmic trading, Journal of Financial Econometrics , , 489–518.Brownlees, C. T., Cipollini, F. and Gallo, G. M. (2012) Multiplicative error models, in VolatilityModels and Their Applications (Eds.) L. Bauwens, C. Hafner and S. Laurent, Wiley, pp.223–247.Brownlees, C. T. and Gallo, G. M. (2010) Comparison of volatility measures: a risk manage-ment perspective,

Journal of Financial Econometrics , , 29–56.17attivelli, L. and Gallo, G. M. (2020) Adaptive lasso for vector multiplicative error models, Quantitative Finance , , 255–274.Cipollini, F., Gallo, G. M. and Otranto, E. (2020) Realized volatility forecasting: Robustness tomeasurement errors, International Journal of Forecasting , p. forthcoming.Conrad, C. and Kleen, O. (2020) Two are better than one: Volatility forecasting using multi-plicative component GARCH-MIDAS models,

Journal of Applied Econometrics , , 19–45.Conrad, C. and Loch, K. (2015) Anticipating long-term stock market volatility, Journal of Ap-plied Econometrics , , 1090–1114.Corsi, F. (2009) A simple approximate long-memory model of realized volatility, Journal ofFinancial Econometrics , , 174–196.Dueker, M. J. (1997) Markov switching in GARCH processes and mean-reverting stock-marketvolatility, Journal of Business & Economic Statistics , , 26–34.Engle, R. F. (1982) Autoregressive conditional heteroscedasticity with estimates of the varianceof United Kingdom inﬂation, Econometrica , , 987–1007.Engle, R. F. (2002) New frontiers for ARCH models, Journal of Applied Econometrics , ,425–446.Engle, R. F. and Gallo, G. M. (2006) A multiple indicators model for volatility using intra-dailydata, Journal of Econometrics , , 3–27.Engle, R. F., Ghysels, E. and Sohn, B. (2013) Stock market volatility and macroeconomicfundamentals, Review of Economics and Statistics , , 776–797.Engle, R. F. and Rangel, J. G. (2008) The spline-GARCH model for low frequency volatilityand its global macroeconomic causes, Review of Financial Studies , , 1187–1222.Engle, R. F. and Russell, J. R. (1998) Autoregressive conditional duration: A new model forirregularly spaced transaction data., Econometrica , , 1127–62.Gallo, G. M. and Otranto, E. (2015) Forecasting realized volatility with changing average levels, International Journal of Forecasting , , 620–634.Ghysels, E., Santa-Clara, P. and Valkanov, R. (2006) Predicting volatility: getting the most outof return data sampled at different frequencies, Journal of Econometrics , , 59–95.Ghysels, E., Sinko, A. and Valkanov, R. (2007) MIDAS regressions: Further results and newdirections, Econometric Reviews , , 53–90.18losten, L. R., Jagannanthan, R. and Runkle, D. E. (1993) On the relation between the expectedvalue and the volatility of the nominal excess return on stocks, The Journal of Finance , ,1779–1801.Hamilton, J. D. and Susmel, R. (1994) Autoregressive conditional heteroskedasticity andchanges in regime, Journal of econometrics , , 307–333.Han, H. and Kristensen, D. (2014) Asymptotic theory for the qmle in garch-x models withstationary and nonstationary covariates, Journal of Business & Economic Statistics , , 416–429.Hansen, P. R., Huang, Z. and Shek, H. H. (2012) Realized GARCH: a joint model for returnsand realized measures of volatility, Journal of Applied Econometrics , , 877–906.Hansen, P. R., Lunde, A. and Nason, J. M. (2011) The Model Conﬁdence Set, Econometrica , , 453–497.Heber, G., Lunde, A., Shephard, N. and Sheppard, K. (2009) Omi’s realised library, version 0.1,Tech. rep., Oxford–Man Institute, University of Oxford.Ljung, G. M. and Box, G. E. P. (1978) On a measure of lack of ﬁt in time series models, Biometrika , , 297–303.Mazur, B. and Pipie´n, M. (2012) On the empirical importance of periodicity in the volatility ofﬁnancial returns-time varying GARCH as a second order APC(2) process, Central EuropeanJournal of Economic Modelling and Econometrics , , 95–116.Newey, W. K. and McFadden, D. (1994) Large sample estimation and hypothesis testing, in Handbook of Econometrics (Eds.) R. F. Engle and D. McFadden, Elsevier, vol. 4, chap. 36,pp. 2111–2245.Pan, Z. and Liu, L. (2018) Forecasting stock return volatility: A comparison between the rolesof short-term and long-term leverage effects,

Physica A: Statistical Mechanics and its Appli-cations , , 168 – 180.Patton, A. (2011) Volatility forecast comparison using imperfect volatility proxies, Journal ofEconometrics , , 246–256. 19 igure 1: Annualized daily log-returns and realized kernel volatility − − − D a il y l og − r e t u r n s ( A nnua l . % ) R ea li z ed k e r n . v o l a t. ( A nnua l . % ) (a) S&P 500 − − − D a il y l og − r e t u r n s ( A nnua l . % ) R ea li z ed k e r n . v o l a t. ( A nnua l . % ) (b) FTSE 100 − − D a il y l og − r e t u r n s ( A nnua l . % ) R ea li z ed k e r n . v o l a t. ( A nnua l . % ) (c) NASDAQ − D a il y l og − r e t u r n s ( A nnua l . % ) R ea li z ed k e r n . v o l a t. ( A nnua l . % ) (d) Hang Seng Notes:

Plots of open-to-close log-returns (top panels, black lines) and realized kernel volatilities(bottom panels, blue lines). Shaded areas represent US recession periods (NBER dating).20 igure 2: Monthly τ t term: comparison between Component − MEM and

MEM − MIDAS Lo w − f r equen cy c o m ponen t ( A nnua li z ed % ) S&P 500FTSE 100NASDAQHang Seng (a)

Component − MEM τ t Lo w − f r equen cy c o m ponen t ( A nnua li z ed % ) S&P 500FTSE 100NASDAQHang Seng (b)

MEM − MIDAS τ t Notes:

Plot of the

Component − MEM (top plot) and

MEM − MIDAS (bottom plot) τ t terms.Shaded areas represent US recession periods (NBER dating).21 igure 3: Component − MEM and

MEM − MIDAS out-of-sample volatilities V o l a t ili t y ( A nnu a li z e d % ) (a) S&P 500 V o l a t ili t y ( A nnu a li z e d % ) (b) FTSE 100 V o l a t ili t y ( A nnu a li z e d % ) (c) NASDAQ V o l a t ili t y ( A nnu a li z e d % ) (d) Hang Seng Notes:

Realized kernel volatility (grey line),

Component − MEM (blue dotted line) and

MEM − MIDAS (black line) out-of-sample estimated volatilities. Period: 2 January 2020 - 15May 2020. 22 able 1: Summary statistics

Obs. Min. Max. Mean SD Skew. Kurt.

Daily data

S&P 500 log-returns 4859 − .

444 162 .

241 0 .

153 17 . − .

206 8 . .

506 113 .

455 12 .

820 10 .

045 3 .

305 17 . − .

500 149 . − .

118 18 . − .

404 7 . .

231 149 .

437 14 .

574 10 .

467 3 .

938 28 . − .

995 110 .

062 0 .

050 19 . − .

221 4 . .

648 116 .

780 14 .

628 10 .

053 2 .

786 13 . − .

402 192 . − .

564 15 .

979 0 .

246 12 . .

137 130 .

854 13 .

039 8 .

018 3 .

800 28 . Monthly dataIPc t − .

410 5 . − .

508 22 . − .

538 214 . Notes : The table reports the number of observations (Obs.), the minimum (Min.) and maximum (Max.), the mean, standard deviation(SD), Skewness (Skew.) and excess Kurtosis (Kurt.). The sample period is 2 January 2001 - 15 May 2020. The daily variablesare the open-to-close log-returns and realized kernel volatility, both expressed in annualized percentage. The monthly variable isthe US Industrial Production ( IP t ), and is expressed as the annualized month-to-month percentage change ( IPc t ), that is 12 . · · (( IP t / IP t − ) − ) . Table 2: Model speciﬁcations

Model Functional form Err. Distr. rvol i , t | F i − , t = µ i , t ε i , t ε i , t i . i . d ∼ D + (cid:0) , σ (cid:1) AMEM µ i , t = α + ( α + γ ( r i − , t < )) rvol i − , t + β µ i − , t α = ( − α − β − γ / ) µ , with µ = E [ rvol i , t ] rvol i , t | F i − , t = τ i , t ξ i , t ε i , t ε i , t i . i . d ∼ D + (cid:0) , σ (cid:1) Component − MEM ξ i , t = ( − α − γ / − β ) + α x ( ξ ) i − , t + γ x ( ξ − ) i − , t + β ξ i − , t , with x ( ξ ) i , t ≡ rvol i , t τ i , t and x ( ξ − ) i , t ≡ x ( ξ ) i , t ( r i , t < ) τ i , t = ω ( τ ) + α ( τ ) x ( τ ) i − , t + γ ( τ ) x ( τ − ) i − , t + β ( τ ) τ i − , t , with x ( τ ) i , t ≡ rvol i , t ξ i , t and x ( τ − ) i , t ≡ x ( τ ) i , t ( r i , t < ) rvol i , t | F i − , t = τ t ξ i , t ε i , t ε i , t i . i . d ∼ D + (cid:0) , σ (cid:1) MEM − MIDAS ξ i , t = ( − α − β − γ / ) + (cid:16) α + γ · ( r i − , t < ) (cid:17) rvol i − , t τ t + β ξ i − , t τ t = exp (cid:8) m + ζ ∑ Kk = δ k ( ω ) X t − k (cid:9) AHAR rvol i , t = c + ( β + γ ( r i − , t < )) rvol i − , t + β rvol ( i − ) : ( i − ) , t + β rvol ( i − ) : ( i − ) , t + u i , t u i , t i . i . d ∼ N (cid:0) , σ u (cid:1) GJR r i , t | F i − = (cid:112) h i , t η i , t η i , t i . i . d ∼ N ( , ) h i , t = const + (cid:16) α + γ ( r i − , t < ) (cid:17) r i − , t + β h i − , t r i , t | F i − , t = (cid:112) τ t × ξ i , t η i , t η i , t i . i . d ∼ N ( , ) GM ξ i , t = ( − α − β − γ / ) + (cid:16) α + γ · ( r i − , t < ) (cid:17) r i − , t τ t + β ξ i − , t τ t = exp (cid:8) m + ζ ∑ Kk = δ k ( ω ) X t − k (cid:9) r i , t | F i − , t = (cid:112) τ t × ξ i , t η i , t η i , t i . i . d ∼ N ( , ) DAGM ξ i , t = ( − α − β − γ / ) + (cid:16) α + γ · ( r i − , t < ) (cid:17) r i − , t τ t + β ξ i − , t τ t = exp (cid:8) m + ζ + ∑ Kk = δ k ( ω ) + X t − k ( X t − k ≥ ) + ζ − ∑ Kk = δ k ( ω ) − X t − k ( X t − k < ) (cid:9) RGARCH r i , t | F i − , t = (cid:112) h i , t η i , t η i , t i . i . d ∼ N ( , ) log ( h i , t ) = const + β log ( h i − , t ) + α log ( rvol i − , t ) Notes : The table reports the functional forms for the Asymmetric MEM (

AMEM ), MEM − MIDAS , Component − MEM , Asymmetric HAR (

AHAR ), GJR , GARCH–MIDAS( GM ), Double Asymmetric GARCH–MIDAS ( DAGM ), and Realized GARCH (

RGARCH ) speciﬁcations. able 3: In-sample comparison. S&P 500 AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH const .

296 1 . ∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) α . ∗∗∗ − . ∗∗∗ . ∗∗∗ .

001 0 .

009 5 . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ζ − . ∗∗∗ − . ( . ) ( . ) ω . ∗∗ . ∗∗ ( . ) ( . ) ζ + − . ∗∗∗ ( . ) ω + . ∗∗∗ ( . ) ζ − − . ∗∗∗ ( . ) ω − . ∗∗∗ ( . ) ω ( τ ) . ∗∗∗ ( . ) α ( τ ) . ∗∗∗ ( . ) β ( τ ) . ∗∗∗ ( . ) γ ( τ ) . ∗∗∗ ( . ) LB .

001 0 .

007 0 .

001 0 .

000 0 .

001 0 .

002 0 .

001 0 . .

012 0 .

045 0 .

017 0 .

000 0 .

007 0 .

016 0 .

006 0 . .

087 0 .

113 0 .

134 0 .

000 0 .

084 0 .

136 0 .

071 0 . .

069 0 .

067 0 .

068 0 .

071 0 .

084 0 .

085 0 . .

16 0 .

157 0 .

164 0 .

199 0 .

203 0 .

206 0 . Notes : The table reports the estimated coefﬁcients of the models in column. ∗ , ∗∗ and ∗∗∗ represent the signiﬁcance at levels 10% , , AMEM model refers to α parameter in Table 2. For ease of notation, the parameter α referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012). Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of theLjung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions.The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at signiﬁcance level α = . IPc t . Number of lagged macro-economic variablerealizations: K = able 4: In-sample comparison. FTSE 100 AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH const .

311 1 . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) α . ∗∗∗ .

019 0 . ∗∗∗ .

001 0 .

001 0 . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ ( . ) β . ∗∗∗ ( . ) γ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) m − . ∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ζ − . ∗∗∗ − . ∗∗∗ ( . ) ( . ) ω . ∗∗∗ . ∗∗∗ ( . ) ( . ) ζ + . ( . ) ω + . ∗∗ ( . ) ζ − − . ( . ) ω − . ∗∗∗ ( . ) ω ( τ ) . ∗∗∗ ( . ) α ( τ ) . ∗∗∗ ( . ) β ( τ ) . ∗∗∗ ( . ) γ ( τ ) . ∗∗∗ ( . ) LB .

008 0 .

102 0 .

009 0 .

000 0 .

106 0 .

061 0 . . .

065 0 .

298 0 .

081 0 .

000 0 .

148 0 .

147 0 .

163 0 . .

157 0 .

579 0 .

215 0 .

000 0 .

027 0 .

019 0 .

033 0 . .

053 0 .

055 0 .

061 0 .

058 0 .

061 0 . .

174 0 .

17 0 .

173 0 .

18 0 .

194 0 .

188 0 .

196 0 . Notes : The table reports the estimated coefﬁcients of the models in column. ∗ , ∗∗ and ∗∗∗ represent the signiﬁcance at levels 10% , , AMEM model refers to α parameter in Table 2. For ease of notation, the parameter α referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012). Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of theLjung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions.The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at signiﬁcance level α = . IPc t . Number of lagged macro-economic variablerealizations: K = able 5: In-sample comparison: NASDAQ AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH const .

321 0 . ∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) α . ∗∗∗ .

031 0 . ∗∗∗ . ∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ ( . ) β . ∗∗∗ ( . ) γ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) m .

032 7 . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ζ − . − . ∗∗∗ ( . ) ( . ) ω .

809 5 . ∗∗∗ ( . ) ( . ) ζ + − . ( . ) ω + . ( . ) ζ − − . ( . ) ω − . ( . ) ω ( τ ) . ∗∗∗ ( . ) α ( τ ) . ∗∗∗ ( . ) β ( τ ) . ∗∗∗ ( . ) γ ( τ ) . ∗∗∗ ( . ) LB .

013 0 .

002 0 .

015 0 .

000 0 .

002 0 .

003 0 .

003 0 . .

005 0 .

014 0 .

006 0 .

000 0 .

002 0 .

003 0 .

003 0 . .

005 0 .

002 0 .

009 0 .

000 0 .

039 0 .

031 0 .

038 0 . .

053 0 .

051 0 .

053 0 .

054 0 .

067 0 .

071 0 .

071 0 . .

159 0 .

153 0 .

157 0 .

159 0 .

196 0 .

229 0 .

226 0 . Notes : The table reports the estimated coefﬁcients of the models in column. ∗ , ∗∗ and ∗∗∗ represent the signiﬁcance at levels 10% , , AMEM model refers to α parameter in Table 2. For ease of notation, the parameter α referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012). Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of theLjung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions.The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at signiﬁcance level α = . IPc t . Number of lagged macro-economic variablerealizations: K = able 6: In-sample comparison. Hang Seng AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH const .

157 0 . ∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) α . ∗∗∗ .

032 0 . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) β . ∗∗∗ ( . ) β . ∗∗∗ ( . ) γ . ∗∗∗ . ∗∗∗ . ∗∗∗ .

034 0 . ∗∗ . ∗∗ . ∗∗ ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) ( . ) m − .

031 5 . ∗∗∗ . ∗∗∗ ( . ) ( . ) ( . ) ζ − . ∗∗∗ − . ∗∗ ( . ) ( . ) ω . ∗∗∗ . ( . ) ( . ) ζ + − . ( . ) ω + . ∗∗∗ ( . ) ζ − . ∗∗ ( . ) ω − . ∗∗∗ ( . ) ω ( τ ) . ∗∗∗ ( . ) α ( τ ) . ∗∗∗ ( . ) β ( τ ) . ∗∗∗ ( . ) γ ( τ ) . ( . ) LB .

078 0 .

436 0 .

107 0 .

001 0 .

907 0 .

897 0 .

899 0 . .

04 0 .

732 0 .

068 0 .

001 0 .

26 0 .

426 0 .

219 0 . .

059 0 .

398 0 .

114 0 .

000 0 .

28 0 .

594 0 .

23 0 . .

064 0 .

063 0 .

064 0 .

065 0 .

075 0 .

074 0 .

074 0 . .

149 0 .

146 0 .

147 0 .

148 0 .

171 0 .

166 0 .

162 0 . Notes : The table reports the estimated coefﬁcients of the models in column. ∗ , ∗∗ and ∗∗∗ represent the signiﬁcance at levels 10% , , AMEM model refers to α parameter in Table 2. For ease of notation, the parameter α referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012). Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of theLjung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions.The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at signiﬁcance level α = . IPc t . Number of lagged macro-economic variablerealizations: K = Table 7:

Component − MEM and

MEM − MIDAS . Correlations among the τ t components S&P 500 FTSE 100 NASDAQ Hang SengS&P 500

FTSE 100

NASDAQ

Hang Seng

Notes : Numbers in bold are the correlations among the low-frequencyterms of the

Component − MEM and

MEM − MIDAS models. Numbersin regular text and italics are the correlations among the indexes for the

Component − MEM and

MEM − MIDAS models, respectively. able 8: Out-of-sample comparison. S&P 500 AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH

QLIKE2013 0 .

081 0 .

08 0 .

084 0 .

087 0 .

115 0 .

131 0 .

15 0 . .

074 0 .

068 0 .

074 0 .

082 0 .

124 0 .

143 0 .

181 0 . .

061 0 .

064 0 .

086 0 .

123 0 .

088 0 . .

064 0 .

068 0 .

062 0 .

068 0 .

096 0 .

119 0 .

12 0 . .

072 0 .

074 0 .

09 0 .

169 0 .

227 0 .

176 0 . .

06 0 .

055 0 .

061 0 .

062 0 .

093 0 .

091 0 .

102 0 . .

07 0 .

066 0 .

073 0 .

076 0 .

098 0 .

127 0 .

116 0 . .

072 0 .

07 0 .

086 0 .

11 0 .

083 0 .

145 0 .

175 0 . .

069 0 .

067 0 .

071 0 .

077 0 .

11 0 .

138 0 .

135 0 . .

051 0 .

052 0 .

059 0 .

082 0 .

098 0 .

124 0 . .

044 0 .

041 0 .

043 0 .

049 0 .

087 0 .

102 0 .

124 0 . .

098 0 .

094 0 .

097 0 . .

128 0 .

137 0 .

12 0 . .

058 0 .

066 0 .

059 0 .

064 0 .

088 0 .

108 0 .

134 0 . .

016 0 .

017 0 .

016 0 .

025 0 .

056 0 .

097 0 .

068 0 . .

091 0 .

082 0 .

089 0 .

087 0 .

147 0 .

125 0 .

135 0 . .

046 0 .

042 0 .

047 0 .

051 0 .

076 0 . .

088 0 . .

462 0 .

533 0 .

575 0 .

58 0 .

488 0 .

772 1 .

058 0 . .

078 0 .

08 0 .

084 0 .

088 0 .

114 0 .

143 0 .

161 0 . Notes : The table reports the averages of the QLIKE and MSE loss functions. Rolling window: twelve years. Reﬁtting frequency: two months.Shades of gray denote inclusion in the MCS at signiﬁcance level α = . Table 9: Out-of-sample comparison. FTSE 100

AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH

QLIKE2013 0 .

053 0 .

05 0 .

058 0 .

056 0 .

058 0 .

066 0 .

073 0 . .

046 0 .

047 0 .

055 0 .

059 0 .

091 0 .

135 0 . .

053 0 .

054 0 .

06 0 .

062 0 .

061 0 .

061 0 . .

068 0 .

079 0 .

067 0 .

072 0 .

07 0 .

072 0 .

066 0 . .

045 0 .

056 0 .

071 0 .

099 0 .

078 0 . .

061 0 .

057 0 .

062 0 .

065 0 .

086 0 .

089 0 .

097 0 . .

037 0 .

039 0 .

044 0 .

054 0 .

07 0 .

065 0 . .

085 0 .

092 0 .

132 0 .

098 0 .

15 0 .

515 0 . .

054 0 .

055 0 .

062 0 .

067 0 .

082 0 .

104 0 . .

056 0 .

053 0 .

058 0 .

061 0 .

068 0 .

076 0 .

084 0 . .

046 0 .

045 0 .

046 0 .

052 0 .

057 0 .

108 0 .

09 0 . .

11 0 .

111 0 .

119 0 .

139 0 .

12 0 .

12 0 . .

338 0 .

394 0 .

336 0 .

355 0 .

333 0 .

332 0 .

321 0 . .

028 0 .

029 0 .

028 0 .

037 0 .

053 0 .

084 0 .

06 0 . .

085 0 .

081 0 .

085 0 .

088 0 .

129 0 .

127 0 .

14 0 . .

038 0 .

037 0 .

039 0 .

045 0 .

07 0 .

092 0 .

085 0 . .

061 1 .

135 1 .

136 1 .

259 1 .

14 1 .

617 2 .

341 1 . .

148 0 .

159 0 .

152 0 .

166 0 .

172 0 .

208 0 .

239 0 . Notes : The table reports the averages of the QLIKE and MSE loss functions. Rolling window: twelve years. Reﬁtting frequency: two months.Shades of gray denote inclusion in the MCS at signiﬁcance level α = .25. able 10: Out-of-sample comparison. NASDAQ AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH

Asymmetric LF, under prediction version: QLIKE ( b = − .

049 0 .

046 0 .

059 0 .

086 0 .

189 0 .

22 0 . .

058 0 .

055 0 .

057 0 .

059 0 .

09 0 .

107 0 .

12 0 . .

054 0 .

051 0 .

054 0 .

053 0 .

079 0 .

103 0 .

08 0 . .

052 0 .

051 0 .

054 0 .

07 0 .

122 0 .

068 0 . .

071 0 .

077 0 .

131 0 .

152 0 .

133 0 . .

054 0 .

052 0 .

055 0 .

061 0 .

085 0 .

072 0 .

108 0 . .

066 0 .

063 0 .

067 0 .

073 0 .

09 0 .

096 0 .

093 0 . .

062 0 .

069 0 .

071 0 .

113 0 .

096 0 .

162 0 .

246 0 . .

058 0 .

056 0 .

06 0 .

065 0 .

09 0 .

122 0 .

124 0 . b = .

03 0 .

029 0 .

047 0 .

041 0 .

065 0 .

398 0 .

483 0 . .

052 0 .

049 0 .

051 0 .

053 0 .

095 0 .

115 0 .

137 0 . .

107 0 .

101 0 .

106 0 .

104 0 .

14 0 .

168 0 .

137 0 . .

056 0 .

058 0 .

055 0 .

057 0 .

07 0 .

091 0 .

068 0 . .

03 0 .

032 0 .

03 0 .

036 0 .

076 0 .

092 0 .

07 0 . .

108 0 .

105 0 .

108 0 .

109 0 .

176 0 .

139 0 .

161 0 . .

06 0 .

058 0 .

061 0 .

067 0 .

093 0 .

092 0 .

089 0 . .

426 0 .

545 0 .

49 0 .

577 0 .

823 1 .

202 3 .

296 0 . .

082 0 .

086 0 .

087 0 .

093 0 .

139 0 .

21 0 .

323 0 . Notes : The table reports the averages of the QLIKE and MSE loss functions. Rolling window: twelve years. Reﬁtting frequency: two months.Shades of gray denote inclusion in the MCS at signiﬁcance level α = . Table 11: Out-of-sample comparison. Hang Seng

AMEM Component − MEM MEM − MIDAS AHAR GJR GM DAGM RGARCH

QLIKE2013 0 .

058 0 .

059 0 .

063 0 .

072 0 .

086 0 .

114 0 . .

049 0 .

051 0 .

055 0 .

061 0 .

065 0 .

086 0 . .

078 0 .

075 0 .

077 0 .

076 0 .

086 0 .

09 0 .

09 0 . .

05 0 .

054 0 .

049 0 .

056 0 .

061 0 .

07 0 .

064 0 . .

049 0 .

051 0 .

056 0 .

065 0 .

161 0 .

117 0 . .

042 0 .

04 0 .

045 0 .

046 0 .

051 0 .

044 0 .

078 0 . .

048 0 .

047 0 .

049 0 .

05 0 .

06 0 .

073 0 .

07 0 . .

061 0 .

063 0 .

083 0 .

061 0 .

098 0 .

176 0 .

126 0 . .

054 0 .

056 0 .

058 0 .

067 0 .

089 0 .

091 0 . .

052 0 .

053 0 .

057 0 .

065 0 .

08 0 .

121 0 . .

04 0 .

041 0 .

046 0 .

053 0 .

058 0 .

085 0 . .

184 0 .

174 0 .

181 0 .

175 0 .

203 0 .

205 0 .

207 0 . .

077 0 .

083 0 .

076 0 .

085 0 .

093 0 .

104 0 .

096 0 . .

028 0 .

029 0 .

034 0 .

042 0 .

059 0 .

06 0 . .

06 0 .

057 0 .

063 0 .

064 0 .

074 0 .

062 0 .

072 0 . .

046 0 .

047 0 .

049 0 .

06 0 .

076 0 .

074 0 . .

261 0 .

257 0 .

342 0 .

254 0 .

346 0 .

815 0 .

46 0 . .

079 0 .

078 0 .

084 0 .

082 0 .

097 0 .

128 0 .

12 0 . Notes : The table reports the averages of the QLIKE and MSE loss functions. Rolling window: twelve years. Reﬁtting frequency: two months.Shades of gray denote inclusion in the MCS at signiﬁcance level α = .25.