[PDF] A flexible observed factor model with separate dynamics for the factor volatilities and their correlation matrix

Abstract

Our article considers a regression model with observed factors. The observed factors have a flexible stochastic volatility structure that has separate dynamics for the volatilities and the correlation matrix. The correlation matrix of the factors is time-varying and its evolution is described by an inverse Wishart process. The model specifies the evolution of the observed volatilities flexibly and is particularly attractive when the dimension of the observations is high. A Markov chain Monte Carlo algorithm is developed to estimate the model. It is straightforward to use this algorithm to obtain the predictive distributions of future observations and to carry out model selection. The model is illustrated and compared to other Wishart-type factor multivariate stochastic volatility models using various empirical data including monthly stock returns and portfolio weighted returns. The evidence suggests that our model has better predictive performance. The paper also allows the idiosyncratic errors to follow individual stochastic volatility processes in order to deal with more volatile data such as daily or weekly stock returns.

Full PDF

aa r X i v : . [ s t a t . O T ] J u l A ﬂexible observed factor model with separate dynamics for the factor volatilities and their correlation matrix Yu-Cheng Ku a,b ∗ , Peter Bloomﬁeld a , Robert Kohn b a Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA b School of Economics, University of New South Wales, Kensington, NSW 2052, Australia October 30, 2018 Abstract Our article considers a regression model with observed factors. The observed factors have a ﬂexible stochastic volatility structure that has separate dynamics for the volatilities and the correlation matrix. The correlation matrix of the factors is time-varying and its evolution is described by an inverse Wishart process. The model speciﬁes the evolution of the observed volatilities ﬂexibly and is particularly attractive when the dimension of the observations is high. A Markov chain Monte Carlo algorithm is developed to estimate the model. It is straightforward to use this algorithm to obtain the predictive distributions of future observations and to carry out model selection. The model is illustrated and compared to other Wishart-type factor multivariate stochastic volatility models using various empirical data including monthly stock returns and portfolio weighted returns. The evidence suggests that our model has better predictive performance. The paper also allows the idiosyncratic errors to follow individual stochastic volatility processes in order to deal with more volatile data such as daily or weekly stock returns. Keywords : Correlated factors; Inverse Wishart; Markov chain Monte Carlo. ∗ Corresponding author. Tel: +61-04-32561617; fax: +61-04-93136337.

E-mail : [email protected] Introduction For the last two decades, multivariate stochastic volatility (MSV) models have been an im- portant class of models in ﬁnancial econometrics. Recent developments in this area focus on dimension reduction since the complexity of computation and the diﬃculty in model interpre- tation grow drastically as the dimension of the model increases. Harvey et al. (1994) were the ﬁrst to discuss a factor structure for MSV models. The seminal work by Jacquier et al. (1995) introduced Bayesian approaches to the factor MSV (FMSV) literature. The FMSV model is also considered and discussed by Pitt and Shephard (1999), Chib et al. (2006), Lopes and Carvalho (2007), among others. A common feature in these FMSV models is that they impose the diag- onality assumption on the factor correlation or covariance matrices, implying that the factors are uncorrelated. However, it is often unrealistic to assume that the factors do not interact with each other, especially when the factors are observed. To relax the diagonality assumption, Philipov and Glickman (2006b) introduces a time- varying FMSV model in which the inverse factor covariance matrices are driven by Wishart processes. The model is a direct application of Philipov and Glickman (2006a) to the factor structure. The inverse Wishart speciﬁcation introduced by Philipov and Glickman (2006a,b) has the attractive property that it can be easily incorporated into model estimation with Bayesian Markov chain Monte Carlo (MCMC) methods. Based on a similar setting, Asai and McAleer (2009) also proposes an MSV model, where the individual return series is modeled with the stochastic volatility (SV) process and the covariance process is characterized by the inverse Wishart distribution. Asai and McAleer (2009) call this type of model a “Wishart Inverse Co- variance” (WIC) model. When the vector of dependent variables is high-dimensional, the WIC models proposed by Philipov and Glickman (2006a) and Asai and McAleer (2009) have two problems. First, the computation becomes highly time-consuming as the dimension increases. Second, the time eﬀect among the diﬀerent series is controlled by just one scalar persistence parameter, which is likely to

2e too restrictive in real applications. The factor structure proposed in Philipov and Glickman (2006b) helps resolve the ﬁrst problem. However, when it comes to more factors, say, three or more, these distinct underlying factors still have to share a common time eﬀect controlled by a single persistence parameter; hence, the second problem remains. In order to solve the two problems simultaneously, we propose in our article an observed dynamic-correlation FMSV model (O-DCFMSV). The basic model form of O-DCFMSV is similar to that of Asai and McAleer (2009), but the structure is applied to a factor model. Consequently, compared to Asai and McAleer (2009), the O-DCFMSV model has advantages in both model estimation and in interpreta- tion for high-dimensional data. Moreover, since in the O-DCFMSV model we allow diﬀer- ent time eﬀects on the factors through separate SV processes, it is more ﬂexible compared to the Philipov and Glickman (2006b). To estimate the model, we develop an MCMC algorithm that deals with all unknown parameters and latent variables jointly, which is quite diﬀerent from the partial MCMC approach used in Asai and McAleer (2009). Using our approach, pre- diction and model selection become straightforward, issues that are not dealt with in either Philipov and Glickman (2006b) or Asai and McAleer (2009). We illustrate how to implement the one-step-ahead prediction, by which we can forecast many quantities of interest, such as the return series, the return covariance matrix, the correlation matrix of the factors, and the value at risk (VaR) of a portfolio. We can also conduct model selection based on predictive performance. To summarize, the contribution of our paper is twofold. First, it introduces a ﬂexible factor model to the MSV literature. Secondly, the MCMC algorithm designed in this paper can be used for prediction and model selection, which signiﬁcantly extends the usefulness of the WIC models in real problems. The remainder of the paper is organized as follows. Section 2 presents the model and the MCMC algorithm to estimate the model. Section 3 conducts a simulation study to illustrate the model. Section 4 provides two empirical examples. The O-DCFMSV model is applied to based on the quality of one-step-ahead predictions. Section 5 extends the model to the case where the idiosyncratic error terms are allowed to follow independent SV processes. Section 6 concludes the paper. Suppose that at time t we have p asset returns, y t , and q underlying observed factors, f t , such that y t = Bf t + e t , (1)where the { f t , t ≥ } and { e t , t ≥ } are independent stochastic processes. The e t are alsoassumed to be independent with e t ∼ N p ( e t | , Ω ) , Ω = diag( σ , ..., σ p ), where N p ( X | µ , Σ ) isa p -dimensional multivariate normal density in X with mean µ and covariance matrix Σ . Theassumption that the conditional variance of e t is constant is relaxed in Section 5 to allow e t tohave SV dynamics. The model for the factors is as follows: f t = V / t ǫ t , (2a) V / t = diag (cid:16) e h t / , e h t / , ..., e h tq / (cid:17) , q ≤ p, (2b) h t +1 = µ + φ ◦ ( h t − µ ) + η t , (2c) h i ∼ N (cid:18) h i (cid:12)(cid:12)(cid:12) µ i , σ η,i − φ i (cid:19) , i = 1 , , ..., q, (2d)where N( x | µ, σ ) is a univariate normal distribution in x with mean µ and σ , and ◦ is theelementwise multiplication operator. The stochastic sequences { ǫ t , t ≥ } and { η t , t ≥ } are4ndependent with η t also an independent sequence and ǫ t | P t ∼ N q ( ǫ t | , Σ ǫ,t ) , (3a) η t ∼ N q ( η t | , Σ η ) , Σ η = diag (cid:0) σ η, , ..., σ η,q (cid:1) . (3b)The covariance matrix Σ ǫ,t is a correlation matrix which is obtained by standardizing the q × q stochastic covariance matrix P t so that Σ ǫ,t = (diag P t ) − P t (diag P t ) − . (4)The dynamics of P t , and hence Σ ǫ,t are given by the stationary autoregressive inverse Wishartprocess P − t +1 | k, P − t ∼ W q ( P − t +1 | k, S t ) , S t = 1 k P − d t AP − d t , (5)where W q ( X | k, S ) is a q × q Wishart density in X with degrees of freedom (df) k ≥ q and the scale matrix S . The q × q matrix A is a symmetric positive deﬁnite matrix parameter, and d is a scalar parameter that accounts for the memory of the matrix process { P t } . The matrix power operation P − d/ t is deﬁned by a spectral decomposition. Similarly to Philipov and Glickman (2006a,b) and Asai and McAleer (2009), we set the initial value P to be P = I q for convenience. In the WIC context, there are two diﬀerent ways to deﬁne the scale matrix. Asai and McAleer (2009) uses the speciﬁcation (5), while Philipov and Glickman (2006b) uses a BEKK-type rep- resentation S t − = 1 k A (cid:0) P − t − (cid:1) d ( A ) ′ , (6)where A is deﬁned by a Cholesky decomposition such that A = A ( A ) ′ . In either case, Philipov and Glickman (2006a) and Asai and McAleer (2009) show that log | P t +1 | is a ﬁrst-order d ; if d ∈ ( − , process is stationary. We have also conducted simulations that suggest that the whole process P t is stationary for d ∈ ( − , Although the O-DCFMSV model has some similarities with Asai and McAleer (2009) in model speciﬁcation, the two models are in fact diﬀerent in several respects. First, Asai and McAleer (2009) adopt the settings (2a) – (5) to model the return series, while in O-DCFMSV, we ap- ply the settings to the observed factors. As a matter of fact, this is the main advantage of O-DCFMSV. When it comes to a high-dimensional environment, the estimation of the model of Asai and McAleer (2009) is extremely tedious. The reason is that the model itself is deﬁned through a sequence of spectral decompositions or singular value decompositions, and the pos- terior densities of the parameters are complicated and dependent on the data dimension. The second diﬀerence is in the sampling scheme to estimate the model which we discuss below. There are two set of parameters in the measurement equation (1). For Ω = diag (cid:0) σ , ..., σ p (cid:1) ,following Liesenfeld and Richard (2006), we assign independent inverse gamma priors for theidiosyncratic variances σ j . Speciﬁcally, σ j ∼ IG( σ j | shape = ν / , scale = ν s / , j = 1 , ..., p .In all our analyses, we use ν = 10 and s = 0 .

01. This deﬁnes a vague prior which is commonlyadopted in the literature. For the loading matrix B , following Jacquier et al. (1995), we choosethe prior given by p ( B | Ω ) ∝ | Ω | − p/ etr (cid:18) − Ω − BB ′ (cid:19) , (7)where etr( X ) means exp(trace( X )). This prior implies that the columns B i of B are a priori independent, each with a prior N p ( B i | , Ω ), which is uninformative relative to the data. The priors for the SV parameters are as follows. We adopt the default settings by Kim et al. (1998). For the mean µ i and variance σ η,i , i = 1 , .., q , we respectively assume that µ i ∼ (cid:0) µ i (cid:12)(cid:12) , (cid:1) and σ η,i ∼ IG (cid:0) σ η,i (cid:12)(cid:12) , . . The prior for φ i is a shifted and scaled beta distribution. Let φ i = 2 ˜ φ i − φ i ∼ Beta( φ (1) , φ (2) ) . We choose φ (1) = 20 and φ (2) = 1 .

5, implying a prior mean of 2 φ (1) / ( φ (1) + φ (2) ) − . The priors for the correlation-level parameters are chosen as follow: for A we specify the prior A − ∼ W q ( A − | q, q − I q ), which implies a prior mean of I q ; for d , we choose the vague prior d ∼ Unif( d | − , k we set k ∼ λ e − λ I ( q, ∞ ) ( k ). Note that the prior for k is a truncated exponential distribution with a rate parameter λ . Throughout the paper we set λ = 0 .

02. This implies a prior mean of 50 + q and a prior standard deviation of 50, which speciﬁes a diﬀuse prior. We estimate the model using the MCMC simulation method described below. Let the observed data Y = { y t } : T × p , F = { f t } : T × q , the log volatilities H = { h t } : T × q , the normalized factors ǫ = { ǫ t } : T × p , and the sequence of unnormalized covariance matrices P = { P t , t = , . . . , T } . Let ω = { ω i , i = 1 , ...q } , with ω i = { µ i , φ i , σ η,i , i = 1 , ..., q } , be the parameters of the volatilities of the factors. The joint density of ( Y , F , H , ǫ , P , B , Ω , ω, A , d, k ) is p ( Y , F , H , ǫ , P , B , Ω , ω , A, d, k ) = p ( Y | B , F , Ω ) p ( F | H , ǫ ) p ( H | ω ) p ( ǫ | P ) (8) × p ( P | A , d, k ) p ( B | Ω ) p ( Ω ) p ( A ) p ( d ) p ( k ) , p ( Y | B , F , Ω ) = T Y t =1 p ( y t | f t , Ω ) , (9a) p ( F | H , ǫ ) = T Y t =1 p ( f t | h t , ǫ t ) , (9b) p ( h t | ω ) = p ( h | ω ) T Y t =2 p ( h t | h t − , ω ) ,p ( h | ω ) = q Y i =1 p ( h i | ω i ) , p ( h t | h t − , ω ) = q Y i =1 p ( h ti | h t − ,i , ω i ) , (9c) p ( ǫ | P , A ) = T Y t =1 p ( ǫ t | P t ) , (9d) p ( P | A , d, k ) = T Y t =1 p ( P t | P t − , A , d, k ) . (9e)The densities p ( y t | f t , Ω ) in (9a) are given by Eq. (1). The densities p ( f t | h t , ǫ t ) in (9b) are degenerate and are given by (2a). The densities p ( h i | ω i ) in (9c) are given by (2d), and the densities p ( h ti | h t − ,i , ω i ) in (9c) are given by (2c). The densities p ( ǫ t | P t ) in (9d) are given by (3a) and (4). The densities p ( P t | P t − , A , d, k ) in (9e) are given by (5). The priors p ( B | Ω ), p ( Ω ), p ( A ), p ( d ), and p ( k ) are discussed in the previous section. We sample from the following conditional distributions. For σ j , we sample from the inverse gamma distribution: p ( σ j | rest) ∝ p ( σ j ) · p ( y j | σ j , B , F ) ∝ ( σ j ) − ν T − exp ( − σ j " ν j s j + T X t =1 (cid:18) y tj − q X i =1 b ji f ti (cid:19) , (10)where y tj is the j th element of y t , f ti is the i th element of f t , and b ji denotes the ij th element of B . It follows from (10) that the conditional density of σ j is an inverse gamma with the shape ν + T and the scale parameter " ν j s j + P Tt =1 (cid:18) y tj − P qi =1 b ji f ti (cid:19) . The posterior density of B is a matrix variate normal density given by: p ( B | rest) ∝ p ( B | Ω ) · p ( Y | B , Ω , F ) ∝ etr (cid:18) − (cid:26) Ω − (cid:2) ( B − µ B ) Σ − B ( B − µ B ) ′ (cid:3) (cid:27)(cid:19) , (11)where Σ B = ( F ′ F + I ) − and µ B = Y ′ F Σ B . We now follow Kim et al. (1998) and discuss how to sample the SV parameters ω and H .First we transform the SV equation (2b) into a linear model by taking the logarithm: f ∗ ti = h ti + z ti , where f ∗ ti = log( f ti + c ) and z ti is a log χ random variable. The scalar c is an “oﬀset” constant that is set to be 10 − . Following Kim et al. (1998), the distribution of f ∗ ti can be approximated by a seven-component normal mixture with the component indicator variables s = { s ti } . Using the oﬀset mixture integration sampler developed by Kim et al. (1998), for each i = 1 , .., q we sample ( φ i , σ η,i ) jointly in one block marginalized over µ i and H and then in another block sample ( µ i , H ) conditional on the rest in the model. To save computational cost, we do not impose the additional reweighting step introduced in Kim et al. (1998). Given that H is drawn, we can then obtain ǫ t = V − / t f ∗ t to estimate the correlations and the correlation-level parameters. Now, since the factors are observed and we have ǫ t , the estimation procedure for P t , A , d , and k is exactly the same as that given in Asai and McAleer (2009). To sample from the complicated and non-conjugate univariate posterior distributions of d and k , following Asai and McAleer (2009), we adopt the adaptive rejection Metropolis sampling (ARMS) of Gilks et al. (1995). The complete MCMC procedure is given as follows: Step 0: Initialize B , Ω , s , ω , H , k, d , and A . Step 1: Sample B | rest, then sample σ j | rest for j = 1 , ..., p. φ , σ η | F ∗ , s and µ , H | F ∗ , s , φ , σ η using the sampler of Kim et al. (1998). Step 3: Obtain the standardized factors ǫ t = V − / t f ∗ t from the sample. Step 4: Sample P t from P t | rest, and then obtain Σ ǫ,t = (diag P t ) − P t (diag P t ) − for t = 1 , ..., T . Step 5: Sample A | rest. Step 6: Sample d | rest using ARMS. Step 7: Sample k | rest using ARMS. Step 8: Go to step 1. Looping steps 1 to 8 is a complete sweep of the MCMC sampler. It is worth noting that Steps (2009) adopt a two-stage procedure. In the ﬁrst stage they estimate the SV parameters ω and the log-volatilities H in one MCMC procedure and obtain the standardized series ǫ ti = U ti f ti with U ti = M P Ml =1 exp h − h ( l ) ti i , where x ( l ) denotes the l th draw of the M MCMC iterations. Then, in the second stage, based on the series ǫ t = ( ǫ t , ..., ǫ tq ), they estimate { P t } and the correlation parameters ( A , d, k ) using another MCMC procedure. Clearly, the strategy does not conduct the MCMC estimation in a joint sense, which is arguably undesirable and improper in at least two respects. First, the method obtains the estimates the log-volatilities ﬁrst and then plug in the estimates to run another separate MCMC. This manner averages out diﬀerent samples of volatilities. Therefore, in the inference of the correlation-level parameters, we actually work with only one ﬁxed set of log-volatilities and residuals ǫ . Secondly, for the purpose of prediction, we need to sample h ( l ) t +1 and then obtain Σ ( l ) ǫ,t , for l = 1 , ..., M . The plug-in method cannot be applied in this case. Unlike the two-stage scheme of Asai and McAleer (2009), our algorithm makes draws for ω and H and then directly obtains the standardized series for the correlation parameters in each single iteration. In this way, we conduct estimation jointly with a full MCMC procedure, and the prediction can be performed directly using the usual MCMC methods. Simulation Study In this subsection, we use simulated data to illustrate how O-DCFMSV works. Note that the illustration is based on a single run since at the very beginning we wish to present the result visually. A complete simulation study based on multiple replications is provided later. We set p = 10 observed series and q = 2 factors with a sample size T = 1 , process (DGP) is described by: (i) Measurement equation: B =  .

00 0 . − .

05 0 .

99 0 . − .

10 0 .

00 0 .

56 0 .

00 0 . .

00 1 .

00 0 .

34 0 .

00 0 .

95 0 .

00 0 .

00 0 .  ′ , Ω = diag(0 . , . , . , . , . , . , . , . , . , . . (ii) SV structures: h ,t +1 = µ + 0 . h t − µ ) + η t , µ = − . ,h ,t +1 = µ + 0 . h t − µ ) + η t , µ = − . ,  η t η t  ∼ N   ,  .

00 0 .  . (iii) Factor correlation level: A =  . .

05 1  − =  . − . − .

050 1 .  , k = 25 , d = 0 . , From (iii) and the initial value P = I q we can simulate a sequence of covariance matrices { P t } , from which we can obtain the correlation matrices Σ ǫ,t using Eq. (4). Given { Σ ǫ,t } together with (ii), we can generate two hidden systematic factors with time-varying correlation ρ t = [ Σ ǫ,t ] , . Then, given the factors we can generate ten observed series Y with the setting (i). The MCMC study is conducted with 20,000 iterations, where the ﬁrst L = 10 ,

000 draws are taken as burn-ins and the remaining M = 10 ,

000 are preserved. The program for the estimation

11s coded in OX by Doornik (2007). Table 1 summarizes the estimation results. We output the posterior means and the 95% intervals based on the M draws. The posterior mean is calculated by averaging the MCMC draws. The 95% credible interval is constructed using the (2.5%, 97.5%) percentiles of the simulated draws. We can see that, out of the pq + p + 3 q + q ( q + 1) / parameters, there is only one, b , not covered by the 95% credible interval. In O-DCFMSV, one of the primary interests is in capturing the time-varying factor corre- lation. The factor correlation provides very useful information since it can reﬂect the market condition as we will see later. The top panel of Figure 1 displays the correlation ﬁts. The smoothed estimate at t is calculated by the posterior mean ˆ ρ t = M − P Ml =1 [ Σ ǫ,t ] ( l )21 , where Σ ( l ) ǫ,t is the l th draw of the preserved MCMC iterations based on smoothing. Note that we draw Σ ( l ) ǫ,t using Step 4 in the algorithm detailed in Section 2.3.2. The grey line is the true correlation and the black line represents the ﬁts. We can observe that, though the ﬁtted result appears smoother than the true values, the model in general well captures the pattern of the dynamic correlations, both the movements and the average level. To measure the performance, following Asai and McAleer (2009), we calculate two performance measures, both of which are based upon the mean absolute error (MAE) of the smoothing estimates. The ﬁrst is MAE ρ ≡ T P t | ˆ ρ t − ρ t | , which measures the quality of the correlation estimates. We obtain the MAE ρ = 0 . suggests a satisfactory result, as we can see that the correlation varies in the (wide) range ( − . , . The second measure is used for evaluating the VaR estimates. This measure is meaningful to the O-DCFMSV model since one important application of the asset-return factor model is to obtain the VaR estimates of the portfolios through factor structures. Suppose that we have a vector of portfolio asset weights w . According to Barbieri et al. (2009) and Chib et al. (2006), under the assumptions of normality and a zero mean, the 5% VaR of the simulated portfolio at time t is estimated by 1 . · ˆ σ P t , where ˆ σ P t denotes the posterior mean of the portfolio standard deviation σ P t = h w ′ ( BV / t Σ ǫ,t V / t B ′ + Ω ) w i . (In fact this is the 95% quantile of the predictive density when the mean is zero, but by symmetry it is the negative

12f the 5% quantile). Suppose that our asset holdings are equally weighted, which means that w = . The true and estimated VaR are shown in the bottom panel of Figure 1, where the grey line represents the true VaR and the black line is the estimate. It is readily seen that both the movement and the magnitude are nicely captured. The MAE measure is obtained by MAEVaR ≡ T P t | VaR estt − VaR t | , where VaR estt = 1 . · ˆ σ P t and VaR t = 1 . · σ P t . We have the result MAEVaR = 0 . range from 0.5 to 1.5. B B B -0.050 -0.044 -0.064 -0.024 B B B -0.100 -0.080 -0.107 -0.053 B B B B , B B B B B B B B B B , σ σ σ σ σ σ σ σ σ σ µ -0.200 -0.092 -0.280 0.087 µ -0.500 -0.859 -1.560 -0.141 φ φ σ η, σ η, a a -0.050 -0.042 -0.080 -0.018 a d k o rr e l a t i on − . . . . MAE r = 0.208 V a R . . . . . MAE

VaR = 0.105

Figure 1: Factor correlations and VaR estimates. The top panel shows the true correlationprocess ρ t (grey line) and its posterior mean ˆ ρ t (solid black line). The bottom panel shows VaR t (grey line) and its posterior mean VaR estt (solid black line). To complete the illustration, we ﬁnish the simulation study by comparing O-DCFMSV with thebenchmark, the model of Philipov and Glickman (2006b, hereafter PG), as it is also a WishartFMSV model with dynamic factor correlations. The model speciﬁcation is: y t | B , f t , Ω ∼ N p ( Bf t , Ω ) , f t | P t ∼ N q ( , P t ) , P − t | P − t − , S t − ∼ W q ( P − t − | k, S t − ) , where the matrix P t is a factor covariance matrix, the meaning of the matrix A and the scalar parameters d and k are the same as those in O-DCFMSV. Here we deﬁne the scale matrix as S t = k P − d t AP − d t , which is the form of (5). It should be noted that, as mentioned in Section 2.1, Philipov and Glickman (2006b) use the BEKK-type speciﬁcation (6) for S t , however, in order to remove the eﬀect caused by diﬀerent parameterizations, we adopt the setting (5) instead

15f (6) for the competing model. Asai and McAleer (2009) point out that it is possible to use either one or the other as an alternative. The matrix power P − d t is calculated by the spectral decomposition. We take (i), (ii), and (iii) used in last section as the true data generation process (DGP)for O-DCFMSV. For the true DGP of PG’s model, we drop (ii) and use only (i) and (iii), sincethe model does not assume SV structures on the factors. Two datasets with diﬀerent DGPs aregenerated and ﬁtted with both models. To evaluate the performance, we calculate the Kullback-Leibler (KL) divergence as a measure of how far away the distribution given the estimatedcovariance is from that given the truth. Let Σ t and ˆ Σ t be the true and estimated covariancematrices, respectively. Let p t = p ( y t | Σ t ) denote the density of y t given true covariance matrixof y t ; also, let p est t = p ( y t | ˆ Σ t ) be the density of y t with the estimated covariance matrix pluggedin instead of the true covariance matrix. Under normality, the KL divergence between p t and p est t is KL (cid:0) p t (cid:12)(cid:12)(cid:12)(cid:12) p est t (cid:1) = Z p t ( y ) log p t ( y ) p estt ( y ) dy = − p (cid:16) ˆ Σ − t Σ t (cid:17) −

12 log (cid:12)(cid:12) Σ t (cid:12)(cid:12) + 12 log (cid:12)(cid:12) ˆ Σ t (cid:12)(cid:12) . In each replication we record the mean KL divergence (MKL), which is deﬁned by T P t KL (cid:0) p t (cid:12)(cid:12)(cid:12)(cid:12) p est t (cid:1) as a summary of the KL divergence over every t . Since we have two true DGPs and two models, there are four combinations, which, by the DGP-model order, are referred as O-O, O-PG, PG-O, and PG-PG. In each combination we conduct the simulation with 40 replications and record their MKL. In each replication we calculate the diﬀerenced measure by subtracting the value of the true model from that of the wrong model, i.e. [O-PG minus O-O] and [PG-O minus PG-PG]. The diﬀerenced measure is denoted by ∆MKL. We report the sample mean of ∆MKL from the 40 replications and the standard error as the ﬁnal summarized output. Table 2 summarizes the comparison results. We can see that, for both DGPs, the diﬀerenced values of [wrong minus true] are both signiﬁcantly positive, which indicate that the true models win. However, we can ﬁnd that the mean ∆MKL implication that, given both the models are misspeciﬁed, PG’s model is more distant from the truth than O-DCFMSV; in other words, the KL loss is greater when PG is used and O-DCFMSV is correct, than vice versa. This section uses two empirical examples to illustrate the applications of the O-DCFMSV model. We use three monthly Fama-French (F-F) factors obtained from Dr. Kenneth French’s data library. The factors are the market excess return (Mkt), the Small-Minus-Big factor (SMB) and the High-Minus-Low factor (HML). All the factors are rescaled to (-1,1) by multiplying by the behavior of the volatility is quite diﬀerent among diﬀerent factors. For example, during the HML. In addition, we can observe that a cluster occurs around late 1990’s to early 2000’s in all three factors, but the magnitude of the volatility is noticeably larger in SMB and HML than in Mkt. Accordingly, we need to allow the factor volatilities to have separate dynamics in order to reﬂect such facts. Prior to the examples, we ﬁrst display the time series plot of the “true” factor correlations to k t − . − . . . S M B − . − . . . H M L − . . . . Figure 2: Time-series plot of the rescaled F-F factors, Jul 1963 - Dec 2005.show that the dynamic factor correlation is a reasonable setting. The “true” correlation at time t is ρ t − r : t + r , which we take simply as the empirical correlation calculated from the data within the window [ t − r, t + r ]. For example, if we choose r = 3, then the correlation of (Mkt,SMB) at January 1980 is calculated by the empirical correlation of (Mkt,SMB) from October 1979 to April 1980, which approximately represents a half-year correlation. Here we choose r = 6 for calculating the 1-year correlation, r = 12 for the 2-year, and r = 18, the 3-year. By rolling the windows, we obtain the “true” correlations over time. Figure 3 shows the 1-year, 2-year, and 3-year pairwise correlations of the three factors. It is obvious that some pairs have quite large correlations during certain periods. The shaded areas account for the events having great economic impacts, which respectively are the ﬁrst and second oil crisises, and the bursting of Dot-com bubble. Obviously, we can see that the factor correlations are changing over time. In particular, we can observe a common pattern: the correlations climb to a higher level or local peaks during these turbulent periods, while in the “calm” periods such as 1980’s to 90’s, the correlations decline to a relatively low level. This represents the well-known “correlation breakdown” phenomenon that has long been recognized

18n empirical data, referring to the pattern that the correlation during ordinary and stressful market conditions diﬀer substantially. See Rey (2000) for a detailed discussion. The correlation breakdown implied in Figure 3 suggests that time-varying factor correlation should be considered in the modeling. We would also like to point out that the main advantage of using O-DCFMSV over the empirical rolling-window method to estimate factor correlations is that we can calculate the credible intervals, by which we can obtain the signiﬁcance relative to a speciﬁc critical level. M k t − S M B , y r − . . . . M k t − S M B , y r − . . . . M k t − S M B , y r − . . . . M k t − H M L , y r − . . . . M k t − H M L , y r − . . . . M k t − H M L , y r − . . . . S M B − H M L , y r − . . . . S M B − H M L , y r − . . . . S M B − H M L , y r − . . . . Figure 3: “True” correlations of the F-F factors Mkt-RF, SMB, and HML, Jul 1963 - Dec 2005.The shaded areas account for the events that have great economic impacts, which respectivelyare the ﬁrst and the second oil crisis, and the Dot-com bubble burst. In this ﬁrst example we use the three F-F factors demonstrated in last section as the covariates. The return series Y are the monthly average value weighted returns for 10 industry portfolios Enrgy, HiTec, Telcm, Shops, Hlth, Utils, and Other. A detailed description for these portfolios can be found in the data library. Again, we ﬁrst convert the data to a (-1,1) scale by multiplying by 0.01. Figure 4 shows the time series plot of the data. We can observe some common clusters occurring in the mid 1970’s and the early 2000’s, which suggests that the factor SV structure can be useful. Figure 5 shows the ﬁtted factor correlations. The black lines are the ﬁtted values and the grey lines represent the 2-year “true” factor correlations presented in Section 4.1. We see that the estimates are smoother than the “true” values, but in general, the movements and the magnitude of the correlations are properly captured. − . . . . N o D u r − . . . D u r b l − . . M anu f − . . . E n r g y − . . . H i T e c − . . . T e l c m − . − . . S hop s − . . . H l t h − . . . U t il s − . . . O t he r Figure 4: Time-series plot of the ten portfolios, Jul 1963 - Dec 2005.20 k t − S M B − . − . . . . M k t − H M L − . − . . . . S M B − H M L − . − . . . . Figure 5: Estimated and “true” correlations of the factors for the portfolio data. The black linesare the ﬁtted values and the grey lines are the “true” values.We now compare the O-DCFMSV model with PG’s model using several performance mea-sures. The ﬁrst two measures are based on the one-step-ahead predictive ability for the returncovariance matrix. Notice that in this example we have 10 return series and in each month thereare more than 10 transaction days. Thus, we can use daily returns to construct a nonsingularempirical covariance matrix as a proxy for the “true” covariance. Given the “true” covariancematrices, we can therefore compute MAEVaR for the equally-weighted portfolio as we do inSection 3. The empirical covariance matrix at month t , denoted by Σ t , is simply the samplecovariance matrix constructed from the daily observations within that month with an adjust-ment factor of n t / ( n t − n t is the number of transaction days within month t . Theone-step-ahead predictor for the return covariance matrix given a model can be obtained by the21onditional covariance: Σ M t +1 ≡ Var ( y t +1 |F t ; M ) = B Var ( f t +1 |F t ; M ) B ′ + Ω , where F t = { y , ..., y t } is the set of observations collected up to time t , and M denotes themodel, either PG or O-DCFMSV. In the implementation, for each period t + 1 we rerun theMCMC procedure to obtain the one-step-ahead covariance matrix. Let Σ P Gt +1 be the one-step-ahead predictive covariance matrix of y t +1 , which is estimated by b Σ P Gt +1 ≈ M M X l =1 h B ( l ) P ( l ) t +1 B ( l ) ′ + Ω ( l ) i , where M is the number of preserved MCMC iterations and P ( l ) t +1 ∼ P t +1 (cid:12)(cid:12) rest; M P G . For the O-DCFMSV model, the factor follows: f t +1 | V t +1 , Σ ǫ,t +1 ∼ N q (cid:16) , V / t +1 Σ ǫ,t +1 V / t +1 (cid:17) . Deﬁne R t +1 = V / t +1 Σ ǫ,t +1 V / t +1 , where V t +1 = diag( V t +1 , , ..., V t +1 ,q ). From (2c), we have that V t +1 ,i | h ti , ω i = exp( h t +1 ,i ) | h ti , ω i ∼ logN( V t +1 ,i | λ ti , σ η,i ) , where (12a) λ ti = φ i h ti + (1 − φ i ) µ i ,V t +1 ,i | h ti , ω i ∼ logN( V t +1 ,i | λ ti / , σ η,i / , (12b)and logN( x | a, b ) is a log normal density in x where log x ∼ N(log x | a, b ). The correlation matrix Σ ǫ,t +1 is obtained by (4). Then, in the l th MCMC iteration we can calculate R ( l ) t +1 . Let Σ Ot +1 be the one step ahead predictive variance of y t +1 under the O-DCFMSV model. Given R ( l ) t +1 ,22e have the approximation b Σ Ot +1 ≈ M M X l =1 (cid:16) B ( l ) R t +1( l ) B ( l ) ′ + Ω ( l ) (cid:17) . To evaluate the predictive accuracy for the 5% VaR predictions for the equally-weightedportfolio, we calculate MAE

VaR ≡ N X t (cid:12)(cid:12)(cid:12) VaR estt +1 − VaR t +1 (cid:12)(cid:12)(cid:12) , where N is the number of forecast periods and the quantity VaR t +1 = 1 . · ( w ′ Σ t +1 w ) / with w = p − being the weight vector. We calculate the estimate VaR estt +1 usingVaR estt +1 = 1 . × M − M X l =1 h w ′ b Σ M t +1 w i ( l ) ! . In addition to the MAE

VaR measure, some authors suggest calculating the diﬀerence betweenthe “true” and predicted covariance matrices in an elementwise sense. Following Ledoit et al.(2003), we calculate the root-mean-square error (RMSE) based on the Frobenius norm (FN):FN = 1 N X t (cid:12)(cid:12)(cid:12)(cid:12) Σ t +1 − b Σ M t +1 (cid:12)(cid:12)(cid:12)(cid:12) = 1 N X t X i,j ([ Σ t +1 ] ij − [ b Σ M t +1 ] ij )  / . Because both MAE

VaR and FN measure deviations from “true” values, a smaller value indicates a better model. Here we calculate the ratio of PG to O-DCFMSV for these measures so that we can compare the deviations. We output the mean values of the ratios over all the prediction periods as the summarized results. Besides using the empirical covariance error-based measures, we also evaluate model perfor-mance in terms of the predictive quality for the return series. To do this, following Geweke and Amisano(2010), we ﬁrst obtain the one-step-ahead log predictive score (LPS) and then calculate the cu-mulative log predictive Bayes factor. The one-step-ahead LPS evaluated at y t +1 under the23peciﬁc model M is given byLPS( y t +1 |F t ; M ) = log p ( y t +1 |F t ; M ) , where the predictive density p ( y t +1 |F t ; M ) is calculated by p ( y t +1 |F t ; M ) = Z p ( y t +1 |F t ; θ M ) p ( θ M |F t ) d θ M ≈ M M X l =1 p (cid:16) y t +1 (cid:12)(cid:12) x ( l ) t +1 , θ ( l ) M (cid:17) = 1 M M X l =1 N p (cid:16) y t +1 (cid:12)(cid:12) B ( l ) f ( l ) t +1 , Ω ( l ) ; M (cid:17) , where θ M is the set of parameters for the model M and x t +1 is the latent state vector. Thenwe can calculate the cumulative log predictive Bayes factor of Model 1 against Model 0, whichis deﬁned by log( B , ) = X t log p ( y t |F t − ; M ) − X t log p ( y t |F t − ; M )= X t h LPS( y t |F t − ; M ) − LPS( y t |F t − ; M ) i . In addition, we calculate the LPS for the equally-weighted portfolio w ′ y t +1 , say, LPS-EW:LPS-EW( w ′ y t +1 |F t , M ) = log P ( w ′ y t +1 |F t , M ) ≈ log " M M X l =1 N (cid:16) w ′ y t +1 (cid:12)(cid:12) w ′ B ( l ) f ( l ) t +1 , w ′ Ω ( l ) w (cid:17) . The reason to calculate LPS-EW is that if the model performs better in this measure, then wehave evidence to believe that the model should also be better in forecasting the VaR for anequally-weighted portfolio. In this sense, we can regard LPS-EW as an alternative to MAE

VaR .Similar to LPS, we then calculate the cumulative log predictive Bayes factor of Model 1 against24odel 0 for the equally-weighted portfolio:log( B EW , ) = X t h LPS-EW( y t |F t − ; M ) − LPS-EW( y t |F t − ; M ) i . The cumulative log predictive Bayes factor has a simple criterion for checking statistical sig- niﬁcance. According to Geweke and Amisano (2010), the evaluation is conducted via the log scoring rule described in Gneiting and Raftery (2007). The detailed log scoring rule is given in Kass and Raftery (1995), of which we use the following criterion: if log( B , ) <

0, the evidence is in favor of Model 0; if log( B , ) ∈ [0 , if log( B , ) ∈ [1 , B , ) ∈ [3 , is strongly in favor of Model 1; if log( B , ) >

5, we have very strong evidence in favor of Model We use a three-year out-of-sample prediction period, from January 2006 to December 2008, with a total length N = 36. This time frame covers two market conditions: before 2007, when the market is relatively calm, and afterwards when the market is relatively volatile due to the subprime crisis. Therefore, we can compare model performance across diﬀerent market conditions. Here, the one-step-ahead prediction is conducted on a rolling basis, i.e., if we use observations y , ..., y T to forecast y T +1 , then in next period, y T +1 is included as a sample for the prediction of y T +2 . Table 3 (a) summarizes the results of the comparison using the empirical covariance error-based measures. R-MAE VaR and R-FN denote the ratio of PG to O-DCFMSV for MAE

VaR and FN, respectively. We see that the mean ratio for MAE

VaR is considerable diﬀerence. The mean ratio for FN is 1.19, which also suggests a considerably large diﬀerence. Table 3 (b) is the summarized comparison results using cumulative log predictive Bayes factors. log( B O,P G ) and log( B EWO,P G ) respectively denote the cumulative log predictive Bayes factor of O-DCFMSV against PG for y t +1 and w ′ y t +1 . We see that log( B O,P G ) = 14 . >

5, suggesting very strong evidence in favor of O-DCFMSV. Similarly, for the equally-weighted B O,P G ) 14.940log( B EWO,P G ) 13.176portfolio, we have log( B EWO,P G ) = 13 . >

5, which again strongly supports O-DCFMSV. In conclusion, all the evidence strongly suggests that O-DCFMSV outperforms PG’s model. The results in Table 3 (a) and (b) are aggregated over the prediction period and do not show how the two models perform at each time point. To be more convincing, we examine the “periodwise”performance. Figure 6 shows the period-by-period results for MAE VaR and FN. It is readily seen that the values of O-DCFMSV are constantly smaller than those of PG’s. Figure 7 displays the period-by-period plot of the diﬀerence in LPS and LPS-EW (O-DCFMSV minus PG). From Figure 7 we observe that the diﬀerences in LPS and LPS-EW of O-DCFMSV minus PG are constantly greater than 0, which shows that the LPS and LPS-EW of O-DCFMSV are constantly larger than those of PG’s model. The only noticeable drop occurs at October 2008, which is the month right after the bankruptcy ﬁling of Lehman Brothers and the bailout of Fannie Mae and Freddie Mac. The market was extremely volatile at that time. In fact, according to the result not shown here, the values of the predictive density functions for both models are only about exp( −

87) to exp( − appear to be equally unlikely (i.e., they do not hold). Consequently, the result at this time point is arguably unrepresentative. Overall, based on these results, we conclude that O-DCFMSV generally performs better than the PG model in terms of one-step-ahead prediction. The second example ﬁts the monthly stock return data. We collect 20 historical stock prices from Yahoo! Finance. The observation period is January 1977 - June 2007, with 366 observations. We AE _ V a R Jan−06 Mar−07 May−08 . . . PGO−DCFMSV F N Jan−06 Mar−07 May−08 . . . PGO−DCFMSV

Figure 6: Period-by-period comparison using MAE

VaR and FN. The portfolio return data. D i ff L PS Jan−06 Mar−07 May−08 − . . . D i ff L PS − E W Jan−06 Mar−07 May−08 − . . . Figure 7: Diﬀerence of the predictive log-likelihood for the returns and for the equally-weightedportfolio returns. The portfolio return data.calculate the stock returns by taking log P t,j − log P t − ,j , t = 2 , ...,

366 and j = 1 , ...,

20, where P t,j is the price of the j th stock at time t . This generates a set of return data with a sample size T = 365. Similar to the example given in Philipov and Glickman (2006b), in this illustration we model the stock returns using two pairs of factors, (Mkt, SMB) and (Mkt, HML), respectively. We also compare O-DCFMSV to PG based on the one-step-ahead prediction quality. The out- of-sample period is again January 2006 – December 2008. Notice that in this case we have 20 stock returns, but during the sample period not every month has at least 20 transaction days; for this reason, unlike Example 1, here we do not calculate the empirical covariance error-based B O,P G ) 7.200 7.272log( B EWO,P G ) 6.408 6.480measures. Table 4 shows the aggregate results, from which we can see that, no matter what pair of factors is used, the cumulative log predictive Bayes factors log( B EWO,P G ) and log( B O,P G ) are both greater than 5, suggesting very strong evidence in favor of the O-DCFMSV model. Figure 8 and Figure 9 respectively display the period-by-period diﬀerences in LPS and LPS-EW (O- DCFMSV minus PG) for the pairs (Mkt, SMB) and (Mkt, HML). As one can see, similar to what we observe in Example 1, the diﬀerenced values are uniformly greater than 0 in both cases except the one at October 2008. The reason is the same. At this month, no matter what pair of factors is used, the values of the predictive density functions for both models are as low as exp( − − Thus, again, we argue that the diﬀerences obtained at this time point may not be meaningful. Regardless of this outlier, O-DCFMSV performs uniformly better over the out-of-sample period. Another natural question to ask is which of the two combinations of factors provides a better explanation to the data. This is a model selection question which in its generality asks how many and which factors should be used and is not discussed in Philipov and Glickman (2006b). Our solution is straightforward. We can simply compare the predictive performance of the candidate models using the cumulative log predictive Bayes factor. For instance, in this illustration we have two O-DCFMSV models, one with the factors (Mkt, SMB), denoted by MS, and the other with (Mkt, HML), MH. Table 5 summarizes the comparison result of the two models in terms of log( B MS,MH ) and log( B EWMS,MH ). We see that, both of the cumulative log predictive Bayes scores show positive (but not strong) evidence in favor of (Mkt,SMB); therefore, we may conclude that, for a two-factor O-DCFMSV model, (Mkt,SMB) is a better choice for the data. Furthermore, i ff L PS , ( M k t, S M B ) Jan−06 Mar−07 May−08 − . . . . D i ff L PS − E W , ( M k t, S M B ) Jan−06 Mar−07 May−08 − . . . . Figure 8: Diﬀerence of the predictive log-likelihood for the returns and for the equally-weightedportfolio returns. (Mkt, SMB). The stock return data. D i ff L PS , ( M k t, H M L ) Jan−06 Mar−07 May−08 − . . . D i ff L PS − E W , ( M k t, H M L ) Jan−06 Mar−07 May−08 − . . . Figure 9: Diﬀerence of the predictive log-likelihood for the returns and for the equally-weightedportfolio returns. (Mkt, HML). The stock return data.if we have models that contain diﬀerent numbers of factors, we can also use this approach to select the “best” model or the optimal number of factors. It is commonly seen in ﬁnancial studies that daily and weekly data exhibit more volatility than monthly data. For this reason, we may consider allowing each of the idiosyncratic errors to follow an independent SV process, see, e.g. Pitt and Shephard (1999) and Chib et al. (2006). B MS,MH ) 1.269log( B EWMS,MH ) 1.187We modify the model form (1) as y t = Bf t + e t , e t = Λ / t u t , (13)where f t and e t are independent and u t ∼ N p ( , I ). The scaling matrix Λ t = diag (cid:0) e h t , ..., e h tp (cid:1) ,where { h tj , j = 1 , ..., p } is the log-volatility of the error terms following the SV process: h tj = µ j + φ j ( h tj − µ j ) + η tj , η tj iid ∼ N(0 , σ η,j ) . Note that, in previous sections we use the index i = 1 , ..., q for the SV processes of the factorvolatilities. Here, for the log-volatilities of the errors, we use the index j = 1 , ..., p . With thespeciﬁcation (13) we need to change the sampling scheme for B . Following Geweke and Zhou(1996) and Lopes and West (2004), we set the priors b j ∼ N q ( , c j I q ), where b j is the j th rowof B . Throughout we choose c j = 5, which deﬁnes an uninformative prior for each b j . Let y j be the j th column of Y and h j = { h j , ..., h tj } . Given the likelihood function: L ( b j | y j , h j , F ) ∝ exp (cid:20) −

12 ( y j − F b j ) ′ Λ − j ( y j − F b j ) (cid:21) , where Λ j = diag (cid:0) e h j , ..., e h tj (cid:1) . The conditional posterior for b j is given by: P ( b j | rest) ∝ exp (cid:26) − σ j h ( b j − µ b j ) ′ Σ − b j ( b j − µ b j ) i (cid:27) , Σ b j = ( c − j I + F ′ Λ − j F ) − and µ b j = Σ b j F ′ Λ − j y j . For { µ j , φ j , σ η,j } and h tj , in a same manner, we use the integration sampler of Kim et al. (1998) to make draws. The sampling scheme for the other parameters remains unchanged. To know how much we gain from adding individual SV processes on the idiosyncratic error terms, using both monthly and daily data, we compare the O-DCFMSV with SV on the errors (SV-Err) to the original model, O-DCFMSV. The model performance is compared in terms of the cumulative log predictive Bayes factors. The ﬁrst comparison is based on monthly data, where we use the same dataset as in the ﬁrst example of Section 4.2. Similarly, the sample period is July 1963 – December 2005 and the out-of-sample period is three-years long covering January 2006 – December 2008, with a length N = 36. Table 6(a) summarizes the results of the comparison. We can see that for the monthly portfolio data, O-DCFMSV beats SV-Err in predicting y t +1 but not in the equally-weighted portfolio w ′ y t +1 . However, if we examine Figure 10, the period-by-period prediction results, we readily see that, in general O-DCFMSV performs better than or as well as SV-Err does. Nonetheless, just as we observed in Section 4.2, the O-DCFMSV fails to capture the movement in returns in the single extreme period, October the overall performance of the two models. Except for this extreme case, we should agree that O-DCFMSV suﬃces to model the monthly data. In the second comparison, both models are ﬁtted to daily data. The dataset contains 30 daily stock prices collected from Yahoo Finance. The sample period is from January 3, 2006 to July 31, 2008. We calculate the stock returns as described in Example 2. This gives us the return data of a sample size T = 649. The out-of-sample period covers one full month, August N = 21. Table 6(b) shows the summarized result. Obviously, we see that the evidence is in favor of SV-Err since both cumulative log predictive Bayes factors are smaller than 0. The result suggests that, for daily data the SV-Err model should be considered. B O,S ) 41.976log( B EWO,S ) -27.864 (b) Daily stock return data.Measure Valuelog( B O,S ) -55.512log( B EWO,S ) -47.952 L PS Jan−06 Mar−07 May−08 − − SV−ErrO−DCFMSV L PS − E W Jan−06 Mar−07 May−08 − − − SV−ErrO−DCFMSV

Figure 10: Comparison of SV-Err and Non-SV-Err using monthly portfolio data. In this paper we propose a dynamic-correlation FMSV model where the factors are observable. The novelty is that we simultaneously allow the factors to have separate SV processes and the factor covariance process follow an inverse Wishart process, which provides great ﬂexibility to describe the dynamics of the factors. We also develop an algorithm based on a full MCMC procedure to estimate the model. A signiﬁcant advantage of the algorithm is it makes feasible to carry out prediction and model selection. This is an important improvement in this context, as it enlarges the scope of applications of the WIC-type models. From the comparisons using simulated data and various empirical data, we show that O-DCFMSV outperforms the competing model, PG’s FMSV model. Moreover, we also consider the SV-Err setting that is adopted by many authors. Our em- pirical result shows that, for monthly data, the basic O-DCFMSV suﬃces; for daily data which is easy to be extended to allow for heavy-tailedness. For example, we can add the adhoc scaled- t speciﬁcation suggested by Kim et al. (1998) to the idiosyncratic errors. A possible future work is to allow the factors to be latent so that the model can be more ﬂexible. When it comes to the latent factor structure, many issues need to resolve, such as the nonidentiﬁcation problem, the choice of the number of factors, and so on. We expect further research on this direction. References Asai, M. and McAleer, M. (2009), “The structure of dynamic correlations in multivariate stochas- tic volatility models,” Journal of Econometric , 150, 182–192. Barbieri, A., Chang, K., Dubikovsky, V., Fox, J., Gladkevich, A., Gold, C., and Goldberg, L. (2009), “Modeling Value at Risk with Factor,” Tech. rep., MSCI Barra Research. Chib, S., Nardari, F., and Shephard, N. (2006), “Analysis of high dimensional multivariate stochastic volatility models,” Journal of Econometrics , 134, 341–371. Doornik, J. (2007),

Object-Oriented Matrix Programming Using O , Timberlake Consultants Geweke, J. and Amisano, G. (2010), “Comparing and evaluating Bayesian predictive distribu- tions of asset returns,” International Journal of Forecasting , 26, 216–230. Geweke, J. and Zhou, G. (1996), “Measuring the pricing error of the arbitrage pricing theory,” Review of Financial Studies , 9, 557–587. Gilks, W., Best, N., and Tan, K. (1995), “Adaptive rejection Metropolis sampling within Gibbs sampling,” Applied Statistic , 44, 455–473. Gneiting, T. and Raftery, A. E. (2007), “Strictly Proper Scoring Rules, Prediction, and Estima- tion,” Journal of the American Statistical Association , 102, 359–378. Review of Economic Studies , 61, 247–264. Jacquier, E., Polson, N. G., and Rossi, P. E. (1995), “Models and prior distributions for multi- variate stochastic volatility,” Tech. Rep. 95-18, CIRANO: Scientiﬁc Series, Montreal. Kass, R. E. and Raftery, A. E. (1995), “Bayes Factors,”

Journal of American Statistical Asso- ciation , 90, 773–795. Kim, S., Shephard, N., and Chib, S. (1998), “Stochastic volatility: Likelihood inference and comparison with ARCH models,” Review of Economic Studies , 65, 361–393. Ledoit, O., Santa-Clara, P., and Wolf, M. (2003), “Flexible multivariate GARCH modeling with an application to international stock markets,” The Review of Economics and Statistics , 85, Liesenfeld, R. and Richard, J.-F. (2006), “Classical and Bayesian Analysis of Univariate and Multivariate Stochastic Volatility Models,”

Econometric Reviews , 25, 335–360. Lopes, H. and Carvalho, C. M. (2007), “Factor stochastic volatility with time varying loadings and Markov switching regimes,” Journal of Statistical Planning and Inference , 137, 3082–3091. Lopes, H. and West, M. (2004), “Bayesian model assessment in factor analysis,”

Statistica Sinica , Philipov, A. and Glickman, M. E. (2006a), “Multivariate stochastic volatility via Wishart pro- cesse,” Journal of Business and Economic Statistics , 24, 313–328. — (2006b), “Factor multivariate stochastic volatility via Wishart processes,” Econometric Re- views , 25, 311–334. Pitt, M. K. and Shephard, N. (1999), “Time-Varying Covariances: A Factor Stochastic Volatility Approach,” in

Bayesian Statistics , eds. Bernardo, J., Berger, J., Dawid, A., and Smith, A., Oxford: Oxford University Press, vol. 6, pp. 169–193. Financial Markets and Portfolio Management , 14, 387–412.2