Bayesian Quantile-Based Portfolio Selection
Taras Bodnar, Mathias Lindholm, Vilhelm Niklasson, Erik Thorsén
BBayesian Quantile-Based Portfolio Selection
Taras Bodnar , Mathias Lindholm , Vilhelm Niklasson , and Erik Thorsén Department of Mathematics, Stockholm University, SE-10691 Stockholm, SwedenDecember 4, 2020
Abstract
We study the optimal portfolio allocation problem from a Bayesian perspective usingvalue at risk (VaR) and conditional value at risk (CVaR) as risk measures. By applying theposterior predictive distribution for the future portfolio return, we derive relevant quantilesneeded in the computations of VaR and CVaR, and express the optimal portfolio weightsin terms of observed data only. This is in contrast to the conventional method wherethe optimal solution is based on unobserved quantities which are estimated, leading tosuboptimality. We also obtain the expressions for the weights of the global minimum VaRand CVaR portfolios, and specify conditions for their existence. It is shown that theseportfolios may not exist if the confidence level used for the VaR or CVaR computation are toolow. Moreover, analytical expressions for the mean-VaR and mean-CVaR efficient frontiersare presented and the extension of theoretical results to general coherent risk measuresis provided. One of the main advantages of the suggested Bayesian approach is that thetheoretical results are derived in the finite-sample case and thus they are exact and can beapplied to large-dimensional portfolios.By using simulation and real market data, we compare the new Bayesian approach tothe conventional method by studying the performance and existence of the global minimumVaR portfolio and by analysing the estimated efficient frontiers. It is concluded that theBayesian approach outperforms the conventional one, in particular at predicting the out-of-sample VaR.
Keywords : Finance; Bayesian inference; Posterior predictive distribution; Optimal portfolio; Quantile-based risk measure
The economic theory underlying optimal portfolio selection was pioneered by Markowitz (1952). In hisseminal paper, the optimal allocation of the available assets was determined by their expected returns and a r X i v : . [ q -f i n . P M ] D ec he covariance matrix of the asset returns. In practise, however, both the vector of expected returns andthe covariance matrix are unknown and have to be estimated by using historical data. These estimateswere traditionally treated as the true parameters of the data-generating process and plugged into theequations for the weights of optimal portfolios. Unfortunately, this causes a challenging problem since theoptimal portfolio weights appear to be sensitive to misspecification of the input parameters, especiallythe expected returns, and thus estimation errors can lead to poorly performing portfolios (see, Chopraand Ziemba 1993, Frankfurter et al. 1971, Klein and Bawa 1976, Merton 1980, Simaan 2014, Bodnaret al. 2018).The problem concerning parameter uncertainty was one of the reasons why the Bayesian approachwas introduced in portfolio theory during the 1970s (Winkler and Barry 1975). In the Bayesian setting,the parameters of the distribution of the asset returns are modelled as random variables with a priordistribution summarizing the information about the asset returns which is not present in the historicaldata. The posterior distribution, derived from the likelihood function and the used priors, provides up-dated knowledge on the model parameters conditionally on the observed data. In addition to reducingestimation risk, a Bayesian setting also makes it possible to employ useful prior information and imple-ment fast numerical algorithms (see, Avramov and Zhou 2010, Rachev et al. 2008). Several different waysof specifying a prior have been suggested in the literature (see, e.g., Avramov and Zhou 2010, Bodnaret al. 2017, Tu and Zhou 2010). In general, a prior can be either informative or non-informative. Earlyapplications of Bayesian statistics in finance were mainly based on uninformative or model-based priors(Bawa et al. 1979). However, during the 80s and 90s, some more sophisticated priors were developed.Two such highly influential priors are the hyperparameter prior by Jorion (1986), which was inspired bythe Bayes-Stein shrinkage approach, and the informative prior by Black and Litterman (1992), whichrelies on market equilibrium arguments.In addition to specifying a prior, an assumption must be made about the return distribution. Such amodelling choice is crucial both in the frequentist and in the Bayesian setup. The standard assumptionof unconditional normality has been heavily criticized since actual market returns tend to have a higherdensity both close to the mean and far away from it (Cont 2001). Because of this, more general assump-tions allowing for heavier tails have been suggested (see, e.g., Adcock 2014, Bauder et al. 2020b). Forinstance, Bauder et al. (2020b) studied a Bayesian setting where they used the general assumption thatthe asset returns are exchangeable and multivariate centered spherically symmetric (see also, Brugière2020). They derived a stochastic representation of the posterior predictive distribution, i.e., the distri-bution of the next return conditioned on the previous returns, under this assumption. By using thisrepresentation in the portfolio optimization problem, the common suboptimality issue in the standardplug-in approach was avoided since the estimates were no longer treated as the true parameters. Using astochastic representation is a well-established tool in computational statistics (Givens and Hoeting 2013)and it reduces the need for demanding Markov Chain Monte Carlo simulations. Moreover, a stochasticrepresentation makes it easier to find quantiles which can be used to measure the risk.In the standard Markowitz setup, the risk is measured in terms of the variance of the portfolio return(Markowitz 1952). Recent regulations have forced banks and insurance companies to use other measures f risk, such as value at risk (VaR) and conditional value at risk (CVaR). The former is still in use inthe Solvency II directive while the latter is enforced by the Basel III and IV standards. Both VaR andCVaR are quantile-based risk measures focusing on downside risk, meaning that the risk is determinedfrom a quantile in the right tail of a loss distribution. A lot of attention has been given to these tworisk measures over the last decades concerning their applicability, computations and back-testing (see,e.g., Baumol 1963, Jorion 1997, Pritsker 1997, Meng and Taylor 2020, Staino and Russo 2020). The twomeasures differ substantially in the way the quantiles are selected and it is usually argued that CVaRis a better measure of risk in comparison to VaR since it does not violate the desirable property ofsubadditivity (Artzner et al. 1999).The usage of VaR and CVaR in portfolio optimization has also become increasingly popular overthe last decades. For instance, Rockafellar and Uryasev (2000) and Babat et al. (2018) studied theminimization of CVaR in a portfolio using linear programming. Alexander and Baptista (2002, 2004)derived the weights of a portfolio when using VaR or CVaR in the objective function in the portfoliooptimization problem under the assumption of multivariate normally distributed asset returns. Theynoted that the optimal allocation strategy does not depend on whether the variance, VaR or CVaR isused as a risk measure. However, the portfolio that globally minimizes VaR or CVaR may not coincidewith the one which globally minimizes the variance. This is illustrated in Figure 1 where the mean-variance efficient frontier is plotted together with the locations of the global minimum variance (GMV),global minimum VaR (GMVaR) and global minimum CVaR (GMCVaR) portfolios based on weeklyreturns for 50 randomly selected stocks in the S&P 500 index. In this figure, the efficient frontier andthe optimal portfolios are all calculated using the conventional method, i.e., using sample estimatesand treating them as the true parameters of the data-generating model. The fact that the GMVaRand GMCVaR portfolios are located above the GMV portfolio on the mean-variance efficient frontiermotivates the study of these portfolios also for an investor who constructs the portfolio following themean-variance analysis, since their expected returns determine the lower bounds for an investor whowants to be efficient also from a mean-VaR or mean-CVaR perspective. Given the new regulations, thisshould be highly desirable.(a) GMVaR (b) GMCVaR Figure 1: Mean-variance efficient frontier based on empirical data from stocks in the S&P500 index together with the locations of the GMV, GMVaR and GMCVaR portfolios.3 his paper contributes to the current literature on portfolio theory by combining a Bayesian frame-work and a quantile-based asset allocation strategy. This gives new insights into how an optimal portfoliocan be constructed in practice. We further compare the new Bayesian approach to its frequentist counter-part and demonstrate some of the advantages of the former through a simulation study and an empiricalillustration. Throughout the report, we consider two different well-known priors (one informative andone non-informative) and we use general (non-normal) assumptions on the asset returns as Bauder et al.(2020b) and Brugière (2020).The rest of the paper is organized as follows. In Section 2, we present a Bayesian model of portfolioand asset returns and discuss its time series properties. Moreover, the posterior predictive distributionis derived together with stochastic representations related to it. In Section 3, we use the obtainedstochastic representations to establish expressions for VaR and CVaR of a portfolio and we also extendthe analysis to more general risk measures. Section 4 deals with portfolio optimization under parameteruncertainty. The Bayesian quantile-based portfolio optimization problem is solved and conditions for theexistence of a global minimum VaR and CVaR portfolio are presented. The mean-VaR and mean-CVaRefficient frontiers are also derived. Section 5 contains a simulation study where our Bayesian approach iscompared to the conventional method by analysing the performance and existence of the global minimumVaR portfolio as well as by comparing the estimated efficient frontiers. The comparison is continued inSection 6 with an empirical study based on market data from stocks in the S&P 500 index. Section 7contains a conclusion and discussion based on our findings. Finally, all of the proofs are moved to theappendix together with a description of the data.
In this section, we introduce our Bayesian model and derive the posterior predictive distributions of theportfolio return using two different priors. We also make a time series interpretation of the model usedto describe the stochastic properties of the asset returns.
Let X t be a k -dimensional vector of logarithmic asset returns at time t , i.e., each element X t,i , i = 1 , ..., k of X t is defined as X t,i := log (cid:18) P t,i P t − ,i (cid:19) , where P t,i denotes the price at time t of asset i . Throughout the paper we assume that the asset returns X , X , ... are infinitely exchangeable and multivariate centered spherically symmetric (see, Bernardoand Smith 2009, for the definition and properties). This assumption in particular implies that the assetreturns are neither normally nor independently distributed. Moreover, let x ( t − = ( x t − n , ..., x t − ) be theobservation matrix of the asset returns x t − n , ..., x t − taken from time t − n until t − whose distribution epends on the parameter vector θ . Bayes theorem provides the posterior distribution of θ expressed as π ( θ | x ( t − ) ∝ f ( x ( t − | θ ) π ( θ ) , where π ( · ) is the prior distribution and f ( ·| θ ) is the likelihood function.At time t the return of the portfolio with weights w = ( w , . . . , w k ) , where it is assumed that (cid:62) w = 1 , is given by X P,t := w (cid:62) X t . The posterior predictive distribution of X P,t , i.e., the conditional distribution of X P,t given x ( t − , iscomputed by (see, Bernardo and Smith 2009) f ( x P,t | x ( t − ) = (cid:90) θ ∈ Θ f ( x P,t | θ ) π ( θ | x ( t − ) d θ , (2.1)where Θ denotes the parameter space and f ( ·| θ ) is the conditional density of X P,t given θ . The posteriorpredictive distribution (2.1) determines the distribution of the future portfolio return at time t given theinformation available in the historical data of asset returns up to t − . It can be used to construct apoint prediction of the future portfolio return, such as the posterior predictive mean or mode, togetherwith the uncertainty, expressed as the posterior predictive variance. Furthermore, a prediction intervalfor the portfolio return can be obtained as a posterior predictive credible interval.The integration in (2.1) makes it, usually, impossible to find an analytical expression for the posteriorpredictive distribution. In such cases, the point prediction of the future portfolio return together withits uncertainty can be obtained by using the Markov Chain Monte Carlo method (see, e.g., Krügeret al. 2020). Moreover, Zellner and Ando (2010) argued that a direct Monte Carlo approach providesan efficient computational method to compute a Bayesian estimation and to determine its uncertainty.Another way to deal with the integration issue is by directly simulating from the posterior predictivedistribution by using a stochastic representation (see, e.g., Bauder et al. 2020b).Employing the non-informative Jeffreys prior and the informative conjugate prior, Bauder et al.(2020b) derived a stochastic representation of a random variable following the posterior predictive dis-tribution (2.1) under the assumptions of infinitely exchangeability and multivariate centered sphericalsymmetry. In this case, the model parameters are the mean vector µ and covariance matrix Σ of theasset returns (see, Section 2.2 for the detailed discussion). The Jeffreys prior and the conjugate prior aregiven by π J ( µ , Σ ) ∝ | Σ | − ( k +1) / , (2.2)and µ | Σ ∼ N k (cid:18) m , r Σ (cid:19) , and Σ ∼ IW k ( d , S ) , (2.3)respectively, where | · | stands for the determinant and IW k ( d , S ) denotes the inverse Wishart distribu-tion with d degrees of freedom and parameter matrix S (see, Gupta and Nagar 2000, for the definitionand properties). The quantities m , r , d , and S are the hyperparameters of the conjugate prior where and S reflect the investor’s prior belief about the mean vector and covariance matrix, whereas r and d represent the precision of these beliefs. The two priors (2.2) and (2.3) have been widely usedin financial literature where the Jeffreys prior is usually also referred to as the diffuse prior (see, e.g.,Barry 1974, Bodnar et al. 2017, Brown 1976, Stambaugh 1997), and the conjugate prior is also commonlyrelated to the Black-Litterman model (see, Bauder et al. 2020b, Black and Litterman 1992, Frost andSavarino 1986, Kolm and Ritter 2017).The stochastic representations derived by Bauder et al. (2020b) fully determine the posterior pre-dictive distribution. They are expressed in terms of two independent standard t -distributed randomvariables whose degrees of freedom depend on the assigned prior. Using these findings, the posteriorpredictive distribution is deduced and is presented in Theorem 2.1. Let t ( q, a, b ) denote the univariate t -distribution with q degrees of freedom, location parameter a and scale parameter b . Then, we obtainthe following results. Theorem 2.1.
Let asset returns X , X , ... be infinitely exchangeable and multivariate centered spheri-cally symmetric. Then1. under the Jeffreys prior, it holds for n > k that the posterior predictive distribution defined in (2.1) is t ( d k,n,J , w T ¯ x t − ,J , r k,n,J w T S t − ,J w ) with d k,n,J = n − k , r k,n,J = n + 1 n ( n − k ) , ¯ x t − ,J = 1 n t − (cid:88) i = t − n x i , (2.4) and S t − ,J = t − (cid:88) i = t − n ( x i − ¯x t − ,J )( x i − ¯x t − ,J ) T ; (2.5)
2. under the conjugate prior, it holds for n + d − k > that the posterior predictive distributiondefined in (2.1) is t ( d k,n,C , w T ¯ x t − ,C , r k,n,C w T S t − ,C w ) with d k,n,C = n + d − k , r k,n,C = n + r + 1( n + r )( n + d − k ) , ¯ x t − ,C = n ¯ x t − ,J + r m n + r , (2.6) and S t − ,C = S t − ,J + S + nr ( m − ¯ x t − ,C )( m − ¯ x t − ,C ) (cid:62) n + r . (2.7)The proof of Theorem 2.1 is given in the appendix.Since the structure of the posterior predictive distributions are similar, we introduce new notationswhich allow us to combine the two findings of Theorem 2.1 into a single result. Let d k,n = d k,n,J , r k,n = r k,n,J , ¯ x t − = ¯ x t − ,J , S t − = S t − ,J , under the Jeffreys prior, d k,n = d k,n,C , r k,n = r k,n,C , ¯ x t − = ¯ x t − ,C , S t − = S t − ,C , under the conjugate prior. (2.8)From Theorem 2.1 we deduce that the posterior predictive distribution under both priors can be ex-pressed as t ( d k,n , w T ¯ x t − , r k,n w T S t − w ) with d k,n , r k,n , ¯ x t − , and S t − as in (2.8). Let (cid:98) X P,t denotea random variable which follows the posterior predictive distribution, i.e., whose distribution coincides ith the conditional distribution of portfolio return X P,t given the information available up to time t − .Application of Theorem 2.1 together with (2.8) then gives the following stochastic representation of (cid:98) X P,t (cid:98) X P,t d = w T ¯ x t − + τ √ r k,n (cid:112) w T S t − w , (2.9)where τ ∼ t ( d k,n ) with t ( d k,n ) denoting the standard t -distribution with d k,n degrees of freedom, i.e.,the t -distribution with zero location parameter and scale parameter equal to one. Representation (2.9)immediately implies thatE ( (cid:98) X P,t ) = E ( X P,t | x ( t − ) = w (cid:62) ¯ x t − , Var ( (cid:98) X P,t ) =
Var ( X P,t | x ( t − ) = d k,n r k,n d k,n − w (cid:62) S t − w , (2.10)for d k,n > and d k,n > , respectively. In the financial literature it is common to describe asset returns in terms of a time series model (see, e.g.,Tsay 2010). In order to make comparisons to such models easier, we now give a time series representationof the considered Bayesian model.The assumption of infinite exchangeability implies following De Finetti’s theorem that there exists aprobability measure conditionally on which the asset returns are independent and identically distributed(see, Kingman et al. 1978). Moreover, the assumption of multivariate centered spherically symmetrydetermines the conditional distribution of the asset returns. Namely, we get that the asset returnsconditionally on the mean vector µ and on the covariance matrix Σ are independent and normallydistributed (see, Proposition 4.6 in Bernardo and Smith 2009). That is X , X , ..., X t | µ , Σ are independent with X i | µ , Σ ∼ N k ( µ , Σ ) . (2.11)Model (2.11) should not be confused with the assumption that the asset returns are independent andnormally distributed as it is usually presented in financial and econometric literature. The two modelscoincide only when µ and Σ possess probability distributions concentrated in single points. This wouldimply that the investor knows µ and Σ with certainty, which is not the case in practice.In order to show the considerable difference between the model (2.11) and the model assumingindependent and normally distributed asset returns, we next derive the marginal distribution of X , X ,..., X t by using the findings of Theorem 2.1 and the stochastic representation (2.9). The results will beobtained by assigning the Jeffreys prior to the model parameters µ and Σ as well as the conjugate prior.In Theorem 2.1 we showed that given a sample of size n > k , the posterior predictive distributionof X P,t = w (cid:62) X t given X t − n , ..., X t − has a univariate t -distribution under both priors (see, also thestochastic representation (2.9)). Since w is an arbitrary vector with the only restriction (cid:62) w = 1 ,choosing n = t − we get X t | X , ..., X t − ∼ t k ( d k,t − , ¯ x t − , r k,t − S t − ) , (2.12) nd, similarly for j = k + 2 , ...., t − it holds that X j +1 | X , ..., X j ∼ t k ( d k,j , ¯ x j , r k,j S j ) , (2.13)where d k,j and r k,j for j = k + 2 , ..., t − are defined as in (2.8); ¯ x j and S j are defined similarly to ¯ x t − and S t − , where the sums in (2.4) and (2.5) are from to j . The symbol t k ( · , · , · ) stands for the k -dimensional multivariate t -distribution.The last result can also be written in the time series context in the following way X j +1 = ¯ x j + √ r k,j S / j ε j for j = k + 2 , ...., t − , (2.14)where ε j ∼ t k ( d k,j , , I ) is the error term and S / j denotes a square root of S j . Finally, we get thatthe joint density of X , ..., X k +2 is given by (see Appendix B in Bauder et al. 2020a) f ( x , ..., x k +2 ) ∝ | S k +2 | − dk,k +2+ k − . (2.15)The derived time series model for X , ..., X t indicates that a complicated nonlinear dependencestructure is present in the conditional model (2.11). Moreover, model (2.11) can be used to capturethe time-dependent structure in the dynamics of both the conditional mean vector and the conditionalcovariance matrix of the asset returns which is usually observed in practice (see, e.g., Tsay 2010). Finally,the error terms in the time series model for X , ..., X t in (2.14) are multivariate t -distributed with degreesof freedom depending on the amount of available information. This gives rise to an additional sourceof non-stationarity and allows for heavier tails than in a model where asset returns are assumed to beindependent and normally distributed corresponding to the conventional approach. The quantile based risk-measures value at risk (VaR) and conditional value at risk (CVaR) are introducedin the following section and we derive their analytical expressions in our Bayesian setting using theposterior predictive distribution. Moreover, we extend the analysis to more general risk measures.
The results of Theorem 2.1 provides an easy way to sample from the posterior predictive distribution aswell as to find its quantiles. The two most common quantile-based risk measures used in the literatureare VaR and CVaR which we define in this section using the posterior predictive distribution of theportfolio return.Denoting by (cid:98) X P,t a random variable whose distribution coincides with the posterior predictive dis-tribution of the portfolio return X P,t given x ( t − and using that the posterior predictive distribution isabsolutely continuous, the two quantile-based risk measures at confidence level α ∈ (0 . , are defined y VaR α,t − ( (cid:98) X P,t ) := F − Y,t − ( α ) and CVaR α,t − ( (cid:98) X P,t ) := E[ Y | Y ≥ VaR α,t − ( (cid:98) X P,t )] , respectively, where Y := − (cid:98) X P,t is the portfolio loss with cumulative distribution function F Y,t − ( · ) .In the following, it is implicit that the probability and expectation in the definitions of the VaRand CVaR are formulated in terms of the posterior predictive distribution and they are conditioned onprevious asset returns x ( t − . Moreover, the posterior predictive distribution is free of parameters and isfully determined by the observed data. This constitutes the main advantage of the suggested approach,namely it takes the parameter uncertainty into account before the optimal portfolio is constructed, i.e.,the optimisation problem is formulated and solved. We discuss this point in detail in Section 4.By definition, VaR α,t − ( (cid:98) X P,t ) satisfies P (cid:16) (cid:98) X P,t ≤ −
VaR α,t − ( (cid:98) X P,t ) (cid:17) = 1 − α. (3.1)Rewriting the left hand side of (3.1) yields P (cid:16) w T ¯ x t − + τ √ r k,n (cid:112) w T S t − w ≤ − VaR α,t − ( (cid:98) X P,t ) (cid:17) = P (cid:32) τ ≤ − VaR α ( (cid:98) X P,t ) − w T ¯ x t − √ r k,n (cid:112) w T S t − w (cid:33) , where τ is standard t -distributed with degrees of freedom d k,n defined in (2.8) depending on the prior.Hence, − VaR α,t − ( (cid:98) X P,t ) − w T ¯ x t − √ r k,n (cid:112) w T S t − w = d − α , where d − α is the (1 − α ) quantile of the t -distribution with d k,n degrees of freedom which satisfies d α = − d − α due to the symmetry of the t -distribution. Thus, VaR α,t − ( (cid:98) X P,t ) = − w T ¯ x t − + d α √ r k,n (cid:112) w T S t − w . (3.2)Similarly, it follows by the definition of CVaR that CVaR α,t − ( (cid:98) X P,t ) = E (cid:16) − (cid:98) X P,t | − (cid:98) X P,t ≥ VaR α,t − ( (cid:98) X P,t ) (cid:17) = − w T ¯ x t − + k α √ r k,n (cid:112) w T S t − w , (3.3)with k α = E ( − τ | − τ ≥ d α ) = 11 − α (cid:90) ∞ d α tf d k,n ( t ) d t = 11 − α Γ (cid:16) d k,n +12 (cid:17) Γ (cid:16) d k,n (cid:17) (cid:112) πd k,n d k,n d k,n − (cid:18) d α d k,n (cid:19) − dk,n − , here f d k,n ( t ) denotes the density of the t -distribution with d k,n degrees of freedom and we use that thedistribution of − τ coincides with τ due to the symmetry of the t -distribution.The expressions for VaR and CVaR given in (3.2) and (3.3), respectively, can be presented in thefollowing form Q t − ( w ) = − w T ¯ x t − + q α √ r k,n (cid:112) w T S t − w , (3.4)where q α = d α when considering VaR and q α = k α when considering CVaR. This general formulationwill be used extensively in the rest of the paper in order to handle VaR and CVaR cases simultaneously.Since α ∈ (0 . , and the t -distribution is symmetric, we get that d α > . Moreover, we have that k α > by definition. As a result, q α > which together with the convexity of (cid:112) w T S t − w implies that Theorem 3.1.
Under the conditions of Theorem 2.1, Q t − ( w ) is convex with respect to w . The proof of Theorem 3.1 is given in the appendix.
The stochastic representation in equation (2.9) shows that the general presentation of the posteriorpredictive VaR and of the posterior predictive CVaR as given in (3.4) can be extended to other riskmeasures used in portfolio theory and risk management. Such a result will provide a possibility toformulate and to solve the portfolio choice problem under more general setups considered in financialmathematics which is based on the context of coherent risk functionals (see, e.g., Artzner et al. 1999).Namely, instead of considering the VaR and the CVaR to specify the portfolio risk, one can choose anyrisk measure which is relevant, translation invariant and positive homogeneous. These properties are allpresent in the definition of a coherent risk measure. For a risk functional ρ , this can be formulated as(a) Relevance : For all X P,t ≤ we have ρ ( X P,t ) ≥ .(b) Translation invariance : For any real scalar a , ρ ( X P,t + a ) = ρ ( X P,t ) − a .(c) Positive homogeneity : For a positive scalar λ , ρ ( λX P,t ) = λρ ( X P,t ) .The relevance property provides us with a foundation for the interpretation of risk measures. Froman investor’s point of view, a risk measure cannot assign a negative risk to a certain loss. That is often,if not always, how we think about risk. The property of translation invariance states that if one addsmore cash to the portfolio or invest a larger proportion of the capital in something that is deemed asrisk-free, then the risk-exposure should decrease by the corresponding amount. The property of positivehomogeneity can be thought of as using leverage to acquire another (possibly more aggressive) positionin the market. Such a position will increase the risk accordingly.When the asset returns X , X , ... are infinitely exchangeable and multivariate centered sphericallysymmetric, then the analytical expression of (posterior) general risk measures is deduced from (2.9) and t is given by ρ t − ( (cid:98) X P,t ) = ρ t − ( w T ¯ x t − + τ √ r k,n (cid:112) w T S t − w )= − w T ¯ x t − + ρ t − ( τ ) √ r k,n (cid:112) w T S t − w , (3.5)which coincides with (3.4) when we set ρ t − ( τ ) = q α . To this end, we note that ρ t − ( τ ) does notdepend on the portfolio weights w . As a result, finding the optimal portfolio by optimizing (3.5) isthe same problem as finding the optimal portfolio based on (3.4). Since the conditions of relevance,translation invariance and positive homogeneity are present in the definition of a coherent risk measure,any (posterior) coherent risk measure should satisfy (3.5) and, consequently, the results of the nextsection can be applied. We now turn to the theory related to portfolio optimization and present the solutions of the quantile-based optimal portfolio choice problems from both the Bayesian and the frequentist points of view.First, we describe existent results related to the conventional portfolio optimization problems. Thisis followed by theory related to the Bayesian quantile-based approach which relies on some previousfindings. Special focus is put on the global minimum VaR and global minimum CVaR portfolios. Thesection ends with a derivation of the efficient frontier in the mean-quantile space.
The mean-variance optimization problem of Markowitz (1952) and its solution provide a fundamentalconcept to practical asset allocation. It was originally formulated in terms of the population parametersof the asset return distribution: mean vector µ and covariance matrix Σ . As a result, the solution ofMarkowitz’s optimization problem depends on these unknown quantities which have to be estimatedbefore the implementation in practice.The population expected portfolio return of the portfolio with weights w and its population varianceare given by R P ( w ) = w (cid:62) µ and V P ( w ) = w (cid:62) Σw , respectively. Then Markowitz’s optimization problemis given by min w : R P ( w )= R , w (cid:62) =1 V P ( w ) , (4.1)where denotes the k -dimensional vector of ones. The solution of (4.1) is expressed as w MV = w GMV + R − R GMV s M µ with M = Σ − − Σ − (cid:62) Σ − (cid:62) Σ − , (4.2)where w GMV = Σ − (cid:62) Σ − (4.3)are the weights of the global minimum variance (GMV) portfolio whose population expected return and opulation variance are given by R GMV = (cid:62) Σ − µ (cid:62) Σ − and V GMV = 1 (cid:62) Σ − . (4.4)The quantity s is the slope parameter of the mean-variance efficient frontier, the set of all optimalportfolios in the mean-variance space, and it is given by s = µ (cid:62) M µ . (4.5)The efficient frontier itself is a parabola given by (see, e.g., Bodnar and Schmid 2009, Kan and Smith2008, Merton 1972) ( R − R GMV ) = s ( V − V GMV ) . (4.6)Recently, Makowitz’s optimization problem has been reformulated from the Bayesian perspectives byBauder et al. (2020b). Using the portfolio expected return R t − ( w ) = E ( X P,t | x ( t − ) and the portfoliovariance V t − ( w ) = Var ( X P,t | x ( t − ) computed from the posterior predictive distribution as in (2.10),Markowitz’s optimization problem from the Bayesian point of view is given by min w : R t − ( w )= R , w (cid:62) =1 V t − ( w ) , (4.7)with the solution expressed as w MV,t − = w GMV,t − + R − R GMV,t − s t − M t − ¯ x t − with M t − = S − t − − S − t − (cid:62) S − t − (cid:62) S − t − , (4.8)where w GMV,t − = S − t − (cid:62) S − t − , R GMV,t − = (cid:62) S − t − ¯ x t − (cid:62) S − t − , and V GMV,t − = d k,n r k,n d k,n − (cid:62) S − t − . (4.9)The quantity s t − is one of the factors determining the slope of the Bayesian efficient frontier in themean-variance space and it is given by s t − = ¯ x (cid:62) t − M t − ¯ x t − . (4.10)Also from the Bayesian perspective, the efficient frontier is a parabola expressed as (see, Bauder et al.2020b) ( R − R GMV,t − ) = d k,n − d k,n r k,n s t − ( V − V GMV,t − ) . (4.11)In contrast to the population optimal portfolios and the efficient frontier, the Bayesian optimal port-folio and the Bayesian efficient frontier are presented in terms of the historical data that are observableup to time t − , when the optimal portfolio for the next period is constructed. Moreover, the Bayesianportfolio allocation is based on the predictive posterior distribution and incorporates the parameteruncertainty in the decision process before the weights of optimal portfolios are computed. ssuming asset returns to be normally distributed, Alexander and Baptista (2002, 2004) extendedMarkowitz’s optimization problem by replacing the population portfolio variance in (4.1) with the pop-ulation VaR and CVaR, respectively, given by Q P ( w ) = − w T µ + q P ; α √ w T Σw , (4.12)where q P ; α = z α for the VaR and q P ; α = exp (cid:0) − z α / (cid:1) (1 − α ) √ π for the CVaR where z α denotes the α -quantile ofthe standard normal distribution.The quantile-based optimization problems of Alexander and Baptista (2002, 2004) are given by min w : R P ( w )= R , w (cid:62) =1 Q P ( w ) . (4.13)If the constraint on the expected return is omitted in (4.13), then the solutions of (4.13) are the weightsof the population optimal portfolios with the smallest values of VaR (or CVaR) at confidence level α given by (see, Bodnar et al. 2012) w GMQ = w GMV + √ V GMV (cid:113) q P ; α − s M µ . (4.14)Similarly to the mean-variance portfolio, the weights (4.14) of the population minimum VaR (or CVaR)portfolio cannot be computed. First, the unknown population parameters µ and Σ should be estimatedby using historical data of asset returns and, then, the estimator of w GMQ is constructed as a proxy of thetrue portfolio weights. This two-step procedure of constructing an optimal portfolio usually leads to sub-optimal solutions since the parameter uncertainty is ignored in its construction. In the next subsection,we deal with the problem from the viewpoint of Bayesian statistics which allows to incorporate theparameter uncertainty directly into the decision process before the optimization problem is solved.
For a general quantile-based risk measure, the extension of the Alexander and Baptista optimizationproblem (4.13) from the Bayesian perspectives is given by min w : R t − ( w )= R , w (cid:62) =1 Q t − ( w ) , (4.15)where R t − ( w ) and Q t − ( w ) are computed by using the posterior predictive distribution as discussed inSection 2.1 and Section 3. For the special choices of the function Q t − ( w ) as discussed in Section 3.1, weget the optimization problems that minimize the predictive portfolio VaR and the predictive portfolioCVaR. he solution of the optimization problem (4.15) can be presented in the following way argmin w : R t − ( w )= R , w (cid:62) =1 Q t − ( w ) = argmin w : R t − ( w )= R , w (cid:62) =1 − R t − ( w ) + q α (cid:115) d k,n − d k,n (cid:112) V t − ( w )= argmin w : R t − ( w )= R , w (cid:62) =1 V t − ( w ) , provided that d k,n > . Hence, on the one hand, all solutions of the quantile-based optimization problem(4.15) are also the solutions of the mean-variance optimization problem (4.7) and belong to the efficientfrontier (4.11). On the other hand, all four optimization problems (4.1), (4.7), (4.13), and (4.15) possessa solution only if R is properly chosen. For example, (4.1) and (4.13) have solutions if and only if R > R GMV and R > R GMV,t − , respectively, while for solving (4.13) one requires that q P ; α − s > . (4.16)Below in Theorem 4.1, we formulate the conditions needed for the existence of the Bayesian optimalportfolio in the sense of minimizing Q t − ( w ) .It has to be noted that the conditions of solution existence formulated in the case of the populationoptimization problems (4.1) and (4.13) depend on the unknown population parameters of the datagenerating process and thus they cannot be validated in practice. In contrast, the Bayesian formulation ofthe optimization problems makes it possible to specify the existence conditions in terms of the previouslyobserved data x ( t − and, thus, to check them before the optimization problem is solved. Finally, theconditions on the existence of the solutions in the quantile-based optimization problems (4.13) and (4.15)depend on the chosen confidence level α , although the solutions themselves are independent of it.Similarly to the mean-variance optimization problems, in order to determine under which conditionsimposed on R the solutions of the quantile-based optimization problem exist, one has to find the optimalportfolio with the smallest possible value of the objective function Q t − ( w ) , that is when the constraint R t − ( w ) = R is dropped from the optimization problem (4.15). The expected return of this portfoliowill provide the smallest possible value for which the optimization problem (4.15) possesses a solution. Tothis end, we note that this is also the portfolio which a completely risk averse investor may be interestedin. The following theorem expresses the variance and return of such a portfolio. Theorem 4.1.
Let d k,n > . Then, under the conditions of Theorem 2.1, the global minimum quantile(GMQ)-based optimal portfolio exists if and only if q α > r − k,n s t − , (4.17) where s t − is defined in (4.10) . Moreover, its posterior predictive expected return and variance are givenby R GMQ,t − = R GMV,t − + r − k,n s t − (cid:113) q α − r − k,n s t − (cid:115) d k,n − d k,n (cid:112) V GMV,t − , (4.18) nd V GMQ,t − = q α q α − r − k,n s t − V GMV,t − , (4.19) where V GMV,t − and R GMV,t − are given in (4.9) . The statement of Theorem 4.1 is proved in the appendix. Its results determine the lower bound forpossible values of R that can be used in the optimization problem (4.15). Since R GMQ,t − > R GMV,t − ,we get that the set of optimal portfolios which solve (4.15) does not coincide with the set of the Bayesianmean-variance optimal portfolios which lie on the upper part of the efficient frontier given by the parabola(4.11) in the mean-variance space.The findings of Theorem 4.1 lead to the expression of the smallest possible value of Q t − ( w ) for theselected confidence level α expressed as Q GMQ,t − = − R GMQ,t − + q α (cid:115) d k,n − d k,n (cid:112) V GMQ,t − . (4.20)Finally, the weights of the global minimum quantile-based portfolio are deduced from the findings ofTheorem 4.1 and they are presented in Theorem 4.2. Theorem 4.2.
Let d k,n > and the inequality (4.17) holds. Then, under the conditions of Theorem 2.1,the weights of the global minimum quantile-based optimal portfolio are given by w GMQ,t − = w GMV,t − + r − k,n (cid:112) V GMV,t − (cid:113) q α − r − k,n s t − (cid:115) d k,n − d k,n M t − ¯ x t − . (4.21) Earlier in this section we proved that the solutions of the quantile-based portfolio optimization problem(4.15) belong the Bayesian efficient frontier (4.11) in the mean-variance space. We now characterise thelocation of the Bayesian quantile-based optimal portfolio in the mean-quantile (mean-Q) space. It hasto be noted that the population mean-VaR efficient frontier was carried out by Alexander and Baptista(2002) under the assumption that the asset returns are multivariate normally distributed. We extendthese findings in Theorem 4.3 whose proof is given in the appendix.
Theorem 4.3.
Let d k,n > and s t − > . Then, under the conditions of Theorem 2.1, the Bayesianefficient frontier in the mean-Q space is a hyperbola given by Q = q α (cid:115) ( R − R GMV,t − ) r − k,n s t − + d k,n − d k,n V GMV,t − − R. (4.22)Expressions for the mean-variance efficient frontier using the Bayesian setup was derived in Bauderet al. (2020b). It holds that the mean-variance efficient frontier (4.11) is a parabola in the mean-variancespace and a hyperbola in the mean-standard deviation space for s t − > . These findings are in linewith the results in Merton (1972), where the same conclusions were drawn for the population efficientfrontier. In Theorem 4.3, we prove that the efficient frontier in the mean-Q space is also a hyperbola nder the same condition s t − > . It is interesting to note that since M t − is positive semi-definitewith M t − = by construction, it always holds that s t − ≥ with s t − = 0 only if the elements ofthe vector ¯ x t − are all equal. Another important observation is that both efficient frontiers (4.11) and(4.22) are determined by the same set of quantities R GMV,t − , V GMV,t − , and s t − which are computedfrom the historical data of asset returns. Remark . Using the proof of Theorem 4.3, we also obtain the analytical expression of the populationefficient frontier in the mean-Q space, thus complementing the findings of Alexander and Baptista (2002)who presents this frontier in the empirical study without deriving its closed-form expression. It holdsthat the population efficient frontier in the mean-Q space is a hyperbola expressed as Q = q P ; α (cid:114) ( R − R GMV ) s + V GMV − R. (4.23)It is fully determined by the same set of constants R GMV , V GMV , and s as the population efficientfrontier (4.6), which is also a hyperbola in the mean standard-deviation space. In the following section, we analyse how the Bayesian approaches compare to the conventional methodvia simulations. We will do so by studying VaR prediction using the global minimum VaR (GMVaR)portfolio and by looking at how frequently the conditions (4.16) and (4.17) are satisfied. The comparisonfor other quantile-based risk measures can be done similarly by emphasising that the coherent riskmeasures have the same structure as given in (3.5). Utilizing (3.5) it is interesting to note that anycoherent risk measure can be rewritten as VaR at confidence level β = F d k,n ( ρ t − ( τ )) for the Bayesianapproaches, where F d k,n ( · ) stands for the cumulative distribution function of the univariate t -distributionwith d k,n degrees of freedom and τ is a t -distributed random variable with d k,n degrees of freedom. Forthe conventional method, one can use the same procedure, where the t -distribution is replaced by thestandard normal distribution. Finally, we compare the different estimation methods by illustratingtheir corresponding efficient frontiers. When doing this comparison we also include the global minimumvariance (GMV) portfolio in the analysis to see where it is located in the mean-VaR space. Throughout the simulation study, the asset returns are generated from a multivariate normal distribution.This distribution satisfies the assumptions of infinite exchangeability and multivariate centered sphericalsymmetry when conditioning on the parameters (see, e.g., Proposition 4.6 in Bernardo and Smith 2009).In order to not restrict the analysis to certain parameters, the mean vector and covariance matrix arerandomized in each new simulation iteration. We draw µ from the uniform distribution on [ − . , . ,i.e., µ i ∼ U ( − . , . , and the covariance matrix is constructed by writing it as Σ = DRD where R is a correlation matrix with ( R ) ij = 0 . if i (cid:54) = j and D is a diagonal matrix with entries given by ( D ) ii ∼ U (0 . , . . We also consider different sample sizes and portfolio sizes by using n ∈ { , } nd k = cn for c ∈ { . , . , . , . } . Moreover, we use α ∈ { . , . } to study the impact of theconfidence level in the GMVaR computations. For each parameter setup, we consider 10000 independentsimulation runs when studying the performance and existence of the GMVaR portfolios. In each suchsimulation iteration, the out-of-sample performance of the portfolios is evaluated for one period ahead.We then aggregate the obtained results in all simulation runs. To this end, only those results wherethe conventional and Bayesian GMVaR conditions (4.16) and (4.17) are satisfied simultaneously areconsidered.Since the true parameters of the asset return distribution are known during simulation, it is possibleto make comparisons with the population GMVaR portfolios as well as the population efficient frontier.The population GMVaR portfolios are constructed from the same mathematical formulas as when usingthe conventional method but they are based on the true parameters. Hence the population portfolioscan be used as benchmarks for the corresponding Bayesian and conventional portfolios which are allbased on parameter estimates. Similarly, the population efficient frontier can be used as a referencefor the estimated efficient frontiers. Finally, the hyperparameters m and S of the conjugate priorare determined by using the empirical Bayesian approach (see, e.g., Bauder et al. 2020a) where we set d = r = n . The GMVaR existence conditions (4.16) and (4.17) are not satisfied in several simulation runs when theportfolio dimension becomes large in comparison to the sample size. More precisely, in the followingcases { n = 100 , k = 50 , α = 0 . } , { n = 100 , k = 70 , α = 0 . } , { n = 200 , k = 100 , α = 0 . } , { n = 200 , k = 140 , α = 0 . } , { n = 100 , k = 70 , α = 0 . } and { n = 200 , k = 140 , α = 0 . } , theconventional condition (4.16) is not met for 247, 8432, 1486, 9995, 1027 and 4155 out of the 10000simulation runs, respectively. For the Jeffreys prior, the corresponding numbers are 0, 16, 0, 51, 0 and0, and they are 4, 581, 4, 2694, 1 and 0 for the conjugate prior. For all other values of { n, k, α } theexistence conditions are satisfied. They are always fulfilled for the population GMVaR portfolio. Basedon these findings we conclude that both Bayesian GMVaR portfolios are more likely to exist than itsconventional counterpart with the Bayeisan GMVaR portfolio under the Jeffreys prior demonstratingthe lowest frequencies of non-existence in all of the considered cases. Our results also show that it ismore likely that the GMVaR conditions are not satisfied when α is small or when c is large. Finally, wepoint out that when { n = 200 , k = 140 , α = 0 . } the conventional GMVaR portfolio does not exist in9995 out of 10000 simulation iterations. For this reason, the values of the performance measures for thisconfiguration are not presented in Tables 1 and 2.We use two measures to analyze the performance of the GMVaR portfolios. The first measure is therelative frequency of times the estimated VaR is exceeded, i.e, N N (cid:88) i =1 {− X GMVaRP ,i ≥ (cid:100) VaR α ( X GMVaRP ,i ) } , where N is the number of simulations, is the indicator function, X GMVaRP ,i is the actual return of he estimated GMVaR portfolio for simulation i and (cid:100) VaR α ( X GMVaRP ,i ) is its predicted VaR. The lattertwo are calculated using equations (4.20) and (4.21) in the Bayesian cases and (4.12) and (4.14) in theconventional and population cases. By the definition of VaR, an exceedance rate close to − α means agood prediction of the VaR.In Table 1 we observe that the relative VaR exceedance of the population GMVaR portfolios arealways close to the target confidence level. The only source of noise is from the number of simulationruns. Such results do not hold for the three estimated GMVaR portfolios. The relative exceedancefrequencies are close to the target confidence level when the portfolio dimension is small with respectto the sample size. For other values of k and n , the predicted VaRs underestimate the true values.These results are in line with recent findings in portfolio theory, i.e., that the sample optimal portfoliosare overoptimistic and tend to underestimate the risk. To this end we note that although the Bayesianapproaches underestimate the risk, they still perform considerably better than the conventional approach.Especially, when k/n ≥ . the relative exceedance rate for the conventional approach is almost twice aslarge as the ones obtained for the Bayesian methods when α = 0 . and it is almost three times largerfor α = 0 . . Table 1: Relative VaR exceedance frequencies for the population GMVaR portfolio and itsthree estimates
Parameter setup GMVaR portfolio α n k
Jeffreys Conjugate Conventional Population .
95 100 10
200 20 — — — — .
99 100 10
200 20 N N (cid:88) i =1 | (cid:100) VaR α ( X GMVaRP ,i ) − VaR α ( X GMVaRP ,i ) | , here VaR α ( X GMVaRP ,i ) is the VaR of the population GMVaR portfolio. Since the population GMVaRportfolio is based on the true parameter values, the VaR of the population GMVaR portfolio coincideswith true value of VaR. Hence, it can be used as a benchmark and the average absolute deviation shouldideally be close to zero.Table 2 shows the results of the GMVaR portfolio comparison using the average absolute deviationas a performance measure. Like in Table 1, we observe the same performance when the portfolio sizeis considerably smaller than the sample size. The Bayesian approach based on the Jeffreys prior showsthe smallest deviations although the different portfolios are very close in their performance. For largerportfolio sizes, the Bayesian methods are significantly better than the conventional approach. Theypossess smaller values of the performance criteria as well as the computed standard deviations aresmaller than those obtained for the conventional procedure. The differences become very large in theextreme case when k/n = 0 . . Table 2: Average absolute deviation of the VaR of the estimated GMVaR portfolios to theVaR of the population GMVaR portfolio. Values inside the brackets represent the standarddeviations.
Parameter setup GMVaR portfolio α n k
Jeffreys Conjugate Conventional .
95 100 10
200 20 — — — .
99 100 10
200 20 c is large. This is indicated by VaR exceedancefrequencies much higher than − α and large deviations to the population GMVaR. However, even ifnone of the methods perform very well for large-dimensional portfolios, this is the situation where we seethe greatest benefit of using the Bayesian approaches. This point is further studied in the next sectionwhere we investigate the influence of parameter uncertainty on the estimation of the whole mean-VaRefficient frontier. .3 Comparison of efficient frontiers In order to get a better understanding of the impact of parameter uncertainty on the quantile-basedportfolio selection, we use the theoretical findings of Section 4.3 and plot the population mean-VaRefficient frontier together with its three estimates in Figures 2 to 4. The estimates of the mean-VaRefficient frontier are computed for a single simulation run as described at the beginning of this section byusing (4.22) and (4.23) for the Bayesian and conventional estimates, respectively. It should be noted thatthe figures present the most common results which are also observed for other simulation runs. All of thefigures also show where the portfolio which globally minimizes the variance is located in the mean-VaRspace using each of the methods.(a) c = 0 . (b) c = 0 . (c) c = 0 . (d) c = 0 . Figure 2: Population mean-VaR efficient frontier together with its three estimates for n =100 , α = 0 . and c ∈ { . , . , . , . } . The locations of the GMV portfolios are markedby circles. Different scales are used on the axes for presentation purposes. The mean-VaR efficient frontiers and the locations of the GMV portfolios are depicted in Figure 2for α = 0 . , n = 100 , and c ∈ { . , . , . , . } . We observe that all methods overestimate the locationof the true efficient frontier in the mean-VaR space. Such a behaviour is similar to the one previouslydocumented for the Markowitz efficient frontier in the mean-variance space by Broadie (1993), Siegeland Woodgate (2007), Bodnar and Bodnar (2010), Bauder et al. (2019) among others. Namely, ignoringthe parameter uncertainty leads to overoptimistic investment opportunities where the investors expectmore return for the same level of risk than the population efficient frontier determines. The situationbecomes even worse when the conventional mean-VaR frontier is constructed for c = 0 . and especially or c = 0 . . The conventional efficient frontier deviates drastically from the population efficient frontier.Among the two Bayesian efficient frontiers, the one based on the Jeffreys prior leads to the curves thatare closest to the population frontier for all considered portfolio sizes. We also observe the positive effectof portfolio diversification in Figure 2. Increasing the portfolio dimension leads to the reduction of theVaR of the GMVaR portfolio. Also, we note the positive effect on the slope parameter of the efficientfrontier which becomes larger.Figure 2 also illustrates that the portfolios that minimize the variance are not located on the mean-VaR efficient frontiers. This is an expected but important observation which illustrates that an investorwho is mean-variance efficient may not always be mean-VaR efficient.(a) α = 0 . (b) α = 0 . Figure 3: Population mean-VaR efficient frontier together with its three estimates for n =100 , c = 0 . and α ∈ { . , . } . The locations of the GMV portfolios are marked bycircles. Different scales are used on the axes for presentation purposes. (a) n = 100 (b) n = 200 Figure 4: Population mean-VaR efficient frontier together with its three estimates for c =0 . , α = 0 . and n ∈ { , } . The locations of the GMV portfolios are marked bycircles. Different scales are used on the axes for presentation purposes. Figure 3 and 4 demonstrate that the conclusions drawn from the results of Figure 2 are also valid forother values of α and n . In both figures the Bayesian approach with Jeffreys prior provides the best fitof the population efficient frontier followed by the Bayesian estimate based on the conjugate prior. Also, e observe that the increase of the portfolio dimension with the simultaneous increase of the sample sizeleads the reduction of the VaR of the GMVaR portfolio and to the increase in the slope parameter of theefficient frontier. Moreover, the GMV portfolios are again shown to not be mean-VaR efficient. We now continue the comparison between the Bayesian and conventional methodologies through anapplication on actual market data. As in the simulation study, we study the performance and existenceof the GMVaR portfolio and investigate the behaviour of their efficient frontiers. Once again, we considerthe cases n ∈ { , } , k = cn for c ∈ { . , . , . , . } and α ∈ { . , . } . We use weekly returns on stocks included in the S&P 500 index for the period from the 1st of January,2010 to the 28th of March, 2020. In order to circumvent the possible bias of selecting stocks whichoutperform or underperform the rest of the market, we consider all stocks included in the S&P 500 indexby our end date that were already part of the the index by our chosen start date. The lack of publicinformation makes it difficult to know exactly when a certain stock was added to this index, but basedon Wikipedia contributors (2020) we have chosen to consider 221 stocks that were present in the indexbefore the 1st of January, 2010. A complete list of the stocks is provided in Table 4 in Appendix B.In the empirical analysis, we randomly choose 500 portfolios of size k from the list of stocks foreach possible value of { n, k, α } . Once the stocks have been selected, they are kept for the whole timeperiod but the weights of the GMVaR portfolio are re-calculated each week and the performance isevaluated on a weekly basis and then averaged across all sampled portfolios. It should be noted that theperformance results are only based on portfolios which satisfy the conventional and Bayesian GMVaRconditions (4.16) and (4.17), respectively, for all estimates of the GMVaR portfolio simultaneously.Finally, the hyperparameters when using the conjugate prior are specified as in the simulation study,i.e., by employing the empirical Bayesian approach and setting d = r = n . Regarding the existence of the estimates of the GMVaR portfolio, it is more likely that the BayesianGMVaR condition (4.17) is satisfied than the conventional condition (4.16). The Bayesian GMVaRportfolios exist all the time whereas the conventional GMVaR portfolio does not always exist when { n = 100 , k = 70 , α = 0 . } , { n = 200 , k = 140 , α = 0 . } and { n = 100 , k = 70 , α = 0 . } . Forthose values, the conventional GMVaR condition fails during the time period for 369, 5 and 1 portfolios,respectively, out of the 500 portfolios. Hence, we observe that the conditions are more likely to besatisfied when α is large or when c is small.As in the simulation study, we consider the relative VaR exceedance frequency when evaluating theperformance of each estimate of the GMVaR portfolio. This value should ideally be close to − α . The esult is summarized in Table 3. Table 3: Relative VaR exceedance frequencies for the three estimates of the GMVaR port-folio.
Parameter setup GMVaR portfolio α n k
Jeffreys Conjugate Conventional .
95 100 10
200 20 .
99 100 10
200 20 − α . Using the Jeffreys prior gives the best results in allsituations that we consider. However, as in the simulation study, all of the methods are underestimatingVaR since the exceedance frequency is always higher than − α . This is especially pronounced when c is large, i.e., in the case of a large-dimensional portfolio. Even if none of the methods performsvery well for such situations, the Bayesian approaches, especially the one based on the Jeffreys prior,provides a considerable improvement in comparison to the conventional method by reducing the relativeexceedance frequency by 45% for α = 0 . and by 55% for α = 0 . when k/n = 0 . . The applicationof the Bayesian approach based on the conjugate prior also results in much lower relative exceedancefrequencies compared to the conventional method, although this Bayesian GMVaR portfolio performsalways worse than the one based on the Jeffreys prior.Similar results to those observed in Table 3 are also present in Figure 5 where we plot the estimatedmean-VaR efficient frontier for the end date using α = 0 . , n = 100 , and c = k/n ∈ { . , . , . , . } .Both Bayesian efficient frontiers are always located under the conventional efficient frontier. Whilethe three efficient frontiers almost coincide when c = 0 . , the difference between the conventional andBayesian approaches becomes pronounced when c becomes larger, particularly when c = 0 . . Moreover,we again see that the GMV portfolios are not mean-VaR efficient. All of this is in line with the observa-tions made for Figure 2 in the simulation study. Varying α and n using the empirical data will also resultin the same relationships between the efficient frontiers as shown in Figures 3 and 4 in the simulationstudy, indicating the considerable overoptimism present in the construction of the conventional efficientfrontier. a) c = 0 . (b) c = 0 . (c) c = 0 . (d) c = 0 . Figure 5: Bayesian and conventional mean-VaR efficient frontiers based on empirical datafor n = 100 , α = 0 . and c ∈ { . , . , . , . } . The locations of the GMV portfolios aremarked by circles. Different scales are used on the axes for presentation purposes. The traditional mean-variance analysis has been a paramount foundation for the extension to portfolioanalysis based on one-sided risk measures which are popular in financial mathematics. However, theconventional approaches related to the construction of optimal portfolios usually ignore the parameteruncertainty in the construction of an optimal portfolio. It is common to define optimal portfolios bya two-step procedure where first an optimization problem is solved and then the optimal portfolios areestimated by replacing the unknown quantities in the solutions by the corresponding sample counterparts.The Bayesian methodology pose a fundamental difference to the conventional approaches in itsviewpoint on what we want to optimize: Investors care about their future risk in taking a position , notthe risk of having the a certain position today. In light of data, today’s outcome is already determinedand usually not interesting. The Bayesian framework use the predictive posterior distribution to copewith this. That is, the Bayesian methodology answers the problem in a straightforward manner whilethe conventional method simply ignores it.We contribute to the existent literature by formulating and solving quantile-based portfolio allocationproblems from the perspective of Bayesian statistics. This approach is advantageous since it allows totake the parameter uncertainty into account before the optimization problem is solved. The developmentof the general risk functionals from the Bayesian perspectives appears to be a very promising subject f research with great potential of future development. The risk functionals can be defined through allinformation available up to the point in time when a portfolio is constructed or a decision on the riskof the current position should be made. As a result, no unknown or unobservable quantities are presentin their definitions. This is a very appealing property since it takes all uncertainties into account beforethe risk functional is determined.In the frequentist setting, the general risk measure of an optimal portfolio choice problem will havea similar structure as under the Bayesian setup when asset returns are elliptically contoured distributed(see, e.g., Gupta et al. 2013, for a definition and properties thereof). However, both the portfolioexpected return and the portfolio variance are determined by unknown parameters of the distributionof asset returns which must be estimated in any practical application. This procedure would lead to animportant task, namely to include the parameter uncertainty in the definition of the general risk measure.This challenging task has not properly been treated in the literature up to now when frequentist methodsare employed, while the Bayesian approach provides an intelligent automatic solution.Results of the simulation study and of the empirical application leads to the conclusion that theBayesian approaches to portfolio construction provide a good alternative to the conventional proceduresand they are usually preferable in most of the considered cases. The Bayesian approaches outperform theconventional one in terms of providing a better VaR prediction. This holds uniformly, independently ofthe portfolio dimension, sample size, and the confidence level used in the computation of the VaR. Onlywhen the portfolio dimension is relatively small to the sample size does the conventional method performsimilarly to the Bayesian approaches. Such a behavior is expected since the priors used in the derivationof Bayesian inference can be interpreted as a regularisation and it might not be necessary employ that insuch cases. Although using Jeffreys prior gave the best results in our study, a more careful calibration ofthe hyperparameters of the conjugate prior could have made that one more beneficial. In practice, thehyperparamters would be specified using knowledge from experts within fundamental market analysis.We also find that the conventional mean-VaR efficient frontier considerably overestimates the lo-cation of the true mean-VaR frontier. Although the Bayesian approaches reduce the underestimationof the VaR considerably and shrink the estimates of the efficient frontier, they still show significantoveroptimism when the portfolio dimension is large in comparison to the sample size, i.e., when a large-dimensional optimal portfolio is constructed. Further research in this direction is needed which mightlead to interesting results completing the existing findings in the direction of large-dimensional portfolioconstruction (see, e.g., Fan et al. 2012, Hautsch et al. 2015, Bodnar et al. 2019, Cai et al. 2020). References
Adcock, C. J. (2014). Mean–variance–skewness efficient surfaces, stein’s lemma and the multivariateextended skew-student distribution.
European Journal of Operational Research , 234(2):392–401.Alexander, G. J. and Baptista, A. M. (2002). Economic implications of using a mean-var model forportfolio selection: A comparison with mean-variance analysis.
Journal of Economic Dynamicsand Control , 26(7):1159–1193. lexander, G. J. and Baptista, A. M. (2004). A comparison of var and cvar constraints on portfolioselection with the mean-variance model. Management Science , 50(9):1261–1273.Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D. (1999). Coherent measures of risk.
MathematicalFinance , 9(3):203–228.Avramov, D. and Zhou, G. (2010). Bayesian portfolio analysis.
The Annual Review of Financial Eco-nomics , 2(1):25–47.Babat, O., Vera, J. C., and Zuluaga, L. F. (2018). Computing near-optimal value-at-risk portfolios usinginteger programming techniques.
European Journal of Operational Research , 266(1):304–315.Barry, C. B. (1974). Portfolio analysis under uncertain means, variances, and covariances.
Journal ofFinance , 29:515–522.Bauder, D., Bodnar, R., Bodnar, T., and Schmid, W. (2019). Bayesian estimation of the efficient frontier.
Scandinavian Journal of Statistics , 46:802–830.Bauder, D., Bodnar, T., Parolya, N., and Schmid, W. (2020a). Bayesian inference of the multi-periodoptimal portfolio for an exponential utility.
Journal of Multivariate Analysis , 175. 104544.Bauder, D., Bodnar, T., Parolya, N., and Schmid, W. (2020b). Bayesian mean–variance analysis: optimalportfolio selection under parameter uncertainty.
Quantitative Finance , page to appear.Baumol, W. J. (1963). An expected gain-confidence limit criterion for portfolio selection.
ManagementScience , 10(1):174–182.Bawa, V. S., Brown, S. J., and Klein, R. W. (1979).
Estimation risk and optimal portfolio choice .North-Holland.Bernardo, J. M. and Smith, A. F. (2009).
Bayesian theory , volume 405. John Wiley & Sons.Black, F. and Litterman, R. (1992). Global portfolio optimization.
Financial Analysts Journal , 48:28–43.Bodnar, O. and Bodnar, T. (2010). On the unbiased estimator of the efficient frontier.
InternationalJournal of Theoretical and Applied Finance , 13:1065–1073.Bodnar, T., Dmytriv, S., Parolya, N., and Schmid, W. (2019). Tests for the weights of the globalminimum variance portfolio in a high-dimensional setting.
IEEE Transactions on Signal Processing ,67(17):4479–4493.Bodnar, T., Mazur, S., and Okhrin, Y. (2017). Bayesian estimation of the global minimum varianceportfolio.
European Journal of Operational Research , 256:292–307.Bodnar, T., Parolya, N., and Schmid, W. (2018). Estimation of the global minimum variance portfolioin high dimensions.
European Journal of Operational Research , 266(1):371–390.Bodnar, T. and Schmid, W. (2009). Econometrical analysis of the sample efficient frontier.
The EuropeanJournal of Finance , 15(3):317–335.Bodnar, T., Schmid, W., and Zabolotskyy, T. (2012). Minimum var and minimum cvar optimal portfolios:estimators, confidence regions, and tests.
Statistics & Risk Modeling with Applications in Financeand Insurance , 29(4):281–313. roadie, M. (1993). Computing efficient frontiers using estimated parameters. Annals of OperationsResearch , 45:21–58.Bronshtein, I. N., Semendyayev, K. A., Musiol, G., and Mühlig, H. (2015).
Handbook of Mathematics .Springer Science & Business Media.Brown, S. J. (1976).
Optimal portfolio choice under uncertainty: a Bayesian approach . PhD thesis,University of Chicago.Brugière, P. (2020).
Quantitative Portfolio Management . Springer.Cai, T. T., Hu, J., Li, Y., and Zheng, X. (2020). High-dimensional minimum variance portfolio estimationbased on high-frequency data.
Journal of Econometrics , 214(2):482–494.Chopra, V. K. and Ziemba, W. T. (1993). The effect of errors in means, variances, and covariances onoptimal portfolio choice.
Journal of Portfolio Management , 19(2):6–11.Cont, R. (2001). Empirical properties of asset returns: stylized facts and statistical issues.
QuantitativeFinance , 1:223–236.Fan, J., Zhang, J., and Yu, K. (2012). Vast portfolio selection with gross-exposure constraints.
Journalof the American Statistical Association , 107(498):592–606.Frankfurter, G. M., Phillips, H. E., and Seagle, J. P. (1971). Portfolio selection: the effects of uncertainmeans, variances, and covariances.
Journal of Financial and Quantitative Analysis , 6(5):1251–1262.Frost, P. A. and Savarino, J. E. (1986). An empirical bayes approach to efficient portfolio selection.
Journal of Financial and Quantitative Analysis , 21(3):293–305.Givens, G. H. and Hoeting, J. A. (2013).
Computational statistics . John Wiley & Sons.Gupta, A. K. and Nagar, D. K. (2000).
Matrix variate distributions . Chapman and Hall/CRC, BocaRaton.Gupta, A. K., Varga, T., and Bodnar, T. (2013).
Elliptically contoured models in statistics and portfoliotheory . Springer.Hautsch, N., Kyj, L. M., and Malec, P. (2015). Do high-frequency data improve high-dimensionalportfolio allocations?
Journal of Applied Econometrics , 30:263–290.Jorion, P. (1986). Bayes-stein estimation for portfolio analysis.
Journal of Financial and Quantitativeanalysis , 21(3):279–292.Jorion, P. (1997).
Value at risk: the new benchmark for controlling market risk . Irwin Professional Pub.Kan, R. and Smith, D. R. (2008). The distribution of the sample minimum-variance frontier.
ManagementScience , 54(7):1364–1380.Kingman, J. F. et al. (1978). Uses of exchangeability.
The Annals of Probability , 6(2):183–197.Klein, R. W. and Bawa, V. S. (1976). The effect of estimation risk on optimal portfolio choice.
Journalof Financial Economics , 3(3):215–231.Kolm, P. and Ritter, G. (2017). On the bayesian interpretation of black–litterman.
European Journal ofOperational Research , 258(2):564–572. otz, S. and Nadarajah, S. (2004). Multivariate t-distributions and their applications . CambridgeUniversity Press.Krüger, F., Lerch, S., Thorarinsdottir, T. L., and Gneiting, T. (2020). Predictive inference based onmarkov chain monte carlo output. arXiv preprint arXiv:1608.06802 .Markowitz, H. (1952). Portfolio selection.
The Journal of Finance , 7(1):77–91.Meng, X. and Taylor, J. W. (2020). Estimating value-at-risk and expected shortfall using the intradaylow and range data.
European Journal of Operational Research , 280(1):191–202.Merton, R. C. (1972). An analytic derivation of the efficient portfolio frontier.
The Journal of Financialand Quantitive Analysis , 7(4):1851–1872.Merton, R. C. (1980). On estimating the expected return on the market: An exploratory investigation.
Journal of Financial Economics , 8(4):323–361.Pritsker, M. (1997). Evaluating value at risk methodologies: accuracy versus computational time.
Journalof Financial Services Research , 12(2-3):201–242.Rachev, S. T., Hsu, J. S., Bagasheva, B. S., and Fabozzi, F. J. (2008).
Bayesian methods in finance .John Wiley & Sons.Rockafellar, R. T. and Uryasev, S. (2000). Optimization of conditional value-at-risk.
Journal of Risk ,2:21–42.Siegel, A. F. and Woodgate, A. (2007). Performance of portfolios optimized with estimation error.
Management Science , 53:1005–1015.Simaan, Y. (2014). The opportunity cost of mean–variance choice under estimation risk.
EuropeanJournal of Operational Research , 234(2):382–391.Staino, A. and Russo, E. (2020). Nested conditional value-at-risk portfolio selection: A model withtemporal dependence driven by market-index volatility.
European Journal of Operational Research ,280(2):741–753.Stambaugh, R. F. (1997). Analyzing investments whose histories differ in length.
Journal of FinancialEconomics , 45:285–331.Tsay, R. S. (2010).
Analysis of financial time series . John Wiley & Sons.Tu, J. and Zhou, G. (2010). Incorporating economic objectives into bayesian priors: Portfolio choiceunder parameter uncertainty.
Journal of Financial and Quantitative Analysis , pages 959–986.Wikipedia contributors (2020). List of s&p 500 companies – wikipedia, the free encyclopedia. [Online;accessed 14-July-2020].Winkler, R. L. and Barry, C. B. (1975). A bayesian model for portfolio selection and revision.
TheJournal of Finance , 30(1):179–192.Zellner, A. and Ando, T. (2010). A direct monte carlo approach for bayesian analysis of the seeminglyunrelated regression model.
Journal of Econometrics , 159(1):33–45. cknowledgement This research was partly supported by the Swedish Research Council (VR) via the project “BayesianAnalysis of Optimal Portfolios and Their Risk Measures”.
A Proofs of theoretical results
In order to get the stochastic representations presented below we also need the following result.
Lemma A.1.
Let a random variable z possess the following stochastic representation z d = τ √ vd + (cid:114) τ d τ √ d + 1 , (A.1) where d > , τ and τ are independent with τ ∼ t ( d ) and τ ∼ t ( d + 1) . Then, z follows a t -distributionwith d degrees of freedom, location parameter 0, and scale parameter (cid:112) ( v + 1) /vd .Proof of Lemma A.1. Since τ and τ are independent with τ ∼ t ( d + 1) , the conditional distributionof z given τ is a t -distribution with d + 1 degrees of freedom, location parameter τ / √ vd and scaleparameter g ( τ ) / √ d + 1 with g ( τ ) = (cid:112) τ /d . Thus the joint distribution of z and τ is given by f ( z, τ ) = f ( z | τ ) f ( τ )= Γ (cid:0) d +22 (cid:1) Γ (cid:0) d +12 (cid:1) √ π g ( τ ) (cid:32) z − τ √ vd g ( τ ) (cid:33) − d +22 Γ (cid:0) d +12 (cid:1) Γ (cid:0) d (cid:1) √ πd (cid:18) τ d (cid:19) − d +12 ∝ g ( τ ) (cid:32) z − τ √ vd g ( τ ) (cid:33) − d +22 g ( τ ) − ( d +1) = (cid:32) g ( τ ) + (cid:18) z − τ √ vd (cid:19) (cid:33) − d +22 = d [ z, τ ] d − √ vdv − √ vdv v +1 v zτ − d +22 . The last expression is the kernel of a multivariate t -distribution with d degrees of freedom, locationvector ν = and dispersion matrix Ω given by Ω = v +1 vd √ vdvd √ vdvd . Hence, the marginal distribution of z is also a t -distribution with d degrees of freedom, location andscale (cid:112) ( v + 1) /vd (see, e.g., Kotz and Nadarajah 2004). Proof of Theorem 2.1.
Bauder et al. (2020b) characterized the posterior predictive distribution of theportfolio return by deriving the stochastic representation of (cid:98) X P,t given by (cid:98) X P,t d = m + √ s (cid:32) τ √ vd + (cid:114) τ d τ √ d + 1 (cid:33) ith m = w T ¯ x t − ,J , s = w T S t − ,J w , v = n , and d = n − k under the Jeffreys prior and with m = w T ¯ x t − ,I , s = w T S t − ,I w , v = n + r and d = n + d − k under the conjugate prior. Theapplication of Lemma A.1 leads to the statement of the theorem. Proof of Theorem 3.1.
The statement of the theorem follows from the fact that w (cid:62) ¯ x t − is linear in w and that, since S t − is positive definite, (cid:112) w T S t − w can be regarded as the Euclidean norm of S / t − w where S / t − is the symmetric square root of S t − . Since q α > and the Euclidean norm is convex, theresult follows. Proof of Theorem 4.1.
Let c k,n = d k,n r k,n d k,n − . Since the solution of min w : w (cid:62) =1 Q t − ( w ) , belongs to the Bayesian efficient frontier (4.11) in the mean-variance space, it can be found by solvingthe univariate optimization problem given by min V : V ≥ V GMV,t − − R GMV,t − − ( c k,n ) − / √ s t − (cid:112) V − V GMV,t − + q α ( c k,n ) − / √ r k,n √ V (A.2)where R GMV,t − and V GMV,t − are given in (4.9) and s t − is defined in (4.11). The solution of (A.2)solves q α √ r k,n √ V = √ s t − (cid:112) V − V GMV,t − (A.3)and it is given by V GMQ,t − = q α q α − r − k,n s t − V GMV,t − . (A.4)where it obviously holds that V GMQ,t − > V GMV,t − as soon as q α − r − k,n s t − > , which coincides withthe second order condition needed to ensure that V GMQ,t − is the solution of (A.2).Finally, R GMQ,t − is obtained from (4.11) and it is given by R GMQ,t − = R GMV,t − + (cid:113) c − k,n s t − (cid:115) q α q α − r − k,n s t − V GMV,t − − V GMV,t − = R GMV,t − + r − k,n s t − (cid:113) q α − r − k,n s t − (cid:115) d k,n − d k,n (cid:112) V GMV,t − . Proof of Theorem 4.3.
From (4.11) and (3.4), we get ( R − R GMV,t − ) a t − + V GMV,t − = V and V = (cid:18) R + Qb (cid:19) (A.5)where a t − = d k,n − d k,n r k,n s t − and b = q α (cid:115) d k,n − d k,n . rom (A.5), we get Q = b (cid:115) ( R − R GMV,t − ) a t − + V GMV,t − − R = q α (cid:115) ( R − R GMV,t − ) r − k,n s t − + d k,n − d k,n V GMV,t − − R. Finally, we note that the Bayesian efficient frontier in the mean-Q space can be rewritten as R − RR GMV,t − + R GMV,t − − a t − b R − a t − b RQ − a t − b Q + a t − V GMV,t − = 0 , which is a hyperbola in the mean-Q space for s t − > (see, e.g., Section 3.5.2.11 in Bronshtein et al.2015) since − a t − b (cid:16) − a t − b (cid:17) − a t − b = − a t − b = − s t − r k,n q α < . B List of stocks
Table 4 presents the list of stocks considered in Section 6.
Table 4: Stocks considered in the empirical illustration.