[PDF] Asset Allocation via Machine Learning and Applications to Equity Portfolio Management

Abstract

Full PDF

AAsset Allocation via Machine Learning andApplications to Equity Portfolio Management

Qing Yang * Zhenning Hong † Ruyan Tian ‡ Tingting Ye § Liangliang Zhang ¶ November 24, 2020 * School of Economics, Fudan University, [email protected]. Qing Yang is a Professor of Finance at the Schoolof Economics of Fudan University, Shanghai, China. † Dongxing Securities, Co., Ltd., Asset Management Division, [email protected]. Zhenning Hong is a Direc-tor and the Head of Quantitative Investment at the Asset Management Division of Dongxing Securities, Co., Ltd.,Shanghai, China. ‡ School of Economics, Fudan University, [email protected]. Ruyan Tian is a Ph.D. student of ﬁrst yearat the School of Economics of Fudan University. § Maine Business School, University of Maine, [email protected]. Tingting Ye is an Assistant of Professor ofAccounting at University of Maine, Maine Business School. ¶ Independent, [email protected]. Liangliang Zhang will soon join the team of quantitative investment atthe asset management division of Dongxing Securities. Liangliang Zhang is the corresponding author of this article. a r X i v : . [ q -f i n . M F ] N ov sset Allocation via Machine Learning andApplications to Equity Portfolio Management November 24, 2020

Abstract

In this paper, we document a novel machine learning based bottom-up approach for staticand dynamic portfolio optimization on, potentially, a large number of assets. The methodologyapplies to general constrained optimization problems and overcomes many major difﬁcultiesarising in current optimization schemes. Taking mean-variance optimization as an example,we no longer need to compute the covariance matrix and its inverse, therefore the method isimmune from the estimation error on this quantity. Moreover, no explicit calls of optimizationroutines are needed. Applications to equity portfolio management in U.S. and China equitymarkets are studied and we document signiﬁcant excess returns to the selected benchmarks.• This paper proposes a fast and convergent numerical framework, which is universal andapplies to arbitrary constrained optimization problems with unique solutions withoutcalling explicitly any optimization routines, unlike the current problem-speciﬁc deeplearning-based methods in the literature. The method enjoys global convergence andwill not be trapped in local optima;• Our methodology involves no estimation of cross higher order moments of the assetspan by its construction. This is crucial for the methodology to overcome the curse ofdimensionality when higher order moments are involved, and the number of assets isvery large;• We provide empirical studies of portfolio optimization on hundreds and thousands ofstocks in U.S. and China equity markets with exotic objective functions and portfolioconstraints and document the performance results.

Keywords : Portfolio Optimization, Machine Learning, Hierarchical Clustering, K-Means Clus-tering, Deep Learning Regression, Mean-Variance-Skewness-Kurtosis, Reinforcement Learning,2onte-Carlo Simulation, Top-Down and Bottom-Up Approaches.

JEL Codes : C61, C63. 3

Introduction

In this paper, we propose a novel Monte-Carlo simulation and machine learning-based (un-supervised and supervised learning) static and dynamic portfolio optimization framework. Thisframework supports arbitrary objective functions and constraints, which can be either linear ornonlinear. Moreover, the number of assets being considered can be very large. Our methodologyis fast, accurate and convergent to the global optimum under minimum assumptions, extending theoriginal method introduced in [Zhang, 2020b].The framework consists of an input preparation module, a hierarchical clustering-based as-set space decomposition module, a simulation and portfolio weights selection module and a proﬁtand loss evaluation module. The input preparation module applies supervised-learning approachto estimate the inputs that feed into the portfolio optimization module. The hierarchical or regu-lar clustering-based asset space decomposition module tries to partition the entire asset universe,which often includes a large number of individual assets, based on a predetermined set of riskfactors. This results in similar factor values for the assets in each cluster. The third module, whichis the simulation and portfolio weights selection module, generates uniformly distributed portfolioweights in the constraint region and selects the globally optimal one corresponding to a certainobjective utility function with nonlinear constraints. The method in this module is fast and conver-gent under minimal assumptions. Most importantly, this module does not require the computationof joint higher order moments of a random vector (e.g., the rate of returns), which is computation-ally expensive especially in high dimensions. The last module evaluates the proﬁt and loss of theselected portfolio and performs backtesting. We empirically test our methodology with numericalexamples covering various objective functions and constraints, in U.S. and China stock markets. For example, covariance matrix or the tensor of third order moments. .2 Literature Review Portfolio optimization is an important topic in ﬁnancial economics that attracts both aca-demic researchers and ﬁnancial practitioners. Starting from the seminal work of modern portfoliotheory in [Markowitz, 1952], where it was suggested that correlations between different assetsshould be included as inputs to portfolio optimization practice, in addition to volatility and ex-pected returns, the theory of portfolio selection has become rigorous science rather than art. Themodern portfolio theory later inspired the famous CAPM model ﬁrst proposed in [Treynor, 1962]and later extended in [Merton, 1973] to a dynamic setting. Through years, we have witnessed anexplosion of the number of results on portfolio optimization as the advancement of theoretical andempirical research, and the increase of computational power. Reﬁnements on modern portfoliotheory have been studied in [Brodie et al., 2008] in a context of sparse portfolios that stabilizingthe output of mean-variance model. The carrier portfolio strategy, described in [Kusiak, 2013],is another approach which can deliver sparse and stable portfolios. It relies on a simple, math-ematical linear program that directly considers and treats each return observation as individualdata, with no assumptions on joint return distributions. Other examples, in a single period setting,include the Black and Litterman approach that was extended by [Black and Litterman, 1992], inwhich a Bayesian type analysis was used to incorporate investors’ views that correct the equilib-rium CAPM expected asset returns. Because the ﬁrst order moments of asset returns are noto-riously hard to estimate, simple return-agnostic strategies were created to account for this phe-nomenon. For example, the minimum variance portfolio and equal weight (1-over-N) portfolioare studied in [DeMiguel et al., 2013] and [Allen et al., 2012]. In addition to the ﬁrst and sec-ond moments, there have been strategies based on higher order moments, namely, skewness andkurtosis, appearing in the work of [Harvey et al., 2010] and many others. The computation ofhigher order co-moments may rely on factor models, such as the single-factor method consideredin [Martellini and Ziemann, 2010], or forward-looking information obtained from option pricesas documented in [Christoffersen et al., 2012]. But whichever approach we choose, the methodssuffer greatly from the curse of dimensionality as the dimension of asset span increases. Sincethe proposal of [Ang, 2013] on factor investing, factor-based strategies start to emerge. Ref-erences can be found in [Koedijk et al., 2014] and [Roncalli and Weisang, 2012], among others.5dditional return agnostic, or risk-based strategies, such as maximum diversiﬁcation strategy,risk-parity strategies, and their variations have been proposed. Research results can be foundin [Anderson et al., 2014] and references therein. Another strand of literature focuses on dynamicportfolio allocation in a randomly varying market environment and considers multi-period opti-mization problems. The pioneering work attributes to [Merton, 1971] with numerous later re-ﬁnements, see [Cvitanic and Karatzas, 1992] and [Schroder and Skiadas, 1999] for example. Thecharacteristics of this type of problems are that they often involve dynamic programming and in acontinuous time setting, a PDE system often needs to be solved.Theoretically being sound, the Markowitz’s mean-variance optimization method, still popu-lar in ﬁnancial industry, has many obvious drawbacks, which are well-documented in the literature,e.g., [Homescu, 2014] and [Perrin and Roncalli, 2019]. There are three major difﬁculties with re-spect to the mean-variance approach. The ﬁrst is that the method output is extremely sensitiveto the model input, which is often difﬁcult to estimate accurately. The second is that the analyt-ical solution of the quadratic programming problem involved requires the computation of a largecovariance matrix and its inverse, when the number of assets is large. Third, the computationalburden increases sharply if multiple linear or nonlinear constraints are added. To address the in-put estimation problem, many deep-learning based methods have been proposed recently, such as[Yang et al., 2018], [Gu et al., 2020], [Babiak and Barunik, 2020] and references therein. To ad-dress the second and third questions, [Perrin and Roncalli, 2019] reviewed machine learning opti-mization approaches to solve the constrained quadratic programming problem in high dimensions.In [Homescu, 2014], general formulations of static portfolio optimization are outlined, taking intoconsideration the reward, risk and various constraints on optimal portfolios. Compared to the ref-erence in the literature, our method enjoys all the advantages while being theoretically simple andeasy to implement in practice.

The contributions of this paper are ﬁve-folds. First, it proposes a fast and convergent numer-ical framework, which is universal and applies to arbitrary constrained optimization problems withunique solutions without calling explicitly any optimization routines, unlike the current problem-6peciﬁc deep learning-based methods in the literature. Second, our methodology involves no esti-mation of cross higher order moments of the asset span by its construction. This is crucial for themethodology to overcome the curse of dimensionality when higher order moments are involved,and the number of assets is large. Third, the paper proposes to use the (hierarchical) clusteringmethod to reduce the dimension of the optimization problem, when there are many assets in theportfolio. Fourth, the paper advocates using deep learning techniques to estimate the model input.Fifth, we provide empirical studies on portfolio choice among a large number of stocks in bothU.S. and China equity markets and provide performance analysis.

The paper is organized as following. Section 2 describes the methodology and the optimiza-tion framework. Section 3 performs numerical experiments and Section 4 concludes.

In this section, we document and present ﬁrst three of the aforementioned modules. Weﬁrst prepare the inputs for the optimization process, which are often the ﬁrst and higher ordermoments of asset return vectors. Second, we decompose the asset span, which usually consistsof thousands of assets, with the help of a set of risk factors, into small subsets in a hierarchicalmanner, perform optimization on each sub-level and obtain the ﬁnal optimal weight on every assetbased on the intermediate weights on each cluster. Third, most importantly, we perform portfoliooptimization at each clustering level in the hierarchy based on Monte-Carlo simulation. The laststep is to compute the results of the backtesting and present the transaction cost at each point intime. Theoretical convergence results are provided in this section and we will illustrate how toperform portfolio choice numerically for both static and dynamic problems. To compute the α th order conditional moments E t (cid:2) R αt + h (cid:3) , where R = ( R , · · · , R n ) denotesthe vector of asset returns and α = ( α , · · · , α n ) represents the standard multi-index notation,7.e., R αt + h = (cid:81) ni =1 (cid:2) R it + h (cid:3) α i and | α | = (cid:80) ni =1 α i , we ﬁrst look at the general semi-martingaledecomposition below, written in matrix and vector notations R t + h = E t [ R t + h ] + ( R t + h − E t [ R t + h ]) (1) = µ t + σ t U t,t + h . (2)Here the random source term U t,t + h satisﬁes E t [ U t,t + h ] ≡ and it has unit variance-covariancematrix. We will have the higher order moments of U t,t + h as functions of ( µ, σ ) , if we assume thatits distribution is elliptic. Alternatively, the conditional moments of asset returns can be assumedto be functions of some selected risk factors f . For example, µ and σ can be computed via machinelearning methods (see [Gu et al., 2020]) or a Monte-Carlo simulation and clustering-based methodintroduced in [Zhang, 2020a]. More detailed analysis of the choice of factors and a review ofpopular regression methodologies can be found in [Gu et al., 2020].In empirical studies, we always have | α | = 1 , meaning that we will only compute the ﬁrstorder moments. The estimation of cross higher order moments, for example, the second ordermoments, also known as the covariance matrix, is not necessary. To understand this, consider alead-lag panel regression of (cid:0) R it +1 (cid:1) = g ( t, f it ) + ε it,t +1 . The second order moment of the weightedassets w · R can be represented by E t [( w · R t +1 ) ] = g ( t, f wt ) , where f wt is the associated factorvalue of the synthetic asset w · R . This is an interpolation problem when < w < , on which themachine learning methods work well. The same applies to higher order cross moments. In order to apply the bottom-up approach to construct the optimal portfolios, we ﬁrst usea top-down (hierarchical) clustering method to decompose the asset universe into stratiﬁed sub-spaces. Starting from the sub-spaces of the lowest level, we obtain the optimal portfolio weightsbased on the parameters input from Section 2.1 and the methodology illustrated in Section 2.3. Byworking from the lowest to highest level, we will obtain the optimal portfolio weights correspond-ing to each of the sub-spaces, and therefore each asset. To be speciﬁc, suppose that we have a K -vector of asset speciﬁc factors (cid:8) f k (cid:9) Kk =1 , and denote the realized values by f k,jt , where j denotesthe j th asset and it ranges in [1 , n ] . Time t ranges in [1 , T ] . Therefore, there are n × T obser-8ations of the K -dimensional factor. Use hierarchical clustering approach on those observationsand compute which cluster each related asset in the universe belongs to at each time t . A morestraightforward way to create clusters is to consider the actual sector that each asset belongs toand categorize them by the related industries. In addition to the clustering approach, we can createvarious criteria based on the factor values to partition the asset universe into small buckets, withthe assets in each bucket presenting some similar behaviors . Assume that we are going to maximize an objective function G ( w, R ) , where w denotes theportfolio weights and R is the rate of return vector of a certain class of assets. There are constraintson w , namely F ( w ) ≥ (3) H ( w ) = 0 (4) w ∈ ( a, b ) . (5) F and H are nonlinear functions of w . A general formulation of the static portfolio optimizationproblem can be found in [Homescu, 2014]. In this section, we outline a simulation and machinelearning-based approach to obtain the optimized weights w . Assume that Equation (4) can berewritten as w n = h (cid:0) w , w , · · · , w n − (cid:1) . (6)The method works as following:1. Generate M × ( n − uniform random numbers { w jm } n − ,Mj =1 ,m =1 , which satisfy a j < w jm < b j For example, we can calculate scores based on a set of predetermined factors for each asset, rank and divide theasset space by the scores. For example, the objective function can be a reward minus a coherent risk measure on R

9. Compute w nm = h ( w m , w m , · · · , w n − m ) for m = 1 , , · · · , M

3. Find a subset of { w jm } n − ,Mj =1 ,m =1 that satisﬁes Equation (3) and denote it by S F

4. Use hierarchical or regular clustering method to decompose S F into K disjoint clusters,denoted by (cid:8) S Fk (cid:9) Kk =1

5. Denote the center of (cid:8) S Fk (cid:9) Kk =1 by (cid:8) ¯ w k (cid:9) Kk =1 and compute k ∗ = argmax ≤ k ≤ K (cid:2) G (cid:0) ¯ w k , R (cid:1)(cid:3)

6. Use hierarchical or regular clustering method to decompose S Fk ∗ into K disjoint clusters,denoted by (cid:110) S F,k ∗ k (cid:111) Kk =1

7. Repeat Step to Step until convergence.It would be interesting to discuss the utility loss introduced by this hierarchical construction. It canbe easily understood that, at the last level of clustering, we perform optimization for each of thesubsets and the ﬁnal global weights are proportional to the weights in each ﬁnal cluster. Of course,this methodology is sub-optimal compared to the global optimization on the whole asset universe.However, we have gains in terms of a faster computational speed, less resources requirement, lesssevere propagation of estimation errors, and the elimination of the potential corner solutions orlocal optimum. Moreover, the portfolio optimization is done in a bottom-up way, but the clusteringis top-down, therefore our methodology enjoys the beneﬁt of both approaches. Moreover, thereason to use clustering approach in Step above is to reduce the computation burden when M isvery large and the evaluations of objective function are time consuming. Theoretical Convergence

In this section, we discuss the global convergence of the proposed approach based on threecritical assumptions below.

Assumption 1 (Completeness)

The random number generator Υ satisﬁes the following. Supposethat the size of random numbers generated is N , and the random numbers generated by Υ form aset β N . Then we have the fact that ∪ ∞ N =1 β N is always dense in the compact set S F . S F is reachable with the samples generated. Assumption 2 (Existence and Uniqueness)

The constrained optimization problem O ( w ) intro-duced in Section 2.3.1 has a unique solution. Assumption 3 (Continuity)

The optimization problem is continuous with respect to β N . Thismeans that lim K →∞ argsup ∪ KN =1 β N O ( w ) = argsup ∪ ∞ N =1 β N O ( w ) (7) where O ( w ) is the original optimization problem with constraints. Then, combining the above two assumptions, we have the theorem below as our main theoreticalresult.

Theorem 1 (Global Convergence)

Under Assumptions 1, 2 and 3, our algorithm output is con-vergent to the unique optimal solution.

In a dynamic portfolio optimization problem, we try to solve the Bellman equation V π ( s ) = R ( s ) + γ × (cid:34) max π (cid:88) s (cid:48) ∈ S P s,π ( s ) ( s (cid:48) ) V π ( s (cid:48) ) (cid:35) (8)where R is the immediate reward function, π : S → A is a mapping from the state space S tothe action space A and is called the policy function. P s,π ( s ) ( s (cid:48) ) denotes the probability transitionmatrix and V π ( s ) is the value function. Last, γ ∈ (0 , is the discount factor process, which isoften taken as a constant. The goal is to ﬁnd an optimal policy function π such that the valuefunction is maximized. Of course, in general, π is nonlinear in both time t and state s . However,it can be approximated locally in an open and sufﬁciently small region by its tangent space, whichis represented by a linear equation. Further suppose that the state space S and action space A arecompact sub-spaces of Euclidean space and we can generate uniform random numbers in S and We will often assume that time variable t is included in state vector s . . Decompose U = S ∪ A into small disjoint sub-spaces { U k } Kk =1 , and we have π ( s ) | U k ∼ = δ k + δ k · s + ε k , where ε k is the approximation error term. The functional form of π is solely determinedby ( δ k , δ k ) for each k . For the transition probability matrix P s,π ( s ) ( s (cid:48) ) , one way to represent it is toassume a parametric model s t +1 = f ( s t , a t , e t,t +1 ) , where s t = ( s , s , · · · , s t ) and likewisefor a t . To solve the optimization problem in Equation (8), we generate M independent copiesof ( δ k , δ k ) Kk =1 , therefore M different functional forms of π , and for each copy, compute the valuefunction via Monte-Carlo simulation based on the data generating process for s t and Equation (8),and use the method proposed in Section 2.3.1 to determine the best choice of ( δ k , δ k ) Kk =1 amongthe M independent samples. Last, the data generating process for s t can, alternatively, be replacedby a non-parametric inference directly using historical relationship. Assume that there is an m -dimensional vector process f t , whose data generating process(DGP) is f t + h = g ( f t , ϑ t ) + e ft,t + h (9)where ϑ is another stochastic process and E t (cid:104) e ft,t + h (cid:105) = 0 . The asset return vector is denoted by r t , which is n -dimensional. We have approximately the following regression relationship r t = h ( f t ) + e rt . (10)Here e rt is considered as a small perturbation term, which might be originated from missing factorsor measurement errors. Further observe that E t [ r t + h ] = h ( t, h, f t , ϑ t ) + u r,ft,t + h (11)where h is potentially a nonlinear function of ( f, ϑ ) and u r,ft,t + h is the pricing error term. Thedetailed conﬁgurations are described below. The DGP for the factor process of each stock is For example, the DGP can be an ARMA-GARCH process and ϑ is therefore the stochastic variance. f t = µ + φf t − h + σ t (cid:15) t (12) σ t = α + βσ t − h + γf t − h . (13)The factor f is n -dimensional , < µ < . is n -dimensional, < φ < is n × n , σ is n × andthe error term (cid:15) t = P · u t , where the correlation generator P is an n × n lower triangular matrixwith squared sum of each row being . u t is an n -dimensional independent Gaussian process withmean and variance . The parameter set ( µ, φ, α, β, γ, P ) is generated randomly according touniform distributions and the values ensure that the ARMA-GARCH models are stationary. The n -dimensional return process satisﬁes r t = 0 . × sin( f t ) + (cid:15) t and (cid:15) ∼ = Unif ( − . , . isan n -dimensional uniformly distributed random vector serving as the perturbation term, accountingfor missing factors or measurement errors. The number of factors is , the number of assets n =1000 and the number of time periods T = 250 . The number of clusters is (cid:2) √ (cid:3) , i.e., theinteger part of √ . The portfolio weights are constrained within [0 , and sum up to . Theobjective function is the classical mean-variance quadratic one. The equity curve of the out-of-sample optimization results is displayed in Exhibit 1 below. It is obvious that the method performsExhibit. 1: Equity Curve for Simulation Study.well in an artiﬁcial simulation environment, according to the equity curve. The x -axis is the numberof periods, which is up to , and the y -axis is the value of the equity curve. From the plot we can This means the factor is asset speciﬁc and we have only m = 1 factor for each stock. This functional form is to ensure that the return series generated are mostly around ( − . , . . to approximately with very limited drawdowns.This is inline with our expectations: in a simulation environment, we know and are able to recoverexactly the functional relationship between factors and expected future returns. Of course, thereare also small negative returns appearing along the equity curve. This is caused by the smallperturbation term e rt in Equation (10) and the fact that the realization of future returns can deviatefrom their expected values, which is illustrated by the pricing error term u r,ft,t + h . The daily OHLC, trading volume and shares outstanding data are downloaded from WRDSfor stocks traded in AMEX, NASDAQ and NYSE. The cross section contains stocks. Timeranges from 20110103 to 20191231. The OHLC data are before dividend and stock splits. There-fore, we use the raw OHLC multiplied by the shares outstanding data to account for stock splits.In order for simplicity, we ignore the dividend effect. Because the portfolio weights are restrictedbetween and , the actual performance of the methodology should be better than what are pre-sented. To carry out the analysis, some details have to be determined. The objective function is f ( w ) = µ ( w )+ s ( w ) σ ( w )+ k ( w ) , where ( µ, σ, s, k ) are the conditional expected return, empirical volatility,skewness and kurtosis of the portfolio w . ( σ, s, k ) are empirical values computed for every simu-lated portfolio weight vector w using past months’ asset return data. An alternative objectivefunction is based on CRRA (constant relative risk aversion) utility function on terminal wealth f ( w ) = E t (cid:104) − (1+ R t + h ( w )) γ − γ (cid:105) , where R t + h ( w ) is one-step ahead portfolio return associated withweight vector w . Portfolio weights are constrained within (0 , and they sum up to . The condi-tional expected returns are estimated via a -factor lead-lag regression model implemented withPython function XGBRegressor provided by module xgboost . The regression is done in a rollingwindow manner, with time length weeks. The prediction is based on a -day time frame and The details of the factors are available upon request. business days. The clustering is done by the scores com-puted via equal weights on the factor values. The cross section is either the largest or , companies in AMEX, NYSE and NASDAQ by market capitalization. The optimal portfolios arecomputed at the beginning of each period and are held until the end of the period. The backtesting results are summarized in the equity curve plot in Exhibit 2 and the perfor-mance metrics in Exhibit 3. From Exhibit 2, we can see that the CRRA1600 (Constant RelativeRisk Aversion objective function optimization on the largest , stocks in U.S. equity marketsby market capitalization) performs best, with the terminal net value more than doubled compared tothe initial capital. The second in place is MVSK1600 (Mean-Variance-Skewness-Kurtosis objec-tive function optimization on the largest , stocks in U.S. equity markets by market capitaliza-tion), with equity value approching . . The performance of MVSK500 and CRRA500 is close toS&P500, with negligible excess returns. It can be seen from the plot that there is a jump in equitycurve value at 20171009 for CRRA1600 and MVSK1600, causing the excess returns. In general, itcan be concluded from our experiments that, in U.S. markets, performing the selected naive singleperiod optimization schemes introduces little economic gains and excess returns compared to themarket index. Exhibit 3 contains performance metrics, where Return denotes annualized averagearithmetic returns,

Vol denotes annualized standard deviation of the return series, IR denotes infor-mation ratio, SR represents Sortino ratio, CR is Calmar ratio and MDD is the abbreviate for maxdrawdown. It is surprising that none of the strategy information ratios exceed that of S&P500.However, the annualized returns of some of the curves beat the ﬁnancial market index.

The adjusted daily stock OHLC and trading volume data in CSI300 and CSI800 indexes aredownloaded from Wind terminal. Time ranges from 20120206 to 20200928.15xhibit. 2: Equity Curves for U.S. Stock Market.Exhibit. 3: Performance Metrics in U.S. Stock Market.

Index Return Vol IR SR MDD CRMVSK500 .

20% 11 .

27% 1 .

17 1 .

58 15 .

17% 0 . MVSK1600 .

66% 14 .

59% 1 .

21 1 .

98 16 .

51% 1 . CRRA500 .

23% 11 .

28% 1 .

26 1 .

78 14 .

10% 1 . CRRA1600 .

49% 21 .

49% 1 .

05 2 .

60 16 .

78% 1 . S&P500 .

23% 9 .

30% 1 .

52 2 .

03 13 .

09% 1 . .3.2 The Methodology The objective function is f ( w ) = µ ( w )+ s ( w ) σ ( w )+ k ( w ) , where ( µ, σ, s, k ) are the conditional expectedreturn, empirical volatility, skewness and kurtosis of the portfolio w . ( σ, s, k ) are empirical valuescomputed for every simulated portfolio weight vector w using past months’ asset return data.An alternative objective function is based on CRRA (constant relative risk aversion) utility functionon terminal wealth f ( w ) = E t (cid:104) − (1+ R t + h ( w )) γ − γ (cid:105) , where R t + h ( w ) is one-step ahead portfolio returnassociated with weight vector w . Portfolio weights are constrained within (0 , and they sum upto . The conditional expected returns are estimated via a -factor lead-lag regression modelimplemented with Python function XGBRegressor provided by module xgboost . The regressionis done in a rolling window manner, with time length weeks. The prediction is based on a -day time frame and the factor values are sampled every business days. The clustering is done bythe scores computed via equal weights on the factor values. The cross section is either the CSI300or CSI800 index stocks. The optimal portfolios are computed at the beginning of each period andare held until the end of the period. The backtesting results are summarized in the equity curve plots in Exhibits 4 and 5 and theperformance metrics in Exhibit 6. In China A shares market, the simple single period optimizationmethods reveal signiﬁcant excess returns as can be observed in Exhibits 4 and 5. The MVSK300scheme tops the horse-race, followed by the CRRA300. Although beaten by the aforementionedtwo schemes, MVSK800 and CRRA800 curves are still above the CSI300 index curve. Exhibit 5documents signiﬁcant and stable excess returns, which are positive through time. Exhibit 6, again,documents the performance metrics. Various risk and reward indexes point out that MVSK300 isthe best strategy among the four competing methods and CSI300 bears minimum IR, SR and CR.

Monte-Carlo simulation is used to construct the optimal portfolios. Therefore, a naturalquestion to ask is whether the method is stable for different random numbers generated and how The details of the factors are available upon request.

Index Return Vol IR SR MDD CRMVSK300 .

60% 21 .

52% 0 .

96 1 .

85 23 .

24% 0 . MVSK800 .

41% 22 .

49% 0 .

64 1 .

30 24 .

62% 0 . CRRA300 .

54% 21 .

26% 0 .

73 1 .

28 22 .

75% 0 . CRRA800 .

05% 22 .

95% 0 .

51 1 .

15 27 .

40% 0 . CSI300 .

51% 18 .

87% 0 .

56 0 .

97 27 .

53% 0 . many samples are considered enough? In this section, we try to answer the question empiricallyonly, although a theoretical derivation of the error bounds for different samples M is possible.In the sequel, we compare the empirical results on MVSK300 in China A share market, with M ranges in (10 , , , , . The results with M = 100 , are taken asthe benchmark values and RMSE’s (root-mean-squared-error), as well as RMSRE’s (root-mean-squared-relative-error), are computed with respect to the benchmark equity curve. We provide boththe comparisons between different equity curves graphically, in Exhibit 7, and RMSE’s/RMSRE’snumerically in Exhibit 8. It can be observed that, with , simulated weights, the result is closeenough to that of , samples. This can be considered as a convergence test in the languageof model validation. Exhibit. 7: Empirical Stability Plot.19xhibit. 8: Empirical Stability Analysis. Index 10,000 25,000 40,000RMSE .

08 0 .

04 0 . RMSRE .

73% 2 .

79% 1 . In this paper, inspired by the methodology introduced in [Zhang, 2020b], we document anovel four-step portfolio optimization framework and test it with simulated and real ﬁnancial datain China A-shares and U.S. equity markets. Our results reveal superior returns over the out-of-sample testing periods for both markets, which illustrates the usefulness of our methodology, thatis not only a numerical framework, but also contributes to the literature of large scale optimizations.The empirical study of our proposed dynamic portfolio choice method via reinforcement learning isboth interesting and important, yet postponed to future research. In addition, our methodology canbe extended to ﬁxed income and option portfolio selection, combining the work of [Zhang, 2020a]and [Zhang, 2020b], which we leave to the interested readers as exercises.20 eferences [Allen et al., 2012] Allen, D., Lizieri, C., and Satchell, S. (2012). 1/n versus mean-variance: Whatif we can forecast.

Working Paper .[Anderson et al., 2014] Anderson, R., Bianchi, S., and Goldberg, L. (2014). Will my risk paritystrategy outperform?

Financial Analyst Journal , pages 75–94.[Ang, 2013] Ang, A. (2013). Factor investing.

Working Paper .[Babiak and Barunik, 2020] Babiak, M. and Barunik, J. (2020). Deep learning, predictability, andoptimal portfolio returns.

Working Paper .[Black and Litterman, 1992] Black, F. and Litterman, R. (1992). Global portfolio optimization.

Financial Analyst Journal , 48(5):28–43.[Brodie et al., 2008] Brodie, J., Daubechies, I., De Mol, C., Giannone, D., and Loris, I. (2008).Sparse and stable markowitz portfolios.

Working Paper .[Christoffersen et al., 2012] Christoffersen, P., Jacobs, K., and Chang, B. (2012). Forecasting withoption-implied information.

Handbook of Economic Forecasting , 2.[Cvitanic and Karatzas, 1992] Cvitanic, J. and Karatzas, I. (1992). Convex duality in constrainedportfolio optimization.

Annals of Applied Probability , 4:767–818.[DeMiguel et al., 2013] DeMiguel, V., Plyakha, Y., Uppal, R., and Vilkov, G. (2013). Improvingportfolio selection using option-implied volatility and skewness.

JFQA , 48(6):1813–1845.[Gu et al., 2020] Gu, S., Kelly, B., and Xiu, D. (2020). Empirical asset pricing via machine learn-ing.

Review of Financial Studies , 33:2223–2273.[Harvey et al., 2010] Harvey, C., Liechty, J., Liechty, M., and Mueller, P. (2010). Portfolio selec-tion with higher moments.

Quantitative Finance , pages 469–485.[Homescu, 2014] Homescu, C. (2014). Many risks, one (optimal) portfolio.

Working Paper .21Koedijk et al., 2014] Koedijk, C., Slager, A., and Stork, P. (2014). Factor investing in practice: Atrustees’ guide to implementation.

Working Paper .[Kusiak, 2013] Kusiak, S. (2013). Carrier portfolios.

Journal of Mathematical Finance , 40(1):61–70.[Markowitz, 1952] Markowitz, H. (1952). Portfolio selection.

Journal of Finance , 7(1):77–91.[Martellini and Ziemann, 2010] Martellini, L. and Ziemann, V. (2010). Improved estimates ofhigher-order comoments and implications for portfolio selection.

Review of Financial Studies ,23:1467–1502.[Merton, 1971] Merton, R. (1971). Optimum consumption and portfolio rules in a continuous-time model.

Journal of Economic Theory , 3(4):373–413.[Merton, 1973] Merton, R. (1973). An intertemporal capital asset pricing model.

Econometrica ,41(5):867–887.[Perrin and Roncalli, 2019] Perrin, S. and Roncalli, T. (2019). Machine learning optimizationalgorithms & portfolio allocation.

SSRN .[Roncalli and Weisang, 2012] Roncalli, T. and Weisang, G. (2012). Risk parity portfolios withrisk factors.

SSRN .[Schroder and Skiadas, 1999] Schroder, M. and Skiadas, C. (1999). Optimal consumption andportfolio selection with stochastic differential utility.

Journal of Economic Theory , 89(1):68–126.[Treynor, 1962] Treynor, J. (1962). Toward a theory of market value of risky assets.

WorkingPaper .[Yang et al., 2018] Yang, H., Liu, X., and Wu, Q. (2018). A practical machine learning approachfor dynamic stock recommendation. .22Zhang, 2020a] Zhang, L. (2020a). A clustering method to solve backward stochastic differentialequations with jumps.

Journal of Mathematical Finance , 10(1):1–9.[Zhang, 2020b] Zhang, L. (2020b). A general framework of derivatives pricing.