Equity Factors: To Short Or Not To Short, That Is The Question
Florent Benaych-Georges, Jean-Philippe Bouchaud, Stefano Ciliberti
EEquity Factors: To Short Or Not To Short, That Is The Question
Florent Benaych-Georges, Jean-Philippe Bouchaud, Stefano Ciliberti
Capital Fund Management
December 23, 2020
Abstract
What is the best market-neutral implementation of classical Equity Factors? Should one usethe specific predictability of the short-leg to build a zero beta Long-Short portfolio, in spite of thespecific costs associated to shorting, or is it preferable to ban the shorts and hedge the long-leg with– say – an index future? We revisit this question by focusing on the relative predictability of thetwo legs, the issue of diversification, and various sources of costs. Our conclusion is that, using thesame Factors, a Long-Short implementation leads to superior risk-adjusted returns than its HedgedLong-Only counterpart, at least when Assets Under Management are not too large.
Contents
When Equity Factors Drop Their Shorts
44 A Realistic Implementation Framework 65 Conclusions 7A More On The Toy Model 8B Brief Description Of The Equity Factors 9
Equity Factor investing has become increasingly popular over the past decade. Practitioners and aca-demics realized in the 70’s that the single factor, CAPM model [ ] has its own limitations and hadto be generalized in order to account for more than one risk driver [ ] . Several questions remainopen regarding the nature of these extra factors: are they pure risk premia [
1, 9, 21, 22 ] , genuine mar-ket anomalies [
2, 5, 11, 19 ] , or unavoidable consequences of institutional constrained investors [ ] ?How many such factors must be considered: 3, as in the original Fama-French model [ ] , or severalhundreds as advocated in the so-called factor-zoo literature [
13, 14 ] ?.There is also a wide variety of ways these factors can be converted into predictive signals andrealistic portfolios. The primary criterion concerns the market exposure of the portfolio: are we lookingfor a market-neutral implementation of these factors, with the long and short legs of the portfoliooffsetting its overall beta exposure, or are we concerned with so-called smart-beta strategies, wherethe portfolio has a positive beta exposure to the equity market, but is tilted in the direction of saidfactors?Both approaches make sense, of course, and correspond to different asset management mandatesand different investor profiles. Still, for any given portfolio we can always identify and isolate the1 a r X i v : . [ q -f i n . P M ] D ec ortfolio‘s exposure to the equity index and analyse the performance of its market neutral, active com-ponent. As an example, in the smart-beta style of implementation, this market neutral component canbe represented by a portfolio with long-only equity positions, hedged by a short equity index futures.This is in fact a realistic, cost-aware set-up, which allows one to build a market-neutral factor strategy.An alternative is the classical Long-Short equity portfolio, with an explicit short position on somestocks. The question that we are going to address in this paper is whether any of these two implemen-tations yields significantly better results for the active, market-neutral risk component of the portfolio,in particular when realistic transaction costs are accounted for. Or, stated differently: Are explicit shortpositions beneficial or detrimental to equity factor strategies?
Not surprisingly, we will see that the predictability power of the short signals plays a crucial role indetermining which of the two implementations should be preferred. We will also show that a properanswer to this question depends quite heavily on some details related to the factor under scrutiny,on the portfolio construction algorithm, and on the very definition of the equity market. We will payspecial attention to implementation costs, including both trading costs (impact) and financing costs,the latter substantially affecting the profitability of the short leg.This problem has been addressed in the literature many times in the past. Some authors focuson the difficulties related to actually taking short positions [ ] , while others find that under somewell defined market conditions the short leg of equity strategies is more profitable [ ] . There seemsto be a general agreement that Long-Short strategies are more profitable than constrained long-onlyones [ ] , despite some claims that the extra-benefit is marginal and that it would disappear whencosts are properly accounted for [ ] .More recently, the authors of [ ] have posted a study which suggests that an optimal implementa-tion of market-neutral equity factors should not contain explicit short positions at all. Not only longsignals seem to be of better quality than short signals, but also long positions provide a more diversifiedexposure to different factors than the shorts.The contribution of the present study to the debate consists in proposing a realistic, cost-awareframework allowing to compare a Long-Short equity portfolio to a beta-Hedged Long-Only one. Inour view, there are often two missing ingredients in the previous research works that should carefullybe taken into account. One is the risk management and the portfolio construction, that is a propercontrol of the portfolio exposure to the desired descriptor under risk-related constraints – which couldbe constant volatility and / or maximum leverage for instance. This step is particularly crucial whenconsidering long-only portfolios, as we will show in Section 4. The other key point is costs. Allowingfor short positions implies leverage and borrowing costs that we will consider explicitly when buildingthe portfolio.The outline of the paper is as follows. In Section 2 we introduce a simple toy model which allowsone to rationalize when the use of shorts can bring value to a portfolio. When confronted to realpredictability data, our criterion suggests that shorts are indeed accretive. In Section 3 we revisit therecent paper of the Robeco group [ ] . While we reproduce the results of that study, we find that theirconclusion that one should “drop the shorts” has to be tempered. As we show in Section 4, a cost-awareportfolio construction using the full range of factor predictors leads to a clear over-performance of aLong-Short implementation compared to a beta-hedged long only portfolio. Consider an investment universe with a factor F , a market M and two assets, Asset 1, positively exposedto F and Asset 2, negatively exposed to F (both being positively exposed to the market). The questionis whether, to bet on F without being exposed to the market, we should rather bet on Asset 1 and hedgewith a negative market exposure ( hedged long-only ) or construct a market neutral portfolio with a beton Asset 1 and a bet against Asset 2 ( long-short ).We naturally model the respective returns R and R of Assets 1 and 2 as follows R = β M + α F + (cid:34) with α , β >
0, (1) R = β M − α F + (cid:34) with α , β >
0, (2)2 igure 1:
Illustration of (5) when κ = Var ( (cid:34) ) / Var ( (cid:34) ) =
1, for various values of γ = Var ( (cid:34) ) / Var ( F ) . When the variance of F dominates that of the residues (cid:34) i , (blue curve), the ratio of Sharpe ratios stays close to 1 even for large values of α , becausein this case, both portfolios are approximately equivalent up to a global rescaling. where (cid:34) and (cid:34) are idiosyncratic residuals. Up to a rescaling of both assets, one can suppose withoutloss of generality that β = β =
1. Then, up to a further rescaling of F , one can set α =
1, so that theabove equations rewrite R = M + F + (cid:34) (3) R = M − α F + (cid:34) . (4)We shall also assume that:• (cid:69) ( F ) > F has positive performance on average),• F , (cid:34) , (cid:34) are independent and (cid:69) ( (cid:34) i ) = i =
1, 2 (residual returns have no alpha).We want to build a portfolio betting on F (hence on Asset 1) in a market-neutral way , i.e. with no globalexposure to M . Up to a global rescaling of the positions, the two possibilities at hand are a Long-Short(LS) portfolio and a long position that we beta-hedge (LH), that is: π LS ≡ { −
1, 0 } π LH = {
1, 0, − } where π = (respective weights of Asset 1, Asset 2 and Market). The following result compares theSharpe Ratios of these portfolios and determines, depending on the parameters α , γ ≡ Var ( (cid:34) ) / Var ( F ) and κ ≡ Var ( (cid:34) ) / Var ( (cid:34) ) , which one should be preferred.We find (see Appendix. A for details) the following results:SR ( π LS ) SR ( π LH ) = (cid:118)(cid:117)(cid:116) + γ + γ ( + κ )( + α ) (5)which implies that SR ( π LS ) > SR ( π LH ) ⇐⇒ α > (cid:112) + κ −
1. (6)This simple result elicits a transition between two regimes. If the hedging asset (Asset 2 here) under-performs the market enough, then from a (no-costs) Sharpe Ratio perspective the long–short market-neutral portfolio on Assets 1 and 2 outperforms the beta-hedged long portfolio on Asset 1. This is inline with the intuition that, if we have enough predictability on the short part of our portfolio, thenshorts should be included. 3 igure 2:
The predictability power of the Momentum Factor (UMD, as defined in Appendix B, on the European stock pool,1985-2020). For every day and every stock in the pool, we put on the X-axis the value of the descriptor properly normalized,and on the Y-axis the future residual return of the corresponding stock (i.e. the total stock return minus its beta component onthe largest principal component of the covariance matrix). All these points are then averaged inside bins (the standard errorsfor these averages are also represented). The plain line is a linear regression through the points with a positive predictor.The dashed line shows the 40% threshold that short predictors have to beat in order for shorts to be accretive, according tothe model.
Our simple model suggests a quantitative threshold on the predictability of the short vs. the longpositions beyond which a Long-Short implementation is beneficial. This threshold can be used on realdata as we show in Figs. 2, 3. The sample of the global market we use is a a monthly re-balanced world-wide pool made of the 1200 (resp. 1000, 900, 200) most liquid stocks (liquidity being quantified asthe 180-days Average Daily Traded Volume) of North-America (resp. Europe, Japan, Australia), from2000 to 2020, using the definition of the Equity Factors we give in Appendix B.The data reveals that for most of the equity factors that we will consider in this study (see nextsection), the short side of the predictor is sufficiently strong to make short positions profitable, atleast in principle. In other words, taking short equity positions is a priori a better choice than simplyshorting the future contract. If the residual returns (cid:34) i have the same volatility ( κ = α > (cid:112) − ( π LS ) > SR ( π LH ) . This threshold is materialized by dashed lines inFigures 1, 2, 3.Finally, note that since (cid:112) + κ − κ , the higher the volatility of theresidual returns of Asset 2, the stronger the under-performance of Asset 2 needs to be for SR ( π LS ) > SR ( π LH ) to hold. When Equity Factors Drop Their Shorts
In this section we revisit the results advertised in ref. [ ] . The study is based on the classical Fama-French factors, as available in the Kenneth French data library . We consider here the following factors:HML (aka Value), WML (aka Momentum), RMW (aka Profitability), CMA (aka Investment), and alsoVOL, i.e. a low volatility factor available on line . These factors are built using monthly returns ofso-called Fama-French “2x3” portfolios on the US market (see the corresponding websites for moredetails). This construction allows one to clearly distinguish a long and a short leg for each factor.In order to run a fair comparison between long and short legs, one beta-hedges each factor using a“market index” (namely the Fama-French market) built using the same 2x3 building blocks. Note thatby construction this index contains 50% of small cap. stocks and 50% of large cap. stocks. Each factoris rescaled to get a beta of one with respect to the index. The index contribution is then removed toget a beta-neutral leg.The Sharpe ratio of each beta-neutral leg is shown in Fig. 4-Left. Overall, there is no striking dif-ference of Sharpe ratio between the long and short legs, except perhaps for VOL. However, Fig. 4-right https: // mba.tuck.dartmouth.edu / pages / faculty / ken.french / data_library.html https: // / en / themes / datasets / igure 3: Global (USA-Canada, Europe, Japan, Australia, 2000-2020) view of the predictability plot of Fig. 2. For the factorsSMB, MOM, LOWVOL, VALUEEAR and ROA (sample and methodology are described in Section 2), on a world-wide pool ofstocks, we show the ratio of the slope of the points corresponding to negative values of the predictor to the slope of the pointscorresponding to positive values of the predictor. The dashed line materializes the 40% threshold of our toy model beyondwhich SR ( π LS ) > SR ( π LH ) . It appears that for all factors but SMB, the short leg under-performs the market enough to justifythe long-short implementation. Figure 4: Fama-French portfolios.
Left:
Sharpe Ratios of hedged long and short legs.
Right: mean correlation, acrossfactors, of long (resp. short) legs with other long (resp. short) legs. These measures result in a portfolio with maximumSharpe (based on the 10 synthetic assets defined by the hedged long and short legs of these 5 factors) weighting long (resp.short) legs at about 70% (resp. 30%).
Figure 5: Fama-French portfolios.
This chart shows that when (longs − market) outperforms (market − shorts) , this isin large part explained by an SMB exposition of the difference (longs − market) − (market − shorts) . We show here thecorrelation, for each factor, between SMB and the difference ∆ = (longs − β longs SPmini) − ( β shorts SPmini − shorts). Thiscorrelation can be explained as follows: ∆ roughly rewrites as (longs + shorts) − [ ] are 50% small caps and 50% large caps, ∆ is correlated to the difference between and equally-weighted indexwith a market-cap weighted index, i.e. to SMB. [ ] . Hence, an optimal allocation will be more tilted towards the long portfoliosalthough, at variance with [ ] , we find that the short leg should be allocated 30% of the weight, andnot zero weight. The discrepancy between our conclusions maybe related to subtle implementationdifferences. We have not been able to identify precisely which implementation detail matter most, butwe suspect the detailed definition of the optimization problem is relevant. For example, there are dif-ferent ways to estimate the correlation structure of the factors, and various constraints that one maywant to impose to the portfolio. At the very least, this means that the “no-short” recommendation isnot robust against such minor changes.More importantly, the very definition of the market index – which serves as a hedge – is found tomatter quite dramatically. Using for example the SPmini index (which is easily implementable as alow-cost hedge), we now find that the Sharpe ratio of the long legs is significantly better that that ofthe short legs, as reported in [ ] . However, this is because the difference between the two legs is nowmechanically exposed to the SMB factor: see Fig. 5.We now turn to the analysis of a realistic implementation of factor trading, including the importantissue of costs, in both its Long-Short (LS) and Hedged Long-Only (LH) incarnations. Our main conclu-sion will be that factor trading through Long-Short portfolios actually over-performs Hedged Long-Onlyportfolios. We define a set of predictors based on ranked metrics (
Low Volatility, Momentum, Returns Over Assets,Small Minus Big, Value Earnings , see Appendix B) which, we believe, represent well the equity factorspace, on a pool of stocks distributed over the main exchanges (USA-Canada, Europe, Asia, Australia),proportionally to the available liquidity. We prefer ranked implementations over 2x3 implementationssince, as shown for example in Fig. 2, there is predictability even within the “central” quantiles, notonly the extremes one. Our study spans the period 2000-2020.We then slow down the corresponding signals (via an Exponential Moving Average of 150 days), sothat the turnover of the portfolios reach reasonable values, compatible with typical trading costs. Moreprecisely, the turnover is ≈ ≈
2% of the Gross MarketValue for the LS portfolio.The LH portfolio is implemented via a long-only constrained optimization problem where the portfo-lio’s overlap with the factor predictor (or of the aggregated factors) is maximized, at controlled turnovercost. More precisely, the portfolio is updated daily as the solution of the optimization problemportfolio t = Argmax (cid:0) 〈 portfolio t · factors predictor t 〉 − trading costs (cid:1) ,under global AUM constraints plus individual risk constraints (i.e. a maximum position of 3% of theAUMs on every single stock).The trading costs are the sum, for each trade, of a linear term accounting for bid-ask spread andbroker costs, plus a term accounting for market impact that depends on the corresponding stock’sliquidity. These trading costs are computed using our best in-house estimates of all these separatecontributions, in particular of the square-root impact law documented in, e.g. [ ] and refs. therein.We then look at the tracking error of the long-only portfolio with respect to easily tradeable marketindexes (respectively: S&P 500, DJ EURO STOXX 50, TOPIX, S&P ASX 200 and S&P / TSX 60 TotalReturn indices). This tracking error corresponds to our LH implementation. The LS portfolio is con-structed using the same signals and the same cost control, plus a volatility target equal to that of LH,implemented using a “cleaned” version of the empirical stock returns correlation matrix, see [ ] . Thisconstruction requires a certain leverage ratio that leads to some extra financing costs, together withsome idiosyncratic shorting costs for hard-to-borrow stocks, which we carefully account for . The version of Value based on the Book / Price ratio is excluded because of its long time bad performance. We use our in-house data base for hard-to-borrow fees, which corresponds to the fees actually paid by the CFM equitymarket neutral programs over the years igure 6: Left:
Sharpe Ratios of the LH and LS implementations of the different factors.
Right:
Mean correlation of eachfactor with the other factors in the LH and LS implementations.
Figure 7:
Factors’ weights in maximum Sharpe LH and LS portfolios.
The results of these experiments, using either single factors or the aggregate of different factors,are summarized below and illustrated in Figs. 6, 7 and 8. We fix the size of the AUMs to 1 Billion USD.• First, we see that the LH implementation of individual factors leads to Sharpe Ratios that are com-parable, on average, to those of the LS implementation (see Fig. 6, left ). But we now see that theLH implementations of the different factors are way more correlated than their LS counterparts(see Fig. 6, right ).• Taking the factors’ Sharpe ratios and their correlations into account, we compute the weight ofeach factor in the maximum Sharpe portfolios for both cases (LH and LS). We see, as expected,that the LS implementation is much more diversified between the different factors (see Fig. 7).Note that better diversification allows one to expect, as a general rule of thumb, a more robustout-of-sample performance.• Finally, the P&Ls of the LH and LS implementations of aggregated factors (with equal weights)are presented in Fig. 8 (left). The Sharpe ratio of the LS implementation is found to be ≈ ≈ all costs (trading costs, leverage financing costs of LS, and borrowing costs of LS) are includedin these P&L. The breakdown of these costs is detailed in Table 1. Note that although totalcosts are ≈
60% higher for the LS implementation, it still significantly over-performs the LHimplementation. Note however that this conclusion may not hold for very large AUMs, given thehigher turnover of the LS implementation and the super-linearity of impact costs as a function oftrading volume.
The conclusions of our study are quite clear: first, when discussing the relative merits of Hedged Long-Only and Long-Short portfolios, details matter. One should carefully discuss what “market” is used tohedge the Long-Only positions, since different definitions can lead to uncontrolled exposures to some7 igure 8:
Left: P&L (including all costs) of the LH and LS implementations, both with a realised volatility of 6.4% annual.Both portfolios take as an input signal the equi-weighted sum of all factors predictors:
Low Volatility, Momentum, ReturnsOver Assets, Small Minus Big, Value Earnings . The Sharpe ratio of LH / LS are, respectively 0.56 and 0.98. Right: Drawdowntime series (AUM − Peak Value / AUM) for both implementations. The mean depth for LH amounts to -8.2%, but only -3.9%for LS.
Sharpe mean drawdown returns + div. trad. cost financ. cost short borrow. costLH 0.56 -8.2% 8.4% -2.8% -2.0% NALS 0.98 -3.9% 14.2% -4.8% -2.6% -0.6% Table 1:
LH vs LS main P&L statistics, with returns and costs in percent per year of the 1 Billion USD AUMs. factors, such as Small Minus Big or Low Volatility. All implementation costs should be estimated andintegrated in the final P&L horse-race. Once all this is done, and provided our analysis is error-free, weunambiguously find that Long-Short implementations best Hedged Long-Only ones, at least when theAUMs are not too large. This conclusion is bolstered by the analysis of a very simple toy model, whichprovides a threshold on the strength of short predictors. Empirical predictability of the shorts indeedseem to lie beyond that threshold by a substantial margin.
Acknowledgments.
We thank S. Gualdi, M. Cristelli, P. Seager, T. Madaule, J.C. Domenge and S. Vial forinteresting discussions and suggestions.
A More On The Toy Model
In this appendix, we show how we get equation (5) in Section 2. We havePNL ( π LS ) = ( + α ) F + (cid:34) − (cid:34) , (7)so that SR ( π LS ) = ( + α ) (cid:69) ( F ) (cid:112) ( + α ) Var ( F ) + Var ( (cid:34) ) + Var ( (cid:34) ) (8)Similarly, SR ( π LS ) = (cid:69) ( F ) (cid:112) Var ( F ) + Var ( (cid:34) ) (9)Eq. (5) follows immediately and we deduce thatSR ( π LS ) > SR ( π LH ) ⇐⇒ ( + α ) > + κ (10)(11) ⇐⇒ α > (cid:112) + κ −
1, (12)where we used the hypothesis that 1 + α >
0. 8
Brief Description Of The Equity Factors • Momentum: 11 month mean of returns lagged by 1 month ranked• Value Earnings: earnings / price ranked• Low Volatility: 250 day volatility ranked• Size: market cap (lagged by 20 days, averaged over 40 days) ranked• ROA: net income / total assets ranked References [ ] Y. Amihud (2002),
Illiquidity and stock returns: cross-section and time-series effects,
Journal ofFinancial Markets 5 (1), 31-56. [ ] M. Baker, Malcolm, B. Bradley and J. Wurgler (2011),
Benchmarks as Limits to Arbitrage: Under-standing the Low-Volatility Anomaly,
Financial Analyst Journal, Vol. 67, No. 1, pp. 40–54. [ ] J. Bambaci, J. Bender, R. Briand, A. Gupta, B. Hammond and M. Subramanian (2013),
Har-vesting Risk Premia for Large Scale Portfolios,
MSCI report [ ] D. Blitz, G. Baltussen and P. van Vliet (2019),
When Equity Factors Drop Their Shorts , https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3493305 [ ] J.-P. Bouchaud, P. Krueger, A. Landier and D. Thesmar (2018),
Sticky Expectations and the Prof-itability Anomaly , Journal of Finance [ ] J.-P. Bouchaud, M Potters (2003),
Theory of financial risk and derivative pricing: from statisticalphysics to risk management
Cambridge university press [ ] F. Bucci, M. Benzaquen, F. Lillo and J.-P. Bouchaud (2019),
Slow decay of impact in equity markets:insights from the ANcerno database . https://arxiv.org/abs/1901.05332 [ ] M.M. Carhart (1997),
On Persistence in Mutual Fund Performance , The Journal of Finance 52, 57 [ ] K.C. Chan and N.F. Chen (1991),
Structural and return characteristics of small and large firms,
Journal of Finance 46: 1467–84. [ ] A. Dasgupta, A. Prat and M. Verardo (2011),
The price impact of institutional herding,
Review ofFinancial Studies, 24 (3): 892-925. [ ] W.F.M. De Bondt and R.H. Thaler (1987),
Further Evidence on Investor Overreaction and StockMarket Seasonality,
Journal of Finance. 42:3, pp. 557-81. [ ] E.F. Fama and K.R. French (1993),
Common risk factors in the returns on stocks and bonds , Journalof Financial Economics, 33, 3 [ ] C.R. Harvey and Y. Liu (2019),
A Census of the Factor Zoo, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3341728 [ ] C.R. Harvey, Y. Liu and C. Zhu (2013), ...and the Cross-Section of Expected Returns, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2249314 [ ] J. Huij, S. Lansdorp, D. Blitz and P. van Vliet (2014),
Factor Investing: Long-Only versus Long-Short , https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2417221 ] A. Ilmanen and J. Kizer (2012),
The Death of Diversification Has Been Greatly Exaggerated , https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2998754 [ ] S. Ross (1976),
The arbitrage theory of capital asset pricing , Journal of Economic Theory 13(3),341–360 [ ] W. Sharpe (1964),
Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk,
Journal of Finance 19, 425-442. [ ] R. Sloan (1996),
Do stock prices fully reflect information in accruals and cash flows about futureearnings?
The Accounting Review 71: 289-315. [ ] R.F. Stambaugh, J. Yu and Y. Yuan (2011),
The Short of It: Investor Sentiment and Anomalies , https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1567616 [ ] M. Vassalou and Y. Xing (2004),
Default Risk in Equity Returns,
Journal of Finance 59, 831-868. [ ] X.F. Zhang (2006),